Getting Started with NeMo Safe Synthesizer#
Get started with NeMo Safe Synthesizer for synthetic data generation.
Prerequisites#
Before using NeMo Safe Synthesizer, complete the NeMo Microservices Installation to install the CLI/SDK and deploy the platform.
NeMo Safe Synthesizer has the following additional requirements:
An NVIDIA GPU with 80GB+ VRAM
Sufficient disk space for generated datasets (50GB+ recommended)
Set Environment Variables#
Configure your API key for accessing NVIDIA’s inference endpoints:
export NIM_API_KEY=<build.nvidia.com-api-key> # Required for PII replacement
Tip
For production deployments, store your API keys as platform secrets rather than environment variables. See Managing Secrets for details.
Using the API#
Test the NeMo Safe Synthesizer jobs API:
curl -X GET -H "Content-type: application/json" localhost:8080/v2/workspaces/default/safe-synthesizer/jobs
You should see an empty array [] if you have not created any jobs yet.
Using the CLI#
You can also interact with NeMo Safe Synthesizer using the nmp CLI:
# List jobs
nmp safe-synthesizer jobs list
# Create a job (from a config file)
nmp safe-synthesizer jobs create --input-file config.json
# Create a job (inline JSON)
nmp safe-synthesizer jobs create --input-data '{"spec": {...}}'
Next Steps#
Run one of our tutorials to create your first synthetic dataset:
Safe Synthesizer 101 Tutorial - A beginner-friendly introduction
PII Replacement Deep Dive - A runthrough of various methods of PII removal
Differential Privacy - Generate differentially private synthetic data
Troubleshooting#
GPU Memory Issues: Ensure your GPU has adequate VRAM for the models you plan to use. Check GPU usage with nvidia-smi.
Storage Space: Ensure you have adequate disk space for models and datasets (50GB+ recommended).
For general platform troubleshooting (port conflicts, health checks, etc.), see the Installation guide.