Getting Started with NeMo Safe Synthesizer#

Get started with NeMo Safe Synthesizer for synthetic data generation.

Prerequisites#

Before using NeMo Safe Synthesizer, complete the NeMo Microservices Installation to install the CLI/SDK and deploy the platform.

NeMo Safe Synthesizer has the following additional requirements:

  • An NVIDIA GPU with 80GB+ VRAM

  • Sufficient disk space for generated datasets (50GB+ recommended)

Set Environment Variables#

Configure your API key for accessing NVIDIA’s inference endpoints:

export NIM_API_KEY=<build.nvidia.com-api-key>  # Required for PII replacement

Tip

For production deployments, store your API keys as platform secrets rather than environment variables. See Managing Secrets for details.


Using the API#

Test the NeMo Safe Synthesizer jobs API:

curl -X GET -H "Content-type: application/json" localhost:8080/v2/workspaces/default/safe-synthesizer/jobs

You should see an empty array [] if you have not created any jobs yet.


Using the CLI#

You can also interact with NeMo Safe Synthesizer using the nmp CLI:

# List jobs
nmp safe-synthesizer jobs list

# Create a job (from a config file)
nmp safe-synthesizer jobs create --input-file config.json

# Create a job (inline JSON)
nmp safe-synthesizer jobs create --input-data '{"spec": {...}}'

Next Steps#

Run one of our tutorials to create your first synthetic dataset:


Troubleshooting#

GPU Memory Issues: Ensure your GPU has adequate VRAM for the models you plan to use. Check GPU usage with nvidia-smi.

Storage Space: Ensure you have adequate disk space for models and datasets (50GB+ recommended).

For general platform troubleshooting (port conflicts, health checks, etc.), see the Installation guide.