Quick Start#

This guide walks you through setting up inference and running your first Data Designer job.

Prerequisites#

  • Access to an NMP deployment

  • An API key for a model provider (e.g., NVIDIA Build, OpenAI)

Step 1: Install the SDK#

Install the NMP SDK with Data Designer support:

pip install nemo-microservices[data-designer]

The [data-designer] extra includes the data_designer.config package for building configurations.

Step 2: Initialize the SDK#

import os
from nemo_microservices import NeMoMicroservices

sdk = NeMoMicroservices(
    base_url=os.environ["NMP_BASE_URL"],
    workspace="default"
)

Step 3: Configure Inference#

Data Designer routes all inference through the Inference Gateway service. You need to set up a model provider.

Store Your API Key#

Use the Secrets service to securely store your API key:

sdk.secrets.create(
    name="nvidia-api-key",
    data="<your-api-key>",
    description="NVIDIA Build API key"
)

Create a Model Provider#

Create a model provider that points to your inference endpoint:

sdk.inference.providers.create(
    name="build-nvidia",
    description="NVIDIA Build API provider",
    host_url="https://integrate.api.nvidia.com",
    api_key_secret_name="nvidia-api-key"
)

Step 4: Build a Configuration#

Use the data_designer.config package to define your dataset:

import data_designer.config as dd

# Define model configuration
model_configs = [
    dd.ModelConfig(
        provider="default/build-nvidia",  # Reference the model provider
        model="nvidia/nemotron-3-nano-30b-a3b",  # Model name from provider
        alias="text",
        inference_parameters=dd.ChatCompletionInferenceParams(
            temperature=1.0,
            top_p=1.0,
        ),
    )
]

# Create config builder
config_builder = dd.DataDesignerConfigBuilder(model_configs)

# Add columns
config_builder.add_column(
    dd.SamplerColumnConfig(
        name="category",
        sampler_type=dd.SamplerType.CATEGORY,
        params=dd.CategorySamplerParams(
            values=["Electronics", "Clothing", "Books"]
        ),
    )
)

config_builder.add_column(
    dd.LLMTextColumnConfig(
        name="product_name",
        prompt="Generate a creative product name for a {{ category }} product.",
        model_alias="text",
    )
)

Step 5: Preview Your Dataset#

Use the preview method for fast iteration:

from nemo_microservices.data_designer.client import NeMoDataDesignerClient

client = NeMoDataDesignerClient(sdk=sdk)

preview = client.preview(config_builder)

# View sample records
preview.display_sample_record()

# Access as DataFrame
df = preview.dataset
print(df.head())

# View analysis
preview.analysis.to_report()

Step 6: Generate Full Dataset#

When satisfied with the preview, create a full dataset:

job = client.create(config_builder, num_records=1000)

# Wait for completion
job.wait_until_done()

# Load results
dataset = job.load_dataset()
analysis = job.load_analysis()

print(dataset.head())
analysis.to_report()

Next Steps#