Guardrail Tutorials#

Use the following tutorials to learn how to accomplish common guardrail tasks using the NeMo Guardrails microservice.

Tip

The tutorials reference NMP_BASE_URL whose value will depend on the ingress in your particular cluster.

Using an External Endpoint#

If you do not have access to GPUs, you can use NIMs hosted on build.nvidia.com for the tutorials, instead of manually deploying the NIMs. Use the Inference Gateway service to create an NVIDIA Build ModelProvider that routes requests to https://integrate.api.nvidia.com.

Step 1#

Install the required packages.

pip install -q nemo-microservices

Instantiate the NeMoMicroservices SDK.

import os
from nemo_microservices import NeMoMicroservices

sdk = NeMoMicroservices(base_url=os.environ["NMP_BASE_URL"], workspace="default")

Step 2#

Use the Secrets Service to securely store your NGC API key. Replace data with your key.

sdk.secrets.create(
    name="nvidia-api-key",
    data="<your-ngc-api-key>",
    description="NVIDIA Build API key"
)

Step 3#

Create a ModelProvider, which represents a reachable inference endpoint. When you create a ModelProvider, Model Entities are automatically discovered and created for each model available at the endpoint.

sdk.inference.providers.create(
    name="build-nvidia",
    description="NVIDIA Build API provider",
    host_url="https://integrate.api.nvidia.com",
    api_key_secret_name="nvidia-api-key"
)

Step 4#

Use Model Entity references (i.e. workspace/model_name) in your guardrail configurations. Requests are automatically routed through the Inference Gateway for your Model Provider.

guardrails_config = {
    "models": [
        {
            "type": "content_safety",
            "engine": "nim",
            "model": "default/nvidia-llama-3-1-nemoguard-8b-content-safety",
        }
    ],
    # ... rest of config
}