Guardrail Tutorials#
Use the following tutorials to learn how to accomplish common guardrail tasks using the NeMo Guardrails microservice.
Tip
The tutorials reference NMP_BASE_URL whose value will depend on the ingress in your particular cluster.
Using an External Endpoint#
If you do not have access to GPUs, you can use NIMs hosted on build.nvidia.com for the tutorials, instead of manually deploying the NIMs. Use the Inference Gateway service to create an NVIDIA Build ModelProvider that routes requests to https://integrate.api.nvidia.com.
Step 1#
Install the required packages.
pip install -q nemo-microservices
Instantiate the NeMoMicroservices SDK.
import os
from nemo_microservices import NeMoMicroservices
sdk = NeMoMicroservices(base_url=os.environ["NMP_BASE_URL"], workspace="default")
Step 2#
Use the Secrets Service to securely store your NGC API key. Replace data with your key.
sdk.secrets.create(
name="nvidia-api-key",
data="<your-ngc-api-key>",
description="NVIDIA Build API key"
)
Step 3#
Create a ModelProvider, which represents a reachable inference endpoint. When you create a ModelProvider, Model Entities are automatically discovered and created for each model available at the endpoint.
sdk.inference.providers.create(
name="build-nvidia",
description="NVIDIA Build API provider",
host_url="https://integrate.api.nvidia.com",
api_key_secret_name="nvidia-api-key"
)
Step 4#
Use Model Entity references (i.e. workspace/model_name) in your guardrail configurations. Requests are automatically routed through the Inference Gateway for your Model Provider.
guardrails_config = {
"models": [
{
"type": "content_safety",
"engine": "nim",
"model": "default/nvidia-llama-3-1-nemoguard-8b-content-safety",
}
],
# ... rest of config
}
Use Content Safety checks to detect and block harmful content
Configure parallel rails for input and output guardrails.
Safety checks for multimodal data with NeMo Guardrails microservice.
Configure checks for SQL, XSS, template, and code injection.