NeMo Guardrails Terminology#

The following terms identify the key concepts related to managing LLM safety by using the NeMo Guardrails microservice.

Models#

There are two groups of models you can define in a guardrail configuration.

Main model

The primary LLM that generates a response to a user prompt. The NeMo Guardrails service automatically uses the model in an incoming request (for example, `model: “meta/llama-3.1-8b-instruct”).

When using self-check rails in your guardrail configuration, the main model performs the check.

Task model

The LLM used for a specific guardrail task on the user input and LLM output (for example, content safety or topic control checks). A guardrail configuration can use multiple task models.

LLM provider

The hosted or managed service for using an LLM. In most cases, this is likely a locally-hosted NIM or the NVIDIA API Catalog managed service, but the NeMo Guardrails service also supports external endpoints such as the OpenAI API.

LLM engine

The underlying runtime that controls how the NeMo Guardrails service communicates with an LLM. If self-hosting a NIM, or using the NVIDIA API Catalog managed service, use nim. If using the OpenAI API, use openai.

Guardrail Configurations#

Guardrail configuration

A configuration object stored in the database that defines how to perform guardrail checks. The configuration specifies information about the model(s) to use and the rails to apply to user input and LLM output.

Rail (or guardrail)

The configuration that controls the interaction with an LLM, potentially modifying or blocking content at a specific point during request processing. Rails are triggered at different points during the handling of a request:

Rail type

Trigger point

Purpose

Input rails

When the user input is received.

Validate, filter, or modify user input.

Retrieval rails

After RAG retrieval completes.

Process retrieved chunks.

Dialog rails

After canonical user intent is computed.

Control conversation flow.

Execution rails

Before and after action execution.

Control tool and action calls.

Output rails

When the LLM output is generated.

Validate, filter, or modify bot responses.

Input rail

A rail applied to user input before it reaches the main model. An input rail can reject the input, stopping any additional processing, or alter the input (for example, masking potentially sensitive data). A guardrail configuration can contain multiple input rails.

Retrieval rail

Applied to the retrieved chunks in the case of a retrieval augmented generation (RAG) deployment. A retrieval rail can reject a chunk, preventing it from being used to prompt the LLM, or alter the relevant chunks, such as masking potentially sensitive data.

Dialog rail

Influence how the LLM is prompted. Dialog rails operate on canonical form messages and determine the next action to take. For example, should the LLM generate the next step, generate a response, or should NeMo Guardrails return a predefined response instead?

Retrieval rail

Applied to the retrieved chunks in the case of a retrieval augmented generation (RAG) deployment. A retrieval rail can reject a chunk, preventing it from being used to prompt the LLM, or alter the relevant chunks, such as masking potentially sensitive data.

Output rail

A rail applied to the LLM output before returning it to the user. An output rail can reject the output, ensuring the LLM output isn’t returned to the user, or alter the input (for example, masking potentially sensitive data). A guardrail configuration can contain multiple output rails.

Flow

A named action in an input or output rail that defines the guardrail action to perform (for example, self check input).

Task prompt

A prompt template associated with a specific flow that defines the instructions to give the model that performs the action.

Prompt template variable

A templated variable that requires a dynamic value, populated with actual content at runtime. (for example, {{ user_input }}).

Refusal text

The pre-defined message returned if the NeMo Guardrails service blocks a request. By default, this value is “I’m sorry, I can’t respond to that.”.

Storage#

Configuration store

The storage backend (Postgres) for guardrail configurations. Configs can also be loaded into the database when the services starts up from a directory (defined by the CONFIG_STORE_PATH environment variable) or Kubernetes ConfigMap.