HuggingFace API Key Secret#

NeMo Customizer supports downloading models from Hugging Face Hub, including private models. This guide explains how to create and configure a Kubernetes secret containing your Hugging Face API token for use with NeMo Customizer.

Prerequisites#

A Hugging Face account with an API token
Access to create Kubernetes secrets in your cluster
NeMo Customizer deployed or being deployed

Create Hugging Face API Token#

Log in to Hugging Face
Navigate to Settings > Access Tokens
Create a new token with Read permissions (or Write if you plan to upload models)
Copy the token value

Create Kubernetes Secret#

Create a Kubernetes secret containing your Hugging Face API token:

kubectl create secret generic hf-api-secret \
  --from-literal=HF_API_KEY=<your-huggingface-token> \
  --namespace <your-customizer-namespace>

Alternatively, create the secret using a YAML manifest:

apiVersion: v1
kind: Secret
metadata:
  name: hf-api-secret
  namespace: <your-customizer-namespace>
type: Opaque
stringData:
  HF_API_KEY: <your-huggingface-token>

Configure NeMo Customizer#

To enable Hugging Face model downloads, configure the following in your values.yaml:

customizer:
  customizerConfig:
    hfTargetDownload:
      enabled: true
    hfApiSecretName: hf-api-secret
    hfApiSecretKey: HF_API_KEY

Configuration Parameters#

hfTargetDownload.enabled: Set to true to enable downloading models from Hugging Face Hub
hfApiSecretName: The name of the Kubernetes secret containing your Hugging Face API token
hfApiSecretKey: The key within the secret that contains the API token (default: HF_API_KEY)

Verify Configuration#

After deploying with the Hugging Face secret configured, you can verify it’s working by:

Checking that the secret is mounted in Customizer pods:

kubectl describe pod <customizer-pod-name> -n <namespace> | grep hf-api-secret

Testing model download by creating a customization target that references a Hugging Face model:

from nemo_microservices import NeMoMicroservices

client = NeMoMicroservices(base_url="http://your-nemo-url")

target = client.customization.targets.create(
    name="my-model",
    namespace="default",
    model_uri="hf://google/gemma-2b-it",
    # ... other parameters
)

Use Cases#

Download Private Models: Access gated models (e.g., Llama, Gemma) that require authentication
Avoid Rate Limits: Authenticated requests have higher rate limits than anonymous requests
Upload Customized Models: With write permissions, upload fine-tuned models back to Hugging Face Hub

Security Best Practices#

Store the secret in the same namespace as your Customizer deployment
Use Kubernetes RBAC to restrict access to the secret
Rotate tokens periodically
Use separate tokens for different environments (dev, staging, production)

Troubleshooting#

Models Fail to Download#

Verify the secret exists and is accessible:

kubectl get secret hf-api-secret -n <namespace>

Check Customizer pod logs for authentication errors:

kubectl logs <customizer-pod-name> -n <namespace> | grep -i "huggingface\|hf\|authentication"

Ensure hfTargetDownload.enabled is set to true in your values file

Token Permissions#

If you encounter permission errors, verify your token has the necessary permissions:

Read: Required for downloading models
Write: Required for uploading models to Hugging Face Hub