HuggingFace API Key Secret#

NeMo Customizer supports downloading models from Hugging Face Hub, including private models. This guide explains how to create and configure a Kubernetes secret containing your Hugging Face API token for use with NeMo Customizer.

Prerequisites#

  • A Hugging Face account with an API token

  • Access to create Kubernetes secrets in your cluster

  • NeMo Customizer deployed or being deployed

Create Hugging Face API Token#

  1. Log in to Hugging Face

  2. Navigate to Settings > Access Tokens

  3. Create a new token with Read permissions (or Write if you plan to upload models)

  4. Copy the token value

Create Kubernetes Secret#

Create a Kubernetes secret containing your Hugging Face API token:

kubectl create secret generic hf-api-secret \
  --from-literal=HF_API_KEY=<your-huggingface-token> \
  --namespace <your-customizer-namespace>

Alternatively, create the secret using a YAML manifest:

apiVersion: v1
kind: Secret
metadata:
  name: hf-api-secret
  namespace: <your-customizer-namespace>
type: Opaque
stringData:
  HF_API_KEY: <your-huggingface-token>

Configure NeMo Customizer#

To enable Hugging Face model downloads, configure the following in your values.yaml:

customizer:
  customizerConfig:
    hfTargetDownload:
      enabled: true
    hfApiSecretName: hf-api-secret
    hfApiSecretKey: HF_API_KEY

Configuration Parameters#

  • hfTargetDownload.enabled: Set to true to enable downloading models from Hugging Face Hub

  • hfApiSecretName: The name of the Kubernetes secret containing your Hugging Face API token

  • hfApiSecretKey: The key within the secret that contains the API token (default: HF_API_KEY)

Verify Configuration#

After deploying with the Hugging Face secret configured, you can verify it’s working by:

  1. Checking that the secret is mounted in Customizer pods:

    kubectl describe pod <customizer-pod-name> -n <namespace> | grep hf-api-secret
    
  2. Testing model download by creating a customization target that references a Hugging Face model:

    from nemo_microservices import NeMoMicroservices
    
    client = NeMoMicroservices(base_url="http://your-nemo-url")
    
    target = client.customization.targets.create(
        name="my-model",
        namespace="default",
        model_uri="hf://google/gemma-2b-it",
        # ... other parameters
    )
    

Use Cases#

  • Download Private Models: Access gated models (e.g., Llama, Gemma) that require authentication

  • Avoid Rate Limits: Authenticated requests have higher rate limits than anonymous requests

  • Upload Customized Models: With write permissions, upload fine-tuned models back to Hugging Face Hub

Security Best Practices#

  • Store the secret in the same namespace as your Customizer deployment

  • Use Kubernetes RBAC to restrict access to the secret

  • Rotate tokens periodically

  • Use separate tokens for different environments (dev, staging, production)

Troubleshooting#

Models Fail to Download#

  • Verify the secret exists and is accessible:

    kubectl get secret hf-api-secret -n <namespace>
    
  • Check Customizer pod logs for authentication errors:

    kubectl logs <customizer-pod-name> -n <namespace> | grep -i "huggingface\|hf\|authentication"
    
  • Ensure hfTargetDownload.enabled is set to true in your values file

Token Permissions#

If you encounter permission errors, verify your token has the necessary permissions:

  • Read: Required for downloading models

  • Write: Required for uploading models to Hugging Face Hub