Metric Job Management#
Note
Performance Tuning: You can improve evaluation performance by setting job.params.parallelism to control the number of concurrent requests. A typical default value is 16, but you may need to adjust based on your model’s capacity and rate limits.
Monitor Job#
Monitor the status of a job.
job_status = client.evaluation.metric_jobs.get_status(name=job.name)
while job_status.status in ("active", "pending", "created"):
time.sleep(10)
job_status = client.evaluation.metric_jobs.get_status(name=job.name)
print("status:", job_status.status, job_status.status_details)
print(job_status)
Visit Troubleshooting NeMo Evaluator to help troubleshoot job failures.
Fetch Job Logs#
Get JSON logs with pagination. Logs are available for an active job and after the job terminates.
logs_response = client.evaluation.benchmark_jobs.get_logs(name=job.name)
for log_entry in logs_response.data:
print(f"[{log_entry.timestamp}] {log_entry.message.strip()}")
# Handle pagination
while logs_response.next_page:
logs_response = client.evaluation.benchmark_jobs.get_logs(
name=job.name,
page_cursor=logs_response.next_page
)
for log_entry in logs_response.data:
print(f"[{log_entry.timestamp}] {log_entry.message.strip()}")
View Evaluation Results#
Evaluation results are available once the evaluation job successfully completes. Visit Metric Results for details to fetch evaluation results.
Download Job Artifacts#
Files generated during the job execution are available for download. Job artifacts are useful to inspect details of the evaluation.
artifacts_zip = client.evaluation.metric_jobs.results.artifacts.download(name=job.name, workspace=workspace)
artifacts_zip.write_to_file("evaluation_artifacts.tar.gz")
print("Saved artifacts to evaluation_artifacts.tar.gz")
Extract files from the tarball with the following command and an artifacts directory will be created.
tar -xf evaluation_artifacts.tar.gz