Setting Up Your Library
Create system prompts, test prompts, and datasets.
The Library holds the prompts used in your evaluations. There are three types of library items.
System Prompts
System prompts define the persona or instructions given to a model before it sees the user's message. Think of them as the "role" the model plays during evaluation.
- Name — a short label (e.g., "Customer Support Agent")
- Description — optional context for your team
- System Prompt — the full text sent as the system message
- Tags — for filtering and organization
Test Prompts
Test prompts are the actual user messages models respond to. Each test prompt is an atomic evaluation unit — models are scored on each one independently.
- Title — a descriptive label
- Prompt — the message sent to the model
- Dataset — optional grouping (see below)
- Tags — for filtering
Datasets
Datasets are lightweight collections that group test prompts. When building a suite, you can select an entire dataset instead of picking individual prompts. A test prompt can belong to at most one dataset.
Bulk Import
You can import up to 500 items at once for system prompts and test prompts. Use the import button on the respective library page.