Caching

How response caching works and when caches are invalidated.

PeerLM caches model responses to save credits and speed up repeat runs. Cache hits are completely free.

How Cache Keys Work

Each response is cached using a SHA-256 hash of: model ID + model version + system prompt ID + system prompt content hash + test prompt ID + test prompt content hash.

This means a cached response is only reused when the exact same model version sees the exact same prompts.

Automatic Invalidation

Cached responses are automatically invalidated when:

You edit a prompt — changing the text changes its content hash
A model version updates — new versions get a new model version identifier

You don't need to manually clear caches. The system handles invalidation through content hashing.

Cache Hits in Results

After a run completes, the summary cards show how many responses were cache hits. The credit cost shown is the net cost after cache savings.

Tip: If you're iterating on criteria or evaluator selection but keeping the same prompts and generators, the Generate phase will be fully cached. Only the Evaluate phase incurs new credits.