All Projects
May 2026
TypeScriptLLM EvalsOpenRouter

chure

An early TypeScript SDK for defining and running text-based LLM benchmarks through OpenRouter.

chure is an early TypeScript SDK for defining and running prompt-based LLM benchmarks through OpenRouter. It focuses on a small, direct workflow for writing evals, running them against one or more models, and inspecting the results.

What It Does

The SDK currently supports text-only model benchmarks with simple evaluator options and a path for custom evaluation logic. It can write benchmark results to JSON and includes a pretty-printer for reviewing output in the terminal.

Technical Implementation

Benchmark Definitions

Benchmarks are defined as typed TypeScript objects. Each benchmark can include a system prompt and a set of eval cases with prompts, expected answers, and evaluator rules.

Evaluators

The evaluator layer supports exact-match checks, substring inclusion checks, and custom function evaluators for cases that need project-specific scoring logic.

OpenRouter Integration

The runner accepts an OpenRouter API key and a model list, making it possible to compare multiple LLMs against the same benchmark set.

Key Features

  • Prompt-based benchmark definitions with TypeScript types
  • Multi-model runs through OpenRouter
  • Exact match, includes, and custom evaluators
  • JSON output for saving benchmark results
  • Pretty-printed summaries for terminal review