> ## Documentation Index > Fetch the complete documentation index at: https://niceeval.com/docs/llms.txt > Use this file to discover all available pages before exploring further. # niceeval/expect matchers and custom assertion reference > Reference for niceeval/expect: includes, equals, matches, similarity, satisfies. Chain .gate() or .atLeast(0.7), and build custom matchers with makeAssertion. The `niceeval/expect` module provides a small set of composable matchers that you pass to `t.check()` and `t.require()`. Each matcher is a function that returns an `Assertion` — a typed scoring function of shape `(value: unknown) => number | Promise` — and carries a default severity (`gate` or `soft`) that determines how a failure affects the eval outcome. You can override the severity on any matcher by chaining `.gate()` or `.atLeast(0.7)`. ```ts theme={null} import { includes, equals, matches, similarity, satisfies } from "niceeval/expect"; ``` *** ## How matchers are used You pass matchers as the second argument to `t.check()` or `t.require()`: ```ts theme={null} // t.check — records the result; execution continues regardless t.check(t.reply, includes("confirmed")); t.check(turn.data, equals({ intent: "refund" })); // t.require — throws immediately if the assertion fails; use for preconditions t.require(turn.status, equals("completed")); ``` The difference between `check` and `require`: | Method | On failure | Use for | | ----------- | ----------------------------------- | --------------------------------------------- | | `t.check` | Records result, execution continues | Most assertions | | `t.require` | Throws immediately, aborts the test | Preconditions where continuing is meaningless | *** ## Matchers ### includes ```ts theme={null} includes(substring: string | RegExp): Assertion ``` Asserts that `value` (coerced to a string) contains `substring`, or that it matches `pattern` if a `RegExp` is provided. Case-sensitive by default when passing a string. **Default severity:** `gate` ```ts theme={null} t.check(t.reply, includes("Paris")); t.check(t.reply, includes(/order #\d+/i)); ``` Chain `.atLeast(0.7)` if you want this to score rather than hard-fail: ```ts theme={null} t.check(t.reply, includes("recommended").atLeast(0.7)); ``` *** ### equals ```ts theme={null} equals(expected: unknown): Assertion ``` Asserts deep structural equality between `value` and `expected`. Works on primitives, plain objects, arrays, and nested structures. Equivalent to a recursive `JSON.stringify`-style comparison (order-insensitive for object keys, order-sensitive for arrays). **Default severity:** `gate` ```ts theme={null} t.check(turn.data, equals({ intent: "refund" })); t.check(turn.data, equals(["a", "b", "c"])); ``` For partial matching (asserting a subset of keys), use `satisfies` with a custom predicate instead of `equals`. *** ### matches ```ts theme={null} matches(schema: StandardSchema): Assertion ``` Validates `value` against a [Standard Schema](https://standardschema.dev/) compatible schema. Zod, Valibot, ArkType, and other Standard Schema-compliant libraries work out of the box. **Default severity:** `gate` ```ts theme={null} import { z } from "zod"; t.check( turn.data, matches(z.object({ intent: z.enum(["refund", "ship", "track"]) })), ); ``` Returns score `1` if validation passes, `0` if it fails. When the schema produces a parse error, the error message is attached to the recorded assertion for easy debugging. *** ### similarity ```ts theme={null} similarity(expected: string, opts?: SimilarityOpts): Assertion ``` Scores `value` (coerced to a string) against `expected` using **normalized Levenshtein distance**, returning a score between `0` (completely different) and `1` (identical). Useful when you want to reward approximate correctness rather than enforce an exact match. **Default severity:** `soft` ```ts theme={null} t.check(t.reply, similarity("The capital of France is Paris.").atLeast(0.8)); ``` Chain `.gate()` if a minimum similarity must be met for the eval to pass: ```ts theme={null} t.check(t.reply, similarity(expectedSql).gate().atLeast(0.95)); ``` **`opts` fields:** Minimum score (0–1) required to pass. Defaults to the threshold set by chaining `.atLeast(n)` on the returned assertion; if neither is specified the assertion always records the raw score as a soft metric. *** ### satisfies ```ts theme={null} satisfies(predicate: (value: unknown) => boolean, label?: string): Assertion ``` Asserts `value` satisfies an arbitrary predicate function. The optional `label` string appears in reports and logs to make the assertion's intent readable. **Default severity:** `gate` ```ts theme={null} t.check(turn.data, satisfies((d) => (d as any).total > 0, "total is positive")); t.check(t.usage.outputTokens, satisfies((n) => (n as number) < 10_000, "output not verbose")); ``` The predicate receives the raw `value` without type coercion — cast as needed inside the function. Return `true` for pass, `false` for fail. *** ## Chaining severity Every matcher returns an `Assertion` object that exposes `.gate()` and `.atLeast(0.7)` methods, letting you override the default severity inline: ```ts theme={null} // Downgrade an includes assertion from gate to soft: t.check(t.reply, includes("optional detail").atLeast(0.7)); // Upgrade a similarity assertion from soft to gate: t.check(t.reply, similarity(expected).gate()); ``` Severity controls how a failure affects the eval **outcome**: | Severity | Failure effect | Strict mode (`--strict`) | | -------- | ------------------------------- | ------------------------ | | `gate` | Eval is `failed` | Same | | `soft` | Eval is `passed` (not `failed`) | Eval is `failed` | Use `gate` for correctness requirements and `soft` for quality metrics you want to track but not block on by default. *** ## The Assertion type An `Assertion` is a scoring function with attached metadata: ```ts theme={null} type Assertion = { (value: unknown): number | Promise; name: string; severity: "gate" | "soft"; gate(): Assertion; soft(): Assertion; atLeast(threshold: number): Assertion; // available on similarity }; ``` Scores are normalized to `[0, 1]`: * `1` — fully passing * `0` — fully failing * Values between 0 and 1 — partial credit (primarily used by `similarity` and judge assertions) For `gate` assertions, any score below `1` is treated as a failure. For `soft` assertions, scores are recorded and compared against the configured threshold (set via `.atLeast(n)`). *** ## Custom matchers with makeAssertion When the built-in matchers don't cover your use case, create a custom assertion with `makeAssertion`. Your scoring function receives the raw value and must return a number (synchronously or as a Promise). ```ts theme={null} import { makeAssertion } from "niceeval/expect"; function jsonValid(): Assertion { return makeAssertion({ name: "jsonValid", severity: "gate", score: (value) => { try { JSON.parse(String(value)); return 1; } catch { return 0; } }, }); } t.check(t.reply, jsonValid()); ``` A short identifier shown in reports and logs when this assertion fails. Use a descriptive name that makes the failure reason self-evident. The default severity for this assertion. Can be overridden by callers via `.gate()` and `.atLeast(0.7)` chaining, just like built-in matchers. The scoring function. Receives the raw value and must return a number in `[0, 1]`. Async scoring functions (e.g. calling an external API) are fully supported. ```ts theme={null} score: async (value) => { const res = await externalGrader.grade(String(value)); return res.score / 100; }, ``` ### Async custom matchers Custom matchers support `async` scoring natively. The runner awaits all assertions before computing the final outcome. ```ts theme={null} function semanticallySimilar(reference: string): Assertion { return makeAssertion({ name: "semanticallySimilar", severity: "soft", score: async (value) => { const embedding1 = await embedText(String(value)); const embedding2 = await embedText(reference); return cosineSimilarity(embedding1, embedding2); }, }); } t.check(t.reply, semanticallySimilar("A greedy algorithm for interval scheduling.").atLeast(0.75)); ``` *** ## Quick reference | Matcher | Default severity | Score type | Best for | | ------------------------------------------ | ---------------- | ---------------- | ------------------------------------ | | `includes(str \| RegExp)` | gate | binary (0 or 1) | Keyword / pattern presence | | `equals(expected)` | gate | binary (0 or 1) | Exact value or structural match | | `matches(schema)` | gate | binary (0 or 1) | Schema / type validation | | `similarity(expected)` | soft | continuous (0–1) | Near-match text, approximate answers | | `satisfies(predicate, label?)` | gate | binary (0 or 1) | Arbitrary logic | | `makeAssertion({ name, severity, score })` | configurable | continuous (0–1) | Any custom requirement |