> ## Documentation Index
> Fetch the complete documentation index at: https://niceeval.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Install niceeval: setup, scaffolding, and configuration

> Install niceeval as a dev dependency, scaffold evals/ and niceeval.config.ts with npx niceeval init, and configure your agent credentials.

Setting up niceeval takes three steps: verify your prerequisites, install the package, and run `npx niceeval init` to scaffold your eval directory and config file. This page walks through each step in detail and covers every configuration option you're likely to need before your first run.

## Prerequisites

Before installing niceeval, make sure your environment meets the following requirements.

**Node.js** — niceeval requires a modern version of Node.js with ESM support. Use the version specified in your project's `.nvmrc` or `package.json` `engines` field, or install the current LTS release from [nodejs.org](https://nodejs.org).

**Docker** — required only for sandbox evals (coding agents like Claude Code, Codex, and bub that need an isolated filesystem). Install [Docker Desktop](https://www.docker.com/products/docker-desktop/) or Docker Engine and confirm it's running before you attempt a sandbox eval. In-process and HTTP agent evals work without Docker.

<Note>
  If Docker is unavailable when you run a sandbox eval, niceeval stops immediately with a clear error message. It will not silently fall back to a different backend.
</Note>

## Install niceeval

Add niceeval as a dev dependency using your preferred package manager:

<CodeGroup>
  ```shell npm theme={null}
  npm install -D niceeval
  ```

  ```shell yarn theme={null}
  yarn add --dev niceeval
  ```

  ```shell pnpm theme={null}
  pnpm add -D niceeval
  ```
</CodeGroup>

## Scaffold your project

Run the init command to generate your eval directory and config file:

```shell theme={null}
npx niceeval init
```

`init` inspects your project layout and creates a minimal but functional starting point. After it completes, your project contains:

```
your-project/
├─ niceeval.config.ts           ← central configuration
└─ evals/
   ├─ hello.eval.ts             ← example: conversational eval
   └─ fixtures/
      └─ button/                ← example: sandbox coding-agent eval
         ├─ PROMPT.md
         ├─ EVAL.ts
         └─ package.json
```

The generated `hello.eval.ts` and `fixtures/button/` are illustrative examples — read through them to understand the eval shape, then replace or delete them when you're ready to write your own.

<Tip>
  Eval IDs are derived automatically from file paths. The file `evals/weather/brooklyn.eval.ts` gets the ID `weather/brooklyn`. You never declare an ID by hand — renaming the file is all it takes to change it.
</Tip>

## Configure niceeval

Open the generated `niceeval.config.ts` and review the available options:

```ts theme={null}
// niceeval.config.ts
import { defineConfig } from "niceeval";
import { Console, JUnit } from "niceeval/reporters";

export default defineConfig({
  // LLM used for t.judge.* assertions
  judge: { model: "anthropic/claude-haiku-4-5" },

  // reporters that run after every eval suite
  reporters: [Console(), JUnit(".niceeval/junit.xml")],

  // how many evals to run in parallel
  maxConcurrency: 8,

  // per-eval timeout in milliseconds (5 minutes)
  timeoutMs: 300_000,

  // "auto" uses a cloud sandbox token if present, otherwise Docker
  sandbox: "auto",
});
```

| Option           | Type                   | Description                                      |
| ---------------- | ---------------------- | ------------------------------------------------ |
| `judge`          | `{ model: string }`    | The LLM model used for `t.judge.*` assertions    |
| `reporters`      | `Reporter[]`           | Reporters that emit results after every run      |
| `maxConcurrency` | `number`               | Maximum number of evals running at the same time |
| `timeoutMs`      | `number`               | Per-eval timeout in milliseconds                 |
| `sandbox`        | `"auto"` \| `"docker"` | Sandbox backend for coding-agent evals           |

Select agents in experiment files:

```ts theme={null}
import { defineExperiment } from "niceeval";
import myAgent from "./agents/my-agent.js";

export default defineExperiment({
  agent: myAgent,
  runs: 1,
});
```

## Set environment variables

niceeval does not manage secrets — it reads them from environment variables that your agent adapters reference at runtime.

**For Claude Code** (Anthropic coding agent):

```shell theme={null}
export ANTHROPIC_API_KEY=sk-ant-...
```

**For Codex** (OpenAI coding agent):

```shell theme={null}
export OPENAI_API_KEY=sk-...
```

**For custom HTTP agents**, add whatever variables your adapter reads — for example:

```shell theme={null}
export AGENT_URL=https://my-agent.example.com
```

<Warning>
  Never commit API keys to your repository. Add `ANTHROPIC_API_KEY` and `OPENAI_API_KEY` to your `.gitignore`-d `.env` file for local development, and store them as CI secrets (for example `secrets.ANTHROPIC_API_KEY` in GitHub Actions) for automated runs.
</Warning>

For CI, pass the relevant secrets through the workflow `env` block:

```yaml theme={null}
# .github/workflows/evals.yml
- run: npx niceeval exp ci --strict --junit .niceeval/junit.xml
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
```

## Verify the installation

Confirm that the CLI is reachable and can discover your evals:

```shell theme={null}
# list all discovered evals without running them
npx niceeval list
```

`npx niceeval list` reads `niceeval.config.ts` and scans your `evals/` directory. A successful run prints each eval's ID, description, and registered agent. If it exits with an error, check that `niceeval.config.ts` exists at the repository root and that your `evals/` directory contains at least one `*.eval.ts` file.

You can also do a dry run — this resolves every eval and prints what would execute, without calling any agent or sandbox:

```shell theme={null}
npx niceeval exp local --dry
```

Once `list` and `--dry` both succeed, you're ready to run your first eval. Head to the [Quickstart](/quickstart) for a step-by-step walkthrough of all three eval types.
