Getting Started

Installation

bun add @trayn/agent-sdk

Quick Start

Record a workflow in the Trayn web app. Trayn generates a task definition (goal, verifiers) from your recording and stores it alongside the sandbox. Then point the SDK at the sandbox URL:

trayn --url https://app.trayn.ai/sandbox/desk.zoho.com/abc123 --max-steps 15 --reps 2

The SDK fetches the task definition from the backend, runs your agent against the sandbox, grades the result, and stores memories for learning.

Programmatic Usage

import { harness } from "@trayn/agent-sdk";
 
const h = harness({
  agentargs: {
    agent_name: "my-agent",
    make_agent: () => ({
      get_action: async (obs) => {
        // obs.goal: fetched from the backend task definition
        // obs.url: current sandbox page URL
        // obs.axtree_txt: accessibility tree with element IDs
        return ["click('42')", { think: "clicking add to cart" }];
      },
    }),
  },
  url_override: "https://app.trayn.ai/sandbox/desk.zoho.com/abc123",
});
 
await h.run();

How Tasks Work

Record a workflow in the browser extension on a real website
Task generated — Trayn AI analyzes the recording and creates a task definition with goal text, verifiers, and expected actions
Edit — You can edit the goal and verifiers in the web app; edits are persisted
Run — The SDK fetches the latest task definition from the backend when you provide a sandbox URL

Tasks are stored in S3 at sandbox-tasks/{sessionId}/task.json. The SDK reads this via fetchTaskFromUrl(). If you edit a task locally with --confirm, changes are pushed back to the backend via pushTaskToRemote().

Curated Benchmarks

Pre-built benchmark suites are available for testing without recordings:

trayn --task omnizon --max-steps 15 --reps 2
trayn --task omnizon --filter 8 --max-steps 15 --reps 2

These use local task files bundled with the SDK. The primary workflow is URL-based fetching from the backend.

Benchmark Suites

Type	Domain	Tasks
omnizon	E-commerce	10
dashdish	Food delivery	11
staynb	Travel booking	9
fly-unified	Flight booking	14
gocalendar	Calendar	10
gomail	Email	8
networkin	Social network	10
opendining	Reservations	10
topwork	Freelance	9
udriver	Ride sharing	11
zilloft	Real estate	10