Trayntrayn.ai

Getting Started

Train, evaluate, and improve your browser agents with Trayn.

Installation

bun add @trayn/agent-sdk

Quick Start

Record a workflow in the Trayn web app. Trayn generates a task definition (goal, verifiers) from your recording and stores it alongside the sandbox. Then point the SDK at the sandbox URL:

trayn --url https://app.trayn.ai/sandbox/desk.zoho.com/abc123 --max-steps 15 --reps 2

The SDK fetches the task definition from the backend, runs your agent against the sandbox, grades the result, and stores memories for learning.

Programmatic Usage

import { harness } from "@trayn/agent-sdk";
 
const h = harness({
  agentargs: {
    agent_name: "my-agent",
    make_agent: () => ({
      get_action: async (obs) => {
        // obs.goal: fetched from the backend task definition
        // obs.url: current sandbox page URL
        // obs.axtree_txt: accessibility tree with element IDs
        return ["click('42')", { think: "clicking add to cart" }];
      },
    }),
  },
  url_override: "https://app.trayn.ai/sandbox/desk.zoho.com/abc123",
});
 
await h.run();

How Tasks Work

  1. Record a workflow in the browser extension on a real website
  2. Task generated — Trayn AI analyzes the recording and creates a task definition with goal text, verifiers, and expected actions
  3. Edit — You can edit the goal and verifiers in the web app; edits are persisted
  4. Run — The SDK fetches the latest task definition from the backend when you provide a sandbox URL

Tasks are stored in S3 at sandbox-tasks/{sessionId}/task.json. The SDK reads this via fetchTaskFromUrl(). If you edit a task locally with --confirm, changes are pushed back to the backend via pushTaskToRemote().

Curated Benchmarks

Pre-built benchmark suites are available for testing without recordings:

trayn --task omnizon --max-steps 15 --reps 2
trayn --task omnizon --filter 8 --max-steps 15 --reps 2

These use local task files bundled with the SDK. The primary workflow is URL-based fetching from the backend.

Benchmark Suites

TypeDomainTasks
omnizonE-commerce10
dashdishFood delivery11
staynbTravel booking9
fly-unifiedFlight booking14
gocalendarCalendar10
gomailEmail8
networkinSocial network10
opendiningReservations10
topworkFreelance9
udriverRide sharing11
zilloftReal estate10

On this page