Skip to main content
Some workflows are too complex to describe in words alone — especially in healthcare software and legacy web apps where the interaction sequence matters and the UI has unusual patterns. Interactive script building lets you perform the workflow yourself, with Libretto recording every click, fill, and navigation. Your agent then reads the action log and translates what you did into a reusable automation script.

Example prompt

I’m gonna show you a workflow in the eclinicalworks EHR to get a patient’s primary insurance ID. Use libretto skill to turn it into a playwright script that takes patient name and dob as input to get back the insurance ID. URL is …
You perform the workflow once in the browser. Libretto captures the full action log. The agent reads it and writes a script that can run the same steps programmatically.

Step-by-step process

1

Agent opens a headed browser

npx libretto open https://your-ehr.example.com --headed
The browser opens on your screen. You’re in control from here.
2

You perform the workflow manually

Navigate through the application exactly as you would normally. Log in, search for a patient, open their record, find the insurance tab — whatever the workflow requires. Libretto records everything happening in the browser as you go.
3

Libretto records your actions

Every click, form fill, and navigation is written to the action log at:
.libretto/sessions/<session>/actions.jsonl
Each entry captures the action type, the element you interacted with, the value you typed (if any), and the URL of the page at that moment.
4

Agent reads what you did

npx libretto actions
The agent reads the action log to reconstruct your workflow step by step. It can also query the log with jq to find specific actions or filter by page:
# See all fill actions (form inputs you typed into)
jq 'select(.action == "fill")' .libretto/sessions/<session>/actions.jsonl

# See the last 20 actions
tail -n 20 .libretto/sessions/<session>/actions.jsonl | jq .
5

Agent inspects network traffic

npx libretto network
The agent cross-references your actions with the network requests they triggered. This helps identify the underlying API calls so the workflow can be converted to direct requests if needed.
6

Agent writes the workflow file

Using the action log as a blueprint, the agent writes a TypeScript workflow that replicates your steps:
import { workflow } from "libretto";

type Input = {
  patientName: string;
  dob: string; // YYYY-MM-DD
};

type Output = {
  primaryInsuranceId: string;
};

export const getInsuranceId = workflow<Input, Output>(
  async (ctx, input): Promise<Output> => {
    const { page } = ctx;

    await page.goto("https://your-ehr.example.com");

    // Search for the patient by name and DOB
    await page.locator("#patientSearch").fill(input.patientName);
    await page.locator("#dobField").fill(input.dob);
    await page.locator("button[type='submit']").click();

    // Open the patient record
    await page.locator(".patient-result").first().click();

    // Navigate to insurance tab
    await page.locator("#insuranceTab").click();

    // Extract the primary insurance ID
    const insuranceId = await page
      .locator("[data-field='primaryInsuranceId']").textContent();

    return { primaryInsuranceId: insuranceId?.trim() ?? "" };
  },
);
7

Agent validates with a headless run

npx libretto run ./get-insurance-id.ts getInsuranceId --headless \
  --params '{"patientName": "Jane Smith", "dob": "1985-04-12"}'
The agent confirms the actual returned output matches what you’d expect.

What the action log captures

Each entry in actions.jsonl is a JSON object on its own line. User-initiated events include:
  • action — what happened: click, dblclick, fill, goto, etc.
  • bestSemanticSelector — the most stable selector for the element you interacted with
  • value — the text you typed or option you selected
  • url — the page URL at the time of the action
  • nearbyText — visible text near the element, for human context
Agent-initiated events include the Playwright locator used and whether the action succeeded.
The bestSemanticSelector field in user action entries is the most reliable selector to use in generated code. It’s the canonical identifier Libretto chose for the element based on the DOM structure — prefer it over targetSelector or raw CSS paths when writing workflow code.

When to use this approach

  • The workflow involves many steps and is hard to describe concisely
  • You’re working with EHR, healthcare, or legacy enterprise software with unusual UI patterns
  • You want to delegate the “figure out the selectors” work entirely to the agent
  • The workflow involves conditional paths that are easier to show than explain

One-shot script generation

Let the agent explore a site and build the workflow without your involvement.

Debugging workflows

Reproduce failures, inspect live page state, and fix broken automations.

Build docs developers (and LLMs) love