Using Backbend

Day-to-day work in the dashboard: creating agents, chatting, scheduling work, editing memory.

Dashboard tour

The signed-in app has one home page and a sidebar of nine sections. Here’s what each one is for and when you’d use it.

Home: Chat with the main agent, see all your agents as cards, and watch today’s spend at a glance. This is where you spend most of your time.
Agents: Full list of every agent (main + sub-agents). Each row links to a detail page with Chat, Tasks, Settings, Memory, and Logs tabs.
Tasks: Every scheduled task across every agent, grouped by agent. Edit the cron expression, preview next runs, kick off a manual run, see run history.
LLM: API keys for the LLM providers your agents use. Stored encrypted at rest with a key derived from your account secret.
Connections: OAuth-linked services. Click Connect on a row, approve in the popup, and your agents inherit access.
Memory: Browse, edit, or wipe what each agent has remembered. Useful when you want to seed an agent with facts or correct a memory it stored wrong.
Tools: Built-in and custom tool definitions. Toggle which tools an agent has access to from the agent’s Settings tab.
Skills: Reusable named playbooks (markdown + small frontmatter) that the agent can invoke. Think saved prompts on steroids.
Testing: A sandbox for poking at agents in isolation before pushing changes to the live one. See Testing sandbox.

Top-of-home chat

The chat input at the top of Home always talks to your mainagent. Suggestion chips above the input fire common prompts (“what did I miss today,” “summarize my morning,” …) so you don’t have to type the same thing every day.

Agent cards

Below the chat, each agent shows up as a card with avatar, run state, tasks count, today’s spend, and the most recent task runs. Click anywhere on the card to open the detail page; the triple-dot menu has Pause/Resume, Settings, and Delete.

Agents

Creating an agent

Hit + New agent on the home page or the Agents list. Required: a name. Optional: a system prompt (what this agent is for and how it should behave), a starter template (research, secretary, code reviewer, and a few others), and a profile picture.

Behind the scenes the platform provisions an isolated container with the agent’s memory, gives it your account LLM keys and connections, and boots the run loop. Time to first response: under 10 seconds.

System prompts

The system prompt is the agent’s persistent identity. A good system prompt is two paragraphs at most: what the agent does, what tone it uses, what it should refuse. Don’t put instructions for individual tasks here; those go in the task prompt.

You can edit the system prompt any time. Changes apply on the next message; running tasks finish on the old prompt.

Lifecycle states

Running: The default state. Agent answers chat and runs tasks.
Starting: The container is provisioning. Lasts a few seconds; if it stays stuck longer than 30s, check that you have an LLM key configured.
Paused: Container is stopped. Scheduled tasks don’t fire and the chat input is disabled. Memory and configuration are preserved; this is “turn off without losing state.”
Failed: The container crashed or refused to start. Click the agent to see the failure reason and a Resume button.
Deleting: Tear-down is in progress. The card stays in the grid (dimmed) until the container is gone, then disappears.

Pausing and deleting

The triple-dot menu on each agent card has Pause / Resume, Settings, and Delete. Deleting an agent removes its container, its memory, and its task history; this is permanent. The main agent can be paused but not deleted.

Settings

The Settings tab on the agent detail page covers: name, system prompt, default LLM model, tool access, connection scope (which connections this agent can use), trigger setup (iMessage, email, webhook), and per-agent quotas (max spend per day, max concurrent tasks). Changes apply on the next message; no restart needed.

Logs

The Logs tab shows raw model calls, tool invocations, and any errors the agent emitted. Most users won’t need this; when you do (debugging a stuck task), it’s the source of truth.

Chatting with an agent

Every agent has a chat surface. On the home page the chat box at the top talks to your mainagent. Open an agent’s detail page and the same input talks to that agent.

Streaming responses

Responses stream token by token. You can interrupt at any time by pressing Esc or the stop button; the agent remembers the partial response and you can pick up the conversation right where it stopped.

Model picker

The pill above the input lets you switch which LLM the agent uses for this message. The default is whatever you set in the agent’s settings; the picker only overrides it for the current message, so you can try a single response against a different model without changing the agent permanently.

Attachments

You can paste or drop files into the input. Images go straight into the model context if the model supports vision; PDFs and text files are read and inlined. There’s a 25 MB per-file limit.

Markdown rendering

The agent’s replies render as Markdown: headings, lists, tables, code blocks with syntax highlighting, links, and inline math. Long code blocks get a Copy button automatically.

Conversation history

Conversations persist forever. Scroll up to see what you said last month; jump to a specific date with the calendar in the chat header. The agent has access to the same history when composing replies, so “what did I ask you about that Acme deal” works without you needing to find the original message.

Tasks and scheduling

What a task is

A task is a saved instruction the agent runs, either on demand or on a schedule. A task has a name, a prompt (what the agent should do when it fires), and an optional cron expression. Tasks live on a specific agent; the agent’s tools, memory, and connections are all available when the task runs.

Manual runs

Click the run button on any task to kick off a one-off execution. The run shows up in the task’s history with a green/red status dot. Useful for testing the prompt before you schedule it, or for ad-hoc “run this now” jobs.

Scheduled runs

Tasks accept standard 5-field cron expressions:

*  *  *  *  *
|  |  |  |  +-- day of week (0-6, Sunday=0)
|  |  |  +----- month (1-12)
|  |  +-------- day of month (1-31)
|  +----------- hour (0-23)
+-------------- minute (0-59)

Common shortcuts the picker exposes:

0 8 * * *: every day at 8 AM
0 8 * * 1-5: weekdays at 8 AM
*/15 * * * *: every 15 minutes
0 9 1 * *: 9 AM on the first of each month
30 14 * * 5: 2:30 PM every Friday

The Tasks page preview shows the next five fire times so you can sanity-check before saving. Timezone follows your account timezone; edit it in Settings if you’ve moved.

Run history

Every run records:

Start time, duration, end time.
Token usage (input + output, per model).
Cost in USD, computed from the active LLM provider’s rates.
Status: completed, failed, or stuck (no progress in 5 minutes).
The agent’s output and a transcript of every tool call.

Failed runs include a short reason and the last few tool calls so you can usually diagnose without opening the full transcript. Click into a run for the complete picture.

Failure handling

A task that fails three times in a row gets auto-paused so a broken integration doesn’t burn tokens overnight. You get a notification, fix the cause, and resume from the Tasks page.

Concurrency

By default an agent runs one task at a time. If a task is already running when the next cron fire arrives, the new run queues. Turn on parallel runs in the agent’s Settings if you have tasks that don’t depend on each other.

Memory

Each agent has a persistent memory store that survives across conversations, tasks, and restarts. The agent reads it at the start of every run and writes new entries as it learns things.

Types of memory

User memory: Facts about you: your role, your preferences, your tone, what you do for a living, who’s on your team. Used by the agent to tailor responses.
Feedback memory: Corrections you’ve given the agent. “Don’t summarize at the end of every reply,” “always use metric units,” etc. The agent honors these on every future turn.
Project memory: Context about ongoing work: deadlines, stakeholders, decisions made, things to watch.
Reference memory: Pointers to external systems: “bugs go in this Linear project,” “oncall watches this Grafana board,” “design lives in this Figma file.”

Editing memory

Open Dashboard → Memoryand pick an agent. Each memory file shows up as an editable card: rename, rewrite, delete, or add new entries. The agent picks up changes on its next turn; you don’t need to restart anything.

Seeding new agents

When you create a new agent, its memory starts empty. To save time, you can copy a starter memory pack (Software engineer, Founder, Operator, …) from the templates dropdown. These are short, opinionated defaults that the agent immediately refines as you work with it.

Isolation

Main-agent memory and sub-agent memory don’t cross-talk. That’s deliberate: when you launch a sub-agent for a specific job, you don’t want its scratch work polluting the main agent’s long-term view of you.

How the agent writes memory

The agent writes memory using a built-in tool. You can see every write in the Logs tab. If the agent writes something wrong, edit it directly in the Memory page or tell the agent what’s wrong (“forget that I work at Acme, I left in March”), it’ll delete or update the entry.

Testing sandbox

The Testing page is a sandbox for poking at agents without touching their real memory or tool surface. Useful when you’re iterating on a system prompt and don’t want to pollute the live agent’s history.

How it works

Pick an agent and click Clone for testing.
A fork is provisioned with the same configuration (system prompt, tool list, connection scope) but empty memory.
Iterate against the clone: try different prompts, see how it handles edge cases, push tool calls you’d be nervous to push to the real agent.
When you’re happy, click Promote to real to copy the clone’s configuration back to the original agent. Memory stays on the clone (you don’t want to overwrite real memory with test data).
Delete the clone when you’re done.

When you’d use this

Rewriting a system prompt and want to A/B test it on real conversation history.
Testing a new tool or skill without touching the live agent.
Diagnosing why an agent behaves a certain way (clone, replay, see what changes).