Automation
Integrations
Enable code sandboxes, image generation, web browsing, payments, and custom plugins.
Overview
Integrations are what make your agent truly powerful. They give it the ability to browse the web, write and run code, generate images, collect payments, schedule messages, automate websites, and connect to any external service.
Some capabilities are built in and always available. Others can be enabled per-instance through the Integrations page.
Built-in capabilities (always available)
These tools are available on every instance by default. No setup required.
Web browsing and search
The agent has full, unrestricted internet access. It can:
- Search the web — Find information on any topic using real-time search
- Read web pages — Navigate to any URL and extract the content
- Follow links — Browse across multiple pages to find what it needs
When you ask the agent something it doesn't know from the Knowledge Base, it can go look it up online.
Persistent browser (Sprites)
The agent has its own dedicated Chrome browser that maintains state between conversations:
- Navigate websites — Open any URL, read page content, interact with elements
- Fill out forms — Complete forms on any website with supplied data
- Log into accounts — Sign into web services and stay logged in across sessions. Cookies, sessions, and profiles persist.
- Automate multi-step workflows — Click through sequences, fill forms, download files — up to 40 steps per automation
- Extract structured data — Pull specific data from websites into clean, structured formats (JSON)
- Save and replay automations — When the agent completes a browser task successfully, it saves the workflow. Next time a similar task comes up, it replays the automation faster and more reliably.
The browser hibernates when idle and wakes up in 1-3 seconds when needed.
Knowledge Base search
The agent automatically searches your uploaded documents when answering questions. See Knowledge Base for details.
Skills and memory
- Save and reuse skills — The agent can store procedures and workflows it learns, then recall them for future tasks
- Working notes — The agent saves notes for context it needs to remember across conversation turns
- Browser skills — Saved browser automations that can be replayed
Sub-agent delegation
For complex tasks, the agent can spawn parallel sub-agents — each with access to the full toolkit (minus messaging). This lets it research multiple topics simultaneously, process data while browsing the web, or work on several parts of a task at once.
File processing
The agent can read and extract content from uploaded files including Word documents, Excel spreadsheets, PowerPoint presentations, text files, and Markdown.
Installable toolkits
These capabilities can be enabled per-instance from the Integrations page.
Code sandboxes (Daytona)
Give your agent a persistent coding environment where it can write and run code.
What it can do:
- Create persistent sandboxes in Python, TypeScript, or JavaScript
- Execute shell commands and scripts
- Read, write, and manage files in the sandbox
- Export generated files and send them back via WhatsApp
- Run data analysis, generate reports, create charts, process spreadsheets
- Build and run scripts of any complexity
Example use cases:
- "Analyze this CSV and tell me the top 10 customers by revenue"
- "Write a script that converts these prices from USD to KES"
- "Generate a bar chart comparing our monthly sales"
- "Process this spreadsheet and create a summary report"
Sandboxes are isolated for security and can be configured with network access controls.
Image generation (Fal.ai)
Let your agent create, edit, and enhance images.
What it can do:
- Generate images from text — Describe what you want and the agent creates it
- Edit existing images — Modify uploaded images based on instructions
- Upscale images — Enhance low-resolution images for better clarity and detail
Example use cases:
- "Create a promotional poster for our weekend sale"
- "Generate a product mockup with our logo"
- "Upscale this low-res photo so we can use it on our website"
- "Edit this image to remove the background"
Payments (M-Pesa)
Collect payments, process refunds, and manage payouts through M-Pesa. See Payments for the full guide.
What it can do:
- Send STK push payment prompts to customers during conversations
- Process refunds for completed transactions
- Request payouts to till numbers or phone numbers
Scheduling
Schedule messages, broadcasts, and follow-ups. See Scheduling & Broadcasts for the full guide.
What it can do:
- Schedule one-time or recurring messages
- Create AI-personalized broadcasts to contact groups
- Set follow-up reminders that the agent sends automatically
- Manage and cancel scheduled tasks
Installing a toolkit
- Go to the Integrations page in your instance sidebar
- Browse available integrations
- Click Install on the one you want
- Follow any configuration steps
- The toolkit is now active — your agent can use its tools immediately
Granting services to your instance
Some integrations are installed at the organization level. To make them available to a specific instance:
- Go to your instance's Integrations page
- In the Granted Services section, grant the service access
- The agent on that instance can now use the service's tools
Each granted service shows its name, when it was granted, and an Active status badge.
Custom plugins
Plugins let you extend the agent with any custom functionality — connect it to your CRM, your project management tool, your internal API, or any third-party service.
When a plugin is installed, its tools become available to the agent just like built-in tools. The agent can call them during conversations based on what the user is asking for.
See the Developer Overview for details on building custom plugins.
Combining capabilities
The real power comes from combining tools. Your agent can chain capabilities together in ways that would take a human multiple steps and multiple applications:
- Research + Code + Report: Search the web for data → write a Python script to analyze it → generate a chart → send the chart as a WhatsApp image
- Browser + Payment: Log into a supplier's portal → check prices → calculate a quote → collect payment via M-Pesa
- Image + Schedule: Generate a promotional image → schedule it as a broadcast to your VIP customer group → follow up a day later to check engagement
- Browse + Knowledge + Response: Read a customer's order on your web portal → cross-reference with your knowledge base → compose a detailed status update
There's no limit to how the agent combines its tools. Describe what you need in the system prompt or in conversation, and the agent figures out the best way to accomplish it.
Next steps
- Settings & Account — Manage your account, API keys, and organization
- Troubleshooting — Fix common issues
- Developer Overview — Build custom plugins to extend the agent