Choosing an AI sandbox provider in 2026

Back to Blog

Jan 30, 2026

Written by

Simon Spurrier

For more than 18 months we have been running our AI code agents on our own Firecracker microVMs with custom orchestration. This worked great, and was the fastest and most reliable option at the time, back when the concept of sandboxes for remote AI agents was only just emerging.

Since launching cto.new late last year, this has supported tens of thousands of users merging hundreds of thousands of pull requests generated by our agents.

Going into 2026, it’s time for an upgrade, and that means handing off our sandbox infrastructure to one of the many startups that have sprung up to specialise in this part of the stack.

Pseudo terminal support

Before the days of Claude Code, we realised quite early on that LLMs are really powerful when you give them full access to a Ubuntu system via a terminal.

To handle never-ending edge cases and to allow the LLM to use a terminal in the same way it would have learned during post training, our solution needed to be pretty feature complete, including handling long running processes and interactive commands.

To solve this, we used node-pty which let the LLM access a pseudo terminal running on our microVMs. This also meant our users could interact directly with the same terminal in our web app.

This was a complex piece of work throughout our stack and was the basis for early product market fit in remote code agents. While we could just run node-pty on any sandbox provider, we wanted to remove this complexity as part of our sandbox handoff, and assumed it would be well supported.

Surprisingly, widely used sandbox providers did not have pseudo terminal support to allow for a bash tool that could handle interactive and long-running commands. For example, after getting pretty deep into our evaluation of Modal, a service that supports many AI code agents, we discovered a server side issue causing inconsistent standard output truncation and streaming (we opened an issue with our findings). This invites the question, are services built on Modal seriously compromising on terminal use?

Examples of services that appear to have better pseudo terminal support include Daytona, E2B, and Sprites.

Snapshots

Snapshots or checkpoints can capture the filesystem and memory state of a sandbox at a point in time in order to return to that state later on.

Snapshots can be useful for several things in code agents. They can serve as persistence of the state of a codebase or application (especially where development is only done in a linear fashion), although git is much more commonly used for this. They can also be used before running unsafe operations in case something goes wrong.

In our case, we are most interested in using snapshots for fast, consistent development environment startup.

One of the reasons ‘serious’ software development has not made the leap into remote environments is the predictability of, control over, and availability of the development environment. When working locally, setting up your environment can be tricky at first, but then you can just start your machine and pick up where you left off. Remote development environments are usually ephemeral and less tactile.

We’re solving that by using snapshots, in combination with ‘setup agents’ to automatically manage and fast-start development environments of arbitrary complexity. The experience will be just like opening your laptop and getting started right away.

This also applies to ‘vibe coding’. One of the limiting factors for how complex a vibe coded app can become is the environment and codebase guardrails which exist partially so that the development environment doesn’t get outdated or require any custom changes. Removing these limitations means your vibe coded app doesn’t hit a brick wall when you start getting serious.

Snapshot branching, the ability to start new sandboxes from a snapshot, is an emerging capability among most sandbox providers with it often being on the roadmap or in beta. Modal leads on this capability.

Later, we might use containers to serve our user’s applications at a small scale where snapshots will allow for instant start up. In the distant future, snapshots could even be useful to allow end users to change the code of the application they are using.

Where we ended up

Our two core requirements - stable pseudo terminal and snapshot branching - were difficult to find in the same product. This was surprising given how we view them as critical to a great AI powered remote software development experience.

Ultimately, we prioritised a stable pseudo terminal, broad technical flexibility, and a product and team that appears to be fast moving at the frontier of AI sandboxes and settled on Daytona. The vast majority of our workloads now run on Daytona’s infrastructure.

We are now working with early access to Daytona’s snapshot branching feature to realise our vision of how AI-powered software development should be done in remote environments and plan to roll that out to our users in the coming weeks.