Back to Writing
February 18, 202612 min readInfrastructureAI DevelopmentRemote Dev

I Type “swarm” and Four AI Agents Wake Up on a Server 2,000 Miles Away

Most developers run one AI coding session and call it cutting edge. I run four in parallel on a headless server, connected from my MacBook with one command or my iPhone from bed. The agents never sleep. Here’s what that setup actually looks like.

It's 11 PM. I'm on my couch. I pick up my iPhone, open Blink Shell, and type one word. Two seconds later, I'm looking at four AI coding agents running on a server that hasn't restarted in weeks. One is mid-way through building a feature. Another finished a bug fix while I was at dinner. A third is running tests. The fourth is waiting for my next instruction.

I didn't open a laptop. I didn't launch an IDE. I didn't re-explain anything to anyone. The agents remembered exactly where they were because they never left.

The core idea: Your AI agents shouldn't live on your laptop. They should live on infrastructure that never sleeps, never restarts, and never forgets what it was doing—and you should be able to reach them from anywhere.

The Ceiling of Local Development

Here's the workflow most people have: open a terminal, start Claude Code, work on a problem, close the laptop when they're done. The session dies. Tomorrow they open it again, re-explain the context, and lose the first twenty minutes getting back to where they were.

Now multiply that by four projects across three companies. That's not a workflow. That's a bottleneck wearing a trenchcoat.

×
Resource contention

Your laptop splits RAM between your browser, editor, Slack, and the AI agent fighting for what's left

×
Session fragility

Close the lid, lose the context. OS update? Gone. Power outage? Start over.

×
Sequential thinking

One session means one task at a time. You become the bottleneck across every project.

The answer isn't a faster laptop. It's removing the laptop from the equation entirely.

The Architecture

Everything runs on a single headless Ubuntu server. No desktop environment. No GUI. No mouse cursor rendering on a screen nobody's looking at. Just a terminal, and every resource dedicated to actual work.

The Stack

Server
24 CPU cores, 188 GB RAM
Using less than 5% at any time
tmux
Persistent terminal multiplexer
Sessions survive disconnections
Tailscale
Private encrypted mesh network
No public IPs exposed
SSH + Keepalive
Persistent connections
Tunnels that don't drop

The server runs a tmux session called swarm with four windows—OC1, OC2, OC3, OC4—each running an independent Claude Code instance. These agents are working whether I'm watching or not.

The Key Insight: Grouped Sessions

Here's the problem nobody tells you about tmux: when multiple clients connect to the same session, they share a cursor. Switch windows in one screen, and every other screen switches too. If you're trying to watch four agents independently, that's useless.

The fix is grouped sessions—a tmux feature most people don't know exists. Each connection gets its own session that shares the same windows but maintains an independent view. I can watch OC1 on my MacBook while my iPhone shows OC3, and neither affects the other.

# Each pane gets its own grouped session
tmux new-session -t swarm -s mac-oc1   # Independent view 1
tmux new-session -t swarm -s mac-oc2   # Independent view 2
tmux new-session -t swarm -s mac-oc3   # Independent view 3
tmux new-session -t swarm -s mac-oc4   # Independent view 4

# Same windows, independent cursors
# Switch OC1 → OC2 in one pane, others don't move

One Word to Connect

On my MacBook, I open a terminal and type:

swarm

A script fires. iTerm2 opens a new window. It splits into a 2×2 grid. Each pane SSHs into the server and attaches to its own grouped session. In about two seconds, I'm looking at four independent agents, each in its own quadrant.

The View

OC1
Building features
OC2
Fixing bugs
OC3
Running tests
OC4
Researching APIs

Each one picks up exactly where it left off. Mid-conversation. Mid-task. Mid-thought. Nothing to re-explain. Nothing to reload. Close my laptop and come back three hours later—same state, same context, same progress.

Connecting from My Phone

The same system works from my iPhone using Blink Shell. A saved snippet connects me to the server with its own grouped session. I can review what the agents have been doing, send a quick command, check build output—all from my phone.

This sounds like a convenience. It's actually a superpower. Ideas don't wait for you to be at your desk. Being able to reach your agents from anywhere means the system truly runs continuously, not just during office hours.

Real scenario: I was at a restaurant when I remembered a critical edge case in a feature we were shipping. Pulled out my phone, opened Blink, attached to OC2, and told the agent to handle it. By the time dessert arrived, the fix was deployed. That doesn't happen with a laptop-bound workflow.

What Four Parallel Agents Actually Look Like

Here's a real workday scenario. I need to ship a feature for Tensor Solutions, fix a production bug in Mentor Agile, write test coverage for a shared library, and research an API integration for PARC Solutions. On a traditional setup, these are sequential. One at a time. Context switching between each.

With the swarm:

1

OC1 — Building

Scaffolding new components, writing business logic, iterating on the implementation

2

OC2 — Debugging

Reading logs, tracing the production issue, deploying the fix

3

OC3 — Testing

Analyzing existing code paths, generating test cases, running the suite

4

OC4 — Researching

Reading documentation, prototyping the integration, writing up findings

All four running simultaneously. I glance between quadrants, provide direction when needed, approve changes, and move on. What used to take a full day of sequential work compresses into a couple hours of parallel execution.

Previewing Without a GUI

The server has no browser. No desktop. So how do you see what the agents build?

SSH tunneling. One command forwards a port from the server to your local machine:

ssh your-server -L 3000:localhost:3000

Now localhost:3000 in your Mac's browser shows the app running on the server. The dev server runs remotely. The browser runs locally. The SSH tunnel connects them invisibly. This is the same pattern used by engineering teams at companies running remote development environments at scale.

Why Headless Is an Advantage, Not a Limitation

Running without a desktop environment isn't a compromise. It's an optimization.

With Desktop Environment

  • • 2–4 GB RAM consumed idle
  • • Window manager, compositor running
  • • Font renderers, clipboard services
  • • More attack surface
  • • More things that can break

Headless (What I Run)

  • • Every byte goes to agents
  • • Minimal services, minimal overhead
  • • Smaller footprint, fewer vulnerabilities
  • • More stable, fewer crashes
  • • Production-grade by design

This is the same reason every production server at every major company runs headless. The terminal isn't a limitation of the setup. It's the point.

The Persistence Advantage

The most underrated part of this architecture is persistence.

Local development is fragile. A laptop restart, an OS update, a power outage—any of these kills your session and your context. On the server, tmux sessions survive everything except a full reboot. I've had sessions running continuously for weeks.

1
Close laptop
2
Agents keep running
3
Open laptop
4
Instant reconnect

When you close your laptop, your agents don't notice. They're not running on your laptop. They're running on a machine that's always on, always connected, always available. You're just a window into their world—not the world itself.

The Mental Shift

The real change isn't technical. It's mental.

When you have four agents running in parallel on a server that never sleeps, you stop thinking about development as something you do and start thinking about it as something you direct. You become the architect, the reviewer, the decision-maker. The agents handle the implementation.

Before: “Let me write this component, then debug that API, then write those tests, then research that library...”

After: “OC1, build the component. OC2, debug the API. OC3, write the tests. OC4, research the library. I'll review in an hour.”

Your job shifts from writing code to making decisions. Which feature matters most? Which bug is blocking users? Which integration unlocks the next phase? You answer those questions and point the agents at the work. They execute while you think about what's next.

What Getting Here Required

I won't pretend this is a weekend project. Getting here meant:

  • Understanding SSH, tmux, and terminal multiplexing beyond the basics
  • Setting up Tailscale for encrypted private networking across every device
  • Writing automation scripts that wire up four SSH connections into a single command
  • Debugging the kind of quoting issues that make you question your career choices
  • Discovering grouped tmux sessions—a feature most tmux users have never heard of
  • Building teardown scripts that clean up gracefully without killing active work

Most of the time was spent not on the architecture itself, but on making the experience feel effortless. Anyone can SSH into a server. The art is making it so seamless that you forget there's a server at all.

The Point

Most developers are running AI coding tools the same way they ran text editors ten years ago: one window, one task, one machine, and everything resets when you close the lid.

The infrastructure to go beyond that exists today. Headless servers. Persistent sessions. Encrypted mesh networking. Parallel agents with independent views. None of this is theoretical. None of it requires special hardware. It requires the willingness to set it up and the patience to get the details right.

The takeaway

Your AI agents should work harder than you do. They should run while you sleep, persist while you travel, and be reachable from whatever device is in your hand. The tools exist. The patterns are proven. The only question is whether you'll build the infrastructure that lets them run at full speed—or keep closing your laptop and starting over tomorrow.