32,000 lines of code in 60 days: notes from building beam.page

by mk · beam.page · May 2026

Beam.page is a static-hosting platform whose front door is an MCP server, with a thin web UI alongside for the things that benefit from one. You talk to whichever AI client you already use — Claude, ChatGPT, Codex, Cursor, Gemini CLI — and it calls the platform's tools; your site lands at <slug>.beam.page or a custom domain. You can change anything later by asking, through any of the same clients, on the go. Here is the story of building it over the last sixty days.

The pipeline

The pipeline that built it was loosely shaped on what Anthropic published about agentic software development, with modifications I made along the way. It runs: spec, build, QA (given the diff), deploy to a UAT environment, integration tests, commit, prod release. Each round takes a couple of hours. I have done a hundred and fifty-four of them.

I did not want to start from zero. I wanted a clear architecture for the pipeline to follow. So before there was a pipeline, I sat with an LLM and we sketched the skeleton — bare-bones frontend and backend, the whole shape of it. Just enough for it to fill in, not so much that there was nothing left for it to do. The trick is one we used to play on the early models: show them a few examples of how the JSON should look, and only then would they produce JSON of their own.

Goal first, not prescription. The starting point of any run is the goal — what I want to be true at the end. I tried being more prescriptive early on, spelling out which files to touch and how, and it kept getting in the way of what came after. What did vary, usefully, was how much code each stage saw. Some stages do better with the whole codebase in front of them. Others do better when handed a narrow brief and the freedom to make a quick edit.

The integration tests are another job for a model. They produce a stream of inputs and outputs — every request the test made, every response it got back. A fresh invocation of the LLM reads the dump the way I would walk a Postman collection, request by request, and flags anything that does not look right. It tells me what it found. I decide whether I agree. Sometimes we argue.

Git was not enough memory. Each round leaves a trail of files behind — what was done, what wasn't, and when. The pipeline reads them at the start of the next round so it knows where it is. Git captures part of the history but not all of it: what was tried, what was abandoned, what the spec said before any code was written. The pipeline keeps its own spec reports and build reports alongside the code.

Security plays by different rules now. I have a software engineer's grounding in security — enough for the basics, not enough to call myself an expert. The landscape has changed. A fair number of attacks now come from models that probe code faster than any human can. So I do the same on the defending side: at intervals I take the entire codebase and feed it to whichever frontier model is best that month, with one instruction. The pipeline fixes what it finds. I keep looking for more.

Numbers. 26,809 lines of Python; 5,175 lines of TypeScript and React; a single 3,507-line integration test file; 3,049 lines of markdown spec across half a dozen design documents. On AWS: 50 Lambda functions, 17 DynamoDB tables, 3 CloudFront distributions. The dead-letter queue has zero messages in it. I have written approximately none of it.

On the code

When I look at the code now, I see two questions sitting on top of each other. One is how the whole project is organised — how files relate, what can be handed over as a coherent chunk. The other is the inside of each file — how it's written, where the comments are, what the model reads when it opens it. Both bend a bit when the next reader is a model and not a person.

Structured for the handoff. The code is organised so the pipeline can pass chunks of varying size — a whole module, a single file, a related set, the entire codebase. I think a lot about how to make those chunks coherent. I don't have a settled answer on the right level of granularity; it varies by task.

Written for the model to read. My old instinct was that less code is always better. That is still true for human readers. But the next reader of every file is now the model, and what makes a file useful to it isn't quite the same as what makes it tidy for a person: clearer structure, comments that say why, predictable shapes the model can grab onto. Smaller-or-larger isn't the right axis any more. Useful-to-the-model is.

Architecture, also for the model. The same logic applies one level up. The shape of the system — what services exist, how alerts flow, where logs land — bends toward what an agent can act on. The clearest example is the second pipeline beam runs after shipping. It checks every page uploaded for things that shouldn't be there using a mix of deterministic and non-deterministic methods, and watches the infrastructure for jobs that fired when they shouldn't have or didn't fire when they should. Everything funnels into a Slack channel — and an agent reads the channel and tells me what's worth knowing about.

On the API

The API is designed to be read by an agent, not by a human. That is easy to miss, but the shape of an API matters more when an agent is reading it.

Slim docs, chatty responses. There is no fat manual to read before starting. Every response from the API and the MCP server is chatty in a way that would feel rude on a normal API. _comment fields explain what just happened. brief.next arrays spell out the next moves as REST calls, and brief.nextMcp arrays spell out the same as MCP tool calls. A capabilities block on every project response says what beam can do and, importantly, what it cannot.

A response looks something like this:

"brief": {
  "summary": "Viewing project hello-beam with 10 pages and 9 assets.",
  "next": [
    "POST /projects/hello-beam/edit {...}",
    "GET /projects/hello-beam/pages/_root"
  ],
  "nextMcp": [
    "edit(projectId='hello-beam', page='_root', edits=[...])",
    "page(op='get', projectId='hello-beam', slug='_root')"
  ]
}

The agent reads it the way you might read a sign at a junction. It doesn't have to have memorised the road.

Why I built it

My sister has a website on a server somewhere, and a habit of asking me to just change something on it. Then just add a section. Then just do one more thing. For her, each ask is a small change on a screen. For me it was a couple of hours per round-trip, ending with a different ask before the first one was finished.

You are using the word just WRONG. Or rather: you are using it the way everyone uses it, including me, when we do not want to think too hard about what something will take. Just compresses hours into a syllable. We let it because the alternative is admitting how long things actually are, and nobody really wants to do that.

Beam.page is the answer to just. You ask in plain English, in whichever AI client you already use. The AI talks to beam. The website appears. You ask for changes and the changes happen. The asking is the doing — or close enough to feel like it.

A few things weren't obvious from the start.

Editing wasn't there at first. The first version of beam could only re-upload whole files. A surgical-edit endpoint that takes find-and-replace pairs fixed that. Now you ask for a typo fix and the typo gets fixed.

Images broke things. Base64 of even a small photo eats the context window before anything useful can happen. Web chat clients felt it hardest; CLI clients had more room. A drop zone fixed it: the LLM mints a short-lived URL, the user uploads from any device, and the LLM gets back the URL plus an optional thumbnail. Full bytes stay where they are.

Snapshots, for when the LLM gets it wrong. Every project keeps point-in-time snapshots. When a round goes the wrong way, you roll back to the previous one.

A way in for apps. When beam.page itself wanted a small React app glued in, the static-only model bent. An /app/ convention lets a project reserve a slug to serve an SPA from.

Beam is a delight. I should probably not say so, being the one who built it; I will say it anyway. I publish lolasquared.com on it. I publish beam.page on it. (Also lola-the-cat.com, my cat's website, where I test new functionality.) When a copy-change idea hits at the wrong moment — about seventy-five thousand times a day — I open my phone, ask whichever AI client has the MCP connected, and the change appears. It is wonderful. I am as surprised as anyone.

What I think

We are in an exciting moment with this technology, and I am not sure anyone fully knows how to use it yet. I don't have a settled view on how, and I am suspicious of anyone who claims they do.

The gain for non-technical users is large. The power to make is huge, and — for me at least — addictive. The tiring-but-rewarding work of writing the code can be outsourced to a pipeline that runs for hours while you do other things. What you have left is the thing you actually wanted to think about. What you want, how you want it, what you need.

Bridging the gap is the interesting part. Some of the tools I have seen over the last few months feel simultaneously too much and too little. Too much because the gap they try to bridge is so wide that what comes out the other side does not quite reach the level a technical person would want. Too little because the environment is not there for anything genuinely complex. Beam.page tries to land somewhere in the middle. Whether it has, you can tell me.

There are heuristics floating around — give the model a place to start, validate with a fresh look from the LLM, keep the human in the loop where the human is good. None of them is a recipe. This isn't one either.

The website is up. My sister has, finally, stopped asking. Lola the cat has developed a taste for fame and wants to know when she gets an Instagram.

Try it

Go to beam.page for setup instructions, or wire your AI client directly:

MCP server: https://api.beam.page/mcp
REST API: https://api.beam.page
LLM instructions: https://www.beam.page/llm.txt