I Hate AI, so I Built a Compiler

I burned through 1.5 Billion Tokens this month. Let me put that in perspective for you. That's roughly $2,000. Enough to train a small model. Enough to generate an entire library. Enough to build something that could theoretically change the world. And I shipped zero code. Well, actually, on the negative side. So, here is what my month looked like. Ever spent 12 Million Tokens only to realize the AI missed a comma? Well, it happened to me. I spent a month generating, iterating, deleting, and regenerating. I had complete architectures that I threw away. I had systems that I couldn't trust. I had ideas that I couldn't execute. I was like a chef who kept cooking and throwing the food away because the plate wasn't arranged correctly. On Friday afternoon, after realizing the codebase was crap and taking down 65,000 lines of code, I decided I was done. So, this weekend… I built Ampersand. The Problem With Current AI-Assisted Development The current AI ecosystem is broken. We're burning billions of tokens on verbose, ambiguous languages when we could be using something efficient, deterministic, and built for the hardware we actually have. Most languages today are: Token-inefficient : Too much syntax for too little meaning. You need 50 tokens to say something that should take 5. Ambiguous : Multiple ways to do the same thing. The AI has to guess which one you meant. It usually guesses wrong. Memory-unaware : Dynamic allocation. Garbage collection. Hidden costs. The hardware is screaming "I'M RIGHT HERE" and the language is like "nah, let me abstract that away." Turing-complete by default : Infinite loops. Unbounded execution. Impossible to verify. We accept this as normal, but imagine if your car had a feature that let it drive forever in a circle with no way to stop. That's programming languages. The Solution: Ampersand Ampersand is a deterministic dataflow language with 7 enforced memory laws. It's designed to be: Token-efficient : Graph-based. Minimal syntax. You describe data flow, not control flow. The AI just connects dots instead of writing paragraphs. AI-friendly : Predictable. Bounded. Explicit. The AI doesn't need to guess what you meant because there's only one way to say it. Hardware-aware : Memory is pre-allocated. No hidden costs. The compiler knows exactly how much memory your program will use before it runs. Imagine that. Sounds good to me. The Architecture (Or, The Part Where I Sound Like I Know What I'm Talking About) The Ampersand compiler follows a classic pipeline design: Source Code (.adg) → Lexer → Parser → IR Graph → Memory Arena → Execution Each stage does exactly one thing and does it well. No magic. No hidden complexity. Just pure, deterministic data transformation. The 7 Memory Laws (I Made These Up But They Work) BOUNDS - Every memory access verifies offset + size ≤ region capacity. You can't read or write outside your allocated space. Revolutionary. WORM - Immutables are written exactly once (UNINITIALIZED → INITIALIZED). After that, they're locked. Any write attempt triggers a WORM violation. Yes, I named it after the computer worm. Yes, it's terrifying. STATE - Every node has defined initialization. You can't use something before it's ready. Basic, but you'd be surprised how many languages get this wrong. PORT - Multi-input ops require exact arity. No variable arguments. No optional parameters. You get exactly what you asked for, and you will like it. OVERFLOW - Division/modulo by zero is caught. I didn't want to, but the compiler forced me to. Actually, I wanted to. It's the right thing to do. TERMINATION - Lifecycle must end via explicit end node. No infinite loops. No hanging processes. Your program will stop. I promise. ARENA - Single OS reservation, no heap after init. Everything is pre-allocated. No malloc. No new. No memory fragmentation. Just a big, contiguous block of memory that the OS can't touch. The Memory Model (Or, The Part That Actually Impresses Me) Ampersand bypasses traditional OS memory management entirely, carving out a monolithic slice of physical RAM: #ifdef _WIN32 base = (uint8_t*)VirtualAlloc(nullptr, total, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE); #else base = (uint8_t*)mmap(nullptr, total, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); #endif This is the part that makes me feel like a hacker. I'm literally telling the OS "give me this memory and don't touch it." The OS is like "okay" and I'm like "good." The regions: Immutable Region (3%) : WORM (Write Once, Read Many) constants. These are your configuration values, your loop boundaries, your magic numbers. They're written once and then locked. Forever. Variable Region (70%) : Transient data that resets every lifecycle. These are your loop counters, your temporary values, your "things that change." They exist for one cycle and then they're gone. No leftovers. Function Region (27%) : Execution context with hard memory limits. Each function gets a specific amount of memory. If it exceeds it, the program crashes immediately. No silent corruption. Example Program: Counting to 100 graph counter; entry start; result end { end: true }; immutable limit { value: 100 }; variable current; function add { op: add }; edge start -> current; edge current -> add; edge limit -> add; edge add -> current; edge current -> end; edge limit -> end; This program demonstrates the beauty of dataflow programming: Immutables provide constant boundaries ( limit = 100 ). They never change. They're written once and then they're locked. If you try to modify them, the program crashes. This is a feature, not a bug. Variables track mutable state ( current ). They exist for one cycle and then they're reset. No stale values. No initialization issues. Functions perform operations ( add ). They're pure. They don't mutate anything. They just transform data. Edges define data flow. The program is literally just a graph. You can see the data moving from node to node. No hidden control flow. No magic. Entry/Result nodes handle execution and termination. The program starts at entry and ends at result. That's it. That's the whole story. ### The 48-Hour Build (Where Everything Went Wrong and Then Right) What I built: Complete lexer for .adg files Parser building IR graph Memory arena with OS-level allocation Immutable dictionary with WORM enforcement Lifecycle engine with cycle-based execution Topological sort for dependency resolution 7 memory laws enforced at runtime Cross-platform support (Windows/Linux) JSON and trace output modes What I learned: C++ memory management is brutal but rewarding : I spent two hours debugging a memory leak only to find out I forgot to zero-initialize a struct. The compiler didn't catch it. The runtime didn't catch it. I caught it because I stared at the code for so long I started hallucinating. Compiler design requires extreme discipline : You can't just "make it work." You have to make it work correctly. Every edge case matters. Every boundary condition matters. The difference between a compiler and a script is that a compiler can't afford to be wrong. AI is a tool, not a replacement for understanding : I spent 1.5 billion tokens generating code I didn't understand. When I finally sat down and wrote the compiler myself, I learned more in 48 hours than I did in the previous month. The AI can write code, but it can't build systems. That's still my job. Building from scratch teaches more than any tutorial : I've been reading about compilers for years. I understood the theory. But until I actually built one, I didn't understand the practice. The difference between theory and practice is that in theory, there is no difference. In practice, there is. The Code (Or, The Part Where I Admit It's a Mess) The compiler is written in ~2,000 lines of C++. It's not production-ready. The README is a disaster. The documentation is non-existent. The examples are probably broken. But it exists. And that's more than I had on Friday morning. I learned C++ from scratch to write this. I'd never written a line of production C++ before Friday. By Sunday night, I'd written a compiler. Did I do it correctly? Probably not. Is it working? Yes. Do I understand why? Not entirely. I'm not going to pretend this code is beautiful or perfect or even good. It's functional. The parser works. The lexer works. The memory arena works. The lifecycle engine works. Everything works. It just might not be pretty. The Stats Total code: ~2,000 lines of C++ Total build time: 48 hours Total sleep: None Total caffeine: Enough to kill a small horse Total AI tokens burned before starting: 1.5 billion Total AI tokens burned during build: Zero -- (If we don’t count the UI for my mini-IDE, if it qualifies to be one) Total times I said "I have no idea what I'm doing": 47 Total times I was right: 47 Why This Matters Ampersand represents a different approach to programming. It's not a "better" language—it's a different kind of language. Memory is explicit : You know exactly where your data lives and how much space it takes. No hidden allocations. No GC pauses. No surprises. Execution is bounded : Your program will terminate. It has to. The language won't let you write infinite loops. This makes verification possible. Immutability is enforced : Once a value is set, it can't change. This eliminates a whole class of bugs. No race conditions. No unexpected mutations. No "where did this value come from?" Dataflow is visual : The program is a graph. You can see the data flowing from node to node. This makes it easier to understand and easier to debug. For AI-Assisted Development This is where Ampersand really shines: Token efficiency : The AI uses fewer tokens to express the same logic. This means faster iteration and lower costs. Deterministic behavior : The AI doesn't need to guess what you meant. There's only one way to do things. This means fewer hallucinations and more correct code. Bounded execution : The AI doesn't need to reason about termination. The language handles it. This means simpler prompts and more reliable outputs. Hardware-aware : The AI can optimize for actual memory constraints. This means better performance and more efficient use of resources. What's Next v1.0 Roadmap: Documentation that actually exists Working examples Performance benchmarks Self-hosting (bootstrap) Sleep (for the developer) For the language: String operations (library nodes) More operators Visual editor Standard library Community adoption Get Involved Repo : github.com/jasonkerama-dev/Ampersand License : MIT Contributions : Bug reports, README improvements, and feature requests welcome I'm serious about the README improvements. I literally don't know how to write it. I spent 48 hours writing a compiler and now I can't write a simple README. Help. The Confession (Where I Tell You the Truth) I did not, in fact, know what I was doing. I just started typing and praying. The compiler works anyway. I don't know how. There were moments when I genuinely believed I was having a stroke. The code was compiling. My brain was not. I stared at segmentation faults like they were ancient runes. I debugged memory leaks by sprinkling print statements everywhere and hoping. At 2 AM on Sunday, the compiler finally compiled a program. I cried. Not because I was happy—because I was so exhausted that any emotion would have caused a breakdown. The markdown file is a disaster. The documentation is non-existent. The examples are probably broken. I pushed it to GitHub at 2 AM with a headache and 48 hours of sleep debt. But it works. And that's more than I had on Friday morning. The Post-Script (Where I Do the Smart Thing) If I don't respond to your messages tomorrow, I am either: a) In a pseudo-comatose state from sleep deprivation b) Having a really bad day from sleep deprivation c) Both Please don't take it personally. I just spent 48 hours building a compiler, learning C++ from scratch, and burning whatever was left of my sanity. I will respond when I can see straight again. \ :::tip * P.S. To everyone who said I should learn C++ before building a compiler—I did. It just took 48 hours. * P.P.S. To everyone who said I shouldn't use AI to generate 1.5 billion tokens—you were right. I didn't listen. I'm not sorry. ::: \

View original source — Hacker Noon ↗

ShareShare on X Share on Facebook