How I run multiple $10K MRR companies on a $20/month tech stack / 我如何用每月20美元的技术栈运营多家月经常性收入1万美元的公司

How I run multiple $10K MRR companies on a $20/month tech stack / 我如何用每月20美元的技术栈运营多家月经常性收入1万美元的公司

Last night, I was rejected from yet another pitch night. It was just the pre-interview, and the problem wasn't my product. I already have MRR. I already have users who depend on it every day.

昨晚,我又一次在路演之夜被拒了。那还只是预面试阶段,问题也不出在我的产品上。我的产品已经有了稳定的月经常性收入,也有每天都依赖它的用户群体。

Article illustration

The feedback was simply: "What do you even need funding for? You're doing pretty well without it."

我得到的反馈很直白:“你到底要融资干嘛?没有融资你也做得挺好的啊。”

I hear this time and time again when I try to grow my ideas. Running lean is in my DNA. I've built tools you might have used, and niche products you probably haven't. That obsession with efficiency leads to successful bootstrapping, and honestly, a lot of VCs hate that.

每次我想要推进自己的创意时,总能听到类似的话。精益运营已经刻进了我的骨子里。我开发过大家可能用过的工具,也做过大家大概率没听说过的小众产品。这种对效率的极致追求让我成功实现了白手起家,不过说实话,很多风险投资人都很讨厌这一点。

Keeping costs near zero gives you the exact same runway as getting a million dollars in funding with a massive burn rate. It's less stressful, it keeps your architecture incredibly simple, and it gives you adequate time to find product-market fit without the pressure of a board breathing down your neck.

把成本控制在接近零的水平,和拿到百万美元融资却有高昂烧钱率的情况相比,你的项目生命周期是完全一样的。这种模式压力更小,技术架构也极其简单,还能让你有充足的时间找到产品市场匹配度,不用承受董事会时时刻刻盯着你的压力。

If you are tired of the modern "Enterprise" boilerplate, here is the exact playbook of how I build my companies to run on nearly nothing.

如果你已经厌倦了当下流行的“企业级”冗余配置,下面就是我近乎零成本运营公司的完整实操指南。

Article illustration

Use a lean server

使用精简服务器

The naive way to launch a web app in 2026 is to fire up AWS, provision an EKS cluster, set up an RDS instance, configure a NAT Gateway, and accidentally spend $300 a month before a single user has even looked at your landing page.

到2026年,开发Web应用的新手做法还是启动AWS服务,配置EKS集群,搭建RDS实例,设置NAT网关,结果在还没有一个用户访问你的落地页之前,每个月就莫名其妙花掉了300美元。

The smart way is to rent a single Virtual Private Server (VPS). First thing I do is get a cheap, reliable box. Forget AWS. You aren't going to need it, and their control panel is a labyrinth designed to extract billing upgrades. I use Linode or DigitalOcean. Pay no more than $5 to $10 a month.

聪明的做法是租一台单独的虚拟专用服务器(VPS)。我做的第一件事就是买一台便宜可靠的服务器。别考虑AWS了,你根本用不上它,而且它的控制面板就像个迷宫,设计出来就是为了诱导你升级付费。我用的是Linode或者DigitalOcean,每个月花费不超过5到10美元。

1GB of RAM sounds terrifying to modern web developers, but it is plenty if you know what you are doing. If you need a little breathing room, just use a swapfile.

1GB内存在现在的Web开发者听起来可能很吓人,但如果你知道怎么优化,完全够用。如果需要多一点缓冲空间,用交换文件就行。

The goal is to serve requests, not to maintain infrastructure. When you have one server, you know exactly where the logs are, exactly why it crashed, and exactly how to restart it.

我们的目标是响应用户请求,不是维护复杂的基础设施。当你只有一台服务器的时候,你清楚地知道日志存在哪里,崩溃的原因是什么,也知道怎么重启服务。

Use a lean language

使用精简编程语言

Now you have constraints. You only have a gigabyte of memory. You could run Python or Ruby as your main backend language—but why would you? You'll spend half your RAM just booting the interpreter and managing `gunicorn` workers.

现在你有硬件限制了,只有1GB内存。你当然可以用Python或者Ruby作为后端语言,但何必呢?光是启动解释器和管理gunicorn工作进程就能用掉你一半的内存。

I write my backends in Go. Go is infinitely more performant for web tasks, it's strictly typed, and—crucially for 2026—it is incredibly easy for LLMs to reason about. But the real magic of Go is the deployment process. There is no `pip install` dependency hell. There is no virtual environment. You compile your entire application into a single, statically linked binary on your laptop, `scp` it to your $5 server, and run it.

我用Go语言写后端。Go在Web任务上的性能要好得多,它是强类型语言,而且对2026年来说非常关键的一点是,大语言模型很容易理解Go代码。不过Go真正的优势在于部署流程。没有`pip install`的依赖地狱,也不需要虚拟环境。你可以在自己的笔记本上把整个应用编译成一个单独的静态链接二进制文件,用`scp`传到你5美元的服务器上就能直接运行。

Use Local AI for long-running tasks

长时间运行的任务用本地AI

If you have a graphics card sitting somewhere in your house, you already have unlimited AI credits.

如果你家里闲置着一块显卡,那你已经拥有了无限的AI额度。

When I was building eh-trade.ca, I had a specific problem: I needed to perform deep, qualitative stock market research on thousands of companies, summarizing massive quarterly reports. The naive solution is to throw all of this at the OpenAI API. I could have paid hundreds of dollars in API credits, only to find a logic bug in my prompt loop that required me to run the whole batch over again.

我开发eh-trade.ca的时候遇到了一个具体的问题:我需要对数千家公司进行深度定性的股市研究,总结大量的季度报告。新手解法是把所有这些任务都丢给OpenAI API,我可能得花数百美元的API费用,最后还可能发现提示词循环里有逻辑错误,需要重新运行整个批次的任务。

Instead, I'm running VLLM on a dusty $900 graphics card (an RTX 3090 with 24GB of VRAM) I bought off Facebook Marketplace. It’s an upfront investment, sure, but I never have to pay a toll to an AI provider for batch processing again.

我选择的方案是用在Facebook Marketplace上花900美元买的一块积灰的显卡(24GB显存的RTX 3090)运行VLLM。这确实是一笔前期投入,但之后我再也不用为批量处理任务给AI服务商付费了。

For local AI, you have a distinct upgrade path: Start with Ollama. It sets up in one command (`ollama run qwen3:32b`) and lets you try out dozens of models instantly. It's the perfect environment for iterating on prompts. Move to VLLM for production. Once you have a system that works, Ollama becomes a bottleneck for concurrent requests. VLLM locks your GPU to one model, but it is drastically faster because it uses PagedAttention. Structure your system so you send 8 or 16 async requests simultaneously. VLLM will batch them together in the GPU memory, and all 16 will finish in roughly the same time it takes to process one.

本地AI有清晰的升级路径:从Ollama开始,只需一行命令(`ollama run qwen3:32b`)就能完成设置,你可以立刻试用数十种模型,是迭代提示词的完美环境。生产环境就切换到VLLM。当你的系统跑通之后,Ollama会成为并发请求的瓶颈。VLLM会把GPU锁定在一个模型上,但因为使用了PagedAttention技术,速度要快得多。你可以把系统设置为同时发送8或16个异步请求,VLLM会把它们在GPU内存中批量处理,16个请求完成的时间和处理1个请求的时间差不多。

Use OpenRouter for your Fast/Smart LLM

快速/智能大语言模型用OpenRouter

You can't do everything locally. Sometimes you need the absolute cutting-edge reasoning of Claude 3.5 Sonnet or GPT-4o for user-facing, low-latency chat interactions.

你不可能所有任务都在本地运行。有时候面对用户的低延迟聊天交互,你需要Claude 3.5 Sonnet或者GPT-4o这类最前沿的推理能力。

Instead of juggling billing accounts, API keys, and rate limits for Anthropic, Google, and OpenAI, I just use OpenRouter. You write one OpenAI-compatible integration in your code, and you instantly get access to every major frontier model. More importantly, it allows for seamless fallback routing. If Anthropic's API goes down on a Tuesday afternoon (which happens), my app automatically falls back to an equivalent OpenAI model. My users never see an error screen, and I don't have to write complex retry logic.

我不用来回切换Anthropic、谷歌和OpenAI的账单账户、API密钥和速率限制,只用OpenRouter就够了。你只需要在代码里写一套兼容OpenAI的集成,就能立刻接入所有主流的前沿模型。更重要的是,它支持无缝回退路由。如果Anthropic的API在某个周二下午宕机了(这种事确实发生过),我的应用会自动回退到等效的OpenAI模型,用户根本看不到错误页面,我也不用写复杂的重试逻辑。

Use SQLite for everything

所有场景都用SQLite

I always start a new venture using `sqlite3` as the main database. Hear me out, this is not as insane as you think. The enterprise mindset dictates that you need an out-of-process database server. But the truth is, a local SQLite file communicating over the C-interface or memory is orders of magnitude faster than making a TCP network hop to a remote Postgres server.

我每次创业新项目都用`sqlite3`作为主数据库。听我说完,这并没有你想的那么疯狂。企业级思维认为你需要一个独立进程的数据库服务器,但实际上,通过C接口或内存通信的本地SQLite文件,比通过TCP网络访问远程Postgres服务器要快好几个数量级。

"But what about concurrency?" you ask. Many people think SQLite locks the whole database on every write. They are wrong. You just need to turn on Write-Ahead Logging (WAL). Execute this pragma once when you open the database: Boom. Readers no longer block writers. Writers no longer block readers. You can now easily handle thousands of concurrent users off a single `.db` file on an NVMe drive.

你可能会问:“那并发怎么办?”很多人以为SQLite每次写入都会锁定整个数据库,其实他们错了。你只需要开启预写日志(WAL),在打开数据库的时候执行一次这个编译指令就行。搞定了,读操作不会阻塞写操作,写操作也不会阻塞读操作。现在你用NVMe硬盘上的单个`.db`文件就能轻松处理数千个并发用户。

Conclusion

总结

The tech industry wants you to believe that building a real business requires complex orchestration, massive monthly AWS bills, and millions in venture capital. It doesn't.

科技行业想让你相信,打造一个真正的企业需要复杂的编排、每月高昂的AWS账单,还有数百万美元的风险投资。但事实并非如此。

By utilizing a single VPS, statically compiled binaries, local GPU hardware for batch AI tasks, and the raw speed of SQLite, you can bootstrap a highly scalable startup that costs less than the price of a few coffees a month. You add infinite runway to your project, giving yourself the time to actually solve your users' problems instead of sweating your burn rate.

通过使用单台VPS、静态编译的二进制文件、本地GPU硬件处理批量AI任务,还有SQLite的原生速度,你可以白手起家打造一个高度可扩展的创业公司,每个月的成本还不如买几杯咖啡贵。这会给你的项目带来无限的生命周期,让你有时间真正解决用户的问题,而不是天天为烧钱率发愁。


来源:https://stevehanov.ca/blog/how-i-run-multiple-10k-mrr-companies-on-a-20month-tech-stack