I had a very short, very strong love affair with Fable 5. Until the U.S. government took it away from me.
Which makes me think: we shouldn't be dependent on it. So let's explore the ways to be independent, cost-effective, and run an amazing business — with all the time freedom we want and costs kept low.
Start here: right now, you rent your intelligence.
Every email your AI drafts, every list it cleans, every transcript it summarizes — you pay a meter that someone else controls. You can almost hear it ticking. Today it's cheap. But cheap is a decision the provider made, and they can un-make it overnight — the way mine was un-made.
Most small businesses have quietly wired their whole operation to a price they don't set and can't predict. That's not a tool. That's a landlord.
The line everyone gets wrong
The fear is: "I can't run real AI without a data center." You're picturing the frontier — the biggest, smartest model in the world. You don't need it. Almost nothing you do every day needs it.
Put a 40-minute sales call in front of a model running on a normal laptop and it hands you five clean bullet points in seconds — offline, on a plane, for nothing. No meter. No invoice. No internet.
That's the bulk of small-business AI work: sorting leads, tagging emails, summarizing calls, drafting a first version you'll rewrite anyway. A normal computer handles all of it now — that line quietly got crossed about six months ago.
So the skill isn't "buy the smartest thing." It's matching the right-sized model to the job. Big rented intelligence for the few things that ship and decide. Small, free, local intelligence for the hundred things that just need doing.
Here's the math nobody shows you
Right now, most small businesses pay a flat monthly fee for AI — call it $200. Here's the part nobody mentions: at real, metered prices, the work you're actually running would cost far more. For a team leaning on it all day, the true usage runs toward $5,000 a month. The provider is eating the difference — for now.
That's the trap. A flat fee feels safe, so you wire your whole operation to it. But a flat fee is an introductory offer, not a law of nature. The day the pricing flips to "pay for what you use" — and for a lot of tools, that day is coming — your bill doesn't creep from $200. It leaps toward what you were really using all along.
So the real question isn't "how do I shave my $200." It's: what happens when the $200 becomes $5,000 — and am I ready?
Here's the hedge. A one-time machine — about $4,000 — runs your own models. It can't do everything the frontier does, and you wouldn't want it to. But the bulk of that $5,000 of usage — the sorting, summarizing, tagging, transcribing, first-drafting — it does for free.
When metered pricing hits | The number |
|---|---|
Your real usage, at API prices | ~$5,000 / month |
What your own machine absorbs (the bulk) | ~$3,000 / month |
What stays in the cloud (the few frontier jobs) | ~$2,000 / month |
The machine that does it | ~$4,000, once |
A one-time $4,000 buys back about $3,000 every month. It pays for itself in roughly six weeks, then saves you on the order of $36,000 a year — for as long as the machine runs. The honest part: it's not the whole $5,000. About half your work is frontier-grade and stays in the cloud. Anyone promising a $4,000 box replaces all of it is selling you something.
You can't stop the meter from being switched on. You can make sure that when it is, most of your work is already running on a machine it can't touch. That's the difference between renting your intelligence and owning it.
The exact setup
Hardware
Free tier: any computer with 16GB of memory. Runs a 12-billion-parameter model — the sweet spot for small-business work.
Serious tier: a Mac Studio (or equivalent PC) with 128GB unified memory, about $4,000. Runs models up to 120 billion parameters, always on, reachable from your phone or laptop.
Software — four steps, one evening
Install the runtime first. Ollama (terminal) or LM Studio (click-to-run, no terminal). Ten minutes. Everyone does this backwards and hunts for models first — don't.
Pull one model. Start with Qwen — strong, free, cleanly licensed. If you only ever learn one, learn that one.
Turn on quantization. The "Q4" setting compresses the model to about half the size, like saving a photo as a high-quality JPEG — your eye can't tell the difference. It's how something that "needs a server" runs on a laptop.
Give it tools, not just brains. Web search, file access. A small model with tools beats a giant one without them. The model is the engine; tools are the wheels.
The three things you actually get
Privacy. The data never leaves your machine. For anyone in finance, health, or law, that's what lets you say yes to a client you'd otherwise have to turn away.
Zero marginal cost. Once the machine is paid for, volume is free. You stop rationing. You stop asking "can we afford to run this on everything?" You just run it.
Nobody can turn it off. No price hike, no policy change, no outage. It works on a plane, in a basement, on the worst internet in the world. While your competitors sweat the next price jump, your machine just hums along.
That last one is the real point. It's a generator in the garage. You hope you never need it. You sleep better owning it.
The control move
Here's why this is a sales conversation, not a tech one.
Selling from need makes you weak. Win rates drop the moment you're the one who can't walk away. The same is true of how you run the business behind the sale. Put your whole operation on a tool someone else prices, and you've handed them leverage over your margins.
You don't have to rip out the cloud. The smart setup is both: the big rented model for the few things that truly need the best, your own free model for the rest. Cloud is the grid. Local is the generator. You want both.
Own part of your stack — not because the meter has jumped yet, but so it never gets to decide for you.
That's control. And control is the whole game.
The Sprint Club is where we build this with you — the tools, and the people who help you install them: https://strategysprints.com
