a local model, wired into your editor in a minute.
oi is the thin layer between a coding llm running on your machine and the places you actually write code — your terminal and vs code. install the cli, point it at ollama or llama.cpp, and start. no account, no keys, no code leaving your laptop.
→ scanning for a local llm runtime…
✓ found ollama at http://localhost:11434
✓ detected model · codellama:13b (7.4 gb)
✓ detected model · qwen2.5-coder:7b (4.7 gb)
→ default model set to codellama:13b
✓ oi is ready — nothing leaves this machine.
oi generating a patch with codellama:13b …
+ validate req.body.email against zod schema
+ return 422 on parse failure
apply this patch? [y/n]
cloud assistants are powerful — and they make you give up three things.
you hand over your source, you pay by the token, and you trust someone else to keep the model stable and online. for a lot of work that trade is fine. for the rest of it — confidential code, tight budgets, offline machines — it isn't. oi gives the same daily assistant without the trade.
per token — local inference on your own hardware
on-device — prompts and code stay on localhost
upvotes on the hacker news thread asking for exactly this
built for the people the cloud leaves out.
indie hackers
ship without watching a token meter. a local model handles the day-to-day refactors and boilerplate at a fixed cost — your gpu — so a heavy week doesn't mean a heavy bill.
privacy-sensitive teams
for regulated, client-confidential or proprietary codebases, nothing can leave the building. oi keeps every prompt and diff on localhost, so legal and security stay happy.
offline & air-gapped
on a plane, on a secure network, or just off the grid — oi works with no connection at all once the model is pulled. the assistant is always there.
cost control
predictable spend instead of usage-based surprises. run the model you already have on hardware you already own, and your per-request cost is exactly zero.
four steps from install to coding.
install the cli
brew install oi, or curl the one-line installer. the cli is the engine — it talks to your local runtime and powers the editor extension.
run oi setup
oi scans for ollama or llama.cpp, lists the models you've pulled, and links one as your default. nothing is downloaded to a cloud — it just wires up what's already on your machine.
the vs code extension auto-detects it
open vs code and the extension finds your running oi setup automatically. the chat panel, quick actions and commit generation light up — no keys, no account.
code locally
every prompt runs against your model on localhost. ask questions, generate, refactor, draft commits — all private, all free, all offline-ready.
welcome to oi
a coding llm that runs on your machine. private, powerful, yours — no cloud, no per-token bill, no code leaving your laptop.
install the cli
one command — brew install oi, or curl the installer.
run oi setup
it finds your ollama or llama.cpp model and links it.
code locally
open vs code — the extension auto-detects oi and chats.
takes about a minute · works offline
message
feat(auth): add slug validation + 422 on bad input
validate route params against the slug regex and reject malformed requests early instead of failing in the handler.
+ src/lib/validators.ts
~ src/routes/posts.ts
~ src/routes/users.ts
the model earns its keep in the boring places too.
oi reads your staged diff and drafts a real conventional-commit message — subject and body grounded in what actually changed. it's the kind of small, constant task a local model is perfect for: fast, free, and never sending your diff anywhere.
- drafts from the diff, not a template
- available in vs code and from the cli (oi commit)
- the diff stays inside your repo
same daily assistant. very different trade.
private, powerful, yours.
free cli and vs code extension. point oi at the model you already run and keep every line of code on your own machine.