About
Prompts are dependencies. Treat them that way.
Every other artifact your app depends on has a name, a version, a hash, a lockfile, and a review process. The string that drives your LLM call is, more often than not, a triple-quoted literal in chat.py. blogus is a thin layer that closes that gap.
The shape of the problem
Almost every team shipping LLM-powered features starts the same way: a prompt is an inline string, embedded right next to the SDK call that uses it. That works until the team grows. Once two engineers can change the same prompt, the trouble starts: silent edits buried in unrelated PRs, copy-pasted variants that diverge over time, evals that pass against one version while production runs another. Prompts behave like business logic but live like comments.
What blogus does
blogus is a Python CLI plus library. It does four things in order:
- Scan looks at the repo and lists every OpenAI / Anthropic call it can identify, with file and line, and flags whether the call uses an inline string or a versioned prompt.
- Version means promoting an inline string into a
.promptfile: YAML frontmatter for name, model id, temperature, and variables, followed by a Jinja-style template. - Lock writes
prompts.lock: a SHA256 over each template and the commit it was generated from. The lock is the artifact a reviewer can sign off on. - Sync rewrites the original code so it loads the versioned prompt by name, with a
# @blogus:summarize sha256:...marker comment.blogus verifyreads those markers in CI and fails the build if anything drifts.
Why "package.lock for prompts" is the right metaphor
npm and uv solved the same problem for source dependencies: declared version, resolved hash, machine-readable manifest, fail-loud verification. None of that infrastructure cares about who wrote the code — only that what is on disk matches what was reviewed. A prompt lockfile gives you the same property for the strings that drive model behaviour.
What blogus deliberately is not
blogus does not host your prompts. It does not run as a service. It does not sit between your application and the model. It does not collect traces, A/B test, or score outputs in production. Those are real problems, but they belong to other tools. blogus only cares about one moment: between the time a prompt changes and the time it ships, has the change been reviewed?
How it interacts with what you already use
The lock file is plain YAML; it diffs cleanly. The .prompt files are plain YAML; reviewers do not need a new editor. blogus verify is a single non-network command that returns a real exit code, so it slots into pre-commit, GitHub Actions, or any other CI runner without ceremony. If you want a browser-based view, uvx --with blogus[web] blogus-web serves one locally on port 8000.
What we built and what is still ahead
The CLI ships eleven commands today: scan, init, prompts, exec, analyze, test, lock, verify, check, fix, and demo. The optional blogus[tui] extra ships an interactive terminal walkthrough we use for demos and onboarding. The optional blogus[web] extra ships a local browser UI.
Ahead of us: deeper language coverage in the scanner, more analysis and evaluation primitives that respect the same "files first" rule, and tighter integrations with the test runners teams already use.
Who Skelf-Research is
Skelf-Research is a small research-and-tooling group focused on the workflow problems that show up once an LLM ships behind a real product. blogus is one of those problems made small enough to put a name on. The source lives on GitHub under the MIT license; the package is published to PyPI as blogus.
How to engage
If you ship anything with a model in it: uvx blogus scan in a repo and see what falls out. If you have ideas about how the scanner should behave on a language we do not handle well: open an issue. If you want to integrate the lock semantics into a different review tool you already use: the file format is small, public, and stable enough to build on. The documentation covers the format details, the CLI reference, and the recommended CI wiring.