Testing a Neovim plugin in CI is one of those problems that looks like it should be solved and isn’t. If you’re writing a Vim plugin — or worse, a Neovim distribution — you have three choices today, and every one of them has a sharp edge:
- Install Neovim system-wide on the runner and shell out to
nvim --headless -c 'PlenaryBustedDirectory ...'. Works, but you own the install/update dance, and the runner OS decides which Neovim you get. - Vendor plenary.nvim into every plugin repo so its test harness is present at test time. That means a submodule (or a lazy clone in CI) for every project, plus you still have to install Neovim itself.
- Ship busted as your Lua test framework. Now you need a matching Lua interpreter that isn’t Neovim’s LuaJIT, and your tests can’t call Neovim APIs directly — they need shims. If your plugin is Neovim code, half the reason to have tests just went away.
neospec is the tool I wanted for exactly this problem. It’s a single Go binary that manages its own Neovim, runs Lua tests inside the real editor, and emits reports in the formats your CI already parses.
The self-contained Neovim story
The load-bearing idea in neospec is that the tool itself owns the Neovim binary. You point it at your test files; it downloads the version you asked for from neovim/neovim releases, caches it under ~/.cache/neospec/<version>/<os>/<arch>/, and runs your tests inside that. First run is one HTTPS download (~28 MB per version). Every subsequent run is instant — the binary is on disk and the version-pin means you get the same one every time.
That single decision knocks out a whole class of “works on my machine” bugs:
- Runner OS drift — an Ubuntu runner’s
apt install neovimand a macOS runner’sbrew install neovimdo not give you the same version. neospec’s cached binary does. - Nightly-vs-stable drift — pin
--neovim-version=v0.10.4inneospec.tomland stop worrying about a nightly upgrade breaking your tests without your commit. - Ephemeral CI cache invalidations — the download is cheap enough that missing the cache one build in ten is not a real cost.
neospec run --neovim-version=v0.10.4
neospec cache list # what's on disk
neospec cache clean # free the ~28 MB × N
Test isolation via a clean XDG env
Every neospec run spins up Neovim with a fresh XDG_CONFIG_HOME, XDG_DATA_HOME, and XDG_STATE_HOME under a temp directory. Your tests cannot read or mutate your real ~/.config/nvim — which is exactly what you want when the plugin under test is one you also use as a human. It also means the same test suite runs identically on a fresh CI runner and on your laptop, because both start from an empty Neovim state.
Why not just use busted?
Busted is a good tool. It’s also solving a slightly different problem — it’s a Lua test framework, not a Neovim test framework. Two consequences follow from that:
- You need a matching Lua runtime. Neovim uses LuaJIT with a specific set of embedded APIs.
lua5.1on your PATH isn’t the same runtime. When your test doesrequire("myplugin")and the plugin callsvim.api.nvim_buf_get_lines(...), that call has to hit the real Neovim, not a shim. - Vendoring is the ergonomic price. Every plugin repo that uses busted-based testing (e.g. via plenary.nvim’s
PlenaryBustedDirectory) needs plenary reachable at test time — as a submodule, as a lazy clone, or as a package that the runner installs. neospec is a single binary; there is nothing to vendor.
The neospec harness reads like busted on purpose — describe / it / before_each / after_each / pending all mean what a busted user would expect them to mean — so a plugin author already fluent in busted doesn’t have to relearn anything. What changes is what runs the tests.
-- test/parser_spec.lua
describe("parser", function()
local subject
before_each(function() subject = require("myplugin.parser").new() end)
it("tokenizes identifiers", function()
local tokens = subject.tokenize("foo bar")
assert.equals(2, #tokens)
assert.equals("foo", tokens[1].value)
end)
it("rejects invalid syntax", function()
assert.has_error(function() subject.parse("!!!invalid") end)
end)
end)
Standard assertion set — equals, not_equals, is_true/is_false, is_nil/is_not_nil, has_error, matches. Each takes an optional final msg for a custom failure line.
Coverage without source rewriting
The coverage story is where neospec earns the “innovative” label. Most language coverage tools instrument source — you install a plugin, or your test runner rewrites the file on the way in. That approach has real costs: your line numbers can drift, tools that read your source (LSPs, other coverage tools) get confused by the rewritten output, and you have to trust the rewrite to be correct.
neospec uses Lua’s built-in debug.sethook API. The hook fires on every executed line, records the hit count, and gets uninstalled at the end of the run. Your source is not touched. Line numbers are the ones you see in your editor. Your LSP is not confused.
Reports come out in the formats CI already accepts:
| Format | Where it plugs in |
|---|---|
| LCOV | Codecov, Coveralls, most editor coverage overlays |
| Cobertura XML | GitLab CI, Jenkins, GitHub’s MishaKav/jest-coverage-comment and friends |
| Coveralls JSON | Coveralls direct upload |
| JUnit XML | GitHub Actions test-report annotations, GitLab test tab |
| Console | Human-readable summary at the bottom of the terminal output |
neospec run --format=console --format=lcov --format=junit
# reports land in coverage/lcov.info, coverage/junit.xml, and stdout
The threshold gate
A one-line CI merge gate:
neospec run --threshold=80
# exits non-zero if coverage < 80%
That’s how you keep coverage from silently drifting down. Every PR either holds the line or explains why the number moved. The gate isn’t opinionated about how you get there — the failure message is just “coverage 76.3% < 80.0% threshold” and it’s on the author to decide whether that’s a fix-it-in-this-PR situation or a shifting-goalposts situation.
GitHub Action
The action shape mirrors the CLI one-to-one, so what you validate locally is what runs in CI:
- uses: jedi-knights/neospec@v0
with:
neovim-version: stable # or nightly, or v0.10.4
formats: console,lcov,junit
threshold: "80"
- uses: jedi-knights/coverage-badge@v1
# reads coverage/lcov.info and updates the README badge on push
Two things worth flagging about the CI story:
- The action installs nothing on the runner except the neospec binary itself. It does not
apt install neovim. It does not clone plenary. The runner ends the job with the same package state it started with, minus a temp directory neospec cleaned up. - JUnit output surfaces per-test detail in the GitHub Actions summary. You do not need a separate test-report action to see which spec failed and why — the annotations attach to the workflow run directly.
Where this fits
If you write Neovim plugins and you’ve been putting off writing tests because the setup felt like it wasn’t worth it — this is the tool that changes that math. One binary. One config file. One Action. No system-nvim install, no vendored plenary, no headless shell scripts. Your Lua code runs against a real, version-pinned Neovim, with coverage, in every runner, every time.
That’s the piece I want more Neovim authors to internalize: the testing infrastructure is no longer the reason not to test your plugin.
Where to find it
- Repo: github.com/jedi-knights/neospec
- Releases: github.com/jedi-knights/neospec/releases
- Coverage badge companion Action: github.com/jedi-knights/coverage-badge
- Neovim upstream releases: github.com/neovim/neovim/releases
- Comparison — plenary.nvim’s test harness: github.com/nvim-lua/plenary.nvim
- Comparison — busted: lunarmodules.github.io/busted