Using Maia and Leela Chess Zero to Find Repertoire Gaps
A novel idea for chess developersBuilding a “gap analysis” tool for my Repertoire Builder has been on my roadmap for early 2026. I had already integrated Maia into Chessboard Magic and published a blog about it—but the actual spark for how to approach gap analysis came from somewhere completely unexpected.
I was watching the Chessbrah Hippo speedrun (as one does), and while drifting off, it suddenly hit me:
Since Maia is trained on millions of real Lichess games, could it scan a player’s repertoire and highlight the high-probability moves they don’t have?
It was a silly moment of inspiration, but the idea made perfect sense once it landed. And I thought it might be food for thought for anyone developing chess tools or experimenting with AI-driven analysis.
The Classic Approach (And Why It’s Heavy)
Before this idea appeared, I had been planning something more traditional—and certainly more technically demanding. It’s a fairly common way developers think about repertoire analysis, especially when looking for missed moves or building opening explorers.
The general outline usually looks like this:
1. Downloading large Lichess datasets
Lichess publishes monthly PGN dumps containing millions of games.
A single month is roughly 30GB, and meaningful analysis often requires:
- multiple months or years
- rating-based splits
- time control filtering
This alone is a major undertaking.
2 Building a FEN move-frequency index
From those PGNs, you would:
- reconstruct all positions
- count the most common replies
- bucket moves by rating range
- prune noise
- store everything in a fast lookup structure
It’s a valid approach, but it results in a very large, specialised database and ongoing engineering work.
3 Handling backend scale
Analysing 1,000–5,000 repertoire positions per user means:
- heavy query loads
- caching layers
- scheduled pre-processing jobs
- storage and maintenance overhead
It works—but it’s heavy.
Note: Why Not Use the Lichess Opening Explorer API Instead?
A natural alternative is to query the Lichess API directly, but:
- strict rate limits
- restrictions on automated bulk lookups
- slowdowns under heavy usage
mean it cannot serve as the backbone for automated repertoire analysis.
It’s perfect for manual exploration, but not for thousands of FEN requests.
A New Perspective: Let Maia Do the Work
This is where the Hippo-speedrun-inspired thought completely shifted my approach.
Maia is trained directly on millions of Lichess games. Instead of constructing a massive FEN database yourself, Maia effectively compresses that statistical knowledge into a single model.
So instead of asking:
“What moves do humans usually play from this position?”
You can simply ask:
“Maia, what would a human play here?”
And this works beautifully for gap detection.
What Maia gives you?
Maia outputs:
- a list of legal moves
- each with a probability describing how likely a human is to choose it
So whenever Maia gives a high-probability move that is not in the user’s repertoire, that becomes a clear gap.
All computation happens in the user’s browser
Because Maia runs through ONNX directly in the client:
- no servers
- no databases
- no Cloud Functions
- no API calls or rate limits
- instant analysis
- zero operational cost
Every user processes their repertoire on their own device.
Note: Maia is weaker in the early opening
Since Maia is trained on real human games—not curated theory—its opening choices can drift from theoretical best practice.
This means:
- moves 1–10 often reflect human habits rather than theory
- early positions need stabilising from another data source
- lower-rated Maia models particularly inherit amateur inaccuracies
A practical solution is to use:
- Lichess opening explorer data for the first few moves
- Maia for middle-game and late-opening gap detection
Together, they provide a stronger and more balanced output.
Pairing with Leela Chess Zero
Lc0 adds an additional perspective:
- Maia what humans tend to play
- Lc0 what a neural network engine evaluates as strongest
This lets you:
- detect gaps
- evaluate the quality of moves
- give users both human-guided and engine-guided insights
Why This Matters
This approach turns a once-heavy engineering problem into something elegant and lightweight.
- No large PGN downloads
- No huge FEN indexes
- No backend infrastructure
- No rate-limited API calls
- Instant in-browser inference
- Human-relevant move suggestions
This makes gap analysis far more accessible for both developers and users.
Final Thoughts
For years, I assumed that serious repertoire gap detection required:
- enormous datasets
- data pipelines
- backend servers
- constant maintenance
But once Maia was running in the browser, a simpler idea became obvious:
Let the model be the database—and use Lichess to support the early opening where needed.
It’s a lightweight, scalable, and surprisingly powerful strategy.
If you're developing chess tools, I think there’s a lot of potential in this direction.
If you experiment with it, I’d love to see what you create.
I hope you found this an interesting read and idea, if you have any questions, do let me know in the comments or send me a DM.
References
- Lichess Database
https://database.lichess.org
Monthly PGN and CSV dumps used for large-scale research, training, and statistical modelling. - Lichess Opening Explorer API
https://lichess.org/api#tag/Opening-Explorer
Provides opening move frequencies. Excellent for manual lookup, but rate limits prevent bulk automated use. - Maia Chess Project
https://maiachess.com
A suite of neural networks trained to predict human moves at different rating levels. - Maia GitHub Repository
https://github.com/CSSLab/maia-chess
Provides the full codebase, model details, and training methodology. - Leela Chess Zero
https://lczero.org
Open-source neural-network chess engine based on reinforcement learning and self-play. - ONNX Runtime
https://onnxruntime.ai
Cross-platform inference system enabling models like Maia to run directly in the browser.