How Chess Masti AI works
Most "AI chess coaches" are an LLM with a board diagram pasted into the prompt. They hallucinate. They invent moves that aren't legal in the position. They confidently misidentify pieces. They tell you the rook on f1 is hanging when there's no rook on f1.
Chess Masti AI is built the other way around. The engine runs first. The LLM only ever paraphrases what the engine already said. Then a separate validator checks the paraphrase against the actual board before you read it.
1. The engine runs first
When you load a game, Stockfish 17 runs in your browser as a WebAssembly worker — no server round-trip, no rate limit. For each move the engine produces an evaluation, the best continuation, and the next-best alternatives. We layer three more passes on top of the raw eval:
- Tactical motif detection: pins, forks, skewers, discovered attacks, back-rank patterns.
- Candidate-move gap analysis: how big was the gap between the move you played and Stockfish's recommendation, and was the recommendation a single forcing line or a quiet positional choice?
- Branch-point analysis: was this move the position's pivot — where the game's evaluation decisively turned?
The LLM never sees the bare position. It sees the engine's structured verdict.
2. Claude turns the verdict into language
Two models, picked by latency budget:
- Claude Sonnet (Anthropic) handles deep, multi-paragraph analysis. It receives the engine output and your historical context — playing style, study goals, favourite openings — and writes the coaching response.
- Claude Haiku handles follow-up chat with sub-5-second responses. The first analysis call seeds a server-side context cache keyed by
contextId, so subsequent questions don't repay the full token cost.
The system prompt makes one thing explicit: never invent a chess fact. If the engine didn't say it, don't write it.
3. The hallucination validator
LLMs still drift. So before any coaching response renders, a hallucination validator parses it for every reference to a piece, square, or move, and checks each one against the live chess.js board state.
If the response says "the bishop on c4 attacks h7," the validator confirms there is a bishop on c4 and that h7 is on its diagonal. If the response suggests Nxe5 as a candidate, the validator confirms that a knight legally moves to e5 in the current position. Claims that don't check out are rewritten or dropped.
This is the layer most "AI coaches" don't have. It's also the reason you can trust the output enough to act on it during a game.
4. Targeted training, inline
A coaching response that just explains your mistake is half the loop. The other half is doing more reps on positions like the one you just got wrong.
So three puzzles render directly inside the coaching message — same chat bubble, same chess board, no tab switch. They're pulled from a Neo4j graph of 100,000+ Lichess puzzles filtered to popularity ≥ 60, plays ≥ 50, rating deviation ≤ 120. Retrieval is a graph traversal (your skill band × the relevant tactical theme), then a 49-dimensional FEN cosine-similarity re-ranking against the FEN you just lost. You train on the geometry of your specific mistake, not a generic "back-rank tactics" bucket.
Solve them, the SM-2 spaced-repetition scheduler files them away, and the next time a similar shape comes up the loop is shorter.
5. The rest of the surface
- Twin Bot runs on Maia-2 (NeurIPS 2024), a neural network trained to predict human moves at a target Elo. Optionally seeded with a public Lichess player's opening repertoire.
- Live play uses Lichess OAuth 2.0 PKCE with dual-SSE streams.
- Opponent scouting ingests a Lichess or Chess.com username and returns opening trees, repertoire collisions, a Stalker Score exploitability index, tilt and timeout psychology profiles, and a shareable SVG card.