Architecture

Chess Masti AI is built around one mental model: every chess fact in a coaching response must be derivable from an authoritative source, not from the LLM's parameters. This page walks through the pieces that enforce that.

Application layer

Next.js 15 (App Router for API + new content surfaces, Pages Router for legacy interactive pages), TypeScript, deployed on Vercel. Sentry for error monitoring. Auth is a signed JWT in an httpOnly cookie (cm_session); email/password uses bcrypt; Google OAuth is server-routed through chessmasti.com so the browser never hits Firebase domains directly.

Persistence is split across three layers:

  • Firestore (server-side, via Firebase Admin SDK) for user accounts, saved games, and the openings repertoire.
  • IndexedDB (browser, via idb) for puzzle progress and SM-2 spaced-repetition state.
  • Neo4j Aura for the puzzle graph (described below).

Engine layer (in-browser)

Stockfish 17 ships as a WebAssembly worker. It evaluates positions on your machine, not on our servers, which means no rate limit, no quota, and no privacy concern about your game data leaving the browser for analysis. We layer three structured passes on top of raw eval output:

  1. Tactical motif detection — pins, forks, skewers, discovered attacks, back-rank patterns, deflections.
  2. Candidate-move gap analysis — how large was the eval gap between the move played and Stockfish's preferred line, and was the preferred line forcing or quiet?
  3. Branch-point analysis — was this move the position's decision pivot?

The LLM is never given the raw board. It receives a structured digest from this layer.

LLM layer (server-side)

Two tiers, selected by latency budget:

  • Flagship: Claude Sonnet. Deep multi-section game analysis. Used by /api/enhanced-analysis.
  • Fast: Claude Haiku. Sub-5-second follow-up chat. Used by /api/chat. The flagship analysis call writes a server-side context cache keyed by contextId; subsequent fast-tier calls hit that cache instead of reconstructing from scratch.

Every server-side LLM call funnels through a single callLLM(tier) function. Callsites pass a tier ("flagship" | "fast"), never a model name — that boundary is what makes model upgrades safe.

The system prompt is centralized and versioned (PROMPT_VERSION), so before/after evals on coaching quality are reproducible.

Validator layer

The piece that most "AI chess coach" products skip. After the LLM returns and before the response renders, a validator:

  1. Parses the response for every piece reference (e.g. "the rook on f1"), square reference (e.g. "control of d4"), and SAN move (e.g. "11. d5!").
  2. Loads the live chess.js board state for the position the user is asking about.
  3. Checks each claim. Is there a rook on f1? Is d5 a legal move in this position? Does the bishop on c4 actually attack h7?
  4. Discards or rewrites claims that don't check out.

The validator runs unconditionally. It is the reason coaching output is safe to act on during a game.

Maia-2 microservice

Twin Bot uses Maia-2 (NeurIPS 2024), a neural network trained on Lichess data to predict human moves at a target Elo. We don't run the model in our application backend — it lives as a FastAPI/PyTorch microservice on Hugging Face Spaces, called via an API proxy from Vercel. A daily cron pings the Hugging Face endpoint to keep it warm. Twin Bot can optionally be seeded with a public Lichess username's opening tree so the bot mirrors a specific player's repertoire as well as their target rating.

Retrieval layer (Neo4j)

The puzzle graph in Neo4j Aura holds 100,000+ Lichess puzzles, filtered by quality (popularity ≥ 60, plays ≥ 50, rating deviation ≤ 120) and structured across 46 tactical themes and 4 difficulty bands.

Recommendation is a two-stage process:

  1. Graph traversal. Given the user's skill band, the tactical theme of their mistake, and their recent solve history, traverse the graph to find candidate puzzles in the right band and theme.
  2. 49-dimensional FEN cosine-similarity re-ranking. Project the FEN of the position the user just lost into a 49-dimensional vector, project each candidate puzzle's start FEN into the same space, rank by cosine similarity, return the top 3.

The output: training that matches the geometry of your mistake, not just its category.

Live play

Lichess OAuth 2.0 PKCE for sign-in. Dual-SSE streams (one for the game, one for board events) mirror the Lichess live game into the app, so you can analyse, chat with the coach, and play in the same surface.

Things we don't do

No paid tier. No advertising. No selling game data. No fine-tuned proprietary chess LLM (a fine-tuned LLM still hallucinates; the validator is what we trust).