Search engine analysis · 2026-04-27

frvspickbrain

Two CLI tools that index your AI coding sessions and call themselves "search". One does keyword matching. The other does meaning matching.

Both tools point at the same corpus — Claude, Codex, and other coding agent transcripts. Both call themselves "search". But only one of them does semantic search. fr is a Tantivy BM25 engine with edit-distance-1 fuzzy fallback — purely lexical. pickbrain is a multi-vector neural retriever built on Stanford's XTR-WARP — actually semantic.

Two different problems wearing the same shape.

fr

fast-resume — Python + Tantivy

Keyword search
Backend
Tantivy — Rust BM25 engine (Lucene-style)
Storage
Tantivy index directory
Query
BM25 boosted ×5 fuzzy term query (edit distance 1, prefix mode)
Filters
agent: dir: date: with ! negation and comma-OR
Adapters
Claude · Codex · Copilot CLI · Copilot VS Code · Crush · OpenCode · OMP · Pi · Vibe
Model
none — pure lexical
pickbrain

witchcraft — Rust + Candle (XTR-WARP)

Semantic + hybrid
Backend
Witchcraft — clean-room reimplementation of Stanford's XTR-WARP
Storage
Single SQLite file with FTS5 + 4-bit packed residual blobs
Query
T5 encoder → 128-dim per-token vectors → MaxSim over centroid-pruned candidates → optional FTS5 fusion via RRF
Filters
--session UUID, programmatic SQL filter
Adapters
Claude Code · Codex · Slack history
Model
Google DeepMind's XTR — T5 encoder, GGUF-quantized; Metal/OpenVINO/fbgemm backends

What happens between query and result.

fr · query path

01
parse_query()
Strip agent:, dir:, date: keywords from input — remainder is free text.
BM25 ×5
Tantivy parse_query over title + content, boosted
Fuzzy term
edit distance ≤ 1, prefix=true, per token
⊕ boolean OR (both branches lexical)
02
Apply filters
term-set / regex / range queries on agent, dir, timestamp fields
03
Sort & return
auto: empty query → newest, else relevance. Or explicit newest / oldest / relevance.

pickbrain · query path

01
embedder.embed(q)
Tokenize → T5 encoder forward pass → drop low-norm tokens (norm < 1.0) → L2-normalize → matrix [n_tokens × 128]
match_centroids()
top-32 centroids per query token; ≤40k candidates; decompress 4-bit residuals; MaxSim aggregate
fulltext_search()
SQLite FTS5 BM25 (only in hybrid mode)
⊕ Reciprocal Rank Fusion · k = 60
02
Apply SQL filter
optional rowid filter via temporary-table join
03
Hydrate & return
score-sorted document chunks with sub-index for span highlighting

The detail underneath each box.

Aspect fr pickbrain
Search type lexical BM25 + 1-edit fuzzy semantic XTR multi-vector + optional BM25
Embedding model none XTR (Google DeepMind), T5 encoder, 128-dim per token
Document representation Tokenized text via Tantivy default tokenizer; raw on filter fields Matrix of token embeddings per chunk; 4-bit packed residuals around k-means centroids
Index structure Tantivy inverted index (Lucene-style postings) LSM-tree of centroid generations · L0=1024, fanout=16
Scoring algorithm BM25 from Tantivy + Jaro-Winkler post-rerank for sub-chunks Per-token MaxSim (à la ColBERT/XTR), with missing-token imputation for pruned candidates
Hybrid mode Same engine, "hybrid" = exact fuzzy both branches still lexical Two engines fused via RRF: semantic + SQLite FTS5 BM25
Long-doc strategy Whole content field, BM25 length-normalized Sliding window: 2048 tokens, 256 stride; sum overlapping embeddings before normalize
Typo tolerance yes fuzzy distance=1, prefix=true yes via subword tokenization
Synonym matching no yes embedding similarity is the synonym mechanism
Concept matching no yes token-level late interaction
Filter syntax agent:claude,!codex dir:proj date:<1d rich, native --session UUID + SQL filter sparse, programmatic
Build cost low no model, no GPU One T5 encoder pass per chunk Metal/OpenVINO accelerates
Query latency Sub-millisecond Tantivy lookup ~21 ms p.95 on NFCorpus (M2 Max), end-to-end incl. encoder
Quality benchmark Not measured against IR benchmarks ~0.33 NDCG@10 on NFCorpus (matches reference XTR-WARP)
Source corpora 9 coding agents Claude Code, Codex, Slack history
UI Textual TUI w/ live indexing & preview Plain CLI; piped output
Operational footprint tiny pip install, runs anywhere ~250 MB T5 weights · Cargo + Make build · GPU helpful

Same query, different results.

Each row pairs a query you might type while looking for a past session with the wording the transcript actually contains. Where the query and the transcript share no common tokens, only embedding-space similarity can recover the match.

Query you remember Actual transcript wording fr pickbrain
auth middleware fix resolved JWT validator bypass missno shared tokens hitembedding similarity
react state management refactored the Zustand store miss hit
off-by-one the loop terminated one iteration early miss hit
that thing about caching memoized the embedder output across calls miss hit
auth midware authentication middleware hitfuzzy edit distance 1 hitsubword tokens
getUserById defined getUserById in user.service.ts hitexact term, BM25 fast hitvia hybrid FTS5 path
agent:claude dir:foo api auth — filter syntax, not transcript nativeparse_query parses keywords partialSQL filter only, no DSL
(empty) browse mode — list everything nativenewest-first sort n/aembedder needs ≥ 4 chars

Reach for which one, when.

fr when you remember the literal words
  • You remember an exact identifier — getUserById, RUST_BACKTRACE, a CLI flag.
  • You want to slice by metadata. The agent:claude,!codex dir:proj date:<1d grammar is fluent and there's no equivalent on the pickbrain side.
  • You want broad agent coverage — nine adapters out of the box.
  • You want to scroll, skim, and live-preview. The Textual TUI is built for that.
  • You don't want to set up a Rust toolchain, download model weights, or use any GPU.
  • You don't want false positives from semantic drift cluttering the result list.
pickbrain when you remember the meaning
  • You phrase the query conceptually — "the auth thing", "where I fixed the caching bug", "the React state refactor".
  • The transcript may use a different vocabulary than you do now (synonyms, paraphrases, alternate framings).
  • You want a single ranked list combining semantic + BM25 evidence. RRF hybrid mode delivers that natively.
  • You're calling search from inside Claude Code or Codex as a skill — programmatic, not interactive.
  • You want token-level highlighting that points at the matching span inside a long chunk.
  • You're OK paying ~21 ms per query and ~250 MB of model weights on disk.

Different tools for different recalls.

Calling fr a "semantic search" engine overstates it — it's a polished BM25-with-fuzzy index over a wide range of agent transcripts, with the best filter syntax of the two and the best TUI. What it cannot do is bridge vocabulary mismatch: if your query and the transcript share no tokens, fr never finds the match.

pickbrain, riding on Witchcraft's XTR-WARP reimplementation, does actual neural multi-vector retrieval. MaxSim aggregation over per-token embeddings is the architecture that gives ColBERT and XTR their strong scores on IR benchmarks; WARP centroid pruning + 4-bit residual storage keeps it under 25 ms end-to-end on a laptop. Hybrid mode adds BM25 evidence via RRF, so you don't lose lexical precision for exact identifiers.

Recommendation: if you want the best semantic search across AI sessions specifically, pickbrain isn't merely better — it's the only contender. Use fr for fast scrolling, exact-identifier lookups, and metadata slicing across all your coding agents. The two tools answer different questions over the same corpus.