Why most languages just work
Recall - language-agnostic
The embedder (
paraphrase-multilingual-MiniLM-L12-v2, 50+ languages) maps
same-meaning text to nearby vectors across languages. A Russian query finds
an English fact, no translation.Intents & extraction - English anchors (warm)
Classified by English semantic anchors; the embedder projects your language
next to them, so “was sind meine Ziele” lands on the same
goals_query
anchor as “what are my open goals”.Cold lexical path - self-compiling (ALD)
As the warm tier classifies your messages, it logs which n-grams co-fired
with which anchor and distils the high-precision ones into
$PMB_HOME/lang/auto.yaml. Your common phrasings then classify cold, at
regex speed - generated from your own traffic, not a pack.