— The embedding model —
The demo uses the MiniLM-L6-v2 embeddings model https://lnkd.in/eCwAzH_h, installed on a self-hosted Weaviate Kubernetes cluster.
This model is considered to have the best performance vs quality for all sentence transformer models.
— Bigger models are better —
Notice that much bigger models (GPU(s) required?) are now trusting the top of the MTEB leaderboard for the retrieval task https://lnkd.in/efmNJyTP
— Indexing time —
Also notice that indexing takes quite some time (around 1 per second) on a (single :)) CPU.
— Quality —
Quality looks inferior to the same demos with PaLM2, OpenAI or Cohere embeddings.
For instance, check out the position of a mattress for keywords “something to sleep on”:
– MiniLM-L6-v2 (not on first page !): https://lnkd.in/eUQnVBXV
– OpenAI (1st position): https://lnkd.in/eVdYpC-P
– PaLM2 (1st position): https://lnkd.in/e4FFVcUj
– Cohere (2nd position): https://lnkd.in/eb3yCw-C