Fine-tuned query spell correction without model training?

— Typo corrected results —

If you’re using vector search, you already get some form of spell tolerance. You will enjoy results even with a query written phonetically.

(If you’re not using vector search, well, you should!)

— Generic typo corrected query —

But sometimes, you’d rater let the user choose his preferred typo correction inside a list, before starting the search.

This is really easy nowadays.

Here are the manual steps:
– Open your GPT3.5/GPT4/ChatGPT sandbox
– Prompt the AI with a proper sentence to fix your query. Something like:

Fix the typo on the following query:
A: plise fixe the tippo
Answers:
A: Please fix the typo

That’s it. But of course, you want to automate corrections in your search, and it remains quite easy.
You just have to call the OpenAI API from your autocomplete Ajax code, and display corrections in a list.

— A bit more sophisticated with vector search —

A better way would be to use the vector search to reorder the corrections, based on your database content.
Some kind of “fine-tuned typo correction”.

Here is the idea:
– Retrieve 10 corrections from the LLM
– For each correction, perform a search on the vector database
– Collect the scoring of each search
– Reorder the 10 corrections from their scoring
– Display the top scored corrections

Generative search is doing something similar. For instance, to fine-tune a Question Answering system, you perform a search first, then call a LLM with a prompt containing the results.

To fine-tune a spell correction, you need to call the LLM first, then call the vector database.

WPSOLR + Weaviate + Vespa: https://www.wpsolr.com

#spellcorrection #spellchecking #generativeai #wpsolr #weaviate #vespasearch