To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
— Typo corrected results —
If you’re using vector search, you already get some form of spell tolerance. You will enjoy results even with a query written phonetically.
(If you’re not using vector search, well, you should!)
— Generic typo corrected query —
But sometimes, you’d rater let the user choose his preferred typo correction inside a list, before starting the search.
This is really easy nowadays.
Here are the manual steps:
– Open your GPT3.5/GPT4/ChatGPT sandbox
– Prompt the AI with a proper sentence to fix your query. Something like:
That’s it. But of course, you want to automate corrections in your search, and it remains quite easy.
You just have to call the OpenAI API from your autocomplete Ajax code, and display corrections in a list.
— A bit more sophisticated with vector search —
A better way would be to use the vector search to reorder the corrections, based on your database content.
Some kind of “fine-tuned typo correction”.
Here is the idea:
– Retrieve 10 corrections from the LLM
– For each correction, perform a search on the vector database
– Collect the scoring of each search
– Reorder the 10 corrections from their scoring
– Display the top scored corrections
Generative search is doing something similar. For instance, to fine-tune a Question Answering system, you perform a search first, then call a LLM with a prompt containing the results.
To fine-tune a spell correction, you need to call the LLM first, then call the vector database.
WPSOLR + Weaviate + Vespa: https://www.wpsolr.com
#spellcorrection #spellchecking #generativeai #wpsolr #weaviate #vespasearch