News guides for WordPress & WooCommerce

Typo correction was a pure nightmare for decades, but no more thanks to LLMs

1. Just prompt #ChatGPT to fix the query with almost any kind of typos, and it will do it with ease. Prompt: “”” Fix the typo on the following queries: A: plise fixe the tippo B: veri plised of my last aquisission “”” Answers: “”” A: Please fix the typo B: Very pleased with my last acquisition “”” 2. You can also get typo-proof search results with a vector search. Embeddings are encoding concepts rather than words, which make them quite insensitive to syntax errors. 3. And last, you can add a nice “Did you mean?” prompt to your vector search, to get simultaneously typo correction and typo-proof results. For more examples of vector search typo-insensitive search with WPSOLR + Weaviate + #gpt3 : https://www.wpsolr.com

Can the Vespa engine be used for WooCommerce search?

The answer is yes, of course. It has all the features required for an e-commerce search, inverted index and vectors included. But being so flexible in architecture and features with configuration files, CLIs and APIs, can Vespa be adapted to a WooCommerce or WordPress plugin? For a small to medium e-commerce site, a search plugin must provide great features, without too many headaches. We cannot ask a site owner or a small agency to master tens to hundreds of concepts. Can we make most of Vespa’s concepts invisible enough, without loosing too much flexibility and power? This is what we are currently investigating: how Vespa’s architecture can be simplified to match the same features already integrated in our search plugin, with #elasticsearch or Weaviate, like: filter, facets, sort, analysers, vectorizers, vector similarity.

A nice example of the need of hybrid BM25/vector search !

Pure generative AI’s vector search or similarity is impressive, but e-commerce needs often less creativity and stick more to the query. With this example of Alexandru Rada ✅, the question (aka product content here) contains features that should be boosted or filtered. The categories are very important, so here the make and model should be used to boost results. Also, with hybrid similarity on keywords, the boosts could not even have been necessary. Often e-commerce search should be given more weight on keywords than on vectors, just for that. Below are several live demos of WooCommerce search powered by Weaviate Visit: – Hybrid search (50% weight on keywords): – Pure vector search with OpenAI and Cohere – CLIP Text & image search https://lnkd.in/dzucnPtZ #wpsolr #woocommerce

Building a recommender system from ChatGPT?

Recommenders are built on user breadcrumbs: what a user visited, clicked, added to a basket, bought … Could we use #ChatGPT generative power to recommend the next user action based on user’s past actions, in a prompt-based sequence? — 2 examples after a search on strawberries– “Given that someone likes strawberries, what would be 5 other domains of interest for this person?” 1. Gardening 2. Cooking 3. Food Preservation 4. Nutrition 5. Farming “Given that someone likes strawberries, can you build a recipe and produce here a list of 10 other items from the recipe?” Strawberry Shortcake Recipe Ingredients: 1. 2 cups all-purpose flour 2. 1/4 cup sugar 3. 4 teaspoons baking powder 4. 1/2 teaspoon salt 5. 1/2 cup cold butter, cut into small pieces 6.

Generative search with WooCommerce and ChatGPT: the fun recommendation systems killer?

— The problem — Let’s face it, we’re exposed to recommendations all day long. And they are often pretty dull and easy to discard. And not fun at all. — A bad use case — Let say I’m looking for Strawberries on a retail shop: I can see a long list of different items, more or less related. I can also see some more items recommended because other clients bought them with Strawberries. Bo…ring classical scenario !! — A creative and exciting use case — Instead, let’s imagine that 3 recipes with strawberries are displayed alongside the results or suggestions… Or 1 recipes with the 10 first ingredients… Or a nutritional information of the items… Or a classification based on their nutriscore… Or a poetry…

AutoML to tune a WooCommerce vector search, without labels?

As a developer without real experience with ML, fine-tuning or distilling a SBERT model is a huge leap. We’ve all heard about learn to rank for information retrieval’s tuning, but it requires lots of labeled data (hundreds to thousands). (Which small to medium WooCommerce shop owner cannot provide, most probably) Jo Kristian Bergum‘s tutorial is a breakthrough for me, because it also provides 3 notebooks to start playing with the following concepts on a WooCommerce shop: 1. Create questions (the labels), from a generative model (T5), and the WooCommerce product titles, descriptions, categories and attributes. 2. Classify questions/results as positive/negative, with a simple search on the actual vector database containing the already indexed WooCommerce products (embeddings already built with a SBERT bi-encoder model) 3. Train a large cross-encoder (“distillation”) from the positive/negative questions/results,

2023 will be the year of E-commerce faceted, filtered, vector search !

E-Commerce search is built from keywords, filters (for stock availability for instance), and aggregations (for attributes facets). Keywords search is usually tuned with analysers (stemming, lemming, N-Grams…). But LLM models and vector search are now mature enough to replace this keyword tuning with semantic. Therefore, e-commerce will either be: – An inverted search augmented with LLM embeddings and vector search – A vector search augmented with filters and aggregations And we indeed now see a convergence of Lucene vs Vector engines: the former to use vectors, the later to use filters/aggregations. You can install and compare Elasticsearch, Solr, Algolia and Weaviate on your own WooCommerce: https://wpsolr.com #wpsolr #woocommerce #weaviate #solr #elasticsearch

This is why backing LLMs like ChatGPT with one or several backend services is not an option on production

LLMs give you the flexibility of natural languages to express your query and format your answers. But this comes with the same flaws as human interactions: fuzzy interpretations and lack of truthfulness. By plugging a database before/after the query/answers, one can build more than necessary safeguards. An example is Question & Answering with a vector database: – Use a SBERT model to vectorize the query – Retrieve some context from the query in the database with cosine similarity – Build a prompt from the query and context – Send the prompt to the QnA model In that example, the context is retrieved from a previously indexed database. For instance from WooCommerce products descriptions and attributes. This limits the answers to a subset of the model’s universe, the

My 5 cents thoughts on BEIR benchmarks on a production environment…

BEIR benchmarks are great as absolute landmarks for theoretical research. But measuring improvements over an existing search is much more dramatic for production systems. (one cannot tell a WooCommerce owner that is new shinny search is the best on BEIR, while at the same time the shop orders dropped significantly) As a dog food recipe, I’m wondering how I could: 1. A/B compare search implementations My clients can tell almost immediately wether a search is better or not than the previous one. But they cannot prove it, because they cannot measure it without great efforts and costs. 2. Introduce event feedbacks in the model (CTR, #orders, #baskets, orders value …) 3. Automate the fine-tuning of a search in Weaviate, ideally with the push of a button As a summary, I’d

WooCommerce is the most comprehensive e-commerce platform. But it craves for a modern search

With the rise of AI, all e-commerce shops should get a great faceted search, augmented with a great semantic search. e-commerce shops need both, as facets are the standard for navigating among product attributes, while semantic search prevents most empty results due to keywords mismatch. There a plenty of technologies, APIs, or plugins for search and Question Answering. But most are 100% classic, or 100% AI, or require a heavy integration, or lack open source choices. WPSOLR let e-commerce owners combine the best of all worlds with a read-to-use package. One can choose classic/hybrid/AI, can combine several search on different sections of their site, choose closed embedding APIs or Open source models or custom trained models. WPSOLR + Weaviate + classic|hybrid|AI search: https://wpsolr.com #wpsolr #weaviate #vectorsearch #aisearch #woocommerce

Why vector search alone is sometimes not enough for WooCommerce shops?

If you’re a shop owner, you know how the front-end and back-end searches are different. On the front-end, vector search is a must. It prevents showing many empty results by looking into concepts rather than keywords. On the back-end, precision is a must. When looking for orders or product ids, the exact results must be on top. This is why, for a retailer, we’ve setup a dual configuration: – 1 Elasticsearch index hosted at Elastic – 1 Weaviate index hosted at Google Cloud Kubernetes, with a multilingual model – 1 view to power admin searches with Elasticsearch – 1 view to power suggestions and faceted search with Weaviate Another benefit is the lower cost of an Elasticsearch index with hundred of thousands of orders/products, compared to the higher cost

WooCommerce image search with AI

Often, an image is worth a thousand words. When you are a retailer, search are often disappointing because images in results does not match the keywords. Showing images of red socks instead of stripped multicoloured socks containing a touch of red is not good enough for your visitors. A new AI named “CLIP” was trained to match images from text queries. Even better, you can define how much the search should be close to product images or close to product texts. Not only your product featured image is matched, but also all images included in your texts, internal links and external links. #WPSOLR + Weaviate + WooCommerce Image search : https://wpsolr.com #wpsolr #search #ai #vectorsearch #weaviate #woocommerce #wordpress

Search concepts rather than keywords. No synonyms needed

Search by keywords is very good when your catalog contains the keywords. But many times, search by keywords cannot retrieve your products. With AI searching concepts rather than keywords, your visitors will always get results. Because your catalog contains many more concepts than keywords. Therefore, synonyms are not needed anymore: the AI search already masters your language, including synonyms, stemming, stop words, N-Grams, and so on. #WPSOLR + Weaviate + search by concepts: https://wpsolr.com #wpsolr #search #ai #vectorsearch #weaviate #woocommerce #wordpress

Typo tolerance with AI for WooCommerce

What is most frustrating that showing a potential customer no results, just for a typo or a misspelling? Well, our search AI is able to recover from almost all possible typo errors without sweating. No more “No results”. Ever. No long training or tuning. It works instantly because AIs understand your language. #WPSOLR + Weaviate + Typo tolerance: https://wpsolr.com #wpsolr #search #ai #vectorsearch #weaviate #woocommerce #wordpress

Search WooCommerce in 100+ languages, without translations

Your WordPress or WooCommerce shop is visible worldwide. Why should your visitors forced to speak English? Choose a multilingual AI model like Cohere, or one among tens at Hugging Face, and let your customers query your product catalog in their native language. WPSOLR + Weaviate + Multilingual: https://wpsolr.com #wpsolr #search #ai #vectorsearch #weaviate #woocommerce #wordpress

Cohere embeddings were indeed considered, on average, more accurate than OpenAI embeddings by a WooCommerce client

But of course, it entirely depends on the data. No solution is absolute. This is why it is so important to be able to compare AI models before deploying a vector search. Vertex Matching Engine (ANN similarity search) was also considered, but the lack of aggregations (faceting) was a no go, unfortunately. I hope it will be added in the future. (Facets are the #1 feature on e-Commerce search) WPSOLR & Cohere: https://wpsolr.com #wpsolr #woocommerce #cohere #vectorsearch

When a WooCommerce retail uses WPSOLR search to compare OpenAI, Cohere, and Elasticsearch. In 5 languages

Nobody can escape the hype: a client of WPSOLR was considering switching to a “#chatgpt” search. But how does it compare to the current search? And which AI model is better? To escape the deadlock, WPSOLR proposed to install a private Elementor page with 3 search boxes. Each search box displaying live suggestions from Elasticsearch, or Weaviate with OpenAI or Cohere embeddings. Within 2 day, the test was set. The client is now ready to compare a long list of keywords to the 3 sets of results. And you, still wondering? Contact us: https://wpsolr.com #wpsolr #weaviate #openai #gpt #cohere #LLM

Comparing classic search, vector search, and Hybrid search: is it even possible (part 4: multilingual)?

(This is a follow up of Part 3 https://www.wpsolr.com/comparing-classic-search-vector-search-and-hybrid-search-is-it-even-possible-part-3/) Multilingual search is a huge topic nowadays, especially for WooCommerce in a globalised market. Your site is in plain English, but with the automated translations in Chrome, it is perfectly fine for visitors worldwide. But what about search? If your visitors look for “Running shoes” in French, or in German, or in Chinese, or in Hebrew ? With some multilingual LLM models, this is not a problem anymore. Your visitors will get the same results for the same keywords in many languages. Here are a few examples with keywords “Running shoes”, powered by Weaviate and a Cohere multilingual LLM model on a WooCommerce Cloudways: – English: https://demo-woocommerce-flatsome-cloudways-2k-cohere.wpsolr.com/?s=running+shoes&post_type=product – French: https://demo-woocommerce-flatsome-cloudways-2k-cohere.wpsolr.com/?s=chaussures+de+course&post_type=product –

Comparing classic search, vector search, and Hybrid search: is it even possible (part 3: typo tolerance)?

(This is a follow up of Part 2 https://www.wpsolr.com/comparing-classic-search-vector-search-and-hybrid-search-is-it-even-possible-part-2/) An incredible and understated effect of AI search is typo tolerance. LLM are very strong in “understanding” the meaning of a sentence, hence being resilient to typo errors. Here are two live examples with 2 typos in the same sentence (you can see them side by side in the screenshot of this post): – Plain classic search without any tuning: https://demo-woocommerce-flatsome-cloudways.wpsolr.com/?s=fotwear+outdor+bad+weather&post_type=product – Vector search with Weaviate OpenAI embeddings: https://demo-woocommerce-flatsome-cloudways-2k-openai.wpsolr.com/?s=fotwear+outdor+bad+weather&post_type=product Not only the vector search fixes the typo error, but also returns great results. WPSOLR: https://wpsolr.com #wpsolr #weaviate #openai #gpt3 #cohere #huggingface

A WooCommerce Hybrid vector search live demo with Weaviate & OpenAI embeddings

Description: – Hybrid search (BM25 sparse search & dense vector search) – WooCommerce with the Flatsome theme are hosted on Cloudways – WPSOLR plugin is installed and configured – Weaviate is installed on a Google Cloud #Kubernetes cluster https://weaviate.io/developers/weaviate/installation/kubernetes – The data vectorization is performed by the OpenAI embeddings API v2 https://openai.com/blog/new-and-improved-embedding-model/ – Search, filters, facets, sorting, and pagination are performed by query/data similarity within the Weaviate database Hybrid demo link: https://demo-woocommerce-flatsome-cloudways-2k-hybrid.wpsolr.com/shop/ All WPSOLR demos: https://www.wpsolr.com/wpsolr-demos/ Hybrid documentation: https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-weaviate-index/hybrid-search/ #wpsolr #weaviate #woocommerce #vectorsearch #vectordatabase #openai

Hybrid search, and OpenAI Questions Answering, are now available with WPSOLR 22.9 !

It’s Christmas almost every day with Weaviate and WPSOLR 🙂 Release documentation: https://www.wpsolr.com/forums/topic/release-22-9/ You can try several engines live at https://wpsolr.com, in the header search boxes: – Search & suggestions with Algolia – Suggestions with Weaviate + OpenAI GTP – Suggestions with Weaviate + Cohere – Question & Answering with Weaviate + OpenAI #wpsolr #weaviate #woocommerce #wordpress #vectorsearch #openai #gpt3 #hybridsearch #aisearch

How to help your sceptical clients whether their search is better with AI ?

People are already aware, thanks to #ChatGPT, of what AI can do in general. People also know that their competitors are willing to use it soon. But people are not always convinced that this will apply to them. Because they think they have special needs that will not been tackled as well as for the general uses case. And there are also lots of examples of a new promising technology not doing so well on the long term: js frameworks dead after a few years, java applets, … The key for you is to prove that their use case is fitted to AI search. And for that, you just have to compare your AI search to their current search stack. This is where WPSOLR comes

Comparing classic search, vector search, and Hybrid search: is it even possible (part 2: multi-engine real-time indexing)?

(This is a follow up of Part 1) Indexing several indexes, from several search engines is already difficult. But you also need to do it in real-time, to ensure your search, suggestions, or recommendations are up to date. So what? Do you have to call a batch every night to fully/incrementally reindex all your data? Nope. WPSOLR handles that automatically, as show on the screenshot, in real-time, for: – Elasticsearch – OpenSearch – Apache Solr – Algolia – Weaviate with OpenAI, Cohere, Hugging Face, Transformers embeddings – Google Retail API WPSOLR: https://wpsolr.com #wpsolr #elasticsearch #opensearch #apachesolr #algolia #weaviate #openai #gpt3 #cohere #huggingface

Comparing classic search, vector search, and Hybrid search: is it even possible (part 1: multi-engine installation)?

It looks impossible at first sight. You will have to: – Install each search engine server – Create each index with the right schema – Ingest your real life data in each index – Install each search engine client – Configure search, suggestions, filtering, sort, pagination for each of them – Bind each suggestion to a dedicated search box And you are right, this is impossible … 😜 Joking! With WPSOLR, you will be able to compare: – Elasticsearch – OpenSearch – Apache Solr – Algolia – Weaviate with OpenAI, Cohere, Hugging Face, Transformers embeddings – Google Retail API Compare a dual classic/vector live suggestion right now: https://wpsolr.com #wpsolr #elasticsearch #opensearch #apachesolr #algolia #weaviate #openai #gpt3 #cohere #huggingface

A WooCommerce vector search live demo with Weaviate & Cohere (LLM) embeddings

Description: – WooCommerce with the Flatsome theme are hosted on Cloudways – WPSOLR plugin is installed and configured – Weaviate is installed on a Google Cloud Kubernetes cluster https://weaviate.io/developers/weaviate/installation/kubernetes/ – The data vectorization is performed by a Cohere LLM model https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules/text2vec-cohere – Search, filters, facets, sorting, and pagination are performed by query/data similarity within the Weaviate database Demo link: https://demo-woocommerce-flatsome-cloudways-2k-cohere.wpsolr.com/shop/ WPSOLR: https://wpsolr.com #wpsolr #weaviate #woocommerce #vectorsearch #vectordatabase #cohere

A WooCommerce vector search live demo with Weaviate & CLIP (text & image) embeddings

Description: – WooCommerce with the Flatsome theme are hosted on Cloudways – WPSOLR plugin is installed and configured – Weaviate is installed on a Google Cloud Kubernetes cluster https://weaviate.io/developers/weaviate/installation/kubernetes/ – The data vectorization is performed by a CLIP model https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules/multi2vec-clip/ – Search, filters, facets, sorting, and pagination are performed by query/data similarity within the Weaviate database Demo link: https://demo-woocommerce-flatsome-cloudways-2k-clip.wpsolr.com/shop/ WPSOLR: https://wpsolr.com #wpsolr #weaviate #woocommerce #vectorsearch #vectordatabase #clipmodel

HyDE could be a challenger to SBERT !

The problem: SBERT models https://www.sbert.net/ are trained to build a similarity embedding between a query and a passage. The problem is that queries and passages are very different in length, form and semantic. A solution with HyDE ? Why not instead match a generated query passage from a model like OpenAI #GPT to passages in a vector database like Weaviate: compare the generated query passage embedding to (near) all database embeddings? Another module for Weaviate in preparation? WPSOLR + Weaviate: https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-weaviate-index/ #wpsolr #weaviate ##nlp #gpt #vectordatabase #vectorsearch #sbert

A WooCommerce vector search live demo with Weaviate & OpenAI embeddings.

Demo link: https://demo-woocommerce-flatsome-cloudways-2k-openai.wpsolr.com/shop/ Description: – WooCommerce with the Flatsome theme are hosted on Cloudways – WPSOLR plugin is installed and configured – Weaviate is installed on a Google Cloud Kubernetes cluster https://weaviate.io/developers/weaviate/installation/kubernetes/ – The data vectorization is performed by the OpenAI embeddings API v2 https://openai.com/blog/new-and-improved-embedding-model/ – Search, filters, facets, sorting, and pagination are performed by query/data similarity within the Weaviate database #wpsolr #weaviate #woocommerce #vectorsearch #vectordatabase #openai

What are the four stages of search?

– Classical search – Anything around BM25 statistical scoring. Including #elasticsearch , Apache #solr, Algolia, and WPSOLR https://www.wpsolr.com. – Classical search AI augmented – Still the classical engines, but with a pre-indexing phase to extract some semantic features. Including WPSOLR https://www.wpsolr.com/guide/configuration-step-by-step-schematic/activate-extensions/extension-nlp/ – Vector search pre-trained – This includes all vector databases like SeMI Technologies Weaviate, Pinecone, Vespa, Qdrant, with a pre-trained LLM vectorizer. See https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-weaviate-index/ – Vector search fine-tuned – This includes all vector databases mentioned earlier, but with a fine-tuned LLM vectorizer. None of them come with an automatic pipeline to fine-tune the model. Or perhaps Google Retail search API https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-a-google-retail-index/ #wpsolr #ai #google #pipeline #vectorsearch #finetuning #largelanguagemodels #searchengines #elasticsearch #apachesolr #algolia #weaviate #pinecone

Which models are best for your WooCommerce vector search: Hugging Face, OpenAI, or Cohere ?

You do not have to wonder or spend weeks prototyping anymore. With WPSOLR ‘s SeMI Technologies Weaviate integration, you can just: – Install docker Weaviate locally with the 3 modules (or soon connect to a free/paid Weaviate hosting service) – Create 3 indexes – Index your data – Compare your 3 searches immediately The 3 models come with filters, facets, sort, and pagination. And with the new Hybrid BM25/vector search too. WPSOLR with Weaviate documentation: https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-weaviate-index/ #wpsolr #weaviate #openai #cohere #huggingface #vectorsearch #wordpres #woocommerce

Finally, a definitive confirmation from Jina AI that large DL models are quick learners.

And a big motivation to add a thin layer of fine-tuning (labeling?) to #wpsolr‘s WooCommerce SeMI Technologies search integration ! It’s pretty straightforward with OpenAI fine-tuning API and fine-tuned models. But Jina AI finetuner is also very promising. WPSOLR and Weaviate: https://www.wpsolr.com/feature-weaviate/ OpenAI fine-tuning: https://beta.openai.com/docs/guides/fine-tuning Jina AI fine-tuning: https://finetuner.jina.ai/ #wpsolr #finetuning #jinaai #openai #weaviate #search

Finally, our Google Retail search for WooCommerce documentation is here !

WPSOLR documentation: https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-a-google-retail-index/ #wpsolr #woocommerce #retail # search #retailsearch

Documentation for WPSOLR’s SeMI Technologies Weaviate vector search integration is here!

Modules already supported, with full WooCommerce integration (suggestions, facets, filters, pagination, sort, …): – Weaviate Transformers module – Weaviate CLIP module – Weaviate Hugging Face Endpoints module – Weaviate OpenAI Embeddings v2 module – Weaviate Cohere Embeddings module – Weaviate Questions Answering module (and more) WPSOLR Documentation: https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-weaviate-index/ #wpsolr #weaviate #huggingface #openai #cohere #embeddings #woocommerce #largelanguagemodels

Hybrid search RRF ranking (Reciprocal Rank Fusion) is the first choice for both Weaviate and Vespa.

(And they are also the two OSS vector search databases supporting aggregation, which make them to my opinion the best contenders for e-commerce vector search) Vespa presentation: SeMI Technologies Weaviate presentation: https://weaviate.io/blog/2023/01/Hybrid-Search-Explained.html WPSOLR for WooCommerce with Weaviate: https://www.wpsolr.com/ #wpsolr #woocommerce #hybridsearch #vectorsearch #vespasearch #weaviate

Hybrid search is the new hot topic, and is now ready to use with WPSOLR & WooCommerce !

SeMI Technologies released Weaviate v 1.17.0 with a new hybrid search, and an alpha setting to set how much search is pure dense/vector or pure sparse/inverted. This is great for e-Commerce, as vector search can be a little bit too much “imaginative”. The hybrid search should help results stick to the products contents. WPSOLR with Hybrid search: https://www.wpsolr.com/guide/configuration-step-by-step-schematic/configure-your-indexes/create-weaviate-index/hybrid-search/ Weaviate v1.17.0: https://github.com/semi-technologies/weaviate/releases/tag/v1.17.0 Weaviate Hybrid operator: https://weaviate.io/developers/weaviate/current/graphql-references/vector-search-parameters.html#hybrid #wpsolr #hybridsearch #woocommerce #weaviate #ml #vectorsearch #sparsesearch #invertedsearch

Is hybrid search an easy trick to better fine-tuned LLM search?

— When LLMs suck – LLMs (large Language Models) do not rank well on recent or specialised corpus, as they are trained on oldish and general datasets. — Don’t understand? Parrot instead — Hybrid search with BM25 can compensate that: after all, when you do not understand a concept (vector similarity) you can always recognise some keywords (pattern matching). — Fine tuning — Fine-tuned models will always be better than hybrid or pure BM25 on a predefined test dataset, by definition. But only if you have time, money, skills, and probably labeled data to do so. — Conclusion — Training a LLM on the last 2 weeks of discoveries in Quantum gravity will probably be out of reach for most people. This is where BM25

Vector databases are lacking Pay as You Go OSS embeddings

Vector databases are lacking Pay as You Go OSS embeddings… – Cannot start? – You’ve carefully selected your favourite vector database. You’re ready to start, but wait a minute … how do you produce the vectors (aka embeddings) ? – Expert fix? – No problems according to tutorials: choose a model among thousands, install it with docker, copy a few lines of python, and that’s it. – Production ready? – Let’s be honest: it looks like an impossible task for most people, who just want to use the wheel, not reinvent it. And it is certainly not good for a production usage, which require security, scalability and much more. – APIs? – APIs are the universal fix noways. There is certainly an API to save

Will 2023 be the start of the end of pure search engines?

– What? – Who is using search nowadays when recommenders secretly choose for us in the background? Personally, I do confess not having used the search bar in Youtube for a very long time. I just browse (a lot) and (sometimes) click on what’s presented to me. – How? – And for search engines also there is a similar trend with personalised search. Which is a fancy way to say “reordered results based on user past actions (or lack of actions)”. But even stronger is the convergence of once distinct engines into a single unified search & recommendation engine. For instance: – Algolia search & recommendations – Google Retail search & recommendations – Recombee recommendations & search – Why? – I suspect that the recent

OpenAI Embeddings API v2 is truly discounted !

Checkout the picture, and compare the 17th of December to the 18th of December. Both figures are issued from the same automated tests on WPSOLR’s Weaviate module for OpenAI. – Embeddings V1: $0.40 – Embeddings v2: $0.0 That’s impressive. OpenAI embeddings new pricing: https://beta.openai.com/docs/guides/embeddings/what-are-embeddings WPSOLR: https://www.wpsolr.com #wpsolr #openai #embeddings #vectorsearch

Cohere large multi-language model is now available to WooCommerce search and discover !

We’re proud to announce that WPSOLR 22.8 just release the new SeMI Technologies Weaviate module for Cohere‘s multilingual-22-12 model. Cohere multi-language embeddings: https://docs.cohere.ai/docs/multilingual-language-models WPSOLR 22.8: https://www.wpsolr.com/forums/topic/release-22-8/ Weaviate Cohere module: https://weaviate.io/developers/weaviate/current/retriever-vectorizer-modules/text2vec-cohere.html #wpsolr #cohere #weaviate #vectorsearch #ml #woocommerce

The new OpenAI embeddings API v2 is now integrated with our SeMI Technologies Weaviate extension

Get instantly cheaper and better embeddings for WooCommerce WPSOLR & Weaviate: https://www.wpsolr.com/feature-weaviate/ Weaviate & OpenAI embeddings: https://weaviate.io/developers/weaviate/current/retriever-vectorizer-modules/text2vec-openai.html #wpsolr #search #woocommerce #weaviate #openai #embeddings #vectorsearch

OpenAI releases its embeddings API v2 at 90% discount !

4 improvements: – 90% discount – texts can be 4 times longer, from 2048 to 8192 – Embeddings size is also much shorter, “making the new embeddings more cost effective in working with vector databases” – A single model now: text-embedding is replacing text-search, code-search, and text-similarity And SeMI Technologies Weaviate is already preparing the switch https://github.com/semi-technologies/weaviate/issues/2449 WPSOLR + OpenAI: https://www.wpsolr.com/feature-weaviate/ OpenAI announcement: https://openai.com/blog/new-and-improved-embedding-model/ #wpsolr #openai #weaviate #search #vectorsearch #embeddings

Did you know that only two Open source Vector search engines are fit to e-commerce?

e-commerce search main features are filters and facets (aggregations). While most of Vector search engines are now able to filter results with more or less complex syntax, it is surprising that only two are able to aggregate results: – Yahoo Vespa – SeMI Technologies Weaviate (Please let me know in comments for others) Can anyone imagine a semantic search that do not provide faceted results on the left, as invented and promoted by Amazon ? WPSOLR & Weaviate: https://www.wpsolr.com/feature-weaviate/ Weaviate aggregation: https://weaviate.io/developers/weaviate/current/graphql-references/aggregate.html Vespa aggregation: https://docs.vespa.ai/en/grouping.html

Multimodal WooCommerce text to image search has landed !

We’ve just completed the integration of SeMI Technologies Weaviates’s CLIP module with WooCommerce front-end products search. Visitors can now explore and retrieve the full catalog images from text keywords, including featured images, gallery images, or even product description embedded external images. WPSOLR 22.7: https://www.wpsolr.com/forums/topic/release-22-7/ Weaviate CLIP module: https://weaviate.io/developers/weaviate/current/retriever-vectorizer-modules/multi2vec-clip.html #wpsolr #woocommerce #huggingface #weaviate #multimodal #search

Image discovery is so important for e-commerce, but reserved to high-end websites

With Weaviate for Hugging Face models like CLIP https://weaviate.io/developers/weaviate/current/retriever-vectorizer-modules/multi2vec-clip.html or Resnet https://weaviate.io/developers/weaviate/current/retriever-vectorizer-modules/img2vec-neural.html, image similarity search is now a dream come true for almost any WooCommerce. WPSOLR’s latest news on the SeMI Technologies Weaviate front ! 1 – Hugging Face CLIP models We’ve just completed the integration of Hugging Face CLIP models to WooCommerce front-end products search. Visitors can now explore and retrieve the full catalog images from text keywords, including featured images, gallery images, or even product description embedded external images. 2 – Image discovery by similarity Next release will include some brand new and revolutionary features. 2.1 – A new image filter This filter will let a visitor upload an image to retrieve similar catalog images. And let him also filter images with facets. 2.2

WPSOLR 22.6 with the brand new Google Retail search API has landed

Also included, the new SeMI Technologies Weaviate Hugging Face endpoints API. More on WPSOLR 2.6: https://www.wpsolr.com/forums/topic/release-22-6/ More on Google Retail search: https://www.wpsolr.com/feature-google-retail/ More on Weaviate: https://www.wpsolr.com/feature-weaviate/ Live WooCommerce demo with Google Retail search: https://lnkd.in/dNz83eQw #wpsolr #retail #google #retail #ecommerce #woocommerce #weaviate

WPSOLR 22.7 will add the new SeMI Technologies Weaviate Question Answering module for OpenAI

WPSOLR 22.7 will add the new SeMI Technologies Weaviate Question Answering module for OpenAI https://weaviate.io/developers/weaviate/current/reader-generator-modules/qna-openai.html You will be able to choose any Transformer Vectorizer https://weaviate.io/developers/weaviate/current/modules/index.html, including the OpenAI embedding https://weaviate.io/developers/weaviate/current/retriever-vectorizer-modules/text2vec-openai.html. Notice that the Q&A transformer module is already integrated to WPSOLR ! WPSOLR 22.7: https://www.wpsolr.com/forums/topic/release-22-7/ WPSOLR and Weaviate: https://www.wpsolr.com/feature-weaviate/ #wpsolr #weaviate #openai #ml #woocommerce #wordpress #plugin

First-party cookies and Data Privacy: DIY !

At #wpsolr, we’ve been asked many times to add personalization and recommendations to our existing search engines. Like Algolia or Google Retail. Or to our planned new engines. Like Recombee and Amazon Web Services (AWS) Personalize. It is not an easy task, as it requires a different kind of events ingestion for each engine. But above all, it implies storing and sending some kind of user session. And, to be honest, we stalled on that part. Do we use: – Google analytics javascript pixels API? – Google Tag manager – Engine’s delivered javascript pixels API? – Engine’s delivered backend event API? We started the integration, then stopped, then restarted. But User Privacy has now become such a big deal, and will never disappear, on the