WPSolr logo
Close this search box.

Table of contents :

How to use Weaviate with any Huggingface vectorization model


Table of contents :

For more info about Weaviate,  check out our documentation.


If you’ve ever wanted to use Weaviate but were worried that you couldn’t use the most efficient or relevant vectorization model you want, I have just the thing for you.

In this notebook/guide, I have detailed the different steps and code needed to setup Weaviate with any Huggingface vectorization model.


  1. Choose from a wide selection of Huggingface models using the official rankings page.
  2. Create your own transformers inference container to be used by Weaviate to vectorize the data. Learn how to add your chosen Huggingface model..
  3. Startup the containers and create the class that will use your new vectorizer model.
  4. Send the data to the Weaviate that will now be automatically vectorized by the custom model.

Voilà! You can now send queries to your Weaviate installation that will return the correct and relevant indexed objects.


Check out_the_notebook here.

Or read the following :


This guide is dedicated to importing data to your Weaviate server. For this guide, the data will come from the ms_marco dataset but you can use any type of data you want to.
We also want to be able to use any model as a Wearviate vectorizer. If you want to find the right one for you you can use the huggingface rankings page. I chose the “BAAI/bge-small-en-v1.5” model for this guide because it’s the smallest better performing model.
According to the t2v-transformers-model page on github, we need to create a custom inference container containing our custom model to be used by weaviate.

Create the containers

To create this container, you need to create this Dockerfile named “Dockerfile_custom_model” with the following content :
FROM semitechnologies/transformers-inference:custom
RUN MODEL_NAME=BAAI/bge-small-en-v1.5 ./download.py
You can use any model from huggingface you want instead of “BAAI/bge-small-en-v1.5” as the “MODEL_NAME” value (For example : sentence-transformers/all-MiniLM-L6-v2, etc…).
You can then create a Dockerfile with this content :
version: '3.4'
    - --host
    - --port
    - '8080'
    - --scheme
    - http
    image: semitechnologies/weaviate:1.21.3
    - 8080:8080
    - ./weaviate_data:/var/lib/weaviate
    restart: on-failure:0
      TRANSFORMERS_INFERENCE_API: 'https://t2v-transformers:8080'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-transformers'
      ENABLE_MODULES: 'text2vec-transformers'
      CLUSTER_HOSTNAME: 'node1'
      context: .
      dockerfile: Dockerfile_custom_model
    image: weaviate_custom_model
      ENABLE_CUDA: '0'
A separate container named “t2v-transformers” containing our model (by building the image from the previously created “Dockerfile_custom_model”) has now been created.
If you want to use the GPU, you can set as the environment variable :
      ENABLE_CUDA: '1'
You can now start it using :
docker-compose up -d

The weaviate container has been started with the url : https://localhost:8080.

WARNING : The container doesn’t have any type of security since this guide is setup for testing purposes. If you want to setup a Weaviate production server, you should add authentification and TLS protection : https://weaviate.io/developers/weaviate/configuration/authentication.

Connect to Weaviate

You can connect to your weaviate server using the weaviate module downloaded using :
pip install weaviate-client
import weaviate

client = weaviate.Client(

Create a class

We can then create the Class. A class is a collection of the elements you index. If you have properties associated to the vectors you import you can define them in the class. Each property is defined by a name and a datatype.
class_obj = {
    "class": "Sentences",
    "vectorizer": "none",
    "properties": [
            "name": "passage",
            "dataType": ["text[]"]
            "name": "answer",
            "dataType": ["text[]"]
            "name": "query",
            "dataType": ["text"]
        'vectorizer': 'text2vec-transformers'

If you need to, you can delete the class using :

Load the data

The properties of the object will be displayed to the user when sending a query and finding a match. The ms_marco dataset has “answer”, “query” and “passage” columns so if you use your own data you can have whatever property and how many or little you need.

In this step, you can create the embeddings and add them as well as the properties to the elements array.

from datasets import load_dataset

# Import the "ms_marco" dataset and load the sentences
dataset = load_dataset("ms_marco", 'v1.1')
passages_data = dataset["train"]['passages']
answers_data = dataset["train"]['answers']
query_data = dataset["train"]['query']

elements = []

# Select the 50 first sentences of the dataset
for i in range(50):
    element = {}
    passage = passages_data[i]['passage_text']
    answer = answers_data[i]
    query = query_data[i]

    # Create the respective embedding
    element["Passage"] = passage
    element["Answer"] = answer
    element["Query"] = query


Import data

You can the Weaviate batch function to add all the elements of the elements array to the “Sentences” class. This will send the elements to Weaviate 10 at a time.
client.batch.configure(batch_size=10)  # Create a batch of 10

with client.batch as batch:
    # Batch import all Questions
    for i, d in enumerate(elements):
        properties = {
            "passage": d["Passage"],
            "answer": d["Answer"],
            "query": d["Query"],

        batch.add_data_object(properties, "Sentences")
The data sent to the Weaviate container will automatically be vectorized by our Huggingface model. If we didn’t use a vectorization mode, we could add an optional vector parameter to add_data_object method (you can check the documentation for more).

Send queries

Now that you have added the vectors, you can send queries to the Weaviate database. In this case “What lives in water”.
import json

query = "What to do with food"

nearText = {
  "concepts": [query],
  "distance": 0.6

result = client.query.get(
        "Sentences", ["passage", "answer", "query"]

print(json.dumps(result, indent=4))
If you want to check that a vector was correctly created you can add a ‘vector’ element to the array in the “with_additional” method.
Check out our complete Weaviate installation guide.
Trending posts