Skip to content

Sentence Transformers

Setup a products table. Copy from the example data provided by the extension.

Ensure vectorize.embedding_svc_url is set to the URL of the vector-serve container.

If you're running this example using the docker-compose.yaml file from this repo, it should look like this:

SHOW vectorize.embedding_service_url;
(1 row)

If you are not running in docker, then you will need to change the url to the appropriate location. If that is localhost, it would look like this;

ALTER SYSTEM SET vectorize.embedding_svc_url TO 'http://localhost:3000/v1/embeddings';

Then reload Postgres configurations:

SELECT pg_reload_conf();

Create an example table if it does not already exist.

CREATE TABLE products (LIKE vectorize.example_products INCLUDING ALL);
INSERT INTO products SELECT * FROM vectorize.example_products;
SELECT * FROM products limit 2;
 product_id | product_name |                      description                       |        last_updated_at        
          1 | Pencil       | Utensil used for writing and often works best on paper | 2023-07-26 17:20:43.639351-05
          2 | Laptop Stand | Elevated platform for laptops, enhancing ergonomics    | 2023-07-26 17:20:43.639351-05

Create a job to vectorize the products table. We'll specify the tables primary key (product_id) and the columns that we want to search (product_name and description).

SELECT vectorize.table(
    job_name    => 'product_search_hf',
    "table"     => 'products',
    primary_key => 'product_id',
    columns     => ARRAY['product_name', 'description'],
    transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1',
    scheduler   => 'realtime'

This adds a new column to your table, in our case it is named product_search_embeddings, then populates that data with the transformed embeddings from the product_name and description columns.

Then search,

    job_name        => 'product_search_hf',
    query           => 'accessories for mobile devices',
    return_columns  => ARRAY['product_id', 'product_name'],
    num_results     => 3

 {"product_id": 13, "product_name": "Phone Charger", "similarity_score": 0.8147814132322894}
 {"product_id": 6, "product_name": "Backpack", "similarity_score": 0.7743061352550308}
 {"product_id": 11, "product_name": "Stylus Pen", "similarity_score": 0.7709902653575383}