Skip to main content

EMBED

Description

Generates a semantic embedding vector based on the input text, representing the semantic information of the text. It can be used for similarity calculation, retrieval, and other scenarios.

Syntax

EMBED([<resource_name>], <text>)

Parameters

ParameterDescription
<resource_name>The specified resource name
<text>The text to generate the embedding vector for

Return Value

The return type is ARRAY, representing the generated vector.

Returns NULL if the input value is NULL.

The result is generated by a large language model, so the returned content is not fixed.

Example

The following table simulates a company's code of conduct.

CREATE TABLE knowledge_base (
id BIGINT,
title STRING,
content STRING,
embedding ARRAY<FLOAT> COMMENT 'Generated embedding vectors by the EMBED function'
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 4
PROPERTIES (
"replication_num" = "1"
);

SET default_ai_resource = 'embed_resource_name';

-- `embedding` is the embedding vector generated by the function EMBED according to the corresponding tag of the content.
INSERT INTO knowledge_base (id, title, content, embedding) VALUES
(1, "Travel Reimbursement Policy",
"Employees must submit a reimbursement request within 7 days after the business trip, with invoices and travel approval attached.",
EMBED("travel reimbursement policy")),
(2, "Leave Policy",
"Employees must apply for leave in the system in advance. If the leave is longer than three days, approval from the direct manager is required.",
EMBED("leave request policy")),
(3, "VPN User Guide",
"To access the internal network, employees must use VPN. For the first login, download and install the client and configure the certificate.",
EMBED("VPN guide intranet access")),
(4, "Meeting Room Reservation",
"Meeting rooms can be reserved in advance through the OA system, with time and number of participants specified.",
EMBED("meeting room booking reservation")),
(5, "Procurement Request Process",
"Departments must fill out a procurement request form for purchasing items. If the amount exceeds $5000, financial approval is required.",
EMBED("procurement request process finance"));

By vectorizing the text, you can perform operations such as:

  1. Question answering retrieval (with COSINE_DISTANCE)
SELECT 
id, title, content,
COSINE_DISTANCE(embedding, EMBED("How to apply for travel reimbursement?")) AS score
FROM knowledge_base
ORDER BY score ASC
LIMIT 2;
+------+-----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| id | title | content | score |
+------+-----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| 1 | Travel Reimbursement Policy | Employees must submit a reimbursement request within 7 days after the business trip, with invoices and travel approval attached. | 0.4463210454563673 |
| 5 | Procurement Request Process | Departments must fill out a procurement request form for purchasing items. If the amount exceeds $5000, financial approval is required. | 0.5726841578491431 |
+------+-----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+--------------------+
  1. Problem analysis matching (with L2_DISTANCE)
SELECT 
id, title, content,
L2_DISTANCE(embedding, EMBED("How to access the company intranet")) AS distance
FROM knowledge_base
ORDER BY distance ASC
LIMIT 2;
+------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| id | title | content | distance |
+------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| 3 | VPN User Guide | To access the internal network, employees must use VPN. For the first login, download and install the client and configure the certificate. | 0.5838271122253775 |
| 1 | Travel Reimbursement Policy | Employees must submit a reimbursement request within 7 days after the business trip, with invoices and travel approval attached. | 1.272394695975331 |
+------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
  1. Text relevance matching and recommendation based on article content (with INNER PRODUCT)
SELECT 
id, title, content,
INNER_PRODUCT(embedding, EMBED("Leave system request leader approval")) AS score
FROM knowledge_base
WHERE id != 2
ORDER BY score DESC
LIMIT 2;
+------+-----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+---------------------+
| id | title | content | score |
+------+-----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+---------------------+
| 5 | Procurement Request Process | Departments must fill out a procurement request form for purchasing items. If the amount exceeds $5000, financial approval is required. | 0.33268885332504 |
| 4 | Meeting Room Reservation | Meeting rooms can be reserved in advance through the OA system, with time and number of participants specified. | 0.29224032230852487 |
+------+-----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+---------------------+
  1. Find content with minimal differences(with L1_DISTANCE)
SELECT 
id, title, content,
L1_DISTANCE(embedding, EMBED("Procurement application process")) AS distance
FROM knowledge_base
ORDER BY distance ASC
LIMIT 3;
+------+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| id | title | content | distance |
+------+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| 5 | Procurement Request Process | Departments must fill out a procurement request form for purchasing items. If the amount exceeds $5000, financial approval is required. | 18.66882028897362 |
| 4 | Meeting Room Reservation | Meeting rooms can be reserved in advance through the OA system, with time and number of participants specified. | 30.90449328294426 |
| 2 | Leave Policy | Employees must apply for leave in the system in advance. If the leave is longer than three days, approval from the direct manager is required. | 31.060405636536416 |
+------+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+