Open
Description
I found a specific query that causes a crash in postgres.
I tested this bug on both postgres 15/16 on my debian installation and it also crashes the postgres in the latest docker ( ghcr.io/postgresml/postgresml:2.7.3 )
To reproduce follow the following sequence
SELECT pgml.load_dataset('ag_news');
DROP TABLE phrases2;
CREATE TABLE phrases2 (
id serial PRIMARY KEY,
phrase text,
embedding vector(384)
);
insert into phrases2(phrase,embedding) select text,pgml.embed('all-MiniLM-L6-v2', text)::vector from pgml.ag_news limit 10000;
After the import is complete to phrases2. Exit the psql client. And start up the psql new. Then run this query:
WITH Embeddings AS (
SELECT
pgml.embed('all-MiniLM-L6-v2', p.phrase) AS embedding,
id
FROM
phrases2 p
limit 45
)
SELECT
p.id,
p.phrase,
1 - (p.embedding <=> e.embedding::vector) AS similarity
FROM
Embeddings e
JOIN
phrases2 p ON true
ORDER BY
(p.embedding <=> e.embedding::vector)
limit 10;
This will create the following log files and an abort
: CommandLine Error: Option 'nvptx-no-f16-math' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
2023-11-29 20:30:51.296 UTC [29] LOG: server process (PID 162) was terminated by signal 6: Aborted
I found that initializing the embedding first mitigates the bug. So if you start psql client and then run:
SELECT pgml.embed('all-MiniLM-L6-v2', 'initialize the framework')::vector;
WITH Embeddings AS (
SELECT
pgml.embed('all-MiniLM-L6-v2', p.phrase) AS embedding,
id
FROM
phrases2 p
limit 45
)
SELECT
p.id,
p.phrase,
1 - (p.embedding <=> e.embedding::vector) AS similarity
FROM
Embeddings e
JOIN
phrases2 p ON true
ORDER BY
(p.embedding <=> e.embedding::vector)
limit 10;