Skip to content

Performing embedding inference as part of an UPDATE crashes the server within the docker container with an "illegal instruction" #1515

Open
@johnwthomson

Description

@johnwthomson

It looks like pgvect needs to be built for the host machine within the container upon first boot. Otherwise, "illegal instruction" errors will crash the database.

I reproduce the error with the following steps:

UPDATE document_chunks
SET embedding = pgml.embed('intfloat/e5-large-v2', 'passage: ' || chunk_text),
    embedding_model = 'intfloat/e5-large-v2'
WHERE chunk_id = 1;

error communicating with database: unexpected end of file

SELECT pgml.embed('intfloat/e5-large-v2', 'passage: ' || chunk_text) 
FROM document_chunks 
WHERE chunk_id = 1;

generates the embedding as expected.

I've pulled /var/log/postgresql/postgresql-15-main.log and found the following error response:

2024-06-06 23:21:44.980 UTC [22] LOG:  server process (PID 3633) was terminated by signal 4: Illegal instruction
2024-06-06 23:21:44.980 UTC [22] DETAIL:  Failed process was running:
                UPDATE document_chunks
                SET
                    embedding = pgml.embed('intfloat/e5-large-v2', 'passage: ' || chunk_text),
                    embedding_model = 'intfloat/e5-large-v2'
                WHERE
                    chunk_id = 1;

I'm running a brand new database in Ubuntu using the following command:
docker run -d --name postgresml -v /storage/data/postgresml_data/_data2/:/var/lib/postgresql/ --gpus "device=1" -p 5433:5432 -p 8000:8000 ghcr.io/postgresml/postgresml:2.8.2 bash -c "sudo -u postgresml bash -c 'while true; do sleep 1000; done'"

After some discussion with Lev on Discord, a stack trace reported the following error within the worker thread:

Thread 1 "postgres" received signal SIGILL, Illegal instruction.
0x00007cb54708e8ae in array_to_vector (fcinfo=<optimized out>) at src/vector.c:501
501                             result->x[i] = DatumGetFloat4(elemsp[i]);

Solution / workaround: Compile pgvector within the docker container. It was found that rebuilding from source fixed this issue.

Recommended action: Please update the docker image so it performs a fresh compile of pgvector to ensure proper support for the system processor. (I'm running on an 8 year old i7 extreme processor, which must have been incompatible)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions