-
Notifications
You must be signed in to change notification settings - Fork 325
Editor's notes #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editor's notes #17
Conversation
@@ -2,44 +2,143 @@ | |||
|
|||
 | |||
|
|||
PostgresML is a Proof of Concept to create the simplest end-to-end machine learning system. We're building on the shoulders of giants, namely Postgres which is arguably the most robust storage and compute engine that exists, and we're coupling that with Python machine learning libraries (and their c implementations) to prototype different machine learning workflows. | |||
PostgresML is an end-to-end machine learning system. Using only SQL, it allows to train models and run online predictions, alongside normal queries, directly using the data in your databases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like proof of concept. It seems to have played well in your pgcat debut. I think name dropping Postgres/Python helps buy some credibility that we're not completely insanely trying to build this from scratch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's true, fair enough
pgml is not compatible with plpython, if using both pgml and plpython in the same session, postgresql will crash. minimum reproducible code: ```sql SELECT pgml.embed('intfloat/e5-small', 'hi mom'); create or replace function pyudf() returns int as $$ return 0 $$ language 'plpython3u'; ``` the call stack: ``` Stack trace of thread 161970: #0 0x00007efc1429edb8 PyImport_Import (libpython3.9.so.1.0 + 0x9edb8) postgresml#1 0x00007efc1429f125 PyImport_ImportModule (libpython3.9.so.1.0 + 0x9f125) postgresml#2 0x00007efb04b0f496 n/a (plpython3.so + 0x10496) postgresml#3 0x00007efb04b1039d plpython3_validator (plpython3.so + 0x1139d) postgresml#4 0x0000559d0cdbc5c2 OidFunctionCall1Coll (postgres + 0x6465c2) postgresml#5 0x0000559d0c9d68bb ProcedureCreate (postgres + 0x2608bb) postgresml#6 0x0000559d0ca5030c CreateFunction (postgres + 0x2da30c) postgresml#7 0x0000559d0ce1c730 n/a (postgres + 0x6a6730) postgresml#8 0x0000559d0cc5a030 standard_ProcessUtility (postgres + 0x4e4030) postgresml#9 0x0000559d0cc545ed n/a (postgres + 0x4de5ed) postgresml#10 0x0000559d0cc546e7 n/a (postgres + 0x4de6e7) postgresml#11 0x0000559d0cc54beb PortalRun (postgres + 0x4debeb) postgresml#12 0x0000559d0cc55249 n/a (postgres + 0x4df249) postgresml#13 0x0000559d0cc576f0 PostgresMain (postgres + 0x4e16f0) postgresml#14 0x0000559d0cbc3e9c n/a (postgres + 0x44de9c) postgresml#15 0x0000559d0cbc50aa PostmasterMain (postgres + 0x44f0aa) postgresml#16 0x0000559d0c8ce7d2 main (postgres + 0x1587d2) postgresml#17 0x00007efc18427cd0 n/a (libc.so.6 + 0x27cd0) postgresml#18 0x00007efc18427d8a __libc_start_main (libc.so.6 + 0x27d8a) postgresml#19 0x0000559d0c8cee15 _start (postgres + 0x158e15) ``` this is because PostgreSQL is using dlopen(RTLD_GLOBAL). this will parse some of symbols into the previous opened .so file, but the others will use a relative offset in pgml.so, and will cause a null-pointer crash. this commit hide all symbols except the UDF symbols (ends with `_wrapper`) and the magic symbols (`_PG_init` `Pg_magic_func`). so dlopen(RTLD_GLOBAL) will parse the symbols to the correct position.
Some edits on the README.