Skip to content

Commit 3c3893e

Browse files
committed
Link to docs from FAQ (#521)
Co-authored-by: Montana Low <montana.low@gmail.com>
1 parent d0e696c commit 3c3893e

File tree

6 files changed

+74
-48
lines changed

6 files changed

+74
-48
lines changed

pgml-dashboard/README.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# PostgresML Dashboard
2+
3+
PostgresML provides a dashboard with analytical views of the training data and model performance, as well as integrated notebooks for rapid iteration. It is primarily written in Rust using [Rocket](https://rocket.rs/) as a lightweight web framework and [SQLx](https://github.com/launchbadge/sqlx) to interact with the database.
4+
5+
Please see the [online documentation](https://postgresml.org/user_guides/setup/quick_start_with_docker/) for general information on installing or deploying PostgresML. This document is intended to help developers set up a local copy of the dashboard.
6+
7+
## Requirements
8+
9+
The dashboard requires a Postgres database with the [pgml-extension](https://github.com/postgresml/postgresml/tree/master/pgml-extension) to generate the core schema. See that subproject for developer setup.
10+
11+
We develop and test this web application on Linux, OS X, and Windows using WSL2.
12+
13+
## Build process
14+
15+
You'll need to specify a database url for the extension to interact with via an environment variable:
16+
17+
```commandline
18+
export DATABASE_URL=postgres://user_name:password@localhost:5432/database_name
19+
```
20+
21+
Build and run:
22+
23+
```commandline
24+
cargo run
25+
```
26+
27+
Incremental and automatic compilation for development cycles is supported with:
28+
29+
```commandline
30+
cargo watch --exec run
31+
```
32+
33+
Run tests:
34+
```commandline
35+
cargo test
36+
```

pgml-docs/docs/about/faq.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Postgres is widely considered mission critical, and some of the most [reliable](
1010

1111
*How good are the models?*
1212

13-
Model quality is often a trade-off between compute resources and incremental quality improvements. Sometimes a few thousands training examples and an off the shelf algorithm can deliver significant business value after a few seconds of training. PostgresML allows stakeholders to choose several different algorithms to get the most bang for the buck, or invest in more computationally intensive techniques as necessary. In addition, PostgresML automatically applies best practices for data cleaning like imputing missing values by default and normalizing data to prevent common problems in production.
13+
Model quality is often a trade-off between compute resources and incremental quality improvements. Sometimes a few thousands training examples and an off the shelf algorithm can deliver significant business value after a few seconds of training. PostgresML allows stakeholders to choose several [different algorithms](/user_guides/training/algorithm_selection/) to get the most bang for the buck, or invest in more computationally intensive techniques as necessary. In addition, PostgresML can automatically apply best practices for [data cleaning](/user_guides/training/preprocessing/)) like imputing missing values by default and normalizing features to prevent common problems in production.
1414

1515
PostgresML doesn't help with reformulating a business problem into a machine learning problem. Like most things in life, the ultimate in quality will be a concerted effort of experts working over time. PostgresML is intended to establish successful patterns for those experts to collaborate around while leveraging the expertise of open source and research communities.
1616

pgml-docs/docs/user_guides/setup/v2/installation.md

Lines changed: 15 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -216,46 +216,23 @@ That's it, PostgresML is ready. You can validate the installation by running:
216216
(1 row)
217217
```
218218

219-
## Dashboard
219+
## Run the dashboard
220220

221-
The dashboard is a Django application. Installing it requires no special dependencies or commands:
221+
The dashboard is a web app that can be run against any Postgres database with the extension installed. There is a Dockerfile included with the source code if you wish to run it as a container. Basic installation can be achieved with:
222222

223+
1. Clone the repo (if you haven't already for the extension):
224+
```bash
225+
git clone https://github.com/postgresml/postgresml && cd postgresml/pgml-dashboard
226+
```
223227

224-
=== ":material-linux: :material-microsoft: Linux & WSL"
225-
226-
Install Python if you don't have it already:
227-
228-
```bash
229-
sudo apt-get update && \
230-
sudo apt-get install python3 python3-pip python3-virtualenv
231-
```
232-
233-
=== ":material-apple: Mac"
234-
235-
Install Python if you don't have it already:
236-
237-
```bash
238-
brew install python3
239-
```
240-
241-
1. Clone our repository (if you haven't already):
242-
243-
```bash
244-
git clone https://github.com/postgresml/postgresml && \
245-
cd postgresml/pgml-dashboard
246-
```
247-
248-
2. Setup a virtual environment (recommended but not required):
249-
250-
```bash
251-
virtualenv venv && \
252-
source venv/bin/activate
253-
```
228+
2. Set the `DATABASE_URL` environment variable:
229+
```bash
230+
export DATABASE_URL=postgres://user_name:password@localhost:5432/database_name
231+
```
254232

255-
3. Run the dashboard:
233+
3. Build and run the web application:
234+
```bash
235+
cargo run
236+
```
256237

257-
```bash
258-
pip3 install -r requirements.txt && \
259-
python manage.py migrate && \
260-
python manage.py runserver
261-
```
238+
The dashboard can be packaged for distribution. You'll need to copy the static files along with the `target/release` directory to your server.

pgml-extension/README.md

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
1-
# PostgresML 2.0
1+
# PostgresML Extension
22

3-
PostgresML is a PostgreSQL extension providing end-to-end machine learning inside your database. The extension is written in Rust using `tcdi/pgx` and provides LightGBM, XGBoost and [Linfa](https://github.com/rust-ml/linfa) algorithms.
3+
PostgresML is a PostgreSQL extension providing end-to-end machine learning inside your database. The extension is primarily written in Rust using [pgx](https://github.com/tcdi/pgx) and provides a SQL interface to various machine learning algorithm implementations such as [XGBoost](https://github.com/dmlc/xgboost), [LightGBM](https://github.com/microsoft/LightGBM), and [other classical methods](https://github.com/rust-ml/linfa).
44

5-
A backwards compatibility layer to Scikit-learn is provided as well, so the entirety of Scikit, XGBoost and LightGBM are available via the standard Scikit interface using Python. The Python layer is written using `pyo3`.
5+
Python seems to be the de facto ML industry standard, so we also include "reference" implementations of classical algorithms from Scikit-learn as well for comparison to the Rust implementations, but runtime performance and correctness. The Python integration is written using `pyo3`.
66

7-
See [our blog](https://postgresml.org/blog/postgresml-is-moving-to-rust-for-our-2.0-release/) for a performance comparison to Python.
7+
See [our blog](https://postgresml.org/blog/postgresml-is-moving-to-rust-for-our-2.0-release/) for a performance comparison and further motivations.
88

99
## Requirements
1010

11-
PostgresML 2.0 requires Python 3.7 or above and the Rust compiler and toolchain. You can download the Rust compiler [here](https://rust-lang.org).
11+
PostgresML requires Python 3.7 or above and the Rust compiler and toolchain. You can download the Rust compiler [here](https://rust-lang.org).
1212

13-
We develop this extension on Ubuntu, so it'll work best there, but it's very likely to work on other distros as well. Windows is only supported through WSL2. It's been tested and it works. Mac OS is also supported.
13+
We develop and test this extension on Linux, OS X, and Windows using WSL2.
1414

1515
## Dependencies
1616

@@ -36,7 +36,7 @@ If your system comes with Python 3.6 or lower, you'll need to install `libpython
3636

3737
## Update postgresql.conf
3838

39-
PostgresML 2.0 requires to be loaded as a shared library. For local development, this is in `~/.pgx/data-13/postgresql.conf`:
39+
PostgresML requires to be loaded as a shared library. For local development, this is in `~/.pgx/data-13/postgresql.conf`:
4040

4141
```
4242
shared_preload_libraries = 'pgml' # (change requires restart)
@@ -54,6 +54,19 @@ shared_preload_libraries = 'pgml' # (change requires restart)
5454
7. `SELECT * FROM pgml.train('Project name', 'regression', 'pgml.diabetes', 'target', 'xgboost');`
5555
8. `SELECT target, pgml.predict('Project name', ARRAY[age, sex, bmi, bp, s1, s2, s3, s4, s5, s6]) FROM pgml.diabetes LIMIT 10;`
5656

57+
## Testing
58+
59+
Run unit tests:
60+
```commandline
61+
cargo test
62+
```
63+
64+
Run integration tests:
65+
```commandline
66+
cargo pgx run --release
67+
psql -h localhost -p 28813 -d pgml -f tests/test.sql -P pager
68+
```
69+
5770
## Packaging
5871

5972
This requires Docker. Once Docker is installed, you can run:

pgml-extension/examples/image_classification.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,6 @@ SELECT * FROM pgml.deploy('Handwritten Digits', 'rollback');
9595
SELECT * FROM pgml.deploy('Handwritten Digits', 'best_score', 'svm');
9696

9797
-- check out the improved predictions
98-
SELECT target, pgml.predict('Handwritten Digits', image) AS prediction
98+
SELECT target, pgml.predict('Handwritten Digits', image::FLOAT4[]) AS prediction
9999
FROM pgml.digits
100100
LIMIT 10;

pgml-extension/tests/test.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
--- Usage:
55
---
66
--- $ cargo pgx run --release
7-
--- $ psql -P pager-off -h localhost -p 28813 -d pgml -f tests/test.sql
7+
--- $ psql -h localhost -p 28813 -d pgml -f tests/test.sql -P pager
88
---
99
\set ON_ERROR_STOP true
1010
\timing on

0 commit comments

Comments
 (0)