Skip to content

Commit 9ca66d9

Browse files
committed
add careers
1 parent 961c1b1 commit 9ca66d9

File tree

159 files changed

+173
-127
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

159 files changed

+173
-127
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ SELECT pgml.transform(
108108
```
109109

110110
## Tabular data
111-
- [47+ classification and regression algorithms](https://postgresml.org/docs/guides/training/algorithm_selection)
111+
- [47+ classification and regression algorithms](https://postgresml.org/docs/training/algorithm_selection)
112112
- [8 - 40X faster inference than HTTP based model serving](https://postgresml.org/blog/postgresml-is-8x-faster-than-python-http-microservices)
113113
- [Millions of transactions per second](https://postgresml.org/blog/scaling-postgresml-to-one-million-requests-per-second)
114114
- [Horizontal scalability](https://github.com/postgresml/pgcat)
@@ -154,7 +154,7 @@ docker run \
154154
sudo -u postgresml psql -d postgresml
155155
```
156156

157-
For more details, take a look at our [Quick Start with Docker](https://postgresml.org/docs/guides/developer-docs/quick-start-with-docker) documentation.
157+
For more details, take a look at our [Quick Start with Docker](https://postgresml.org/docs/developer-docs/quick-start-with-docker) documentation.
158158

159159
# Getting Started
160160

packages/cargo-pgml-components/src/local_dev.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ static PG_PGVECTOR: &str = "
8282
static PG_PGML: &str = "To install PostgresML into your PostgreSQL database,
8383
follow the instructions on:
8484
85-
\thttps://postgresml.org/docs/guides/setup/v2/installation
85+
\thttps://postgresml.org/docs/setup/v2/installation
8686
";
8787

8888
#[cfg(target_os = "linux")]

pgml-dashboard/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22

33
PostgresML provides a dashboard with analytical views of the training data and model performance, as well as integrated notebooks for rapid iteration. It is primarily written in Rust using [Rocket](https://rocket.rs/) as a lightweight web framework and [SQLx](https://github.com/launchbadge/sqlx) to interact with the database.
44

5-
Please see the [quick start instructions](https://postgresml.org/docs/guides/getting-started/sign-up) for general information on installing or deploying PostgresML. A [developer guide](https://postgresml.org/developer_guide/overview/) is also available for those who would like to contribute.
5+
Please see the [quick start instructions](https://postgresml.org/docs/getting-started/sign-up) for general information on installing or deploying PostgresML. A [developer guide](https://postgresml.org/developer_guide/overview/) is also available for those who would like to contribute.

pgml-dashboard/content/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ LIMIT 5;
121121

122122
## Generating embeddings from natural language text
123123

124-
PostgresML provides a simple interface to generate embeddings from text in your database. You can use the [`pgml.embed`](https://postgresml.org/docs/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the [Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
124+
PostgresML provides a simple interface to generate embeddings from text in your database. You can use the [`pgml.embed`](https://postgresml.org/docs/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the [Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
125125

126126
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model. <code>[intfloat/e5-small](https://huggingface.co/intfloat/e5-small)</code> will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
127127

pgml-dashboard/content/docs/about/faq.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Postgres is widely considered mission critical, and some of the most [reliable](
1010

1111
*How good are the models?*
1212

13-
Model quality is often a trade-off between compute resources and incremental quality improvements. Sometimes a few thousands training examples and an off the shelf algorithm can deliver significant business value after a few seconds of training. PostgresML allows stakeholders to choose several [different algorithms](/docs/guides/training/algorithm_selection/) to get the most bang for the buck, or invest in more computationally intensive techniques as necessary. In addition, PostgresML can automatically apply best practices for [data cleaning](/docs/guides/training/preprocessing/)) like imputing missing values by default and normalizing features to prevent common problems in production.
13+
Model quality is often a trade-off between compute resources and incremental quality improvements. Sometimes a few thousands training examples and an off the shelf algorithm can deliver significant business value after a few seconds of training. PostgresML allows stakeholders to choose several [different algorithms](/docs/training/algorithm_selection/) to get the most bang for the buck, or invest in more computationally intensive techniques as necessary. In addition, PostgresML can automatically apply best practices for [data cleaning](/docs/training/preprocessing/)) like imputing missing values by default and normalizing features to prevent common problems in production.
1414

1515
PostgresML doesn't help with reformulating a business problem into a machine learning problem. Like most things in life, the ultimate in quality will be a concerted effort of experts working over time. PostgresML is intended to establish successful patterns for those experts to collaborate around while leveraging the expertise of open source and research communities.
1616

pgml-dashboard/content/docs/guides/dashboard/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Dashboard
22

3-
PostgresML comes with a web app to provide visibility into models and datasets in your database. If you're running [our Docker container](/docs/guides/developer-docs/quick-start-with-docker), you can view it running on [http://localhost:8000/](http://localhost:8000/).
3+
PostgresML comes with a web app to provide visibility into models and datasets in your database. If you're running [our Docker container](/docs/developer-docs/quick-start-with-docker), you can view it running on [http://localhost:8000/](http://localhost:8000/).
44

55

66
## Generate example data

pgml-dashboard/content/docs/guides/predictions/overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ LIMIT 25;
5151

5252
### Example
5353

54-
If you've already been through the [Training Overview](/docs/guides/training/overview/), you can see the results of those efforts:
54+
If you've already been through the [Training Overview](/docs/training/overview/), you can see the results of those efforts:
5555

5656
=== "SQL"
5757

@@ -106,7 +106,7 @@ SELECT * FROM pgml.deployed_models;
106106

107107
PostgresML will automatically deploy a model only if it has better metrics than existing ones, so it's safe to experiment with different algorithms and hyperparameters.
108108

109-
Take a look at [Deploying Models](/docs/guides/predictions/deployments/) documentation for more details.
109+
Take a look at [Deploying Models](/docs/predictions/deployments/) documentation for more details.
110110

111111
## Specific Models
112112

pgml-dashboard/content/docs/guides/schema/deployments.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Deployments
22

3-
Deployments are an artifact of calls to `pgml.deploy()` and `pgml.train()`. See [Deployments](/docs/guides/predictions/deployments/) for ways to create new deployments manually.
3+
Deployments are an artifact of calls to `pgml.deploy()` and `pgml.train()`. See [Deployments](/docs/predictions/deployments/) for ways to create new deployments manually.
44

55
![Deployment](/dashboard/static/images/dashboard/deployment.png)
66

pgml-dashboard/content/docs/guides/schema/models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Models
22

3-
Models are an artifact of calls to `pgml.train()`. See [Training Overview](/docs/guides/training/overview/) for ways to create new models.
3+
Models are an artifact of calls to `pgml.train()`. See [Training Overview](/docs/training/overview/) for ways to create new models.
44

55
![Models](/dashboard/static/images/dashboard/model.png)
66

pgml-dashboard/content/docs/guides/schema/projects.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Projects
22

3-
Projects are an artifact of calls to `pgml.train()`. See [Training Overview](/docs/guides/training/overview/) for ways to create new projects.
3+
Projects are an artifact of calls to `pgml.train()`. See [Training Overview](/docs/training/overview/) for ways to create new projects.
44

55
![Projects](/dashboard/static/images/dashboard/project.png)
66

pgml-dashboard/content/docs/guides/schema/snapshots.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Snapshots
22

3-
Snapshots are an artifact of calls to `pgml.train()` that specify the `relation_name` and `y_column_name` parameters. See [Training Overview](/docs/guides/training/overview/) for ways to create new snapshots.
3+
Snapshots are an artifact of calls to `pgml.train()` that specify the `relation_name` and `y_column_name` parameters. See [Training Overview](/docs/training/overview/) for ways to create new snapshots.
44

55
![Snapshots](/dashboard/static/images/dashboard/snapshot.png)
66

pgml-dashboard/content/docs/guides/setup/distributed_training.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ psql \
2222
-f dump.sql
2323
```
2424

25-
If you're using our <a href="/docs/guides/developer-docs/quick-start-with-docker">Docker</a> stack, you can import the data there:</p>
25+
If you're using our <a href="/docs/developer-docs/quick-start-with-docker">Docker</a> stack, you can import the data there:</p>
2626

2727
```
2828
psql \

pgml-dashboard/content/docs/guides/setup/gpu_support.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,13 @@ Models trained on GPU may also require GPU support to make predictions. Consult
99
!!!
1010

1111
## Tensorflow
12-
GPU setup for Tensorflow is covered in the [documentation](https://www.tensorflow.org/install/pip). You may acquire pre-trained GPU enabled models for fine tuning from [Hugging Face](/docs/guides/transformers/fine_tuning/).
12+
GPU setup for Tensorflow is covered in the [documentation](https://www.tensorflow.org/install/pip). You may acquire pre-trained GPU enabled models for fine tuning from [Hugging Face](/docs/transformers/fine_tuning/).
1313

1414
## Torch
15-
GPU setup for Torch is covered in the [documentation](https://pytorch.org/get-started/locally/). You may acquire pre-trained GPU enabled models for fine tuning from [Hugging Face](/docs/guides/transformers/fine_tuning/).
15+
GPU setup for Torch is covered in the [documentation](https://pytorch.org/get-started/locally/). You may acquire pre-trained GPU enabled models for fine tuning from [Hugging Face](/docs/transformers/fine_tuning/).
1616

1717
## Flax
18-
GPU setup for Flax is covered in the [documentation](https://github.com/google/jax#pip-installation-gpu-cuda). You may acquire pre-trained GPU enabled models for fine tuning from [Hugging Face](/docs/guides/transformers/fine_tuning/).
18+
GPU setup for Flax is covered in the [documentation](https://github.com/google/jax#pip-installation-gpu-cuda). You may acquire pre-trained GPU enabled models for fine tuning from [Hugging Face](/docs/transformers/fine_tuning/).
1919

2020
## XGBoost
2121
GPU setup for XGBoost is covered in the [documentation](https://xgboost.readthedocs.io/en/stable/gpu/index.html).

pgml-dashboard/content/docs/guides/setup/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
!!! note
44

5-
With the release of PostgresML 2.0, this documentation has been deprecated. New installation instructions are <a href="/docs/guides/setup/v2/installation/">available</a>.
5+
With the release of PostgresML 2.0, this documentation has been deprecated. New installation instructions are <a href="/docs/setup/v2/installation/">available</a>.
66

77
!!!
88

pgml-dashboard/content/docs/guides/setup/quick_start_with_docker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -278,7 +278,7 @@ The following common machine learning tasks are performed automatically by Postg
278278
4. Save it into the model store (a Postgres table)
279279
5. Load it and cache it during inference
280280

281-
Check out our [Training](/docs/guides/training/overview/) and [Predictions](/docs/guides/predictions/overview/) documentation for more details. Some more advanced topics like [hyperparameter search](/docs/guides/training/hyperparameter_search/) and [GPU acceleration](/docs/guides/setup/gpu_support/) are available as well.
281+
Check out our [Training](/docs/training/overview/) and [Predictions](/docs/predictions/overview/) documentation for more details. Some more advanced topics like [hyperparameter search](/docs/training/hyperparameter_search/) and [GPU acceleration](/docs/setup/gpu_support/) are available as well.
282282

283283
## Dashboard
284284

pgml-dashboard/content/docs/guides/setup/v2/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The extension can be installed by compiling it from source, or if you're using U
1010

1111
!!! tip
1212

13-
If you're just looking to try PostgresML without installing it on your system, take a look at our [Quick Start with Docker](/docs/guides/developer-docs/quick-start-with-docker) guide.
13+
If you're just looking to try PostgresML without installing it on your system, take a look at our [Quick Start with Docker](/docs/developer-docs/quick-start-with-docker) guide.
1414

1515
!!!
1616

pgml-dashboard/content/docs/guides/setup/v2/upgrade-from-v1.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ The API is identical between v1.0 and v2.0, and models trained with v1.0 can be
55

66
!!! note
77

8-
Make sure you've set up the system requirements in [v2.0 installation](/docs/guides/setup/v2/installation/), so that the v2.0 extension may be installed.
8+
Make sure you've set up the system requirements in [v2.0 installation](/docs/setup/v2/installation/), so that the v2.0 extension may be installed.
99

1010
!!!
1111

pgml-dashboard/content/docs/guides/training/hyperparameter_search.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ SELECT * FROM pgml.train(
2828

2929
!!!
3030

31-
You may pass any of the arguments listed in the algorithms documentation as hyperparameters. See [Algorithms](/docs/guides/training/algorithm_selection/) for the complete list of algorithms and their associated hyperparameters.
31+
You may pass any of the arguments listed in the algorithms documentation as hyperparameters. See [Algorithms](/docs/training/algorithm_selection/) for the complete list of algorithms and their associated hyperparameters.
3232

3333
### Search Algorithms
3434

pgml-dashboard/content/docs/guides/training/overview.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,9 @@ pgml.train(
3030
| `task` | The objective of the experiment: `regression` or `classification`. | `classification` |
3131
| `relation_name` | The Postgres table or view where the training data is stored or defined. | `public.users` |
3232
| `y_column_name` | The name of the label (aka "target" or "unknown") column in the training table. | `is_bot` |
33-
| `algorithm` | The algorithm to train on the dataset, see [Algorithm Selection](/docs/guides/training/algorithm_selection/) for details. | `xgboost` |
33+
| `algorithm` | The algorithm to train on the dataset, see [Algorithm Selection](/docs/training/algorithm_selection/) for details. | `xgboost` |
3434
| `hyperparams ` | The hyperparameters to pass to the algorithm for training, JSON formatted. | `{ "n_estimators": 25 }` |
35-
| `search` | If set, PostgresML will perform a hyperparameter search to find the best hyperparameters for the algorithm. See [Hyperparameter Search](/docs/guides/training/hyperparameter_search/) for details. | `grid` |
35+
| `search` | If set, PostgresML will perform a hyperparameter search to find the best hyperparameters for the algorithm. See [Hyperparameter Search](/docs/training/hyperparameter_search/) for details. | `grid` |
3636
| `search_params` | Search parameters used in the hyperparameter search, using the scikit-learn notation, JSON formatted. | ```{ "n_estimators": [5, 10, 25, 100] }``` |
3737
| `search_args` | Configuration parameters for the search, JSON formatted. Currently only `n_iter` is supported for `random` search. | `{ "n_iter": 10 }` |
3838
| `test_size ` | Fraction of the dataset to use for the test set and algorithm validation. | `0.25` |
@@ -136,7 +136,7 @@ target |
136136

137137
## Training a Model
138138

139-
Now that we've got data, we're ready to train a model using an algorithm. We'll start with the default `linear` algorithm to demonstrate the basics. See the [Algorithms](/docs/guides/training/algorithm_selection/) for a complete list of available algorithms.
139+
Now that we've got data, we're ready to train a model using an algorithm. We'll start with the default `linear` algorithm to demonstrate the basics. See the [Algorithms](/docs/training/algorithm_selection/) for a complete list of available algorithms.
140140

141141

142142
=== "SQL"
@@ -177,7 +177,7 @@ INFO: Metrics: {
177177
===
178178

179179

180-
The output gives us information about the training run, including the `deployed` status. This is great news indicating training has successfully reached a new high score for the project's key metric and our new model was automatically deployed as the one that will be used to make new predictions for the project. See [Deployments](/docs/guides/predictions/deployments/) for a guide to managing the active model.
180+
The output gives us information about the training run, including the `deployed` status. This is great news indicating training has successfully reached a new high score for the project's key metric and our new model was automatically deployed as the one that will be used to make new predictions for the project. See [Deployments](/docs/predictions/deployments/) for a guide to managing the active model.
181181

182182
## Inspecting the results
183183
Now we can inspect some of the artifacts a training run creates.

pgml-dashboard/content/docs/guides/training/preprocessing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ There are 3 steps to preprocessing data:
2929
- [Imputing](#imputing-missing-values) NULL values to some quantitative value
3030
- [Scaling](#scaling-values) quantitative values across all variables to similar ranges
3131

32-
These preprocessing steps may be specified on a per-column basis to the [train()](/docs/guides/training/overview/) function. By default, PostgresML does minimal preprocessing on training data, and will raise an error during analysis if NULL values are encountered without a preprocessor. All types other than `TEXT` are treated as quantitative variables and cast to floating point representations before passing them to the underlying algorithm implementations.
32+
These preprocessing steps may be specified on a per-column basis to the [train()](/docs/training/overview/) function. By default, PostgresML does minimal preprocessing on training data, and will raise an error during analysis if NULL values are encountered without a preprocessor. All types other than `TEXT` are treated as quantitative variables and cast to floating point representations before passing them to the underlying algorithm implementations.
3333

3434
```postgresql title="pgml.train()"
3535
SELECT pgml.train(

pgml-dashboard/content/docs/guides/transformers/pre_trained_models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ SELECT pgml.transform(
9494

9595
===
9696

97-
See [text classification documentation](https://huggingface.co/tasks/text-classification) for more options and potential use cases beyond sentiment analysis. You'll notice the outputs are not great in this example. RoBERTa is a breakthrough model, that demonstrated just how important each particular hyperparameter is for the task and particular dataset regardless of how large your model is. We'll show how to [fine tune](/docs/guides/transformers/fine_tuning/) models on your data in the next step.
97+
See [text classification documentation](https://huggingface.co/tasks/text-classification) for more options and potential use cases beyond sentiment analysis. You'll notice the outputs are not great in this example. RoBERTa is a breakthrough model, that demonstrated just how important each particular hyperparameter is for the task and particular dataset regardless of how large your model is. We'll show how to [fine tune](/docs/transformers/fine_tuning/) models on your data in the next step.
9898

9999
### Summarization
100100
Sometimes we need all the nuanced detail, but sometimes it's nice to get to the point. Summarization can reduce a very long and complex document to a few sentences. One studied application is reducing legal bills passed by Congress into a plain english summary. Hollywood may also need some intelligence to reduce a full synopsis down to a pithy blurb for movies like Inception.
@@ -225,4 +225,4 @@ SELECT pgml.transform(
225225
===
226226
227227
### More
228-
There are many different [tasks](https://huggingface.co/tasks) and tens of thousands of state-of-the-art [models](https://huggingface.co/models) available for you to explore. The possibilities are expanding every day. There can be amazing performance improvements in domain specific versions of these general tasks by fine tuning published models on your dataset. See the next section for [fine tuning](/docs/guides/transformers/fine_tuning/) demonstrations.
228+
There are many different [tasks](https://huggingface.co/tasks) and tens of thousands of state-of-the-art [models](https://huggingface.co/models) available for you to explore. The possibilities are expanding every day. There can be amazing performance improvements in domain specific versions of these general tasks by fine tuning published models on your dataset. See the next section for [fine tuning](/docs/transformers/fine_tuning/) demonstrations.

0 commit comments

Comments
 (0)