Skip to content

add careers #1176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ SELECT pgml.transform(
```

## Tabular data
- [47+ classification and regression algorithms](https://postgresml.org/docs/guides/training/algorithm_selection)
- [47+ classification and regression algorithms](https://postgresml.org/docs/training/algorithm_selection)
- [8 - 40X faster inference than HTTP based model serving](https://postgresml.org/blog/postgresml-is-8x-faster-than-python-http-microservices)
- [Millions of transactions per second](https://postgresml.org/blog/scaling-postgresml-to-one-million-requests-per-second)
- [Horizontal scalability](https://github.com/postgresml/pgcat)
Expand Down Expand Up @@ -154,7 +154,7 @@ docker run \
sudo -u postgresml psql -d postgresml
```

For more details, take a look at our [Quick Start with Docker](https://postgresml.org/docs/guides/developer-docs/quick-start-with-docker) documentation.
For more details, take a look at our [Quick Start with Docker](https://postgresml.org/docs/developer-docs/quick-start-with-docker) documentation.

# Getting Started

Expand Down Expand Up @@ -214,7 +214,7 @@ SELECT pgml.transform(

Text classification involves assigning a label or category to a given text. Common use cases include sentiment analysis, natural language inference, and the assessment of grammatical correctness.

![text classification](pgml-docs/docs/images/text-classification.png)
![text classification](pgml-cms/docs/images/text-classification.png)

### Sentiment Analysis
Sentiment analysis is a type of natural language processing technique that involves analyzing a piece of text to determine the sentiment or emotion expressed within it. It can be used to classify a text as positive, negative, or neutral, and has a wide range of applications in fields such as marketing, customer service, and political analysis.
Expand Down Expand Up @@ -383,7 +383,7 @@ SELECT pgml.transform(
## Zero-Shot Classification
Zero Shot Classification is a task where the model predicts a class that it hasn't seen during the training phase. This task leverages a pre-trained language model and is a type of transfer learning. Transfer learning involves using a model that was initially trained for one task in a different application. Zero Shot Classification is especially helpful when there is a scarcity of labeled data available for the specific task at hand.

![zero-shot classification](pgml-docs/docs/images/zero-shot-classification.png)
![zero-shot classification](pgml-cms/docs/images/zero-shot-classification.png)

In the example provided below, we will demonstrate how to classify a given sentence into a class that the model has not encountered before. To achieve this, we make use of `args` in the SQL query, which allows us to provide `candidate_labels`. You can customize these labels to suit the context of your task. We will use `facebook/bart-large-mnli` model.

Expand Down Expand Up @@ -417,7 +417,7 @@ SELECT pgml.transform(
## Token Classification
Token classification is a task in natural language understanding, where labels are assigned to certain tokens in a text. Some popular subtasks of token classification include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. NER models can be trained to identify specific entities in a text, such as individuals, places, and dates. PoS tagging, on the other hand, is used to identify the different parts of speech in a text, such as nouns, verbs, and punctuation marks.

![token classification](pgml-docs/docs/images/token-classification.png)
![token classification](pgml-cms/docs/images/token-classification.png)

### Named Entity Recognition
Named Entity Recognition (NER) is a task that involves identifying named entities in a text. These entities can include the names of people, locations, or organizations. The task is completed by labeling each token with a class for each named entity and a class named "0" for tokens that don't contain any entities. In this task, the input is text, and the output is the annotated text with named entities.
Expand Down Expand Up @@ -467,7 +467,7 @@ select pgml.transform(
## Translation
Translation is the task of converting text written in one language into another language.

![translation](pgml-docs/docs/images/translation.png)
![translation](pgml-cms/docs/images/translation.png)

You have the option to select from over 2000 models available on the Hugging Face <a href="https://huggingface.co/models?pipeline_tag=translation" target="_blank">hub</a> for translation.

Expand All @@ -490,7 +490,7 @@ select pgml.transform(
## Summarization
Summarization involves creating a condensed version of a document that includes the important information while reducing its length. Different models can be used for this task, with some models extracting the most relevant text from the original document, while other models generate completely new text that captures the essence of the original content.

![summarization](pgml-docs/docs/images/summarization.png)
![summarization](pgml-cms/docs/images/summarization.png)

```sql
select pgml.transform(
Expand Down Expand Up @@ -534,7 +534,7 @@ select pgml.transform(
## Question Answering
Question Answering models are designed to retrieve the answer to a question from a given text, which can be particularly useful for searching for information within a document. It's worth noting that some question answering models are capable of generating answers even without any contextual information.

![question answering](pgml-docs/docs/images/question-answering.png)
![question answering](pgml-cms/docs/images/question-answering.png)

```sql
SELECT pgml.transform(
Expand All @@ -558,12 +558,12 @@ SELECT pgml.transform(
}
```
<!-- ## Table Question Answering
![table question answering](pgml-docs/docs/images/table-question-answering.png) -->
![table question answering](pgml-cms/docs/images/table-question-answering.png) -->

## Text Generation
Text generation is the task of producing new text, such as filling in incomplete sentences or paraphrasing existing text. It has various use cases, including code generation and story generation. Completion generation models can predict the next word in a text sequence, while text-to-text generation models are trained to learn the mapping between pairs of texts, such as translating between languages. Popular models for text generation include GPT-based models, T5, T0, and BART. These models can be trained to accomplish a wide range of tasks, including text classification, summarization, and translation.

![text generation](pgml-docs/docs/images/text-generation.png)
![text generation](pgml-cms/docs/images/text-generation.png)

```sql
SELECT pgml.transform(
Expand Down Expand Up @@ -725,7 +725,7 @@ SELECT pgml.transform(
```
## Text-to-Text Generation
Text-to-text generation methods, such as T5, are neural network architectures designed to perform various natural language processing tasks, including summarization, translation, and question answering. T5 is a transformer-based architecture pre-trained on a large corpus of text data using denoising autoencoding. This pre-training process enables the model to learn general language patterns and relationships between different tasks, which can be fine-tuned for specific downstream tasks. During fine-tuning, the T5 model is trained on a task-specific dataset to learn how to perform the specific task.
![text-to-text](pgml-docs/docs/images/text-to-text-generation.png)
![text-to-text](pgml-cms/docs/images/text-to-text-generation.png)

*Translation*
```sql
Expand Down Expand Up @@ -762,7 +762,7 @@ SELECT pgml.transform(
```
## Fill-Mask
Fill-mask refers to a task where certain words in a sentence are hidden or "masked", and the objective is to predict what words should fill in those masked positions. Such models are valuable when we want to gain statistical insights about the language used to train the model.
![fill mask](pgml-docs/docs/images/fill-mask.png)
![fill mask](pgml-cms/docs/images/fill-mask.png)

```sql
SELECT pgml.transform(
Expand Down Expand Up @@ -859,7 +859,7 @@ SELECT * FROM items, query ORDER BY items.embedding <-> query.embedding LIMIT 5;

<!-- ## Sentence Similarity
Sentence Similarity involves determining the degree of similarity between two texts. To accomplish this, Sentence similarity models convert the input texts into vectors (embeddings) that encapsulate semantic information, and then measure the proximity (or similarity) between the vectors. This task is especially beneficial for tasks such as information retrieval and clustering/grouping.
![sentence similarity](pgml-docs/docs/images/sentence-similarity.png)
![sentence similarity](pgml-cms/docs/images/sentence-similarity.png)

<!-- ## Conversational -->
<!-- # Regression
Expand Down
3 changes: 1 addition & 2 deletions docker/dashboard.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@ set -e

export DATABASE_URL=postgres://postgresml:postgresml@127.0.0.1:5432/postgresml
export DASHBOARD_STATIC_DIRECTORY=/usr/share/pgml-dashboard/dashboard-static
export DASHBOARD_CONTENT_DIRECTORY=/usr/share/pgml-dashboard/dashboard-content
export DASHBOARD_DOCS_DIRECTORY=/usr/share/pgml-docs
export DASHBOARD_CMS_DIRECTORY=/usr/share/pgml-cms
export SEARCH_INDEX_DIRECTORY=/var/lib/pgml-dashboard/search-index
export ROCKET_SECRET_KEY=$(openssl rand -hex 32)
export ROCKET_ADDRESS=0.0.0.0
Expand Down
2 changes: 1 addition & 1 deletion packages/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The version of PostgresML is set in many places, and all of them need to be upda

#### Documentation

Additionally, we mention the version of the extension in our documentation. It would be very helpful to update it there as well, so our users are always instructed to install the latest and greatest version. Our documentation is located in `pgml-docs`. If you search it for the current version number, you should find all the places where we mention it.
Additionally, we mention the version of the extension in our documentation. It would be very helpful to update it there as well, so our users are always instructed to install the latest and greatest version. Our documentation is located in `pgml-cms`. If you search it for the current version number, you should find all the places where we mention it.

#### Github Actions

Expand Down
2 changes: 1 addition & 1 deletion packages/cargo-pgml-components/src/local_dev.rs
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ static PG_PGVECTOR: &str = "
static PG_PGML: &str = "To install PostgresML into your PostgreSQL database,
follow the instructions on:

\thttps://postgresml.org/docs/guides/setup/v2/installation
\thttps://postgresml.org/docs/setup/v2/installation
";

#[cfg(target_os = "linux")]
Expand Down
3 changes: 1 addition & 2 deletions packages/postgresml-dashboard/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,8 @@ rm "$deb_dir/release.sh"
( cd ${SCRIPT_DIR}/../../pgml-dashboard && \
cargo build --release && \
cp target/release/pgml-dashboard "$deb_dir/usr/bin/pgml-dashboard" && \
cp -R content "$deb_dir/usr/share/pgml-dashboard/dashboard-content" && \
cp -R static "$deb_dir/usr/share/pgml-dashboard/dashboard-static" && \
cp -R ../pgml-docs "$deb_dir/usr/share/pgml-docs" )
cp -R ../pgml-cms "$deb_dir/usr/share/pgml-cms" )

(cat ${SCRIPT_DIR}/DEBIAN/control | envsubst) > "$deb_dir/DEBIAN/control"
(cat ${SCRIPT_DIR}/etc/systemd/system/pgml-dashboard.service | envsubst) > "$deb_dir/etc/systemd/system/pgml-dashboard.service"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ StartLimitIntervalSec=0
[Service]
Environment=RUST_LOG=info
Environment=DASHBOARD_STATIC_DIRECTORY=/usr/share/pgml-dashboard/dashboard-static
Environment=DASHBOARD_CONTENT_DIRECTORY=/usr/share/pgml-dashboard/dashboard-content
Environment=DASHBOARD_DOCS_DIRECTORY=/usr/share/pgml-docs
Environment=DASHBOARD_CMS_DIRECTORY=/usr/share/pgml-cms
Environment=ROCKET_ADDRESS=0.0.0.0
Environment=GITHUB_STARS=${GITHUB_STARS}
Environment=SEARCH_INDEX_DIRECTORY=/var/lib/pgml-dashboard/search-index
Expand Down
Loading