Skip to content

Docs pass #1419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions pgml-cms/docs/.gitbook/assets/pgcat_1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pgml-cms/docs/.gitbook/assets/pgcat_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pgml-cms/docs/.gitbook/assets/pgcat_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pgml-cms/docs/.gitbook/assets/pgcat_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pgml-cms/docs/.gitbook/assets/pgcat_5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pgml-cms/docs/.gitbook/assets/pgcat_6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pgml-cms/docs/.gitbook/assets/pgcat_7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 6 additions & 6 deletions pgml-cms/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,17 @@ description: The key concepts that make up PostgresML.

PostgresML is a complete MLOps platform built on PostgreSQL. Our operating principle is:

> _Move the models to the database, rather than constantly moving the data to the models._
> _Move models to the database, rather than constantly moving data to the models._

The data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move the models to the database, rather than continuously moving data to the models.
Data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move models to the database, rather than continuously moving data to the models.

## AI engine

PostgresML allows you to take advantage of the fundamental relationship between data and models, by extending the database with the following capabilities:

* **Model Serving** - GPU accelerated inference engine for interactive applications, with no additional networking latency or reliability costs
* **Model Store** - Access to open-source models including state of the art LLMs from HuggingFace, and track changes in performance between versions
* **Model Training** - Train models with your application data using more than 50 algorithms for regression, classification or clustering tasks; fine tune pre-trained models like LLaMA and BERT to improve performance
* **Model Store** - Access to open-source models including state of the art LLMs from Hugging Face, and track changes in performance between versions
* **Model Training** - Train models with your application data using more than 50 algorithms for regression, classification or clustering tasks; fine tune pre-trained models like Llama and BERT to improve performance
* **Feature Store** - Scalable access to model inputs, including vector, text, categorical, and numeric data: vector database, text search, knowledge graph and application data all in one low-latency system

<figure><img src=".gitbook/assets/ml_system.svg" alt="Machine Learning Infrastructure (2.0) by a16z"><figcaption class="mt-2"><p>PostgresML handles all of the functions <a href="https://a16z.com/emerging-architectures-for-modern-data-infrastructure/">described by a16z</a></p></figcaption></figure>
Expand All @@ -34,14 +34,14 @@ The PostgresML team also provides [native language SDKs](https://github.com/post

While using the SDK is completely optional, SDK clients can perform advanced machine learning tasks in a single SQL request, without having to transfer additional data, models, hardware or dependencies to the client application.

Use cases include:
Some of the use cases include:

* Chat with streaming responses from state-of-the-art open source LLMs
* Semantic search with keywords and embeddings
* RAG in a single request without using any third-party services
* Text translation between hundreds of languages
* Text summarization to distill complex documents
* Forecasting timeseries data for key metrics with and metadata
* Forecasting time series data for key metrics with and metadata
* Anomaly detection using application data

## Our mission
Expand Down
8 changes: 4 additions & 4 deletions pgml-cms/docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Introduction

* [Overview](README.md)
* [Getting Started](introduction/getting-started/README.md)
* [Getting started](introduction/getting-started/README.md)
* [Create your database](introduction/getting-started/create-your-database.md)
* [Connect your app](introduction/getting-started/connect-your-app.md)
* [Import your data](introduction/getting-started/import-your-data/README.md)
Expand Down Expand Up @@ -52,12 +52,12 @@

## Product

* [Cloud Database](product/cloud-database/README.md)
* [Cloud database](product/cloud-database/README.md)
* [Serverless](product/cloud-database/serverless.md)
* [Dedicated](product/cloud-database/dedicated.md)
* [Enterprise](product/cloud-database/plans.md)
* [Vector Database](product/vector-database.md)
* [PgCat Proxy](product/pgcat/README.md)
* [Vector database](product/vector-database.md)
* [PgCat pooler](product/pgcat/README.md)
* [Features](product/pgcat/features.md)
* [Installation](product/pgcat/installation.md)
* [Configuration](product/pgcat/configuration.md)
Expand Down
16 changes: 9 additions & 7 deletions pgml-cms/docs/introduction/getting-started/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,18 @@
description: Setup a database and connect your application to PostgresML
---

# Getting Started
# Getting started

A PostgresML deployment consists of multiple components working in concert to provide a complete Machine Learning platform. We provide a fully managed solution in [our cloud](create-your-database), and document a self-hosted installation in [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker).
A PostgresML deployment consists of multiple components working in concert to provide a complete Machine Learning platform:

* PostgreSQL database, with `pgml`, `pgvector` and many other extensions installed, including backups, metrics, logs, replicas and high availability
* PgCat pooler to provide secure access and model load balancing across thousands of clients
* A web application to manage deployed models and share experiments and analysis in SQL notebooks
* PostgreSQL database, with `pgml`, `pgvector` and many other extensions that add features useful in day-to-day and machine learning use cases
* [PgCat pooler](/docs/product/pgcat/) to load balance thousands of concurrenct client requests across several database instances
* A web application to manage deployed models and share experiments analysis with SQL notebooks

<figure class="m-3"><img src="../../.gitbook/assets/architecture.png" alt="PostgresML architecture"><figcaption></figcaption></figure>
We provide a fully managed solution in [our cloud](create-your-database), and document a self-hosted installation in the [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker).

<figure class="my-4"><img src="../../.gitbook/assets/architecture.png" alt="PostgresML architecture"><figcaption></figcaption></figure>

By building PostgresML on top of a mature database, we get reliable backups for model inputs and proven scalability without reinventing the wheel, so that we can focus on providing access to the latest developments in open source machine learning and artificial intelligence.

This guide will help you get started with a generous free account, that includes access to GPU accelerated models and 5 GB of storage, or you can skip to our [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker) to see how to run PostgresML locally with our Docker image.
This guide will help you get started with a generous [free account](create-your-database), that includes access to GPU accelerated models and 5 GB of storage, or you can skip to our [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker) to see how to run PostgresML locally with our Docker image.
46 changes: 42 additions & 4 deletions pgml-cms/docs/product/pgcat/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,48 @@
description: Nextgen PostgreSQL Pooler
---

# PgCat
# PgCat pooler

PgCat is PostgreSQL connection pooler and proxy which scales PostgresML deployments. It supports read/write query separation, multiple replicas, automatic traffic distribution and load balancing, sharding, and many more features expected out of high availability enterprise grade Postgres databases.
<div class="row">
<div class="col-12 col-md-4">
<figure class="my-4">
<img class="mb-3" src="../../.gitbook/assets/pgcat_1.svg" height="auto" width="185" alt="PgCat logo">
<figcaption></figcaption>
</figure>
</div>
<div class="col-12 col-md-8">
<p>PgCat is PostgreSQL connection pooler and proxy which scales PostgreSQL (and PostgresML) databases beyond a single instance.</p>
<p>
It supports replicas, load balancing, sharding, failover, and many more features expected out of high availability enterprise-grade PostgreSQL deployment.
</p>
<p>
Written in Rust using Tokio, it takes advantage of multiple CPUs and the safety and performance guarantees of the Rust language.
</p>
</div>
</div>

Written in Rust and powered by Tokio, it takes advantage of multiple CPUs, and the safety and performance guarantees of the Rust language.

PgCat, like PostgresML, is free and open source, distributed under the MIT license. It's currently running in our Cloud, powering both Serverless and Dedicated databases.
PgCat, like PostgresML, is free and open source, distributed under the MIT license. It's currently running in our [cloud](https://postgresml.org/signup), powering both Serverless and Dedicated databases.

## [Features](features)

PgCat implements the PostgreSQL wire protocol and can understand and optimally route queries & transactions based on their characteristics. For example, if your database deployment consists of a primary and replica, PgCat can send all `SELECT` queries to the replica, and all other queries to the primary, creating a read/write traffic separation.

<figure>
<img class="mb-3" src="../../.gitbook/assets/pgcat_4.png" alt="PgCat architecture" width="95%" height="auto">
<figcaption><i>PgCat deployment at scale</i></figcaption>
</figure>

<br>

If you have more than one primary, sharded with either the Postgres hashing algorithm or a custom sharding function, PgCat can parse queries, extract the sharding key, and route the query to the correct shard without requiring any modifications on the client side.

PgCat has many more features which are more thoroughly described in the [PgCat features](features) section.

## [Installation](installation)

PgCat is open source and available from our [GitHub repository](https://github.com/postgresml/pgcat) and, if you're running Ubuntu 22.04, from our Aptitude repository. You can read more about how to install PgCat in the [installation](installation) section.

## [Configuration](configuration)

PgCat, like many other PostgreSQL poolers, has its own configuration file format (it's written in Rust, so of course we use TOML). The settings and their meaning are documented in the [configuration](configuration) section.
Loading