Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration catalog widgets: Refactor into CrateDB Guide, part 1 #154

Merged
merged 4 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 0 additions & 90 deletions docs/integrate/etl.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,32 +76,6 @@ journey. Spend time where it counts.
:::


## Apache Flink

```{div}
:style: "float: right"
[![](https://flink.apache.org/flink-header-logo.svg){w=180px}](https://flink.apache.org/)
```

[Apache Flink] is a framework and distributed processing engine for stateful
computations over unbounded and bounded data streams, written in Java.

Flink has been designed to run in all common cluster environments, perform
computations at in-memory speed and at any scale. It received the [2023 SIGMOD
Systems Award].

> Apache Flink greatly expanded the use of stream data-processing.

![](https://flink.apache.org/img/flink-home-graphic.png){h=200px}

:::{dropdown} **Managed Flink**
A few companies are specializing in offering managed Flink services.

- [Aiven] offers their managed [Aiven for Apache Flink] solution.
- [Immerok Cloud]'s offering is being converged into [Flink managed by Confluent].
:::


## Apache Hop

```{div}
Expand Down Expand Up @@ -184,62 +158,6 @@ worldwide across every industry.
![](https://github.com/crate/crate-clients-tools/assets/453543/ccfa4ac7-0d60-432f-b952-2b50789cd325){h=120px}


## dbt

```{div}
:style: "float: right"
[![](https://www.getdbt.com/ui/img/logos/dbt-logo.svg){w=180px}](https://www.getdbt.com/)
```

[dbt] is an open source tool for transforming data in data warehouses using Python and
SQL. It is an SQL-first transformation workflow platform that lets teams quickly and
collaboratively deploy analytics code following software engineering best practices
like modularity, portability, CI/CD, and documentation.

> dbt enables data analysts and engineers to transform their data using the same
> practices that software engineers use to build applications.

With dbt, anyone on your data team can safely contribute to production-grade data
pipelines.

The idea is that data engineers make source data available to an environment where
dbt projects run, for example with [Debezium](#debezium) or with [Airflow](#apache-airflow).
Afterwards, data analysts can run their dbt projects against this data to produce models
(tables and views) that can be used with a number of [BI tools](#bi-tools).

![](https://www.getdbt.com/ui/img/products/what-is-dbt-main-image.png){h=120px}
![](https://www.getdbt.com/ui/img/products/what-is-dbt-deploy.svg){h=120px}
![](https://www.getdbt.com/ui/img/products/what-is-dbt-eliminate-silos.svg){h=120px}

:::{dropdown} **Managed dbt**
```{div}
:style: "float: right"
[![](https://www.getdbt.com/ui/img/hero-dbt-cloud-features-2x5.png){w=180px}](https://www.getdbt.com/product/dbt-cloud/)
```

With [dbt Cloud], you can ditch time-consuming setup, and the struggles
of scaling your data production. dbt Cloud is a full-suite service that is built for
scale.

- Start building data products quickly using the dbt Cloud IDE with integrated security
and governance controls.
- Schedule, deploy, and monitor your data products using the scalable and reliable dbt
Cloud Scheduler.
- Help your data teams discover and reuse data products using hosted docs or integrations
with the powerful Discovery API.
- Extend your workflow beyond dbt Cloud with 30+ seamless integrations covering a range
of use cases across the Modern Data Stack, from observability and data quality to
visualization, reverse ETL, and much more.
- Ship more high-quality data and scale your development like the 1000s of companies that
use dbt Cloud. They’ve used its convenient and collaboration-friendly interface to
eliminate the bottlenecks that keep growth limited.

```{div}
:style: "clear: both"
```
:::


## Debezium

```{div}
Expand Down Expand Up @@ -386,13 +304,9 @@ an SSIS Catalog database to store, run, and manage packages.
```


[2023 SIGMOD Systems Award]: https://sigmod.org/2023-sigmod-systems-award/
[Aiven]: https://aiven.io/
[Aiven for Apache Flink]: https://aiven.io/flink
[Aiven for Apache Kafka]: https://aiven.io/kafka
[Amazon Managed Streaming for Apache Kafka (MSK)]: https://aws.amazon.com/msk/
[Apache Airflow]: https://airflow.apache.org/
[Apache Flink]: https://flink.apache.org/
[Apache Hop]: https://hop.apache.org/
[Apache Kafka]: https://kafka.apache.org/
[Apache Kafka on Azure]: https://azuremarketplace.microsoft.com/marketplace/consulting-services/canonical.0001-com-ubuntu-managed-kafka
Expand All @@ -404,14 +318,10 @@ an SSIS Catalog database to store, run, and manage packages.
[CrateDB and Apache Kafka]: https://cratedb.com/integrations/cratedb-and-kafka
[CrateDB and Kestra]: https://cratedb.com/integrations/cratedb-and-kestra
[CrateDB and Node-RED]: https://cratedb.com/integrations/cratedb-and-node-red
[dbt]: https://www.getdbt.com/
[dbt Cloud]: https://www.getdbt.com/product/dbt-cloud/
[Debezium]: https://debezium.io/
[DoubleCloud Managed Service for Apache Kafka]: https://double.cloud/services/managed-kafka/
[Flink managed by Confluent]: https://www.datanami.com/2023/05/17/confluents-new-cloud-capabilities-address-data-streaming-hurdles/
[FlowFuse]: https://flowfuse.com/
[FlowFuse Cloud]: https://app.flowforge.com/
[Immerok Cloud]: https://web.archive.org/web/20230602085618/https://www.immerok.io/product
[Introduction to FlowFuse]: https://flowfuse.com/webinars/2023/introduction-to-flowforge/
[Kestra]: https://kestra.io/
[Meltano]: https://meltano.com/
Expand Down
30 changes: 0 additions & 30 deletions docs/integrate/ml.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,34 +19,6 @@ for MLOps and Vector database operations.
::::


## LangChain

```{div}
:style: "float: right; font-size: 4em; margin-left: 0.3em"
🦜️🔗
```

[LangChain] is a framework for developing applications powered by language models,
written in Python, and with a strong focus on composability. As a language model
integration framework, LangChain's use-cases largely overlap with those of language
models in general, including document analysis and summarization, chatbots, and
code analysis.

LangChain supports retrieval-augmented generation (RAG), which is a technique for
augmenting LLM knowledge with additional, often private or real-time, data, and mixing
in "prompt engineering" as the process of structuring text that can be interpreted and
understood by a generative AI model. A prompt is natural language text describing the
task that an AI should perform.

The [LangChain adapter for CrateDB] provides support to use CrateDB as a vector store
database, to load documents using LangChain's DocumentLoader, and also supports
LangChain's conversational memory subsystem.

```{div}
:style: "clear: both"
```


## MLflow

```{div}
Expand Down Expand Up @@ -129,8 +101,6 @@ A modular design invites extensions to expand and enrich functionality.
```


[LangChain]: https://python.langchain.com/
[LangChain adapter for CrateDB]: https://github.com/crate-workbench/langchain
[MLflow]: https://mlflow.org/
[mlflow-cratedb]: https://pypi.org/project/mlflow-cratedb/
[MLflow adapter for CrateDB]: https://github.com/crate/mlflow-cratedb
Expand Down
51 changes: 0 additions & 51 deletions docs/integrate/visualize.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,55 +17,6 @@ Guidelines about data analysis and visualization with CrateDB.
::::


(apache-superset)=
(preset)=
(superset)=
## Apache Superset / Preset

```{div}
:style: "float: right"
[![](https://cratedb.com/hs-fs/hubfs/[email protected]?width=604&height=216&[email protected]){w=180px}](https://superset.apache.org/)

[![](https://github.com/crate/crate-clients-tools/assets/453543/9d07da87-8aff-4569-bf2a-0a16bf89f4bc){w=180px}](https://preset.io/)
```

[Apache Superset] is an open-source modern data exploration and visualization
platform, written in Python.

[Preset] offers a managed, elevated, and enterprise-grade SaaS for open-source
Apache Superset.

![](https://superset.apache.org/img/hero-screenshot.jpg){h=200px}
![](https://github.com/crate/crate-clients-tools/assets/453543/0f8f7bd8-2e30-4aca-bcf3-61fbc81da855){h=200px}

```{seealso}
[CrateDB and Superset]
```

:::{dropdown} **Managed Superset**
```{div}
:style: "float: right"
[![](https://github.com/crate/crate-clients-tools/assets/453543/9d07da87-8aff-4569-bf2a-0a16bf89f4bc){w=180px}](https://preset.io/)
```

[Preset Cloud] is a fully-managed, open-source BI for the modern data stack,
based on Apache Superset.

- **Hassle-free setup:** There is no need to install or maintain software with Preset.
Get the latest version of Superset in a secure, reliable, and scalable SaaS experience.
- **Up-to-date Superset, always:** Access all the latest features of Superset
released and thoroughly tested every two weeks.
- **One-click to deploy multiple workspaces:** Give each team in your organization
a separate Superset workspace to protect sensitive data.
- **Control user roles and access:** Easily assign roles and fine-tune data access
using RBAC and row-level security (RLS).

```{div}
:style: "clear: both"
```
:::


## Cluvio

```{div}
Expand Down Expand Up @@ -200,10 +151,8 @@ with none of the work or hidden costs that come with self-hosting.
:::


[Apache Superset]: https://superset.apache.org/
[Cluvio]: https://www.cluvio.com/
[CrateDB and Grafana]: https://cratedb.com/integrations/cratedb-and-grafana
[CrateDB and Superset]: https://cratedb.com/integrations/cratedb-and-apache-superset
[CrateDB and Metabase]: https://cratedb.com/integrations/cratedb-and-metabase
[Explo]: https://www.explo.co/
[Explo Explore]: https://www.explo.co/products/explore
Expand Down