Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala standardization #100

Merged
merged 45 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
a3d0ec4
DOCSP-42308: Deployment and connection string (#52)
norareidy Oct 4, 2024
3e39306
DOCSP-42306: Get Started and Install (#51)
norareidy Oct 11, 2024
bd8fdc5
DOCSP-42310: Connect to MongoDB (#53)
norareidy Oct 14, 2024
47ce528
DOCSP-42305: Index page + repo cleanup (#56)
mcmorisi Oct 14, 2024
4cc5e1c
DOCSP-42321: Insert landing page (#55)
mcmorisi Oct 17, 2024
7850b1f
DOCSP-42329: Retrieve data (#54)
norareidy Oct 17, 2024
bee1e78
DOCSP-42321: Insert Documents (#61)
mcmorisi Oct 18, 2024
58fdbbf
DOCSP-42328: Specify a query (#57)
norareidy Oct 18, 2024
9276032
DOCSP-42330: Projection guide (#58)
norareidy Oct 18, 2024
db684b2
DOCSP-42323: Replace (#62)
mcmorisi Oct 21, 2024
15cc939
DOCSP-42331: Specify docs to return (#64)
norareidy Oct 21, 2024
e130df4
DOCSP-42322: Update (#63)
mcmorisi Oct 22, 2024
6f8885c
DOCSP-42332: Count documents (#66)
norareidy Oct 22, 2024
0ac1358
DOCSP-42312 Create a MongoDB Client (#59)
lindseymoore Nov 1, 2024
3237882
DOCSP-42324: Delete (#65)
mcmorisi Nov 4, 2024
f1594a3
DOCSP-42325: Bulk Write (#68)
mcmorisi Nov 4, 2024
0836eee
DOCSP-42317: Databases and Collections (#77)
mcmorisi Nov 5, 2024
51f96b8
DOCSP-42333: Distinct values (#67)
norareidy Nov 5, 2024
08826ba
DOCSP-42335: Change streams (#69)
norareidy Nov 5, 2024
32d3ffa
DOCSP-42311: Run a Command (#75)
norareidy Nov 5, 2024
31d9884
DOCSP-42318: Time Series (#78)
mcmorisi Nov 5, 2024
db65164
DOCSP-42326: GridFS (#72)
mcmorisi Nov 5, 2024
9d5e341
DOCSP-42342: Aggregation (#71)
norareidy Nov 5, 2024
7fe74bf
DOCSP-42316: Stable API (#81)
mcmorisi Nov 6, 2024
060c5c2
DOCSP-42336: Cluster monitoring (#73)
norareidy Nov 6, 2024
e8d8ee0
DOCSP-42341: Atlas Search Indexes (#82)
mcmorisi Nov 8, 2024
734c3e0
Uncomment link to compat tables on landing page (#84)
mcmorisi Nov 11, 2024
052dbf5
DOCSP-40287: Read landing page (#85)
mcmorisi Nov 11, 2024
3dc6ae6
DOCSP-42343: Secure Your Data + Authentication Mechanisms (#83)
mcmorisi Nov 12, 2024
2a10a11
DOCSP-42319: Read and write settings (#79)
norareidy Nov 14, 2024
c388e76
DOCSP-42338 Single Field Indexes (#70)
lindseymoore Nov 14, 2024
6ce37a8
DOCSP-42339 Compound Indexes (#74)
lindseymoore Nov 14, 2024
230f68e
DOCSP-42314: Enable TLS (#88)
norareidy Nov 15, 2024
5d491db
DOCSP-42337: Optimize Queries with Indexes (#92)
mayaraman19 Nov 19, 2024
60014ea
DOCSP-42349: Transactions (#89)
norareidy Nov 19, 2024
f78020c
DOCSP-42340 Multikey Index (#76)
lindseymoore Nov 19, 2024
cd0af93
DOCSP-42313 Connection Targets (#80)
lindseymoore Nov 19, 2024
bd4ee2e
DOCSP-42334: Observables (#86)
norareidy Nov 20, 2024
5acbda0
DOCSP-24494: Atlas search section (#90)
norareidy Nov 20, 2024
d47e9dc
DOCSP-42346: Upgrade driver (#91)
norareidy Nov 20, 2024
a636436
DOCSP-45354: TOC labels (#94)
norareidy Nov 21, 2024
f21fa5d
DOCSP-45348: Cleanup (#93)
norareidy Nov 22, 2024
70dad44
Merge remote-tracking branch 'upstream/master' into scala-standardiza…
norareidy Nov 22, 2024
539bfef
staging link
norareidy Nov 22, 2024
fb0f04e
fixes
norareidy Nov 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion snooty.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,32 @@ intersphinx = [
sharedinclude_root = "https://raw.githubusercontent.com/10gen/docs-shared/main/"

toc_landing_pages = [
"/get-started",
"/bson",
"/tutorials/connect",
"/tutorials/write-ops",
"/builders",
]
"/get-started",
"/databases-collections",
"/read",
"/write",
"/indexes",
"/security"
]

[constants]
driver-short = "Scala driver"
driver-long = "MongoDB Scala Driver"
version = "5.2"
full-version = "{+version+}.1"
language = "Scala"
language-version = "2.13.15"
mdb-server = "MongoDB Server"
api = "https://mongodb.github.io/mongo-java-driver/{+version+}/apidocs/mongo-scala-driver"
driver-source-gh = "https://github.com/mongodb/mongo-java-driver"
rs-docs = "https://www.reactive-streams.org/reactive-streams-1.0.4-javadoc/org/reactivestreams"
core-api = "https://mongodb.github.io/mongo-java-driver/{+version+}/apidocs/mongodb-driver-core"
mongocrypt-version = "{+full-version+}"
java-version = "23"
mongodb-server = "MongoDB Server"
stable-api = "Stable API"
285 changes: 285 additions & 0 deletions source/aggregation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,285 @@
.. _scala-aggregation:

====================================
Transform Your Data with Aggregation
====================================

.. facet::
:name: genre
:values: reference

.. meta::
:keywords: code example, transform, computed, pipeline
:description: Learn how to use the Scala driver to perform aggregation operations.

.. contents:: On this page
:local:
:backlinks: none
:depth: 2
:class: singlecol

.. TODO:
.. toctree::
:titlesonly:
:maxdepth: 1

/aggregation/aggregation-tutorials

Overview
--------

In this guide, you can learn how to use the {+driver-short+} to perform
**aggregation operations**.

Aggregation operations process data in your MongoDB collections and
return computed results. The MongoDB Aggregation framework, which is
part of the Query API, is modeled on the concept of data processing
pipelines. Documents enter a pipeline that contains one or more stages,
and this pipeline transforms the documents into an aggregated result.

An aggregation operation is similar to a car factory. A car factory has
an assembly line, which contains assembly stations with specialized
tools to do specific jobs, like drills and welders. Raw parts enter the
factory, and then the assembly line transforms and assembles them into a
finished product.

The **aggregation pipeline** is the assembly line, **aggregation stages** are the
assembly stations, and **operator expressions** are the
specialized tools.

Compare Aggregation and Find Operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following table lists the different tasks that find
operations can perform and compares them to what aggregation
operations can perform. The aggregation framework provides
expanded functionality that allows you to transform and manipulate
your data.

.. list-table::
:header-rows: 1
:widths: 50 50

* - Find Operations
- Aggregation Operations

* - | Select *certain* documents to return
| Select *which* fields to return
| Sort the results
| Limit the results
| Count the results
- | Select *certain* documents to return
| Select *which* fields to return
| Sort the results
| Limit the results
| Count the results
| Rename fields
| Compute new fields
| Summarize data
| Connect and merge data sets

Limitations
~~~~~~~~~~~

Consider the following limitations when performing aggregation operations:

- Returned documents cannot violate the
:manual:`BSON document size limit </reference/limits/#mongodb-limit-BSON-Document-Size>`
of 16 megabytes.
- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this
limit by passing a value of ``true`` to the ``allowDiskUse()`` method and chaining the
method to ``aggregate()``.
- The :manual:`$graphLookup </reference/operator/aggregation/graphLookup/>`
operator has a strict memory limit of 100 megabytes and ignores the
value passed to the ``allowDiskUse()`` method.

.. _scala-run-aggregation:

Run Aggregation Operations
--------------------------

.. note:: Sample Data

The examples in this guide use the ``restaurants`` collection in the ``sample_restaurants``
database from the :atlas:`Atlas sample datasets </sample-data>`. To learn how to create a
free MongoDB Atlas cluster and load the sample datasets, see the :atlas:`Get Started with Atlas
</getting-started>` guide.

To perform an aggregation, pass a list containing the pipeline stages to
the ``aggregate()`` method. The {+driver-short+} provides the ``Aggregates`` class,
which includes helper methods for building pipeline stages.

To learn more about pipeline stages and their corresponding ``Aggregates`` helper
methods, see the following resources:

- :manual:`Aggregation Stages </reference/operator/aggregation-pipeline/>` in the
{+mdb-server+} manual
- `Aggregates <{+api+}/org/mongodb/scala/model/Aggregates$.html>`__ in the API documentation

.. _scala-aggregation-example:

Filter, Group, and Count Documents
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This code example produces a count of the number of bakeries in each borough
of New York. To do so, it calls the ``aggregate()`` method and passes an aggregation
pipeline as a list of stages. The code builds these stages by using the following
``Aggregates`` helper methods:

- ``filter()``: Builds the :manual:`$match </reference/operator/aggregation/match/>` stage
to filter for documents that have a ``cuisine`` value of ``"Bakery"``

- ``group()``: Builds the :manual:`$group </reference/operator/aggregation/group/>` stage to
group the matching documents by the ``borough`` field, accumulating a count of documents for each
distinct value

.. io-code-block::
:copyable:

.. input:: /includes/aggregation.scala
:start-after: start-match-group
:end-before: end-match-group
:language: scala
:dedent:

.. output::
:visible: false

{"_id": "Brooklyn", "count": 173}
{"_id": "Queens", "count": 204}
{"_id": "Bronx", "count": 71}
{"_id": "Staten Island", "count": 20}
{"_id": "Missing", "count": 2}
{"_id": "Manhattan", "count": 221}

Explain an Aggregation
~~~~~~~~~~~~~~~~~~~~~~

To view information about how MongoDB executes your operation, you can
instruct the MongoDB query planner to **explain** it. When MongoDB explains
an operation, it returns **execution plans** and performance statistics.
An execution plan is a potential way in which MongoDB can complete an operation.
When you instruct MongoDB to explain an operation, it returns both the
plan MongoDB executed and any rejected execution plans by default.

To explain an aggregation operation, chain the ``explain()`` method to the
``aggregate()`` method. You can pass a verbosity level to ``explain()``,
which modifies the type and amount of information that the method returns. For more
information about verbosity, see :manual:`Verbosity Modes </reference/command/explain/#verbosity-modes>`
in the {+mdb-server+} manual.

The following example instructs MongoDB to explain the aggregation operation
from the preceding :ref:`scala-aggregation-example` example. The code passes a verbosity
value of ``ExplainVerbosity.EXECUTION_STATS`` to the ``explain()`` method, which
configures the method to return statistics describing the execution of the winning
plan:

.. io-code-block::
:copyable:

.. input:: /includes/aggregation.scala
:start-after: start-explain
:end-before: end-explain
:language: scala
:dedent:

.. output::
:visible: false

{"explainVersion": "2", "queryPlanner": {"namespace": "sample_restaurants.restaurants",
"indexFilterSet": false, "parsedQuery": {"cuisine": {"$eq": "Bakery"}}, "queryHash": "865F14C3",
"planCacheKey": "0FC225DA", "optimizedPipeline": true, "maxIndexedOrSolutionsReached": false,
"maxIndexedAndSolutionsReached": false, "maxScansToExplodeReached": false, "winningPlan":
{"queryPlan": {"stage": "GROUP", "planNodeId": 3, "inputStage": {"stage": "COLLSCAN",
"planNodeId": 1, "filter": {"cuisine": {"$eq": "Bakery"}}, "direction": "forward"}},
...}

Run an Atlas Full-Text Search
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. tip:: Only Available on Atlas for MongoDB v4.2 and later

This aggregation pipeline operator is only available for collections hosted
on :atlas:`MongoDB Atlas </>` clusters running v4.2 or later that are
covered by an :atlas:`Atlas Search index </reference/atlas-search/index-definitions/>`.

To specify a full-text search of one or more fields, you can create
a ``$search`` pipeline stage. The {+driver-short+} provides the
``Aggregates.search()`` helper method to create this stage. The ``search()``
method requires the following arguments:

- ``SearchOperator`` instance: Specifies the field and text to search for.
- ``SearchOptions`` instance: Specifies options to customize the full-text
search. You must set the ``index`` option to the name of the Atlas Search
index to use.

This example creates pipeline stages to perform the following actions:

- Search the ``name`` field for text that contains the word ``"Salt"``
- Project only the ``_id`` and ``name`` values of matching documents

.. io-code-block::
:copyable:

.. input:: /includes/aggregation.scala
:start-after: start-atlas-search
:end-before: end-atlas-search
:language: scala
:dedent:

.. output::
:visible: false

{"_id": {"$oid": "..."}, "name": "Fresh Salt"}
{"_id": {"$oid": "..."}, "name": "Salt & Pepper"}
{"_id": {"$oid": "..."}, "name": "Salt + Charcoal"}
{"_id": {"$oid": "..."}, "name": "A Salt & Battery"}
{"_id": {"$oid": "..."}, "name": "Salt And Fat"}
{"_id": {"$oid": "..."}, "name": "Salt And Pepper Diner"}

.. important::

To run the preceding example, you must create an Atlas Search index on the ``restaurants``
collection that covers the ``name`` field. Then, replace the ``"<search index name>"``
placeholder with the name of the index. To learn more about Atlas Search indexes, see
the :ref:`scala-atlas-search-index` guide.

Additional Information
----------------------

MongoDB Server Manual
~~~~~~~~~~~~~~~~~~~~~

To learn more about the topics discussed in this guide, see the following
pages in the {+mdb-server+} manual:

- To view a full list of expression operators, see :manual:`Aggregation
Operators </reference/operator/aggregation/>`.

- To learn about assembling an aggregation pipeline and to view examples, see
:manual:`Aggregation Pipeline </core/aggregation-pipeline/>`.

- To learn more about creating pipeline stages, see :manual:`Aggregation
Stages </reference/operator/aggregation-pipeline/>`.

- To learn more about explaining MongoDB operations, see
:manual:`Explain Output </reference/explain-results/>` and
:manual:`Query Plans </core/query-plans/>`.

.. TODO:
Aggregation Tutorials
~~~~~~~~~~~~~~~~~~~~~

.. To view step-by-step explanations of common aggregation tasks, see
.. :ref:`scala-aggregation-tutorials-landing`.

API Documentation
~~~~~~~~~~~~~~~~~

To learn more about the methods and types discussed in this guide, see the
following API documentation:

- `aggregate() <{+api+}/org/mongodb/scala/MongoCollection.html#aggregate[C](pipeline:Seq[org.mongodb.scala.bson.conversions.Bson])(implicite:org.mongodb.scala.bson.DefaultHelper.DefaultsTo[C,TResult],implicitct:scala.reflect.ClassTag[C]):org.mongodb.scala.AggregateObservable[C]>`__
- `Aggregates <{+api+}/org/mongodb/scala/model/Aggregates$.html>`__
- `explain() <{+api+}/org/mongodb/scala/AggregateObservable.html#explain[ExplainResult](verbosity:com.mongodb.ExplainVerbosity)(implicite:org.mongodb.scala.bson.DefaultHelper.DefaultsTo[ExplainResult,org.mongodb.scala.Document],implicitct:scala.reflect.ClassTag[ExplainResult]):org.mongodb.scala.SingleObservable[ExplainResult]>`__

49 changes: 0 additions & 49 deletions source/bson.txt

This file was deleted.

Loading
Loading