Update README.rst (#287)

* Update README.rst * Update README.rst (#288) * Update README.rst * Update description --------- Co-authored-by: Yerdos Ordabayev <[email protected]>
cellarium-ai · Jan 17, 2025 · 090161e · 090161e
1 parent a854767
commit 090161e
Showing 1 changed file with 99 additions and 43 deletions.
diff --git a/README.rst b/README.rst
@@ -1,67 +1,123 @@
-*Cellarium ML: distributed single-cell data analysis.*
+.. image:: https://cellarium.ai/wp-content/uploads/2024/07/cellarium-logo-medium.png
+   :alt: Cellarium Logo
+   :width: 180
+   :align: center
 
----------
+**Cellarium ML: a machine learning framework for single-cell biology**
+======================================================================
 
 Cellarium ML is a PyTorch Lightning-based library for distributed single-cell data analysis.
-It provides a set of tools for training deep learning models on large-scale single-cell datasets,
-including distributed data loading, model training, and evaluation. Cellarium ML is designed to be
-modular and extensible, allowing users to easily define custom models, data transformations,
+It provides tools for training deep learning models on large-scale single-cell datasets,
+including distributed data loading, model training, and evaluation. Designed to be modular
+and extensible, Cellarium ML allows users to easily define custom models, data transformations,
 and training pipelines.
 
-Code organization
------------------
+-------------------------------------------------------------------------------
+
+**Code Organization**
+----------------------
 
 The code is organized as follows:
 
-- ``cellarium/ml/callbacks``: Contains custom PyTorch Lightning callbacks.
-- ``cellarium/ml/core``: Includes essential Cellarium ML components:
-  - ``CellariumModule``: A PyTorch Lightning Module tasked with defining and configuring the model, training step, and optimizer.
-  - ``CellariumAnnDataDataModule``: A PyTorch Lightning DataModule designed for setting up a multi-GPU DataLoader for a collection of AnnData objects.
-  - ``CellariumPipeline``: A Module List that pipes the input data through a series of transforms and a model.
-- ``cellarium/ml/data``: Contains Distributed AnnData Collection and multi-GPU Iterable Dataset implementations.
-- ``cellarium/ml/lr_schedulers``: Contains custom learning rate schedulers.
-- ``cellarium/ml/models``: Features Cellarium ML models:
-  - Models must subclass ``CellariumModel`` and implement the ``.reset_parameters`` method.
-  - The ``.forward`` method should return a dictionary containing the computed loss under the ``loss`` key.
-  - Optionally, hooks such as ``.on_train_start``, ``.on_train_epoch_end``, and ``.on_train_batch_end`` can be implemented to be triggered by the ``CellariumModule`` during training phases.
-- ``cellarium/ml/preprocessing``: Provides pre-processing functions.
-- ``cellarium/ml/transforms``: Contains data transformation modules:
-  - Each transform is a subclass of ``torch.nn.Module``.
-  - The ``.forward`` method should output a dictionary where the keys correspond to the input arguments of subsequent transforms and the model.
-- ``cellarium/ml/utilities``: Contains utility functions for various submodules.
-- ``cellarium/ml/cli.py``: Implements the ``cellarium-ml`` CLI. Models must be registered here to be accessible via the CLI.
+.. code-block:: text
+
+   cellarium/
+   └── ml/
+       ├── "callbacks"        # Custom PyTorch Lightning callbacks
+       ├── "core"             # Essential components
+       │   ├── "CellariumModule"              # PyTorch Lightning Module for model, training step, and optimizer
+       │   ├── "CellariumAnnDataDataModule"   # DataModule for multi-GPU DataLoader for AnnData objects
+       │   └── "CellariumPipeline"            # Pipeline for data transformations and model inference
+       ├── "data"             # Distributed AnnData Collection and multi-GPU Iterable Datasets
+       ├── "lr_schedulers"    # Custom learning rate schedulers
+       ├── "models"           # Cellarium ML models
+       ├── "preprocessing"    # Pre-processing functions
+       ├── "transforms"       # Data transformation modules
+       ├── "utilities"        # Utility functions for various submodules
+       └── "cli.py"           # Implements the "cellarium-ml" CLI. Models must be registered here
+
+Important Notes
+~~~~~~~~~~~~~~~
+
+``cellarium/ml/models/*``  
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- Models must subclass ``CellariumModel`` and implement the following:  
+- ``reset_parameters``: Initializes model parameters.  
+- ``forward``: Returns a dictionary containing the computed loss under the ``loss`` key.  
+
+Optional hooks for training include:  
+
+- ``on_train_start``: Called at the start of training.  
+- ``on_train_epoch_end``: Triggered at the end of each epoch.  
+- ``on_train_batch_end``: Triggered at the end of each batch.  
+
+``cellarium/ml/transforms/*``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- All transforms must subclass ``torch.nn.Module``.
+- The ``forward`` method must output a dictionary where keys correspond to the input arguments for subsequent transforms or the model.  
+
+``cellarium/ml/cli.py``
+~~~~~~~~~~~~~~~~~~~~~~~
+- Models must be registered here to be accessible via the command-line interface (``cellarium-ml`` CLI).
+
+
+
+-------------------------------------------------------------------------------
+
+**Installation**
+-----------------
+
+To install via pip:
+
+.. code-block:: bash
+
+   pip install cellarium-ml
+
+To install the developer version from source:
+
+.. code-block:: bash
+
+   git clone https://github.com/cellarium-ai/cellarium-ml.git
+   cd cellarium-ml
+   make install  # runs pip install -e .[dev]
 
-Installation
-------------
+**API Documentation and Tutorials**
+-----------------------------------
 
-To install from the pip::
+For detailed API documentation and tutorials, visit:  
+`Cellarium ML Documentation <https://cellarium-ai.github.io/cellarium-ml/>`_
 
-   $ pip install cellarium-ml
+-------------------------------------------------------------------------------
 
-To install the developer version from the source::
+**For Developers**
+-------------------
 
-   $ git clone https://github.com/cellarium-ai/cellarium-ml.git
-   $ cd cellarium-ml
-   $ make install               # runs pip install -e .[dev]
+To run the tests:
 
-For developers
---------------
+.. code-block:: bash
 
-To run the tests::
+   make test-examples                   # runs single-device cli example tests
+   make test-dataloader                 # runs single-device dataloader related tests
+   TEST_DEVICES=2 make test-dataloader  # runs multi-device dataloader related test
+   make test                            # runs single-device (all other) tests
+   TEST_DEVICES=2 make test             # runs multi-device (all other) tests
 
-   $ make test                  # runs single-device tests
-   $ TEST_DEVICES=2 make test   # runs multi-device tests
+To format the code automatically:
 
-To automatically format the code::
+.. code-block:: bash
 
-   $ make format               # runs ruff formatter and fixes linter errors
+   make format                # runs ruff formatter and fixes linter errors
 
-To run the linters::
+To run the linters:
 
-   $ make lint                  # runs ruff linter and checks for formatter errors
+.. code-block:: bash
 
-To build the documentation::
+   make lint                  # runs ruff linter and checks for formatter errors
 
-   $ make docs                  # builds the documentation at docs/build/html
+To build the documentation:
 
+.. code-block:: bash
 
+   make docs                  # builds the documentation at docs/build/html