Skip to content

Commit

Permalink
Updated index.rst in docs from README
Browse files Browse the repository at this point in the history
  • Loading branch information
npalacioescat committed Feb 27, 2020
1 parent 5eae913 commit 0b1b6e1
Show file tree
Hide file tree
Showing 2 changed files with 102 additions and 61 deletions.
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -607,7 +607,7 @@ segment in **UniProt** protein sequences while being aware of isoforms.
Tissue expression
-----------------

For 3 protein expression databases there are functions and modules for
For three protein expression databases there are functions and modules for
downloading and combining the expression data with the network. These are the
Human Protein Atlas, the ProteomicsDB and GIANT. The ``giant`` and
``proteomicsdb`` modules can be used also as stand alone Python clients for
Expand Down Expand Up @@ -640,8 +640,8 @@ Technical

The module ``pypath.curl`` provides a very flexible **download manager**
built on top of ``pycurl``. The classes ``pypath.curl.Curl()`` and
``pypath.curl.FileOpener`` accept numerous arguments, try to deal in a smart
way with local **cache,** authentication, redirects, uncompression, character
``pypath.curl.FileOpener`` accept numerous arguments to deal in a smart
way with local **cache**, authentication, redirects, uncompression, character
encodings, FTP and HTTP transactions, and many other stuff. Cache can grow to
several GBs, and takes place in ``~/.pypath/cache`` by default. If you
experience issues using ``pypath`` these are most often related to failed
Expand All @@ -657,7 +657,7 @@ The ``pypath.session`` and ``pypath.log`` modules take care of setting up
session level parameters and logging. Each session has a random 5 character
identifier e.g. ``y5jzx``. The default log file in this case is
``pypath_log/pypath-y5jzx.log``. The log messages are flushed every 2 seconds
by default. You can always change these things by the ``settings`` module.
by default. You can always change these things using the ``settings`` module.
In this module you can get and set the values of various parameters using
the ``pypath.settings.setup()`` and the ``pypath.settings.get()`` methods.

Expand Down
155 changes: 98 additions & 57 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,58 @@
*pypath:* A Python module for molecular signaling prior knowledge processing
############################################################################


:note: ``pypath`` supports both Python 2.7 and Python 3.6+. In the beginning,
pypath has been developed only for Python 2.7. Then the code have been
adjusted to Py3 and for a few years we develop and test ``pypath`` in
Python 3. Therefore this is the better supported Python variant.
**Important:** New module structure and new network API

Around the end of December we added a new network API to ``pypath`` which
is not based on ``igraph`` any more and provides a modular and versatile
access interface to the network data (since version ``0.9``). In January
we reorganized the submodules in ``pypath`` in order to create a clear
structure (since version ``0.10``). These are important milestones
towards version ``1.0`` and we hope they will make ``pypath`` more
convenient to use for everyone. By 18 February we merged these changes
to the master branch however the *pypath guide* is still to be updated.
Apologies for this inconvenience and please don't hesitate to ask
questions by opening an issue on github. The old ``igraph`` based network
class is still available in the ``pypath.legacy`` module.

:Py2/3: Although we still keep the compatibility with Python 2, we don't
test ``pypath`` in this environment and very few people uses it
already. We highly recommend to use ``pypath`` in Python 3.6+.

:documentation: http://saezlab.github.io/pypath
:issues: https://github.com/saezlab/pypath/issues

.. toctree::
:maxdepth: 5
:caption: Contents:

installation
reference
webservice
changelog

**pypath** consists of a number of submodules to build various databases.
Most of these are provided as **pandas** data frames. The network database
is built around igraph to work with molecular network representations e.g.
protein, miRNA and drug compound interaction networks.
:contact: [email protected]
:developers: ``pypath`` is developed in the Saez Lab (http://saezlab.org) by
Olga Ivanova, Nicolàs Palacio and Dénes Türei; the R package and the
Cytoscape app are developed and maintained by Francesco Ceccarelli, Attila
Gábor, Alberto Valdeolivas and Nicolàs Palacio.

**pypath** is a Python module for processing molecular biology data resources,
combining them into databases and providing a versatile interface in Python
as well as exporting the data for access through other platforms such as
the R (the OmnipathR R/Bioconductor package), web service (at
http://omnipathdb.org), Cytoscape (the OmniPath Cytoscape app) and BEL
(Biological Expression Language).

**pypath** provides access to more than 100 resources! It builds 5 major
combined databases and within these we can distinguish different datasets.
The 5 major databases are interactions (molecular interaction network or
pathways), enzyme-substrate relationships, protein complexes, molecular
annotations (functional roles, localizations, and more) and inter-cellular
communication roles.

**pypath** consists of a number of submodules and each of them again contains
a number of submodules. Overall **pypath** consists of around 100 modules.
The most important higher level submodules:

* *pypath.core:* contains the database classes e.g. network, complex,
annotations, etc
* *pypath.inputs:* contains the resource specific methods which directly
downlad and preprocess data from the original sources
* *pypath.omnipath:* higher level applications, e.g. a database manager, a
web server
* *pypath.utils:* stand alone useful utilities, e.g. identifier translator,
Gene Ontology processor, BioPax processor, etc


Webservice
Expand All @@ -41,7 +71,7 @@ Query types
-----------

The webservice currently recognizes 7 types of queries: ``interactions``,
``ptms``, ``annotations``, ``complexes``, ``intercell``, ``queries`` and
``enz_sub``, ``annotations``, ``complexes``, ``intercell``, ``queries`` and
``info``.
The query types ``resources``, ``network`` and ``about`` have not been
implemented yet in the new webservice.
Expand All @@ -64,7 +94,7 @@ datasets. Each of them has a short name what you can use in the queries

TF-target interactions from TF Regulons, a large collection additional
enzyme-substrate interactions, and literature curated miRNA-mRNA interacions
combined from 4 databases.
combined from 4 databases.

Mouse and rat
-------------
Expand Down Expand Up @@ -148,11 +178,11 @@ Enzyme-substrate interactions
Another query type available is ``ptms`` which provides enzyme-substrate
interactions. It is very similar to the ``interactions``:

http://omnipathdb.org/ptms?genesymbols=1&fields=sources,references,isoforms&enzymes=FYN
http://omnipathdb.org/enz_sub?genesymbols=1&fields=sources,references,isoforms&enzymes=FYN

Is there any ubiquitination reaction?

http://omnipathdb.org/ptms?genesymbols=1&fields=sources,references&types=ubiquitination
http://omnipathdb.org/ens_sub?genesymbols=1&fields=sources,references&types=ubiquitination

And acetylation in mouse?

Expand All @@ -161,7 +191,7 @@ And acetylation in mouse?
Rat interactions, both directly from rat and homology translated from human,
from the PhosphoSite database:

http://omnipathdb.org/ptms?genesymbols=1&fields=sources,references&organisms=10116&databases=PhosphoSite,PhosphoSite_noref
http://omnipathdb.org/enz_sub?genesymbols=1&fields=sources,references&organisms=10116&databases=PhosphoSite,PhosphoSite_noref


Molecular complexes
Expand Down Expand Up @@ -192,7 +222,7 @@ annotations from SignaLink:

Or the tissue expression of BMP7 from Human Protein Atlas:

http://omnipathdb.org/annotations?databases=HPA&proteins=BMP7
http://omnipathdb.org/annotations?databases=HPA_tissue&proteins=BMP7


Roles in inter-cellular communication
Expand Down Expand Up @@ -227,15 +257,13 @@ Exploring possible parameters
Sometimes the names and values of the query parameters are not intuitive,
even though in many cases the server accepts multiple alternatives. To see
the possible parameters with all possible values you can use the ``queries``
query type. The server checks the paremeter names and values exactly against
query type. The server checks the parameter names and values exactly against
these rules and if any of them don't match you will get an error message
instead of reply. To see the parameters for the ``interactions`` query:

http://omnipathdb.org/queries/interactions




Can I use OmniPath in R?
========================

Expand All @@ -244,9 +272,6 @@ our colleague Attila Gabor we have a dedicated package for this:

https://github.com/saezlab/OmnipathR

Alternatively here is a very simple example:

https://github.com/saezlab/pypath/tree/master/r_import

Installation
============
Expand All @@ -255,15 +280,16 @@ Linux
-----

In almost any up-to-date Linux distribution the dependencies of **pypath** are
built-in, or provided by the distributors. You only need to install a couple
of things in your package manager (cairo, py(2)cairo, igraph,
python(2)-igraph, graphviz, pygraphviz), and after install **pypath** by *pip*
(see below). If any module still missing, you can install them the usual way
by *pip* or your package manager.
built-in, or provided by the distributors. You can simply install **pypath**
by **pip** (see below).
If any non mandatory dependency is still missing, you can install them the
usual way by *pip* or your package manager.

igraph C library, cairo and pycairo
-----------------------------------

For the legacy network class or the ``igraph`` conversion from the current
network class *python-igraph* must be installed.
*python(2)-igraph* is a Python interface to use the igraph C library. The
C library must be installed. The same goes for *cairo*, *py(2)cairo* and
*graphviz*.
Expand Down Expand Up @@ -296,12 +322,16 @@ Clone the git repo, and run setup.py:
Mac OS X
--------

On OS X installation is not straightforward primarily because cairo needs to
be compiled from source. We provide 2 scripts here: the
**mac-install-brew.sh** installs everything with HomeBrew, and
Recently the installation on Mac should not be more complicated than on Linux:
you can simply install by **pip** (see above).

When ``igraph`` was a mandatory dependency and it didn't provide wheels
the OS X installation was not straightforward primarily because cairo needs to
be compiled from source. If you want igraph and cairo we provide two scripts
`here <src/scripts>`_: the **mac-install-brew.sh** installs everything with HomeBrew and
**mac-install-conda.sh** installs from Anaconda distribution. With these
scripts installation of igraph, cairo and graphviz goes smoothly most of the
time, and options are available for omitting the 2 latter. To know more see
scripts, installation of igraph, cairo and graphviz goes smoothly most of the
time and options are available to omit the last two. To know more, see
the description in the script header. There is a third script
**mac-install-source.sh** which compiles everything from source and presumes
only Python 2.7 and Xcode installed. We do not recommend this as it is time
Expand All @@ -311,7 +341,7 @@ Troubleshooting
^^^^^^^^^^^^^^^

* ``no module named ...`` when you try to load a module in Python. Did
theinstallation of the module run without error? Try to run again the specific
the installation of the module run without error? Try to run again the specific
part from the mac install shell script to see if any error comes up. Is the
path where the module has been installed in your ``$PYTHONPATH``? Try ``echo
$PYTHONPATH`` to see the current paths. Add your local install directories if
Expand Down Expand Up @@ -383,10 +413,9 @@ external dependencies, after *pip* install should work. On Windows certain
packages can not be installed by compiled from source by *pip*, instead the
easiest to install them precompiled. These are in our case *fisher, lxml,
numpy (mkl version), pycairo, igraph, pygraphviz, scipy and statsmodels*. The
precompiled packages are available here:
http://www.lfd.uci.edu/~gohlke/pythonlibs/. We tested the setup with Python
3.4.3 and Python 2.7.11. The former should just work fine, while with the
latter we have issues to be resolved.
precompiled packages are available `here <http://www.lfd.uci.edu/~gohlke/pythonlibs/>`_.
We tested the setup with Python 3.4.3 and Python 2.7.11. The former should just
work fine, while with the latter we have issues to be resolved.

Known issues
^^^^^^^^^^^^
Expand All @@ -398,7 +427,11 @@ Known issues
* Encoding related exceptions in Python2: these might occur at some points in
the module, please send the traceback if you encounter one, and we will fix
as soon as possible.
* For Mac OS X (v >= 10.11 El Capitan) import of pypath fails with error: "libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)". To fix it, you may need to reinstall pycurl library using special flags. More information and steps can be found e.g. [here](https://cscheng.info/2018/01/26/installing-pycurl-on-macos-high-sierra.html)
* For Mac OS X (v >= 10.11 El Capitan) import of pypath fails with error:
"libcurl link-time ssl backend (openssl) is different from compile-time ssl
backend (none/other)". To fix it, you may need to reinstall pycurl library
using special flags. More information and steps can be found
`here <https://cscheng.info/2018/01/26/installing-pycurl-on-macos-high-sierra.html>`_.

*Special thanks to Jorge Ferreira for testing pypath on Windows!*

Expand Down Expand Up @@ -490,7 +523,7 @@ Main improvements in the past releases:
delete data to free memory
* New interaction category in `data_formats`: `ligand_receptor`
* Improved logging and control over verbosity
* Better control over paremeters by the `settings` module
* Better control over parameters by the `settings` module
* Many methods in `dataio` have been improved or fixed, docs and code style largely improved
* Started to add tests especially for methods in `dataio`

Expand All @@ -500,6 +533,11 @@ Main improvements in the past releases:
has been removed from the mandatory dependencies
* New API for the network, interactions, evidences, molecular entities

0.10.0
------
* New module structure: modules grouped into `core`, `inputs`, `internals`,
`legacy`, `omnipath`, `resources`, `share` and `utils` submodules.

Upcoming
--------

Expand All @@ -512,6 +550,9 @@ Upcoming
Features
========

*Warning:*
The sections below are outdated, will be updated soon

In the beginning the primary aim of **pypath** was to build networks from
multiple sources using an igraph object as the fundament of the integrated
data structure. From version 0.7 and 0.8 this design principle started to
Expand All @@ -528,8 +569,8 @@ rug compound data, searching drug targets and compounds in **ChEMBL**.
ID conversion
-------------

The ID conversion module ``mapping`` can be used independently. It has the
feature to translate secondary UniProt IDs to primaries, and Trembl IDs to
The ID conversion module ``utils.mapping`` can be used independently. It has
the feature to translate secondary UniProt IDs to primaries, and Trembl IDs to
SwissProt, using primary Gene Symbols to find the connections. This module
automatically loads and stores the necessary conversion tables. Many tables
are predefined, such as all the IDs in **UniProt mapping service,** while
Expand All @@ -540,7 +581,7 @@ Pathways
--------

**pypath** includes data and predefined format descriptions for more than 25
high quality, literature curated databases. The inut formats are defined in
high quality, literature curated databases. The input formats are defined in
the ``data_formats`` module. For some resources data downloaded on the fly,
where it is not possible, data is redistributed with the module. Descriptions
and comprehensive information about the resources is available in the
Expand All @@ -566,7 +607,7 @@ segment in **UniProt** protein sequences while being aware of isoforms.
Tissue expression
-----------------

For 3 protein expression databases there are functions and modules for
For three protein expression databases there are functions and modules for
downloading and combining the expression data with the network. These are the
Human Protein Atlas, the ProteomicsDB and GIANT. The ``giant`` and
``proteomicsdb`` modules can be used also as stand alone Python clients for
Expand Down Expand Up @@ -599,8 +640,8 @@ Technical

The module ``pypath.curl`` provides a very flexible **download manager**
built on top of ``pycurl``. The classes ``pypath.curl.Curl()`` and
``pypath.curl.FileOpener`` accept numerous arguments, try to deal in a smart
way with local **cache,** authentication, redirects, uncompression, character
``pypath.curl.FileOpener`` accept numerous arguments to deal in a smart
way with local **cache**, authentication, redirects, uncompression, character
encodings, FTP and HTTP transactions, and many other stuff. Cache can grow to
several GBs, and takes place in ``~/.pypath/cache`` by default. If you
experience issues using ``pypath`` these are most often related to failed
Expand All @@ -610,13 +651,13 @@ the context managers in ``pypath.curl`` to show, delete or bypass the cache
for some particular method calls (``pypath.curl.cache_print_on()``,
``pypath.curl.cache_delete_on()`` and ``pypath.curl.cache_off()``.
You can always set up an alternative cache directory for the entire session
using the ``pypath.settings`` module.
using the ``pypath.settings`` module.

The ``pypath.session`` and ``pypath.log`` modules take care of setting up
session level parameters and logging. Each session has a random 5 character
identifier e.g. ``y5jzx``. The default log file in this case is
``pypath_log/pypath-y5jzx.log``. The log messages flushed in every 2 seconds
by default. You can always change these things by the ``settings`` module.
``pypath_log/pypath-y5jzx.log``. The log messages are flushed every 2 seconds
by default. You can always change these things using the ``settings`` module.
In this module you can get and set the values of various parameters using
the ``pypath.settings.setup()`` and the ``pypath.settings.get()`` methods.

Expand Down

0 comments on commit 0b1b6e1

Please sign in to comment.