PaperParser is a python package for extracting synthesis and performance metrics from academic articles on perovskite solar cells. The long-term goal of this package is to provide a means to (1) scrape, (2) summarize, and (3) compare the relationships between synthesis procedure and device performance across perovskite literature.
The result is a relational graph like the example below,
implemented in python as nested dictionaries and lists.
The simplest way to run the example notebooks is to clone the git repo to your local machine. To install paperparser
and its dependencies, we recommend the following procedure:
-
Clone the git repository to your local machine.
-
Create a new conda environment by running the following command in your terminal.
conda create -n your_new_env python=3.6
(Note: PaperParser was designed in Python 3.6, but also works with 3.5.)
-
Activate your new, clean conda environment.
conda activate your_new_env
-
(Optional) For users of Git for Windows/Git Bash: run the following command.
conda install -c conda-forge dawg
Note that Linux, Mac, and WSL (Windows Subsystem for Linux) users can skip this step.
-
Navigate to the top-level directory containing
setup.py
and pip install by runningpip install .
This will automatically install the dependencies required to run the package and the provided example notebooks. Make sure you are in the correct environment before running
pip install
! -
Download ChemDataExtractor's Data files. This step is important-- PaperParser will not run without this step.
cde data download
Now you're ready to use PaperParser! If you're lost, be sure to check out the example notebook in ./examples/example_notebook.ipynb
. Happy parsing!
PaperParser uses the following open-source packages in its implementation:
An example of each tool that makes up paperparser
is contained within the jupyter notebook examples/example_notebook.ipynb
. This notebook should not require installation of paperparser
if run in the original directory structure.