Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plotnine to terra-jupyter-python #126

Closed
mbookman opened this issue May 27, 2020 · 1 comment
Closed

Add plotnine to terra-jupyter-python #126

mbookman opened this issue May 27, 2020 · 1 comment

Comments

@mbookman
Copy link

plotnine is an implementation of a grammar of graphics in Python, it is based on ggplot2.

Having this available in terra-jupyter-python would make it such that the code for R and Python notebooks in Terra can look similar and decrease the cognitive load on researchers when going back-and-forth between the two languages.

Note that the AoU Dockerfile adds plotnine.

deflaux added a commit to deflaux/terra-docker that referenced this issue Feb 3, 2021
Changes include:
* removed all version pins
* rearranged packages alphabetically into one pip install command
* removed explicit installation of redundant packages that will be installed as a side effect by higher level packages
  * google-api-core
  * matplotlib
  * numpy
  * pandas
  * protobuf
* added package plotnine per DataBiosphere#126
* added package tensorflow_cpu to get rid of the warnings about GPUs being unavailable for Terra Cloud Runtimes
* --use_rest_api flag is now needed for %%bigquery magic
  * As of release [google-cloud-bigquery 1.26.0 (2020-07-20)](https://github.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#1260-2020-07-20) the BigQuery Python client uses the BigQuery Storage client by default.
  * This currently causes an error on Terra Cloud Runtimes `the user does not have 'bigquery.readsessions.create' permission for '<Terra billing project id>'`.
  * To work-around we both uninstall the dependency so that flag `--use_rest_api` can be used with `%%bigquery` to use the older, slower mechanism for data transfer.
deflaux added a commit to deflaux/terra-docker that referenced this issue Mar 1, 2021
Also:
* added package `plotnine` per DataBiosphere#126
* replaced package `tensorflow` with `tensorflow_cpu` to get rid of the warnings about GPUs being unavailable for Terra Cloud Runtimes
* added package `google-resumable-media` as an explicit dependency to ensure a more recent version of it is used, pandas-gbq depends on it for table uploads
* `--use_rest_api` flag is now needed for `%%bigquery magic`
  * As of release [google-cloud-bigquery 1.26.0 (2020-07-20)](https://github.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#1260-2020-07-20) the BigQuery Python client uses the BigQuery Storage client by default.
  * This currently causes an error on Terra Cloud Runtimes `the user does not have 'bigquery.readsessions.create' permission for '<Terra billing project id>'`.
  * To work-around this must uninstall the dependency `google-cloud-bigquery-storage` so that flag `--use_rest_api` can be used with `%%bigquery` to use the older, slower mechanism for data transfer.
deflaux added a commit to deflaux/terra-docker that referenced this issue Mar 5, 2021
Also:
* added package `plotnine` per DataBiosphere#126
* replaced package `tensorflow` with `tensorflow_cpu` to get rid of the warnings about GPUs being unavailable for Terra Cloud Runtimes
* added package `google-resumable-media` as an explicit dependency to ensure a more recent version of it is used, pandas-gbq depends on it for table uploads
* `--use_rest_api` flag is now needed for `%%bigquery magic`
  * As of release [google-cloud-bigquery 1.26.0 (2020-07-20)](https://github.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#1260-2020-07-20) the BigQuery Python client uses the BigQuery Storage client by default.
  * This currently causes an error on Terra Cloud Runtimes `the user does not have 'bigquery.readsessions.create' permission for '<Terra billing project id>'`.
  * To work-around this must uninstall the dependency `google-cloud-bigquery-storage` so that flag `--use_rest_api` can be used with `%%bigquery` to use the older, slower mechanism for data transfer.
deflaux added a commit to deflaux/terra-docker that referenced this issue Mar 5, 2021
Also:
* added package `plotnine` per DataBiosphere#126
* replaced package `tensorflow` with `tensorflow_cpu` to get rid of the warnings about GPUs being unavailable for Terra Cloud Runtimes
* added package `google-resumable-media` as an explicit dependency to ensure a more recent version of it is used, pandas-gbq depends on it for table uploads
* `--use_rest_api` flag is now needed for `%%bigquery magic`
  * As of release [google-cloud-bigquery 1.26.0 (2020-07-20)](https://github.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#1260-2020-07-20) the BigQuery Python client uses the BigQuery Storage client by default.
  * This currently causes an error on Terra Cloud Runtimes `the user does not have 'bigquery.readsessions.create' permission for '<Terra billing project id>'`.
  * To work-around this we uninstall the dependency `google-cloud-bigquery-storage` so that flag `--use_rest_api` can be used with `%%bigquery` to use the older, slower mechanism for data transfer.
* add nbstripout to terra-jupyter-aou and enable it globally
* improve test coverage by enabling tests that were intentionally commented out for the prior image
deflaux added a commit that referenced this issue Mar 23, 2021
Update all Python package versions by unpinning them.

Also:
* added package `plotnine` per #126
* replaced package `tensorflow` with `tensorflow_cpu` to get rid of the warnings about GPUs being unavailable for Terra Cloud Runtimes
* added package `google-resumable-media` as an explicit dependency to ensure a more recent version of it is used, pandas-gbq depends on it for table uploads
* `--use_rest_api` flag is now needed for `%%bigquery magic`
  * As of release [google-cloud-bigquery 1.26.0 (2020-07-20)](https://github.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#1260-2020-07-20) the BigQuery Python client uses the BigQuery Storage client by default.
  * This currently causes an error on Terra Cloud Runtimes `the user does not have 'bigquery.readsessions.create' permission for '<Terra billing project id>'`.
  * To work-around this we uninstall the dependency `google-cloud-bigquery-storage` so that flag `--use_rest_api` can be used with `%%bigquery` to use the older, slower mechanism for data transfer.
* add `nbstripout` to `terra-jupyter-aou` and enable it globally
* improve test coverage by
  * enabling tests that were intentionally commented out for the prior image
  * adding tests for `gcloud`, `gsutil`, and `bq` command line tools
@Qi77Qi
Copy link
Collaborator

Qi77Qi commented Apr 22, 2021

think @deflaux 's PR has already addressed this...

@Qi77Qi Qi77Qi closed this as completed Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants