Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload the documentation as release assets #3731

Open
2 of 5 tasks
seisman opened this issue Dec 30, 2024 · 6 comments
Open
2 of 5 tasks

Upload the documentation as release assets #3731

seisman opened this issue Dec 30, 2024 · 6 comments
Assignees
Labels
maintenance Boring but important stuff for the core devs
Milestone

Comments

@seisman
Copy link
Member

seisman commented Dec 30, 2024

The PyGMT documentation is hosted in the gh-pages branch, with docs of each PyGMT release stored in subdirectories, and docs of the main branch in the dev subdirectory.

Screenshot from 2024-12-30 15-40-06

Currently, the documentation has no backups and we may lose all the documentation in some rare cases, e.g.,

  • Accidentally delete the gh-pages branch when someone tries to clean up stale branches (happened once in the GMT repository Web documentation missing gmt#7338)
  • Changes to the "Push the built HTML to gh-pages" step in the "Docs" workflow have uncaught flaws [That's why I'm hesitating to make any changes to improve the step, since it has been working well for a long time]
  • "Push the built HTML to gh-pages" step in the "Docs" workflow fails and leads to partial uploads
  • maybe more

Technically speaking, we can restore the PyGMT documentation since,

  • we may have the gh-pages branch in our local copy (maybe not)
  • we can create local environments and rebuild the docs for each PyGMT version [It takes a lot of efforts]

I think the easiest solution is to upload the documentation of each release as release assets. Then if we lose the documentation, we just need to:

  1. create a new gh-pages branch
  2. download the documentation for each release from the release page
  3. organize the documentation into subdirectories and add them to the gh-pages branch
  4. push the docs to the remote gh-pages branch
  5. manually trigger the "Docs" workflow once

Looking at the current gh-pages branch, there are more files/directories in addition to the version subdirectories, so the above steps are not enough to restore the gh-pages branch. Here are the additional files/directories:

  • .buildinfo: This file was added 6 years ago. At that time, files were pushed to the root directory instead of subdirectories in the gh-pages branch. In other words, this file should be deleted.
  • .nojekyll: Tell GitHub to not build this branch using Jekyll [Automatically created by the workflow]
  • CNAME: Set up the docs domain (xref: https://docs.github.com/en/pages/configuring-a-custom-domain-for-your-github-pages-site/troubleshooting-custom-domains-and-github-pages#cname-errors) [It was manually added and maintained. I think we should create it automatically in the workflow]
  • index.html: The index page, redirected to /latest/ automatically [Again, this file should be automatically created in the workflow]
  • latest: Symlink to the latest release [Automatically linked to the latest version only when making a release. So it's likely we have to create the symblink manually, but it's technically possible to determine the latest version using gh release list command]
  • dev: Contains the dev docs [Automatically created by the workflow]

It agreed, here are things we need to do:

@seisman seisman added the discussions Need more discussion before taking further actions label Dec 30, 2024
@seisman
Copy link
Member Author

seisman commented Jan 6, 2025

Ping @GenericMappingTools/pygmt-maintainers for comments.

@weiji14
Copy link
Member

weiji14 commented Jan 6, 2025

  • Accidentally delete the gh-pages branch when someone tries to clean up stale branches (happened once in the GMT repository

I just set a branch protection rule for gh-pages to prevent accidental deletion, and it should also disable force pushes.

I think the easiest solution is to upload the documentation of each release as release assets. Then if we lose the documentation, we just need to:

  1. create a new gh-pages branch

  2. download the documentation for each release from the release page

  3. organize the documentation into subdirectories and add them to the gh-pages branch

  4. push the docs to the remote gh-pages branch

  5. manually trigger the "Docs" workflow once

Step 1-3 is still a lot of work 😅 I do see the value of saving the HTML documentation (or a PDF copy, xref #1606) on GitHub or Zenodo for long term archival. But for restoring the gh-pages branch, it might be easier to just mirror it on another git hosting site as we already have on https://www.dagshub.com/GenericMappingTools/pygmt/src/gh-pages, probably do it in a more systematic way.

@seisman
Copy link
Member Author

seisman commented Jan 7, 2025

  • Accidentally delete the gh-pages branch when someone tries to clean up stale branches (happened once in the GMT repository

I just set a branch protection rule for gh-pages to prevent accidental deletion, and it should also disable force pushes.

We can't disable force pushes, since the doc deploy script always does force pushes:

git push -fq origin gh-pages 2>&1 >/dev/null

I think the easiest solution is to upload the documentation of each release as release assets. Then if we lose the documentation, we just need to:

  1. create a new gh-pages branch
  2. download the documentation for each release from the release page
  3. organize the documentation into subdirectories and add them to the gh-pages branch
  4. push the docs to the remote gh-pages branch
  5. manually trigger the "Docs" workflow once

Step 1-3 is still a lot of work 😅

Not that much work since we can use gh release download to download the pre-uploaded docs.

I do see the value of saving the HTML documentation (or a PDF copy, xref #1606) on GitHub or Zenodo for long term archival. But for restoring the gh-pages branch, it might be easier to just mirror it on another git hosting site as we already have on https://www.dagshub.com/GenericMappingTools/pygmt/src/gh-pages, probably do it in a more systematic way.

Good to know that at least we have a copy on DagsHub.

Anyway, it does no harm to upload the HTML documentation as a release asset, with a name like pygmt-v0.14.0-docs.zip?

@seisman
Copy link
Member Author

seisman commented Jan 7, 2025

  • Accidentally delete the gh-pages branch when someone tries to clean up stale branches (happened once in the GMT repository

I just set a branch protection rule for gh-pages to prevent accidental deletion, and it should also disable force pushes.

We can't disable force pushes, since the doc deploy script always does force pushes:

I've enabled force pushes so that the doc deploy script can update the gh-pages branch. The branch protection rule is still useful for preventing deletion.

@seisman seisman added maintenance Boring but important stuff for the core devs and removed discussions Need more discussion before taking further actions labels Jan 7, 2025
@seisman
Copy link
Member Author

seisman commented Jan 10, 2025

  • Upload the documentation of old versions to the release page [manually]

This can be done by the following commands:

# Clone the gh-pages branch into a separate directory.
git clone -b gh-pages [email protected]:GenericMappingTools/pygmt.git pygmt-gh-pages

cd pygmt-gh-pages

# Create a directory for storing the zip files
mkdir -p zipfiles

# Loop over versions, create a zip file with names pygmt-<version>-docs.zip
for version in $(ls -1d v0.*); do 
    echo ${version}
    cp -r ${version}/ pygmt-${version}-docs/   
    zip -r zipfiles/pygmt-${version}-docs.zip pygmt-${version}-docs/                    
    rm -r pygmt-${version}-docs  
done

The above commands create zip files with name pygmt-<version>-docs.zip and the directory inside the ZIP files also have the same name.

Then we can upload them as release assets using the command below (not tested yet):

for version in $(ls -1d v0.*); do 
    gh release upload $version zipfiles/pygmt-${version}-docs.zip
done

Please let me know if you have any comments before I take actions.

@seisman seisman self-assigned this Jan 10, 2025
@seisman
Copy link
Member Author

seisman commented Jan 11, 2025

After thinking twice, maybe we should just name the zip files like pygmt-docs.zip without the version string, similar to what we're doing to the baseline-images.zip.

Then the commands will be:

# Clone the gh-pages branch into a separate directory.
git clone -b gh-pages [email protected]:GenericMappingTools/pygmt.git pygmt-gh-pages

cd pygmt-gh-pages

# Create a directory for storing the zip files
mkdir -p zipfiles

# Loop over versions, create a zip file with names pygmt-<version>-docs.zip
for version in $(ls -1d v0.*); do 
    echo ${version}
    cp -r ${version}/ pygmt-docs/
    mkdir -p zipfiles/${version}/
    zip -r zipfiles/${version}/pygmt-docs.zip pygmt-docs/                    
    rm -r pygmt-docs  
done

and

for version in $(ls -1d v0.*); do 
    gh release upload $version zipfiles/${version}/pygmt-docs.zip
done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs
Projects
None yet
Development

No branches or pull requests

2 participants