Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate ArchivedProject content to ActiveProject #2170

Merged
merged 2 commits into from
Jan 17, 2024

Conversation

tompollard
Copy link
Member

As discussed in #2166, we would like to migrate content from the (now deprecated) ArchivedProject model to the ActiveProject model.

This pull request:

  1. Adds an Author to the "Failed demo software for parsing clinical notes" (slug="t2ASGLbIBoWaTJvPrM2A") fixture.
  2. Creates a migration that migrates ArchivedProjects (and associated objects, like authors, references, logs, etc) to ActiveProjects with SubmissionStatus=ARCHIVED.

I have tested this fairly comprehensively and have not found any issues.

Note that files are not migrated. Files for archived projects will remain in the pn-media/archived-projects subfolder. We should migrate them to pn-media/active-projects later.

One way of testing is to:

1 . Reset to c49b2a1c640a0a6c07ee135f3ff84d45e2baa4e3 (e.g. with git reset c49b2a1c640a0a6c07ee135f3ff84d45e2baa4e3 --hard) which is the pull request prior to the archived status being added to the ActiveProject object.
2. Create a bunch of interesting ArchivedProjects using the console. Add references, co-authors, copyediting history, etc.
3. Pull down the latest version of this branch.
4. Run the migrations to migrate ArchivedProjects to ActiveProjects.
5. Try viewing archived projects in the console, author project page, etc.
6. Change the submission status of the archived projects to unsubmitted (p.submission_status = 0) and save the change.
7. Check whether the now-active projects are correctly populated in the author submission system. You should find that all pages work correctly, except http://localhost:8000/projects/SLUG/files/, which will raise a FileNotFoundError as expected.

@tompollard tompollard force-pushed the tp/migrate_archived_project branch 2 times, most recently from c6e3870 to 0e5a9ab Compare December 22, 2023 17:09
@tompollard tompollard requested a review from bemoody December 22, 2023 17:15
@bemoody
Copy link
Collaborator

bemoody commented Dec 22, 2023

Makes sense. Thanks.

I think that we want to delete the ArchivedProject after migrating it.

@tompollard
Copy link
Member Author

I think that we want to delete the ArchivedProject after migrating it.

Thanks Benjamin. Deleted!

@bemoody
Copy link
Collaborator

bemoody commented Dec 22, 2023

Interesting error.

call_command('purgeaccounts')
-> user.delete()
-> django.db.utils.OperationalError: no such table: project_archivedproject

Perhaps because of the editor field on ArchivedProject?

But I didn't mean deleting the whole table necessarily.

I meant, within your migrate_archived_to_active, after you've saved the new objects, then delete the old object. So it all happens as a single transaction.

@tompollard tompollard force-pushed the tp/migrate_archived_project branch 2 times, most recently from 8b335ff to b72308d Compare December 22, 2023 23:46
@tompollard
Copy link
Member Author

I meant, within your migrate_archived_to_active, after you've saved the new objects, then delete the old object. So it all happens as a single transaction.

Sorry, misunderstood! Now done. I'll remove the table in a later pull request in this case.

@bemoody
Copy link
Collaborator

bemoody commented Jan 8, 2024

I'm not sure if there is a reason for the particular set of fields being copied from ArchivedProject to ActiveProject (in line 28 onwards.) As it is, this is losing some information from the old project (e.g. installation, acknowledgements, release_notes.)

It's instructive to look at the old archive() function which converted ActiveProjects into ArchivedProjects. It did this:

        archived_project = ArchivedProject(archive_reason=archive_reason,
            slug=self.slug)

        modified_datetime = self.modified_datetime

        # Direct copy over fields
        for attr in [f.name for f in Metadata._meta.fields] + [f.name for f in SubmissionInfo._meta.fields]:
            setattr(archived_project, attr, getattr(self, attr))

In other words, it would copy all of the fields in Metadata or SubmissionInfo, plus slug and modified_datetime.

The only fields that weren't copied were is_new_version, latest_reminder, doi, and submission_status. The first two don't seem important. Missing doi seems like a bug but probably never came up. submission_status doesn't exist for ArchivedProject.

Unfortunately we can't use the same approach in migrate_archived_to_active because the migration system doesn't know about abstract classes like Metadata and SubmissionInfo. But I think the following would be good enough:

    for archived_project in ArchivedProject.objects.all():
        active_project = ActiveProject(
            submission_status=SubmissionStatus.ARCHIVED.value,
        )
        for attr in [f.name for f in ArchivedProject._meta.fields]:
            if attr != 'id':
                setattr(active_project, attr, getattr(archived_project, attr))
        active_project.save()
        ...

…#2166.

Note that we do not migrate files. Files for archived projects will remain in the pn-media/archived-projects subfolder. We should migrate them to pn-media/active-projects later.
@tompollard tompollard force-pushed the tp/migrate_archived_project branch from b72308d to 4d9494a Compare January 9, 2024 12:42
@tompollard
Copy link
Member Author

I'm not sure if there is a reason for the particular set of fields being copied from ArchivedProject to ActiveProject (in line 28 onwards.)

I'd walked through the fields, picking ones that I thought were important to keep. I have now implemented your more-complete approach. Good suggestion, thanks.

@bemoody
Copy link
Collaborator

bemoody commented Jan 17, 2024

thanks, looks great!

@bemoody bemoody merged commit 2d97768 into dev Jan 17, 2024
11 checks passed
@bemoody bemoody deleted the tp/migrate_archived_project branch January 17, 2024 17:05
tompollard added a commit that referenced this pull request Sep 25, 2024
This pull request removes the unused `ArchivedProject` model.

We now track the "Archived" status of projects on the ActiveProject
model (see: #2149).

In #2170 we migrated all
ArchivedProject content to ActiveProjects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants