Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Migration Script Draft #19

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

jersey1dev
Copy link

Currently, this python script is able to migrate data from test environment's entity, planter, trees table to dev environment's entity, planter, trees, as well as resolving the differences of columns in tables of the two environments.
However, I have not implemented the functionality which allows users to only migrate rows with specified id in the entity table, which means the code currently migrates all rows. I'd really appreciate it if you could review the code and tell me how I should expose the interface that allows users to specify the desired entity.id(Currently, it requires the user to manually edit the entity_ids list, which I doubt to be good enough).

@jersey1dev jersey1dev marked this pull request as ready for review July 9, 2024 16:57
@jersey1dev
Copy link
Author

I have added the functionality to filter by entity_id, and have successfully run this code on the test environment. Please let me know if it works in the production environement and whether there are more features needed!

Comment on lines +15 to +17
organization_query=SELECT [columns] FROM entity WHERE entity.id > %s and (type = 'o' or type = 'O') and entity.id in (%s) ORDER BY entity.id LIMIT %s;
planter_query=SELECT [columns] FROM planter, entity WHERE planter.id > %s and entity.id = planter.organization_id and (entity.type = 'o' or entity.type = 'O') and planter.organization_id in (%s) ORDER BY planter.id LIMIT %s;
tree_query=SELECT [columns] FROM trees, planter, entity WHERE trees.id > %s and planter_id = planter.id and entity.id = planter.organization_id and (entity.type = 'o' or entity.type = 'O') and entity.id in (%s) ORDER BY trees.id LIMIT %s;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel you don't need these settings anymore, right?

Comment on lines +140 to +143
if entity_ids: # use the specified entity id as filter condition
query = query[:-1] + " WHERE "
filter_column = "id" if table_name == "entity" else "organization_id" if table_name == "planter" else "planter_id"
entity_ids_str = ", ".join(map(str, entity_ids))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@1nzexi I can not understand logic here, how we get ids here for filter? I didn't find the place to update this entity_ids, where the data from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entity_id is a python list defined on line 221. Currently, it requires users to manually specify the ids in the entity_id list. While this is obviously not a robust way of specifying input, I figure it would be a great place to start testing out the functionality of this script. Please let me know if you think there needs to be a better way to specify the ids to filter that better integrate with the existing workflows. Open to all ideas!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, sorry for taking so long to reply. Somehow github did not notify me of your comment.

@jersey1dev jersey1dev requested a review from dadiorchen July 28, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants