Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: suspension type validation refactor #1139

Closed
wants to merge 3 commits into from

Conversation

nayib-jose-gloria
Copy link
Contributor

@nayib-jose-gloria nayib-jose-gloria commented Dec 2, 2024

Reason for Change

  • Suspension type implementation introduced needless complication into schema definition yaml and was not performant, in exchange for more detailed error messages
  • I don't think this trade-off has been worthwhile, so I propose simplifying the suspension type validation to follow patterns other fields' validation use and improve performance by reducing the total number of pandas df queries the validator runs.
  • TODO: Refactor loses ability to warn but still accept unrecognized assays. Must account for that

Changes

  • remove "complex_rule" pattern in schema_definition.yaml, use 'dependencies -> rule' pattern
  • remove "complex_rule" implementation in valdiate.py (designed to be extensible, but no new field using it has come up in the last two years)
  • group queries by suspension type (3) rather than by assay type (25 and counting)
  • use slightly more generic error messages to accommodate this pattern (instead of specifying what suspension_types are valid for a failing assay x suspension_type combo, just flags that it is an invalid combo and to check the schema for a valid suspension type for the given assay type)
  • small refactor to schema_definition.yaml to list all dependent validation rules for a field under "dependencies" rather than supporting both a list under 'dependencies' and a "default" validation defined outside of "dependencies". This affected self_reported_ethnicity_ontology_term_id and development_stage_ontology_term_id.
  • small refactor to "_validate_column" command to account for removing "default" dependency behavior.

Testing

  • unit tests for regression testing
  • TBD notebook testing for performance improvement

Notes for Reviewer

@nayib-jose-gloria nayib-jose-gloria changed the title Nayib/suspension type refactor chore: suspension type validation refactor Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant