Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append filterSql to source variant_to_person in cohort count #1130

Merged
merged 4 commits into from
Jan 16, 2025

Conversation

dexamundsen
Copy link
Contributor

@dexamundsen dexamundsen commented Jan 15, 2025

issue: current generated SQL fetches all of the varinat_to_person table and then applies a filter. This makes the query too slow and expensive

Fix: apply the filter to the varinat_to_person directly

after change:

SELECT
  COUNT(id) AS T_CTDT
FROM
  `vwb-dev-blissful-eggplant-4805.aou_test_data_SC2023Q3R2_index_121624`.T_ENT_person
WHERE
  (id IN (
    SELECT
      flattened_person_id
    FROM (
      SELECT
        DISTINCT vid,
        flattened_person_id
      FROM
        `vwb-dev-blissful-eggplant-4805.aou_test_data_SC2023Q3R2.variant_to_person`
      CROSS JOIN
        UNNEST(person_ids) AS flattened_person_id
      WHERE
        vid IN (
        SELECT
          id
        FROM
          `vwb-dev-blissful-eggplant-4805.aou_test_data_SC2023Q3R2_index_121624`.T_ENT_variant
        WHERE
          (id IN (
            SELECT
              id
            FROM
              `vwb-dev-blissful-eggplant-4805.aou_test_data_SC2023Q3R2_index_121624`.T_ESA_variant_gene
            WHERE
              gene = 'GENE'))
          AND (EXISTS (
            SELECT
              *
            FROM
              UNNEST(consequence) AS flattened
            WHERE
              flattened IN ('missense_variant')))))))
ORDER BY
  T_CTDT DESC
LIMIT
  1000000

@tjennison-work
Copy link
Collaborator

Can we just always append if it's more efficient or does that not work for some cases?

Copy link
Collaborator

@freemabd freemabd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. LGTM

@dexamundsen
Copy link
Contributor Author

Can we just always append if it's more efficient or does that not work for some cases?

I will discuss this in the standup with Tim. I have not looked at all the use cases where this is used. At the very least there may be different field names that need to be updated. Hence made this optional.
Also note that this change uses an existing config that reads from source table, which is also used by only variant_person entity group.
so theoretically any query that reads from source table (like variantPerson) will get this appended.

@dexamundsen dexamundsen merged commit 2556ab4 into main Jan 16, 2025
8 checks passed
@dexamundsen dexamundsen deleted the dexamundsen/0114 branch January 16, 2025 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants