Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/spark expectation enhancements#123 #125

Conversation

sudeep7978
Copy link
Contributor

Add Column for Column-Level Visibility in Data Quality Framework Result Table

Description

Schema Evolution with AutoMerge:
Enabled Delta Lake's spark.databricks.delta.schema.autoMerge.enabled configuration to allow schema evolution during write operations.
Modified the data quality framework to include the affected_column_name field dynamically if not already present.

ENHANCEMENT

  • Enhanced granularity in data quality reporting.
  • Improved ease of debugging and resolving data quality issues.
  • Better alignment with industry practices for data governance and observability.

Motivation and Context

Increased Transparency: Builds trust by providing clear visibility into how data quality rules are applied and which columns are impacted.
Operational Efficiency: Reduces manual intervention and effort required to diagnose data issues, optimizing resource utilization.

How Has This Been Tested?

Ensure backward compatibility is maintained for legacy workflows making sure that exusting pipelines donot break
Screenshot 2024-12-19 at 10 49 21 PM
Screenshot 2024-12-19 at 10 58 58 PM

Screenshots (if appropriate):

Screenshot 2024-12-19 at 11 07 27 PM

Types of changes

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Add Column for Column-Level Visibility in Data Quality Framework Result Table
…column-level visibility.

Add Column for Column-Level Visibility in Data Quality Framework Result Table
Enabled Delta Lake's spark.databricks.delta.schema.autoMerge.enabled configuration to allow schema evolution during write operations.
Modified the data quality framework to include the affected_column_name field dynamically if not already present.
…configuration to allow schema evolution during write operations.
@sudeep7978
Copy link
Contributor Author

@asingamaneni Can you please look into it.
Let me know any changes required.
THANK YOU.

@sudeep7978
Copy link
Contributor Author

@asingamaneni
closing this PR will raise a new PR taking the SMTP authentication changes after the other PR is merged
#54

@sudeep7978 sudeep7978 closed this Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant