-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug/fix replicate table primary keys #88
Bug/fix replicate table primary keys #88
Conversation
Recommended adding tests for the new logic |
Can't add suggestions for files outside of the scope of the PR, however I recommend adding the following test here: class TestPrimaryKeyUniqueKey(unittest.TestCase):
def setUp(self):
self.conn = test_utils.get_test_connection()
with connect_with_backoff(self.conn) as open_conn:
with open_conn.cursor() as cursor:
try:
cursor.execute("drop table uc_only_table")
except:
pass
try:
cursor.execute("drop table pk_only_table")
except:
pass
try:
cursor.execute("drop table pk_uc_table")
except:
pass
cursor.execute(
"""
CREATE TABLE uc_only_table (
pk int,
uc_1 int,
uc_2 int,
CONSTRAINT constraint_uc_only_table UNIQUE(uc_1,uc_2) )
"""
)
cursor.execute(
"""
CREATE TABLE pk_only_table (
pk int PRIMARY KEY,
uc_1 int,
uc_2 int,
)
"""
)
cursor.execute(
"""
CREATE TABLE pk_uc_table (
pk int PRIMARY KEY,
uc_1 int,
uc_2 int,
CONSTRAINT constraint_pk_uc_table UNIQUE(uc_1,uc_2) )
"""
)
def test_only_primary_key(self):
catalog = test_utils.discover_catalog(self.conn, {})
primary_keys = {}
for c in catalog.streams:
primary_keys[c.table] = (
singer.metadata.to_map(c.metadata).get((), {}).get("table-key-properties")
)
self.assertEqual(primary_keys["uc_only_table"], ["uc_1","uc_2"])
self.assertEqual(primary_keys["pk_only_table"], ["pk"])
self.assertEqual(primary_keys["pk_uc_table"], ["pk"]) The test creates three tables for the scenarios you are addressing:
Looking at the last three lines of the test, the assertions are used to test the expected primary key chosen in each scenario:
To run this on your branch you can do:
(the pytest -k switch looks for any tests starting with the provided argument, so TestPrimaryKey matches just TestPrimaryKeyUniqueKey - the name of the added test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest pipeline shows 26 tests conducted where previously it was 25 in total (unfortunately can't see the test names). Safe to assume the new test is passing and this change is ready. Thank you @s7clarke10!
Resolving issue when a table has a primary key and unique key. Both unique and primary key columns were being identified as the primary key for the target table. Prioritising the primary key first, and unique key secondary if there is no primary key.
The code prior to the fix is overstating what the primary key is.
This change resolves #87