-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resume from exact position and truncate the rest #536
Conversation
5367d6c
to
1688ce5
Compare
I'm seeing the same issue #471 even after this PR,
What happens is that while switching from prefetch to replay, stream_transform_stream calls stream_transform_resume which streams existing 000000050000019E00000023.json(which is not yet truncated), which can contain partial txn. When we receive our first streamWrite callback we truncate 000000050000019E00000023.json and the continue with the streaming. I think we should call partial txn truncation somewhere from streamCheckResumePosition. |
That sounds like what we need to do yes. Is it possible that's the whole fix for the situation? I mean, if you agree to it, could we make a separate PR that does only that? Then the plan would be to get back on this PR here, because I like the “cache invalidation” mechanism that we would obtain at re-connecting to the streaming protocol / replication slot, I think that's the best approach here. |
1688ce5
to
ac29cbd
Compare
@dimitri I attempted this and hit the same problem again(transform reads the latest file even before we truncate it). I'm going to add synchronisation between streamCheckResumePosition and pgcopydb/src/bin/pgcopydb/ld_transform.c Lines 238 to 253 in 63c2d85
|
@dimitri I think this approach is not going to work unless we do drastic change to the existing flow. The current implementation assumes that wal segments files are immutable and various assumptions are made according to that. For example, When a new message has arrived, we decide the respective wal file based on maxWrittenLSN and metadata->lsn., it just breaks when we truncate the wal file. Additionally we need to add synchronisation between receive and transform to avoid reading partial files from transform process. |
ac29cbd
to
8cc2a31
Compare
The new logic finds the WAL segment file that contains the given message from the metadata and truncates upto the that position. According to the implementation of streamRotateFile, the message could be anywhere between pg_walfile_name(metadata->lsn) ... latest. Consider the following scenario where we have 4 WAL segment files: 000000010000000000000000.json 000000010000000000000001.json 000000010000000000000002.json 000000010000000000000003.json <-- latest When receive a message with LSN 0/144BEE0, the search starts from the corresponding WAL segment 000000010000000000000001.json and goes upto 000000010000000000000003.json(latest). Assume that we found the message in 000000010000000000000002.json, we need to truncate the file upto the message position and remove(renaming for debugging) all the files after that and make that file as the latest file. If we couldn't find the message in any of the files, it means that the message is not yet streamed and we can keep writing to the latest file. Signed-off-by: Arunprasad Rajkumar <[email protected]>
Signed-off-by: Arunprasad Rajkumar <[email protected]>
8cc2a31
to
ebaac31
Compare
I wonder now if in replay mode the transform process could wait until it receives its first message in the PIPE to do its initialization, and then it could use the LSN of that first message as input when/where needed. Also the whole idea of the transform process needing to read pre-existing files at startup is because of transactions that span multiple files. We could also force-store these in their own file ( What do you think? |
@dimitri This won't work when we resume after reaching the ENDPOS. The receive would never write any message into the PIPE as it already reached the ENPOS and the transform will wait forever. Currently, we handle the ENDPOS and exit early before reading from stream(PIPE). pgcopydb/src/bin/pgcopydb/ld_transform.c Lines 65 to 81 in dfa87c2
|
@dimitri I'm trying out alternate approach using undo logs. It records last commit position in a undo file and resume from it incase if pgcopydb exits in the midway. |
This is no longer needed as #544 already merged into main. |
I'm wondering if your approach with undo logs is needed to handle the crash conditions? |
The new logic finds the WAL segment file that contains the given message from the metadata and truncates upto the that position.
According to the implementation of streamRotateFile, the message could be anywhere between WalJsonFile(metadata->lsn) ... latest.
Consider the following scenario where we have 4 WAL segment files:
000000010000000000000000.json
000000010000000000000001.json
000000010000000000000002.json
000000010000000000000003.json <-- latest
When receive a message with LSN 0/144BEE0, the search starts from the corresponding WAL segment 000000010000000000000001.json and goes upto 000000010000000000000003.json(latest). Assume that we found the message in 000000010000000000000002.json, we need to truncate the file upto the message position and remove(renaming for debugging) all the files after that and make that file as the latest file.
If we couldn't find the message in any of those files, it means that the message is not yet streamed and we can keep writing to the latest file.