[Work In Progress] VReplicating JSON Columns: fix data loss with large NUMERIC and DECIMAL values #12731
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Motivation
Large fixed point
NUMERIC
andDECIMAL
values in JSON Objects are not correctly vreplicated today. Such numbers are converted to afloat64
by the binlog json parser.So a json value like
{"a": 12345678901234567890123456789012345678901234567890123456789012345678901234567890, "b": 987654321.012345678901234567890}
on the source gets replicated like{"a": 1.2345678901234573e79, "b": 987654321.0123456}
Approach
VStreamer
The VStreamer needs to deserialize the binlog values exactly. This is done by adding support for fixed type integers and decimals in
ajson
, the library we use today to define a json value in an ast and stringify it while creating the correspondig VEvent.VPlayer
The VPlayer needs to generate mysql queries so that MySQL retains precision.MySQL will only keep the true value of fixed point values if they are specified in a
json_object()
while inserting or updating. If they are just specified as a string representation, the mysql parser will convert it to a float and lose precision. Same holds for json arrays.The snippets below illustrate the issue:
So we need to also generate
json_object()
s andjson_array()
s while inserting json (dictionary) objects and arrays so as to replicate the exact value.Status
VStreamer now generates data correctly for DECIMALS with a local change to ajson which has not yet been pushed to our ajson fork. VStreamer: improve representation of integers in json data types #12630 already fixed issues with large integers like
930701976723823
which would get vreplicated as9.30701976723823e+14
and1234567890
which would get vreplicated as1234567890.0
.TODO:
Related Issue(s)
#8686
Checklist
Deployment Notes