-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VStreamer: improve representation of integers in json data types #12630
VStreamer: improve representation of integers in json data types #12630
Conversation
…ssue of integers being parsed as float64 by the source binlog parser. This results in larger integers being stored as floats on the target and sent with scientific notation in vstream events. Signed-off-by: Rohit Nayak <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
data: []byte{9, 255, 255, 255, 255, 255, 255, 255, 127}, | ||
expected: `9223372036854775807`, | ||
}, { | ||
name: "uint16/1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why we use uint16/1
here and int64 -1
above?
@@ -65,7 +65,6 @@ require ( | |||
github.com/spf13/cobra v1.6.1 | |||
github.com/spf13/pflag v1.0.5 | |||
github.com/spf13/viper v1.15.0 | |||
github.com/spyzhov/ajson v0.7.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given spyzhov/ajson#63 seems positive about merging this upstream, would it be preferable here to use a go mod replacement instead of a hard fork? We can point the replacement at the fork for now until it's part of upstream.
That also makes it easier to do the actual PR upstream since you don't need to rename packages etc. then there as well in our fork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will make the change now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbussink comment looks reasonable.
…to the upstream module once changes are made there Signed-off-by: Rohit Nayak <[email protected]>
While this does somewhat improve on the current situation, JSON vreplication is still broken with this change. It only solves the range of numbers from 2^53 to 2^64 for integers and it doesn't handle decimals at all (decimals are today also broken already). Take the following table definition:
When inserting data using the MySQL JSON object parser, it shows that MySQL parses into doubles itself (according to the JSON standard):
But, it's also possible to build a JSON object using
What can be seen here is that MySQL maintains the types and stores decimals (and the very large integer is also stored as a decimal). But now, if we'd run a vreplication workflow on this like with using online DDL, it breaks the actual data stored (
What can be seen here is that also the decimal values and integer outside of int64 range are now converted into doubles. |
Description
The binlog parser in vstreamer currently uses the
github.com/spyzhov/ajson
module to decode the value from the binlog image to its json value. However the library only supports a single type Numeric (float64) as a catchall for all numeric types including signed and unsigned integers. As a consequence, the generated JSON represents integers as floats and the string representation in a VEvent can contain decimals or values in scientific notation. So integers can be stored as floats on the target and larger ints sent with scientific notation in VStream events.This results in VDiff failures since the json strings stored are different. Also parsing the VEvents sent using the VStream API can result in errors if, for example, the JSON is being parsed by golang. See #8686.
This PR uses a forked version of the library, https://github.com/rohit-nayak-ps/ajson, that adds Integer and UnsignedInteger data type JSON Nodes. Once we submit the related changes upstream to
github.com/spyzhov/ajson
and they get merged we will switch back to using upstream again.Minor refactoring is also done as part of this PR.
Related Issue(s)
Checklist