-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PrimitiveGenericRecord interface #45
base: master
Are you sure you want to change the base?
Conversation
* @return true if the field has a value defined, or false if it is null | ||
* @throws AvroRuntimeException | ||
*/ | ||
boolean isDefined(String key) throws AvroRuntimeException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - "isDefined" sounds (to me?) more like if the schema contains a field. maybe hasValue()? of nonNull() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't sure how to name this one. Also thought of isSet(name)
... what's your favorite wording?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if its set to null? :-)
@@ -0,0 +1,103 @@ | |||
package com.linkedin.avro.primitive; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - if we're gonna go extending avro ... "com.linkedin.avro.extensions" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, or maybe com.linkedin.avro.api
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think a benchmark is in order before we embark on this?
Nice! I think it definitely can improve deser performance further like PrimitiveFloatList.
|
N.B.: This is still a work in progress. There are some build issues that I need to debug. And the code overall is untested.
24a71b9
to
3396a05
Compare
@volauvent @FelixGV |
@gaojieliu since we only have primitiveFloatList now, we can vary float array size and compare the performance diff between Primitive float and default box one. The above metric is for 800 FloatArray, are you saying reduce the array size to a small number to understand the real performance improvement brought by primitive float type, such as to 5? |
The PrimitiveFloatList is an API which users should expect to rely on, so it is wrong to degrade to the more constrained List<Float> API while Fast-Avro is still cold. This commit introduces several changes to make the extended API reliably present whenever using Fast-Avro, regardless of being cold or warm. - Changed PrimitiveFloatList to an interface, in a new package called: com.linkedin.avro.api; since the package name migration makes this an incompatible change, it would be desirable for the next release to not increment only the patch version. Having a proper package name for API extension should make things cleaner in the future as we add other optimized APIs (e.g. PR linkedin#45). - Renamed the old class to ByteBufferBackedPrimitiveFloatList, and made it implement the new interface. - Added new several new classes to ensure that the PrimitiveFloatList is always returned even when Fast-Avro falls back to vanilla Avro: - ColdPrimitiveFloatList which is a naive implementation that simply implements the new API by delegating to the regular Avro functions. This does not provide any GC benefits, but at least maintains the API. - ColdGenericDatumReader and ColdSpecificDatumReader which extend the GenericDatumReader and SpecificDatumReader classes, respectively, from vanilla Avro. - ColdDatumReaderMixIn which provides a utility function to minimize repeated code between the two DatumReader functions. - Significantly refactored the FastGenericDeserializerGeneratorTest so that it tests three permutations: vanilla, cold fast and warm fast. As part of doing this, several test short-comings were discovered and fixed. In particular, the decodeRecordSlow function had some flipped parameters which led to test failures on vanilla Avro, and those test failures were hidden by the fact that some tests ignored he provided permutation param and systematically tested only Fast-Avro.
The PrimitiveFloatList is an API which users should expect to rely on, so it is wrong to degrade to the more constrained List<Float> API while Fast-Avro is still cold. This commit introduces several changes to make the extended API reliably present whenever using Fast-Avro, regardless of being cold or warm. - Changed PrimitiveFloatList to an interface, in a new package called: com.linkedin.avro.api; since the package name migration makes this an incompatible change, it would be desirable for the next release to not increment only the patch version. Having a proper package name for API extension should make things cleaner in the future as we add other optimized APIs (e.g. PR linkedin#45). - Renamed the old class to ByteBufferBackedPrimitiveFloatList, and made it implement the new interface. - Added new several new classes to ensure that the PrimitiveFloatList is always returned even when Fast-Avro falls back to vanilla Avro: - ColdPrimitiveFloatList which is a naive implementation that simply implements the new API by delegating to the regular Avro functions. This does not provide any GC benefits, but at least maintains the API. - ColdGenericDatumReader and ColdSpecificDatumReader which extend the GenericDatumReader and SpecificDatumReader classes, respectively, from vanilla Avro. - ColdDatumReaderMixIn which provides a utility function to minimize repeated code between the two DatumReader functions. - Significantly refactored the FastGenericDeserializerGeneratorTest so that it tests three permutations: vanilla, cold fast and warm fast. As part of doing this, several test short-comings were discovered and fixed. In particular, the decodeRecordSlow function had some flipped parameters which led to test failures on vanilla Avro, and those test failures were hidden by the fact that some tests ignored he provided permutation param and systematically tested only Fast-Avro.
The PrimitiveFloatList is an API which users should expect to rely on, so it is wrong to degrade to the more constrained List<Float> API while Fast-Avro is still cold. This commit introduces several changes to make the extended API reliably present whenever using Fast-Avro, regardless of being cold or warm. - Changed PrimitiveFloatList to an interface, in a new package called: com.linkedin.avro.api; since the package name migration makes this an incompatible change, it would be desirable for the next release to not increment only the patch version. Having a proper package name for API extension should make things cleaner in the future as we add other optimized APIs (e.g. PR linkedin#45). - Renamed the old class to ByteBufferBackedPrimitiveFloatList, and made it implement the new interface. - Added new several new classes to ensure that the PrimitiveFloatList is always returned even when Fast-Avro falls back to vanilla Avro: - ColdPrimitiveFloatList which is a naive implementation that simply implements the new API by delegating to the regular Avro functions. This does not provide any GC benefits, but at least maintains the API. - ColdGenericDatumReader and ColdSpecificDatumReader which extend the GenericDatumReader and SpecificDatumReader classes, respectively, from vanilla Avro. - ColdDatumReaderMixIn which provides a utility function to minimize repeated code between the two DatumReader functions. - Significantly refactored the FastGenericDeserializerGeneratorTest so that it tests three permutations: vanilla, cold fast and warm fast. As part of doing this, several test short-comings were discovered and fixed. In particular, the decodeRecordSlow function had some flipped parameters which led to test failures on vanilla Avro, and those test failures were hidden by the fact that some tests ignored the provided permutation param and systematically tested only Fast-Avro.
The PrimitiveFloatList is an API which users should expect to rely on, so it is wrong to degrade to the more constrained List<Float> API while Fast-Avro is still cold. This commit introduces several changes to make the extended API reliably present whenever using Fast-Avro, regardless of being cold or warm. - Changed PrimitiveFloatList to an interface, in a new package called: com.linkedin.avro.api; since the package name migration makes this an incompatible change, it would be desirable for the next release to not increment only the patch version. Having a proper package name for API extension should make things cleaner in the future as we add other optimized APIs (e.g. PR linkedin#45). - Renamed the old class to ByteBufferBackedPrimitiveFloatList, and made it implement the new interface. - Added new several new classes to ensure that the PrimitiveFloatList is always returned even when Fast-Avro falls back to vanilla Avro: - ColdPrimitiveFloatList which is a naive implementation that simply implements the new API by delegating to the regular Avro functions. This does not provide any GC benefits, but at least maintains the API. - ColdGenericDatumReader and ColdSpecificDatumReader which extend the GenericDatumReader and SpecificDatumReader classes, respectively, from vanilla Avro. - ColdDatumReaderMixIn which provides a utility function to minimize repeated code between the two DatumReader functions. - Significantly refactored the FastGenericDeserializerGeneratorTest so that it tests three permutations: vanilla, cold fast and warm fast. As part of doing this, several test short-comings were discovered and fixed. In particular, the decodeRecordSlow function had some flipped parameters which led to test failures on vanilla Avro, and those test failures were hidden by the fact that some tests ignored the provided permutation param and systematically tested only Fast-Avro.
The PrimitiveFloatList is an API which users should expect to rely on, so it is wrong to degrade to the more constrained List<Float> API while Fast-Avro is still cold. This commit introduces several changes to make the extended API reliably present whenever using Fast-Avro, regardless of being cold or warm. - Changed PrimitiveFloatList to an interface, in a new package called: com.linkedin.avro.api; since the package name migration makes this an incompatible change, it would be desirable for the next release to not increment only the patch version. Having a proper package name for API extension should make things cleaner in the future as we add other optimized APIs (e.g. PR #45). - Renamed the old class to ByteBufferBackedPrimitiveFloatList, and made it implement the new interface. - Added new several new classes to ensure that the PrimitiveFloatList is always returned even when Fast-Avro falls back to vanilla Avro: - ColdPrimitiveFloatList which is a naive implementation that simply implements the new API by delegating to the regular Avro functions. This does not provide any GC benefits, but at least maintains the API. - ColdGenericDatumReader and ColdSpecificDatumReader which extend the GenericDatumReader and SpecificDatumReader classes, respectively, from vanilla Avro. - ColdDatumReaderMixIn which provides a utility function to minimize repeated code between the two DatumReader functions. - Significantly refactored the FastGenericDeserializerGeneratorTest so that it tests three permutations: vanilla, cold fast and warm fast. As part of doing this, several test short-comings were discovered and fixed. In particular, the decodeRecordSlow function had some flipped parameters which led to test failures on vanilla Avro, and those test failures were hidden by the fact that some tests ignored the provided permutation param and systematically tested only Fast-Avro.
No description provided.