You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a few compilation flags that should be documented. I could see the effect of them but didn't understand the reason they were introduced / tradeoffs of enabling them.
rc-alloc: Changes the dynamic allocations from a box to an RC
custom-vecdeque: A custom vec implementation
simd: Faster parsing, but only available to x86
The text was updated successfully, but these errors were encountered:
One of our main goals with Kawa is to be memory efficient and limit the number of dynamic allocations and copies. On that subject, the Sotre::Alloc is necessary for Kawa to be useful but I'm not happy with the current implementation, rc-alloc was just to test an alternative but it proved to have the same shortcomings.
Kawa is just the generic HTTP representation, but another goal for this crate is to provide default HTTP1/HTTP2 parsers and converters that are "fast". Our benchmarks indicated that the std::collections::VecDeque::push_back was slowing the parsers, and we thought it was because the compiler wasn't able to do "in place initialization", so we reimplemented a vecdeque that would be easier for the compiler to optimize. We can't measure a significant difference though, the original diagnostic might have been biased by the tracing we used. It will probably get removed completely from the crate in future releases.
Finally, the simd flag uses sse4.2 x86 specific instructions to greatly speed up our parsers. This optimization makes its speed on par with the widely used httparse crate:
kawa without simd: ~1.5Mb/s
kawa with simd: ~3Mb/s
kawa with simd without cookie parsing (not an available flag yet): ~4.5Mb/s
httparse: ~3.5Mb/s
picohttpparser (fastest parser I could find): ~7Mb/s
numbers obtained on my machine on a specific request, it might not be representative of the general case.
Ideally, the parsers should not rely on a feature flag but the target architecture directly.
There are a few compilation flags that should be documented. I could see the effect of them but didn't understand the reason they were introduced / tradeoffs of enabling them.
rc-alloc
: Changes the dynamic allocations from a box to an RCcustom-vecdeque
: A custom vec implementationsimd
: Faster parsing, but only available to x86The text was updated successfully, but these errors were encountered: