Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch-less alphanumeric underscore's replacement #1294

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

franz1981
Copy link
Contributor

@franz1981 franz1981 commented Jan 21, 2025

This is following the same idea of quarkusio/quarkus#45546 re using lookup tables.
If the src Strings are rarely latin only beyond US_ASCII i.e. [128, 255] this can be further simplified by having a 128-sized lookup table.

I didn't yet looked at its performance - and, as suggested at mkouba/qute-benchmarks#1 we should have a proper micro-benchmark which account to stress the branch-predictor too (or not - should be configurable) - unless the non-alphanumeric case is supposed to be very predictable in reality as well

@franz1981 franz1981 force-pushed the branchless_replacement branch from 29b76da to 4d44a72 Compare January 21, 2025 07:06
@franz1981
Copy link
Contributor Author

franz1981 commented Jan 21, 2025

I probably have a further suggestion here

@franz1981 franz1981 force-pushed the branchless_replacement branch from 0c78604 to 96c14bd Compare January 21, 2025 10:36
@franz1981
Copy link
Contributor Author

franz1981 commented Jan 21, 2025

I've created this benchmark at which gives these numbers franz1981/java-puzzles@8fb833c

with highly predictable sameples:

Benchmark                                                (finalDoubleQuoteProbability)  (nonAlphanumericProbability)  (samples)  (size)  Mode  Cnt   Score   Error  Units
ReplaceWithUnderscores.replaceWithUnderscoresSwitch                                 10                            10        100      33  avgt   10  94.525 ± 3.802  ns/op
ReplaceWithUnderscores.replaceWithUnderscoresTableArray                             10                            10        100      33  avgt   10  20.951 ± 0.339  ns/op
ReplaceWithUnderscores.replaceWithUnderscoresTableSb                                10                            10        100      33  avgt   10  77.077 ± 1.501  ns/op

and with not that much predictable (10K samples):

Benchmark                                                     (finalDoubleQuoteProbability)  (nonAlphanumericProbability)  (samples)  (size)  Mode  Cnt    Score   Error  Units
ReplaceWithUnderscores.replaceWithUnderscoresSwitch                                      10                            10      10000      33  avgt   10  231.371 ± 9.276  ns/op
ReplaceWithUnderscores.replaceWithUnderscoresTableArray                                  10                            10      10000      33  avgt   10   32.910 ± 2.189  ns/op
ReplaceWithUnderscores.replaceWithUnderscoresTableSb                                     10                            10      10000      33  avgt   10  102.949 ± 4.079  ns/op

In both case the Table version (same of this PR) is faster with the TableSb (using the builder) to be slower .

I didn't accounted yet for the code duplication, but since native image is not wow to handle StringBuilders and Strings - moving to a "ligher" abstractions should pay off (@galderz can verify it :P) - for JIT I would like to test when only C1 is running or with "cold" code as well.

@franz1981 franz1981 force-pushed the branchless_replacement branch from 7fc70d3 to 9a0a872 Compare January 21, 2025 20:01
@franz1981
Copy link
Contributor Author

Sadly if I got right what C1MaxInlineSize does - C1 doesn't have inlining superpowers, hence reusing code could be relatively costly if the callee is not trivially inlinable, which seems this case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant