More carry calculation fixing #835

reunanen · 2024-05-12T07:26:34Z

Add test cases provided by @azrafe7, and update carry calculation to a method that seems to actually work.

I can't say that I fully understand why exactly, but I too compared this against the naive method (using a native uint128 type), and this approach seems to work when the previous one does not always give exactly the same results.

tomwiel · 2024-05-12T10:02:10Z

A small observation: This change leads to five mult-operations for the product,
while other implementations (like clipper1 or Emulate64x64to128) just need four.

reunanen · 2024-05-12T12:37:24Z

@tomwiel Right – and that's because carry is calculated separately, which is leftover from a previous version that bailed out after a straightforward (carry-ignoring) multiplication, in case the result was not equal. Good catch, let me try to improve on this front...

AngusJohnson · 2024-05-12T13:03:28Z

I can't say that I fully understand why exactly

I think I've finally gotten my head around this ...

  a * b ==>
  split a and b into upper and lower 32bits
  (aHi + aLo) * (bHi + bLo) ==>
  (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo) [ie 4 multiples]
  1. aHi * bHi: XXXXXXXX00000000 * XXXXXXXX00000000
  2. aHi * bLo: XXXXXXXX00000000 * 00000000XXXXXXXX
  3. aLo * bHi: XXXXXXXX00000000 * 00000000XXXXXXXX
  4. aLo * bLo: 00000000XXXXXXXX * 00000000XXXXXXXX
     { overflow bits }
  1. XXXXXXXX XXXXXXXX 00000000 00000000 +
  2. 00000000 XXXXXXXX XXXXXXXX 00000000 +
  3. 00000000 XXXXXXXX XXXXXXXX 00000000 +
  4. 00000000 00000000 XXXXXXXX XXXXXXXX
  given bit shifting to keep each multiplication within 64 bits
  then overflow bits equals ...
  a. all of (1) PLUS
  b. upper 32bits of both (2) and (3) PLUS
  c. overflow of addition of (4) and lower 32bits of both (2) and (3)
  note: overflow of addition in c. will be between 0 and 2.

What I hadn't appreciated until now is the addition c. above can potentially overflow by 2.

…, and another for `c*d`)

reunanen · 2024-05-12T13:35:18Z

This change leads to five mult-operations for the product

This should be fixed now.

azrafe7 · 2024-05-12T20:51:26Z

I can't say that I fully understand why exactly

I think I've finally gotten my head around this ...

  // aLo = a & 0xFFFFFFFF;
  // aHi = a & 0xFFFFFFFF00000000;
  // bLo = b & 0xFFFFFFFF;
  // bHi = b & 0xFFFFFFFF00000000;

  // a * b == (aHi + aLo) * (bHi + bLo)
  // a * b == (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo)
  // (aHi * bHi) => up to 128bits where bottom 64bits must be 0
  // (aHi * bLo) and (bHi * aLo)  => up to 96bits where bottom 32bits must be 0
  // (aLo * bLo) => up to 64bits

  // 64bit overflow carry of a * b consists of 
  // 1. all of (aHi * bHi) PLUS 
  // 2. the upper 32bits of both (aHi * bLo) and (bHi * aLo) PLUS
  // 3. 0 - 2: the overflow of ((aHi * bLo)<<32) + (bHi * aLo)<<32 + (aLo * bLo)

What I hadn't appreciated until now is that ((aHi * bLo)<<32) + (bHi * aLo)<<32 + (aLo * bLo) can potentially overflow by 2.

Yes, just realized that much later than you. 😅

And found out that that extra_carry can be computed as:

uint64_t extra_carry = (((aHiShr * bLo) & 0xFFFFFFFF) +
                        ((bHiShr * aLo) & 0xFFFFFFFF) +
                        ((aLo * bLo) >> 32)) >> 32;

AngusJohnson · 2024-05-12T21:29:23Z

And found out that that extra_carry can be computed as:

Neat. Thanks.
And I've just rewritten my overflow explanation above so I hope it's less confusing about when aHi etc are shifted.

reunanen added 5 commits May 12, 2024 09:51

Add carry calculation tests

de261cf

Fix carry calculation

6030698

Minor simplification

4245650

Test both ways, now that we are here

be517cd

Add helpful remark

69f5d55

Avoid a single multiplication instruction per product (=one for a*b…

926203e

…, and another for `c*d`)

AngusJohnson merged commit 82cd887 into AngusJohnson:main May 12, 2024
7 checks passed

reunanen deleted the more-carry-calculation-fixing branch May 12, 2024 13:44

AngusJohnson added a commit that referenced this pull request May 13, 2024

Updated C# and Delphi code with bugfixed IsCollinear function (#835)

07cabad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More carry calculation fixing #835

More carry calculation fixing #835

reunanen commented May 12, 2024

tomwiel commented May 12, 2024

reunanen commented May 12, 2024

AngusJohnson commented May 12, 2024 •

edited

Loading

reunanen commented May 12, 2024

azrafe7 commented May 12, 2024

AngusJohnson commented May 12, 2024

More carry calculation fixing #835

More carry calculation fixing #835

Conversation

reunanen commented May 12, 2024

tomwiel commented May 12, 2024

reunanen commented May 12, 2024

AngusJohnson commented May 12, 2024 • edited Loading

reunanen commented May 12, 2024

azrafe7 commented May 12, 2024

AngusJohnson commented May 12, 2024

AngusJohnson commented May 12, 2024 •

edited

Loading