Precompilation throws: `InexactError: check_top_bit(UInt64, -3141633)` #177

nathanaelbosch · 2023-05-10T09:31:27Z

I wanted to use package that relies on Octavian, but the precompilation step fails in the following way:

julia> using Octavian
[ Info: Precompiling Octavian [6fd5a793-0b7e-452c-907f-f8bfe9c57db4]
ERROR: LoadError: InexactError: check_top_bit(UInt64, -3141633)
Stacktrace:
  [1] throw_inexacterror(f::Symbol, #unused#::Type{UInt64}, val::Int64)
    @ Core ./boot.jl:634
  [2] check_top_bit
    @ ./boot.jl:648 [inlined]
  [3] toUInt64
    @ ./boot.jl:759 [inlined]
  [4] UInt64
    @ ./boot.jl:789 [inlined]
  [5] convert
    @ ./number.jl:7 [inlined]
  [6] cconvert
    @ ./essentials.jl:492 [inlined]
  [7] malloc
    @ ./libc.jl:355 [inlined]
  [8] valloc
    @ ~/.julia/packages/VectorizationBase/0dXyA/src/alignment.jl:36 [inlined]
  [9] init_bcache
    @ ~/.julia/packages/Octavian/XhL0C/src/init.jl:19 [inlined]
 [10] __init__()
    @ Octavian ~/.julia/packages/Octavian/XhL0C/src/init.jl:3
 [11] macro expansion
    @ ~/.julia/packages/Octavian/XhL0C/src/Octavian.jl:80 [inlined]
 [12] macro expansion
    @ ~/.julia/packages/SnoopPrecompile/1XXT1/src/SnoopPrecompile.jl:119 [inlined]
 [13] top-level scope
    @ ~/.julia/packages/Octavian/XhL0C/src/Octavian.jl:77
 [14] include
    @ ./Base.jl:457 [inlined]
 [15] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::Nothing)
    @ Base ./loading.jl:2010
 [16] top-level scope
    @ stdin:2
in expression starting at /home/me/.julia/packages/Octavian/XhL0C/src/Octavian.jl:1
in expression starting at stdin:2
ERROR: Failed to precompile Octavian [6fd5a793-0b7e-452c-907f-f8bfe9c57db4] to "/home/me/julia/compiled/v1.9/Octavian/jl_1m6rbd".

This is on a compute cluster, so the error might be linked to the setup I suppose, but I don't know enough about such things to figure out how to solve this. Any pointers?

EDIT: This happens both on the new Julia 1.9.0, as well as on 1.8.5

The text was updated successfully, but these errors were encountered:

chriselrod · 2023-05-10T11:12:14Z

This is on a compute cluster, so the error might be linked to the setup I suppose

It's probably not reading the cache sizes directly.

nathanaelbosch · 2023-05-10T11:47:26Z

Any idea on how I can fix this?

chriselrod · 2023-05-10T11:50:40Z

What do you get for

julia> using CPUSummary

julia> CPUSummary.cache_size(Val(1))
static(32768)

julia> CPUSummary.cache_size(Val(2))
static(1048576)

julia> CPUSummary.cache_size(Val(3))
static(1441792)

julia> using Hwloc

julia> Hwloc.cachesize()
(L1 = 32768, L2 = 1048576, L3 = 20185088)

nathanaelbosch · 2023-05-10T12:09:45Z

julia> CPUSummary.cache_size(Val(1))
static(32768)

julia> CPUSummary.cache_size(Val(2))
static(4194304)

julia> CPUSummary.cache_size(Val(3))
static(1048576)

julia> Hwloc.cachesize()
(L1 = 32768, L2 = 4194304, L3 = 16777216)

chriselrod · 2023-05-10T21:12:03Z

That all looks correct, except -- you have 4 MiB of L2 cache? What CPU are you on?

julia> versioninfo()
Julia Version 1.10.0-DEV.1254
Commit b9b8b38ec0 (2023-05-09 20:47 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: 28 × Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake-avx512)
  Threads: 41 on 28 virtual cores
Environment:
  JULIA_PATH = @.
  LD_LIBRARY_PATH = /usr/local/lib/
  JULIA_NUM_THREADS = 28

julia> ccall(:jl_getpagesize, Int, ())
4096

and let's confirm jl_getpagesize is returning correctly.

chriselrod · 2023-05-10T21:20:51Z

It'd be easier for you to debug these yourself and tell me what is wrong.
Why do we have an invalid call to malloc?

  [7] malloc
    @ ./libc.jl:355 [inlined]
  [8] valloc
    @ ~/.julia/packages/VectorizationBase/0dXyA/src/alignment.jl:36 [inlined]
  [9] init_bcache
    @ ~/.julia/packages/Octavian/XhL0C/src/init.jl:19 [inlined]

https://github.com/JuliaLinearAlgebra/Octavian.jl/blob/00d50b3fb270f23d7f94dedd261cd95a3fb25af3/src/init.jl#LL16C1-L27C4

function init_bcache()
  if bcache_count() ≢ Zero()
    if BCACHEPTR[] == C_NULL
      BCACHEPTR[] = VectorizationBase.valloc(
        Threads.nthreads() * second_cache_size() * bcache_count(),
        Cvoid,
        ccall(:jl_getpagesize, Int, ())
      )
    end
  end
  nothing
end

calls

function valloc(
  N::Union{Integer,StaticInt},
  ::Type{T} = Float64,
  a = max(register_size(), cache_linesize())
) where {T}
  # We want alignment to both vector and cacheline-sized boundaries
  size_T = max(1, sizeof(T))
  reinterpret(
    Ptr{T},
    align(reinterpret(UInt, Libc.malloc(size_T * N + a - 1)), a)
  )
end

https://github.com/JuliaSIMD/VectorizationBase.jl/blob/9174dcca731144935e438d44ba07f4e4ec3a66c6/src/alignment.jl#L29-L40

So

N = Threads.nthreads() * Octavian.second_cache_size() * Octavian.bcache_count()
a = ccall(:jl_getpagesize, Int, ())
N + a - 1

Seems to be negative.

You can copy paste the definitions of Octavian.bcache_count and Octavian.second_cache_size.

nathanaelbosch · 2023-05-11T13:42:41Z

Thanks a lot for your help, I really appreciate it.

That all looks correct, except -- you have 4 MiB of L2 cache? What CPU are you on?

versioninfo()

julia> versioninfo()
Julia Version 1.9.0
Commit 8e630552924 (2023-05-07 11:25 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 64 × Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, cascadelake)
  Threads: 2 on 64 virtual cores
Environment:
  JULIA_NUM_THREADS = auto
  JULIA_STACKTRACE_MINIMAL = true

and let's confirm jl_getpagesize is returning correctly.

jl_getpagesize

julia> ccall(:jl_getpagesize, Int, ())
4096

It'd be easier for you to debug these yourself and tell me what is wrong.
So [...] Seems to be negative.

It is negative indeed! I get

julia> N + a - 1
-6287361

This line seems to be the issue:

Octavian.jl/src/global_constants.jl

Line 70 in 00d50b3

_second_cache_size(scs::StaticInt, ::True) = scs - cache_size(first_cache())

According to CPUSummary I have

julia> (CPUSummary.cache_size(second_cache()), CPUSummary.cache_size(first_cache()))
(static(1048576), static(4194304))

so the former minus the latter gives negative number.

You mentioned that the results I wrote ealier (#177 (comment)) looked correct, but were they? The L3 numbers reported by CPUSummary and Hwloc are different ones, and in particular, if CPUSummary reported the Hwloc number, then this would not get negative (but again I really don't know anything about hardware so this might make no sense).

chriselrod · 2023-05-11T16:27:11Z

Cascadelake has 1 MiB of L2 cache/core. So the 4 MiB reported is wrong.
Furthermore

julia> sc = Octavian.second_cache()
static(3)

julia> Octavian.cache_inclusive(sc)
static(false)

the cache is not inclusive either, so it shouldn't be subtracting.

CPUSummary is suposed to report per-core sizes, hence the discrepancy for the L3 cahce vs hwloc.
Octavian also shouldn't be trying to use a greater allotment of cache than the number of threads it has, as it can't assume the other threads aren't busy working on something else (if they weren't, Octavian itself could/should've been multithreaded).

sloede · 2023-05-29T05:55:22Z

I am getting the same errors on my machine. It is not a compute cluster but a virtual machine by a large German cloud provider (Hetzner). I also get the reported numbers

julia> (CPUSummary.cache_size(second_cache()), CPUSummary.cache_size(first_cache()))
(static(2097152), static(4194304))

That is, the first cache is reported much larger then the second one and thus N is already negative.

@nathanaelbosch did you find a way to fix this issue for you?
@chriselrod If I reach out to their support regarding this, what exactly should I tell them (preferably without having to rely on Julia terminology)? That their setup reports the wrong cache sizes for the L2 and L3 caches?

nathanaelbosch · 2023-05-30T09:19:54Z

@sloede Unfortunately I did not find a way to fix this. But I would be very interested in a solution to this issue.

chriselrod · 2023-05-30T14:45:02Z

As a workaround, we could hardcode values for certain architectures, e.g. check for

julia> Sys.CPU_NAME
"cascadelake"

sloede · 2023-06-02T06:56:36Z

Is the amount of cache fixed for certain architectures? In my case, they identify as Skylake, as far as I can tell

chriselrod · 2023-06-02T15:05:21Z

Yes. CPUSummary reports cache per core, so that Octavian can assume the total L3 cache it can use is proportional to the number of cores it is using. (If other cores are doing something else, they're likely to use/want some chunk of the L3 themselves.)

Skylake-avx512 and cascadelake are almost the same thing. They both have the same number of L1 and L2 cache per core: 32 KiB L1d, 32 KiB L1i, 1024 KiB L2.

They also have something like a 1.375 MiB L3 slice per core, shared among all cores.
I'm assuming your server is skylake-avx512 rather than skylake?

sloede · 2023-06-02T15:10:36Z

I'm assuming your server is skylake-avx512 rather than skylake?

Yes, it looks like it - `julia -e 'using InteractiveUtils; versioninfo(verbose=true)' gives me the following:

Julia Version 1.9.0
Commit 8e630552924 (2023-05-07 11:25 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 20.04.6 LTS
  uname: Linux 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 2023 x86_64 x86_64
  CPU: Intel Xeon Processor (Skylake, IBRS): 
              speed         user         nice          sys         idle          irq
       #1  2099 MHz      10207 s          0 s        968 s      67491 s          0 s
       #2  2099 MHz      10011 s          0 s       1001 s      67651 s          0 s
       #3  2099 MHz       9902 s          0 s        940 s      67830 s          0 s
       #4  2099 MHz      10310 s          0 s        969 s      67400 s          0 s
       #5  2099 MHz        242 s          0 s        210 s      78168 s          0 s
       #6  2099 MHz        250 s          0 s        200 s      78101 s          0 s
       #7  2099 MHz        235 s          0 s        212 s      78164 s          0 s
       #8  2099 MHz        265 s          8 s        210 s      78115 s          0 s
  Memory: 8.0 GB (7221.390625 MB free)
  Uptime: 7880.56 sec
  Load Avg:  0.54  0.12  0.04
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake-avx512)
  Threads: 1 on 8 virtual cores
Environment:
  GITHUB_PATH = /_work/github-runner-1-3/_temp/_runner_file_commands/add_path_cec3c7ec-60a7-4b35-842d-3fdcb4ffc5f2
  HOME = /root
  GITHUB_EVENT_PATH = /_work/github-runner-1-3/_temp/_github_workflow/event.json
  PATH = /opt/hostedtoolcache/julia/1.9.0/x64/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/actions-runner

sloede mentioned this issue May 29, 2023

Add CI job on self-hosted runner trixi-framework/Trixi.jl#1495

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precompilation throws: `InexactError: check_top_bit(UInt64, -3141633)` #177

Precompilation throws: `InexactError: check_top_bit(UInt64, -3141633)` #177

nathanaelbosch commented May 10, 2023 •

edited

Loading

chriselrod commented May 10, 2023

nathanaelbosch commented May 10, 2023

chriselrod commented May 10, 2023 •

edited

Loading

nathanaelbosch commented May 10, 2023

chriselrod commented May 10, 2023 •

edited

Loading

chriselrod commented May 10, 2023

nathanaelbosch commented May 11, 2023 •

edited

Loading

chriselrod commented May 11, 2023 •

edited

Loading

sloede commented May 29, 2023

nathanaelbosch commented May 30, 2023

chriselrod commented May 30, 2023

sloede commented Jun 2, 2023

chriselrod commented Jun 2, 2023 •

edited

Loading

sloede commented Jun 2, 2023

Precompilation throws: InexactError: check_top_bit(UInt64, -3141633) #177

Precompilation throws: InexactError: check_top_bit(UInt64, -3141633) #177

Comments

nathanaelbosch commented May 10, 2023 • edited Loading

chriselrod commented May 10, 2023

nathanaelbosch commented May 10, 2023

chriselrod commented May 10, 2023 • edited Loading

nathanaelbosch commented May 10, 2023

chriselrod commented May 10, 2023 • edited Loading

chriselrod commented May 10, 2023

nathanaelbosch commented May 11, 2023 • edited Loading

chriselrod commented May 11, 2023 • edited Loading

sloede commented May 29, 2023

nathanaelbosch commented May 30, 2023

chriselrod commented May 30, 2023

sloede commented Jun 2, 2023

chriselrod commented Jun 2, 2023 • edited Loading

sloede commented Jun 2, 2023

Precompilation throws: `InexactError: check_top_bit(UInt64, -3141633)` #177

Precompilation throws: `InexactError: check_top_bit(UInt64, -3141633)` #177

nathanaelbosch commented May 10, 2023 •

edited

Loading

chriselrod commented May 10, 2023 •

edited

Loading

chriselrod commented May 10, 2023 •

edited

Loading

nathanaelbosch commented May 11, 2023 •

edited

Loading

chriselrod commented May 11, 2023 •

edited

Loading

chriselrod commented Jun 2, 2023 •

edited

Loading