-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precompilation throws: InexactError: check_top_bit(UInt64, -3141633)
#177
Comments
It's probably not reading the cache sizes directly. |
Any idea on how I can fix this? |
What do you get for julia> using CPUSummary
julia> CPUSummary.cache_size(Val(1))
static(32768)
julia> CPUSummary.cache_size(Val(2))
static(1048576)
julia> CPUSummary.cache_size(Val(3))
static(1441792)
julia> using Hwloc
julia> Hwloc.cachesize()
(L1 = 32768, L2 = 1048576, L3 = 20185088) |
|
That all looks correct, except -- you have 4 MiB of L2 cache? What CPU are you on? julia> versioninfo()
Julia Version 1.10.0-DEV.1254
Commit b9b8b38ec0 (2023-05-09 20:47 UTC)
Platform Info:
OS: Linux (x86_64-generic-linux)
CPU: 28 × Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, skylake-avx512)
Threads: 41 on 28 virtual cores
Environment:
JULIA_PATH = @.
LD_LIBRARY_PATH = /usr/local/lib/
JULIA_NUM_THREADS = 28
julia> ccall(:jl_getpagesize, Int, ())
4096 and let's confirm |
It'd be easier for you to debug these yourself and tell me what is wrong. [7] malloc
@ ./libc.jl:355 [inlined]
[8] valloc
@ ~/.julia/packages/VectorizationBase/0dXyA/src/alignment.jl:36 [inlined]
[9] init_bcache
@ ~/.julia/packages/Octavian/XhL0C/src/init.jl:19 [inlined] function init_bcache()
if bcache_count() ≢ Zero()
if BCACHEPTR[] == C_NULL
BCACHEPTR[] = VectorizationBase.valloc(
Threads.nthreads() * second_cache_size() * bcache_count(),
Cvoid,
ccall(:jl_getpagesize, Int, ())
)
end
end
nothing
end calls function valloc(
N::Union{Integer,StaticInt},
::Type{T} = Float64,
a = max(register_size(), cache_linesize())
) where {T}
# We want alignment to both vector and cacheline-sized boundaries
size_T = max(1, sizeof(T))
reinterpret(
Ptr{T},
align(reinterpret(UInt, Libc.malloc(size_T * N + a - 1)), a)
)
end So N = Threads.nthreads() * Octavian.second_cache_size() * Octavian.bcache_count()
a = ccall(:jl_getpagesize, Int, ())
N + a - 1 Seems to be negative. You can copy paste the definitions of |
Thanks a lot for your help, I really appreciate it.
versioninfo()
jl_getpagesize
It is negative indeed! I get
This line seems to be the issue: Octavian.jl/src/global_constants.jl Line 70 in 00d50b3
According to CPUSummary I have
so the former minus the latter gives negative number. You mentioned that the results I wrote ealier (#177 (comment)) looked correct, but were they? The L3 numbers reported by CPUSummary and Hwloc are different ones, and in particular, if CPUSummary reported the Hwloc number, then this would not get negative (but again I really don't know anything about hardware so this might make no sense). |
Cascadelake has 1 MiB of L2 cache/core. So the 4 MiB reported is wrong. julia> sc = Octavian.second_cache()
static(3)
julia> Octavian.cache_inclusive(sc)
static(false) the cache is not inclusive either, so it shouldn't be subtracting. CPUSummary is suposed to report per-core sizes, hence the discrepancy for the L3 cahce vs hwloc. |
I am getting the same errors on my machine. It is not a compute cluster but a virtual machine by a large German cloud provider (Hetzner). I also get the reported numbers julia> (CPUSummary.cache_size(second_cache()), CPUSummary.cache_size(first_cache()))
(static(2097152), static(4194304)) That is, the first cache is reported much larger then the second one and thus @nathanaelbosch did you find a way to fix this issue for you? |
@sloede Unfortunately I did not find a way to fix this. But I would be very interested in a solution to this issue. |
As a workaround, we could hardcode values for certain architectures, e.g. check for julia> Sys.CPU_NAME
"cascadelake" |
Is the amount of cache fixed for certain architectures? In my case, they identify as Skylake, as far as I can tell |
Yes. CPUSummary reports cache per core, so that Octavian can assume the total L3 cache it can use is proportional to the number of cores it is using. (If other cores are doing something else, they're likely to use/want some chunk of the L3 themselves.) Skylake-avx512 and cascadelake are almost the same thing. They both have the same number of L1 and L2 cache per core: 32 KiB L1d, 32 KiB L1i, 1024 KiB L2. They also have something like a 1.375 MiB L3 slice per core, shared among all cores. |
Yes, it looks like it - `julia -e 'using InteractiveUtils; versioninfo(verbose=true)' gives me the following:
|
I wanted to use package that relies on Octavian, but the precompilation step fails in the following way:
This is on a compute cluster, so the error might be linked to the setup I suppose, but I don't know enough about such things to figure out how to solve this. Any pointers?
EDIT: This happens both on the new Julia 1.9.0, as well as on 1.8.5
The text was updated successfully, but these errors were encountered: