Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Felix (latest) build error on Linux #175

Open
razetime opened this issue Oct 26, 2022 · 26 comments
Open

Felix (latest) build error on Linux #175

razetime opened this issue Oct 26, 2022 · 26 comments
Assignees

Comments

@razetime
Copy link
Collaborator

Felix (as of the latest commit in 26th October) throws a build error when installed.

OS: Ubuntu 20.04.5 LTS x86_64

Python version: 3.8.10

OCaml version: 4.14.0

g++ version: 9.4.0

Full command output: gist

Previous discussion has been done in the Felix mail group, and in #174.

@skaller
Copy link
Member

skaller commented Oct 27, 2022

So: the bootstrap build has worked successfully. However I cannot see the exact commands, which are in the file

build/fbuild.log

I think. I also cannot see the commands for the second build, these can be exposed by

export FLX_SHELL_ECHO=1

Both build processes just provide a summary of each step, rather than the exact shell command.

So in the second build we get two different errors when trying to link:

/usr/local/bin/ld: error: build/release/host/lib/rtl/libflx_static.a(flx_world_config_static.o): requires unsupported dynamic reloc 11; recompile with -fPIC
/usr/local/bin/ld: error: build/release/host/lib/rtl/libflx_static.a(flx_svc_static.o): requires unsupported dynamic reloc 11; recompile with -fPIC
/usr/local/bin/ld: error: build/release/host/lib/rtl/libflx_static.a(flx_ioutil_static.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC

As far as I can tell this is a bug in the linker. The linker thinks it should be linking a shared library OR an executable that depends on a shared library. But it's SUPPOSED to be doing a full static link and all the objects file SHOULD have been compiled for that. Every one of them ends in the suffix

_static.o

which means precisely compiled WITHOUT -fPIC for static linkage.

Now the thing is, the fbuild (bootstrap) build successfully linked bootflx (which is renamed flx) which is almost exactly the same code as the full flx except it doesn't support plugins.

Now, plugin support just loads shared libraries. This uses dlopen. However dlopen and dlsym should NOT require the executable to be -fPIC. That would be required if we did LOAD TIME linkage to a shared library but plugins are loaded at RUN TIME. It's possible the Linux library object file containing dlopen is incorrectly compiled.

So at the moment it looks like gcc, ld, or something in LInux is broken. It can be easily fixed by simply building even the static objects with -fPIC but the whole point of having static objects at all is to avoid precisely that.

It's POSSIBLE that the error is in the Felix build code but very unlikely because it has worked before, and, the compiles are done with a wildcarded Felix script that builds everything the same way. The difference in error messages is almost certainly because of this: some modules have no dependencies, other does. So the reloc 11 error is when there are no dependencies, and the reloc X86 error is when there are.

The fact that EVERY object file gives an error suggests it its the linker command that is wrong: either the wrong switches given by the Felix build script, OR, the linker itself is broken.

Linkage by default should be static.

However .. there is one issue here: the C library on most platforms is a shared library END OF STORY. There is no static link shared library any more. Linux is one of the archaic holdouts on this one. It's possible modern Linux has removed static link C libraries (because they prevent upgrading the system!)

MacOS doesn't support static link system libraries: not just the C library either, almost everything in a Framework on a Mac is dynamic link, and dynamic link ONLY. Similarly Windows. Everything's a DLL.

So the bottom line is this:

Step 1: set

export FLX_SHELL_ECHO=1

Step 2: delete the build artefacts

rm -rf build

Step 3: rebuild it (sorry!)
make

You need to somehow save the output. Both stderr and stdout.

Step 4: Find the linker commands used to build bootflx and flx.
The first one is in the fbuild.log file.
The second in console output from the second stage build.
This has all the errors in it.

The two commands may be different. The first one works, the second doesn't.
However the fbuild system uses Python for compiling and linking, whereas
the second stage does compiling and linking all over again using Felix script
instead of Python.

I need to see both linker commands to see if I got the switches right.

Based on the advice of the error message it is possible to patch it so that the compiles are done with -fPIC even when they should not need to be. The actual compiler toolchains are in this file:

https://github.com/felix-lang/felix/blob/master/src/packages/toolchain.fdoc

around line 935 you will see the Felix code that is used to launch g++ for compilation and a bit later for linkage.

@skaller
Copy link
Member

skaller commented Oct 27, 2022

So I'm now building Felix on Ubuntu using GitHub Workflows! Hopefully this will help identify the problem.

@skaller
Copy link
Member

skaller commented Oct 27, 2022

Well you can look in the Actions tag in GitHub for the repository. The bootstrap, for some arcane reason is using clang++. I can fix that. But the Felix build is using g++ (so it says anyhow) and it's building without any problem.

I need to add some stuff to be sure it's actually using g++ and what version.

@skaller
Copy link
Member

skaller commented Oct 27, 2022

ok everything builds with clang++ I've fixed the script finally to use g++ instead

@skaller
Copy link
Member

skaller commented Oct 27, 2022

https://github.com/felix-lang/felix/actions/runs/3337443837/jobs/5523792114

This is using g++ version 9.4 and it builds just fine. I wonder why that works and your build does not?

@razetime
Copy link
Collaborator Author

razetime commented Oct 28, 2022

I am not sure why, but I do have the result of a fresh build from scratch.

@skaller
Copy link
Member

skaller commented Oct 28, 2022

maybe build/release/fbuild.log

This is always produced. It is unrelated to FLX_SHELL_ECHO.

@skaller
Copy link
Member

skaller commented Oct 28, 2022

It seems to have worked!

@razetime
Copy link
Collaborator Author

Oh, that's great. It must have been the artefacts that hurt the build. Thanks for the support.

@skaller
Copy link
Member

skaller commented Oct 28, 2022

Just to explain the build process:

1a. A program written in Python, called fbuild, is used to build a bootstrap version of Felix.
The log for that part of the build processes is here:

~/felix>ls build/release/fbuild.log
build/release/fbuild.log

This contains more detailed information than shown on the console.

Looking at the file you posted, fbuild decided to use clang++, not g++, as the compiler for this phase.
This shouldn't matter .. but it might if any object files are left lying around because whilst clang and gcc use compatible C libraries, they use quite different, incompatible C++ libraries.

The main result of this process is to build flxg the Felix compiler (written in Ocaml).

1b. Now, the flxg compiler is used to compile bootflx, which is written in Felix.

1c. Now bootflx is renamed to flx, and is used to build flx_build_boot, flx_build_prep, flx_build_flxg
and flx_build_rtl, the Felix programs used to build Felix.

  1. We now have a cut down Felix system, with enough things built to rebuild Felix from scratch.
    So we use these tools to rebuild the compiler flxg and the run time library, and finally the
    driver flx. Then we scrap the boostrap Felix altogether and use flx to rebuild the build
    tools, and then use these tools to rebuild the rest of the system.

  2. Finally we run the tests.

Now if the repository is changed you can try

make rebuild

and it rebuild the system starting at phase 2. If the ONLY changes are to test code, or most
of the library, or the grammar, you don't need to rebuild ANYTHING. It will be rebuilt automatically.
The rebuild is needed if the compiler is changed, or the code for flx itself is changed.
And sometimes other things, so if you're not sure, just do a phase 2 rebuild.

Occasionally a clean rebuild from phase 1 is required.

There are other targets in the GNUmakefile:

make test

is one that can be run every now and then. There will be a

make rosetta

at some stage to rebuild and check all the Rosetta tests.

Finally if you say

flx --help

you will get a list of switches and environment variables. This one:

FLX_SHELL_ECHO=1

causes ANY Felix program that calls the shell to display the text of the shell call. Of course flx does this a lot:

~/felix>FLX_SHELL_ECHO=1 flx hello
[system] "/Users/skaller/felix/build/release/host/bin/flxg" "-q" "--inline=25" "--output_dir=/Users/skaller/.felix/cache/text" "--cache_dir=/Users/skaller/.felix/cache/binary" "-I/Users/skaller/felix/build/release/share/lib" "-I/Users/skaller/felix/build/release/host/lib" "--syntax=@/Users/skaller/felix/build/release/share/lib/grammar/grammar.files" "--automaton=/Users/skaller/.felix/cache/binary/Users/skaller/felix/build/release/share/lib/grammar/grammar.files/syntax.automaton" "--import=plat/flx.flxh" "--import=concordance/concordance.flxh" "std" "/Users/skaller/felix/hello.flx"
[get_stdout] "clang++" "-MM" "-std=c++14" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello.cpp"
[system] "clang++" "-fPIC" "-fvisibility=hidden" "-g" "-c" "-O1" "-fno-common" "-fno-strict-aliasing" "-std=c++14" "-w" "-Wfatal-errors" "-Wno-return-type-c-linkage" "-Wno-invalid-offsetof" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello.cpp" -o "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_dynamic.o"
[system] "clang++" "-dynamiclib" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_dynamic.o" -o "/Users/skaller/.felix/cache/binary/Users/skaller/felix/hello.dylib" "-L/Users/skaller/felix/build/release/host/lib/rtl" "-lflx_dynamic" "-lflx_pthread_dynamic" "-lflx_dynlink_dynamic" "-lflx_strutil_dynamic" "-lflx_gc_dynamic" "-ljudy_dynamic" "-lflx_exceptions_dynamic"
[system] env DYLD_LIBRARY_PATH=/Users/skaller/felix/build/release/host/lib/rtl:$DYLD_LIBRARY_PATH "/Users/skaller/felix/build/release/host/bin/flx_run" "/Users/skaller/.felix/cache/binary/Users/skaller/felix/hello.dylib"
[load_library] /Users/skaller/.felix/cache/binary/Users/skaller/felix/hello.dylib
Hello World!

You can see flx calling the compiler flxg and then the c++ compiler clang++ to compile the file, and then clang++ again to link the file .. SURPRISE! It is making a shared library! It then runs the shared library hello.dylib from the executable flx_run.

By default, Felix generates shared libraries. If you want an executable:

~/felix>FLX_SHELL_ECHO=1 flx --static hello
[get_stdout] "clang++" "-MM" "-std=c++14" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static_link_thunk.cpp"
[system] "clang++" "-g" "-c" "-O1" "-fno-common" "-fno-strict-aliasing" "-std=c++14" "-w" "-Wfatal-errors" "-Wno-return-type-c-linkage" "-Wno-invalid-offsetof" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static_link_thunk.cpp" -o "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static_link_thunk_static.o"
[get_stdout] "clang++" "-MM" "-std=c++14" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello.cpp"
[system] "clang++" "-g" "-c" "-O1" "-fno-common" "-fno-strict-aliasing" "-std=c++14" "-w" "-Wfatal-errors" "-Wno-return-type-c-linkage" "-Wno-invalid-offsetof" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello.cpp" -o "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static.o"
[system] "clang++"  -o "/Users/skaller/.felix/cache/binary/Users/skaller/felix/hello" "/Users/skaller/felix/build/release/host/lib/rtl/flx_run_lib_static_static.o" "/Users/skaller/felix/build/release/host/lib/rtl/flx_run_main_static.o" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static_link_thunk_static.o" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static.o" "-L/Users/skaller/felix/build/release/host/lib/rtl" "-lflx_static" "-lflx_pthread_static" "-lflx_dynlink_static" "-lflx_strutil_static" "-lflx_gc_static" "-ljudy_static" "-lflx_exceptions_static"
[system] "/Users/skaller/.felix/cache/binary/Users/skaller/felix/hello"
Hello World!

You may notice above flxg isn't called this time! That's because the output is cached and the input hasn't changed. You can also see clang++ --MM there. That's calculating the C++ dependencies. Watch this:

~/felix>FLX_SHELL_ECHO=1 flx --static hello
[get_stdout] "clang++" "-MM" "-std=c++14" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello_static_link_thunk.cpp"
[get_stdout] "clang++" "-MM" "-std=c++14" "-O1" "-I/Users/skaller/felix/build/release/share/lib/rtl" "-I/Users/skaller/felix/build/release/host/lib/rtl" "/Users/skaller/.felix/cache/text/Users/skaller/felix/hello.cpp"
[system] "/Users/skaller/.felix/cache/binary/Users/skaller/felix/hello"
Hello World!

See? The -MM steps calculate dependencies but there is no Felix compile, there is no C++ compile, and there is no link! Felix just runs the executable. Everything is cached. In fact notice WHERE the executable is .. it's in the cache too.

@razetime
Copy link
Collaborator Author

When i attempt to run programs with flx, however I get this error:

$ flx factorial.flx
/home/razetime/Software/felix/build/release/host/bin/flx_run: symbol lookup error: /home/razetime/Software/felix/build/release/host/lib/rtl/libflx_gc_dynamic.so: undefined symbol: pthread_create
Error 127 in flx: [strerror_r] Failed to find text for error number 127

@skaller
Copy link
Member

skaller commented Oct 28, 2022

AH. I think I know what that is. Actually all of the test cases you had failed. I didn't notice. Hmm.

I think the problem is that the linker needs -lpthread.

@razetime
Copy link
Collaborator Author

flx --static factorial.flx runs it perfectly.

@skaller
Copy link
Member

skaller commented Oct 28, 2022

Hmmm:

~/felix>cat src/config/linux/pthread.fpc
Generated_from: 543 "/Users/skaller/felix/src/packages/rtl-threads.fdoc"
Description: Linux pthread support
requires_dlibs: -lpthread
requires_slibs: -lpthread

dlibs are for shared lib builds and slibs for static link. But -lpthread is there in both.
But the dependency may be missing. This is a job for FLX_SHELL_ECHO :-)
To see if the dynamic (default) link includes that switch.

It doesn't on MacOS because it's not required on MacOS.

@skaller skaller reopened this Oct 28, 2022
@skaller
Copy link
Member

skaller commented Oct 28, 2022

yeah that error is on EVERY test case in your build log. Same error.

Here's the build of the Felix pthread library:

Dynamic Linking library build/release/trial/lib/rtl/libflx_pthread_dynamic.so
[system] "g++" "-shared" 
"build/rtl-tmp/pthread_monitor_dynamic.o" "build/rtl-tmp/pthread_posix_thread_dynamic.o" 
"build/rtl-tmp/pthread_lf_bag_dynamic.o" "build/rtl-tmp/pthread_thread_control_dynamic.o" 
"build/rtl-tmp/flx_ts_collector_dynamic.o" 
"build/rtl-tmp/pthread_win_thread_dynamic.o" "build/rtl-tmp/pthread_bound_queue_dynamic.o" 
"build/rtl-tmp/pthread_condv_dynamic.o" 
"build/rtl-tmp/pthread_waitable_bool_dynamic.o" 
-o "build/release/trial/lib/rtl/libflx_pthread_dynamic.so" 
"-Lbuild/release/trial/lib/rtl" "-lflx_gc_dynamic" "-lflx_exceptions_dynamic"

Note there is no -lpthread!

I mean if anything needed the C pthread library it would be the Felix pthread library !

@skaller
Copy link
Member

skaller commented Oct 28, 2022

Just by the by .. Linux linker is TOTALLY SCREWED. The above is the proof. When linking shared libraries it silently ignore unresolved symbols. You don't get an error until you actually load the library and it finds there is a missing dependency. That's so utterly stupid it's unbelievable. If you're linking an executable, it tells you if there's a missing symbol. The argument is, when you link against a shared library, at load time it could be a different library so why both reporting an error which hasn't happened yet? ARRRGGGG.

@skaller
Copy link
Member

skaller commented Oct 28, 2022

Yep .. missing switch:

Dynamic Linking library build/release/trial/lib/rtl/libflx_gc_dynamic.so
[system] "g++" "-shared" "build/rtl-tmp/flx_collector_dynamic.o" "build/rtl-tmp/flx_serialisers_dynamic.o" "build/rtl-tmp/flx_judy_scanner_dynamic.o" 
"build/rtl-tmp/flx_gc_dynamic.o" -o "build/release/trial/lib/rtl/libflx_gc_dynamic.so" "-Lbuild/release/trial/lib/rtl" "-ljudy_dynamic" "-lflx_exceptions_dynamic"

Now the question is how did the CI build work .. perhaps it didn't lol ...

@skaller
Copy link
Member

skaller commented Oct 28, 2022

But here is a test case as an example:

[system] "g++" "-shared" "/home/runner/.felix/cache/text/home/runner/work/felix/felix/build/release/test/tut/tut_12_dynamic.o" 
-o "/home/runner/.felix/cache/binary/home/runner/work/felix/felix/build/release/test/tut/tut_12.so" 
"-Lbuild/release/host/lib/rtl" "-lflx_dynamic" "-lflx_pthread_dynamic" 
"-lpthread" "-lflx_dynlink_dynamic" 
"-ldl" "-lflx_strutil_dynamic" 
"-lflx_gc_dynamic" "-ljudy_dynamic" "-lflx_exceptions_dynamic"

See? That one has -lpthread.
Maybe your system doesn't know it's running on Linux ..

@skaller
Copy link
Member

skaller commented Oct 28, 2022

Can you try this:

~/felix>cat build/release/host/config/pthread.fpc
Generated_from: 539 "/Users/skaller/felix/src/packages/rtl-threads.fdoc"
Description: pthread support defaults to no requirements

That's on MacOS. On Linux is should be this file:

Generated_from: 543 "/Users/skaller/felix/src/packages/rtl-threads.fdoc"
Description: Linux pthread support
requires_dlibs: -lpthread
requires_slibs: -lpthread

If it's NOT, try this:

cp src/config/linux/* build/release/host/config

Also you should be set up to use the Felix in build/release so DELETE THE INSTALLED FELIX

rm -rf /usr/local/lib/felix

and check

~/felix>which flx
/Users/skaller/felix/build/release/host/bin/flx

I'm not sure if flx gets installed in /usr/local/bin if so get rid of it. You'll need your $PATH and $LD_LIBRARY_PATH to include $PWD/build/release/host/bin and $PWD/build/release/host/lib/rtl

AND you may need this too:

~/felix>cat /Users/skaller/.felix/config/felix.fpc
FLX_INSTALL_DIR: /Users/skaller/felix/build/release

with obvious changes ...

@razetime
Copy link
Collaborator Author

razetime commented Oct 28, 2022

I modified LD_LIBRARY_PATH to add those directories, but there was no change. Should i rebuild again after that change?

Here are the command outputs, they all seem to be in order:

(i have no felix installed in /usr)

$ cat build/release/host/config/pthread.fpc
Generated_from: 543 "/home/razetime/Software/felix/src/packages/rtl-threads.fdoc"
Description: Linux pthread support
requires_dlibs: -lpthread
requires_slibs: -lpthread
$ which flx
/home/razetime/Software/felix/build/release/host/bin/flx
$ cat ~/.felix/config/felix.fpc
FLX_INSTALL_DIR: /home/razetime/Software/felix/build/release

@skaller
Copy link
Member

skaller commented Oct 29, 2022

You should probably build again from scratch. I just don't understand what's happening considering it builds on Ubuntu with g++ version 9 on the GitHub CI server. You could try:

make rebuild

which avoids the bootstrap. Note PATH needs to include build/release/host/bin, and LD_LIBRARY_PATH has to include build/release/host/lib/rtl. The Felix binaries are in the first directory, and the shared libraries in the second. When you run flx it automatically sets the LD_LIBRARY_PATH for the program it's running: however if you make an executable and run it stand-alone AND it needs a shared library, typically as a plugin, then the LD_LIBRARY_PATH has to be set in the environment.

Now the thing is, when you run say

flx hello.flx

it uses dynamic linkage (generates a shared library) which is loaded and run by

build/release/host/bin/flx_run

and the linkage machinery SHOULD be setting -lpthread because the meta-data in build/release/host/config is telling it too. There's something seriously wrong if it isn't set.

The way dynamic linkage works in Felix is using two level namespaces. What this means is that if a shared library A depends on B, then A is linked to B. Now if a shared library X depends on A, it is linked to A but it is NOT linked to B. In other words, -lpthread should ONLY be specified when making a shared library that actually does calls to POSIX pthread functions. A library that uses THAT library SHOULD NOT need the -lpthread switch.

So what's happening is some library .. and I don't know which one .. which depends on the pthread library has not been linked against it. The idiot Linux linker doesn't have an error "symbol not found" which is should, so you only find out some library that needs pthreads isn't linked against it at run time .. and the diagnostic is NOT telling us which one. The problem is it works on the CI server. So something has gone wrong in your build but I cannot debug it because I'm not sitting in front of your screen. My best attempt to find an error has resulted in not finding one.

@razetime
Copy link
Collaborator Author

razetime commented Nov 1, 2022

well, both rebuild and a fresh build go into errors.
I tried the repl and got this:

$ flx --repl
> 1,2,3
/home/razetime/Software/felix/doc/cmd.flx: line 2, cols 1 to 0
1: 1,2,3
<eof>

Fatal error: exception Dyp.Syntax_error
Felix compilation "/home/razetime/Software/felix/build/release/host/bin/flxg" "-q" "--inline=25" "--output_dir=/home/razetime/.felix/cache/text" "--cache_dir=/home/razetime/.felix/cache/binary" "-I/home/razetime/Software/felix/build/release
/share/lib" "-I/home/razetime/Software/felix/build/release/host/lib" "--syntax=@/home/razetime/Software/felix/build/release/share/lib/grammar/grammar.files" "--automaton=/home/razetime/.felix/cache/binary/home/razetime/Software/felix/build/
release/share/lib/grammar/grammar.files/syntax.automaton" "--import=plat/flx.flxh" "--import=concordance/concordance.flxh" "std" "/home/razetime/Software/felix/doc/cmd.flx" failed
> (1,2,3)

@skaller
Copy link
Member

skaller commented Nov 1, 2022

What do you mean "go into errors"?

The repl requires a Felix program. For example

~/felix>flx --repl
> println$ 1,2,3;
(1, 2, 3)

It reloads definitions every time. It does not work very well. This is the complete source code of the repl:


fun startlib (x:string) =
{
   return x in RE2(" *(fun|proc|var|val|gen|union|struct|typedef).*\n");
}

// MOVE LATER!
proc repl()
{

nextline:>
  print "> "; fflush stdout;
  var text = readln stdin;
  if feof(stdin) return;

  if startlib(text) goto morelibrary;
  goto executable;

morelibrary:>
  print ".. "; fflush stdout;
  var more = readln stdin;
  if feof(stdin) return;

  if more == "\n" goto saveit;
  text += more;
  goto morelibrary;

saveit:>
  var dlibrary = load("library.flx");
  dlibrary += text;
  save("library.flx",dlibrary);
  goto nextline;

executable:>
   var session = load("session.flx");
   session += text;
   save ("session.flx", session);
   dlibrary = load("library.flx");
   var torun = dlibrary + text;
   save ("cmd.flx", torun);
}
...
      end
    elif control*.REPL_MODE do
      begin
        again:>
        repl();
        if not feof (stdin) do
          var dvars = FlxDepvars::cal_depvars(toolchain_maker,c_compiler_executable, cxx_compiler_executable, *config,control, *loopctl);
          var pe = processing_env(toolchain_maker,c_compiler_executable, cxx_compiler_executable, *config,*control,dvars);
          result = pe.runit(ehandler);
          goto again;
        else
          println$ "Bye!";
          // TOP LEVEL REPL, OK
          System::exit 0;
        done
      end

@razetime
Copy link
Collaborator Author

razetime commented Nov 1, 2022 via email

@skaller
Copy link
Member

skaller commented Nov 2, 2022

GOT IT I think! It's a bug in C++ thread library construction.

void flx_collector_t::mark_multi(pthread::memory_ranges_t *px,int reclimit, int nthreads)
{
//fprintf(stderr, "starting %d mark threads\n", nthreads);
  j_tmp_waiting = 0;
  mark_thread_context_t mtc {this,px, reclimit};
  ::std::vector< ::std::thread> mark_threads;
  for (int i=0; i<gcthreads; ++i)
    mark_threads.push_back (::std::thread (run_mark_thread, &mtc));
  for (int i=0; i<gcthreads; ++i)
    mark_threads[i].join();
//fprintf(stderr, "multithread mark finished\n");
}

This is part of the GC. It is creating a thread using the C++ std::thread class.
The C++ header file is exposing pthread_create.

In any case the fix is trivial, just link the GC with -lpthread to work around the bug. I'll have a go at a patch.

@skaller skaller closed this as completed in 440ad1a Nov 2, 2022
@skaller skaller reopened this Nov 2, 2022
@skaller
Copy link
Member

skaller commented Nov 2, 2022

Reopening. Stupid github decided a reference to this in a message should close it. Assigning to razetime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants