Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test GALAHAD with PASTIX #319

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Test GALAHAD with PASTIX #319

wants to merge 15 commits into from

Conversation

amontoison
Copy link
Member

No description provided.

@amontoison
Copy link
Member Author

@nimgould
I added CI tests with MUMPS v5.7.2 in #318 and all tests passed with Int32 and Int64. 👍
However I added CI tests with PASTIX v6.3.0 in this PR (only for Int32) and the tests failed. 👎

@nimgould
Copy link
Contributor

OK, I'll have a look. I think my pastix is out of date, so I'll install the latest first

@nimgould
Copy link
Contributor

The makefile double precision tests (both standalone via pastixt.F90 and as part of sls/sbls)
all work fine with pastix 6.3.2. However, the single precision ones don't get started; the spmGetArray function doesn't allocate the values array to the correct length. I will investigate further, but I have no idea why the meson double tests are failing

@amontoison
Copy link
Member Author

I will upgrade the version installed with CI (6.3.0 -> 6.3.2) to check if the tests passed with this version.

@amontoison
Copy link
Member Author

@nimgould I updated PaStiX to use the version 6.3.2 but single and double precision tests of PaStiX are still failing:

=================================== 74/460 ===================================
test:         GALAHAD:pastix+single+fortran / pastixt_single
start time:   17:16:18
duration:     0.15s
result:       killed by signal 11 SIGSEGV
command:      ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 MALLOC_PERTURB_=88 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 LD_LIBRARY_PATH=/home/runner/work/GALAHAD/GALAHAD/builddir/:/opt/hostedtoolcache/Python/3.12.4/x64/lib:/home/runner/work/GALAHAD/GALAHAD/galahad/lib:/home/runner/work/GALAHAD/GALAHAD/../deps/lib:/home/runner/work/GALAHAD/GALAHAD/../CUTEst/lib /home/runner/work/GALAHAD/GALAHAD/builddir/pastixt_single
----------------------------------- stdout -----------------------------------
  eps    1.0000000474974513E-003
single
  size of row, col, val =            7           7           7
----------------------------------- stderr -----------------------------------
ischedInit: The thread number has been automatically set to 2

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7fca35823970 in ???
#1  0x7fca35822ad5 in ???
#2  0x7fca3544251f in ???
#3  0x7fca354a53fe in ???
#4  0x7fca352f8d5f in ???
#5  0x7fca353e2fba in ???
#6  0x7fca35384b0d in ???
#7  0x7fca35394c6b in ???
#8  0x7fca35b68018 in ???
#9  0x55d624ceec3c in test_pastix
	at ../src/external/pastix/pastixt.F90:226
#10  0x55d624cef47b in main
	at ../src/external/pastix/pastixt.F90:7
==============================================================================

=================================== 75/460 ===================================
test:         GALAHAD:pastix+double+fortran / pastixt_double
start time:   17:16:18
duration:     0.15s
result:       killed by signal 11 SIGSEGV
command:      ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MALLOC_PERTURB_=104 LD_LIBRARY_PATH=/home/runner/work/GALAHAD/GALAHAD/builddir/:/opt/hostedtoolcache/Python/3.12.4/x64/lib:/home/runner/work/GALAHAD/GALAHAD/galahad/lib:/home/runner/work/GALAHAD/GALAHAD/../deps/lib:/home/runner/work/GALAHAD/GALAHAD/../CUTEst/lib /home/runner/work/GALAHAD/GALAHAD/builddir/pastixt_double
----------------------------------- stdout -----------------------------------
  eps    9.9999999999999998E-013
double
  size of row, col, val =            7           7           7
----------------------------------- stderr -----------------------------------
ischedInit: The thread number has been automatically set to 2

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7f4f19423970 in ???
#1  0x7f4f19422ad5 in ???
#2  0x7f4f1904251f in ???
#3  0x7f4f190a53fe in ???
#4  0x7f4f18ef8d5f in ???
#5  0x7f4f18fe2fba in ???
#6  0x7f4f18f84b0d in ???
#7  0x7f4f18f94c6b in ???
#8  0x7f4f1976a018 in ???
#9  0x561ca02f7c36 in test_pastix
	at ../src/external/pastix/pastixt.F90:226
#10  0x561ca02f8475 in main
	at ../src/external/pastix/pastixt.F90:7
==============================================================================

@amontoison
Copy link
Member Author

amontoison commented Jul 11, 2024

If I compile PASTIX without SCOTCH, the test pastixt in double precision works. 🤔

@nimgould
Copy link
Contributor

Oh, I should say that I took the advice from the Pastix page to set scotch on and metis off. This works for me. I suppose that it might well be a metis version clash (again!)

.github/meson/action.yml Outdated Show resolved Hide resolved
@amontoison amontoison force-pushed the master branch 4 times, most recently from ff47118 to f58979d Compare January 12, 2025 07:31
@amontoison
Copy link
Member Author

amontoison commented Jan 27, 2025

@nimgould
I have this compilation error, I don't understand how a .mod can be broken?!

 [2842/4669] Compiling Fortran object pastixt_quadruple.p/src_external_pastix_pastixt.F90.o
FAILED: pastixt_quadruple.p/src_external_pastix_pastixt.F90.o 
gfortran -Ipastixt_quadruple.p -I. -I.. -Iinclude -I../include -I../src/dum/include -I../src/metis/include -Isrc/ampl -I../src/ampl -I../../deps/modules -I../../CUTEst/modules -I../../deps/include/spm -I../../deps/include/pastix -Ilibgalahad_quadruple.so.p -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -O0 -g -fopenmp -DLANCELOT_USE_MA57 -cpp -DREAL_128 -DGALAHAD_BLAS -DGALAHAD_LAPACK -DDUMMY_QMUMPS -DDUMMY_MKL_PARDISO -DDUMMY_PARDISO -DDUMMY_WSMP -DDUMMY_MPI -Jpastixt_quadruple.p -o pastixt_quadruple.p/src_external_pastix_pastixt.F90.o -c ../src/external/pastix/pastixt.F90
f951: Fatal Error: Reading module ‘libgalahad_quadruple.so.p/spmf_interfaces_quadruple.mod’ at line 304 column 51: Expected left parenthesis

I would like to test GALAHAD with PaStiX v6.4.0.
I was able to cross-compile this version, which means that all Julia users can use it in the next release of GALAHAD_jll.jl / GALAHAD.jl if we want.

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 43.33%. Comparing base (1f13ff9) to head (1f67c4c).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #319   +/-   ##
=======================================
  Coverage   43.33%   43.33%           
=======================================
  Files         313      313           
  Lines      161912   161912           
  Branches    56182    56182           
=======================================
  Hits        70165    70165           
  Misses      74180    74180           
  Partials    17567    17567           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nimgould
Copy link
Contributor

I have no idea. As you say, a bug in a mod file is unusual, I have never seen one before. Certainly there is no issue here with the quad test using the dummy pastix (but then I don't know if the real one supports 128 bit reals)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants