-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unhandled exception Type=Segmentation error #595
Comments
@Jenson3210 The error message and the method match the referenced bug/APAR, but the stacktrace is slightly different. So I don't know for sure that the fix for the APAR would resolve your error. Semeru team would likely know. Repository for Semeru is at https://github.com/ibmruntimes/Semeru-Runtimes/issues As per the release roadmap for Semeru, 17.0.14.0 is planned for February 2025. Semeru containers are typically available 1-2 weeks after the binary (non-container) release. Liberty images can be expected to be updated within 1-2 weeks of Semeru container release. |
@leochr, Thanks a lot for the quick answer. Raised our question over there also. At this moment testing an e2e pipeline every 30 minutes with |
I checked |
@leochr Our pipeline failed already several times now but we are not able to retreive the dump file. I tried it on a healthy and I get a dump file. But on a crashed container the command does not finish (even after one hour), I only get a empty directory. |
@JorisNens Heap and system dumps are heavyweights. The thread dump is lightweight and commonly useful. Can you try with |
@leochr I tried with --include=thread only but the same issue. The command does not stop. In the OpenShift metrics view I see a small increase in memory but I do not see an increase in cpu (<10mCore usage). |
Kevin (@kgibm), adding you, in case you have any suggestions or insights to debug/resolve the issue with gathering dump. Thank you. |
Most commonly with containers, crash core dumps go to the worker node: https://eclipse.dev/openj9/docs/xdump/#piped-system-dumps Check As far as the hanging |
Thank you Kevin! @JorisNens, please try what Kevin suggested and provide the artifacts so we can investigate further. Thank you. |
Hi sorry for the late response. I see some company details in the file, so I first have to check internally if I can share the txt file. |
A javacore is a thread dump which is just a small text file that shows basic information about a Java process such as stack traces of all threads: https://eclipse.dev/openj9/docs/dump_javadump/ You can either review it in a text editor or in a free tool such as the IBM Thread and Monitor Dump Analyzer.
Then execute
Yes, please scrub any confidential information. For the issue of |
javacore.20250106.154207.1.0003_clean2.txt I changed the setting to collect the system dump with the Kill command. I will send a new javacore file when the crash happens again. |
@jorisnenscolruyt Was this taken during the
A javacore and system dump will be produced automatically on a crash, so you do not need to execute the |
I was able to trigger a new javacore dump: javacore.20250107.200605.1.0004 copy.txt When the server crashed a dump is automaticly created but the dump creation fails:
That is why I try to generate a dump with the kill command or server dump command. The following error is printed in the server logs:
|
These are from the same PID 1 which is Liberty. The
This means the system dump went to |
I should add that it won't be named
|
Hi, thanks for the patience here. Finally we where able to get the dump from our worker nodes. We've volume mounted this dump on a liberty server container to have the same runtime and processed the decompressed file with jpackcore ./mounted/file ./mounted/dump.zip This generates a dump.zip on my host machine which can be analysed locally after installing 17.0.13-sem locally(sdk install java 17.0.13-sem), using ~/.sdkman/candidates/java/17.0.13-sem/bin/jdmpview ./dump.zip As this file appears to contain loads of data (some of it sensitive), any advice on how to best continue here? |
analysis dump.zip Can you find anything in these files you need? |
The core dump analysis for this issue is best handled in ibmruntimes/Semeru-Runtimes#101 or in a support case. Based on Peter's last comment, it looks like By the way, if you want to test if an early release of 17.0.14 fixes the issue, here's a procedure to dynamically use a different JDK in containers: https://publib.boulder.ibm.com/httpserv/cookbook/Troubleshooting_Recipes-Troubleshooting_OpenShift_Recipes-Replace_Container_Directory_in_OpenShift.html |
I created a image with java 17.0.14. It took longer before the server crash but at the end I received the same error: During startup of the pod, I see these additional messages: JVMSHRC226E Error opening shared class cache file Can this cause some errors? |
My guess is that the shared class cache would need to be rebuilt for the new JDK but those shared class cache errors aren't fatal; they just impact startup performance as the JVM falls back to no shared class cache, which is okay. I do find those module warnings a bit strange so I don't know if they're related but if you don't see them on the previous JDK, then I guess they're not the main issue. I think at this point deeper analysis of the core dump would be required by the Java team. A support case is the best way to share a core dump. |
I noticed
This enables the Java debugger. Is this a business requirement? Besides being potentially related to the crash, the debugger may have a significant performance impact even if there's no active debugger. |
Hi,
We're using
24.0.0.11-full-java17-openj9-ubi
to run. However, I think also24.0.0.12-full-java17-openj9-ubi
would not work.We're seeing crashes with error log:
We think it might be related to this bug.
According to the details there, it should be resolved by semeru 17.0.14.0
However, it seems that there is no version available other than the one used in the dockerfile
that contains semeru 17.0.14.
Our startup logs contain
showing it is 17.0.13 only.
Now, i could not find a way to raise this question/issue on ibm ubi image that is used in the dockerfile, so raised it here instead hoping for some insights.
The text was updated successfully, but these errors were encountered: