-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAVSDK 2.1.0 Seg Fault Crashes When Tearing Down Server #180
Comments
Hm, any chance you can get a backtrace with more info? It looks like it happens in some lock but I'm not sure which one and when. |
Sure, here's the Tombstone for it |
Slightly different tombstone for it |
So I nailed down the commit that introduced this particular crash, went through the different versions on the v2.12 branch on MAVSDK and saw that this: mavlink/MAVSDK@71126ac Reverting this change while on the 2.12.6 tag made this particular crash go away. But I do see the intent of the change was to improve connecting to systems, so we probably don't want to remove this but instead fix this somehow... |
Great find, sorry for being slow here. Given you seem to be able to build the server and try out different versions, any chance you can build debug mode and get more output where it crashes? |
Or if you have hints how I can reproduce it with the example apps that would work too. Problem is I'm not familiar with the Java stuff but I do want to help fix this bug. |
Hi @julianoes - I attached a custom example app (https://github.com/user-attachments/files/17066260/ReproMavsdkCrash.zip) that can easily repro the issue. It can be done by,
|
Thanks for that. I can reproduce it, but I'm not sure what to make of it yet. It looks like it crashes here: But that doesn't look right if you ask me as the thombstone suggests pthread_mutex_lock. |
Ok, so now I realized that you're starting 3 servers, one on each UDP port. Are you actually connecting 3 vehicles? Or is this just to connect three ports? If it is to do 3 ports but one vehicle, you might want to add some mavlink-router or MAVSDK based forwarder in-between. In any case, the crash happens for the one where nothing is connected! I'll have a closer look into this. |
I've finally managed to copy in my debug build mavsdk_server.so and added some printfs. It turns out the https://github.com/mavlink/MAVSDK/blob/v2.12/src/mavsdk_server/src/connection_initiator.h#L35 |
After plenty of head scratching I figured out what was going on, here is the fix: mavlink/MAVSDK#2417. |
Fix coming with https://github.com/mavlink/MAVSDK/releases/tag/v2.12.9. @JonasVautherin do you mind making a Java release once the artifacts are ready? Thanks! |
I guess I should wait for 2.12.10 with mavlink/MAVSDK#2421? |
@JonasVautherin no mavlink/MAVSDK#2421 is only on main, not v2.12. |
That's done in mavsdk_server:2.1.3 👍 |
I see some crashes with "ReproMavsdkCrash" app even after bumping mavsdk-server to 2.1.3. |
@daniel-sales ok I will have to try to reproduce that again.
I'm still curious on this @rayw-dronesense and @daniel-sales. |
@julianoes On my end I'm starting only one server on port (14551). Regarding the connection issue, it is reproductible with the @rayw-dronesense's example app if you:
Here's the stack trace:
|
Ok, so this is a SIGABORT and happens when something is connected, not disconnected like before. |
I can't reproduce this just yet. When you say "close the app", how do you do that? I'm using the emulator and hit stop, then I can just relaunch. My guess is that you run into a bind issue with port already bound by the previous run. |
Hitting AndroidStudio "stop" button or using Android device UI to close the app. App crashes on next "start server" attempt.
That makes sense. The crashes go away when I revert PR#2388, so maybe the new autopilot discovery method is interfering on the automatic port release by system when app is destroyed? |
I honestly don't see how that would be related. And I could not reproduce it that way I'm afraid. |
I just made more tests here and indeed, issue is not reproductible with px4-gazebo-headless simulator.
Crash happening on "start server" command after a few start/stop loops, or on next "start server" after closing the app without manual server stop. Stack trace
|
Ok, any chance you can debug this yourself by building the .so file in debug mode and copying it in as these instructions: That's essentially what I did to debug this. It should hopefully tell us which abort call is happening. |
Environment
(Does not occur with 1.3.1)
Repro
Stack trace
The text was updated successfully, but these errors were encountered: