Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Segmentation fault in odyssey`od_cron_stat_cb #25718

Open
1 task done
shishir2001-yb opened this issue Jan 22, 2025 · 0 comments
Open
1 task done

[YSQL] Segmentation fault in odyssey`od_cron_stat_cb #25718

shishir2001-yb opened this issue Jan 22, 2025 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue qa_stress Bugs identified via Stress automation QA QA filed bugs

Comments

@shishir2001-yb
Copy link

shishir2001-yb commented Jan 22, 2025

Jira Link: DB-14991

Description

Version: 2024.2.1.0-b162

With some CONNECTION MANAGER g-flags we encountered the following coredump in Wait on Conflict toggle on and off stress test.

(lldb) target create "/home/yugabyte/yb-software/yugabyte-2024.2.1.0-b162-centos-x86_64/bin/odyssey" --core "/home/yugabyte/cores/core_3232589_1736754992_!home!yugabyte!yb-software!yugabyte-2024.2.1.0-b162-centos-x86_64!bin!odyssey"
Core file '/home/yugabyte/cores/core_3232589_1736754992_!home!yugabyte!yb-software!yugabyte-2024.2.1.0-b162-centos-x86_64!bin!odyssey' (x86_64) was loaded.
(lldb) bt all
* thread #1, name = 'odyssey', stop reason = signal SIGSEGV
  * frame #0: 0x0000560270da3b64 odyssey`od_cron_stat_cb(route=0x0000378a3fda0480, current=<unavailable>, avg=0x00007fde9463ae30, argv=0x00007fde9463aec8) at cron.c:51:4
    frame #1: 0x0000560270da3751 odyssey`od_cron at route_pool.h:186:4
    frame #2: 0x0000560270da3147 odyssey`od_cron [inlined] od_router_stat(router=0x00007ffd48d216c8, prev_time_us=9341342902, callback=(odyssey`od_cron_stat_cb at cron.c:41), argv=0x00007fde9463aec8) at router.c:370:2
    frame #3: 0x0000560270da3147 odyssey`od_cron [inlined] od_cron_stat(cron=0x00007ffd48d21610) at cron.c:248:2
    frame #4: 0x0000560270da3147 odyssey`od_cron(arg=0x00007ffd48d21610) at cron.c:344:5
    frame #5: 0x0000560270dd0062 odyssey`mm_scheduler_main(arg=0x0000378a3fd69200) at scheduler.c:17:2
    frame #6: 0x0000560270dd02c7 odyssey`mm_context_runner at context.c:28:2
  thread #2, stop reason = signal 0
    frame #0: 0x00007fde953896cd libpthread.so.0`__GI___pthread_timedjoin_ex + 397
    frame #1: 0x0000560270dd128d odyssey`machine_wait [inlined] mm_thread_join(thread=0x0000378a3fd742a8) at thread.c:36:7
    frame #2: 0x0000560270dd127f odyssey`machine_wait(machine_id=<unavailable>) at machine.c:146:7
    frame #3: 0x0000560270dc35c6 odyssey`main [inlined] od_instance_main(instance=0x00007ffd48d21838, argc=<unavailable>, argv=<unavailable>) at instance.c:327:9
    frame #4: 0x0000560270dc326a odyssey`main [inlined] odyssey_main(argc=<unavailable>, argv=<unavailable>) at main.c:18:11
    frame #5: 0x0000560270dc326a odyssey`main(argc=<unavailable>, argv=<unavailable>) at odyssey_main.cc:6:12
    frame #6: 0x00007fde94ff5d85 libc.so.6`__libc_start_main + 229
    frame #7: 0x0000560270d94ace odyssey`_start + 46
  thread #3, stop reason = signal 0
    frame #0: 0x00007fde9539182d libpthread.so.0`__lll_lock_wait + 29
    frame #1: 0x00007fde9538aad9 libpthread.so.0`__pthread_mutex_lock + 89
    frame #2: 0x0000560270d9bcca odyssey`od_router_route(router=<unavailable>, client=<unavailable>) at router.c:399:2
    frame #3: 0x0000560270da49fe odyssey`od_auth_frontend [inlined] yb_auth_via_auth_backend(client=0x0000378a3f2b1400) at frontend.c:2800:11
    frame #4: 0x0000560270da4963 odyssey`od_auth_frontend(client=0x0000378a3f2b1400) at auth.c:669:8
    frame #5: 0x0000560270db3896 odyssey`od_frontend(arg=0x0000378a3f2b1400) at frontend.c:2357:8
    frame #6: 0x0000560270dd0062 odyssey`mm_scheduler_main(arg=0x0000378a3fd68240) at scheduler.c:17:2
    frame #7: 0x0000560270dd02c7 odyssey`mm_context_runner at context.c:28:2
  thread #4, stop reason = signal 0
    frame #0: 0x00007fde950ea247 libc.so.6`epoll_wait + 87
    frame #1: 0x0000560270dd1659 odyssey`mm_epoll_step(poll=0x0000378a3fd73940, timeout=<unavailable>) at epoll.c:70:10
    frame #2: 0x0000560270dd1036 odyssey`machine_main [inlined] mm_loop_step(loop=0x0000378a3fd74700) at loop.c:64:7
    frame #3: 0x0000560270dd1026 odyssey`machine_main(arg=0x0000378a3fd74500) at machine.c:59:3
    frame #4: 0x00007fde953881ca libpthread.so.0`start_thread + 234
    frame #5: 0x00007fde94ff4e73 libc.so.6`__clone + 67
  thread #5, stop reason = signal 0
    frame #0: 0x00007fde9539182d libpthread.so.0`__lll_lock_wait + 29
    frame #1: 0x00007fde9538aad9 libpthread.so.0`__pthread_mutex_lock + 89
    frame #2: 0x0000560270dc465a odyssey`od_system_shutdown(system=0x00007ffd48d219d0, instance=0x00007ffd48d21838) at sighandler.c:96:2
    frame #3: 0x0000560270dc50da odyssey`od_system_signal_handler(arg=0x00007ffd48d219d0) at sighandler.c:143:4
    frame #4: 0x0000560270dd0062 odyssey`mm_scheduler_main(arg=0x0000378a3fd693b0) at scheduler.c:17:2
    frame #5: 0x0000560270dd02c7 odyssey`mm_context_runner at context.c:28:2
  thread #6, stop reason = signal 0
    frame #0: 0x00007fde950ea247 libc.so.6`epoll_wait + 87
    frame #1: 0x0000560270dd1659 odyssey`mm_epoll_step(poll=0x0000378a3fd72480, timeout=<unavailable>) at epoll.c:70:10
    frame #2: 0x0000560270dd1036 odyssey`machine_main [inlined] mm_loop_step(loop=0x0000378a3fd74e80) at loop.c:64:7
    frame #3: 0x0000560270dd1026 odyssey`machine_main(arg=0x0000378a3fd74c80) at machine.c:59:3
    frame #4: 0x00007fde953881ca libpthread.so.0`start_thread + 234
    frame #5: 0x00007fde94ff4e73 libc.so.6`__clone + 67
  thread #7, stop reason = signal 0
    frame #0: 0x00007fde950ea247 libc.so.6`epoll_wait + 87
    frame #1: 0x0000560270dd1659 odyssey`mm_epoll_step(poll=0x0000378a3fd73960, timeout=<unavailable>) at epoll.c:70:10
    frame #2: 0x0000560270dd1036 odyssey`machine_main [inlined] mm_loop_step(loop=0x0000378a3fd74200) at loop.c:64:7
    frame #3: 0x0000560270dd1026 odyssey`machine_main(arg=0x0000378a3fd74000) at machine.c:59:3
    frame #4: 0x00007fde953881ca libpthread.so.0`start_thread + 234
    frame #5: 0x00007fde94ff4e73 libc.so.6`__clone + 67

Test details:


        1. Create a cluster with required g-flags
        2. Crate 2 databases(1 colocated and 1 noncolocated)
        3. Start SqlBankWaitOnConflict workload on both the databases with and RC and RR
            isolation level
        4. Start SQL_READ_COMMITTED workload on both the database
        5. Start a loop and run it for 4 hours
            a. Disable wait_queses g-flag if enabled or vice versa

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@shishir2001-yb shishir2001-yb added area/ysql Yugabyte SQL (YSQL) QA QA filed bugs qa_stress Bugs identified via Stress automation status/awaiting-triage Issue awaiting triage labels Jan 22, 2025
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jan 22, 2025
@yugabyte-ci yugabyte-ci assigned rahulb-yb and unassigned suranjan Jan 22, 2025
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue qa_stress Bugs identified via Stress automation QA QA filed bugs
Projects
None yet
Development

No branches or pull requests

4 participants