Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PANIC at dsl_crypt.c:1450:spa_keystore_change_key_sync_impl() #15028

Closed
numinit opened this issue Jul 1, 2023 · 3 comments
Closed

PANIC at dsl_crypt.c:1450:spa_keystore_change_key_sync_impl() #15028

numinit opened this issue Jul 1, 2023 · 3 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@numinit
Copy link
Contributor

numinit commented Jul 1, 2023

System information

Type Version/Name
Distribution Name NixOS
Distribution Version 23.05
Kernel Version 6.1.31-xanmod1 #1-NixOS
Architecture x86_64
OpenZFS Version zfs-2.1.11-1 / zfs-kmod-2.1.11-1

Describe the problem you're observing

Panic when changing key resulting in the ZFS command line tools hanging

Describe how to reproduce the problem

Just ran: zfs change-key -o keyformat=hex -o keylocation=file:///keystore/foo.key foo/enc

Include any warning/errors/backtraces from the system logs

[426807.517977] VERIFY3(0 == spa_keystore_dsl_key_hold_dd(dp->dp_spa, dd, FTAG, &dck)) failed (0 == 13)
[426807.517981] PANIC at dsl_crypt.c:1450:spa_keystore_change_key_sync_impl()
[426807.517982] Showing stack for process 5733
[426807.517984] CPU: 45 PID: 5733 Comm: txg_sync Tainted: P        W  O       6.1.31-xanmod1 #1-NixOS
[426807.517986] Call Trace:
[426807.517988]  <TASK>
[426807.517992]  dump_stack_lvl+0x44/0x5c
[426807.517999]  spl_panic+0xf0/0x108 [spl]
[426807.518007]  ? spa_keystore_dsl_key_hold_dd.isra.0+0x101/0x260 [zfs]
[426807.518040]  spa_keystore_change_key_sync_impl+0x451/0x460 [zfs]
[426807.518068]  spa_keystore_change_key_sync_impl+0x227/0x460 [zfs]
[426807.518093]  spa_keystore_change_key_sync+0x1ae/0x480 [zfs]
[426807.518119]  dsl_sync_task_sync+0xa8/0xf0 [zfs]
[426807.518152]  dsl_pool_sync+0x404/0x520 [zfs]
[426807.518181]  spa_sync+0x565/0xf90 [zfs]
[426807.518215]  ? _raw_spin_lock+0x13/0x40
[426807.518217]  ? spa_txg_history_init_io+0x113/0x120 [zfs]
[426807.518251]  txg_sync_thread+0x227/0x3e0 [zfs]
[426807.518281]  ? txg_fini+0x260/0x260 [zfs]
[426807.518308]  ? __thread_exit+0x20/0x20 [spl]
[426807.518313]  thread_generic_wrapper+0x5a/0x70 [spl]
[426807.518317]  kthread+0xe9/0x110
[426807.518320]  ? kthread_complete_and_exit+0x20/0x20
[426807.518321]  ret_from_fork+0x22/0x30
[426807.518324]  </TASK>

Shortly followed by:

[427015.492769] INFO: task txg_sync:5733 blocked for more than 122 seconds.
[427015.492774]       Tainted: P        W  O       6.1.31-xanmod1 #1-NixOS
[427015.492775] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[427015.492776] task:txg_sync        state:D stack:0     pid:5733  ppid:2      flags:0x00004000
[427015.492779] Call Trace:
[427015.492780]  <TASK>
[427015.492787]  __schedule+0x31b/0x1270
[427015.492793]  ? ret_from_fork+0x22/0x30
[427015.492796]  schedule+0x5d/0xe0
[427015.492799]  spl_panic+0x106/0x108 [spl]
[427015.492807]  ? spa_keystore_dsl_key_hold_dd.isra.0+0x101/0x260 [zfs]
[427015.492841]  spa_keystore_change_key_sync_impl+0x451/0x460 [zfs]
[427015.492868]  spa_keystore_change_key_sync_impl+0x227/0x460 [zfs]
[427015.492893]  spa_keystore_change_key_sync+0x1ae/0x480 [zfs]
[427015.492919]  dsl_sync_task_sync+0xa8/0xf0 [zfs]
[427015.492951]  dsl_pool_sync+0x404/0x520 [zfs]
[427015.492981]  spa_sync+0x565/0xf90 [zfs]
[427015.493014]  ? _raw_spin_lock+0x13/0x40
[427015.493016]  ? spa_txg_history_init_io+0x113/0x120 [zfs]
[427015.493049]  txg_sync_thread+0x227/0x3e0 [zfs]
[427015.493079]  ? txg_fini+0x260/0x260 [zfs]
[427015.493106]  ? __thread_exit+0x20/0x20 [spl]
[427015.493111]  thread_generic_wrapper+0x5a/0x70 [spl]
[427015.493115]  kthread+0xe9/0x110
[427015.493117]  ? kthread_complete_and_exit+0x20/0x20
[427015.493119]  ret_from_fork+0x22/0x30
[427015.493121]  </TASK>
[427015.493255] INFO: task zfs:1338138 blocked for more than 122 seconds.
[427015.493256]       Tainted: P        W  O       6.1.31-xanmod1 #1-NixOS
[427015.493257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[427015.493258] task:zfs             state:D stack:0     pid:1338138 ppid:1337352 flags:0x00000006
[427015.493259] Call Trace:
[427015.493260]  <TASK>
[427015.493261]  __schedule+0x31b/0x1270
[427015.493262]  ? autoremove_wake_function+0x2e/0x60
[427015.493265]  schedule+0x5d/0xe0
[427015.493266]  io_schedule+0x42/0x70
[427015.493268]  cv_wait_common+0xaa/0x130 [spl]
[427015.493272]  ? sugov_start+0x140/0x140
[427015.493273]  txg_wait_synced_impl+0xcb/0x110 [zfs]
[427015.493301]  txg_wait_synced+0xc/0x40 [zfs]
[427015.493328]  dsl_sync_task_common+0x1c9/0x2a0 [zfs]
[427015.493357]  ? spa_keystore_change_key_sync_impl+0x460/0x460 [zfs]
[427015.493384]  ? dmu_objset_check_wkey_loaded+0x90/0x90 [zfs]
[427015.493409]  ? dmu_objset_check_wkey_loaded+0x90/0x90 [zfs]
[427015.493434]  ? spa_keystore_change_key_sync_impl+0x460/0x460 [zfs]
[427015.493459]  dsl_sync_task+0x16/0x20 [zfs]
[427015.493486]  spa_keystore_change_key+0x44/0x70 [zfs]
[427015.493511]  zfs_ioc_change_key+0x11f/0x140 [zfs]
[427015.493550]  zfsdev_ioctl_common+0x59f/0xa00 [zfs]
[427015.493582]  zfsdev_ioctl+0x4f/0xd0 [zfs]
[427015.493618]  __x64_sys_ioctl+0x90/0xd0
[427015.493621]  do_syscall_64+0x3a/0x90
[427015.493623]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[427015.493625] RIP: 0033:0x7efc51bb729f
[427015.493652] RSP: 002b:00007fffcc90d1a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[427015.493654] RAX: ffffffffffffffda RBX: 0000000000005a4b RCX: 00007efc51bb729f
[427015.493654] RDX: 00007fffcc90d220 RSI: 0000000000005a4b RDI: 0000000000000006
[427015.493655] RBP: 00007fffcc910800 R08: 0000000000000000 R09: 00000000016b9000
[427015.493656] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffcc90d220
[427015.493656] R13: 0000000000005a4b R14: 00000000016e8200 R15: 0000000000000000
[427015.493657]  </TASK>
@numinit numinit added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jul 1, 2023
@numinit
Copy link
Contributor Author

numinit commented Jul 1, 2023

Oh dear, the entire pool is hung. :-(

@numinit
Copy link
Contributor Author

numinit commented Jul 1, 2023

This is definitely reproducible. I will try 2.1.12 and see if the problem persists.

@numinit
Copy link
Contributor Author

numinit commented Jul 4, 2023

I am happy to report that this was solved by destroying and re-receiving the affected datasets. I had performed a change-key and then an incremental receive on top of that, which made the data inaccessible.

Related issues:

#12000

#12614

@numinit numinit closed this as completed Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant