您好 openEuler 内核sig团队: 我司一台服务器(openEuler 22.03 LTS,内核 5.10.0-60.18.0.50.oe2203.x86_64,运行 342 天)发生内核崩溃,已收集到 vmcore。 根据openEuler Witty智能诊断Agent 诊断报告,发现内核可能存在bug,恳请协助排查。 当前状况 我们已在 openEuler 社区提交 Issue(链接:https://gitcode.com/openeuler/kernel/issues/9047),待明确根因和解决方案。 谢谢! 附 1. 内核日志片段 [29613041.914241] ------------[ cut here ]------------ [29613041.914241] kernel BUG at kernel/entry/common.c:427! ... [29613041.914259] BUG: stack guard page was hit at 00000000089c7912 [29613041.914260] NMI watchdog: Watchdog detected hard LOCKUP on cpu 8 [29613041.914272] CPU: 8 PID: 1249797 Comm: lurker Kdump: loaded Not tainted 5.10.0-60.18.0.50.oe2203.x86_64 #1 [29613041.914273] RIP: 0010:native_queued_spin_lock_slowpath+0x5c/0x1b0 ... [29613041.914325] traps: PANIC: double fault, error_code: 0x0 2. PID 1249797 (lurker) 的堆栈回溯 PID: 1249797 TASK: ffff9c26fc228000 CPU: 8 COMMAND: "lurker" #0 [fffffe00001e2ee8] die at ffffffffa5e25b4a #1 [fffffe00001e2f10] handle_stack_overflow at ffffffffa67dca7b #2 [fffffe00001e2f28] exc_double_fault at ffffffffa6827a08 #3 [fffffe00001e2f50] asm_exc_double_fault at ffffffffa6a00b7e [exception RIP: bsearch+2] ... #160 [ffffa93bcf3b7958] no_context at ffffffffa5e71c37 #161 [ffffa93bcf3b7990] __bad_area_nosemaphore at ffffffffa5e71e62 #162 [ffffa93bcf3b79d8] exc_page_fault at ffffffffa682a565 #163 [ffffa93bcf3b7a30] asm_exc_page_fault at ffffffffa6a00ace [exception RIP: vma_dump_size+39] RIP: ffffffffa61d4fa7 RSP: ffffa93bcf3b7ae8 RFLAGS: 00010286 RAX: ffffffffb750aba0 RBX: ffff9c264e92f5c0 ... #164 [ffffa93bcf3b7af8] dump_vma_snapshot at ffffffffa61d6461 #165 [ffffa93bcf3b7b60] elf_core_dump at ffffffffa61cefc9 #166 [ffffa93bcf3b7d40] do_coredump at ffffffffa61d5e81 #167 [ffffa93bcf3b7e38] get_signal at ffffffffa5eefd6f #168 [ffffa93bcf3b7e88] arch_do_signal at ffffffffa5e21cda #169 [ffffa93bcf3b7f10] exit_to_user_mode_loop at ffffffffa5f6f309 #170 [ffffa93bcf3b7f30] exit_to_user_mode_prepare at ffffffffa5f6f3a2 #171 [ffffa93bcf3b7f48] irqentry_exit_to_user_mode at ffffffffa682ac65 #172 [ffffa93bcf3b7f50] asm_exc_page_fault at ffffffffa6a00ace 3. 其他 CPU 的部分堆栈(CPU 1, 2, 7, 10, 14 均出现异常) CPU 1 (postgres): #3 [fffffe0000045f50] asm_exc_double_fault [exception RIP: do_error_trap+53] ... #8 [ffffa93bcd274170] do_error_trap #9 [ffffa93bcd2741b0] exc_invalid_op #10 [ffffa93bcd2741d0] asm_exc_invalid_op [exception RIP: fixup_exception+35] CPU 2 (log_server): #3 [ffffa93bcef97ce0] native_queued_spin_lock_slowpath ... #9 [ffffa93bcef97dc0] asm_exc_general_protection [exception RIP: do_renameat2+50] RIP: ffffffffa615e252 CPU 10 (df): #2 [fffffe0000258f50] asm_exc_double_fault [exception RIP: mmap_base+155] RIP: ffffffffa5e732eb RSP: 00000000029de8c0 4. 智能Agent 工具分析结果 ZDNS 张宝