From: Li Lingfeng lilingfeng3@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6QMTU
--------------------------------
Recently, a null-ptr-deref problem occur when submitting IO to nvme disk:
[34432.226539] ========================================================== [34432.226579] BUG: KASAN: null-ptr-deref in trace_event_raw_event_nvme_complete_rq+0x13c/0x270 [nvme_core] [34432.226584] Read of size 2 at addr 0000000000000002 by task loop0/32 [34432.226586] [34432.226594] CPU: 0 PID: 3242729 Comm: loop0 Kdump: loaded Tainted: [34432.226598] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD [34432.226602] Call trace: [34432.226610] dump_backtrace+0x0/0x2fc [34432.226615] show_stack+0x20/0x30 [34432.226623] dump_stack+0x104/0x17c [34432.226630] __kasan_report+0x138/0x140 [34432.226634] kasan_report+0x44/0xdc [34432.226639] __asan_load2+0x90/0xd0 [34432.226662] trace_event_raw_event_nvme_complete_rq+0x13c/0x270 [34432.226684] nvme_complete_rq+0x228/0x480 [nvme_core] [34432.226698] nvme_pci_complete_rq+0x184/0x1b4 [nvme] [34432.226706] nvme_irq+0x270/0x500 [nvme] [34432.226714] __handle_irq_event_percpu+0x8c/0x324 [34432.226719] handle_irq_event_percpu+0x88/0x11c [34432.226724] handle_irq_event+0x110/0x2b0 [34432.226729] handle_fasteoi_irq+0x1e4/0x3f4 [34432.226734] __handle_domain_irq+0xbc/0x130 [34432.226739] gic_handle_irq+0x78/0x460 [34432.226743] el1_irq+0xb8/0x140 [34432.226750] __slab_alloc+0x38/0x70 [34432.226756] kmem_cache_alloc+0x6b8/0x904 [34432.226762] mempool_alloc_slab+0x3c/0x60 [34432.226766] mempool_alloc+0xf0/0x440 [34432.226772] bio_alloc_bioset+0x208/0x2f0 [34432.226899] io_submit_init_bio+0x3c/0x190 [ext4] [34432.226991] ext4_bio_write_page+0x540/0xbd0 [ext4] [34432.227082] mpage_submit_page+0xb0/0x120 [ext4] [34432.227173] mpage_process_page_bufs+0x25c/0x2b4 [ext4] [34432.227265] mpage_prepare_extent_to_map+0x3b8/0x75c [ext4] [34432.227356] ext4_writepages+0x454/0xcb4 [ext4] [34432.227361] do_writepages+0xc4/0x1c0 ...
This can be reproduced by following steps: 1) modprobe nvme 2) echo nvme:* > /sys/kernel/debug/tracing/set_event 3) dd if=/dev/random of=/dev/nvmexxx bs=1M count=1024
Generating command_id by nvme_cid() in trace event instead of nvme_req(req)->cmd->common.command_id can fix it since nvme_req(req)->cmd can be NULL in sometimes.
Fixes: eae0bc99108a ("nvme: use command_id instead of req->tag in trace_nvme_complete_rq()") Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- drivers/nvme/host/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h index aa8b0f86b2be..d4eacb0b43ce 100644 --- a/drivers/nvme/host/trace.h +++ b/drivers/nvme/host/trace.h @@ -98,7 +98,7 @@ TRACE_EVENT(nvme_complete_rq, TP_fast_assign( __entry->ctrl_id = nvme_req(req)->ctrl->instance; __entry->qid = nvme_req_qid(req); - __entry->cid = nvme_req(req)->cmd->common.command_id; + __entry->cid = nvme_cid(req); __entry->result = le64_to_cpu(nvme_req(req)->result.u64); __entry->retries = nvme_req(req)->retries; __entry->flags = nvme_req(req)->flags;