[PATCH OLK-6.6 0/7] merge HULK-6.6 patches into OLK-6.6

Kaixiong Yu

11 Jan 2025 11 Jan '25

7:59 p.m.

merge HULK-6.6 CVE patches into OLK-6.6 CVE-No as listed: CVE-2024-56715 CVE-2024-56610 CVE-2024-56617 CVE-2024-53105 CVE-2024-53109 CVE-2024-53056 Brett Creeley (1): ionic: Fix netdev notifier unregister on failure Dan Carpenter (1): drm/mediatek: Fix potential NULL dereference in mtk_crtc_destroy() Hajime Tazaki (1): nommu: pass NULL argument to vma_iter_prealloc() Lorenzo Stoakes (1): mm: refactor map_deny_write_exec() Marco Elver (1): kcsan: Turn report_filterlist_lock into a raw_spinlock Ricardo Neri (1): cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU Roman Gushchin (1): mm: page_alloc: move mlocked flag clearance into free_pages_prepare() drivers/base/cacheinfo.c | 14 ++-- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 3 +- .../net/ethernet/pensando/ionic/ionic_lif.c | 4 +- include/linux/mman.h | 21 +++++- kernel/kcsan/debugfs.c | 74 +++++++++---------- mm/mmap.c | 2 +- mm/mprotect.c | 2 +- mm/nommu.c | 2 +- mm/page_alloc.c | 15 ++++ mm/swap.c | 14 ---- 10 files changed, 83 insertions(+), 68 deletions(-) -- 2.34.1

Show replies by date

Kaixiong Yu

11 Jan 11 Jan

7:59 p.m.

New subject: [PATCH OLK-6.6 1/7] ionic: Fix netdev notifier unregister on failure

From: Brett Creeley <brett.creeley@amd.com> stable inclusion from stable-v6.6.68 commit da5736f516a664a9e1ff74902663c64c423045d2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBEG3Z CVE: CVE-2024-56715 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 9590d32e090ea2751e131ae5273859ca22f5ac14 ] If register_netdev() fails, then the driver leaks the netdev notifier. Fix this by calling ionic_lif_unregister() on register_netdev() failure. This will also call ionic_lif_unregister_phc() if it has already been registered. Fixes: 30b87ab4c0b3 ("ionic: remove lif list concept") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20241212213157.12212-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Liu Shixin <liushixin2@huawei.com> --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 9d724d228b83..bc7c5cd38596 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -3736,8 +3736,8 @@ int ionic_lif_register(struct ionic_lif *lif) /* only register LIF0 for now */ err = register_netdev(lif->netdev); if (err) { - dev_err(lif->ionic->dev, "Cannot register net device, aborting\n"); - ionic_lif_unregister_phc(lif); + dev_err(lif->ionic->dev, "Cannot register net device: %d, aborting\n", err); + ionic_lif_unregister(lif); return err; } -- 2.34.1

Kaixiong Yu

7:59 p.m.

New subject: [PATCH OLK-6.6 2/7] kcsan: Turn report_filterlist_lock into a raw_spinlock

From: Marco Elver <elver@google.com> mainline inclusion from mainline-v6.13-rc1 commit 59458fa4ddb47e7891c61b4a928d13d5f5b00aa0 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBEAOU CVE: CVE-2024-56610 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Ran Xiaokai reports that with a KCSAN-enabled PREEMPT_RT kernel, we can see splats like: | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 | preempt_count: 10002, expected: 0 | RCU nest depth: 0, expected: 0 | no locks held by swapper/1/0. | irq event stamp: 156674 | hardirqs last enabled at (156673): [<ffffffff81130bd9>] do_idle+0x1f9/0x240 | hardirqs last disabled at (156674): [<ffffffff82254f84>] sysvec_apic_timer_interrupt+0x14/0xc0 | softirqs last enabled at (0): [<ffffffff81099f47>] copy_process+0xfc7/0x4b60 | softirqs last disabled at (0): [<0000000000000000>] 0x0 | Preemption disabled at: | [<ffffffff814a3e2a>] paint_ptr+0x2a/0x90 | CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.11.0+ #3 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 | Call Trace: | <IRQ> | dump_stack_lvl+0x7e/0xc0 | dump_stack+0x1d/0x30 | __might_resched+0x1a2/0x270 | rt_spin_lock+0x68/0x170 | kcsan_skip_report_debugfs+0x43/0xe0 | print_report+0xb5/0x590 | kcsan_report_known_origin+0x1b1/0x1d0 | kcsan_setup_watchpoint+0x348/0x650 | __tsan_unaligned_write1+0x16d/0x1d0 | hrtimer_interrupt+0x3d6/0x430 | __sysvec_apic_timer_interrupt+0xe8/0x3a0 | sysvec_apic_timer_interrupt+0x97/0xc0 | </IRQ> On a detected data race, KCSAN's reporting logic checks if it should filter the report. That list is protected by the report_filterlist_lock *non-raw* spinlock which may sleep on RT kernels. Since KCSAN may report data races in any context, convert it to a raw_spinlock. This requires being careful about when to allocate memory for the filter list itself which can be done via KCSAN's debugfs interface. Concurrent modification of the filter list via debugfs should be rare: the chosen strategy is to optimistically pre-allocate memory before the critical section and discard if unused. Link: https://lore.kernel.org/all/20240925143154.2322926-1-ranxiaokai627@163.com/ Reported-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Tested-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> --- kernel/kcsan/debugfs.c | 74 ++++++++++++++++++++---------------------- 1 file changed, 36 insertions(+), 38 deletions(-) diff --git a/kernel/kcsan/debugfs.c b/kernel/kcsan/debugfs.c index 1d1d1b0e4248..f4623910fb1f 100644 --- a/kernel/kcsan/debugfs.c +++ b/kernel/kcsan/debugfs.c @@ -46,14 +46,8 @@ static struct { int used; /* number of elements used */ bool sorted; /* if elements are sorted */ bool whitelist; /* if list is a blacklist or whitelist */ -} report_filterlist = { - .addrs = NULL, - .size = 8, /* small initial size */ - .used = 0, - .sorted = false, - .whitelist = false, /* default is blacklist */ -}; -static DEFINE_SPINLOCK(report_filterlist_lock); +} report_filterlist; +static DEFINE_RAW_SPINLOCK(report_filterlist_lock); /* * The microbenchmark allows benchmarking KCSAN core runtime only. To run @@ -110,7 +104,7 @@ bool kcsan_skip_report_debugfs(unsigned long func_addr) return false; func_addr -= offset; /* Get function start */ - spin_lock_irqsave(&report_filterlist_lock, flags); + raw_spin_lock_irqsave(&report_filterlist_lock, flags); if (report_filterlist.used == 0) goto out; @@ -127,7 +121,7 @@ bool kcsan_skip_report_debugfs(unsigned long func_addr) ret = !ret; out: - spin_unlock_irqrestore(&report_filterlist_lock, flags); + raw_spin_unlock_irqrestore(&report_filterlist_lock, flags); return ret; } @@ -135,9 +129,9 @@ static void set_report_filterlist_whitelist(bool whitelist) { unsigned long flags; - spin_lock_irqsave(&report_filterlist_lock, flags); + raw_spin_lock_irqsave(&report_filterlist_lock, flags); report_filterlist.whitelist = whitelist; - spin_unlock_irqrestore(&report_filterlist_lock, flags); + raw_spin_unlock_irqrestore(&report_filterlist_lock, flags); } /* Returns 0 on success, error-code otherwise. */ @@ -145,6 +139,9 @@ static ssize_t insert_report_filterlist(const char *func) { unsigned long flags; unsigned long addr = kallsyms_lookup_name(func); + unsigned long *delay_free = NULL; + unsigned long *new_addrs = NULL; + size_t new_size = 0; ssize_t ret = 0; if (!addr) { @@ -152,32 +149,33 @@ static ssize_t insert_report_filterlist(const char *func) return -ENOENT; } - spin_lock_irqsave(&report_filterlist_lock, flags); +retry_alloc: + /* + * Check if we need an allocation, and re-validate under the lock. Since + * the report_filterlist_lock is a raw, cannot allocate under the lock. + */ + if (data_race(report_filterlist.used == report_filterlist.size)) { + new_size = (report_filterlist.size ?: 4) * 2; + delay_free = new_addrs = kmalloc_array(new_size, sizeof(unsigned long), GFP_KERNEL); + if (!new_addrs) + return -ENOMEM; + } - if (report_filterlist.addrs == NULL) { - /* initial allocation */ - report_filterlist.addrs = - kmalloc_array(report_filterlist.size, - sizeof(unsigned long), GFP_ATOMIC); - if (report_filterlist.addrs == NULL) { - ret = -ENOMEM; - goto out; - } - } else if (report_filterlist.used == report_filterlist.size) { - /* resize filterlist */ - size_t new_size = report_filterlist.size * 2; - unsigned long *new_addrs = - krealloc(report_filterlist.addrs, - new_size * sizeof(unsigned long), GFP_ATOMIC); - - if (new_addrs == NULL) { - /* leave filterlist itself untouched */ - ret = -ENOMEM; - goto out; + raw_spin_lock_irqsave(&report_filterlist_lock, flags); + if (report_filterlist.used == report_filterlist.size) { + /* Check we pre-allocated enough, and retry if not. */ + if (report_filterlist.used >= new_size) { + raw_spin_unlock_irqrestore(&report_filterlist_lock, flags); + kfree(new_addrs); /* kfree(NULL) is safe */ + delay_free = new_addrs = NULL; + goto retry_alloc; } + if (report_filterlist.used) + memcpy(new_addrs, report_filterlist.addrs, report_filterlist.used * sizeof(unsigned long)); + delay_free = report_filterlist.addrs; /* free the old list */ + report_filterlist.addrs = new_addrs; /* switch to the new list */ report_filterlist.size = new_size; - report_filterlist.addrs = new_addrs; } /* Note: deduplicating should be done in userspace. */ @@ -185,9 +183,9 @@ static ssize_t insert_report_filterlist(const char *func) kallsyms_lookup_name(func); report_filterlist.sorted = false; -out: - spin_unlock_irqrestore(&report_filterlist_lock, flags); + raw_spin_unlock_irqrestore(&report_filterlist_lock, flags); + kfree(delay_free); return ret; } @@ -204,13 +202,13 @@ static int show_info(struct seq_file *file, void *v) } /* show filter functions, and filter type */ - spin_lock_irqsave(&report_filterlist_lock, flags); + raw_spin_lock_irqsave(&report_filterlist_lock, flags); seq_printf(file, "\n%s functions: %s\n", report_filterlist.whitelist ? "whitelisted" : "blacklisted", report_filterlist.used == 0 ? "none" : ""); for (i = 0; i < report_filterlist.used; ++i) seq_printf(file, " %ps\n", (void *)report_filterlist.addrs[i]); - spin_unlock_irqrestore(&report_filterlist_lock, flags); + raw_spin_unlock_irqrestore(&report_filterlist_lock, flags); return 0; } -- 2.34.1

Kaixiong Yu

7:59 p.m.

New subject: [PATCH OLK-6.6 3/7] cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> stable inclusion from stable-v6.6.66 commit 23b5908b11b77ff8d7b8f7b8f11cbab2e1f4bfc2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBEANA CVE: CVE-2024-56617 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit b3fce429a1e030b50c1c91351d69b8667eef627b upstream. Commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU") adds functionality that architectures can use to optionally allocate and build cacheinfo early during boot. Commit 6539cffa9495 ("cacheinfo: Add arch specific early level initializer") lets secondary CPUs correct (and reallocate memory) cacheinfo data if needed. If the early build functionality is not used and cacheinfo does not need correction, memory for cacheinfo is never allocated. x86 does not use the early build functionality. Consequently, during the cacheinfo CPU hotplug callback, last_level_cache_is_valid() attempts to dereference a NULL pointer: BUG: kernel NULL pointer dereference, address: 0000000000000100 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not present page PGD 0 P4D 0 Oops: 0000 [#1] PREEPMT SMP NOPTI CPU: 0 PID 19 Comm: cpuhp/0 Not tainted 6.4.0-rc2 #1 RIP: 0010: last_level_cache_is_valid+0x95/0xe0a Allocate memory for cacheinfo during the cacheinfo CPU hotplug callback if not done earlier. Moreover, before determining the validity of the last-level cache info, ensure that it has been allocated. Simply checking for non-zero cache_leaves() is not sufficient, as some architectures (e.g., Intel processors) have non-zero cache_leaves() before allocation. Dereferencing NULL cacheinfo can occur in update_per_cpu_data_slice_size(). This function iterates over all online CPUs. However, a CPU may have come online recently, but its cacheinfo may not have been allocated yet. While here, remove an unnecessary indentation in allocate_cache_info(). [ bp: Massage. ] Fixes: 6539cffa9495 ("cacheinfo: Add arch specific early level initializer") Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Radu Rendec <rrendec@redhat.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: stable@vger.kernel.org # 6.3+ Link: https://lore.kernel.org/r/20241128002247.26726-2-ricardo.neri-calderon@linux... Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ze Zuo <zuoze1@huawei.com> --- drivers/base/cacheinfo.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 0618827f73fe..c5c61ccbd8ce 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -58,7 +58,7 @@ bool last_level_cache_is_valid(unsigned int cpu) { struct cacheinfo *llc; - if (!cache_leaves(cpu)) + if (!cache_leaves(cpu) || !per_cpu_cacheinfo(cpu)) return false; llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1); @@ -511,11 +511,9 @@ int __weak populate_cache_leaves(unsigned int cpu) return -ENOENT; } -static inline -int allocate_cache_info(int cpu) +static inline int allocate_cache_info(int cpu) { - per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu), - sizeof(struct cacheinfo), GFP_ATOMIC); + per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu), sizeof(struct cacheinfo), GFP_ATOMIC); if (!per_cpu_cacheinfo(cpu)) { cache_leaves(cpu) = 0; return -ENOMEM; @@ -587,7 +585,11 @@ static inline int init_level_allocate_ci(unsigned int cpu) */ ci_cacheinfo(cpu)->early_ci_levels = false; - if (cache_leaves(cpu) <= early_leaves) + /* + * Some architectures (e.g., x86) do not use early initialization. + * Allocate memory now in such case. + */ + if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu)) return 0; kfree(per_cpu_cacheinfo(cpu)); -- 2.34.1

Kaixiong Yu

7:59 p.m.

New subject: [PATCH OLK-6.6 4/7] mm: page_alloc: move mlocked flag clearance into free_pages_prepare()

From: Roman Gushchin <roman.gushchin@linux.dev> stable inclusion from stable-v6.11.10 commit 7873d11911cd1d21e25c354eb130d8c3b5cb3ca5 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB8IUV CVE: CVE-2024-53105 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- commit 66edc3a5894c74f8887c8af23b97593a0dd0df4d upstream. Syzbot reported a bad page state problem caused by a page being freed using free_page() still having a mlocked flag at free_pages_prepare() stage: BUG: Bad page state in process syz.5.504 pfn:61f45 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x61f45 flags: 0xfff00000080204(referenced|workingset|mlocked|node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000080204 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), pid 8443, tgid 8442 (syz.5.504), ts 201884660643, free_ts 201499827394 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537 prep_new_page mm/page_alloc.c:1545 [inline] get_page_from_freelist+0x303f/0x3190 mm/page_alloc.c:3457 __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4733 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 kvm_coalesced_mmio_init+0x1f/0xf0 virt/kvm/coalesced_mmio.c:99 kvm_create_vm virt/kvm/kvm_main.c:1235 [inline] kvm_dev_ioctl_create_vm virt/kvm/kvm_main.c:5488 [inline] kvm_dev_ioctl+0x12dc/0x2240 virt/kvm/kvm_main.c:5530 __do_compat_sys_ioctl fs/ioctl.c:1007 [inline] __se_compat_sys_ioctl+0x510/0xc90 fs/ioctl.c:950 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386 do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e page last free pid 8399 tgid 8399 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1108 [inline] free_unref_folios+0xf12/0x18d0 mm/page_alloc.c:2686 folios_put_refs+0x76c/0x860 mm/swap.c:1007 free_pages_and_swap_cache+0x5c8/0x690 mm/swap_state.c:335 __tlb_batch_free_encoded_pages mm/mmu_gather.c:136 [inline] tlb_batch_pages_flush mm/mmu_gather.c:149 [inline] tlb_flush_mmu_free mm/mmu_gather.c:366 [inline] tlb_flush_mmu+0x3a3/0x680 mm/mmu_gather.c:373 tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:465 exit_mmap+0x496/0xc40 mm/mmap.c:1926 __mmput+0x115/0x390 kernel/fork.c:1348 exit_mm+0x220/0x310 kernel/exit.c:571 do_exit+0x9b2/0x28e0 kernel/exit.c:926 do_group_exit+0x207/0x2c0 kernel/exit.c:1088 __do_sys_exit_group kernel/exit.c:1099 [inline] __se_sys_exit_group kernel/exit.c:1097 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097 x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Modules linked in: CPU: 0 UID: 0 PID: 8442 Comm: syz.5.504 Not tainted 6.12.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 bad_page+0x176/0x1d0 mm/page_alloc.c:501 free_page_is_bad mm/page_alloc.c:918 [inline] free_pages_prepare mm/page_alloc.c:1100 [inline] free_unref_page+0xed0/0xf20 mm/page_alloc.c:2638 kvm_destroy_vm virt/kvm/kvm_main.c:1327 [inline] kvm_put_kvm+0xc75/0x1350 virt/kvm/kvm_main.c:1386 kvm_vcpu_release+0x54/0x60 virt/kvm/kvm_main.c:4143 __fput+0x23f/0x880 fs/file_table.c:431 task_work_run+0x24f/0x310 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:43 [inline] do_exit+0xa2f/0x28e0 kernel/exit.c:939 do_group_exit+0x207/0x2c0 kernel/exit.c:1088 __do_sys_exit_group kernel/exit.c:1099 [inline] __se_sys_exit_group kernel/exit.c:1097 [inline] __ia32_sys_exit_group+0x3f/0x40 kernel/exit.c:1097 ia32_sys_call+0x2624/0x2630 arch/x86/include/generated/asm/syscalls_32.h:253 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386 do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e RIP: 0023:0xf745d579 Code: Unable to access opcode bytes at 0xf745d54f. RSP: 002b:00000000f75afd6c EFLAGS: 00000206 ORIG_RAX: 00000000000000fc RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000000ffffff9c RDI: 00000000f744cff4 RBP: 00000000f717ae61 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> The problem was originally introduced by commit b109b87050df ("mm/munlock: replace clear_page_mlock() by final clearance"): it was focused on handling pagecache and anonymous memory and wasn't suitable for lower level get_page()/free_page() API's used for example by KVM, as with this reproducer. Fix it by moving the mlocked flag clearance down to free_page_prepare(). The bug itself if fairly old and harmless (aside from generating these warnings), aside from a small memory leak - "bad" pages are stopped from being allocated again. Link: https://lkml.kernel.org/r/20241106195354.270757-1-roman.gushchin@linux.dev Fixes: b109b87050df ("mm/munlock: replace clear_page_mlock() by final clearance") Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Reported-by: syzbot+e985d3026c4fd041578e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6729f475.050a0220.701a.0019.GAE@google.com Acked-by: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: mm/page_alloc.c [Context differences.] Signed-off-by: Lulu Yao <yaolulu5@huawei.com> --- mm/page_alloc.c | 15 +++++++++++++++ mm/swap.c | 14 -------------- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 36cd38df0614..a58453ed565c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1076,12 +1076,27 @@ __always_inline bool free_pages_prepare(struct page *page, bool skip_kasan_poison = should_skip_kasan_poison(page); bool init = want_init_on_free(); bool compound = PageCompound(page); + struct folio *folio = page_folio(page); VM_BUG_ON_PAGE(PageTail(page), page); trace_mm_page_free(page, order); kmsan_free_page(page, order); + /* + * In rare cases, when truncation or holepunching raced with + * munlock after VM_LOCKED was cleared, Mlocked may still be + * found set here. This does not indicate a problem, unless + * "unevictable_pgs_cleared" appears worryingly large. + */ + if (unlikely(folio_test_mlocked(folio))) { + long nr_pages = folio_nr_pages(folio); + + __folio_clear_mlocked(folio); + zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); + count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages); + } + if (unlikely(PageHWPoison(page)) && !order) { /* * Do not let hwpoison pages hit pcplists/buddy diff --git a/mm/swap.c b/mm/swap.c index 9bc530395949..3e5f88baf61e 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -82,20 +82,6 @@ static void __page_cache_release(struct folio *folio, struct lruvec **lruvecp, lruvec_del_folio(*lruvecp, folio); __folio_clear_lru_flags(folio); } - - /* - * In rare cases, when truncation or holepunching raced with - * munlock after VM_LOCKED was cleared, Mlocked may still be - * found set here. This does not indicate a problem, unless - * "unevictable_pgs_cleared" appears worryingly large. - */ - if (unlikely(folio_test_mlocked(folio))) { - long nr_pages = folio_nr_pages(folio); - - __folio_clear_mlocked(folio); - zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); - count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages); - } } /* -- 2.34.1

Kaixiong Yu

7:59 p.m.

New subject: [PATCH OLK-6.6 5/7] nommu: pass NULL argument to vma_iter_prealloc()

From: Hajime Tazaki <thehajime@gmail.com> stable inclusion from stable-v6.6.63 commit 8bbf0ab631cdf1dade6745f137cff98751e6ced7 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB8IUS CVE: CVE-2024-53109 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit 247d720b2c5d22f7281437fd6054a138256986ba upstream. When deleting a vma entry from a maple tree, it has to pass NULL to vma_iter_prealloc() in order to calculate internal state of the tree, but it passed a wrong argument. As a result, nommu kernels crashed upon accessing a vma iterator, such as acct_collect() reading the size of vma entries after do_munmap(). This commit fixes this issue by passing a right argument to the preallocation call. Link: https://lkml.kernel.org/r/20241108222834.3625217-1-thehajime@gmail.com Fixes: b5df09226450 ("mm: set up vma iterator for vma_iter_prealloc() calls") Signed-off-by: Hajime Tazaki <thehajime@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> --- mm/nommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/nommu.c b/mm/nommu.c index 8c6686176ebd..10b42c531e0c 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -582,7 +582,7 @@ static int delete_vma_from_mm(struct vm_area_struct *vma) VMA_ITERATOR(vmi, vma->vm_mm, vma->vm_start); vma_iter_config(&vmi, vma->vm_start, vma->vm_end); - if (vma_iter_prealloc(&vmi, vma)) { + if (vma_iter_prealloc(&vmi, NULL)) { pr_warn("Allocation of vma tree for process %d failed\n", current->pid); return -ENOMEM; -- 2.34.1

Kaixiong Yu

7:59 p.m.

New subject: [PATCH OLK-6.6 6/7] drm/mediatek: Fix potential NULL dereference in mtk_crtc_destroy()

From: Dan Carpenter <dan.carpenter@linaro.org> mainline inclusion from mainline-v6.12-rc6 commit 4018651ba5c409034149f297d3dd3328b91561fd category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB5KR0 CVE: CVE-2024-53056 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- In mtk_crtc_create(), if the call to mbox_request_channel() fails then we set the "mtk_crtc->cmdq_client.chan" pointer to NULL. In that situation, we do not call cmdq_pkt_create(). During the cleanup, we need to check if the "mtk_crtc->cmdq_client.chan" is NULL first before calling cmdq_pkt_destroy(). Calling cmdq_pkt_destroy() is unnecessary if we didn't call cmdq_pkt_create() and it will result in a NULL pointer dereference. Fixes: 7627122fd1c0 ("drm/mediatek: Add cmdq_handle in mtk_crtc") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/cc537bd6-837f-4c85-a37b... Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Conflicts: drivers/gpu/drm/mediatek/mtk_crtc.c drivers/gpu/drm/mediatek/mtk_drm_crtc.c [Ma Wupeng: file and function has been renamed, no real changes] Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 659112da47b6..a0dbdfd8a68f 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -163,10 +163,9 @@ static void mtk_drm_crtc_destroy(struct drm_crtc *crtc) mtk_mutex_put(mtk_crtc->mutex); #if IS_REACHABLE(CONFIG_MTK_CMDQ) - mtk_drm_cmdq_pkt_destroy(&mtk_crtc->cmdq_handle); - if (mtk_crtc->cmdq_client.chan) { mbox_free_channel(mtk_crtc->cmdq_client.chan); + mtk_drm_cmdq_pkt_destroy(&mtk_crtc->cmdq_handle); mtk_crtc->cmdq_client.chan = NULL; } #endif -- 2.34.1

Kaixiong Yu

7:59 p.m.

New subject: [PATCH OLK-6.6 7/7] mm: refactor map_deny_write_exec()

From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> mainline inclusion from mainline-v6.12-rc7 commit 0fb4a7ad270b3b209e510eb9dc5b07bf02b7edaf category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB7051 CVE: CVE-2024-53056 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Refactor the map_deny_write_exec() to not unnecessarily require a VMA parameter but rather to accept VMA flags parameters, which allows us to use this function early in mmap_region() in a subsequent commit. While we're here, we refactor the function to be more readable and add some additional documentation. Link: https://lkml.kernel.org/r/6be8bb59cd7c68006ebb006eb9d8dc27104b1f70.173022466... Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Jann Horn <jannh@google.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: mm/vma.h [Ma Wupeng: comment conflict, no real function involved] Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> --- include/linux/mman.h | 21 ++++++++++++++++++--- mm/mmap.c | 2 +- mm/mprotect.c | 2 +- 3 files changed, 20 insertions(+), 5 deletions(-) diff --git a/include/linux/mman.h b/include/linux/mman.h index bcb201ab7a41..8ddca62d6460 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -188,16 +188,31 @@ static inline bool arch_memory_deny_write_exec_supported(void) * * d) mmap(PROT_READ | PROT_EXEC) * mmap(PROT_READ | PROT_EXEC | PROT_BTI) + * + * This is only applicable if the user has set the Memory-Deny-Write-Execute + * (MDWE) protection mask for the current process. + * + * @old specifies the VMA flags the VMA originally possessed, and @new the ones + * we propose to set. + * + * Return: false if proposed change is OK, true if not ok and should be denied. */ -static inline bool map_deny_write_exec(struct vm_area_struct *vma, unsigned long vm_flags) +static inline bool map_deny_write_exec(unsigned long old, unsigned long new) { + /* If MDWE is disabled, we have nothing to deny. */ if (!test_bit(MMF_HAS_MDWE, ¤t->mm->flags)) return false; - if ((vm_flags & VM_EXEC) && (vm_flags & VM_WRITE)) + /* If the new VMA is not executable, we have nothing to deny. */ + if (!(new & VM_EXEC)) + return false; + + /* Under MDWE we do not accept newly writably executable VMAs... */ + if (new & VM_WRITE) return true; - if (!(vma->vm_flags & VM_EXEC) && (vm_flags & VM_EXEC)) + /* ...nor previously non-executable VMAs becoming executable. */ + if (!(old & VM_EXEC)) return true; return false; diff --git a/mm/mmap.c b/mm/mmap.c index dfa3d2bfe289..e68ace34716f 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2900,7 +2900,7 @@ static unsigned long __mmap_region(struct mm_struct *mm, vma_set_anonymous(vma); } - if (map_deny_write_exec(vma, vma->vm_flags)) { + if (map_deny_write_exec(vma->vm_flags, vma->vm_flags)) { error = -EACCES; goto close_and_free_vma; } diff --git a/mm/mprotect.c b/mm/mprotect.c index ed0e21a05339..ed08f87e39c4 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -793,7 +793,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, break; } - if (map_deny_write_exec(vma, newflags)) { + if (map_deny_write_exec(vma->vm_flags, newflags)) { error = -EACCES; break; } -- 2.34.1

patchwork bot

8:05 p.m.

反馈：您发送到kernel@openeuler.org的补丁/补丁集，已成功转换为PR！ PR链接地址： https://gitee.com/openeuler/kernel/pulls/14845 邮件列表地址：https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/14845 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W...

185

Age (days ago)

185

Last active (days ago)

List overview

8 comments

2 participants

participants (2)

Kaixiong Yu
patchwork bot

[PATCH OLK-6.6 0/7] merge HULK-6.6 patches into OLK-6.6

tags

participants (2)