- Kernel - mailweb.openeuler.org

[PATCH openEuler-1.0-LTS] PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free
by Ziming Du 25 Oct '25

25 Oct '25

From: Ding Hui <dinghui(a)sangfor.com.cn> mainline inclusion from mainline-v6.5 commit 456d8aa37d0f56fc9e985e812496e861dcd6f2f2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICYQS7 CVE: CVE-2023-53446 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- Struct pcie_link_state->downstream is a pointer to the pci_dev of function 0. Previously we retained that pointer when removing function 0, and subsequent ASPM policy changes dereferenced it, resulting in a use-after-free warning from KASAN, e.g.: # echo 1 > /sys/bus/pci/devices/0000:03:00.0/remove # echo powersave > /sys/module/pcie_aspm/parameters/policy BUG: KASAN: slab-use-after-free in pcie_config_aspm_link+0x42d/0x500 Call Trace: kasan_report+0xae/0xe0 pcie_config_aspm_link+0x42d/0x500 pcie_aspm_set_policy+0x8e/0x1a0 param_attr_store+0x162/0x2c0 module_attr_store+0x3e/0x80 PCIe spec r6.0, sec 7.5.3.7, recommends that software program the same ASPM Control value in all functions of multi-function devices. Disable ASPM and free the pcie_link_state when any child function is removed so we can discard the dangling pcie_link_state->downstream pointer and maintain the same ASPM Control configuration for all functions. [bhelgaas: commit log and comment] Debugged-by: Zongquan Qin <qinzongquan(a)sangfor.com.cn> Suggested-by: Bjorn Helgaas <bhelgaas(a)google.com> Fixes: b5a0a9b59c81 ("PCI/ASPM: Read and set up L1 substate capabilities") Link: https://lore.kernel.org/r/20230507034057.20970-1-dinghui@sangfor.com.cn Signed-off-by: Ding Hui <dinghui(a)sangfor.com.cn> Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com> Conflicts: drivers/pci/pcie/aspm.c [Ziming Du: context conflict] Signed-off-by: Ziming Du <duziming2(a)huawei.com> --- drivers/pci/pcie/aspm.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 54fc2856995d5..0a21070a7557b 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1004,22 +1004,25 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) down_read(&pci_bus_sem); mutex_lock(&aspm_lock); - /* - * All PCIe functions are in one slot, remove one function will remove - * the whole slot, so just wait until we are the last function left. - */ - if (!list_empty(&parent->subordinate->devices)) - goto out; link = parent->link_state; root = link->root; parent_link = link->parent; - /* All functions are removed, so just disable ASPM for the link */ + /* + * link->downstream is a pointer to the pci_dev of function 0. If + * we remove that function, the pci_dev is about to be deallocated, + * so we can't use link->downstream again. Free the link state to + * avoid this. + * + * If we're removing a non-0 function, it's possible we could + * retain the link state, but PCIe r6.0, sec 7.5.3.7, recommends + * programming the same ASPM Control value for all functions of + * multi-function devices, so disable ASPM for all of them. + */ pcie_config_aspm_link(link, 0); list_del(&link->sibling); list_del(&link->link); - /* Clock PM is for endpoint device */ free_link_state(link); /* Recheck latencies and configure upstream links */ @@ -1027,7 +1030,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) pcie_update_aspm_capable(root); pcie_config_aspm_path(parent_link); } -out: + mutex_unlock(&aspm_lock); up_read(&pci_bus_sem); } -- 2.43.0

2 1

[PATCH OLK-6.6 0/2] arm64/mpam: Fix MBWU monitor overflow handling
by Zeng Heng 25 Oct '25

25 Oct '25

Zeng Heng (2): arm64/mpam: Fix MBWU monitor overflow handling arm64/mpam: Print MPAM register operation for debug drivers/platform/mpam/mpam_devices.c | 33 ++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) -- 2.25.1

2 3

[PATCH OLK-6.6] cnic: Fix use-after-free bugs in cnic_delete_task
by Yao Kai 25 Oct '25

25 Oct '25

From: Duoming Zhou <duoming(a)zju.edu.cn> stable inclusion from stable-v6.6.108 commit 8eeb2091e72d75df8ceaa2172638d61b4cf8929a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ID0U8X CVE: CVE-2025-39945 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit cfa7d9b1e3a8604afc84e9e51d789c29574fb216 ] The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(), which does not guarantee that the delayed work item 'delete_task' has fully completed if it was already running. Additionally, the delayed work item is cyclic, the flush_workqueue() in cnic_cm_stop_bnx2x_hw() only blocks and waits for work items that were already queued to the workqueue prior to its invocation. Any work items submitted after flush_workqueue() is called are not included in the set of tasks that the flush operation awaits. This means that after the cyclic work items have finished executing, a delayed work item may still exist in the workqueue. This leads to use-after-free scenarios where the cnic_dev is deallocated by cnic_free_dev(), while delete_task remains active and attempt to dereference cnic_dev in cnic_delete_task(). A typical race condition is illustrated below: CPU 0 (cleanup) | CPU 1 (delayed work callback) cnic_netdev_event() | cnic_stop_hw() | cnic_delete_task() cnic_cm_stop_bnx2x_hw() | ... cancel_delayed_work() | /* the queue_delayed_work() flush_workqueue() | executes after flush_workqueue()*/ | queue_delayed_work() cnic_free_dev(dev)//free | cnic_delete_task() //new instance | dev = cp->dev; //use Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the cyclic delayed work item is properly canceled and that any ongoing execution of the work item completes before the cnic_dev is deallocated. Furthermore, since cancel_delayed_work_sync() uses __flush_work(work, true) to synchronously wait for any currently executing instance of the work item to finish, the flush_workqueue() becomes redundant and should be removed. This bug was identified through static analysis. To reproduce the issue and validate the fix, I simulated the cnic PCI device in QEMU and introduced intentional delays — such as inserting calls to ssleep() within the cnic_delete_task() function — to increase the likelihood of triggering the bug. Fixes: fdf24086f475 ("cnic: Defer iscsi connection cleanup") Signed-off-by: Duoming Zhou <duoming(a)zju.edu.cn> Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Yao Kai <yaokai34(a)huawei.com> --- drivers/net/ethernet/broadcom/cnic.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c index 7926aaef8f0c..ad2745c07c1a 100644 --- a/drivers/net/ethernet/broadcom/cnic.c +++ b/drivers/net/ethernet/broadcom/cnic.c @@ -4220,8 +4220,7 @@ static void cnic_cm_stop_bnx2x_hw(struct cnic_dev *dev) cnic_bnx2x_delete_wait(dev, 0); - cancel_delayed_work(&cp->delete_task); - flush_workqueue(cnic_wq); + cancel_delayed_work_sync(&cp->delete_task); if (atomic_read(&cp->iscsi_conn) != 0) netdev_warn(dev->netdev, "%d iSCSI connections not destroyed\n", -- 2.43.0

2 9

[openeuler:OLK-6.6 3039/3039] mm/memblock.c:1409:20: sparse: sparse: symbol 'memblock_alloc_range_nid_flags' was not declared. Should it be static?
by kernel test robot 25 Oct '25

25 Oct '25

tree: https://gitee.com/openeuler/kernel.git OLK-6.6 head: 4c00b18c122c4cdb2747e44b469af328b8cecb28 commit: 64018b291c1f49622c4b23b303364d760306d662 [3039/3039] mm/memblock: Introduce ability to alloc memory from specify memory region config: x86_64-randconfig-122-20251025 (https://download.01.org/0day-ci/archive/20251025/202510251138.4pyXr8W0-lkp@…) compiler: gcc-12 (Debian 12.4.0-5) 12.4.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251025/202510251138.4pyXr8W0-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202510251138.4pyXr8W0-lkp@intel.com/ sparse warnings: (new ones prefixed by >>) >> mm/memblock.c:1409:20: sparse: sparse: symbol 'memblock_alloc_range_nid_flags' was not declared. Should it be static? vim +/memblock_alloc_range_nid_flags +1409 mm/memblock.c 1386 1387 /** 1388 * memblock_alloc_range_nid_flags - allocate boot memory block with specify flag 1389 * @size: size of memory block to be allocated in bytes 1390 * @align: alignment of the region and block's size 1391 * @start: the lower bound of the memory region to allocate (phys address) 1392 * @end: the upper bound of the memory region to allocate (phys address) 1393 * @nid: nid of the free area to find, %NUMA_NO_NODE for any node 1394 * @exact_nid: control the allocation fall back to other nodes 1395 * @flags: alloc memory from specify memblock flag 1396 * 1397 * The allocation is performed from memory region limited by 1398 * memblock.current_limit if @end == %MEMBLOCK_ALLOC_ACCESSIBLE. 1399 * 1400 * If the specified node can not hold the requested memory and @exact_nid 1401 * is false, the allocation falls back to any node in the system. 1402 * 1403 * In addition, function sets the min_count to 0 using kmemleak_alloc_phys for 1404 * allocated boot memory block, so that it is never reported as leaks. 1405 * 1406 * Return: 1407 * Physical address of allocated memory block on success, %0 on failure. 1408 */ > 1409 phys_addr_t __init memblock_alloc_range_nid_flags(phys_addr_t size, 1410 phys_addr_t align, phys_addr_t start, 1411 phys_addr_t end, int nid, 1412 bool exact_nid, 1413 enum memblock_flags flags) 1414 { 1415 phys_addr_t found; 1416 1417 if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) 1418 nid = NUMA_NO_NODE; 1419 1420 if (!align) { 1421 /* Can't use WARNs this early in boot on powerpc */ 1422 dump_stack(); 1423 align = SMP_CACHE_BYTES; 1424 } 1425 1426 again: 1427 found = memblock_find_in_range_node(size, align, start, end, nid, 1428 flags); 1429 if (found && !memblock_reserve(found, size)) 1430 goto done; 1431 1432 if (nid != NUMA_NO_NODE && !exact_nid) { 1433 found = memblock_find_in_range_node(size, align, start, 1434 end, NUMA_NO_NODE, 1435 flags); 1436 if (found && !memblock_reserve(found, size)) 1437 goto done; 1438 } 1439 1440 if (flags & MEMBLOCK_MIRROR) { 1441 flags &= ~MEMBLOCK_MIRROR; 1442 pr_warn_ratelimited("Could not allocate %pap bytes of mirrored memory\n", 1443 &size); 1444 goto again; 1445 } 1446 1447 return 0; 1448 1449 done: 1450 /* 1451 * Skip kmemleak for those places like kasan_init() and 1452 * early_pgtable_alloc() due to high volume. 1453 */ 1454 if (end != MEMBLOCK_ALLOC_NOLEAKTRACE) 1455 /* 1456 * Memblock allocated blocks are never reported as 1457 * leaks. This is because many of these blocks are 1458 * only referred via the physical address which is 1459 * not looked up by kmemleak. 1460 */ 1461 kmemleak_alloc_phys(found, size, 0); 1462 1463 /* 1464 * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, 1465 * require memory to be accepted before it can be used by the 1466 * guest. 1467 * 1468 * Accept the memory of the allocated buffer. 1469 */ 1470 accept_memory(found, found + size); 1471 1472 return found; 1473 } 1474 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki

1 0

[PATCH OLK-6.6] arm64/mpam: Fix L2 MBWU monitor multiplexing issue
by Zeng Heng 25 Oct '25

25 Oct '25

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ID29QH -------------------------------- The MBWU monitor performs long-term cumulative statistics on bandwidth for a specified PARTID. However, the number of L2 physical monitors is very limited. To support the expanded quantities of PARTIDs and PMGs, the MBWU monitors need to multiplex. Therefore, when a monitor tracks a new PARTID or PMG, which means a change in monitor configuration is detected, the accumulated correction should set to zero. Fixes: 21ac9fc8f275 ("arm_mpam: Track bandwidth counter state for overflow and power management") Signed-off-by: Zeng Heng <zengheng4(a)huawei.com> --- drivers/platform/mpam/mpam_devices.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/platform/mpam/mpam_devices.c b/drivers/platform/mpam/mpam_devices.c index c2e58fc4c0e8..ae6791b05e8a 100644 --- a/drivers/platform/mpam/mpam_devices.c +++ b/drivers/platform/mpam/mpam_devices.c @@ -1011,7 +1011,7 @@ static void __ris_msmon_read(void *arg) bool reset_on_next_read = false; struct mpam_msc_ris *ris = m->ris; struct mpam_msc *msc = m->ris->msc; - struct msmon_mbwu_state *mbwu_state; + struct msmon_mbwu_state *mbwu_state = NULL; u32 mon_sel, ctl_val, flt_val, cur_ctl, cur_flt; lockdep_assert_held(&msc->lock); @@ -1043,8 +1043,14 @@ static void __ris_msmon_read(void *arg) config_mismatch = cur_flt != flt_val || cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN); - if (config_mismatch || reset_on_next_read) + if (config_mismatch || reset_on_next_read) { write_msmon_ctl_flt_vals(m, ctl_val, flt_val); + if (mbwu_state) { + mbwu_state->prev_val = 0; + mbwu_state->correction = 0; + mbwu_overflow = false; + } + } /* * Selects the monitor instance associated to the specified PARTID -- 2.25.1

2 1

[PATCH OLK-6.6 0/2] arm64/mpam: Fix MBWU monitor overflow handling
by Zeng Heng 25 Oct '25

25 Oct '25

Zeng Heng (2): arm64/mpam: Fix MBWU monitor overflow handling arm64/mpam: Print MPAM register operation for debug drivers/platform/mpam/mpam_devices.c | 29 ++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-) -- 2.25.1

2 3

[PATCH openEuler-1.0-LTS] posix-timers: Ensure timer ID search-loop limit is valid
by Jinjie Ruan 25 Oct '25

25 Oct '25

From: Thomas Gleixner <tglx(a)linutronix.de> stable inclusion from stable-v4.19.291 commit 9ea26a8494a0a9337e7415eafd6f3ed940327dc5 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ID33F2 CVE: CVE-2023-53728 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 8ce8849dd1e78dadcee0ec9acbd259d239b7069f ] posix_timer_add() tries to allocate a posix timer ID by starting from the cached ID which was stored by the last successful allocation. This is done in a loop searching the ID space for a free slot one by one. The loop has to terminate when the search wrapped around to the starting point. But that's racy vs. establishing the starting point. That is read out lockless, which leads to the following problem: CPU0 CPU1 posix_timer_add() start = sig->posix_timer_id; lock(hash_lock); ... posix_timer_add() if (++sig->posix_timer_id < 0) start = sig->posix_timer_id; sig->posix_timer_id = 0; So CPU1 can observe a negative start value, i.e. -1, and the loop break never happens because the condition can never be true: if (sig->posix_timer_id == start) break; While this is unlikely to ever turn into an endless loop as the ID space is huge (INT_MAX), the racy read of the start value caught the attention of KCSAN and Dmitry unearthed that incorrectness. Rewrite it so that all id operations are under the hash lock. Reported-by: syzbot+5c54bd3eb218bb595aa9(a)syzkaller.appspotmail.com Reported-by: Dmitry Vyukov <dvyukov(a)google.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Reviewed-by: Frederic Weisbecker <frederic(a)kernel.org> Link: https://lore.kernel.org/r/87bkhzdn6g.ffs@tglx Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflicts: include/linux/sched/signal.h [ conflict because of kabi change. ] Signed-off-by: Jinjie Ruan <ruanjinjie(a)huawei.com> --- include/linux/sched/signal.h | 4 ++++ kernel/time/posix-timers.c | 31 ++++++++++++++++++------------- 2 files changed, 22 insertions(+), 13 deletions(-) diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index 8bde85ac0e77..66afd36b7de9 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -128,7 +128,11 @@ struct signal_struct { #ifdef CONFIG_POSIX_TIMERS /* POSIX.1b Interval Timers */ +#ifdef __GENKSYMS__ int posix_timer_id; +#else + unsigned int next_posix_timer_id; +#endif struct list_head posix_timers; /* ITIMER_REAL timer for the process */ diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index d86fdabc5748..9b22171f7339 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -159,25 +159,30 @@ static struct k_itimer *posix_timer_by_id(timer_t id) static int posix_timer_add(struct k_itimer *timer) { struct signal_struct *sig = current->signal; - int first_free_id = sig->posix_timer_id; struct hlist_head *head; - int ret = -ENOENT; + unsigned int cnt, id; - do { + /* + * FIXME: Replace this by a per signal struct xarray once there is + * a plan to handle the resulting CRIU regression gracefully. + */ + for (cnt = 0; cnt <= INT_MAX; cnt++) { spin_lock(&hash_lock); - head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)]; - if (!__posix_timers_find(head, sig, sig->posix_timer_id)) { + id = sig->next_posix_timer_id; + + /* Write the next ID back. Clamp it to the positive space */ + sig->next_posix_timer_id = (id + 1) & INT_MAX; + + head = &posix_timers_hashtable[hash(sig, id)]; + if (!__posix_timers_find(head, sig, id)) { hlist_add_head_rcu(&timer->t_hash, head); - ret = sig->posix_timer_id; + spin_unlock(&hash_lock); + return id; } - if (++sig->posix_timer_id < 0) - sig->posix_timer_id = 0; - if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT)) - /* Loop over all possible ids completed */ - ret = -EAGAIN; spin_unlock(&hash_lock); - } while (ret == -ENOENT); - return ret; + } + /* POSIX return code when no timer ID could be allocated */ + return -EAGAIN; } static inline void unlock_timer(struct k_itimer *timr, unsigned long flags) -- 2.34.1

2 1

[PATCH OLK-6.6] cpu/SMT: recover global num_threads after disable smt switch fail
by Bowen You 25 Oct '25

25 Oct '25

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ID3F7S CVE: NA ------------------------------------------------- When executing `echo off > /sys/devices/system/cpu/smt/control`, if `cpuhp_smt_disable` fails, a "resource busy" error is reported. However, the global variable `cpu_smt_num_threads` is incorrectly updated to the target value despite the failure. If SMT is subsequently disabled again, the command will not take effect due to a state machine inconsistency. To recover functionality, `echo on > /sys/devices/system/cpu/smt/control` must be executed manually. This commit adds exception handling to restore `cpu_smt_num_threads` to its original value when `cpuhp_smt_disable` fails, ensuring proper state management. Fixes: 7f48405c3c34 ("cpu/SMT: Allow enabling partial SMT states via sysfs") Signed-off-by: Bowen You <youbowen2(a)huawei.com> --- kernel/cpu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/cpu.c b/kernel/cpu.c index 88e81b8b1700..e5ebc498e328 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -2934,6 +2934,9 @@ __store_smt_control(struct device *dev, struct device_attribute *attr, else if (num_threads < orig_threads || force_off) ret = cpuhp_smt_disable(ctrlval); + if (ret) + cpu_smt_num_threads = orig_threads; + unlock_device_hotplug(); return ret ? ret : count; } -- 2.34.1

2 1

[PATCH OLK-5.10] i40e: fix idx validation in config queues msg
by Fanhua Li 25 Oct '25

25 Oct '25

From: Lukasz Czapnik <lukasz.czapnik(a)intel.com> stable inclusion from stable-v5.10.245 commit f5f91d164af22e7147130ef8bebbdb28d8ecc6e2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ID2290 CVE: CVE-2025-39971 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit f1ad24c5abe1eaef69158bac1405a74b3c365115 ] Ensure idx is within range of active/initialized TCs when iterating over vf->ch[idx] in i40e_vc_config_queues_msg(). Fixes: c27eac48160d ("i40e: Enable ADq and create queue channel/s on VF") Cc: stable(a)vger.kernel.org Signed-off-by: Lukasz Czapnik <lukasz.czapnik(a)intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel(a)intel.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Tested-by: Kamakshi Nellore <nellorex.kamakshi(a)intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com> [ Adjust context ] Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Fanhua Li <lifanhua5(a)huawei.com> --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 4f23243bbfbb6..3ce6d18cc2e5e 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -2326,7 +2326,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg) } if (vf->adq_enabled) { - if (idx >= ARRAY_SIZE(vf->ch)) { + if (idx >= vf->num_tc) { aq_ret = I40E_ERR_NO_AVAILABLE_VSI; goto error_param; } @@ -2347,7 +2347,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg) * to its appropriate VSIs based on TC mapping */ if (vf->adq_enabled) { - if (idx >= ARRAY_SIZE(vf->ch)) { + if (idx >= vf->num_tc) { aq_ret = I40E_ERR_NO_AVAILABLE_VSI; goto error_param; } -- 2.43.0

2 1

[PATCH OLK-6.6] fs/notify: call exportfs_encode_fid with s_umount
by Wang Zhaolong 25 Oct '25

25 Oct '25

From: Jakub Acs <acsjakub(a)amazon.de> mainline inclusion from mainline-v6.18 commit a7c4bb43bfdc2b9f06ee9d036028ed13a83df42a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ID2VH3 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- Calling intotify_show_fdinfo() on fd watching an overlayfs inode, while the overlayfs is being unmounted, can lead to dereferencing NULL ptr. This issue was found by syzkaller. Race Condition Diagram: Thread 1 Thread 2 -------- -------- generic_shutdown_super() shrink_dcache_for_umount sb->s_root = NULL | | vfs_read() | inotify_fdinfo() | * inode get from mark * | show_mark_fhandle(m, inode) | exportfs_encode_fid(inode, ..) | ovl_encode_fh(inode, ..) | ovl_check_encode_origin(inode) | * deref i_sb->s_root * | | v fsnotify_sb_delete(sb) Which then leads to: [ 32.133461] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 32.134438] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] [ 32.135032] CPU: 1 UID: 0 PID: 4468 Comm: systemd-coredum Not tainted 6.17.0-rc6 #22 PREEMPT(none) <snip registers, unreliable trace> [ 32.143353] Call Trace: [ 32.143732] ovl_encode_fh+0xd5/0x170 [ 32.144031] exportfs_encode_inode_fh+0x12f/0x300 [ 32.144425] show_mark_fhandle+0xbe/0x1f0 [ 32.145805] inotify_fdinfo+0x226/0x2d0 [ 32.146442] inotify_show_fdinfo+0x1c5/0x350 [ 32.147168] seq_show+0x530/0x6f0 [ 32.147449] seq_read_iter+0x503/0x12a0 [ 32.148419] seq_read+0x31f/0x410 [ 32.150714] vfs_read+0x1f0/0x9e0 [ 32.152297] ksys_read+0x125/0x240 IOW ovl_check_encode_origin derefs inode->i_sb->s_root, after it was set to NULL in the unmount path. Fix it by protecting calling exportfs_encode_fid() from show_mark_fhandle() with s_umount lock. This form of fix was suggested by Amir in [1]. [1]: https://lore.kernel.org/all/CAOQ4uxhbDwhb+2Brs1UdkoF0a3NSdBAOQPNfEHjahrgoKJ… Fixes: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Jakub Acs <acsjakub(a)amazon.de> Cc: Jan Kara <jack(a)suse.cz> Cc: Amir Goldstein <amir73il(a)gmail.com> Cc: Miklos Szeredi <miklos(a)szeredi.hu> Cc: Christian Brauner <brauner(a)kernel.org> Cc: linux-unionfs(a)vger.kernel.org Cc: linux-fsdevel(a)vger.kernel.org Cc: linux-kernel(a)vger.kernel.org Cc: stable(a)vger.kernel.org Signed-off-by: Jan Kara <jack(a)suse.cz> Conflicts: fs/notify/fdinfo.c [Conflicts with mainline commit 4d69c58ef2e4 ("fsnotify: Avoid -Wflex-array-member-not-at-end warning").] Signed-off-by: Wang Zhaolong <wangzhaolong(a)huaweicloud.com> --- fs/notify/fdinfo.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c index 26655572975d..1aa7de55094c 100644 --- a/fs/notify/fdinfo.c +++ b/fs/notify/fdinfo.c @@ -15,10 +15,11 @@ #include "inotify/inotify.h" #include "fanotify/fanotify.h" #include "fdinfo.h" #include "fsnotify.h" +#include "../internal.h" #if defined(CONFIG_PROC_FS) #if defined(CONFIG_INOTIFY_USER) || defined(CONFIG_FANOTIFY) @@ -48,11 +49,16 @@ static void show_mark_fhandle(struct seq_file *m, struct inode *inode) int size, ret, i; f.handle.handle_bytes = sizeof(f.pad); size = f.handle.handle_bytes >> 2; + if (!super_trylock_shared(inode->i_sb)) + return; + ret = exportfs_encode_fid(inode, (struct fid *)f.handle.f_handle, &size); + up_read(&inode->i_sb->s_umount); + if ((ret == FILEID_INVALID) || (ret < 0)) return; f.handle.handle_type = ret; f.handle.handle_bytes = size * sizeof(u32); -- 2.34.3

2 1