mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2026 -----
  • April
  • March
  • February
  • January
  • ----- 2025 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 11 participants
  • 23152 discussions
[PATCH v2 OLK-6.6 0/5] Reuse SUBSYS for xcu and freezer to preserve KABI
by Liu Kai 02 Apr '26

02 Apr '26
Reuse SUBSYS for xcu and freezer to preserve KABI Liu Kai (5): xSched/cgroup: reuse SUBSYS for xcu and freezer to preserve KABI xSched/cgroup: make xcu.stat invisible at root cgroup cgroup: sync CGROUP_SUBSYS_COUNT limit with upstream to 16 xSched: enable CONFIG_CGROUP_XCU and CONFIG_XCU_SCHED_CFS in arm64/x86 defconfig xSched: update xSched manual for xcu cmdline enable option Documentation/scheduler/xsched.md | 6 +- arch/arm64/configs/openeuler_defconfig | 3 +- arch/x86/configs/openeuler_defconfig | 3 +- include/linux/cgroup_subsys.h | 8 +- include/linux/freezer.h | 24 ++++ kernel/cgroup/cgroup.c | 2 +- kernel/cgroup/legacy_freezer.c | 25 ++-- kernel/xsched/cgroup.c | 166 +++++++++++++++++++++++-- 8 files changed, 209 insertions(+), 28 deletions(-) -- 2.34.1
2 6
0 0
[PATCH OLK-6.6] mm: thp: deny THP for files on anonymous inodes
by Ze Zuo 02 Apr '26

02 Apr '26
From: Deepanshu Kartikey <kartikey406(a)gmail.com> stable inclusion from stable-v6.12.78 commit 08de46a75f91a6661bc1ce0a93614f4bc313c581 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14006 CVE: CVE-2026-23375 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit dd085fe9a8ebfc5d10314c60452db38d2b75e609 upstream. file_thp_enabled() incorrectly allows THP for files on anonymous inodes (e.g. guest_memfd and secretmem). These files are created via alloc_file_pseudo(), which does not call get_write_access() and leaves inode->i_writecount at 0. Combined with S_ISREG(inode->i_mode) being true, they appear as read-only regular files when CONFIG_READ_ONLY_THP_FOR_FS is enabled, making them eligible for THP collapse. Anonymous inodes can never pass the inode_is_open_for_write() check since their i_writecount is never incremented through the normal VFS open path. The right thing to do is to exclude them from THP eligibility altogether, since CONFIG_READ_ONLY_THP_FOR_FS was designed for real filesystem files (e.g. shared libraries), not for pseudo-filesystem inodes. For guest_memfd, this allows khugepaged and MADV_COLLAPSE to create large folios in the page cache via the collapse path, but the guest_memfd fault handler does not support large folios. This triggers WARN_ON_ONCE(folio_test_large(folio)) in kvm_gmem_fault_user_mapping(). For secretmem, collapse_file() tries to copy page contents through the direct map, but secretmem pages are removed from the direct map. This can result in a kernel crash: BUG: unable to handle page fault for address: ffff88810284d000 RIP: 0010:memcpy_orig+0x16/0x130 Call Trace: collapse_file hpage_collapse_scan_file madvise_collapse Secretmem is not affected by the crash on upstream as the memory failure recovery handles the failed copy gracefully, but it still triggers confusing false memory failure reports: Memory failure: 0x106d96f: recovery action for clean unevictable LRU page: Recovered Check IS_ANON_FILE(inode) in file_thp_enabled() to deny THP for all anonymous inode files. Link: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44 Link: https://lore.kernel.org/linux-mm/CAEvNRgHegcz3ro35ixkDw39ES8=U6rs6S7iP0gkR9… Link: https://lkml.kernel.org/r/20260214001535.435626-1-kartikey406@gmail.com Fixes: 7fbb5e188248 ("mm: remove VM_EXEC requirement for THP eligibility") Signed-off-by: Deepanshu Kartikey <Kartikey406(a)gmail.com> Reported-by: syzbot+33a04338019ac7e43a44(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44 Tested-by: syzbot+33a04338019ac7e43a44(a)syzkaller.appspotmail.com Tested-by: Lance Yang <lance.yang(a)linux.dev> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Barry Song <baohua(a)kernel.org> Reviewed-by: Ackerley Tng <ackerleytng(a)google.com> Tested-by: Ackerley Tng <ackerleytng(a)google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Dev Jain <dev.jain(a)arm.com> Cc: Fangrui Song <i(a)maskray.me> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Nico Pache <npache(a)redhat.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Yang Shi <shy828301(a)gmail.com> Cc: Zi Yan <ziy(a)nvidia.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> [ Ackerley: we don't have IS_ANON_FILE() yet. As guest_memfd does not apply yet, simply check for secretmem explicitly. ] Signed-off-by: Ackerley Tng <ackerleytng(a)google.com> Reviewed-by: David Hildenbrand (Arm) <david(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Ze Zuo <zuoze1(a)huawei.com> --- include/linux/huge_mm.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 328b1fbb134c..86565da790e0 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -7,6 +7,7 @@ #include <linux/fs.h> /* only for vma_is_dax() */ #include <linux/kobject.h> +#include <linux/secretmem.h> vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, @@ -270,6 +271,9 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma) inode = vma->vm_file->f_inode; + if (secretmem_mapping(inode->i_mapping)) + return false; + return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); } -- 2.25.1
2 1
0 0
[PATCH OLK-6.6] net: usb: pegasus: validate USB endpoints
by Ze Zuo 02 Apr '26

02 Apr '26
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> stable inclusion from stable-v6.6.130 commit 7f8505c7ce3f186ef9d2495f3c0bd6ad6fce999f category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13993 CVE: CVE-2026-23290 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit 11de1d3ae5565ed22ef1f89d73d8f2d00322c699 upstream. The pegasus driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: Petko Manolov <petkan(a)nucleusys.com> Cc: stable <stable(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Link: https://patch.msgid.link/2026022347-legibly-attest-cc5c@gregkh Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Ze Zuo <zuoze1(a)huawei.com> --- drivers/net/usb/pegasus.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c index c514483134f0..7cc949460edc 100644 --- a/drivers/net/usb/pegasus.c +++ b/drivers/net/usb/pegasus.c @@ -804,8 +804,19 @@ static void unlink_all_urbs(pegasus_t *pegasus) static int alloc_urbs(pegasus_t *pegasus) { + static const u8 bulk_ep_addr[] = { + 1 | USB_DIR_IN, + 2 | USB_DIR_OUT, + 0}; + static const u8 int_ep_addr[] = { + 3 | USB_DIR_IN, + 0}; int res = -ENOMEM; + if (!usb_check_bulk_endpoints(pegasus->intf, bulk_ep_addr) || + !usb_check_int_endpoints(pegasus->intf, int_ep_addr)) + return -ENODEV; + pegasus->rx_urb = usb_alloc_urb(0, GFP_KERNEL); if (!pegasus->rx_urb) { return res; @@ -1146,6 +1157,7 @@ static int pegasus_probe(struct usb_interface *intf, pegasus = netdev_priv(net); pegasus->dev_index = dev_index; + pegasus->intf = intf; res = alloc_urbs(pegasus); if (res < 0) { @@ -1157,7 +1169,6 @@ static int pegasus_probe(struct usb_interface *intf, INIT_DELAYED_WORK(&pegasus->carrier_check, check_carrier); - pegasus->intf = intf; pegasus->usb = dev; pegasus->net = net; -- 2.25.1
2 1
0 0
[PATCH OLK-5.10] net: usb: pegasus: validate USB endpoints
by Ze Zuo 02 Apr '26

02 Apr '26
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> mainline inclusion from mainline-v7.0-rc2 commit 11de1d3ae5565ed22ef1f89d73d8f2d00322c699 category: bugfix bugzilla: CVE-2026-23290 CVE: CVE-2026-23290 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The pegasus driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: Petko Manolov <petkan(a)nucleusys.com> Cc: stable <stable(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Link: https://patch.msgid.link/2026022347-legibly-attest-cc5c@gregkh Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Ze Zuo <zuoze1(a)huawei.com> --- drivers/net/usb/pegasus.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c index 138279bbb544..99a8702c1df7 100644 --- a/drivers/net/usb/pegasus.c +++ b/drivers/net/usb/pegasus.c @@ -828,8 +828,19 @@ static void unlink_all_urbs(pegasus_t *pegasus) static int alloc_urbs(pegasus_t *pegasus) { + static const u8 bulk_ep_addr[] = { + 1 | USB_DIR_IN, + 2 | USB_DIR_OUT, + 0}; + static const u8 int_ep_addr[] = { + 3 | USB_DIR_IN, + 0}; int res = -ENOMEM; + if (!usb_check_bulk_endpoints(pegasus->intf, bulk_ep_addr) || + !usb_check_int_endpoints(pegasus->intf, int_ep_addr)) + return -ENODEV; + pegasus->rx_urb = usb_alloc_urb(0, GFP_KERNEL); if (!pegasus->rx_urb) { return res; @@ -1170,6 +1181,7 @@ static int pegasus_probe(struct usb_interface *intf, pegasus = netdev_priv(net); pegasus->dev_index = dev_index; + pegasus->intf = intf; res = alloc_urbs(pegasus); if (res < 0) { @@ -1181,7 +1193,6 @@ static int pegasus_probe(struct usb_interface *intf, INIT_DELAYED_WORK(&pegasus->carrier_check, check_carrier); - pegasus->intf = intf; pegasus->usb = dev; pegasus->net = net; -- 2.25.1
2 1
0 0
[PATCH OLK-5.10 0/1] cpufreq: CPPC: Use `ktime` to replace `jiffies` to get the system time in cppc_get_perf_ctrs_sample()
by Lifeng Zheng 02 Apr '26

02 Apr '26
From: Hongye Lin <linhongye(a)h-partners.com> driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8860 ---------------------------------------------------------------------- Lifeng Zheng (1): cpufreq: CPPC: Use `ktime` to replace `jiffies` to get the system time in cppc_get_perf_ctrs_sample() drivers/cpufreq/cppc_cpufreq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- 2.33.0
2 2
0 0
[PATCH OLK-6.6 0/1] cpufreq: CPPC: Use `ktime` to replace `jiffies` to get the system time in cppc_get_perf_ctrs_pair()
by Lifeng Zheng 01 Apr '26

01 Apr '26
From: Hongye Lin <linhongye(a)h-partners.com> driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8860 ---------------------------------------------------------------------- Lifeng Zheng (1): cpufreq: CPPC: Use `ktime` to replace `jiffies` to get the system time in cppc_get_perf_ctrs_pair() drivers/cpufreq/cppc_cpufreq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- 2.33.0
2 2
0 0
[PATCH OLK-6.6] xSched/cgroup: utilize xcu cmdline to dynamically switch between xcu and freezer subsys
by Liu Kai 01 Apr '26

01 Apr '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8424 ---------------------------------------- To support both cgroup v1 and v2 while adhering to the CGROUP_SUBSYS_COUNT limit (16), this patch introduces a mechanism to share the same SUBSYS(xcu) slot between the 'xcu' and 'freezer' subsystems. Since 'xcu' is a cgroup v2-only controller and 'freezer' is a cgroup v1-only controller, they are mutually exclusive at runtime. We introduce a new kernel command line parameter, "xcu", to control this behavior dynamically. This approach allows us to enable both CONFIG_CGROUP_XCU and CONFIG_CGROUP_FREEZER simultaneously without exceeding the subsystem limit. The behavior based on the "xcu" cmdline parameter is as follows: 1. xcu=disable, cgroup v1: - The legacy 'frezzer' subsystem is active and functional. - The 'xcu' subsystem remains dormant. 2. xcu=enable, cgroup v1: - The 'freezer' subsystem is effectively disabled/blocked. - (Note: 'xcu' is not usable in v1 mode as it is v2-only). 3. xcu=disable, cgroup v2: - The 'xcu' subsystem is not enabled in the hierarchy. 4. xcu=enable, cgroup v2: - The 'xcu' subsystem is active and usable. - The 'freezer' logic is bypassed. This ensures backward compatibility for v1 users while enabling the new functionality for v2, all within the constraints of the kernel subsystem limit. Fixes: 43bbefc53356 ("xsched: Add XCU control group implementation and its backend in xsched CFS") Signed-off-by: Liu Kai <liukai284(a)huawei.com> --- Documentation/scheduler/xsched.md | 6 +- arch/arm64/configs/openeuler_defconfig | 3 +- arch/x86/configs/openeuler_defconfig | 3 +- include/linux/cgroup_subsys.h | 8 +- include/linux/freezer.h | 24 ++++ kernel/cgroup/cgroup.c | 2 +- kernel/cgroup/legacy_freezer.c | 25 ++-- kernel/xsched/cgroup.c | 166 +++++++++++++++++++++++-- 8 files changed, 209 insertions(+), 28 deletions(-) diff --git a/Documentation/scheduler/xsched.md b/Documentation/scheduler/xsched.md index 11dc0c964e0a..c5e643ab35f0 100644 --- a/Documentation/scheduler/xsched.md +++ b/Documentation/scheduler/xsched.md @@ -64,11 +64,11 @@ CONFIG_CGROUP_XCU=y # 修改内核引导文件,根据实际情况编辑 vim /etc/grub2-efi.cfg -# 在XSched内核新增 cmdline 配置,关闭驱动签名校验、开启cgroup-v2 -module.sig_enforce=0 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all +# 在XSched内核新增 cmdline 配置,关闭驱动签名校验、开启cgroup-v2,使能 xcu cgroup 子系统 +module.sig_enforce=0 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all xcu=enable ``` -保存引导文件后,重启切换内核 +保存引导文件后,重启切换内核,**注意!!!,xcu 子系统仅支持 cgroup-v2** ### 1.3 重编驱动 diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index fc581adb563b..622d44e6d9ff 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -101,7 +101,8 @@ CONFIG_XCU_SCHEDULER=y CONFIG_XCU_VSTREAM=y CONFIG_XSCHED_NR_CUS=128 CONFIG_XCU_SCHED_RT=y -# CONFIG_XCU_SCHED_CFS is not set +CONFIG_XCU_SCHED_CFS=y +CONFIG_CGROUP_XCU=y # # CPU/Task time and stats accounting diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index d493dbf6b8a1..e66724b15bb4 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -121,7 +121,8 @@ CONFIG_XCU_SCHEDULER=y CONFIG_XCU_VSTREAM=y CONFIG_XSCHED_NR_CUS=128 CONFIG_XCU_SCHED_RT=y -# CONFIG_XCU_SCHED_CFS is not set +CONFIG_XCU_SCHED_CFS=y +CONFIG_CGROUP_XCU=y # # CPU/Task time and stats accounting diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h index e65ae90946c2..9ee14c9cab33 100644 --- a/include/linux/cgroup_subsys.h +++ b/include/linux/cgroup_subsys.h @@ -33,7 +33,9 @@ SUBSYS(memory) SUBSYS(devices) #endif -#if IS_ENABLED(CONFIG_CGROUP_FREEZER) +#if IS_ENABLED(CONFIG_CGROUP_XCU) +SUBSYS(xcu) +#elif IS_ENABLED(CONFIG_CGROUP_FREEZER) SUBSYS(freezer) #endif @@ -61,10 +63,6 @@ SUBSYS(pids) SUBSYS(rdma) #endif -#if IS_ENABLED(CONFIG_CGROUP_XCU) -SUBSYS(xcu) -#endif - #if IS_ENABLED(CONFIG_CGROUP_MISC) SUBSYS(misc) #endif diff --git a/include/linux/freezer.h b/include/linux/freezer.h index b303472255be..0c7a6da03d43 100644 --- a/include/linux/freezer.h +++ b/include/linux/freezer.h @@ -10,6 +10,10 @@ #include <linux/atomic.h> #include <linux/jump_label.h> +#ifdef CONFIG_CGROUP_XCU +#include <linux/cgroup-defs.h> +#endif + #ifdef CONFIG_FREEZER DECLARE_STATIC_KEY_FALSE(freezer_active); @@ -87,4 +91,24 @@ static inline void set_freezable(void) {} #endif /* !CONFIG_FREEZER */ +/* + * When CONFIG_CGROUP_XCU is enabled, freezer_cgrp_subsys and xcu_cgrp_subsys + * share the same set of cgroup_subsys hook functions. Consequently, the hooks for + * freezer_cgrp_subsys must be exposed externally to allow linkage with the XCU + * cgroup_subsys. + * + */ +#ifdef CONFIG_CGROUP_XCU +#define freezer_cgrp_id xcu_cgrp_id + +extern struct cftype files[]; +struct cgroup_subsys_state * +freezer_css_alloc(struct cgroup_subsys_state *parent_css); +int freezer_css_online(struct cgroup_subsys_state *css); +void freezer_css_offline(struct cgroup_subsys_state *css); +void freezer_css_free(struct cgroup_subsys_state *css); +void freezer_attach(struct cgroup_taskset *tset); +void freezer_fork(struct task_struct *task); +#endif /* CONFIG_CGROUP_XCU */ + #endif /* FREEZER_H_INCLUDED */ diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 17521bc192ee..04301432e84a 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -6256,7 +6256,7 @@ int __init cgroup_init(void) struct cgroup_subsys *ss; int ssid; - BUILD_BUG_ON(CGROUP_SUBSYS_COUNT > 17); + BUILD_BUG_ON(CGROUP_SUBSYS_COUNT > 16); BUG_ON(cgroup_init_cftypes(NULL, cgroup_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_psi_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup1_base_files)); diff --git a/kernel/cgroup/legacy_freezer.c b/kernel/cgroup/legacy_freezer.c index bee2f9ea5e4a..9ef242b73947 100644 --- a/kernel/cgroup/legacy_freezer.c +++ b/kernel/cgroup/legacy_freezer.c @@ -24,6 +24,17 @@ #include <linux/mutex.h> #include <linux/cpu.h> +/* + * The STATIC macro is used to handle this conditional visibility: + * - Enabled: interfaces are defined as non-static (exported). + * - Disabled: interfaces remain static (file-local). + */ +#ifdef CONFIG_CGROUP_XCU +#define STATIC +#else +#define STATIC static +#endif + /* * A cgroup is freezing if any FREEZING flags are set. FREEZING_SELF is * set if "FROZEN" is written to freezer.state cgroupfs file, and cleared @@ -83,7 +94,7 @@ static const char *freezer_state_strs(unsigned int state) return "THAWED"; }; -static struct cgroup_subsys_state * +STATIC struct cgroup_subsys_state * freezer_css_alloc(struct cgroup_subsys_state *parent_css) { struct freezer *freezer; @@ -103,7 +114,7 @@ freezer_css_alloc(struct cgroup_subsys_state *parent_css) * parent's freezing state while holding both parent's and our * freezer->lock. */ -static int freezer_css_online(struct cgroup_subsys_state *css) +STATIC int freezer_css_online(struct cgroup_subsys_state *css) { struct freezer *freezer = css_freezer(css); struct freezer *parent = parent_freezer(freezer); @@ -130,7 +141,7 @@ static int freezer_css_online(struct cgroup_subsys_state *css) * @css is going away. Mark it dead and decrement system_freezing_count if * it was holding one. */ -static void freezer_css_offline(struct cgroup_subsys_state *css) +STATIC void freezer_css_offline(struct cgroup_subsys_state *css) { struct freezer *freezer = css_freezer(css); @@ -146,7 +157,7 @@ static void freezer_css_offline(struct cgroup_subsys_state *css) cpus_read_unlock(); } -static void freezer_css_free(struct cgroup_subsys_state *css) +STATIC void freezer_css_free(struct cgroup_subsys_state *css) { kfree(css_freezer(css)); } @@ -160,7 +171,7 @@ static void freezer_css_free(struct cgroup_subsys_state *css) * @freezer->lock. freezer_attach() makes the new tasks conform to the * current state and all following state changes can see the new tasks. */ -static void freezer_attach(struct cgroup_taskset *tset) +STATIC void freezer_attach(struct cgroup_taskset *tset) { struct task_struct *task; struct cgroup_subsys_state *new_css; @@ -205,7 +216,7 @@ static void freezer_attach(struct cgroup_taskset *tset) * to do anything as freezer_attach() will put @task into the appropriate * state. */ -static void freezer_fork(struct task_struct *task) +STATIC void freezer_fork(struct task_struct *task) { struct freezer *freezer; @@ -449,7 +460,7 @@ static u64 freezer_parent_freezing_read(struct cgroup_subsys_state *css, return (bool)(freezer->state & CGROUP_FREEZING_PARENT); } -static struct cftype files[] = { +STATIC struct cftype files[] = { { .name = "state", .flags = CFTYPE_NOT_ON_ROOT, diff --git a/kernel/xsched/cgroup.c b/kernel/xsched/cgroup.c index 73f044475939..8a85faaa8dc4 100644 --- a/kernel/xsched/cgroup.c +++ b/kernel/xsched/cgroup.c @@ -21,6 +21,10 @@ #include <linux/xsched.h> #include <linux/delay.h> +#ifdef CONFIG_CGROUP_FREEZER +#include <linux/freezer.h> +#endif + static struct xsched_group root_xsched_group; struct xsched_group *root_xcg = &root_xsched_group; @@ -39,6 +43,61 @@ static const char xcu_sched_name[XSCHED_TYPE_NUM][SCHED_CLASS_MAX_LENGTH] = { [XSCHED_TYPE_CFS] = "cfs" }; +/* + * xcu_mode: + * 0 = disable (freezer cgroup) + * 1 = enable (xcu cgroup) + */ +static int xcu_mode; + +/** + * Parse the "xcu=" kernel command line parameter. + * + * Usage: + * xcu=enable → enable xcu_cgrp_subsys + * Otherwise → enable freezer_cgrp_subsys + * + * Returns: + * 1 (handled), 0 (not handled) + */ +static int __init xcu_setup(char *str) +{ + if (!str) + return 0; + + if (strcmp(str, "enable") == 0) + xcu_mode = 1; + + return 1; +} +__setup("xcu=", xcu_setup); + +static bool xcu_cgroup_enabled(void) +{ + return xcu_mode; +} + +/** + * xcu_cgroup_check_compat - Verify XCU mode matches the cgroup hierarchy version. + * + * Checks if the current xcu_mode aligns with the cgroup subsystem's default + * hierarchy status. + * + * IMPORTANT: cgroup_subsys_on_dfl() only returns a valid version indicator + * after the cgroup filesystem has been mounted at the root node. Calling + * this function prior to mount may yield incorrect results. + * + * Return: true if compatible, false otherwise (with a warning logged). + */ +static bool xcu_cgroup_check_compat(void) +{ + if (xcu_mode != cgroup_subsys_on_dfl(xcu_cgrp_subsys)) { + XSCHED_WARN("XCU cgrp is incompatible with the cgroup version\n"); + return false; + } + return true; +} + static int xcu_cg_set_file_show(struct xsched_group *xg, int sched_class) { if (!xg) { @@ -742,6 +801,7 @@ static struct cftype xcu_cg_files[] = { }, { .name = "stat", + .flags = CFTYPE_NOT_ON_ROOT, .seq_show = xcu_stat, }, { @@ -753,17 +813,103 @@ static struct cftype xcu_cg_files[] = { {} /* terminate */ }; +static struct cgroup_subsys_state * +xcu_freezer_compat_css_alloc(struct cgroup_subsys_state *parent_css) +{ + /* Skip allocation if XCU cmdline mismatches the cgroup version. */ + if (parent_css && !xcu_cgroup_check_compat()) + return ERR_PTR(-EPERM); + + if (xcu_cgroup_enabled()) + return xcu_css_alloc(parent_css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_alloc(parent_css); +#else /* CONFIG_CGROUP_FREEZER=n xcu=disable cgroup=v1 */ + if (!parent_css) + return &root_xsched_group.css; + else + return ERR_PTR(-EPERM); +#endif +} + +static int xcu_freezer_compat_css_online(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_online(css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_online(css); +#else + return 0; +#endif +} + +static void xcu_freezer_compat_css_offline(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_offline(css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_offline(css); +#endif +} + +static void xcu_freezer_compat_css_released(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_released(css); +} + +static void xcu_freezer_compat_css_free(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_free(css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_free(css); +#endif +} + +static int xcu_freezer_compat_can_attach(struct cgroup_taskset *tset) +{ + if (xcu_cgroup_enabled()) + return xcu_can_attach(tset); + + return 0; +} + +static void xcu_freezer_compat_cancel_attach(struct cgroup_taskset *tset) +{ + if (xcu_cgroup_enabled()) + return xcu_cancel_attach(tset); +} + +static void xcu_freezer_compat_attach(struct cgroup_taskset *tset) +{ + if (xcu_cgroup_enabled()) + return xcu_attach(tset); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_attach(tset); +#endif +} + struct cgroup_subsys xcu_cgrp_subsys = { - .css_alloc = xcu_css_alloc, - .css_online = xcu_css_online, - .css_offline = xcu_css_offline, - .css_released = xcu_css_released, - .css_free = xcu_css_free, - .can_attach = xcu_can_attach, - .cancel_attach = xcu_cancel_attach, - .attach = xcu_attach, + .css_alloc = xcu_freezer_compat_css_alloc, + .css_online = xcu_freezer_compat_css_online, + .css_offline = xcu_freezer_compat_css_offline, + .css_released = xcu_freezer_compat_css_released, + .css_free = xcu_freezer_compat_css_free, + .can_attach = xcu_freezer_compat_can_attach, + .cancel_attach = xcu_freezer_compat_cancel_attach, + .attach = xcu_freezer_compat_attach, .dfl_cftypes = xcu_cg_files, +#ifdef CONFIG_CGROUP_FREEZER + .fork = freezer_fork, + .legacy_cftypes = files, + .legacy_name = "freezer", +#else .legacy_cftypes = xcu_cg_files, - .early_init = false, - .threaded = true +#endif }; -- 2.34.1
2 1
0 0
[PATCH openEuler-1.0-LTS] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 01 Apr '26

01 Apr '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio() because of the race between folio split and zap_nonpresent_ptes() leading to a folio incorrectly undergoing modification without a folio lock being held. This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further swapops predicates"), which in merged in v6.19-rc1. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h [miragtion entry hasn't been renamed to softleaf entry.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 22af9d8a84ae..c742e778e024 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -205,14 +205,28 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } -- 2.43.0
2 1
0 0
[PATCH OLK-5.10] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 01 Apr '26

01 Apr '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h mm/filemap.c [miragtion entry hasn't been renamed to softleaf entry. Add new helper migration_entry_to_compound_page().] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 31 ++++++++++++++++++++++++++++--- mm/filemap.c | 2 +- 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index e749c4c86b26..ed33367fb6a6 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -194,17 +194,42 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } +static inline struct page *migration_entry_to_compound_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + struct page *head; + + head = compound_head(p); + migration_entry_sync_page(head); + + return head; +} + static inline void make_migration_entry_read(swp_entry_t *entry) { *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry)); diff --git a/mm/filemap.c b/mm/filemap.c index 18e304ce6229..c2932db70212 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1334,7 +1334,7 @@ void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep, bool delayacct = false; unsigned long pflags = 0; wait_queue_head_t *q; - struct page *page = compound_head(migration_entry_to_page(entry)); + struct page *page = migration_entry_to_compound_page(entry); q = page_waitqueue(page); if (!PageUptodate(page) && PageWorkingset(page)) { -- 2.43.0
2 1
0 0
[PATCH OLK-6.6] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 01 Apr '26

01 Apr '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio() because of the race between folio split and zap_nonpresent_ptes() leading to a folio incorrectly undergoing modification without a folio lock being held. This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further swapops predicates"), which in merged in v6.19-rc1. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h [miragtion entry hasn't been renamed to softleaf entry.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index b32d696242b6..7bb5937a3f3c 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -500,15 +500,28 @@ static inline int is_userswap_entry(swp_entry_t entry) } #endif -static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync(struct folio *folio) { - struct page *p = pfn_to_page(swp_offset_pfn(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * folios and thus result in observing an unlocked folio. + * This matches the write barrier in __split_folio_to_order(). + */ + smp_rmb(); /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(is_migration_entry(entry) && !PageLocked(p)); + BUG_ON(!folio_test_locked(folio)); +} + +static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset_pfn(entry)); + + if (is_migration_entry(entry)) + migration_entry_sync(page_folio(p)); return p; } @@ -517,11 +530,8 @@ static inline struct folio *pfn_swap_entry_folio(swp_entry_t entry) { struct folio *folio = pfn_folio(swp_offset_pfn(entry)); - /* - * Any use of migration entries may only occur while the - * corresponding folio is locked - */ - BUG_ON(is_migration_entry(entry) && !folio_test_locked(folio)); + if (is_migration_entry(entry)) + migration_entry_sync(folio); return folio; } -- 2.43.0
2 1
0 0
  • ← Newer
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • ...
  • 2316
  • Older →

HyperKitty Powered by HyperKitty