mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2026 -----
  • April
  • March
  • February
  • January
  • ----- 2025 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 3 participants
  • 23126 discussions
[PATCH OLK-6.6] xSched/cgroup: utilize xcu cmdline to dynamically switch between xcu and freezer subsys
by Liu Kai 01 Apr '26

01 Apr '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8424 ---------------------------------------- To support both cgroup v1 and v2 while adhering to the CGROUP_SUBSYS_COUNT limit (16), this patch introduces a mechanism to share the same SUBSYS(xcu) slot between the 'xcu' and 'freezer' subsystems. Since 'xcu' is a cgroup v2-only controller and 'freezer' is a cgroup v1-only controller, they are mutually exclusive at runtime. We introduce a new kernel command line parameter, "xcu", to control this behavior dynamically. This approach allows us to enable both CONFIG_CGROUP_XCU and CONFIG_CGROUP_FREEZER simultaneously without exceeding the subsystem limit. The behavior based on the "xcu" cmdline parameter is as follows: 1. xcu=disable, cgroup v1: - The legacy 'frezzer' subsystem is active and functional. - The 'xcu' subsystem remains dormant. 2. xcu=enable, cgroup v1: - The 'freezer' subsystem is effectively disabled/blocked. - (Note: 'xcu' is not usable in v1 mode as it is v2-only). 3. xcu=disable, cgroup v2: - The 'xcu' subsystem is not enabled in the hierarchy. 4. xcu=enable, cgroup v2: - The 'xcu' subsystem is active and usable. - The 'freezer' logic is bypassed. This ensures backward compatibility for v1 users while enabling the new functionality for v2, all within the constraints of the kernel subsystem limit. Fixes: 43bbefc53356 ("xsched: Add XCU control group implementation and its backend in xsched CFS") Signed-off-by: Liu Kai <liukai284(a)huawei.com> --- Documentation/scheduler/xsched.md | 6 +- arch/arm64/configs/openeuler_defconfig | 3 +- arch/x86/configs/openeuler_defconfig | 3 +- include/linux/cgroup_subsys.h | 8 +- include/linux/freezer.h | 24 ++++ kernel/cgroup/cgroup.c | 2 +- kernel/cgroup/legacy_freezer.c | 25 ++-- kernel/xsched/cgroup.c | 166 +++++++++++++++++++++++-- 8 files changed, 209 insertions(+), 28 deletions(-) diff --git a/Documentation/scheduler/xsched.md b/Documentation/scheduler/xsched.md index 11dc0c964e0a..c5e643ab35f0 100644 --- a/Documentation/scheduler/xsched.md +++ b/Documentation/scheduler/xsched.md @@ -64,11 +64,11 @@ CONFIG_CGROUP_XCU=y # 修改内核引导文件,根据实际情况编辑 vim /etc/grub2-efi.cfg -# 在XSched内核新增 cmdline 配置,关闭驱动签名校验、开启cgroup-v2 -module.sig_enforce=0 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all +# 在XSched内核新增 cmdline 配置,关闭驱动签名校验、开启cgroup-v2,使能 xcu cgroup 子系统 +module.sig_enforce=0 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all xcu=enable ``` -保存引导文件后,重启切换内核 +保存引导文件后,重启切换内核,**注意!!!,xcu 子系统仅支持 cgroup-v2** ### 1.3 重编驱动 diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index fc581adb563b..622d44e6d9ff 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -101,7 +101,8 @@ CONFIG_XCU_SCHEDULER=y CONFIG_XCU_VSTREAM=y CONFIG_XSCHED_NR_CUS=128 CONFIG_XCU_SCHED_RT=y -# CONFIG_XCU_SCHED_CFS is not set +CONFIG_XCU_SCHED_CFS=y +CONFIG_CGROUP_XCU=y # # CPU/Task time and stats accounting diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index d493dbf6b8a1..e66724b15bb4 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -121,7 +121,8 @@ CONFIG_XCU_SCHEDULER=y CONFIG_XCU_VSTREAM=y CONFIG_XSCHED_NR_CUS=128 CONFIG_XCU_SCHED_RT=y -# CONFIG_XCU_SCHED_CFS is not set +CONFIG_XCU_SCHED_CFS=y +CONFIG_CGROUP_XCU=y # # CPU/Task time and stats accounting diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h index e65ae90946c2..9ee14c9cab33 100644 --- a/include/linux/cgroup_subsys.h +++ b/include/linux/cgroup_subsys.h @@ -33,7 +33,9 @@ SUBSYS(memory) SUBSYS(devices) #endif -#if IS_ENABLED(CONFIG_CGROUP_FREEZER) +#if IS_ENABLED(CONFIG_CGROUP_XCU) +SUBSYS(xcu) +#elif IS_ENABLED(CONFIG_CGROUP_FREEZER) SUBSYS(freezer) #endif @@ -61,10 +63,6 @@ SUBSYS(pids) SUBSYS(rdma) #endif -#if IS_ENABLED(CONFIG_CGROUP_XCU) -SUBSYS(xcu) -#endif - #if IS_ENABLED(CONFIG_CGROUP_MISC) SUBSYS(misc) #endif diff --git a/include/linux/freezer.h b/include/linux/freezer.h index b303472255be..0c7a6da03d43 100644 --- a/include/linux/freezer.h +++ b/include/linux/freezer.h @@ -10,6 +10,10 @@ #include <linux/atomic.h> #include <linux/jump_label.h> +#ifdef CONFIG_CGROUP_XCU +#include <linux/cgroup-defs.h> +#endif + #ifdef CONFIG_FREEZER DECLARE_STATIC_KEY_FALSE(freezer_active); @@ -87,4 +91,24 @@ static inline void set_freezable(void) {} #endif /* !CONFIG_FREEZER */ +/* + * When CONFIG_CGROUP_XCU is enabled, freezer_cgrp_subsys and xcu_cgrp_subsys + * share the same set of cgroup_subsys hook functions. Consequently, the hooks for + * freezer_cgrp_subsys must be exposed externally to allow linkage with the XCU + * cgroup_subsys. + * + */ +#ifdef CONFIG_CGROUP_XCU +#define freezer_cgrp_id xcu_cgrp_id + +extern struct cftype files[]; +struct cgroup_subsys_state * +freezer_css_alloc(struct cgroup_subsys_state *parent_css); +int freezer_css_online(struct cgroup_subsys_state *css); +void freezer_css_offline(struct cgroup_subsys_state *css); +void freezer_css_free(struct cgroup_subsys_state *css); +void freezer_attach(struct cgroup_taskset *tset); +void freezer_fork(struct task_struct *task); +#endif /* CONFIG_CGROUP_XCU */ + #endif /* FREEZER_H_INCLUDED */ diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 17521bc192ee..04301432e84a 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -6256,7 +6256,7 @@ int __init cgroup_init(void) struct cgroup_subsys *ss; int ssid; - BUILD_BUG_ON(CGROUP_SUBSYS_COUNT > 17); + BUILD_BUG_ON(CGROUP_SUBSYS_COUNT > 16); BUG_ON(cgroup_init_cftypes(NULL, cgroup_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_psi_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup1_base_files)); diff --git a/kernel/cgroup/legacy_freezer.c b/kernel/cgroup/legacy_freezer.c index bee2f9ea5e4a..9ef242b73947 100644 --- a/kernel/cgroup/legacy_freezer.c +++ b/kernel/cgroup/legacy_freezer.c @@ -24,6 +24,17 @@ #include <linux/mutex.h> #include <linux/cpu.h> +/* + * The STATIC macro is used to handle this conditional visibility: + * - Enabled: interfaces are defined as non-static (exported). + * - Disabled: interfaces remain static (file-local). + */ +#ifdef CONFIG_CGROUP_XCU +#define STATIC +#else +#define STATIC static +#endif + /* * A cgroup is freezing if any FREEZING flags are set. FREEZING_SELF is * set if "FROZEN" is written to freezer.state cgroupfs file, and cleared @@ -83,7 +94,7 @@ static const char *freezer_state_strs(unsigned int state) return "THAWED"; }; -static struct cgroup_subsys_state * +STATIC struct cgroup_subsys_state * freezer_css_alloc(struct cgroup_subsys_state *parent_css) { struct freezer *freezer; @@ -103,7 +114,7 @@ freezer_css_alloc(struct cgroup_subsys_state *parent_css) * parent's freezing state while holding both parent's and our * freezer->lock. */ -static int freezer_css_online(struct cgroup_subsys_state *css) +STATIC int freezer_css_online(struct cgroup_subsys_state *css) { struct freezer *freezer = css_freezer(css); struct freezer *parent = parent_freezer(freezer); @@ -130,7 +141,7 @@ static int freezer_css_online(struct cgroup_subsys_state *css) * @css is going away. Mark it dead and decrement system_freezing_count if * it was holding one. */ -static void freezer_css_offline(struct cgroup_subsys_state *css) +STATIC void freezer_css_offline(struct cgroup_subsys_state *css) { struct freezer *freezer = css_freezer(css); @@ -146,7 +157,7 @@ static void freezer_css_offline(struct cgroup_subsys_state *css) cpus_read_unlock(); } -static void freezer_css_free(struct cgroup_subsys_state *css) +STATIC void freezer_css_free(struct cgroup_subsys_state *css) { kfree(css_freezer(css)); } @@ -160,7 +171,7 @@ static void freezer_css_free(struct cgroup_subsys_state *css) * @freezer->lock. freezer_attach() makes the new tasks conform to the * current state and all following state changes can see the new tasks. */ -static void freezer_attach(struct cgroup_taskset *tset) +STATIC void freezer_attach(struct cgroup_taskset *tset) { struct task_struct *task; struct cgroup_subsys_state *new_css; @@ -205,7 +216,7 @@ static void freezer_attach(struct cgroup_taskset *tset) * to do anything as freezer_attach() will put @task into the appropriate * state. */ -static void freezer_fork(struct task_struct *task) +STATIC void freezer_fork(struct task_struct *task) { struct freezer *freezer; @@ -449,7 +460,7 @@ static u64 freezer_parent_freezing_read(struct cgroup_subsys_state *css, return (bool)(freezer->state & CGROUP_FREEZING_PARENT); } -static struct cftype files[] = { +STATIC struct cftype files[] = { { .name = "state", .flags = CFTYPE_NOT_ON_ROOT, diff --git a/kernel/xsched/cgroup.c b/kernel/xsched/cgroup.c index 73f044475939..8a85faaa8dc4 100644 --- a/kernel/xsched/cgroup.c +++ b/kernel/xsched/cgroup.c @@ -21,6 +21,10 @@ #include <linux/xsched.h> #include <linux/delay.h> +#ifdef CONFIG_CGROUP_FREEZER +#include <linux/freezer.h> +#endif + static struct xsched_group root_xsched_group; struct xsched_group *root_xcg = &root_xsched_group; @@ -39,6 +43,61 @@ static const char xcu_sched_name[XSCHED_TYPE_NUM][SCHED_CLASS_MAX_LENGTH] = { [XSCHED_TYPE_CFS] = "cfs" }; +/* + * xcu_mode: + * 0 = disable (freezer cgroup) + * 1 = enable (xcu cgroup) + */ +static int xcu_mode; + +/** + * Parse the "xcu=" kernel command line parameter. + * + * Usage: + * xcu=enable → enable xcu_cgrp_subsys + * Otherwise → enable freezer_cgrp_subsys + * + * Returns: + * 1 (handled), 0 (not handled) + */ +static int __init xcu_setup(char *str) +{ + if (!str) + return 0; + + if (strcmp(str, "enable") == 0) + xcu_mode = 1; + + return 1; +} +__setup("xcu=", xcu_setup); + +static bool xcu_cgroup_enabled(void) +{ + return xcu_mode; +} + +/** + * xcu_cgroup_check_compat - Verify XCU mode matches the cgroup hierarchy version. + * + * Checks if the current xcu_mode aligns with the cgroup subsystem's default + * hierarchy status. + * + * IMPORTANT: cgroup_subsys_on_dfl() only returns a valid version indicator + * after the cgroup filesystem has been mounted at the root node. Calling + * this function prior to mount may yield incorrect results. + * + * Return: true if compatible, false otherwise (with a warning logged). + */ +static bool xcu_cgroup_check_compat(void) +{ + if (xcu_mode != cgroup_subsys_on_dfl(xcu_cgrp_subsys)) { + XSCHED_WARN("XCU cgrp is incompatible with the cgroup version\n"); + return false; + } + return true; +} + static int xcu_cg_set_file_show(struct xsched_group *xg, int sched_class) { if (!xg) { @@ -742,6 +801,7 @@ static struct cftype xcu_cg_files[] = { }, { .name = "stat", + .flags = CFTYPE_NOT_ON_ROOT, .seq_show = xcu_stat, }, { @@ -753,17 +813,103 @@ static struct cftype xcu_cg_files[] = { {} /* terminate */ }; +static struct cgroup_subsys_state * +xcu_freezer_compat_css_alloc(struct cgroup_subsys_state *parent_css) +{ + /* Skip allocation if XCU cmdline mismatches the cgroup version. */ + if (parent_css && !xcu_cgroup_check_compat()) + return ERR_PTR(-EPERM); + + if (xcu_cgroup_enabled()) + return xcu_css_alloc(parent_css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_alloc(parent_css); +#else /* CONFIG_CGROUP_FREEZER=n xcu=disable cgroup=v1 */ + if (!parent_css) + return &root_xsched_group.css; + else + return ERR_PTR(-EPERM); +#endif +} + +static int xcu_freezer_compat_css_online(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_online(css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_online(css); +#else + return 0; +#endif +} + +static void xcu_freezer_compat_css_offline(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_offline(css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_offline(css); +#endif +} + +static void xcu_freezer_compat_css_released(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_released(css); +} + +static void xcu_freezer_compat_css_free(struct cgroup_subsys_state *css) +{ + if (xcu_cgroup_enabled()) + return xcu_css_free(css); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_css_free(css); +#endif +} + +static int xcu_freezer_compat_can_attach(struct cgroup_taskset *tset) +{ + if (xcu_cgroup_enabled()) + return xcu_can_attach(tset); + + return 0; +} + +static void xcu_freezer_compat_cancel_attach(struct cgroup_taskset *tset) +{ + if (xcu_cgroup_enabled()) + return xcu_cancel_attach(tset); +} + +static void xcu_freezer_compat_attach(struct cgroup_taskset *tset) +{ + if (xcu_cgroup_enabled()) + return xcu_attach(tset); + +#ifdef CONFIG_CGROUP_FREEZER + return freezer_attach(tset); +#endif +} + struct cgroup_subsys xcu_cgrp_subsys = { - .css_alloc = xcu_css_alloc, - .css_online = xcu_css_online, - .css_offline = xcu_css_offline, - .css_released = xcu_css_released, - .css_free = xcu_css_free, - .can_attach = xcu_can_attach, - .cancel_attach = xcu_cancel_attach, - .attach = xcu_attach, + .css_alloc = xcu_freezer_compat_css_alloc, + .css_online = xcu_freezer_compat_css_online, + .css_offline = xcu_freezer_compat_css_offline, + .css_released = xcu_freezer_compat_css_released, + .css_free = xcu_freezer_compat_css_free, + .can_attach = xcu_freezer_compat_can_attach, + .cancel_attach = xcu_freezer_compat_cancel_attach, + .attach = xcu_freezer_compat_attach, .dfl_cftypes = xcu_cg_files, +#ifdef CONFIG_CGROUP_FREEZER + .fork = freezer_fork, + .legacy_cftypes = files, + .legacy_name = "freezer", +#else .legacy_cftypes = xcu_cg_files, - .early_init = false, - .threaded = true +#endif }; -- 2.34.1
2 1
0 0
[PATCH openEuler-1.0-LTS] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 01 Apr '26

01 Apr '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio() because of the race between folio split and zap_nonpresent_ptes() leading to a folio incorrectly undergoing modification without a folio lock being held. This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further swapops predicates"), which in merged in v6.19-rc1. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h [miragtion entry hasn't been renamed to softleaf entry.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 22af9d8a84ae..c742e778e024 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -205,14 +205,28 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } -- 2.43.0
2 1
0 0
[PATCH OLK-5.10] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 01 Apr '26

01 Apr '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h mm/filemap.c [miragtion entry hasn't been renamed to softleaf entry. Add new helper migration_entry_to_compound_page().] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 31 ++++++++++++++++++++++++++++--- mm/filemap.c | 2 +- 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index e749c4c86b26..ed33367fb6a6 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -194,17 +194,42 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } +static inline struct page *migration_entry_to_compound_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + struct page *head; + + head = compound_head(p); + migration_entry_sync_page(head); + + return head; +} + static inline void make_migration_entry_read(swp_entry_t *entry) { *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry)); diff --git a/mm/filemap.c b/mm/filemap.c index 18e304ce6229..c2932db70212 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1334,7 +1334,7 @@ void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep, bool delayacct = false; unsigned long pflags = 0; wait_queue_head_t *q; - struct page *page = compound_head(migration_entry_to_page(entry)); + struct page *page = migration_entry_to_compound_page(entry); q = page_waitqueue(page); if (!PageUptodate(page) && PageWorkingset(page)) { -- 2.43.0
2 1
0 0
[PATCH OLK-6.6] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 01 Apr '26

01 Apr '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio() because of the race between folio split and zap_nonpresent_ptes() leading to a folio incorrectly undergoing modification without a folio lock being held. This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further swapops predicates"), which in merged in v6.19-rc1. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h [miragtion entry hasn't been renamed to softleaf entry.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index b32d696242b6..7bb5937a3f3c 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -500,15 +500,28 @@ static inline int is_userswap_entry(swp_entry_t entry) } #endif -static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync(struct folio *folio) { - struct page *p = pfn_to_page(swp_offset_pfn(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * folios and thus result in observing an unlocked folio. + * This matches the write barrier in __split_folio_to_order(). + */ + smp_rmb(); /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(is_migration_entry(entry) && !PageLocked(p)); + BUG_ON(!folio_test_locked(folio)); +} + +static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset_pfn(entry)); + + if (is_migration_entry(entry)) + migration_entry_sync(page_folio(p)); return p; } @@ -517,11 +530,8 @@ static inline struct folio *pfn_swap_entry_folio(swp_entry_t entry) { struct folio *folio = pfn_folio(swp_offset_pfn(entry)); - /* - * Any use of migration entries may only occur while the - * corresponding folio is locked - */ - BUG_ON(is_migration_entry(entry) && !folio_test_locked(folio)); + if (is_migration_entry(entry)) + migration_entry_sync(folio); return folio; } -- 2.43.0
2 1
0 0
[PATCH openEuler-1.0-LTS] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 31 Mar '26

31 Mar '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://gitcode.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio() because of the race between folio split and zap_nonpresent_ptes() leading to a folio incorrectly undergoing modification without a folio lock being held. This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further swapops predicates"), which in merged in v6.19-rc1. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h [miragtion entry hasn't been renamed to softleaf entry.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 22af9d8a84ae..c742e778e024 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -205,14 +205,28 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } -- 2.43.0
2 1
0 0
[PATCH OLK-5.10] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 31 Mar '26

31 Mar '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://gitcode.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h mm/filemap.c [miragtion entry hasn't been renamed to softleaf entry. Add new helper migration_entry_to_compound_page().] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 31 ++++++++++++++++++++++++++++--- mm/filemap.c | 2 +- 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index e749c4c86b26..ed33367fb6a6 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -194,17 +194,42 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } +static inline struct page *migration_entry_to_compound_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + struct page *head; + + head = compound_head(p); + migration_entry_sync_page(head); + + return head; +} + static inline void make_migration_entry_read(swp_entry_t *entry) { *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry)); diff --git a/mm/filemap.c b/mm/filemap.c index 18e304ce6229..c2932db70212 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1334,7 +1334,7 @@ void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep, bool delayacct = false; unsigned long pflags = 0; wait_queue_head_t *q; - struct page *page = compound_head(migration_entry_to_page(entry)); + struct page *page = migration_entry_to_compound_page(entry); q = page_waitqueue(page); if (!PageUptodate(page) && PageWorkingset(page)) { -- 2.43.0
2 1
0 0
[PATCH OLK-6.6] mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
by Jinjiang Tu 31 Mar '26

31 Mar '26
mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://gitcode.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio() because of the race between folio split and zap_nonpresent_ptes() leading to a folio incorrectly undergoing modification without a folio lock being held. This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further swapops predicates"), which in merged in v6.19-rc1. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang(a)huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> Acked-by: David Hildenbrand (Arm) <david(a)kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs(a)kernel.org> Cc: Barry Song <baohua(a)kernel.org> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Nanyong Sun <sunnanyong(a)huawei.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h [miragtion entry hasn't been renamed to softleaf entry.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- include/linux/swapops.h | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index b32d696242b6..7bb5937a3f3c 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -500,15 +500,28 @@ static inline int is_userswap_entry(swp_entry_t entry) } #endif -static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync(struct folio *folio) { - struct page *p = pfn_to_page(swp_offset_pfn(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * folios and thus result in observing an unlocked folio. + * This matches the write barrier in __split_folio_to_order(). + */ + smp_rmb(); /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(is_migration_entry(entry) && !PageLocked(p)); + BUG_ON(!folio_test_locked(folio)); +} + +static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset_pfn(entry)); + + if (is_migration_entry(entry)) + migration_entry_sync(page_folio(p)); return p; } @@ -517,11 +530,8 @@ static inline struct folio *pfn_swap_entry_folio(swp_entry_t entry) { struct folio *folio = pfn_folio(swp_offset_pfn(entry)); - /* - * Any use of migration entries may only occur while the - * corresponding folio is locked - */ - BUG_ON(is_migration_entry(entry) && !folio_test_locked(folio)); + if (is_migration_entry(entry)) + migration_entry_sync(folio); return folio; } -- 2.43.0
2 1
0 0
[PATCH OLK-6.6 v4 0/2] kvm: arm64: Transition from CPU Type to MIDR Register for Virtualization Feature Detection
by liqiqi 31 Mar '26

31 Mar '26
Currently, there are two methods for determining whether a chip supports specific virtualization features: 1. Reading the chip's CPU type from BIOS 2. Reading the value of the MIDR register The issue with the first method is that each time a new chip is introduced, the new CPU type must be defined, which leads to poor code portability and maintainability. Therefore, the second method has been adopted to replace the first. This approach eliminates the dependency on CPU type by using the MIDR register. liqiqi (2): kvm: arm64: Add MIDR definitions and use MIDR to determine whether features are supported kvm: arm64: Remove cpu_type definition and it's related interfaces arch/arm64/include/asm/cache.h | 2 +- arch/arm64/include/asm/cputype.h | 8 +- arch/arm64/kernel/cpu_errata.c | 4 +- arch/arm64/kernel/cpufeature.c | 2 +- arch/arm64/kernel/proton-pack.c | 4 +- arch/arm64/kvm/arm.c | 1 - arch/arm64/kvm/hisilicon/hisi_virt.c | 111 +++-------------------- arch/arm64/kvm/hisilicon/hisi_virt.h | 12 --- drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +- tools/arch/arm64/include/asm/cputype.h | 4 +- 10 files changed, 26 insertions(+), 124 deletions(-) -- 2.43.0
2 7
0 0
[PATCH OLK-5.10] xfs: fix internal error from AGFL exhaustion
by Long Li 31 Mar '26

31 Mar '26
From: Omar Sandoval <osandov(a)fb.com> mainline inclusion from mainline-v6.7-rc1 commit f63a5b3769ad7659da4c0420751d78958ab97675 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14036 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- We've been seeing XFS errors like the following: XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c. Caller xfs_btree_insert+0x1ec/0x280 ... Call Trace: xfs_corruption_error+0x94/0xa0 xfs_btree_insert+0x221/0x280 xfs_alloc_fixup_trees+0x104/0x3e0 xfs_alloc_ag_vextent_size+0x667/0x820 xfs_alloc_fix_freelist+0x5d9/0x750 xfs_free_extent_fix_freelist+0x65/0xa0 __xfs_free_extent+0x57/0x180 ... This is the XFS_IS_CORRUPT() check in xfs_btree_insert() when xfs_btree_insrec() fails. After converting this into a panic and dissecting the core dump, I found that xfs_btree_insrec() is failing because it's trying to split a leaf node in the cntbt when the AG free list is empty. In particular, it's failing to get a block from the AGFL _while trying to refill the AGFL_. If a single operation splits every level of the bnobt and the cntbt (and the rmapbt if it is enabled) at once, the free list will be empty. Then, when the next operation tries to refill the free list, it allocates space. If the allocation does not use a full extent, it will need to insert records for the remaining space in the bnobt and cntbt. And if those new records go in full leaves, the leaves (and potentially more nodes up to the old root) need to be split. Fix it by accounting for the additional splits that may be required to refill the free list in the calculation for the minimum free list size. P.S. As far as I can tell, this bug has existed for a long time -- maybe back to xfs-history commit afdf80ae7405 ("Add XFS_AG_MAXLEVELS macros ...") in April 1994! It requires a very unlucky sequence of events, and in fact we didn't hit it until a particular sparse mmap workload updated from 5.12 to 5.19. But this bug existed in 5.12, so it must've been exposed by some other change in allocation or writeback patterns. It's also much less likely to be hit with the rmapbt enabled, since that increases the minimum free list size and is unlikely to split at the same time as the bnobt and cntbt. Reviewed-by: "Darrick J. Wong" <djwong(a)kernel.org> Reviewed-by: Dave Chinner <dchinner(a)redhat.com> Signed-off-by: Omar Sandoval <osandov(a)fb.com> Signed-off-by: Chandan Babu R <chandanbabu(a)kernel.org> Conflicts: fs/xfs/libxfs/xfs_alloc.c [Context conflicts] Signed-off-by: Long Li <leo.lilong(a)huawei.com> --- fs/xfs/libxfs/xfs_alloc.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 23c0e666d2f4..15dce9276d45 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -2328,16 +2328,37 @@ xfs_alloc_min_freelist( ASSERT(mp->m_ag_maxlevels > 0); + /* + * For a btree shorter than the maximum height, the worst case is that + * every level gets split and a new level is added, then while inserting + * another entry to refill the AGFL, every level under the old root gets + * split again. This is: + * + * (full height split reservation) + (AGFL refill split height) + * = (current height + 1) + (current height - 1) + * = (new height) + (new height - 2) + * = 2 * new height - 2 + * + * For a btree of maximum height, the worst case is that every level + * under the root gets split, then while inserting another entry to + * refill the AGFL, every level under the root gets split again. This is + * also: + * + * 2 * (current height - 1) + * = 2 * (new height - 1) + * = 2 * new height - 2 + */ + /* space needed by-bno freespace btree */ min_free = min_t(unsigned int, levels[XFS_BTNUM_BNOi] + 1, - mp->m_ag_maxlevels); + mp->m_ag_maxlevels) * 2 - 2; /* space needed by-size freespace btree */ min_free += min_t(unsigned int, levels[XFS_BTNUM_CNTi] + 1, - mp->m_ag_maxlevels); + mp->m_ag_maxlevels) * 2 - 2; /* space needed reverse mapping used space btree */ if (xfs_has_rmapbt(mp)) min_free += min_t(unsigned int, levels[XFS_BTNUM_RMAPi] + 1, - mp->m_rmap_maxlevels); + mp->m_rmap_maxlevels) * 2 - 2; return min_free; } -- 2.39.2
2 1
0 0
[PATCH OLK-6.6 v3 0/3] kvm: arm64: Transition from CPU Type to MIDR Register for Virtualization Feature Detection
by liqiqi 31 Mar '26

31 Mar '26
Currently, there are two methods for determining whether a chip supports specific virtualization features: 1. Reading the chip's CPU type from BIOS 2. Reading the value of the MIDR register The issue with the first method is that each time a new chip is introduced, the new CPU type must be defined, which leads to poor code portability and maintainability. Therefore, the second method has been adopted to replace the first. This approach eliminates the dependency on CPU type by using the MIDR register. liqiqi (3): kvm: arm64: Add HIP08, HIP10, HIP10C MIDR definitions kvm: arm64: use MIDR to determine whether features are supported kvm: arm64: Remove cpu_type definition and it's related interfaces arch/arm64/include/asm/cache.h | 2 +- arch/arm64/include/asm/cputype.h | 8 +- arch/arm64/kernel/cpu_errata.c | 4 +- arch/arm64/kernel/cpufeature.c | 2 +- arch/arm64/kernel/proton-pack.c | 4 +- arch/arm64/kvm/arm.c | 1 - arch/arm64/kvm/hisilicon/hisi_virt.c | 111 +++-------------------- arch/arm64/kvm/hisilicon/hisi_virt.h | 12 --- drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +- tools/arch/arm64/include/asm/cputype.h | 4 +- 10 files changed, 26 insertions(+), 124 deletions(-) -- 2.43.0
2 4
0 0
  • ← Newer
  • 1
  • 2
  • 3
  • 4
  • ...
  • 2313
  • Older →

HyperKitty Powered by HyperKitty