hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IAHJKC CVE: NA
--------------------------------
The numa_scan_seq update depends on the numa scanning work done, which is skipped when sampling based numa affinity is on, so the numa_scan_seq will always be 0, skip the judge to avoid false migrate here.
Spark benchmark show 1%~2% performance improvement after applying this.
Fixes: bdc4701337d7 ("mm/mem_sampling.c: Drive NUMA balancing via mem_sampling access data") Signed-off-by: Nanyong Sun sunnanyong@huawei.com --- kernel/sched/fair.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2139edac2cb1..d22936de5714 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1423,6 +1423,20 @@ static inline unsigned long group_weight(struct task_struct *p, int nid, return 1000 * faults / total_faults; }
+static inline bool in_early_stage(struct task_struct *p, int early_seq) +{ + /* + * For sampling based autonuma, numa_scan_seq never update. Currently, + * just skip here to avoid false migrate. In the future, the real + * lifetime judgment can be implemented if the workloads are very + * sensitive to the starting stage of the process. + */ + if (numa_affinity_sampling_enabled()) + return false; + + return p->numa_scan_seq <= early_seq; +} + bool should_numa_migrate_memory(struct task_struct *p, struct page * page, int src_nid, int dst_cpu) { @@ -1439,7 +1453,7 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page, * two full passes of the "multi-stage node selection" test that is * executed below. */ - if ((p->numa_preferred_nid == NUMA_NO_NODE || p->numa_scan_seq <= 4) && + if ((p->numa_preferred_nid == NUMA_NO_NODE || in_early_stage(p, 4)) && (cpupid_pid_unset(last_cpupid) || cpupid_match_pid(p, last_cpupid))) return true;