On Sun, Mar 21, 2021 at 11:14:32AM +1300, Barry Song wrote:
update_idle_core() is only done for the case of sched_smt_present. but test_idle_cores() is done for all machines even those without smt. this could contribute to up 8%+ hackbench performance loss on a machine like kunpeng 920 which has no smt. this patch removes the redundant test_idle_cores() for non-smt machines.
we run the below hackbench with different -g parameter from 2 to 14, for each different g, we run the command 10 times and get the average time: $ numactl -N 0 hackbench -p -T -l 20000 -g $1
hackbench will report the time which is needed to complete a certain number of messages transmissions between a certain number of tasks, for example: $ numactl -N 0 hackbench -p -T -l 20000 -g 10 Running in threaded mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 20000 messages of 100 bytes
The below is the result of hackbench w/ and w/o this patch: g= 2 4 6 8 10 12 14 w/o: 1.8151 3.8499 5.5142 7.2491 9.0340 10.7345 12.0929 w/ : 1.8428 3.7436 5.4501 6.9522 8.2882 9.9535 11.3367 +4.1% +8.3% +7.3% +6.3%
The patch looks obvious, but the Changelog and Subject needed a lot of help.
I've changed it like so:
--- Subject: sched/fair: Optimize test_idle_cores() for !SMT From: Barry Song song.bao.hua@hisilicon.com Date: Sun, 21 Mar 2021 11:14:32 +1300
From: Barry Song song.bao.hua@hisilicon.com
update_idle_core() is only done for the case of sched_smt_present. but test_idle_cores() is done for all machines even those without SMT.
This can contribute to up 8%+ hackbench performance loss on a machine like kunpeng 920 which has no SMT. This patch removes the redundant test_idle_cores() for !SMT machines.
Hackbench is ran with -g {2..14}, for each g it is ran 10 times to get an average.
$ numactl -N 0 hackbench -p -T -l 20000 -g $1
The below is the result of hackbench w/ and w/o this patch:
g= 2 4 6 8 10 12 14 w/o: 1.8151 3.8499 5.5142 7.2491 9.0340 10.7345 12.0929 w/ : 1.8428 3.7436 5.4501 6.9522 8.2882 9.9535 11.3367 +4.1% +8.3% +7.3% +6.3%
Signed-off-by: Barry Song song.bao.hua@hisilicon.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Mel Gorman mgorman@suse.de Link: https://lkml.kernel.org/r/20210320221432.924-1-song.bao.hua@hisilicon.com --- kernel/sched/fair.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6038,9 +6038,11 @@ static inline bool test_idle_cores(int c { struct sched_domain_shared *sds;
- sds = rcu_dereference(per_cpu(sd_llc_shared, cpu)); - if (sds) - return READ_ONCE(sds->has_idle_cores); + if (static_branch_likely(&sched_smt_present)) { + sds = rcu_dereference(per_cpu(sd_llc_shared, cpu)); + if (sds) + return READ_ONCE(sds->has_idle_cores); + }
return def; }