On 01/02/21 16:38, Barry Song wrote:
@@ -964,6 +941,12 @@ static void init_overlap_sched_group(struct sched_domain *sd,
build_balance_mask(sd, sg, mask); cpu = cpumask_first_and(sched_group_span(sg), mask);
/*
* for the group generated by grandchild, use the sgc of 2nd cpu
* because the 1st cpu might be used by another sched_group
*/
if (from_grandchild && cpumask_weight(mask) > 1)
cpu = cpumask_next_and(cpu, sched_group_span(sg), mask);
sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
So you are getting a (hopefully) unique ID for this group span at this given topology level (i.e. sd->private) but as I had stated in that list of issues, this creates an sgc that isn't attached to the local group of any sched_domain, and thus won't get its capacity values updated.
This can actually be seen via the capacity values you're getting at build time:
[ 0.868907] CPU0 attaching sched-domain(s):
...
[ 0.869542] domain-2: span=0-5 level=NUMA [ 0.869559] groups: 0:{ span=0-3 cap=4002 }, 5:{ span=4-5 cap=2048 }
^^^^^^^^^^^^^^^^
[ 0.871177] CPU4 attaching sched-domain(s):
...
[ 0.871200] groups: 4:{ span=4 cap=977 }, 5:{ span=5 cap=1001 } [ 0.871243] domain-1: span=4-7 level=NUMA [ 0.871257] groups: 4:{ span=4-5 cap=1978 }, 6:{ span=6-7 cap=1968 }
^^^^^^^^^^^^^^^^
IMO what we want to do here is to hook this CPU0-domain-2-group5 to the sgc of CPU4-domain1-group4. I've done that in the below diff - this gives us groups with sgc's owned at lower topology levels, but this will only ever be true for non-local groups. This has the added benefit of working with single-CPU nodes. Briefly tested on your topology and the sunfire's (via QEMU), and I didn't get screamed at.
Before the fun police comes and impounds my keyboard, I'd like to point out that we could leverage this cross-level sgc referencing hack to further change the NUMA domains and pretty much get rid of overlapping groups (that's what I was fumbling with in [1]).
[1]: http://lore.kernel.org/r/jhjwnw11ak2.mognet@arm.com
That is, rather than building overlapping groups and fixing them whenever that breaks (distance > 2), we could have: - the local group being the child domain's span (as always) - all non-local NUMA groups spanning a single node each, with the right sgc cross-referencing.
Thoughts?
--->8--- diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index b748999c9e11..ef43abb6b1fb 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -932,21 +932,15 @@ build_group_from_child_sched_domain(struct sched_domain *sd, int cpu)
static void init_overlap_sched_group(struct sched_domain *sd, struct sched_group *sg, - int from_grandchild) + struct sched_domain *grandchild) { struct cpumask *mask = sched_domains_tmpmask2; - struct sd_data *sdd = sd->private; + struct sd_data *sdd = grandchild ? grandchild->private : sd->private; struct cpumask *sg_span; int cpu;
build_balance_mask(sd, sg, mask); cpu = cpumask_first_and(sched_group_span(sg), mask); - /* - * for the group generated by grandchild, use the sgc of 2nd cpu - * because the 1st cpu might be used by another sched_group - */ - if (from_grandchild && cpumask_weight(mask) > 1) - cpu = cpumask_next_and(cpu, sched_group_span(sg), mask);
sg->sgc = *per_cpu_ptr(sdd->sgc, cpu); if (atomic_inc_return(&sg->sgc->ref) == 1) @@ -979,7 +973,7 @@ build_overlap_sched_groups(struct sched_domain *sd, int cpu)
for_each_cpu_wrap(i, span, cpu) { struct cpumask *sg_span; - int from_grandchild = 0; + bool from_grandchild = false;
if (cpumask_test_cpu(i, covered)) continue; @@ -1033,7 +1027,7 @@ build_overlap_sched_groups(struct sched_domain *sd, int cpu) !cpumask_subset(sched_domain_span(sibling->child), span)) { sibling = sibling->child; - from_grandchild = 1; + from_grandchild = true; }
sg = build_group_from_child_sched_domain(sibling, cpu); @@ -1043,7 +1037,7 @@ build_overlap_sched_groups(struct sched_domain *sd, int cpu) sg_span = sched_group_span(sg); cpumask_or(covered, covered, sg_span);
- init_overlap_sched_group(sd, sg, from_grandchild); + init_overlap_sched_group(sd, sg, from_grandchild ? sibling : NULL);
if (!first) first = sg;