On 25/01/2021 11:50, Song Bao Hua (Barry Song) wrote:
-----Original Message----- From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] Sent: Wednesday, January 13, 2021 12:00 AM To: Morten Rasmussen morten.rasmussen@arm.com; Tim Chen tim.c.chen@linux.intel.com Cc: Song Bao Hua (Barry Song) song.bao.hua@hisilicon.com; valentin.schneider@arm.com; catalin.marinas@arm.com; will@kernel.org; rjw@rjwysocki.net; vincent.guittot@linaro.org; lenb@kernel.org; gregkh@linuxfoundation.org; Jonathan Cameron jonathan.cameron@huawei.com; mingo@redhat.com; peterz@infradead.org; juri.lelli@redhat.com; rostedt@goodmis.org; bsegall@google.com; mgorman@suse.de; mark.rutland@arm.com; sudeep.holla@arm.com; aubrey.li@linux.intel.com; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; linux-acpi@vger.kernel.org; linuxarm@openeuler.org; xuwei (O) xuwei5@huawei.com; Zengtao (B) prime.zeng@hisilicon.com; tiantao (H) tiantao6@hisilicon.com Subject: Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler
On 11/01/2021 10:28, Morten Rasmussen wrote:
On Fri, Jan 08, 2021 at 12:22:41PM -0800, Tim Chen wrote:
On 1/8/21 7:12 AM, Morten Rasmussen wrote:
On Thu, Jan 07, 2021 at 03:16:47PM -0800, Tim Chen wrote:
On 1/6/21 12:30 AM, Barry Song wrote:
[...]
wake_wide() switches between packing (select_idle_sibling(), llc_size CPUs) and spreading (find_idlest_cpu(), all CPUs).
AFAICS, since none of the sched domains set SD_BALANCE_WAKE, currently all wakeups are (llc-)packed.
Sorry for late response. I was struggling with some other topology issues recently.
For "all wakeups are (llc-)packed", it seems you mean current want_affine is only affecting the new_cpu, and for wake-up path, we will always go to select_idle_sibling() rather than find_idlest_cpu() since nobody sets SD_WAKE_BALANCE in any sched_domain ?
select_task_rq_fair()
for_each_domain(cpu, tmp)
if (tmp->flags & sd_flag) sd = tmp;
In case we would like to further distinguish between llc-packing and even narrower (cluster or MC-L2)-packing, we would introduce a 2. level packing vs. spreading heuristic further down in sis().
I didn't get your point on "2 level packing". Would you like to describe more? It seems you mean we need to have separate calculation for avg_scan_cost and sched_feat(SIS_) for cluster (or MC-L2) since cluster and llc are not in the same level physically?
By '1. level packing' I meant going sis() (i.e. sd=per_cpu(sd_llc, target)) instead of routing WF_TTWU through find_idlest_cpu() which uses a broader sd span (in case all sd's (or at least up to an sd > llc) would have SD_BALANCE_WAKE set). wake_wide() (wakee/waker flip heuristic) is currently used to make this decision. But since no sd sets SD_BALANCE_WAKE we always go sis() for WF_TTWU.
'2. level packing' would be the decision between cluster- and llc-packing. The question was which heuristic could be used here.
IMHO, Barry's current implementation doesn't do this right now. Instead he's trying to pack on cluster first and if not successful look further among the remaining llc CPUs for an idle CPU.
Yes. That is exactly what the current patch is doing.
And this will be favoring cluster- over llc-packing for each task instead.