MPAM bugfix @ 20210224
James Morse (10): arm64/mpam: Add mpam driver discovery phase and kbuild boiler plate cacheinfo: Provide a helper to find a cacheinfo leaf arm64/mpam: Probe supported partid/pmg ranges from devices arm64/mpam: Supplement MPAM MSC register layout definitions arm64/mpam: Probe the features resctrl supports arm64/mpam: Reset controls when CPUs come online arm64/mpam: Summarize feature support during mpam_enable() arm64/mpam: resctrl: Re-synchronise resctrl's view of online CPUs drivers: base: cacheinfo: Add helper to search cacheinfo by of_node arm64/mpam: Enabling registering and logging error interrupts
Wang ShaoBo (55): arm64/mpam: Preparing for MPAM refactoring arm64/mpam: Add helper for getting mpam sysprops arm64/mpam: Allocate mpam component configuration arrays arm64/mpam: Pick MPAM resources and events for resctrl_res exported arm64/mpam: Init resctrl resources' info from resctrl_res selected arm64/mpam: resctrl: Handle cpuhp and resctrl_dom allocation arm64/mpam: Implement helpers for handling configuration and monitoring arm64/mpam: Migrate old MSCs' discovery process to new branch arm64/mpam: Add helper for getting MSCs' configuration arm64/mpam: Probe partid,pmg and feature capabilities' ranges from classes arm64/mpam: resctrl: Rebuild configuration and monitoring pipeline arm64/mpam: resctrl: Append schemata CDP definitions arm64/mpam: resctrl: Supplement cdpl2,cdpl3 for mount options arm64/mpam: resctrl: Add helpers for init and destroy schemata list arm64/mpam: resctrl: Use resctrl_group_init_alloc() to init schema list arm64/mpam: resctrl: Write and read schemata by schema_list arm64/mpam: Support cdp in mpam_sched_in() arm64/mpam: resctrl: Update resources reset process arm64/mpam: resctrl: Update closid alloc and free process with bitmap arm64/mpam: resctrl: Move ctrlmon sysfile write/read function to mpam_ctrlmon.c arm64/mpam: Support cdp on allocating monitors arm64/mpam: resctrl: Support cdp on monitoring data arm64/mpam: Clean up header files and rearrange declarations arm64/mpam: resctrl: Remove ctrlmon sysfile arm64/mpam: resctrl: Remove unnecessary CONFIG_ARM64 arm64/mpam: Implement intpartid narrowing process arm64/mpam: Using software-defined id for rdtgroup instead of 32-bit integer arm64/mpam: resctrl: collect child mon group's monitor data arm64/mpam: resctrl: Support cpus' monitoring for mon group arm64/mpam: resctrl: Support priority and hardlimit(Memory bandwidth) configuration arm64/mpam: Store intpri and dspri for mpam device reset arm64/mpam: Squash default priority from mpam device to class arm64/mpam: Restore extend ctrls' max width for checking schemata input arm64/mpam: Re-plan intpartid narrowing process arm64/mpam: Add hook-events id for ctrl features arm64/mpam: Integrate monitor data for Memory Bandwidth if cdp enabled arm64/mpam: Fix MPAM_ESR intPARTID_range error arm64/mpam: Separate internal and downstream priority event arm64/mpam: Remap reqpartid,pmg to rmid and intpartid to closid arm64/mpam: Add wait queue for monitor alloc and free arm64/mpam: Add resctrl_ctrl_feature structure to manage ctrl features arm64/mpam: resctrl: Export resource's properties to info directory arm64/mpam: Split header files into suitable location arm64/mpam: resctrl: Add rmid file in resctrl sysfs arm64/mpam: Filter schema control type with ctrl features arm64/mpam: Simplify mpamid cdp mapping process arm64/mpam: Set per-cpu's closid to none zero for cdp ACPI/MPAM: Use acpi_map_pxm_to_node() to get node id for memory node arm64/mpam: Supplement additional useful ctrl features for mount options arm64/mpam: resctrl: Add proper error handling to resctrl_mount() arm64/mpam: resctrl: Use resctrl_group_init_alloc() for default group arm64/mpam: resctrl: Allow setting register MPAMCFG_MBW_MIN to 0 arm64/mpam: resctrl: Refresh cpu mask for handling cpuhp arm64/mpam: Sort domains when cpu online arm64/mpam: Fix compile warning
arch/arm64/include/asm/mpam.h | 324 +--- arch/arm64/include/asm/mpam_resource.h | 129 -- arch/arm64/include/asm/mpam_sched.h | 8 - arch/arm64/include/asm/resctrl.h | 514 +++++- arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/mpam.c | 1499 ---------------- arch/arm64/kernel/mpam/Makefile | 3 + arch/arm64/kernel/mpam/mpam_ctrlmon.c | 961 ++++++++++ arch/arm64/kernel/mpam/mpam_device.c | 1706 ++++++++++++++++++ arch/arm64/kernel/mpam/mpam_device.h | 140 ++ arch/arm64/kernel/mpam/mpam_internal.h | 345 ++++ arch/arm64/kernel/mpam/mpam_mon.c | 334 ++++ arch/arm64/kernel/mpam/mpam_resctrl.c | 2240 ++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_resource.h | 228 +++ arch/arm64/kernel/mpam/mpam_setup.c | 608 +++++++ arch/arm64/kernel/mpam_ctrlmon.c | 623 ------- arch/arm64/kernel/mpam_mon.c | 124 -- drivers/acpi/arm64/mpam.c | 87 +- drivers/base/cacheinfo.c | 38 + fs/resctrlfs.c | 396 +++-- include/linux/arm_mpam.h | 118 ++ include/linux/cacheinfo.h | 36 + include/linux/resctrlfs.h | 30 - 23 files changed, 7521 insertions(+), 2972 deletions(-) delete mode 100644 arch/arm64/include/asm/mpam_resource.h delete mode 100644 arch/arm64/kernel/mpam.c create mode 100644 arch/arm64/kernel/mpam/Makefile create mode 100644 arch/arm64/kernel/mpam/mpam_ctrlmon.c create mode 100644 arch/arm64/kernel/mpam/mpam_device.c create mode 100644 arch/arm64/kernel/mpam/mpam_device.h create mode 100644 arch/arm64/kernel/mpam/mpam_internal.h create mode 100644 arch/arm64/kernel/mpam/mpam_mon.c create mode 100644 arch/arm64/kernel/mpam/mpam_resctrl.c create mode 100644 arch/arm64/kernel/mpam/mpam_resource.h create mode 100644 arch/arm64/kernel/mpam/mpam_setup.c delete mode 100644 arch/arm64/kernel/mpam_ctrlmon.c delete mode 100644 arch/arm64/kernel/mpam_mon.c create mode 100644 include/linux/arm_mpam.h
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
To make MPAM feature perfectly compatiable with resctrl sysfs provided by Intel-RDT, we are ready to carry out a large-scale refactoring.
We move mpam.c, mpam_ctrlmon.c and mpam_mon.c to the mpam/ subdirectory under arch/arm64/kernel directory, and rename mpam.c to mpam_resctrl.c, of which function is expected to fully take up the internal resctrl jobs including many new supports such as cdp(Code Data Prioritization) and new interaction style.
Original operations on resctrl sysfs toward resctrl resource will be remapped to operations with new structures.
Before we formally declare that we have accomplished entire refactoring jobs, MPAM driver under our implementation is incomplete.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/mpam/Makefile | 3 ++ arch/arm64/kernel/{ => mpam}/mpam_ctrlmon.c | 6 +--- arch/arm64/kernel/{ => mpam}/mpam_mon.c | 0 .../kernel/{mpam.c => mpam/mpam_resctrl.c} | 31 +++---------------- 5 files changed, 9 insertions(+), 33 deletions(-) create mode 100644 arch/arm64/kernel/mpam/Makefile rename arch/arm64/kernel/{ => mpam}/mpam_ctrlmon.c (98%) rename arch/arm64/kernel/{ => mpam}/mpam_mon.c (100%) rename arch/arm64/kernel/{mpam.c => mpam/mpam_resctrl.c} (97%)
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index d6a907297aac..ec0cf9543e21 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -63,7 +63,7 @@ arm64-obj-$(CONFIG_CRASH_CORE) += crash_core.o arm64-obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o arm64-obj-$(CONFIG_ARM64_SSBD) += ssbd.o arm64-obj-$(CONFIG_SDEI_WATCHDOG) += watchdog_sdei.o -arm64-obj-$(CONFIG_MPAM) += mpam.o mpam_ctrlmon.o mpam_mon.o +arm64-obj-$(CONFIG_MPAM) += mpam/
obj-y += $(arm64-obj-y) vdso/ probes/ obj-$(CONFIG_ARM64_ILP32) += vdso-ilp32/ diff --git a/arch/arm64/kernel/mpam/Makefile b/arch/arm64/kernel/mpam/Makefile new file mode 100644 index 000000000000..3492ff0a9d0f --- /dev/null +++ b/arch/arm64/kernel/mpam/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_MPAM) += mpam_resctrl.o mpam_mon.o \ + mpam_ctrlmon.o diff --git a/arch/arm64/kernel/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c similarity index 98% rename from arch/arm64/kernel/mpam_ctrlmon.c rename to arch/arm64/kernel/mpam/mpam_ctrlmon.c index b62ab076bf30..22e701195b28 100644 --- a/arch/arm64/kernel/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -175,9 +175,6 @@ static int parse_line(char *line, struct resctrl_resource *r) goto next; } } - - rdt_last_cmd_printf("unknown domain (%lu)\n", dom_id); - return -EINVAL; }
@@ -231,7 +228,6 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, rdtgrp = resctrl_group_kn_lock_live(of->kn); if (!rdtgrp) { resctrl_group_kn_unlock(of->kn); - rdt_last_cmd_puts("directory was removed\n"); return -ENOENT; } rdt_last_cmd_clear(); @@ -440,7 +436,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, return ret; }
- /* [FIXME] Could we remove the MATCH_* param ? */ + /* Could we remove the MATCH_* param ? */ rr->mon_write(d, prgrp, true);
return ret; diff --git a/arch/arm64/kernel/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c similarity index 100% rename from arch/arm64/kernel/mpam_mon.c rename to arch/arm64/kernel/mpam/mpam_mon.c diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam/mpam_resctrl.c similarity index 97% rename from arch/arm64/kernel/mpam.c rename to arch/arm64/kernel/mpam/mpam_resctrl.c index 30900c08f0ef..38e5a551c9d5 100644 --- a/arch/arm64/kernel/mpam.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -91,9 +91,6 @@ static inline void mpam_node_assign_val(struct mpam_node *n, n->addr = hwpage_address; n->component_id = component_id; n->cpus_list = "0"; - - if (n->type == MPAM_RESOURCE_MC) - n->default_ctrl = MBA_MAX_WD; }
#define MPAM_NODE_NAME_SIZE (10) @@ -313,7 +310,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { [MPAM_RESOURCE_MC] = { .msr_update = bw_wrmsr, .msr_read = bw_rdmsr, - .parse_ctrlval = parse_bw, /* [FIXME] add parse_bw() helper */ + .parse_ctrlval = parse_bw, /* add parse_bw() helper */ .format_str = "%d=%0*d", .mon_read = mbwu_read, .mon_write = mbwu_write, @@ -373,7 +370,6 @@ u64 bw_rdmsr(struct rdt_domain *d, int partid) }
/* - * [FIXME] * use pmg as monitor id * just use match_pardid only. */ @@ -510,7 +506,7 @@ static int mpam_online_cpu(unsigned int cpu) return 0; }
-/* [FIXME] remove related resource when cpu offline */ +/* remove related resource when cpu offline */ static int mpam_offline_cpu(unsigned int cpu) { return 0; @@ -552,20 +548,6 @@ void post_resctrl_mount(void)
static int reset_all_ctrls(struct resctrl_resource *r) { - struct raw_resctrl_resource *rr; - struct rdt_domain *d; - int partid; - - rr = (struct raw_resctrl_resource *)r->res; - for (partid = 0; partid < rr->num_partid; partid++) { - list_for_each_entry(d, &r->domains, list) { - d->new_ctrl = rr->default_ctrl; - d->ctrl_val[partid] = rr->default_ctrl; - d->have_new_ctrl = true; - rr->msr_update(d, partid); - } - } - return 0; }
@@ -863,7 +845,7 @@ static int resctrl_num_mon_show(struct kernfs_open_file *of, int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask, cpumask_var_t tmpmask) { - rdt_last_cmd_puts("temporarily unsupported write cpus on mon_groups\n"); + pr_info("unsupported on mon_groups, please use ctrlmon groups\n"); return -EINVAL; }
@@ -1148,12 +1130,11 @@ static ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of,
if (!rdtgrp) { ret = -ENOENT; - rdt_last_cmd_puts("directory was removed\n"); goto unlock; }
if ((rdtgrp->flags & RDT_CTRLMON) && !ctrlmon) { - /* [FIXME] disable & remove mon_data dir */ + /* disable & remove mon_data dir */ rdtgrp->flags &= ~RDT_CTRLMON; resctrl_ctrlmon_disable(rdtgrp->mon.mon_data_kn, rdtgrp); } else if (!(rdtgrp->flags & RDT_CTRLMON) && ctrlmon) { @@ -1162,10 +1143,6 @@ static ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of, if (!ret) rdtgrp->flags |= RDT_CTRLMON; } else { - if (ctrlmon) - rdt_last_cmd_printf("ctrlmon has been enabled\n"); - else - rdt_last_cmd_printf("ctrlmon has been disabled\n"); ret = -ENOENT; }
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Components with MPAM controls (or monitors) could be placed anywhere in the system, each with their own set MMIO configuration area.
Firmware tables tell us the components position in the topology and location of its MMIO configuration area, as well as which logical component this corresponds with.
For now, we are only interested in the well-known caches, e.g. L2.
To reduce the number of times we have to schedule work on another CPU, we collect all the information from the firmware tables between mpam_discovery_start() and mpam_discovery_complete().
If the code parseing firmware tables concludes it can't continue, it can call mpam_discovery_failed() to free the allocated memory.
[Wang ShaoBo: few version adaptation changes]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=ba79c6b4021fb13f395ea0... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/Makefile | 2 +- arch/arm64/kernel/mpam/mpam_device.c | 272 +++++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_device.h | 92 +++++++++ 3 files changed, 365 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kernel/mpam/mpam_device.c create mode 100644 arch/arm64/kernel/mpam/mpam_device.h
diff --git a/arch/arm64/kernel/mpam/Makefile b/arch/arm64/kernel/mpam/Makefile index 3492ff0a9d0f..f69a7018d42b 100644 --- a/arch/arm64/kernel/mpam/Makefile +++ b/arch/arm64/kernel/mpam/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_MPAM) += mpam_resctrl.o mpam_mon.o \ - mpam_ctrlmon.o + mpam_ctrlmon.o mpam_device.o diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c new file mode 100644 index 000000000000..269a99695e6c --- /dev/null +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -0,0 +1,272 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Common code for ARM v8 MPAM + * + * Copyright (C) 2020-2021 Huawei Technologies Co., Ltd + * + * Author: Wang Shaobo bobo.shaobowang@huawei.com + * + * Code was partially borrowed from http://www.linux-arm.org/ + * git?p=linux-jm.git;a=shortlog;h=refs/heads/mpam/snapshot/may. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * More information about MPAM be found in the Arm Architecture Reference + * Manual. + * + * https://static.docs.arm.com/ddi0598/a/DDI0598_MPAM_supp_armv8a.pdf + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/io.h> +#include <linux/slab.h> +#include <linux/types.h> + +#include "mpam_device.h" + +/* + * During discovery this lock protects writers to class, components and devices. + * Once all devices are successfully probed, the system_supports_mpam() static + * key is enabled, and these lists become read only. + */ +static DEFINE_MUTEX(mpam_devices_lock); + +/* Devices are MSCs */ +static LIST_HEAD(mpam_all_devices); + +/* Classes are the set of MSCs that make up components of the same type. */ +LIST_HEAD(mpam_classes); + +static struct mpam_device * __init +mpam_device_alloc(struct mpam_component *comp) +{ + struct mpam_device *dev; + + lockdep_assert_held(&mpam_devices_lock); + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return ERR_PTR(-ENOMEM); + + spin_lock_init(&dev->lock); + INIT_LIST_HEAD(&dev->comp_list); + INIT_LIST_HEAD(&dev->glbl_list); + + dev->comp = comp; + list_add(&dev->comp_list, &comp->devices); + list_add(&dev->glbl_list, &mpam_all_devices); + + return dev; +} + +static void mpam_devices_destroy(struct mpam_component *comp) +{ + struct mpam_device *dev, *tmp; + + lockdep_assert_held(&mpam_devices_lock); + + list_for_each_entry_safe(dev, tmp, &comp->devices, comp_list) { + list_del(&dev->comp_list); + list_del(&dev->glbl_list); + kfree(dev); + } +} + +static struct mpam_component * __init mpam_component_alloc(int id) +{ + struct mpam_component *comp; + + comp = kzalloc(sizeof(*comp), GFP_KERNEL); + if (!comp) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&comp->devices); + INIT_LIST_HEAD(&comp->class_list); + + comp->comp_id = id; + + return comp; +} + +struct mpam_component *mpam_component_get(struct mpam_class *class, int id, + bool alloc) +{ + struct mpam_component *comp; + + list_for_each_entry(comp, &class->components, class_list) { + if (comp->comp_id == id) + return comp; + } + + if (!alloc) + return ERR_PTR(-ENOENT); + + comp = mpam_component_alloc(id); + if (IS_ERR(comp)) + return comp; + + list_add(&comp->class_list, &class->components); + + return comp; +} + +static struct mpam_class * __init mpam_class_alloc(u8 level_idx, + enum mpam_class_types type) +{ + struct mpam_class *class; + + lockdep_assert_held(&mpam_devices_lock); + + class = kzalloc(sizeof(*class), GFP_KERNEL); + if (!class) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&class->components); + INIT_LIST_HEAD(&class->classes_list); + + mutex_init(&class->lock); + + class->level = level_idx; + class->type = type; + + list_add(&class->classes_list, &mpam_classes); + + return class; +} + +/* free all components and devices of this class */ +static void mpam_class_destroy(struct mpam_class *class) +{ + struct mpam_component *comp, *tmp; + + lockdep_assert_held(&mpam_devices_lock); + + list_for_each_entry_safe(comp, tmp, &class->components, class_list) { + mpam_devices_destroy(comp); + list_del(&comp->class_list); + kfree(comp); + } +} + +static struct mpam_class * __init mpam_class_get(u8 level_idx, + enum mpam_class_types type, + bool alloc) +{ + bool found = false; + struct mpam_class *class; + + lockdep_assert_held(&mpam_devices_lock); + + list_for_each_entry(class, &mpam_classes, classes_list) { + if (class->type == type && class->level == level_idx) { + found = true; + break; + } + } + + if (found) + return class; + + if (!alloc) + return ERR_PTR(-ENOENT); + + return mpam_class_alloc(level_idx, type); +} + +/* + * Create a a device with this @hwpage_address, of class type:level_idx. + * class/component structures may be allocated. + * Returns the new device, or an ERR_PTR(). + */ +struct mpam_device * __init +__mpam_device_create(u8 level_idx, enum mpam_class_types type, + int component_id, const struct cpumask *fw_affinity, + phys_addr_t hwpage_address) +{ + struct mpam_device *dev; + struct mpam_class *class; + struct mpam_component *comp; + + if (!fw_affinity) + fw_affinity = cpu_possible_mask; + + mutex_lock(&mpam_devices_lock); + do { + class = mpam_class_get(level_idx, type, true); + if (IS_ERR(class)) { + dev = (void *)class; + break; + } + + comp = mpam_component_get(class, component_id, true); + if (IS_ERR(comp)) { + dev = (void *)comp; + break; + } + + /* + * For caches we learn the affinity from the cache-id as CPUs + * come online. For everything else, we have to be told. + */ + if (type != MPAM_CLASS_CACHE) + cpumask_or(&comp->fw_affinity, &comp->fw_affinity, + fw_affinity); + + dev = mpam_device_alloc(comp); + if (IS_ERR(dev)) + break; + + dev->fw_affinity = *fw_affinity; + dev->hwpage_address = hwpage_address; + dev->mapped_hwpage = ioremap(hwpage_address, SZ_MPAM_DEVICE); + if (!dev->mapped_hwpage) + dev = ERR_PTR(-ENOMEM); + } while (0); + mutex_unlock(&mpam_devices_lock); + + return dev; +} + +static int mpam_cpus_have_feature(void) +{ + if (!cpus_have_const_cap(ARM64_HAS_MPAM)) + return 0; + return 1; +} + +/* + * prepare for initializing devices. + */ +int __init mpam_discovery_start(void) +{ + if (!mpam_cpus_have_feature()) + return -EOPNOTSUPP; + + return 0; +} + +int __init mpam_discovery_complete(void) +{ + return 0; +} + +void __init mpam_discovery_failed(void) +{ + struct mpam_class *class, *tmp; + + mutex_lock(&mpam_devices_lock); + list_for_each_entry_safe(class, tmp, &mpam_classes, classes_list) { + mpam_class_destroy(class); + list_del(&class->classes_list); + kfree(class); + } + mutex_unlock(&mpam_devices_lock); +} diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h new file mode 100644 index 000000000000..ab986fb1911f --- /dev/null +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -0,0 +1,92 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_ARM64_MPAM_DEVICE_H +#define _ASM_ARM64_MPAM_DEVICE_H + +#include <linux/err.h> +#include <linux/cpumask.h> +#include <linux/types.h> + +/* + * Size of the memory mapped registers: 4K of feature page + * then 2x 4K bitmap registers + */ +#define SZ_MPAM_DEVICE (3 * SZ_4K) + +enum mpam_class_types { + MPAM_CLASS_SMMU, + MPAM_CLASS_CACHE, /* Well known caches, e.g. L2 */ + MPAM_CLASS_MEMORY, /* Main memory */ + MPAM_CLASS_UNKNOWN, /* Everything else, e.g. TLBs etc */ +}; + +/* + * An mpam_device corresponds to an MSC, an interface to a component's cache + * or bandwidth controls. It is associated with a set of CPUs, and a component. + * For resctrl the component is expected to be a well-known cache (e.g. L2). + * We may have multiple interfaces per component, each for a set of CPUs that + * share the same component. + */ +struct mpam_device { + /* member of mpam_component:devices */ + struct list_head comp_list; + struct mpam_component *comp; + + /* member of mpam_all_devices */ + struct list_head glbl_list; + + /* The affinity learn't from firmware */ + struct cpumask fw_affinity; + /* of which these cpus are online */ + struct cpumask online_affinity; + + spinlock_t lock; + bool probed; + + phys_addr_t hwpage_address; + void __iomem *mapped_hwpage; +}; + +/* + * A set of devices that share the same component. e.g. the MSCs that + * make up the L2 cache. This may be 1:1. Exposed to user-space as a domain by + * resctrl when the component is a well-known cache. + */ +struct mpam_component { + u32 comp_id; + + /* mpam_devices in this domain */ + struct list_head devices; + + struct cpumask fw_affinity; + + /* member of mpam_class:components */ + struct list_head class_list; +}; + +/* + * All the components of the same type at a particular level, + * e.g. all the L2 cache components. Exposed to user-space as a resource + * by resctrl when the component is a well-known cache. We may have additional + * classes such as system-caches, or internal components that are not exposed. + */ +struct mpam_class { + /* + * resctrl expects to see an empty domain list if all 'those' CPUs are + * offline. As we can't discover the cpu affinity of 'unknown' MSCs, we + * need a second list. + * mpam_components in this class. + */ + struct list_head components; + + struct cpumask fw_affinity; + + u8 level; + enum mpam_class_types type; + + struct mutex lock; + + /* member of mpam_classes */ + struct list_head classes_list; +}; + +#endif /* _ASM_ARM64_MPAM_DEVICE_H */
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
To provide a reasonable threshold for a measure of cache-usage, MPAM would like to keep track of the largest cache it has seen.
Provide a helper that takes a cpu and a cache-level, and returns the corresponding leaf. This only works for unified caches.
Callers must hold the cpus_read_lock() to prevent the leaf being free()d.
Conflicts: include/linux/cacheinfo.h [Wang ShaoBo: fix a little conflicts and validate info_list]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=c9972f21aa19762dc6acf1... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- include/linux/cacheinfo.h | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+)
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h index 46b92cd61d0c..6db3f7f8a7d6 100644 --- a/include/linux/cacheinfo.h +++ b/include/linux/cacheinfo.h @@ -119,4 +119,39 @@ int acpi_find_last_cache_level(unsigned int cpu);
const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
+/* Get the id of a particular cache on @cpu. cpuhp lock must held. */ +static inline struct cacheinfo *get_cpu_cache_leaf(int cpu, int level) +{ + int i; + struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu); + + for (i = 0; i < ci->num_leaves; i++) { + /* + * info_list of this cacheinfo instance + * may not be initialized because sometimes + * free_cache_attributes() may free this + * info_list but not set num_leaves to zero, + * for example when PPTT is not supported. + */ + if (!ci->info_list) + continue; + + if ((ci->info_list[i].type == CACHE_TYPE_UNIFIED) && + (ci->info_list[i].level == level)) { + return &ci->info_list[i]; + } + } + + return NULL; +} + +static inline int get_cpu_cacheinfo_id(int cpu, int level) +{ + struct cacheinfo *leaf = get_cpu_cache_leaf(cpu, level); + + if (leaf && leaf->attributes & CACHE_ID) + return leaf->id; + return -1; +} + #endif /* _LINUX_CACHEINFO_H */
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Once we know where all the devices are, we can register cpu hotplug callbacks to probe the devices each CPU can access. Once we've probed all the devices, we can enable MPAM.
As a first step, we learn whether the MSC supports MPAMv1.x, and update our system wide view of the commonly supported partid/pmg range.
As noted in the ACPI code, we learn the cache affinities as CPUs come online. This ensures the data we export via resctrl matches the data cacheinfo exports via sysfs.
[Wang ShaoBo: version adaption and few changes in mpam_sysprops_prop]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=b91f071ae923de34a0b0f7... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 3 + arch/arm64/kernel/mpam/mpam_device.c | 231 ++++++++++++++++++++++++++- arch/arm64/kernel/mpam/mpam_device.h | 7 + 3 files changed, 240 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index de00c141065f..b83f940e0432 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -8,6 +8,7 @@
#include <linux/seq_buf.h> #include <linux/seq_file.h> +#include <linux/resctrlfs.h>
/* MPAM register */ #define SYS_MPAM0_EL1 sys_reg(3, 0, 10, 5, 1) @@ -97,9 +98,11 @@ */ #define VPMR_MAX_BITS (3) #define PARTID_MAX_SHIFT (0) +#define PARTID_MAX_MASK (MPAM_MASK(PARTID_BITS) << PARTID_MAX_SHIFT) #define HAS_HCR_SHIFT (PARTID_MAX_SHIFT + PARTID_BITS + 1) #define VPMR_MAX_SHIFT (HAS_HCR_SHIFT + 1) #define PMG_MAX_SHIFT (VPMR_MAX_SHIFT + VPMR_MAX_BITS + 11) +#define PMG_MAX_MASK (MPAM_MASK(PMG_BITS) << PMG_MAX_SHIFT) #define VPMR_MASK MPAM_MASK(VPMR_MAX_BITS)
/* diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 269a99695e6c..096ca71ff648 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -29,6 +29,9 @@ #include <linux/io.h> #include <linux/slab.h> #include <linux/types.h> +#include <linux/cpu.h> +#include <linux/cacheinfo.h> +#include <asm/mpam.h>
#include "mpam_device.h"
@@ -45,6 +48,75 @@ static LIST_HEAD(mpam_all_devices); /* Classes are the set of MSCs that make up components of the same type. */ LIST_HEAD(mpam_classes);
+static DEFINE_MUTEX(mpam_cpuhp_lock); +static int mpam_cpuhp_state; + + +static inline int mpam_cpu_online(unsigned int cpu); +static inline int mpam_cpu_offline(unsigned int cpu); + +static struct mpam_sysprops_prop mpam_sysprops; + +/* + * mpam is enabled once all devices have been probed from CPU online callbacks, + * scheduled via this work_struct. + */ +static struct work_struct mpam_enable_work; + +/* + * This gets set if something terrible happens, it prevents future attempts + * to configure devices. + */ +static int mpam_broken; +static struct work_struct mpam_failed_work; + +static int mpam_device_probe(struct mpam_device *dev) +{ + return 0; +} + +/* + * Enable mpam once all devices have been probed. + * Scheduled by mpam_discovery_complete() once all devices have been created. + * Also scheduled when new devices are probed when new CPUs come online. + */ +static void __init mpam_enable(struct work_struct *work) +{ + unsigned long flags; + struct mpam_device *dev; + bool all_devices_probed = true; + + /* Have we probed all the devices? */ + mutex_lock(&mpam_devices_lock); + list_for_each_entry(dev, &mpam_all_devices, glbl_list) { + spin_lock_irqsave(&dev->lock, flags); + if (!dev->probed) + all_devices_probed = false; + spin_unlock_irqrestore(&dev->lock, flags); + + if (!all_devices_probed) + break; + } + mutex_unlock(&mpam_devices_lock); + + if (!all_devices_probed) + return; +} + +static void mpam_failed(struct work_struct *work) +{ + /* + * Make it look like all CPUs are offline. This also resets the + * cpu default values and disables interrupts. + */ + mutex_lock(&mpam_cpuhp_lock); + if (mpam_cpuhp_state) { + cpuhp_remove_state(mpam_cpuhp_state); + mpam_cpuhp_state = 0; + } + mutex_unlock(&mpam_cpuhp_lock); +} + static struct mpam_device * __init mpam_device_alloc(struct mpam_component *comp) { @@ -242,6 +314,28 @@ static int mpam_cpus_have_feature(void) return 1; }
+/* + * get max partid from reading SYS_MPAMIDR_EL1. + */ +static inline u16 mpam_cpu_max_partid(void) +{ + u64 reg; + + reg = mpam_read_sysreg_s(SYS_MPAMIDR_EL1, "SYS_MPAMIDR_EL1"); + return reg & PARTID_MAX_MASK; +} + +/* + * get max pmg from reading SYS_MPAMIDR_EL1. + */ +static inline u16 mpam_cpu_max_pmg(void) +{ + u64 reg; + + reg = mpam_read_sysreg_s(SYS_MPAMIDR_EL1, "SYS_MPAMIDR_EL1"); + return (reg & PMG_MAX_MASK) >> PMG_MAX_SHIFT; +} + /* * prepare for initializing devices. */ @@ -250,14 +344,149 @@ int __init mpam_discovery_start(void) if (!mpam_cpus_have_feature()) return -EOPNOTSUPP;
+ mpam_sysprops.max_partid = mpam_cpu_max_partid(); + mpam_sysprops.max_pmg = mpam_cpu_max_pmg(); + + INIT_WORK(&mpam_enable_work, mpam_enable); + INIT_WORK(&mpam_failed_work, mpam_failed); + return 0; }
-int __init mpam_discovery_complete(void) +static int __online_devices(struct mpam_component *comp, int cpu) { + int err = 0; + unsigned long flags; + struct mpam_device *dev; + bool new_device_probed = false; + + list_for_each_entry(dev, &comp->devices, comp_list) { + if (!cpumask_test_cpu(cpu, &dev->fw_affinity)) + continue; + + spin_lock_irqsave(&dev->lock, flags); + if (!dev->probed) { + err = mpam_device_probe(dev); + if (!err) + new_device_probed = true; + } + + cpumask_set_cpu(cpu, &dev->online_affinity); + spin_unlock_irqrestore(&dev->lock, flags); + + if (err) + return err; + } + + if (new_device_probed) + return 1; + + return 0; +} + +/* + * Firmware didn't give us an affinity, but a cache-id, if this cpu has that + * cache-id, update the fw_affinity for this component. + */ +static void +mpam_sync_cpu_cache_component_fw_affinity(struct mpam_class *class, int cpu) +{ + int cpu_cache_id; + struct cacheinfo *leaf; + struct mpam_component *comp; + + lockdep_assert_held(&mpam_devices_lock); /* we modify mpam_sysprops */ + + if (class->type != MPAM_CLASS_CACHE) + return; + + cpu_cache_id = cpu_to_node(cpu); + comp = mpam_component_get(class, cpu_cache_id, false); + + /* This cpu does not have a component of this class */ + if (IS_ERR(comp)) + return; + + /* + * The resctrl rmid_threshold is based on cache size. Keep track of + * the biggest cache we've seen. + */ + leaf = get_cpu_cache_leaf(cpu, class->level); + if (leaf) + mpam_sysprops.mpam_llc_size = max(mpam_sysprops.mpam_llc_size, + leaf->size); + + cpumask_set_cpu(cpu, &comp->fw_affinity); + cpumask_set_cpu(cpu, &class->fw_affinity); +} + +static int mpam_cpu_online(unsigned int cpu) +{ + int err = 0; + struct mpam_class *class; + struct mpam_component *comp; + bool new_device_probed = false; + + mutex_lock(&mpam_devices_lock); + + list_for_each_entry(class, &mpam_classes, classes_list) { + mpam_sync_cpu_cache_component_fw_affinity(class, cpu); + + list_for_each_entry(comp, &class->components, class_list) { + if (!cpumask_test_cpu(cpu, &comp->fw_affinity)) + continue; + + err = __online_devices(comp, cpu); + if (err > 0) + new_device_probed = true; + if (err < 0) + break; // mpam_broken + } + } + + if (new_device_probed && err >= 0) + schedule_work(&mpam_enable_work); + + mutex_unlock(&mpam_devices_lock); + if (err < 0) { + if (!cmpxchg(&mpam_broken, err, 0)) + schedule_work(&mpam_failed_work); + return err; + } + return 0; }
+static int mpam_cpu_offline(unsigned int cpu) +{ + struct mpam_device *dev; + + mutex_lock(&mpam_devices_lock); + list_for_each_entry(dev, &mpam_all_devices, glbl_list) + cpumask_clear_cpu(cpu, &dev->online_affinity); + + mutex_unlock(&mpam_devices_lock); + + return 0; +} + +int __init mpam_discovery_complete(void) +{ + int ret = 0; + + mutex_lock(&mpam_cpuhp_lock); + mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, + "mpam:online", mpam_cpu_online, + mpam_cpu_offline); + if (mpam_cpuhp_state <= 0) { + pr_err("Failed to register 'dyn' cpuhp callbacks"); + ret = -EINVAL; + } + mutex_unlock(&mpam_cpuhp_lock); + + return ret; +} + void __init mpam_discovery_failed(void) { struct mpam_class *class, *tmp; diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index ab986fb1911f..7b8d9ae5a548 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -89,4 +89,11 @@ struct mpam_class { struct list_head classes_list; };
+/* System wide properties */ +struct mpam_sysprops_prop { + u32 mpam_llc_size; + u16 max_partid; + u16 max_pmg; +}; + #endif /* _ASM_ARM64_MPAM_DEVICE_H */
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Memory Partitioning and Monitoring (MPAM) has memory mapped devices (MSCs) with an identity/configuration page.
Supplement the definitions for these registers as offset within the page(s).
[Wang ShaoBo: replace tab with space and useful definitions are added]
Signed-off-by: James Morse james.morse@arm.com http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=a9102d227c371ec3cf3913... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 254 +++++++++++++++++-------- 1 file changed, 177 insertions(+), 77 deletions(-)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 1a6904c22b9c..caa0b822d8ff 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -4,91 +4,191 @@
#include <linux/bitops.h>
-#define MPAMF_IDR 0x0000 -#define MPAMF_SIDR 0x0008 -#define MPAMF_MSMON_IDR 0x0080 -#define MPAMF_IMPL_IDR 0x0028 -#define MPAMF_CPOR_IDR 0x0030 -#define MPAMF_CCAP_IDR 0x0038 -#define MPAMF_MBW_IDR 0x0040 -#define MPAMF_PRI_IDR 0x0048 -#define MPAMF_CSUMON_IDR 0x0088 -#define MPAMF_MBWUMON_IDR 0x0090 -#define MPAMF_PARTID_NRW_IDR 0x0050 -#define MPAMF_IIDR 0x0018 -#define MPAMF_AIDR 0x0020 -#define MPAMCFG_PART_SEL 0x0100 -#define MPAMCFG_CPBM 0x1000 -#define MPAMCFG_CMAX 0x0108 -#define MPAMCFG_MBW_MIN 0x0200 -#define MPAMCFG_MBW_MAX 0x0208 -#define MPAMCFG_MBW_WINWD 0x0220 -#define MPAMCFG_MBW_PBM 0x2000 -#define MPAMCFG_PRI 0x0400 -#define MPAMCFG_MBW_PROP 0x0500 -#define MPAMCFG_INTPARTID 0x0600 -#define MSMON_CFG_MON_SEL 0x0800 -#define MSMON_CFG_CSU_FLT 0x0810 -#define MSMON_CFG_CSU_CTL 0x0818 -#define MSMON_CFG_MBWU_FLT 0x0820 -#define MSMON_CFG_MBWU_CTL 0x0828 -#define MSMON_CSU 0x0840 -#define MSMON_CSU_CAPTURE 0x0848 -#define MSMON_MBWU 0x0860 -#define MSMON_MBWU_CAPTURE 0x0868 -#define MSMON_CAPT_EVNT 0x0808 -#define MPAMF_ESR 0x00F8 -#define MPAMF_ECR 0x00F0 - -#define HAS_CCAP_PART BIT(24) -#define HAS_CPOR_PART BIT(25) -#define HAS_MBW_PART BIT(26) -#define HAS_PRI_PART BIT(27) -#define HAS_IMPL_IDR BIT(29) -#define HAS_MSMON BIT(30) +#define MPAMF_IDR 0x0000 +#define MPAMF_SIDR 0x0008 +#define MPAMF_MSMON_IDR 0x0080 +#define MPAMF_IMPL_IDR 0x0028 +#define MPAMF_CPOR_IDR 0x0030 +#define MPAMF_CCAP_IDR 0x0038 +#define MPAMF_MBW_IDR 0x0040 +#define MPAMF_PRI_IDR 0x0048 +#define MPAMF_CSUMON_IDR 0x0088 +#define MPAMF_MBWUMON_IDR 0x0090 +#define MPAMF_PARTID_NRW_IDR 0x0050 +#define MPAMF_IIDR 0x0018 +#define MPAMF_AIDR 0x0020 +#define MPAMCFG_PART_SEL 0x0100 +#define MPAMCFG_CPBM 0x1000 +#define MPAMCFG_CMAX 0x0108 +#define MPAMCFG_MBW_MIN 0x0200 +#define MPAMCFG_MBW_MAX 0x0208 +#define MPAMCFG_MBW_WINWD 0x0220 +#define MPAMCFG_MBW_PBM 0x2000 +#define MPAMCFG_PRI 0x0400 +#define MPAMCFG_MBW_PROP 0x0500 +#define MPAMCFG_INTPARTID 0x0600 +#define MSMON_CFG_MON_SEL 0x0800 +#define MSMON_CFG_CSU_FLT 0x0810 +#define MSMON_CFG_CSU_CTL 0x0818 +#define MSMON_CFG_MBWU_FLT 0x0820 +#define MSMON_CFG_MBWU_CTL 0x0828 +#define MSMON_CSU 0x0840 +#define MSMON_CSU_CAPTURE 0x0848 +#define MSMON_MBWU 0x0860 +#define MSMON_MBWU_CAPTURE 0x0868 +#define MSMON_CAPT_EVNT 0x0808 +#define MPAMF_ESR 0x00F8 +#define MPAMF_ECR 0x00F0 + +#define HAS_CCAP_PART BIT(24) +#define HAS_CPOR_PART BIT(25) +#define HAS_MBW_PART BIT(26) +#define HAS_PRI_PART BIT(27) +#define HAS_IMPL_IDR BIT(29) +#define HAS_MSMON BIT(30) +#define HAS_PARTID_NRW BIT(31)
/* MPAMF_IDR */ -#define MPAMF_IDR_PMG_MAX_MASK ((BIT(8) - 1) << 16) -#define MPAMF_IDR_PARTID_MAX_MASK (BIT(16) - 1) -#define MPAMF_IDR_PMG_MAX_GET(v) ((v & MPAMF_IDR_PMG_MAX_MASK) >> 16) -#define MPAMF_IDR_PARTID_MAX_GET(v) (v & MPAMF_IDR_PARTID_MAX_MASK) +#define MPAMF_IDR_PMG_MAX_MASK ((BIT(8) - 1) << 16) +#define MPAMF_IDR_PMG_MAX_SHIFT 16 +#define MPAMF_IDR_PARTID_MAX_MASK (BIT(16) - 1) +#define MPAMF_IDR_PMG_MAX_GET(v) ((v & MPAMF_IDR_PMG_MAX_MASK) >> 16) +#define MPAMF_IDR_PARTID_MAX_GET(v) (v & MPAMF_IDR_PARTID_MAX_MASK) + +#define MPAMF_IDR_HAS_CCAP_PART(v) ((v) & HAS_CCAP_PART) +#define MPAMF_IDR_HAS_CPOR_PART(v) ((v) & HAS_CPOR_PART) +#define MPAMF_IDR_HAS_MBW_PART(v) ((v) & HAS_MBW_PART) +#define MPAMF_IDR_HAS_MSMON(v) ((v) & HAS_MSMON) +#define MPAMF_IDR_PARTID_MASK GENMASK(15, 0) +#define MPAMF_IDR_PMG_MASK GENMASK(23, 16) +#define MPAMF_IDR_PMG_SHIFT 16 +#define MPAMF_IDR_HAS_PARTID_NRW(v) ((v) & HAS_PARTID_NRW) +#define NUM_MON_MASK (BIT(16) - 1) +#define MPAMF_IDR_NUM_MON(v) ((v) & NUM_MON_MASK) + +#define CPBM_WD_MASK 0xFFFF +#define CPBM_MASK 0x7FFF + +#define BWA_WD 6 /* hard code for P680 */ +#define MBW_MAX_MASK 0xFC00 +#define MBW_MAX_HARDLIM BIT(31) +#define MBW_MAX_SET(v) (MBW_MAX_HARDLIM|((v) << (16 - BWA_WD))) +#define MBW_MAX_GET(v) (((v) & MBW_MAX_MASK) >> (16 - BWA_WD)) + +#define MSMON_MATCH_PMG BIT(17) +#define MSMON_MATCH_PARTID BIT(16) +#define MSMON_CFG_CTL_EN BIT(31) +#define MSMON_CFG_FLT_SET(r, p) ((r) << 16|(p)) +#define MBWU_SUBTYPE_DEFAULT (3 << 20) +#define MSMON_CFG_MBWU_CTL_SET(m) (BIT(31)|MBWU_SUBTYPE_DEFAULT|(m)) +#define MSMON_CFG_CSU_CTL_SET(m) (BIT(31)|(m)) +#define MSMON_CFG_CSU_TYPE 0x43 +#define MSMON_CFG_MBWU_TYPE 0x42
-#define MPAMF_IDR_HAS_CCAP_PART(v) ((v) & HAS_CCAP_PART) -#define MPAMF_IDR_HAS_CPOR_PART(v) ((v) & HAS_CPOR_PART) -#define MPAMF_IDR_HAS_MBW_PART(v) ((v) & HAS_MBW_PART) -#define MPAMF_IDR_HAS_MSMON(v) ((v) & HAS_MSMON) - -/* MPAMF_x_IDR */ -#define NUM_MON_MASK (BIT(16) - 1) -#define MPAMF_IDR_NUM_MON(v) ((v) & NUM_MON_MASK) - -/* TODO */ - -#define CPBM_WD_MASK 0xFFFF -#define CPBM_MASK 0x7FFF - -#define BWA_WD 6 /* hard code for P680 */ -#define MBW_MAX_MASK 0xFC00 -#define MBW_MAX_HARDLIM BIT(31) - -#define MSMON_MATCH_PMG BIT(17) -#define MSMON_MATCH_PARTID BIT(16) - -#define MSMON_CFG_CTL_EN BIT(31) - -#define MSMON_CFG_FLT_SET(r, p) ((r) << 16|(p)) +/* + * Size of the memory mapped registers: 4K of feature page then 2 x 4K + * bitmap registers + */ +#define SZ_MPAM_DEVICE (3 * SZ_4K)
-#define MBWU_SUBTYPE_DEFAULT (3 << 20) -#define MSMON_CFG_MBWU_CTL_SET(m) (BIT(31)|MBWU_SUBTYPE_DEFAULT|(m)) +/* + * MSMON_CSU - Memory system performance monitor cache storage usage monitor + * register + * MSMON_CSU_CAPTURE - Memory system performance monitor cache storage usage + * capture register + * MSMON_MBWU - Memory system performance monitor memory bandwidth usage + * monitor register + * MSMON_MBWU_CAPTURE - Memory system performance monitor memory bandwidth usage + * capture register + */ +#define MSMON___VALUE GENMASK(30, 0) +#define MSMON___NRDY BIT(31)
-#define MSMON_CFG_CSU_CTL_SET(m) (BIT(31)|(m)) +/* + * MSMON_CAPT_EVNT - Memory system performance monitoring capture event + * generation register + */ +#define MSMON_CAPT_EVNT_NOW BIT(0) +/* + * MPAMCFG_MBW_MAX SET - temp Hard code + */ +#define MPAMCFG_PRI_DSPRI_SHIFT 16 + +/* MPAMF_PRI_IDR - MPAM features priority partitioning ID register */ +#define MPAMF_PRI_IDR_HAS_INTPRI BIT(0) +#define MPAMF_PRI_IDR_INTPRI_0_IS_LOW BIT(1) +#define MPAMF_PRI_IDR_INTPRI_WD_SHIFT 4 +#define MPAMF_PRI_IDR_INTPRI_WD GENMASK(9, 4) +#define MPAMF_PRI_IDR_HAS_DSPRI BIT(16) +#define MPAMF_PRI_IDR_DSPRI_0_IS_LOW BIT(17) +#define MPAMF_PRI_IDR_DSPRI_WD_SHIFT 20 +#define MPAMF_PRI_IDR_DSPRI_WD GENMASK(25, 20) + +/* MPAMF_CSUMON_IDR - MPAM cache storage usage monitor ID register */ +#define MPAMF_CSUMON_IDR_NUM_MON GENMASK(15, 0) +#define MPAMF_CSUMON_IDR_HAS_CAPTURE BIT(31) + +/* MPAMF_MBWUMON_IDR - MPAM memory bandwidth usage monitor ID register */ +#define MPAMF_MBWUMON_IDR_NUM_MON GENMASK(15, 0) +#define MPAMF_MBWUMON_IDR_HAS_CAPTURE BIT(31) + +/* MPAMF_CPOR_IDR - MPAM features cache portion partitioning ID register */ +#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15, 0) + +/* MPAMF_CCAP_IDR - MPAM features cache capacity partitioning ID register */ +#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0) + +/* MPAMF_MBW_IDR - MPAM features memory bandwidth partitioning ID register */ +#define MPAMF_MBW_IDR_BWA_WD GENMASK(5, 0) +#define MPAMF_MBW_IDR_HAS_MIN BIT(10) +#define MPAMF_MBW_IDR_HAS_MAX BIT(11) +#define MPAMF_MBW_IDR_HAS_PBM BIT(12) + +#define MPAMF_MBW_IDR_HAS_PROP BIT(13) +#define MPAMF_MBW_IDR_WINDWR BIT(14) +#define MPAMF_MBW_IDR_BWPBM_WD GENMASK(28, 16) +#define MPAMF_MBW_IDR_BWPBM_WD_SHIFT 16 + +/* MPAMF_PARTID_NRW_IDR - MPAM features partid narrow ID register */ +#define MPAMF_PARTID_NRW_IDR_MASK (BIT(16) - 1) + +#define MSMON_CFG_CTL_TYPE GENMASK(7, 0) +#define MSMON_CFG_CTL_MATCH_PARTID BIT(16) +#define MSMON_CFG_CTL_MATCH_PMG BIT(17) +#define MSMON_CFG_CTL_SUBTYPE GENMASK(23, 20) +#define MSMON_CFG_CTL_SUBTYPE_SHIFT 20 +#define MSMON_CFG_CTL_OFLOW_FRZ BIT(24) +#define MSMON_CFG_CTL_OFLOW_INTR BIT(25) +#define MSMON_CFG_CTL_OFLOW_STATUS BIT(26) +#define MSMON_CFG_CTL_CAPT_RESET BIT(27) +#define MSMON_CFG_CTL_CAPT_EVNT GENMASK(30, 28) +#define MSMON_CFG_CTL_CAPT_EVNT_SHIFT 28 +#define MSMON_CFG_CTL_EN BIT(31) + +#define MPAMF_IDR_HAS_PRI_PART(v) (v & BIT(27)) + +/* MPAMF_MSMON_IDR - MPAM performance monitoring ID register */ +#define MPAMF_MSMON_IDR_MSMON_CSU BIT(16) +#define MPAMF_MSMON_IDR_MSMON_MBWU BIT(17) +#define MPAMF_MSMON_IDR_HAS_LOCAL_CAPT_EVNT BIT(31)
-#define MSMON_CFG_CSU_TYPE 0x43 +/* + * MSMON_CFG_MBWU_FLT - Memory system performance monitor configure memory + * bandwidth usage monitor filter register + */ +#define MSMON_CFG_MBWU_FLT_PARTID GENMASK(15, 0) +#define MSMON_CFG_MBWU_FLT_PMG_SHIFT 16 +#define MSMON_CFG_MBWU_FLT_PMG GENMASK(23, 16) #define MSMON_CFG_MBWU_TYPE 0x42
-/* [FIXME] hard code for hardlim */ -#define MBW_MAX_SET(v) (MBW_MAX_HARDLIM|((v) << (16 - BWA_WD))) -#define MBW_MAX_GET(v) (((v) & MBW_MAX_MASK) >> (16 - BWA_WD)) +/* + * MSMON_CFG_CSU_FLT - Memory system performance monitor configure cache storage + * usage monitor filter register + */ +#define MSMON_CFG_CSU_FLT_PARTID GENMASK(15, 0) +#define MSMON_CFG_CSU_FLT_PMG GENMASK(23, 16) +#define MSMON_CFG_CSU_FLT_PMG_SHIFT 16 +#define MSMON_CFG_CSU_TYPE 0x43
/* hard code for mbw_max max-percentage's cresponding masks */ #define MBA_MAX_WD 63u
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Expand our probing support with the control and monitor types we can use with resctrl.
[Wang Shaobo: version adaption changes, additional MSCs' narrow support]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=12ec685952ba85b3ce6d52... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 177 +++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_device.h | 15 +++ arch/arm64/kernel/mpam/mpam_internal.h | 51 +++++++ 3 files changed, 243 insertions(+) create mode 100644 arch/arm64/kernel/mpam/mpam_internal.h
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 096ca71ff648..e14d8edbafd6 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -32,6 +32,7 @@ #include <linux/cpu.h> #include <linux/cacheinfo.h> #include <asm/mpam.h> +#include <asm/mpam_resource.h>
#include "mpam_device.h"
@@ -70,8 +71,184 @@ static struct work_struct mpam_enable_work; static int mpam_broken; static struct work_struct mpam_failed_work;
+static inline u32 mpam_read_reg(struct mpam_device *dev, u16 reg) +{ + WARN_ON_ONCE(reg > SZ_MPAM_DEVICE); + assert_spin_locked(&dev->lock); + + /* + * If we touch a device that isn't accessible from this CPU we may get + * an external-abort. + */ + WARN_ON_ONCE(preemptible()); + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity)); + + return readl_relaxed(dev->mapped_hwpage + reg); +} + +static inline void mpam_write_reg(struct mpam_device *dev, u16 reg, u32 val) +{ + WARN_ON_ONCE(reg > SZ_MPAM_DEVICE); + assert_spin_locked(&dev->lock); + + /* + * If we touch a device that isn't accessible from this CPU we may get + * an external-abort. If we're lucky, we corrupt another mpam:component. + */ + WARN_ON_ONCE(preemptible()); + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity)); + + writel_relaxed(val, dev->mapped_hwpage + reg); +} + +static void +mpam_probe_update_sysprops(u16 max_partid, u16 max_pmg) +{ + lockdep_assert_held(&mpam_devices_lock); + + mpam_sysprops.max_partid = + (mpam_sysprops.max_partid < max_partid) ? + mpam_sysprops.max_partid : max_partid; + mpam_sysprops.max_pmg = + (mpam_sysprops.max_pmg < max_pmg) ? + mpam_sysprops.max_pmg : max_pmg; +} + static int mpam_device_probe(struct mpam_device *dev) { + u32 hwfeatures; + u16 max_intpartid = 0; + u16 max_partid, max_pmg; + + if (mpam_read_reg(dev, MPAMF_AIDR) != MPAM_ARCHITECTURE_V1) { + pr_err_once("device at 0x%llx does not match MPAM architecture v1.0\n", + dev->hwpage_address); + return -EIO; + } + + hwfeatures = mpam_read_reg(dev, MPAMF_IDR); + max_partid = hwfeatures & MPAMF_IDR_PARTID_MAX_MASK; + max_pmg = (hwfeatures & MPAMF_IDR_PMG_MAX_MASK) >> MPAMF_IDR_PMG_MAX_SHIFT; + + dev->num_partid = max_partid + 1; + dev->num_pmg = max_pmg + 1; + + /* Partid Narrowing*/ + if (MPAMF_IDR_HAS_PARTID_NRW(hwfeatures)) { + u32 partid_nrw_features = mpam_read_reg(dev, MPAMF_PARTID_NRW_IDR); + + max_intpartid = partid_nrw_features & MPAMF_PARTID_NRW_IDR_MASK; + dev->num_intpartid = max_intpartid + 1; + mpam_set_feature(mpam_feat_part_nrw, &dev->features); + } + + mpam_probe_update_sysprops(max_partid, max_pmg); + + /* Cache Capacity Partitioning */ + if (MPAMF_IDR_HAS_CCAP_PART(hwfeatures)) { + u32 ccap_features = mpam_read_reg(dev, MPAMF_CCAP_IDR); + + pr_debug("probe: probed CCAP_PART\n"); + + dev->cmax_wd = ccap_features & MPAMF_CCAP_IDR_CMAX_WD; + if (dev->cmax_wd) + mpam_set_feature(mpam_feat_ccap_part, &dev->features); + } + + /* Cache Portion partitioning */ + if (MPAMF_IDR_HAS_CPOR_PART(hwfeatures)) { + u32 cpor_features = mpam_read_reg(dev, MPAMF_CPOR_IDR); + + pr_debug("probe: probed CPOR_PART\n"); + + dev->cpbm_wd = cpor_features & MPAMF_CPOR_IDR_CPBM_WD; + if (dev->cpbm_wd) + mpam_set_feature(mpam_feat_cpor_part, &dev->features); + } + + /* Memory bandwidth partitioning */ + if (MPAMF_IDR_HAS_MBW_PART(hwfeatures)) { + u32 mbw_features = mpam_read_reg(dev, MPAMF_MBW_IDR); + + pr_debug("probe: probed MBW_PART\n"); + + /* portion bitmap resolution */ + dev->mbw_pbm_bits = (mbw_features & MPAMF_MBW_IDR_BWPBM_WD) >> + MPAMF_MBW_IDR_BWPBM_WD_SHIFT; + if (dev->mbw_pbm_bits && (mbw_features & + MPAMF_MBW_IDR_HAS_PBM)) + mpam_set_feature(mpam_feat_mbw_part, &dev->features); + + dev->bwa_wd = (mbw_features & MPAMF_MBW_IDR_BWA_WD); + if (dev->bwa_wd && (mbw_features & MPAMF_MBW_IDR_HAS_MAX)) { + mpam_set_feature(mpam_feat_mbw_max, &dev->features); + /* we want to export MBW hardlimit support */ + mpam_set_feature(mpam_feat_part_hdl, &dev->features); + } + + if (dev->bwa_wd && (mbw_features & MPAMF_MBW_IDR_HAS_MIN)) + mpam_set_feature(mpam_feat_mbw_min, &dev->features); + + if (dev->bwa_wd && (mbw_features & MPAMF_MBW_IDR_HAS_PROP)) { + mpam_set_feature(mpam_feat_mbw_prop, &dev->features); + /* we want to export MBW hardlimit support */ + mpam_set_feature(mpam_feat_part_hdl, &dev->features); + } + } + + /* Priority partitioning */ + if (MPAMF_IDR_HAS_PRI_PART(hwfeatures)) { + u32 pri_features = mpam_read_reg(dev, MPAMF_PRI_IDR); + + pr_debug("probe: probed PRI_PART\n"); + + dev->intpri_wd = (pri_features & MPAMF_PRI_IDR_INTPRI_WD) >> + MPAMF_PRI_IDR_INTPRI_WD_SHIFT; + if (dev->intpri_wd && (pri_features & + MPAMF_PRI_IDR_HAS_INTPRI)) { + mpam_set_feature(mpam_feat_intpri_part, &dev->features); + if (pri_features & MPAMF_PRI_IDR_INTPRI_0_IS_LOW) + mpam_set_feature(mpam_feat_intpri_part_0_low, + &dev->features); + } + + dev->dspri_wd = (pri_features & MPAMF_PRI_IDR_DSPRI_WD) >> + MPAMF_PRI_IDR_DSPRI_WD_SHIFT; + if (dev->dspri_wd && (pri_features & MPAMF_PRI_IDR_HAS_DSPRI)) { + mpam_set_feature(mpam_feat_dspri_part, &dev->features); + if (pri_features & MPAMF_PRI_IDR_DSPRI_0_IS_LOW) + mpam_set_feature(mpam_feat_dspri_part_0_low, + &dev->features); + } + } + + /* Performance Monitoring */ + if (MPAMF_IDR_HAS_MSMON(hwfeatures)) { + u32 msmon_features = mpam_read_reg(dev, MPAMF_MSMON_IDR); + + pr_debug("probe: probed MSMON\n"); + + if (msmon_features & MPAMF_MSMON_IDR_MSMON_CSU) { + u32 csumonidr; + + csumonidr = mpam_read_reg(dev, MPAMF_CSUMON_IDR); + dev->num_csu_mon = csumonidr & MPAMF_CSUMON_IDR_NUM_MON; + if (dev->num_csu_mon) + mpam_set_feature(mpam_feat_msmon_csu, + &dev->features); + } + if (msmon_features & MPAMF_MSMON_IDR_MSMON_MBWU) { + u32 mbwumonidr = mpam_read_reg(dev, MPAMF_MBWUMON_IDR); + + dev->num_mbwu_mon = mbwumonidr & + MPAMF_MBWUMON_IDR_NUM_MON; + if (dev->num_mbwu_mon) + mpam_set_feature(mpam_feat_msmon_mbwu, + &dev->features); + } + } + dev->probed = true; + return 0; }
diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index 7b8d9ae5a548..d49f5be41443 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -5,6 +5,7 @@ #include <linux/err.h> #include <linux/cpumask.h> #include <linux/types.h> +#include "mpam_internal.h"
/* * Size of the memory mapped registers: 4K of feature page @@ -44,6 +45,20 @@ struct mpam_device {
phys_addr_t hwpage_address; void __iomem *mapped_hwpage; + + u32 features; + + u16 cmax_wd; + u16 cpbm_wd; + u16 mbw_pbm_bits; + u16 bwa_wd; + u16 intpri_wd; + u16 dspri_wd; + u16 num_partid; + u16 num_intpartid; + u16 num_pmg; + u16 num_csu_mon; + u16 num_mbwu_mon; };
/* diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h new file mode 100644 index 000000000000..24b26dc0e3d0 --- /dev/null +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_ARM64_MPAM_INTERNAL_H +#define _ASM_ARM64_MPAM_INTERNAL_H + +typedef u32 mpam_features_t; + +/* Bits for mpam_features_t */ +enum mpam_device_features { + mpam_feat_ccap_part = 0, + mpam_feat_cpor_part, + mpam_feat_mbw_part, + mpam_feat_mbw_min, + mpam_feat_mbw_max, + mpam_feat_mbw_prop, + mpam_feat_intpri_part, + mpam_feat_intpri_part_0_low, + mpam_feat_dspri_part, + mpam_feat_dspri_part_0_low, + mpam_feat_msmon, + mpam_feat_msmon_csu, + mpam_feat_msmon_csu_capture, + mpam_feat_msmon_mbwu, + mpam_feat_msmon_mbwu_capture, + mpam_feat_msmon_capt, + mpam_feat_part_nrw, + /* this feature always enabled */ + mpam_feat_part_hdl, + MPAM_FEATURE_LAST, +}; + +static inline bool mpam_has_feature(enum mpam_device_features feat, + mpam_features_t supported) +{ + return (1<<feat) & supported; +} + +static inline void mpam_set_feature(enum mpam_device_features feat, + mpam_features_t *supported) +{ + *supported |= (1<<feat); +} + +static inline void mpam_clear_feature(enum mpam_device_features feat, + mpam_features_t *supported) +{ + *supported &= ~(1<<feat); +} + +#define MPAM_ARCHITECTURE_V1 0x10 + +#endif
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Now mpam sysprops have been probed, maximum support of partid and pmg should be exported to resctrl.
For MPAM, Processing elements (PEs) issue memory-system requests, PEs must implement the MPAMn_ELx registers and their behaviors to generate the PARTID and PMG fields of memory-system requests.
For resctrl, partid and pmg should be used to combined into a unique rmid for labeling each group, and partid for determining the maximum number of ctrl groups.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 12 ++++++++++++ arch/arm64/kernel/mpam/mpam_internal.h | 3 +++ 2 files changed, 15 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index e14d8edbafd6..a72d5f0251a9 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -676,3 +676,15 @@ void __init mpam_discovery_failed(void) } mutex_unlock(&mpam_devices_lock); } + +u16 mpam_sysprops_num_partid(void) +{ + /* At least one partid for system width */ + return mpam_sysprops.max_partid + 1; +} + +u16 mpam_sysprops_num_pmg(void) +{ + /* At least one pmg for system width */ + return mpam_sysprops.max_pmg + 1; +} diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 24b26dc0e3d0..2579d111d7df 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -48,4 +48,7 @@ static inline void mpam_clear_feature(enum mpam_device_features feat,
#define MPAM_ARCHITECTURE_V1 0x10
+u16 mpam_sysprops_num_partid(void); +u16 mpam_sysprops_num_pmg(void); + #endif
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
As only the hardware's default partid has its configuration reset in hardware, we have to do all the others in software.
If this cpu coming online has made a new device accessible, reset it. For cpuhp we assume its configuration has been lost.
Write the maximum values for all discovered controls.
[Wang ShaoBo: few version adaption changes]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=a6160f572b09dceb6bd65f... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 5 ++ arch/arm64/kernel/mpam/mpam_device.c | 88 ++++++++++++++++++++++++++ 2 files changed, 93 insertions(+)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index caa0b822d8ff..57ec024c2c50 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -72,6 +72,7 @@ #define BWA_WD 6 /* hard code for P680 */ #define MBW_MAX_MASK 0xFC00 #define MBW_MAX_HARDLIM BIT(31) +#define MBW_MAX_BWA_FRACT(w) GENMASK(w - 1, 0) #define MBW_MAX_SET(v) (MBW_MAX_HARDLIM|((v) << (16 - BWA_WD))) #define MBW_MAX_GET(v) (((v) & MBW_MAX_MASK) >> (16 - BWA_WD))
@@ -85,6 +86,10 @@ #define MSMON_CFG_CSU_TYPE 0x43 #define MSMON_CFG_MBWU_TYPE 0x42
+/* + * Set MPAMCFG_PART_SEL internal bit + */ +#define PART_SEL_SET_INTERNAL(r) (r | BIT(16)) /* * Size of the memory mapped registers: 4K of feature page then 2 x 4K * bitmap registers diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index a72d5f0251a9..47b1c0b25d23 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -530,6 +530,91 @@ int __init mpam_discovery_start(void) return 0; }
+static void mpam_reset_device_bitmap(struct mpam_device *dev, u16 reg, u16 wd) +{ + u32 bm = ~0; + int i; + + lockdep_assert_held(&dev->lock); + + /* write all but the last full-32bit-word */ + for (i = 0; i < wd / 32; i++, reg += sizeof(bm)) + mpam_write_reg(dev, reg, bm); + + /* and the last partial 32bit word */ + bm = GENMASK(wd % 32, 0); + if (bm) + mpam_write_reg(dev, reg, bm); +} + +static void mpam_reset_device_config(struct mpam_component *comp, + struct mpam_device *dev, u32 partid) +{ + u16 intpri = GENMASK(dev->intpri_wd, 0); + u16 dspri = GENMASK(dev->dspri_wd, 0); + u32 pri_val = 0; + u32 mbw_max; + + lockdep_assert_held(&dev->lock); + + if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) + partid = PART_SEL_SET_INTERNAL(partid); + mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); + wmb(); /* subsequent writes must be applied to our new partid */ + + if (mpam_has_feature(mpam_feat_cpor_part, dev->features)) + mpam_reset_device_bitmap(dev, MPAMCFG_CPBM, dev->cpbm_wd); + if (mpam_has_feature(mpam_feat_mbw_part, dev->features)) + mpam_reset_device_bitmap(dev, MPAMCFG_MBW_PBM, + dev->mbw_pbm_bits); + if (mpam_has_feature(mpam_feat_mbw_max, dev->features)) { + mbw_max = MBW_MAX_SET(MBW_MAX_BWA_FRACT(dev->bwa_wd)); + mpam_write_reg(dev, MPAMCFG_MBW_MAX, mbw_max); + } + if (mpam_has_feature(mpam_feat_mbw_min, dev->features)) { + mpam_write_reg(dev, MPAMCFG_MBW_MIN, 0); + } + + if (mpam_has_feature(mpam_feat_intpri_part, dev->features) || + mpam_has_feature(mpam_feat_dspri_part, dev->features)) { + /* aces high? */ + if (!mpam_has_feature(mpam_feat_intpri_part_0_low, + dev->features)) + intpri = 0; + if (!mpam_has_feature(mpam_feat_dspri_part_0_low, + dev->features)) + dspri = 0; + + if (mpam_has_feature(mpam_feat_intpri_part, dev->features)) + pri_val |= intpri; + if (mpam_has_feature(mpam_feat_dspri_part, dev->features)) + pri_val |= (dspri << MPAMCFG_PRI_DSPRI_SHIFT); + + mpam_write_reg(dev, MPAMCFG_PRI, pri_val); + } + mb(); /* complete the configuration before the cpu can use this partid */ +} + +/* + * Called from cpuhp callbacks and with the cpus_read_lock() held from + * mpam_reset_devices(). + */ +static void mpam_reset_device(struct mpam_component *comp, + struct mpam_device *dev) +{ + u32 partid; + + lockdep_assert_held(&dev->lock); + + if (!mpam_has_feature(mpam_feat_part_nrw, dev->features)) { + for (partid = 0; partid < dev->num_partid; partid++) + mpam_reset_device_config(comp, dev, partid); + } else { + for (partid = 0; partid < dev->num_intpartid; partid++) + mpam_reset_device_config(comp, dev, partid); + } +} + static int __online_devices(struct mpam_component *comp, int cpu) { int err = 0; @@ -548,6 +633,9 @@ static int __online_devices(struct mpam_component *comp, int cpu) new_device_probed = true; }
+ if (!err && cpumask_empty(&dev->online_affinity)) + mpam_reset_device(comp, dev); + cpumask_set_cpu(cpu, &dev->online_affinity); spin_unlock_irqrestore(&dev->lock, flags);
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
To make a decision about whether to expose an mpam class as a resctrl resource we need to know its overall supported features and properties.
Once we've probed all the devices, we can walk the tree and produced overall values. If bitmap properties are mismatched within a component we cannot support that bitmap.
[Wang ShaoBo: few version adaption changes]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=96d7c0b933c0334abf9c45... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 110 +++++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_device.h | 15 ++++ 2 files changed, 125 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 47b1c0b25d23..bb0864120c71 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -252,6 +252,112 @@ static int mpam_device_probe(struct mpam_device *dev) return 0; }
+/* + * If device doesn't match class feature/configuration, do the right thing. + * For 'num' properties we can just take the minimum. + * For properties where the mismatched unused bits would make a difference, we + * nobble the class feature, as we can't configure all the devices. + * e.g. The L3 cache is composed of two devices with 13 and 17 portion + * bitmaps respectively. + */ +static void __device_class_feature_mismatch(struct mpam_device *dev, + struct mpam_class *class) +{ + lockdep_assert_held(&mpam_devices_lock); /* we modify class */ + + if (class->cpbm_wd != dev->cpbm_wd) + mpam_clear_feature(mpam_feat_cpor_part, &class->features); + if (class->mbw_pbm_bits != dev->mbw_pbm_bits) + mpam_clear_feature(mpam_feat_mbw_part, &class->features); + + /* For num properties, take the minimum */ + if (class->num_partid != dev->num_partid) + class->num_partid = min(class->num_partid, dev->num_partid); + if (class->num_intpartid != dev->num_intpartid) + class->num_intpartid = min(class->num_intpartid, dev->num_intpartid); + if (class->num_pmg != dev->num_pmg) + class->num_pmg = min(class->num_pmg, dev->num_pmg); + if (class->num_csu_mon != dev->num_csu_mon) + class->num_csu_mon = min(class->num_csu_mon, dev->num_csu_mon); + if (class->num_mbwu_mon != dev->num_mbwu_mon) + class->num_mbwu_mon = min(class->num_mbwu_mon, + dev->num_mbwu_mon); + + /* bwa_wd is a count of bits, fewer bits means less precision */ + if (class->bwa_wd != dev->bwa_wd) + class->bwa_wd = min(class->bwa_wd, dev->bwa_wd); + + if (class->intpri_wd != dev->intpri_wd) + class->intpri_wd = min(class->intpri_wd, dev->intpri_wd); + if (class->dspri_wd != dev->dspri_wd) + class->dspri_wd = min(class->dspri_wd, dev->dspri_wd); + + /* {int,ds}pri may not have differing 0-low behaviour */ + if (mpam_has_feature(mpam_feat_intpri_part_0_low, class->features) != + mpam_has_feature(mpam_feat_intpri_part_0_low, dev->features)) + mpam_clear_feature(mpam_feat_intpri_part, &class->features); + if (mpam_has_feature(mpam_feat_dspri_part_0_low, class->features) != + mpam_has_feature(mpam_feat_dspri_part_0_low, dev->features)) + mpam_clear_feature(mpam_feat_dspri_part, &class->features); +} + +/* + * Squash common class=>component=>device->features down to the + * class->features + */ +static void mpam_enable_squash_features(void) +{ + unsigned long flags; + struct mpam_device *dev; + struct mpam_class *class; + struct mpam_component *comp; + + lockdep_assert_held(&mpam_devices_lock); + + list_for_each_entry(class, &mpam_classes, classes_list) { + /* + * Copy the first component's first device's properties and + * features to the class. __device_class_feature_mismatch() + * will fix them as appropriate. + * It is not possible to have a component with no devices. + */ + if (!list_empty(&class->components)) { + comp = list_first_entry_or_null(&class->components, + struct mpam_component, class_list); + if (WARN_ON(!comp)) + break; + + dev = list_first_entry_or_null(&comp->devices, + struct mpam_device, comp_list); + if (WARN_ON(!dev)) + break; + + spin_lock_irqsave(&dev->lock, flags); + class->features = dev->features; + class->cpbm_wd = dev->cpbm_wd; + class->mbw_pbm_bits = dev->mbw_pbm_bits; + class->bwa_wd = dev->bwa_wd; + class->intpri_wd = dev->intpri_wd; + class->dspri_wd = dev->dspri_wd; + class->num_partid = dev->num_partid; + class->num_intpartid = dev->num_intpartid; + class->num_pmg = dev->num_pmg; + class->num_csu_mon = dev->num_csu_mon; + class->num_mbwu_mon = dev->num_mbwu_mon; + spin_unlock_irqrestore(&dev->lock, flags); + } + + list_for_each_entry(comp, &class->components, class_list) { + list_for_each_entry(dev, &comp->devices, comp_list) { + spin_lock_irqsave(&dev->lock, flags); + __device_class_feature_mismatch(dev, class); + class->features &= dev->features; + spin_unlock_irqrestore(&dev->lock, flags); + } + } + } +} + /* * Enable mpam once all devices have been probed. * Scheduled by mpam_discovery_complete() once all devices have been created. @@ -278,6 +384,10 @@ static void __init mpam_enable(struct work_struct *work)
if (!all_devices_probed) return; + + mutex_lock(&mpam_devices_lock); + mpam_enable_squash_features(); + mutex_unlock(&mpam_devices_lock); }
static void mpam_failed(struct work_struct *work) diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index d49f5be41443..05f8431c71fc 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -98,10 +98,25 @@ struct mpam_class { u8 level; enum mpam_class_types type;
+ /* Once enabled, the common features */ + u32 features; + struct mutex lock;
/* member of mpam_classes */ struct list_head classes_list; + + u16 cmax_wd; + u16 cpbm_wd; + u16 mbw_pbm_bits; + u16 bwa_wd; + u16 intpri_wd; + u16 dspri_wd; + u16 num_partid; + u16 num_intpartid; + u16 num_pmg; + u16 num_csu_mon; + u16 num_mbwu_mon; };
/* System wide properties */
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
To bridge resctrl resources and mpam devices, we need somewhere to store the configuration information.
We allocate a configuration array for each mpam component, which gets configuration from intermediate structure of resctrl and write to mpam devices(MSCs).
This config element's categories can be classified as cache ctrl feature (CPBM and CMAX) and memory ctrl feature (MAX and PBM), meanwhile some extended features are also supported, including priority, and hardlimit choice,
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 24 ++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_device.h | 4 ++++ arch/arm64/kernel/mpam/mpam_internal.h | 31 ++++++++++++++++++++++++++ 3 files changed, 59 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index bb0864120c71..a4d0a92b9e46 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -358,6 +358,25 @@ static void mpam_enable_squash_features(void) } }
+static int mpam_allocate_config(void) +{ + struct mpam_class *class; + struct mpam_component *comp; + + lockdep_assert_held(&mpam_devices_lock); + + list_for_each_entry(class, &mpam_classes, classes_list) { + list_for_each_entry(comp, &class->components, class_list) { + comp->cfg = kcalloc(mpam_sysprops_num_partid(), sizeof(*comp->cfg), + GFP_KERNEL); + if (!comp->cfg) + return -ENOMEM; + } + } + + return 0; +} + /* * Enable mpam once all devices have been probed. * Scheduled by mpam_discovery_complete() once all devices have been created. @@ -365,6 +384,7 @@ static void mpam_enable_squash_features(void) */ static void __init mpam_enable(struct work_struct *work) { + int err; unsigned long flags; struct mpam_device *dev; bool all_devices_probed = true; @@ -387,6 +407,9 @@ static void __init mpam_enable(struct work_struct *work)
mutex_lock(&mpam_devices_lock); mpam_enable_squash_features(); + err = mpam_allocate_config(); + if (err) + return; mutex_unlock(&mpam_devices_lock); }
@@ -511,6 +534,7 @@ static void mpam_class_destroy(struct mpam_class *class) list_for_each_entry_safe(comp, tmp, &class->components, class_list) { mpam_devices_destroy(comp); list_del(&comp->class_list); + kfree(comp->cfg); kfree(comp); } } diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index 05f8431c71fc..a98c34742374 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -7,6 +7,8 @@ #include <linux/types.h> #include "mpam_internal.h"
+struct mpam_config; + /* * Size of the memory mapped registers: 4K of feature page * then 2x 4K bitmap registers @@ -74,6 +76,8 @@ struct mpam_component {
struct cpumask fw_affinity;
+ struct mpam_config *cfg; + /* member of mpam_class:components */ struct list_head class_list; }; diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 2579d111d7df..53df10e84554 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -4,6 +4,37 @@
typedef u32 mpam_features_t;
+/* + * MPAM component config Structure + */ +struct mpam_config { + + /* + * The biggest config we could pass around is 4K, but resctrl's max + * cbm is u32, so we only need the full-size config during reset. + * Just in case a cache with a >u32 bitmap is exported for another + * reason, we need to track which bits of the configuration are valid. + */ + mpam_features_t valid; + + u32 cpbm; + u32 mbw_pbm; + u16 mbw_max; + + /* + * dspri is downstream priority, intpri is internal priority. + */ + u16 dspri; + u16 intpri; + + /* + * hardlimit or not + */ + bool hdl; + + u32 intpartid; +}; + /* Bits for mpam_features_t */ enum mpam_device_features { mpam_feat_ccap_part = 0,
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Pick available classes and exported as well-known caches and MBA():
1) System with MPAM support may have a variety of control types at any point of their system layout. We can only expose certain types of control, and only if they exist at particular locations.
Start with the well-know caches. These have to be depth 2 or 3 and support MPAM's cache portion bitmap controls, with a number of portions fewer that resctrl's limit.
2) Picking which MPAM component we can expose via resctrl as MBA (Memory Bandwidth Allocation) is tricky. The ABI is a percentage of available bandwidth.
We can either do this with the memory bandwidth portion bitmaps, or the memory bandwidth maximum control. If both are implemented we prefer the bitmap.
We require and candidate for this resource type to support bandwidth monitoring too.
For 'MBA's position in the toplogy, we want it to be at, or after, the last level cache that is being exposed via resctrl. If there are multiple candidates, we prefer the one closer to the outermost exposed cache.
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=b6870246e25f8f6f9c7b27... Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=676d9aee8c2b27a17dd9cb... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 19 +++ arch/arm64/kernel/mpam/Makefile | 2 +- arch/arm64/kernel/mpam/mpam_device.c | 25 +++ arch/arm64/kernel/mpam/mpam_internal.h | 38 +++++ arch/arm64/kernel/mpam/mpam_mon.c | 3 +- arch/arm64/kernel/mpam/mpam_resctrl.c | 2 + arch/arm64/kernel/mpam/mpam_setup.c | 223 +++++++++++++++++++++++++ include/linux/resctrlfs.h | 2 + 8 files changed, 312 insertions(+), 2 deletions(-) create mode 100644 arch/arm64/kernel/mpam/mpam_setup.c
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index fb5fa6c13843..258baefc2360 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -8,6 +8,25 @@ #define resctrl_alloc_capable rdt_alloc_capable #define resctrl_mon_capable rdt_mon_capable
+enum resctrl_resource_level { + RDT_RESOURCE_SMMU, + RDT_RESOURCE_L3, + RDT_RESOURCE_L2, + RDT_RESOURCE_MC, + + /* Must be the last */ + RDT_NUM_RESOURCES, +}; + +enum rdt_event_id { + QOS_L3_OCCUP_EVENT_ID = 0x01, + QOS_L3_MBM_TOTAL_EVENT_ID = 0x02, + QOS_L3_MBM_LOCAL_EVENT_ID = 0x03, + + /* Must be the last */ + RESCTRL_NUM_EVENT_IDS, +}; + static inline int alloc_mon_id(void) {
diff --git a/arch/arm64/kernel/mpam/Makefile b/arch/arm64/kernel/mpam/Makefile index f69a7018d42b..23fe2d5095fb 100644 --- a/arch/arm64/kernel/mpam/Makefile +++ b/arch/arm64/kernel/mpam/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_MPAM) += mpam_resctrl.o mpam_mon.o \ - mpam_ctrlmon.o mpam_device.o + mpam_ctrlmon.o mpam_device.o mpam_setup.o diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index a4d0a92b9e46..b5362c08e2e3 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -33,6 +33,7 @@ #include <linux/cacheinfo.h> #include <asm/mpam.h> #include <asm/mpam_resource.h> +#include <asm/mpam.h>
#include "mpam_device.h"
@@ -71,6 +72,11 @@ static struct work_struct mpam_enable_work; static int mpam_broken; static struct work_struct mpam_failed_work;
+void mpam_class_list_lock_held(void) +{ + lockdep_assert_held(&mpam_devices_lock); +} + static inline u32 mpam_read_reg(struct mpam_device *dev, u16 reg) { WARN_ON_ONCE(reg > SZ_MPAM_DEVICE); @@ -411,6 +417,25 @@ static void __init mpam_enable(struct work_struct *work) if (err) return; mutex_unlock(&mpam_devices_lock); + + /* + * mpam_enable() runs in parallel with cpuhp callbacks bringing other + * CPUs online, as we eagerly schedule the work. To give resctrl a + * clean start, we make all cpus look offline, set resctrl_registered, + * and then bring them back. + */ + mutex_lock(&mpam_cpuhp_lock); + if (!mpam_cpuhp_state) { + /* We raced with mpam_failed(). */ + mutex_unlock(&mpam_cpuhp_lock); + return; + } + cpuhp_remove_state(mpam_cpuhp_state); + mutex_unlock(&mpam_cpuhp_lock); + + mutex_lock(&mpam_devices_lock); + err = mpam_resctrl_setup(); + mutex_unlock(&mpam_devices_lock); }
static void mpam_failed(struct work_struct *work) diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 53df10e84554..3115f934917d 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -2,8 +2,42 @@ #ifndef _ASM_ARM64_MPAM_INTERNAL_H #define _ASM_ARM64_MPAM_INTERNAL_H
+#include <linux/resctrlfs.h> + typedef u32 mpam_features_t;
+struct mpam_component; +struct rdt_domain; +struct mpam_class; + +extern bool rdt_alloc_capable; +extern bool rdt_mon_capable; + +extern struct list_head mpam_classes; + +struct mpam_resctrl_dom { + struct mpam_component *comp; + + struct rdt_domain resctrl_dom; +}; + +struct mpam_resctrl_res { + struct mpam_class *class; + + bool resctrl_mba_uses_mbw_part; + + struct resctrl_resource resctrl_res; +}; + +#define for_each_resctrl_exports(r) \ + for (r = &mpam_resctrl_exports[0]; \ + r < &mpam_resctrl_exports[0] + \ + ARRAY_SIZE(mpam_resctrl_exports); r++) + +#define for_each_supported_resctrl_exports(r) \ + for_each_resctrl_exports(r) \ + if (r->class) + /* * MPAM component config Structure */ @@ -82,4 +116,8 @@ static inline void mpam_clear_feature(enum mpam_device_features feat, u16 mpam_sysprops_num_partid(void); u16 mpam_sysprops_num_pmg(void);
+void mpam_class_list_lock_held(void); + +int mpam_resctrl_setup(void); + #endif diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 4ff0f7e1f9d2..497ca7f4aa30 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -29,9 +29,10 @@ #include <linux/module.h> #include <linux/slab.h> #include <linux/resctrlfs.h> - #include <asm/resctrl.h>
+#include "mpam_internal.h" + /* * Global boolean for rdt_monitor which is true if any * resource monitoring is enabled. diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 38e5a551c9d5..c825142d1200 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -44,6 +44,8 @@ #include <asm/resctrl.h> #include <asm/io.h>
+#include "mpam_internal.h" + /* Mutex to protect rdtgroup access. */ DEFINE_MUTEX(resctrl_group_mutex);
diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c new file mode 100644 index 000000000000..3bd660ebf35e --- /dev/null +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -0,0 +1,223 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Common code for ARM v8 MPAM + * + * Copyright (C) 2020-2021 Huawei Technologies Co., Ltd + * + * Author: Wang Shaobo bobo.shaobowang@huawei.com + * + * Code was partially borrowed from http://www.linux-arm.org/ + * git?p=linux-jm.git;a=shortlog;h=refs/heads/mpam/snapshot/may. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * More information about MPAM be found in the Arm Architecture Reference + * Manual. + * + * https://static.docs.arm.com/ddi0598/a/DDI0598_MPAM_supp_armv8a.pdf + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/slab.h> +#include <linux/err.h> +#include <linux/resctrlfs.h> +#include <asm/resctrl.h> + +#include "mpam_device.h" +#include "mpam_internal.h" + +/* + * The classes we've picked to map to resctrl resources. + * Class pointer may be NULL. + */ +struct mpam_resctrl_res mpam_resctrl_exports[RDT_NUM_RESOURCES]; +struct mpam_resctrl_res mpam_resctrl_events[RESCTRL_NUM_EVENT_IDS]; + +/* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */ +static void mpam_resctrl_pick_caches(void) +{ + struct mpam_class *class; + struct mpam_resctrl_res *res; + + mpam_class_list_lock_held(); + + list_for_each_entry(class, &mpam_classes, classes_list) { + if (class->type != MPAM_CLASS_CACHE) + continue; + + if (class->level != 2 && class->level != 3) + continue; + + if (!mpam_has_feature(mpam_feat_cpor_part, class->features) && + !mpam_has_feature(mpam_feat_msmon_csu, class->features)) + continue; + + if (!mpam_has_feature(mpam_feat_msmon_csu, class->features) && + mpam_sysprops_num_partid() <= 1) + continue; + + if (class->cpbm_wd > RESCTRL_MAX_CBM) + continue; + + if (class->level == 2) { + res = &mpam_resctrl_exports[RDT_RESOURCE_L2]; + res->resctrl_res.name = "L2"; + } else { + res = &mpam_resctrl_exports[RDT_RESOURCE_L3]; + res->resctrl_res.name = "L3"; + } + res->class = class; + } +} + +/* Find what we can can export as MBA */ +static void mpam_resctrl_pick_mba(void) +{ + u8 resctrl_llc; + struct mpam_class *class; + struct mpam_class *candidate = NULL; + + mpam_class_list_lock_held(); + + /* At least two partitions ... */ + if (mpam_sysprops_num_partid() <= 1) + return; + + if (mpam_resctrl_exports[RDT_RESOURCE_L3].class) + resctrl_llc = 3; + else if (mpam_resctrl_exports[RDT_RESOURCE_L2].class) + resctrl_llc = 2; + else + resctrl_llc = 0; + + list_for_each_entry(class, &mpam_classes, classes_list) { + if (class->type == MPAM_CLASS_UNKNOWN) + continue; + + if (class->level < resctrl_llc) + continue; + + /* + * Once we support MBM counters, we should require the MBA + * class to be at the same point in the hierarchy. Practically, + * this means the MBA class must support MBWU. Until then + * having something is better than nothing, but this may cause + * the MBA resource to disappear over a kernel update on a + * system that could support both, but not at the same time. + */ + + /* + * There are two ways we can generate delays for MBA, either + * with the mbw portion bitmap, or the mbw max control. + */ + if (!mpam_has_feature(mpam_feat_mbw_part, class->features) && + !mpam_has_feature(mpam_feat_mbw_max, class->features)) { + continue; + } + + /* pick the class 'closest' to resctrl_llc */ + if (!candidate || (class->level < candidate->level)) + candidate = class; + } + + if (candidate) + mpam_resctrl_exports[RDT_RESOURCE_MC].class = candidate; +} + +static void mpam_resctrl_pick_event_l3_occup(void) +{ + /* + * as the name suggests, resctrl can only use this if your cache is + * called 'l3'. + */ + struct mpam_resctrl_res *res = &mpam_resctrl_exports[RDT_RESOURCE_L3]; + + if (!res->class) + return; + + if (!mpam_has_feature(mpam_feat_msmon_csu, res->class->features)) + return; + + mpam_resctrl_events[QOS_L3_OCCUP_EVENT_ID] = *res; + + rdt_mon_capable = true; + res->resctrl_res.mon_capable = true; + res->resctrl_res.mon_capable = true; +} + +static void mpam_resctrl_pick_event_mbm_total(void) +{ + u64 num_counters; + struct mpam_resctrl_res *res; + + /* We prefer to measure mbm_total on whatever we used as MBA... */ + res = &mpam_resctrl_exports[RDT_RESOURCE_MC]; + if (!res->class) { + /* ... but if there isn't one, the L3 cache works */ + res = &mpam_resctrl_exports[RDT_RESOURCE_L3]; + if (!res->class) + return; + } + + /* + * to measure bandwidth in a resctrl like way, we need to leave a + * counter running all the time. As these are PMU-like, it is really + * unlikely we have enough... To be useful, we'd need at least one per + * closid. + */ + num_counters = mpam_sysprops_num_partid(); + + if (mpam_has_feature(mpam_feat_msmon_mbwu, res->class->features)) { + if (res->class->num_mbwu_mon >= num_counters) { + /* + * We don't support this use of monitors, let the + * world know this platform could make use of them + * if we did! + */ + } + } +} + +static void mpam_resctrl_pick_event_mbm_local(void) +{ + struct mpam_resctrl_res *res; + + res = &mpam_resctrl_exports[RDT_RESOURCE_MC]; + if (!res->class) + return; + + if (mpam_has_feature(mpam_feat_msmon_mbwu, res->class->features)) { + res->resctrl_res.mon_capable = true; + mpam_resctrl_events[QOS_L3_MBM_LOCAL_EVENT_ID] = *res; + } +} + +/* Called with the mpam classes lock held */ +int mpam_resctrl_setup(void) +{ + struct mpam_resctrl_res *res; + enum resctrl_resource_level level = 0; + + for_each_resctrl_exports(res) { + INIT_LIST_HEAD(&res->resctrl_res.domains); + res->resctrl_res.rid = level; + level++; + } + + mpam_resctrl_pick_caches(); + mpam_resctrl_pick_mba(); + + mpam_resctrl_pick_event_l3_occup(); + mpam_resctrl_pick_event_mbm_total(); + mpam_resctrl_pick_event_mbm_local(); + + return 0; +} diff --git a/include/linux/resctrlfs.h b/include/linux/resctrlfs.h index e192fd55c316..7271703431e2 100644 --- a/include/linux/resctrlfs.h +++ b/include/linux/resctrlfs.h @@ -94,4 +94,6 @@ void resctrl_group_kn_unlock(struct kernfs_node *kn);
void post_resctrl_mount (void);
+#define RESCTRL_MAX_CBM 32 + #endif /* _RESCTRLFS_H */
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Initialize resctrl resources from exported resctrl_res which contains the class distinguished by mpam type and level (just for Cache).
resctrl resource structure and initialization process need to be modified for it doesn't distinguish between L2 and L3.
Part of code refers to James's, See links, others refer to Intel-RDT's and with appropriate expands.
Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=b6870246e25f8f6f9c7b27... Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=676d9aee8c2b27a17dd9cb... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 9 -- arch/arm64/include/asm/resctrl.h | 2 +- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 2 +- arch/arm64/kernel/mpam/mpam_internal.h | 6 ++ arch/arm64/kernel/mpam/mpam_mon.c | 2 +- arch/arm64/kernel/mpam/mpam_resctrl.c | 73 +++++++++----- arch/arm64/kernel/mpam/mpam_setup.c | 128 +++++++++++++++++++++++++ include/linux/resctrlfs.h | 37 +++++++ 8 files changed, 223 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index b83f940e0432..97e259703933 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -212,15 +212,6 @@ extern struct resctrl_resource resctrl_resources_all[];
int __init resctrl_group_init(void);
-enum { - MPAM_RESOURCE_SMMU, - MPAM_RESOURCE_CACHE, - MPAM_RESOURCE_MC, - - /* Must be the last */ - MPAM_NUM_RESOURCES, -}; - void rdt_last_cmd_clear(void); void rdt_last_cmd_puts(const char *s); void rdt_last_cmd_printf(const char *fmt, ...); diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 258baefc2360..d0d30a0fdc1d 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -71,7 +71,7 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of,
#define for_each_resctrl_resource(r) \ for (r = resctrl_resources_all; \ - r < resctrl_resources_all + MPAM_NUM_RESOURCES; \ + r < resctrl_resources_all + RDT_NUM_RESOURCES; \ r++) \
int mpam_get_mon_config(struct resctrl_resource *r); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 22e701195b28..522ed65bb810 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -538,7 +538,7 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, if (r->mon_enabled) { /* HHA does not support monitor by pmg */ if ((prgrp->type == RDTMON_GROUP) && - (r->rid == MPAM_RESOURCE_MC)) + (r->rid == RDT_RESOURCE_MC)) continue;
ret = mkdir_mondata_subdir_alldom(kn, r, prgrp); diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 3115f934917d..be4109c19de9 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -9,12 +9,15 @@ typedef u32 mpam_features_t; struct mpam_component; struct rdt_domain; struct mpam_class; +struct raw_resctrl_resource;
extern bool rdt_alloc_capable; extern bool rdt_mon_capable;
extern struct list_head mpam_classes;
+#define MAX_MBA_BW 100u + struct mpam_resctrl_dom { struct mpam_component *comp;
@@ -120,4 +123,7 @@ void mpam_class_list_lock_held(void);
int mpam_resctrl_setup(void);
+struct raw_resctrl_resource * +mpam_get_raw_resctrl_resource(u32 level); + #endif diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 497ca7f4aa30..81dddf5432b5 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -45,7 +45,7 @@ void pmg_init(void) { /* use L3's num_pmg as system num_pmg */ struct raw_resctrl_resource *rr = - resctrl_resources_all[MPAM_RESOURCE_CACHE].res; + resctrl_resources_all[RDT_RESOURCE_L3].res; int num_pmg = rr->num_pmg;
mon_init(); diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index c825142d1200..d37dbde9f89c 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -195,7 +195,7 @@ int mpam_create_cache_node(u32 component_id, struct mpam_node *new; char *name;
- if (validate_mpam_node(MPAM_RESOURCE_CACHE, component_id)) + if (validate_mpam_node(RDT_RESOURCE_L3, component_id)) goto skip;
new = kzalloc(sizeof(struct mpam_node), GFP_KERNEL); @@ -211,7 +211,7 @@ int mpam_create_cache_node(u32 component_id,
mpam_node_assign_val(new, name, - MPAM_RESOURCE_CACHE, + RDT_RESOURCE_L3, hwpage_address, component_id); list_add_tail(&new->list, &mpam_nodes_ptr->list); @@ -226,7 +226,7 @@ int mpam_create_memory_node(u32 component_id, struct mpam_node *new; char *name;
- if (validate_mpam_node(MPAM_RESOURCE_MC, component_id)) + if (validate_mpam_node(RDT_RESOURCE_MC, component_id)) goto skip;
new = kzalloc(sizeof(struct mpam_node), GFP_KERNEL); @@ -242,7 +242,7 @@ int mpam_create_memory_node(u32 component_id,
mpam_node_assign_val(new, name, - MPAM_RESOURCE_MC, + RDT_RESOURCE_MC, hwpage_address, component_id); list_add_tail(&new->list, &mpam_nodes_ptr->list); @@ -301,7 +301,7 @@ static int csu_write(struct rdt_domain *d, struct rdtgroup *g, bool enable); #define domain_init(id) LIST_HEAD_INIT(resctrl_resources_all[id].domains)
struct raw_resctrl_resource raw_resctrl_resources_all[] = { - [MPAM_RESOURCE_CACHE] = { + [RDT_RESOURCE_L3] = { .msr_update = cat_wrmsr, .msr_read = cat_rdmsr, .parse_ctrlval = parse_cbm, @@ -309,7 +309,15 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .mon_read = csu_read, .mon_write = csu_write, }, - [MPAM_RESOURCE_MC] = { + [RDT_RESOURCE_L2] = { + .msr_update = cat_wrmsr, + .msr_read = cat_rdmsr, + .parse_ctrlval = parse_cbm, + .format_str = "%d=%0*x", + .mon_read = csu_read, + .mon_write = csu_write, + }, + [RDT_RESOURCE_MC] = { .msr_update = bw_wrmsr, .msr_read = bw_rdmsr, .parse_ctrlval = parse_bw, /* add parse_bw() helper */ @@ -320,24 +328,41 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { };
struct resctrl_resource resctrl_resources_all[] = { - [MPAM_RESOURCE_CACHE] = { - .rid = MPAM_RESOURCE_CACHE, - .name = "L3", - .domains = domain_init(MPAM_RESOURCE_CACHE), - .res = &raw_resctrl_resources_all[MPAM_RESOURCE_CACHE], - .fflags = RFTYPE_RES_CACHE, - .alloc_enabled = 1, + [RDT_RESOURCE_L3] = { + .rid = RDT_RESOURCE_L3, + .name = "L3", + .domains = domain_init(RDT_RESOURCE_L3), + .res = &raw_resctrl_resources_all[RDT_RESOURCE_L3], + .fflags = RFTYPE_RES_CACHE, + .alloc_enabled = 1, + }, + [RDT_RESOURCE_L2] = { + .rid = RDT_RESOURCE_L2, + .name = "L2", + .domains = domain_init(RDT_RESOURCE_L2), + .res = &raw_resctrl_resources_all[RDT_RESOURCE_L2], + .fflags = RFTYPE_RES_CACHE, + .alloc_enabled = 1, }, - [MPAM_RESOURCE_MC] = { - .rid = MPAM_RESOURCE_MC, - .name = "MB", - .domains = domain_init(MPAM_RESOURCE_MC), - .res = &raw_resctrl_resources_all[MPAM_RESOURCE_MC], - .fflags = RFTYPE_RES_MC, - .alloc_enabled = 1, + [RDT_RESOURCE_MC] = { + .rid = RDT_RESOURCE_MC, + .name = "MB", + .domains = domain_init(RDT_RESOURCE_MC), + .res = &raw_resctrl_resources_all[RDT_RESOURCE_MC], + .fflags = RFTYPE_RES_MC, + .alloc_enabled = 1, }, };
+struct raw_resctrl_resource * +mpam_get_raw_resctrl_resource(enum resctrl_resource_level level) +{ + if (level >= RDT_NUM_RESOURCES) + return NULL; + + return &raw_resctrl_resources_all[level]; +} + static void cat_wrmsr(struct rdt_domain *d, int partid) { @@ -1324,13 +1349,13 @@ static void mpam_domains_init(struct resctrl_resource *r) r->mon_capable = MPAMF_IDR_HAS_MSMON(val); r->mon_enabled = MPAMF_IDR_HAS_MSMON(val);
- if (r->rid == MPAM_RESOURCE_CACHE) { + if (r->rid == RDT_RESOURCE_L3) { r->alloc_capable = MPAMF_IDR_HAS_CPOR_PART(val); r->alloc_enabled = MPAMF_IDR_HAS_CPOR_PART(val);
val = mpam_readl(d->base + MPAMF_CSUMON_IDR); rr->num_mon = MPAMF_IDR_NUM_MON(val); - } else if (r->rid == MPAM_RESOURCE_MC) { + } else if (r->rid == RDT_RESOURCE_MC) { r->alloc_capable = MPAMF_IDR_HAS_MBW_PART(val); r->alloc_enabled = MPAMF_IDR_HAS_MBW_PART(val);
@@ -1387,8 +1412,8 @@ static int __init mpam_init(void) goto out; }
- mpam_domains_init(&resctrl_resources_all[MPAM_RESOURCE_CACHE]); - mpam_domains_init(&resctrl_resources_all[MPAM_RESOURCE_MC]); + mpam_domains_init(&resctrl_resources_all[RDT_RESOURCE_L3]); + mpam_domains_init(&resctrl_resources_all[RDT_RESOURCE_MC]);
state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/mpam:online:", diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 3bd660ebf35e..d84fd7f3aabd 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -200,9 +200,128 @@ static void mpam_resctrl_pick_event_mbm_local(void) } }
+static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) +{ + struct mpam_class *class = res->class; + struct resctrl_resource *r = &res->resctrl_res; + + if (class == mpam_resctrl_exports[RDT_RESOURCE_SMMU].class) { + return 0; + } else if (class == mpam_resctrl_exports[RDT_RESOURCE_MC].class) { + r->rid = RDT_RESOURCE_MC; + r->name = "MB"; + r->fflags = RFTYPE_RES_MC; + r->mbw.delay_linear = true; + r->res = mpam_get_raw_resctrl_resource(RDT_RESOURCE_MC); + + if (mpam_has_feature(mpam_feat_mbw_part, class->features)) { + res->resctrl_mba_uses_mbw_part = true; + + /* + * The maximum throttling is the number of bits we can + * unset in the bitmap. We never clear all of them, + * so the minimum is one bit, as a percentage. + */ + r->mbw.min_bw = MAX_MBA_BW / class->mbw_pbm_bits; + } else { + /* we're using mpam_feat_mbw_max's */ + res->resctrl_mba_uses_mbw_part = false; + + /* + * The maximum throttling is the number of fractions we + * can represent with the implemented bits. We never + * set 0. The minimum is the LSB, as a percentage. + */ + r->mbw.min_bw = MAX_MBA_BW / + ((1ULL << class->bwa_wd) - 1); + /* the largest mbw_max is 100 */ + r->default_ctrl = 100; + } + /* Just in case we have an excessive number of bits */ + if (!r->mbw.min_bw) + r->mbw.min_bw = 1; + + /* + * because its linear with no offset, the granule is the same + * as the smallest value + */ + r->mbw.bw_gran = r->mbw.min_bw; + + /* We will only pick a class that can monitor and control */ + r->alloc_capable = true; + r->alloc_enabled = true; + rdt_alloc_capable = true; + r->mon_capable = true; + r->mon_enabled = true; + } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L3].class) { + r->rid = RDT_RESOURCE_L3; + r->res = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); + r->fflags = RFTYPE_RES_CACHE; + r->name = "L3"; + + r->cache.cbm_len = class->cpbm_wd; + r->default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + /* + * Which bits are shared with other ...things... + * Unknown devices use partid-0 which uses all the bitmap + * fields. Until we configured the SMMU and GIC not to do this + * 'all the bits' is the correct answer here. + */ + r->cache.shareable_bits = r->default_ctrl; + r->cache.min_cbm_bits = 1; + + if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { + r->alloc_capable = true; + r->alloc_enabled = true; + rdt_alloc_capable = true; + } + /* + * While this is a CPU-interface feature of MPAM, we only tell + * resctrl about it for caches, as that seems to be how x86 + * works, and thus what resctrl expects. + */ + r->cdp_capable = true; + r->mon_capable = true; + r->mon_enabled = true; + + } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L2].class) { + r->rid = RDT_RESOURCE_L2; + r->res = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L2); + r->fflags = RFTYPE_RES_CACHE; + r->name = "L2"; + + r->cache.cbm_len = class->cpbm_wd; + r->default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + /* + * Which bits are shared with other ...things... + * Unknown devices use partid-0 which uses all the bitmap + * fields. Until we configured the SMMU and GIC not to do this + * 'all the bits' is the correct answer here. + */ + r->cache.shareable_bits = r->default_ctrl; + + if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { + r->alloc_capable = true; + r->alloc_enabled = true; + rdt_alloc_capable = true; + } + + /* + * While this is a CPU-interface feature of MPAM, we only tell + * resctrl about it for caches, as that seems to be how x86 + * works, and thus what resctrl expects. + */ + r->cdp_capable = true; + r->mon_capable = false; + } + + return 0; +} + /* Called with the mpam classes lock held */ int mpam_resctrl_setup(void) { + int rc; struct mpam_resctrl_res *res; enum resctrl_resource_level level = 0;
@@ -219,5 +338,14 @@ int mpam_resctrl_setup(void) mpam_resctrl_pick_event_mbm_total(); mpam_resctrl_pick_event_mbm_local();
+ for_each_supported_resctrl_exports(res) { + rc = mpam_resctrl_resource_init(res); + if (rc) + return rc; + } + + if (!rdt_alloc_capable && !rdt_mon_capable) + return -EOPNOTSUPP; + return 0; } diff --git a/include/linux/resctrlfs.h b/include/linux/resctrlfs.h index 7271703431e2..a97cbf310def 100644 --- a/include/linux/resctrlfs.h +++ b/include/linux/resctrlfs.h @@ -9,6 +9,36 @@ #include <linux/seq_buf.h> #include <linux/seq_file.h>
+/** + * struct resctrl_cache - Cache allocation related data + * @cbm_len: Length of the cache bit mask + * @min_cbm_bits: Minimum number of consecutive bits to be set + * @cbm_idx_mult: Multiplier of CBM index + * @cbm_idx_offset: Offset of CBM index. CBM index is computed by: + * closid * cbm_idx_multi + cbm_idx_offset + * in a cache bit mask + * @shareable_bits: Bitmask of shareable resource with other + * executing entities + * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid. + */ +struct resctrl_cache { + u32 cbm_len; + u32 shareable_bits; + u32 min_cbm_bits; +}; + +/** + * struct resctrl_membw - Memory bandwidth allocation related data + * @min_bw: Minimum memory bandwidth percentage user can request + * @bw_gran: Granularity at which the memory bandwidth is allocated + * @delay_linear: True if memory B/W delay is in linear scale + */ +struct resctrl_membw { + u32 min_bw; + u32 bw_gran; + u32 delay_linear; +}; + struct resctrl_resource { int rid; bool alloc_enabled; @@ -20,6 +50,13 @@ struct resctrl_resource { struct list_head evt_list; unsigned long fflags;
+ struct resctrl_cache cache; + struct resctrl_membw mbw; + + bool cdp_capable; + bool cdp_enable; + u32 default_ctrl; + void *res; };
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
So far we probe devices as they become accessible. Each time we probe a new device, we eagerly schedule mpam_enable(). Once all the devices have been probed mpam_enable() will poke resctrl.
At this point, resctrl has an inconcistent view of which CPUs are online, as we only update the the classes that we picked, and we only did that after there were enough CPUs online to have probed all the devices.
Instead of having some complicated re-sync logic, unregister the cpuhp callbacks, register resctrl, then re-register them. As we know all the devices have been probed, no-one can find a new one to cause mpam_enable() to be re-scheduled.
[Wang ShaoBo: many version adaptation changes]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=e81ea2f3ca64d8e46a0519... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 22 +++++++++++++++++++++- arch/arm64/kernel/mpam/mpam_internal.h | 6 ++++++ arch/arm64/kernel/mpam/mpam_resctrl.c | 22 ++++++++++++++-------- arch/arm64/kernel/mpam/mpam_setup.c | 10 ++++++++++ 4 files changed, 51 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index b5362c08e2e3..285f69244da3 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -53,6 +53,7 @@ LIST_HEAD(mpam_classes); static DEFINE_MUTEX(mpam_cpuhp_lock); static int mpam_cpuhp_state;
+static bool resctrl_registered;
static inline int mpam_cpu_online(unsigned int cpu); static inline int mpam_cpu_offline(unsigned int cpu); @@ -431,11 +432,24 @@ static void __init mpam_enable(struct work_struct *work) return; } cpuhp_remove_state(mpam_cpuhp_state); - mutex_unlock(&mpam_cpuhp_lock);
mutex_lock(&mpam_devices_lock); err = mpam_resctrl_setup(); + if (!err) { + err = mpam_resctrl_init(); + if (!err) + resctrl_registered = true; + } + if (err) + pr_err("Failed to setup/init resctrl\n"); mutex_unlock(&mpam_devices_lock); + + mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, + "mpam:online", mpam_cpu_online, + mpam_cpu_offline); + if (mpam_cpuhp_state <= 0) + pr_err("Failed to re-register 'dyn' cpuhp callbacks"); + mutex_unlock(&mpam_cpuhp_lock); }
static void mpam_failed(struct work_struct *work) @@ -878,6 +892,9 @@ static int mpam_cpu_online(unsigned int cpu) return err; }
+ if (resctrl_registered) + mpam_resctrl_cpu_online(cpu); + return 0; }
@@ -891,6 +908,9 @@ static int mpam_cpu_offline(unsigned int cpu)
mutex_unlock(&mpam_devices_lock);
+ if (resctrl_registered) + mpam_resctrl_cpu_offline(cpu); + return 0; }
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index be4109c19de9..106a67ef687a 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -121,9 +121,15 @@ u16 mpam_sysprops_num_pmg(void);
void mpam_class_list_lock_held(void);
+int mpam_resctrl_cpu_online(unsigned int cpu); + +int mpam_resctrl_cpu_offline(unsigned int cpu); + int mpam_resctrl_setup(void);
struct raw_resctrl_resource * mpam_get_raw_resctrl_resource(u32 level);
+int __init mpam_resctrl_init(void); + #endif diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index d37dbde9f89c..3d9b0a32ea02 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -1404,8 +1404,6 @@ static int __init mpam_init(void) rdt_alloc_capable = 1; rdt_mon_capable = 1;
- mpam_init_padding(); - ret = mpam_nodes_init(); if (ret) { pr_err("internal error: bad cpu list\n"); @@ -1423,12 +1421,7 @@ static int __init mpam_init(void) goto out; }
- register_resctrl_specific_files(res_specific_files, ARRAY_SIZE(res_specific_files)); - - seq_buf_init(&last_cmd_status, last_cmd_status_buf, - sizeof(last_cmd_status_buf)); - - ret = resctrl_group_init(); + ret = mpam_resctrl_init(); if (ret) { cpuhp_remove_state(state); goto out; @@ -1449,6 +1442,19 @@ static int __init mpam_init(void) return ret; }
+int __init mpam_resctrl_init(void) +{ + mpam_init_padding(); + + register_resctrl_specific_files(res_specific_files, + ARRAY_SIZE(res_specific_files)); + + seq_buf_init(&last_cmd_status, last_cmd_status_buf, + sizeof(last_cmd_status_buf)); + + return resctrl_group_init(); +} + /* * __intel_rdt_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR * diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index d84fd7f3aabd..94abcf46806f 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -41,6 +41,16 @@ struct mpam_resctrl_res mpam_resctrl_exports[RDT_NUM_RESOURCES]; struct mpam_resctrl_res mpam_resctrl_events[RESCTRL_NUM_EVENT_IDS];
+int mpam_resctrl_cpu_online(unsigned int cpu) +{ + return 0; +} + +int mpam_resctrl_cpu_offline(unsigned int cpu) +{ + return 0; +} + /* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */ static void mpam_resctrl_pick_caches(void) {
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
We implement the cpuhp hooks that allocated and free the resctrl domain structures, meanwhile, ctrl_val array in resctrl_resource structure are created and destroyed synchronously, so it continuously maintains the operations below when cpu online or offline, for mpam resctrl, only the cpu mask is needed to know.
Most of this code are borrowed from James's(76814660 "arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation").
Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=768146605a808b379ae386... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_internal.h | 3 + arch/arm64/kernel/mpam/mpam_resctrl.c | 18 +++- arch/arm64/kernel/mpam/mpam_setup.c | 119 ++++++++++++++++++++++++- include/linux/resctrlfs.h | 1 + 4 files changed, 138 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 106a67ef687a..ed411d7b0031 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -132,4 +132,7 @@ mpam_get_raw_resctrl_resource(u32 level);
int __init mpam_resctrl_init(void);
+int mpam_resctrl_set_default_cpu(unsigned int cpu); +void mpam_resctrl_clear_default_cpu(unsigned int cpu); + #endif diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 3d9b0a32ea02..8fce1b7a32db 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -82,6 +82,19 @@ bool rdt_alloc_capable; * AFF2: MPIDR.AFF2 */
+int mpam_resctrl_set_default_cpu(unsigned int cpu) +{ + /* The cpu is set in default rdtgroup after online. */ + cpumask_set_cpu(cpu, &resctrl_group_default.cpu_mask); + return 0; +} + +void mpam_resctrl_clear_default_cpu(unsigned int cpu) +{ + /* The cpu is set in default rdtgroup after online. */ + cpumask_clear_cpu(cpu, &resctrl_group_default.cpu_mask); +} + static inline void mpam_node_assign_val(struct mpam_node *n, char *name, u8 type, @@ -529,13 +542,14 @@ void closid_free(int closid)
static int mpam_online_cpu(unsigned int cpu) { - cpumask_set_cpu(cpu, &resctrl_group_default.cpu_mask); - return 0; + return mpam_resctrl_set_default_cpu(cpu); }
/* remove related resource when cpu offline */ static int mpam_offline_cpu(unsigned int cpu) { + mpam_resctrl_clear_default_cpu(cpu); + return 0; }
diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 94abcf46806f..45639a1fecb9 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -41,16 +41,133 @@ struct mpam_resctrl_res mpam_resctrl_exports[RDT_NUM_RESOURCES]; struct mpam_resctrl_res mpam_resctrl_events[RESCTRL_NUM_EVENT_IDS];
-int mpam_resctrl_cpu_online(unsigned int cpu) +/* Like resctrl_get_domain_from_cpu(), but for offline CPUs */ +static struct mpam_resctrl_dom * +mpam_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res) { + struct rdt_domain *d; + struct mpam_resctrl_dom *dom; + + list_for_each_entry(d, &res->resctrl_res.domains, list) { + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + if (cpumask_test_cpu(cpu, &dom->comp->fw_affinity)) + return dom; + } + + return NULL; +} + +static int mpam_resctrl_setup_domain(unsigned int cpu, + struct mpam_resctrl_res *res) +{ + struct mpam_resctrl_dom *dom; + struct mpam_class *class = res->class; + struct mpam_component *comp_iter, *comp; + u32 num_partid; + u32 **ctrlval_ptr; + + num_partid = mpam_sysprops_num_partid(); + + comp = NULL; + list_for_each_entry(comp_iter, &class->components, class_list) { + if (cpumask_test_cpu(cpu, &comp_iter->fw_affinity)) { + comp = comp_iter; + break; + } + } + + /* cpu with unknown exported component? */ + if (WARN_ON_ONCE(!comp)) + return 0; + + dom = kzalloc_node(sizeof(*dom), GFP_KERNEL, cpu_to_node(cpu)); + if (!dom) + return -ENOMEM; + + dom->comp = comp; + INIT_LIST_HEAD(&dom->resctrl_dom.list); + dom->resctrl_dom.id = comp->comp_id; + cpumask_set_cpu(cpu, &dom->resctrl_dom.cpu_mask); + + ctrlval_ptr = &dom->resctrl_dom.ctrl_val; + *ctrlval_ptr = kmalloc_array(num_partid, + sizeof(**ctrlval_ptr), GFP_KERNEL); + if (!*ctrlval_ptr) { + kfree(dom); + return -ENOMEM; + } + + /* TODO: this list should be sorted */ + list_add_tail(&dom->resctrl_dom.list, &res->resctrl_res.domains); + res->resctrl_res.dom_num++; + return 0; }
+int mpam_resctrl_cpu_online(unsigned int cpu) +{ + int ret; + struct mpam_resctrl_dom *dom; + struct mpam_resctrl_res *res; + + for_each_supported_resctrl_exports(res) { + dom = mpam_get_domain_from_cpu(cpu, res); + if (dom) { + cpumask_set_cpu(cpu, &dom->resctrl_dom.cpu_mask); + } else { + ret = mpam_resctrl_setup_domain(cpu, res); + if (ret) + return ret; + } + } + + return mpam_resctrl_set_default_cpu(cpu); +} + +static inline struct rdt_domain * +resctrl_get_domain_from_cpu(int cpu, struct resctrl_resource *r) +{ + struct rdt_domain *d; + + list_for_each_entry(d, &r->domains, list) { + /* Find the domain that contains this CPU */ + if (cpumask_test_cpu(cpu, &d->cpu_mask)) + return d; + } + + return NULL; +} + int mpam_resctrl_cpu_offline(unsigned int cpu) { + struct rdt_domain *d; + struct mpam_resctrl_res *res; + struct mpam_resctrl_dom *dom; + + for_each_supported_resctrl_exports(res) { + d = resctrl_get_domain_from_cpu(cpu, &res->resctrl_res); + + /* cpu with unknown exported component? */ + if (WARN_ON_ONCE(!d)) + continue; + + cpumask_clear_cpu(cpu, &d->cpu_mask); + + if (!cpumask_empty(&d->cpu_mask)) + continue; + + list_del(&d->list); + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + kfree(dom); + } + + mpam_resctrl_clear_default_cpu(cpu); + return 0; }
+ /* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */ static void mpam_resctrl_pick_caches(void) { diff --git a/include/linux/resctrlfs.h b/include/linux/resctrlfs.h index a97cbf310def..684bcdba51de 100644 --- a/include/linux/resctrlfs.h +++ b/include/linux/resctrlfs.h @@ -47,6 +47,7 @@ struct resctrl_resource { bool mon_capable; char *name; struct list_head domains; + u32 dom_num; struct list_head evt_list; unsigned long fflags;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
So far we have accomplished resctrl resource initialization works, we need the way to get resource(like Cache/Memory) monitor data and apply corresponding configuration from resctrl input to MSCs.
Sometimes before applying configurations there should some necessary operations to be pre-operated, for instance intpartid narrowing, of which implementation is left for continuous jobs.
For monitoring, This add support to read MSMON_MBWU (QOS_L3_MBM_LOCAL _EVENT_ID supported only) and MSMON_CSU register.
Code related to applying configuration is borrowed from http:// www.linux-arm.org/git?p=linux-jm.git;a=shortlog;h=refs/heads/mpam/ snapshot/jun, besides, monitoring related code is borrowed from Shameer's (5cba077c "arm/mpam: Add MBWU monitor support"), please refer to link.
Link: https://github.com/hisilicon/kernel-dev/commit/5cba077c9c75efecff37017019a5d... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 9 + arch/arm64/kernel/mpam/mpam_device.c | 373 +++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_internal.h | 46 +++ 3 files changed, 428 insertions(+)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 57ec024c2c50..10d727512d61 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -75,6 +75,11 @@ #define MBW_MAX_BWA_FRACT(w) GENMASK(w - 1, 0) #define MBW_MAX_SET(v) (MBW_MAX_HARDLIM|((v) << (16 - BWA_WD))) #define MBW_MAX_GET(v) (((v) & MBW_MAX_MASK) >> (16 - BWA_WD)) +#define MBW_MAX_SET_HDL(r) (r | MBW_MAX_HARDLIM) +/* MPAMCFG_MBW_PROP */ +#define MBW_PROP_HARDLIM BIT(31) +#define MBW_PROP_SET_HDL(r) (r | MBW_PROP_HARDLIM) +/* MPAMCFG_MBW_MAX */
#define MSMON_MATCH_PMG BIT(17) #define MSMON_MATCH_PARTID BIT(16) @@ -90,6 +95,10 @@ * Set MPAMCFG_PART_SEL internal bit */ #define PART_SEL_SET_INTERNAL(r) (r | BIT(16)) + +/* MPAM_ESR */ +#define MPAMF_ESR_ERRCODE_MASK ((BIT(4) - 1) << 24) + /* * Size of the memory mapped registers: 4K of feature page then 2 x 4K * bitmap registers diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 285f69244da3..6162b919d1f9 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -742,6 +742,7 @@ static void mpam_reset_device_config(struct mpam_component *comp, dev->mbw_pbm_bits); if (mpam_has_feature(mpam_feat_mbw_max, dev->features)) { mbw_max = MBW_MAX_SET(MBW_MAX_BWA_FRACT(dev->bwa_wd)); + mbw_max = MBW_MAX_SET_HDL(mbw_max); mpam_write_reg(dev, MPAMCFG_MBW_MAX, mbw_max); } if (mpam_has_feature(mpam_feat_mbw_min, dev->features)) { @@ -955,3 +956,375 @@ u16 mpam_sysprops_num_pmg(void) /* At least one pmg for system width */ return mpam_sysprops.max_pmg + 1; } + +static u32 mpam_device_read_csu_mon(struct mpam_device *dev, + struct sync_args *args) +{ + u16 mon; + u32 clt, flt, cur_clt, cur_flt; + + mon = args->mon; + + mpam_write_reg(dev, MSMON_CFG_MON_SEL, mon); + wmb(); /* subsequent writes must be applied to this mon */ + + /* + * We don't bother with capture as we don't expose a way of measuring + * multiple partid:pmg with a single capture. + */ + clt = MSMON_CFG_CTL_MATCH_PARTID | MSMON_CFG_CSU_TYPE; + if (args->match_pmg) + clt |= MSMON_CFG_CTL_MATCH_PMG; + flt = args->partid | + (args->pmg << MSMON_CFG_CSU_FLT_PMG_SHIFT); + + /* + * We read the existing configuration to avoid re-writing the same + * values. + */ + cur_flt = mpam_read_reg(dev, MSMON_CFG_CSU_FLT); + cur_clt = mpam_read_reg(dev, MSMON_CFG_CSU_CTL); + + if (cur_flt != flt || cur_clt != (clt | MSMON_CFG_CTL_EN)) { + mpam_write_reg(dev, MSMON_CFG_CSU_FLT, flt); + + /* + * Write the ctl with the enable bit cleared, reset the + * counter, then enable counter. + */ + mpam_write_reg(dev, MSMON_CFG_CSU_CTL, clt); + wmb(); + + mpam_write_reg(dev, MSMON_CSU, 0); + wmb(); + + clt |= MSMON_CFG_CTL_EN; + mpam_write_reg(dev, MSMON_CFG_CSU_CTL, clt); + wmb(); + } + + return mpam_read_reg(dev, MSMON_CSU); +} + +static u32 mpam_device_read_mbwu_mon(struct mpam_device *dev, + struct sync_args *args) +{ + u16 mon; + u32 clt, flt, cur_clt, cur_flt; + + mon = args->mon; + + mpam_write_reg(dev, MSMON_CFG_MON_SEL, mon); + wmb(); /* subsequent writes must be applied to this mon */ + + /* + * We don't bother with capture as we don't expose a way of measuring + * multiple partid:pmg with a single capture. + */ + clt = MSMON_CFG_CTL_MATCH_PARTID | MSMON_CFG_MBWU_TYPE; + if (args->match_pmg) + clt |= MSMON_CFG_CTL_MATCH_PMG; + flt = args->partid | + (args->pmg << MSMON_CFG_MBWU_FLT_PMG_SHIFT); + + /* + * We read the existing configuration to avoid re-writing the same + * values. + */ + cur_flt = mpam_read_reg(dev, MSMON_CFG_MBWU_FLT); + cur_clt = mpam_read_reg(dev, MSMON_CFG_MBWU_CTL); + + if (cur_flt != flt || cur_clt != (clt | MSMON_CFG_CTL_EN)) { + mpam_write_reg(dev, MSMON_CFG_MBWU_FLT, flt); + + /* + * Write the ctl with the enable bit cleared, reset the + * counter, then enable counter. + */ + mpam_write_reg(dev, MSMON_CFG_MBWU_CTL, clt); + wmb(); + + mpam_write_reg(dev, MSMON_MBWU, 0); + wmb(); + + clt |= MSMON_CFG_CTL_EN; + mpam_write_reg(dev, MSMON_CFG_MBWU_CTL, clt); + wmb(); + } + + return mpam_read_reg(dev, MSMON_MBWU); +} + +static int mpam_device_frob_mon(struct mpam_device *dev, + struct mpam_device_sync *ctx) +{ + struct sync_args *args = ctx->args; + u32 val; + + lockdep_assert_held(&dev->lock); + + if (mpam_broken) + return -EIO; + + if (!args) + return -EINVAL; + + if (args->eventid == QOS_L3_OCCUP_EVENT_ID && + mpam_has_feature(mpam_feat_msmon_csu, dev->features)) + val = mpam_device_read_csu_mon(dev, args); + else if (args->eventid == QOS_L3_MBM_LOCAL_EVENT_ID && + mpam_has_feature(mpam_feat_msmon_mbwu, dev->features)) + val = mpam_device_read_mbwu_mon(dev, args); + else + return -EOPNOTSUPP; + + if (val & MSMON___NRDY) + return -EBUSY; + + val = val & MSMON___VALUE; + atomic64_add(val, &ctx->mon_value); + return 0; +} + +static int mpam_device_narrow_map(struct mpam_device *dev, u32 partid, + u32 intpartid) +{ + return 0; +} + +static int mpam_device_config(struct mpam_device *dev, u32 partid, + struct mpam_config *cfg) +{ + int ret; + u16 cmax = GENMASK(dev->cmax_wd, 0); + u32 pri_val = 0; + u16 intpri, dspri, max_intpri, max_dspri; + u32 mbw_pbm, mbw_max; + + lockdep_assert_held(&dev->lock); + + if (!mpam_has_part_sel(dev->features)) + return -EINVAL; + + /* + * intpartid should be narrowed the first time, + * upstream(resctrl) keep this order + */ + if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { + if (cfg && mpam_has_feature(mpam_feat_part_nrw, cfg->valid)) { + ret = mpam_device_narrow_map(dev, partid, + cfg->intpartid); + if (ret) + goto out; + partid = PART_SEL_SET_INTERNAL(cfg->intpartid); + } else { + partid = PART_SEL_SET_INTERNAL(cfg->intpartid); + } + } + + mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); + wmb(); /* subsequent writes must be applied to our new partid */ + + if (mpam_has_feature(mpam_feat_ccap_part, dev->features)) + mpam_write_reg(dev, MPAMCFG_CMAX, cmax); + + if (mpam_has_feature(mpam_feat_cpor_part, dev->features)) { + if (cfg && mpam_has_feature(mpam_feat_cpor_part, cfg->valid)) { + /* + * cpor_part being valid implies the bitmap fits in a + * single write. + */ + mpam_write_reg(dev, MPAMCFG_CPBM, cfg->cpbm); + } + } + + if (mpam_has_feature(mpam_feat_mbw_part, dev->features)) { + mbw_pbm = cfg->mbw_pbm; + if (cfg && mpam_has_feature(mpam_feat_mbw_part, cfg->valid)) { + if (!mpam_has_feature(mpam_feat_part_hdl, cfg->valid) || + (mpam_has_feature(mpam_feat_part_hdl, cfg->valid) && cfg->hdl)) + mbw_pbm = MBW_PROP_SET_HDL(cfg->mbw_pbm); + mpam_write_reg(dev, MPAMCFG_MBW_PBM, mbw_pbm); + } + } + + if (mpam_has_feature(mpam_feat_mbw_max, dev->features)) { + if (cfg && mpam_has_feature(mpam_feat_mbw_max, cfg->valid)) { + mbw_max = MBW_MAX_SET(cfg->mbw_max); + if (!mpam_has_feature(mpam_feat_part_hdl, cfg->valid) || + (mpam_has_feature(mpam_feat_part_hdl, cfg->valid) && cfg->hdl)) + mbw_max = MBW_MAX_SET_HDL(mbw_max); + mpam_write_reg(dev, MPAMCFG_MBW_MAX, mbw_max); + } + } + + if (mpam_has_feature(mpam_feat_intpri_part, dev->features) || + mpam_has_feature(mpam_feat_dspri_part, dev->features)) { + if (mpam_has_feature(mpam_feat_intpri_part, cfg->valid) && + mpam_has_feature(mpam_feat_intpri_part, dev->features)) { + max_intpri = GENMASK(dev->intpri_wd - 1, 0); + /* + * Each priority portion only occupys a bit, not only that + * we leave lowest priority, which may be not suitable when + * owning large dspri_wd or intpri_wd. + * dspri and intpri are from same input, so if one + * exceeds it's max width, set it to max priority. + */ + intpri = (cfg->intpri > max_intpri) ? max_intpri : cfg->intpri; + if (!mpam_has_feature(mpam_feat_intpri_part_0_low, + dev->features)) + intpri = GENMASK(dev->intpri_wd - 1, 0) & ~intpri; + pri_val |= intpri; + } + if (mpam_has_feature(mpam_feat_dspri_part, cfg->valid) && + mpam_has_feature(mpam_feat_dspri_part, dev->features)) { + max_dspri = GENMASK(dev->dspri_wd - 1, 0); + dspri = (cfg->dspri > max_dspri) ? max_dspri : cfg->dspri; + if (!mpam_has_feature(mpam_feat_dspri_part_0_low, + dev->features)) + dspri = GENMASK(dev->dspri_wd - 1, 0) & ~dspri; + pri_val |= (dspri << MPAMCFG_PRI_DSPRI_SHIFT); + } + + mpam_write_reg(dev, MPAMCFG_PRI, pri_val); + } + + /* + * complete the configuration before the cpu can + * use this partid + */ + mb(); + +out: + return ret; +} + +static void mpam_component_device_sync(void *__ctx) +{ + int err = 0; + u32 partid; + unsigned long flags; + struct mpam_device *dev; + struct mpam_device_sync *ctx = (struct mpam_device_sync *)__ctx; + struct mpam_component *comp = ctx->comp; + struct sync_args *args = ctx->args; + + list_for_each_entry(dev, &comp->devices, comp_list) { + if (cpumask_intersects(&dev->online_affinity, + &ctx->updated_on)) + continue; + + /* This device needs updating, can I reach it? */ + if (!cpumask_test_cpu(smp_processor_id(), + &dev->online_affinity)) + continue; + + /* Apply new configuration to this device */ + err = 0; + spin_lock_irqsave(&dev->lock, flags); + if (args) { + partid = args->partid; + if (ctx->config_mon) + err = mpam_device_frob_mon(dev, ctx); + else + err = mpam_device_config(dev, partid, + &comp->cfg[partid]); + } else { + mpam_reset_device(comp, dev); + } + spin_unlock_irqrestore(&dev->lock, flags); + if (err) + cmpxchg(&ctx->error, 0, err); + } + + cpumask_set_cpu(smp_processor_id(), &ctx->updated_on); +} + +/** + * in some cases/platforms the MSC register access is only possible with + * the associated CPUs. And need to check if those CPUS are online before + * accessing it. So we use those CPUs dev->online_affinity to apply config. + */ +static int do_device_sync(struct mpam_component *comp, + struct mpam_device_sync *sync_ctx) +{ + int cpu; + struct mpam_device *dev; + + lockdep_assert_cpus_held(); + + cpu = get_cpu(); + if (cpumask_test_cpu(cpu, &comp->fw_affinity)) + mpam_component_device_sync(sync_ctx); + put_cpu(); + + /* + * Find the set of other CPUs we need to run on to update + * this component + */ + list_for_each_entry(dev, &comp->devices, comp_list) { + if (sync_ctx->error) + break; + + if (cpumask_intersects(&dev->online_affinity, + &sync_ctx->updated_on)) + continue; + + /* + * This device needs the config applying, and hasn't been + * reachable by any cpu so far. + */ + cpu = cpumask_any(&dev->online_affinity); + smp_call_function_single(cpu, mpam_component_device_sync, + sync_ctx, 1); + } + + return sync_ctx->error; +} + +static inline void +mpam_device_sync_config_prepare(struct mpam_component *comp, + struct mpam_device_sync *sync_ctx, struct sync_args *args) +{ + sync_ctx->comp = comp; + sync_ctx->args = args; + sync_ctx->config_mon = false; + sync_ctx->error = 0; + cpumask_clear(&sync_ctx->updated_on); +} + +int mpam_component_config(struct mpam_component *comp, struct sync_args *args) +{ + struct mpam_device_sync sync_ctx; + + mpam_device_sync_config_prepare(comp, &sync_ctx, args); + + return do_device_sync(comp, &sync_ctx); +} + +static inline void +mpam_device_sync_mon_prepare(struct mpam_component *comp, + struct mpam_device_sync *sync_ctx, struct sync_args *args) +{ + sync_ctx->comp = comp; + sync_ctx->args = args; + sync_ctx->error = 0; + sync_ctx->config_mon = true; + cpumask_clear(&sync_ctx->updated_on); + atomic64_set(&sync_ctx->mon_value, 0); +} + +int mpam_component_mon(struct mpam_component *comp, + struct sync_args *args, u64 *result) +{ + int ret; + struct mpam_device_sync sync_ctx; + + mpam_device_sync_mon_prepare(comp, &sync_ctx, args); + + ret = do_device_sync(comp, &sync_ctx); + if (!ret && result) + *result = atomic64_read(&sync_ctx.mon_value); + + return ret; +} diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index ed411d7b0031..9f6af1e11777 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -3,6 +3,7 @@ #define _ASM_ARM64_MPAM_INTERNAL_H
#include <linux/resctrlfs.h> +#include <asm/resctrl.h>
typedef u32 mpam_features_t;
@@ -32,6 +33,31 @@ struct mpam_resctrl_res { struct resctrl_resource resctrl_res; };
+struct sync_args { + u8 domid; + u8 pmg; + u32 partid; + u32 mon; + bool match_pmg; + enum rdt_event_id eventid; + /*for reading msr*/ + u16 reg; +}; + +struct mpam_device_sync { + struct mpam_component *comp; + + struct sync_args *args; + + bool config_mon; + atomic64_t mon_value; + + struct cpumask updated_on; + + atomic64_t cfg_value; + int error; +}; + #define for_each_resctrl_exports(r) \ for (r = &mpam_resctrl_exports[0]; \ r < &mpam_resctrl_exports[0] + \ @@ -116,6 +142,26 @@ static inline void mpam_clear_feature(enum mpam_device_features feat,
#define MPAM_ARCHITECTURE_V1 0x10
+static inline bool mpam_has_part_sel(mpam_features_t supported) +{ + mpam_features_t mask = (1<<mpam_feat_ccap_part) | + (1<<mpam_feat_cpor_part) | (1<<mpam_feat_mbw_part) | + (1<<mpam_feat_mbw_max) | (1<<mpam_feat_intpri_part) | + (1<<mpam_feat_dspri_part); + /* or HAS_PARTID_NRW or HAS_IMPL_IDR */ + + return supported & mask; +} + +/** + * Reset component devices if args is NULL + */ +int mpam_component_config(struct mpam_component *comp, + struct sync_args *args); + +int mpam_component_mon(struct mpam_component *comp, + struct sync_args *args, u64 *result); + u16 mpam_sysprops_num_partid(void); u16 mpam_sysprops_num_pmg(void);
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
The MPAM ACPI table holds a PPTT-offset that describes a cache. Add a helper cacheinfo_shared_cpu_map_search() to search the cacheinfo structures for a cache that represents this firmware description.
The cacheinfo structures are freed and allocated over CPU online/offline, the caller of this helper must hold the cpu-hotplug read lock while the helper runs, and while it holds the return value.
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=9e5b7ec7c145019f7160c5... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/base/cacheinfo.c | 38 ++++++++++++++++++++++++++++++++++++++ include/linux/cacheinfo.h | 1 + 2 files changed, 39 insertions(+)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 7ada06f5d961..aee69d78c2e7 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -213,6 +213,44 @@ int __weak cache_setup_acpi(unsigned int cpu) return -ENOTSUPP; }
+/** + * cacheinfo_shared_cpu_map_search() - find an instance of struct cacheinfo + * from the provided firmware description. + * Caller must hold cpus_read_lock() until its finished with the cacheinfo. + * + * Return a CPUs cache leaf described @fw_desc, or NULL. + */ +struct cacheinfo *cacheinfo_shared_cpu_map_search(void *fw_token) +{ + struct cacheinfo *iter; + unsigned int cpu, index; + struct cpu_cacheinfo *cpu_ci; + + for_each_online_cpu(cpu) { + cpu_ci = get_cpu_cacheinfo(cpu); + + /* + * info_list of this cacheinfo instance + * may not be initialized because sometimes + * free_cache_attributes() may free this + * info_list but not set num_leaves to zero, + * for example when PPTT is not supported. + */ + if (!cpu_ci->info_list) + continue; + + for (index = 0; index < cache_leaves(cpu); index++) { + iter = cpu_ci->info_list + index; + + if (iter->fw_token == fw_token) { + return iter; + } + } + } + + return NULL; +} + unsigned int coherency_max_size;
static int cache_shared_cpu_map_setup(unsigned int cpu) diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h index 6db3f7f8a7d6..2e5ab4cde3fe 100644 --- a/include/linux/cacheinfo.h +++ b/include/linux/cacheinfo.h @@ -97,6 +97,7 @@ int func(unsigned int cpu) \ }
struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu); +struct cacheinfo *cacheinfo_shared_cpu_map_search(void *fw_desc); int init_cache_level(unsigned int cpu); int populate_cache_leaves(unsigned int cpu); int cache_setup_acpi(unsigned int cpu);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
We used to make use of mpam_node structure to initialize MSCs and directly use resctrl_resource structure to store the MSCs' probing information before, it's a good choice until we support multiple MSC's node per domain, so far this new framework mpam_device->mpam_component->mpam_class has been constructed, we should make MPAM setup process compatible with this new framework firstly.
At present, we only parsed the base address to create the mpam devices, but did not deal with the interruption registration issue, which will be dealt with later.
We will continue to update discovery process from MPAM ACPI tlb according to latest MPAM ACPI spec.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 33 --- arch/arm64/include/asm/mpam_sched.h | 8 - arch/arm64/kernel/mpam/mpam_device.c | 2 +- arch/arm64/kernel/mpam/mpam_device.h | 8 +- arch/arm64/kernel/mpam/mpam_resctrl.c | 360 +------------------------ drivers/acpi/arm64/mpam.c | 48 +++- include/linux/arm_mpam.h | 67 +++++ 7 files changed, 106 insertions(+), 420 deletions(-) create mode 100644 include/linux/arm_mpam.h
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 10d727512d61..aa5bbe390c19 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -207,37 +207,4 @@ /* hard code for mbw_max max-percentage's cresponding masks */ #define MBA_MAX_WD 63u
-/* - * emulate the mpam nodes - * These should be reported by ACPI MPAM Table. - */ - -struct mpam_node { - /* for label mpam_node instance*/ - u32 component_id; - /* MPAM node header */ - u8 type; /* MPAM_SMMU, MPAM_CACHE, MPAM_MC */ - u64 addr; - void __iomem *base; - struct cpumask cpu_mask; - u64 default_ctrl; - - /* for debug */ - char *cpus_list; - char *name; - struct list_head list; -}; - -int __init mpam_force_init(void); - -int __init mpam_nodes_discovery_start(void); - -void __init mpam_nodes_discovery_failed(void); - -int __init mpam_nodes_discovery_complete(void); - -int mpam_create_cache_node(u32 component_id, phys_addr_t hwpage_address); - -int mpam_create_memory_node(u32 component_id, phys_addr_t hwpage_address); - #endif /* _ASM_ARM64_MPAM_RESOURCE_H */ diff --git a/arch/arm64/include/asm/mpam_sched.h b/arch/arm64/include/asm/mpam_sched.h index 350296157087..08ed349b6efa 100644 --- a/arch/arm64/include/asm/mpam_sched.h +++ b/arch/arm64/include/asm/mpam_sched.h @@ -40,14 +40,6 @@ static inline void mpam_sched_in(void) __mpam_sched_in(); }
-enum mpam_enable_type { - enable_denied = 0, - enable_default, - enable_acpi, -}; - -extern enum mpam_enable_type __read_mostly mpam_enabled; - #else
static inline void mpam_sched_in(void) {} diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 6162b919d1f9..cb29d28d42be 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -31,7 +31,7 @@ #include <linux/types.h> #include <linux/cpu.h> #include <linux/cacheinfo.h> -#include <asm/mpam.h> +#include <linux/arm_mpam.h> #include <asm/mpam_resource.h> #include <asm/mpam.h>
diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index a98c34742374..3165d6b1a270 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -5,6 +5,7 @@ #include <linux/err.h> #include <linux/cpumask.h> #include <linux/types.h> +#include <linux/arm_mpam.h> #include "mpam_internal.h"
struct mpam_config; @@ -15,13 +16,6 @@ struct mpam_config; */ #define SZ_MPAM_DEVICE (3 * SZ_4K)
-enum mpam_class_types { - MPAM_CLASS_SMMU, - MPAM_CLASS_CACHE, /* Well known caches, e.g. L2 */ - MPAM_CLASS_MEMORY, /* Main memory */ - MPAM_CLASS_UNKNOWN, /* Everything else, e.g. TLBs etc */ -}; - /* * An mpam_device corresponds to an MSC, an interface to a component's cache * or bandwidth controls. It is associated with a set of CPUs, and a component. diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 8fce1b7a32db..58c7582b2eef 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -38,6 +38,7 @@ #include <linux/sched/signal.h> #include <linux/sched/task.h> #include <linux/resctrlfs.h> +#include <linux/arm_mpam.h>
#include <asm/mpam_sched.h> #include <asm/mpam_resource.h> @@ -95,208 +96,6 @@ void mpam_resctrl_clear_default_cpu(unsigned int cpu) cpumask_clear_cpu(cpu, &resctrl_group_default.cpu_mask); }
-static inline void mpam_node_assign_val(struct mpam_node *n, - char *name, - u8 type, - phys_addr_t hwpage_address, - u32 component_id) -{ - n->name = name; - n->type = type; - n->addr = hwpage_address; - n->component_id = component_id; - n->cpus_list = "0"; -} - -#define MPAM_NODE_NAME_SIZE (10) - -struct mpam_node *mpam_nodes_ptr; - -static int __init mpam_init(void); - -static void mpam_nodes_unmap(void) -{ - struct mpam_node *n; - - list_for_each_entry(n, &mpam_nodes_ptr->list, list) { - if (n->base) { - iounmap(n->base); - n->base = NULL; - } - } -} - -static int mpam_nodes_init(void) -{ - int ret = 0; - struct mpam_node *n; - - list_for_each_entry(n, &mpam_nodes_ptr->list, list) { - ret |= cpulist_parse(n->cpus_list, &n->cpu_mask); - n->base = ioremap(n->addr, 0x10000); - if (!n->base) { - mpam_nodes_unmap(); - return -ENOMEM; - } - } - - return ret; -} - -static void mpam_nodes_destroy(void) -{ - struct mpam_node *n, *tmp; - - if (!mpam_nodes_ptr) - return; - - list_for_each_entry_safe(n, tmp, &mpam_nodes_ptr->list, list) { - kfree(n->name); - list_del(&n->list); - kfree(n); - } - - list_del(&mpam_nodes_ptr->list); - kfree(mpam_nodes_ptr); - mpam_nodes_ptr = NULL; -} - -int __init mpam_nodes_discovery_start(void) -{ - if (!mpam_enabled) - return -EINVAL; - - mpam_nodes_ptr = kzalloc(sizeof(struct mpam_node), GFP_KERNEL); - if (!mpam_nodes_ptr) - return -ENOMEM; - - INIT_LIST_HEAD(&mpam_nodes_ptr->list); - - return 0; -} - -void __init mpam_nodes_discovery_failed(void) -{ - mpam_nodes_destroy(); -} - -int __init mpam_nodes_discovery_complete(void) -{ - return mpam_init(); -} - -static inline int validate_mpam_node(int type, - int component_id) -{ - int ret = 0; - struct mpam_node *n; - - list_for_each_entry(n, &mpam_nodes_ptr->list, list) { - if (n->component_id == component_id && - n->type == type) { - ret = -EINVAL; - break; - } - } - - return ret; -} - -int mpam_create_cache_node(u32 component_id, - phys_addr_t hwpage_address) -{ - struct mpam_node *new; - char *name; - - if (validate_mpam_node(RDT_RESOURCE_L3, component_id)) - goto skip; - - new = kzalloc(sizeof(struct mpam_node), GFP_KERNEL); - if (!new) - return -ENOMEM; - - name = kzalloc(MPAM_NODE_NAME_SIZE, GFP_KERNEL); - if (!name) { - kfree(new); - return -ENOMEM; - } - snprintf(name, MPAM_NODE_NAME_SIZE, "%s%d", "L3TALL", component_id); - - mpam_node_assign_val(new, - name, - RDT_RESOURCE_L3, - hwpage_address, - component_id); - list_add_tail(&new->list, &mpam_nodes_ptr->list); - -skip: - return 0; -} - -int mpam_create_memory_node(u32 component_id, - phys_addr_t hwpage_address) -{ - struct mpam_node *new; - char *name; - - if (validate_mpam_node(RDT_RESOURCE_MC, component_id)) - goto skip; - - new = kzalloc(sizeof(struct mpam_node), GFP_KERNEL); - if (!new) - return -ENOMEM; - - name = kzalloc(MPAM_NODE_NAME_SIZE, GFP_KERNEL); - if (!name) { - kfree(new); - return -ENOMEM; - } - snprintf(name, MPAM_NODE_NAME_SIZE, "%s%d", "HHAALL", component_id); - - mpam_node_assign_val(new, - name, - RDT_RESOURCE_MC, - hwpage_address, - component_id); - list_add_tail(&new->list, &mpam_nodes_ptr->list); - -skip: - return 0; - -} - -int __init mpam_force_init(void) -{ - int ret; - - if (mpam_enabled != enable_default) - return 0; - - ret = mpam_nodes_discovery_start(); - if (ret) - return ret; - - ret |= mpam_create_cache_node(0, 0x000098b90000ULL); - ret |= mpam_create_cache_node(1, 0x000090b90000ULL); - ret |= mpam_create_cache_node(2, 0x200098b90000ULL); - ret |= mpam_create_cache_node(3, 0x200090b90000ULL); - ret |= mpam_create_memory_node(0, 0x000098c10000ULL); - ret |= mpam_create_memory_node(1, 0x000090c10000ULL); - ret |= mpam_create_memory_node(2, 0x200098c10000ULL); - ret |= mpam_create_memory_node(3, 0x200090c10000ULL); - if (ret) { - mpam_nodes_discovery_failed(); - pr_err("Failed to force create mpam node\n"); - return -EINVAL; - } - - ret = mpam_nodes_discovery_complete(); - if (!ret) - pr_info("Successfully init mpam by hardcode.\n"); - - return 1; -} - static void cat_wrmsr(struct rdt_domain *d, int partid); static void @@ -540,19 +339,6 @@ void closid_free(int closid) closid_free_map |= 1 << closid; }
-static int mpam_online_cpu(unsigned int cpu) -{ - return mpam_resctrl_set_default_cpu(cpu); -} - -/* remove related resource when cpu offline */ -static int mpam_offline_cpu(unsigned int cpu) -{ - mpam_resctrl_clear_default_cpu(cpu); - - return 0; -} - /* * Choose a width for the resource name and resource data based on the * resource that has widest name and cbm. @@ -1306,156 +1092,16 @@ struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, return NULL; }
-static void mpam_domains_destroy(struct resctrl_resource *r) -{ - struct list_head *pos, *q; - struct rdt_domain *d; - - list_for_each_safe(pos, q, &r->domains) { - d = list_entry(pos, struct rdt_domain, list); - list_del(pos); - if (d) { - kfree(d->ctrl_val); - kfree(d); - } - } -} - -static void mpam_domains_init(struct resctrl_resource *r) -{ - int id = 0; - struct mpam_node *n; - struct list_head *add_pos = NULL; - struct rdt_domain *d; - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; - u32 val; - - list_for_each_entry(n, &mpam_nodes_ptr->list, list) { - if (r->rid != n->type) - continue; - - d = mpam_find_domain(r, id, &add_pos); - if (IS_ERR(d)) { - mpam_domains_destroy(r); - pr_warn("Could't find cache id %d\n", id); - return; - } - - if (!d) - d = kzalloc(sizeof(*d), GFP_KERNEL); - else - continue; - - if (!d) { - mpam_domains_destroy(r); - return; - } - - d->id = id; - d->base = n->base; - cpumask_copy(&d->cpu_mask, &n->cpu_mask); - rr->default_ctrl = n->default_ctrl; - - val = mpam_readl(d->base + MPAMF_IDR); - rr->num_partid = MPAMF_IDR_PARTID_MAX_GET(val) + 1; - rr->num_pmg = MPAMF_IDR_PMG_MAX_GET(val) + 1; - - r->mon_capable = MPAMF_IDR_HAS_MSMON(val); - r->mon_enabled = MPAMF_IDR_HAS_MSMON(val); - - if (r->rid == RDT_RESOURCE_L3) { - r->alloc_capable = MPAMF_IDR_HAS_CPOR_PART(val); - r->alloc_enabled = MPAMF_IDR_HAS_CPOR_PART(val); - - val = mpam_readl(d->base + MPAMF_CSUMON_IDR); - rr->num_mon = MPAMF_IDR_NUM_MON(val); - } else if (r->rid == RDT_RESOURCE_MC) { - r->alloc_capable = MPAMF_IDR_HAS_MBW_PART(val); - r->alloc_enabled = MPAMF_IDR_HAS_MBW_PART(val); - - val = mpam_readl(d->base + MPAMF_MBWUMON_IDR); - rr->num_mon = MPAMF_IDR_NUM_MON(val); - } - - r->alloc_capable = 1; - r->alloc_enabled = 1; - r->mon_capable = 1; - r->mon_enabled = 1; - - d->cpus_list = n->cpus_list; - - d->ctrl_val = kmalloc_array(rr->num_partid, sizeof(*d->ctrl_val), GFP_KERNEL); - if (!d->ctrl_val) { - kfree(d); - mpam_domains_destroy(r); - - return; - } - - if (add_pos) - list_add_tail(&d->list, add_pos); - - id++; - } -} - enum mpam_enable_type __read_mostly mpam_enabled; static int __init mpam_setup(char *str) { if (!strcmp(str, "=acpi")) - mpam_enabled = enable_acpi; - else - mpam_enabled = enable_default; + mpam_enabled = MPAM_ENABLE_ACPI; + return 1; } __setup("mpam", mpam_setup);
-static int __init mpam_init(void) -{ - struct resctrl_resource *r; - int state, ret; - - rdt_alloc_capable = 1; - rdt_mon_capable = 1; - - ret = mpam_nodes_init(); - if (ret) { - pr_err("internal error: bad cpu list\n"); - goto out; - } - - mpam_domains_init(&resctrl_resources_all[RDT_RESOURCE_L3]); - mpam_domains_init(&resctrl_resources_all[RDT_RESOURCE_MC]); - - state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, - "arm64/mpam:online:", - mpam_online_cpu, mpam_offline_cpu); - if (state < 0) { - ret = state; - goto out; - } - - ret = mpam_resctrl_init(); - if (ret) { - cpuhp_remove_state(state); - goto out; - } - - for_each_resctrl_resource(r) { - if (r->alloc_capable) - pr_info("MPAM %s allocation detected\n", r->name); - } - - for_each_resctrl_resource(r) { - if (r->mon_capable) - pr_info("MPAM %s monitoring detected\n", r->name); - } - -out: - mpam_nodes_destroy(); - return ret; -} - int __init mpam_resctrl_init(void) { mpam_init_padding(); diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c index 1f82dce33e07..10e4769d5227 100644 --- a/drivers/acpi/arm64/mpam.c +++ b/drivers/acpi/arm64/mpam.c @@ -30,7 +30,7 @@ #include <linux/cacheinfo.h> #include <linux/string.h> #include <linux/nodemask.h> -#include <asm/mpam_resource.h> +#include <linux/arm_mpam.h>
/** * acpi_mpam_label_cache_component_id() - Recursivly find @min_physid @@ -95,6 +95,7 @@ static int __init acpi_mpam_parse_memory(struct acpi_mpam_header *h) { int ret = 0; u32 component_id; + struct mpam_device *dev; struct acpi_mpam_node_memory *node = (struct acpi_mpam_node_memory *)h;
ret = acpi_mpam_label_memory_component_id(node->proximity_domain, @@ -104,9 +105,9 @@ static int __init acpi_mpam_parse_memory(struct acpi_mpam_header *h) return -EINVAL; }
- ret = mpam_create_memory_node(component_id, + dev = mpam_device_create_memory(component_id, node->header.base_address); - if (ret) { + if (IS_ERR(dev)) { pr_err("Failed to create memory node\n"); return -EINVAL; } @@ -118,7 +119,10 @@ static int __init acpi_mpam_parse_cache(struct acpi_mpam_header *h, struct acpi_table_header *pptt) { int ret = 0; + int level; u32 component_id; + struct mpam_device *dev; + struct cacheinfo *ci; struct acpi_pptt_cache *pptt_cache; struct acpi_pptt_processor *pptt_cpu_node; struct acpi_mpam_node_cache *node = (struct acpi_mpam_node_cache *)h; @@ -148,9 +152,28 @@ static int __init acpi_mpam_parse_cache(struct acpi_mpam_header *h, return -EINVAL; }
- ret = mpam_create_cache_node(component_id, - node->header.base_address); - if (ret) { + cpus_read_lock(); + ci = cacheinfo_shared_cpu_map_search(pptt_cpu_node); + if (!ci) { + pr_err_once("No CPU has cache with PPTT reference 0x%x", + node->PPTT_ref); + pr_err_once("All CPUs must be online to probe mpam.\n"); + cpus_read_unlock(); + return -ENODEV; + } + + level = ci->level; + ci = NULL; + cpus_read_unlock(); + + /* + * Possible we can get cpu-affinity in next MPAM ACPI version, + * now we have to set it to NULL and use default possible_aff- + * inity. + */ + dev = mpam_device_create_cache(level, component_id, NULL, + node->header.base_address); + if (IS_ERR(dev)) { pr_err("Failed to create cache node\n"); return -EINVAL; } @@ -166,7 +189,8 @@ static int __init acpi_mpam_parse_table(struct acpi_table_header *table, struct acpi_mpam_header *node_hdr; int ret = 0;
- ret = mpam_nodes_discovery_start(); + ret = mpam_discovery_start(); + if (ret) return ret;
@@ -200,9 +224,9 @@ static int __init acpi_mpam_parse_table(struct acpi_table_header *table,
if (ret) { pr_err("discovery failed: %d\n", ret); - mpam_nodes_discovery_failed(); + mpam_discovery_failed(); } else { - ret = mpam_nodes_discovery_complete(); + ret = mpam_discovery_complete(); if (!ret) pr_info("Successfully init mpam by ACPI.\n"); } @@ -219,11 +243,7 @@ int __init acpi_mpam_parse(void) if (!cpus_have_const_cap(ARM64_HAS_MPAM)) return 0;
- ret = mpam_force_init(); - if (ret) - return 0; - - if (acpi_disabled) + if (acpi_disabled || mpam_enabled != MPAM_ENABLE_ACPI) return 0;
status = acpi_get_table(ACPI_SIG_MPAM, 0, &mpam); diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h new file mode 100644 index 000000000000..9a00c7984d91 --- /dev/null +++ b/include/linux/arm_mpam.h @@ -0,0 +1,67 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_ARM_MPAM_H +#define __LINUX_ARM_MPAM_H + +#include <linux/err.h> +#include <linux/cpumask.h> +#include <linux/types.h> + +struct mpam_device; + +enum mpam_class_types { + MPAM_CLASS_SMMU, + MPAM_CLASS_CACHE, /* Well known caches, e.g. L2 */ + MPAM_CLASS_MEMORY, /* Main memory */ + MPAM_CLASS_UNKNOWN, /* Everything else, e.g. TLBs etc */ +}; + +struct mpam_device * __init +__mpam_device_create(u8 level_idx, enum mpam_class_types type, + int component_id, const struct cpumask *fw_affinity, + phys_addr_t hwpage_address); + +/* + * Create a device for a well known cache, e.g. L2. + * @level_idx and @cache_id will be used to match the cache via cacheinfo + * to learn the component affinity and export domain/resources via resctrl. + * If the device can only be accessed from a smaller set of CPUs, provide + * this as @device_affinity, which can otherwise be NULL. + * + * Returns the new device, or an ERR_PTR(). + */ +static inline struct mpam_device * +mpam_device_create_cache(u8 level_idx, int cache_id, + const struct cpumask *device_affinity, + phys_addr_t hwpage_address) +{ + return __mpam_device_create(level_idx, MPAM_CLASS_CACHE, cache_id, + device_affinity, hwpage_address); +} +/* + * Create a device for a main memory. + * For NUMA systems @nid allows multiple components to be created, + * which will be exported as resctrl domains. MSCs for memory must + * be accessible from any cpu. + */ +static inline struct mpam_device * +mpam_device_create_memory(int nid, phys_addr_t hwpage_address) +{ + struct cpumask dev_affinity; + + cpumask_copy(&dev_affinity, cpumask_of_node(nid)); + + return __mpam_device_create(~0, MPAM_CLASS_MEMORY, nid, + &dev_affinity, hwpage_address); +} +int __init mpam_discovery_start(void); +int __init mpam_discovery_complete(void); +void __init mpam_discovery_failed(void); + +enum mpam_enable_type { + MPAM_ENABLE_DENIED = 0, + MPAM_ENABLE_ACPI, +}; + +extern enum mpam_enable_type mpam_enabled; + +#endif
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
resctrl sysfs need to show MSCs' configuration applied, we can get this intermediate data from mpam config structure live in each component straightforwardly, but for safety we look at the exact value of those registers in any cases, althought it will spend a few time.
We add independent helper separated from do_device_sync() according to James' implementation, purposely reading single device for one component, it is because all devices in one component will be uniformly configured in one configuration process, so reading single device is sufficient for getting each component's configuration.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 73 ++++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_internal.h | 3 ++ 2 files changed, 76 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index cb29d28d42be..11cd7a5a6785 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -1328,3 +1328,76 @@ int mpam_component_mon(struct mpam_component *comp,
return ret; } + +static void mpam_component_read_mpamcfg(void *_ctx) +{ + unsigned long flags; + struct mpam_device *dev; + struct mpam_device_sync *ctx = (struct mpam_device_sync *)_ctx; + struct mpam_component *comp = ctx->comp; + struct sync_args *args = ctx->args; + u64 val; + u16 reg; + u32 partid; + + if (!args) + return; + + reg = args->reg; + /* + * args->partid is possible reqpartid or intpartid, + * if narrow enabled, it should be intpartid. + */ + partid = args->partid; + + list_for_each_entry(dev, &comp->devices, comp_list) { + if (!cpumask_test_cpu(smp_processor_id(), + &dev->online_affinity)) + continue; + + spin_lock_irqsave(&dev->lock, flags); + if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) + partid = PART_SEL_SET_INTERNAL(partid); + mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); + wmb(); + val = mpam_read_reg(dev, reg); + atomic64_add(val, &ctx->cfg_value); + spin_unlock_irqrestore(&dev->lock, flags); + + break; + } +} + +/* + * reading first device of the this component is enough + * for getting configuration. + */ +static void +mpam_component_get_config_local(struct mpam_component *comp, + struct sync_args *args, u32 *result) +{ + int cpu; + struct mpam_device *dev; + struct mpam_device_sync sync_ctx; + + sync_ctx.args = args; + sync_ctx.comp = comp; + atomic64_set(&sync_ctx.cfg_value, 0); + + dev = list_first_entry_or_null(&comp->devices, + struct mpam_device, comp_list); + if (WARN_ON(!dev)) + return; + + cpu = cpumask_any(&dev->online_affinity); + smp_call_function_single(cpu, mpam_component_read_mpamcfg, &sync_ctx, 1); + + if (result) + *result = atomic64_read(&sync_ctx.cfg_value); +} + +void mpam_component_get_config(struct mpam_component *comp, + struct sync_args *args, u32 *result) +{ + mpam_component_get_config_local(comp, args, result); +} diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 9f6af1e11777..ea8be8c861c0 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -162,6 +162,9 @@ int mpam_component_config(struct mpam_component *comp, int mpam_component_mon(struct mpam_component *comp, struct sync_args *args, u64 *result);
+void mpam_component_get_config(struct mpam_component *comp, + struct sync_args *args, u32 *result); + u16 mpam_sysprops_num_partid(void); u16 mpam_sysprops_num_pmg(void);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
So far partid and pmg has been probed by mpam classes, so do feature capabilities of each resources, for resctrl intermediate processing layer, those information placed in classes should be restored in internal resctrl resource structure.
For simplicity, capabilities related are unifiedly controlled by integer input, also should its' width probed from mpam classes. currently we only give priority width and hardlimit width fields in resctrl resource structure, in order to adapt more features, part of this would be re-covered.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 11 +++++++---- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 4 +--- arch/arm64/kernel/mpam/mpam_resctrl.c | 2 +- arch/arm64/kernel/mpam/mpam_setup.c | 19 ++++++++++++++++--- 4 files changed, 25 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 97e259703933..a70133fff450 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -263,7 +263,6 @@ struct msr_param { * @name: Name to use in "schemata" file * @num_closid: Number of CLOSIDs available * @cache_level: Which cache level defines scope of this resource - * @default_ctrl: Specifies default cache cbm or memory B/W percent. * @msr_base: Base MSR address for CBMs * @msr_update: Function pointer to update QOS MSRs * @data_width: Character width of data when displaying @@ -278,15 +277,19 @@ struct msr_param { */
struct raw_resctrl_resource { - int num_partid; - u32 default_ctrl; + u16 num_partid; + u16 num_intpartid; + u16 num_pmg; + + u16 pri_wd; + u16 hdl_wd; + void (*msr_update) (struct rdt_domain *d, int partid); u64 (*msr_read) (struct rdt_domain *d, int partid); int data_width; const char *format_str; int (*parse_ctrlval) (char *buf, struct raw_resctrl_resource *r, struct rdt_domain *d); - int num_pmg; int num_mon; u64 (*mon_read) (struct rdt_domain *d, struct rdtgroup *g); int (*mon_write) (struct rdt_domain *d, struct rdtgroup *g, bool enable); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 522ed65bb810..0d4ba3afe419 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -594,7 +594,6 @@ int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) { struct resctrl_resource *r; - struct raw_resctrl_resource *rr; struct rdt_domain *d; int ret;
@@ -602,9 +601,8 @@ int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) if (!r->alloc_enabled) continue;
- rr = (struct raw_resctrl_resource *)r->res; list_for_each_entry(d, &r->domains, list) { - d->new_ctrl = rr->default_ctrl; + d->new_ctrl = r->default_ctrl; d->have_new_ctrl = true; }
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 58c7582b2eef..d4870e33c3bd 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -313,7 +313,7 @@ void closid_init(void) for_each_resctrl_resource(r) { if (r->alloc_enabled) { rr = r->res; - num_closid = min(num_closid, rr->num_partid); + num_closid = min(num_closid, (int)rr->num_partid); } } closid_free_map = BIT_MASK(num_closid) - 1; diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 45639a1fecb9..265e700cd7c0 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -331,6 +331,7 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) { struct mpam_class *class = res->class; struct resctrl_resource *r = &res->resctrl_res; + struct raw_resctrl_resource *rr = NULL;
if (class == mpam_resctrl_exports[RDT_RESOURCE_SMMU].class) { return 0; @@ -339,7 +340,8 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->name = "MB"; r->fflags = RFTYPE_RES_MC; r->mbw.delay_linear = true; - r->res = mpam_get_raw_resctrl_resource(RDT_RESOURCE_MC); + rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_MC); + r->res = rr;
if (mpam_has_feature(mpam_feat_mbw_part, class->features)) { res->resctrl_mba_uses_mbw_part = true; @@ -382,7 +384,8 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mon_enabled = true; } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L3].class) { r->rid = RDT_RESOURCE_L3; - r->res = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); + rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); + r->res = rr; r->fflags = RFTYPE_RES_CACHE; r->name = "L3";
@@ -413,7 +416,8 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res)
} else if (class == mpam_resctrl_exports[RDT_RESOURCE_L2].class) { r->rid = RDT_RESOURCE_L2; - r->res = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L2); + rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L2); + r->res = rr; r->fflags = RFTYPE_RES_CACHE; r->name = "L2";
@@ -442,6 +446,15 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mon_capable = false; }
+ if (rr && class) { + rr->num_partid = class->num_partid; + rr->num_intpartid = class->num_intpartid; + rr->num_pmg = class->num_pmg; + + rr->pri_wd = max(class->intpri_wd, class->dspri_wd); + rr->hdl_wd = 2; + } + return 0; }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
We now bridge resctrl intermediate processing module and mpam devices module, a large block of code refer to configuration and monitoring process involved need to be modified.
We change the previous method where straightly writing MSCs' registers, this jobs are handed over to helpers offered by mpam devices module instead, when configuration or monitoring action happened, each domains' ctrlval array changed by resctrl sysfs input would be updated into mpam config structure live in each mpam component structure, relevant helpers provided by mpam devices module will soon accomplish the remaining jobs.
Comparasion: configuration or monitoring
old new + + | | | +---------+------------+ | | intermediate helpers | | +---------+------------+ | | | | +--+-----------------+----+ | [reading writing MMIO] | +-------------------------+
So far we nearly accomplish the mission that open up process between resctrl sysfs and mpam devices module but still incomplete currently, also some proper actions are needed after.
Also this moves relevant structures such as struct mongroup to suitable place,.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 67 +--- arch/arm64/include/asm/resctrl.h | 56 ++- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 38 +- arch/arm64/kernel/mpam/mpam_internal.h | 3 + arch/arm64/kernel/mpam/mpam_mon.c | 26 +- arch/arm64/kernel/mpam/mpam_resctrl.c | 459 ++++++++++++++++--------- arch/arm64/kernel/mpam/mpam_setup.c | 13 + fs/resctrlfs.c | 17 + 8 files changed, 433 insertions(+), 246 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index a70133fff450..5aef534fb3df 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -118,54 +118,6 @@ DECLARE_STATIC_KEY_FALSE(resctrl_mon_enable_key); extern bool rdt_alloc_capable; extern bool rdt_mon_capable;
-enum rdt_group_type { - RDTCTRL_GROUP = 0, - RDTMON_GROUP, - RDT_NUM_GROUP, -}; - -/** - * struct mongroup - store mon group's data in resctrl fs. - * @mon_data_kn kernlfs node for the mon_data directory - * @parent: parent rdtgrp - * @crdtgrp_list: child rdtgroup node list - * @rmid: rmid for this rdtgroup - * @mon: monnitor id - */ -struct mongroup { - struct kernfs_node *mon_data_kn; - struct rdtgroup *parent; - struct list_head crdtgrp_list; - u32 rmid; - u32 mon; - int init; -}; - -/** - * struct rdtgroup - store rdtgroup's data in resctrl file system. - * @kn: kernfs node - * @resctrl_group_list: linked list for all rdtgroups - * @closid: closid for this rdtgroup - * #endif - * @cpu_mask: CPUs assigned to this rdtgroup - * @flags: status bits - * @waitcount: how many cpus expect to find this - * group when they acquire resctrl_group_mutex - * @type: indicates type of this rdtgroup - either - * monitor only or ctrl_mon group - * @mon: mongroup related data - */ -struct rdtgroup { - struct kernfs_node *kn; - struct list_head resctrl_group_list; - u32 closid; - struct cpumask cpu_mask; - int flags; - atomic_t waitcount; - enum rdt_group_type type; - struct mongroup mon; -}; - extern int max_name_width, max_data_width;
/* rdtgroup.flags */ @@ -284,15 +236,18 @@ struct raw_resctrl_resource { u16 pri_wd; u16 hdl_wd;
- void (*msr_update) (struct rdt_domain *d, int partid); - u64 (*msr_read) (struct rdt_domain *d, int partid); + void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, + struct list_head *opt_list, int partid); + u64 (*msr_read)(struct rdt_domain *d, int partid); + int data_width; const char *format_str; - int (*parse_ctrlval) (char *buf, struct raw_resctrl_resource *r, - struct rdt_domain *d); - int num_mon; - u64 (*mon_read) (struct rdt_domain *d, struct rdtgroup *g); - int (*mon_write) (struct rdt_domain *d, struct rdtgroup *g, bool enable); + int (*parse_ctrlval)(char *buf, struct raw_resctrl_resource *r, + struct rdt_domain *d); + + u16 num_mon; + u64 (*mon_read)(struct rdt_domain *d, struct rdtgroup *g); + int (*mon_write)(struct rdt_domain *d, struct rdtgroup *g, bool enable); };
int parse_cbm(char *buf, struct raw_resctrl_resource *r, struct rdt_domain *d); @@ -321,4 +276,6 @@ int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, struct rdtgroup *prgrp, struct kernfs_node **dest_kn);
+u16 mpam_resctrl_max_mon_num(void); + #endif /* _ASM_ARM64_MPAM_H */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index d0d30a0fdc1d..2119204fa090 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -27,6 +27,54 @@ enum rdt_event_id { RESCTRL_NUM_EVENT_IDS, };
+enum rdt_group_type { + RDTCTRL_GROUP = 0, + RDTMON_GROUP, + RDT_NUM_GROUP, +}; + +/** + * struct mongroup - store mon group's data in resctrl fs. + * @mon_data_kn kernlfs node for the mon_data directory + * @parent: parent rdtgrp + * @crdtgrp_list: child rdtgroup node list + * @rmid: rmid for this rdtgroup + * @mon: monnitor id + */ +struct mongroup { + struct kernfs_node *mon_data_kn; + struct rdtgroup *parent; + struct list_head crdtgrp_list; + u32 rmid; + u32 mon; + int init; +}; + +/** + * struct rdtgroup - store rdtgroup's data in resctrl file system. + * @kn: kernfs node + * @resctrl_group_list: linked list for all rdtgroups + * @closid: closid for this rdtgroup + * #endif + * @cpu_mask: CPUs assigned to this rdtgroup + * @flags: status bits + * @waitcount: how many cpus expect to find this + * group when they acquire resctrl_group_mutex + * @type: indicates type of this rdtgroup - either + * monitor only or ctrl_mon group + * @mon: mongroup related data + */ +struct rdtgroup { + struct kernfs_node *kn; + struct list_head resctrl_group_list; + u32 closid; + struct cpumask cpu_mask; + int flags; + atomic_t waitcount; + enum rdt_group_type type; + struct mongroup mon; +}; + static inline int alloc_mon_id(void) {
@@ -69,11 +117,6 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, #define release_resctrl_group_fs_options release_rdtgroupfs_options #define parse_resctrl_group_fs_options parse_rdtgroupfs_options
-#define for_each_resctrl_resource(r) \ - for (r = resctrl_resources_all; \ - r < resctrl_resources_all + RDT_NUM_RESOURCES; \ - r++) \ - int mpam_get_mon_config(struct resctrl_resource *r);
int mkdir_mondata_all(struct kernfs_node *parent_kn, @@ -86,4 +129,7 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp,
int rdtgroup_init_alloc(struct rdtgroup *rdtgrp);
+struct resctrl_resource * +mpam_resctrl_get_resource(enum resctrl_resource_level level); + #endif /* _ASM_ARM64_RESCTRL_H */ diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 0d4ba3afe419..39c5020cdfa6 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -38,6 +38,7 @@ #include <asm/mpam.h> #include <asm/mpam_resource.h> #include <asm/resctrl.h> +#include "mpam_internal.h"
/* * Check whether a cache bit mask is valid. The SDM says: @@ -188,7 +189,7 @@ static int update_domains(struct resctrl_resource *r, struct rdtgroup *g) list_for_each_entry(d, &r->domains, list) { if (d->have_new_ctrl && d->new_ctrl != d->ctrl_val[partid]) { d->ctrl_val[partid] = d->new_ctrl; - rr->msr_update(d, partid); + rr->msr_update(r, d, NULL, partid); } }
@@ -197,13 +198,17 @@ static int update_domains(struct resctrl_resource *r, struct rdtgroup *g)
static int resctrl_group_parse_resource(char *resname, char *tok, int closid) { + struct mpam_resctrl_res *res; struct resctrl_resource *r; struct raw_resctrl_resource *rr;
- for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (r->alloc_enabled) { rr = (struct raw_resctrl_resource *)r->res; - if (!strcmp(resname, r->name) && closid < rr->num_partid) + if (!strcmp(resname, r->name) && closid < + mpam_sysprops_num_partid()) return parse_line(tok, r); } } @@ -216,6 +221,7 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, { struct rdtgroup *rdtgrp; struct rdt_domain *dom; + struct mpam_resctrl_res *res; struct resctrl_resource *r; char *tok, *resname; int closid, ret = 0; @@ -234,7 +240,9 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of,
closid = rdtgrp->closid;
- for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (r->alloc_enabled) { list_for_each_entry(dom, &r->domains, list) dom->have_new_ctrl = false; @@ -258,7 +266,9 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, goto out; }
- for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (r->alloc_enabled) { ret = update_domains(r, rdtgrp); if (ret) @@ -292,6 +302,7 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, struct seq_file *s, void *v) { struct rdtgroup *rdtgrp; + struct mpam_resctrl_res *res; struct resctrl_resource *r; struct raw_resctrl_resource *rr; int ret = 0; @@ -300,10 +311,11 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, rdtgrp = resctrl_group_kn_lock_live(of->kn); if (rdtgrp) { partid = rdtgrp->closid; - for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; if (r->alloc_enabled) { rr = (struct raw_resctrl_resource *)r->res; - if (partid < rr->num_partid) + if (partid < mpam_sysprops_num_partid()) show_doms(s, r, partid); } } @@ -367,7 +379,7 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg)
md.priv = of->kn->priv;
- r = &resctrl_resources_all[md.u.rid]; + r = mpam_resctrl_get_resource(md.u.rid); rr = r->res;
/* show monitor data */ @@ -516,6 +528,7 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, struct kernfs_node **dest_kn) { + struct mpam_resctrl_res *res; struct resctrl_resource *r; struct kernfs_node *kn; int ret; @@ -534,7 +547,9 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, * Create the subdirectories for each domain. Note that all events * in a domain like L3 are grouped into a resource whose domain is L3 */ - for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (r->mon_enabled) { /* HHA does not support monitor by pmg */ if ((prgrp->type == RDTMON_GROUP) && @@ -593,11 +608,14 @@ int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, /* Initialize the RDT group's allocations. */ int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) { + struct mpam_resctrl_res *res; struct resctrl_resource *r; struct rdt_domain *d; int ret;
- for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (!r->alloc_enabled) continue;
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index ea8be8c861c0..8ab019fd8938 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -170,6 +170,9 @@ u16 mpam_sysprops_num_pmg(void);
void mpam_class_list_lock_held(void);
+extern struct mpam_resctrl_res mpam_resctrl_exports[RDT_NUM_RESOURCES]; +extern struct mpam_resctrl_res mpam_resctrl_events[RESCTRL_NUM_EVENT_IDS]; + int mpam_resctrl_cpu_online(unsigned int cpu);
int mpam_resctrl_cpu_offline(unsigned int cpu); diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 81dddf5432b5..f952e9aa20c2 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -43,10 +43,17 @@ static int pmg_free_map; void mon_init(void); void pmg_init(void) { - /* use L3's num_pmg as system num_pmg */ - struct raw_resctrl_resource *rr = - resctrl_resources_all[RDT_RESOURCE_L3].res; - int num_pmg = rr->num_pmg; + u16 num_pmg = USHRT_MAX; + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + struct raw_resctrl_resource *rr; + + /* Use the max num_pmg among all resources */ + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + rr = r->res; + num_pmg = min(num_pmg, rr->num_pmg); + }
mon_init();
@@ -77,16 +84,9 @@ void free_pmg(u32 pmg) static int mon_free_map; void mon_init(void) { - struct resctrl_resource *r; - struct raw_resctrl_resource *rr; - int num_mon = INT_MAX; + int num_mon;
- for_each_resctrl_resource(r) { - if (r->mon_enabled) { - rr = r->res; - num_mon = min(num_mon, rr->num_mon); - } - } + num_mon = mpam_resctrl_max_mon_num();
mon_free_map = BIT_MASK(num_mon) - 1; } diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index d4870e33c3bd..8fd5c84f28bd 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -45,6 +45,7 @@ #include <asm/resctrl.h> #include <asm/io.h>
+#include "mpam_device.h" #include "mpam_internal.h"
/* Mutex to protect rdtgroup access. */ @@ -70,6 +71,10 @@ int max_name_width, max_data_width; */ bool rdt_alloc_capable;
+/* + * Indicate the max number of monitor supported. + */ +static u32 max_mon_num; /* * Hi1620 2P Base Address Map * @@ -97,72 +102,55 @@ void mpam_resctrl_clear_default_cpu(unsigned int cpu) }
static void -cat_wrmsr(struct rdt_domain *d, int partid); +mpam_resctrl_update_component_cfg(struct resctrl_resource *r, + struct rdt_domain *d, struct list_head *opt_list, u32 partid); + static void -bw_wrmsr(struct rdt_domain *d, int partid); +common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, + struct list_head *opt_list, int partid);
-u64 cat_rdmsr(struct rdt_domain *d, int partid); -u64 bw_rdmsr(struct rdt_domain *d, int partid); +static u64 cache_rdmsr(struct rdt_domain *d, int partid); +static u64 mbw_rdmsr(struct rdt_domain *d, int partid);
-static u64 mbwu_read(struct rdt_domain *d, struct rdtgroup *g); -static u64 csu_read(struct rdt_domain *d, struct rdtgroup *g); +static u64 cache_rdmon(struct rdt_domain *d, struct rdtgroup *g); +static u64 mbw_rdmon(struct rdt_domain *d, struct rdtgroup *g);
-static int mbwu_write(struct rdt_domain *d, struct rdtgroup *g, bool enable); -static int csu_write(struct rdt_domain *d, struct rdtgroup *g, bool enable); +static int common_wrmon(struct rdt_domain *d, struct rdtgroup *g, + bool enable);
-#define domain_init(id) LIST_HEAD_INIT(resctrl_resources_all[id].domains) +static inline bool is_mon_dyn(u32 mon) +{ + /* + * if rdtgrp->mon.mon has been tagged with value (max_mon_num), + * allocating a monitor in dynamic when getting monitor data. + */ + return (mon == mpam_resctrl_max_mon_num()) ? true : false; +}
struct raw_resctrl_resource raw_resctrl_resources_all[] = { [RDT_RESOURCE_L3] = { - .msr_update = cat_wrmsr, - .msr_read = cat_rdmsr, - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .mon_read = csu_read, - .mon_write = csu_write, + .msr_update = common_wrmsr, + .msr_read = cache_rdmsr, + .parse_ctrlval = parse_cbm, + .format_str = "%d=%0*x", + .mon_read = cache_rdmon, + .mon_write = common_wrmon, }, [RDT_RESOURCE_L2] = { - .msr_update = cat_wrmsr, - .msr_read = cat_rdmsr, - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .mon_read = csu_read, - .mon_write = csu_write, + .msr_update = common_wrmsr, + .msr_read = cache_rdmsr, + .parse_ctrlval = parse_cbm, + .format_str = "%d=%0*x", + .mon_read = cache_rdmon, + .mon_write = common_wrmon, }, [RDT_RESOURCE_MC] = { - .msr_update = bw_wrmsr, - .msr_read = bw_rdmsr, - .parse_ctrlval = parse_bw, /* add parse_bw() helper */ - .format_str = "%d=%0*d", - .mon_read = mbwu_read, - .mon_write = mbwu_write, - }, -}; - -struct resctrl_resource resctrl_resources_all[] = { - [RDT_RESOURCE_L3] = { - .rid = RDT_RESOURCE_L3, - .name = "L3", - .domains = domain_init(RDT_RESOURCE_L3), - .res = &raw_resctrl_resources_all[RDT_RESOURCE_L3], - .fflags = RFTYPE_RES_CACHE, - .alloc_enabled = 1, - }, - [RDT_RESOURCE_L2] = { - .rid = RDT_RESOURCE_L2, - .name = "L2", - .domains = domain_init(RDT_RESOURCE_L2), - .res = &raw_resctrl_resources_all[RDT_RESOURCE_L2], - .fflags = RFTYPE_RES_CACHE, - .alloc_enabled = 1, - }, - [RDT_RESOURCE_MC] = { - .rid = RDT_RESOURCE_MC, - .name = "MB", - .domains = domain_init(RDT_RESOURCE_MC), - .res = &raw_resctrl_resources_all[RDT_RESOURCE_MC], - .fflags = RFTYPE_RES_MC, - .alloc_enabled = 1, + .msr_update = common_wrmsr, + .msr_read = mbw_rdmsr, + .parse_ctrlval = parse_bw, + .format_str = "%d=%0*d", + .mon_read = mbw_rdmon, + .mon_write = common_wrmon, }, };
@@ -176,35 +164,51 @@ mpam_get_raw_resctrl_resource(enum resctrl_resource_level level) }
static void -cat_wrmsr(struct rdt_domain *d, int partid) +common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, + struct list_head *opt_list, int partid) { - mpam_writel(partid, d->base + MPAMCFG_PART_SEL); - mpam_writel(d->ctrl_val[partid], d->base + MPAMCFG_CPBM); -} + struct sync_args args; + struct mpam_resctrl_dom *dom;
-static void -bw_wrmsr(struct rdt_domain *d, int partid) -{ - u64 val = MBW_MAX_SET(d->ctrl_val[partid]); + args.partid = partid;
- mpam_writel(partid, d->base + MPAMCFG_PART_SEL); - mpam_writel(val, d->base + MPAMCFG_MBW_MAX); + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + mpam_resctrl_update_component_cfg(r, d, opt_list, partid); + + mpam_component_config(dom->comp, &args); }
-u64 cat_rdmsr(struct rdt_domain *d, int partid) +static u64 cache_rdmsr(struct rdt_domain *d, int partid) { - mpam_writel(partid, d->base + MPAMCFG_PART_SEL); - return mpam_readl(d->base + MPAMCFG_CPBM); -} + u32 result; + struct sync_args args; + struct mpam_resctrl_dom *dom;
-u64 bw_rdmsr(struct rdt_domain *d, int partid) + args.partid = partid; + args.reg = MPAMCFG_CPBM; + + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + mpam_component_get_config(dom->comp, &args, &result); + + return result; +} +static u64 mbw_rdmsr(struct rdt_domain *d, int partid) { u64 max; + u32 result; + struct sync_args args; + struct mpam_resctrl_dom *dom; + + args.partid = partid; + args.reg = MPAMCFG_MBW_MAX; + + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- mpam_writel(partid, d->base + MPAMCFG_PART_SEL); - max = mpam_readl(d->base + MPAMCFG_MBW_MAX); + mpam_component_get_config(dom->comp, &args, &result);
- max = MBW_MAX_GET(max); + max = MBW_MAX_GET(result); return roundup((max * 100) / 64, 5); }
@@ -212,81 +216,116 @@ u64 bw_rdmsr(struct rdt_domain *d, int partid) * use pmg as monitor id * just use match_pardid only. */ -static u64 mbwu_read(struct rdt_domain *d, struct rdtgroup *g) +static u64 cache_rdmon(struct rdt_domain *d, struct rdtgroup *g) { + int err; + u64 result; + struct sync_args args; + struct mpam_resctrl_dom *dom; u32 mon = g->mon.mon; + unsigned long timeout;
- mpam_writel(mon, d->base + MSMON_CFG_MON_SEL); - return mpam_readl(d->base + MSMON_MBWU); -} + /* Indicates whether allocating a monitor dynamically*/ + if (is_mon_dyn(mon)) + mon = alloc_mon(); + + args.partid = g->closid; + args.mon = mon; + args.pmg = g->mon.rmid; + args.match_pmg = true; + args.eventid = QOS_L3_OCCUP_EVENT_ID; + + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + /** + * We should judge if return is OK, it is possible affected + * by NRDY bit. + */ + timeout = READ_ONCE(jiffies) + (1*SEC_CONVERSION); + do { + if (time_after(READ_ONCE(jiffies), timeout)) { + err = -ETIMEDOUT; + break; + } + err = mpam_component_mon(dom->comp, &args, &result); + /* Currently just report it */ + WARN_ON(err && (err != -EBUSY)); + } while (err == -EBUSY);
-static u64 csu_read(struct rdt_domain *d, struct rdtgroup *g) + if (is_mon_dyn(mon)) + free_mon(mon); + + return result; +} +/* + * use pmg as monitor id + * just use match_pardid only. + */ +static u64 mbw_rdmon(struct rdt_domain *d, struct rdtgroup *g) { + int err; + u64 result; + struct sync_args args; + struct mpam_resctrl_dom *dom; u32 mon = g->mon.mon; + unsigned long timeout;
- mpam_writel(mon, d->base + MSMON_CFG_MON_SEL); - return mpam_readl(d->base + MSMON_CSU); -} + if (is_mon_dyn(mon)) + mon = alloc_mon();
-static int mbwu_write(struct rdt_domain *d, struct rdtgroup *g, bool enable) -{ - u32 mon, partid, pmg, ctl, flt, cur_ctl, cur_flt; - - mon = g->mon.mon; - mpam_writel(mon, d->base + MSMON_CFG_MON_SEL); - if (enable) { - partid = g->closid; - pmg = g->mon.rmid; - ctl = MSMON_MATCH_PARTID|MSMON_MATCH_PMG; - flt = MSMON_CFG_FLT_SET(pmg, partid); - cur_flt = mpam_readl(d->base + MSMON_CFG_MBWU_FLT); - cur_ctl = mpam_readl(d->base + MSMON_CFG_MBWU_CTL); - - if (cur_ctl != (ctl | MSMON_CFG_CTL_EN | MSMON_CFG_MBWU_TYPE) || - cur_flt != flt) { - mpam_writel(flt, d->base + MSMON_CFG_MBWU_FLT); - mpam_writel(ctl, d->base + MSMON_CFG_MBWU_CTL); - mpam_writel(0, d->base + MSMON_MBWU); - ctl |= MSMON_CFG_CTL_EN; - mpam_writel(ctl, d->base + MSMON_CFG_MBWU_CTL); - } - } else { - ctl = 0; - mpam_writel(ctl, d->base + MSMON_CFG_MBWU_CTL); - } + args.partid = g->closid; + args.mon = mon; + args.pmg = g->mon.rmid; + args.match_pmg = true; + args.eventid = QOS_L3_MBM_LOCAL_EVENT_ID;
- return 0; + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + /** + * We should judge if return is OK, it is possible affected + * by NRDY bit. + */ + timeout = READ_ONCE(jiffies) + (1*SEC_CONVERSION); + do { + if (time_after(READ_ONCE(jiffies), timeout)) { + err = -ETIMEDOUT; + break; + } + err = mpam_component_mon(dom->comp, &args, &result); + /* Currently just report it */ + WARN_ON(err && (err != -EBUSY)); + } while (err == -EBUSY); + + if (is_mon_dyn(mon)) + free_mon(mon); + return result; }
-static int csu_write(struct rdt_domain *d, struct rdtgroup *g, bool enable) +static int common_wrmon(struct rdt_domain *d, struct rdtgroup *g, bool enable) { - u32 mon, partid, pmg, ctl, flt, cur_ctl, cur_flt; - - mon = g->mon.mon; - mpam_writel(mon, d->base + MSMON_CFG_MON_SEL); - if (enable) { - partid = g->closid; - pmg = g->mon.rmid; - ctl = MSMON_MATCH_PARTID|MSMON_MATCH_PMG; - flt = MSMON_CFG_FLT_SET(pmg, partid); - cur_flt = mpam_readl(d->base + MSMON_CFG_CSU_FLT); - cur_ctl = mpam_readl(d->base + MSMON_CFG_CSU_CTL); - - if (cur_ctl != (ctl | MSMON_CFG_CTL_EN | MSMON_CFG_CSU_TYPE) || - cur_flt != flt) { - mpam_writel(flt, d->base + MSMON_CFG_CSU_FLT); - mpam_writel(ctl, d->base + MSMON_CFG_CSU_CTL); - mpam_writel(0, d->base + MSMON_CSU); - ctl |= MSMON_CFG_CTL_EN; - mpam_writel(ctl, d->base + MSMON_CFG_CSU_CTL); - } - } else { - ctl = 0; - mpam_writel(ctl, d->base + MSMON_CFG_CSU_CTL); - } + u64 result; + struct sync_args args; + struct mpam_resctrl_dom *dom; + + if (!enable) + return -EINVAL; + + args.partid = g->closid; + args.mon = g->mon.mon; + args.pmg = g->mon.rmid; + args.match_pmg = true; + + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + /** + * We needn't judge if return is OK, we just want to configure + * monitor info. + */ + mpam_component_mon(dom->comp, &args, &result);
return 0; } + /* * Trivial allocator for CLOSIDs. Since h/w only supports a small number, * we can keep a bitmap of free CLOSIDs in a single integer. @@ -306,16 +345,10 @@ static int closid_free_map;
void closid_init(void) { - struct resctrl_resource *r; - struct raw_resctrl_resource *rr; int num_closid = INT_MAX;
- for_each_resctrl_resource(r) { - if (r->alloc_enabled) { - rr = r->res; - num_closid = min(num_closid, (int)rr->num_partid); - } - } + num_closid = mpam_sysprops_num_partid(); + closid_free_map = BIT_MASK(num_closid) - 1;
/* CLOSID 0 is always reserved for the default group */ @@ -345,20 +378,24 @@ void closid_free(int closid) */ static __init void mpam_init_padding(void) { + int cl; + struct mpam_resctrl_res *res; struct resctrl_resource *r; struct raw_resctrl_resource *rr; - int cl;
- for_each_resctrl_resource(r) { - if (r->alloc_enabled) { - rr = (struct raw_resctrl_resource *)r->res; - cl = strlen(r->name); - if (cl > max_name_width) - max_name_width = cl; + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res;
- if (rr->data_width > max_data_width) - max_data_width = rr->data_width; - } + cl = strlen(r->name); + if (cl > max_name_width) + max_name_width = cl; + + rr = r->res; + if (!rr) + continue; + cl = rr->data_width; + if (cl > max_data_width) + max_data_width = cl; } }
@@ -380,10 +417,13 @@ static int reset_all_ctrls(struct resctrl_resource *r)
void resctrl_resource_reset(void) { + struct mpam_resctrl_res *res; struct resctrl_resource *r;
/*Put everything back to default values. */ - for_each_resctrl_resource(r) { + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (r->alloc_enabled) reset_all_ctrls(r); } @@ -640,9 +680,12 @@ static int resctrl_num_partid_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { struct resctrl_resource *r = of->kn->parent->priv; - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; + struct raw_resctrl_resource *rr = r->res; + u16 num_partid;
- seq_printf(seq, "%d\n", rr->num_partid); + num_partid = rr->num_partid; + + seq_printf(seq, "%d\n", num_partid);
return 0; } @@ -651,9 +694,12 @@ static int resctrl_num_pmg_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { struct resctrl_resource *r = of->kn->parent->priv; - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; + struct raw_resctrl_resource *rr = r->res; + u16 num_pmg; + + num_pmg = rr->num_pmg;
- seq_printf(seq, "%d\n", rr->num_pmg); + seq_printf(seq, "%d\n", num_pmg);
return 0; } @@ -662,9 +708,12 @@ static int resctrl_num_mon_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { struct resctrl_resource *r = of->kn->parent->priv; - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; + struct raw_resctrl_resource *rr = r->res; + u16 num_mon;
- seq_printf(seq, "%d\n", rr->num_mon); + num_mon = rr->num_mon; + + seq_printf(seq, "%d\n", num_mon);
return 0; } @@ -917,7 +966,8 @@ int resctrl_ctrlmon_enable(struct kernfs_node *parent_kn, void resctrl_ctrlmon_disable(struct kernfs_node *kn_mondata, struct resctrl_group *prgrp) { - struct resctrl_resource *r; + struct mpam_resctrl_res *r; + struct resctrl_resource *resctrl_res; struct raw_resctrl_resource *rr; struct rdt_domain *dom; int mon = prgrp->mon.mon; @@ -926,12 +976,13 @@ void resctrl_ctrlmon_disable(struct kernfs_node *kn_mondata, if (prgrp->type == RDTMON_GROUP) return;
- /* disable monitor before free mon */ - for_each_resctrl_resource(r) { - if (r->mon_enabled) { - rr = (struct raw_resctrl_resource *)r->res; + for_each_supported_resctrl_exports(r) { + resctrl_res = &r->resctrl_res; + + if (resctrl_res->mon_enabled) { + rr = (struct raw_resctrl_resource *)resctrl_res->res;
- list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &resctrl_res->domains, list) { rr->mon_write(dom, prgrp, false); } } @@ -1167,3 +1218,85 @@ void __mpam_sched_in(void) mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); } } + +static void +mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res, + u32 resctrl_cfg, struct mpam_config *mpam_cfg) +{ + if (res == &mpam_resctrl_exports[RDT_RESOURCE_MC]) { + u64 range; + + /* For MBA cfg is a percentage of .. */ + if (res->resctrl_mba_uses_mbw_part) { + /* .. the number of bits we can set */ + range = res->class->mbw_pbm_bits; + mpam_cfg->mbw_pbm = (resctrl_cfg * range) / MAX_MBA_BW; + mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); + } else { + /* .. the number of fractions we can represent */ + mpam_cfg->mbw_max = resctrl_cfg; + + mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); + } + } else { + /* + * Nothing clever here as mpam_resctrl_pick_caches() + * capped the size at RESCTRL_MAX_CBM. + */ + mpam_cfg->cpbm = resctrl_cfg; + mpam_set_feature(mpam_feat_cpor_part, &mpam_cfg->valid); + } +} + +static void +mpam_resctrl_update_component_cfg(struct resctrl_resource *r, + struct rdt_domain *d, struct list_head *opt_list, u32 partid) +{ + struct mpam_resctrl_dom *dom; + struct mpam_resctrl_res *res; + struct mpam_config *mpam_cfg; + u32 resctrl_cfg = d->ctrl_val[partid]; + + lockdep_assert_held(&resctrl_group_mutex); + + /* Out of range */ + if (partid >= mpam_sysprops_num_partid()) + return; + + res = container_of(r, struct mpam_resctrl_res, resctrl_res); + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + + mpam_cfg = &dom->comp->cfg[partid]; + if (WARN_ON_ONCE(!mpam_cfg)) + return; + + mpam_cfg->valid = 0; + if (partid != mpam_cfg->intpartid) { + mpam_cfg->intpartid = partid; + mpam_set_feature(mpam_feat_part_nrw, &mpam_cfg->valid); + } + + mpam_update_from_resctrl_cfg(res, resctrl_cfg, mpam_cfg); +} + +u16 mpam_resctrl_max_mon_num(void) +{ + struct mpam_resctrl_res *res; + u16 mon_num = USHRT_MAX; + struct raw_resctrl_resource *rr; + + if (max_mon_num) + return max_mon_num; + + for_each_supported_resctrl_exports(res) { + rr = res->resctrl_res.res; + mon_num = min(mon_num, rr->num_mon); + } + + if (mon_num == USHRT_MAX) + mon_num = 0; + + max_mon_num = mon_num; + + return mon_num; +} diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 265e700cd7c0..4ad178c083ea 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -341,6 +341,7 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->fflags = RFTYPE_RES_MC; r->mbw.delay_linear = true; rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_MC); + rr->num_mon = class->num_mbwu_mon; r->res = rr;
if (mpam_has_feature(mpam_feat_mbw_part, class->features)) { @@ -385,6 +386,7 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L3].class) { r->rid = RDT_RESOURCE_L3; rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); + rr->num_mon = class->num_csu_mon; r->res = rr; r->fflags = RFTYPE_RES_CACHE; r->name = "L3"; @@ -417,6 +419,7 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L2].class) { r->rid = RDT_RESOURCE_L2; rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L2); + rr->num_mon = class->num_csu_mon; r->res = rr; r->fflags = RFTYPE_RES_CACHE; r->name = "L2"; @@ -489,3 +492,13 @@ int mpam_resctrl_setup(void)
return 0; } + +struct resctrl_resource * +mpam_resctrl_get_resource(enum resctrl_resource_level level) +{ + if (level >= RDT_NUM_RESOURCES || + !mpam_resctrl_exports[level].class) + return NULL; + + return &mpam_resctrl_exports[level].resctrl_res; +} diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 9c48cd165ea4..7bd1f519d4b1 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -183,6 +183,9 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) unsigned long fflags; char name[32]; int ret; +#ifdef CONFIG_ARM64 + enum resctrl_resource_level level; +#endif
/* create the directory */ kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL); @@ -194,7 +197,14 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) if (ret) goto out_destroy;
+#ifdef CONFIG_ARM64 + for (level = RDT_RESOURCE_SMMU; level < RDT_NUM_RESOURCES; level++) { + r = mpam_resctrl_get_resource(level); + if (!r) + continue; +#else for_each_resctrl_resource(r) { +#endif if (r->alloc_enabled) { fflags = r->fflags | RF_CTRL_INFO; ret = resctrl_group_mkdir_info_resdir(r, r->name, fflags); @@ -203,7 +213,14 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) } }
+#ifdef CONFIG_ARM64 + for (level = RDT_RESOURCE_SMMU; level < RDT_NUM_RESOURCES; level++) { + r = mpam_resctrl_get_resource(level); + if (!r) + continue; +#else for_each_resctrl_resource(r) { +#endif if (r->mon_enabled) { fflags = r->fflags | RF_MON_INFO; snprintf(name, sizeof(name), "%s_MON", r->name);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
CDP (Code and Data Prioritization) should also be supported, because separate code and data caches is an illusion resctrl creates using CDP, as James said, The L2 cache is controlled from one place regardless. Arm doesn't specify a cache topology. Platforms may have separate L2 code and data caches, with independent controls. On such a system we would need a unified L2 cache to be an illusion. To support Arm's MPAM, we need CDP to not be implicit between the architecture code and the file-system code. this add a series definitions independent of resctrl resources.
To do this we make the code/data/both 'type' a property of the configuration that comes from the schema. This lets us combined the illusionary cache. Eventually we separate the architecture code and file-system code's idea of closid, the architecture code can then provide helpers to map one to the other.
Part of this code is borrowed to James's, See links.
Link: http://www.linux-arm.org/git?p=linux-jm.git;a=commit;h=57a6f6204f72e2afc1167... Link: http://www.linux-arm.org/git?p=linux-jm.git;a=commit;h=1385052cce87a8aed5dc0... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 91 +++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 5aef534fb3df..3082ee4f68d4 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -120,6 +120,97 @@ extern bool rdt_mon_capable;
extern int max_name_width, max_data_width;
+enum resctrl_conf_type { + CDP_BOTH = 0, + CDP_CODE, + CDP_DATA, + CDP_NUM_CONF_TYPE, +}; + +static inline int conf_name_to_conf_type(char *name) +{ + enum resctrl_conf_type t; + + if (!strcmp(name, "L3CODE") || !strcmp(name, "L2CODE")) + t = CDP_CODE; + else if (!strcmp(name, "L3DATA") || !strcmp(name, "L2DATA")) + t = CDP_DATA; + else + t = CDP_BOTH; + return t; +} + +#define for_each_conf_type(t) \ + for (t = CDP_BOTH; t < CDP_NUM_CONF_TYPE; t++) + +typedef struct { u16 val; } hw_mpamid_t; + +#define hw_closid_t hw_mpamid_t +#define hw_monid_t hw_mpamid_t +#define hw_closid_val(__x) (__x.val) +#define hw_monid_val(__x) (__x.val) + +#define as_hw_t(__name, __x) \ + ((hw_##__name##id_t){(__x)}) +#define hw_val(__name, __x) \ + hw_##__name##id_val(__x) + +/** + * When cdp enabled, give (closid + 1) to Cache LxDATA. + */ +#define resctrl_cdp_map(__name, __closid, __type, __result) \ +do { \ + if (__type == CDP_CODE) \ + __result = as_hw_t(__name, __closid); \ + else if (__type == CDP_DATA) \ + __result = as_hw_t(__name, __closid + 1); \ + else \ + __result = as_hw_t(__name, __closid); \ +} while (0) + +static inline bool is_resctrl_cdp_enabled(void) +{ + return 0; +} + +#define hw_alloc_times_validate(__name, __times, __flag) \ +do { \ + __flag = is_resctrl_cdp_enabled(); \ + __times = flag ? 2 : 1; \ +} while (0) + + +/** + * struct resctrl_staged_config - parsed configuration to be applied + * @hw_closid: raw closid for this configuration, regardless of CDP + * @new_ctrl: new ctrl value to be loaded + * @have_new_ctrl: did user provide new_ctrl for this domain + * @new_ctrl_type: CDP property of the new ctrl + */ +struct resctrl_staged_config { + hw_closid_t hw_closid; + u32 new_ctrl; + bool have_new_ctrl; + enum resctrl_conf_type new_ctrl_type; +}; + +/* later move to resctrl common directory */ +#define RESCTRL_NAME_LEN 7 + +/** + * @list: Member of resctrl's schema list + * @name: Name visible in the schemata file + * @conf_type: Type of configuration, e.g. code/data/both + * @res: The rdt_resource for this entry + */ +struct resctrl_schema { + struct list_head list; + char name[RESCTRL_NAME_LEN]; + enum resctrl_conf_type conf_type; + struct resctrl_resource *res; +}; + + /* rdtgroup.flags */ #define RDT_DELETED BIT(0) #define RDT_CTRLMON BIT(1)
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
This supports cdpl2,cdpl3 parameters for mount options, some other options including capabilities' feature such as priority and hardlimit will be supported in the future, a simple example like this.
e.g.
mount -t resctrl resctrl /sys/fs/resctrl -o cdpl3 cd /sys/fs/resctrl && cat schemata
L3CODE:0=7fff;1=7fff;2=7fff;3=7fff L3CODE:0=7fff;1=7fff;2=7fff;3=7fff MB:0=100;1=100;2=100;3=100
Note that we only complete this part interface adaption, not mean cdp is supported currently.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_resctrl.c | 73 +++++++++++++++++++++++++++ 1 file changed, 73 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 8fd5c84f28bd..aa7a3d1402ec 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -75,6 +75,12 @@ bool rdt_alloc_capable; * Indicate the max number of monitor supported. */ static u32 max_mon_num; + +/* + * Indicate if had mount cdpl2/cdpl3 option. + */ +static bool resctrl_cdp_enabled; + /* * Hi1620 2P Base Address Map * @@ -433,9 +439,76 @@ void release_rdtgroupfs_options(void) { }
+static void disable_cdp(void) +{ + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + r->cdp_enable = false; + } + + resctrl_cdp_enabled = false; +} + +static int try_to_enable_cdp(enum resctrl_resource_level level) +{ + struct resctrl_resource *r = mpam_resctrl_get_resource(level); + + if (!r || !r->cdp_capable) + return -EINVAL; + + r->cdp_enable = true; + + resctrl_cdp_enabled = true; + return 0; +} + +static int cdpl3_enable(void) +{ + return try_to_enable_cdp(RDT_RESOURCE_L3); +} + +static int cdpl2_enable(void) +{ + return try_to_enable_cdp(RDT_RESOURCE_L2); +} + int parse_rdtgroupfs_options(char *data) { + char *token; + char *o = data; + int ret = 0; + + disable_cdp(); + + while ((token = strsep(&o, ",")) != NULL) { + if (!*token) { + ret = -EINVAL; + goto out; + } + + if (!strcmp(token, "cdpl3")) { + ret = cdpl3_enable(); + if (ret) + goto out; + } else if (!strcmp(token, "cdpl2")) { + ret = cdpl2_enable(); + if (ret) + goto out; + } else { + ret = -EINVAL; + goto out; + } + } + return 0; + +out: + pr_err("Invalid mount option "%s"\n", token); + + return ret; }
/*
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Initialize schemata list when mount resctrl sysfs and destroy it when umount, each list node contains the value updated by schemata (in resctrl sysfs) row.
Partial code is borrowed from 250656171d95 ("x86/resctrl: Stop using Lx CODE/DATA resources"), as it illustrates:
Now that CDP enable/disable is global, and the closid offset correction is based on the configuration being applied, we are using different hw_closid slots in the ctrl array for CODE/DATA schema. This lets us merge them using the same Lx resource twice for CDP's CODE/DATA schema. This keeps the illusion of separate caches in the resctrl code.
When CDP is enabled for a cache, create two schema generating the names and setting the configuration type.
We can now remove the initialisation of of the illusionary hw_resources: 'cdp_capable' just requires setting a flag, resctrl knows what to do from there.
Link: http://www.linux-arm.org/git?p=linux-jm.git;a=commit;h=250656171d95dea079cc6... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 4 ++ arch/arm64/kernel/mpam/mpam_ctrlmon.c | 78 +++++++++++++++++++++++++++ fs/resctrlfs.c | 11 +++- 3 files changed, 92 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 2119204fa090..58cff955fbda 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -75,6 +75,10 @@ struct rdtgroup { struct mongroup mon; };
+int schemata_list_init(void); + +void schemata_list_destroy(void); + static inline int alloc_mon_id(void) {
diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 39c5020cdfa6..d5cbb5f16d92 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -40,6 +40,84 @@ #include <asm/resctrl.h> #include "mpam_internal.h"
+/* schemata content list */ +LIST_HEAD(resctrl_all_schema); + +/* Init schemata content */ +static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) +{ + char *suffix = ""; + struct resctrl_schema *s; + + s = kzalloc(sizeof(*s), GFP_KERNEL); + if (!s) + return -ENOMEM; + + s->res = r; + s->conf_type = t; + + switch (t) { + case CDP_CODE: + suffix = "CODE"; + break; + case CDP_DATA: + suffix = "DATA"; + break; + case CDP_BOTH: + suffix = ""; + break; + default: + return -EINVAL; + } + + WARN_ON_ONCE(strlen(r->name) + strlen(suffix) + 1 > RESCTRL_NAME_LEN); + snprintf(s->name, sizeof(s->name), "%s%s", r->name, suffix); + + INIT_LIST_HEAD(&s->list); + list_add_tail(&s->list, &resctrl_all_schema); + + return 0; +} + +int schemata_list_init(void) +{ + int ret; + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (!r || !r->alloc_capable) + continue; + + if (r->cdp_enable) { + ret = add_schema(CDP_CODE, r); + ret |= add_schema(CDP_DATA, r); + } else { + ret = add_schema(CDP_BOTH, r); + } + if (ret) + break; + } + + return ret; +} + +/* + * During resctrl_kill_sb(), the mba_sc state is reset before + * schemata_list_destroy() is called: unconditionally try to free the + * array. + */ +void schemata_list_destroy(void) +{ + struct resctrl_schema *s, *tmp; + + list_for_each_entry_safe(s, tmp, &resctrl_all_schema, list) { + list_del(&s->list); + kfree(s); + } +} + /* * Check whether a cache bit mask is valid. The SDM says: * Please note that all (and only) contiguous '1' combinations diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 7bd1f519d4b1..9fbabed56015 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -335,7 +335,13 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, dentry = ERR_PTR(ret); goto out_options; } - +#ifdef CONFIG_ARM64 + ret = schemata_list_init(); + if (ret) { + dentry = ERR_PTR(ret); + goto out_options; + } +#endif resctrl_id_init();
ret = resctrl_group_create_info_dir(resctrl_group_default.kn); @@ -505,6 +511,9 @@ static void resctrl_kill_sb(struct super_block *sb) mutex_lock(&resctrl_group_mutex);
resctrl_resource_reset(); +#ifdef CONFIG_ARM64 + schemata_list_destroy(); +#endif
rmdir_all_sub(); static_branch_disable_cpuslocked(&resctrl_alloc_enable_key);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Add a schema list for each rdt domain, we use this list to store changes from schemata row instead of previous ctrlval array live in resctrl resource structure, when mounting resctrl sysfs happened, we would reset all resource's configuration into default by resctrl_group_update_domains().
Currently each row in schemata sysfile occupy a list node, this may be extended for perfecting control types.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 2 + arch/arm64/include/asm/resctrl.h | 2 +- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 119 +++++++++++++++++++++++--- fs/resctrlfs.c | 2 +- 4 files changed, 110 insertions(+), 15 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 3082ee4f68d4..6a90cc9661a2 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -247,6 +247,8 @@ struct rdt_domain {
/* for debug */ char *cpus_list; + + struct resctrl_staged_config staged_cfg[CDP_NUM_CONF_TYPE]; };
extern struct mutex resctrl_group_mutex; diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 58cff955fbda..f44feeb6b496 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -131,7 +131,7 @@ int mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, char *name, struct kernfs_node **dest_kn);
-int rdtgroup_init_alloc(struct rdtgroup *rdtgrp); +int resctrl_group_init_alloc(struct rdtgroup *rdtgrp);
struct resctrl_resource * mpam_resctrl_get_resource(enum resctrl_resource_level level); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index d5cbb5f16d92..acb4696d48b1 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -118,6 +118,37 @@ void schemata_list_destroy(void) } }
+static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, + struct resctrl_resource *r) +{ + int i; + u32 partid; + struct rdt_domain *d; + struct raw_resctrl_resource *rr; + struct resctrl_staged_config *cfg; + + rr = r->res; + list_for_each_entry(d, &r->domains, list) { + cfg = d->staged_cfg; + for (i = 0; i < ARRAY_SIZE(d->staged_cfg); i++) { + if (!cfg[i].have_new_ctrl) + continue; + + partid = hw_closid_val(cfg[i].hw_closid); + /* apply cfg */ + if (d->ctrl_val[partid] == cfg[i].new_ctrl) + continue; + + d->ctrl_val[partid] = cfg[i].new_ctrl; + d->have_new_ctrl = true; + + rr->msr_update(r, d, NULL, partid); + } + } + + return 0; +} + /* * Check whether a cache bit mask is valid. The SDM says: * Please note that all (and only) contiguous '1' combinations @@ -683,26 +714,88 @@ int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, return ret; }
-/* Initialize the RDT group's allocations. */ -int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) +/* Initialize MBA resource with default values. */ +static void rdtgroup_init_mba(struct resctrl_resource *r, u32 closid) { - struct mpam_resctrl_res *res; - struct resctrl_resource *r; + struct resctrl_staged_config *cfg; struct rdt_domain *d; - int ret;
- for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; + list_for_each_entry(d, &r->domains, list) { + cfg = &d->staged_cfg[CDP_BOTH]; + cfg->new_ctrl = r->default_ctrl; + resctrl_cdp_map(clos, closid, CDP_BOTH, cfg->hw_closid); + cfg->have_new_ctrl = true; + } +}
- if (!r->alloc_enabled) - continue; +/* + * Initialize cache resources with default values. + * + * A new resctrl group is being created on an allocation capable (CAT) + * supporting system. Set this group up to start off with all usable + * allocations. + * + * If there are no more shareable bits available on any domain then + * the entire allocation will fail. + */ +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) +{ + struct resctrl_staged_config *cfg; + enum resctrl_conf_type t = s->conf_type; + struct rdt_domain *d; + struct resctrl_resource *r; + u32 used_b = 0; + u32 unused_b = 0; + unsigned long tmp_cbm;
- list_for_each_entry(d, &r->domains, list) { - d->new_ctrl = r->default_ctrl; - d->have_new_ctrl = true; + r = s->res; + if (WARN_ON(!r)) + return -EINVAL; + + list_for_each_entry(d, &s->res->domains, list) { + cfg = &d->staged_cfg[t]; + cfg->have_new_ctrl = false; + cfg->new_ctrl = r->cache.shareable_bits; + used_b = r->cache.shareable_bits; + + unused_b = used_b ^ (BIT_MASK(r->cache.cbm_len) - 1); + unused_b &= BIT_MASK(r->cache.cbm_len) - 1; + cfg->new_ctrl |= unused_b; + + /* Ensure cbm does not access out-of-bound */ + tmp_cbm = cfg->new_ctrl; + if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < + r->cache.min_cbm_bits) { + rdt_last_cmd_printf("No space on %s:%d\n", + r->name, d->id); + return -ENOSPC; + } + + resctrl_cdp_map(clos, closid, t, cfg->hw_closid); + cfg->have_new_ctrl = true; + } + + return 0; +} + +/* Initialize the resctrl group's allocations. */ +int resctrl_group_init_alloc(struct rdtgroup *rdtgrp) +{ + struct resctrl_schema *s; + struct resctrl_resource *r; + int ret; + + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res; + if (r->rid == RDT_RESOURCE_MC) { + rdtgroup_init_mba(r, rdtgrp->closid); + } else { + ret = rdtgroup_init_cat(s, rdtgrp->closid); + if (ret < 0) + return ret; }
- ret = update_domains(r, rdtgrp); + ret = resctrl_group_update_domains(rdtgrp, r); if (ret < 0) { rdt_last_cmd_puts("Failed to initialize allocations\n"); return ret; diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 9fbabed56015..acec4a1c9021 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -700,7 +700,7 @@ static int resctrl_group_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
rdtgrp->closid = closid;
- ret = rdtgroup_init_alloc(rdtgrp); + ret = resctrl_group_init_alloc(rdtgrp); if (ret < 0) goto out_id_free;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
schemata labels each row with hw_closid, which can be parsed into closid according to fixed rules (LxCODE and MBA are given closid, LxDATA is given to closid + 1), so the maximum number of rdtgroup can be created is also restricted by half if cdp enabled.
The length of Lx Cache domains displayed in schemata is compressed, this is because for specified hardware platform, domains of each resource may be too many to be easily operated for user interaction.
This patch also move parse_cbm() and parse_bw() to mpam_resctrl.c for clarity.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 7 +- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 237 +++++++++----------------- arch/arm64/kernel/mpam/mpam_resctrl.c | 118 ++++++++++++- 3 files changed, 202 insertions(+), 160 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 6a90cc9661a2..b0bab6153db8 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -251,6 +251,8 @@ struct rdt_domain { struct resctrl_staged_config staged_cfg[CDP_NUM_CONF_TYPE]; };
+#define RESCTRL_SHOW_DOM_MAX_NUM 8 + extern struct mutex resctrl_group_mutex;
extern struct resctrl_resource resctrl_resources_all[]; @@ -336,16 +338,13 @@ struct raw_resctrl_resource { int data_width; const char *format_str; int (*parse_ctrlval)(char *buf, struct raw_resctrl_resource *r, - struct rdt_domain *d); + struct resctrl_staged_config *cfg, hw_closid_t closid);
u16 num_mon; u64 (*mon_read)(struct rdt_domain *d, struct rdtgroup *g); int (*mon_write)(struct rdt_domain *d, struct rdtgroup *g, bool enable); };
-int parse_cbm(char *buf, struct raw_resctrl_resource *r, struct rdt_domain *d); -int parse_bw(char *buf, struct raw_resctrl_resource *r, struct rdt_domain *d); - union mon_data_bits { void *priv; struct { diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index acb4696d48b1..f04b90667095 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -149,124 +149,21 @@ static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, return 0; }
-/* - * Check whether a cache bit mask is valid. The SDM says: - * Please note that all (and only) contiguous '1' combinations - * are allowed (e.g. FFFFH, 0FF0H, 003CH, etc.). - * Additionally Haswell requires at least two bits set. - */ -static bool cbm_validate(char *buf, unsigned long *data, struct raw_resctrl_resource *r) -{ - u64 val; - int ret; - - ret = kstrtou64(buf, 16, &val); - if (ret) { - rdt_last_cmd_printf("non-hex character in mask %s\n", buf); - return false; - } - - *data = val; - return true; -} - -/* - * Read one cache bit mask (hex). Check that it is valid for the current - * resource type. - */ -int parse_cbm(char *buf, struct raw_resctrl_resource *r, struct rdt_domain *d) -{ - unsigned long data; - - if (d->have_new_ctrl) { - rdt_last_cmd_printf("duplicate domain %d\n", d->id); - return -EINVAL; - } - - if (!cbm_validate(buf, &data, r)) - return -EINVAL; - - d->new_ctrl = data; - d->have_new_ctrl = true; - - return 0; -} - -/* define bw_min as 5 percentage, that are 5% ~ 100% which cresponding masks: */ -static u32 bw_max_mask[20] = { - 3, /* 3/64: 5% */ - 6, /* 6/64: 10% */ - 10, /* 10/64: 15% */ - 13, /* 13/64: 20% */ - 16, /* 16/64: 25% */ - 19, /* ... */ - 22, - 26, - 29, - 32, - 35, - 38, - 42, - 45, - 48, - 51, - 54, - 58, - 61, - 63 /* 100% */ -}; - -static bool bw_validate(char *buf, unsigned long *data, struct raw_resctrl_resource *r) -{ - unsigned long bw; - int ret, idx; - - ret = kstrtoul(buf, 10, &bw); - if (ret) { - rdt_last_cmd_printf("non-hex character in mask %s\n", buf); - return false; - } - - bw = bw < 5 ? 5 : bw; - bw = bw > 100 ? 100 : bw; - - idx = roundup(bw, 5) / 5 - 1; - - *data = bw_max_mask[idx]; - return true; -} - -int parse_bw(char *buf, struct raw_resctrl_resource *r, struct rdt_domain *d) -{ - unsigned long data; - - if (d->have_new_ctrl) { - rdt_last_cmd_printf("duplicate domain %d\n", d->id); - return -EINVAL; - } - - if (!bw_validate(buf, &data, r)) - return -EINVAL; - - d->new_ctrl = data; - d->have_new_ctrl = true; - - return 0; -} - /* * For each domain in this resource we expect to find a series of: * id=mask * separated by ";". The "id" is in decimal, and must match one of * the "id"s for this resource. */ -static int parse_line(char *line, struct resctrl_resource *r) +static int parse_line(char *line, struct resctrl_resource *r, + enum resctrl_conf_type t, u32 closid) { - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; - char *dom = NULL, *id; + struct raw_resctrl_resource *rr = r->res; + char *dom = NULL; + char *id; struct rdt_domain *d; unsigned long dom_id; - + hw_closid_t hw_closid;
next: if (!line || line[0] == '\0') @@ -280,7 +177,8 @@ static int parse_line(char *line, struct resctrl_resource *r) dom = strim(dom); list_for_each_entry(d, &r->domains, list) { if (d->id == dom_id) { - if (rr->parse_ctrlval(dom, (struct raw_resctrl_resource *)&r->res, d)) + resctrl_cdp_map(clos, closid, t, hw_closid); + if (rr->parse_ctrlval(dom, rr, &d->staged_cfg[t], hw_closid)) return -EINVAL; goto next; } @@ -288,40 +186,29 @@ static int parse_line(char *line, struct resctrl_resource *r) return -EINVAL; }
-static int update_domains(struct resctrl_resource *r, struct rdtgroup *g) +static int +resctrl_group_parse_schema_resource(char *resname, char *tok, u32 closid) { - struct raw_resctrl_resource *rr; - struct rdt_domain *d; - int partid = g->closid; - - rr = (struct raw_resctrl_resource *)r->res; - list_for_each_entry(d, &r->domains, list) { - if (d->have_new_ctrl && d->new_ctrl != d->ctrl_val[partid]) { - d->ctrl_val[partid] = d->new_ctrl; - rr->msr_update(r, d, NULL, partid); - } - } - - return 0; -} - -static int resctrl_group_parse_resource(char *resname, char *tok, int closid) -{ - struct mpam_resctrl_res *res; struct resctrl_resource *r; - struct raw_resctrl_resource *rr; + struct resctrl_schema *s; + enum resctrl_conf_type t;
- for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res; + + if (!r) + continue;
if (r->alloc_enabled) { - rr = (struct raw_resctrl_resource *)r->res; - if (!strcmp(resname, r->name) && closid < - mpam_sysprops_num_partid()) - return parse_line(tok, r); + if (!strcmp(resname, s->name) && + closid < mpam_sysprops_num_partid()) { + t = conf_name_to_conf_type(s->name); + return parse_line(tok, r, t, closid); + } } } rdt_last_cmd_printf("unknown/unsupported resource name '%s'\n", resname); + return -EINVAL; }
@@ -330,10 +217,13 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, { struct rdtgroup *rdtgrp; struct rdt_domain *dom; - struct mpam_resctrl_res *res; struct resctrl_resource *r; + struct mpam_resctrl_res *res; + enum resctrl_conf_type conf_type; + struct resctrl_staged_config *cfg; char *tok, *resname; - int closid, ret = 0; + u32 closid; + int ret = 0;
/* Valid input requires a trailing newline */ if (nbytes == 0 || buf[nbytes - 1] != '\n') @@ -345,6 +235,7 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, resctrl_group_kn_unlock(of->kn); return -ENOENT; } + rdt_last_cmd_clear();
closid = rdtgrp->closid; @@ -353,8 +244,13 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, r = &res->resctrl_res;
if (r->alloc_enabled) { - list_for_each_entry(dom, &r->domains, list) + list_for_each_entry(dom, &r->domains, list) { dom->have_new_ctrl = false; + for_each_conf_type(conf_type) { + cfg = &dom->staged_cfg[conf_type]; + cfg->have_new_ctrl = false; + } + } } }
@@ -370,16 +266,15 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, ret = -EINVAL; goto out; } - ret = resctrl_group_parse_resource(resname, tok, closid); + ret = resctrl_group_parse_schema_resource(resname, tok, closid); if (ret) goto out; }
for_each_supported_resctrl_exports(res) { r = &res->resctrl_res; - if (r->alloc_enabled) { - ret = update_domains(r, rdtgrp); + ret = resctrl_group_update_domains(rdtgrp, r); if (ret) goto out; } @@ -390,42 +285,73 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, return ret ?: nbytes; }
-static void show_doms(struct seq_file *s, struct resctrl_resource *r, int partid) +/** + * MPAM resources such as L2 may have too many domains for arm64, + * at this time we should rearrange this display for brevity and + * harmonious interaction. + * + * Before rearrangement: L2:0=ff;1=ff;2=fc;3=ff;4=f;....;255=ff + * After rearrangement: L2:S;2=fc;S;4=f;S + * Those continuous fully sharable domains will be combined into + * a single "S" simply. + */ +static void show_doms(struct seq_file *s, struct resctrl_resource *r, + char *schema_name, int partid) { - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; + struct raw_resctrl_resource *rr = r->res; struct rdt_domain *dom; bool sep = false; + bool rg = false; + bool prev_auto_fill = false; + u32 reg_val; + + if (r->dom_num > RESCTRL_SHOW_DOM_MAX_NUM) + rg = true;
- seq_printf(s, "%*s:", max_name_width, r->name); + seq_printf(s, "%*s:", max_name_width, schema_name); list_for_each_entry(dom, &r->domains, list) { + reg_val = rr->msr_read(dom, partid); + + if (rg && reg_val == r->default_ctrl && + prev_auto_fill == true) + continue; + if (sep) seq_puts(s, ";"); - seq_printf(s, rr->format_str, dom->id, max_data_width, - rr->msr_read(dom, partid)); + if (rg && reg_val == r->default_ctrl) { + prev_auto_fill = true; + seq_puts(s, "S"); + } else { + seq_printf(s, rr->format_str, dom->id, + max_data_width, reg_val); + } sep = true; } seq_puts(s, "\n"); }
int resctrl_group_schemata_show(struct kernfs_open_file *of, - struct seq_file *s, void *v) + struct seq_file *s, void *v) { struct rdtgroup *rdtgrp; - struct mpam_resctrl_res *res; struct resctrl_resource *r; - struct raw_resctrl_resource *rr; + struct resctrl_schema *rs; int ret = 0; + hw_closid_t hw_closid; u32 partid;
rdtgrp = resctrl_group_kn_lock_live(of->kn); if (rdtgrp) { - partid = rdtgrp->closid; - for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; + list_for_each_entry(rs, &resctrl_all_schema, list) { + r = rs->res; + if (!r) + continue; if (r->alloc_enabled) { - rr = (struct raw_resctrl_resource *)r->res; + resctrl_cdp_map(clos, rdtgrp->closid, + rs->conf_type, hw_closid); + partid = hw_closid_val(hw_closid); if (partid < mpam_sysprops_num_partid()) - show_doms(s, r, partid); + show_doms(s, r, rs->name, partid); } } } else { @@ -711,6 +637,7 @@ int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, rdt_last_cmd_puts("kernfs subdir error\n"); free_mon(ret); } + return ret; }
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index aa7a3d1402ec..2508bf666742 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -133,6 +133,11 @@ static inline bool is_mon_dyn(u32 mon) return (mon == mpam_resctrl_max_mon_num()) ? true : false; }
+static int parse_cbm(char *buf, struct raw_resctrl_resource *r, + struct resctrl_staged_config *cfg, hw_closid_t hw_closid); +static int parse_bw(char *buf, struct raw_resctrl_resource *r, + struct resctrl_staged_config *cfg, hw_closid_t hw_closid); + struct raw_resctrl_resource raw_resctrl_resources_all[] = { [RDT_RESOURCE_L3] = { .msr_update = common_wrmsr, @@ -169,6 +174,116 @@ mpam_get_raw_resctrl_resource(enum resctrl_resource_level level) return &raw_resctrl_resources_all[level]; }
+/* + * Check whether a cache bit mask is valid. for arm64 MPAM, + * it seems that there are no restrictions according to MPAM + * spec expect for requiring at least one bit. + */ +static bool cbm_validate(char *buf, unsigned long *data, + struct raw_resctrl_resource *r) +{ + u64 val; + int ret; + + ret = kstrtou64(buf, 16, &val); + if (ret) { + rdt_last_cmd_printf("non-hex character in mask %s\n", buf); + return false; + } + + *data = val; + return true; +} + +/* + * Read one cache bit mask (hex). Check that it is valid for the current + * resource type. + */ +static int +parse_cbm(char *buf, struct raw_resctrl_resource *r, + struct resctrl_staged_config *cfg, hw_closid_t hw_closid) +{ + unsigned long data; + + if (cfg->have_new_ctrl) { + rdt_last_cmd_printf("duplicate domain\n"); + return -EINVAL; + } + + if (!cbm_validate(buf, &data, r)) + return -EINVAL; + + cfg->new_ctrl = data; + cfg->have_new_ctrl = true; + cfg->hw_closid = hw_closid; + + return 0; +} + +/* define bw_min as 5 percentage, that are 5% ~ 100% which cresponding masks: */ +static u32 bw_max_mask[20] = { + 3, /* 3/64: 5% */ + 6, /* 6/64: 10% */ + 10, /* 10/64: 15% */ + 13, /* 13/64: 20% */ + 16, /* 16/64: 25% */ + 19, /* ... */ + 22, + 26, + 29, + 32, + 35, + 38, + 42, + 45, + 48, + 51, + 54, + 58, + 61, + 63 /* 100% */ +}; + +static bool bw_validate(char *buf, unsigned long *data, + struct raw_resctrl_resource *r) +{ + unsigned long bw; + int ret; + + ret = kstrtoul(buf, 10, &bw); + if (ret) { + rdt_last_cmd_printf("non-hex character in mask %s\n", buf); + return false; + } + + bw = bw < 5 ? 5 : bw; + bw = bw > 100 ? 100 : bw; + *data = roundup(bw, 5); + + return true; +} + +static int +parse_bw(char *buf, struct raw_resctrl_resource *r, + struct resctrl_staged_config *cfg, hw_closid_t hw_closid) +{ + unsigned long data; + + if (cfg->have_new_ctrl) { + rdt_last_cmd_printf("duplicate domain\n"); + return -EINVAL; + } + + if (!bw_validate(buf, &data, r)) + return -EINVAL; + + cfg->new_ctrl = data; + cfg->have_new_ctrl = true; + cfg->hw_closid = hw_closid; + + return 0; +} + static void common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, struct list_head *opt_list, int partid) @@ -1307,7 +1422,8 @@ mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res, mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); } else { /* .. the number of fractions we can represent */ - mpam_cfg->mbw_max = resctrl_cfg; + mpam_cfg->mbw_max = bw_max_mask[(resctrl_cfg / 5 - 1) % + ARRAY_SIZE(bw_max_mask)];
mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
For MPAM, Processing elements (PEs) issue memory-system requests, PEs must implement the MPAMn_ELx registers and their behaviors to generate the PARTID and PMG fields of memory-system requests.
So far schemata supports cdp writing and reading, to grab MPAM info from cpu for downstream MSCs, SYS_MPAMx_ELx registers should be filled in both partid_d and partid_i (mapped from closids) of LxDATA and LxCODE and pmg_d and pmg_i (mapped from rmid).
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_resctrl.c | 69 ++++++++++++++++++++------- 1 file changed, 52 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 2508bf666742..b3924fca81f2 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -1371,7 +1371,8 @@ int __init mpam_resctrl_init(void) void __mpam_sched_in(void) { struct intel_pqr_state *state = this_cpu_ptr(&pqr_state); - u64 partid = state->default_closid; + u64 closid = state->default_closid; + u64 partid_d, partid_i; u64 pmg = state->default_rmid;
/* @@ -1380,7 +1381,7 @@ void __mpam_sched_in(void) */ if (static_branch_likely(&resctrl_alloc_enable_key)) { if (current->closid) - partid = current->closid; + closid = current->closid; }
if (static_branch_likely(&resctrl_mon_enable_key)) { @@ -1388,22 +1389,56 @@ void __mpam_sched_in(void) pmg = current->rmid; }
- if (partid != state->cur_closid || pmg != state->cur_rmid) { + if (closid != state->cur_closid || pmg != state->cur_rmid) { u64 reg; - state->cur_closid = partid; - state->cur_rmid = pmg; - - /* set in EL0 */ - reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); - reg = PARTID_SET(reg, partid); - reg = PMG_SET(reg, pmg); - mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); - - /* set in EL1 */ - reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); - reg = PARTID_SET(reg, partid); - reg = PMG_SET(reg, pmg); - mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); + + if (resctrl_cdp_enabled) { + hw_closid_t hw_closid; + + resctrl_cdp_map(clos, closid, CDP_DATA, hw_closid); + partid_d = hw_closid_val(hw_closid); + + resctrl_cdp_map(clos, closid, CDP_CODE, hw_closid); + partid_i = hw_closid_val(hw_closid); + + /* + * when cdp enabled, we use partid_i to label cur_closid + * of cpu state instead of partid_d, because each task/ + * rdtgrp's closid is labeled by CDP_BOTH/CDP_CODE but not + * CDP_DATA. + */ + state->cur_closid = partid_i; + state->cur_rmid = pmg; + + /* set in EL0 */ + reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); + reg = PARTID_D_SET(reg, partid_d); + reg = PARTID_I_SET(reg, partid_i); + reg = PMG_SET(reg, pmg); + mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); + + /* set in EL1 */ + reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); + reg = PARTID_D_SET(reg, partid_d); + reg = PARTID_I_SET(reg, partid_i); + reg = PMG_SET(reg, pmg); + mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); + } else { + state->cur_closid = closid; + state->cur_rmid = pmg; + + /* set in EL0 */ + reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); + reg = PARTID_SET(reg, closid); + reg = PMG_SET(reg, pmg); + mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); + + /* set in EL1 */ + reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); + reg = PARTID_SET(reg, closid); + reg = PMG_SET(reg, pmg); + mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); + } } }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
resctrl_resource_reset() would be excuted if resctrl sysfs umount, this help to reset all settings stored in related structures (such as mpam_cfg) and put MSCs back to default state.
This is similar to 6ab0b81f2c18 ("arm64/mpam: Fix unreset resources when mkdir ctrl group or umount resctrl") but using helpers from mpam devices module.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 16 +++++++ arch/arm64/kernel/mpam/mpam_internal.h | 2 + arch/arm64/kernel/mpam/mpam_resctrl.c | 58 +++++++++++++++++--------- 3 files changed, 57 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 11cd7a5a6785..62ca0952cadc 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -1302,6 +1302,22 @@ int mpam_component_config(struct mpam_component *comp, struct sync_args *args) return do_device_sync(comp, &sync_ctx); }
+/* + * Reset every component, configuring every partid unrestricted. + */ +void mpam_reset_devices(void) +{ + struct mpam_class *class; + struct mpam_component *comp; + + mutex_lock(&mpam_devices_lock); + list_for_each_entry(class, &mpam_classes, classes_list) { + list_for_each_entry(comp, &class->components, class_list) + mpam_component_config(comp, NULL); + } + mutex_unlock(&mpam_devices_lock); +} + static inline void mpam_device_sync_mon_prepare(struct mpam_component *comp, struct mpam_device_sync *sync_ctx, struct sync_args *args) diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 8ab019fd8938..1a31d563bc41 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -159,6 +159,8 @@ static inline bool mpam_has_part_sel(mpam_features_t supported) int mpam_component_config(struct mpam_component *comp, struct sync_args *args);
+void mpam_reset_devices(void); + int mpam_component_mon(struct mpam_component *comp, struct sync_args *args, u64 *result);
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index b3924fca81f2..82f73c802c9f 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -531,25 +531,6 @@ void post_resctrl_mount(void) static_branch_enable_cpuslocked(&resctrl_enable_key); }
-static int reset_all_ctrls(struct resctrl_resource *r) -{ - return 0; -} - -void resctrl_resource_reset(void) -{ - struct mpam_resctrl_res *res; - struct resctrl_resource *r; - - /*Put everything back to default values. */ - for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; - - if (r->alloc_enabled) - reset_all_ctrls(r); - } -} - void release_rdtgroupfs_options(void) { } @@ -1503,6 +1484,45 @@ mpam_resctrl_update_component_cfg(struct resctrl_resource *r, mpam_update_from_resctrl_cfg(res, resctrl_cfg, mpam_cfg); }
+static void mpam_reset_cfg(struct mpam_resctrl_res *res, + struct mpam_resctrl_dom *dom, struct rdt_domain *d) + +{ + int i; + struct resctrl_resource *r = &res->resctrl_res; + + for (i = 0; i != mpam_sysprops_num_partid(); i++) { + mpam_update_from_resctrl_cfg(res, r->default_ctrl, + &dom->comp->cfg[i]); + d->ctrl_val[i] = r->default_ctrl; + } +} + +void resctrl_resource_reset(void) +{ + struct mpam_resctrl_res *res; + struct mpam_resctrl_dom *dom; + struct rdt_domain *d; + + for_each_supported_resctrl_exports(res) { + if (!res->resctrl_res.alloc_capable) + continue; + + list_for_each_entry(d, &res->resctrl_res.domains, list) { + dom = container_of(d, struct mpam_resctrl_dom, + resctrl_dom); + mpam_reset_cfg(res, dom, d); + } + } + + mpam_reset_devices(); + + /* + * reset CDP configuration used in recreating schema list nodes. + */ + resctrl_cdp_enabled = false; +} + u16 mpam_resctrl_max_mon_num(void) { struct mpam_resctrl_res *res;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Replace u32 bitmask with bitmap for closid allocation, it's because closid may be too large to use 32 bits.
This also support cdp, when cdp is enabled, closid will be assigned twice once time, giving closid to code LxCache and closid+1 to data LxDATA, so do free process.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 7 +-- arch/arm64/include/asm/resctrl.h | 14 +++++- arch/arm64/kernel/mpam/mpam_resctrl.c | 62 +++++++++++++++++++++------ fs/resctrlfs.c | 6 ++- 4 files changed, 67 insertions(+), 22 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index b0bab6153db8..d4cb6672f7b9 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -168,10 +168,7 @@ do { \ __result = as_hw_t(__name, __closid); \ } while (0)
-static inline bool is_resctrl_cdp_enabled(void) -{ - return 0; -} +bool is_resctrl_cdp_enabled(void);
#define hw_alloc_times_validate(__name, __times, __flag) \ do { \ @@ -269,7 +266,7 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg); void rmdir_mondata_subdir_allrdtgrp(struct resctrl_resource *r, unsigned int dom_id);
-void closid_init(void); +int closid_init(void); int closid_alloc(void); void closid_free(int closid);
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index f44feeb6b496..408b4a02d7c7 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -91,10 +91,18 @@ static inline void free_mon_id(u32 id) }
void pmg_init(void); -static inline void resctrl_id_init(void) +static inline int resctrl_id_init(void) { - closid_init(); + int ret; + + ret = closid_init(); + if (ret) + goto out; + pmg_init(); + +out: + return ret; }
static inline int resctrl_id_alloc(void) @@ -136,4 +144,6 @@ int resctrl_group_init_alloc(struct rdtgroup *rdtgrp); struct resctrl_resource * mpam_resctrl_get_resource(enum resctrl_resource_level level);
+#define RESCTRL_MAX_CLOSID 32 + #endif /* _ASM_ARM64_RESCTRL_H */ diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 82f73c802c9f..bfee2845236a 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -107,6 +107,11 @@ void mpam_resctrl_clear_default_cpu(unsigned int cpu) cpumask_clear_cpu(cpu, &resctrl_group_default.cpu_mask); }
+bool is_resctrl_cdp_enabled(void) +{ + return !!resctrl_cdp_enabled; +} + static void mpam_resctrl_update_component_cfg(struct resctrl_resource *r, struct rdt_domain *d, struct list_head *opt_list, u32 partid); @@ -448,8 +453,8 @@ static int common_wrmon(struct rdt_domain *d, struct rdtgroup *g, bool enable) }
/* - * Trivial allocator for CLOSIDs. Since h/w only supports a small number, - * we can keep a bitmap of free CLOSIDs in a single integer. + * Notifing resctrl_id_init() should be called after calling parse_ + * resctrl_group_fs_options() to guarantee resctrl_cdp_enabled() active. * * Using a global CLOSID across all resources has some advantages and * some drawbacks: @@ -462,35 +467,64 @@ static int common_wrmon(struct rdt_domain *d, struct rdtgroup *g, bool enable) * - Our choices on how to configure each resource become progressively more * limited as the number of resources grows. */ -static int closid_free_map;
-void closid_init(void) +static unsigned long *closid_free_map; +static int num_closid; + +int closid_init(void) { - int num_closid = INT_MAX; + int pos; + u32 times, flag; + + if (closid_free_map) + kfree(closid_free_map);
num_closid = mpam_sysprops_num_partid(); + num_closid = min(num_closid, RESCTRL_MAX_CLOSID); + + hw_alloc_times_validate(clos, times, flag); + + if (flag) + num_closid = rounddown(num_closid, 2); + + closid_free_map = bitmap_zalloc(num_closid, GFP_KERNEL); + if (!closid_free_map) + return -ENOMEM;
- closid_free_map = BIT_MASK(num_closid) - 1; + bitmap_set(closid_free_map, 0, num_closid);
/* CLOSID 0 is always reserved for the default group */ - closid_free_map &= ~1; -} + pos = find_first_bit(closid_free_map, num_closid); + bitmap_clear(closid_free_map, pos, times);
+ return 0; +} +/* + * If cdp enabled, allocate two closid once time, then return first + * allocated id. + */ int closid_alloc(void) { - u32 closid = ffs(closid_free_map); + int pos; + u32 times, flag; + + hw_alloc_times_validate(clos, times, flag);
- if (closid == 0) + pos = find_first_bit(closid_free_map, num_closid); + if (pos == num_closid) return -ENOSPC; - closid--; - closid_free_map &= ~(1 << closid);
- return closid; + bitmap_clear(closid_free_map, pos, times); + + return pos; }
void closid_free(int closid) { - closid_free_map |= 1 << closid; + u32 times, flag; + + hw_alloc_times_validate(clos, times, flag); + bitmap_set(closid_free_map, closid, times); }
/* diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index acec4a1c9021..e4f4f2b35079 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -342,7 +342,11 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, goto out_options; } #endif - resctrl_id_init(); + ret = resctrl_id_init(); + if (ret) { + dentry = ERR_PTR(ret); + goto out_options; + }
ret = resctrl_group_create_info_dir(resctrl_group_default.kn); if (ret) {
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Resctrl ctrlmon write/read functions should be moved to mpam_ctrlmon.c to make code clear.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 6 ++ arch/arm64/kernel/mpam/mpam_ctrlmon.c | 109 +++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_resctrl.c | 110 -------------------------- 3 files changed, 115 insertions(+), 110 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index d4cb6672f7b9..b81aec481784 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -355,6 +355,12 @@ union mon_data_bits { struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, struct list_head **pos);
+ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); + +int resctrl_group_ctrlmon_show(struct kernfs_open_file *of, + struct seq_file *s, void *v); + int resctrl_group_alloc_mon(struct rdtgroup *grp);
void mon_init(void); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index f04b90667095..481b4d54f2ba 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -489,6 +489,115 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, return ret; }
+int resctrl_ctrlmon_enable(struct kernfs_node *parent_kn, + struct resctrl_group *prgrp, + struct kernfs_node **dest_kn) +{ + int ret; + + /* only for RDTCTRL_GROUP */ + if (prgrp->type == RDTMON_GROUP) + return 0; + + ret = alloc_mon(); + if (ret < 0) { + rdt_last_cmd_puts("out of monitors\n"); + pr_info("out of monitors: ret %d\n", ret); + return ret; + } + prgrp->mon.mon = ret; + prgrp->mon.rmid = 0; + + ret = mkdir_mondata_all(parent_kn, prgrp, dest_kn); + if (ret) { + rdt_last_cmd_puts("kernfs subdir error\n"); + free_mon(ret); + } + + return ret; +} + +void resctrl_ctrlmon_disable(struct kernfs_node *kn_mondata, + struct resctrl_group *prgrp) +{ + struct mpam_resctrl_res *r; + struct resctrl_resource *resctrl_res; + struct raw_resctrl_resource *rr; + struct rdt_domain *dom; + int mon = prgrp->mon.mon; + + /* only for RDTCTRL_GROUP */ + if (prgrp->type == RDTMON_GROUP) + return; + + for_each_resctrl_exports(r) { + resctrl_res = &r->resctrl_res; + + if (resctrl_res->mon_enabled) { + rr = (struct raw_resctrl_resource *)resctrl_res->res; + + list_for_each_entry(dom, &resctrl_res->domains, list) { + rr->mon_write(dom, prgrp, false); + } + } + } + + free_mon(mon); + kernfs_remove(kn_mondata); +} + +ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + struct rdtgroup *rdtgrp; + int ret = 0; + int ctrlmon; + + if (kstrtoint(strstrip(buf), 0, &ctrlmon) || ctrlmon < 0) + return -EINVAL; + rdtgrp = resctrl_group_kn_lock_live(of->kn); + rdt_last_cmd_clear(); + + if (!rdtgrp) { + ret = -ENOENT; + goto unlock; + } + + if ((rdtgrp->flags & RDT_CTRLMON) && !ctrlmon) { + /* disable & remove mon_data dir */ + rdtgrp->flags &= ~RDT_CTRLMON; + resctrl_ctrlmon_disable(rdtgrp->mon.mon_data_kn, rdtgrp); + } else if (!(rdtgrp->flags & RDT_CTRLMON) && ctrlmon) { + ret = resctrl_ctrlmon_enable(rdtgrp->kn, rdtgrp, + &rdtgrp->mon.mon_data_kn); + if (!ret) + rdtgrp->flags |= RDT_CTRLMON; + } else { + ret = -ENOENT; + } + +unlock: + resctrl_group_kn_unlock(of->kn); + return ret ?: nbytes; +} + +int resctrl_group_ctrlmon_show(struct kernfs_open_file *of, + struct seq_file *s, void *v) +{ + struct rdtgroup *rdtgrp; + int ret = 0; + + rdtgrp = resctrl_group_kn_lock_live(of->kn); + if (rdtgrp) + seq_printf(s, "%d", !!(rdtgrp->flags & RDT_CTRLMON)); + else + ret = -ENOENT; + resctrl_group_kn_unlock(of->kn); + + return ret; +} + + static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, struct resctrl_resource *r, struct resctrl_group *prgrp) diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index bfee2845236a..df327de7fe48 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -1138,116 +1138,6 @@ static int resctrl_group_tasks_show(struct kernfs_open_file *of, return ret; }
-int resctrl_ctrlmon_enable(struct kernfs_node *parent_kn, - struct resctrl_group *prgrp, - struct kernfs_node **dest_kn) -{ - int ret; - - /* only for RDTCTRL_GROUP */ - if (prgrp->type == RDTMON_GROUP) - return 0; - - ret = alloc_mon(); - if (ret < 0) { - rdt_last_cmd_puts("out of monitors\n"); - pr_info("out of monitors: ret %d\n", ret); - return ret; - } - prgrp->mon.mon = ret; - prgrp->mon.rmid = 0; - - ret = mkdir_mondata_all(parent_kn, prgrp, dest_kn); - if (ret) { - rdt_last_cmd_puts("kernfs subdir error\n"); - free_mon(ret); - } - - return ret; -} - -void resctrl_ctrlmon_disable(struct kernfs_node *kn_mondata, - struct resctrl_group *prgrp) -{ - struct mpam_resctrl_res *r; - struct resctrl_resource *resctrl_res; - struct raw_resctrl_resource *rr; - struct rdt_domain *dom; - int mon = prgrp->mon.mon; - - /* only for RDTCTRL_GROUP */ - if (prgrp->type == RDTMON_GROUP) - return; - - for_each_supported_resctrl_exports(r) { - resctrl_res = &r->resctrl_res; - - if (resctrl_res->mon_enabled) { - rr = (struct raw_resctrl_resource *)resctrl_res->res; - - list_for_each_entry(dom, &resctrl_res->domains, list) { - rr->mon_write(dom, prgrp, false); - } - } - } - - free_mon(mon); - kernfs_remove(kn_mondata); - - return; -} - -static ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off) -{ - struct rdtgroup *rdtgrp; - int ret = 0; - int ctrlmon; - - if (kstrtoint(strstrip(buf), 0, &ctrlmon) || ctrlmon < 0) - return -EINVAL; - rdtgrp = resctrl_group_kn_lock_live(of->kn); - rdt_last_cmd_clear(); - - if (!rdtgrp) { - ret = -ENOENT; - goto unlock; - } - - if ((rdtgrp->flags & RDT_CTRLMON) && !ctrlmon) { - /* disable & remove mon_data dir */ - rdtgrp->flags &= ~RDT_CTRLMON; - resctrl_ctrlmon_disable(rdtgrp->mon.mon_data_kn, rdtgrp); - } else if (!(rdtgrp->flags & RDT_CTRLMON) && ctrlmon) { - ret = resctrl_ctrlmon_enable(rdtgrp->kn, rdtgrp, - &rdtgrp->mon.mon_data_kn); - if (!ret) - rdtgrp->flags |= RDT_CTRLMON; - } else { - ret = -ENOENT; - } - -unlock: - resctrl_group_kn_unlock(of->kn); - return ret ?: nbytes; -} - -static int resctrl_group_ctrlmon_show(struct kernfs_open_file *of, - struct seq_file *s, void *v) -{ - struct rdtgroup *rdtgrp; - int ret = 0; - - rdtgrp = resctrl_group_kn_lock_live(of->kn); - if (rdtgrp) - seq_printf(s, "%d", !!(rdtgrp->flags & RDT_CTRLMON)); - else - ret = -ENOENT; - resctrl_group_kn_unlock(of->kn); - - return ret; -} - /* rdtgroup information files for one cache resource. */ static struct rftype res_specific_files[] = { {
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
This prepares for simultaneously monitoring LxDATA and LxCODE when cdp is open, under our implementation, LxDATA and LxCODE is allocated closid and closid+1, so we should keep two monitor once time for each.
Why there needs one monitors for each closid when cdp is open, but not switch one between the two LxDATA and LxCODE is because this monitor kept by target closid maybe busy for a long time, it would cause inaccuracy if we force switching.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_mon.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index f952e9aa20c2..297169b41ea3 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -93,20 +93,28 @@ void mon_init(void)
int alloc_mon(void) { - u32 mon = ffs(mon_free_map); + u32 mon = 0; + u32 times, flag;
+ hw_alloc_times_validate(mon, times, flag); + + mon = ffs(mon_free_map); if (mon == 0) return -ENOSPC;
mon--; - mon_free_map &= ~(1 << mon); + mon_free_map &= ~(GENMASK(mon, mon + times - 1));
return mon; }
void free_mon(u32 mon) { - mon_free_map |= 1 << mon; + u32 times, flag; + + hw_alloc_times_validate(mon, times, flag); + + mon_free_map |= GENMASK(mon, mon + times - 1); }
/*
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
commit 43be0de7be8f ("arm64/mpam: Support cdp on allocating monitors") allows us to allocate two monitor once, we apply this two monitors to different monitor sysfile under mon_data directory according to its' closid, as following illustrates.
-- resctrl/ +-- schemata L3CODE:0=xx # closid L3DATA:1=xx # closid+1 MB:0=xx # closid +-- mon_data/ +-- mon_L3CODE_00 # monitor +-- mon_L3DATA_00 # monitor+1 +-- mon_MB_00 # monitor
When monitoring happens, we read the private data of each monitor sysfile which contains closid, monitor and pmg, this is used for obtaining monitor data.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 6 ++-- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 47 ++++++++++++++++++--------- arch/arm64/kernel/mpam/mpam_resctrl.c | 45 ++++++++++++++++--------- 3 files changed, 65 insertions(+), 33 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index b81aec481784..82b9887270a1 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -338,10 +338,11 @@ struct raw_resctrl_resource { struct resctrl_staged_config *cfg, hw_closid_t closid);
u16 num_mon; - u64 (*mon_read)(struct rdt_domain *d, struct rdtgroup *g); - int (*mon_write)(struct rdt_domain *d, struct rdtgroup *g, bool enable); + u64 (*mon_read)(struct rdt_domain *d, void *md_priv); + int (*mon_write)(struct rdt_domain *d, void *md_priv, bool enable); };
+/* 64bit arm64 specified */ union mon_data_bits { void *priv; struct { @@ -349,6 +350,7 @@ union mon_data_bits { u8 domid; u8 partid; u8 pmg; + u8 mon; } u; };
diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 481b4d54f2ba..d1f7ffd25b69 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -425,7 +425,7 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) goto out; }
- usage = rr->mon_read(d, rdtgrp); + usage = rr->mon_read(d, md.priv); seq_printf(m, "%llu\n", usage);
out: @@ -454,22 +454,31 @@ static int resctrl_group_kn_set_ugid(struct kernfs_node *kn) }
static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, - struct rdt_domain *d, - struct resctrl_resource *r, struct resctrl_group *prgrp) + struct rdt_domain *d, struct resctrl_schema *s, + struct resctrl_group *prgrp) + { - struct raw_resctrl_resource *rr = (struct raw_resctrl_resource *)r->res; + struct resctrl_resource *r; + struct raw_resctrl_resource *rr; + hw_closid_t hw_closid; + hw_monid_t hw_monid; union mon_data_bits md; struct kernfs_node *kn; char name[32]; int ret = 0;
+ r = s->res; + rr = r->res;
md.u.rid = r->rid; md.u.domid = d->id; - md.u.partid = prgrp->closid; + resctrl_cdp_map(clos, prgrp->closid, s->conf_type, hw_closid); + md.u.partid = hw_closid_val(hw_closid); + resctrl_cdp_map(mon, prgrp->mon.mon, s->conf_type, hw_monid); + md.u.mon = hw_monid_val(hw_monid); md.u.pmg = prgrp->mon.rmid;
- snprintf(name, sizeof(name), "mon_%s_%02d", r->name, d->id); + snprintf(name, sizeof(name), "mon_%s_%02d", s->name, d->id); kn = __kernfs_create_file(parent_kn, name, 0444, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, &kf_mondata_ops, md.priv, NULL, NULL); @@ -484,7 +493,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, }
/* Could we remove the MATCH_* param ? */ - rr->mon_write(d, prgrp, true); + rr->mon_write(d, md.priv, true);
return ret; } @@ -599,14 +608,15 @@ int resctrl_group_ctrlmon_show(struct kernfs_open_file *of,
static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, - struct resctrl_resource *r, - struct resctrl_group *prgrp) + struct resctrl_schema *s, struct resctrl_group *prgrp) { + struct resctrl_resource *r; struct rdt_domain *dom; int ret;
+ r = s->res; list_for_each_entry(dom, &r->domains, list) { - ret = mkdir_mondata_subdir(parent_kn, dom, r, prgrp); + ret = mkdir_mondata_subdir(parent_kn, dom, s, prgrp); if (ret) return ret; } @@ -672,7 +682,7 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, struct kernfs_node **dest_kn) { - struct mpam_resctrl_res *res; + struct resctrl_schema *s; struct resctrl_resource *r; struct kernfs_node *kn; int ret; @@ -691,16 +701,23 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, * Create the subdirectories for each domain. Note that all events * in a domain like L3 are grouped into a resource whose domain is L3 */ - for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res;
if (r->mon_enabled) { /* HHA does not support monitor by pmg */ + struct raw_resctrl_resource *rr; + + rr = r->res; + /* + * num pmg of different resources varies, we just + * skip creating those unqualified ones. + */ if ((prgrp->type == RDTMON_GROUP) && - (r->rid == RDT_RESOURCE_MC)) + (prgrp->mon.rmid >= rr->num_pmg)) continue;
- ret = mkdir_mondata_subdir_alldom(kn, r, prgrp); + ret = mkdir_mondata_subdir_alldom(kn, s, prgrp); if (ret) goto out_destroy; } diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index df327de7fe48..76881ba127c9 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -123,11 +123,10 @@ common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, static u64 cache_rdmsr(struct rdt_domain *d, int partid); static u64 mbw_rdmsr(struct rdt_domain *d, int partid);
-static u64 cache_rdmon(struct rdt_domain *d, struct rdtgroup *g); -static u64 mbw_rdmon(struct rdt_domain *d, struct rdtgroup *g); +static u64 cache_rdmon(struct rdt_domain *d, void *md_priv); +static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv);
-static int common_wrmon(struct rdt_domain *d, struct rdtgroup *g, - bool enable); +static int common_wrmon(struct rdt_domain *d, void *md_priv, bool enable);
static inline bool is_mon_dyn(u32 mon) { @@ -342,22 +341,27 @@ static u64 mbw_rdmsr(struct rdt_domain *d, int partid) * use pmg as monitor id * just use match_pardid only. */ -static u64 cache_rdmon(struct rdt_domain *d, struct rdtgroup *g) +static u64 cache_rdmon(struct rdt_domain *d, void *md_priv) { int err; u64 result; + union mon_data_bits md; struct sync_args args; struct mpam_resctrl_dom *dom; - u32 mon = g->mon.mon; + u32 mon; unsigned long timeout;
+ md.priv = md_priv; + + mon = md.u.mon; + /* Indicates whether allocating a monitor dynamically*/ if (is_mon_dyn(mon)) mon = alloc_mon();
- args.partid = g->closid; + args.partid = md.u.partid; args.mon = mon; - args.pmg = g->mon.rmid; + args.pmg = md.u.pmg; args.match_pmg = true; args.eventid = QOS_L3_OCCUP_EVENT_ID;
@@ -387,21 +391,26 @@ static u64 cache_rdmon(struct rdt_domain *d, struct rdtgroup *g) * use pmg as monitor id * just use match_pardid only. */ -static u64 mbw_rdmon(struct rdt_domain *d, struct rdtgroup *g) +static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv) { int err; u64 result; + union mon_data_bits md; struct sync_args args; struct mpam_resctrl_dom *dom; - u32 mon = g->mon.mon; + u32 mon; unsigned long timeout;
+ md.priv = md_priv; + + mon = md.u.mon; + if (is_mon_dyn(mon)) mon = alloc_mon();
- args.partid = g->closid; + args.partid = md.u.partid; args.mon = mon; - args.pmg = g->mon.rmid; + args.pmg = md.u.pmg; args.match_pmg = true; args.eventid = QOS_L3_MBM_LOCAL_EVENT_ID;
@@ -427,18 +436,22 @@ static u64 mbw_rdmon(struct rdt_domain *d, struct rdtgroup *g) return result; }
-static int common_wrmon(struct rdt_domain *d, struct rdtgroup *g, bool enable) +static int +common_wrmon(struct rdt_domain *d, void *md_priv, bool enable) { u64 result; + union mon_data_bits md; struct sync_args args; struct mpam_resctrl_dom *dom;
if (!enable) return -EINVAL;
- args.partid = g->closid; - args.mon = g->mon.mon; - args.pmg = g->mon.rmid; + md.priv = md_priv; + args.partid = md.u.partid; + args.mon = md.u.mon; + args.pmg = md.u.pmg; + args.match_pmg = true;
dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Rearrange helpers' declaration place for resctrlfs and clean up header files included, this make code more clear.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 44 ++++---------------------- arch/arm64/include/asm/resctrl.h | 40 ++++++++++++++++++++--- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 4 --- arch/arm64/kernel/mpam/mpam_device.c | 2 +- arch/arm64/kernel/mpam/mpam_device.h | 1 - arch/arm64/kernel/mpam/mpam_internal.h | 1 + arch/arm64/kernel/mpam/mpam_mon.c | 2 -- arch/arm64/kernel/mpam/mpam_resctrl.c | 2 -- arch/arm64/kernel/mpam/mpam_setup.c | 2 -- 9 files changed, 43 insertions(+), 55 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 82b9887270a1..52a334cce91a 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -115,9 +115,6 @@ DECLARE_STATIC_KEY_FALSE(resctrl_enable_key); DECLARE_STATIC_KEY_FALSE(resctrl_mon_enable_key);
-extern bool rdt_alloc_capable; -extern bool rdt_mon_capable; - extern int max_name_width, max_data_width;
enum resctrl_conf_type { @@ -207,11 +204,6 @@ struct resctrl_schema { struct resctrl_resource *res; };
- -/* rdtgroup.flags */ -#define RDT_DELETED BIT(0) -#define RDT_CTRLMON BIT(1) - /** * struct rdt_domain - group of cpus sharing an RDT resource * @list: all instances of this resource @@ -250,35 +242,13 @@ struct rdt_domain {
#define RESCTRL_SHOW_DOM_MAX_NUM 8
-extern struct mutex resctrl_group_mutex; - -extern struct resctrl_resource resctrl_resources_all[]; - int __init resctrl_group_init(void);
-void rdt_last_cmd_clear(void); -void rdt_last_cmd_puts(const char *s); -void rdt_last_cmd_printf(const char *fmt, ...); - -int alloc_rmid(void); -void free_rmid(u32 rmid); int resctrl_group_mondata_show(struct seq_file *m, void *arg); void rmdir_mondata_subdir_allrdtgrp(struct resctrl_resource *r, unsigned int dom_id);
-int closid_init(void); -int closid_alloc(void); -void closid_free(int closid); - int cdp_enable(int level, int data_type, int code_type); -void resctrl_resource_reset(void); -void release_rdtgroupfs_options(void); -int parse_rdtgroupfs_options(char *data); - -static inline int __resctrl_group_show_options(struct seq_file *seq) -{ - return 0; -}
void post_resctrl_mount(void);
@@ -354,6 +324,12 @@ union mon_data_bits { } u; };
+ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); + +int resctrl_group_schemata_show(struct kernfs_open_file *of, + struct seq_file *s, void *v); + struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, struct list_head **pos);
@@ -365,14 +341,6 @@ int resctrl_group_ctrlmon_show(struct kernfs_open_file *of,
int resctrl_group_alloc_mon(struct rdtgroup *grp);
-void mon_init(void); -int alloc_mon(void); -void free_mon(u32 mon); - -int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, - struct rdtgroup *prgrp, - struct kernfs_node **dest_kn); - u16 mpam_resctrl_max_mon_num(void);
#endif /* _ASM_ARM64_MPAM_H */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 408b4a02d7c7..90b7683dd4dd 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -2,7 +2,6 @@ #define _ASM_ARM64_RESCTRL_H
#include <asm/mpam_sched.h> -#include <asm/mpam.h>
#define resctrl_group rdtgroup #define resctrl_alloc_capable rdt_alloc_capable @@ -79,6 +78,9 @@ int schemata_list_init(void);
void schemata_list_destroy(void);
+int alloc_rmid(void); +void free_rmid(u32 rmid); + static inline int alloc_mon_id(void) {
@@ -90,7 +92,11 @@ static inline void free_mon_id(u32 id) free_rmid(id); }
+int closid_init(void); +int closid_alloc(void); +void closid_free(int closid); void pmg_init(void); + static inline int resctrl_id_init(void) { int ret; @@ -120,11 +126,26 @@ void update_closid_rmid(const struct cpumask *cpu_mask, struct resctrl_group *r) int __resctrl_group_move_task(struct task_struct *tsk, struct resctrl_group *rdtgrp);
-ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off); +extern bool rdt_alloc_capable; +extern bool rdt_mon_capable; + +/* rdtgroup.flags */ +#define RDT_DELETED BIT(0) +#define RDT_CTRLMON BIT(1) + +void rdt_last_cmd_clear(void); +void rdt_last_cmd_puts(const char *s); +void rdt_last_cmd_printf(const char *fmt, ...); + +extern struct mutex resctrl_group_mutex; + +void release_rdtgroupfs_options(void); +int parse_rdtgroupfs_options(char *data);
-int resctrl_group_schemata_show(struct kernfs_open_file *of, - struct seq_file *s, void *v); +int alloc_mon(void); +void free_mon(u32 mon); + +void resctrl_resource_reset(void);
#define release_resctrl_group_fs_options release_rdtgroupfs_options #define parse_resctrl_group_fs_options parse_rdtgroupfs_options @@ -141,6 +162,15 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp,
int resctrl_group_init_alloc(struct rdtgroup *rdtgrp);
+static inline int __resctrl_group_show_options(struct seq_file *seq) +{ + return 0; +} + +int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, + struct rdtgroup *prgrp, + struct kernfs_node **dest_kn); + struct resctrl_resource * mpam_resctrl_get_resource(enum resctrl_resource_level level);
diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index d1f7ffd25b69..a94a1f2c5847 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -33,11 +33,8 @@ #include <linux/kernfs.h> #include <linux/seq_file.h> #include <linux/slab.h> -#include <linux/resctrlfs.h>
-#include <asm/mpam.h> #include <asm/mpam_resource.h> -#include <asm/resctrl.h> #include "mpam_internal.h"
/* schemata content list */ @@ -705,7 +702,6 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, r = s->res;
if (r->mon_enabled) { - /* HHA does not support monitor by pmg */ struct raw_resctrl_resource *rr;
rr = r->res; diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 62ca0952cadc..356362ecdc79 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -33,9 +33,9 @@ #include <linux/cacheinfo.h> #include <linux/arm_mpam.h> #include <asm/mpam_resource.h> -#include <asm/mpam.h>
#include "mpam_device.h" +#include "mpam_internal.h"
/* * During discovery this lock protects writers to class, components and devices. diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index 3165d6b1a270..9930ca70e0ce 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -6,7 +6,6 @@ #include <linux/cpumask.h> #include <linux/types.h> #include <linux/arm_mpam.h> -#include "mpam_internal.h"
struct mpam_config;
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 1a31d563bc41..57a08a78bb6e 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -3,6 +3,7 @@ #define _ASM_ARM64_MPAM_INTERNAL_H
#include <linux/resctrlfs.h> +#include <asm/mpam.h> #include <asm/resctrl.h>
typedef u32 mpam_features_t; diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 297169b41ea3..bb681d1ab7ad 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -28,8 +28,6 @@
#include <linux/module.h> #include <linux/slab.h> -#include <linux/resctrlfs.h> -#include <asm/resctrl.h>
#include "mpam_internal.h"
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 76881ba127c9..aafe20473acf 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -37,12 +37,10 @@ #include <linux/task_work.h> #include <linux/sched/signal.h> #include <linux/sched/task.h> -#include <linux/resctrlfs.h> #include <linux/arm_mpam.h>
#include <asm/mpam_sched.h> #include <asm/mpam_resource.h> -#include <asm/resctrl.h> #include <asm/io.h>
#include "mpam_device.h" diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 4ad178c083ea..b01716392a65 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -28,8 +28,6 @@
#include <linux/slab.h> #include <linux/err.h> -#include <linux/resctrlfs.h> -#include <asm/resctrl.h>
#include "mpam_device.h" #include "mpam_internal.h"
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
We redesign monitoring process for user, as following illustrates:
e.g. before rewriting: mount /sys/fs/resctrl && cd /sys/fs/resctrl mkdir p1 && cd p1 echo 1 > ctrlmon # this allocates a monitor resource for this group ... # associating task/cpu with this group grep . mon_data/* # get monitor data from mon_data directory e.g. after rewriting: mount /sys/fs/resctrl && cd /sys/fs/resctrl mkdir p1 && cd p1 # automically allocating a monitoring resource ... # associate task/cpu with this group grep . mon_data/* # directly get monitor data
ctrlmon is used for manually allocating a monitor resource for monitoring a specified group (labeled by partid and pmg), we delete ctrlmon because this action is redundant.
User should know which group has been allocated a available monitor resource and only this monitor resource is released then this monitor resource can be reallocated to a new group after, this action is redundant and unnecessary, as monitor resource is used only when monitoring process happens, so a relax monitor resource can be allocated to multiple groups and take effect when monitoring process happened.
But should some restrictions be known, a monitor resource for monitoring Cache-occupancy might be kept for a long time until it doesn't need to be use anymore, or below a threshold as like intel-RDT limbo list works, otherwise you may see that the monitoring result is very small beyond exception when you force switch one mon resource from one group to another.
We deliver a simple LRU mon resource allocation mechanism, but so far it just assign a monitor according to the order in which groups was created, this is incomplete and needs subsequent improvement.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 13 +- arch/arm64/include/asm/resctrl.h | 63 +------ arch/arm64/kernel/mpam/mpam_ctrlmon.c | 240 +------------------------- arch/arm64/kernel/mpam/mpam_mon.c | 75 ++++---- arch/arm64/kernel/mpam/mpam_resctrl.c | 79 ++++----- fs/resctrlfs.c | 198 ++++++++++++++++----- 6 files changed, 250 insertions(+), 418 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 52a334cce91a..ec2fc0f2eadb 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -167,7 +167,7 @@ do { \
bool is_resctrl_cdp_enabled(void);
-#define hw_alloc_times_validate(__name, __times, __flag) \ +#define hw_alloc_times_validate(__times, __flag) \ do { \ __flag = is_resctrl_cdp_enabled(); \ __times = flag ? 2 : 1; \ @@ -309,7 +309,7 @@ struct raw_resctrl_resource {
u16 num_mon; u64 (*mon_read)(struct rdt_domain *d, void *md_priv); - int (*mon_write)(struct rdt_domain *d, void *md_priv, bool enable); + int (*mon_write)(struct rdt_domain *d, void *md_priv); };
/* 64bit arm64 specified */ @@ -333,14 +333,11 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, struct list_head **pos);
-ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off); - -int resctrl_group_ctrlmon_show(struct kernfs_open_file *of, - struct seq_file *s, void *v); - int resctrl_group_alloc_mon(struct rdtgroup *grp);
u16 mpam_resctrl_max_mon_num(void);
+void pmg_init(void); +void mon_init(void); + #endif /* _ASM_ARM64_MPAM_H */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 90b7683dd4dd..68e515ea8779 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -78,48 +78,14 @@ int schemata_list_init(void);
void schemata_list_destroy(void);
-int alloc_rmid(void); -void free_rmid(u32 rmid); +int resctrl_lru_request_mon(void);
-static inline int alloc_mon_id(void) -{ - - return alloc_rmid(); -} - -static inline void free_mon_id(u32 id) -{ - free_rmid(id); -} +int alloc_mon_id(void); +void free_mon_id(u32 id);
-int closid_init(void); -int closid_alloc(void); -void closid_free(int closid); -void pmg_init(void); - -static inline int resctrl_id_init(void) -{ - int ret; - - ret = closid_init(); - if (ret) - goto out; - - pmg_init(); - -out: - return ret; -} - -static inline int resctrl_id_alloc(void) -{ - return closid_alloc(); -} - -static inline void resctrl_id_free(int id) -{ - closid_free(id); -} +int resctrl_id_init(void); +int resctrl_id_alloc(void); +void resctrl_id_free(int id);
void update_cpu_closid_rmid(void *info); void update_closid_rmid(const struct cpumask *cpu_mask, struct resctrl_group *r); @@ -131,7 +97,6 @@ extern bool rdt_mon_capable;
/* rdtgroup.flags */ #define RDT_DELETED BIT(0) -#define RDT_CTRLMON BIT(1)
void rdt_last_cmd_clear(void); void rdt_last_cmd_puts(const char *s); @@ -142,9 +107,6 @@ extern struct mutex resctrl_group_mutex; void release_rdtgroupfs_options(void); int parse_rdtgroupfs_options(char *data);
-int alloc_mon(void); -void free_mon(u32 mon); - void resctrl_resource_reset(void);
#define release_resctrl_group_fs_options release_rdtgroupfs_options @@ -152,14 +114,6 @@ void resctrl_resource_reset(void);
int mpam_get_mon_config(struct resctrl_resource *r);
-int mkdir_mondata_all(struct kernfs_node *parent_kn, - struct resctrl_group *prgrp, - struct kernfs_node **dest_kn); - -int -mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, - char *name, struct kernfs_node **dest_kn); - int resctrl_group_init_alloc(struct rdtgroup *rdtgrp);
static inline int __resctrl_group_show_options(struct seq_file *seq) @@ -167,9 +121,8 @@ static inline int __resctrl_group_show_options(struct seq_file *seq) return 0; }
-int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, - struct rdtgroup *prgrp, - struct kernfs_node **dest_kn); +int resctrl_mkdir_mondata_all_subdir(struct kernfs_node *parent_kn, + struct resctrl_group *prgrp);
struct resctrl_resource * mpam_resctrl_get_resource(enum resctrl_resource_level level); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index a94a1f2c5847..0c324008d9ab 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -450,7 +450,7 @@ static int resctrl_group_kn_set_ugid(struct kernfs_node *kn) return kernfs_setattr(kn, &iattr); }
-static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, +static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, struct rdt_domain *d, struct resctrl_schema *s, struct resctrl_group *prgrp)
@@ -490,121 +490,12 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, }
/* Could we remove the MATCH_* param ? */ - rr->mon_write(d, md.priv, true); + rr->mon_write(d, md.priv);
return ret; }
-int resctrl_ctrlmon_enable(struct kernfs_node *parent_kn, - struct resctrl_group *prgrp, - struct kernfs_node **dest_kn) -{ - int ret; - - /* only for RDTCTRL_GROUP */ - if (prgrp->type == RDTMON_GROUP) - return 0; - - ret = alloc_mon(); - if (ret < 0) { - rdt_last_cmd_puts("out of monitors\n"); - pr_info("out of monitors: ret %d\n", ret); - return ret; - } - prgrp->mon.mon = ret; - prgrp->mon.rmid = 0; - - ret = mkdir_mondata_all(parent_kn, prgrp, dest_kn); - if (ret) { - rdt_last_cmd_puts("kernfs subdir error\n"); - free_mon(ret); - } - - return ret; -} - -void resctrl_ctrlmon_disable(struct kernfs_node *kn_mondata, - struct resctrl_group *prgrp) -{ - struct mpam_resctrl_res *r; - struct resctrl_resource *resctrl_res; - struct raw_resctrl_resource *rr; - struct rdt_domain *dom; - int mon = prgrp->mon.mon; - - /* only for RDTCTRL_GROUP */ - if (prgrp->type == RDTMON_GROUP) - return; - - for_each_resctrl_exports(r) { - resctrl_res = &r->resctrl_res; - - if (resctrl_res->mon_enabled) { - rr = (struct raw_resctrl_resource *)resctrl_res->res; - - list_for_each_entry(dom, &resctrl_res->domains, list) { - rr->mon_write(dom, prgrp, false); - } - } - } - - free_mon(mon); - kernfs_remove(kn_mondata); -} - -ssize_t resctrl_group_ctrlmon_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off) -{ - struct rdtgroup *rdtgrp; - int ret = 0; - int ctrlmon; - - if (kstrtoint(strstrip(buf), 0, &ctrlmon) || ctrlmon < 0) - return -EINVAL; - rdtgrp = resctrl_group_kn_lock_live(of->kn); - rdt_last_cmd_clear(); - - if (!rdtgrp) { - ret = -ENOENT; - goto unlock; - } - - if ((rdtgrp->flags & RDT_CTRLMON) && !ctrlmon) { - /* disable & remove mon_data dir */ - rdtgrp->flags &= ~RDT_CTRLMON; - resctrl_ctrlmon_disable(rdtgrp->mon.mon_data_kn, rdtgrp); - } else if (!(rdtgrp->flags & RDT_CTRLMON) && ctrlmon) { - ret = resctrl_ctrlmon_enable(rdtgrp->kn, rdtgrp, - &rdtgrp->mon.mon_data_kn); - if (!ret) - rdtgrp->flags |= RDT_CTRLMON; - } else { - ret = -ENOENT; - } - -unlock: - resctrl_group_kn_unlock(of->kn); - return ret ?: nbytes; -} - -int resctrl_group_ctrlmon_show(struct kernfs_open_file *of, - struct seq_file *s, void *v) -{ - struct rdtgroup *rdtgrp; - int ret = 0; - - rdtgrp = resctrl_group_kn_lock_live(of->kn); - if (rdtgrp) - seq_printf(s, "%d", !!(rdtgrp->flags & RDT_CTRLMON)); - else - ret = -ENOENT; - resctrl_group_kn_unlock(of->kn); - - return ret; -} - - -static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, +static int resctrl_mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, struct resctrl_schema *s, struct resctrl_group *prgrp) { struct resctrl_resource *r; @@ -613,7 +504,7 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn,
r = s->res; list_for_each_entry(dom, &r->domains, list) { - ret = mkdir_mondata_subdir(parent_kn, dom, s, prgrp); + ret = resctrl_mkdir_mondata_dom(parent_kn, dom, s, prgrp); if (ret) return ret; } @@ -621,79 +512,13 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, return 0; }
-int -mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, - char *name, struct kernfs_node **dest_kn) -{ - struct kernfs_node *kn; - int ret; - - /* create the directory */ - kn = kernfs_create_dir(parent_kn, name, parent_kn->mode, prgrp); - if (IS_ERR(kn)) { - pr_info("%s: create dir %s, error\n", __func__, name); - return PTR_ERR(kn); - } - - if (dest_kn) - *dest_kn = kn; - - /* - * This extra ref will be put in kernfs_remove() and guarantees - * that @rdtgrp->kn is always accessible. - */ - kernfs_get(kn); - - ret = resctrl_group_kn_set_ugid(kn); - if (ret) - goto out_destroy; - - kernfs_activate(kn); - - return 0; - -out_destroy: - kernfs_remove(kn); - return ret; -} - - -/* - * This creates a directory mon_data which contains the monitored data. - * - * mon_data has one directory for each domain whic are named - * in the format mon_<domain_name>_<domain_id>. For ex: A mon_data - * with L3 domain looks as below: - * ./mon_data: - * mon_L3_00 - * mon_L3_01 - * mon_L3_02 - * ... - * - * Each domain directory has one file per event: - * ./mon_L3_00/: - * llc_occupancy - * - */ -int mkdir_mondata_all(struct kernfs_node *parent_kn, - struct resctrl_group *prgrp, - struct kernfs_node **dest_kn) +int resctrl_mkdir_mondata_all_subdir(struct kernfs_node *parent_kn, + struct resctrl_group *prgrp) { struct resctrl_schema *s; struct resctrl_resource *r; - struct kernfs_node *kn; int ret;
- /* - * Create the mon_data directory first. - */ - ret = mongroup_create_dir(parent_kn, prgrp, "mon_data", &kn); - if (ret) - return ret; - - if (dest_kn) - *dest_kn = kn; - /* * Create the subdirectories for each domain. Note that all events * in a domain like L3 are grouped into a resource whose domain is L3 @@ -705,61 +530,14 @@ int mkdir_mondata_all(struct kernfs_node *parent_kn, struct raw_resctrl_resource *rr;
rr = r->res; - /* - * num pmg of different resources varies, we just - * skip creating those unqualified ones. - */ - if ((prgrp->type == RDTMON_GROUP) && - (prgrp->mon.rmid >= rr->num_pmg)) - continue;
- ret = mkdir_mondata_subdir_alldom(kn, s, prgrp); + ret = resctrl_mkdir_mondata_subdir_alldom(parent_kn, + s, prgrp); if (ret) - goto out_destroy; + break; } }
- kernfs_activate(kn); - - return 0; - -out_destroy: - kernfs_remove(kn); - return ret; -} - -int resctrl_mkdir_ctrlmon_mondata(struct kernfs_node *parent_kn, - struct resctrl_group *prgrp, - struct kernfs_node **dest_kn) -{ - int ret; - - /* disalbe monitor by default for mpam. */ - if (prgrp->type == RDTCTRL_GROUP) - return 0; - - ret = alloc_mon(); - if (ret < 0) { - rdt_last_cmd_puts("out of monitors\n"); - return ret; - } - prgrp->mon.mon = ret; - - ret = alloc_mon_id(); - if (ret < 0) { - rdt_last_cmd_puts("out of PMGs\n"); - free_mon(prgrp->mon.mon); - return ret; - } - - prgrp->mon.rmid = ret; - - ret = mkdir_mondata_all(parent_kn, prgrp, dest_kn); - if (ret) { - rdt_last_cmd_puts("kernfs subdir error\n"); - free_mon(ret); - } - return ret; }
diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index bb681d1ab7ad..29f84e251b1e 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -38,7 +38,6 @@ bool rdt_mon_capable;
static int pmg_free_map; -void mon_init(void); void pmg_init(void) { u16 num_pmg = USHRT_MAX; @@ -53,15 +52,13 @@ void pmg_init(void) num_pmg = min(num_pmg, rr->num_pmg); }
- mon_init(); - pmg_free_map = BIT_MASK(num_pmg) - 1;
/* pmg 0 is always reserved for the default group */ pmg_free_map &= ~1; }
-int alloc_pmg(void) +static int alloc_pmg(void) { u32 pmg = ffs(pmg_free_map);
@@ -74,58 +71,66 @@ int alloc_pmg(void) return pmg; }
-void free_pmg(u32 pmg) +static void free_pmg(u32 pmg) { pmg_free_map |= 1 << pmg; }
-static int mon_free_map; +int alloc_mon_id(void) +{ + return alloc_pmg(); +} + +void free_mon_id(u32 id) +{ + free_pmg(id); +} + +/* + * A simple LRU monitor allocation machanism, each + * monitor free map occupies two section, one for + * allocation and another for recording. + */ +static int mon_free_map[2]; +static u8 alloc_idx, record_idx; + void mon_init(void) { int num_mon; + u32 times, flag;
num_mon = mpam_resctrl_max_mon_num();
- mon_free_map = BIT_MASK(num_mon) - 1; + hw_alloc_times_validate(times, flag); + /* for cdp on or off */ + num_mon = rounddown(num_mon, times); + + mon_free_map[0] = BIT_MASK(num_mon) - 1; + mon_free_map[1] = 0; + + alloc_idx = 0; + record_idx = 1; }
-int alloc_mon(void) +int resctrl_lru_request_mon(void) { u32 mon = 0; u32 times, flag;
- hw_alloc_times_validate(mon, times, flag); + hw_alloc_times_validate(times, flag);
- mon = ffs(mon_free_map); + mon = ffs(mon_free_map[alloc_idx]); if (mon == 0) return -ENOSPC;
mon--; - mon_free_map &= ~(GENMASK(mon, mon + times - 1)); + mon_free_map[alloc_idx] &= ~(GENMASK(mon + times - 1, mon)); + mon_free_map[record_idx] |= GENMASK(mon + times - 1, mon);
- return mon; -} - -void free_mon(u32 mon) -{ - u32 times, flag; - - hw_alloc_times_validate(mon, times, flag); - - mon_free_map |= GENMASK(mon, mon + times - 1); -} - -/* - * As of now the RMIDs allocation is global. - * However we keep track of which packages the RMIDs - * are used to optimize the limbo list management. - */ -int alloc_rmid(void) -{ - return alloc_pmg(); -} + if (!mon_free_map[alloc_idx]) { + alloc_idx = record_idx; + record_idx ^= 0x1; + }
-void free_rmid(u32 pmg) -{ - free_pmg(pmg); + return mon; } diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index aafe20473acf..ce05b8037a4d 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -124,16 +124,7 @@ static u64 mbw_rdmsr(struct rdt_domain *d, int partid); static u64 cache_rdmon(struct rdt_domain *d, void *md_priv); static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv);
-static int common_wrmon(struct rdt_domain *d, void *md_priv, bool enable); - -static inline bool is_mon_dyn(u32 mon) -{ - /* - * if rdtgrp->mon.mon has been tagged with value (max_mon_num), - * allocating a monitor in dynamic when getting monitor data. - */ - return (mon == mpam_resctrl_max_mon_num()) ? true : false; -} +static int common_wrmon(struct rdt_domain *d, void *md_priv);
static int parse_cbm(char *buf, struct raw_resctrl_resource *r, struct resctrl_staged_config *cfg, hw_closid_t hw_closid); @@ -346,19 +337,12 @@ static u64 cache_rdmon(struct rdt_domain *d, void *md_priv) union mon_data_bits md; struct sync_args args; struct mpam_resctrl_dom *dom; - u32 mon; unsigned long timeout;
md.priv = md_priv;
- mon = md.u.mon; - - /* Indicates whether allocating a monitor dynamically*/ - if (is_mon_dyn(mon)) - mon = alloc_mon(); - args.partid = md.u.partid; - args.mon = mon; + args.mon = md.u.mon; args.pmg = md.u.pmg; args.match_pmg = true; args.eventid = QOS_L3_OCCUP_EVENT_ID; @@ -380,9 +364,6 @@ static u64 cache_rdmon(struct rdt_domain *d, void *md_priv) WARN_ON(err && (err != -EBUSY)); } while (err == -EBUSY);
- if (is_mon_dyn(mon)) - free_mon(mon); - return result; } /* @@ -396,18 +377,12 @@ static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv) union mon_data_bits md; struct sync_args args; struct mpam_resctrl_dom *dom; - u32 mon; unsigned long timeout;
md.priv = md_priv;
- mon = md.u.mon; - - if (is_mon_dyn(mon)) - mon = alloc_mon(); - args.partid = md.u.partid; - args.mon = mon; + args.mon = md.u.mon; args.pmg = md.u.pmg; args.match_pmg = true; args.eventid = QOS_L3_MBM_LOCAL_EVENT_ID; @@ -429,22 +404,17 @@ static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv) WARN_ON(err && (err != -EBUSY)); } while (err == -EBUSY);
- if (is_mon_dyn(mon)) - free_mon(mon); return result; }
static int -common_wrmon(struct rdt_domain *d, void *md_priv, bool enable) +common_wrmon(struct rdt_domain *d, void *md_priv) { u64 result; union mon_data_bits md; struct sync_args args; struct mpam_resctrl_dom *dom;
- if (!enable) - return -EINVAL; - md.priv = md_priv; args.partid = md.u.partid; args.mon = md.u.mon; @@ -493,7 +463,7 @@ int closid_init(void) num_closid = mpam_sysprops_num_partid(); num_closid = min(num_closid, RESCTRL_MAX_CLOSID);
- hw_alloc_times_validate(clos, times, flag); + hw_alloc_times_validate(times, flag);
if (flag) num_closid = rounddown(num_closid, 2); @@ -519,7 +489,7 @@ int closid_alloc(void) int pos; u32 times, flag;
- hw_alloc_times_validate(clos, times, flag); + hw_alloc_times_validate(times, flag);
pos = find_first_bit(closid_free_map, num_closid); if (pos == num_closid) @@ -534,7 +504,7 @@ void closid_free(int closid) { u32 times, flag;
- hw_alloc_times_validate(clos, times, flag); + hw_alloc_times_validate(times, flag); bitmap_set(closid_free_map, closid, times); }
@@ -1211,15 +1181,7 @@ static struct rftype res_specific_files[] = { .write = resctrl_group_schemata_write, .seq_show = resctrl_group_schemata_show, .fflags = RF_CTRL_BASE, - }, - { - .name = "ctrlmon", - .mode = 0644, - .kf_ops = &resctrl_group_kf_single_ops, - .write = resctrl_group_ctrlmon_write, - .seq_show = resctrl_group_ctrlmon_show, - .fflags = RF_CTRL_BASE, - }, + } };
struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, @@ -1479,3 +1441,28 @@ u16 mpam_resctrl_max_mon_num(void)
return mon_num; } + +int resctrl_id_init(void) +{ + int ret; + + ret = closid_init(); + if (ret) + goto out; + + pmg_init(); + mon_init(); + +out: + return ret; +} + +int resctrl_id_alloc(void) +{ + return closid_alloc(); +} + +void resctrl_id_free(int id) +{ + closid_free(id); +} diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index e4f4f2b35079..ebca3fb03e9a 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -313,6 +313,138 @@ void resctrl_group_kn_unlock(struct kernfs_node *kn) } }
+static int +mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, + char *name, struct kernfs_node **dest_kn) +{ + struct kernfs_node *kn; + int ret; + + /* create the directory */ + kn = kernfs_create_dir(parent_kn, name, parent_kn->mode, prgrp); + if (IS_ERR(kn)) { + pr_info("%s: create dir %s, error\n", __func__, name); + return PTR_ERR(kn); + } + + if (dest_kn) + *dest_kn = kn; + + /* + * This extra ref will be put in kernfs_remove() and guarantees + * that @rdtgrp->kn is always accessible. + */ + kernfs_get(kn); + + ret = resctrl_group_kn_set_ugid(kn); + if (ret) + goto out_destroy; + + kernfs_activate(kn); + + return 0; + +out_destroy: + kernfs_remove(kn); + return ret; +} + +static void mkdir_mondata_all_prepare_clean(struct resctrl_group *prgrp) +{ + if (prgrp->type == RDTCTRL_GROUP) + return; + + if (prgrp->closid) + resctrl_id_free(prgrp->closid); + if (prgrp->mon.rmid) + free_mon_id(prgrp->mon.rmid); +} + +static int mkdir_mondata_all_prepare(struct resctrl_group *rdtgrp) +{ + int ret = 0; + int mon, mon_id, closid; + + mon = resctrl_lru_request_mon(); + if (mon < 0) { + rdt_last_cmd_puts("out of monitors\n"); + ret = -EINVAL; + goto out; + } + rdtgrp->mon.mon = mon; + + if (rdtgrp->type == RDTMON_GROUP) { + mon_id = alloc_mon_id(); + if (mon_id < 0) { + closid = resctrl_id_alloc(); + if (closid < 0) { + rdt_last_cmd_puts("out of closID\n"); + free_mon_id(mon_id); + ret = -EINVAL; + goto out; + } + rdtgrp->closid = closid; + rdtgrp->mon.rmid = 0; + } else { + struct resctrl_group *prgrp; + + prgrp = rdtgrp->mon.parent; + rdtgrp->closid = prgrp->closid; + rdtgrp->mon.rmid = mon_id; + } + } + +out: + return ret; +} + +/* + * This creates a directory mon_data which contains the monitored data. + * + * mon_data has one directory for each domain whic are named + * in the format mon_<domain_name>_<domain_id>. For ex: A mon_data + * with L3 domain looks as below: + * ./mon_data: + * mon_L3_00 + * mon_L3_01 + * mon_L3_02 + * ... + * + * Each domain directory has one file per event: + * ./mon_L3_00/: + * llc_occupancy + * + */ +static int mkdir_mondata_all(struct kernfs_node *parent_kn, + struct resctrl_group *prgrp, + struct kernfs_node **dest_kn) +{ + struct kernfs_node *kn; + int ret; + + /* + * Create the mon_data directory first. + */ + ret = mongroup_create_dir(parent_kn, prgrp, "mon_data", &kn); + if (ret) + return ret; + + if (dest_kn) + *dest_kn = kn; + + ret = resctrl_mkdir_mondata_all_subdir(kn, prgrp); + if (ret) + goto out_destroy; + + kernfs_activate(kn); + + return 0; + +out_destroy: + kernfs_remove(kn); + return ret; +} + static struct dentry *resctrl_mount(struct file_system_type *fs_type, int flags, const char *unused_dev_name, void *data) @@ -365,6 +497,11 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, kernfs_get(kn_mongrp);
#ifndef CONFIG_ARM64 /* [FIXME] arch specific code */ + ret = mkdir_mondata_all_prepare(&resctrl_group_default); + if (ret < 0) { + dentry = ERR_PTR(ret); + goto out_mongrp; + } ret = mkdir_mondata_all(resctrl_group_default.kn, &resctrl_group_default, &kn_mondata); if (ret) { @@ -562,6 +699,17 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, *r = rdtgrp; rdtgrp->mon.parent = prdtgrp; rdtgrp->type = rtype; + + if (rdtgrp->type == RDTCTRL_GROUP) { + ret = resctrl_id_alloc(); + if (ret < 0) { + rdt_last_cmd_puts("out of CLOSIDs\n"); + goto out_unlock; + } + rdtgrp->closid = ret; + ret = 0; + } + INIT_LIST_HEAD(&rdtgrp->mon.crdtgrp_list);
/* kernfs creates the directory for rdtgrp */ @@ -595,27 +743,16 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, }
if (resctrl_mon_capable) { -#ifdef CONFIG_ARM64 - ret = resctrl_mkdir_ctrlmon_mondata(kn, rdtgrp, &rdtgrp->mon.mon_data_kn); - if (ret < 0) { - rdt_last_cmd_puts("out of monitors or PMGs\n"); - goto out_destroy; - } - -#else - ret = alloc_mon_id(); + ret = mkdir_mondata_all_prepare(rdtgrp); if (ret < 0) { - rdt_last_cmd_puts("out of RMIDs\n"); goto out_destroy; } - rdtgrp->mon.rmid = ret;
ret = mkdir_mondata_all(kn, rdtgrp, &rdtgrp->mon.mon_data_kn); if (ret) { rdt_last_cmd_puts("kernfs subdir error\n"); - goto out_idfree; + goto out_prepare_clean; } -#endif } kernfs_activate(kn);
@@ -624,10 +761,8 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, */ return 0;
-#ifndef CONFIG_ARM64 -out_idfree: - free_mon_id(rdtgrp->mon.rmid); -#endif +out_prepare_clean: + mkdir_mondata_all_prepare_clean(rdtgrp); out_destroy: kernfs_remove(rdtgrp->kn); out_free_rgrp: @@ -640,7 +775,6 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, static void mkdir_resctrl_prepare_clean(struct resctrl_group *rgrp) { kernfs_remove(rgrp->kn); - free_mon_id(rgrp->mon.rmid); kfree(rgrp); }
@@ -663,8 +797,6 @@ static int resctrl_group_mkdir_mon(struct kernfs_node *parent_kn, return ret;
prgrp = rdtgrp->mon.parent; - rdtgrp->closid = prgrp->closid; - /* * Add the rdtgrp to the list of rdtgrps the parent * ctrl_mon group has to track. @@ -685,7 +817,6 @@ static int resctrl_group_mkdir_ctrl_mon(struct kernfs_node *parent_kn, { struct resctrl_group *rdtgrp; struct kernfs_node *kn; - u32 closid; int ret;
ret = mkdir_resctrl_prepare(parent_kn, prgrp_kn, name, mode, RDTCTRL_GROUP, @@ -694,19 +825,10 @@ static int resctrl_group_mkdir_ctrl_mon(struct kernfs_node *parent_kn, return ret;
kn = rdtgrp->kn; - ret = resctrl_id_alloc(); - if (ret < 0) { - rdt_last_cmd_puts("out of CLOSIDs\n"); - goto out_common_fail; - } - closid = ret; - ret = 0; - - rdtgrp->closid = closid;
ret = resctrl_group_init_alloc(rdtgrp); if (ret < 0) - goto out_id_free; + goto out_common_fail;
list_add(&rdtgrp->resctrl_group_list, &resctrl_all_groups);
@@ -718,14 +840,13 @@ static int resctrl_group_mkdir_ctrl_mon(struct kernfs_node *parent_kn, ret = mongroup_create_dir(kn, NULL, "mon_groups", NULL); if (ret) { rdt_last_cmd_puts("kernfs subdir error\n"); - goto out_id_free; + goto out_list_del; } }
goto out_unlock;
-out_id_free: - resctrl_id_free(closid); +out_list_del: list_del(&rdtgrp->resctrl_group_list); out_common_fail: mkdir_resctrl_prepare_clean(rdtgrp); @@ -781,10 +902,6 @@ static void resctrl_group_rm_mon(struct resctrl_group *rdtgrp, struct resctrl_group *prdtgrp = rdtgrp->mon.parent; int cpu;
-#ifdef CONFIG_ARM64 /* [FIXME] arch specific code */ - free_mon(rdtgrp->mon.mon); -#endif - /* Give any tasks back to the parent group */ resctrl_move_group_tasks(rdtgrp, prdtgrp, tmpmask);
@@ -862,11 +979,6 @@ static void resctrl_group_rm_ctrl(struct resctrl_group *rdtgrp, cpumask_var_t tm static int resctrl_group_rmdir_ctrl(struct kernfs_node *kn, struct resctrl_group *rdtgrp, cpumask_var_t tmpmask) { -#ifdef CONFIG_ARM64 /* [FIXME] arch specific code */ - if (rdtgrp->flags & RDT_CTRLMON) - return -EPERM; -#endif - resctrl_group_rm_ctrl(rdtgrp, tmpmask);
/*
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Code in resctrlfs.c is not shared with x86 RDT currently, but may be updated to support both in the future, so remove unrelated CONFIG for now to make code clearer.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/resctrlfs.c | 19 +------------------ 1 file changed, 1 insertion(+), 18 deletions(-)
diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index ebca3fb03e9a..eb7eec23096d 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -183,9 +183,7 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) unsigned long fflags; char name[32]; int ret; -#ifdef CONFIG_ARM64 enum resctrl_resource_level level; -#endif
/* create the directory */ kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL); @@ -197,14 +195,10 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) if (ret) goto out_destroy;
-#ifdef CONFIG_ARM64 for (level = RDT_RESOURCE_SMMU; level < RDT_NUM_RESOURCES; level++) { r = mpam_resctrl_get_resource(level); if (!r) continue; -#else - for_each_resctrl_resource(r) { -#endif if (r->alloc_enabled) { fflags = r->fflags | RF_CTRL_INFO; ret = resctrl_group_mkdir_info_resdir(r, r->name, fflags); @@ -213,14 +207,10 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) } }
-#ifdef CONFIG_ARM64 for (level = RDT_RESOURCE_SMMU; level < RDT_NUM_RESOURCES; level++) { r = mpam_resctrl_get_resource(level); if (!r) continue; -#else - for_each_resctrl_resource(r) { -#endif if (r->mon_enabled) { fflags = r->fflags | RF_MON_INFO; snprintf(name, sizeof(name), "%s_MON", r->name); @@ -467,13 +457,11 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, dentry = ERR_PTR(ret); goto out_options; } -#ifdef CONFIG_ARM64 ret = schemata_list_init(); if (ret) { dentry = ERR_PTR(ret); goto out_options; } -#endif ret = resctrl_id_init(); if (ret) { dentry = ERR_PTR(ret); @@ -496,7 +484,6 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, } kernfs_get(kn_mongrp);
-#ifndef CONFIG_ARM64 /* [FIXME] arch specific code */ ret = mkdir_mondata_all_prepare(&resctrl_group_default); if (ret < 0) { dentry = ERR_PTR(ret); @@ -510,7 +497,6 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, } kernfs_get(kn_mondata); resctrl_group_default.mon.mon_data_kn = kn_mondata; -#endif }
dentry = kernfs_mount(fs_type, flags, resctrl_root, @@ -523,11 +509,9 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, goto out;
out_mondata: -#ifndef CONFIG_ARM64 /* [FIXME] arch specific code */ if (resctrl_mon_capable) kernfs_remove(kn_mondata); out_mongrp: -#endif if (resctrl_mon_capable) kernfs_remove(kn_mongrp); out_info: @@ -652,9 +636,8 @@ static void resctrl_kill_sb(struct super_block *sb) mutex_lock(&resctrl_group_mutex);
resctrl_resource_reset(); -#ifdef CONFIG_ARM64 + schemata_list_destroy(); -#endif
rmdir_all_sub(); static_branch_disable_cpuslocked(&resctrl_alloc_enable_key);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
According to Arm MPAM spec definition, register MPAMCFG_PART_SEL's high 16 bit should be set to 0 and MPAMCFG_INTPARTID's high 16 bit should be set to 1 when establishing intpartid association, and we should use intpartid in MPAMCFG_PART_SEL instead of reqpartid as long as intpartid narrowing is implemented.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 4 +++ arch/arm64/kernel/mpam/mpam_device.c | 35 +++++++++++++++----------- 2 files changed, 25 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index aa5bbe390c19..4c042eb2da20 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -91,6 +91,10 @@ #define MSMON_CFG_CSU_TYPE 0x43 #define MSMON_CFG_MBWU_TYPE 0x42
+/* + * Set MPAMCFG_INTPARTID internal bit + */ +#define MPAMCFG_INTPARTID_INTERNAL BIT(16) /* * Set MPAMCFG_PART_SEL internal bit */ diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 356362ecdc79..327540e1f2eb 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -1086,16 +1086,29 @@ static int mpam_device_frob_mon(struct mpam_device *dev, return 0; }
-static int mpam_device_narrow_map(struct mpam_device *dev, u32 partid, +static void mpam_device_narrow_map(struct mpam_device *dev, u32 partid, u32 intpartid) { - return 0; + int cur_intpartid; + + lockdep_assert_held(&dev->lock); + + mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); + wmb(); /* subsequent writes must be applied to our new partid */ + + cur_intpartid = mpam_read_reg(dev, MPAMCFG_INTPARTID); + /* write association, this need set 16 bit to 1 */ + intpartid = intpartid | MPAMCFG_INTPARTID_INTERNAL; + /* reqpartid has already been associated to this intpartid */ + if (cur_intpartid == intpartid) + return; + + mpam_write_reg(dev, MPAMCFG_INTPARTID, intpartid); }
static int mpam_device_config(struct mpam_device *dev, u32 partid, struct mpam_config *cfg) { - int ret; u16 cmax = GENMASK(dev->cmax_wd, 0); u32 pri_val = 0; u16 intpri, dspri, max_intpri, max_dspri; @@ -1111,15 +1124,10 @@ static int mpam_device_config(struct mpam_device *dev, u32 partid, * upstream(resctrl) keep this order */ if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { - if (cfg && mpam_has_feature(mpam_feat_part_nrw, cfg->valid)) { - ret = mpam_device_narrow_map(dev, partid, - cfg->intpartid); - if (ret) - goto out; - partid = PART_SEL_SET_INTERNAL(cfg->intpartid); - } else { - partid = PART_SEL_SET_INTERNAL(cfg->intpartid); - } + if (cfg && mpam_has_feature(mpam_feat_part_nrw, cfg->valid)) + mpam_device_narrow_map(dev, partid, cfg->intpartid); + /* intpartid success, set 16 bit to 1*/ + partid = PART_SEL_SET_INTERNAL(cfg->intpartid); }
mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); @@ -1195,8 +1203,7 @@ static int mpam_device_config(struct mpam_device *dev, u32 partid, */ mb();
-out: - return ret; + return 0; }
static void mpam_component_device_sync(void *__ctx)
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Currently we use partid and pmg (Performance Monitoring Group) to filter some performance events so that the performance of a particular partid and pmg can be monitored, but pmg looks useless except for making a filter with partid, especially when pmg varies in different MPAM resources, it makes difficult to allocate pmg resource when creating new mon group in resctrl sysfs, even causes a lot of waste.
So we use a software-defined sd_closid instead of 32-bit integer to label each rdtgroup (including mon group), sd_closid include intpartid for allocation and reqpartid for synchronizing configuration and monitoring, Given MPAM has narrowing feature, also includes the concept (hw_reqpartid, hw_intpartid we named), when narrowing is not supported, number of intpartid and reqpartid equals to hw_reqpartid, otherwise intpartid and reqpartid is related to minimum number of both hw_reqpartid and hw_intpartid supported across different resources, by using this way, not only we solve above problem but also use relax reqpartid for creating new mon group. additionally, pmg is also preferred when it is available.
e.g. hw_intpartid: 0 1 2 3 4 5 6 7 hw_reqpartid: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| | | | | | | | | | | | | | | | | | | | | | | | resctrl ctrl group: p0 p1 p2 p3 p4 p5 p6 p7 | | | | | | | | | | resctrl mon group: | +-----------------------m4 m5 m6 m7 +-----------------m0 m1 m2 m3 In this case, use extra reqpartid to create m0, m1, m2, m3 mon group for p2 ctrl group, and m4, m5, m6, m7 for p4.
As we know reqpartid both supports allocating and monitoring filter, we should synchronize config of ctrl group with child mon groups under this design, each mon group's configuration indexed by a reqpartid that called slave is closely following it's father ctrl group that called master whenever configuration changes. not only that, we let task_struct keep both intpartid and reqpartid so we can know if tasks belong to a same ctrl group through intpartid and change cpu's partid by writing MPAMx_ELx through reqpartid when tasks switching.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 18 +- arch/arm64/include/asm/mpam_resource.h | 2 + arch/arm64/include/asm/resctrl.h | 46 ++++- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 111 ++++++++---- arch/arm64/kernel/mpam/mpam_device.c | 45 +++-- arch/arm64/kernel/mpam/mpam_internal.h | 4 +- arch/arm64/kernel/mpam/mpam_mon.c | 4 +- arch/arm64/kernel/mpam/mpam_resctrl.c | 224 +++++++++++++++++-------- fs/resctrlfs.c | 126 +++++++++----- 9 files changed, 396 insertions(+), 184 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index ec2fc0f2eadb..5a76fb5d0fc6 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -185,7 +185,7 @@ struct resctrl_staged_config { hw_closid_t hw_closid; u32 new_ctrl; bool have_new_ctrl; - enum resctrl_conf_type new_ctrl_type; + enum resctrl_conf_type conf_type; };
/* later move to resctrl common directory */ @@ -257,14 +257,10 @@ void post_resctrl_mount(void); #define mpam_readl(addr) readl(addr) #define mpam_writel(v, addr) writel(v, addr)
-/** - * struct msr_param - set a range of MSRs from a domain - * @res: The resource to use - * @value: value - */ +struct sd_closid; + struct msr_param { - struct resctrl_resource *res; - u64 value; + struct sd_closid *closid; };
/** @@ -299,13 +295,13 @@ struct raw_resctrl_resource { u16 hdl_wd;
void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, - struct list_head *opt_list, int partid); - u64 (*msr_read)(struct rdt_domain *d, int partid); + struct list_head *opt_list, struct msr_param *para); + u64 (*msr_read)(struct rdt_domain *d, struct msr_param *para);
int data_width; const char *format_str; int (*parse_ctrlval)(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg, hw_closid_t closid); + struct resctrl_staged_config *cfg);
u16 num_mon; u64 (*mon_read)(struct rdt_domain *d, void *md_priv); diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 4c042eb2da20..cc863183e1be 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -95,6 +95,8 @@ * Set MPAMCFG_INTPARTID internal bit */ #define MPAMCFG_INTPARTID_INTERNAL BIT(16) +#define INTPARTID_INTPARTID_MASK (BIT(15) - 1) +#define MPAMCFG_INTPARTID_INTPARTID_GET(r) (r & INTPARTID_INTPARTID_MASK) /* * Set MPAMCFG_PART_SEL internal bit */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 68e515ea8779..0c1f2cef0c36 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -49,12 +49,20 @@ struct mongroup { int init; };
+/** + * struct sd_closid - software defined closid + * @intpartid: closid for this rdtgroup only for allocation + * @weak_closid: closid for synchronizing configuration and monitoring + */ +struct sd_closid { + u32 intpartid; + u32 reqpartid; +}; + /** * struct rdtgroup - store rdtgroup's data in resctrl file system. * @kn: kernfs node * @resctrl_group_list: linked list for all rdtgroups - * @closid: closid for this rdtgroup - * #endif * @cpu_mask: CPUs assigned to this rdtgroup * @flags: status bits * @waitcount: how many cpus expect to find this @@ -66,7 +74,7 @@ struct mongroup { struct rdtgroup { struct kernfs_node *kn; struct list_head resctrl_group_list; - u32 closid; + struct sd_closid closid; struct cpumask cpu_mask; int flags; atomic_t waitcount; @@ -80,12 +88,17 @@ void schemata_list_destroy(void);
int resctrl_lru_request_mon(void);
-int alloc_mon_id(void); -void free_mon_id(u32 id); +int alloc_rmid(void); +void free_rmid(u32 id);
+enum closid_type { + CLOSID_INT = 0x1, + CLOSID_REQ = 0x2, + CLOSID_NUM_TYPES, +}; int resctrl_id_init(void); -int resctrl_id_alloc(void); -void resctrl_id_free(int id); +int resctrl_id_alloc(enum closid_type); +void resctrl_id_free(enum closid_type, int id);
void update_cpu_closid_rmid(void *info); void update_closid_rmid(const struct cpumask *cpu_mask, struct resctrl_group *r); @@ -127,6 +140,25 @@ int resctrl_mkdir_mondata_all_subdir(struct kernfs_node *parent_kn, struct resctrl_resource * mpam_resctrl_get_resource(enum resctrl_resource_level level);
+int resctrl_update_groups_config(struct rdtgroup *rdtgrp); + #define RESCTRL_MAX_CLOSID 32
+/* + * left 16 bits of closid store parent(master)'s + * closid, the reset store current group's closid, + * this used for judging if tasks are allowed to move + * another ctrlmon/mon group, it is because when + * a mon group is permited to allocated another + * closid different from it's parent, only closid + * is not sufficient to do that. + */ +#define TASK_CLOSID_SET(prclosid, closid) \ + ((prclosid << 16) | closid) + +#define TASK_CLOSID_CUR_GET(closid) \ + (closid & GENMASK(15, 0)) +#define TASK_CLOSID_PR_GET(closid) \ + ((closid & GENMASK(31, 16)) >> 16) + #endif /* _ASM_ARM64_RESCTRL_H */ diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 0c324008d9ab..b290fbf49c4c 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -119,10 +119,16 @@ static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, struct resctrl_resource *r) { int i; - u32 partid; struct rdt_domain *d; struct raw_resctrl_resource *rr; struct resctrl_staged_config *cfg; + hw_closid_t hw_closid; + struct sd_closid closid; + struct list_head *head; + struct rdtgroup *entry; + struct msr_param para; + + para.closid = &closid;
rr = r->res; list_for_each_entry(d, &r->domains, list) { @@ -131,15 +137,38 @@ static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, if (!cfg[i].have_new_ctrl) continue;
- partid = hw_closid_val(cfg[i].hw_closid); - /* apply cfg */ - if (d->ctrl_val[partid] == cfg[i].new_ctrl) - continue; - - d->ctrl_val[partid] = cfg[i].new_ctrl; - d->have_new_ctrl = true; - - rr->msr_update(r, d, NULL, partid); + /* + * for ctrl group configuration, hw_closid of cfg[i] + * equals to rdtgrp->closid.intpartid. + */ + closid.intpartid = hw_closid_val(cfg[i].hw_closid); + + /* if ctrl group's config has changed, refresh it first. */ + if (d->ctrl_val[closid.intpartid] != cfg[i].new_ctrl) { + /* + * duplicate ctrl group's configuration indexed + * by intpartid from domain ctrl_val array. + */ + resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, + cfg[i].conf_type, hw_closid); + closid.reqpartid = hw_closid_val(hw_closid); + + d->ctrl_val[closid.intpartid] = cfg[i].new_ctrl; + d->have_new_ctrl = true; + rr->msr_update(r, d, NULL, ¶); + } + /* + * we should synchronize all child mon groups' + * configuration from this ctrl rdtgrp + */ + head = &rdtgrp->mon.crdtgrp_list; + list_for_each_entry(entry, head, mon.crdtgrp_list) { + resctrl_cdp_map(clos, entry->closid.reqpartid, + cfg[i].conf_type, hw_closid); + closid.reqpartid = hw_closid_val(hw_closid); + + rr->msr_update(r, d, NULL, ¶); + } } }
@@ -175,8 +204,10 @@ static int parse_line(char *line, struct resctrl_resource *r, list_for_each_entry(d, &r->domains, list) { if (d->id == dom_id) { resctrl_cdp_map(clos, closid, t, hw_closid); - if (rr->parse_ctrlval(dom, rr, &d->staged_cfg[t], hw_closid)) + if (rr->parse_ctrlval(dom, rr, &d->staged_cfg[t])) return -EINVAL; + d->staged_cfg[t].hw_closid = hw_closid; + d->staged_cfg[t].conf_type = t; goto next; } } @@ -235,7 +266,7 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of,
rdt_last_cmd_clear();
- closid = rdtgrp->closid; + closid = rdtgrp->closid.intpartid;
for_each_supported_resctrl_exports(res) { r = &res->resctrl_res; @@ -268,15 +299,7 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, goto out; }
- for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; - if (r->alloc_enabled) { - ret = resctrl_group_update_domains(rdtgrp, r); - if (ret) - goto out; - } - } - + ret = resctrl_update_groups_config(rdtgrp); out: resctrl_group_kn_unlock(of->kn); return ret ?: nbytes; @@ -293,21 +316,24 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, * a single "S" simply. */ static void show_doms(struct seq_file *s, struct resctrl_resource *r, - char *schema_name, int partid) + char *schema_name, struct sd_closid *closid) { struct raw_resctrl_resource *rr = r->res; struct rdt_domain *dom; + struct msr_param para; bool sep = false; bool rg = false; bool prev_auto_fill = false; u32 reg_val;
+ para.closid = closid; + if (r->dom_num > RESCTRL_SHOW_DOM_MAX_NUM) rg = true;
seq_printf(s, "%*s:", max_name_width, schema_name); list_for_each_entry(dom, &r->domains, list) { - reg_val = rr->msr_read(dom, partid); + reg_val = rr->msr_read(dom, ¶);
if (rg && reg_val == r->default_ctrl && prev_auto_fill == true) @@ -335,7 +361,7 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, struct resctrl_schema *rs; int ret = 0; hw_closid_t hw_closid; - u32 partid; + struct sd_closid closid;
rdtgrp = resctrl_group_kn_lock_live(of->kn); if (rdtgrp) { @@ -344,11 +370,15 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, if (!r) continue; if (r->alloc_enabled) { - resctrl_cdp_map(clos, rdtgrp->closid, + resctrl_cdp_map(clos, rdtgrp->closid.intpartid, + rs->conf_type, hw_closid); + closid.intpartid = hw_closid_val(hw_closid); + + resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, rs->conf_type, hw_closid); - partid = hw_closid_val(hw_closid); - if (partid < mpam_sysprops_num_partid()) - show_doms(s, r, rs->name, partid); + closid.reqpartid = hw_closid_val(hw_closid); + + show_doms(s, r, rs->name, &closid); } } } else { @@ -469,7 +499,8 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn,
md.u.rid = r->rid; md.u.domid = d->id; - resctrl_cdp_map(clos, prgrp->closid, s->conf_type, hw_closid); + /* monitoring use reqpartid (reqpartid) */ + resctrl_cdp_map(clos, prgrp->closid.reqpartid, s->conf_type, hw_closid); md.u.partid = hw_closid_val(hw_closid); resctrl_cdp_map(mon, prgrp->mon.mon, s->conf_type, hw_monid); md.u.mon = hw_monid_val(hw_monid); @@ -615,9 +646,9 @@ int resctrl_group_init_alloc(struct rdtgroup *rdtgrp) list_for_each_entry(s, &resctrl_all_schema, list) { r = s->res; if (r->rid == RDT_RESOURCE_MC) { - rdtgroup_init_mba(r, rdtgrp->closid); + rdtgroup_init_mba(r, rdtgrp->closid.intpartid); } else { - ret = rdtgroup_init_cat(s, rdtgrp->closid); + ret = rdtgroup_init_cat(s, rdtgrp->closid.intpartid); if (ret < 0) return ret; } @@ -631,3 +662,21 @@ int resctrl_group_init_alloc(struct rdtgroup *rdtgrp)
return 0; } + +int resctrl_update_groups_config(struct rdtgroup *rdtgrp) +{ + int ret = 0; + struct resctrl_resource *r; + struct mpam_resctrl_res *res; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (r->alloc_enabled) { + ret = resctrl_group_update_domains(rdtgrp, r); + if (ret) + break; + } + } + + return ret; +} diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 327540e1f2eb..2e4cf61dc797 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -975,7 +975,7 @@ static u32 mpam_device_read_csu_mon(struct mpam_device *dev, clt = MSMON_CFG_CTL_MATCH_PARTID | MSMON_CFG_CSU_TYPE; if (args->match_pmg) clt |= MSMON_CFG_CTL_MATCH_PMG; - flt = args->partid | + flt = args->closid.reqpartid | (args->pmg << MSMON_CFG_CSU_FLT_PMG_SHIFT);
/* @@ -1024,7 +1024,7 @@ static u32 mpam_device_read_mbwu_mon(struct mpam_device *dev, clt = MSMON_CFG_CTL_MATCH_PARTID | MSMON_CFG_MBWU_TYPE; if (args->match_pmg) clt |= MSMON_CFG_CTL_MATCH_PMG; - flt = args->partid | + flt = args->closid.reqpartid | (args->pmg << MSMON_CFG_MBWU_FLT_PMG_SHIFT);
/* @@ -1106,13 +1106,20 @@ static void mpam_device_narrow_map(struct mpam_device *dev, u32 partid, mpam_write_reg(dev, MPAMCFG_INTPARTID, intpartid); }
-static int mpam_device_config(struct mpam_device *dev, u32 partid, +static int +mpam_device_config(struct mpam_device *dev, struct sd_closid *closid, struct mpam_config *cfg) { u16 cmax = GENMASK(dev->cmax_wd, 0); u32 pri_val = 0; u16 intpri, dspri, max_intpri, max_dspri; u32 mbw_pbm, mbw_max; + /* + * if dev supports narrowing, narrowing first and then apply this slave's + * configuration. + */ + u32 intpartid = closid->intpartid; + u32 partid = closid->reqpartid;
lockdep_assert_held(&dev->lock);
@@ -1125,9 +1132,9 @@ static int mpam_device_config(struct mpam_device *dev, u32 partid, */ if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { if (cfg && mpam_has_feature(mpam_feat_part_nrw, cfg->valid)) - mpam_device_narrow_map(dev, partid, cfg->intpartid); + mpam_device_narrow_map(dev, partid, intpartid); /* intpartid success, set 16 bit to 1*/ - partid = PART_SEL_SET_INTERNAL(cfg->intpartid); + partid = PART_SEL_SET_INTERNAL(intpartid); }
mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); @@ -1209,7 +1216,7 @@ static int mpam_device_config(struct mpam_device *dev, u32 partid, static void mpam_component_device_sync(void *__ctx) { int err = 0; - u32 partid; + u32 reqpartid; unsigned long flags; struct mpam_device *dev; struct mpam_device_sync *ctx = (struct mpam_device_sync *)__ctx; @@ -1230,12 +1237,16 @@ static void mpam_component_device_sync(void *__ctx) err = 0; spin_lock_irqsave(&dev->lock, flags); if (args) { - partid = args->partid; + /* + * at this time reqpartid shows where the + * configuration was stored. + */ + reqpartid = args->closid.reqpartid; if (ctx->config_mon) err = mpam_device_frob_mon(dev, ctx); else - err = mpam_device_config(dev, partid, - &comp->cfg[partid]); + err = mpam_device_config(dev, &args->closid, + &comp->cfg[reqpartid]); } else { mpam_reset_device(comp, dev); } @@ -1367,11 +1378,8 @@ static void mpam_component_read_mpamcfg(void *_ctx) return;
reg = args->reg; - /* - * args->partid is possible reqpartid or intpartid, - * if narrow enabled, it should be intpartid. - */ - partid = args->partid; + + partid = args->closid.reqpartid;
list_for_each_entry(dev, &comp->devices, comp_list) { if (!cpumask_test_cpu(smp_processor_id(), @@ -1379,8 +1387,13 @@ static void mpam_component_read_mpamcfg(void *_ctx) continue;
spin_lock_irqsave(&dev->lock, flags); - if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) - partid = PART_SEL_SET_INTERNAL(partid); + if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { + /* + * partid is possible reqpartid or intpartid, + * if narrow enabled, it should be intpartid. + */ + partid = PART_SEL_SET_INTERNAL(args->closid.intpartid); + } mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); wmb(); val = mpam_read_reg(dev, reg); diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 57a08a78bb6e..cc35dfc73449 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -37,7 +37,7 @@ struct mpam_resctrl_res { struct sync_args { u8 domid; u8 pmg; - u32 partid; + struct sd_closid closid; u32 mon; bool match_pmg; enum rdt_event_id eventid; @@ -95,8 +95,6 @@ struct mpam_config { * hardlimit or not */ bool hdl; - - u32 intpartid; };
/* Bits for mpam_features_t */ diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 29f84e251b1e..9875b44b83ac 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -76,12 +76,12 @@ static void free_pmg(u32 pmg) pmg_free_map |= 1 << pmg; }
-int alloc_mon_id(void) +int alloc_rmid(void) { return alloc_pmg(); }
-void free_mon_id(u32 id) +void free_rmid(u32 id) { free_pmg(id); } diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index ce05b8037a4d..b92aca531feb 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -112,14 +112,15 @@ bool is_resctrl_cdp_enabled(void)
static void mpam_resctrl_update_component_cfg(struct resctrl_resource *r, - struct rdt_domain *d, struct list_head *opt_list, u32 partid); + struct rdt_domain *d, struct list_head *opt_list, + struct sd_closid *closid);
static void common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, - struct list_head *opt_list, int partid); + struct list_head *opt_list, struct msr_param *para);
-static u64 cache_rdmsr(struct rdt_domain *d, int partid); -static u64 mbw_rdmsr(struct rdt_domain *d, int partid); +static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para); +static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para);
static u64 cache_rdmon(struct rdt_domain *d, void *md_priv); static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv); @@ -127,9 +128,9 @@ static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv); static int common_wrmon(struct rdt_domain *d, void *md_priv);
static int parse_cbm(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg, hw_closid_t hw_closid); + struct resctrl_staged_config *cfg); static int parse_bw(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg, hw_closid_t hw_closid); + struct resctrl_staged_config *cfg);
struct raw_resctrl_resource raw_resctrl_resources_all[] = { [RDT_RESOURCE_L3] = { @@ -194,7 +195,7 @@ static bool cbm_validate(char *buf, unsigned long *data, */ static int parse_cbm(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg, hw_closid_t hw_closid) + struct resctrl_staged_config *cfg) { unsigned long data;
@@ -208,7 +209,6 @@ parse_cbm(char *buf, struct raw_resctrl_resource *r,
cfg->new_ctrl = data; cfg->have_new_ctrl = true; - cfg->hw_closid = hw_closid;
return 0; } @@ -258,7 +258,7 @@ static bool bw_validate(char *buf, unsigned long *data,
static int parse_bw(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg, hw_closid_t hw_closid) + struct resctrl_staged_config *cfg) { unsigned long data;
@@ -272,34 +272,36 @@ parse_bw(char *buf, struct raw_resctrl_resource *r,
cfg->new_ctrl = data; cfg->have_new_ctrl = true; - cfg->hw_closid = hw_closid;
return 0; }
static void common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, - struct list_head *opt_list, int partid) + struct list_head *opt_list, struct msr_param *para) { struct sync_args args; struct mpam_resctrl_dom *dom;
- args.partid = partid; - dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- mpam_resctrl_update_component_cfg(r, d, opt_list, partid); + mpam_resctrl_update_component_cfg(r, d, opt_list, para->closid);
+ /* + * so far we have accomplished configuration replication, + * it is ready to apply this configuration. + */ + args.closid = *para->closid; mpam_component_config(dom->comp, &args); }
-static u64 cache_rdmsr(struct rdt_domain *d, int partid) +static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para) { u32 result; struct sync_args args; struct mpam_resctrl_dom *dom;
- args.partid = partid; + args.closid = *para->closid; args.reg = MPAMCFG_CPBM;
dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); @@ -308,14 +310,15 @@ static u64 cache_rdmsr(struct rdt_domain *d, int partid)
return result; } -static u64 mbw_rdmsr(struct rdt_domain *d, int partid) + +static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para) { u64 max; u32 result; struct sync_args args; struct mpam_resctrl_dom *dom;
- args.partid = partid; + args.closid = *para->closid; args.reg = MPAMCFG_MBW_MAX;
dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); @@ -341,7 +344,8 @@ static u64 cache_rdmon(struct rdt_domain *d, void *md_priv)
md.priv = md_priv;
- args.partid = md.u.partid; + /* monitoring only need reqpartid */ + args.closid.reqpartid = md.u.partid; args.mon = md.u.mon; args.pmg = md.u.pmg; args.match_pmg = true; @@ -381,7 +385,8 @@ static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv)
md.priv = md_priv;
- args.partid = md.u.partid; + /* monitoring only need reqpartid */ + args.closid.reqpartid = md.u.partid; args.mon = md.u.mon; args.pmg = md.u.pmg; args.match_pmg = true; @@ -416,7 +421,8 @@ common_wrmon(struct rdt_domain *d, void *md_priv) struct mpam_resctrl_dom *dom;
md.priv = md_priv; - args.partid = md.u.partid; + /* monitoring only need reqpartid */ + args.closid.reqpartid = md.u.partid; args.mon = md.u.mon; args.pmg = md.u.pmg;
@@ -449,63 +455,120 @@ common_wrmon(struct rdt_domain *d, void *md_priv) * limited as the number of resources grows. */
-static unsigned long *closid_free_map; -static int num_closid; +static unsigned long *intpartid_free_map, *reqpartid_free_map; +static int num_intpartid, num_reqpartid;
-int closid_init(void) +static void mpam_resctrl_closid_collect(void) { - int pos; - u32 times, flag; + struct mpam_resctrl_res *res; + struct raw_resctrl_resource *rr; + + /* + * num_reqpartid refers to the maximum partid number + * that system width provides. + */ + num_reqpartid = mpam_sysprops_num_partid(); + /* + * we make intpartid the closid, this is because when + * system platform supports intpartid narrowing, this + * intpartid concept represents the resctrl maximum + * group we can create, so it should be less than + * maximum reqpartid number and maximum closid number + * allowed by resctrl sysfs provided by @Intel-RDT. + */ + num_intpartid = mpam_sysprops_num_partid(); + num_intpartid = min(num_reqpartid, RESCTRL_MAX_CLOSID);
- if (closid_free_map) - kfree(closid_free_map); + /* + * as we know we make intpartid the closid given to + * resctrl, we should know if any resource supports + * intpartid narrowing. + */ + for_each_supported_resctrl_exports(res) { + rr = res->resctrl_res.res; + if (!rr->num_intpartid) + continue; + num_intpartid = min(num_intpartid, (int)rr->num_intpartid); + } +}
- num_closid = mpam_sysprops_num_partid(); - num_closid = min(num_closid, RESCTRL_MAX_CLOSID); +static inline int local_closid_bitmap_init(int bits_num, unsigned long **ptr) +{ + int pos; + u32 times, flag;
hw_alloc_times_validate(times, flag);
if (flag) - num_closid = rounddown(num_closid, 2); + bits_num = rounddown(bits_num, 2);
- closid_free_map = bitmap_zalloc(num_closid, GFP_KERNEL); - if (!closid_free_map) - return -ENOMEM; + if (!*ptr) { + *ptr = bitmap_zalloc(bits_num, GFP_KERNEL); + if (!*ptr) + return -ENOMEM; + }
- bitmap_set(closid_free_map, 0, num_closid); + bitmap_set(*ptr, 0, bits_num);
/* CLOSID 0 is always reserved for the default group */ - pos = find_first_bit(closid_free_map, num_closid); - bitmap_clear(closid_free_map, pos, times); + pos = find_first_bit(*ptr, bits_num); + bitmap_clear(*ptr, pos, times); + + return 0; +} + +int closid_bitmap_init(void) +{ + int ret; + + mpam_resctrl_closid_collect(); + if (!num_intpartid || !num_reqpartid) + return -EINVAL; + + if (intpartid_free_map) + kfree(intpartid_free_map); + if (reqpartid_free_map) + kfree(reqpartid_free_map); + + ret = local_closid_bitmap_init(num_intpartid, &intpartid_free_map); + if (ret) + goto out; + + ret = local_closid_bitmap_init(num_reqpartid, &reqpartid_free_map); + if (ret) + goto out;
return 0; +out: + return ret; } + /* * If cdp enabled, allocate two closid once time, then return first * allocated id. */ -int closid_alloc(void) +static int closid_bitmap_alloc(int bits_num, unsigned long *ptr) { int pos; u32 times, flag;
hw_alloc_times_validate(times, flag);
- pos = find_first_bit(closid_free_map, num_closid); - if (pos == num_closid) + pos = find_first_bit(ptr, bits_num); + if (pos == bits_num) return -ENOSPC;
- bitmap_clear(closid_free_map, pos, times); + bitmap_clear(ptr, pos, times);
return pos; }
-void closid_free(int closid) +static void closid_bitmap_free(int pos, unsigned long *ptr) { u32 times, flag;
hw_alloc_times_validate(times, flag); - bitmap_set(closid_free_map, closid, times); + bitmap_set(ptr, pos, times); }
/* @@ -633,7 +696,7 @@ void update_cpu_closid_rmid(void *info) struct rdtgroup *r = info;
if (r) { - this_cpu_write(pqr_state.default_closid, r->closid); + this_cpu_write(pqr_state.default_closid, r->closid.reqpartid); this_cpu_write(pqr_state.default_rmid, r->mon.rmid); }
@@ -728,10 +791,14 @@ int __resctrl_group_move_task(struct task_struct *tsk, * their parent CTRL group. */ if (rdtgrp->type == RDTCTRL_GROUP) { - tsk->closid = rdtgrp->closid; + tsk->closid = TASK_CLOSID_SET(rdtgrp->closid.intpartid, + rdtgrp->closid.reqpartid); tsk->rmid = rdtgrp->mon.rmid; } else if (rdtgrp->type == RDTMON_GROUP) { - if (rdtgrp->mon.parent->closid == tsk->closid) { + if (rdtgrp->mon.parent->closid.intpartid == + TASK_CLOSID_PR_GET(tsk->closid)) { + tsk->closid = TASK_CLOSID_SET(rdtgrp->closid.intpartid, + rdtgrp->closid.reqpartid); tsk->rmid = rdtgrp->mon.rmid; } else { rdt_last_cmd_puts("Can't move task to different control group\n"); @@ -1093,12 +1160,14 @@ static void show_resctrl_tasks(struct rdtgroup *r, struct seq_file *s)
rcu_read_lock(); for_each_process_thread(p, t) { - if ((r->type == RDTCTRL_GROUP && t->closid == r->closid) || - (r->type == RDTMON_GROUP && t->closid == r->closid && - t->rmid == r->mon.rmid)) - seq_printf(s, "%d: partid = %d, pmg = %d, (group: partid %d, pmg %d, mon %d)\n", - t->pid, t->closid, t->rmid, - r->closid, r->mon.rmid, r->mon.mon); + if ((r->type == RDTMON_GROUP && + TASK_CLOSID_CUR_GET(t->closid) == r->closid.reqpartid && + t->rmid == r->mon.rmid) || + (r->type == RDTCTRL_GROUP && + TASK_CLOSID_PR_GET(t->closid) == r->closid.intpartid)) + seq_printf(s, "group:(gid:%d mon:%d) task:(pid:%d gid:%d rmid:%d)\n", + r->closid.reqpartid, r->mon.mon, t->pid, + (int)TASK_CLOSID_CUR_GET(t->closid), t->rmid); } rcu_read_unlock(); } @@ -1259,7 +1328,7 @@ void __mpam_sched_in(void) */ if (static_branch_likely(&resctrl_alloc_enable_key)) { if (current->closid) - closid = current->closid; + closid = TASK_CLOSID_CUR_GET(current->closid); }
if (static_branch_likely(&resctrl_mon_enable_key)) { @@ -1352,33 +1421,38 @@ mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res,
static void mpam_resctrl_update_component_cfg(struct resctrl_resource *r, - struct rdt_domain *d, struct list_head *opt_list, u32 partid) + struct rdt_domain *d, struct list_head *opt_list, + struct sd_closid *closid) { struct mpam_resctrl_dom *dom; struct mpam_resctrl_res *res; - struct mpam_config *mpam_cfg; - u32 resctrl_cfg = d->ctrl_val[partid]; + struct mpam_config *slave_mpam_cfg; + u32 intpartid = closid->intpartid; + u32 reqpartid = closid->reqpartid; + u32 resctrl_cfg = d->ctrl_val[intpartid];
lockdep_assert_held(&resctrl_group_mutex);
/* Out of range */ - if (partid >= mpam_sysprops_num_partid()) + if (intpartid >= mpam_sysprops_num_partid() || + reqpartid >= mpam_sysprops_num_partid()) return;
res = container_of(r, struct mpam_resctrl_res, resctrl_res); dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- mpam_cfg = &dom->comp->cfg[partid]; - if (WARN_ON_ONCE(!mpam_cfg)) + /* + * now reqpartid is used for duplicating master's configuration, + * mpam_cfg[intpartid] needn't duplicate this setting, + * it is because only reqpartid stands for each rdtgroup's + * mpam_cfg index id. + */ + slave_mpam_cfg = &dom->comp->cfg[reqpartid]; + if (WARN_ON_ONCE(!slave_mpam_cfg)) return;
- mpam_cfg->valid = 0; - if (partid != mpam_cfg->intpartid) { - mpam_cfg->intpartid = partid; - mpam_set_feature(mpam_feat_part_nrw, &mpam_cfg->valid); - } - - mpam_update_from_resctrl_cfg(res, resctrl_cfg, mpam_cfg); + slave_mpam_cfg->valid = 0; + mpam_update_from_resctrl_cfg(res, resctrl_cfg, slave_mpam_cfg); }
static void mpam_reset_cfg(struct mpam_resctrl_res *res, @@ -1446,7 +1520,7 @@ int resctrl_id_init(void) { int ret;
- ret = closid_init(); + ret = closid_bitmap_init(); if (ret) goto out;
@@ -1457,12 +1531,20 @@ int resctrl_id_init(void) return ret; }
-int resctrl_id_alloc(void) +int resctrl_id_alloc(enum closid_type type) { - return closid_alloc(); + if (type == CLOSID_INT) + return closid_bitmap_alloc(num_intpartid, intpartid_free_map); + else if (type == CLOSID_REQ) + return closid_bitmap_alloc(num_reqpartid, reqpartid_free_map); + + return -ENOSPC; }
-void resctrl_id_free(int id) +void resctrl_id_free(enum closid_type type, int id) { - closid_free(id); + if (type == CLOSID_INT) + return closid_bitmap_free(id, intpartid_free_map); + else if (type == CLOSID_REQ) + return closid_bitmap_free(id, reqpartid_free_map); } diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index eb7eec23096d..78185a4c2b41 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -339,21 +339,26 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, return ret; }
-static void mkdir_mondata_all_prepare_clean(struct resctrl_group *prgrp) +static inline void free_mon_id(struct resctrl_group *rdtgrp) { - if (prgrp->type == RDTCTRL_GROUP) - return; + if (rdtgrp->mon.rmid) + free_rmid(rdtgrp->mon.rmid); + else if (rdtgrp->closid.reqpartid) + resctrl_id_free(CLOSID_REQ, rdtgrp->closid.reqpartid); +}
- if (prgrp->closid) - resctrl_id_free(prgrp->closid); - if (prgrp->mon.rmid) - free_mon_id(prgrp->mon.rmid); +static void mkdir_mondata_all_prepare_clean(struct resctrl_group *prgrp) +{ + if (prgrp->type == RDTCTRL_GROUP && prgrp->closid.intpartid) + resctrl_id_free(CLOSID_INT, prgrp->closid.intpartid); + free_mon_id(prgrp); }
static int mkdir_mondata_all_prepare(struct resctrl_group *rdtgrp) { int ret = 0; - int mon, mon_id, closid; + int mon, rmid, reqpartid; + struct resctrl_group *prgrp;
mon = resctrl_lru_request_mon(); if (mon < 0) { @@ -363,25 +368,40 @@ static int mkdir_mondata_all_prepare(struct resctrl_group *rdtgrp) } rdtgrp->mon.mon = mon;
+ prgrp = rdtgrp->mon.parent; + if (rdtgrp->type == RDTMON_GROUP) { - mon_id = alloc_mon_id(); - if (mon_id < 0) { - closid = resctrl_id_alloc(); - if (closid < 0) { + /* + * this for mon id allocation, for mpam, rmid + * (pmg) is just reserved for creating monitoring + * group, it has the same effect with reqpartid + * (reqpartid) except for config allocation, but + * for some fuzzy reasons, we keep it until spec + * changes. We also allocate rmid first if it's + * available. + */ + rmid = alloc_rmid(); + if (rmid < 0) { + reqpartid = resctrl_id_alloc(CLOSID_REQ); + if (reqpartid < 0) { rdt_last_cmd_puts("out of closID\n"); - free_mon_id(mon_id); ret = -EINVAL; goto out; } - rdtgrp->closid = closid; + rdtgrp->closid.reqpartid = reqpartid; rdtgrp->mon.rmid = 0; } else { - struct resctrl_group *prgrp; - - prgrp = rdtgrp->mon.parent; - rdtgrp->closid = prgrp->closid; - rdtgrp->mon.rmid = mon_id; + /* + * this time copy reqpartid from father group, + * as rmid is sufficient to monitoring. + */ + rdtgrp->closid.reqpartid = prgrp->closid.reqpartid; + rdtgrp->mon.rmid = rmid; } + /* + * establish relationship from ctrl to mon group. + */ + rdtgrp->closid.intpartid = prgrp->closid.intpartid; }
out: @@ -526,16 +546,10 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, return dentry; }
-static bool is_closid_match(struct task_struct *t, struct resctrl_group *r) -{ - return (resctrl_alloc_capable && - (r->type == RDTCTRL_GROUP) && (t->closid == r->closid)); -} - -static bool is_rmid_match(struct task_struct *t, struct resctrl_group *r) +static inline bool +is_task_match_resctrl_group(struct task_struct *t, struct resctrl_group *r) { - return (resctrl_mon_capable && - (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid)); + return (TASK_CLOSID_PR_GET(t->closid) == r->closid.intpartid); }
/* @@ -553,9 +567,9 @@ static void resctrl_move_group_tasks(struct resctrl_group *from, struct resctrl_
read_lock(&tasklist_lock); for_each_process_thread(p, t) { - if (!from || is_closid_match(t, from) || - is_rmid_match(t, from)) { - t->closid = to->closid; + if (!from || is_task_match_resctrl_group(t, from)) { + t->closid = TASK_CLOSID_SET(to->closid.intpartid, + to->closid.reqpartid); t->rmid = to->mon.rmid;
#ifdef CONFIG_SMP @@ -583,7 +597,8 @@ static void free_all_child_rdtgrp(struct resctrl_group *rdtgrp)
head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { - free_mon_id(sentry->mon.rmid); + /* rmid may not be used */ + free_mon_id(sentry); list_del(&sentry->mon.crdtgrp_list); kfree(sentry); } @@ -615,7 +630,7 @@ static void rmdir_all_sub(void) cpumask_or(&resctrl_group_default.cpu_mask, &resctrl_group_default.cpu_mask, &rdtgrp->cpu_mask);
- free_mon_id(rdtgrp->mon.rmid); + free_mon_id(rdtgrp);
kernfs_remove(rdtgrp->kn); list_del(&rdtgrp->resctrl_group_list); @@ -683,13 +698,25 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, rdtgrp->mon.parent = prdtgrp; rdtgrp->type = rtype;
+ /* + * for ctrlmon group, intpartid is used for + * applying configuration, reqpartid is + * used for following this configuration and + * getting monitoring for child mon groups. + */ if (rdtgrp->type == RDTCTRL_GROUP) { - ret = resctrl_id_alloc(); + ret = resctrl_id_alloc(CLOSID_INT); if (ret < 0) { rdt_last_cmd_puts("out of CLOSIDs\n"); goto out_unlock; } - rdtgrp->closid = ret; + rdtgrp->closid.intpartid = ret; + ret = resctrl_id_alloc(CLOSID_REQ); + if (ret < 0) { + rdt_last_cmd_puts("out of SLAVE CLOSIDs\n"); + goto out_unlock; + } + rdtgrp->closid.reqpartid = ret; ret = 0; }
@@ -737,6 +764,7 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, goto out_prepare_clean; } } + kernfs_activate(kn);
/* @@ -786,6 +814,12 @@ static int resctrl_group_mkdir_mon(struct kernfs_node *parent_kn, */ list_add_tail(&rdtgrp->mon.crdtgrp_list, &prgrp->mon.crdtgrp_list);
+ /* + * update all mon group's configuration under this parent group + * for master-slave model. + */ + ret = resctrl_update_groups_config(prgrp); + resctrl_group_kn_unlock(prgrp_kn); return ret; } @@ -888,9 +922,11 @@ static void resctrl_group_rm_mon(struct resctrl_group *rdtgrp, /* Give any tasks back to the parent group */ resctrl_move_group_tasks(rdtgrp, prdtgrp, tmpmask);
- /* Update per cpu rmid of the moved CPUs first */ - for_each_cpu(cpu, &rdtgrp->cpu_mask) + /* Update per cpu closid and rmid of the moved CPUs first */ + for_each_cpu(cpu, &rdtgrp->cpu_mask) { + per_cpu(pqr_state.default_closid, cpu) = prdtgrp->closid.reqpartid; per_cpu(pqr_state.default_rmid, cpu) = prdtgrp->mon.rmid; + } /* * Update the MSR on moved CPUs and CPUs which have moved * task running on them. @@ -899,7 +935,8 @@ static void resctrl_group_rm_mon(struct resctrl_group *rdtgrp, update_closid_rmid(tmpmask, NULL);
rdtgrp->flags |= RDT_DELETED; - free_mon_id(rdtgrp->mon.rmid); + + free_mon_id(rdtgrp);
/* * Remove the rdtgrp from the parent ctrl_mon group's list @@ -936,8 +973,10 @@ static void resctrl_group_rm_ctrl(struct resctrl_group *rdtgrp, cpumask_var_t tm
/* Update per cpu closid and rmid of the moved CPUs first */ for_each_cpu(cpu, &rdtgrp->cpu_mask) { - per_cpu(pqr_state.default_closid, cpu) = resctrl_group_default.closid; - per_cpu(pqr_state.default_rmid, cpu) = resctrl_group_default.mon.rmid; + per_cpu(pqr_state.default_closid, cpu) = + resctrl_group_default.closid.reqpartid; + per_cpu(pqr_state.default_rmid, cpu) = + resctrl_group_default.mon.rmid; }
/* @@ -948,8 +987,8 @@ static void resctrl_group_rm_ctrl(struct resctrl_group *rdtgrp, cpumask_var_t tm update_closid_rmid(tmpmask, NULL);
rdtgrp->flags |= RDT_DELETED; - resctrl_id_free(rdtgrp->closid); - free_mon_id(rdtgrp->mon.rmid); + resctrl_id_free(CLOSID_INT, rdtgrp->closid.intpartid); + resctrl_id_free(CLOSID_REQ, rdtgrp->closid.reqpartid);
/* * Free all the child monitor group rmids. @@ -1024,7 +1063,8 @@ static struct kernfs_syscall_ops resctrl_group_kf_syscall_ops = {
static void resctrl_group_default_init(struct resctrl_group *r) { - r->closid = 0; + r->closid.intpartid = 0; + r->closid.reqpartid = 0; r->mon.rmid = 0; r->type = RDTCTRL_GROUP; }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
For each ctrl group, it's monitoring data should include all it's child mon groups' monitoring data, these code is borrowed from Intel-RDT for facilitating users to configure different monitoring strategies.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index b290fbf49c4c..8052a8bc0893 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -453,6 +453,33 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) }
usage = rr->mon_read(d, md.priv); + /* + * if this rdtgroup is ctrlmon group, also collect it's + * mon groups' monitor data. + */ + if (rdtgrp->type == RDTCTRL_GROUP) { + struct list_head *head; + struct rdtgroup *entry; + hw_closid_t hw_closid; + enum resctrl_conf_type type = CDP_CODE; + + resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, + CDP_CODE, hw_closid); + /* CDP_CODE share the same closid with CDP_BOTH */ + if (md.u.partid != hw_closid_val(hw_closid)) + type = CDP_DATA; + + head = &rdtgrp->mon.crdtgrp_list; + list_for_each_entry(entry, head, mon.crdtgrp_list) { + resctrl_cdp_map(clos, entry->closid.reqpartid, + type, hw_closid); + md.u.partid = hw_closid_val(hw_closid); + md.u.pmg = entry->mon.rmid; + md.u.mon = entry->mon.mon; + usage += rr->mon_read(d, md.priv); + } + } + seq_printf(m, "%llu\n", usage);
out:
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
So far each mon group is tagged by sd_closid, we can monitor data by switching cpus' sd_closid.reqpartid and pmg, and ensuring consistent configuration for mon groups by following it's parent ctrl group through sd_closid.intpartid.
Most of this code is borrowed from Intel-RDT.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_resctrl.c | 40 +++++++++++++++++++++++++-- 1 file changed, 38 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index b92aca531feb..15896f69ca6d 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -972,8 +972,44 @@ static int resctrl_num_mon_show(struct kernfs_open_file *of, int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask, cpumask_var_t tmpmask) { - pr_info("unsupported on mon_groups, please use ctrlmon groups\n"); - return -EINVAL; + struct rdtgroup *prgrp = rdtgrp->mon.parent, *crgrp; + struct list_head *head; + + /* Check whether cpus belong to parent ctrl group */ + cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask); + if (cpumask_weight(tmpmask)) { + rdt_last_cmd_puts("can only add CPUs to mongroup that belong to parent\n"); + return -EINVAL; + } + + /* Check whether cpus are dropped from this group */ + cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask); + if (cpumask_weight(tmpmask)) { + /* Give any dropped cpus to parent rdtgroup */ + cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask); + update_closid_rmid(tmpmask, prgrp); + } + + /* + * If we added cpus, remove them from previous group that owned them + * and update per-cpu rmid + */ + cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask); + if (cpumask_weight(tmpmask)) { + head = &prgrp->mon.crdtgrp_list; + list_for_each_entry(crgrp, head, mon.crdtgrp_list) { + if (crgrp == rdtgrp) + continue; + cpumask_andnot(&crgrp->cpu_mask, &crgrp->cpu_mask, + tmpmask); + } + update_closid_rmid(tmpmask, rdtgrp); + } + + /* Done pushing/pulling - update this group with new mask */ + cpumask_copy(&rdtgrp->cpu_mask, newmask); + + return 0; }
static ssize_t resctrl_group_cpus_write(struct kernfs_open_file *of,
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Currently configuration control type is devided into three classes: COMMON, PRIORITY and HARDLIMIT, capacities' features for mount options is stored in ctrl_extend_bits field live in resctrl resource structure to figure out which configuration type is allowed to apply, when writing schemata sysfile, all related configurations' content from corresponding configuration array will be updated and applied once time.
we can set configuration like this: e.g. > mount -t resctrl resctrl /sys/fs/resctrl && cd /sys/fs/resctrl -o hardlimit > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MB:0=100;1=100;2=100;3=100 MBHDL:0=1;1=1;2=1;3=1 > echo 'MB:0=10' > schemata && echo 'MBHDL:0=0' > schemata # no hardlimit
This also deletes opt_list no longer needed that used as organizing different control types, now we can check supports from ctrl_extend_bits and do extended control-type works by schema list.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 37 ++- arch/arm64/include/asm/mpam_resource.h | 2 + arch/arm64/kernel/mpam/mpam_ctrlmon.c | 206 +++++++++++----- arch/arm64/kernel/mpam/mpam_resctrl.c | 312 +++++++++++++++++++------ arch/arm64/kernel/mpam/mpam_setup.c | 37 ++- include/linux/resctrlfs.h | 6 +- 6 files changed, 455 insertions(+), 145 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 5a76fb5d0fc6..6641180b4c3a 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -117,6 +117,21 @@ DECLARE_STATIC_KEY_FALSE(resctrl_mon_enable_key);
extern int max_name_width, max_data_width;
+enum resctrl_ctrl_type { + SCHEMA_COMM = 0, + SCHEMA_PRI, + SCHEMA_HDL, + SCHEMA_NUM_CTRL_TYPE +}; + +#define for_each_ctrl_type(t) \ + for (t = SCHEMA_COMM; t != SCHEMA_NUM_CTRL_TYPE; t++) + +#define for_each_extend_ctrl_type(t) \ + for (t = SCHEMA_PRI; t != SCHEMA_NUM_CTRL_TYPE; t++) + +bool resctrl_ctrl_extend_bits_match(u32 bitmap, enum resctrl_ctrl_type type); + enum resctrl_conf_type { CDP_BOTH = 0, CDP_CODE, @@ -183,25 +198,34 @@ do { \ */ struct resctrl_staged_config { hw_closid_t hw_closid; - u32 new_ctrl; + u32 new_ctrl[SCHEMA_NUM_CTRL_TYPE]; bool have_new_ctrl; enum resctrl_conf_type conf_type; + enum resctrl_ctrl_type ctrl_type; };
/* later move to resctrl common directory */ -#define RESCTRL_NAME_LEN 7 +#define RESCTRL_NAME_LEN 15 + +struct resctrl_schema_ctrl { + struct list_head list; + char name[RESCTRL_NAME_LEN]; + enum resctrl_ctrl_type ctrl_type; +};
/** * @list: Member of resctrl's schema list * @name: Name visible in the schemata file * @conf_type: Type of configuration, e.g. code/data/both * @res: The rdt_resource for this entry + * @schemata_ctrl_list: Type of ctrl configuration. e.g. priority/hardlimit */ struct resctrl_schema { struct list_head list; char name[RESCTRL_NAME_LEN]; enum resctrl_conf_type conf_type; struct resctrl_resource *res; + struct list_head schema_ctrl_list; };
/** @@ -230,8 +254,7 @@ struct rdt_domain { void __iomem *base;
/* arch specific fields */ - u32 *ctrl_val; - u32 new_ctrl; + u32 *ctrl_val[SCHEMA_NUM_CTRL_TYPE]; bool have_new_ctrl;
/* for debug */ @@ -260,6 +283,7 @@ void post_resctrl_mount(void); struct sd_closid;
struct msr_param { + enum resctrl_ctrl_type type; struct sd_closid *closid; };
@@ -295,13 +319,14 @@ struct raw_resctrl_resource { u16 hdl_wd;
void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, - struct list_head *opt_list, struct msr_param *para); + struct msr_param *para); u64 (*msr_read)(struct rdt_domain *d, struct msr_param *para);
int data_width; const char *format_str; int (*parse_ctrlval)(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg); + struct resctrl_staged_config *cfg, + enum resctrl_ctrl_type ctrl_type);
u16 num_mon; u64 (*mon_read)(struct rdt_domain *d, void *md_priv); diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index cc863183e1be..8868da13411b 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -76,6 +76,7 @@ #define MBW_MAX_SET(v) (MBW_MAX_HARDLIM|((v) << (16 - BWA_WD))) #define MBW_MAX_GET(v) (((v) & MBW_MAX_MASK) >> (16 - BWA_WD)) #define MBW_MAX_SET_HDL(r) (r | MBW_MAX_HARDLIM) +#define MBW_MAX_GET_HDL(r) (r & MBW_MAX_HARDLIM) /* MPAMCFG_MBW_PROP */ #define MBW_PROP_HARDLIM BIT(31) #define MBW_PROP_SET_HDL(r) (r | MBW_PROP_HARDLIM) @@ -133,6 +134,7 @@ * MPAMCFG_MBW_MAX SET - temp Hard code */ #define MPAMCFG_PRI_DSPRI_SHIFT 16 +#define MPAMCFG_PRI_GET(r) ((r & GENMASK(15, 0)) | (r & GENMASK(16, 31)) >> 16)
/* MPAMF_PRI_IDR - MPAM features priority partitioning ID register */ #define MPAMF_PRI_IDR_HAS_INTPRI BIT(0) diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 8052a8bc0893..e356dcbeb246 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -43,8 +43,15 @@ LIST_HEAD(resctrl_all_schema); /* Init schemata content */ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) { + int ret = 0; char *suffix = ""; + char *ctrl_suffix = ""; struct resctrl_schema *s; + struct raw_resctrl_resource *rr; + struct resctrl_schema_ctrl *sc, *sc_tmp; + struct resctrl_schema_ctrl *sc_pri = NULL; + struct resctrl_schema_ctrl *sc_hdl = NULL; + enum resctrl_ctrl_type type;
s = kzalloc(sizeof(*s), GFP_KERNEL); if (!s) @@ -73,7 +80,50 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) INIT_LIST_HEAD(&s->list); list_add_tail(&s->list, &resctrl_all_schema);
+ /* + * Initialize extension ctrl type with MPAM capabilities, + * e.g. priority/hardlimit. + */ + rr = r->res; + INIT_LIST_HEAD(&s->schema_ctrl_list); + for_each_extend_ctrl_type(type) { + if ((type == SCHEMA_PRI && !rr->pri_wd) || + (type == SCHEMA_HDL && !rr->hdl_wd) || + !resctrl_ctrl_extend_bits_match(r->ctrl_extend_bits, + type)) + continue; + + sc = kzalloc(sizeof(*sc), GFP_KERNEL); + if (!sc) { + ret = -ENOMEM; + goto err; + } + sc->ctrl_type = type; + if (type == SCHEMA_PRI) { + sc_pri = sc; + ctrl_suffix = "PRI"; + } else if (type == SCHEMA_HDL) { + sc_hdl = sc; + ctrl_suffix = "HDL"; + } + + WARN_ON_ONCE(strlen(r->name) + strlen(suffix) + + strlen(ctrl_suffix) + 1 > RESCTRL_NAME_LEN); + snprintf(sc->name, sizeof(sc->name), "%s%s%s", + r->name, suffix, ctrl_suffix); + list_add_tail(&sc->list, &s->schema_ctrl_list); + } + return 0; + +err: + list_for_each_entry_safe(sc, sc_tmp, &s->schema_ctrl_list, list) { + list_del(&sc->list); + kfree(sc); + } + list_del(&s->list); + kfree(s); + return ret; }
int schemata_list_init(void) @@ -108,69 +158,88 @@ int schemata_list_init(void) void schemata_list_destroy(void) { struct resctrl_schema *s, *tmp; + struct resctrl_schema_ctrl *sc, *sc_tmp;
list_for_each_entry_safe(s, tmp, &resctrl_all_schema, list) { + list_for_each_entry_safe(sc, sc_tmp, &s->schema_ctrl_list, list) { + list_del(&sc->list); + kfree(sc); + } list_del(&s->list); kfree(s); } }
-static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, - struct resctrl_resource *r) +static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, + struct resctrl_resource *r, struct rdt_domain *dom) { int i; - struct rdt_domain *d; - struct raw_resctrl_resource *rr; struct resctrl_staged_config *cfg; + enum resctrl_ctrl_type type; hw_closid_t hw_closid; + struct raw_resctrl_resource *rr; struct sd_closid closid; struct list_head *head; struct rdtgroup *entry; struct msr_param para;
- para.closid = &closid; + bool update_on;
rr = r->res; - list_for_each_entry(d, &r->domains, list) { - cfg = d->staged_cfg; - for (i = 0; i < ARRAY_SIZE(d->staged_cfg); i++) { - if (!cfg[i].have_new_ctrl) - continue;
- /* - * for ctrl group configuration, hw_closid of cfg[i] - * equals to rdtgrp->closid.intpartid. - */ - closid.intpartid = hw_closid_val(cfg[i].hw_closid); + cfg = dom->staged_cfg; + para.closid = &closid;
+ for (i = 0; i < ARRAY_SIZE(dom->staged_cfg); i++) { + if (!cfg[i].have_new_ctrl) + continue; + update_on = false; + /* + * for ctrl group configuration, hw_closid of cfg[i] equals + * to rdtgrp->closid.intpartid. + */ + closid.intpartid = hw_closid_val(cfg[i].hw_closid); + for_each_ctrl_type(type) { /* if ctrl group's config has changed, refresh it first. */ - if (d->ctrl_val[closid.intpartid] != cfg[i].new_ctrl) { + if (dom->ctrl_val[closid.intpartid] != cfg[i].new_ctrl) { /* * duplicate ctrl group's configuration indexed * by intpartid from domain ctrl_val array. */ resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, - cfg[i].conf_type, hw_closid); - closid.reqpartid = hw_closid_val(hw_closid); - - d->ctrl_val[closid.intpartid] = cfg[i].new_ctrl; - d->have_new_ctrl = true; - rr->msr_update(r, d, NULL, ¶); - } - /* - * we should synchronize all child mon groups' - * configuration from this ctrl rdtgrp - */ - head = &rdtgrp->mon.crdtgrp_list; - list_for_each_entry(entry, head, mon.crdtgrp_list) { - resctrl_cdp_map(clos, entry->closid.reqpartid, cfg[i].conf_type, hw_closid); closid.reqpartid = hw_closid_val(hw_closid);
- rr->msr_update(r, d, NULL, ¶); + dom->ctrl_val[type][closid.intpartid] = + cfg[i].new_ctrl[type]; + dom->have_new_ctrl = true; + update_on = true; } } + if (update_on) + rr->msr_update(r, dom, ¶); + + /* + * we should synchronize all child mon groups' + * configuration from this ctrl rdtgrp + */ + head = &rdtgrp->mon.crdtgrp_list; + list_for_each_entry(entry, head, mon.crdtgrp_list) { + resctrl_cdp_map(clos, entry->closid.reqpartid, + cfg[i].conf_type, hw_closid); + closid.reqpartid = hw_closid_val(hw_closid); + rr->msr_update(r, dom, ¶); + } } +} + +static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, + struct resctrl_resource *r) +{ + struct rdt_domain *d; + + list_for_each_entry(d, &r->domains, list) + resctrl_group_update_domain_ctrls(rdtgrp, r, d);
return 0; } @@ -181,8 +250,10 @@ static int resctrl_group_update_domains(struct rdtgroup *rdtgrp, * separated by ";". The "id" is in decimal, and must match one of * the "id"s for this resource. */ -static int parse_line(char *line, struct resctrl_resource *r, - enum resctrl_conf_type t, u32 closid) +static int +parse_line(char *line, struct resctrl_resource *r, + enum resctrl_conf_type conf_type, + enum resctrl_ctrl_type ctrl_type, u32 closid) { struct raw_resctrl_resource *rr = r->res; char *dom = NULL; @@ -203,11 +274,13 @@ static int parse_line(char *line, struct resctrl_resource *r, dom = strim(dom); list_for_each_entry(d, &r->domains, list) { if (d->id == dom_id) { - resctrl_cdp_map(clos, closid, t, hw_closid); - if (rr->parse_ctrlval(dom, rr, &d->staged_cfg[t])) + resctrl_cdp_map(clos, closid, conf_type, hw_closid); + if (rr->parse_ctrlval(dom, rr, + &d->staged_cfg[conf_type], ctrl_type)) return -EINVAL; - d->staged_cfg[t].hw_closid = hw_closid; - d->staged_cfg[t].conf_type = t; + d->staged_cfg[conf_type].hw_closid = hw_closid; + d->staged_cfg[conf_type].conf_type = conf_type; + d->staged_cfg[conf_type].ctrl_type = ctrl_type; goto next; } } @@ -220,6 +293,7 @@ resctrl_group_parse_schema_resource(char *resname, char *tok, u32 closid) struct resctrl_resource *r; struct resctrl_schema *s; enum resctrl_conf_type t; + struct resctrl_schema_ctrl *sc;
list_for_each_entry(s, &resctrl_all_schema, list) { r = s->res; @@ -228,10 +302,18 @@ resctrl_group_parse_schema_resource(char *resname, char *tok, u32 closid) continue;
if (r->alloc_enabled) { - if (!strcmp(resname, s->name) && - closid < mpam_sysprops_num_partid()) { - t = conf_name_to_conf_type(s->name); - return parse_line(tok, r, t, closid); + if (closid >= mpam_sysprops_num_partid()) + continue; + t = conf_name_to_conf_type(s->name); + if (!strcmp(resname, s->name)) + return parse_line(tok, r, t, + SCHEMA_COMM, closid); + + list_for_each_entry(sc, &s->schema_ctrl_list, list) { + if (!strcmp(resname, sc->name)) + return parse_line(tok, r, t, + sc->ctrl_type, + closid); } } } @@ -316,7 +398,8 @@ ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, * a single "S" simply. */ static void show_doms(struct seq_file *s, struct resctrl_resource *r, - char *schema_name, struct sd_closid *closid) + char *schema_name, enum resctrl_ctrl_type type, + struct sd_closid *closid) { struct raw_resctrl_resource *rr = r->res; struct rdt_domain *dom; @@ -327,6 +410,7 @@ static void show_doms(struct seq_file *s, struct resctrl_resource *r, u32 reg_val;
para.closid = closid; + para.type = type;
if (r->dom_num > RESCTRL_SHOW_DOM_MAX_NUM) rg = true; @@ -335,13 +419,13 @@ static void show_doms(struct seq_file *s, struct resctrl_resource *r, list_for_each_entry(dom, &r->domains, list) { reg_val = rr->msr_read(dom, ¶);
- if (rg && reg_val == r->default_ctrl && + if (rg && reg_val == r->default_ctrl[SCHEMA_COMM] && prev_auto_fill == true) continue;
if (sep) seq_puts(s, ";"); - if (rg && reg_val == r->default_ctrl) { + if (rg && reg_val == r->default_ctrl[SCHEMA_COMM]) { prev_auto_fill = true; seq_puts(s, "S"); } else { @@ -362,6 +446,7 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, int ret = 0; hw_closid_t hw_closid; struct sd_closid closid; + struct resctrl_schema_ctrl *sc;
rdtgrp = resctrl_group_kn_lock_live(of->kn); if (rdtgrp) { @@ -378,7 +463,10 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, rs->conf_type, hw_closid); closid.reqpartid = hw_closid_val(hw_closid);
- show_doms(s, r, rs->name, &closid); + show_doms(s, r, rs->name, SCHEMA_COMM, &closid); + list_for_each_entry(sc, &rs->schema_ctrl_list, list) { + show_doms(s, r, sc->name, sc->ctrl_type, &closid); + } } } } else { @@ -604,12 +692,17 @@ static void rdtgroup_init_mba(struct resctrl_resource *r, u32 closid) { struct resctrl_staged_config *cfg; struct rdt_domain *d; + enum resctrl_ctrl_type t;
list_for_each_entry(d, &r->domains, list) { cfg = &d->staged_cfg[CDP_BOTH]; - cfg->new_ctrl = r->default_ctrl; + cfg->new_ctrl[SCHEMA_COMM] = r->default_ctrl[SCHEMA_COMM]; resctrl_cdp_map(clos, closid, CDP_BOTH, cfg->hw_closid); cfg->have_new_ctrl = true; + /* Set extension ctrl default value, e.g. priority/hardlimit */ + for_each_extend_ctrl_type(t) { + cfg->new_ctrl[t] = r->default_ctrl[t]; + } } }
@@ -626,7 +719,8 @@ static void rdtgroup_init_mba(struct resctrl_resource *r, u32 closid) static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) { struct resctrl_staged_config *cfg; - enum resctrl_conf_type t = s->conf_type; + enum resctrl_conf_type conf_type = s->conf_type; + enum resctrl_ctrl_type ctrl_type; struct rdt_domain *d; struct resctrl_resource *r; u32 used_b = 0; @@ -638,17 +732,17 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) return -EINVAL;
list_for_each_entry(d, &s->res->domains, list) { - cfg = &d->staged_cfg[t]; + cfg = &d->staged_cfg[conf_type]; cfg->have_new_ctrl = false; - cfg->new_ctrl = r->cache.shareable_bits; + cfg->new_ctrl[SCHEMA_COMM] = r->cache.shareable_bits; used_b = r->cache.shareable_bits;
unused_b = used_b ^ (BIT_MASK(r->cache.cbm_len) - 1); unused_b &= BIT_MASK(r->cache.cbm_len) - 1; - cfg->new_ctrl |= unused_b; + cfg->new_ctrl[SCHEMA_COMM] |= unused_b;
/* Ensure cbm does not access out-of-bound */ - tmp_cbm = cfg->new_ctrl; + tmp_cbm = cfg->new_ctrl[SCHEMA_COMM]; if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) { rdt_last_cmd_printf("No space on %s:%d\n", @@ -656,8 +750,16 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) return -ENOSPC; }
- resctrl_cdp_map(clos, closid, t, cfg->hw_closid); + resctrl_cdp_map(clos, closid, conf_type, cfg->hw_closid); cfg->have_new_ctrl = true; + + /* + * Set extension ctrl default value, e.g. priority/hardlimit + * with MPAM capabilities. + */ + for_each_extend_ctrl_type(ctrl_type) { + cfg->new_ctrl[ctrl_type] = r->default_ctrl[ctrl_type]; + } }
return 0; diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 15896f69ca6d..3c056867aedf 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -110,14 +110,29 @@ bool is_resctrl_cdp_enabled(void) return !!resctrl_cdp_enabled; }
+static void +resctrl_ctrl_extend_bits_set(u32 *bitmap, enum resctrl_ctrl_type type) +{ + *bitmap |= BIT(type); +} + +static void resctrl_ctrl_extend_bits_clear(u32 *bitmap) +{ + *bitmap = 0; +} + +bool resctrl_ctrl_extend_bits_match(u32 bitmap, enum resctrl_ctrl_type type) +{ + return bitmap & BIT(type); +} + static void mpam_resctrl_update_component_cfg(struct resctrl_resource *r, - struct rdt_domain *d, struct list_head *opt_list, - struct sd_closid *closid); + struct rdt_domain *d, struct sd_closid *closid);
static void common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, - struct list_head *opt_list, struct msr_param *para); + struct msr_param *para);
static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para); static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para); @@ -127,16 +142,16 @@ static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv);
static int common_wrmon(struct rdt_domain *d, void *md_priv);
-static int parse_cbm(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg); +static int parse_cache(char *buf, struct raw_resctrl_resource *r, + struct resctrl_staged_config *cfg, enum resctrl_ctrl_type ctrl_type); static int parse_bw(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg); + struct resctrl_staged_config *cfg, enum resctrl_ctrl_type ctrl_type);
struct raw_resctrl_resource raw_resctrl_resources_all[] = { [RDT_RESOURCE_L3] = { .msr_update = common_wrmsr, .msr_read = cache_rdmsr, - .parse_ctrlval = parse_cbm, + .parse_ctrlval = parse_cache, .format_str = "%d=%0*x", .mon_read = cache_rdmon, .mon_write = common_wrmon, @@ -144,7 +159,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { [RDT_RESOURCE_L2] = { .msr_update = common_wrmsr, .msr_read = cache_rdmsr, - .parse_ctrlval = parse_cbm, + .parse_ctrlval = parse_cache, .format_str = "%d=%0*x", .mon_read = cache_rdmon, .mon_write = common_wrmon, @@ -169,33 +184,13 @@ mpam_get_raw_resctrl_resource(enum resctrl_resource_level level) }
/* - * Check whether a cache bit mask is valid. for arm64 MPAM, - * it seems that there are no restrictions according to MPAM - * spec expect for requiring at least one bit. - */ -static bool cbm_validate(char *buf, unsigned long *data, - struct raw_resctrl_resource *r) -{ - u64 val; - int ret; - - ret = kstrtou64(buf, 16, &val); - if (ret) { - rdt_last_cmd_printf("non-hex character in mask %s\n", buf); - return false; - } - - *data = val; - return true; -} - -/* - * Read one cache bit mask (hex). Check that it is valid for the current + * Read one cache schema row. Check that it is valid for the current * resource type. */ static int -parse_cbm(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg) +parse_cache(char *buf, struct raw_resctrl_resource *r, + struct resctrl_staged_config *cfg, + enum resctrl_ctrl_type type) { unsigned long data;
@@ -204,10 +199,24 @@ parse_cbm(char *buf, struct raw_resctrl_resource *r, return -EINVAL; }
- if (!cbm_validate(buf, &data, r)) + switch (type) { + case SCHEMA_COMM: + if (kstrtoul(buf, 16, &data)) + return -EINVAL; + break; + case SCHEMA_PRI: + if (kstrtoul(buf, 10, &data)) + return -EINVAL; + break; + case SCHEMA_HDL: + if (kstrtoul(buf, 10, &data)) + return -EINVAL; + break; + default: return -EINVAL; + }
- cfg->new_ctrl = data; + cfg->new_ctrl[type] = data; cfg->have_new_ctrl = true;
return 0; @@ -258,7 +267,8 @@ static bool bw_validate(char *buf, unsigned long *data,
static int parse_bw(char *buf, struct raw_resctrl_resource *r, - struct resctrl_staged_config *cfg) + struct resctrl_staged_config *cfg, + enum resctrl_ctrl_type type) { unsigned long data;
@@ -267,10 +277,24 @@ parse_bw(char *buf, struct raw_resctrl_resource *r, return -EINVAL; }
- if (!bw_validate(buf, &data, r)) + switch (type) { + case SCHEMA_COMM: + if (!bw_validate(buf, &data, r)) + return -EINVAL; + break; + case SCHEMA_PRI: + if (kstrtoul(buf, 10, &data)) + return -EINVAL; + break; + case SCHEMA_HDL: + if (kstrtoul(buf, 10, &data)) + return -EINVAL; + break; + default: return -EINVAL; + }
- cfg->new_ctrl = data; + cfg->new_ctrl[type] = data; cfg->have_new_ctrl = true;
return 0; @@ -278,14 +302,14 @@ parse_bw(char *buf, struct raw_resctrl_resource *r,
static void common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, - struct list_head *opt_list, struct msr_param *para) + struct msr_param *para) { struct sync_args args; struct mpam_resctrl_dom *dom;
dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- mpam_resctrl_update_component_cfg(r, d, opt_list, para->closid); + mpam_resctrl_update_component_cfg(r, d, para->closid);
/* * so far we have accomplished configuration replication, @@ -302,31 +326,75 @@ static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para) struct mpam_resctrl_dom *dom;
args.closid = *para->closid; - args.reg = MPAMCFG_CPBM;
- dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + switch (para->type) { + case SCHEMA_COMM: + args.reg = MPAMCFG_CPBM; + break; + case SCHEMA_PRI: + args.reg = MPAMCFG_PRI; + default: + return 0; + }
+ dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); mpam_component_get_config(dom->comp, &args, &result);
+ switch (para->type) { + case SCHEMA_PRI: + result = MPAMCFG_PRI_GET(result); + break; + default: + break; + } + return result; }
static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para) { - u64 max; u32 result; struct sync_args args; struct mpam_resctrl_dom *dom;
args.closid = *para->closid; - args.reg = MPAMCFG_MBW_MAX;
- dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); + /* + * software default set memory bandwidth by + * MPAMCFG_MBW_MAX but not MPAMCFG_MBW_PBM. + */ + switch (para->type) { + case SCHEMA_COMM: + args.reg = MPAMCFG_MBW_MAX; + break; + case SCHEMA_HDL: + args.reg = MPAMCFG_MBW_MAX; + break; + case SCHEMA_PRI: + args.reg = MPAMCFG_PRI; + break; + default: + return 0; + }
+ dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); mpam_component_get_config(dom->comp, &args, &result);
- max = MBW_MAX_GET(result); - return roundup((max * 100) / 64, 5); + switch (para->type) { + case SCHEMA_COMM: + result = roundup((MBW_MAX_GET(result) * 100) / 64, 5); + break; + case SCHEMA_PRI: + result = MPAMCFG_PRI_GET(result); + break; + case SCHEMA_HDL: + result = MBW_MAX_GET_HDL(result); + break; + default: + break; + } + + return result; }
/* @@ -649,6 +717,52 @@ static int cdpl2_enable(void) return try_to_enable_cdp(RDT_RESOURCE_L2); }
+static void basic_ctrl_enable(void) +{ + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + /* At least SCHEMA_COMM is supported */ + resctrl_ctrl_extend_bits_set(&r->ctrl_extend_bits, SCHEMA_COMM); + } +} + +static int extend_ctrl_enable(enum resctrl_ctrl_type type) +{ + bool match = false; + struct resctrl_resource *r; + struct raw_resctrl_resource *rr; + struct mpam_resctrl_res *res; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + rr = r->res; + if ((type == SCHEMA_PRI && rr->pri_wd) || + (type == SCHEMA_HDL && rr->hdl_wd)) { + resctrl_ctrl_extend_bits_set(&r->ctrl_extend_bits, type); + match = true; + } + } + + if (!match) + return -EINVAL; + + return 0; +} + +static void extend_ctrl_disable(void) +{ + struct resctrl_resource *r; + struct mpam_resctrl_res *res; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + resctrl_ctrl_extend_bits_clear(&r->ctrl_extend_bits); + } +} + int parse_rdtgroupfs_options(char *data) { char *token; @@ -656,6 +770,7 @@ int parse_rdtgroupfs_options(char *data) int ret = 0;
disable_cdp(); + extend_ctrl_disable();
while ((token = strsep(&o, ",")) != NULL) { if (!*token) { @@ -671,12 +786,22 @@ int parse_rdtgroupfs_options(char *data) ret = cdpl2_enable(); if (ret) goto out; + } else if (!strcmp(token, "priority")) { + ret = extend_ctrl_enable(SCHEMA_PRI); + if (ret) + goto out; + } else if (!strcmp(token, "hardlimit")) { + ret = extend_ctrl_enable(SCHEMA_HDL); + if (ret) + goto out; } else { ret = -EINVAL; goto out; } }
+ basic_ctrl_enable(); + return 0;
out: @@ -1427,45 +1552,70 @@ void __mpam_sched_in(void)
static void mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res, - u32 resctrl_cfg, struct mpam_config *mpam_cfg) + u32 resctrl_cfg, enum resctrl_ctrl_type ctrl_type, + struct mpam_config *mpam_cfg) { - if (res == &mpam_resctrl_exports[RDT_RESOURCE_MC]) { - u64 range; + switch (ctrl_type) { + case SCHEMA_COMM: + if (res == &mpam_resctrl_exports[RDT_RESOURCE_MC]) { + u64 range; + + /* For MBA cfg is a percentage of .. */ + if (res->resctrl_mba_uses_mbw_part) { + /* .. the number of bits we can set */ + range = res->class->mbw_pbm_bits; + mpam_cfg->mbw_pbm = + (resctrl_cfg * range) / MAX_MBA_BW; + mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); + } else { + /* .. the number of fractions we can represent */ + mpam_cfg->mbw_max = + bw_max_mask[(resctrl_cfg / 5 - 1) % + ARRAY_SIZE(bw_max_mask)];
- /* For MBA cfg is a percentage of .. */ - if (res->resctrl_mba_uses_mbw_part) { - /* .. the number of bits we can set */ - range = res->class->mbw_pbm_bits; - mpam_cfg->mbw_pbm = (resctrl_cfg * range) / MAX_MBA_BW; - mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); + mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); + } } else { - /* .. the number of fractions we can represent */ - mpam_cfg->mbw_max = bw_max_mask[(resctrl_cfg / 5 - 1) % - ARRAY_SIZE(bw_max_mask)]; - - mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); + /* + * Nothing clever here as mpam_resctrl_pick_caches() + * capped the size at RESCTRL_MAX_CBM. + */ + mpam_cfg->cpbm = resctrl_cfg; + mpam_set_feature(mpam_feat_cpor_part, &mpam_cfg->valid); } - } else { - /* - * Nothing clever here as mpam_resctrl_pick_caches() - * capped the size at RESCTRL_MAX_CBM. - */ - mpam_cfg->cpbm = resctrl_cfg; - mpam_set_feature(mpam_feat_cpor_part, &mpam_cfg->valid); + break; + case SCHEMA_PRI: + mpam_cfg->dspri = resctrl_cfg; + mpam_cfg->intpri = resctrl_cfg; + mpam_set_feature(mpam_feat_dspri_part, &mpam_cfg->valid); + mpam_set_feature(mpam_feat_intpri_part, &mpam_cfg->valid); + break; + case SCHEMA_HDL: + mpam_cfg->hdl = resctrl_cfg; + mpam_set_feature(mpam_feat_part_hdl, &mpam_cfg->valid); + break; + default: + break; } }
+/* + * copy all ctrl type at once looks more efficient, as it + * only needs refresh devices' state once time through + * mpam_component_config, this feature will be checked + * again when appling configuration. + */ static void mpam_resctrl_update_component_cfg(struct resctrl_resource *r, - struct rdt_domain *d, struct list_head *opt_list, - struct sd_closid *closid) + struct rdt_domain *d, struct sd_closid *closid) { struct mpam_resctrl_dom *dom; struct mpam_resctrl_res *res; struct mpam_config *slave_mpam_cfg; + enum resctrl_ctrl_type type; u32 intpartid = closid->intpartid; u32 reqpartid = closid->reqpartid; - u32 resctrl_cfg = d->ctrl_val[intpartid]; + u32 resctrl_cfg;
lockdep_assert_held(&resctrl_group_mutex);
@@ -1486,9 +1636,18 @@ mpam_resctrl_update_component_cfg(struct resctrl_resource *r, slave_mpam_cfg = &dom->comp->cfg[reqpartid]; if (WARN_ON_ONCE(!slave_mpam_cfg)) return; - slave_mpam_cfg->valid = 0; - mpam_update_from_resctrl_cfg(res, resctrl_cfg, slave_mpam_cfg); + + for_each_ctrl_type(type) { + /* + * we don't need check if we have enabled this ctrl type, because + * this ctrls also should be applied an default configuration and + * this feature type would be rechecked when configuring mpam devices. + */ + resctrl_cfg = d->ctrl_val[type][intpartid]; + mpam_update_from_resctrl_cfg(res, resctrl_cfg, + type, slave_mpam_cfg); + } }
static void mpam_reset_cfg(struct mpam_resctrl_res *res, @@ -1497,11 +1656,14 @@ static void mpam_reset_cfg(struct mpam_resctrl_res *res, { int i; struct resctrl_resource *r = &res->resctrl_res; + enum resctrl_ctrl_type type;
for (i = 0; i != mpam_sysprops_num_partid(); i++) { - mpam_update_from_resctrl_cfg(res, r->default_ctrl, - &dom->comp->cfg[i]); - d->ctrl_val[i] = r->default_ctrl; + for_each_ctrl_type(type) { + mpam_update_from_resctrl_cfg(res, r->default_ctrl[type], + type, &dom->comp->cfg[i]); + d->ctrl_val[type][i] = r->default_ctrl[type]; + } } }
diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index b01716392a65..a06bd19be485 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -64,6 +64,7 @@ static int mpam_resctrl_setup_domain(unsigned int cpu, struct mpam_component *comp_iter, *comp; u32 num_partid; u32 **ctrlval_ptr; + enum resctrl_ctrl_type type;
num_partid = mpam_sysprops_num_partid();
@@ -88,12 +89,14 @@ static int mpam_resctrl_setup_domain(unsigned int cpu, dom->resctrl_dom.id = comp->comp_id; cpumask_set_cpu(cpu, &dom->resctrl_dom.cpu_mask);
- ctrlval_ptr = &dom->resctrl_dom.ctrl_val; - *ctrlval_ptr = kmalloc_array(num_partid, + for_each_ctrl_type(type) { + ctrlval_ptr = &dom->resctrl_dom.ctrl_val[type]; + *ctrlval_ptr = kmalloc_array(num_partid, sizeof(**ctrlval_ptr), GFP_KERNEL); - if (!*ctrlval_ptr) { - kfree(dom); - return -ENOMEM; + if (!*ctrlval_ptr) { + kfree(dom); + return -ENOMEM; + } }
/* TODO: this list should be sorted */ @@ -331,6 +334,13 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) struct resctrl_resource *r = &res->resctrl_res; struct raw_resctrl_resource *rr = NULL;
+ if (class && !r->default_ctrl) { + r->default_ctrl = kmalloc_array(SCHEMA_NUM_CTRL_TYPE, + sizeof(*r->default_ctrl), GFP_KERNEL); + if (!r->default_ctrl) + return -ENOMEM; + } + if (class == mpam_resctrl_exports[RDT_RESOURCE_SMMU].class) { return 0; } else if (class == mpam_resctrl_exports[RDT_RESOURCE_MC].class) { @@ -363,7 +373,7 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mbw.min_bw = MAX_MBA_BW / ((1ULL << class->bwa_wd) - 1); /* the largest mbw_max is 100 */ - r->default_ctrl = 100; + r->default_ctrl[SCHEMA_COMM] = 100; } /* Just in case we have an excessive number of bits */ if (!r->mbw.min_bw) @@ -381,6 +391,9 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) rdt_alloc_capable = true; r->mon_capable = true; r->mon_enabled = true; + /* Export memory bandwidth hardlimit, default active hardlimit */ + rr->hdl_wd = 2; + r->default_ctrl[SCHEMA_HDL] = rr->hdl_wd - 1; } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L3].class) { r->rid = RDT_RESOURCE_L3; rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); @@ -390,14 +403,14 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->name = "L3";
r->cache.cbm_len = class->cpbm_wd; - r->default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + r->default_ctrl[SCHEMA_COMM] = GENMASK(class->cpbm_wd - 1, 0); /* * Which bits are shared with other ...things... * Unknown devices use partid-0 which uses all the bitmap * fields. Until we configured the SMMU and GIC not to do this * 'all the bits' is the correct answer here. */ - r->cache.shareable_bits = r->default_ctrl; + r->cache.shareable_bits = r->default_ctrl[SCHEMA_COMM]; r->cache.min_cbm_bits = 1;
if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { @@ -423,14 +436,14 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->name = "L2";
r->cache.cbm_len = class->cpbm_wd; - r->default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + r->default_ctrl[SCHEMA_COMM] = GENMASK(class->cpbm_wd - 1, 0); /* * Which bits are shared with other ...things... * Unknown devices use partid-0 which uses all the bitmap * fields. Until we configured the SMMU and GIC not to do this * 'all the bits' is the correct answer here. */ - r->cache.shareable_bits = r->default_ctrl; + r->cache.shareable_bits = r->default_ctrl[SCHEMA_COMM];
if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { r->alloc_capable = true; @@ -452,8 +465,10 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) rr->num_intpartid = class->num_intpartid; rr->num_pmg = class->num_pmg;
+ /* Export priority setting, default highest priority */ rr->pri_wd = max(class->intpri_wd, class->dspri_wd); - rr->hdl_wd = 2; + r->default_ctrl[SCHEMA_PRI] = (rr->pri_wd > 0) ? + rr->pri_wd - 1 : 0; }
return 0; diff --git a/include/linux/resctrlfs.h b/include/linux/resctrlfs.h index 684bcdba51de..7f1ff6e816f5 100644 --- a/include/linux/resctrlfs.h +++ b/include/linux/resctrlfs.h @@ -32,6 +32,8 @@ struct resctrl_cache { * @min_bw: Minimum memory bandwidth percentage user can request * @bw_gran: Granularity at which the memory bandwidth is allocated * @delay_linear: True if memory B/W delay is in linear scale + * @ctrl_extend_bits: Indicates if there are extra ctrl capabilities supported. + * e.g. priority/hardlimit. */ struct resctrl_membw { u32 min_bw; @@ -56,7 +58,9 @@ struct resctrl_resource {
bool cdp_capable; bool cdp_enable; - u32 default_ctrl; + u32 *default_ctrl; + + u32 ctrl_extend_bits;
void *res; };
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Register MPAMCFG_PRI's default value is also used for software default usage after probing resources, two fields hwdef_intpri and hwdef_dspri are placed into mpam_device structure to store the default priority setting.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 5 ++++- arch/arm64/kernel/mpam/mpam_device.c | 27 +++++++++++++++----------- arch/arm64/kernel/mpam/mpam_device.h | 4 ++++ 3 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 8868da13411b..270ae49f306e 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -134,7 +134,10 @@ * MPAMCFG_MBW_MAX SET - temp Hard code */ #define MPAMCFG_PRI_DSPRI_SHIFT 16 -#define MPAMCFG_PRI_GET(r) ((r & GENMASK(15, 0)) | (r & GENMASK(16, 31)) >> 16) +#define MPAMCFG_INTPRI_GET(r) (r & GENMASK(15, 0)) +#define MPAMCFG_DSPRI_GET(r) ((r & GENMASK(16, 31)) >> 16) +/* Always same if both supported */ +#define MPAMCFG_PRI_GET(r) (MPAMCFG_DSPRI_GET(r) | MPAMCFG_INTPRI_GET(r))
/* MPAMF_PRI_IDR - MPAM features priority partitioning ID register */ #define MPAMF_PRI_IDR_HAS_INTPRI BIT(0) diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 2e4cf61dc797..b8686e6c6669 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -206,26 +206,37 @@ static int mpam_device_probe(struct mpam_device *dev) /* Priority partitioning */ if (MPAMF_IDR_HAS_PRI_PART(hwfeatures)) { u32 pri_features = mpam_read_reg(dev, MPAMF_PRI_IDR); + u32 hwdef_pri = mpam_read_reg(dev, MPAMCFG_PRI);
pr_debug("probe: probed PRI_PART\n");
dev->intpri_wd = (pri_features & MPAMF_PRI_IDR_INTPRI_WD) >> MPAMF_PRI_IDR_INTPRI_WD_SHIFT; - if (dev->intpri_wd && (pri_features & - MPAMF_PRI_IDR_HAS_INTPRI)) { + if (dev->intpri_wd && (pri_features & MPAMF_PRI_IDR_HAS_INTPRI)) { mpam_set_feature(mpam_feat_intpri_part, &dev->features); + dev->hwdef_intpri = MPAMCFG_INTPRI_GET(hwdef_pri); if (pri_features & MPAMF_PRI_IDR_INTPRI_0_IS_LOW) mpam_set_feature(mpam_feat_intpri_part_0_low, &dev->features); + else + /* keep higher value higher priority */ + dev->hwdef_intpri = GENMASK(dev->intpri_wd - 1, 0) & + ~dev->hwdef_intpri; + }
dev->dspri_wd = (pri_features & MPAMF_PRI_IDR_DSPRI_WD) >> MPAMF_PRI_IDR_DSPRI_WD_SHIFT; if (dev->dspri_wd && (pri_features & MPAMF_PRI_IDR_HAS_DSPRI)) { mpam_set_feature(mpam_feat_dspri_part, &dev->features); + dev->hwdef_dspri = MPAMCFG_DSPRI_GET(hwdef_pri); if (pri_features & MPAMF_PRI_IDR_DSPRI_0_IS_LOW) mpam_set_feature(mpam_feat_dspri_part_0_low, &dev->features); + else + /* keep higher value higher priority */ + dev->hwdef_dspri = GENMASK(dev->dspri_wd - 1, 0) & + ~dev->hwdef_dspri; } }
@@ -723,8 +734,7 @@ static void mpam_reset_device_bitmap(struct mpam_device *dev, u16 reg, u16 wd) static void mpam_reset_device_config(struct mpam_component *comp, struct mpam_device *dev, u32 partid) { - u16 intpri = GENMASK(dev->intpri_wd, 0); - u16 dspri = GENMASK(dev->dspri_wd, 0); + u16 intpri, dspri; u32 pri_val = 0; u32 mbw_max;
@@ -751,13 +761,8 @@ static void mpam_reset_device_config(struct mpam_component *comp,
if (mpam_has_feature(mpam_feat_intpri_part, dev->features) || mpam_has_feature(mpam_feat_dspri_part, dev->features)) { - /* aces high? */ - if (!mpam_has_feature(mpam_feat_intpri_part_0_low, - dev->features)) - intpri = 0; - if (!mpam_has_feature(mpam_feat_dspri_part_0_low, - dev->features)) - dspri = 0; + intpri = dev->hwdef_intpri; + dspri = dev->hwdef_dspri;
if (mpam_has_feature(mpam_feat_intpri_part, dev->features)) pri_val |= intpri; diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index 9930ca70e0ce..b1f852e65d83 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -54,6 +54,10 @@ struct mpam_device { u16 num_pmg; u16 num_csu_mon; u16 num_mbwu_mon; + + /* for reset device MPAMCFG_PRI */ + u16 hwdef_intpri; + u16 hwdef_dspri; };
/*
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Store default priority in mpam class structure from reading devices' intpri_wd and dspri_wd.
intpri_wd and dspri_wd represent the number of implemented bits in the internal/downstream priority field in MPAMCFG_PRI, when INTPRI_0_IS_LOW /DSPRI_0_IS_LOW is not set, we need to rotate input priority(higher value higher priority) from user space to target priority (higher value lower priority) and this is restricted by implemented bits.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 13 +++++++++++-- arch/arm64/kernel/mpam/mpam_device.h | 4 ++++ arch/arm64/kernel/mpam/mpam_setup.c | 10 +++++++--- 3 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index b8686e6c6669..0413eac0ba5e 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -362,6 +362,8 @@ static void mpam_enable_squash_features(void) class->num_pmg = dev->num_pmg; class->num_csu_mon = dev->num_csu_mon; class->num_mbwu_mon = dev->num_mbwu_mon; + class->hwdef_intpri = dev->hwdef_intpri; + class->hwdef_dspri = dev->hwdef_dspri; spin_unlock_irqrestore(&dev->lock, flags); }
@@ -764,10 +766,17 @@ static void mpam_reset_device_config(struct mpam_component *comp, intpri = dev->hwdef_intpri; dspri = dev->hwdef_dspri;
- if (mpam_has_feature(mpam_feat_intpri_part, dev->features)) + if (mpam_has_feature(mpam_feat_intpri_part, dev->features)) { + if (!mpam_has_feature(mpam_feat_intpri_part_0_low, dev->features)) + intpri = GENMASK(dev->intpri_wd - 1, 0) & ~intpri; pri_val |= intpri; - if (mpam_has_feature(mpam_feat_dspri_part, dev->features)) + } + + if (mpam_has_feature(mpam_feat_dspri_part, dev->features)) { + if (!mpam_has_feature(mpam_feat_dspri_part_0_low, dev->features)) + dspri = GENMASK(dev->dspri_wd - 1, 0) & ~dspri; pri_val |= (dspri << MPAMCFG_PRI_DSPRI_SHIFT); + }
mpam_write_reg(dev, MPAMCFG_PRI, pri_val); } diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index b1f852e65d83..fc5f7c292b6f 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -118,6 +118,10 @@ struct mpam_class { u16 num_pmg; u16 num_csu_mon; u16 num_mbwu_mon; + + /* for reset class MPAMCFG_PRI */ + u16 hwdef_intpri; + u16 hwdef_dspri; };
/* System wide properties */ diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index a06bd19be485..dc7490890349 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -465,10 +465,14 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) rr->num_intpartid = class->num_intpartid; rr->num_pmg = class->num_pmg;
- /* Export priority setting, default highest priority */ + /* + * Export priority setting, default priority from hardware, + * no clever here, we don't need to define another default + * value. + */ rr->pri_wd = max(class->intpri_wd, class->dspri_wd); - r->default_ctrl[SCHEMA_PRI] = (rr->pri_wd > 0) ? - rr->pri_wd - 1 : 0; + r->default_ctrl[SCHEMA_PRI] = max(class->hwdef_intpri, + class->hwdef_dspri); }
return 0;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Use an array for storing extend ctrls' max width, on purpose, checking each input value from schemata.
Note the useful value of each ctrls' max width is at least 1, 0 means meaningless, and greater than 1 means the choices can be selected.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 3 +-- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 6 ++---- arch/arm64/kernel/mpam/mpam_resctrl.c | 3 +-- arch/arm64/kernel/mpam/mpam_setup.c | 12 +++++++----- 4 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 6641180b4c3a..2e327ee2f560 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -315,8 +315,7 @@ struct raw_resctrl_resource { u16 num_intpartid; u16 num_pmg;
- u16 pri_wd; - u16 hdl_wd; + u16 extend_ctrls_wd[SCHEMA_NUM_CTRL_TYPE];
void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, struct msr_param *para); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index e356dcbeb246..ce8cf3623ad5 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -87,10 +87,8 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) rr = r->res; INIT_LIST_HEAD(&s->schema_ctrl_list); for_each_extend_ctrl_type(type) { - if ((type == SCHEMA_PRI && !rr->pri_wd) || - (type == SCHEMA_HDL && !rr->hdl_wd) || - !resctrl_ctrl_extend_bits_match(r->ctrl_extend_bits, - type)) + if (!resctrl_ctrl_extend_bits_match(r->ctrl_extend_bits, type) || + !rr->extend_ctrls_wd[type]) continue;
sc = kzalloc(sizeof(*sc), GFP_KERNEL); diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 3c056867aedf..75399887c1ff 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -739,8 +739,7 @@ static int extend_ctrl_enable(enum resctrl_ctrl_type type) for_each_supported_resctrl_exports(res) { r = &res->resctrl_res; rr = r->res; - if ((type == SCHEMA_PRI && rr->pri_wd) || - (type == SCHEMA_HDL && rr->hdl_wd)) { + if (rr->extend_ctrls_wd[type]) { resctrl_ctrl_extend_bits_set(&r->ctrl_extend_bits, type); match = true; } diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index dc7490890349..ef922a796ff8 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -392,8 +392,8 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mon_capable = true; r->mon_enabled = true; /* Export memory bandwidth hardlimit, default active hardlimit */ - rr->hdl_wd = 2; - r->default_ctrl[SCHEMA_HDL] = rr->hdl_wd - 1; + rr->extend_ctrls_wd[SCHEMA_HDL] = 2; + r->default_ctrl[SCHEMA_HDL] = 1; } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L3].class) { r->rid = RDT_RESOURCE_L3; rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); @@ -466,11 +466,13 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) rr->num_pmg = class->num_pmg;
/* - * Export priority setting, default priority from hardware, - * no clever here, we don't need to define another default + * Export priority setting, extend_ctrls_wd represents the + * max level of control we can export. this default priority + * is just from hardware, no need to define another default * value. */ - rr->pri_wd = max(class->intpri_wd, class->dspri_wd); + rr->extend_ctrls_wd[SCHEMA_PRI] = 1 << max(class->intpri_wd, + class->dspri_wd); r->default_ctrl[SCHEMA_PRI] = max(class->hwdef_intpri, class->hwdef_dspri); }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
MPAMCFG_INTPARTID.INTERNAL must be set when narrowing reqpartid to intpartid according to MPAM spec definitions, and this action must be done before writing MPAMCFG_PART_SEL if narrowing implemented. So we plan this work that do narrowing unifiedly when narrowing is supported.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 2 +- arch/arm64/kernel/mpam/mpam_device.c | 47 +++++++++++++++----------- 2 files changed, 28 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 270ae49f306e..d7d9549be668 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -101,7 +101,7 @@ /* * Set MPAMCFG_PART_SEL internal bit */ -#define PART_SEL_SET_INTERNAL(r) (r | BIT(16)) +#define MPAMCFG_PART_SEL_INTERNAL BIT(16)
/* MPAM_ESR */ #define MPAMF_ESR_ERRCODE_MASK ((BIT(4) - 1) << 24) diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 0413eac0ba5e..6cc939e173f9 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -743,7 +743,7 @@ static void mpam_reset_device_config(struct mpam_component *comp, lockdep_assert_held(&dev->lock);
if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) - partid = PART_SEL_SET_INTERNAL(partid); + partid = partid | MPAMCFG_PART_SEL_INTERNAL; mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); wmb(); /* subsequent writes must be applied to our new partid */
@@ -1120,6 +1120,25 @@ static void mpam_device_narrow_map(struct mpam_device *dev, u32 partid, mpam_write_reg(dev, MPAMCFG_INTPARTID, intpartid); }
+/* + * partid should be narrowed to intpartid if this feature implemented, + * before writing to register MPAMCFG_PART_SEL should we check this. + */ +static int try_to_narrow_device_intpartid(struct mpam_device *dev, + u32 *partid, u32 intpartid) +{ + if (!mpam_has_part_sel(dev->features)) + return -EINVAL; + + if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { + mpam_device_narrow_map(dev, *partid, intpartid); + /* narrowing intpartid success, then set 16 bit to 1*/ + *partid = intpartid | MPAMCFG_PART_SEL_INTERNAL; + } + + return 0; +} + static int mpam_device_config(struct mpam_device *dev, struct sd_closid *closid, struct mpam_config *cfg) @@ -1137,20 +1156,9 @@ mpam_device_config(struct mpam_device *dev, struct sd_closid *closid,
lockdep_assert_held(&dev->lock);
- if (!mpam_has_part_sel(dev->features)) + if (try_to_narrow_device_intpartid(dev, &partid, intpartid)) return -EINVAL;
- /* - * intpartid should be narrowed the first time, - * upstream(resctrl) keep this order - */ - if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { - if (cfg && mpam_has_feature(mpam_feat_part_nrw, cfg->valid)) - mpam_device_narrow_map(dev, partid, intpartid); - /* intpartid success, set 16 bit to 1*/ - partid = PART_SEL_SET_INTERNAL(intpartid); - } - mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); wmb(); /* subsequent writes must be applied to our new partid */
@@ -1386,7 +1394,7 @@ static void mpam_component_read_mpamcfg(void *_ctx) struct sync_args *args = ctx->args; u64 val; u16 reg; - u32 partid; + u32 partid, intpartid;
if (!args) return; @@ -1394,6 +1402,7 @@ static void mpam_component_read_mpamcfg(void *_ctx) reg = args->reg;
partid = args->closid.reqpartid; + intpartid = args->closid.intpartid;
list_for_each_entry(dev, &comp->devices, comp_list) { if (!cpumask_test_cpu(smp_processor_id(), @@ -1401,13 +1410,11 @@ static void mpam_component_read_mpamcfg(void *_ctx) continue;
spin_lock_irqsave(&dev->lock, flags); - if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { - /* - * partid is possible reqpartid or intpartid, - * if narrow enabled, it should be intpartid. - */ - partid = PART_SEL_SET_INTERNAL(args->closid.intpartid); + if (try_to_narrow_device_intpartid(dev, &partid, intpartid)) { + spin_unlock_irqrestore(&dev->lock, flags); + return; } + mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); wmb(); val = mpam_read_reg(dev, reg);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Reading/Writing registers directly for getting or putting configuration is not friendly with expansion and legibility, multiple types of schemata ctrls is supported, of which value should be converted to a proper value based on specific definition and range in corresponding register according to MPAM spec, Using event id instead to indicate which type configuration we want to get looks easier for us.
Besides, different hook-events have different setting bound such as bwa_wd for adaptive range conversion when writing configuration, this can be associated with specific event for conversion.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 2 +- arch/arm64/include/asm/mpam_resource.h | 17 +++--- arch/arm64/include/asm/resctrl.h | 5 ++ arch/arm64/kernel/mpam/mpam_ctrlmon.c | 2 +- arch/arm64/kernel/mpam/mpam_device.c | 50 +++++++++++++++-- arch/arm64/kernel/mpam/mpam_internal.h | 2 - arch/arm64/kernel/mpam/mpam_resctrl.c | 78 ++++++-------------------- 7 files changed, 75 insertions(+), 81 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 2e327ee2f560..7dd34caa8a86 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -323,7 +323,7 @@ struct raw_resctrl_resource {
int data_width; const char *format_str; - int (*parse_ctrlval)(char *buf, struct raw_resctrl_resource *r, + int (*parse_ctrlval)(char *buf, struct resctrl_resource *r, struct resctrl_staged_config *cfg, enum resctrl_ctrl_type ctrl_type);
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index d7d9549be668..72cc9029946f 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -69,18 +69,17 @@ #define CPBM_WD_MASK 0xFFFF #define CPBM_MASK 0x7FFF
-#define BWA_WD 6 /* hard code for P680 */ -#define MBW_MAX_MASK 0xFC00 -#define MBW_MAX_HARDLIM BIT(31) +#define MBW_MAX_HARDLIM BIT(31) +#define MBW_PROP_HARDLIM BIT(31) +#define MBW_MAX_MASK GENMASK(15, 0) #define MBW_MAX_BWA_FRACT(w) GENMASK(w - 1, 0) -#define MBW_MAX_SET(v) (MBW_MAX_HARDLIM|((v) << (16 - BWA_WD))) -#define MBW_MAX_GET(v) (((v) & MBW_MAX_MASK) >> (16 - BWA_WD)) -#define MBW_MAX_SET_HDL(r) (r | MBW_MAX_HARDLIM) -#define MBW_MAX_GET_HDL(r) (r & MBW_MAX_HARDLIM) +#define MBW_MAX_SET(v, w) (v << (16 - w)) /* MPAMCFG_MBW_PROP */ -#define MBW_PROP_HARDLIM BIT(31) -#define MBW_PROP_SET_HDL(r) (r | MBW_PROP_HARDLIM) +#define MBW_PROP_SET_HDL(r) (r | MBW_PROP_HARDLIM) /* MPAMCFG_MBW_MAX */ +#define MBW_MAX_SET_HDL(r) (r | MBW_MAX_HARDLIM) +#define MBW_MAX_GET_HDL(r) ((r & MBW_MAX_HARDLIM) >> 31) +#define MBW_MAX_GET(v, w) (((v) & MBW_MAX_MASK) >> (16 - w))
#define MSMON_MATCH_PMG BIT(17) #define MSMON_MATCH_PARTID BIT(16) diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 0c1f2cef0c36..37e750029fbc 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -22,6 +22,11 @@ enum rdt_event_id { QOS_L3_MBM_TOTAL_EVENT_ID = 0x02, QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
+ QOS_CAT_CPBM_EVENT_ID = 0x04, + QOS_CAT_PRI_EVENT_ID = 0x05, + QOS_MBA_MAX_EVENT_ID = 0x06, + QOS_MBA_PRI_EVENT_ID = 0x07, + QOS_MBA_HDL_EVENT_ID = 0x08, /* Must be the last */ RESCTRL_NUM_EVENT_IDS, }; diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index ce8cf3623ad5..019cb57e5c46 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -273,7 +273,7 @@ parse_line(char *line, struct resctrl_resource *r, list_for_each_entry(d, &r->domains, list) { if (d->id == dom_id) { resctrl_cdp_map(clos, closid, conf_type, hw_closid); - if (rr->parse_ctrlval(dom, rr, + if (rr->parse_ctrlval(dom, r, &d->staged_cfg[conf_type], ctrl_type)) return -EINVAL; d->staged_cfg[conf_type].hw_closid = hw_closid; diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 6cc939e173f9..1202a5795e97 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -753,7 +753,7 @@ static void mpam_reset_device_config(struct mpam_component *comp, mpam_reset_device_bitmap(dev, MPAMCFG_MBW_PBM, dev->mbw_pbm_bits); if (mpam_has_feature(mpam_feat_mbw_max, dev->features)) { - mbw_max = MBW_MAX_SET(MBW_MAX_BWA_FRACT(dev->bwa_wd)); + mbw_max = MBW_MAX_SET(MBW_MAX_BWA_FRACT(dev->bwa_wd), dev->bwa_wd); mbw_max = MBW_MAX_SET_HDL(mbw_max); mpam_write_reg(dev, MPAMCFG_MBW_MAX, mbw_max); } @@ -1187,7 +1187,7 @@ mpam_device_config(struct mpam_device *dev, struct sd_closid *closid,
if (mpam_has_feature(mpam_feat_mbw_max, dev->features)) { if (cfg && mpam_has_feature(mpam_feat_mbw_max, cfg->valid)) { - mbw_max = MBW_MAX_SET(cfg->mbw_max); + mbw_max = MBW_MAX_SET(cfg->mbw_max, dev->bwa_wd); if (!mpam_has_feature(mpam_feat_part_hdl, cfg->valid) || (mpam_has_feature(mpam_feat_part_hdl, cfg->valid) && cfg->hdl)) mbw_max = MBW_MAX_SET_HDL(mbw_max); @@ -1392,14 +1392,15 @@ static void mpam_component_read_mpamcfg(void *_ctx) struct mpam_device_sync *ctx = (struct mpam_device_sync *)_ctx; struct mpam_component *comp = ctx->comp; struct sync_args *args = ctx->args; - u64 val; - u16 reg; + u64 val = 0; u32 partid, intpartid; + u32 dspri = 0; + u32 intpri = 0; + u64 range;
if (!args) return;
- reg = args->reg;
partid = args->closid.reqpartid; intpartid = args->closid.intpartid; @@ -1417,7 +1418,44 @@ static void mpam_component_read_mpamcfg(void *_ctx)
mpam_write_reg(dev, MPAMCFG_PART_SEL, partid); wmb(); - val = mpam_read_reg(dev, reg); + + switch (args->eventid) { + case QOS_CAT_CPBM_EVENT_ID: + if (!mpam_has_feature(mpam_feat_cpor_part, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_CPBM); + break; + case QOS_MBA_MAX_EVENT_ID: + if (!mpam_has_feature(mpam_feat_mbw_max, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_MBW_MAX); + range = MBW_MAX_BWA_FRACT(dev->bwa_wd); + val = MBW_MAX_GET(val, dev->bwa_wd) * (MAX_MBA_BW - 1) / range; + break; + case QOS_MBA_HDL_EVENT_ID: + if (!mpam_has_feature(mpam_feat_mbw_max, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_MBW_MAX); + val = MBW_MAX_GET_HDL(val); + break; + case QOS_CAT_PRI_EVENT_ID: + case QOS_MBA_PRI_EVENT_ID: + if (mpam_has_feature(mpam_feat_intpri_part, dev->features)) + intpri = MPAMCFG_INTPRI_GET(val); + if (mpam_has_feature(mpam_feat_dspri_part, dev->features)) + dspri = MPAMCFG_DSPRI_GET(val); + if (!mpam_has_feature(mpam_feat_intpri_part_0_low, + dev->features)) + intpri = GENMASK(dev->intpri_wd - 1, 0) & ~intpri; + if (!mpam_has_feature(mpam_feat_dspri_part_0_low, + dev->features)) + dspri = GENMASK(dev->intpri_wd - 1, 0) & ~dspri; + val = (dspri > intpri) ? dspri : intpri; + break; + default: + break; + } + atomic64_add(val, &ctx->cfg_value); spin_unlock_irqrestore(&dev->lock, flags);
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index cc35dfc73449..0ca58712a8ca 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -41,8 +41,6 @@ struct sync_args { u32 mon; bool match_pmg; enum rdt_event_id eventid; - /*for reading msr*/ - u16 reg; };
struct mpam_device_sync { diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 75399887c1ff..7c519d93c1bb 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -142,9 +142,9 @@ static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv);
static int common_wrmon(struct rdt_domain *d, void *md_priv);
-static int parse_cache(char *buf, struct raw_resctrl_resource *r, +static int parse_cache(char *buf, struct resctrl_resource *r, struct resctrl_staged_config *cfg, enum resctrl_ctrl_type ctrl_type); -static int parse_bw(char *buf, struct raw_resctrl_resource *r, +static int parse_bw(char *buf, struct resctrl_resource *r, struct resctrl_staged_config *cfg, enum resctrl_ctrl_type ctrl_type);
struct raw_resctrl_resource raw_resctrl_resources_all[] = { @@ -188,7 +188,7 @@ mpam_get_raw_resctrl_resource(enum resctrl_resource_level level) * resource type. */ static int -parse_cache(char *buf, struct raw_resctrl_resource *r, +parse_cache(char *buf, struct resctrl_resource *r, struct resctrl_staged_config *cfg, enum resctrl_ctrl_type type) { @@ -222,32 +222,8 @@ parse_cache(char *buf, struct raw_resctrl_resource *r, return 0; }
-/* define bw_min as 5 percentage, that are 5% ~ 100% which cresponding masks: */ -static u32 bw_max_mask[20] = { - 3, /* 3/64: 5% */ - 6, /* 6/64: 10% */ - 10, /* 10/64: 15% */ - 13, /* 13/64: 20% */ - 16, /* 16/64: 25% */ - 19, /* ... */ - 22, - 26, - 29, - 32, - 35, - 38, - 42, - 45, - 48, - 51, - 54, - 58, - 61, - 63 /* 100% */ -}; - static bool bw_validate(char *buf, unsigned long *data, - struct raw_resctrl_resource *r) + struct resctrl_resource *r) { unsigned long bw; int ret; @@ -258,15 +234,15 @@ static bool bw_validate(char *buf, unsigned long *data, return false; }
- bw = bw < 5 ? 5 : bw; - bw = bw > 100 ? 100 : bw; - *data = roundup(bw, 5); + bw = bw > MAX_MBA_BW ? MAX_MBA_BW : bw; + bw = bw < r->mbw.min_bw ? r->mbw.min_bw : bw; + *data = roundup(bw, r->mbw.bw_gran);
return true; }
static int -parse_bw(char *buf, struct raw_resctrl_resource *r, +parse_bw(char *buf, struct resctrl_resource *r, struct resctrl_staged_config *cfg, enum resctrl_ctrl_type type) { @@ -329,10 +305,10 @@ static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para)
switch (para->type) { case SCHEMA_COMM: - args.reg = MPAMCFG_CPBM; + args.eventid = QOS_CAT_CPBM_EVENT_ID; break; case SCHEMA_PRI: - args.reg = MPAMCFG_PRI; + args.eventid = QOS_CAT_PRI_EVENT_ID; default: return 0; } @@ -340,14 +316,6 @@ static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para) dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); mpam_component_get_config(dom->comp, &args, &result);
- switch (para->type) { - case SCHEMA_PRI: - result = MPAMCFG_PRI_GET(result); - break; - default: - break; - } - return result; }
@@ -365,13 +333,13 @@ static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para) */ switch (para->type) { case SCHEMA_COMM: - args.reg = MPAMCFG_MBW_MAX; + args.eventid = QOS_MBA_MAX_EVENT_ID; break; case SCHEMA_HDL: - args.reg = MPAMCFG_MBW_MAX; + args.eventid = QOS_MBA_HDL_EVENT_ID; break; case SCHEMA_PRI: - args.reg = MPAMCFG_PRI; + args.eventid = QOS_MBA_PRI_EVENT_ID; break; default: return 0; @@ -380,20 +348,6 @@ static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para) dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); mpam_component_get_config(dom->comp, &args, &result);
- switch (para->type) { - case SCHEMA_COMM: - result = roundup((MBW_MAX_GET(result) * 100) / 64, 5); - break; - case SCHEMA_PRI: - result = MPAMCFG_PRI_GET(result); - break; - case SCHEMA_HDL: - result = MBW_MAX_GET_HDL(result); - break; - default: - break; - } - return result; }
@@ -1568,10 +1522,10 @@ mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res, mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); } else { /* .. the number of fractions we can represent */ + range = MBW_MAX_BWA_FRACT(res->class->bwa_wd); + mpam_cfg->mbw_max = (resctrl_cfg * range) / (MAX_MBA_BW - 1); mpam_cfg->mbw_max = - bw_max_mask[(resctrl_cfg / 5 - 1) % - ARRAY_SIZE(bw_max_mask)]; - + (mpam_cfg->mbw_max > range) ? range : mpam_cfg->mbw_max; mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); } } else {
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
If cdp enabled, LxCODE and LxDATA are assigned two different partid each occupies a monitor, but because not all features use cdp mode, for instance MB(Memory Bandwidth), we should make sure this two partid/ monitor be operated simultaneously for display.
e.g.
+- code stream (partid = 0, monitor = 0) ----+---> L3CODE cpu-+ + +- data stream (partid = 1, monitor = 1) ----+---> L3DATA | +---> MB
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 5 ++ arch/arm64/kernel/mpam/mpam_ctrlmon.c | 81 ++++++++++++++++++++++----- 2 files changed, 72 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 7dd34caa8a86..5fac0fb3c807 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -195,6 +195,7 @@ do { \ * @new_ctrl: new ctrl value to be loaded * @have_new_ctrl: did user provide new_ctrl for this domain * @new_ctrl_type: CDP property of the new ctrl + * @cdp_both_ctrl: did cdp both control if cdp enabled */ struct resctrl_staged_config { hw_closid_t hw_closid; @@ -202,6 +203,7 @@ struct resctrl_staged_config { bool have_new_ctrl; enum resctrl_conf_type conf_type; enum resctrl_ctrl_type ctrl_type; + bool cdp_both_ctrl; };
/* later move to resctrl common directory */ @@ -219,6 +221,7 @@ struct resctrl_schema_ctrl { * @conf_type: Type of configuration, e.g. code/data/both * @res: The rdt_resource for this entry * @schemata_ctrl_list: Type of ctrl configuration. e.g. priority/hardlimit + * @cdp_mc_both: did cdp both mon/ctrl if cdp enabled */ struct resctrl_schema { struct list_head list; @@ -226,6 +229,7 @@ struct resctrl_schema { enum resctrl_conf_type conf_type; struct resctrl_resource *res; struct list_head schema_ctrl_list; + bool cdp_mc_both; };
/** @@ -341,6 +345,7 @@ union mon_data_bits { u8 partid; u8 pmg; u8 mon; + u8 cdp_both_mon; } u; };
diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 019cb57e5c46..808a16209129 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -60,6 +60,15 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) s->res = r; s->conf_type = t;
+ /* + * code and data is separated for resources LxCache but + * not for MB(Memory Bandwidth), it's necessary to set + * cdp_mc_both to let resctrl know operating the two closid/ + * monitor simultaneously when configuring/monitoring. + */ + if (is_resctrl_cdp_enabled()) + s->cdp_mc_both = !r->cdp_enable; + switch (t) { case CDP_CODE: suffix = "CODE"; @@ -168,6 +177,24 @@ void schemata_list_destroy(void) } }
+static void +resctrl_dom_ctrl_config(bool cdp_both_ctrl, struct resctrl_resource *r, + struct rdt_domain *dom, struct msr_param *para) +{ + struct raw_resctrl_resource *rr; + + rr = r->res; + rr->msr_update(r, dom, para); + + if (cdp_both_ctrl) { + hw_closid_t hw_closid; + + resctrl_cdp_map(clos, para->closid->reqpartid, CDP_DATA, hw_closid); + para->closid->reqpartid = hw_closid_val(hw_closid); + rr->msr_update(r, dom, para); + } +} + static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, struct resctrl_resource *r, struct rdt_domain *dom) { @@ -175,15 +202,11 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, struct resctrl_staged_config *cfg; enum resctrl_ctrl_type type; hw_closid_t hw_closid; - struct raw_resctrl_resource *rr; struct sd_closid closid; struct list_head *head; struct rdtgroup *entry; struct msr_param para; - - bool update_on; - - rr = r->res; + bool update_on, cdp_both_ctrl;
cfg = dom->staged_cfg; para.closid = &closid; @@ -192,6 +215,7 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, if (!cfg[i].have_new_ctrl) continue; update_on = false; + cdp_both_ctrl = cfg[i].cdp_both_ctrl; /* * for ctrl group configuration, hw_closid of cfg[i] equals * to rdtgrp->closid.intpartid. @@ -215,7 +239,7 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, } } if (update_on) - rr->msr_update(r, dom, ¶); + resctrl_dom_ctrl_config(cdp_both_ctrl, r, dom, ¶);
/* * we should synchronize all child mon groups' @@ -225,8 +249,9 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, list_for_each_entry(entry, head, mon.crdtgrp_list) { resctrl_cdp_map(clos, entry->closid.reqpartid, cfg[i].conf_type, hw_closid); + closid.reqpartid = hw_closid_val(hw_closid); - rr->msr_update(r, dom, ¶); + resctrl_dom_ctrl_config(cdp_both_ctrl, r, dom, ¶); } } } @@ -504,13 +529,33 @@ static inline char *get_resource_name(char *name) return res; }
+static u64 resctrl_dom_mon_data(struct resctrl_resource *r, + struct rdt_domain *d, void *md_priv) +{ + u64 ret; + union mon_data_bits md; + struct raw_resctrl_resource *rr; + + md.priv = md_priv; + rr = r->res; + ret = rr->mon_read(d, md.priv); + if (md.u.cdp_both_mon) { + hw_closid_t hw_closid; + + resctrl_cdp_map(clos, md.u.partid, CDP_DATA, hw_closid); + md.u.partid = hw_closid_val(hw_closid); + ret += rr->mon_read(d, md.priv); + } + + return ret; +} + int resctrl_group_mondata_show(struct seq_file *m, void *arg) { struct kernfs_open_file *of = m->private; struct rdtgroup *rdtgrp; struct rdt_domain *d; struct resctrl_resource *r; - struct raw_resctrl_resource *rr; union mon_data_bits md; int ret = 0; char *resname = get_resource_name(kernfs_node_name(of)); @@ -528,7 +573,6 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) md.priv = of->kn->priv;
r = mpam_resctrl_get_resource(md.u.rid); - rr = r->res;
/* show monitor data */ d = mpam_find_domain(r, md.u.domid, NULL); @@ -538,7 +582,8 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) goto out; }
- usage = rr->mon_read(d, md.priv); + usage = resctrl_dom_mon_data(r, d, md.priv); + /* * if this rdtgroup is ctrlmon group, also collect it's * mon groups' monitor data. @@ -562,7 +607,7 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) md.u.partid = hw_closid_val(hw_closid); md.u.pmg = entry->mon.rmid; md.u.mon = entry->mon.mon; - usage += rr->mon_read(d, md.priv); + usage += resctrl_dom_mon_data(r, d, md.priv); } }
@@ -618,6 +663,7 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, resctrl_cdp_map(mon, prgrp->mon.mon, s->conf_type, hw_monid); md.u.mon = hw_monid_val(hw_monid); md.u.pmg = prgrp->mon.rmid; + md.u.cdp_both_mon = s->cdp_mc_both;
snprintf(name, sizeof(name), "mon_%s_%02d", s->name, d->id); kn = __kernfs_create_file(parent_kn, name, 0444, @@ -686,14 +732,20 @@ int resctrl_mkdir_mondata_all_subdir(struct kernfs_node *parent_kn, }
/* Initialize MBA resource with default values. */ -static void rdtgroup_init_mba(struct resctrl_resource *r, u32 closid) +static void rdtgroup_init_mba(struct resctrl_schema *s, u32 closid) { struct resctrl_staged_config *cfg; + struct resctrl_resource *r; struct rdt_domain *d; enum resctrl_ctrl_type t;
- list_for_each_entry(d, &r->domains, list) { + r = s->res; + if (WARN_ON(!r)) + return; + + list_for_each_entry(d, &s->res->domains, list) { cfg = &d->staged_cfg[CDP_BOTH]; + cfg->cdp_both_ctrl = s->cdp_mc_both; cfg->new_ctrl[SCHEMA_COMM] = r->default_ctrl[SCHEMA_COMM]; resctrl_cdp_map(clos, closid, CDP_BOTH, cfg->hw_closid); cfg->have_new_ctrl = true; @@ -731,6 +783,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
list_for_each_entry(d, &s->res->domains, list) { cfg = &d->staged_cfg[conf_type]; + cfg->cdp_both_ctrl = s->cdp_mc_both; cfg->have_new_ctrl = false; cfg->new_ctrl[SCHEMA_COMM] = r->cache.shareable_bits; used_b = r->cache.shareable_bits; @@ -773,7 +826,7 @@ int resctrl_group_init_alloc(struct rdtgroup *rdtgrp) list_for_each_entry(s, &resctrl_all_schema, list) { r = s->res; if (r->rid == RDT_RESOURCE_MC) { - rdtgroup_init_mba(r, rdtgrp->closid.intpartid); + rdtgroup_init_mba(s, rdtgrp->closid.intpartid); } else { ret = rdtgroup_init_cat(s, rdtgrp->closid.intpartid); if (ret < 0)
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
MPAM spec says, when an MPAMCFG register other than MPAMCFG_INTPARTID is read or written, if the value of MPAMCFG_PART_SEL.INTERNAL is not 1, MPAMF_ESR is set to indicate an intPARTID_Range error. So we should set MPAMCFG_PART_SEL.INTERNAL to 1 before reading MPAMCFG_PRI register.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 1202a5795e97..70f27e0e12c9 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -123,7 +123,7 @@ mpam_probe_update_sysprops(u16 max_partid, u16 max_pmg)
static int mpam_device_probe(struct mpam_device *dev) { - u32 hwfeatures; + u32 hwfeatures, part_sel; u16 max_intpartid = 0; u16 max_partid, max_pmg;
@@ -205,8 +205,17 @@ static int mpam_device_probe(struct mpam_device *dev)
/* Priority partitioning */ if (MPAMF_IDR_HAS_PRI_PART(hwfeatures)) { - u32 pri_features = mpam_read_reg(dev, MPAMF_PRI_IDR); - u32 hwdef_pri = mpam_read_reg(dev, MPAMCFG_PRI); + u32 pri_features, hwdef_pri; + /* + * if narrow support, MPAMCFG_PART_SEL.INTERNAL must be 1 when + * reading/writing MPAMCFG register other than MPAMCFG_INTPARTID. + */ + if (mpam_has_feature(mpam_feat_part_nrw, dev->features)) { + part_sel = MPAMCFG_PART_SEL_INTERNAL; + mpam_write_reg(dev, MPAMCFG_PART_SEL, part_sel); + } + pri_features = mpam_read_reg(dev, MPAMF_PRI_IDR); + hwdef_pri = mpam_read_reg(dev, MPAMCFG_PRI);
pr_debug("probe: probed PRI_PART\n");
From: James Morse james.morse@arm.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
The MPAM MSC error interrupt tells us how we misconfigured the MSC. We don't expect to to this. If the interrupt fires, print a summary, and mark MPAM as broken. Eventually we will try and cleanly teardown when we see this.
Now we can register from a helper mpam_register_device_irq() to register overflow and error interrupt from mpam device, When devices come and go we want to make sure the error irq is enabled. We disable the error irq when cpus are taken offline in case the component remains online even when the associated CPUs are offline.
Code of this patch are borrowed from james james.morse@arm.com.
[Wang ShaoBo: few version adaptation changes]
Signed-off-by: James Morse james.morse@arm.com Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=6d1ceca3eb5953fc16a524... Link: http://www.linux-arm.org/git?p=linux-jm.git;a=patch;h=81d178c198165fd557431d... Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam_resource.h | 10 +- arch/arm64/kernel/mpam/mpam_device.c | 164 ++++++++++++++++++++++++- arch/arm64/kernel/mpam/mpam_device.h | 6 + arch/arm64/kernel/mpam/mpam_internal.h | 10 ++ drivers/acpi/arm64/mpam.c | 10 +- include/linux/arm_mpam.h | 51 ++++++++ 6 files changed, 246 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/include/asm/mpam_resource.h index 72cc9029946f..412faca90e0b 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/include/asm/mpam_resource.h @@ -102,9 +102,17 @@ */ #define MPAMCFG_PART_SEL_INTERNAL BIT(16)
-/* MPAM_ESR */ +/* MPAMF_ESR - MPAM Error Status Register */ +#define MPAMF_ESR_PARTID_OR_MON GENMASK(15, 0) +#define MPAMF_ESR_PMG GENMASK(23, 16) +#define MPAMF_ESR_ERRCODE GENMASK(27, 24) +#define MPAMF_ESR_ERRCODE_SHIFT 24 +#define MPAMF_ESR_OVRWR BIT(31) #define MPAMF_ESR_ERRCODE_MASK ((BIT(4) - 1) << 24)
+/* MPAMF_ECR - MPAM Error Control Register */ +#define MPAMF_ECR_INTEN BIT(0) + /* * Size of the memory mapped registers: 4K of feature page then 2 x 4K * bitmap registers diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 70f27e0e12c9..c1a41b39e328 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -406,6 +406,129 @@ static int mpam_allocate_config(void) return 0; }
+static const char *mpam_msc_err_str[_MPAM_NUM_ERRCODE] = { + [MPAM_ERRCODE_NONE] = "No Error", + [MPAM_ERRCODE_PARTID_SEL_RANGE] = "Out of range PARTID selected", + [MPAM_ERRCODE_REQ_PARTID_RANGE] = "Out of range PARTID requested", + [MPAM_ERRCODE_REQ_PMG_RANGE] = "Out of range PMG requested", + [MPAM_ERRCODE_MONITOR_RANGE] = "Out of range Monitor selected", + [MPAM_ERRCODE_MSMONCFG_ID_RANGE] = "Out of range Monitor:PARTID or PMG written", + + /* These two are about PARTID narrowing, which we don't support */ + [MPAM_ERRCODE_INTPARTID_RANGE] = "Out or range Internal-PARTID written", + [MPAM_ERRCODE_UNEXPECTED_INTERNAL] = "Internal-PARTID set but not expected", +}; + + +static irqreturn_t mpam_handle_error_irq(int irq, void *data) +{ + u32 device_esr; + u16 device_errcode; + struct mpam_device *dev = data; + + spin_lock(&dev->lock); + device_esr = mpam_read_reg(dev, MPAMF_ESR); + spin_unlock(&dev->lock); + + device_errcode = (device_esr & MPAMF_ESR_ERRCODE) >> MPAMF_ESR_ERRCODE_SHIFT; + if (device_errcode == MPAM_ERRCODE_NONE) + return IRQ_NONE; + + /* No-one expects MPAM errors! */ + if (device_errcode <= _MPAM_NUM_ERRCODE) + pr_err_ratelimited("unexpected error '%s' [esr:%x]\n", + mpam_msc_err_str[device_errcode], + device_esr); + else + pr_err_ratelimited("unexpected error %d [esr:%x]\n", + device_errcode, device_esr); + + if (!cmpxchg(&mpam_broken, -EINTR, 0)) + schedule_work(&mpam_failed_work); + + /* A write of 0 to MPAMF_ESR.ERRCODE clears level interrupts */ + spin_lock(&dev->lock); + mpam_write_reg(dev, MPAMF_ESR, 0); + spin_unlock(&dev->lock); + + return IRQ_HANDLED; +} +/* register and enable all device error interrupts */ +static void mpam_enable_irqs(void) +{ + struct mpam_device *dev; + int rc, irq, request_flags; + unsigned long irq_save_flags; + + list_for_each_entry(dev, &mpam_all_devices, glbl_list) { + spin_lock_irqsave(&dev->lock, irq_save_flags); + irq = dev->error_irq; + request_flags = dev->error_irq_flags; + spin_unlock_irqrestore(&dev->lock, irq_save_flags); + + if (request_flags & MPAM_IRQ_MODE_LEVEL) { + struct cpumask tmp; + bool inaccessible_cpus; + + request_flags = IRQF_TRIGGER_LOW | IRQF_SHARED; + + /* + * If the MSC is not accessible from any CPU the IRQ + * may be migrated to, we won't be able to clear it. + * ~dev->fw_affinity is all the CPUs that can't access + * the MSC. 'and' cpu_possible_mask tells us whether we + * care. + */ + spin_lock_irqsave(&dev->lock, irq_save_flags); + inaccessible_cpus = cpumask_andnot(&tmp, + cpu_possible_mask, + &dev->fw_affinity); + spin_unlock_irqrestore(&dev->lock, irq_save_flags); + + if (inaccessible_cpus) { + pr_err_once("NOT registering MPAM error level-irq that isn't globally reachable"); + continue; + } + } else { + request_flags = IRQF_TRIGGER_RISING | IRQF_SHARED; + } + + rc = request_irq(irq, mpam_handle_error_irq, request_flags, + "MPAM ERR IRQ", dev); + if (rc) { + pr_err_ratelimited("Failed to register irq %u\n", irq); + continue; + } + + /* + * temporary: the interrupt will only be enabled when cpus + * subsequently come online after mpam_enable(). + */ + spin_lock_irqsave(&dev->lock, irq_save_flags); + dev->enable_error_irq = true; + spin_unlock_irqrestore(&dev->lock, irq_save_flags); + } +} + +static void mpam_disable_irqs(void) +{ + int irq; + bool do_unregister; + struct mpam_device *dev; + unsigned long irq_save_flags; + + list_for_each_entry(dev, &mpam_all_devices, glbl_list) { + spin_lock_irqsave(&dev->lock, irq_save_flags); + irq = dev->error_irq; + do_unregister = dev->enable_error_irq; + dev->enable_error_irq = false; + spin_unlock_irqrestore(&dev->lock, irq_save_flags); + + if (do_unregister) + free_irq(irq, dev); + } +} + /* * Enable mpam once all devices have been probed. * Scheduled by mpam_discovery_complete() once all devices have been created. @@ -441,6 +564,8 @@ static void __init mpam_enable(struct work_struct *work) return; mutex_unlock(&mpam_devices_lock);
+ mpam_enable_irqs(); + /* * mpam_enable() runs in parallel with cpuhp callbacks bringing other * CPUs online, as we eagerly schedule the work. To give resctrl a @@ -484,6 +609,8 @@ static void mpam_failed(struct work_struct *work) if (mpam_cpuhp_state) { cpuhp_remove_state(mpam_cpuhp_state); mpam_cpuhp_state = 0; + + mpam_disable_irqs(); } mutex_unlock(&mpam_cpuhp_lock); } @@ -679,6 +806,28 @@ __mpam_device_create(u8 level_idx, enum mpam_class_types type, return dev; }
+void __init mpam_device_set_error_irq(struct mpam_device *dev, u32 irq, + u32 flags) +{ + unsigned long irq_save_flags; + + spin_lock_irqsave(&dev->lock, irq_save_flags); + dev->error_irq = irq; + dev->error_irq_flags = flags & MPAM_IRQ_FLAGS_MASK; + spin_unlock_irqrestore(&dev->lock, irq_save_flags); +} + +void __init mpam_device_set_overflow_irq(struct mpam_device *dev, u32 irq, + u32 flags) +{ + unsigned long irq_save_flags; + + spin_lock_irqsave(&dev->lock, irq_save_flags); + dev->overflow_irq = irq; + dev->overflow_irq_flags = flags & MPAM_IRQ_FLAGS_MASK; + spin_unlock_irqrestore(&dev->lock, irq_save_flags); +} + static int mpam_cpus_have_feature(void) { if (!cpus_have_const_cap(ARM64_HAS_MPAM)) @@ -803,6 +952,9 @@ static void mpam_reset_device(struct mpam_component *comp,
lockdep_assert_held(&dev->lock);
+ if (dev->enable_error_irq) + mpam_write_reg(dev, MPAMF_ECR, MPAMF_ECR_INTEN); + if (!mpam_has_feature(mpam_feat_part_nrw, dev->features)) { for (partid = 0; partid < dev->num_partid; partid++) mpam_reset_device_config(comp, dev, partid); @@ -924,12 +1076,22 @@ static int mpam_cpu_online(unsigned int cpu)
static int mpam_cpu_offline(unsigned int cpu) { + unsigned long flags; struct mpam_device *dev;
mutex_lock(&mpam_devices_lock); - list_for_each_entry(dev, &mpam_all_devices, glbl_list) + list_for_each_entry(dev, &mpam_all_devices, glbl_list) { + if (!cpumask_test_cpu(cpu, &dev->online_affinity)) + continue; cpumask_clear_cpu(cpu, &dev->online_affinity);
+ if (cpumask_empty(&dev->online_affinity)) { + spin_lock_irqsave(&dev->lock, flags); + mpam_write_reg(dev, MPAMF_ECR, 0); + spin_unlock_irqrestore(&dev->lock, flags); + } + } + mutex_unlock(&mpam_devices_lock);
if (resctrl_registered) diff --git a/arch/arm64/kernel/mpam/mpam_device.h b/arch/arm64/kernel/mpam/mpam_device.h index fc5f7c292b6f..f3ebd3f8b23d 100644 --- a/arch/arm64/kernel/mpam/mpam_device.h +++ b/arch/arm64/kernel/mpam/mpam_device.h @@ -58,6 +58,12 @@ struct mpam_device { /* for reset device MPAMCFG_PRI */ u16 hwdef_intpri; u16 hwdef_dspri; + + bool enable_error_irq; + u32 error_irq; + u32 error_irq_flags; + u32 overflow_irq; + u32 overflow_irq_flags; };
/* diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 0ca58712a8ca..974a0b0784fa 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -20,6 +20,16 @@ extern struct list_head mpam_classes;
#define MAX_MBA_BW 100u
+#define MPAM_ERRCODE_NONE 0 +#define MPAM_ERRCODE_PARTID_SEL_RANGE 1 +#define MPAM_ERRCODE_REQ_PARTID_RANGE 2 +#define MPAM_ERRCODE_MSMONCFG_ID_RANGE 3 +#define MPAM_ERRCODE_REQ_PMG_RANGE 4 +#define MPAM_ERRCODE_MONITOR_RANGE 5 +#define MPAM_ERRCODE_INTPARTID_RANGE 6 +#define MPAM_ERRCODE_UNEXPECTED_INTERNAL 7 +#define _MPAM_NUM_ERRCODE 8 + struct mpam_resctrl_dom { struct mpam_component *comp;
diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c index 10e4769d5227..6c238f5a5c5a 100644 --- a/drivers/acpi/arm64/mpam.c +++ b/drivers/acpi/arm64/mpam.c @@ -93,7 +93,7 @@ static int acpi_mpam_label_memory_component_id(u8 proximity_domain,
static int __init acpi_mpam_parse_memory(struct acpi_mpam_header *h) { - int ret = 0; + int ret; u32 component_id; struct mpam_device *dev; struct acpi_mpam_node_memory *node = (struct acpi_mpam_node_memory *)h; @@ -112,7 +112,9 @@ static int __init acpi_mpam_parse_memory(struct acpi_mpam_header *h) return -EINVAL; }
- return ret; + return mpam_register_device_irq(dev, + node->header.overflow_interrupt, node->header.overflow_flags, + node->header.error_interrupt, node->header.error_interrupt_flags); }
static int __init acpi_mpam_parse_cache(struct acpi_mpam_header *h, @@ -178,7 +180,9 @@ static int __init acpi_mpam_parse_cache(struct acpi_mpam_header *h, return -EINVAL; }
- return ret; + return mpam_register_device_irq(dev, + node->header.overflow_interrupt, node->header.overflow_flags, + node->header.error_interrupt, node->header.error_interrupt_flags); }
static int __init acpi_mpam_parse_table(struct acpi_table_header *table, diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h index 9a00c7984d91..44d0690ae8c4 100644 --- a/include/linux/arm_mpam.h +++ b/include/linux/arm_mpam.h @@ -2,6 +2,7 @@ #ifndef __LINUX_ARM_MPAM_H #define __LINUX_ARM_MPAM_H
+#include <linux/acpi.h> #include <linux/err.h> #include <linux/cpumask.h> #include <linux/types.h> @@ -64,4 +65,54 @@ enum mpam_enable_type {
extern enum mpam_enable_type mpam_enabled;
+#define MPAM_IRQ_MODE_LEVEL 0x1 +#define MPAM_IRQ_FLAGS_MASK 0x7f + +#define mpam_irq_flags_to_acpi(x) ((x & MPAM_IRQ_MODE_LEVEL) ? \ + ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE) + +void __init mpam_device_set_error_irq(struct mpam_device *dev, u32 irq, + u32 flags); +void __init mpam_device_set_overflow_irq(struct mpam_device *dev, u32 irq, + u32 flags); + +static inline int __init mpam_register_device_irq(struct mpam_device *dev, + u32 overflow_interrupt, u32 overflow_flags, + u32 error_interrupt, u32 error_flags) +{ + int irq, trigger; + int ret = 0; + u8 irq_flags; + + if (overflow_interrupt) { + irq_flags = overflow_flags & MPAM_IRQ_FLAGS_MASK; + trigger = mpam_irq_flags_to_acpi(irq_flags); + + irq = acpi_register_gsi(NULL, overflow_interrupt, trigger, + ACPI_ACTIVE_HIGH); + if (irq < 0) { + pr_err_once("Failed to register overflow interrupt with ACPI\n"); + return ret; + } + + mpam_device_set_overflow_irq(dev, irq, irq_flags); + } + + if (error_interrupt) { + irq_flags = error_flags & MPAM_IRQ_FLAGS_MASK; + trigger = mpam_irq_flags_to_acpi(irq_flags); + + irq = acpi_register_gsi(NULL, error_interrupt, trigger, + ACPI_ACTIVE_HIGH); + if (irq < 0) { + pr_err_once("Failed to register error interrupt with ACPI\n"); + return ret; + } + + mpam_device_set_error_irq(dev, irq, irq_flags); + } + + return ret; +} + #endif
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
There are two aspects involved:
- Getting configuration
We divide event QOS_XX_PRI_EVENT_ID into QOS_XX_INTPRI_EVENT_ID and QOS_XX_DSPRI_EVENT_ID, in spite of having attempted to set same value of filling dspri and intpti in mpam_config structure but exactly we need read seperately to ensure their independence.
Besides, an event such as QOS_CAT_INTPRI_EVENT_ID is not necessary to be read from MSC's register but set to be 0 directly if corresponding feature doesn't support.
- Applying configuration
When applying downstream or internal priority configuration, given the independence of their two, we should check if feature mpam_feat_ xxpri_part supported first and next check mpam_feat_xxpri_part_0_low, and convert dspri and intpri into a proper value according to it's max width.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 10 +++++--- arch/arm64/kernel/mpam/mpam_device.c | 30 +++++++++++++--------- arch/arm64/kernel/mpam/mpam_resctrl.c | 36 +++++++++++++++------------ 3 files changed, 44 insertions(+), 32 deletions(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 37e750029fbc..1cd24441d2e6 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -23,10 +23,12 @@ enum rdt_event_id { QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
QOS_CAT_CPBM_EVENT_ID = 0x04, - QOS_CAT_PRI_EVENT_ID = 0x05, - QOS_MBA_MAX_EVENT_ID = 0x06, - QOS_MBA_PRI_EVENT_ID = 0x07, - QOS_MBA_HDL_EVENT_ID = 0x08, + QOS_CAT_INTPRI_EVENT_ID = 0x05, + QOS_CAT_DSPRI_EVENT_ID = 0x06, + QOS_MBA_MAX_EVENT_ID = 0x07, + QOS_MBA_INTPRI_EVENT_ID = 0x08, + QOS_MBA_DSPRI_EVENT_ID = 0x09, + QOS_MBA_HDL_EVENT_ID = 0x0a, /* Must be the last */ RESCTRL_NUM_EVENT_IDS, }; diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index c1a41b39e328..3e4509e289dd 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -1609,19 +1609,25 @@ static void mpam_component_read_mpamcfg(void *_ctx) val = mpam_read_reg(dev, MPAMCFG_MBW_MAX); val = MBW_MAX_GET_HDL(val); break; - case QOS_CAT_PRI_EVENT_ID: - case QOS_MBA_PRI_EVENT_ID: - if (mpam_has_feature(mpam_feat_intpri_part, dev->features)) - intpri = MPAMCFG_INTPRI_GET(val); - if (mpam_has_feature(mpam_feat_dspri_part, dev->features)) - dspri = MPAMCFG_DSPRI_GET(val); - if (!mpam_has_feature(mpam_feat_intpri_part_0_low, - dev->features)) + case QOS_CAT_INTPRI_EVENT_ID: + case QOS_MBA_INTPRI_EVENT_ID: + if (!mpam_has_feature(mpam_feat_intpri_part, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_PRI); + intpri = MPAMCFG_INTPRI_GET(val); + if (!mpam_has_feature(mpam_feat_intpri_part_0_low, dev->features)) intpri = GENMASK(dev->intpri_wd - 1, 0) & ~intpri; - if (!mpam_has_feature(mpam_feat_dspri_part_0_low, - dev->features)) - dspri = GENMASK(dev->intpri_wd - 1, 0) & ~dspri; - val = (dspri > intpri) ? dspri : intpri; + val = intpri; + break; + case QOS_CAT_DSPRI_EVENT_ID: + case QOS_MBA_DSPRI_EVENT_ID: + if (!mpam_has_feature(mpam_feat_dspri_part, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_PRI); + dspri = MPAMCFG_DSPRI_GET(val); + if (!mpam_has_feature(mpam_feat_dspri_part_0_low, dev->features)) + dspri = GENMASK(dev->dspri_wd - 1, 0) & ~dspri; + val = dspri; break; default: break; diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 7c519d93c1bb..d5cf67d18e7f 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -297,57 +297,61 @@ common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d,
static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para) { - u32 result; + u32 result, intpri, dspri; struct sync_args args; struct mpam_resctrl_dom *dom;
args.closid = *para->closid; + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
switch (para->type) { case SCHEMA_COMM: args.eventid = QOS_CAT_CPBM_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &result); break; case SCHEMA_PRI: - args.eventid = QOS_CAT_PRI_EVENT_ID; + args.eventid = QOS_CAT_INTPRI_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &intpri); + args.eventid = QOS_MBA_DSPRI_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &dspri); + result = (intpri > dspri) ? intpri : dspri; + break; default: return 0; }
- dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); - mpam_component_get_config(dom->comp, &args, &result); - return result; }
static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para) { - u32 result; + u32 result, intpri, dspri; struct sync_args args; struct mpam_resctrl_dom *dom;
args.closid = *para->closid; + dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- /* - * software default set memory bandwidth by - * MPAMCFG_MBW_MAX but not MPAMCFG_MBW_PBM. - */ switch (para->type) { case SCHEMA_COMM: args.eventid = QOS_MBA_MAX_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &result); + break; + case SCHEMA_PRI: + args.eventid = QOS_MBA_INTPRI_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &intpri); + args.eventid = QOS_MBA_DSPRI_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &dspri); + result = (intpri > dspri) ? intpri : dspri; break; case SCHEMA_HDL: args.eventid = QOS_MBA_HDL_EVENT_ID; - break; - case SCHEMA_PRI: - args.eventid = QOS_MBA_PRI_EVENT_ID; + mpam_component_get_config(dom->comp, &args, &result); break; default: return 0; }
- dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom); - mpam_component_get_config(dom->comp, &args, &result); - return result; }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
So far we use sd_closid, including {reqpartid, intpartid}, to label each resctrl group including ctrlgroup and mongroup, This can perfectly handle this case where number of reqpartid exceeds intpartid, this always happen when intpartid narrowing supported, otherwise their two are of same number. So we use excessive reqpartid to indicate (1)- how configurations can be synchronized from the configuration indexed by intpartid, not only that, (2)- take part of monitor role.
But reqpartid in (2) with pmg still be scattered, So far we have not yet a right way to explain how can we use their two properly. In order to ensure their resources can be fully utilized, and given this idea from Intel-RDT's design which uses rmid for monitoring, a rmid remap matrix is delivered for transforming partid and pmg to rmid, this matrix is organized like this:
[bitmap entry indexed by partid] [col pos is partid]
[0] [1] [2] [3] [4] [5] occ->bitmap[:0] 1 0 0 1 1 1 bitmap[:1] 1 0 0 1 1 1 bitmap[:2] 1 1 1 1 1 1 bitmap[:3] 1 1 1 1 1 1 [row pos-1 is pmg]
Calculate rmid = partid + NR_partid * pmg
occ represents if this bitmap has been used by a partid, it is because a certain partid should not be accompany with a duplicated pmg for monitoring, this design easily saves a lot of space, and can also decrease time complexity of allocating and free rmid process from O(NR_partid)* O(NR_pmg) to O(NR_partid) + O(log(NR_pmg)) compared with using list.
By this way, we get a continuous rmid set with upper bound(NR_pmg * NR_partid - 1), given an rmid we can assume that if it's a valid rmid by judging whether it falls within this range or not.
rmid implicts the reqpartid info, so we can use relevant helpers to get this reqpartid for sd_closid@reqpartid and perfectly accomplish this configuration sync mission, this also makes closid simpler which can be consists of intpartid index only, also each resctrl group is happy to own consecutive rmid.
This also has some profound influences, for instance for MPAM there also support SMMU io using partid and pmg, we can use a single helper mpam_rmid_to_partid_pmg() in SMMU driver to complete this remap process for rmid input from outside user space.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 3 +- arch/arm64/include/asm/resctrl.h | 55 +-- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 17 +- arch/arm64/kernel/mpam/mpam_mon.c | 49 --- arch/arm64/kernel/mpam/mpam_resctrl.c | 506 +++++++++++++++++++++----- fs/resctrlfs.c | 138 +++---- 6 files changed, 529 insertions(+), 239 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 5fac0fb3c807..014d5728f607 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -362,7 +362,8 @@ int resctrl_group_alloc_mon(struct rdtgroup *grp);
u16 mpam_resctrl_max_mon_num(void);
-void pmg_init(void); void mon_init(void);
+extern int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg); + #endif /* _ASM_ARM64_MPAM_H */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 1cd24441d2e6..40f97b1ddb83 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -2,6 +2,7 @@ #define _ASM_ARM64_RESCTRL_H
#include <asm/mpam_sched.h> +#include <asm/mpam.h>
#define resctrl_group rdtgroup #define resctrl_alloc_capable rdt_alloc_capable @@ -95,17 +96,12 @@ void schemata_list_destroy(void);
int resctrl_lru_request_mon(void);
-int alloc_rmid(void); -void free_rmid(u32 id); +int rmid_alloc(int entry_idx); +void rmid_free(int rmid);
-enum closid_type { - CLOSID_INT = 0x1, - CLOSID_REQ = 0x2, - CLOSID_NUM_TYPES, -}; int resctrl_id_init(void); -int resctrl_id_alloc(enum closid_type); -void resctrl_id_free(enum closid_type, int id); +int closid_alloc(void); +void closid_free(int closid);
void update_cpu_closid_rmid(void *info); void update_closid_rmid(const struct cpumask *cpu_mask, struct resctrl_group *r); @@ -152,20 +148,35 @@ int resctrl_update_groups_config(struct rdtgroup *rdtgrp); #define RESCTRL_MAX_CLOSID 32
/* - * left 16 bits of closid store parent(master)'s - * closid, the reset store current group's closid, - * this used for judging if tasks are allowed to move - * another ctrlmon/mon group, it is because when - * a mon group is permited to allocated another - * closid different from it's parent, only closid - * is not sufficient to do that. + * This is only for avoiding unnecessary cost in mpam_sched_in() + * called by __switch_to() if using mpam_rmid_to_partid_pmg() + * to get partid and pmg, we just simply shift and get their + * two easily when we want. */ -#define TASK_CLOSID_SET(prclosid, closid) \ - ((prclosid << 16) | closid) +static inline void resctrl_navie_rmid_partid_pmg(u32 rmid, int *partid, int *pmg) +{ + *partid = rmid >> 16; + *pmg = (rmid << 16) >> 16; +} + +static inline u32 resctrl_navie_rmid(u32 rmid) +{ + int ret, partid, pmg; + + ret = mpam_rmid_to_partid_pmg(rmid, (int *)&partid, (int *)&pmg); + if (ret) + return 0;
-#define TASK_CLOSID_CUR_GET(closid) \ - (closid & GENMASK(15, 0)) -#define TASK_CLOSID_PR_GET(closid) \ - ((closid & GENMASK(31, 16)) >> 16) + return (partid << 16) | pmg; +} + +/* + * closid.reqpartid is used as part of mapping to rmid, now + * we only need to map intpartid to closid. + */ +static inline u32 resctrl_navie_closid(struct sd_closid closid) +{ + return closid.intpartid; +}
#endif /* _ASM_ARM64_RESCTRL_H */ diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 808a16209129..2ad19f255060 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -560,6 +560,7 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) int ret = 0; char *resname = get_resource_name(kernfs_node_name(of)); u64 usage; + int pmg;
if (!resname) return -ENOMEM; @@ -605,7 +606,13 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) resctrl_cdp_map(clos, entry->closid.reqpartid, type, hw_closid); md.u.partid = hw_closid_val(hw_closid); - md.u.pmg = entry->mon.rmid; + + ret = mpam_rmid_to_partid_pmg(entry->mon.rmid, + NULL, &pmg); + if (ret) + return ret; + + md.u.pmg = pmg; md.u.mon = entry->mon.mon; usage += resctrl_dom_mon_data(r, d, md.priv); } @@ -651,6 +658,7 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, struct kernfs_node *kn; char name[32]; int ret = 0; + int pmg;
r = s->res; rr = r->res; @@ -662,7 +670,12 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, md.u.partid = hw_closid_val(hw_closid); resctrl_cdp_map(mon, prgrp->mon.mon, s->conf_type, hw_monid); md.u.mon = hw_monid_val(hw_monid); - md.u.pmg = prgrp->mon.rmid; + + ret = mpam_rmid_to_partid_pmg(prgrp->mon.rmid, NULL, &pmg); + if (ret) + return ret; + md.u.pmg = pmg; + md.u.cdp_both_mon = s->cdp_mc_both;
snprintf(name, sizeof(name), "mon_%s_%02d", s->name, d->id); diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 9875b44b83ac..053f8501f7d2 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -37,55 +37,6 @@ */ bool rdt_mon_capable;
-static int pmg_free_map; -void pmg_init(void) -{ - u16 num_pmg = USHRT_MAX; - struct mpam_resctrl_res *res; - struct resctrl_resource *r; - struct raw_resctrl_resource *rr; - - /* Use the max num_pmg among all resources */ - for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; - rr = r->res; - num_pmg = min(num_pmg, rr->num_pmg); - } - - pmg_free_map = BIT_MASK(num_pmg) - 1; - - /* pmg 0 is always reserved for the default group */ - pmg_free_map &= ~1; -} - -static int alloc_pmg(void) -{ - u32 pmg = ffs(pmg_free_map); - - if (pmg == 0) - return -ENOSPC; - - pmg--; - pmg_free_map &= ~(1 << pmg); - - return pmg; -} - -static void free_pmg(u32 pmg) -{ - pmg_free_map |= 1 << pmg; -} - -int alloc_rmid(void) -{ - return alloc_pmg(); -} - -void free_rmid(u32 id) -{ - free_pmg(id); -} - /* * A simple LRU monitor allocation machanism, each * monitor free map occupies two section, one for diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index d5cf67d18e7f..418defc05e06 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -481,8 +481,8 @@ common_wrmon(struct rdt_domain *d, void *md_priv) * limited as the number of resources grows. */
-static unsigned long *intpartid_free_map, *reqpartid_free_map; static int num_intpartid, num_reqpartid; +static unsigned long *intpartid_free_map;
static void mpam_resctrl_closid_collect(void) { @@ -518,83 +518,435 @@ static void mpam_resctrl_closid_collect(void) } }
-static inline int local_closid_bitmap_init(int bits_num, unsigned long **ptr) +int closid_bitmap_init(void) { int pos; u32 times, flag; + u32 bits_num;
+ mpam_resctrl_closid_collect(); + bits_num = num_intpartid; hw_alloc_times_validate(times, flag); + bits_num = rounddown(bits_num, times); + if (!bits_num) + return -EINVAL;
- if (flag) - bits_num = rounddown(bits_num, 2); + if (intpartid_free_map) + kfree(intpartid_free_map);
- if (!*ptr) { - *ptr = bitmap_zalloc(bits_num, GFP_KERNEL); - if (!*ptr) - return -ENOMEM; - } + intpartid_free_map = bitmap_zalloc(bits_num, GFP_KERNEL); + if (!intpartid_free_map) + return -ENOMEM;
- bitmap_set(*ptr, 0, bits_num); + bitmap_set(intpartid_free_map, 0, bits_num);
/* CLOSID 0 is always reserved for the default group */ - pos = find_first_bit(*ptr, bits_num); - bitmap_clear(*ptr, pos, times); + pos = find_first_bit(intpartid_free_map, bits_num); + bitmap_clear(intpartid_free_map, pos, times);
return 0; }
-int closid_bitmap_init(void) +/** + * struct rmid_transform - Matrix for transforming rmid to partid and pmg + * @rows: Number of bits for remap_body[:] bitmap + * @clos: Number of bitmaps + * @nr_usage: Number rmid we have + * @stride: Step stride from transforming rmid to partid and pmg + * @remap_body: Storing bitmaps' entry and itself + * @remap_enabled: Does remap_body init done + */ +struct rmid_transform { + u32 rows; + u32 cols; + u32 nr_usage; + int stride; + unsigned long **remap_body; + bool remap_enabled; +}; +static struct rmid_transform rmid_remap_matrix; + +/* + * a rmid remap matrix is delivered for transforming partid pmg to rmid, + * this matrix is organized like this: + * + * [bitmap entry indexed by partid] + * + * [0] [1] [2] [3] [4] [5] + * occ 1 0 0 1 1 1 + * bitmap[:0] 1 0 0 1 1 1 + * bitmap[:1] 1 1 1 1 1 1 + * bitmap[:2] 1 1 1 1 1 1 + * [pos is pmg] + * + * Calculate rmid = partid + NR_partid * pmg + * + * occ represents if this bitmap has been used by a partid, it is because + * a certain partid should not be accompany with a duplicated pmg for + * monitoring, this design easily saves a lot of space, and can also decrease + * time complexity of allocating and free rmid process from O(NR_partid)* + * O(NR_pmg) to O(NR_partid) + O(log(NR_pmg)) compared with using list. + */ +static int set_rmid_remap_matrix(u32 rows, u32 cols) { - int ret; + u32 times, flag; + int ret, col;
- mpam_resctrl_closid_collect(); - if (!num_intpartid || !num_reqpartid) + /* + * cols stands for partid, so if cdp enabled we must + * keep at least two partid for LxCODE and LxDATA + * respectively once time. + */ + hw_alloc_times_validate(times, flag); + rmid_remap_matrix.cols = rounddown(cols, times); + rmid_remap_matrix.stride = times; + if (times > rmid_remap_matrix.cols) return -EINVAL;
- if (intpartid_free_map) - kfree(intpartid_free_map); - if (reqpartid_free_map) - kfree(reqpartid_free_map); + /* + * first row of rmid remap matrix is used for indicating + * if remap bitmap is occupied by a col index. + */ + rmid_remap_matrix.rows = rows + 1; + + if (rows == 0 || cols == 0) + return -EINVAL; + + rmid_remap_matrix.nr_usage = rows * cols; + + /* free history pointer for matrix recreation */ + if (rmid_remap_matrix.remap_body) { + for (col = 0; col < cols; col++) { + if (!rmid_remap_matrix.remap_body[col]) + continue; + kfree(rmid_remap_matrix.remap_body[col]); + } + kfree(rmid_remap_matrix.remap_body); + } + + rmid_remap_matrix.remap_body = kcalloc(rmid_remap_matrix.cols, + sizeof(*rmid_remap_matrix.remap_body), GFP_KERNEL); + if (!rmid_remap_matrix.remap_body) + return -ENOMEM; + + for (col = 0; col < cols; col++) { + if (rmid_remap_matrix.remap_body[col]) + kfree(rmid_remap_matrix.remap_body[col]); + + rmid_remap_matrix.remap_body[col] = + bitmap_zalloc(rmid_remap_matrix.rows, + GFP_KERNEL); + if (!rmid_remap_matrix.remap_body[col]) { + ret = -ENOMEM; + goto clean; + } + + bitmap_set(rmid_remap_matrix.remap_body[col], + 0, rmid_remap_matrix.rows); + } + + rmid_remap_matrix.remap_enabled = 1; + + return 0; +clean: + for (col = 0; col < cols; col++) { + if (!rmid_remap_matrix.remap_body[col]) + continue; + kfree(rmid_remap_matrix.remap_body[col]); + rmid_remap_matrix.remap_body[col] = NULL; + } + if (rmid_remap_matrix.remap_body) { + kfree(rmid_remap_matrix.remap_body); + rmid_remap_matrix.remap_body = NULL; + } + + return ret; +} + +static u32 probe_rmid_remap_matrix_cols(void) +{ + return (u32)num_reqpartid; +} + +static u32 probe_rmid_remap_matrix_rows(void) +{ + return (u32)mpam_sysprops_num_pmg(); +} + +static inline unsigned long **__rmid_remap_bmp(int col) +{ + if (!rmid_remap_matrix.remap_enabled) + return NULL; + + if ((u32)col >= rmid_remap_matrix.cols) + return NULL; + + return rmid_remap_matrix.remap_body + col; +} + +#define for_each_rmid_remap_bmp(bmp) \ + for (bmp = __rmid_remap_bmp(0); \ + bmp <= __rmid_remap_bmp(rmid_remap_matrix.cols - 1); \ + bmp++) + +#define for_each_valid_rmid_remap_bmp(bmp) \ + for_each_rmid_remap_bmp(bmp) \ + if (bmp && *bmp) + +#define STRIDE_CHK(stride) \ + (stride == rmid_remap_matrix.stride) + +#define STRIDE_INC_CHK(stride) \ + (++stride == rmid_remap_matrix.stride) + +#define STRIDE_CHK_AND_WARN(stride) \ +do { \ + if (!STRIDE_CHK(stride)) \ + WARN_ON_ONCE("Unexpected stride\n"); \ +} while (0) + +static void set_rmid_remap_bmp_occ(unsigned long *bmp) +{ + clear_bit(0, bmp); +} + +static void unset_rmid_remap_bmp_occ(unsigned long *bmp) +{ + set_bit(0, bmp); +} + +static void rmid_remap_bmp_bdr_set(unsigned long *bmp, int b) +{ + set_bit(b + 1, bmp); +}
- ret = local_closid_bitmap_init(num_intpartid, &intpartid_free_map); +static void rmid_remap_bmp_bdr_clear(unsigned long *bmp, int b) +{ + clear_bit(b + 1, bmp); +} + +static int is_rmid_remap_bmp_occ(unsigned long *bmp) +{ + return (find_first_bit(bmp, rmid_remap_matrix.rows) == 0) ? 0 : 1; +} + +static int is_rmid_remap_bmp_full(unsigned long *bmp) +{ + return ((is_rmid_remap_bmp_occ(bmp) && + bitmap_weight(bmp, rmid_remap_matrix.rows) == + (rmid_remap_matrix.rows-1)) || + bitmap_full(bmp, rmid_remap_matrix.rows)); +} + +static int rmid_remap_bmp_alloc_pmg(unsigned long *bmp) +{ + int pos; + + pos = find_first_bit(bmp, rmid_remap_matrix.rows); + if (pos == rmid_remap_matrix.rows) + return -ENOSPC; + + clear_bit(pos, bmp); + return pos - 1; +} + +static int rmid_remap_matrix_init(void) +{ + int stride = 0; + int ret; + u32 cols, rows; + unsigned long **bmp; + + cols = probe_rmid_remap_matrix_cols(); + rows = probe_rmid_remap_matrix_rows(); + + ret = set_rmid_remap_matrix(rows, cols); if (ret) goto out;
- ret = local_closid_bitmap_init(num_reqpartid, &reqpartid_free_map); + /* + * if CDP disabled, drop partid = 0, pmg = 0 + * from bitmap for root resctrl group reserving + * default rmid, otherwise drop partid = 0 and + * partid = 1 for LxCACHE, LxDATA reservation. + */ + for_each_valid_rmid_remap_bmp(bmp) { + set_rmid_remap_bmp_occ(*bmp); + rmid_remap_bmp_bdr_clear(*bmp, 0); + if (STRIDE_INC_CHK(stride)) + break; + } + + STRIDE_CHK_AND_WARN(stride); + + return 0; + +out: + return ret; +} + +int resctrl_id_init(void) +{ + int ret; + + ret = closid_bitmap_init(); if (ret) - goto out; + return ret; + + ret = rmid_remap_matrix_init(); + if (ret) + return ret; + + mon_init();
return 0; +} + +static int is_rmid_valid(int rmid) +{ + return ((u32)rmid >= rmid_remap_matrix.nr_usage) ? 0 : 1; +} + +static int to_rmid(int partid, int pmg) +{ + return (partid + (rmid_remap_matrix.cols * pmg)); +} + +static int rmid_to_partid_pmg(int rmid, int *partid, int *pmg) +{ + if (!is_rmid_valid(rmid)) + return -EINVAL; + + if (pmg) + *pmg = rmid / rmid_remap_matrix.cols; + if (partid) + *partid = rmid % rmid_remap_matrix.cols; + return 0; +} + +static int __rmid_alloc(int partid) +{ + int stride = 0; + int partid_sel = 0; + int ret, pmg; + int rmid[2] = {-1, -1}; + unsigned long **cmp, **bmp; + + if (partid >= 0) { + cmp = __rmid_remap_bmp(partid); + if (!cmp) { + ret = -EINVAL; + goto out; + } + for_each_valid_rmid_remap_bmp(bmp) { + if (bmp < cmp) + continue; + set_rmid_remap_bmp_occ(*bmp); + + ret = rmid_remap_bmp_alloc_pmg(*bmp); + if (ret < 0) + goto out; + pmg = ret; + rmid[stride] = to_rmid(partid + stride, pmg); + if (STRIDE_INC_CHK(stride)) + break; + } + } else { + for_each_valid_rmid_remap_bmp(bmp) { + partid_sel++; + + if (is_rmid_remap_bmp_occ(*bmp)) + continue; + set_rmid_remap_bmp_occ(*bmp); + + ret = rmid_remap_bmp_alloc_pmg(*bmp); + if (ret < 0) + goto out; + pmg = ret; + rmid[stride] = to_rmid(partid_sel - 1, pmg); + if (STRIDE_INC_CHK(stride)) + break; + } + } + + if (!STRIDE_CHK(stride)) { + ret = -ENOSPC; + goto out; + } + + return rmid[0]; + out: + rmid_free(rmid[0]); return ret; }
+int rmid_alloc(int partid) +{ + return __rmid_alloc(partid); +} + +void rmid_free(int rmid) +{ + int stride = 0; + int partid, pmg; + unsigned long **bmp, **cmp; + + if (rmid_to_partid_pmg(rmid, &partid, &pmg)) + return; + + cmp = __rmid_remap_bmp(partid); + if (!cmp) + return; + + for_each_valid_rmid_remap_bmp(bmp) { + if (bmp < cmp) + continue; + + rmid_remap_bmp_bdr_set(*bmp, pmg); + + if (is_rmid_remap_bmp_full(*bmp)) + unset_rmid_remap_bmp_occ(*bmp); + + if (STRIDE_INC_CHK(stride)) + break; + } + + STRIDE_CHK_AND_WARN(stride); +} + +int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg) +{ + return rmid_to_partid_pmg(rmid, partid, pmg); +} +EXPORT_SYMBOL(mpam_rmid_to_partid_pmg); + /* * If cdp enabled, allocate two closid once time, then return first * allocated id. */ -static int closid_bitmap_alloc(int bits_num, unsigned long *ptr) +int closid_alloc(void) { int pos; u32 times, flag;
hw_alloc_times_validate(times, flag);
- pos = find_first_bit(ptr, bits_num); - if (pos == bits_num) + pos = find_first_bit(intpartid_free_map, num_intpartid); + if (pos == num_intpartid) return -ENOSPC;
- bitmap_clear(ptr, pos, times); + bitmap_clear(intpartid_free_map, pos, times);
return pos; }
-static void closid_bitmap_free(int pos, unsigned long *ptr) +void closid_free(int closid) { u32 times, flag;
hw_alloc_times_validate(times, flag); - bitmap_set(ptr, pos, times); + bitmap_set(intpartid_free_map, closid, times); }
/* @@ -778,8 +1130,8 @@ void update_cpu_closid_rmid(void *info) struct rdtgroup *r = info;
if (r) { - this_cpu_write(pqr_state.default_closid, r->closid.reqpartid); - this_cpu_write(pqr_state.default_rmid, r->mon.rmid); + this_cpu_write(pqr_state.default_closid, resctrl_navie_closid(r->closid)); + this_cpu_write(pqr_state.default_rmid, resctrl_navie_rmid(r->mon.rmid)); }
/* @@ -873,15 +1225,12 @@ int __resctrl_group_move_task(struct task_struct *tsk, * their parent CTRL group. */ if (rdtgrp->type == RDTCTRL_GROUP) { - tsk->closid = TASK_CLOSID_SET(rdtgrp->closid.intpartid, - rdtgrp->closid.reqpartid); - tsk->rmid = rdtgrp->mon.rmid; + tsk->closid = resctrl_navie_closid(rdtgrp->closid); + tsk->rmid = resctrl_navie_rmid(rdtgrp->mon.rmid); } else if (rdtgrp->type == RDTMON_GROUP) { - if (rdtgrp->mon.parent->closid.intpartid == - TASK_CLOSID_PR_GET(tsk->closid)) { - tsk->closid = TASK_CLOSID_SET(rdtgrp->closid.intpartid, - rdtgrp->closid.reqpartid); - tsk->rmid = rdtgrp->mon.rmid; + if (rdtgrp->mon.parent->closid.intpartid == tsk->closid) { + tsk->closid = resctrl_navie_closid(rdtgrp->closid); + tsk->rmid = resctrl_navie_rmid(rdtgrp->mon.rmid); } else { rdt_last_cmd_puts("Can't move task to different control group\n"); ret = -EINVAL; @@ -1279,13 +1628,10 @@ static void show_resctrl_tasks(struct rdtgroup *r, struct seq_file *s) rcu_read_lock(); for_each_process_thread(p, t) { if ((r->type == RDTMON_GROUP && - TASK_CLOSID_CUR_GET(t->closid) == r->closid.reqpartid && - t->rmid == r->mon.rmid) || + t->rmid == resctrl_navie_rmid(r->mon.rmid)) || (r->type == RDTCTRL_GROUP && - TASK_CLOSID_PR_GET(t->closid) == r->closid.intpartid)) - seq_printf(s, "group:(gid:%d mon:%d) task:(pid:%d gid:%d rmid:%d)\n", - r->closid.reqpartid, r->mon.mon, t->pid, - (int)TASK_CLOSID_CUR_GET(t->closid), t->rmid); + t->closid == resctrl_navie_closid(r->closid))) + seq_printf(s, "%d\n", t->pid); } rcu_read_unlock(); } @@ -1436,9 +1782,11 @@ int __init mpam_resctrl_init(void) void __mpam_sched_in(void) { struct intel_pqr_state *state = this_cpu_ptr(&pqr_state); - u64 closid = state->default_closid; u64 partid_d, partid_i; - u64 pmg = state->default_rmid; + u64 rmid = state->default_rmid; + u64 closid = state->default_closid; + u64 reqpartid = 0; + u64 pmg = 0;
/* * If this task has a closid/rmid assigned, use it. @@ -1446,35 +1794,28 @@ void __mpam_sched_in(void) */ if (static_branch_likely(&resctrl_alloc_enable_key)) { if (current->closid) - closid = TASK_CLOSID_CUR_GET(current->closid); + closid = current->closid; }
if (static_branch_likely(&resctrl_mon_enable_key)) { if (current->rmid) - pmg = current->rmid; + rmid = current->rmid; }
- if (closid != state->cur_closid || pmg != state->cur_rmid) { + if (closid != state->cur_closid || rmid != state->cur_rmid) { u64 reg;
+ resctrl_navie_rmid_partid_pmg(rmid, (int *)&reqpartid, (int *)&pmg); + if (resctrl_cdp_enabled) { hw_closid_t hw_closid;
- resctrl_cdp_map(clos, closid, CDP_DATA, hw_closid); + resctrl_cdp_map(clos, reqpartid, CDP_DATA, hw_closid); partid_d = hw_closid_val(hw_closid);
- resctrl_cdp_map(clos, closid, CDP_CODE, hw_closid); + resctrl_cdp_map(clos, reqpartid, CDP_CODE, hw_closid); partid_i = hw_closid_val(hw_closid);
- /* - * when cdp enabled, we use partid_i to label cur_closid - * of cpu state instead of partid_d, because each task/ - * rdtgrp's closid is labeled by CDP_BOTH/CDP_CODE but not - * CDP_DATA. - */ - state->cur_closid = partid_i; - state->cur_rmid = pmg; - /* set in EL0 */ reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); reg = PARTID_D_SET(reg, partid_d); @@ -1489,21 +1830,21 @@ void __mpam_sched_in(void) reg = PMG_SET(reg, pmg); mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); } else { - state->cur_closid = closid; - state->cur_rmid = pmg; - /* set in EL0 */ reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1"); - reg = PARTID_SET(reg, closid); + reg = PARTID_SET(reg, reqpartid); reg = PMG_SET(reg, pmg); mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
/* set in EL1 */ reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); - reg = PARTID_SET(reg, closid); + reg = PARTID_SET(reg, reqpartid); reg = PMG_SET(reg, pmg); mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1"); } + + state->cur_rmid = rmid; + state->cur_closid = closid; } }
@@ -1670,36 +2011,3 @@ u16 mpam_resctrl_max_mon_num(void)
return mon_num; } - -int resctrl_id_init(void) -{ - int ret; - - ret = closid_bitmap_init(); - if (ret) - goto out; - - pmg_init(); - mon_init(); - -out: - return ret; -} - -int resctrl_id_alloc(enum closid_type type) -{ - if (type == CLOSID_INT) - return closid_bitmap_alloc(num_intpartid, intpartid_free_map); - else if (type == CLOSID_REQ) - return closid_bitmap_alloc(num_reqpartid, reqpartid_free_map); - - return -ENOSPC; -} - -void resctrl_id_free(enum closid_type type, int id) -{ - if (type == CLOSID_INT) - return closid_bitmap_free(id, intpartid_free_map); - else if (type == CLOSID_REQ) - return closid_bitmap_free(id, reqpartid_free_map); -} diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 78185a4c2b41..3eb2d0d70703 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -41,6 +41,7 @@
#include <uapi/linux/magic.h>
+#include <asm/mpam.h> #include <asm/resctrl.h>
DEFINE_STATIC_KEY_FALSE(resctrl_enable_key); @@ -339,25 +340,17 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, return ret; }
-static inline void free_mon_id(struct resctrl_group *rdtgrp) -{ - if (rdtgrp->mon.rmid) - free_rmid(rdtgrp->mon.rmid); - else if (rdtgrp->closid.reqpartid) - resctrl_id_free(CLOSID_REQ, rdtgrp->closid.reqpartid); -} - static void mkdir_mondata_all_prepare_clean(struct resctrl_group *prgrp) { if (prgrp->type == RDTCTRL_GROUP && prgrp->closid.intpartid) - resctrl_id_free(CLOSID_INT, prgrp->closid.intpartid); - free_mon_id(prgrp); + closid_free(prgrp->closid.intpartid); + rmid_free(prgrp->mon.rmid); }
static int mkdir_mondata_all_prepare(struct resctrl_group *rdtgrp) { int ret = 0; - int mon, rmid, reqpartid; + int mon; struct resctrl_group *prgrp;
mon = resctrl_lru_request_mon(); @@ -368,39 +361,8 @@ static int mkdir_mondata_all_prepare(struct resctrl_group *rdtgrp) } rdtgrp->mon.mon = mon;
- prgrp = rdtgrp->mon.parent; - if (rdtgrp->type == RDTMON_GROUP) { - /* - * this for mon id allocation, for mpam, rmid - * (pmg) is just reserved for creating monitoring - * group, it has the same effect with reqpartid - * (reqpartid) except for config allocation, but - * for some fuzzy reasons, we keep it until spec - * changes. We also allocate rmid first if it's - * available. - */ - rmid = alloc_rmid(); - if (rmid < 0) { - reqpartid = resctrl_id_alloc(CLOSID_REQ); - if (reqpartid < 0) { - rdt_last_cmd_puts("out of closID\n"); - ret = -EINVAL; - goto out; - } - rdtgrp->closid.reqpartid = reqpartid; - rdtgrp->mon.rmid = 0; - } else { - /* - * this time copy reqpartid from father group, - * as rmid is sufficient to monitoring. - */ - rdtgrp->closid.reqpartid = prgrp->closid.reqpartid; - rdtgrp->mon.rmid = rmid; - } - /* - * establish relationship from ctrl to mon group. - */ + prgrp = rdtgrp->mon.parent; rdtgrp->closid.intpartid = prgrp->closid.intpartid; }
@@ -549,7 +511,7 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, static inline bool is_task_match_resctrl_group(struct task_struct *t, struct resctrl_group *r) { - return (TASK_CLOSID_PR_GET(t->closid) == r->closid.intpartid); + return (t->closid == r->closid.intpartid); }
/* @@ -568,9 +530,8 @@ static void resctrl_move_group_tasks(struct resctrl_group *from, struct resctrl_ read_lock(&tasklist_lock); for_each_process_thread(p, t) { if (!from || is_task_match_resctrl_group(t, from)) { - t->closid = TASK_CLOSID_SET(to->closid.intpartid, - to->closid.reqpartid); - t->rmid = to->mon.rmid; + t->closid = resctrl_navie_closid(to->closid); + t->rmid = resctrl_navie_rmid(to->mon.rmid);
#ifdef CONFIG_SMP /* @@ -598,7 +559,7 @@ static void free_all_child_rdtgrp(struct resctrl_group *rdtgrp) head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { /* rmid may not be used */ - free_mon_id(sentry); + rmid_free(sentry->mon.rmid); list_del(&sentry->mon.crdtgrp_list); kfree(sentry); } @@ -630,7 +591,7 @@ static void rmdir_all_sub(void) cpumask_or(&resctrl_group_default.cpu_mask, &resctrl_group_default.cpu_mask, &rdtgrp->cpu_mask);
- free_mon_id(rdtgrp); + rmid_free(rdtgrp->mon.rmid);
kernfs_remove(rdtgrp->kn); list_del(&rdtgrp->resctrl_group_list); @@ -669,6 +630,46 @@ static struct file_system_type resctrl_fs_type = { .kill_sb = resctrl_kill_sb, };
+static int find_rdtgrp_allocable_rmid(struct resctrl_group *rdtgrp) +{ + int ret, rmid, reqpartid; + struct resctrl_group *prgrp, *entry; + struct list_head *head; + + prgrp = rdtgrp->mon.parent; + if (prgrp == &resctrl_group_default) { + rmid = rmid_alloc(-1); + if (rmid < 0) + return rmid; + } else { + do { + rmid = rmid_alloc(prgrp->closid.reqpartid); + if (rmid >= 0) + break; + + head = &prgrp->mon.crdtgrp_list; + list_for_each_entry(entry, head, mon.crdtgrp_list) { + if (entry == rdtgrp) + continue; + rmid = rmid_alloc(entry->closid.reqpartid); + if (rmid >= 0) + break; + } + } while (0); + } + + if (rmid < 0) + rmid = rmid_alloc(-1); + + ret = mpam_rmid_to_partid_pmg(rmid, &reqpartid, NULL); + if (ret) + return ret; + rdtgrp->mon.rmid = rmid; + rdtgrp->closid.reqpartid = reqpartid; + + return rmid; +} + static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, struct kernfs_node *prgrp_kn, const char *name, umode_t mode, @@ -705,21 +706,21 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, * getting monitoring for child mon groups. */ if (rdtgrp->type == RDTCTRL_GROUP) { - ret = resctrl_id_alloc(CLOSID_INT); + ret = closid_alloc(); if (ret < 0) { rdt_last_cmd_puts("out of CLOSIDs\n"); goto out_unlock; } rdtgrp->closid.intpartid = ret; - ret = resctrl_id_alloc(CLOSID_REQ); - if (ret < 0) { - rdt_last_cmd_puts("out of SLAVE CLOSIDs\n"); - goto out_unlock; - } - rdtgrp->closid.reqpartid = ret; - ret = 0; }
+ ret = find_rdtgrp_allocable_rmid(rdtgrp); + if (ret < 0) { + rdt_last_cmd_puts("out of RMIDs\n"); + goto out_free_closid; + } + rdtgrp->mon.rmid = ret; + INIT_LIST_HEAD(&rdtgrp->mon.crdtgrp_list);
/* kernfs creates the directory for rdtgrp */ @@ -727,7 +728,7 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, if (IS_ERR(kn)) { ret = PTR_ERR(kn); rdt_last_cmd_puts("kernfs create error\n"); - goto out_free_rgrp; + goto out_free_rmid; } rdtgrp->kn = kn;
@@ -776,8 +777,12 @@ static int mkdir_resctrl_prepare(struct kernfs_node *parent_kn, mkdir_mondata_all_prepare_clean(rdtgrp); out_destroy: kernfs_remove(rdtgrp->kn); -out_free_rgrp: +out_free_rmid: + rmid_free(rdtgrp->mon.rmid); kfree(rdtgrp); +out_free_closid: + if (rdtgrp->type == RDTCTRL_GROUP) + closid_free(rdtgrp->closid.intpartid); out_unlock: resctrl_group_kn_unlock(prgrp_kn); return ret; @@ -924,9 +929,10 @@ static void resctrl_group_rm_mon(struct resctrl_group *rdtgrp,
/* Update per cpu closid and rmid of the moved CPUs first */ for_each_cpu(cpu, &rdtgrp->cpu_mask) { - per_cpu(pqr_state.default_closid, cpu) = prdtgrp->closid.reqpartid; - per_cpu(pqr_state.default_rmid, cpu) = prdtgrp->mon.rmid; + per_cpu(pqr_state.default_closid, cpu) = resctrl_navie_closid(prdtgrp->closid); + per_cpu(pqr_state.default_rmid, cpu) = resctrl_navie_rmid(prdtgrp->mon.rmid); } + /* * Update the MSR on moved CPUs and CPUs which have moved * task running on them. @@ -936,7 +942,7 @@ static void resctrl_group_rm_mon(struct resctrl_group *rdtgrp,
rdtgrp->flags |= RDT_DELETED;
- free_mon_id(rdtgrp); + rmid_free(rdtgrp->mon.rmid);
/* * Remove the rdtgrp from the parent ctrl_mon group's list @@ -974,9 +980,9 @@ static void resctrl_group_rm_ctrl(struct resctrl_group *rdtgrp, cpumask_var_t tm /* Update per cpu closid and rmid of the moved CPUs first */ for_each_cpu(cpu, &rdtgrp->cpu_mask) { per_cpu(pqr_state.default_closid, cpu) = - resctrl_group_default.closid.reqpartid; + resctrl_navie_closid(resctrl_group_default.closid); per_cpu(pqr_state.default_rmid, cpu) = - resctrl_group_default.mon.rmid; + resctrl_navie_rmid(resctrl_group_default.mon.rmid); }
/* @@ -987,8 +993,8 @@ static void resctrl_group_rm_ctrl(struct resctrl_group *rdtgrp, cpumask_var_t tm update_closid_rmid(tmpmask, NULL);
rdtgrp->flags |= RDT_DELETED; - resctrl_id_free(CLOSID_INT, rdtgrp->closid.intpartid); - resctrl_id_free(CLOSID_REQ, rdtgrp->closid.reqpartid); + closid_free(rdtgrp->closid.intpartid); + rmid_free(rdtgrp->mon.rmid);
/* * Free all the child monitor group rmids.
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
For MPAM, a rmid can do monitoring work only with a monitor resource allocated, we adopt a mechanism for monitor resource dynamic allocation and recycling, it is different from Intel-RDT operation who creates a kworker thread for dynamically monitoring Cache usage and checks if it is below a threshold adjustable for rmid free, for we have detected that this method will affect the cpu utilization in many cases, sometimes this influence cannot be accepted.
Our method is simple, as different resource's monitor number varies, we deliever two list, one for storing rmids which has exclusive monitor resource and another for storing this rmids which have monitor resource shared, this shared monitor id always be 0. it works like this, if a new rmid apply for a resource monitor which is in used, then we put this rmid to the tail of latter list and temporarily give a default monitor id 0 util someone releases available monitor resource, if this new rmid has all resources' monitor resource needed, then it will be put into exclusive list.
This implements the LRU allocation of monitor resources and give users part control rights of allocation and release, if resctrl group's quantity can be guaranteed or user don't need monitoring too many groups synchronously, this is a more appropriate way for user deployment, not only that, also can it avoid the risk of inaccuracy in monitoring when monitoring operation happen to too many groups at the same time.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 6 - arch/arm64/include/asm/resctrl.h | 1 - arch/arm64/kernel/mpam/mpam_ctrlmon.c | 9 +- arch/arm64/kernel/mpam/mpam_internal.h | 4 + arch/arm64/kernel/mpam/mpam_mon.c | 297 ++++++++++++++++++++++--- arch/arm64/kernel/mpam/mpam_resctrl.c | 46 +--- fs/resctrlfs.c | 13 +- 7 files changed, 295 insertions(+), 81 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 014d5728f607..0414fdc5cb0e 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -358,12 +358,6 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, struct list_head **pos);
-int resctrl_group_alloc_mon(struct rdtgroup *grp); - -u16 mpam_resctrl_max_mon_num(void); - -void mon_init(void); - extern int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg);
#endif /* _ASM_ARM64_MPAM_H */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 40f97b1ddb83..44a5bcfa5b92 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -53,7 +53,6 @@ struct mongroup { struct rdtgroup *parent; struct list_head crdtgrp_list; u32 rmid; - u32 mon; int init; };
diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 2ad19f255060..b52cbad5c50e 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -593,6 +593,7 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) struct list_head *head; struct rdtgroup *entry; hw_closid_t hw_closid; + hw_monid_t hw_monid; enum resctrl_conf_type type = CDP_CODE;
resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, @@ -613,7 +614,10 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) return ret;
md.u.pmg = pmg; - md.u.mon = entry->mon.mon; + resctrl_cdp_map(mon, get_rmid_mon(entry->mon.rmid, + r->rid), type, hw_monid); + md.u.mon = hw_monid_val(hw_monid); + usage += resctrl_dom_mon_data(r, d, md.priv); } } @@ -668,7 +672,8 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, /* monitoring use reqpartid (reqpartid) */ resctrl_cdp_map(clos, prgrp->closid.reqpartid, s->conf_type, hw_closid); md.u.partid = hw_closid_val(hw_closid); - resctrl_cdp_map(mon, prgrp->mon.mon, s->conf_type, hw_monid); + resctrl_cdp_map(mon, get_rmid_mon(prgrp->mon.rmid, r->rid), + s->conf_type, hw_monid); md.u.mon = hw_monid_val(hw_monid);
ret = mpam_rmid_to_partid_pmg(prgrp->mon.rmid, NULL, &pmg); diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 974a0b0784fa..690ed3f875e8 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -196,4 +196,8 @@ int __init mpam_resctrl_init(void); int mpam_resctrl_set_default_cpu(unsigned int cpu); void mpam_resctrl_clear_default_cpu(unsigned int cpu);
+int assoc_rmid_with_mon(u32 rmid); +void deassoc_rmid_with_mon(u32 rmid); +u32 get_rmid_mon(u32 rmid, enum resctrl_resource_level rid); +int rmid_mon_ptrs_init(u32 nr_rmids); #endif diff --git a/arch/arm64/kernel/mpam/mpam_mon.c b/arch/arm64/kernel/mpam/mpam_mon.c index 053f8501f7d2..154763d4d58b 100644 --- a/arch/arm64/kernel/mpam/mpam_mon.c +++ b/arch/arm64/kernel/mpam/mpam_mon.c @@ -37,51 +37,298 @@ */ bool rdt_mon_capable;
-/* - * A simple LRU monitor allocation machanism, each - * monitor free map occupies two section, one for - * allocation and another for recording. +struct rmid_entry { + u32 rmid; + u32 mon[RDT_NUM_RESOURCES]; + struct list_head mon_exclusive_q; + struct list_head mon_wait_q; +}; + +/** + * @rmid_mon_exclusive_all List of allocated RMIDs with + * exclusive available mon. + */ +static LIST_HEAD(rmid_mon_exclusive_all); + +/** + * @rmid_mon_wait_all List of allocated RMIDs with default + * 0 mon and wait for exclusive available mon. + */ +static LIST_HEAD(rmid_mon_wait_all); + +static u32 rmid_ptrs_len; + +/** + * @rmid_entry - The entry in the mon list. */ -static int mon_free_map[2]; -static u8 alloc_idx, record_idx; +static struct rmid_entry *rmid_ptrs;
-void mon_init(void) +static int mon_free_map[RDT_NUM_RESOURCES]; + +static void mon_init(void) { - int num_mon; + u16 mon_num; u32 times, flag; + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + struct raw_resctrl_resource *rr;
- num_mon = mpam_resctrl_max_mon_num(); - - hw_alloc_times_validate(times, flag); - /* for cdp on or off */ - num_mon = rounddown(num_mon, times); + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + rr = r->res;
- mon_free_map[0] = BIT_MASK(num_mon) - 1; - mon_free_map[1] = 0; + hw_alloc_times_validate(times, flag); + /* for cdp*/ + mon_num = rounddown(rr->num_mon, times); + mon_free_map[r->rid] = BIT_MASK(mon_num) - 1;
- alloc_idx = 0; - record_idx = 1; + /* mon = 0 is reserved */ + mon_free_map[r->rid] &= ~(BIT_MASK(times) - 1); + } }
-int resctrl_lru_request_mon(void) +static u32 mon_alloc(enum resctrl_resource_level rid) { u32 mon = 0; u32 times, flag;
hw_alloc_times_validate(times, flag);
- mon = ffs(mon_free_map[alloc_idx]); + mon = ffs(mon_free_map[rid]); if (mon == 0) return -ENOSPC;
mon--; - mon_free_map[alloc_idx] &= ~(GENMASK(mon + times - 1, mon)); - mon_free_map[record_idx] |= GENMASK(mon + times - 1, mon); + mon_free_map[rid] &= ~(GENMASK(mon + times - 1, mon)); + + return mon; +} + +static void mon_free(u32 mon, enum resctrl_resource_level rid) +{ + u32 times, flag; + + hw_alloc_times_validate(times, flag); + mon_free_map[rid] |= GENMASK(mon + times - 1, mon); +} + +static inline struct rmid_entry *__rmid_entry(u32 rmid) +{ + struct rmid_entry *entry; + + if (rmid >= rmid_ptrs_len) + return NULL; + + entry = &rmid_ptrs[rmid]; + WARN_ON(entry->rmid != rmid); + + return entry; +}
- if (!mon_free_map[alloc_idx]) { - alloc_idx = record_idx; - record_idx ^= 0x1; +static void mon_wait_q_init(void) +{ + INIT_LIST_HEAD(&rmid_mon_wait_all); +} + +static void mon_exclusive_q_init(void) +{ + INIT_LIST_HEAD(&rmid_mon_exclusive_all); +} + +static void put_mon_wait_q(struct rmid_entry *entry) +{ + list_add_tail(&entry->mon_wait_q, &rmid_mon_wait_all); +} + +static void put_mon_exclusive_q(struct rmid_entry *entry) +{ + list_add_tail(&entry->mon_exclusive_q, &rmid_mon_exclusive_all); +} + +static void mon_wait_q_del(struct rmid_entry *entry) +{ + list_del(&entry->mon_wait_q); +} + +static void mon_exclusive_q_del(struct rmid_entry *entry) +{ + list_del(&entry->mon_exclusive_q); +} + +static int is_mon_wait_q_exist(u32 rmid) +{ + struct rmid_entry *entry; + + list_for_each_entry(entry, &rmid_mon_wait_all, mon_wait_q) { + if (entry->rmid == rmid) + return 1; }
- return mon; + return 0; +} + +static int is_mon_exclusive_q_exist(u32 rmid) +{ + struct rmid_entry *entry; + + list_for_each_entry(entry, &rmid_mon_exclusive_all, mon_exclusive_q) { + if (entry->rmid == rmid) + return 1; + } + + return 0; +} + +static int is_rmid_mon_wait_q_exist(u32 rmid) +{ + struct rmid_entry *entry; + + list_for_each_entry(entry, &rmid_mon_wait_all, mon_wait_q) { + if (entry->rmid == rmid) + return 1; + } + + return 0; +} + +int rmid_mon_ptrs_init(u32 nr_rmids) +{ + struct rmid_entry *entry = NULL; + int i; + + if (rmid_ptrs) + kfree(rmid_ptrs); + + rmid_ptrs = kcalloc(nr_rmids, sizeof(struct rmid_entry), GFP_KERNEL); + if (!rmid_ptrs) + return -ENOMEM; + + rmid_ptrs_len = nr_rmids; + + for (i = 0; i < nr_rmids; i++) { + entry = &rmid_ptrs[i]; + entry->rmid = i; + } + + mon_exclusive_q_init(); + mon_wait_q_init(); + + /* + * RMID 0 is special and is always allocated. It's used for all + * tasks monitoring. + */ + entry = __rmid_entry(0); + if (!entry) { + kfree(rmid_ptrs); + rmid_ptrs = NULL; + return -EINVAL; + } + + put_mon_exclusive_q(entry); + + mon_init(); + + return 0; +} + +int assoc_rmid_with_mon(u32 rmid) +{ + int mon; + bool has_mon_wait = false; + struct rmid_entry *entry; + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + + if (is_mon_exclusive_q_exist(rmid) || + is_rmid_mon_wait_q_exist(rmid)) + return -EINVAL; + + entry = __rmid_entry(rmid); + if (!entry) + return -EINVAL; + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (!r->mon_enabled) + continue; + + mon = mon_alloc(r->rid); + if (mon < 0) { + entry->mon[r->rid] = 0; + has_mon_wait = true; + } else { + entry->mon[r->rid] = mon; + } + } + + if (has_mon_wait) + put_mon_wait_q(entry); + else + put_mon_exclusive_q(entry); + + return 0; +} + +void deassoc_rmid_with_mon(u32 rmid) +{ + bool has_mon_wait; + struct mpam_resctrl_res *res; + struct resctrl_resource *r; + struct rmid_entry *entry = __rmid_entry(rmid); + struct rmid_entry *wait, *tmp; + + if (!entry) + return; + + if (!is_mon_wait_q_exist(rmid) && + !is_mon_exclusive_q_exist(rmid)) + return; + + if (is_mon_wait_q_exist(rmid)) + mon_wait_q_del(entry); + else + mon_exclusive_q_del(entry); + + list_for_each_entry_safe(wait, tmp, &rmid_mon_wait_all, mon_wait_q) { + has_mon_wait = false; + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (!r->mon_enabled) + continue; + + if (!wait->mon[r->rid]) { + wait->mon[r->rid] = entry->mon[r->rid]; + entry->mon[r->rid] = 0; + } + + if (!wait->mon[r->rid]) + has_mon_wait = true; + } + if (!has_mon_wait) { + mon_wait_q_del(wait); + put_mon_exclusive_q(wait); + } + } + + for_each_supported_resctrl_exports(res) { + r = &res->resctrl_res; + if (!r->mon_enabled) + continue; + + if (entry->mon[r->rid]) + mon_free(entry->mon[r->rid], r->rid); + } +} + +u32 get_rmid_mon(u32 rmid, enum resctrl_resource_level rid) +{ + struct rmid_entry *entry = __rmid_entry(rmid); + + if (!entry) + return 0; + + if (!is_mon_wait_q_exist(rmid) && !is_mon_exclusive_q_exist(rmid)) + return 0; + + return entry->mon[rid]; } diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 418defc05e06..5292fea4a398 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -69,11 +69,6 @@ int max_name_width, max_data_width; */ bool rdt_alloc_capable;
-/* - * Indicate the max number of monitor supported. - */ -static u32 max_mon_num; - /* * Indicate if had mount cdpl2/cdpl3 option. */ @@ -779,8 +774,11 @@ static int rmid_remap_matrix_init(void)
STRIDE_CHK_AND_WARN(stride);
- return 0; + ret = rmid_mon_ptrs_init(rmid_remap_matrix.nr_usage); + if (ret) + goto out;
+ return 0; out: return ret; } @@ -793,13 +791,7 @@ int resctrl_id_init(void) if (ret) return ret;
- ret = rmid_remap_matrix_init(); - if (ret) - return ret; - - mon_init(); - - return 0; + return rmid_remap_matrix_init(); }
static int is_rmid_valid(int rmid) @@ -874,6 +866,10 @@ static int __rmid_alloc(int partid) goto out; }
+ ret = assoc_rmid_with_mon(rmid[0]); + if (ret) + goto out; + return rmid[0];
out: @@ -913,6 +909,8 @@ void rmid_free(int rmid) }
STRIDE_CHK_AND_WARN(stride); + + deassoc_rmid_with_mon(rmid); }
int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg) @@ -1989,25 +1987,3 @@ void resctrl_resource_reset(void) */ resctrl_cdp_enabled = false; } - -u16 mpam_resctrl_max_mon_num(void) -{ - struct mpam_resctrl_res *res; - u16 mon_num = USHRT_MAX; - struct raw_resctrl_resource *rr; - - if (max_mon_num) - return max_mon_num; - - for_each_supported_resctrl_exports(res) { - rr = res->resctrl_res.res; - mon_num = min(mon_num, rr->num_mon); - } - - if (mon_num == USHRT_MAX) - mon_num = 0; - - max_mon_num = mon_num; - - return mon_num; -} diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 3eb2d0d70703..2b6731c21532 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -349,25 +349,14 @@ static void mkdir_mondata_all_prepare_clean(struct resctrl_group *prgrp)
static int mkdir_mondata_all_prepare(struct resctrl_group *rdtgrp) { - int ret = 0; - int mon; struct resctrl_group *prgrp;
- mon = resctrl_lru_request_mon(); - if (mon < 0) { - rdt_last_cmd_puts("out of monitors\n"); - ret = -EINVAL; - goto out; - } - rdtgrp->mon.mon = mon; - if (rdtgrp->type == RDTMON_GROUP) { prgrp = rdtgrp->mon.parent; rdtgrp->closid.intpartid = prgrp->closid.intpartid; }
-out: - return ret; + return 0; }
/*
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Structure resctrl_ctrl_feature taken by resources is introduced to manage ctrl features, of which characteristic like max width from outer input and the base we parse from.
Now it is more practical for declaring a new ctrl feature, such as SCHEMA_PRI feature, only associated with internal priority setting exported by mpam devices, where informations is collected from mpam_resctrl_resource_init(), and next be chosen open or close by user options.
ctrl_ctrl_feature structure contains a flags field to avoid duplicated control type, for instance, SCHEMA_COMM feature selectes cpbm (Cache portion bitmap) as resource Cache default control type, so we should not enable this feature no longer if user manually selectes cpbm control type through mount options.
This field evt in ctrl_ctrl_feature structure is enum rdt_event_id type variable which works like eee4ad2a36e6 ("arm64/mpam: Add hook-events id for ctrl features") illustrates.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 25 ++- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 29 ++- arch/arm64/kernel/mpam/mpam_internal.h | 1 + arch/arm64/kernel/mpam/mpam_resctrl.c | 287 ++++++++++++------------- arch/arm64/kernel/mpam/mpam_setup.c | 136 +++++++----- include/linux/resctrlfs.h | 3 - 6 files changed, 260 insertions(+), 221 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 0414fdc5cb0e..e6fd2b7c72b0 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -124,14 +124,28 @@ enum resctrl_ctrl_type { SCHEMA_NUM_CTRL_TYPE };
+struct resctrl_ctrl_feature { + enum resctrl_ctrl_type type; + int flags; + const char *name; + + u32 max_wd; + + int base; + int evt; + + int default_ctrl; + + bool capable; + bool enabled; +}; + #define for_each_ctrl_type(t) \ for (t = SCHEMA_COMM; t != SCHEMA_NUM_CTRL_TYPE; t++)
#define for_each_extend_ctrl_type(t) \ for (t = SCHEMA_PRI; t != SCHEMA_NUM_CTRL_TYPE; t++)
-bool resctrl_ctrl_extend_bits_match(u32 bitmap, enum resctrl_ctrl_type type); - enum resctrl_conf_type { CDP_BOTH = 0, CDP_CODE, @@ -319,11 +333,10 @@ struct raw_resctrl_resource { u16 num_intpartid; u16 num_pmg;
- u16 extend_ctrls_wd[SCHEMA_NUM_CTRL_TYPE]; - void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, struct msr_param *para); - u64 (*msr_read)(struct rdt_domain *d, struct msr_param *para); + u64 (*msr_read)(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para);
int data_width; const char *format_str; @@ -334,6 +347,8 @@ struct raw_resctrl_resource { u16 num_mon; u64 (*mon_read)(struct rdt_domain *d, void *md_priv); int (*mon_write)(struct rdt_domain *d, void *md_priv); + + struct resctrl_ctrl_feature ctrl_features[SCHEMA_NUM_CTRL_TYPE]; };
/* 64bit arm64 specified */ diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index b52cbad5c50e..73e5d9c8c033 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -96,8 +96,8 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) rr = r->res; INIT_LIST_HEAD(&s->schema_ctrl_list); for_each_extend_ctrl_type(type) { - if (!resctrl_ctrl_extend_bits_match(r->ctrl_extend_bits, type) || - !rr->extend_ctrls_wd[type]) + if (!rr->ctrl_features[type].enabled || + !rr->ctrl_features[type].max_wd) continue;
sc = kzalloc(sizeof(*sc), GFP_KERNEL); @@ -285,6 +285,9 @@ parse_line(char *line, struct resctrl_resource *r, unsigned long dom_id; hw_closid_t hw_closid;
+ if (!rr->ctrl_features[ctrl_type].enabled) + return -EINVAL; + next: if (!line || line[0] == '\0') return 0; @@ -432,6 +435,9 @@ static void show_doms(struct seq_file *s, struct resctrl_resource *r, bool prev_auto_fill = false; u32 reg_val;
+ if (!rr->ctrl_features[type].enabled) + return; + para.closid = closid; para.type = type;
@@ -440,15 +446,15 @@ static void show_doms(struct seq_file *s, struct resctrl_resource *r,
seq_printf(s, "%*s:", max_name_width, schema_name); list_for_each_entry(dom, &r->domains, list) { - reg_val = rr->msr_read(dom, ¶); + reg_val = rr->msr_read(r, dom, ¶);
- if (rg && reg_val == r->default_ctrl[SCHEMA_COMM] && - prev_auto_fill == true) + if (reg_val == rr->ctrl_features[SCHEMA_COMM].default_ctrl && + rg && prev_auto_fill == true) continue;
if (sep) seq_puts(s, ";"); - if (rg && reg_val == r->default_ctrl[SCHEMA_COMM]) { + if (rg && reg_val == rr->ctrl_features[SCHEMA_COMM].default_ctrl) { prev_auto_fill = true; seq_puts(s, "S"); } else { @@ -754,22 +760,24 @@ static void rdtgroup_init_mba(struct resctrl_schema *s, u32 closid) { struct resctrl_staged_config *cfg; struct resctrl_resource *r; + struct raw_resctrl_resource *rr; struct rdt_domain *d; enum resctrl_ctrl_type t;
r = s->res; if (WARN_ON(!r)) return; + rr = r->res;
list_for_each_entry(d, &s->res->domains, list) { cfg = &d->staged_cfg[CDP_BOTH]; cfg->cdp_both_ctrl = s->cdp_mc_both; - cfg->new_ctrl[SCHEMA_COMM] = r->default_ctrl[SCHEMA_COMM]; + cfg->new_ctrl[SCHEMA_COMM] = rr->ctrl_features[SCHEMA_COMM].default_ctrl; resctrl_cdp_map(clos, closid, CDP_BOTH, cfg->hw_closid); cfg->have_new_ctrl = true; /* Set extension ctrl default value, e.g. priority/hardlimit */ for_each_extend_ctrl_type(t) { - cfg->new_ctrl[t] = r->default_ctrl[t]; + cfg->new_ctrl[t] = rr->ctrl_features[t].default_ctrl; } } } @@ -791,6 +799,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) enum resctrl_ctrl_type ctrl_type; struct rdt_domain *d; struct resctrl_resource *r; + struct raw_resctrl_resource *rr; u32 used_b = 0; u32 unused_b = 0; unsigned long tmp_cbm; @@ -798,6 +807,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) r = s->res; if (WARN_ON(!r)) return -EINVAL; + rr = r->res;
list_for_each_entry(d, &s->res->domains, list) { cfg = &d->staged_cfg[conf_type]; @@ -827,7 +837,8 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) * with MPAM capabilities. */ for_each_extend_ctrl_type(ctrl_type) { - cfg->new_ctrl[ctrl_type] = r->default_ctrl[ctrl_type]; + cfg->new_ctrl[ctrl_type] = + rr->ctrl_features[ctrl_type].default_ctrl; } }
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 690ed3f875e8..d74989e03993 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -19,6 +19,7 @@ extern bool rdt_mon_capable; extern struct list_head mpam_classes;
#define MAX_MBA_BW 100u +#define GRAN_MBA_BW 2u
#define MPAM_ERRCODE_NONE 0 #define MPAM_ERRCODE_PARTID_SEL_RANGE 1 diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 5292fea4a398..2b9f0f7dca93 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -105,22 +105,6 @@ bool is_resctrl_cdp_enabled(void) return !!resctrl_cdp_enabled; }
-static void -resctrl_ctrl_extend_bits_set(u32 *bitmap, enum resctrl_ctrl_type type) -{ - *bitmap |= BIT(type); -} - -static void resctrl_ctrl_extend_bits_clear(u32 *bitmap) -{ - *bitmap = 0; -} - -bool resctrl_ctrl_extend_bits_match(u32 bitmap, enum resctrl_ctrl_type type) -{ - return bitmap & BIT(type); -} - static void mpam_resctrl_update_component_cfg(struct resctrl_resource *r, struct rdt_domain *d, struct sd_closid *closid); @@ -129,8 +113,10 @@ static void common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, struct msr_param *para);
-static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para); -static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para); +static u64 cache_rdmsr(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para); +static u64 mbw_rdmsr(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para);
static u64 cache_rdmon(struct rdt_domain *d, void *md_priv); static u64 mbw_rdmon(struct rdt_domain *d, void *md_priv); @@ -150,6 +136,23 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .format_str = "%d=%0*x", .mon_read = cache_rdmon, .mon_write = common_wrmon, + .ctrl_features = { + [SCHEMA_COMM] = { + .type = SCHEMA_COMM, + .flags = SCHEMA_COMM, + .name = "comm", + .base = 16, + .evt = QOS_CAT_CPBM_EVENT_ID, + .capable = 1, + }, + [SCHEMA_PRI] = { + .type = SCHEMA_PRI, + .flags = SCHEMA_PRI, + .name = "caPrio", + .base = 10, + .evt = QOS_CAT_INTPRI_EVENT_ID, + }, + }, }, [RDT_RESOURCE_L2] = { .msr_update = common_wrmsr, @@ -158,6 +161,23 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .format_str = "%d=%0*x", .mon_read = cache_rdmon, .mon_write = common_wrmon, + .ctrl_features = { + [SCHEMA_COMM] = { + .type = SCHEMA_COMM, + .flags = SCHEMA_COMM, + .name = "comm", + .base = 16, + .evt = QOS_CAT_CPBM_EVENT_ID, + .capable = 1, + }, + [SCHEMA_PRI] = { + .type = SCHEMA_PRI, + .flags = SCHEMA_PRI, + .name = "caPrio", + .base = 10, + .evt = QOS_CAT_INTPRI_EVENT_ID, + }, + }, }, [RDT_RESOURCE_MC] = { .msr_update = common_wrmsr, @@ -166,6 +186,30 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .format_str = "%d=%0*d", .mon_read = mbw_rdmon, .mon_write = common_wrmon, + .ctrl_features = { + [SCHEMA_COMM] = { + .type = SCHEMA_COMM, + .flags = SCHEMA_COMM, + .name = "comm", + .base = 10, + .evt = QOS_MBA_MAX_EVENT_ID, + .capable = 1, + }, + [SCHEMA_PRI] = { + .type = SCHEMA_PRI, + .flags = SCHEMA_PRI, + .name = "mbPrio", + .base = 10, + .evt = QOS_MBA_INTPRI_EVENT_ID, + }, + [SCHEMA_HDL] = { + .type = SCHEMA_HDL, + .flags = SCHEMA_HDL, + .name = "mbHdl", + .base = 10, + .evt = QOS_MBA_HDL_EVENT_ID, + }, + }, }, };
@@ -188,28 +232,18 @@ parse_cache(char *buf, struct resctrl_resource *r, enum resctrl_ctrl_type type) { unsigned long data; + struct raw_resctrl_resource *rr = r->res;
if (cfg->have_new_ctrl) { rdt_last_cmd_printf("duplicate domain\n"); return -EINVAL; }
- switch (type) { - case SCHEMA_COMM: - if (kstrtoul(buf, 16, &data)) - return -EINVAL; - break; - case SCHEMA_PRI: - if (kstrtoul(buf, 10, &data)) - return -EINVAL; - break; - case SCHEMA_HDL: - if (kstrtoul(buf, 10, &data)) - return -EINVAL; - break; - default: + if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) + return -EINVAL; + + if (data >= rr->ctrl_features[type].max_wd) return -EINVAL; - }
cfg->new_ctrl[type] = data; cfg->have_new_ctrl = true; @@ -217,54 +251,35 @@ parse_cache(char *buf, struct resctrl_resource *r, return 0; }
-static bool bw_validate(char *buf, unsigned long *data, - struct resctrl_resource *r) -{ - unsigned long bw; - int ret; - - ret = kstrtoul(buf, 10, &bw); - if (ret) { - rdt_last_cmd_printf("non-hex character in mask %s\n", buf); - return false; - } - - bw = bw > MAX_MBA_BW ? MAX_MBA_BW : bw; - bw = bw < r->mbw.min_bw ? r->mbw.min_bw : bw; - *data = roundup(bw, r->mbw.bw_gran); - - return true; -} - static int parse_bw(char *buf, struct resctrl_resource *r, struct resctrl_staged_config *cfg, enum resctrl_ctrl_type type) { unsigned long data; + struct raw_resctrl_resource *rr = r->res;
if (cfg->have_new_ctrl) { rdt_last_cmd_printf("duplicate domain\n"); return -EINVAL; }
- switch (type) { - case SCHEMA_COMM: - if (!bw_validate(buf, &data, r)) + switch (rr->ctrl_features[type].evt) { + case QOS_MBA_MAX_EVENT_ID: + if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) return -EINVAL; + data = (data < r->mbw.min_bw) ? r->mbw.min_bw : data; + data = roundup(data, r->mbw.bw_gran); break; - case SCHEMA_PRI: - if (kstrtoul(buf, 10, &data)) - return -EINVAL; - break; - case SCHEMA_HDL: - if (kstrtoul(buf, 10, &data)) + default: + if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) return -EINVAL; break; - default: - return -EINVAL; }
+ if (data >= rr->ctrl_features[type].max_wd) + return -EINVAL; + cfg->new_ctrl[type] = data; cfg->have_new_ctrl = true;
@@ -290,61 +305,43 @@ common_wrmsr(struct resctrl_resource *r, struct rdt_domain *d, mpam_component_config(dom->comp, &args); }
-static u64 cache_rdmsr(struct rdt_domain *d, struct msr_param *para) +static u64 cache_rdmsr(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para) { - u32 result, intpri, dspri; + u32 result; struct sync_args args; struct mpam_resctrl_dom *dom; + struct raw_resctrl_resource *rr = r->res;
args.closid = *para->closid; dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- switch (para->type) { - case SCHEMA_COMM: - args.eventid = QOS_CAT_CPBM_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &result); - break; - case SCHEMA_PRI: - args.eventid = QOS_CAT_INTPRI_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &intpri); - args.eventid = QOS_MBA_DSPRI_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &dspri); - result = (intpri > dspri) ? intpri : dspri; - break; - default: - return 0; - } + args.eventid = rr->ctrl_features[para->type].evt; + mpam_component_get_config(dom->comp, &args, &result);
return result; }
-static u64 mbw_rdmsr(struct rdt_domain *d, struct msr_param *para) +static u64 mbw_rdmsr(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para) { - u32 result, intpri, dspri; + u32 result; struct sync_args args; struct mpam_resctrl_dom *dom; + struct raw_resctrl_resource *rr = r->res;
args.closid = *para->closid; dom = container_of(d, struct mpam_resctrl_dom, resctrl_dom);
- switch (para->type) { - case SCHEMA_COMM: - args.eventid = QOS_MBA_MAX_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &result); - break; - case SCHEMA_PRI: - args.eventid = QOS_MBA_INTPRI_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &intpri); - args.eventid = QOS_MBA_DSPRI_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &dspri); - result = (intpri > dspri) ? intpri : dspri; - break; - case SCHEMA_HDL: - args.eventid = QOS_MBA_HDL_EVENT_ID; - mpam_component_get_config(dom->comp, &args, &result); + args.eventid = rr->ctrl_features[para->type].evt; + mpam_component_get_config(dom->comp, &args, &result); + + switch (rr->ctrl_features[para->type].evt) { + case QOS_MBA_MAX_EVENT_ID: + result = roundup(result, r->mbw.bw_gran); break; default: - return 0; + break; }
return result; @@ -1028,27 +1025,25 @@ static int cdpl2_enable(void) static void basic_ctrl_enable(void) { struct mpam_resctrl_res *res; - struct resctrl_resource *r; + struct raw_resctrl_resource *rr;
for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; + rr = res->resctrl_res.res; /* At least SCHEMA_COMM is supported */ - resctrl_ctrl_extend_bits_set(&r->ctrl_extend_bits, SCHEMA_COMM); + rr->ctrl_features[SCHEMA_COMM].enabled = true; } }
static int extend_ctrl_enable(enum resctrl_ctrl_type type) { bool match = false; - struct resctrl_resource *r; struct raw_resctrl_resource *rr; struct mpam_resctrl_res *res;
for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; - rr = r->res; - if (rr->extend_ctrls_wd[type]) { - resctrl_ctrl_extend_bits_set(&r->ctrl_extend_bits, type); + rr = res->resctrl_res.res; + if (rr->ctrl_features[type].capable) { + rr->ctrl_features[type].enabled = true; match = true; } } @@ -1061,12 +1056,13 @@ static int extend_ctrl_enable(enum resctrl_ctrl_type type)
static void extend_ctrl_disable(void) { - struct resctrl_resource *r; + struct raw_resctrl_resource *rr; struct mpam_resctrl_res *res;
for_each_supported_resctrl_exports(res) { - r = &res->resctrl_res; - resctrl_ctrl_extend_bits_clear(&r->ctrl_extend_bits); + rr = res->resctrl_res.res; + rr->ctrl_features[SCHEMA_PRI].enabled = false; + rr->ctrl_features[SCHEMA_HDL].enabled = false; } }
@@ -1848,48 +1844,32 @@ void __mpam_sched_in(void)
static void mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res, - u32 resctrl_cfg, enum resctrl_ctrl_type ctrl_type, + u32 resctrl_cfg, enum rdt_event_id evt, struct mpam_config *mpam_cfg) { - switch (ctrl_type) { - case SCHEMA_COMM: - if (res == &mpam_resctrl_exports[RDT_RESOURCE_MC]) { - u64 range; - - /* For MBA cfg is a percentage of .. */ - if (res->resctrl_mba_uses_mbw_part) { - /* .. the number of bits we can set */ - range = res->class->mbw_pbm_bits; - mpam_cfg->mbw_pbm = - (resctrl_cfg * range) / MAX_MBA_BW; - mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); - } else { - /* .. the number of fractions we can represent */ - range = MBW_MAX_BWA_FRACT(res->class->bwa_wd); - mpam_cfg->mbw_max = (resctrl_cfg * range) / (MAX_MBA_BW - 1); - mpam_cfg->mbw_max = - (mpam_cfg->mbw_max > range) ? range : mpam_cfg->mbw_max; - mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); - } - } else { - /* - * Nothing clever here as mpam_resctrl_pick_caches() - * capped the size at RESCTRL_MAX_CBM. - */ - mpam_cfg->cpbm = resctrl_cfg; - mpam_set_feature(mpam_feat_cpor_part, &mpam_cfg->valid); - } - break; - case SCHEMA_PRI: - mpam_cfg->dspri = resctrl_cfg; - mpam_cfg->intpri = resctrl_cfg; - mpam_set_feature(mpam_feat_dspri_part, &mpam_cfg->valid); - mpam_set_feature(mpam_feat_intpri_part, &mpam_cfg->valid); + u64 range; + + switch (evt) { + case QOS_MBA_MAX_EVENT_ID: + /* .. the number of fractions we can represent */ + range = MBW_MAX_BWA_FRACT(res->class->bwa_wd); + mpam_cfg->mbw_max = (resctrl_cfg * range) / (MAX_MBA_BW - 1); + mpam_cfg->mbw_max = + (mpam_cfg->mbw_max > range) ? range : mpam_cfg->mbw_max; + mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); break; - case SCHEMA_HDL: + case QOS_MBA_HDL_EVENT_ID: mpam_cfg->hdl = resctrl_cfg; mpam_set_feature(mpam_feat_part_hdl, &mpam_cfg->valid); break; + case QOS_CAT_CPBM_EVENT_ID: + mpam_cfg->cpbm = resctrl_cfg; + mpam_set_feature(mpam_feat_cpor_part, &mpam_cfg->valid); + break; + case QOS_CAT_INTPRI_EVENT_ID: + mpam_cfg->intpri = resctrl_cfg; + mpam_set_feature(mpam_feat_intpri_part, &mpam_cfg->valid); + break; default: break; } @@ -1908,6 +1888,7 @@ mpam_resctrl_update_component_cfg(struct resctrl_resource *r, struct mpam_resctrl_dom *dom; struct mpam_resctrl_res *res; struct mpam_config *slave_mpam_cfg; + struct raw_resctrl_resource *rr = r->res; enum resctrl_ctrl_type type; u32 intpartid = closid->intpartid; u32 reqpartid = closid->reqpartid; @@ -1935,11 +1916,9 @@ mpam_resctrl_update_component_cfg(struct resctrl_resource *r, slave_mpam_cfg->valid = 0;
for_each_ctrl_type(type) { - /* - * we don't need check if we have enabled this ctrl type, because - * this ctrls also should be applied an default configuration and - * this feature type would be rechecked when configuring mpam devices. - */ + if (!rr->ctrl_features[type].enabled) + continue; + resctrl_cfg = d->ctrl_val[type][intpartid]; mpam_update_from_resctrl_cfg(res, resctrl_cfg, type, slave_mpam_cfg); @@ -1952,13 +1931,15 @@ static void mpam_reset_cfg(struct mpam_resctrl_res *res, { int i; struct resctrl_resource *r = &res->resctrl_res; + struct raw_resctrl_resource *rr = r->res; enum resctrl_ctrl_type type;
for (i = 0; i != mpam_sysprops_num_partid(); i++) { for_each_ctrl_type(type) { - mpam_update_from_resctrl_cfg(res, r->default_ctrl[type], - type, &dom->comp->cfg[i]); - d->ctrl_val[type][i] = r->default_ctrl[type]; + mpam_update_from_resctrl_cfg(res, + rr->ctrl_features[type].default_ctrl, + rr->ctrl_features[type].evt, &dom->comp->cfg[i]); + d->ctrl_val[type][i] = rr->ctrl_features[type].default_ctrl; } } } diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index ef922a796ff8..18b1e5db5c0a 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -334,13 +334,6 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) struct resctrl_resource *r = &res->resctrl_res; struct raw_resctrl_resource *rr = NULL;
- if (class && !r->default_ctrl) { - r->default_ctrl = kmalloc_array(SCHEMA_NUM_CTRL_TYPE, - sizeof(*r->default_ctrl), GFP_KERNEL); - if (!r->default_ctrl) - return -ENOMEM; - } - if (class == mpam_resctrl_exports[RDT_RESOURCE_SMMU].class) { return 0; } else if (class == mpam_resctrl_exports[RDT_RESOURCE_MC].class) { @@ -373,17 +366,32 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mbw.min_bw = MAX_MBA_BW / ((1ULL << class->bwa_wd) - 1); /* the largest mbw_max is 100 */ - r->default_ctrl[SCHEMA_COMM] = 100; + rr->ctrl_features[SCHEMA_COMM].default_ctrl = MAX_MBA_BW; + rr->ctrl_features[SCHEMA_COMM].max_wd = MAX_MBA_BW + 1; + rr->ctrl_features[SCHEMA_COMM].capable = true; + } + + if (mpam_has_feature(mpam_feat_intpri_part, class->features)) { + /* + * Export internal priority setting, which represents the + * max level of control we can export to resctrl. this default + * priority is from hardware, no clever here. + */ + rr->ctrl_features[SCHEMA_PRI].max_wd = 1 << class->intpri_wd; + rr->ctrl_features[SCHEMA_PRI].default_ctrl = class->hwdef_intpri; + rr->ctrl_features[SCHEMA_PRI].capable = true; } + /* Just in case we have an excessive number of bits */ if (!r->mbw.min_bw) r->mbw.min_bw = 1;
/* - * because its linear with no offset, the granule is the same - * as the smallest value + * james said because its linear with no offset, the granule is the same + * as the smallest value. It is a little fuzzy here because a granularity + * of 1 would appear too fine to make percentage conversions. */ - r->mbw.bw_gran = r->mbw.min_bw; + r->mbw.bw_gran = GRAN_MBA_BW;
/* We will only pick a class that can monitor and control */ r->alloc_capable = true; @@ -392,8 +400,9 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mon_capable = true; r->mon_enabled = true; /* Export memory bandwidth hardlimit, default active hardlimit */ - rr->extend_ctrls_wd[SCHEMA_HDL] = 2; - r->default_ctrl[SCHEMA_HDL] = 1; + rr->ctrl_features[SCHEMA_HDL].default_ctrl = 1; + rr->ctrl_features[SCHEMA_HDL].max_wd = 2; + rr->ctrl_features[SCHEMA_HDL].capable = true; } else if (class == mpam_resctrl_exports[RDT_RESOURCE_L3].class) { r->rid = RDT_RESOURCE_L3; rr = mpam_get_raw_resctrl_resource(RDT_RESOURCE_L3); @@ -402,22 +411,40 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->fflags = RFTYPE_RES_CACHE; r->name = "L3";
- r->cache.cbm_len = class->cpbm_wd; - r->default_ctrl[SCHEMA_COMM] = GENMASK(class->cpbm_wd - 1, 0); - /* - * Which bits are shared with other ...things... - * Unknown devices use partid-0 which uses all the bitmap - * fields. Until we configured the SMMU and GIC not to do this - * 'all the bits' is the correct answer here. - */ - r->cache.shareable_bits = r->default_ctrl[SCHEMA_COMM]; - r->cache.min_cbm_bits = 1; - if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { - r->alloc_capable = true; - r->alloc_enabled = true; - rdt_alloc_capable = true; + r->cache.cbm_len = class->cpbm_wd; + rr->ctrl_features[SCHEMA_COMM].default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + rr->ctrl_features[SCHEMA_COMM].max_wd = + rr->ctrl_features[SCHEMA_COMM].default_ctrl + 1; + rr->ctrl_features[SCHEMA_COMM].capable = true; + /* + * Which bits are shared with other ...things... + * Unknown devices use partid-0 which uses all the bitmap + * fields. Until we configured the SMMU and GIC not to do this + * 'all the bits' is the correct answer here. + */ + r->cache.shareable_bits = rr->ctrl_features[SCHEMA_COMM].default_ctrl; + r->cache.min_cbm_bits = 1; + } + + if (mpam_has_feature(mpam_feat_intpri_part, class->features)) { + /* + * Export internal priority setting, which represents the + * max level of control we can export to resctrl. this default + * priority is from hardware, no clever here. + */ + rr->ctrl_features[SCHEMA_PRI].max_wd = 1 << class->intpri_wd; + rr->ctrl_features[SCHEMA_PRI].default_ctrl = class->hwdef_intpri; + rr->ctrl_features[SCHEMA_PRI].capable = true; } + /* + * Only this resource is allocable can it be picked from + * mpam_resctrl_pick_caches(). So directly set following + * fields to true. + */ + r->alloc_capable = true; + r->alloc_enabled = true; + rdt_alloc_capable = true; /* * While this is a CPU-interface feature of MPAM, we only tell * resctrl about it for caches, as that seems to be how x86 @@ -435,22 +462,40 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->fflags = RFTYPE_RES_CACHE; r->name = "L2";
- r->cache.cbm_len = class->cpbm_wd; - r->default_ctrl[SCHEMA_COMM] = GENMASK(class->cpbm_wd - 1, 0); - /* - * Which bits are shared with other ...things... - * Unknown devices use partid-0 which uses all the bitmap - * fields. Until we configured the SMMU and GIC not to do this - * 'all the bits' is the correct answer here. - */ - r->cache.shareable_bits = r->default_ctrl[SCHEMA_COMM]; - if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { - r->alloc_capable = true; - r->alloc_enabled = true; - rdt_alloc_capable = true; + r->cache.cbm_len = class->cpbm_wd; + rr->ctrl_features[SCHEMA_COMM].default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + rr->ctrl_features[SCHEMA_COMM].max_wd = + rr->ctrl_features[SCHEMA_COMM].default_ctrl + 1; + rr->ctrl_features[SCHEMA_COMM].capable = true; + /* + * Which bits are shared with other ...things... + * Unknown devices use partid-0 which uses all the bitmap + * fields. Until we configured the SMMU and GIC not to do this + * 'all the bits' is the correct answer here. + */ + r->cache.shareable_bits = rr->ctrl_features[SCHEMA_COMM].default_ctrl; }
+ if (mpam_has_feature(mpam_feat_intpri_part, class->features)) { + /* + * Export internal priority setting, which represents the + * max level of control we can export to resctrl. this default + * priority is from hardware, no clever here. + */ + rr->ctrl_features[SCHEMA_PRI].max_wd = 1 << class->intpri_wd; + rr->ctrl_features[SCHEMA_PRI].default_ctrl = class->hwdef_intpri; + rr->ctrl_features[SCHEMA_PRI].capable = true; + } + /* + * Only this resource is allocable can it be picked from + * mpam_resctrl_pick_caches(). So directly set following + * fields to true. + */ + r->alloc_capable = true; + r->alloc_enabled = true; + rdt_alloc_capable = true; + /* * While this is a CPU-interface feature of MPAM, we only tell * resctrl about it for caches, as that seems to be how x86 @@ -464,17 +509,6 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) rr->num_partid = class->num_partid; rr->num_intpartid = class->num_intpartid; rr->num_pmg = class->num_pmg; - - /* - * Export priority setting, extend_ctrls_wd represents the - * max level of control we can export. this default priority - * is just from hardware, no need to define another default - * value. - */ - rr->extend_ctrls_wd[SCHEMA_PRI] = 1 << max(class->intpri_wd, - class->dspri_wd); - r->default_ctrl[SCHEMA_PRI] = max(class->hwdef_intpri, - class->hwdef_dspri); }
return 0; diff --git a/include/linux/resctrlfs.h b/include/linux/resctrlfs.h index 7f1ff6e816f5..287be52e2385 100644 --- a/include/linux/resctrlfs.h +++ b/include/linux/resctrlfs.h @@ -58,9 +58,6 @@ struct resctrl_resource {
bool cdp_capable; bool cdp_enable; - u32 *default_ctrl; - - u32 ctrl_extend_bits;
void *res; };
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Some resource's properities such as closid and rmid are exported like Intel-RDT in our resctrl design, but there also has two main differences, one is MB(Memory Bandwidth), for we MB is also divided into two directories MB and MB_MON to show respective properties about control and monitor type as same as LxCache, another is we adopt features sysfile under resources' directories, which indicates the properties of control type of corresponding resource, for instance MB hardlimit.
e.g. > mount -t resctrl resctrl /sys/fs/resctrl -o mbHdl > cd /sys/fs/resctrl/ && cat info/MB/features mbHdl@1 #indicate MBHDL setting's upper bound is 1 > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MB:0=100;1=100;2=100;3=100 MBHDL:0=1;1=1;2=1;3=1
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 4 + arch/arm64/kernel/mpam/mpam_resctrl.c | 240 ++++++++++++++++++++------ fs/resctrlfs.c | 22 +-- 3 files changed, 200 insertions(+), 66 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index e6fd2b7c72b0..930658a775d6 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -246,6 +246,8 @@ struct resctrl_schema { bool cdp_mc_both; };
+extern struct list_head resctrl_all_schema; + /** * struct rdt_domain - group of cpus sharing an RDT resource * @list: all instances of this resource @@ -349,6 +351,8 @@ struct raw_resctrl_resource { int (*mon_write)(struct rdt_domain *d, void *md_priv);
struct resctrl_ctrl_feature ctrl_features[SCHEMA_NUM_CTRL_TYPE]; + + unsigned long fflags; };
/* 64bit arm64 specified */ diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 2b9f0f7dca93..503244cb7e97 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -136,6 +136,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .format_str = "%d=%0*x", .mon_read = cache_rdmon, .mon_write = common_wrmon, + .fflags = RFTYPE_RES_CACHE, .ctrl_features = { [SCHEMA_COMM] = { .type = SCHEMA_COMM, @@ -161,6 +162,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .format_str = "%d=%0*x", .mon_read = cache_rdmon, .mon_write = common_wrmon, + .fflags = RFTYPE_RES_CACHE, .ctrl_features = { [SCHEMA_COMM] = { .type = SCHEMA_COMM, @@ -186,6 +188,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .format_str = "%d=%0*d", .mon_read = mbw_rdmon, .mon_write = common_wrmon, + .fflags = RFTYPE_RES_MB, .ctrl_features = { [SCHEMA_COMM] = { .type = SCHEMA_COMM, @@ -510,6 +513,14 @@ static void mpam_resctrl_closid_collect(void) } }
+static u32 get_nr_closid(void) +{ + if (!intpartid_free_map) + return 0; + + return num_intpartid; +} + int closid_bitmap_init(void) { int pos; @@ -558,6 +569,14 @@ struct rmid_transform { }; static struct rmid_transform rmid_remap_matrix;
+static u32 get_nr_rmids(void) +{ + if (!rmid_remap_matrix.remap_enabled) + return 0; + + return rmid_remap_matrix.nr_usage; +} + /* * a rmid remap matrix is delivered for transforming partid pmg to rmid, * this matrix is organized like this: @@ -868,7 +887,6 @@ static int __rmid_alloc(int partid) goto out;
return rmid[0]; - out: rmid_free(rmid[0]); return ret; @@ -1352,48 +1370,6 @@ int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask, return 0; }
-static int resctrl_num_partid_show(struct kernfs_open_file *of, - struct seq_file *seq, void *v) -{ - struct resctrl_resource *r = of->kn->parent->priv; - struct raw_resctrl_resource *rr = r->res; - u16 num_partid; - - num_partid = rr->num_partid; - - seq_printf(seq, "%d\n", num_partid); - - return 0; -} - -static int resctrl_num_pmg_show(struct kernfs_open_file *of, - struct seq_file *seq, void *v) -{ - struct resctrl_resource *r = of->kn->parent->priv; - struct raw_resctrl_resource *rr = r->res; - u16 num_pmg; - - num_pmg = rr->num_pmg; - - seq_printf(seq, "%d\n", num_pmg); - - return 0; -} - -static int resctrl_num_mon_show(struct kernfs_open_file *of, - struct seq_file *seq, void *v) -{ - struct resctrl_resource *r = of->kn->parent->priv; - struct raw_resctrl_resource *rr = r->res; - u16 num_mon; - - num_mon = rr->num_mon; - - seq_printf(seq, "%d\n", num_mon); - - return 0; -} - int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask, cpumask_var_t tmpmask) { @@ -1578,7 +1554,7 @@ void rdt_last_cmd_printf(const char *fmt, ...) va_end(ap); }
-static int rdt_last_cmd_status_show(struct kernfs_open_file *of, +static int resctrl_last_cmd_status_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { int len; @@ -1593,6 +1569,116 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of, return 0; }
+static int resctrl_num_closids_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + u32 flag, times; + + hw_alloc_times_validate(times, flag); + + seq_printf(seq, "%u\n", get_nr_closid() / times); + return 0; +} + +static int resctrl_cbm_mask_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct resctrl_resource *r = of->kn->parent->priv; + struct raw_resctrl_resource *rr = r->res; + + seq_printf(seq, "%x\n", rr->ctrl_features[SCHEMA_COMM].default_ctrl); + return 0; +} + +static int resctrl_min_cbm_bits_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct resctrl_resource *r = of->kn->parent->priv; + + seq_printf(seq, "%u\n", r->cache.min_cbm_bits); + return 0; +} + +static int resctrl_shareable_bits_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct resctrl_resource *r = of->kn->parent->priv; + + seq_printf(seq, "%x\n", r->cache.shareable_bits); + return 0; +} + +static int resctrl_features_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + enum resctrl_ctrl_type type; + struct resctrl_resource *r = of->kn->parent->priv; + struct raw_resctrl_resource *rr = r->res; + + for_each_extend_ctrl_type(type) { + if (!rr->ctrl_features[type].enabled) + continue; + /* + * we define the range of ctrl features with integer, + * here give maximum upper bound to user space. + */ + switch (rr->ctrl_features[type].base) { + case 10: + seq_printf(seq, "%s@%u\n", rr->ctrl_features[type].name, + rr->ctrl_features[type].max_wd - 1); + break; + case 16: + seq_printf(seq, "%s@%x\n", rr->ctrl_features[type].name, + rr->ctrl_features[type].max_wd - 1); + break; + default: + break; + } + } + return 0; +} + +static int resctrl_min_bandwidth_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct resctrl_resource *r = of->kn->parent->priv; + + seq_printf(seq, "%u\n", r->mbw.min_bw); + return 0; +} + +static int resctrl_bandwidth_gran_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct resctrl_resource *r = of->kn->parent->priv; + + seq_printf(seq, "%u\n", r->mbw.bw_gran); + return 0; +} + +static int resctrl_num_rmids_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + u32 flag, times; + + hw_alloc_times_validate(times, flag); + seq_printf(seq, "%u\n", get_nr_rmids() / times); + return 0; +} + +static int resctrl_num_monitors_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct resctrl_resource *r = of->kn->parent->priv; + struct raw_resctrl_resource *rr = r->res; + u32 flag, times; + + hw_alloc_times_validate(times, flag); + seq_printf(seq, "%u\n", rr->num_mon / times); + return 0; +} + + static ssize_t resctrl_group_tasks_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { @@ -1649,32 +1735,74 @@ static int resctrl_group_tasks_show(struct kernfs_open_file *of, /* rdtgroup information files for one cache resource. */ static struct rftype res_specific_files[] = { { - .name = "num_partids", + .name = "last_cmd_status", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_last_cmd_status_show, + .fflags = RF_TOP_INFO, + }, + { + .name = "num_closids", .mode = 0444, .kf_ops = &resctrl_group_kf_single_ops, - .seq_show = resctrl_num_partid_show, + .seq_show = resctrl_num_closids_show, .fflags = RF_CTRL_INFO, }, { - .name = "num_pmgs", + .name = "cbm_mask", .mode = 0444, .kf_ops = &resctrl_group_kf_single_ops, - .seq_show = resctrl_num_pmg_show, - .fflags = RF_MON_INFO, + .seq_show = resctrl_cbm_mask_show, + .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE, }, { - .name = "num_monitors", + .name = "min_cbm_bits", .mode = 0444, .kf_ops = &resctrl_group_kf_single_ops, - .seq_show = resctrl_num_mon_show, + .seq_show = resctrl_min_cbm_bits_show, + .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE, + }, + { + .name = "shareable_bits", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_shareable_bits_show, + .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE, + }, + { + .name = "features", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_features_show, + .fflags = RF_CTRL_INFO, + }, + { + .name = "min_bandwidth", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_min_bandwidth_show, + .fflags = RF_CTRL_INFO | RFTYPE_RES_MB, + }, + { + .name = "bandwidth_gran", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_bandwidth_gran_show, + .fflags = RF_CTRL_INFO | RFTYPE_RES_MB, + }, + { + .name = "num_rmids", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_num_rmids_show, .fflags = RF_MON_INFO, }, { - .name = "last_cmd_status", - .mode = 0444, - .kf_ops = &resctrl_group_kf_single_ops, - .seq_show = rdt_last_cmd_status_show, - .fflags = RF_TOP_INFO, + .name = "num_monitors", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_num_monitors_show, + .fflags = RF_MON_INFO, }, { .name = "cpus", diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 2b6731c21532..e2891529ec11 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -180,11 +180,12 @@ static int resctrl_group_mkdir_info_resdir(struct resctrl_resource *r, char *nam
static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) { + struct resctrl_schema *s; struct resctrl_resource *r; + struct raw_resctrl_resource *rr; unsigned long fflags; char name[32]; int ret; - enum resctrl_resource_level level;
/* create the directory */ kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL); @@ -196,25 +197,27 @@ static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) if (ret) goto out_destroy;
- for (level = RDT_RESOURCE_SMMU; level < RDT_NUM_RESOURCES; level++) { - r = mpam_resctrl_get_resource(level); + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res; if (!r) continue; + rr = r->res; if (r->alloc_enabled) { - fflags = r->fflags | RF_CTRL_INFO; - ret = resctrl_group_mkdir_info_resdir(r, r->name, fflags); + fflags = rr->fflags | RF_CTRL_INFO; + ret = resctrl_group_mkdir_info_resdir(r, s->name, fflags); if (ret) goto out_destroy; } }
- for (level = RDT_RESOURCE_SMMU; level < RDT_NUM_RESOURCES; level++) { - r = mpam_resctrl_get_resource(level); + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res; if (!r) continue; + rr = r->res; if (r->mon_enabled) { - fflags = r->fflags | RF_MON_INFO; - snprintf(name, sizeof(name), "%s_MON", r->name); + fflags = rr->fflags | RF_MON_INFO; + snprintf(name, sizeof(name), "%s_MON", s->name); ret = resctrl_group_mkdir_info_resdir(r, name, fflags); if (ret) goto out_destroy; @@ -314,7 +317,6 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct resctrl_group *prgrp, /* create the directory */ kn = kernfs_create_dir(parent_kn, name, parent_kn->mode, prgrp); if (IS_ERR(kn)) { - pr_info("%s: create dir %s, error\n", __func__, name); return PTR_ERR(kn); }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
So far there are some declarations shared by resctrlfs.c and mpam core module files under kernel/mpam directory scattered in mpam.h and resctrl.h, this is organized like this:
-- asm/ +-- resctrl.h + +-- mpam.h | + +-- mpam_resource.h | | + | | | -- fs/ | | +-> mpam/ +-- resctrlfs.c <----+----+------> +-- mpam_resctrl.c ...
We move this declarations shared by resctrlfs.c and mpam/ to resctrl.h and split another declarations into mpam_internal.h, also including moving mpam_resource.h to mpam/ directory, currently this is organized like this:
-- asm/ +-- mpam.h +----> export to other modules(e.g. SMMU master io) +-- resctrl.h + | -- mpam/ | +-- mpam_internal.h | + +-- mpam_resource.h | | + | | | -- fs/ | +----+-> mpam/ +-- resctrlfs.c <----+-----------> +-- mpam_resctrl.c ...
In this way can we build a clearer framework for MPAM usage.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/mpam.h | 377 +----------------- arch/arm64/include/asm/resctrl.h | 322 ++++++++++++++- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 96 ++++- arch/arm64/kernel/mpam/mpam_device.c | 2 +- arch/arm64/kernel/mpam/mpam_internal.h | 142 ++++++- arch/arm64/kernel/mpam/mpam_resctrl.c | 3 +- .../asm => kernel/mpam}/mpam_resource.h | 4 +- fs/resctrlfs.c | 94 +---- include/linux/resctrlfs.h | 71 ---- 9 files changed, 551 insertions(+), 560 deletions(-) rename arch/arm64/{include/asm => kernel/mpam}/mpam_resource.h (98%)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h index 930658a775d6..6338eab817e7 100644 --- a/arch/arm64/include/asm/mpam.h +++ b/arch/arm64/include/asm/mpam.h @@ -2,381 +2,8 @@ #ifndef _ASM_ARM64_MPAM_H #define _ASM_ARM64_MPAM_H
-#include <linux/sched.h> -#include <linux/kernfs.h> -#include <linux/jump_label.h> - -#include <linux/seq_buf.h> -#include <linux/seq_file.h> -#include <linux/resctrlfs.h> - -/* MPAM register */ -#define SYS_MPAM0_EL1 sys_reg(3, 0, 10, 5, 1) -#define SYS_MPAM1_EL1 sys_reg(3, 0, 10, 5, 0) -#define SYS_MPAM2_EL2 sys_reg(3, 4, 10, 5, 0) -#define SYS_MPAM3_EL3 sys_reg(3, 6, 10, 5, 0) -#define SYS_MPAM1_EL12 sys_reg(3, 5, 10, 5, 0) -#define SYS_MPAMHCR_EL2 sys_reg(3, 4, 10, 4, 0) -#define SYS_MPAMVPMV_EL2 sys_reg(3, 4, 10, 4, 1) -#define SYS_MPAMVPMn_EL2(n) sys_reg(3, 4, 10, 6, n) -#define SYS_MPAMIDR_EL1 sys_reg(3, 0, 10, 4, 4) - -#define MPAM_MASK(n) ((1UL << n) - 1) -/* plan to use GENMASK(n, 0) instead */ - -/* - * MPAMx_ELn: - * 15:0 PARTID_I - * 31:16 PARTID_D - * 39:32 PMG_I - * 47:40 PMG_D - * 48 TRAPMPAM1EL1 - * 49 TRAPMPAM0EL1 - * 61:49 Reserved - * 62 TRAPLOWER - * 63 MPAMEN - */ -#define PARTID_BITS (16) -#define PMG_BITS (8) -#define PARTID_MASK MPAM_MASK(PARTID_BITS) -#define PMG_MASK MPAM_MASK(PMG_BITS) - -#define PARTID_I_SHIFT (0) -#define PARTID_D_SHIFT (PARTID_I_SHIFT + PARTID_BITS) -#define PMG_I_SHIFT (PARTID_D_SHIFT + PARTID_BITS) -#define PMG_D_SHIFT (PMG_I_SHIFT + PMG_BITS) - -#define PARTID_I_MASK (PARTID_MASK << PARTID_I_SHIFT) -#define PARTID_D_MASK (PARTID_MASK << PARTID_D_SHIFT) -#define PARTID_I_CLR(r) ((r) & ~PARTID_I_MASK) -#define PARTID_D_CLR(r) ((r) & ~PARTID_D_MASK) -#define PARTID_CLR(r) (PARTID_I_CLR(r) & PARTID_D_CLR(r)) - -#define PARTID_I_SET(r, id) (PARTID_I_CLR(r) | ((id) << PARTID_I_SHIFT)) -#define PARTID_D_SET(r, id) (PARTID_D_CLR(r) | ((id) << PARTID_D_SHIFT)) -#define PARTID_SET(r, id) (PARTID_CLR(r) | ((id) << PARTID_I_SHIFT) | ((id) << PARTID_D_SHIFT)) - -#define PMG_I_MASK (PMG_MASK << PMG_I_SHIFT) -#define PMG_D_MASK (PMG_MASK << PMG_D_SHIFT) -#define PMG_I_CLR(r) ((r) & ~PMG_I_MASK) -#define PMG_D_CLR(r) ((r) & ~PMG_D_MASK) -#define PMG_CLR(r) (PMG_I_CLR(r) & PMG_D_CLR(r)) - -#define PMG_I_SET(r, id) (PMG_I_CLR(r) | ((id) << PMG_I_SHIFT)) -#define PMG_D_SET(r, id) (PMG_D_CLR(r) | ((id) << PMG_D_SHIFT)) -#define PMG_SET(r, id) (PMG_CLR(r) | ((id) << PMG_I_SHIFT) | ((id) << PMG_D_SHIFT)) - -#define TRAPMPAM1EL1_SHIFT (PMG_D_SHIFT + PMG_BITS) -#define TRAPMPAM0EL1_SHIFT (TRAPMPAM1EL1_SHIFT + 1) -#define TRAPLOWER_SHIFT (TRAPMPAM0EL1_SHIFT + 13) -#define MPAMEN_SHIFT (TRAPLOWER_SHIFT + 1) - -/* - * MPAMHCR_EL2: - * 0 EL0_VPMEN - * 1 EL1_VPMEN - * 7:2 Reserved - * 8 GSTAPP_PLK - * 30:9 Reserved - * 31 TRAP_MPAMIDR_EL1 - * 63:32 Reserved - */ -#define EL0_VPMEN_SHIFT (0) -#define EL1_VPMEN_SHIFT (EL0_VPMEN_SHIFT + 1) -#define GSTAPP_PLK_SHIFT (8) -#define TRAP_MPAMIDR_EL1_SHIFT (31) - -/* - * MPAMIDR_EL1: - * 15:0 PARTID_MAX - * 16 Reserved - * 17 HAS_HCR - * 20:18 VPMR_MAX - * 31:21 Reserved - * 39:32 PMG_MAX - * 63:40 Reserved - */ -#define VPMR_MAX_BITS (3) -#define PARTID_MAX_SHIFT (0) -#define PARTID_MAX_MASK (MPAM_MASK(PARTID_BITS) << PARTID_MAX_SHIFT) -#define HAS_HCR_SHIFT (PARTID_MAX_SHIFT + PARTID_BITS + 1) -#define VPMR_MAX_SHIFT (HAS_HCR_SHIFT + 1) -#define PMG_MAX_SHIFT (VPMR_MAX_SHIFT + VPMR_MAX_BITS + 11) -#define PMG_MAX_MASK (MPAM_MASK(PMG_BITS) << PMG_MAX_SHIFT) -#define VPMR_MASK MPAM_MASK(VPMR_MAX_BITS) - -/* - * MPAMVPMV_EL2: - * 31:0 VPM_V - * 63:32 Reserved - */ -#define VPM_V_BITS 32 - -DECLARE_STATIC_KEY_FALSE(resctrl_enable_key); -DECLARE_STATIC_KEY_FALSE(resctrl_mon_enable_key); - -extern int max_name_width, max_data_width; - -enum resctrl_ctrl_type { - SCHEMA_COMM = 0, - SCHEMA_PRI, - SCHEMA_HDL, - SCHEMA_NUM_CTRL_TYPE -}; - -struct resctrl_ctrl_feature { - enum resctrl_ctrl_type type; - int flags; - const char *name; - - u32 max_wd; - - int base; - int evt; - - int default_ctrl; - - bool capable; - bool enabled; -}; - -#define for_each_ctrl_type(t) \ - for (t = SCHEMA_COMM; t != SCHEMA_NUM_CTRL_TYPE; t++) - -#define for_each_extend_ctrl_type(t) \ - for (t = SCHEMA_PRI; t != SCHEMA_NUM_CTRL_TYPE; t++) - -enum resctrl_conf_type { - CDP_BOTH = 0, - CDP_CODE, - CDP_DATA, - CDP_NUM_CONF_TYPE, -}; - -static inline int conf_name_to_conf_type(char *name) -{ - enum resctrl_conf_type t; - - if (!strcmp(name, "L3CODE") || !strcmp(name, "L2CODE")) - t = CDP_CODE; - else if (!strcmp(name, "L3DATA") || !strcmp(name, "L2DATA")) - t = CDP_DATA; - else - t = CDP_BOTH; - return t; -} - -#define for_each_conf_type(t) \ - for (t = CDP_BOTH; t < CDP_NUM_CONF_TYPE; t++) - -typedef struct { u16 val; } hw_mpamid_t; - -#define hw_closid_t hw_mpamid_t -#define hw_monid_t hw_mpamid_t -#define hw_closid_val(__x) (__x.val) -#define hw_monid_val(__x) (__x.val) - -#define as_hw_t(__name, __x) \ - ((hw_##__name##id_t){(__x)}) -#define hw_val(__name, __x) \ - hw_##__name##id_val(__x) - -/** - * When cdp enabled, give (closid + 1) to Cache LxDATA. - */ -#define resctrl_cdp_map(__name, __closid, __type, __result) \ -do { \ - if (__type == CDP_CODE) \ - __result = as_hw_t(__name, __closid); \ - else if (__type == CDP_DATA) \ - __result = as_hw_t(__name, __closid + 1); \ - else \ - __result = as_hw_t(__name, __closid); \ -} while (0) - -bool is_resctrl_cdp_enabled(void); - -#define hw_alloc_times_validate(__times, __flag) \ -do { \ - __flag = is_resctrl_cdp_enabled(); \ - __times = flag ? 2 : 1; \ -} while (0) - - -/** - * struct resctrl_staged_config - parsed configuration to be applied - * @hw_closid: raw closid for this configuration, regardless of CDP - * @new_ctrl: new ctrl value to be loaded - * @have_new_ctrl: did user provide new_ctrl for this domain - * @new_ctrl_type: CDP property of the new ctrl - * @cdp_both_ctrl: did cdp both control if cdp enabled - */ -struct resctrl_staged_config { - hw_closid_t hw_closid; - u32 new_ctrl[SCHEMA_NUM_CTRL_TYPE]; - bool have_new_ctrl; - enum resctrl_conf_type conf_type; - enum resctrl_ctrl_type ctrl_type; - bool cdp_both_ctrl; -}; - -/* later move to resctrl common directory */ -#define RESCTRL_NAME_LEN 15 - -struct resctrl_schema_ctrl { - struct list_head list; - char name[RESCTRL_NAME_LEN]; - enum resctrl_ctrl_type ctrl_type; -}; - -/** - * @list: Member of resctrl's schema list - * @name: Name visible in the schemata file - * @conf_type: Type of configuration, e.g. code/data/both - * @res: The rdt_resource for this entry - * @schemata_ctrl_list: Type of ctrl configuration. e.g. priority/hardlimit - * @cdp_mc_both: did cdp both mon/ctrl if cdp enabled - */ -struct resctrl_schema { - struct list_head list; - char name[RESCTRL_NAME_LEN]; - enum resctrl_conf_type conf_type; - struct resctrl_resource *res; - struct list_head schema_ctrl_list; - bool cdp_mc_both; -}; - -extern struct list_head resctrl_all_schema; - -/** - * struct rdt_domain - group of cpus sharing an RDT resource - * @list: all instances of this resource - * @id: unique id for this instance - * @cpu_mask: which cpus share this resource - * @rmid_busy_llc: - * bitmap of which limbo RMIDs are above threshold - * @mbm_total: saved state for MBM total bandwidth - * @mbm_local: saved state for MBM local bandwidth - * @mbm_over: worker to periodically read MBM h/w counters - * @cqm_limbo: worker to periodically read CQM h/w counters - * @mbm_work_cpu: - * worker cpu for MBM h/w counters - * @cqm_work_cpu: - * worker cpu for CQM h/w counters - * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID) - * @new_ctrl: new ctrl value to be loaded - * @have_new_ctrl: did user provide new_ctrl for this domain - */ -struct rdt_domain { - struct list_head list; - int id; - struct cpumask cpu_mask; - void __iomem *base; - - /* arch specific fields */ - u32 *ctrl_val[SCHEMA_NUM_CTRL_TYPE]; - bool have_new_ctrl; - - /* for debug */ - char *cpus_list; - - struct resctrl_staged_config staged_cfg[CDP_NUM_CONF_TYPE]; -}; - -#define RESCTRL_SHOW_DOM_MAX_NUM 8 - -int __init resctrl_group_init(void); - -int resctrl_group_mondata_show(struct seq_file *m, void *arg); -void rmdir_mondata_subdir_allrdtgrp(struct resctrl_resource *r, - unsigned int dom_id); - -int cdp_enable(int level, int data_type, int code_type); - -void post_resctrl_mount(void); - -#define mpam_read_sysreg_s(reg, name) read_sysreg_s(reg) -#define mpam_write_sysreg_s(v, r, n) write_sysreg_s(v, r) -#define mpam_readl(addr) readl(addr) -#define mpam_writel(v, addr) writel(v, addr) - -struct sd_closid; - -struct msr_param { - enum resctrl_ctrl_type type; - struct sd_closid *closid; -}; - -/** - * struct resctrl_resource - attributes of an RDT resource - * @rid: The index of the resource - * @alloc_enabled: Is allocation enabled on this machine - * @mon_enabled: Is monitoring enabled for this feature - * @alloc_capable: Is allocation available on this machine - * @mon_capable: Is monitor feature available on this machine - * @name: Name to use in "schemata" file - * @num_closid: Number of CLOSIDs available - * @cache_level: Which cache level defines scope of this resource - * @msr_base: Base MSR address for CBMs - * @msr_update: Function pointer to update QOS MSRs - * @data_width: Character width of data when displaying - * @domains: All domains for this resource - * @cache: Cache allocation related data - * @format_str: Per resource format string to show domain value - * @parse_ctrlval: Per resource function pointer to parse control values - * @evt_list: List of monitoring events - * @num_rmid: Number of RMIDs available - * @mon_scale: cqm counter * mon_scale = occupancy in bytes - * @fflags: flags to choose base and info files - */ - -struct raw_resctrl_resource { - u16 num_partid; - u16 num_intpartid; - u16 num_pmg; - - void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, - struct msr_param *para); - u64 (*msr_read)(struct resctrl_resource *r, struct rdt_domain *d, - struct msr_param *para); - - int data_width; - const char *format_str; - int (*parse_ctrlval)(char *buf, struct resctrl_resource *r, - struct resctrl_staged_config *cfg, - enum resctrl_ctrl_type ctrl_type); - - u16 num_mon; - u64 (*mon_read)(struct rdt_domain *d, void *md_priv); - int (*mon_write)(struct rdt_domain *d, void *md_priv); - - struct resctrl_ctrl_feature ctrl_features[SCHEMA_NUM_CTRL_TYPE]; - - unsigned long fflags; -}; - -/* 64bit arm64 specified */ -union mon_data_bits { - void *priv; - struct { - u8 rid; - u8 domid; - u8 partid; - u8 pmg; - u8 mon; - u8 cdp_both_mon; - } u; -}; - -ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off); - -int resctrl_group_schemata_show(struct kernfs_open_file *of, - struct seq_file *s, void *v); - -struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, - struct list_head **pos); - +#ifdef CONFIG_MPAM extern int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg); +#endif
#endif /* _ASM_ARM64_MPAM_H */ diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 44a5bcfa5b92..af2388a43990 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -1,9 +1,12 @@ #ifndef _ASM_ARM64_RESCTRL_H #define _ASM_ARM64_RESCTRL_H
+#include <linux/resctrlfs.h> #include <asm/mpam_sched.h> #include <asm/mpam.h>
+#if defined(CONFIG_RESCTRL) && defined(CONFIG_MPAM) + #define resctrl_group rdtgroup #define resctrl_alloc_capable rdt_alloc_capable #define resctrl_mon_capable rdt_mon_capable @@ -40,13 +43,81 @@ enum rdt_group_type { RDT_NUM_GROUP, };
+/** + * struct resctrl_cache - Cache allocation related data + * @cbm_len: Length of the cache bit mask + * @min_cbm_bits: Minimum number of consecutive bits to be set + * @shareable_bits: Bitmask of shareable resource with other + * executing entities + */ +struct resctrl_cache { + u32 cbm_len; + u32 shareable_bits; + u32 min_cbm_bits; +}; + +/** + * struct resctrl_membw - Memory bandwidth allocation related data + * @min_bw: Minimum memory bandwidth percentage user can request + * @bw_gran: Granularity at which the memory bandwidth is allocated + * @delay_linear: True if memory B/W delay is in linear scale + * @ctrl_extend_bits: Indicates if there are extra ctrl capabilities supported. + * e.g. priority/hardlimit. + */ +struct resctrl_membw { + u32 min_bw; + u32 bw_gran; + u32 delay_linear; +}; + +/** + * struct resctrl_resource - attributes of an RDT resource + * @rid: The index of the resource + * @alloc_enabled: Is allocation enabled on this machine + * @mon_enabled: Is monitoring enabled for this feature + * @alloc_capable: Is allocation available on this machine + * @mon_capable: Is monitor feature available on this machine + * @name: Name to use in "schemata" file + * @domains: All domains for this resource + * @cache: Cache allocation related data + * @mbw: Memory Bandwidth allocation related data + * @evt_list: List of monitoring events + * @fflags: flags to choose base and info files + */ +struct resctrl_resource { + int rid; + bool alloc_enabled; + bool mon_enabled; + bool alloc_capable; + bool mon_capable; + char *name; + struct list_head domains; + u32 dom_num; + struct list_head evt_list; + unsigned long fflags; + + struct resctrl_cache cache; + struct resctrl_membw mbw; + + bool cdp_capable; + bool cdp_enable; + u32 *default_ctrl; + + u32 ctrl_extend_bits; + + void *res; +}; + +/* List of all resource groups */ +extern struct list_head resctrl_all_groups; + /** * struct mongroup - store mon group's data in resctrl fs. * @mon_data_kn kernlfs node for the mon_data directory * @parent: parent rdtgrp * @crdtgrp_list: child rdtgroup node list * @rmid: rmid for this rdtgroup - * @mon: monnitor id + * @init: init flag */ struct mongroup { struct kernfs_node *mon_data_kn; @@ -59,7 +130,7 @@ struct mongroup { /** * struct sd_closid - software defined closid * @intpartid: closid for this rdtgroup only for allocation - * @weak_closid: closid for synchronizing configuration and monitoring + * @reqpartid: closid for synchronizing configuration and monitoring */ struct sd_closid { u32 intpartid; @@ -70,6 +141,7 @@ struct sd_closid { * struct rdtgroup - store rdtgroup's data in resctrl file system. * @kn: kernfs node * @resctrl_group_list: linked list for all rdtgroups + * @closid: software defined closid * @cpu_mask: CPUs assigned to this rdtgroup * @flags: status bits * @waitcount: how many cpus expect to find this @@ -89,11 +161,214 @@ struct rdtgroup { struct mongroup mon; };
+enum resctrl_ctrl_type { + SCHEMA_COMM = 0, + SCHEMA_PRI, + SCHEMA_HDL, + SCHEMA_NUM_CTRL_TYPE +}; + +#define for_each_ctrl_type(t) \ + for (t = SCHEMA_COMM; t != SCHEMA_NUM_CTRL_TYPE; t++) + +#define for_each_extend_ctrl_type(t) \ + for (t = SCHEMA_PRI; t != SCHEMA_NUM_CTRL_TYPE; t++) + +/** + * struct resctrl_ctrl_feature - ctrl feature member live in schema list + * @flags: Does what ctrl types can this feature server for + * @name: Name of this ctrl feature + * @max_wd: Max width of this feature can be input from outter space + * @base: Base of integer from outter space + * @evt: rdt_event_id event owned for applying configuration + * @capable: Does this feature support + * @enabled: Enabled or not. + * @default_ctrl: Default ctrl value of this feature + */ +struct resctrl_ctrl_feature { + enum resctrl_ctrl_type type; + int flags; + const char *name; + u32 max_wd; + int base; + enum rdt_event_id evt; + int default_ctrl; + bool capable; + bool enabled; +}; + +struct msr_param { + enum resctrl_ctrl_type type; + struct sd_closid *closid; +}; + +enum resctrl_conf_type { + CDP_BOTH = 0, + CDP_CODE, + CDP_DATA, + CDP_NUM_CONF_TYPE, +}; + +static inline int conf_name_to_conf_type(char *name) +{ + enum resctrl_conf_type t; + + if (!strcmp(name, "L3CODE") || !strcmp(name, "L2CODE")) + t = CDP_CODE; + else if (!strcmp(name, "L3DATA") || !strcmp(name, "L2DATA")) + t = CDP_DATA; + else + t = CDP_BOTH; + return t; +} + +#define for_each_conf_type(t) \ + for (t = CDP_BOTH; t < CDP_NUM_CONF_TYPE; t++) + +typedef struct { u16 val; } hw_def_t; + +#define hw_closid_t hw_def_t +#define hw_monid_t hw_def_t +#define hw_closid_val(__x) (__x.val) +#define hw_monid_val(__x) (__x.val) + +#define as_hw_t(__name, __x) \ + ((hw_##__name##id_t){(__x)}) +#define hw_val(__name, __x) \ + hw_##__name##id_val(__x) + +/** + * When cdp enabled, give (closid + 1) to Cache LxDATA. + */ +#define resctrl_cdp_map(__name, __closid, __type, __result) \ +do { \ + if (__type == CDP_CODE) \ + __result = as_hw_t(__name, __closid); \ + else if (__type == CDP_DATA) \ + __result = as_hw_t(__name, __closid + 1); \ + else \ + __result = as_hw_t(__name, __closid); \ +} while (0) + +bool is_resctrl_cdp_enabled(void); + +#define hw_alloc_validate(__flag) \ +do { \ + if (is_resctrl_cdp_enabled()) \ + __flag = true; \ + else \ + __flag = false; \ +} while (0) + +#define hw_alloc_times_validate(__times, __flag) \ +do { \ + hw_alloc_validate(__flag); \ + if (__flag) \ + __times = 2; \ + else \ + __times = 1; \ +} while (0) + +/** + * struct resctrl_staged_config - parsed configuration to be applied + * @hw_closid: raw closid for this configuration, regardless of CDP + * @new_ctrl: new ctrl value to be loaded + * @have_new_ctrl: did user provide new_ctrl for this domain + * @new_ctrl_type: CDP property of the new ctrl + * @cdp_both_ctrl: did cdp both control if cdp enabled + */ +struct resctrl_staged_config { + hw_closid_t hw_closid; + u32 new_ctrl[SCHEMA_NUM_CTRL_TYPE]; + bool have_new_ctrl; + enum resctrl_conf_type conf_type; + enum resctrl_ctrl_type ctrl_type; + bool cdp_both_ctrl; +}; + +/* later move to resctrl common directory */ +#define RESCTRL_NAME_LEN 15 + +struct resctrl_schema_ctrl { + struct list_head list; + char name[RESCTRL_NAME_LEN]; + enum resctrl_ctrl_type ctrl_type; +}; + +/** + * @list: Member of resctrl's schema list + * @name: Name visible in the schemata file + * @conf_type: Type of configuration, e.g. code/data/both + * @res: The rdt_resource for this entry + * @schemata_ctrl_list: Type of ctrl configuration. e.g. priority/hardlimit + * @cdp_mc_both: did cdp both mon/ctrl if cdp enabled + */ +struct resctrl_schema { + struct list_head list; + char name[RESCTRL_NAME_LEN]; + enum resctrl_conf_type conf_type; + struct resctrl_resource *res; + struct list_head schema_ctrl_list; + bool cdp_mc_both; +}; + int schemata_list_init(void);
void schemata_list_destroy(void);
-int resctrl_lru_request_mon(void); +/** + * struct rdt_domain - group of cpus sharing an RDT resource + * @list: all instances of this resource + * @id: unique id for this instance + * @cpu_mask: which cpus share this resource + * @base MMIO base address + * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID) + * @have_new_ctrl: did user provide new_ctrl for this domain + */ +struct rdt_domain { + struct list_head list; + int id; + struct cpumask cpu_mask; + void __iomem *base; + + /* arch specific fields */ + u32 *ctrl_val[SCHEMA_NUM_CTRL_TYPE]; + bool have_new_ctrl; + + /* for debug */ + char *cpus_list; + + struct resctrl_staged_config staged_cfg[CDP_NUM_CONF_TYPE]; +}; + +/* + * Internal struct of resctrl_resource structure, + * for static initialization. + */ +struct raw_resctrl_resource { + u16 num_partid; + u16 num_intpartid; + u16 num_pmg; + + u16 extend_ctrls_wd[SCHEMA_NUM_CTRL_TYPE]; + + void (*msr_update)(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para); + u64 (*msr_read)(struct resctrl_resource *r, struct rdt_domain *d, + struct msr_param *para); + + int data_width; + const char *format_str; + int (*parse_ctrlval)(char *buf, struct resctrl_resource *r, + struct resctrl_staged_config *cfg, enum resctrl_ctrl_type ctrl_type); + + u16 num_mon; + u64 (*mon_read)(struct rdt_domain *d, void *md_priv); + int (*mon_write)(struct rdt_domain *d, void *md_priv); + unsigned long fflags; + + struct resctrl_ctrl_feature ctrl_features[SCHEMA_NUM_CTRL_TYPE]; +};
int rmid_alloc(int entry_idx); void rmid_free(int rmid); @@ -103,7 +378,8 @@ int closid_alloc(void); void closid_free(int closid);
void update_cpu_closid_rmid(void *info); -void update_closid_rmid(const struct cpumask *cpu_mask, struct resctrl_group *r); +void update_closid_rmid(const struct cpumask *cpu_mask, + struct resctrl_group *r); int __resctrl_group_move_task(struct task_struct *tsk, struct resctrl_group *rdtgrp);
@@ -117,18 +393,11 @@ void rdt_last_cmd_clear(void); void rdt_last_cmd_puts(const char *s); void rdt_last_cmd_printf(const char *fmt, ...);
-extern struct mutex resctrl_group_mutex; - -void release_rdtgroupfs_options(void); -int parse_rdtgroupfs_options(char *data); - void resctrl_resource_reset(void);
#define release_resctrl_group_fs_options release_rdtgroupfs_options #define parse_resctrl_group_fs_options parse_rdtgroupfs_options
-int mpam_get_mon_config(struct resctrl_resource *r); - int resctrl_group_init_alloc(struct rdtgroup *rdtgrp);
static inline int __resctrl_group_show_options(struct seq_file *seq) @@ -136,15 +405,35 @@ static inline int __resctrl_group_show_options(struct seq_file *seq) return 0; }
+int resctrl_update_groups_config(struct rdtgroup *rdtgrp); + +#define RESCTRL_MAX_CLOSID 32 + +int __init resctrl_group_init(void); + +void post_resctrl_mount(void); + +extern struct mutex resctrl_group_mutex; +DECLARE_STATIC_KEY_FALSE(resctrl_alloc_enable_key); +extern struct rdtgroup resctrl_group_default; int resctrl_mkdir_mondata_all_subdir(struct kernfs_node *parent_kn, - struct resctrl_group *prgrp); + struct resctrl_group *prgrp);
-struct resctrl_resource * -mpam_resctrl_get_resource(enum resctrl_resource_level level); +int resctrl_group_create_info_dir(struct kernfs_node *parent_kn, + struct kernfs_node **kn_info);
-int resctrl_update_groups_config(struct rdtgroup *rdtgrp); +int register_resctrl_specific_files(struct rftype *files, size_t len); +extern struct kernfs_ops resctrl_group_kf_single_ops;
-#define RESCTRL_MAX_CLOSID 32 +extern struct rdtgroup *resctrl_group_kn_lock_live(struct kernfs_node *kn); +void resctrl_group_kn_unlock(struct kernfs_node *kn); + +void release_rdtgroupfs_options(void); +int parse_rdtgroupfs_options(char *data); + +int resctrl_group_add_files(struct kernfs_node *kn, unsigned long fflags); + +#define RESCTRL_MAX_CBM 32
/* * This is only for avoiding unnecessary cost in mpam_sched_in() @@ -178,4 +467,5 @@ static inline u32 resctrl_navie_closid(struct sd_closid closid) return closid.intpartid; }
+#endif #endif /* _ASM_ARM64_RESCTRL_H */ diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 73e5d9c8c033..368cedcded1f 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -33,8 +33,9 @@ #include <linux/kernfs.h> #include <linux/seq_file.h> #include <linux/slab.h> +#include <asm/mpam.h>
-#include <asm/mpam_resource.h> +#include "mpam_resource.h" #include "mpam_internal.h"
/* schemata content list */ @@ -755,6 +756,99 @@ int resctrl_mkdir_mondata_all_subdir(struct kernfs_node *parent_kn, return ret; }
+static int resctrl_group_mkdir_info_resdir(struct resctrl_resource *r, + char *name,unsigned long fflags, struct kernfs_node *kn_info) +{ + struct kernfs_node *kn_subdir; + int ret; + + kn_subdir = kernfs_create_dir(kn_info, name, + kn_info->mode, r); + if (IS_ERR(kn_subdir)) + return PTR_ERR(kn_subdir); + + kernfs_get(kn_subdir); + ret = resctrl_group_kn_set_ugid(kn_subdir); + if (ret) + return ret; + + ret = resctrl_group_add_files(kn_subdir, fflags); + if (!ret) + kernfs_activate(kn_subdir); + + return ret; +} + +int resctrl_group_create_info_dir(struct kernfs_node *parent_kn, + struct kernfs_node **kn_info) +{ + struct resctrl_schema *s; + struct resctrl_resource *r; + struct raw_resctrl_resource *rr; + unsigned long fflags; + char name[32]; + int ret; + + /* create the directory */ + *kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL); + if (IS_ERR(*kn_info)) + return PTR_ERR(*kn_info); + kernfs_get(*kn_info); + + ret = resctrl_group_add_files(*kn_info, RF_TOP_INFO); + if (ret) + goto out_destroy; + + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res; + if (!r) + continue; + rr = r->res; + if (r->alloc_enabled) { + fflags = rr->fflags | RF_CTRL_INFO; + ret = resctrl_group_mkdir_info_resdir(r, s->name, + fflags, *kn_info); + if (ret) + goto out_destroy; + } + } + + list_for_each_entry(s, &resctrl_all_schema, list) { + r = s->res; + if (!r) + continue; + rr = r->res; + if (r->mon_enabled) { + fflags = rr->fflags | RF_MON_INFO; + snprintf(name, sizeof(name), "%s_MON", s->name); + ret = resctrl_group_mkdir_info_resdir(r, name, + fflags, *kn_info); + if (ret) + goto out_destroy; + } + } + + /* + m This extra ref will be put in kernfs_remove() and guarantees + * that @rdtgrp->kn is always accessible. + */ + kernfs_get(*kn_info); + + ret = resctrl_group_kn_set_ugid(*kn_info); + if (ret) + goto out_destroy; + + kernfs_activate(*kn_info); + + return 0; + +out_destroy: + kernfs_remove(*kn_info); + return ret; +} + + + /* Initialize MBA resource with default values. */ static void rdtgroup_init_mba(struct resctrl_schema *s, u32 closid) { diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 3e4509e289dd..3fd5fc66b204 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -32,8 +32,8 @@ #include <linux/cpu.h> #include <linux/cacheinfo.h> #include <linux/arm_mpam.h> -#include <asm/mpam_resource.h>
+#include "mpam_resource.h" #include "mpam_device.h" #include "mpam_internal.h"
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index d74989e03993..40dcc02f4e57 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -2,8 +2,6 @@ #ifndef _ASM_ARM64_MPAM_INTERNAL_H #define _ASM_ARM64_MPAM_INTERNAL_H
-#include <linux/resctrlfs.h> -#include <asm/mpam.h> #include <asm/resctrl.h>
typedef u32 mpam_features_t; @@ -12,6 +10,142 @@ struct mpam_component; struct rdt_domain; struct mpam_class; struct raw_resctrl_resource; +struct resctrl_resource; +/* MPAM register */ +#define SYS_MPAM0_EL1 sys_reg(3, 0, 10, 5, 1) +#define SYS_MPAM1_EL1 sys_reg(3, 0, 10, 5, 0) +#define SYS_MPAM2_EL2 sys_reg(3, 4, 10, 5, 0) +#define SYS_MPAM3_EL3 sys_reg(3, 6, 10, 5, 0) +#define SYS_MPAM1_EL12 sys_reg(3, 5, 10, 5, 0) +#define SYS_MPAMHCR_EL2 sys_reg(3, 4, 10, 4, 0) +#define SYS_MPAMVPMV_EL2 sys_reg(3, 4, 10, 4, 1) +#define SYS_MPAMVPMn_EL2(n) sys_reg(3, 4, 10, 6, n) +#define SYS_MPAMIDR_EL1 sys_reg(3, 0, 10, 4, 4) + +#define MPAM_MASK(n) ((1UL << n) - 1) +/* plan to use GENMASK(n, 0) instead */ + +/* + * MPAMx_ELn: + * 15:0 PARTID_I + * 31:16 PARTID_D + * 39:32 PMG_I + * 47:40 PMG_D + * 48 TRAPMPAM1EL1 + * 49 TRAPMPAM0EL1 + * 61:49 Reserved + * 62 TRAPLOWER + * 63 MPAMEN + */ +#define PARTID_BITS (16) +#define PMG_BITS (8) +#define PARTID_MASK MPAM_MASK(PARTID_BITS) +#define PMG_MASK MPAM_MASK(PMG_BITS) + +#define PARTID_I_SHIFT (0) +#define PARTID_D_SHIFT (PARTID_I_SHIFT + PARTID_BITS) +#define PMG_I_SHIFT (PARTID_D_SHIFT + PARTID_BITS) +#define PMG_D_SHIFT (PMG_I_SHIFT + PMG_BITS) + +#define PARTID_I_MASK (PARTID_MASK << PARTID_I_SHIFT) +#define PARTID_D_MASK (PARTID_MASK << PARTID_D_SHIFT) +#define PARTID_I_CLR(r) ((r) & ~PARTID_I_MASK) +#define PARTID_D_CLR(r) ((r) & ~PARTID_D_MASK) +#define PARTID_CLR(r) (PARTID_I_CLR(r) & PARTID_D_CLR(r)) + +#define PARTID_I_SET(r, id) (PARTID_I_CLR(r) | ((id) << PARTID_I_SHIFT)) +#define PARTID_D_SET(r, id) (PARTID_D_CLR(r) | ((id) << PARTID_D_SHIFT)) +#define PARTID_SET(r, id) (PARTID_CLR(r) | ((id) << PARTID_I_SHIFT) | ((id) << PARTID_D_SHIFT)) + +#define PMG_I_MASK (PMG_MASK << PMG_I_SHIFT) +#define PMG_D_MASK (PMG_MASK << PMG_D_SHIFT) +#define PMG_I_CLR(r) ((r) & ~PMG_I_MASK) +#define PMG_D_CLR(r) ((r) & ~PMG_D_MASK) +#define PMG_CLR(r) (PMG_I_CLR(r) & PMG_D_CLR(r)) + +#define PMG_I_SET(r, id) (PMG_I_CLR(r) | ((id) << PMG_I_SHIFT)) +#define PMG_D_SET(r, id) (PMG_D_CLR(r) | ((id) << PMG_D_SHIFT)) +#define PMG_SET(r, id) (PMG_CLR(r) | ((id) << PMG_I_SHIFT) | ((id) << PMG_D_SHIFT)) + +#define TRAPMPAM1EL1_SHIFT (PMG_D_SHIFT + PMG_BITS) +#define TRAPMPAM0EL1_SHIFT (TRAPMPAM1EL1_SHIFT + 1) +#define TRAPLOWER_SHIFT (TRAPMPAM0EL1_SHIFT + 13) +#define MPAMEN_SHIFT (TRAPLOWER_SHIFT + 1) + +/* + * MPAMHCR_EL2: + * 0 EL0_VPMEN + * 1 EL1_VPMEN + * 7:2 Reserved + * 8 GSTAPP_PLK + * 30:9 Reserved + * 31 TRAP_MPAMIDR_EL1 + * 63:32 Reserved + */ +#define EL0_VPMEN_SHIFT (0) +#define EL1_VPMEN_SHIFT (EL0_VPMEN_SHIFT + 1) +#define GSTAPP_PLK_SHIFT (8) +#define TRAP_MPAMIDR_EL1_SHIFT (31) + +/* + * MPAMIDR_EL1: + * 15:0 PARTID_MAX + * 16 Reserved + * 17 HAS_HCR + * 20:18 VPMR_MAX + * 31:21 Reserved + * 39:32 PMG_MAX + * 63:40 Reserved + */ +#define VPMR_MAX_BITS (3) +#define PARTID_MAX_SHIFT (0) +#define PARTID_MAX_MASK (MPAM_MASK(PARTID_BITS) << PARTID_MAX_SHIFT) +#define HAS_HCR_SHIFT (PARTID_MAX_SHIFT + PARTID_BITS + 1) +#define VPMR_MAX_SHIFT (HAS_HCR_SHIFT + 1) +#define PMG_MAX_SHIFT (VPMR_MAX_SHIFT + VPMR_MAX_BITS + 11) +#define PMG_MAX_MASK (MPAM_MASK(PMG_BITS) << PMG_MAX_SHIFT) +#define VPMR_MASK MPAM_MASK(VPMR_MAX_BITS) + +/* + * MPAMVPMV_EL2: + * 31:0 VPM_V + * 63:32 Reserved + */ +#define VPM_V_BITS 32 + +DECLARE_STATIC_KEY_FALSE(resctrl_enable_key); +DECLARE_STATIC_KEY_FALSE(resctrl_mon_enable_key); + +extern int max_name_width, max_data_width; + +#define RESCTRL_SHOW_DOM_MAX_NUM 8 + +#define mpam_read_sysreg_s(reg, name) read_sysreg_s(reg) +#define mpam_write_sysreg_s(v, r, n) write_sysreg_s(v, r) +#define mpam_readl(addr) readl(addr) +#define mpam_writel(v, addr) writel(v, addr) + +/* 64bit arm64 specified */ +union mon_data_bits { + void *priv; + struct { + u8 rid; + u8 domid; + u8 partid; + u8 pmg; + u8 mon; + u8 cdp_both_mon; + } u; +}; + +ssize_t resctrl_group_schemata_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); + +int resctrl_group_schemata_show(struct kernfs_open_file *of, + struct seq_file *s, void *v); + +struct rdt_domain *mpam_find_domain(struct resctrl_resource *r, int id, + struct list_head **pos);
extern bool rdt_alloc_capable; extern bool rdt_mon_capable; @@ -201,4 +335,8 @@ int assoc_rmid_with_mon(u32 rmid); void deassoc_rmid_with_mon(u32 rmid); u32 get_rmid_mon(u32 rmid, enum resctrl_resource_level rid); int rmid_mon_ptrs_init(u32 nr_rmids); + +struct resctrl_resource * +mpam_resctrl_get_resource(enum resctrl_resource_level level); + #endif diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 503244cb7e97..ea99c711a296 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -40,10 +40,11 @@ #include <linux/arm_mpam.h>
#include <asm/mpam_sched.h> -#include <asm/mpam_resource.h> +#include <asm/mpam.h> #include <asm/io.h>
#include "mpam_device.h" +#include "mpam_resource.h" #include "mpam_internal.h"
/* Mutex to protect rdtgroup access. */ diff --git a/arch/arm64/include/asm/mpam_resource.h b/arch/arm64/kernel/mpam/mpam_resource.h similarity index 98% rename from arch/arm64/include/asm/mpam_resource.h rename to arch/arm64/kernel/mpam/mpam_resource.h index 412faca90e0b..a9e8334e879e 100644 --- a/arch/arm64/include/asm/mpam_resource.h +++ b/arch/arm64/kernel/mpam/mpam_resource.h @@ -165,10 +165,10 @@ #define MPAMF_MBWUMON_IDR_HAS_CAPTURE BIT(31)
/* MPAMF_CPOR_IDR - MPAM features cache portion partitioning ID register */ -#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15, 0) +#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15, 0)
/* MPAMF_CCAP_IDR - MPAM features cache capacity partitioning ID register */ -#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0) +#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0)
/* MPAMF_MBW_IDR - MPAM features memory bandwidth partitioning ID register */ #define MPAMF_MBW_IDR_BWA_WD GENMASK(5, 0) diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index e2891529ec11..239bd4e3d9ca 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -41,8 +41,8 @@
#include <uapi/linux/magic.h>
-#include <asm/mpam.h> #include <asm/resctrl.h> +#include <asm/mpam.h>
DEFINE_STATIC_KEY_FALSE(resctrl_enable_key); DEFINE_STATIC_KEY_FALSE(resctrl_mon_enable_key); @@ -144,7 +144,7 @@ static int __resctrl_group_add_files(struct kernfs_node *kn, unsigned long fflag return ret; }
-static int resctrl_group_add_files(struct kernfs_node *kn, unsigned long fflags) +int resctrl_group_add_files(struct kernfs_node *kn, unsigned long fflags) { int ret = 0;
@@ -155,94 +155,6 @@ static int resctrl_group_add_files(struct kernfs_node *kn, unsigned long fflags) return ret; }
-static int resctrl_group_mkdir_info_resdir(struct resctrl_resource *r, char *name, - unsigned long fflags) -{ - struct kernfs_node *kn_subdir; - int ret; - - kn_subdir = kernfs_create_dir(kn_info, name, - kn_info->mode, r); - if (IS_ERR(kn_subdir)) - return PTR_ERR(kn_subdir); - - kernfs_get(kn_subdir); - ret = resctrl_group_kn_set_ugid(kn_subdir); - if (ret) - return ret; - - ret = resctrl_group_add_files(kn_subdir, fflags); - if (!ret) - kernfs_activate(kn_subdir); - - return ret; -} - -static int resctrl_group_create_info_dir(struct kernfs_node *parent_kn) -{ - struct resctrl_schema *s; - struct resctrl_resource *r; - struct raw_resctrl_resource *rr; - unsigned long fflags; - char name[32]; - int ret; - - /* create the directory */ - kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL); - if (IS_ERR(kn_info)) - return PTR_ERR(kn_info); - kernfs_get(kn_info); - - ret = resctrl_group_add_files(kn_info, RF_TOP_INFO); - if (ret) - goto out_destroy; - - list_for_each_entry(s, &resctrl_all_schema, list) { - r = s->res; - if (!r) - continue; - rr = r->res; - if (r->alloc_enabled) { - fflags = rr->fflags | RF_CTRL_INFO; - ret = resctrl_group_mkdir_info_resdir(r, s->name, fflags); - if (ret) - goto out_destroy; - } - } - - list_for_each_entry(s, &resctrl_all_schema, list) { - r = s->res; - if (!r) - continue; - rr = r->res; - if (r->mon_enabled) { - fflags = rr->fflags | RF_MON_INFO; - snprintf(name, sizeof(name), "%s_MON", s->name); - ret = resctrl_group_mkdir_info_resdir(r, name, fflags); - if (ret) - goto out_destroy; - } - } - - /* - m This extra ref will be put in kernfs_remove() and guarantees - * that @rdtgrp->kn is always accessible. - */ - kernfs_get(kn_info); - - ret = resctrl_group_kn_set_ugid(kn_info); - if (ret) - goto out_destroy; - - kernfs_activate(kn_info); - - return 0; - -out_destroy: - kernfs_remove(kn_info); - return ret; -} - /* * We don't allow resctrl_group directories to be created anywhere * except the root directory. Thus when looking for the resctrl_group @@ -441,7 +353,7 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, goto out_options; }
- ret = resctrl_group_create_info_dir(resctrl_group_default.kn); + ret = resctrl_group_create_info_dir(resctrl_group_default.kn, &kn_info); if (ret) { dentry = ERR_PTR(ret); goto out_options; diff --git a/include/linux/resctrlfs.h b/include/linux/resctrlfs.h index 287be52e2385..bd739bdc72cb 100644 --- a/include/linux/resctrlfs.h +++ b/include/linux/resctrlfs.h @@ -9,59 +9,6 @@ #include <linux/seq_buf.h> #include <linux/seq_file.h>
-/** - * struct resctrl_cache - Cache allocation related data - * @cbm_len: Length of the cache bit mask - * @min_cbm_bits: Minimum number of consecutive bits to be set - * @cbm_idx_mult: Multiplier of CBM index - * @cbm_idx_offset: Offset of CBM index. CBM index is computed by: - * closid * cbm_idx_multi + cbm_idx_offset - * in a cache bit mask - * @shareable_bits: Bitmask of shareable resource with other - * executing entities - * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid. - */ -struct resctrl_cache { - u32 cbm_len; - u32 shareable_bits; - u32 min_cbm_bits; -}; - -/** - * struct resctrl_membw - Memory bandwidth allocation related data - * @min_bw: Minimum memory bandwidth percentage user can request - * @bw_gran: Granularity at which the memory bandwidth is allocated - * @delay_linear: True if memory B/W delay is in linear scale - * @ctrl_extend_bits: Indicates if there are extra ctrl capabilities supported. - * e.g. priority/hardlimit. - */ -struct resctrl_membw { - u32 min_bw; - u32 bw_gran; - u32 delay_linear; -}; - -struct resctrl_resource { - int rid; - bool alloc_enabled; - bool mon_enabled; - bool alloc_capable; - bool mon_capable; - char *name; - struct list_head domains; - u32 dom_num; - struct list_head evt_list; - unsigned long fflags; - - struct resctrl_cache cache; - struct resctrl_membw mbw; - - bool cdp_capable; - bool cdp_enable; - - void *res; -}; - DECLARE_STATIC_KEY_FALSE(resctrl_enable_key);
/* rftype.flags */ @@ -87,9 +34,6 @@ DECLARE_STATIC_KEY_FALSE(resctrl_enable_key); #define RF_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP) #define RF_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
-/* List of all resource groups */ -extern struct list_head resctrl_all_groups; - /** * struct rftype - describe each file in the resctrl file system * @name: File name @@ -120,19 +64,4 @@ struct rftype { char *buf, size_t nbytes, loff_t off); };
-DECLARE_STATIC_KEY_FALSE(resctrl_alloc_enable_key); -extern struct rdtgroup resctrl_group_default; - -int __init resctrl_group_init(void); - -int register_resctrl_specific_files(struct rftype *files, size_t len); -extern struct kernfs_ops resctrl_group_kf_single_ops; - -extern struct rdtgroup *resctrl_group_kn_lock_live(struct kernfs_node *kn); -void resctrl_group_kn_unlock(struct kernfs_node *kn); - -void post_resctrl_mount (void); - -#define RESCTRL_MAX_CBM 32 - #endif /* _RESCTRLFS_H */
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
rmid is used to mark each resctrl group for monitoring, anyhow, also following corresponding resctrl group's configuration, we export rmid sysfile to resctrl sysfs for any usage elsewhere such as SMMU io, user can get rmid from a resctrl group and set this rmid to a target io through SMMU driver if SMMU MPAM implemented, so make related io devices can be monitored or accomplish aimed configuration for resource's usage.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_resctrl.c | 30 +++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index ea99c711a296..004be508459e 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -1733,6 +1733,29 @@ static int resctrl_group_tasks_show(struct kernfs_open_file *of, return ret; }
+static int resctrl_group_rmid_show(struct kernfs_open_file *of, + struct seq_file *s, void *v) +{ + int ret = 0; + struct rdtgroup *rdtgrp; + u32 flag, times; + + hw_alloc_times_validate(times, flag); + + rdtgrp = resctrl_group_kn_lock_live(of->kn); + if (rdtgrp) { + if (flag) + seq_printf(s, "%u-%u\n", rdtgrp->mon.rmid, + rdtgrp->mon.rmid + 1); + else + seq_printf(s, "%u\n", rdtgrp->mon.rmid); + } else + ret = -ENOENT; + resctrl_group_kn_unlock(of->kn); + + return ret; +} + /* rdtgroup information files for one cache resource. */ static struct rftype res_specific_files[] = { { @@ -1830,6 +1853,13 @@ static struct rftype res_specific_files[] = { .seq_show = resctrl_group_tasks_show, .fflags = RFTYPE_BASE, }, + { + .name = "rmid", + .mode = 0444, + .kf_ops = &resctrl_group_kf_single_ops, + .seq_show = resctrl_group_rmid_show, + .fflags = RFTYPE_BASE, + }, { .name = "schemata", .mode = 0644,
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
ctrl_features array, introduced by 61fa56e1dd8a ("arm64/mpam: Add resctrl_ctrl_feature structure to manage ctrl features"), which lives in raw_resctrl_resource structure for listing ctrl features's type do we support in total for this resource, this filters illegal parameters outside from mount options and provides useful info for add_schema() for registering a new control type node in schema list.
This action helps us to add new ctrl feature easier later.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 2 ++ arch/arm64/kernel/mpam/mpam_ctrlmon.c | 24 ++++++++------------ arch/arm64/kernel/mpam/mpam_resctrl.c | 32 +++++++++++++++------------ 3 files changed, 29 insertions(+), 29 deletions(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index af2388a43990..15310ad1b287 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -195,6 +195,8 @@ struct resctrl_ctrl_feature { int default_ctrl; bool capable; bool enabled; + + const char *ctrl_suffix; };
struct msr_param { diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 368cedcded1f..90416179bad2 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -46,12 +46,9 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) { int ret = 0; char *suffix = ""; - char *ctrl_suffix = ""; struct resctrl_schema *s; struct raw_resctrl_resource *rr; - struct resctrl_schema_ctrl *sc, *sc_tmp; - struct resctrl_schema_ctrl *sc_pri = NULL; - struct resctrl_schema_ctrl *sc_hdl = NULL; + struct resctrl_schema_ctrl *sc, *tmp; enum resctrl_ctrl_type type;
s = kzalloc(sizeof(*s), GFP_KERNEL); @@ -97,6 +94,9 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) rr = r->res; INIT_LIST_HEAD(&s->schema_ctrl_list); for_each_extend_ctrl_type(type) { + struct resctrl_ctrl_feature *feature = + &rr->ctrl_features[type]; + if (!rr->ctrl_features[type].enabled || !rr->ctrl_features[type].max_wd) continue; @@ -107,25 +107,19 @@ static int add_schema(enum resctrl_conf_type t, struct resctrl_resource *r) goto err; } sc->ctrl_type = type; - if (type == SCHEMA_PRI) { - sc_pri = sc; - ctrl_suffix = "PRI"; - } else if (type == SCHEMA_HDL) { - sc_hdl = sc; - ctrl_suffix = "HDL"; - }
WARN_ON_ONCE(strlen(r->name) + strlen(suffix) + - strlen(ctrl_suffix) + 1 > RESCTRL_NAME_LEN); - snprintf(sc->name, sizeof(sc->name), "%s%s%s", - r->name, suffix, ctrl_suffix); + strlen(feature->ctrl_suffix) + 1 > RESCTRL_NAME_LEN); + snprintf(sc->name, sizeof(sc->name), "%s%s%s", r->name, + suffix, feature->ctrl_suffix); + list_add_tail(&sc->list, &s->schema_ctrl_list); }
return 0;
err: - list_for_each_entry_safe(sc, sc_tmp, &s->schema_ctrl_list, list) { + list_for_each_entry_safe(sc, tmp, &s->schema_ctrl_list, list) { list_del(&sc->list); kfree(sc); } diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 004be508459e..6e251ef90c36 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -1053,17 +1053,28 @@ static void basic_ctrl_enable(void) } }
-static int extend_ctrl_enable(enum resctrl_ctrl_type type) +static int extend_ctrl_enable(char *tok) { bool match = false; + struct resctrl_resource *r; struct raw_resctrl_resource *rr; struct mpam_resctrl_res *res; + struct resctrl_ctrl_feature *feature; + enum resctrl_ctrl_type type;
for_each_supported_resctrl_exports(res) { - rr = res->resctrl_res.res; - if (rr->ctrl_features[type].capable) { - rr->ctrl_features[type].enabled = true; - match = true; + r = &res->resctrl_res; + if (!r->alloc_capable) + continue; + rr = r->res; + for_each_ctrl_type(type) { + feature = &rr->ctrl_features[type]; + if (strcmp(feature->name, tok)) + continue; + if (rr->ctrl_features[type].capable) { + rr->ctrl_features[type].enabled = true; + match = true; + } } }
@@ -1108,17 +1119,10 @@ int parse_rdtgroupfs_options(char *data) ret = cdpl2_enable(); if (ret) goto out; - } else if (!strcmp(token, "priority")) { - ret = extend_ctrl_enable(SCHEMA_PRI); - if (ret) - goto out; - } else if (!strcmp(token, "hardlimit")) { - ret = extend_ctrl_enable(SCHEMA_HDL); + } else { + ret = extend_ctrl_enable(token); if (ret) goto out; - } else { - ret = -EINVAL; - goto out; } }
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
MPAM includes partid, pmg, monitor, all of these we collectively call mpam id, if cdp on, we would allocate a new mpamid_new which equals to mpamid + 1, and at some places mpamid may not need to be encapsulated into struct { u16 val; } for simplicity, So we use a simpler macro resctrl_cdp_mpamid_map_val() to complete this cdp mapping process.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 30 ++++++++----- arch/arm64/kernel/mpam/mpam_ctrlmon.c | 64 ++++++++++----------------- arch/arm64/kernel/mpam/mpam_resctrl.c | 9 +--- 3 files changed, 43 insertions(+), 60 deletions(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 15310ad1b287..45dde62e8586 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -227,29 +227,35 @@ static inline int conf_name_to_conf_type(char *name) #define for_each_conf_type(t) \ for (t = CDP_BOTH; t < CDP_NUM_CONF_TYPE; t++)
-typedef struct { u16 val; } hw_def_t; +typedef struct { u16 val; } hw_mpamid_t; +typedef hw_mpamid_t hw_closid_t;
-#define hw_closid_t hw_def_t -#define hw_monid_t hw_def_t +#define hw_mpamid_val(__x) (__x.val) #define hw_closid_val(__x) (__x.val) -#define hw_monid_val(__x) (__x.val)
-#define as_hw_t(__name, __x) \ - ((hw_##__name##id_t){(__x)}) -#define hw_val(__name, __x) \ - hw_##__name##id_val(__x) +#define as_hw_mpamid_t(__x) ((hw_mpamid_t){(__x)})
/** * When cdp enabled, give (closid + 1) to Cache LxDATA. */ -#define resctrl_cdp_map(__name, __closid, __type, __result) \ +#define resctrl_cdp_mpamid_map(__id, __type, __hw_mpamid) \ do { \ if (__type == CDP_CODE) \ - __result = as_hw_t(__name, __closid); \ + __hw_mpamid = as_hw_mpamid_t(__id); \ else if (__type == CDP_DATA) \ - __result = as_hw_t(__name, __closid + 1); \ + __hw_mpamid = as_hw_mpamid_t(__id + 1); \ else \ - __result = as_hw_t(__name, __closid); \ + __hw_mpamid = as_hw_mpamid_t(__id); \ +} while (0) + +#define resctrl_cdp_mpamid_map_val(__id, __type, __hw_mpamid_val) \ +do { \ + if (__type == CDP_CODE) \ + __hw_mpamid_val = __id; \ + else if (__type == CDP_DATA) \ + __hw_mpamid_val = __id + 1; \ + else \ + __hw_mpamid_val = __id; \ } while (0)
bool is_resctrl_cdp_enabled(void); diff --git a/arch/arm64/kernel/mpam/mpam_ctrlmon.c b/arch/arm64/kernel/mpam/mpam_ctrlmon.c index 90416179bad2..aae585e7d7df 100644 --- a/arch/arm64/kernel/mpam/mpam_ctrlmon.c +++ b/arch/arm64/kernel/mpam/mpam_ctrlmon.c @@ -182,10 +182,8 @@ resctrl_dom_ctrl_config(bool cdp_both_ctrl, struct resctrl_resource *r, rr->msr_update(r, dom, para);
if (cdp_both_ctrl) { - hw_closid_t hw_closid; - - resctrl_cdp_map(clos, para->closid->reqpartid, CDP_DATA, hw_closid); - para->closid->reqpartid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(para->closid->reqpartid, CDP_DATA, + para->closid->reqpartid); rr->msr_update(r, dom, para); } } @@ -196,7 +194,6 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, int i; struct resctrl_staged_config *cfg; enum resctrl_ctrl_type type; - hw_closid_t hw_closid; struct sd_closid closid; struct list_head *head; struct rdtgroup *entry; @@ -223,9 +220,8 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, * duplicate ctrl group's configuration indexed * by intpartid from domain ctrl_val array. */ - resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, - cfg[i].conf_type, hw_closid); - closid.reqpartid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(rdtgrp->closid.reqpartid, + cfg[i].conf_type, closid.reqpartid);
dom->ctrl_val[type][closid.intpartid] = cfg[i].new_ctrl[type]; @@ -242,10 +238,8 @@ static void resctrl_group_update_domain_ctrls(struct rdtgroup *rdtgrp, */ head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry(entry, head, mon.crdtgrp_list) { - resctrl_cdp_map(clos, entry->closid.reqpartid, - cfg[i].conf_type, hw_closid); - - closid.reqpartid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(entry->closid.reqpartid, + cfg[i].conf_type, closid.reqpartid); resctrl_dom_ctrl_config(cdp_both_ctrl, r, dom, ¶); } } @@ -295,7 +289,7 @@ parse_line(char *line, struct resctrl_resource *r, dom = strim(dom); list_for_each_entry(d, &r->domains, list) { if (d->id == dom_id) { - resctrl_cdp_map(clos, closid, conf_type, hw_closid); + resctrl_cdp_mpamid_map(closid, conf_type, hw_closid); if (rr->parse_ctrlval(dom, r, &d->staged_cfg[conf_type], ctrl_type)) return -EINVAL; @@ -468,7 +462,6 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, struct resctrl_resource *r; struct resctrl_schema *rs; int ret = 0; - hw_closid_t hw_closid; struct sd_closid closid; struct resctrl_schema_ctrl *sc;
@@ -479,13 +472,11 @@ int resctrl_group_schemata_show(struct kernfs_open_file *of, if (!r) continue; if (r->alloc_enabled) { - resctrl_cdp_map(clos, rdtgrp->closid.intpartid, - rs->conf_type, hw_closid); - closid.intpartid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(rdtgrp->closid.intpartid, + rs->conf_type, closid.intpartid);
- resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, - rs->conf_type, hw_closid); - closid.reqpartid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(rdtgrp->closid.reqpartid, + rs->conf_type, closid.reqpartid);
show_doms(s, r, rs->name, SCHEMA_COMM, &closid); list_for_each_entry(sc, &rs->schema_ctrl_list, list) { @@ -541,10 +532,7 @@ static u64 resctrl_dom_mon_data(struct resctrl_resource *r, rr = r->res; ret = rr->mon_read(d, md.priv); if (md.u.cdp_both_mon) { - hw_closid_t hw_closid; - - resctrl_cdp_map(clos, md.u.partid, CDP_DATA, hw_closid); - md.u.partid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(md.u.partid, CDP_DATA, md.u.partid); ret += rr->mon_read(d, md.priv); }
@@ -594,10 +582,9 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) struct list_head *head; struct rdtgroup *entry; hw_closid_t hw_closid; - hw_monid_t hw_monid; enum resctrl_conf_type type = CDP_CODE;
- resctrl_cdp_map(clos, rdtgrp->closid.reqpartid, + resctrl_cdp_mpamid_map(rdtgrp->closid.reqpartid, CDP_CODE, hw_closid); /* CDP_CODE share the same closid with CDP_BOTH */ if (md.u.partid != hw_closid_val(hw_closid)) @@ -605,9 +592,8 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg)
head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry(entry, head, mon.crdtgrp_list) { - resctrl_cdp_map(clos, entry->closid.reqpartid, - type, hw_closid); - md.u.partid = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(entry->closid.reqpartid, + type, md.u.partid);
ret = mpam_rmid_to_partid_pmg(entry->mon.rmid, NULL, &pmg); @@ -615,9 +601,8 @@ int resctrl_group_mondata_show(struct seq_file *m, void *arg) return ret;
md.u.pmg = pmg; - resctrl_cdp_map(mon, get_rmid_mon(entry->mon.rmid, - r->rid), type, hw_monid); - md.u.mon = hw_monid_val(hw_monid); + resctrl_cdp_mpamid_map_val(get_rmid_mon(entry->mon.rmid, + r->rid), type, md.u.mon);
usage += resctrl_dom_mon_data(r, d, md.priv); } @@ -657,8 +642,6 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, { struct resctrl_resource *r; struct raw_resctrl_resource *rr; - hw_closid_t hw_closid; - hw_monid_t hw_monid; union mon_data_bits md; struct kernfs_node *kn; char name[32]; @@ -671,11 +654,10 @@ static int resctrl_mkdir_mondata_dom(struct kernfs_node *parent_kn, md.u.rid = r->rid; md.u.domid = d->id; /* monitoring use reqpartid (reqpartid) */ - resctrl_cdp_map(clos, prgrp->closid.reqpartid, s->conf_type, hw_closid); - md.u.partid = hw_closid_val(hw_closid); - resctrl_cdp_map(mon, get_rmid_mon(prgrp->mon.rmid, r->rid), - s->conf_type, hw_monid); - md.u.mon = hw_monid_val(hw_monid); + resctrl_cdp_mpamid_map_val(prgrp->closid.reqpartid, s->conf_type, + md.u.partid); + resctrl_cdp_mpamid_map_val(get_rmid_mon(prgrp->mon.rmid, r->rid), + s->conf_type, md.u.mon);
ret = mpam_rmid_to_partid_pmg(prgrp->mon.rmid, NULL, &pmg); if (ret) @@ -861,7 +843,7 @@ static void rdtgroup_init_mba(struct resctrl_schema *s, u32 closid) cfg = &d->staged_cfg[CDP_BOTH]; cfg->cdp_both_ctrl = s->cdp_mc_both; cfg->new_ctrl[SCHEMA_COMM] = rr->ctrl_features[SCHEMA_COMM].default_ctrl; - resctrl_cdp_map(clos, closid, CDP_BOTH, cfg->hw_closid); + resctrl_cdp_mpamid_map(closid, CDP_BOTH, cfg->hw_closid); cfg->have_new_ctrl = true; /* Set extension ctrl default value, e.g. priority/hardlimit */ for_each_extend_ctrl_type(t) { @@ -917,7 +899,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) return -ENOSPC; }
- resctrl_cdp_map(clos, closid, conf_type, cfg->hw_closid); + resctrl_cdp_mpamid_map(closid, conf_type, cfg->hw_closid); cfg->have_new_ctrl = true;
/* diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 6e251ef90c36..aa57b7ae003f 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -1965,13 +1965,8 @@ void __mpam_sched_in(void) resctrl_navie_rmid_partid_pmg(rmid, (int *)&reqpartid, (int *)&pmg);
if (resctrl_cdp_enabled) { - hw_closid_t hw_closid; - - resctrl_cdp_map(clos, reqpartid, CDP_DATA, hw_closid); - partid_d = hw_closid_val(hw_closid); - - resctrl_cdp_map(clos, reqpartid, CDP_CODE, hw_closid); - partid_i = hw_closid_val(hw_closid); + resctrl_cdp_mpamid_map_val(reqpartid, CDP_DATA, partid_d); + resctrl_cdp_mpamid_map_val(reqpartid, CDP_CODE, partid_i);
/* set in EL0 */ reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Sometimes monitoring will have such anomalies:
e.g. > cd /sys/fs/resctrl/ && grep . mon_data/* mon_data/mon_L3CODE_00:14336 mon_data/mon_L3CODE_01:344064 mon_data/mon_L3CODE_02:2048 mon_data/mon_L3CODE_03:27648 mon_data/mon_L3DATA_00:0 #L3DATA's monitoring data always be 0 mon_data/mon_L3DATA_01:0 mon_data/mon_L3DATA_02:0 mon_data/mon_L3DATA_03:0 mon_data/mon_MB_00:392 mon_data/mon_MB_01:552 mon_data/mon_MB_02:160 mon_data/mon_MB_03:0
If cdp on, tasks in resctrl default group with closid=0 and rmid=0 don't know how to fill proper partid_i/pmg_i and partid_d/pmg_d into MPAMx_ELx sysregs by mpam_sched_in() called by __switch_to(), it's because current cpu's default closid and rmid are also equal to 0 and to make the operation modifying configuration passed.
Update per cpu default closid of none-zero value, call update_closid_rmid() to update each cpu's mpam proper MPAMx_ELx sysregs for setting partid and pmg when mounting resctrl sysfs, it looks like a practical method.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/resctrlfs.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 239bd4e3d9ca..cfa09344ad5d 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -320,6 +320,28 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn, return ret; }
+static void resctrl_cdp_update_cpus_state(struct resctrl_group *r) +{ + int cpu; + + /* + * If cdp on, tasks in resctrl default group with closid=0 + * and rmid=0 don't know how to fill proper partid_i/pmg_i + * and partid_d/pmg_d into MPAMx_ELx sysregs by mpam_sched_in() + * called by __switch_to(), it's because current cpu's default + * closid and rmid are also equal to 0 and to make the operation + * modifying configuration passed. Update per cpu default closid + * of none-zero value, call update_closid_rmid() to update each + * cpu's mpam proper MPAMx_ELx sysregs for setting partid and + * pmg when mounting resctrl sysfs, it looks like a practical + * method. + */ + for_each_cpu(cpu, &r->cpu_mask) + per_cpu(pqr_state.default_closid, cpu) = ~0; + + update_closid_rmid(&r->cpu_mask, NULL); +} + static struct dentry *resctrl_mount(struct file_system_type *fs_type, int flags, const char *unused_dev_name, void *data) @@ -389,6 +411,8 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, if (IS_ERR(dentry)) goto out_mondata;
+ resctrl_cdp_update_cpus_state(&resctrl_group_default); + post_resctrl_mount();
goto out;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Proximity domain of Memory MSC node cannot be treated as node id for components' index, we should use acpi_map_pxm_to_node() to get the exact node id anyway, for instance, after DIE interleaving, we can only use node id instead, for pxm is discontinuous at this time.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- drivers/acpi/arm64/mpam.c | 33 ++++----------------------------- 1 file changed, 4 insertions(+), 29 deletions(-)
diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c index 6c238f5a5c5a..51419473f63b 100644 --- a/drivers/acpi/arm64/mpam.c +++ b/drivers/acpi/arm64/mpam.c @@ -71,42 +71,17 @@ acpi_mpam_label_cache_component_id(struct acpi_table_header *table_hdr, return 0; }
-/** - * acpi_mpam_label_memory_component_id() - Use proximity_domain id to - * label mpam memory node, which be signed by @component_id. - * @proximity_domain: proximity_domain of ACPI MPAM memory node - * @component_id: The id labels the structure mpam_node memory - */ -static int acpi_mpam_label_memory_component_id(u8 proximity_domain, - u32 *component_id) -{ - u32 nid = (u32)proximity_domain; - - if (nid >= nr_online_nodes) { - pr_err_once("Invalid proximity domain\n"); - return -EINVAL; - } - - *component_id = nid; - return 0; -} - static int __init acpi_mpam_parse_memory(struct acpi_mpam_header *h) { - int ret; u32 component_id; struct mpam_device *dev; struct acpi_mpam_node_memory *node = (struct acpi_mpam_node_memory *)h;
- ret = acpi_mpam_label_memory_component_id(node->proximity_domain, - &component_id); - if (ret) { - pr_err("Failed to label memory component id\n"); - return -EINVAL; - } + component_id = acpi_map_pxm_to_node(node->proximity_domain); + if (component_id == NUMA_NO_NODE) + component_id = 0;
- dev = mpam_device_create_memory(component_id, - node->header.base_address); + dev = mpam_device_create_memory(component_id, node->header.base_address); if (IS_ERR(dev)) { pr_err("Failed to create memory node\n"); return -EINVAL;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: feature bugzilla: 34278 CVE: NA
-------------------------------------------------
Based on 61fa56e1dd8a ("arm64/mpam: Add resctrl_ctrl_feature structure to manage ctrl features"), we add several ctrl features and supply corresponding mount options, including mbPbm, mbMax, mbMin, mbPrio, caMax, caPrio, caPbm, if MPAM system supports relevant features, we can mount resctrl like this:
e.g.
mount -t resctrl resctrl /sys/fs/resctrl -o mbMax,mbMin,caPrio cd /sys/fs/resctrl && cat schemata
L3:0=0x7fff;1=0x7fff;2=0x7fff;3=0x7fff #default select cpbm as basic ctrl feature L3PRI:0=3;1=3;2=3;3=3 MBMAX:0=100;1=100;2=100;3=100 MBMIN:0=0;1=0;2=0;3=0
mount -t resctrl resctrl /sys/fs/resctrl cd /sys/fs/resctrl && cat schemata
L3:0=0x7fff;1=0x7fff;2=0x7fff;3=0x7fff #default select cpbm as basic ctrl feature MB:0=100;1=100;2=100;3=100 #default select mbw max as basic ctrl feature
mount -t resctrl resctrl /sys/fs/resctrl -o caMax cd /sys/fs/resctrl && cat schemata
L3:0=33554432;1=33554432;2=33554432;3=33554432 #use cmax ctrl feature MB:0=100;1=100;2=100;3=100 #default select mbw max as basic ctrl feature
For Cache MSCs, basic ctrl features include cmax(Cache Maximum Capacity) and cpbm(Cache protion bitmap) partition, if mount options are not specified, default cpbm will be selected.
For Memory MSCs, basic ctrl features include max(Memory Bandwidth Maximum) and pbm(Memory Bandwidth Portion Bitmap) partition, if mount options are not specified, default max will be selected.
Above mount options also can be used accompany with cdp options.
e.g.
mount -t resctrl resctrl /sys/fs/resctrl -o caMax,caPrio,cdpl3 cd /sys/fs/resctrl && cat schemata
L3CODE:0=33554432;1=33554432;2=33554432;3=33554432 #code use cmax ctrl feature L3DATA:0=33554432;1=33554432;2=33554432;3=33554432 #data use cmax ctrl feature L3CODEPRI:0=3;1=3;2=3;3=3 #code use intpriority ctrl feature L3DATAPRI:0=3;1=3;2=3;3=3 #data use intpriority ctrl feature MB:0=100;1=100;2=100;3=100 #default select mbw max as basic ctrl feature
By combining these mount parameters can we use MPAM more powerfully.
Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Reviewed-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 18 ++-- arch/arm64/kernel/mpam/mpam_device.c | 33 ++++++- arch/arm64/kernel/mpam/mpam_internal.h | 3 + arch/arm64/kernel/mpam/mpam_resctrl.c | 120 ++++++++++++++++++++++--- arch/arm64/kernel/mpam/mpam_setup.c | 84 ++++++++++++----- 5 files changed, 218 insertions(+), 40 deletions(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 45dde62e8586..95148e1bb232 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -27,12 +27,15 @@ enum rdt_event_id { QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
QOS_CAT_CPBM_EVENT_ID = 0x04, - QOS_CAT_INTPRI_EVENT_ID = 0x05, - QOS_CAT_DSPRI_EVENT_ID = 0x06, - QOS_MBA_MAX_EVENT_ID = 0x07, - QOS_MBA_INTPRI_EVENT_ID = 0x08, - QOS_MBA_DSPRI_EVENT_ID = 0x09, - QOS_MBA_HDL_EVENT_ID = 0x0a, + QOS_CAT_CMAX_EVENT_ID = 0x05, + QOS_CAT_INTPRI_EVENT_ID = 0x06, + QOS_CAT_DSPRI_EVENT_ID = 0x07, + QOS_MBA_MAX_EVENT_ID = 0x08, + QOS_MBA_MIN_EVENT_ID = 0x09, + QOS_MBA_PBM_EVENT_ID = 0x0a, + QOS_MBA_INTPRI_EVENT_ID = 0x0b, + QOS_MBA_DSPRI_EVENT_ID = 0x0c, + QOS_MBA_HDL_EVENT_ID = 0x0d, /* Must be the last */ RESCTRL_NUM_EVENT_IDS, }; @@ -165,6 +168,9 @@ enum resctrl_ctrl_type { SCHEMA_COMM = 0, SCHEMA_PRI, SCHEMA_HDL, + SCHEMA_PBM, + SCHEMA_MAX, + SCHEMA_MIN, SCHEMA_NUM_CTRL_TYPE };
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index 3fd5fc66b204..fc7aa1ae0b82 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -1142,6 +1142,11 @@ u16 mpam_sysprops_num_pmg(void) return mpam_sysprops.max_pmg + 1; }
+u32 mpam_sysprops_llc_size(void) +{ + return mpam_sysprops.mpam_llc_size; +} + static u32 mpam_device_read_csu_mon(struct mpam_device *dev, struct sync_args *args) { @@ -1317,7 +1322,7 @@ mpam_device_config(struct mpam_device *dev, struct sd_closid *closid, u16 cmax = GENMASK(dev->cmax_wd, 0); u32 pri_val = 0; u16 intpri, dspri, max_intpri, max_dspri; - u32 mbw_pbm, mbw_max; + u32 mbw_pbm, mbw_max, mbw_min; /* * if dev supports narrowing, narrowing first and then apply this slave's * configuration. @@ -1366,6 +1371,13 @@ mpam_device_config(struct mpam_device *dev, struct sd_closid *closid, } }
+ if (mpam_has_feature(mpam_feat_mbw_min, dev->features)) { + if (cfg && mpam_has_feature(mpam_feat_mbw_min, cfg->valid)) { + mbw_min = MBW_MAX_SET(cfg->mbw_min, dev->bwa_wd); + mpam_write_reg(dev, MPAMCFG_MBW_MIN, mbw_min); + } + } + if (mpam_has_feature(mpam_feat_intpri_part, dev->features) || mpam_has_feature(mpam_feat_dspri_part, dev->features)) { if (mpam_has_feature(mpam_feat_intpri_part, cfg->valid) && @@ -1596,6 +1608,11 @@ static void mpam_component_read_mpamcfg(void *_ctx) break; val = mpam_read_reg(dev, MPAMCFG_CPBM); break; + case QOS_CAT_CMAX_EVENT_ID: + if (!mpam_has_feature(mpam_feat_ccap_part, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_CMAX); + break; case QOS_MBA_MAX_EVENT_ID: if (!mpam_has_feature(mpam_feat_mbw_max, dev->features)) break; @@ -1603,6 +1620,20 @@ static void mpam_component_read_mpamcfg(void *_ctx) range = MBW_MAX_BWA_FRACT(dev->bwa_wd); val = MBW_MAX_GET(val, dev->bwa_wd) * (MAX_MBA_BW - 1) / range; break; + case QOS_MBA_MIN_EVENT_ID: + if (!mpam_has_feature(mpam_feat_mbw_min, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_MBW_MIN); + range = MBW_MAX_BWA_FRACT(dev->bwa_wd); + val = MBW_MAX_GET(val, dev->bwa_wd) * (MAX_MBA_BW - 1) / range; + break; + case QOS_MBA_PBM_EVENT_ID: + if (!mpam_has_feature(mpam_feat_mbw_part, dev->features)) + break; + val = mpam_read_reg(dev, MPAMCFG_MBW_PBM); + range = dev->mbw_pbm_bits; + val = val * MAX_MBA_BW / range; + break; case QOS_MBA_HDL_EVENT_ID: if (!mpam_has_feature(mpam_feat_mbw_max, dev->features)) break; diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h index 40dcc02f4e57..cfaef82428aa 100644 --- a/arch/arm64/kernel/mpam/mpam_internal.h +++ b/arch/arm64/kernel/mpam/mpam_internal.h @@ -225,8 +225,10 @@ struct mpam_config { mpam_features_t valid;
u32 cpbm; + u32 cmax; u32 mbw_pbm; u16 mbw_max; + u16 mbw_min;
/* * dspri is downstream priority, intpri is internal priority. @@ -311,6 +313,7 @@ void mpam_component_get_config(struct mpam_component *comp,
u16 mpam_sysprops_num_partid(void); u16 mpam_sysprops_num_pmg(void); +u32 mpam_sysprops_llc_size(void);
void mpam_class_list_lock_held(void);
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index aa57b7ae003f..26e38e6954a2 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -146,6 +146,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .base = 16, .evt = QOS_CAT_CPBM_EVENT_ID, .capable = 1, + .ctrl_suffix = "", }, [SCHEMA_PRI] = { .type = SCHEMA_PRI, @@ -153,6 +154,23 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .name = "caPrio", .base = 10, .evt = QOS_CAT_INTPRI_EVENT_ID, + .ctrl_suffix = "PRI", + }, + [SCHEMA_PBM] = { + .type = SCHEMA_PBM, + .flags = SCHEMA_COMM, + .name = "caPbm", + .base = 16, + .evt = QOS_CAT_CPBM_EVENT_ID, + .ctrl_suffix = "PBM", + }, + [SCHEMA_MAX] = { + .type = SCHEMA_MAX, + .flags = SCHEMA_COMM, + .name = "caMax", + .base = 10, + .evt = QOS_CAT_CMAX_EVENT_ID, + .ctrl_suffix = "MAX", }, }, }, @@ -172,6 +190,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .base = 16, .evt = QOS_CAT_CPBM_EVENT_ID, .capable = 1, + .ctrl_suffix = "", }, [SCHEMA_PRI] = { .type = SCHEMA_PRI, @@ -179,6 +198,23 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .name = "caPrio", .base = 10, .evt = QOS_CAT_INTPRI_EVENT_ID, + .ctrl_suffix = "PRI", + }, + [SCHEMA_PBM] = { + .type = SCHEMA_PBM, + .flags = SCHEMA_COMM, + .name = "caPbm", + .base = 16, + .evt = QOS_CAT_CPBM_EVENT_ID, + .ctrl_suffix = "PBM", + }, + [SCHEMA_MAX] = { + .type = SCHEMA_MAX, + .flags = SCHEMA_COMM, + .name = "caMax", + .base = 10, + .evt = QOS_CAT_CMAX_EVENT_ID, + .ctrl_suffix = "MAX", }, }, }, @@ -198,6 +234,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .base = 10, .evt = QOS_MBA_MAX_EVENT_ID, .capable = 1, + .ctrl_suffix = "", }, [SCHEMA_PRI] = { .type = SCHEMA_PRI, @@ -205,6 +242,7 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .name = "mbPrio", .base = 10, .evt = QOS_MBA_INTPRI_EVENT_ID, + .ctrl_suffix = "PRI", }, [SCHEMA_HDL] = { .type = SCHEMA_HDL, @@ -212,6 +250,31 @@ struct raw_resctrl_resource raw_resctrl_resources_all[] = { .name = "mbHdl", .base = 10, .evt = QOS_MBA_HDL_EVENT_ID, + .ctrl_suffix = "HDL", + }, + [SCHEMA_PBM] = { + .type = SCHEMA_PBM, + .flags = SCHEMA_COMM, + .name = "mbPbm", + .base = 16, + .evt = QOS_MBA_PBM_EVENT_ID, + .ctrl_suffix = "PBM", + }, + [SCHEMA_MAX] = { + .type = SCHEMA_MAX, + .flags = SCHEMA_COMM, + .name = "mbMax", + .base = 10, + .evt = QOS_MBA_MAX_EVENT_ID, + .ctrl_suffix = "MAX", + }, + [SCHEMA_MIN] = { + .type = SCHEMA_MIN, + .flags = SCHEMA_COMM, + .name = "mbMin", + .base = 10, + .evt = QOS_MBA_MIN_EVENT_ID, + .ctrl_suffix = "MIN", }, }, }, @@ -270,6 +333,8 @@ parse_bw(char *buf, struct resctrl_resource *r,
switch (rr->ctrl_features[type].evt) { case QOS_MBA_MAX_EVENT_ID: + case QOS_MBA_MIN_EVENT_ID: + case QOS_MBA_PBM_EVENT_ID: if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) return -EINVAL; data = (data < r->mbw.min_bw) ? r->mbw.min_bw : data; @@ -342,6 +407,8 @@ static u64 mbw_rdmsr(struct resctrl_resource *r, struct rdt_domain *d,
switch (rr->ctrl_features[para->type].evt) { case QOS_MBA_MAX_EVENT_ID: + case QOS_MBA_MIN_EVENT_ID: + case QOS_MBA_PBM_EVENT_ID: result = roundup(result, r->mbw.bw_gran); break; default: @@ -1067,14 +1134,21 @@ static int extend_ctrl_enable(char *tok) if (!r->alloc_capable) continue; rr = r->res; - for_each_ctrl_type(type) { + for_each_extend_ctrl_type(type) { feature = &rr->ctrl_features[type]; + if (!feature->capable || !feature->name) + continue; if (strcmp(feature->name, tok)) continue; - if (rr->ctrl_features[type].capable) { - rr->ctrl_features[type].enabled = true; - match = true; - } + + rr->ctrl_features[type].enabled = true; + /* + * If we chose to enable a feature also embraces + * SCHEMA_COMM, SCHEMA_COMM will not be selected. + */ + if (feature->flags == SCHEMA_COMM) + rr->ctrl_features[SCHEMA_COMM].enabled = false;; + match = true; } }
@@ -1088,11 +1162,15 @@ static void extend_ctrl_disable(void) { struct raw_resctrl_resource *rr; struct mpam_resctrl_res *res; + struct resctrl_ctrl_feature *feature; + enum resctrl_ctrl_type type;
for_each_supported_resctrl_exports(res) { rr = res->resctrl_res.res; - rr->ctrl_features[SCHEMA_PRI].enabled = false; - rr->ctrl_features[SCHEMA_HDL].enabled = false; + for_each_extend_ctrl_type(type) { + feature = &rr->ctrl_features[type]; + feature->enabled = false; + } } }
@@ -1104,6 +1182,7 @@ int parse_rdtgroupfs_options(char *data)
disable_cdp(); extend_ctrl_disable(); + basic_ctrl_enable();
while ((token = strsep(&o, ",")) != NULL) { if (!*token) { @@ -1126,8 +1205,6 @@ int parse_rdtgroupfs_options(char *data) } }
- basic_ctrl_enable(); - return 0;
out: @@ -2008,22 +2085,43 @@ mpam_update_from_resctrl_cfg(struct mpam_resctrl_res *res, u64 range;
switch (evt) { + case QOS_MBA_PBM_EVENT_ID: + /* .. the number of bits we can set */ + range = res->class->mbw_pbm_bits; + mpam_cfg->mbw_pbm = + (resctrl_cfg * range) / MAX_MBA_BW; + mpam_set_feature(mpam_feat_mbw_part, &mpam_cfg->valid); + break; case QOS_MBA_MAX_EVENT_ID: - /* .. the number of fractions we can represent */ range = MBW_MAX_BWA_FRACT(res->class->bwa_wd); mpam_cfg->mbw_max = (resctrl_cfg * range) / (MAX_MBA_BW - 1); mpam_cfg->mbw_max = (mpam_cfg->mbw_max > range) ? range : mpam_cfg->mbw_max; mpam_set_feature(mpam_feat_mbw_max, &mpam_cfg->valid); break; + case QOS_MBA_MIN_EVENT_ID: + range = MBW_MAX_BWA_FRACT(res->class->bwa_wd); + mpam_cfg->mbw_min = (resctrl_cfg * range) / (MAX_MBA_BW - 1); + mpam_cfg->mbw_min = + (mpam_cfg->mbw_min > range) ? range : mpam_cfg->mbw_min; + mpam_set_feature(mpam_feat_mbw_min, &mpam_cfg->valid); + break; case QOS_MBA_HDL_EVENT_ID: mpam_cfg->hdl = resctrl_cfg; mpam_set_feature(mpam_feat_part_hdl, &mpam_cfg->valid); break; + case QOS_MBA_INTPRI_EVENT_ID: + mpam_cfg->intpri = resctrl_cfg; + mpam_set_feature(mpam_feat_intpri_part, &mpam_cfg->valid); + break; case QOS_CAT_CPBM_EVENT_ID: mpam_cfg->cpbm = resctrl_cfg; mpam_set_feature(mpam_feat_cpor_part, &mpam_cfg->valid); break; + case QOS_CAT_CMAX_EVENT_ID: + mpam_cfg->cmax = resctrl_cfg; + mpam_set_feature(mpam_feat_ccap_part, &mpam_cfg->valid); + break; case QOS_CAT_INTPRI_EVENT_ID: mpam_cfg->intpri = resctrl_cfg; mpam_set_feature(mpam_feat_intpri_part, &mpam_cfg->valid); @@ -2079,7 +2177,7 @@ mpam_resctrl_update_component_cfg(struct resctrl_resource *r,
resctrl_cfg = d->ctrl_val[type][intpartid]; mpam_update_from_resctrl_cfg(res, resctrl_cfg, - type, slave_mpam_cfg); + rr->ctrl_features[type].evt, slave_mpam_cfg); } }
diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 18b1e5db5c0a..51817091f119 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -346,18 +346,17 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->res = rr;
if (mpam_has_feature(mpam_feat_mbw_part, class->features)) { - res->resctrl_mba_uses_mbw_part = true; - /* * The maximum throttling is the number of bits we can * unset in the bitmap. We never clear all of them, * so the minimum is one bit, as a percentage. */ r->mbw.min_bw = MAX_MBA_BW / class->mbw_pbm_bits; - } else { - /* we're using mpam_feat_mbw_max's */ - res->resctrl_mba_uses_mbw_part = false; + rr->ctrl_features[SCHEMA_PBM].max_wd = MAX_MBA_BW + 1; + rr->ctrl_features[SCHEMA_PBM].capable = true; + }
+ if (mpam_has_feature(mpam_feat_mbw_max, class->features)) { /* * The maximum throttling is the number of fractions we * can represent with the implemented bits. We never @@ -366,22 +365,36 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) r->mbw.min_bw = MAX_MBA_BW / ((1ULL << class->bwa_wd) - 1); /* the largest mbw_max is 100 */ - rr->ctrl_features[SCHEMA_COMM].default_ctrl = MAX_MBA_BW; - rr->ctrl_features[SCHEMA_COMM].max_wd = MAX_MBA_BW + 1; - rr->ctrl_features[SCHEMA_COMM].capable = true; + rr->ctrl_features[SCHEMA_MAX].default_ctrl = MAX_MBA_BW; + rr->ctrl_features[SCHEMA_MAX].max_wd = MAX_MBA_BW + 1; + rr->ctrl_features[SCHEMA_MAX].capable = true; + + /* default set max stride MAX as COMMON ctrl feature */ + rr->ctrl_features[SCHEMA_COMM].default_ctrl = + rr->ctrl_features[SCHEMA_MAX].default_ctrl; + rr->ctrl_features[SCHEMA_COMM].max_wd = + rr->ctrl_features[SCHEMA_MAX].max_wd; + rr->ctrl_features[SCHEMA_COMM].capable = + rr->ctrl_features[SCHEMA_MAX].capable; + } + + if (mpam_has_feature(mpam_feat_mbw_min, class->features)) { + rr->ctrl_features[SCHEMA_MIN].max_wd = MAX_MBA_BW + 1; + rr->ctrl_features[SCHEMA_MIN].capable = true; }
+ /* + * Export priority setting, which represents the max level of + * control we can export. this default priority from hardware, + * no clever here, no need to define additional default value. + */ if (mpam_has_feature(mpam_feat_intpri_part, class->features)) { - /* - * Export internal priority setting, which represents the - * max level of control we can export to resctrl. this default - * priority is from hardware, no clever here. - */ rr->ctrl_features[SCHEMA_PRI].max_wd = 1 << class->intpri_wd; rr->ctrl_features[SCHEMA_PRI].default_ctrl = class->hwdef_intpri; rr->ctrl_features[SCHEMA_PRI].capable = true; }
+ /* Just in case we have an excessive number of bits */ if (!r->mbw.min_bw) r->mbw.min_bw = 1; @@ -413,18 +426,26 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res)
if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { r->cache.cbm_len = class->cpbm_wd; - rr->ctrl_features[SCHEMA_COMM].default_ctrl = GENMASK(class->cpbm_wd - 1, 0); - rr->ctrl_features[SCHEMA_COMM].max_wd = - rr->ctrl_features[SCHEMA_COMM].default_ctrl + 1; - rr->ctrl_features[SCHEMA_COMM].capable = true; + rr->ctrl_features[SCHEMA_PBM].default_ctrl = GENMASK(class->cpbm_wd - 1, 0); + rr->ctrl_features[SCHEMA_PBM].max_wd = + rr->ctrl_features[SCHEMA_PBM].default_ctrl + 1; + rr->ctrl_features[SCHEMA_PBM].capable = true; /* * Which bits are shared with other ...things... * Unknown devices use partid-0 which uses all the bitmap * fields. Until we configured the SMMU and GIC not to do this * 'all the bits' is the correct answer here. */ - r->cache.shareable_bits = rr->ctrl_features[SCHEMA_COMM].default_ctrl; + r->cache.shareable_bits = rr->ctrl_features[SCHEMA_PBM].default_ctrl; r->cache.min_cbm_bits = 1; + + /* default set CPBM as COMMON ctrl feature */ + rr->ctrl_features[SCHEMA_COMM].default_ctrl = + rr->ctrl_features[SCHEMA_PBM].default_ctrl; + rr->ctrl_features[SCHEMA_COMM].max_wd = + rr->ctrl_features[SCHEMA_PBM].max_wd; + rr->ctrl_features[SCHEMA_COMM].capable = + rr->ctrl_features[SCHEMA_PBM].capable; }
if (mpam_has_feature(mpam_feat_intpri_part, class->features)) { @@ -437,6 +458,12 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) rr->ctrl_features[SCHEMA_PRI].default_ctrl = class->hwdef_intpri; rr->ctrl_features[SCHEMA_PRI].capable = true; } + + if (mpam_has_feature(mpam_feat_ccap_part, class->features)) { + rr->ctrl_features[SCHEMA_MAX].max_wd = mpam_sysprops_llc_size() + 1; + rr->ctrl_features[SCHEMA_MAX].capable = true; + } + /* * Only this resource is allocable can it be picked from * mpam_resctrl_pick_caches(). So directly set following @@ -464,10 +491,11 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res)
if (mpam_has_feature(mpam_feat_cpor_part, class->features)) { r->cache.cbm_len = class->cpbm_wd; - rr->ctrl_features[SCHEMA_COMM].default_ctrl = GENMASK(class->cpbm_wd - 1, 0); - rr->ctrl_features[SCHEMA_COMM].max_wd = - rr->ctrl_features[SCHEMA_COMM].default_ctrl + 1; - rr->ctrl_features[SCHEMA_COMM].capable = true; + rr->ctrl_features[SCHEMA_PBM].default_ctrl = + GENMASK(class->cpbm_wd - 1, 0); + rr->ctrl_features[SCHEMA_PBM].max_wd = + rr->ctrl_features[SCHEMA_PBM].default_ctrl + 1; + rr->ctrl_features[SCHEMA_PBM].capable = true; /* * Which bits are shared with other ...things... * Unknown devices use partid-0 which uses all the bitmap @@ -475,6 +503,18 @@ static int mpam_resctrl_resource_init(struct mpam_resctrl_res *res) * 'all the bits' is the correct answer here. */ r->cache.shareable_bits = rr->ctrl_features[SCHEMA_COMM].default_ctrl; + /* default set max stride MAX as COMMON ctrl feature */ + rr->ctrl_features[SCHEMA_COMM].default_ctrl = + rr->ctrl_features[SCHEMA_PBM].default_ctrl; + rr->ctrl_features[SCHEMA_COMM].max_wd = + rr->ctrl_features[SCHEMA_PBM].max_wd; + rr->ctrl_features[SCHEMA_COMM].capable = + rr->ctrl_features[SCHEMA_PBM].capable; + } + + if (mpam_has_feature(mpam_feat_ccap_part, class->features)) { + rr->ctrl_features[SCHEMA_MAX].max_wd = ~0; + rr->ctrl_features[SCHEMA_MAX].capable = true; }
if (mpam_has_feature(mpam_feat_intpri_part, class->features)) {
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: bugfix bugzilla: 34278 CVE: NA
-------------------------------------------------
This function is called only when we mount resctrl sysfs, for error handling we need to destroy schemata list when next few steps failed after creation of schemata list.
Fixes: 7e9b5caeefff ("arm64/mpam: resctrl: Add helpers for init and destroy schemata list") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/resctrlfs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index cfa09344ad5d..8b9803ffa6e6 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -372,13 +372,13 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, ret = resctrl_id_init(); if (ret) { dentry = ERR_PTR(ret); - goto out_options; + goto out_schema; }
ret = resctrl_group_create_info_dir(resctrl_group_default.kn, &kn_info); if (ret) { dentry = ERR_PTR(ret); - goto out_options; + goto out_schema; }
if (resctrl_mon_capable) { @@ -425,6 +425,8 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, kernfs_remove(kn_mongrp); out_info: kernfs_remove(kn_info); +out_schema: + schemata_list_destroy(); out_options: release_resctrl_group_fs_options(); out:
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: bugfix bugzilla: 34278 CVE: NA
-------------------------------------------------
When we support configure different types of resources for a resource, the wrong history value will be updated in the default group after remounting.
e.g. > mount -t resctrl resctrl /sys/fs/resctrl/ -o mbMax,mbMin && cd resctrl/ > echo 'MBMIN:0=2;1=2;2=2;3=2' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMAX:0=100;1=100;2=100;3=100 MBMIN:0=2;1=2;2=2;3=2 > cd .. && umount /sys/fs/resctrl/ > mount -t resctrl resctrl /sys/fs/resctrl/ -o mbMax,mbMin && cd resctrl/ && cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMAX:0=100;1=100;2=100;3=100 MBMIN:0=0;1=0;2=0;3=0 > echo 'MBMAX:0=10;1=10;2=10;3=10' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMAX:0=10;1=10;2=10;3=10 MBMIN:0=2;1=2;2=2;3=2 #update error history value
When writing schemata sysfile, call path like this:
resctrl_group_schemata_write() -=> resctrl_update_groups_config() -=> resctrl_group_update_domains() -=> resctrl_group_update_domain_ctrls() { .../*refresh new_ctrl array of supported conf type once for each resource*/ }
We should refresh new_ctrl field in struct resctrl_staged_config by resctrl_group_init_alloc() before calling resctrl_group_update_domain_ctrls().
Fixes: 6b2471f089be ("arm64/mpam: resctrl: Support priority and hardlimit(Memory bandwidth) configuration") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/resctrlfs.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 8b9803ffa6e6..6a51212afdac 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -375,6 +375,10 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, goto out_schema; }
+ ret = resctrl_group_init_alloc(&resctrl_group_default); + if (ret < 0) + goto out_schema; + ret = resctrl_group_create_info_dir(resctrl_group_default.kn, &kn_info); if (ret) { dentry = ERR_PTR(ret);
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: bugfix bugzilla: 34278 CVE: NA
-------------------------------------------------
Unlike mbw max(Memory Bandwidth Maximum), sometimes we don't want make use of mbw min feature(this for restrict memory bandwidth maximum capacity partition by using MPAMCFG_MBW_MIN, MBMIN row in schemata) and set MPAMCFG_MBW_MIN to 0.
e.g. > mount -t resctrl resctrl /sys/fs/resctrl/ -o mbMin > cd resctrl/ && cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMIN:0=0;1=0;2=0;3=0
# before revision > echo 'MBMIN:0=0;1=0;2=0;3=0' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMIN:0=2;1=2;2=2;3=2
# after revision > echo 'MBMIN:0=0;1=0;2=0;3=0' > schemata > cat schemata L3:0=7fff;1=7fff;2=7fff;3=7fff MBMIN:0=0;1=0;2=0;3=0
Fixes: 5a49c4f1983d ("arm64/mpam: Supplement additional useful ctrl features for mount options") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_resctrl.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 26e38e6954a2..05d34181df90 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -333,13 +333,18 @@ parse_bw(char *buf, struct resctrl_resource *r,
switch (rr->ctrl_features[type].evt) { case QOS_MBA_MAX_EVENT_ID: - case QOS_MBA_MIN_EVENT_ID: case QOS_MBA_PBM_EVENT_ID: if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) return -EINVAL; data = (data < r->mbw.min_bw) ? r->mbw.min_bw : data; data = roundup(data, r->mbw.bw_gran); break; + case QOS_MBA_MIN_EVENT_ID: + if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) + return -EINVAL; + /* for mbw min feature, 0 of setting is allowed */ + data = roundup(data, r->mbw.bw_gran); + break; default: if (kstrtoul(buf, rr->ctrl_features[type].base, &data)) return -EINVAL;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: bugfix bugzilla: 34278 CVE: NA
-------------------------------------------------
This fixes two problems:
1) when cpu offline, we should clear cpu mask from all associated resctrl group but not only default group.
2) when cpu online, we should set cpu mask for default group and update default group's cpus to default state if cdp on, this operation is to fill code and data fields of mpam sysregs with appropriate value.
Fixes: 2e2c511ff49d ("arm64/mpam: resctrl: Handle cpuhp and resctrl_dom allocation") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/include/asm/resctrl.h | 25 +++++++++++++++++++++++++ arch/arm64/kernel/mpam/mpam_resctrl.c | 15 ++++++++++++--- fs/resctrlfs.c | 22 ---------------------- 3 files changed, 37 insertions(+), 25 deletions(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h index 95148e1bb232..a31df7fdc477 100644 --- a/arch/arm64/include/asm/resctrl.h +++ b/arch/arm64/include/asm/resctrl.h @@ -447,6 +447,31 @@ int parse_rdtgroupfs_options(char *data);
int resctrl_group_add_files(struct kernfs_node *kn, unsigned long fflags);
+static inline void resctrl_cdp_update_cpus_state(struct resctrl_group *rdtgrp) +{ + int cpu; + + /* + * If cdp on, tasks in resctrl default group with closid=0 + * and rmid=0 don't know how to fill proper partid_i/pmg_i + * and partid_d/pmg_d into MPAMx_ELx sysregs by mpam_sched_in() + * called by __switch_to(), it's because current cpu's default + * closid and rmid are also equal to 0 and make the operation + * modifying configuration passed. Update per cpu default closid + * of none-zero value, call update_closid_rmid() to update each + * cpu's mpam proper MPAMx_ELx sysregs for setting partid and + * pmg when mounting resctrl sysfs, which is a practical method; + * Besides, to support cpu online and offline we should set + * cur_closid to 0. + */ + for_each_cpu(cpu, &rdtgrp->cpu_mask) { + per_cpu(pqr_state.default_closid, cpu) = ~0; + per_cpu(pqr_state.cur_closid, cpu) = 0; + } + + update_closid_rmid(&rdtgrp->cpu_mask, NULL); +} + #define RESCTRL_MAX_CBM 32
/* diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c index 05d34181df90..a7fef0ac36f2 100644 --- a/arch/arm64/kernel/mpam/mpam_resctrl.c +++ b/arch/arm64/kernel/mpam/mpam_resctrl.c @@ -90,15 +90,24 @@ static bool resctrl_cdp_enabled;
int mpam_resctrl_set_default_cpu(unsigned int cpu) { - /* The cpu is set in default rdtgroup after online. */ + /* The cpu is set in default rdtgroup after online. */ cpumask_set_cpu(cpu, &resctrl_group_default.cpu_mask); + + /* Update CPU mpam sysregs' default setting when cdp enabled */ + if (resctrl_cdp_enabled) + resctrl_cdp_update_cpus_state(&resctrl_group_default); + return 0; }
void mpam_resctrl_clear_default_cpu(unsigned int cpu) { - /* The cpu is set in default rdtgroup after online. */ - cpumask_clear_cpu(cpu, &resctrl_group_default.cpu_mask); + struct resctrl_group *rdtgrp; + + list_for_each_entry(rdtgrp, &resctrl_all_groups, resctrl_group_list) { + /* The cpu is clear in associated rdtgroup after offline. */ + cpumask_clear_cpu(cpu, &rdtgrp->cpu_mask); + } }
bool is_resctrl_cdp_enabled(void) diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 6a51212afdac..7779d6ec3e27 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -320,28 +320,6 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn, return ret; }
-static void resctrl_cdp_update_cpus_state(struct resctrl_group *r) -{ - int cpu; - - /* - * If cdp on, tasks in resctrl default group with closid=0 - * and rmid=0 don't know how to fill proper partid_i/pmg_i - * and partid_d/pmg_d into MPAMx_ELx sysregs by mpam_sched_in() - * called by __switch_to(), it's because current cpu's default - * closid and rmid are also equal to 0 and to make the operation - * modifying configuration passed. Update per cpu default closid - * of none-zero value, call update_closid_rmid() to update each - * cpu's mpam proper MPAMx_ELx sysregs for setting partid and - * pmg when mounting resctrl sysfs, it looks like a practical - * method. - */ - for_each_cpu(cpu, &r->cpu_mask) - per_cpu(pqr_state.default_closid, cpu) = ~0; - - update_closid_rmid(&r->cpu_mask, NULL); -} - static struct dentry *resctrl_mount(struct file_system_type *fs_type, int flags, const char *unused_dev_name, void *data)
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: bugfix bugzilla: 34278 CVE: NA
-------------------------------------------------
When cpu online, domains inserted into resctrl_resource structure's domains list may be out of order, so sort them with domain id.
Fixes: 2e2c511ff49d ("arm64/mpam: resctrl: Handle cpuhp and resctrl_dom allocation") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/arm64/kernel/mpam/mpam_setup.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_setup.c b/arch/arm64/kernel/mpam/mpam_setup.c index 51817091f119..8fd76dc38d89 100644 --- a/arch/arm64/kernel/mpam/mpam_setup.c +++ b/arch/arm64/kernel/mpam/mpam_setup.c @@ -59,12 +59,14 @@ mpam_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res) static int mpam_resctrl_setup_domain(unsigned int cpu, struct mpam_resctrl_res *res) { + struct rdt_domain *d; struct mpam_resctrl_dom *dom; struct mpam_class *class = res->class; struct mpam_component *comp_iter, *comp; u32 num_partid; u32 **ctrlval_ptr; enum resctrl_ctrl_type type; + struct list_head *tmp;
num_partid = mpam_sysprops_num_partid();
@@ -99,8 +101,17 @@ static int mpam_resctrl_setup_domain(unsigned int cpu, } }
- /* TODO: this list should be sorted */ - list_add_tail(&dom->resctrl_dom.list, &res->resctrl_res.domains); + tmp = &res->resctrl_res.domains; + /* insert domains in id ascending order */ + list_for_each_entry(d, &res->resctrl_res.domains, list) { + /* find the last domain with id greater than this domain */ + if (dom->resctrl_dom.id > d->id) + tmp = &d->list; + if (dom->resctrl_dom.id < d->id) + break; + } + list_add(&dom->resctrl_dom.list, tmp); + res->resctrl_res.dom_num++;
return 0;
From: Wang ShaoBo bobo.shaobowang@huawei.com
hulk inclusion category: bugfix bugzilla: 34278 CVE: NA
-------------------------------------------------
Set dentry before goto error handling branch.
fs/resctrlfs.c: In function ‘resctrl_mount’: fs/resctrlfs.c:419:9: warning: ‘dentry’ may be used uninitialized in this function [-Wmaybe-uninitialized] return dentry; ^~~~~~
Fixes: eb870a0d4e33 ("arm64/mpam: resctrl: Use resctrl_group_init_alloc() for default group") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/resctrlfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c index 7779d6ec3e27..532b944e922c 100644 --- a/fs/resctrlfs.c +++ b/fs/resctrlfs.c @@ -354,8 +354,10 @@ static struct dentry *resctrl_mount(struct file_system_type *fs_type, }
ret = resctrl_group_init_alloc(&resctrl_group_default); - if (ret < 0) + if (ret < 0) { + dentry = ERR_PTR(ret); goto out_schema; + }
ret = resctrl_group_create_info_dir(resctrl_group_default.kn, &kn_info); if (ret) {