[PATCH OLK-6.6 0/3] Updates of HiSilicon Uncore L3C PMU

Updates of HiSilicon Uncore L3C PMU --- Support new version of L3C PMU, which supports extended events space which can be controlled in up to 2 extra address spaces with separate overflow interrupts. The layout of the control/event registers are kept the same. The extended events with original ones together cover the monitoring job of all transactions on L3C. That's said, the driver supports finer granual statistics of L3 cache with separated and dedicated PMUs, and a new operand `ext` to give a hint of to which part should perf counting command be delivered. The extended events is specified with `ext=[1|2]` option for the driver to distinguish: perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=<ext>/ Currently only event option using config bit [7, 0]. There's still plenty unused space. Make ext using config [16, 17] and reserve bit [15, 8] for event option for future extension. With the capability of extra counters, number of counters for HiSilicon uncore PMU could reach up to 24, the usedmap is extended accordingly. The hw_perf_event::event_base is initialized to the base MMIO address of the event and will be used for later control, overflow handling and counts readout. We still make use of the Uncore PMU framework for handling the events and interrupt migration on CPU hotplug. The framework's cpuhp callback will handle the event migration and interrupt migration of orginial event, if PMU supports extended events then the interrupt of extended events is migrated to the same CPU choosed by the framework. A new HID of HISI0215 is used for this version of L3C PMU. Some necessary refactor is included, allowing the framework to cope with the new version of driver. Yicong Yang (1): drivers/perf: hisi: Add support for L3C PMU v3 Yushan Wang (2): Documentation: hisi-pmu: Fix of minor format error Documentation: hisi-pmu: Add introduction to HiSilicon Documentation/admin-guide/perf/hisi-pmu.rst | 44 ++- drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 342 ++++++++++--------- drivers/perf/hisilicon/hisi_uncore_pmu.h | 2 +- 3 files changed, 214 insertions(+), 174 deletions(-) -- 2.33.0

From: Yicong Yang <yangyicong@hisilicon.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/ICTSJJ ---------------------------------------------------------------------- This patch adds support for L3C PMU v3. The v3 L3C PMU supports an extended events space which can be controlled in up to 2 extra address spaces with separate overflow interrupts. The layout of the control/event registers are kept the same. The extended events with original ones together cover the monitoring job of all transactions on L3C. The extended events is specified with `ext=[1|2]` option for the driver to distinguish, like below: perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/ Currently only event option using config bit [7, 0]. There's still plenty unused space. Make ext using config [16, 17] and reserve bit [15, 8] for event option for future extension. With the capability of extra counters, number of counters for HiSilicon uncore PMU could reach up to 24, the usedmap is extended accordingly. The hw_perf_event::event_base is initialized to the base MMIO address of the event and will be used for later control, overflow handling and counts readout. We still make use of the Uncore PMU framework for handling the events and interrupt migration on CPU hotplug. The framework's cpuhp callback will handle the event migration and interrupt migration of orginial event, if PMU supports extended events then the interrupt of extended events is migrated to the same CPU choosed by the framework. A new HID of HISI0215 is used for this version of L3C PMU. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Co-developed-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> --- drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 342 ++++++++++--------- drivers/perf/hisilicon/hisi_uncore_pmu.h | 2 +- 2 files changed, 175 insertions(+), 169 deletions(-) diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c index 00ed571a3030..8ec1acc6846e 100644 --- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c @@ -56,7 +56,9 @@ #define L3C_V1_NR_EVENTS 0x59 #define L3C_V2_NR_EVENTS 0xFF -HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 16, 16); +#define L3C_MAX_EXT 2 + +HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 17, 16); HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8); HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11); HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16); @@ -65,12 +67,11 @@ HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0); struct hisi_l3c_pmu { struct hisi_pmu l3c_pmu; - unsigned long feature; -#define L3C_PMU_FEAT_EXT 0x1 /* MMIO and IRQ resources for extension events */ - void __iomem *ext_base; - int ext_irq; + void __iomem *ext_base[L3C_MAX_EXT]; + int ext_irq[L3C_MAX_EXT]; + int ext_num; }; #define to_hisi_l3c_pmu(_l3c_pmu) \ @@ -82,47 +83,59 @@ struct hisi_l3c_pmu { */ #define L3C_HW_IDX(_idx) ((_idx) % L3C_NR_COUNTERS) +/* The ext resource number to which a hardware counter belongs. */ +#define L3C_CNTR_EXT(_idx) ((_idx) / L3C_NR_COUNTERS) + +struct hisi_l3c_pmu_ext { + bool support_ext; +}; + +static bool support_ext(struct hisi_l3c_pmu *pmu) +{ + struct hisi_l3c_pmu_ext *l3c_pmu_ext = pmu->l3c_pmu.dev_info->private; + + return l3c_pmu_ext->support_ext; +} + static int hisi_l3c_pmu_get_event_idx(struct perf_event *event) { struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; - u32 num_counters = l3c_pmu->num_counters; - struct hisi_l3c_pmu *hisi_l3c_pmu; + int ext = hisi_get_ext(event); int idx; - hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); - /* * For an L3C PMU that supports extension events, we can monitor - * maximum 2 * num_counters events. Thus use bit [0, num_counters - 1] - * for normal events and bit [num_counters, 2 * num_counters - 1] for - * extension events. The idx allocation will keep unchanged for normal - * events and we can also use the idx to distinguish whether it's an - * extension event or not. + * maximum 2 * num_counters to 3 * num_counters events, depending on + * the number of ext regions supported by hardware. Thus use bit + * [0, num_counters - 1] for normal events and bit + * [ext * num_counters, (ext + 1) * num_counters - 1] for extension + * events. The idx allocation will keep unchanged for normal events and + * we can also use the idx to distinguish whether it's an extension + * event or not. * * Since normal events and extension events locates on the different * address space, save the base address to the event->hw.event_base. */ - if (hisi_get_ext(event)) { - if (!(hisi_l3c_pmu->feature & L3C_PMU_FEAT_EXT)) + if (ext) { + if (!support_ext(hisi_l3c_pmu)) return -EOPNOTSUPP; - event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base; - idx = find_next_zero_bit(used_mask, num_counters, L3C_NR_COUNTERS); + event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base[ext - 1]; + idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS * (ext + 1), + L3C_NR_COUNTERS * ext); } else { event->hw.event_base = (unsigned long)l3c_pmu->base; idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS, 0); - if (idx == L3C_NR_COUNTERS) - idx = num_counters; } - if (idx == num_counters) + if (idx >= L3C_NR_COUNTERS * (ext + 1)) return -EAGAIN; set_bit(idx, used_mask); - WARN_ON(hisi_get_ext(event) && idx < L3C_NR_COUNTERS); - WARN_ON(!hisi_get_ext(event) && idx >= L3C_NR_COUNTERS); + WARN_ON(idx < L3C_NR_COUNTERS * ext || idx >= L3C_NR_COUNTERS * (ext + 1)); return idx; } @@ -322,6 +335,17 @@ static void hisi_l3c_pmu_disable_filter(struct perf_event *event) } } +static int hisi_l3c_pmu_check_filter(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); + int ext = hisi_get_ext(event); + + if (ext < 0 || ext > hisi_l3c_pmu->ext_num) + return -EINVAL; + return 0; +} + /* * Select the counter register offset using the counter index */ @@ -372,22 +396,29 @@ static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu) { struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; + unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters); u32 val; + int i; /* - * Set perf_enable bit in L3C_PERF_CTRL register to start counting - * for all enabled counters. + * Check if any counter belongs to the normal range (instead of ext + * range). If so, enable it. */ - if (find_first_bit(used_mask, l3c_pmu->num_counters) < L3C_NR_COUNTERS) { + if (bit < L3C_NR_COUNTERS) { val = readl(l3c_pmu->base + L3C_PERF_CTRL); val |= L3C_PERF_CTRL_EN; writel(val, l3c_pmu->base + L3C_PERF_CTRL); } - if (find_next_bit(used_mask, l3c_pmu->num_counters, L3C_NR_COUNTERS) != l3c_pmu->num_counters) { - val = readl(hisi_l3c_pmu->ext_base + L3C_PERF_CTRL); - val |= L3C_PERF_CTRL_EN; - writel(val, hisi_l3c_pmu->ext_base + L3C_PERF_CTRL); + /* If not, do enable it on ext ranges. */ + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { + bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2), + L3C_NR_COUNTERS * (i + 1)); + if (L3C_CNTR_EXT(bit) == i + 1) { + val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); + val |= L3C_PERF_CTRL_EN; + writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); + } } } @@ -395,22 +426,30 @@ static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu) { struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; + unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters); u32 val; + int i; /* - * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting - * for all enabled counters. + * Check if any counter belongs to the normal range (instead of ext + * range). If so, stop it. */ - if (find_first_bit(used_mask, l3c_pmu->num_counters) < L3C_NR_COUNTERS) { + if (bit < L3C_NR_COUNTERS) { val = readl(l3c_pmu->base + L3C_PERF_CTRL); val &= ~(L3C_PERF_CTRL_EN); writel(val, l3c_pmu->base + L3C_PERF_CTRL); } - if (find_next_bit(used_mask, l3c_pmu->num_counters, L3C_NR_COUNTERS) != l3c_pmu->num_counters) { - val = readl(hisi_l3c_pmu->ext_base + L3C_PERF_CTRL); - val &= ~(L3C_PERF_CTRL_EN); - writel(val, hisi_l3c_pmu->ext_base + L3C_PERF_CTRL); + /* If not, do stop it on ext ranges. */ + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { + bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2), + L3C_NR_COUNTERS * (i + 1)); + if (L3C_CNTR_EXT(bit) != i + 1) + continue; + + val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); + val &= ~L3C_PERF_CTRL_EN; + writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); } } @@ -461,11 +500,18 @@ static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu, static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu) { struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); - u32 status, status_ext = 0; + u32 ext_int, status, status_ext = 0; + int i; status = readl(l3c_pmu->base + L3C_INT_STATUS); - if (hisi_l3c_pmu->feature & L3C_PMU_FEAT_EXT) - status_ext = readl(hisi_l3c_pmu->ext_base + L3C_INT_STATUS); + + if (!support_ext(hisi_l3c_pmu)) + return status; + + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { + ext_int = readl(hisi_l3c_pmu->ext_base[i] + L3C_INT_STATUS); + status_ext |= ext_int << (L3C_NR_COUNTERS * i); + } return status | (status_ext << L3C_NR_COUNTERS); } @@ -496,10 +542,6 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev, return -EINVAL; } - l3c_pmu->dev_info = device_get_match_data(&pdev->dev); - if (!l3c_pmu->dev_info) - return -ENODEV; - l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(l3c_pmu->base)) { dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n"); @@ -514,35 +556,40 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev, static int hisi_l3c_pmu_init_ext(struct hisi_pmu *l3c_pmu, struct platform_device *pdev) { struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); + int ret, irq, ext_num, i; char *irqname; - int ret, irq; - hisi_l3c_pmu->ext_base = devm_platform_ioremap_resource(pdev, 1); - if (IS_ERR(hisi_l3c_pmu->ext_base)) - return PTR_ERR(hisi_l3c_pmu->ext_base); + /* HiSilicon L3C PMU ext should have more than 1 irq resources. */ + ext_num = platform_irq_count(pdev); + if (ext_num < 2) + return -ENODEV; - irq = platform_get_irq(pdev, 1); - /* - * We may don't need to handle -EPROBDEFER since we should have already - * handle it when probling irq[0]. - */ - if (irq < 0) - return irq; + hisi_l3c_pmu->ext_num = ext_num - 1; - irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext", dev_name(&pdev->dev)); - if (!irqname) - return -ENOMEM; + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { + hisi_l3c_pmu->ext_base[i] = devm_platform_ioremap_resource(pdev, i + 1); + if (IS_ERR(hisi_l3c_pmu->ext_base[i])) + return PTR_ERR(hisi_l3c_pmu->ext_base[i]); - ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr, - IRQF_NOBALANCING | IRQF_NO_THREAD, - irqname, l3c_pmu); - if (ret < 0) { - dev_err(&pdev->dev, - "Fail to request EXT IRQ: %d ret: %d.\n", irq, ret); - return ret; + irq = platform_get_irq(pdev, i + 1); + if (irq < 0) + return irq; + + irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext%d", + dev_name(&pdev->dev), i + 1); + if (!irqname) + return -ENOMEM; + + ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr, + IRQF_NOBALANCING | IRQF_NO_THREAD, + irqname, l3c_pmu); + if (ret < 0) + return dev_err_probe(&pdev->dev, ret, + "Fail to request EXT IRQ: %d.\n", irq); + + hisi_l3c_pmu->ext_irq[i] = irq; } - hisi_l3c_pmu->ext_irq = irq; return 0; } @@ -574,7 +621,6 @@ static struct attribute *hisi_l3c_pmu_v3_format_attr[] = { HISI_PMU_FORMAT_ATTR(event, "config:0-7"), HISI_PMU_FORMAT_ATTR(ext, "config:16"), HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"), - HISI_PMU_FORMAT_ATTR(tt_cacheable, "config1:17"), HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"), NULL }; @@ -639,78 +685,23 @@ static ssize_t hisi_l3c_pmu_event_show(struct device *dev, return sysfs_emit(page, "event=0x%lx,ext=1\n", event->event_id); } -#define HISI_L3C_PMU_EVENT_ATTR(_name, _event, _ext) \ -static struct hisi_l3c_pmu_v3_event hisi_l3c_##_name = { _event, _ext }; \ -static struct dev_ext_attribute hisi_l3c_##_name##_attr = \ - { __ATTR(_name, 0444, hisi_l3c_pmu_event_show, NULL), (void *) &hisi_l3c_##_name } - -HISI_L3C_PMU_EVENT_ATTR(rd_cpipe, 0x00, true); -HISI_L3C_PMU_EVENT_ATTR(rd_hit_cpipe, 0x01, true); -HISI_L3C_PMU_EVENT_ATTR(wr_cpipe, 0x02, true); -HISI_L3C_PMU_EVENT_ATTR(wr_hit_cpipe, 0x03, true); -HISI_L3C_PMU_EVENT_ATTR(io_rd_cpipe, 0x04, true); -HISI_L3C_PMU_EVENT_ATTR(io_rd_hit_cpipe, 0x05, true); -HISI_L3C_PMU_EVENT_ATTR(io_wr_cpipe, 0x06, true); -HISI_L3C_PMU_EVENT_ATTR(io_wr_hit_cpipe, 0x07, true); -HISI_L3C_PMU_EVENT_ATTR(victim_num, 0x0c, true); -HISI_L3C_PMU_EVENT_ATTR(rd_spipe, 0x18, false); -HISI_L3C_PMU_EVENT_ATTR(rd_hit_spipe, 0x19, false); -HISI_L3C_PMU_EVENT_ATTR(wr_spipe, 0x1a, false); -HISI_L3C_PMU_EVENT_ATTR(wr_hit_spipe, 0x1b, false); -HISI_L3C_PMU_EVENT_ATTR(io_rd_spipe, 0x1c, false); -HISI_L3C_PMU_EVENT_ATTR(io_rd_hit_spipe, 0x1d, false); -HISI_L3C_PMU_EVENT_ATTR(io_wr_spipe, 0x1e, false); -HISI_L3C_PMU_EVENT_ATTR(io_wr_hit_spipe, 0x1f, false); -HISI_L3C_PMU_EVENT_ATTR(cycles, 0x7f, false); -HISI_L3C_PMU_EVENT_ATTR(l3c_ref, 0xbc, false); -HISI_L3C_PMU_EVENT_ATTR(l3c2ring, 0xbd, true); - static struct attribute *hisi_l3c_pmu_v3_events_attr[] = { - &hisi_l3c_rd_cpipe_attr.attr.attr, - &hisi_l3c_rd_hit_cpipe_attr.attr.attr, - &hisi_l3c_wr_cpipe_attr.attr.attr, - &hisi_l3c_wr_hit_cpipe_attr.attr.attr, - &hisi_l3c_io_rd_cpipe_attr.attr.attr, - &hisi_l3c_io_rd_hit_cpipe_attr.attr.attr, - &hisi_l3c_io_wr_cpipe_attr.attr.attr, - &hisi_l3c_io_wr_hit_cpipe_attr.attr.attr, - &hisi_l3c_victim_num_attr.attr.attr, - &hisi_l3c_rd_spipe_attr.attr.attr, - &hisi_l3c_rd_hit_spipe_attr.attr.attr, - &hisi_l3c_wr_spipe_attr.attr.attr, - &hisi_l3c_wr_hit_spipe_attr.attr.attr, - &hisi_l3c_io_rd_spipe_attr.attr.attr, - &hisi_l3c_io_rd_hit_spipe_attr.attr.attr, - &hisi_l3c_io_wr_spipe_attr.attr.attr, - &hisi_l3c_io_wr_hit_spipe_attr.attr.attr, - &hisi_l3c_cycles_attr.attr.attr, - &hisi_l3c_l3c_ref_attr.attr.attr, - &hisi_l3c_l3c2ring_attr.attr.attr, + HISI_PMU_EVENT_ATTR(rd_spipe, 0x18), + HISI_PMU_EVENT_ATTR(rd_hit_spipe, 0x19), + HISI_PMU_EVENT_ATTR(wr_spipe, 0x1a), + HISI_PMU_EVENT_ATTR(wr_hit_spipe, 0x1b), + HISI_PMU_EVENT_ATTR(io_rd_spipe, 0x1c), + HISI_PMU_EVENT_ATTR(io_rd_hit_spipe, 0x1d), + HISI_PMU_EVENT_ATTR(io_wr_spipe, 0x1e), + HISI_PMU_EVENT_ATTR(io_wr_hit_spipe, 0x1f), + HISI_PMU_EVENT_ATTR(cycles, 0x7f), + HISI_PMU_EVENT_ATTR(l3c_ref, 0xbc), + HISI_PMU_EVENT_ATTR(l3c2ring, 0xbd), NULL }; -static umode_t hisi_l3c_pmu_v3_events_visible(struct kobject *kobj, - struct attribute *attr, int unused) -{ - struct device *dev = kobj_to_dev(kobj); - struct pmu *pmu = dev_get_drvdata(dev); - struct hisi_pmu *l3c_pmu = to_hisi_pmu(pmu); - struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); - struct hisi_l3c_pmu_v3_event *event; - struct dev_ext_attribute *ext_attr; - - ext_attr = container_of(attr, struct dev_ext_attribute, attr.attr); - event = ext_attr->var; - - if (!event->ext || (hisi_l3c_pmu->feature & L3C_PMU_FEAT_EXT)) - return attr->mode; - - return 0; -} - static const struct attribute_group hisi_l3c_pmu_v3_events_group = { .name = "events", - .is_visible = hisi_l3c_pmu_v3_events_visible, .attrs = hisi_l3c_pmu_v3_events_attr, }; @@ -738,23 +729,33 @@ static const struct attribute_group *hisi_l3c_pmu_v3_attr_groups[] = { NULL }; +static struct hisi_l3c_pmu_ext hisi_l3c_pmu_support_ext = { + .support_ext = true, +}; + +static struct hisi_l3c_pmu_ext hisi_l3c_pmu_not_support_ext = { + .support_ext = false, +}; + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = { .attr_groups = hisi_l3c_pmu_v1_attr_groups, .counter_bits = 48, .check_event = L3C_V1_NR_EVENTS, + .private = &hisi_l3c_pmu_not_support_ext, }; static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = { .attr_groups = hisi_l3c_pmu_v2_attr_groups, .counter_bits = 64, .check_event = L3C_V2_NR_EVENTS, + .private = &hisi_l3c_pmu_not_support_ext, }; static const struct hisi_pmu_dev_info hisi_l3c_pmu_v3 = { .attr_groups = hisi_l3c_pmu_v3_attr_groups, .counter_bits = 64, .check_event = L3C_V2_NR_EVENTS, - .private = (void *) L3C_PMU_FEAT_EXT, + .private = &hisi_l3c_pmu_support_ext, }; static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { @@ -772,12 +773,14 @@ static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { .clear_int_status = hisi_l3c_pmu_clear_int_status, .enable_filter = hisi_l3c_pmu_enable_filter, .disable_filter = hisi_l3c_pmu_disable_filter, + .check_filter = hisi_l3c_pmu_check_filter, }; static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, struct hisi_pmu *l3c_pmu) { struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); + struct hisi_l3c_pmu_ext *l3c_pmu_dev_ext = l3c_pmu->dev_info->private; int ret; ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu); @@ -796,20 +799,16 @@ static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, l3c_pmu->dev = &pdev->dev; l3c_pmu->on_cpu = -1; - if ((unsigned long)l3c_pmu->dev_info->private & L3C_PMU_FEAT_EXT) { + if (l3c_pmu_dev_ext->support_ext) { ret = hisi_l3c_pmu_init_ext(l3c_pmu, pdev); - if (ret) { - dev_warn(&pdev->dev, "ext event is unavailable, ret = %d\n", ret); - } else { - /* - * The extension events have their own counters with the - * same number of the normal events counters. So we can - * have at maximum num_counters * 2 events monitored. - */ - l3c_pmu->num_counters <<= 1; - - hisi_l3c_pmu->feature |= L3C_PMU_FEAT_EXT; - } + if (ret) + return ret; + /* + * The extension events have their own counters with the + * same number of the normal events counters. So we can + * have at maximum num_counters * ext events monitored. + */ + l3c_pmu->num_counters += hisi_l3c_pmu->ext_num * L3C_NR_COUNTERS; } return 0; @@ -829,6 +828,10 @@ static int hisi_l3c_pmu_probe(struct platform_device *pdev) l3c_pmu = &hisi_l3c_pmu->l3c_pmu; platform_set_drvdata(pdev, l3c_pmu); + l3c_pmu->dev_info = device_get_match_data(&pdev->dev); + if (!l3c_pmu->dev_info) + return -ENODEV; + ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu); if (ret) return ret; @@ -893,42 +896,45 @@ static struct platform_driver hisi_l3c_pmu_driver = { static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) { struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); - struct hisi_l3c_pmu *hisi_l3c_pmu; + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); int ret; - hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); - - /* - * Invoking the framework's online function for doing the core logic - * of CPU, interrupt and perf context migrating. Then return directly - * if we don't support L3C_PMU_FEAT_EXT. Otherwise migrate the ext_irq - * using the migrated CPU. - * - * Same logic for CPU offline. - */ ret = hisi_uncore_pmu_online_cpu(cpu, node); - if (!(hisi_l3c_pmu->feature & L3C_PMU_FEAT_EXT) || - l3c_pmu->on_cpu >= nr_cpu_ids) + if (ret) return ret; - WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq, cpumask_of(l3c_pmu->on_cpu))); - return ret; + /* Avoid L3C pmu not supporting ext from ext irq migrating. */ + if (!support_ext(hisi_l3c_pmu)) + return 0; + + for (int i = 0; i < hisi_l3c_pmu->ext_num; i++) + WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i], + cpumask_of(l3c_pmu->on_cpu))); + return 0; } static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) { struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); - struct hisi_l3c_pmu *hisi_l3c_pmu; + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); int ret; - hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); ret = hisi_uncore_pmu_offline_cpu(cpu, node); - if (!(hisi_l3c_pmu->feature & L3C_PMU_FEAT_EXT) || - l3c_pmu->on_cpu >= nr_cpu_ids) + if (ret) return ret; - WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq, cpumask_of(l3c_pmu->on_cpu))); - return ret; + /* If failed to find any available CPU, skip irq migration. */ + if (l3c_pmu->on_cpu <= 0) + return 0; + + /* Avoid L3C pmu not supporting ext from ext irq migrating. */ + if (!support_ext(hisi_l3c_pmu)) + return 0; + + for (int i = 0; i < hisi_l3c_pmu->ext_num; i++) + WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i], + cpumask_of(l3c_pmu->on_cpu))); + return 0; } static int __init hisi_l3c_pmu_module_init(void) diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h index 31225c2ccdce..bdf17d1a3099 100644 --- a/drivers/perf/hisilicon/hisi_uncore_pmu.h +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h @@ -24,7 +24,7 @@ #define pr_fmt(fmt) "hisi_pmu: " fmt #define HISI_PMU_V2 0x30 -#define HISI_MAX_COUNTERS 0x10 +#define HISI_MAX_COUNTERS 0x18 #define to_hisi_pmu(p) (container_of(p, struct hisi_pmu, pmu)) #define HISI_PMU_ATTR(_name, _func, _config) \ -- 2.33.0

driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/ICTSJJ ---------------------------------------------------------------------- The inline path of sysfs should be placed in literal blocks to make documentation look better. Signed-off-by: Yushan Wang <wangyushan12@huawei.com> --- Documentation/admin-guide/perf/hisi-pmu.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst index 102ea54fd64c..8df048c26498 100644 --- a/Documentation/admin-guide/perf/hisi-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pmu.rst @@ -18,10 +18,10 @@ HiSilicon SoC uncore PMU driver Each device PMU has separate registers for event counting, control and interrupt, and the PMU driver shall register perf PMU drivers like L3C, HHA and DDRC etc. The available events and configuration options shall -be described in the sysfs, see: +be described in the sysfs, see:: + +/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}> -/sys/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>/, or -/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>. The "perf list" command shall list the available events from sysfs. Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU -- 2.33.0

driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/ICTSJJ ---------------------------------------------------------------------- Some of HiSilicon V3 PMU hardware is divided into parts to fulfill the job of monitoring specific parts of a device. Add description on that as well as the newly added ext operand for L3C PMU. Signed-off-by: Yushan Wang <wangyushan12@huawei.com> --- Documentation/admin-guide/perf/hisi-pmu.rst | 38 +++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst index 8df048c26498..78bec239293d 100644 --- a/Documentation/admin-guide/perf/hisi-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pmu.rst @@ -12,8 +12,8 @@ The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has two HHAs (0 - 1) and four DDRCs (0 - 3), respectively. -HiSilicon SoC uncore PMU driver -------------------------------- +HiSilicon SoC uncore PMU v1 +--------------------------- Each device PMU has separate registers for event counting, control and interrupt, and the PMU driver shall register perf PMU drivers like L3C, @@ -56,6 +56,9 @@ Example usage of perf:: $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5 $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5 +HiSilicon SoC uncore PMU v2 +---------------------------------- + For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same as PMU v1, but some new functions are added to the hardware. @@ -124,6 +127,37 @@ channel with this option. The current supported channels are as follows: 7. tt_en: NoC PMU supports counting only transactions that have tracetag set if this option is set. See the 2nd list for more information about tracetag. +HiSilicon SoC uncore PMU v3 +---------------------------------- + +For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are +further divided into parts for finer granularity of tracing, each part has its +own dedicated PMU, and all such PMUs together cover the monitoring job of events +on particular uncore device. Such PMUs are described in sysfs with name format +slightly changed:: + +/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}> + +Z is the sub-id, indicating different PMUs for part of hardware device. + +Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU +provides ``ext`` operand to allow exploration of even finer granual statistics +of L3C PMU, L3C PMU driver use that as hint of termination when delivering perf +command to hardware: + +- ext=0: Default, could be used with event names. +- ext=1 and ext=2: Must be used with event codes, event names are not supported. + +An example of perf command could be:: + + $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5 + +or:: + + $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5 + +As above, ``hisi_sccl0_l3c1_0`` locates PMU on CPU cluster 0, L3 cache 1 pipe0. + Users could configure IDs to count data come from specific CCL/ICL, by setting srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not -- 2.33.0

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/17691 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/2AS... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/17691 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/2AS...
participants (2)
-
patchwork bot
-
Yushan Wang