mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 70 participants
  • 19492 discussions
[PATCH kernel-4.19 1/3] ALSA: hda: Add support of Zhaoxin SB HDAC
by LeoLiu-oc 25 Mar '21

25 Mar '21
Add the new PCI ID 0x1d17 0x3288 Zhaoxin SB HDAC support. The patch is scheduled to be submitted to the kernel mainline in 2021. Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- sound/pci/hda/hda_intel.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 2cd8bfd5293b..67791114471c 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -250,7 +250,8 @@ MODULE_SUPPORTED_DEVICE("{{Intel, ICH6}," "{VIA, VT8251}," "{VIA, VT8237A}," "{SiS, SIS966}," - "{ULI, M5461}}"); + "{ULI, M5461}," + "{ZX, ZhaoxinHDA}}"); MODULE_DESCRIPTION("Intel HDA driver"); #if defined(CONFIG_PM) && defined(CONFIG_VGA_SWITCHEROO) @@ -281,6 +282,7 @@ enum { AZX_DRIVER_CTX, AZX_DRIVER_CTHDA, AZX_DRIVER_CMEDIA, + AZX_DRIVER_ZHAOXIN, AZX_DRIVER_GENERIC, AZX_NUM_DRIVERS, /* keep this as last entry */ }; @@ -401,6 +403,7 @@ static char *driver_short_names[] = { [AZX_DRIVER_CTX] = "HDA Creative", [AZX_DRIVER_CTHDA] = "HDA Creative", [AZX_DRIVER_CMEDIA] = "HDA C-Media", + [AZX_DRIVER_ZHAOXIN] = "HDA Zhaoxin", [AZX_DRIVER_GENERIC] = "HD-Audio Generic", }; @@ -1599,6 +1602,9 @@ static int check_position_fix(struct azx *chip, int fix) dev_dbg(chip->card->dev, "Using FIFO position fix\n"); return POS_FIX_FIFO; } + if (chip->driver_type == AZX_DRIVER_ZHAOXIN) { + return POS_FIX_VIACOMBO; + } if (chip->driver_caps & AZX_DCAPS_POSFIX_LPIB) { dev_dbg(chip->card->dev, "Using LPIB position fix\n"); return POS_FIX_LPIB; @@ -1755,6 +1761,15 @@ static void azx_check_snoop_available(struct azx *chip) snoop = false; } + if (azx_get_snoop_type(chip) == AZX_SNOOP_TYPE_NONE && + chip->driver_type == AZX_DRIVER_ZHAOXIN) { + u8 val1; + pci_read_config_byte(chip->pci, 0x42, &val1); + if (!(val1 & 0x80) && chip->pci->revision == 0x20) { + snoop = false; + } + } + if (chip->driver_caps & AZX_DCAPS_SNOOP_OFF) snoop = false; @@ -2811,6 +2826,8 @@ static const struct pci_device_id azx_ids[] = { .class = PCI_CLASS_MULTIMEDIA_HD_AUDIO << 8, .class_mask = 0xffffff, .driver_data = AZX_DRIVER_GENERIC | AZX_DCAPS_PRESET_ATI_HDMI }, + /* Zhaoxin */ + { PCI_DEVICE(0x1d17, 0x3288), .driver_data = AZX_DRIVER_ZHAOXIN }, { 0, } }; MODULE_DEVICE_TABLE(pci, azx_ids); -- 2.20.1
1 0
0 0
[PATCH kernel-4.19] x86/perf: Add hardware performance events support for Zhaoxin CPU.
by LeoLiu-oc 25 Mar '21

25 Mar '21
mainline inclusion from mainline-5.8 commit 3a4ac121c2cacbf97d493fa3bc42ead88657abe4 category: x86/perf Add the generic Zhaoxin uncore PMU support. -------------------------------- Zhaoxin CPU has provided facilities for monitoring performance via PMU (Performance Monitor Unit), but the functionality is unused so far. Therefore, add support for zhaoxin pmu to make performance related hardware events available. The PMU is mostly an Intel Architectural PerfMon-v2 with a novel errata for the ZXC line. It supports the following events: ----------------------------------------------------------------------------------------------------------------------------------- Event | Event | Umask | Description | Select | | ----------------------------------------------------------------------------------------------------------------------------------- cpu-cycles | 82h | 00h | unhalt core clock instructions | 00h | 00h | number of instructions at retirement. cache-references | 15h | 05h | number of fillq pushs at the current cycle. cache-misses | 1ah | 05h | number of l2 miss pushed by fillq. branch-instructions | 28h | 00h | counts the number of branch instructions retired. branch-misses | 29h | 00h | mispredicted branch instructions at retirement. bus-cycles | 83h | 00h | unhalt bus clock stalled-cycles-frontend | 01h | 01h | Increments each cycle the # of Uops issued by the RAT to RS. stalled-cycles-backend | 0fh | 04h | RS0/1/2/3/45 empty L1-dcache-loads | 68h | 05h | number of retire/commit load. L1-dcache-load-misses | 4bh | 05h | retired load uops whose data source followed an L1 miss. L1-dcache-stores | 69h | 06h | number of retire/commit Store,no LEA L1-dcache-store-misses | 62h | 05h | cache lines in M state evicted out of L1D due to Snoop HitM or dirty line replacement. L1-icache-loads | 00h | 03h | number of l1i cache access for valid normal fetch,including un-cacheable access. L1-icache-load-misses | 01h | 03h | number of l1i cache miss for valid normal fetch,including un-cacheable miss. L1-icache-prefetches | 0ah | 03h | number of prefetch. L1-icache-prefetch-misses | 0bh | 03h | number of prefetch miss. dTLB-loads | 68h | 05h | number of retire/commit load dTLB-load-misses | 2ch | 05h | number of load operations miss all level tlbs and cause a tablewalk. dTLB-stores | 69h | 06h | number of retire/commit Store,no LEA dTLB-store-misses | 30h | 05h | number of store operations miss all level tlbs and cause a tablewalk. dTLB-prefetches | 64h | 05h | number of hardware pte prefetch requests dispatched out of the prefetch FIFO. dTLB-prefetch-misses | 65h | 05h | number of hardware pte prefetch requests miss the l1d data cache. iTLB-load | 00h | 00h | actually counter instructions. iTLB-load-misses | 34h | 05h | number of code operations miss all level tlbs and cause a tablewalk. ----------------------------------------------------------------------------------------------------------------------------------- Reported-by: kbuild test robot <lkp(a)intel.com> Signed-off-by: CodyYao-oc <CodyYao-oc(a)zhaoxin.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Link: https://lkml.kernel.org/r/1586747669-4827-1-git-send-email-CodyYao-oc@zhaox… Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/events/Makefile | 2 + arch/x86/events/core.c | 4 + arch/x86/events/perf_event.h | 14 +- arch/x86/events/zhaoxin/Makefile | 3 + arch/x86/events/zhaoxin/core.c | 612 +++++++++++++ arch/x86/events/zhaoxin/uncore.c | 1101 ++++++++++++++++++++++++ arch/x86/events/zhaoxin/uncore.h | 308 +++++++ arch/x86/kernel/cpu/perfctr-watchdog.c | 8 + 8 files changed, 2051 insertions(+), 1 deletion(-) create mode 100644 arch/x86/events/zhaoxin/Makefile create mode 100644 arch/x86/events/zhaoxin/core.c create mode 100644 arch/x86/events/zhaoxin/uncore.c create mode 100644 arch/x86/events/zhaoxin/uncore.h diff --git a/arch/x86/events/Makefile b/arch/x86/events/Makefile index b8ccdb5c9244..ad4a7c789637 100644 --- a/arch/x86/events/Makefile +++ b/arch/x86/events/Makefile @@ -2,3 +2,5 @@ obj-y += core.o obj-y += amd/ obj-$(CONFIG_X86_LOCAL_APIC) += msr.o obj-$(CONFIG_CPU_SUP_INTEL) += intel/ +obj-$(CONFIG_CPU_SUP_CENTAUR) += zhaoxin/ +obj-$(CONFIG_CPU_SUP_ZHAOXIN) += zhaoxin/ diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 8e8970dd1af1..640f85da2b34 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1758,6 +1758,10 @@ static int __init init_hw_perf_events(void) err = amd_pmu_init(); x86_pmu.name = "HYGON"; break; + case X86_VENDOR_ZHAOXIN: + case X86_VENDOR_CENTAUR: + err = zhaoxin_pmu_init(); + break; default: err = -ENOTSUPP; } diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 05659c7b43d4..dd24cac3d5e5 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -565,9 +565,12 @@ struct x86_pmu { struct event_constraint *event_constraints; struct x86_pmu_quirk *quirks; int perfctr_second_write; - bool late_ack; u64 (*limit_period)(struct perf_event *event, u64 l); + /* PMI handler bits */ + unsigned int late_ack :1, + enabled_ack :1, + counter_freezing :1; /* * sysfs attrs */ @@ -1044,3 +1047,12 @@ static inline int is_ht_workaround_enabled(void) return 0; } #endif /* CONFIG_CPU_SUP_INTEL */ + +#if ((defined CONFIG_CPU_SUP_CENTAUR) || (defined CONFIG_CPU_ZHAOXIN)) +int zhaoxin_pmu_init(void); +#else +static inline int zhaoxin_pmu_init(void) +{ + return 0; +} +#endif /*CONFIG_CPU_SUP_CENTAUR or CONFIG_CPU_SUP_ZHAOXIN*/ diff --git a/arch/x86/events/zhaoxin/Makefile b/arch/x86/events/zhaoxin/Makefile new file mode 100644 index 000000000000..767d6212bac1 --- /dev/null +++ b/arch/x86/events/zhaoxin/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-y += core.o +obj-y += uncore.o diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c new file mode 100644 index 000000000000..c2e5bdf3893d --- /dev/null +++ b/arch/x86/events/zhaoxin/core.c @@ -0,0 +1,612 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Zhaoxin PMU; + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/stddef.h> +#include <linux/types.h> +#include <linux/init.h> +#include <linux/slab.h> +#include <linux/export.h> +#include <linux/nmi.h> + +#include <asm/cpufeature.h> +#include <asm/hardirq.h> +#include <asm/apic.h> + +#include "../perf_event.h" + +/* + * Zhaoxin PerfMon, used on zxc and later. + */ +static u64 zx_pmon_event_map[PERF_COUNT_HW_MAX] __read_mostly = { + + [PERF_COUNT_HW_CPU_CYCLES] = 0x0082, + [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, + [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0515, + [PERF_COUNT_HW_CACHE_MISSES] = 0x051a, + [PERF_COUNT_HW_BUS_CYCLES] = 0x0083, +}; + +static struct event_constraint zxc_event_constraints[] __read_mostly = { + + FIXED_EVENT_CONSTRAINT(0x0082, 1), /* unhalted core clock cycles */ + EVENT_CONSTRAINT_END +}; + +static struct event_constraint zxd_event_constraints[] __read_mostly = { + + FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* retired instructions */ + FIXED_EVENT_CONSTRAINT(0x0082, 1), /* unhalted core clock cycles */ + FIXED_EVENT_CONSTRAINT(0x0083, 2), /* unhalted bus clock cycles */ + EVENT_CONSTRAINT_END +}; + +static __initconst const u64 zxd_hw_cache_event_ids + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { +[C(L1D)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0042, + [C(RESULT_MISS)] = 0x0538, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = 0x0043, + [C(RESULT_MISS)] = 0x0562, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(L1I)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0300, + [C(RESULT_MISS)] = 0x0301, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = 0x030a, + [C(RESULT_MISS)] = 0x030b, + }, +}, +[C(LL)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(DTLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0042, + [C(RESULT_MISS)] = 0x052c, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = 0x0043, + [C(RESULT_MISS)] = 0x0530, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = 0x0564, + [C(RESULT_MISS)] = 0x0565, + }, +}, +[C(ITLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x00c0, + [C(RESULT_MISS)] = 0x0534, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(BPU)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0700, + [C(RESULT_MISS)] = 0x0709, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(NODE)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +}; + +static __initconst const u64 zxe_hw_cache_event_ids + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { +[C(L1D)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0568, + [C(RESULT_MISS)] = 0x054b, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = 0x0669, + [C(RESULT_MISS)] = 0x0562, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(L1I)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0300, + [C(RESULT_MISS)] = 0x0301, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = 0x030a, + [C(RESULT_MISS)] = 0x030b, + }, +}, +[C(LL)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0, + [C(RESULT_MISS)] = 0x0, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = 0x0, + [C(RESULT_MISS)] = 0x0, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = 0x0, + [C(RESULT_MISS)] = 0x0, + }, +}, +[C(DTLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0568, + [C(RESULT_MISS)] = 0x052c, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = 0x0669, + [C(RESULT_MISS)] = 0x0530, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = 0x0564, + [C(RESULT_MISS)] = 0x0565, + }, +}, +[C(ITLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x00c0, + [C(RESULT_MISS)] = 0x0534, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(BPU)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x0028, + [C(RESULT_MISS)] = 0x0029, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +[C(NODE)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = -1, + [C(RESULT_MISS)] = -1, + }, +}, +}; + +static void zhaoxin_pmu_disable_all(void) +{ + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0); +} + +static void zhaoxin_pmu_enable_all(int added) +{ + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, x86_pmu.intel_ctrl); +} + +static inline u64 zhaoxin_pmu_get_status(void) +{ + u64 status; + + rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, status); + + return status; +} + +static inline void zhaoxin_pmu_ack_status(u64 ack) +{ + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack); +} + +static inline void zxc_pmu_ack_status(u64 ack) +{ + /* + * ZXC needs global control enabled in order to clear status bits. + */ + zhaoxin_pmu_enable_all(0); + zhaoxin_pmu_ack_status(ack); + zhaoxin_pmu_disable_all(); +} + +static void zhaoxin_pmu_disable_fixed(struct hw_perf_event *hwc) +{ + int idx = hwc->idx - INTEL_PMC_IDX_FIXED; + u64 ctrl_val, mask; + + mask = 0xfULL << (idx * 4); + + rdmsrl(hwc->config_base, ctrl_val); + ctrl_val &= ~mask; + wrmsrl(hwc->config_base, ctrl_val); +} + +static void zhaoxin_pmu_disable_event(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { + zhaoxin_pmu_disable_fixed(hwc); + return; + } + + x86_pmu_disable_event(event); +} + +static void zhaoxin_pmu_enable_fixed(struct hw_perf_event *hwc) +{ + int idx = hwc->idx - INTEL_PMC_IDX_FIXED; + u64 ctrl_val, bits, mask; + + /* + * Enable IRQ generation (0x8), + * and enable ring-3 counting (0x2) and ring-0 counting (0x1) + * if requested: + */ + bits = 0x8ULL; + if (hwc->config & ARCH_PERFMON_EVENTSEL_USR) + bits |= 0x2; + if (hwc->config & ARCH_PERFMON_EVENTSEL_OS) + bits |= 0x1; + + bits <<= (idx * 4); + mask = 0xfULL << (idx * 4); + + rdmsrl(hwc->config_base, ctrl_val); + ctrl_val &= ~mask; + ctrl_val |= bits; + wrmsrl(hwc->config_base, ctrl_val); +} + +static void zhaoxin_pmu_enable_event(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { + zhaoxin_pmu_enable_fixed(hwc); + return; + } + + __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); +} + +/* + * This handler is triggered by the local APIC, so the APIC IRQ handling + * rules apply: + */ +static int zhaoxin_pmu_handle_irq(struct pt_regs *regs) +{ + struct perf_sample_data data; + struct cpu_hw_events *cpuc; + int handled = 0; + u64 status; + int bit; + + cpuc = this_cpu_ptr(&cpu_hw_events); + apic_write(APIC_LVTPC, APIC_DM_NMI); + zhaoxin_pmu_disable_all(); + status = zhaoxin_pmu_get_status(); + if (!status) + goto done; + +again: + if (x86_pmu.enabled_ack) + zxc_pmu_ack_status(status); + else + zhaoxin_pmu_ack_status(status); + + inc_irq_stat(apic_perf_irqs); + + /* + * CondChgd bit 63 doesn't mean any overflow status. Ignore + * and clear the bit. + */ + if (__test_and_clear_bit(63, (unsigned long *)&status)) { + if (!status) + goto done; + } + + for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) { + struct perf_event *event = cpuc->events[bit]; + + handled++; + + if (!test_bit(bit, cpuc->active_mask)) + continue; + + x86_perf_event_update(event); + perf_sample_data_init(&data, 0, event->hw.last_period); + + if (!x86_perf_event_set_period(event)) + continue; + + if (perf_event_overflow(event, &data, regs)) + x86_pmu_stop(event, 0); + } + + /* + * Repeat if there is more work to be done: + */ + status = zhaoxin_pmu_get_status(); + if (status) + goto again; + +done: + zhaoxin_pmu_enable_all(0); + return handled; +} + +static u64 zhaoxin_pmu_event_map(int hw_event) +{ + return zx_pmon_event_map[hw_event]; +} + +static struct event_constraint * +zhaoxin_get_event_constraints(struct cpu_hw_events *cpuc, int idx, + struct perf_event *event) +{ + struct event_constraint *c; + + if (x86_pmu.event_constraints) { + for_each_event_constraint(c, x86_pmu.event_constraints) { + if ((event->hw.config & c->cmask) == c->code) + return c; + } + } + + return &unconstrained; +} + +PMU_FORMAT_ATTR(event, "config:0-7"); +PMU_FORMAT_ATTR(umask, "config:8-15"); +PMU_FORMAT_ATTR(edge, "config:18"); +PMU_FORMAT_ATTR(inv, "config:23"); +PMU_FORMAT_ATTR(cmask, "config:24-31"); + +static struct attribute *zx_arch_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_inv.attr, + &format_attr_cmask.attr, + NULL, +}; + +static ssize_t zhaoxin_event_sysfs_show(char *page, u64 config) +{ + u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT); + + return x86_event_sysfs_show(page, config, event); +} + +static const struct x86_pmu zhaoxin_pmu __initconst = { + .name = "zhaoxin_pmu", + .handle_irq = zhaoxin_pmu_handle_irq, + .disable_all = zhaoxin_pmu_disable_all, + .enable_all = zhaoxin_pmu_enable_all, + .enable = zhaoxin_pmu_enable_event, + .disable = zhaoxin_pmu_disable_event, + .hw_config = x86_pmu_hw_config, + .schedule_events = x86_schedule_events, + .eventsel = MSR_ARCH_PERFMON_EVENTSEL0, + .perfctr = MSR_ARCH_PERFMON_PERFCTR0, + .event_map = zhaoxin_pmu_event_map, + .max_events = ARRAY_SIZE(zx_pmon_event_map), + .apic = 1, + /* + * For zxd/zxe, read/write operation for PMCx MSR is 48 bits. + */ + .max_period = (1ULL << 47) - 1, + .get_event_constraints = zhaoxin_get_event_constraints, + + .format_attrs = zx_arch_formats_attr, + .events_sysfs_show = zhaoxin_event_sysfs_show, +}; + +static const struct { int id; char *name; } zx_arch_events_map[] __initconst = { + { PERF_COUNT_HW_CPU_CYCLES, "cpu cycles" }, + { PERF_COUNT_HW_INSTRUCTIONS, "instructions" }, + { PERF_COUNT_HW_BUS_CYCLES, "bus cycles" }, + { PERF_COUNT_HW_CACHE_REFERENCES, "cache references" }, + { PERF_COUNT_HW_CACHE_MISSES, "cache misses" }, + { PERF_COUNT_HW_BRANCH_INSTRUCTIONS, "branch instructions" }, + { PERF_COUNT_HW_BRANCH_MISSES, "branch misses" }, +}; + +static __init void zhaoxin_arch_events_quirk(void) +{ + int bit; + + /* disable event that reported as not presend by cpuid */ + for_each_set_bit(bit, x86_pmu.events_mask, ARRAY_SIZE(zx_arch_events_map)) { + zx_pmon_event_map[zx_arch_events_map[bit].id] = 0; + pr_warn("CPUID marked event: \'%s\' unavailable\n", + zx_arch_events_map[bit].name); + } +} + +__init int zhaoxin_pmu_init(void) +{ + union cpuid10_edx edx; + union cpuid10_eax eax; + union cpuid10_ebx ebx; + struct event_constraint *c; + unsigned int unused; + int version; + + pr_info("Welcome to pmu!\n"); + + /* + * Check whether the Architectural PerfMon supports + * hw_event or not. + */ + cpuid(10, &eax.full, &ebx.full, &unused, &edx.full); + + if (eax.split.mask_length < ARCH_PERFMON_EVENTS_COUNT - 1) + return -ENODEV; + + version = eax.split.version_id; + if (version != 2) + return -ENODEV; + + x86_pmu = zhaoxin_pmu; + pr_info("Version check pass!\n"); + + x86_pmu.version = version; + x86_pmu.num_counters = eax.split.num_counters; + x86_pmu.cntval_bits = eax.split.bit_width; + x86_pmu.cntval_mask = (1ULL << eax.split.bit_width) - 1; + x86_pmu.events_maskl = ebx.full; + x86_pmu.events_mask_len = eax.split.mask_length; + + x86_pmu.num_counters_fixed = edx.split.num_counters_fixed; + x86_add_quirk(zhaoxin_arch_events_quirk); + + switch (boot_cpu_data.x86) { + case 0x06: + if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) { + + x86_pmu.max_period = x86_pmu.cntval_mask >> 1; + + /* Clearing status works only if the global control is enable on zxc. */ + x86_pmu.enabled_ack = 1; + + x86_pmu.event_constraints = zxc_event_constraints; + zx_pmon_event_map[PERF_COUNT_HW_INSTRUCTIONS] = 0; + zx_pmon_event_map[PERF_COUNT_HW_CACHE_REFERENCES] = 0; + zx_pmon_event_map[PERF_COUNT_HW_CACHE_MISSES] = 0; + zx_pmon_event_map[PERF_COUNT_HW_BUS_CYCLES] = 0; + + pr_cont("C events, "); + break; + } + return -ENODEV; + + case 0x07: + zx_pmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = + X86_CONFIG(.event = 0x01, .umask = 0x01, .inv = 0x01, .cmask = 0x01); + + zx_pmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = + X86_CONFIG(.event = 0x0f, .umask = 0x04, .inv = 0, .cmask = 0); + + switch (boot_cpu_data.x86_model) { + case 0x1b: + memcpy(hw_cache_event_ids, zxd_hw_cache_event_ids, + sizeof(hw_cache_event_ids)); + + x86_pmu.event_constraints = zxd_event_constraints; + + zx_pmon_event_map[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x0700; + zx_pmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x0709; + + pr_cont("D events, "); + break; + case 0x3b: + memcpy(hw_cache_event_ids, zxe_hw_cache_event_ids, + sizeof(hw_cache_event_ids)); + + x86_pmu.event_constraints = zxd_event_constraints; + + zx_pmon_event_map[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x0028; + zx_pmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x0029; + + pr_cont("E events, "); + break; + default: + return -ENODEV; + } + break; + + default: + return -ENODEV; + } + + x86_pmu.intel_ctrl = (1 << (x86_pmu.num_counters)) - 1; + x86_pmu.intel_ctrl |= ((1LL << x86_pmu.num_counters_fixed)-1) << INTEL_PMC_IDX_FIXED; + + if (x86_pmu.event_constraints) { + for_each_event_constraint(c, x86_pmu.event_constraints) { + c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1; + c->weight += x86_pmu.num_counters; + } + } + + return 0; +} diff --git a/arch/x86/events/zhaoxin/uncore.c b/arch/x86/events/zhaoxin/uncore.c new file mode 100644 index 000000000000..4c4ea01d23c8 --- /dev/null +++ b/arch/x86/events/zhaoxin/uncore.c @@ -0,0 +1,1101 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <linux/module.h> + +#include <asm/cpu_device_id.h> +#include "uncore.h" + +static struct zhaoxin_uncore_type *empty_uncore[] = { NULL, }; +static struct zhaoxin_uncore_type **uncore_msr_uncores = empty_uncore; + +/* mask of cpus that collect uncore events */ +static cpumask_t uncore_cpu_mask; + +/* constraint for the fixed counter */ +static struct event_constraint uncore_constraint_fixed = + EVENT_CONSTRAINT(~0ULL, 1 << UNCORE_PMC_IDX_FIXED, ~0ULL); + +static int max_packages; + +/* CHX event control */ +#define CHX_UNC_CTL_EV_SEL_MASK 0x000000ff +#define CHX_UNC_CTL_UMASK_MASK 0x0000ff00 +#define CHX_UNC_CTL_EDGE_DET (1 << 18) +#define CHX_UNC_CTL_EN (1 << 22) +#define CHX_UNC_CTL_INVERT (1 << 23) +#define CHX_UNC_CTL_CMASK_MASK 0xff000000 +#define CHX_UNC_FIXED_CTR_CTL_EN (1 << 0) + +#define CHX_UNC_RAW_EVENT_MASK (CHX_UNC_CTL_EV_SEL_MASK | \ + CHX_UNC_CTL_UMASK_MASK | \ + CHX_UNC_CTL_EDGE_DET | \ + CHX_UNC_CTL_INVERT | \ + CHX_UNC_CTL_CMASK_MASK) + +/* CHX global control register */ +#define CHX_UNC_PERF_GLOBAL_CTL 0x391 +#define CHX_UNC_FIXED_CTR 0x394 +#define CHX_UNC_FIXED_CTR_CTRL 0x395 + +/* CHX uncore global control */ +#define CHX_UNC_GLOBAL_CTL_EN_PC_ALL ((1ULL << 4) - 1) +#define CHX_UNC_GLOBAL_CTL_EN_FC (1ULL << 32) + +/* CHX uncore register */ +#define CHX_UNC_PERFEVTSEL0 0x3c0 +#define CHX_UNC_UNCORE_PMC0 0x3b0 + +DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7"); +DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15"); +DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18"); +DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23"); +DEFINE_UNCORE_FORMAT_ATTR(cmask8, cmask, "config:24-31"); + +ssize_t zx_uncore_event_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) +{ + struct uncore_event_desc *event = + container_of(attr, struct uncore_event_desc, attr); + return sprintf(buf, "%s", event->config); +} + +/*chx uncore support */ +static void chx_uncore_msr_disable_event(struct zhaoxin_uncore_box *box, struct perf_event *event) +{ + wrmsrl(event->hw.config_base, 0); +} + +static u64 uncore_msr_read_counter(struct zhaoxin_uncore_box *box, struct perf_event *event) +{ + u64 count; + + rdmsrl(event->hw.event_base, count); + + return count; +} + +static void chx_uncore_msr_disable_box(struct zhaoxin_uncore_box *box) +{ + wrmsrl(CHX_UNC_PERF_GLOBAL_CTL, 0); +} + +static void chx_uncore_msr_enable_box(struct zhaoxin_uncore_box *box) +{ + wrmsrl(CHX_UNC_PERF_GLOBAL_CTL, CHX_UNC_GLOBAL_CTL_EN_PC_ALL | CHX_UNC_GLOBAL_CTL_EN_FC); +} + +static void chx_uncore_msr_enable_event(struct zhaoxin_uncore_box *box, struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (hwc->idx < UNCORE_PMC_IDX_FIXED) + wrmsrl(hwc->config_base, hwc->config | CHX_UNC_CTL_EN); + else + wrmsrl(hwc->config_base, CHX_UNC_FIXED_CTR_CTL_EN); +} + +static struct attribute *chx_uncore_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_inv.attr, + &format_attr_cmask8.attr, + NULL, +}; + +static struct attribute_group chx_uncore_format_group = { + .name = "format", + .attrs = chx_uncore_formats_attr, +}; + +static struct uncore_event_desc chx_uncore_events[] = { + { /* end: all zeroes */ }, +}; + +static struct zhaoxin_uncore_ops chx_uncore_msr_ops = { + .disable_box = chx_uncore_msr_disable_box, + .enable_box = chx_uncore_msr_enable_box, + .disable_event = chx_uncore_msr_disable_event, + .enable_event = chx_uncore_msr_enable_event, + .read_counter = uncore_msr_read_counter, +}; + +static struct zhaoxin_uncore_type chx_uncore_box = { + .name = "", + .num_counters = 4, + .num_boxes = 1, + .perf_ctr_bits = 48, + .fixed_ctr_bits = 48, + .event_ctl = CHX_UNC_PERFEVTSEL0, + .perf_ctr = CHX_UNC_UNCORE_PMC0, + .fixed_ctr = CHX_UNC_FIXED_CTR, + .fixed_ctl = CHX_UNC_FIXED_CTR_CTRL, + .event_mask = CHX_UNC_RAW_EVENT_MASK, + .event_descs = chx_uncore_events, + .ops = &chx_uncore_msr_ops, + .format_group = &chx_uncore_format_group, +}; + +static struct zhaoxin_uncore_type *chx_msr_uncores[] = { + &chx_uncore_box, + NULL, +}; + +static struct zhaoxin_uncore_box *uncore_pmu_to_box(struct zhaoxin_uncore_pmu *pmu, int cpu) +{ + unsigned int package_id = topology_logical_package_id(cpu); + + /* + * The unsigned check also catches the '-1' return value for non + * existent mappings in the topology map. + */ + return package_id < max_packages ? pmu->boxes[package_id] : NULL; +} + +static void uncore_assign_hw_event(struct zhaoxin_uncore_box *box, + struct perf_event *event, int idx) +{ + struct hw_perf_event *hwc = &event->hw; + + hwc->idx = idx; + hwc->last_tag = ++box->tags[idx]; + + if (uncore_pmc_fixed(hwc->idx)) { + hwc->event_base = uncore_fixed_ctr(box); + hwc->config_base = uncore_fixed_ctl(box); + return; + } + + hwc->config_base = uncore_event_ctl(box, hwc->idx); + hwc->event_base = uncore_perf_ctr(box, hwc->idx); +} + +void uncore_perf_event_update(struct zhaoxin_uncore_box *box, struct perf_event *event) +{ + u64 prev_count, new_count, delta; + int shift; + + if (uncore_pmc_fixed(event->hw.idx)) + shift = 64 - uncore_fixed_ctr_bits(box); + else + shift = 64 - uncore_perf_ctr_bits(box); + + /* the hrtimer might modify the previous event value */ +again: + prev_count = local64_read(&event->hw.prev_count); + new_count = uncore_read_counter(box, event); + if (local64_xchg(&event->hw.prev_count, new_count) != prev_count) + goto again; + + delta = (new_count << shift) - (prev_count << shift); + delta >>= shift; + + local64_add(delta, &event->count); +} + +static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer) +{ + struct zhaoxin_uncore_box *box; + struct perf_event *event; + unsigned long flags; + int bit; + + box = container_of(hrtimer, struct zhaoxin_uncore_box, hrtimer); + if (!box->n_active || box->cpu != smp_processor_id()) + return HRTIMER_NORESTART; + /* + * disable local interrupt to prevent uncore_pmu_event_start/stop + * to interrupt the update process + */ + local_irq_save(flags); + + /* + * handle boxes with an active event list as opposed to active + * counters + */ + list_for_each_entry(event, &box->active_list, active_entry) { + uncore_perf_event_update(box, event); + } + + for_each_set_bit(bit, box->active_mask, UNCORE_PMC_IDX_MAX) + uncore_perf_event_update(box, box->events[bit]); + + local_irq_restore(flags); + + hrtimer_forward_now(hrtimer, ns_to_ktime(box->hrtimer_duration)); + return HRTIMER_RESTART; +} + +static void uncore_pmu_start_hrtimer(struct zhaoxin_uncore_box *box) +{ + hrtimer_start(&box->hrtimer, ns_to_ktime(box->hrtimer_duration), + HRTIMER_MODE_REL_PINNED); +} + +static void uncore_pmu_cancel_hrtimer(struct zhaoxin_uncore_box *box) +{ + hrtimer_cancel(&box->hrtimer); +} + +static void uncore_pmu_init_hrtimer(struct zhaoxin_uncore_box *box) +{ + hrtimer_init(&box->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + box->hrtimer.function = uncore_pmu_hrtimer; +} + +static struct zhaoxin_uncore_box *uncore_alloc_box(struct zhaoxin_uncore_type *type, + int node) +{ + int i, size, numshared = type->num_shared_regs; + struct zhaoxin_uncore_box *box; + + size = sizeof(*box) + numshared * sizeof(struct zhaoxin_uncore_extra_reg); + + box = kzalloc_node(size, GFP_KERNEL, node); + if (!box) + return NULL; + + for (i = 0; i < numshared; i++) + raw_spin_lock_init(&box->shared_regs[i].lock); + + uncore_pmu_init_hrtimer(box); + box->cpu = -1; + box->package_id = -1; + + /* set default hrtimer timeout */ + box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL; + + INIT_LIST_HEAD(&box->active_list); + + return box; +} + +static bool is_box_event(struct zhaoxin_uncore_box *box, struct perf_event *event) +{ + return &box->pmu->pmu == event->pmu; +} + +static struct event_constraint * +uncore_get_event_constraint(struct zhaoxin_uncore_box *box, struct perf_event *event) +{ + struct zhaoxin_uncore_type *type = box->pmu->type; + struct event_constraint *c; + + if (type->ops->get_constraint) { + c = type->ops->get_constraint(box, event); + if (c) + return c; + } + + if (event->attr.config == UNCORE_FIXED_EVENT) + return &uncore_constraint_fixed; + + if (type->constraints) { + for_each_event_constraint(c, type->constraints) { + if ((event->hw.config & c->cmask) == c->code) + return c; + } + } + + return &type->unconstrainted; +} + +static void uncore_put_event_constraint(struct zhaoxin_uncore_box *box, + struct perf_event *event) +{ + if (box->pmu->type->ops->put_constraint) + box->pmu->type->ops->put_constraint(box, event); +} + +static int uncore_assign_events(struct zhaoxin_uncore_box *box, int assign[], int n) +{ + unsigned long used_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)]; + struct event_constraint *c; + int i, wmin, wmax, ret = 0; + struct hw_perf_event *hwc; + + bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX); + + for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) { + c = uncore_get_event_constraint(box, box->event_list[i]); + box->event_constraint[i] = c; + wmin = min(wmin, c->weight); + wmax = max(wmax, c->weight); + } + + /* fastpath, try to reuse previous register */ + for (i = 0; i < n; i++) { + hwc = &box->event_list[i]->hw; + c = box->event_constraint[i]; + + /* never assigned */ + if (hwc->idx == -1) + break; + + /* constraint still honored */ + if (!test_bit(hwc->idx, c->idxmsk)) + break; + + /* not already used */ + if (test_bit(hwc->idx, used_mask)) + break; + + __set_bit(hwc->idx, used_mask); + if (assign) + assign[i] = hwc->idx; + } + /* slow path */ + if (i != n) + ret = perf_assign_events(box->event_constraint, n, + wmin, wmax, n, assign); + + if (!assign || ret) { + for (i = 0; i < n; i++) + uncore_put_event_constraint(box, box->event_list[i]); + } + return ret ? -EINVAL : 0; +} + +static void uncore_pmu_event_start(struct perf_event *event, int flags) +{ + struct zhaoxin_uncore_box *box = uncore_event_to_box(event); + int idx = event->hw.idx; + + + if (WARN_ON_ONCE(idx == -1 || idx >= UNCORE_PMC_IDX_MAX)) + return; + + if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED))) + return; + + event->hw.state = 0; + box->events[idx] = event; + box->n_active++; + __set_bit(idx, box->active_mask); + + local64_set(&event->hw.prev_count, uncore_read_counter(box, event)); + uncore_enable_event(box, event); + + if (box->n_active == 1) { + uncore_enable_box(box); + uncore_pmu_start_hrtimer(box); + } +} + +static void uncore_pmu_event_stop(struct perf_event *event, int flags) +{ + struct zhaoxin_uncore_box *box = uncore_event_to_box(event); + struct hw_perf_event *hwc = &event->hw; + + if (__test_and_clear_bit(hwc->idx, box->active_mask)) { + uncore_disable_event(box, event); + box->n_active--; + box->events[hwc->idx] = NULL; + WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); + hwc->state |= PERF_HES_STOPPED; + + if (box->n_active == 0) { + uncore_disable_box(box); + uncore_pmu_cancel_hrtimer(box); + } + } + + if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) { + /* + * Drain the remaining delta count out of a event + * that we are disabling: + */ + uncore_perf_event_update(box, event); + hwc->state |= PERF_HES_UPTODATE; + } +} + +static int +uncore_collect_events(struct zhaoxin_uncore_box *box, struct perf_event *leader, + bool dogrp) +{ + struct perf_event *event; + int n, max_count; + + max_count = box->pmu->type->num_counters; + if (box->pmu->type->fixed_ctl) + max_count++; + + if (box->n_events >= max_count) + return -EINVAL; + + n = box->n_events; + + if (is_box_event(box, leader)) { + box->event_list[n] = leader; + n++; + } + + if (!dogrp) + return n; + + for_each_sibling_event(event, leader) { + if (!is_box_event(box, event) || + event->state <= PERF_EVENT_STATE_OFF) + continue; + + if (n >= max_count) + return -EINVAL; + + box->event_list[n] = event; + n++; + } + return n; +} + +static int uncore_pmu_event_add(struct perf_event *event, int flags) +{ + struct zhaoxin_uncore_box *box = uncore_event_to_box(event); + struct hw_perf_event *hwc = &event->hw; + int assign[UNCORE_PMC_IDX_MAX]; + int i, n, ret; + + if (!box) + return -ENODEV; + + ret = n = uncore_collect_events(box, event, false); + if (ret < 0) + return ret; + + hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; + if (!(flags & PERF_EF_START)) + hwc->state |= PERF_HES_ARCH; + + ret = uncore_assign_events(box, assign, n); + if (ret) + return ret; + + /* save events moving to new counters */ + for (i = 0; i < box->n_events; i++) { + event = box->event_list[i]; + hwc = &event->hw; + + if (hwc->idx == assign[i] && + hwc->last_tag == box->tags[assign[i]]) + continue; + /* + * Ensure we don't accidentally enable a stopped + * counter simply because we rescheduled. + */ + if (hwc->state & PERF_HES_STOPPED) + hwc->state |= PERF_HES_ARCH; + + uncore_pmu_event_stop(event, PERF_EF_UPDATE); + } + + /* reprogram moved events into new counters */ + for (i = 0; i < n; i++) { + event = box->event_list[i]; + hwc = &event->hw; + + if (hwc->idx != assign[i] || + hwc->last_tag != box->tags[assign[i]]) + uncore_assign_hw_event(box, event, assign[i]); + else if (i < box->n_events) + continue; + + if (hwc->state & PERF_HES_ARCH) + continue; + + uncore_pmu_event_start(event, 0); + } + box->n_events = n; + + return 0; +} + +static int uncore_validate_group(struct zhaoxin_uncore_pmu *pmu, + struct perf_event *event) +{ + struct perf_event *leader = event->group_leader; + struct zhaoxin_uncore_box *fake_box; + int ret = -EINVAL, n; + + fake_box = uncore_alloc_box(pmu->type, NUMA_NO_NODE); + if (!fake_box) + return -ENOMEM; + + fake_box->pmu = pmu; + /* + * the event is not yet connected with its + * siblings therefore we must first collect + * existing siblings, then add the new event + * before we can simulate the scheduling + */ + n = uncore_collect_events(fake_box, leader, true); + if (n < 0) + goto out; + + fake_box->n_events = n; + n = uncore_collect_events(fake_box, event, false); + if (n < 0) + goto out; + + fake_box->n_events = n; + + ret = uncore_assign_events(fake_box, NULL, n); +out: + kfree(fake_box); + return ret; +} + +static void uncore_pmu_event_del(struct perf_event *event, int flags) +{ + struct zhaoxin_uncore_box *box = uncore_event_to_box(event); + int i; + + uncore_pmu_event_stop(event, PERF_EF_UPDATE); + + for (i = 0; i < box->n_events; i++) { + if (event == box->event_list[i]) { + uncore_put_event_constraint(box, event); + + for (++i; i < box->n_events; i++) + box->event_list[i - 1] = box->event_list[i]; + + --box->n_events; + break; + } + } + + event->hw.idx = -1; + event->hw.last_tag = ~0ULL; +} + +static void uncore_pmu_event_read(struct perf_event *event) +{ + struct zhaoxin_uncore_box *box = uncore_event_to_box(event); + + uncore_perf_event_update(box, event); +} + +static int uncore_pmu_event_init(struct perf_event *event) +{ + struct zhaoxin_uncore_pmu *pmu; + struct zhaoxin_uncore_box *box; + struct hw_perf_event *hwc = &event->hw; + int ret; + + if (event->attr.type != event->pmu->type) + return -ENOENT; + + pmu = uncore_event_to_pmu(event); + /* no device found for this pmu */ + if (pmu->func_id < 0) + return -ENOENT; + + /* Sampling not supported yet */ + if (hwc->sample_period) + return -EINVAL; + + /* + * Place all uncore events for a particular physical package + * onto a single cpu + */ + if (event->cpu < 0) + return -EINVAL; + box = uncore_pmu_to_box(pmu, event->cpu); + if (!box || box->cpu < 0) + return -EINVAL; + event->cpu = box->cpu; + event->pmu_private = box; + + event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG; + + event->hw.idx = -1; + event->hw.last_tag = ~0ULL; + event->hw.extra_reg.idx = EXTRA_REG_NONE; + event->hw.branch_reg.idx = EXTRA_REG_NONE; + + if (event->attr.config == UNCORE_FIXED_EVENT) { + /* no fixed counter */ + if (!pmu->type->fixed_ctl) + return -EINVAL; + /* + * if there is only one fixed counter, only the first pmu + * can access the fixed counter + */ + if (pmu->type->single_fixed && pmu->pmu_idx > 0) + return -EINVAL; + + /* fixed counters have event field hardcoded to zero */ + hwc->config = 0ULL; + } else { + hwc->config = event->attr.config & + (pmu->type->event_mask | ((u64)pmu->type->event_mask_ext << 32)); + if (pmu->type->ops->hw_config) { + ret = pmu->type->ops->hw_config(box, event); + if (ret) + return ret; + } + } + + if (event->group_leader != event) + ret = uncore_validate_group(pmu, event); + else + ret = 0; + + return ret; +} + +static ssize_t uncore_get_attr_cpumask(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpumap_print_to_pagebuf(true, buf, &uncore_cpu_mask); +} + +static DEVICE_ATTR(cpumask, S_IRUGO, uncore_get_attr_cpumask, NULL); + +static struct attribute *uncore_pmu_attrs[] = { + &dev_attr_cpumask.attr, + NULL, +}; + +static const struct attribute_group uncore_pmu_attr_group = { + .attrs = uncore_pmu_attrs, +}; + +static void uncore_pmu_unregister(struct zhaoxin_uncore_pmu *pmu) +{ + if (!pmu->registered) + return; + perf_pmu_unregister(&pmu->pmu); + pmu->registered = false; +} + +static void uncore_free_boxes(struct zhaoxin_uncore_pmu *pmu) +{ + int package; + + for (package = 0; package < max_packages; package++) + kfree(pmu->boxes[package]); + kfree(pmu->boxes); +} + +static void uncore_type_exit(struct zhaoxin_uncore_type *type) +{ + struct zhaoxin_uncore_pmu *pmu = type->pmus; + int i; + + if (pmu) { + for (i = 0; i < type->num_boxes; i++, pmu++) { + uncore_pmu_unregister(pmu); + uncore_free_boxes(pmu); + } + kfree(type->pmus); + type->pmus = NULL; + } + kfree(type->events_group); + type->events_group = NULL; +} + +static void uncore_types_exit(struct zhaoxin_uncore_type **types) +{ + for (; *types; types++) + uncore_type_exit(*types); +} + +static int __init uncore_type_init(struct zhaoxin_uncore_type *type, bool setid) +{ + struct zhaoxin_uncore_pmu *pmus; + size_t size; + int i, j; + + pmus = kcalloc(type->num_boxes, sizeof(*pmus), GFP_KERNEL); + if (!pmus) + return -ENOMEM; + + size = max_packages*sizeof(struct zhaoxin_uncore_box *); + + for (i = 0; i < type->num_boxes; i++) { + pmus[i].func_id = setid ? i : -1; + pmus[i].pmu_idx = i; + pmus[i].type = type; + pmus[i].boxes = kzalloc(size, GFP_KERNEL); + if (!pmus[i].boxes) + goto err; + } + + type->pmus = pmus; + type->unconstrainted = (struct event_constraint) + __EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1, + 0, type->num_counters, 0, 0); + + if (type->event_descs) { + struct { + struct attribute_group group; + struct attribute *attrs[]; + } *attr_group; + for (i = 0; type->event_descs[i].attr.attr.name; i++) + ; + + attr_group = kzalloc(struct_size(attr_group, attrs, i + 1), GFP_KERNEL); + if (!attr_group) + goto err; + + attr_group->group.name = "events"; + attr_group->group.attrs = attr_group->attrs; + + for (j = 0; j < i; j++) + attr_group->attrs[j] = &type->event_descs[j].attr.attr; + + type->events_group = &attr_group->group; + } + + type->pmu_group = &uncore_pmu_attr_group; + + return 0; + +err: + for (i = 0; i < type->num_boxes; i++) + kfree(pmus[i].boxes); + kfree(pmus); + + return -ENOMEM; +} + +static int __init +uncore_types_init(struct zhaoxin_uncore_type **types, bool setid) +{ + int ret; + + for (; *types; types++) { + ret = uncore_type_init(*types, setid); + if (ret) + return ret; + } + return 0; +} + +static void uncore_change_type_ctx(struct zhaoxin_uncore_type *type, int old_cpu, + int new_cpu) +{ + struct zhaoxin_uncore_pmu *pmu = type->pmus; + struct zhaoxin_uncore_box *box; + int i, package; + + package = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu); + for (i = 0; i < type->num_boxes; i++, pmu++) { + box = pmu->boxes[package]; + if (!box) + continue; + + if (old_cpu < 0) { + WARN_ON_ONCE(box->cpu != -1); + box->cpu = new_cpu; + continue; + } + + WARN_ON_ONCE(box->cpu != old_cpu); + box->cpu = -1; + if (new_cpu < 0) + continue; + + uncore_pmu_cancel_hrtimer(box); + perf_pmu_migrate_context(&pmu->pmu, old_cpu, new_cpu); + box->cpu = new_cpu; + } +} + +static void uncore_change_context(struct zhaoxin_uncore_type **uncores, + int old_cpu, int new_cpu) +{ + for (; *uncores; uncores++) + uncore_change_type_ctx(*uncores, old_cpu, new_cpu); +} + +static void uncore_box_unref(struct zhaoxin_uncore_type **types, int id) +{ + struct zhaoxin_uncore_type *type; + struct zhaoxin_uncore_pmu *pmu; + struct zhaoxin_uncore_box *box; + int i; + + for (; *types; types++) { + type = *types; + pmu = type->pmus; + for (i = 0; i < type->num_boxes; i++, pmu++) { + box = pmu->boxes[id]; + if (box && atomic_dec_return(&box->refcnt) == 0) + uncore_box_exit(box); + } + } +} + +static int uncore_event_cpu_offline(unsigned int cpu) +{ + int package, target; + + /* Check if exiting cpu is used for collecting uncore events */ + if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask)) + goto unref; + /* Find a new cpu to collect uncore events */ + target = cpumask_any_but(topology_core_cpumask(cpu), cpu); + + /* Migrate uncore events to the new target */ + if (target < nr_cpu_ids) + cpumask_set_cpu(target, &uncore_cpu_mask); + else + target = -1; + + uncore_change_context(uncore_msr_uncores, cpu, target); + +unref: + /* Clear the references */ + package = topology_logical_package_id(cpu); + uncore_box_unref(uncore_msr_uncores, package); + return 0; +} + +static int allocate_boxes(struct zhaoxin_uncore_type **types, + unsigned int package, unsigned int cpu) +{ + struct zhaoxin_uncore_box *box, *tmp; + struct zhaoxin_uncore_type *type; + struct zhaoxin_uncore_pmu *pmu; + LIST_HEAD(allocated); + int i; + + /* Try to allocate all required boxes */ + for (; *types; types++) { + type = *types; + pmu = type->pmus; + for (i = 0; i < type->num_boxes; i++, pmu++) { + if (pmu->boxes[package]) + continue; + box = uncore_alloc_box(type, cpu_to_node(cpu)); + if (!box) + goto cleanup; + box->pmu = pmu; + box->package_id = package; + list_add(&box->active_list, &allocated); + } + } + /* Install them in the pmus */ + list_for_each_entry_safe(box, tmp, &allocated, active_list) { + list_del_init(&box->active_list); + box->pmu->boxes[package] = box; + } + return 0; + +cleanup: + list_for_each_entry_safe(box, tmp, &allocated, active_list) { + list_del_init(&box->active_list); + kfree(box); + } + return -ENOMEM; +} + +static int uncore_box_ref(struct zhaoxin_uncore_type **types, + int id, unsigned int cpu) +{ + struct zhaoxin_uncore_type *type; + struct zhaoxin_uncore_pmu *pmu; + struct zhaoxin_uncore_box *box; + int i, ret; + + ret = allocate_boxes(types, id, cpu); + if (ret) + return ret; + + for (; *types; types++) { + type = *types; + pmu = type->pmus; + for (i = 0; i < type->num_boxes; i++, pmu++) { + box = pmu->boxes[id]; + if (box && atomic_inc_return(&box->refcnt) == 1) + uncore_box_init(box); + } + } + return 0; +} + +static int uncore_event_cpu_online(unsigned int cpu) +{ + int package, target, msr_ret; + + package = topology_logical_package_id(cpu); + msr_ret = uncore_box_ref(uncore_msr_uncores, package, cpu); + + if (msr_ret) + return -ENOMEM; + + /* + * Check if there is an online cpu in the package + * which collects uncore events already. + */ + target = cpumask_any_and(&uncore_cpu_mask, topology_core_cpumask(cpu)); + if (target < nr_cpu_ids) + return 0; + + cpumask_set_cpu(cpu, &uncore_cpu_mask); + + if (!msr_ret) + uncore_change_context(uncore_msr_uncores, -1, cpu); + + return 0; +} + +static int uncore_pmu_register(struct zhaoxin_uncore_pmu *pmu) +{ + int ret; + + if (!pmu->type->pmu) { + pmu->pmu = (struct pmu) { + .attr_groups = pmu->type->attr_groups, + .task_ctx_nr = perf_invalid_context, + .event_init = uncore_pmu_event_init, + .add = uncore_pmu_event_add, + .del = uncore_pmu_event_del, + .start = uncore_pmu_event_start, + .stop = uncore_pmu_event_stop, + .read = uncore_pmu_event_read, + .module = THIS_MODULE, + }; + } else { + pmu->pmu = *pmu->type->pmu; + pmu->pmu.attr_groups = pmu->type->attr_groups; + } + + if (pmu->type->num_boxes == 1) { + if (strlen(pmu->type->name) > 0) + sprintf(pmu->name, "uncore_%s", pmu->type->name); + else + sprintf(pmu->name, "uncore"); + } else { + sprintf(pmu->name, "uncore_%s_%d", pmu->type->name, + pmu->pmu_idx); + } + + ret = perf_pmu_register(&pmu->pmu, pmu->name, -1); + if (!ret) + pmu->registered = true; + return ret; +} + +static int __init type_pmu_register(struct zhaoxin_uncore_type *type) +{ + int i, ret; + + for (i = 0; i < type->num_boxes; i++) { + ret = uncore_pmu_register(&type->pmus[i]); + if (ret) + return ret; + } + return 0; +} + +static int __init uncore_msr_pmus_register(void) +{ + struct zhaoxin_uncore_type **types = uncore_msr_uncores; + int ret; + + for (; *types; types++) { + ret = type_pmu_register(*types); + if (ret) + return ret; + } + return 0; +} + +static int __init uncore_cpu_init(void) +{ + int ret; + + ret = uncore_types_init(uncore_msr_uncores, true); + if (ret) + goto err; + + ret = uncore_msr_pmus_register(); + if (ret) + goto err; + return 0; +err: + uncore_types_exit(uncore_msr_uncores); + uncore_msr_uncores = empty_uncore; + return ret; +} + + +#define CENTAUR_UNCORE_MODEL_MATCH(model, init) \ + { X86_VENDOR_CENTAUR, 7, model, X86_FEATURE_ANY, (unsigned long)&init } + +#define ZHAOXIN_UNCORE_MODEL_MATCH(model, init) \ + { X86_VENDOR_ZHAOXIN, 7, model, X86_FEATURE_ANY, (unsigned long)&init } + +struct zhaoxin_uncore_init_fun { + void (*cpu_init)(void); +}; + +void chx_uncore_cpu_init(void) +{ + uncore_msr_uncores = chx_msr_uncores; +} + +static const struct zhaoxin_uncore_init_fun chx_uncore_init __initconst = { + .cpu_init = chx_uncore_cpu_init, +}; + +static const struct x86_cpu_id zhaoxin_uncore_match[] __initconst = { + CENTAUR_UNCORE_MODEL_MATCH(ZHAOXIN_FAM7_CHX001, chx_uncore_init), + CENTAUR_UNCORE_MODEL_MATCH(ZHAOXIN_FAM7_CHX002, chx_uncore_init), + ZHAOXIN_UNCORE_MODEL_MATCH(ZHAOXIN_FAM7_CHX001, chx_uncore_init), + ZHAOXIN_UNCORE_MODEL_MATCH(ZHAOXIN_FAM7_CHX002, chx_uncore_init), + {}, +}; + +MODULE_DEVICE_TABLE(x86cpu, zhaoxin_uncore_match); + +static int __init zhaoxin_uncore_init(void) +{ + const struct x86_cpu_id *id; + struct zhaoxin_uncore_init_fun *uncore_init; + int cret = 0, ret; + + id = x86_match_cpu(zhaoxin_uncore_match); + + if (!id) + return -ENODEV; + + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return -ENODEV; + + max_packages = topology_max_packages(); + + pr_info("welcome to uncore!\n"); + + uncore_init = (struct zhaoxin_uncore_init_fun *)id->driver_data; + + if (uncore_init->cpu_init) { + uncore_init->cpu_init(); + cret = uncore_cpu_init(); + } + + if (cret) + return -ENODEV; + + ret = cpuhp_setup_state(CPUHP_AP_PERF_X86_UNCORE_ONLINE, + "perf/x86/zhaoxin/uncore:online", + uncore_event_cpu_online, + uncore_event_cpu_offline); + if (ret) + goto err; + + pr_info("uncore init success!\n"); + return 0; +err: + uncore_types_exit(uncore_msr_uncores); + return ret; +} +module_init(zhaoxin_uncore_init); + +static void __exit zhaoxin_uncore_exit(void) +{ + cpuhp_remove_state(CPUHP_AP_PERF_X86_UNCORE_ONLINE); + uncore_types_exit(uncore_msr_uncores); +} +module_exit(zhaoxin_uncore_exit); + +MODULE_LICENSE("GPL"); diff --git a/arch/x86/events/zhaoxin/uncore.h b/arch/x86/events/zhaoxin/uncore.h new file mode 100644 index 000000000000..3521123dc95d --- /dev/null +++ b/arch/x86/events/zhaoxin/uncore.h @@ -0,0 +1,308 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <linux/slab.h> +#include <linux/pci.h> +#include <asm/apicdef.h> +#include <linux/io-64-nonatomic-lo-hi.h> + +#include <linux/perf_event.h> +#include "../perf_event.h" + +#define ZHAOXIN_FAM7_CHX001 0x1b +#define ZHAOXIN_FAM7_CHX002 0x3b + +#define UNCORE_PMU_NAME_LEN 32 +#define UNCORE_PMU_HRTIMER_INTERVAL (60LL * NSEC_PER_SEC) +#define UNCORE_CHX_IMC_HRTIMER_INTERVAL (5ULL * NSEC_PER_SEC) + + +#define UNCORE_FIXED_EVENT 0xff +#define UNCORE_PMC_IDX_MAX_GENERIC 4 +#define UNCORE_PMC_IDX_MAX_FIXED 1 +#define UNCORE_PMC_IDX_FIXED UNCORE_PMC_IDX_MAX_GENERIC + +#define UNCORE_PMC_IDX_MAX (UNCORE_PMC_IDX_FIXED + 1) + +struct zhaoxin_uncore_ops; +struct zhaoxin_uncore_pmu; +struct zhaoxin_uncore_box; +struct uncore_event_desc; + +struct zhaoxin_uncore_type { + const char *name; + int num_counters; + int num_boxes; + int perf_ctr_bits; + int fixed_ctr_bits; + unsigned perf_ctr; + unsigned event_ctl; + unsigned event_mask; + unsigned event_mask_ext; + unsigned fixed_ctr; + unsigned fixed_ctl; + unsigned box_ctl; + unsigned msr_offset; + unsigned num_shared_regs:8; + unsigned single_fixed:1; + unsigned pair_ctr_ctl:1; + unsigned *msr_offsets; + struct event_constraint unconstrainted; + struct event_constraint *constraints; + struct zhaoxin_uncore_pmu *pmus; + struct zhaoxin_uncore_ops *ops; + struct uncore_event_desc *event_descs; + const struct attribute_group *attr_groups[4]; + struct pmu *pmu; /* for custom pmu ops */ +}; + +#define pmu_group attr_groups[0] +#define format_group attr_groups[1] +#define events_group attr_groups[2] + +struct zhaoxin_uncore_ops { + void (*init_box)(struct zhaoxin_uncore_box *); + void (*exit_box)(struct zhaoxin_uncore_box *); + void (*disable_box)(struct zhaoxin_uncore_box *); + void (*enable_box)(struct zhaoxin_uncore_box *); + void (*disable_event)(struct zhaoxin_uncore_box *, struct perf_event *); + void (*enable_event)(struct zhaoxin_uncore_box *, struct perf_event *); + u64 (*read_counter)(struct zhaoxin_uncore_box *, struct perf_event *); + int (*hw_config)(struct zhaoxin_uncore_box *, struct perf_event *); + struct event_constraint *(*get_constraint)(struct zhaoxin_uncore_box *, + struct perf_event *); + void (*put_constraint)(struct zhaoxin_uncore_box *, struct perf_event *); +}; + +struct zhaoxin_uncore_pmu { + struct pmu pmu; + char name[UNCORE_PMU_NAME_LEN]; + int pmu_idx; + int func_id; + bool registered; + atomic_t activeboxes; + struct zhaoxin_uncore_type *type; + struct zhaoxin_uncore_box **boxes; +}; + +struct zhaoxin_uncore_extra_reg { + raw_spinlock_t lock; + u64 config, config1, config2; + atomic_t ref; +}; + +struct zhaoxin_uncore_box { + int pci_phys_id; + int package_id; /*Package ID */ + int n_active; /* number of active events */ + int n_events; + int cpu; /* cpu to collect events */ + unsigned long flags; + atomic_t refcnt; + struct perf_event *events[UNCORE_PMC_IDX_MAX]; + struct perf_event *event_list[UNCORE_PMC_IDX_MAX]; + struct event_constraint *event_constraint[UNCORE_PMC_IDX_MAX]; + unsigned long active_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)]; + u64 tags[UNCORE_PMC_IDX_MAX]; + struct pci_dev *pci_dev; + struct zhaoxin_uncore_pmu *pmu; + u64 hrtimer_duration; /* hrtimer timeout for this box */ + struct hrtimer hrtimer; + struct list_head list; + struct list_head active_list; + void __iomem *io_addr; + struct zhaoxin_uncore_extra_reg shared_regs[0]; +}; + +#define UNCORE_BOX_FLAG_INITIATED 0 + +struct uncore_event_desc { + struct kobj_attribute attr; + const char *config; +}; + +ssize_t zx_uncore_event_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf); + +#define ZHAOXIN_UNCORE_EVENT_DESC(_name, _config) \ +{ \ + .attr = __ATTR(_name, 0444, zx_uncore_event_show, NULL), \ + .config = _config, \ +} + +#define DEFINE_UNCORE_FORMAT_ATTR(_var, _name, _format) \ +static ssize_t __uncore_##_var##_show(struct kobject *kobj, \ + struct kobj_attribute *attr, \ + char *page) \ +{ \ + BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE); \ + return sprintf(page, _format "\n"); \ +} \ +static struct kobj_attribute format_attr_##_var = \ + __ATTR(_name, 0444, __uncore_##_var##_show, NULL) + +static inline bool uncore_pmc_fixed(int idx) +{ + return idx == UNCORE_PMC_IDX_FIXED; +} + +static inline unsigned uncore_msr_box_offset(struct zhaoxin_uncore_box *box) +{ + struct zhaoxin_uncore_pmu *pmu = box->pmu; + + return pmu->type->msr_offsets ? + pmu->type->msr_offsets[pmu->pmu_idx] : + pmu->type->msr_offset * pmu->pmu_idx; +} + +static inline unsigned uncore_msr_box_ctl(struct zhaoxin_uncore_box *box) +{ + if (!box->pmu->type->box_ctl) + return 0; + return box->pmu->type->box_ctl + uncore_msr_box_offset(box); +} + +static inline unsigned uncore_msr_fixed_ctl(struct zhaoxin_uncore_box *box) +{ + if (!box->pmu->type->fixed_ctl) + return 0; + return box->pmu->type->fixed_ctl + uncore_msr_box_offset(box); +} + +static inline unsigned uncore_msr_fixed_ctr(struct zhaoxin_uncore_box *box) +{ + return box->pmu->type->fixed_ctr + uncore_msr_box_offset(box); +} + +static inline +unsigned uncore_msr_event_ctl(struct zhaoxin_uncore_box *box, int idx) +{ + return box->pmu->type->event_ctl + + (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx) + + uncore_msr_box_offset(box); +} + +static inline +unsigned uncore_msr_perf_ctr(struct zhaoxin_uncore_box *box, int idx) +{ + return box->pmu->type->perf_ctr + + (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx) + + uncore_msr_box_offset(box); +} + +static inline +unsigned uncore_fixed_ctl(struct zhaoxin_uncore_box *box) +{ + return uncore_msr_fixed_ctl(box); +} + +static inline +unsigned uncore_fixed_ctr(struct zhaoxin_uncore_box *box) +{ + return uncore_msr_fixed_ctr(box); +} + +static inline +unsigned uncore_event_ctl(struct zhaoxin_uncore_box *box, int idx) +{ + return uncore_msr_event_ctl(box, idx); +} + +static inline +unsigned uncore_perf_ctr(struct zhaoxin_uncore_box *box, int idx) +{ + return uncore_msr_perf_ctr(box, idx); +} + +static inline int uncore_perf_ctr_bits(struct zhaoxin_uncore_box *box) +{ + return box->pmu->type->perf_ctr_bits; +} + +static inline int uncore_fixed_ctr_bits(struct zhaoxin_uncore_box *box) +{ + return box->pmu->type->fixed_ctr_bits; +} + +static inline int uncore_num_counters(struct zhaoxin_uncore_box *box) +{ + return box->pmu->type->num_counters; +} + +static inline void uncore_disable_box(struct zhaoxin_uncore_box *box) +{ + if (box->pmu->type->ops->disable_box) + box->pmu->type->ops->disable_box(box); +} + +static inline void uncore_enable_box(struct zhaoxin_uncore_box *box) +{ + if (box->pmu->type->ops->enable_box) + box->pmu->type->ops->enable_box(box); +} + +static inline void uncore_disable_event(struct zhaoxin_uncore_box *box, + struct perf_event *event) +{ + box->pmu->type->ops->disable_event(box, event); +} + +static inline void uncore_enable_event(struct zhaoxin_uncore_box *box, + struct perf_event *event) +{ + box->pmu->type->ops->enable_event(box, event); +} + +static inline u64 uncore_read_counter(struct zhaoxin_uncore_box *box, + struct perf_event *event) +{ + return box->pmu->type->ops->read_counter(box, event); +} + +static inline void uncore_box_init(struct zhaoxin_uncore_box *box) +{ + if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) { + if (box->pmu->type->ops->init_box) + box->pmu->type->ops->init_box(box); + } +} + +static inline void uncore_box_exit(struct zhaoxin_uncore_box *box) +{ + if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) { + if (box->pmu->type->ops->exit_box) + box->pmu->type->ops->exit_box(box); + } +} + +static inline bool uncore_box_is_fake(struct zhaoxin_uncore_box *box) +{ + return (box->package_id < 0); +} + +static inline struct zhaoxin_uncore_pmu *uncore_event_to_pmu(struct perf_event *event) +{ + return container_of(event->pmu, struct zhaoxin_uncore_pmu, pmu); +} + +static inline struct zhaoxin_uncore_box *uncore_event_to_box(struct perf_event *event) +{ + return event->pmu_private; +} + + +static struct zhaoxin_uncore_box *uncore_pmu_to_box(struct zhaoxin_uncore_pmu *pmu, int cpu); +static u64 uncore_msr_read_counter(struct zhaoxin_uncore_box *box, struct perf_event *event); + +static void uncore_pmu_start_hrtimer(struct zhaoxin_uncore_box *box); +static void uncore_pmu_cancel_hrtimer(struct zhaoxin_uncore_box *box); +static void uncore_pmu_event_start(struct perf_event *event, int flags); +static void uncore_pmu_event_stop(struct perf_event *event, int flags); +static int uncore_pmu_event_add(struct perf_event *event, int flags); +static void uncore_pmu_event_del(struct perf_event *event, int flags); +static void uncore_pmu_event_read(struct perf_event *event); +static void uncore_perf_event_update(struct zhaoxin_uncore_box *box, struct perf_event *event); +struct event_constraint * +uncore_get_constraint(struct zhaoxin_uncore_box *box, struct perf_event *event); +void uncore_put_constraint(struct zhaoxin_uncore_box *box, struct perf_event *event); +u64 uncore_shared_reg_config(struct zhaoxin_uncore_box *box, int idx); + +void chx_uncore_cpu_init(void); diff --git a/arch/x86/kernel/cpu/perfctr-watchdog.c b/arch/x86/kernel/cpu/perfctr-watchdog.c index 9556930cd8c1..a548d9104604 100644 --- a/arch/x86/kernel/cpu/perfctr-watchdog.c +++ b/arch/x86/kernel/cpu/perfctr-watchdog.c @@ -63,6 +63,10 @@ static inline unsigned int nmi_perfctr_msr_to_bit(unsigned int msr) case 15: return msr - MSR_P4_BPU_PERFCTR0; } + break; + case X86_VENDOR_ZHAOXIN: + case X86_VENDOR_CENTAUR: + return msr - MSR_ARCH_PERFMON_PERFCTR0; } return 0; } @@ -92,6 +96,10 @@ static inline unsigned int nmi_evntsel_msr_to_bit(unsigned int msr) case 15: return msr - MSR_P4_BSU_ESCR0; } + break; + case X86_VENDOR_ZHAOXIN: + case X86_VENDOR_CENTAUR: + return msr - MSR_ARCH_PERFMON_EVENTSEL0; } return 0; -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 2/2] x86/speculation/swapgs: Exclude Zhaoxin CPUs from SWAPGS vulnerability
by LeoLiu-oc 25 Mar '21

25 Mar '21
mainline inclusion from mainline-5.6 commit a84de2fa962c1b0551653fe245d6cb5f6129179c category: x86/speculation/swapgs -------------------------------- New Zhaoxin family 7 CPUs are not affected by the SWAPGS vulnerability. So mark these CPUs in the cpu vulnerability whitelist accordingly. Signed-off-by: Tony W Wang-oc <TonyWWang-oc(a)zhaoxin.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Link: https://lore.kernel.org/r/1579227872-26972-3-git-send-email-TonyWWang-oc@zh… Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/cpu/common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 246f98153240..1d83e5f7c5a8 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1017,8 +1017,8 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), /* Zhaoxin Family 7 */ - VULNWL(CENTAUR, 7, X86_MODEL_ANY, NO_SPECTRE_V2), - VULNWL(ZHAOXIN, 7, X86_MODEL_ANY, NO_SPECTRE_V2), + VULNWL(CENTAUR, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS), + VULNWL(ZHAOXIN, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS), {} }; -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 1/2] x86/speculation/spectre_v2: Exclude Zhaoxin CPUs from SPECTRE_V2
by LeoLiu-oc 25 Mar '21

25 Mar '21
mainline inclusion from mainline-5.6 commit 1e41a766c98b481400ab8c5a7aa8ea63a1bb03de category: x86/speculation/spectre_v2 -------------------------------- New Zhaoxin family 7 CPUs are not affected by SPECTRE_V2. So define a separate cpu_vuln_whitelist bit NO_SPECTRE_V2 and add these CPUs to the cpu vulnerability whitelist. Signed-off-by: Tony W Wang-oc <TonyWWang-oc(a)zhaoxin.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Link: https://lore.kernel.org/r/1579227872-26972-2-git-send-email-TonyWWang-oc@zh… Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/cpu/common.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a5954a2f8591..246f98153240 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -954,6 +954,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c) #define MSBDS_ONLY BIT(5) #define NO_SWAPGS BIT(6) #define NO_ITLB_MULTIHIT BIT(7) +#define NO_SPECTRE_V2 BIT(8) #define VULNWL(_vendor, _family, _model, _whitelist) \ { X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist } @@ -1014,6 +1015,10 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), + + /* Zhaoxin Family 7 */ + VULNWL(CENTAUR, 7, X86_MODEL_ANY, NO_SPECTRE_V2), + VULNWL(ZHAOXIN, 7, X86_MODEL_ANY, NO_SPECTRE_V2), {} }; @@ -1068,7 +1073,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) return; setup_force_cpu_bug(X86_BUG_SPECTRE_V1); - setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + + if (!cpu_matches(cpu_vuln_whitelist, NO_SPECTRE_V2)) + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); if (!cpu_matches(cpu_vuln_whitelist, NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) && -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 3/3] x86/mce: Add Zhaoxin LMCE support
by LeoLiu-oc 25 Mar '21

25 Mar '21
mainline inclusion from mainline-5.5 commit 70f0c230031dfef3c9b3e37b2a8c18d3f7186fb2 category: x86/mce Add support for more Zhaoxin CPUs. -------------------------------- Newer Zhaoxin CPUs support LMCE compatible with Intel. Add support for that. [ bp: Export functions and massage. ] Signed-off-by: Tony W Wang-oc <TonyWWang-oc(a)zhaoxin.com> Signed-off-by: Borislav Petkov <bp(a)suse.de> Cc: CooperYan(a)zhaoxin.com Cc: DavidWang(a)zhaoxin.com Cc: HerryYang(a)zhaoxin.com Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: linux-edac <linux-edac(a)vger.kernel.org> Cc: QiyuanWang(a)zhaoxin.com Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Tony Luck <tony.luck(a)intel.com> Cc: x86-ml <x86(a)kernel.org> Link: https://lkml.kernel.org/r/1568787573-1297-5-git-send-email-TonyWWang-oc@zha… Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/cpu/mce/core.c | 25 +++++++++++++++++++++++-- arch/x86/kernel/cpu/mce/intel.c | 4 ++-- arch/x86/kernel/cpu/mce/internal.h | 4 ++++ 3 files changed, 29 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 71ec7afbabdf..8534b952af76 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1125,6 +1125,13 @@ static bool __mc_check_crashing_cpu(int cpu) u64 mcgstatus; mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS); + + if (boot_cpu_data.x86_vendor == X86_VENDOR_ZHAOXIN || + boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR) { + if (mcgstatus & MCG_STATUS_LMCES) + return false; + } + if (mcgstatus & MCG_STATUS_RIPV) { mce_wrmsrl(MSR_IA32_MCG_STATUS, 0); return true; @@ -1274,9 +1281,11 @@ void do_machine_check(struct pt_regs *regs, long error_code) /* * Check if this MCE is signaled to only this logical processor, - * on Intel only. + * on Intel, Zhaoxin only. */ - if (m.cpuvendor == X86_VENDOR_INTEL) + if (m.cpuvendor == X86_VENDOR_INTEL || + m.cpuvendor == X86_VENDOR_ZHAOXIN || + m.cpuvendor == X86_VENDOR_CENTAUR) lmce = m.mcgstatus & MCG_STATUS_LMCES; /* @@ -1745,9 +1754,15 @@ static void mce_zhaoxin_feature_init(struct cpuinfo_x86 *c) } intel_init_cmci(); + intel_init_lmce(); mce_adjust_timer = cmci_intel_adjust_timer; } +static void mce_zhaoxin_feature_clear(struct cpuinfo_x86 *c) +{ + intel_clear_lmce(); +} + static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) { switch (c->x86_vendor) { @@ -1781,6 +1796,12 @@ static void __mcheck_cpu_clear_vendor(struct cpuinfo_x86 *c) case X86_VENDOR_INTEL: mce_intel_feature_clear(c); break; + + case X86_VENDOR_ZHAOXIN: + case X86_VENDOR_CENTAUR: + mce_zhaoxin_feature_clear(c); + break; + default: break; } diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c index 6a220c999a01..f6f3b2675164 100644 --- a/arch/x86/kernel/cpu/mce/intel.c +++ b/arch/x86/kernel/cpu/mce/intel.c @@ -445,7 +445,7 @@ void intel_init_cmci(void) cmci_recheck(); } -static void intel_init_lmce(void) +void intel_init_lmce(void) { u64 val; @@ -458,7 +458,7 @@ static void intel_init_lmce(void) wrmsrl(MSR_IA32_MCG_EXT_CTL, val | MCG_EXT_CTL_LMCE_EN); } -static void intel_clear_lmce(void) +void intel_clear_lmce(void) { u64 val; diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index 99d73d18f2c4..22e8aa8c8fe7 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -53,12 +53,16 @@ bool mce_intel_cmci_poll(void); void mce_intel_hcpu_update(unsigned long cpu); void cmci_disable_bank(int bank); void intel_init_cmci(void); +void intel_init_lmce(void); +void intel_clear_lmce(void); #else # define cmci_intel_adjust_timer mce_adjust_timer_default static inline bool mce_intel_cmci_poll(void) { return false; } static inline void mce_intel_hcpu_update(unsigned long cpu) { } static inline void cmci_disable_bank(int bank) { } static inline void intel_init_cmci(void) { } +static inline void intel_init_lmce(void) { } +static inline void intel_clear_lmce(void) { } #endif void mce_timer_kick(unsigned long interval); -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 2/3] x86/mce: Add Zhaoxin CMCI support
by LeoLiu-oc 25 Mar '21

25 Mar '21
mainline inclusion from mainline-5.5 commit 5a3d56a034be9e8e87a6cb9ed3f2928184db1417 category: x86/mce Add support for more Zhaoxin CPUs. -------------------------------- All newer Zhaoxin CPUs support CMCI and are compatible with Intel's Machine-Check Architecture. Add that support for Zhaoxin CPUs. [ bp: Massage comments and export intel_init_cmci(). ] Signed-off-by: Tony W Wang-oc <TonyWWang-oc(a)zhaoxin.com> Signed-off-by: Borislav Petkov <bp(a)suse.de> Cc: CooperYan(a)zhaoxin.com Cc: DavidWang(a)zhaoxin.com Cc: HerryYang(a)zhaoxin.com Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: linux-edac <linux-edac(a)vger.kernel.org> Cc: QiyuanWang(a)zhaoxin.com Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Tony Luck <tony.luck(a)intel.com> Cc: x86-ml <x86(a)kernel.org> Link: https://lkml.kernel.org/r/1568787573-1297-4-git-send-email-TonyWWang-oc@zha… Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/cpu/mce/core.c | 30 +++++++++++++++++++----------- arch/x86/kernel/cpu/mce/intel.c | 7 +++++-- arch/x86/kernel/cpu/mce/internal.h | 2 ++ 3 files changed, 26 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index dce0fbd4cb0f..71ec7afbabdf 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1726,19 +1726,26 @@ static void __mcheck_cpu_init_early(struct cpuinfo_x86 *c) } } -static void mce_centaur_feature_init(struct cpuinfo_x86 *c) +static void mce_zhaoxin_feature_init(struct cpuinfo_x86 *c) { struct mca_config *cfg = &mca_cfg; - - /* - * All newer Centaur CPUs support MCE broadcasting. Enable - * synchronization with a one second timeout. - */ - if ((c->x86 == 6 && c->x86_model == 0xf && c->x86_stepping >= 0xe) || - c->x86 > 6) { - if (cfg->monarch_timeout < 0) - cfg->monarch_timeout = USEC_PER_SEC; + /* + * These CPUs have MCA bank 8 which reports only one error type called + * SVAD (System View Address Decoder). The reporting of that error is + * controlled by IA32_MC8.CTL.0. + * + * If enabled, prefetching on these CPUs will cause SVAD MCE when + * virtual machines start and result in a system panic. Always disable + * bank 8 SVAD error by default. + */ + if ((c->x86 == 7 && c->x86_model == 0x1b) || + (c->x86_model == 0x19 || c->x86_model == 0x1f)) { + if (cfg->banks > 8) + mce_banks[8].ctl = 0; } + + intel_init_cmci(); + mce_adjust_timer = cmci_intel_adjust_timer; } static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) @@ -1759,7 +1766,8 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) break; case X86_VENDOR_CENTAUR: - mce_centaur_feature_init(c); + case X86_VENDOR_ZHAOXIN: + mce_zhaoxin_feature_init(c); break; default: diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c index 693c8cfac75d..6a220c999a01 100644 --- a/arch/x86/kernel/cpu/mce/intel.c +++ b/arch/x86/kernel/cpu/mce/intel.c @@ -85,8 +85,11 @@ static int cmci_supported(int *banks) * initialization is vendor keyed and this * makes sure none of the backdoors are entered otherwise. */ - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL && + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN && + boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR) return 0; + if (!boot_cpu_has(X86_FEATURE_APIC) || lapic_get_maxlvt() < 6) return 0; rdmsrl(MSR_IA32_MCG_CAP, cap); @@ -423,7 +426,7 @@ void cmci_disable_bank(int bank) raw_spin_unlock_irqrestore(&cmci_discover_lock, flags); } -static void intel_init_cmci(void) +void intel_init_cmci(void) { int banks; diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index ceb67cd5918f..99d73d18f2c4 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -52,11 +52,13 @@ unsigned long cmci_intel_adjust_timer(unsigned long interval); bool mce_intel_cmci_poll(void); void mce_intel_hcpu_update(unsigned long cpu); void cmci_disable_bank(int bank); +void intel_init_cmci(void); #else # define cmci_intel_adjust_timer mce_adjust_timer_default static inline bool mce_intel_cmci_poll(void) { return false; } static inline void mce_intel_hcpu_update(unsigned long cpu) { } static inline void cmci_disable_bank(int bank) { } +static inline void intel_init_cmci(void) { } #endif void mce_timer_kick(unsigned long interval); -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 2/2] x86/acpi/cstate: Add Zhaoxin processors support for cache flush policy in C3
by LeoLiu-oc 25 Mar '21

25 Mar '21
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/acpi/cstate.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c index 45745ecaa624..5eebe05b00fb 100644 --- a/arch/x86/kernel/acpi/cstate.c +++ b/arch/x86/kernel/acpi/cstate.c @@ -63,6 +63,21 @@ void acpi_processor_power_init_bm_check(struct acpi_processor_flags *flags, c->x86_stepping >= 0x0e)) flags->bm_check = 1; } + + if (c->x86_vendor == X86_VENDOR_ZHAOXIN) { + /* + * All Zhaoxin CPUs that support C3 share cache. + * And caches should not be flushed by software while + * entering C3 type state. + */ + flags->bm_check = 1; + /* + * On all recent Zhaoxin platforms, ARB_DISABLE is a nop. + * So, set bm_control to zero to indicate that ARB_DISABLE + * is not required while entering C3 type state. + */ + flags->bm_control = 0; + } } EXPORT_SYMBOL(acpi_processor_power_init_bm_check); -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 1/2] x86/power: Optimize C3 entry on Centaur CPUs
by LeoLiu-oc 25 Mar '21

25 Mar '21
mainline inclusion from mainline-5.2 commit 987ddbe4870b53623d76ac64044c55a13e368113 category: x86/power -------------------------------- For new Centaur CPUs the ucode will take care of the preservation of cache coherence between CPU cores in C-states regardless of how deep the C-states are. So, it is not necessary to flush the caches in software befor entering C3. This useless operation will cause performance drop for the cores which share some caches with the idling core. Signed-off-by: David Wang <davidwang(a)zhaoxin.com> Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Pavel Machek <pavel(a)ucw.cz> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: brucechang(a)via-alliance.com Cc: cooperyan(a)zhaoxin.com Cc: len.brown(a)intel.com Cc: linux-pm(a)kernel.org Cc: qiyuanwang(a)zhaoxin.com Cc: rjw(a)rjwysocki.net Cc: timguo(a)zhaoxin.com Link: http://lkml.kernel.org/r/1545900110-2757-1-git-send-email-davidwang@zhaoxin… [ Tidy up the comment. ] Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/acpi/cstate.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c index 92539a1c3e31..45745ecaa624 100644 --- a/arch/x86/kernel/acpi/cstate.c +++ b/arch/x86/kernel/acpi/cstate.c @@ -51,6 +51,18 @@ void acpi_processor_power_init_bm_check(struct acpi_processor_flags *flags, if (c->x86_vendor == X86_VENDOR_INTEL && (c->x86 > 0xf || (c->x86 == 6 && c->x86_model >= 0x0f))) flags->bm_control = 0; + /* + * For all recent Centaur CPUs, the ucode will make sure that each + * core can keep cache coherence with each other while entering C3 + * type state. So, set bm_check to 1 to indicate that the kernel + * doesn't need to execute a cache flush operation (WBINVD) when + * entering C3 type state. + */ + if (c->x86_vendor == X86_VENDOR_CENTAUR) { + if (c->x86 > 6 || (c->x86 == 6 && c->x86_model == 0x0f && + c->x86_stepping >= 0x0e)) + flags->bm_check = 1; + } } EXPORT_SYMBOL(acpi_processor_power_init_bm_check); -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 6/6] x86/cpu: Add detect extended topology for Zhaoxin CPUs
by LeoLiu-oc 25 Mar '21

25 Mar '21
Detect the extended topology information of Zhaoxin CPUs if available. The patch is scheduled to be submitted to the kernel mainline in 2021. Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/kernel/cpu/centaur.c | 20 +++++++++++++++++++- arch/x86/kernel/cpu/zhaoxin.c | 7 ++++++- 2 files changed, 25 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c index 8735be464bc1..49b33cc78751 100644 --- a/arch/x86/kernel/cpu/centaur.c +++ b/arch/x86/kernel/cpu/centaur.c @@ -115,6 +115,21 @@ static void early_init_centaur(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); } + + if (c->cpuid_level >= 0x00000001) { + u32 eax, ebx, ecx, edx; + + cpuid(0x00000001, &eax, &ebx, &ecx, &edx); + /* + * If HTT (EDX[28]) is set EBX[16:23] contain the number of + * apicids which are reserved per package. Store the resulting + * shift value for the package management code. + */ + if (edx & (1U << 28)) + c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff); + } + if (detect_extended_topology_early(c) < 0) + detect_ht_early(c); } static void centaur_detect_vmx_virtcap(struct cpuinfo_x86 *c) @@ -158,8 +173,11 @@ static void init_centaur(struct cpuinfo_x86 *c) clear_cpu_cap(c, 0*32+31); #endif early_init_centaur(c); + detect_extended_topology(c); init_intel_cacheinfo(c); - detect_num_cpu_cores(c); + if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) + detect_num_cpu_cores(c); + #ifdef CONFIG_X86_32 detect_ht(c); #endif diff --git a/arch/x86/kernel/cpu/zhaoxin.c b/arch/x86/kernel/cpu/zhaoxin.c index 452fd0a6bc61..b6fc969b3e74 100644 --- a/arch/x86/kernel/cpu/zhaoxin.c +++ b/arch/x86/kernel/cpu/zhaoxin.c @@ -85,6 +85,8 @@ static void early_init_zhaoxin(struct cpuinfo_x86 *c) c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff); } + if (detect_extended_topology_early(c) < 0) + detect_ht_early(c); } static void zhaoxin_detect_vmx_virtcap(struct cpuinfo_x86 *c) @@ -115,8 +117,11 @@ static void zhaoxin_detect_vmx_virtcap(struct cpuinfo_x86 *c) static void init_zhaoxin(struct cpuinfo_x86 *c) { early_init_zhaoxin(c); + detect_extended_topology(c); init_intel_cacheinfo(c); - detect_num_cpu_cores(c); + if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) + detect_num_cpu_cores(c); + #ifdef CONFIG_X86_32 detect_ht(c); #endif -- 2.20.1
1 0
0 0
[PATCH kernel-4.19 5/6] x86/cpufeatures: Add Zhaoxin feature bits
by LeoLiu-oc 25 Mar '21

25 Mar '21
Add Zhaoxin feature bits on Zhaoxin CPUs. The patch is scheduled to be submitted to the kernel mainline in 2021. Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com> --- arch/x86/include/asm/cpufeatures.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index f7f9604b10cc..48535113efa6 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -145,8 +145,12 @@ #define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */ /* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */ +#define X86_FEATURE_SM2 (5*32+0) /* sm2 present*/ +#define X86_FEATURE_SM2_EN (5*32+1) /* sm2 enabled */ #define X86_FEATURE_XSTORE ( 5*32+ 2) /* "rng" RNG present (xstore) */ #define X86_FEATURE_XSTORE_EN ( 5*32+ 3) /* "rng_en" RNG enabled */ +#define X86_FEATURE_CCS (5*32+4) /* "sm3 sm4" present */ +#define X86_FEATURE_CCS_EN (5*32+5) /* "sm3_en sm4_en" enabled */ #define X86_FEATURE_XCRYPT ( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */ #define X86_FEATURE_XCRYPT_EN ( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */ #define X86_FEATURE_ACE2 ( 5*32+ 8) /* Advanced Cryptography Engine v2 */ @@ -155,6 +159,23 @@ #define X86_FEATURE_PHE_EN ( 5*32+11) /* PHE enabled */ #define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */ #define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */ +#define X86_FEATURE_ZX_FMA (5*32+15) /* FMA supported */ +#define X86_FEATURE_PARALLAX (5*32+16) /* Adaptive P-state control present */ +#define X86_FEATURE_PARALLAX_EN (5*32+17) /* Adaptive P-state control enabled */ +#define X86_FEATURE_OVERSTRESS (5*32+18) /* Overstress Feature for auto overclock present */ +#define X86_FEATURE_OVERSTRESS_EN (5*32+19) /* Overstress Feature for auto overclock enabled */ +#define X86_FEATURE_TM3 (5*32+20) /* Thermal Monitor 3 present */ +#define X86_FEATURE_TM3_EN (5*32+21) /* Thermal Monitor 3 enabled */ +#define X86_FEATURE_RNG2 (5*32+22) /* 2nd generation of RNG present */ +#define X86_FEATURE_RNG2_EN (5*32+23) /* 2nd generation of RNG enabled */ +#define X86_FEATURE_SEM (5*32+24) /* SME feature present */ +#define X86_FEATURE_PHE2 (5*32+25) /* SHA384 and SHA 512 present */ +#define X86_FEATURE_PHE2_EN (5*32+26) /* SHA384 and SHA 512 enabled */ +#define X86_FEATURE_XMODX (5*32+27) /* "rsa" XMODEXP and MONTMUL2 instructions are present */ +#define X86_FEATURE_XMODX_EN (5*32+28) /* "rsa_en" XMODEXP and MONTMUL2instructions are enabled */ +#define X86_FEATURE_VEX (5*32+29) /* VEX instructions are present */ +#define X86_FEATURE_VEX_EN (5*32+30) /* VEX instructions are enabled */ +#define X86_FEATURE_STK (5*32+31) /* STK are present */ /* More extended AMD flags: CPUID level 0x80000001, ECX, word 6 */ #define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */ -- 2.20.1
1 0
0 0
  • ← Newer
  • 1
  • ...
  • 1884
  • 1885
  • 1886
  • 1887
  • 1888
  • 1889
  • 1890
  • ...
  • 1950
  • Older →

HyperKitty Powered by HyperKitty