[PATCH OLK-6.6 00/36] perf arm-spe: Support SPE new features
From: Ying Jiang<jiangying44@h-partners.com> Anshuman Khandual (9): arm64: Handle BRBE booting requirements arm64/sysreg: Update register fields for ID_AA64MMFR0_EL1 arm64/sysreg: Add register fields for HDFGRTR2_EL2 arm64/sysreg: Add register fields for HDFGWTR2_EL2 arm64/sysreg: Add register fields for HFGITR2_EL2 arm64/sysreg: Add register fields for HFGRTR2_EL2 arm64/sysreg: Add register fields for HFGWTR2_EL2 arm64/sysreg: Update ID_DFR0_EL1 register fields arm64/boot: Enable EL2 requirements for FEAT_PMUv3p9 Dave Martin (1): arm64: el2_setup.h: Rename some labels to be more diff-friendly James Clark (12): perf: arm_spe: Add format option for discard mode arm64: sysreg: Add new PMSFCR_EL1 fields and PMSDSFR_EL1 register perf: arm_spe: Support FEAT_SPEv1p4 filters perf: arm_spe: Add support for FEAT_SPE_EFT extended filtering arm64/boot: Factor out a macro to check SPE version arm64/boot: Enable EL2 requirements for SPE_FEAT_FDS KVM: arm64: Add trap configs for PMSDSFR_EL1 perf: Add perf_event_attr::config4 perf: arm_spe: Add support for filtering on data source tools headers UAPI: Sync linux/perf_event.h with the kernel sources perf tools: Add support for perf_event_attr::config4 perf docs: arm-spe: Document new SPE filtering features Leo Yan (1): perf: arm_spe: Expose event filter Marc Zyngier (5): KVM: arm64: Shave a few bytes from the EL2 idmap code arm64/sysreg: Allow a 'Mapping' descriptor for system registers arm64: sysreg: Add registers trapped by HDFG{R,W}TR2_EL2 KVM: arm64: nv: Add include containing the VNCR_EL2 offsets arm64: sysreg: Update PMSIDR_EL1 description Rob Herring (Arm) (4): arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control perf: arm_pmu: Use of_property_present() perf: arm_pmu: Remove event index to counter remapping Yushan Wang (4): Revert "arm64/boot: Enable EL2 requirements for BRBE" perf: Fix hw_metric event idx config perf: Keep event consistency of hw_metric events arm_pmuv3: hw_metric: Remove unnecessary check of event type Documentation/arch/arm64/booting.rst | 75 +++- arch/arm64/include/asm/el2_setup.h | 130 +++--- arch/arm64/include/asm/kvm_asm.h | 1 + arch/arm64/include/asm/sysreg.h | 19 +- arch/arm64/include/asm/vncr_mapping.h | 105 +++++ arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kvm/emulate-nested.c | 1 + arch/arm64/kvm/hyp/nvhe/hyp-init.S | 52 ++- arch/arm64/kvm/pmu-emul.c | 6 +- arch/arm64/kvm/sys_regs.c | 1 + arch/arm64/tools/gen-sysreg.awk | 2 +- arch/arm64/tools/sysreg | 502 +++++++++++++++++++++- drivers/perf/apple_m1_cpu_pmu.c | 4 +- drivers/perf/arm_pmu.c | 11 +- drivers/perf/arm_pmu_platform.c | 2 +- drivers/perf/arm_pmuv3.c | 68 +-- drivers/perf/arm_spe_pmu.c | 170 +++++++- include/linux/perf/arm_pmu.h | 2 +- include/linux/perf/arm_pmuv3.h | 1 + include/uapi/linux/perf_event.h | 4 + tools/include/uapi/linux/perf_event.h | 2 + tools/perf/Documentation/perf-arm-spe.txt | 103 ++++- tools/perf/tests/parse-events.c | 14 +- tools/perf/util/parse-events.c | 11 + tools/perf/util/parse-events.h | 1 + tools/perf/util/parse-events.l | 1 + tools/perf/util/pmu.c | 3 + tools/perf/util/pmu.h | 1 + 28 files changed, 1098 insertions(+), 195 deletions(-) create mode 100644 arch/arm64/include/asm/vncr_mapping.h -- 2.33.0
From: Marc Zyngier <maz@kernel.org> mainline inclusion from mainline-v6.12-rc5 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Our idmap is becoming too big, to the point where it doesn't fit in a 4kB page anymore. There are some low-hanging fruits though, such as the el2_init_state horror that is expanded 3 times in the kernel. Let's at least limit ourselves to two copies, which makes the kernel link again. At some point, we'll have to have a better way of doing this. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241009204903.GA3353168@thelio-3990X Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/kvm_asm.h | 1 + arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kvm/hyp/nvhe/hyp-init.S | 52 +++++++++++++++++------------- 3 files changed, 31 insertions(+), 23 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 24b5e6b23417..443e30284753 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -179,6 +179,7 @@ struct kvm_nvhe_init_params { unsigned long hcr_el2; unsigned long vttbr; unsigned long vtcr; + unsigned long tmp; }; /* diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 9cafa7d43ae6..ea3ef8400567 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -158,6 +158,7 @@ int main(void) DEFINE(NVHE_INIT_HCR_EL2, offsetof(struct kvm_nvhe_init_params, hcr_el2)); DEFINE(NVHE_INIT_VTTBR, offsetof(struct kvm_nvhe_init_params, vttbr)); DEFINE(NVHE_INIT_VTCR, offsetof(struct kvm_nvhe_init_params, vtcr)); + DEFINE(NVHE_INIT_TMP, offsetof(struct kvm_nvhe_init_params, tmp)); #endif #ifdef CONFIG_CPU_PM DEFINE(CPU_CTX_SP, offsetof(struct cpu_suspend_ctx, sp)); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S index 1cc06e6797bd..0f9d8422cf80 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S @@ -23,28 +23,25 @@ .align 11 SYM_CODE_START(__kvm_hyp_init) - ventry __invalid // Synchronous EL2t - ventry __invalid // IRQ EL2t - ventry __invalid // FIQ EL2t - ventry __invalid // Error EL2t + ventry . // Synchronous EL2t + ventry . // IRQ EL2t + ventry . // FIQ EL2t + ventry . // Error EL2t - ventry __invalid // Synchronous EL2h - ventry __invalid // IRQ EL2h - ventry __invalid // FIQ EL2h - ventry __invalid // Error EL2h + ventry . // Synchronous EL2h + ventry . // IRQ EL2h + ventry . // FIQ EL2h + ventry . // Error EL2h ventry __do_hyp_init // Synchronous 64-bit EL1 - ventry __invalid // IRQ 64-bit EL1 - ventry __invalid // FIQ 64-bit EL1 - ventry __invalid // Error 64-bit EL1 + ventry . // IRQ 64-bit EL1 + ventry . // FIQ 64-bit EL1 + ventry . // Error 64-bit EL1 - ventry __invalid // Synchronous 32-bit EL1 - ventry __invalid // IRQ 32-bit EL1 - ventry __invalid // FIQ 32-bit EL1 - ventry __invalid // Error 32-bit EL1 - -__invalid: - b . + ventry . // Synchronous 32-bit EL1 + ventry . // IRQ 32-bit EL1 + ventry . // FIQ 32-bit EL1 + ventry . // Error 32-bit EL1 /* * Only uses x0..x3 so as to not clobber callee-saved SMCCC registers. @@ -75,6 +72,13 @@ __do_hyp_init: eret SYM_CODE_END(__kvm_hyp_init) +SYM_CODE_START_LOCAL(__kvm_init_el2_state) + /* Initialize EL2 CPU state to sane values. */ + init_el2_state // Clobbers x0..x2 + finalise_el2_state + ret +SYM_CODE_END(__kvm_init_el2_state) + /* * Initialize the hypervisor in EL2. * @@ -101,9 +105,12 @@ SYM_CODE_START_LOCAL(___kvm_hyp_init) // TPIDR_EL2 is used to preserve x0 across the macro maze... isb msr tpidr_el2, x0 - init_el2_state - finalise_el2_state + str lr, [x0, #NVHE_INIT_TMP] + + bl __kvm_init_el2_state + mrs x0, tpidr_el2 + ldr lr, [x0, #NVHE_INIT_TMP] 1: ldr x1, [x0, #NVHE_INIT_TPIDR_EL2] @@ -202,9 +209,8 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu) 2: msr SPsel, #1 // We want to use SP_EL{1,2} - /* Initialize EL2 CPU state to sane values. */ - init_el2_state // Clobbers x0..x2 - finalise_el2_state + bl __kvm_init_el2_state + __init_el2_nvhe_prepare_eret /* Enable MMU, set vectors and stack. */ -- 2.33.0
driver inclusion category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA ----------------------------------------------- This reverts commit 9192278e57e60fe5c18f98d2b44f5829f7b6c854. Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- Documentation/arch/arm64/booting.rst | 21 ------- arch/arm64/include/asm/el2_setup.h | 87 +--------------------------- 2 files changed, 3 insertions(+), 105 deletions(-) diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst index 30d00b890b18..c03877e71727 100644 --- a/Documentation/arch/arm64/booting.rst +++ b/Documentation/arch/arm64/booting.rst @@ -352,27 +352,6 @@ Before jumping into the kernel, the following conditions must be met: - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01. - For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE): - - - If EL3 is present: - - - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b11. - - - If the kernel is entered at EL1 and EL2 is present: - - - BRBCR_EL2.CC (bit 3) must be initialised to 0b1. - - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1. - - - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1. - - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1. - - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1. - - - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1. - - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1. - - - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1. - - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1. - For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64): - If EL3 is present: diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 6f65322e2fd0..923fee612f66 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -168,40 +168,6 @@ .Lskip_set_cptr_\@: .endm -/* - * Enable BRBE to record cycle counts and branch mispredicts. - * - * At any EL, to record cycle counts BRBE requires that both - * BRBCR_EL2.CC=1 and BRBCR_EL1.CC=1. - * - * At any EL, to record branch mispredicts BRBE requires that both - * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1. - * - * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to - * BRBCR_EL2, but the {CC,MPRED} bits in the real BRBCR_EL1 register - * still apply. - * - * Set {CC,MPRBED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we - * only need to enable/disable thse in BRBCR_EL1 regardless of whether - * the kernel ends up executing in EL1 or EL2. - */ -.macro __init_el2_brbe - mrs x1, id_aa64dfr0_el1 - ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 - cbz x1, .Lskip_brbe_\@ - - mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED - msr_s SYS_BRBCR_EL2, x0 - - __check_hvhe .Lset_brbe_nvhe_\@, x1 - msr_s SYS_BRBCR_EL12, x0 // VHE - b .Lskip_brbe_\@ - -.Lset_brbe_nvhe_\@: - msr_s SYS_BRBCR_EL1, x0 // NVHE -.Lskip_brbe_\@: -.endm - /* Disable any fine grained traps */ .macro __init_el2_fgt mrs x1, id_aa64mmfr0_el1 @@ -209,48 +175,16 @@ cbz x1, .Lskip_fgt_\@ mov x0, xzr - mov x2, xzr mrs x1, id_aa64dfr0_el1 ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 cmp x1, #3 b.lt .Lset_debug_fgt_\@ - /* Disable PMSNEVFR_EL1 read and write traps */ - orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK - orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK + orr x0, x0, #(1 << 62) .Lset_debug_fgt_\@: -#ifdef CONFIG_ARM64_BRBE - mrs x1, id_aa64dfr0_el1 - ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 - cbz x1, .Lskip_brbe_reg_fgt_\@ - - /* - * Disable read traps for the following registers - * - * [BRBSRC|BRBTGT|RBINF]_EL1 - * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1 - */ - orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK - - /* - * Disable write traps for the following registers - * - * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1 - */ - orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK - - /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */ - orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK - orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK - - /* Disable read traps for BRBIDR_EL1 */ - orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK - -.Lskip_brbe_reg_fgt_\@: -#endif /* CONFIG_ARM64_BRBE */ msr_s SYS_HDFGRTR_EL2, x0 - msr_s SYS_HDFGWTR_EL2, x2 + msr_s SYS_HDFGWTR_EL2, x0 mov x0, xzr mrs x1, id_aa64pfr1_el1 @@ -273,21 +207,7 @@ .Lset_fgt_\@: msr_s SYS_HFGRTR_EL2, x0 msr_s SYS_HFGWTR_EL2, x0 - mov x0, xzr -#ifdef CONFIG_ARM64_BRBE - mrs x1, id_aa64dfr0_el1 - ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 - cbz x1, .Lskip_brbe_insn_fgt_\@ - - /* Disable traps for BRBIALL instruction */ - orr x0, x0, #HFGITR_EL2_nBRBIALL_MASK - - /* Disable traps for BRBINJ instruction */ - orr x0, x0, #HFGITR_EL2_nBRBINJ_MASK - -.Lskip_brbe_insn_fgt_\@: -#endif /* CONFIG_ARM64_BRBE */ - msr_s SYS_HFGITR_EL2, x0 + msr_s SYS_HFGITR_EL2, xzr mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4 @@ -322,7 +242,6 @@ __init_el2_nvhe_idregs __init_el2_cptr __init_el2_fgt - __init_el2_brbe .endm #ifndef __KVM_NVHE_HYPERVISOR__ -- 2.33.0
From: Dave Martin <Dave.Martin@arm.com> mainline inclusion from mainline-v6.12-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- A minor anti-pattern has established itself in __init_el2_fgt, where each block of instructions is skipped by jumping to a label named for the next (typically unrelated) block. This makes diffs more noisy than necessary, since appending each new block to deal with some new architecture feature now requires altering a branch destination in the existing code. Fix it by naming the affected labels based on the block that is skipping itself instead, as is done elsewhere in the el2_setup code. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20240729162542.367059-1-Dave.Martin@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/el2_setup.h | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 923fee612f66..ffb6a3bea323 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -178,42 +178,45 @@ mrs x1, id_aa64dfr0_el1 ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 cmp x1, #3 - b.lt .Lset_debug_fgt_\@ + b.lt .Lskip_spe_fgt_\@ /* Disable PMSNEVFR_EL1 read and write traps */ orr x0, x0, #(1 << 62) -.Lset_debug_fgt_\@: +.Lskip_spe_fgt_\@: msr_s SYS_HDFGRTR_EL2, x0 msr_s SYS_HDFGWTR_EL2, x0 mov x0, xzr mrs x1, id_aa64pfr1_el1 ubfx x1, x1, #ID_AA64PFR1_EL1_SME_SHIFT, #4 - cbz x1, .Lset_pie_fgt_\@ + cbz x1, .Lskip_debug_fgt_\@ /* Disable nVHE traps of TPIDR2 and SMPRI */ orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK -.Lset_pie_fgt_\@: +.Lskip_debug_fgt_\@: mrs_s x1, SYS_ID_AA64MMFR3_EL1 ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4 - cbz x1, .Lset_fgt_\@ + cbz x1, .Lskip_pie_fgt_\@ /* Disable trapping of PIR_EL1 / PIRE0_EL1 */ orr x0, x0, #HFGxTR_EL2_nPIR_EL1 orr x0, x0, #HFGxTR_EL2_nPIRE0_EL1 -.Lset_fgt_\@: +.Lskip_pie_fgt_\@: msr_s SYS_HFGRTR_EL2, x0 msr_s SYS_HFGWTR_EL2, x0 msr_s SYS_HFGITR_EL2, xzr mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4 - cbz x1, .Lskip_fgt_\@ + cbz x1, .Lskip_amu_fgt_\@ msr_s SYS_HAFGRTR_EL2, xzr + +.Lskip_amu_fgt_\@: + .Lskip_fgt_\@: .endm -- 2.33.0
From: "Rob Herring (Arm)" <robh@kernel.org> mainline inclusion from mainline-v6.16-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Commit 5b39db6037e7 ("arm64: el2_setup.h: Rename some labels to be more diff-friendly") reworked the labels in __init_el2_fgt to say what's skipped rather than what the target location is. The exception was "set_fgt_" which is where registers are written. In reviewing the BRBE additions, Will suggested "set_debug_fgt_" where HDFGxTR_EL2 are written. Doing that would partially revert commit 5b39db6037e7 undoing the goal of minimizing additions here, but it would follow the convention for labels where registers are written. So let's do both. Branches that skip something go to a "skip" label and places that set registers have a "set" label. This results in some double labels, but it makes things entirely consistent. While we're here, the SME skip label was incorrectly named, so fix it. Reported-by: Will Deacon <will@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20250520-arm-brbe-v19-v22-2-c1ddde38e7f8@kernel.or... Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/el2_setup.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index ffb6a3bea323..5f82e8f2a924 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -183,19 +183,21 @@ orr x0, x0, #(1 << 62) .Lskip_spe_fgt_\@: + +.Lset_debug_fgt_\@: msr_s SYS_HDFGRTR_EL2, x0 msr_s SYS_HDFGWTR_EL2, x0 mov x0, xzr mrs x1, id_aa64pfr1_el1 ubfx x1, x1, #ID_AA64PFR1_EL1_SME_SHIFT, #4 - cbz x1, .Lskip_debug_fgt_\@ + cbz x1, .Lskip_sme_fgt_\@ /* Disable nVHE traps of TPIDR2 and SMPRI */ orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK -.Lskip_debug_fgt_\@: +.Lskip_sme_fgt_\@: mrs_s x1, SYS_ID_AA64MMFR3_EL1 ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4 cbz x1, .Lskip_pie_fgt_\@ -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.17-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- To use the Branch Record Buffer Extension (BRBE), some configuration is necessary at EL3 and EL2. This patch documents the requirements and adds the initial EL2 setup code, which largely consists of configuring the fine-grained traps and initializing a couple of BRBE control registers. Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and HDFGWTR_EL2 with the same value, relying on the read/write trap controls for a register occupying the same bit position in either register. The 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit 59 of HDFGWTR_EL2 is RES0, and so this assumption no longer holds. To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and HDFGWTR_EL2 control bits separately. While making this change the open-coded value (1 << 62) is replaced with HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK. The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special initialisation: even though they are subject to E2H renaming, both have an effect regardless of HCR_EL2.TGE, even when running at EL2. So we must initialize BRBCR_EL2 in case we run in nVHE mode. This is handled in __init_el2_brbe() with a comment to explain the situation. Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> [Mark: rewrite commit message, fix typo in comment] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Co-developed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> tested-by: Adam Young <admiyo@os.amperecomputing.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-2-e7775563036e@kernel.or... Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- Documentation/arch/arm64/booting.rst | 21 ++++++++ arch/arm64/include/asm/el2_setup.h | 71 ++++++++++++++++++++++++++-- 2 files changed, 89 insertions(+), 3 deletions(-) diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst index c03877e71727..34d1fac1687f 100644 --- a/Documentation/arch/arm64/booting.rst +++ b/Documentation/arch/arm64/booting.rst @@ -382,6 +382,27 @@ Before jumping into the kernel, the following conditions must be met: - SMCR_EL2.EZT0 (bit 30) must be initialised to 0b1. + For CPUs with the Branch Record Buffer Extension (FEAT_BRBE): + + - If EL3 is present: + + - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b01 or 0b11. + + - If the kernel is entered at EL1 and EL2 is present: + + - BRBCR_EL2.CC (bit 3) must be initialised to 0b1. + - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1. + + - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1. + - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1. + - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1. + + - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1. + - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1. + + - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1. + - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1. + For CPUs with Memory Copy and Memory Set instructions (FEAT_MOPS): - If the kernel is entered at EL1 and EL2 is present: diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 5f82e8f2a924..31c00ba599fc 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -168,6 +168,28 @@ .Lskip_set_cptr_\@: .endm +/* + * Configure BRBE to permit recording cycle counts and branch mispredicts. + * + * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and + * BRBCR_EL1.CC=1. + * + * At any EL, to record branch mispredicts BRBE requires that both + * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1. + * + * Set {CC,MPRED} in BRBCR_EL2 in case nVHE mode is used and we are + * executing in EL1. + */ +.macro __init_el2_brbe + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_\@ + + mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED + msr_s SYS_BRBCR_EL2, x0 +.Lskip_brbe_\@: +.endm + /* Disable any fine grained traps */ .macro __init_el2_fgt mrs x1, id_aa64mmfr0_el1 @@ -175,20 +197,62 @@ cbz x1, .Lskip_fgt_\@ mov x0, xzr + mov x2, xzr mrs x1, id_aa64dfr0_el1 ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 cmp x1, #3 b.lt .Lskip_spe_fgt_\@ /* Disable PMSNEVFR_EL1 read and write traps */ - orr x0, x0, #(1 << 62) + orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK + orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK .Lskip_spe_fgt_\@: + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_fgt_\@ + + /* + * Disable read traps for the following registers + * + * [BRBSRC|BRBTGT|RBINF]_EL1 + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1 + */ + orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK + + /* + * Disable write traps for the following registers + * + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1 + */ + orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK + + /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */ + orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK + orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK + + /* Disable read traps for BRBIDR_EL1 */ + orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK + +.Lskip_brbe_fgt_\@: .Lset_debug_fgt_\@: msr_s SYS_HDFGRTR_EL2, x0 - msr_s SYS_HDFGWTR_EL2, x0 + msr_s SYS_HDFGWTR_EL2, x2 mov x0, xzr + mov x2, xzr + + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4 + cbz x1, .Lskip_brbe_insn_fgt_\@ + + /* Disable traps for BRBIALL instruction */ + orr x2, x2, #HFGITR_EL2_nBRBIALL_MASK + + /* Disable traps for BRBINJ instruction */ + orr x2, x2, #HFGITR_EL2_nBRBINJ_MASK + +.Lskip_brbe_insn_fgt_\@: mrs x1, id_aa64pfr1_el1 ubfx x1, x1, #ID_AA64PFR1_EL1_SME_SHIFT, #4 cbz x1, .Lskip_sme_fgt_\@ @@ -209,7 +273,7 @@ .Lskip_pie_fgt_\@: msr_s SYS_HFGRTR_EL2, x0 msr_s SYS_HFGWTR_EL2, x0 - msr_s SYS_HFGITR_EL2, xzr + msr_s SYS_HFGITR_EL2, x2 mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4 @@ -240,6 +304,7 @@ __init_el2_hcrx __init_el2_timers __init_el2_debug + __init_el2_brbe __init_el2_lor __init_el2_stage2 __init_el2_gicv3 -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This updates ID_AA64MMFR0_EL1 register fields as per the definitions based on DDI0601 2024-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250203050828.1049370-2-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 963177485fd8..12e3abb27045 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1526,6 +1526,7 @@ EndEnum UnsignedEnum 59:56 FGT 0b0000 NI 0b0001 IMP + 0b0010 FGT2 EndEnum Res0 55:48 UnsignedEnum 47:44 EXS @@ -1587,6 +1588,7 @@ Enum 3:0 PARANGE 0b0100 44 0b0101 48 0b0110 52 + 0b0111 56 EndEnum EndSysreg -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This adds register fields for HDFGRTR2_EL2 as per the definitions based on DDI0601 2024-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250203050828.1049370-3-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 12e3abb27045..4694e53734df 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2297,6 +2297,35 @@ Field 1 ICIALLU Field 0 ICIALLUIS EndSysreg +Sysreg HDFGRTR2_EL2 3 4 3 1 0 +Res0 63:25 +Field 24 nPMBMAR_EL1 +Field 23 nMDSTEPOP_EL1 +Field 22 nTRBMPAM_EL1 +Res0 21 +Field 20 nTRCITECR_EL1 +Field 19 nPMSDSFR_EL1 +Field 18 nSPMDEVAFF_EL1 +Field 17 nSPMID +Field 16 nSPMSCR_EL1 +Field 15 nSPMACCESSR_EL1 +Field 14 nSPMCR_EL0 +Field 13 nSPMOVS +Field 12 nSPMINTEN +Field 11 nSPMCNTEN +Field 10 nSPMSELR_EL0 +Field 9 nSPMEVTYPERn_EL0 +Field 8 nSPMEVCNTRn_EL0 +Field 7 nPMSSCR_EL1 +Field 6 nPMSSDATA +Field 5 nMDSELR_EL1 +Field 4 nPMUACR_EL1 +Field 3 nPMICFILTR_EL0 +Field 2 nPMICNTR_EL0 +Field 1 nPMIAR_EL1 +Field 0 nPMECR_EL1 +EndSysreg + Sysreg HDFGRTR_EL2 3 4 3 1 4 Field 63 PMBIDR_EL1 Field 62 nPMSNEVFR_EL1 -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This adds register fields for HDFGWTR2_EL2 as per the definitions based on DDI0601 2024-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250203050828.1049370-4-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 4694e53734df..a8900b668ba8 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2326,6 +2326,34 @@ Field 1 nPMIAR_EL1 Field 0 nPMECR_EL1 EndSysreg +Sysreg HDFGWTR2_EL2 3 4 3 1 1 +Res0 63:25 +Field 24 nPMBMAR_EL1 +Field 23 nMDSTEPOP_EL1 +Field 22 nTRBMPAM_EL1 +Field 21 nPMZR_EL0 +Field 20 nTRCITECR_EL1 +Field 19 nPMSDSFR_EL1 +Res0 18:17 +Field 16 nSPMSCR_EL1 +Field 15 nSPMACCESSR_EL1 +Field 14 nSPMCR_EL0 +Field 13 nSPMOVS +Field 12 nSPMINTEN +Field 11 nSPMCNTEN +Field 10 nSPMSELR_EL0 +Field 9 nSPMEVTYPERn_EL0 +Field 8 nSPMEVCNTRn_EL0 +Field 7 nPMSSCR_EL1 +Res0 6 +Field 5 nMDSELR_EL1 +Field 4 nPMUACR_EL1 +Field 3 nPMICFILTR_EL0 +Field 2 nPMICNTR_EL0 +Field 1 nPMIAR_EL1 +Field 0 nPMECR_EL1 +EndSysreg + Sysreg HDFGRTR_EL2 3 4 3 1 4 Field 63 PMBIDR_EL1 Field 62 nPMSNEVFR_EL1 -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This adds register fields for HFGITR2_EL2 as per the definitions based on DDI0601 2024-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250203050828.1049370-5-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index a8900b668ba8..5eeb3d57b8e6 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2483,6 +2483,12 @@ Field 1 DBGBVRn_EL1 Field 0 DBGBCRn_EL1 EndSysreg +Sysreg HFGITR2_EL2 3 4 3 1 7 +Res0 63:2 +Field 1 nDCCIVAPS +Field 0 TSBCSYNC +EndSysreg + Sysreg ZCR_EL2 3 4 1 2 0 Fields ZCR_ELx EndSysreg -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This adds register fields for HFGRTR2_EL2 as per the definitions based on DDI0601 2024-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250203050828.1049370-6-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 5eeb3d57b8e6..5692ef26f73a 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2354,6 +2354,25 @@ Field 1 nPMIAR_EL1 Field 0 nPMECR_EL1 EndSysreg +Sysreg HFGRTR2_EL2 3 4 3 1 2 +Res0 63:15 +Field 14 nACTLRALIAS_EL1 +Field 13 nACTLRMASK_EL1 +Field 12 nTCR2ALIAS_EL1 +Field 11 nTCRALIAS_EL1 +Field 10 nSCTLRALIAS2_EL1 +Field 9 nSCTLRALIAS_EL1 +Field 8 nCPACRALIAS_EL1 +Field 7 nTCR2MASK_EL1 +Field 6 nTCRMASK_EL1 +Field 5 nSCTLR2MASK_EL1 +Field 4 nSCTLRMASK_EL1 +Field 3 nCPACRMASK_EL1 +Field 2 nRCWSMASK_EL1 +Field 1 nERXGSR_EL1 +Field 0 nPFAR_EL1 +EndSysreg + Sysreg HDFGRTR_EL2 3 4 3 1 4 Field 63 PMBIDR_EL1 Field 62 nPMSNEVFR_EL1 -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This adds register fields for HFGWTR2_EL2 as per the definitions based on DDI0601 2024-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250203050828.1049370-7-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 5692ef26f73a..b5b717cc6250 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2373,6 +2373,25 @@ Field 1 nERXGSR_EL1 Field 0 nPFAR_EL1 EndSysreg +Sysreg HFGWTR2_EL2 3 4 3 1 3 +Res0 63:15 +Field 14 nACTLRALIAS_EL1 +Field 13 nACTLRMASK_EL1 +Field 12 nTCR2ALIAS_EL1 +Field 11 nTCRALIAS_EL1 +Field 10 nSCTLRALIAS2_EL1 +Field 9 nSCTLRALIAS_EL1 +Field 8 nCPACRALIAS_EL1 +Field 7 nTCR2MASK_EL1 +Field 6 nTCRMASK_EL1 +Field 5 nSCTLR2MASK_EL1 +Field 4 nSCTLRMASK_EL1 +Field 3 nCPACRMASK_EL1 +Field 2 nRCWSMASK_EL1 +Res0 1 +Field 0 nPFAR_EL1 +EndSysreg + Sysreg HDFGRTR_EL2 3 4 3 1 4 Field 63 PMBIDR_EL1 Field 62 nPMSNEVFR_EL1 -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.9-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- This updates ID_DFR0_EL1.PerfMon and ID_DFR0_EL1.CopDbg register fields as per the definitions based on DDI0601 2023-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240220025343.3093955-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index b5b717cc6250..4ed1100ee583 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -200,6 +200,7 @@ UnsignedEnum 27:24 PerfMon 0b0110 PMUv3p5 0b0111 PMUv3p7 0b1000 PMUv3p8 + 0b1001 PMUv3p9 0b1111 IMPDEF EndEnum Enum 23:20 MProfDbg @@ -231,6 +232,7 @@ Enum 3:0 CopDbg 0b1000 Debugv8p2 0b1001 Debugv8p4 0b1010 Debugv8p8 + 0b1011 Debugv8p9 EndEnum EndSysreg -- 2.33.0
From: "Rob Herring (Arm)" <robh@kernel.org> mainline inclusion from mainline-v6.13-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Armv8.9/9.4 PMUv3.9 adds per counter EL0 access controls. Per counter access is enabled with the UEN bit in PMUSERENR_EL1 register. Individual counters are enabled/disabled in the PMUACR_EL1 register. When UEN is set, the CR/ER bits control EL0 write access and must be set to disable write access. With the access controls, the clearing of unused counters can be skipped. KVM also configures PMUSERENR_EL1 in order to trap to EL2. UEN does not need to be set for it since only PMUv3.5 is exposed to guests. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241002184326.1105499-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 4ed1100ee583..6574d3aedd62 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1287,6 +1287,7 @@ UnsignedEnum 11:8 PMUVer 0b0110 V3P5 0b0111 V3P7 0b1000 V3P8 + 0b1001 V3P9 0b1111 IMP_DEF EndEnum UnsignedEnum 7:4 TraceVer -- 2.33.0
From: Anshuman Khandual <anshuman.khandual@arm.com> mainline inclusion from mainline-v6.15-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- FEAT_PMUv3p9 registers such as PMICNTR_EL0, PMICFILTR_EL0, and PMUACR_EL1 access from EL1 requires appropriate EL2 fine grained trap configuration via FEAT_FGT2 based trap control registers HDFGRTR2_EL2 and HDFGWTR2_EL2. Otherwise such register accesses will result in traps into EL2. Add a new helper __init_el2_fgt2() which initializes FEAT_FGT2 based fine grained trap control registers HDFGRTR2_EL2 and HDFGWTR2_EL2 (setting the bits nPMICNTR_EL0, nPMICFILTR_EL0 and nPMUACR_EL1) to enable access into PMICNTR_EL0, PMICFILTR_EL0, and PMUACR_EL1 registers. Also update booting.rst with SCR_EL3.FGTEn2 requirement for all FEAT_FGT2 based registers to be accessible in EL2. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: kvmarm@lists.linux.dev Fixes: 0bbff9ed8165 ("perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control") Fixes: d8226d8cfbaf ("perf: arm_pmuv3: Add support for Armv9.4 PMU instruction counter") Tested-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250227035119.2025171-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- Documentation/arch/arm64/booting.rst | 22 ++++++++++++++++++++++ arch/arm64/include/asm/el2_setup.h | 25 +++++++++++++++++++++++++ 2 files changed, 47 insertions(+) diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst index 34d1fac1687f..f2e26af6f7da 100644 --- a/Documentation/arch/arm64/booting.rst +++ b/Documentation/arch/arm64/booting.rst @@ -288,6 +288,12 @@ Before jumping into the kernel, the following conditions must be met: - SCR_EL3.FGTEn (bit 27) must be initialised to 0b1. + For CPUs with the Fine Grained Traps 2 (FEAT_FGT2) extension present: + + - If EL3 is present and the kernel is entered at EL2: + + - SCR_EL3.FGTEn2 (bit 59) must be initialised to 0b1. + For CPUs with support for HCRX_EL2 (FEAT_HCX) present: - If EL3 is present and the kernel is entered at EL2: @@ -382,6 +388,22 @@ Before jumping into the kernel, the following conditions must be met: - SMCR_EL2.EZT0 (bit 30) must be initialised to 0b1. + For CPUs with the Performance Monitors Extension (FEAT_PMUv3p9): + + - If EL3 is present: + + - MDCR_EL3.EnPM2 (bit 7) must be initialised to 0b1. + + - If the kernel is entered at EL1 and EL2 is present: + + - HDFGRTR2_EL2.nPMICNTR_EL0 (bit 2) must be initialised to 0b1. + - HDFGRTR2_EL2.nPMICFILTR_EL0 (bit 3) must be initialised to 0b1. + - HDFGRTR2_EL2.nPMUACR_EL1 (bit 4) must be initialised to 0b1. + + - HDFGWTR2_EL2.nPMICNTR_EL0 (bit 2) must be initialised to 0b1. + - HDFGWTR2_EL2.nPMICFILTR_EL0 (bit 3) must be initialised to 0b1. + - HDFGWTR2_EL2.nPMUACR_EL1 (bit 4) must be initialised to 0b1. + For CPUs with the Branch Record Buffer Extension (FEAT_BRBE): - If EL3 is present: diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 31c00ba599fc..1b429c201edb 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -286,6 +286,30 @@ .Lskip_fgt_\@: .endm +.macro __init_el2_fgt2 + mrs x1, id_aa64mmfr0_el1 + ubfx x1, x1, #ID_AA64MMFR0_EL1_FGT_SHIFT, #4 + cmp x1, #ID_AA64MMFR0_EL1_FGT_FGT2 + b.lt .Lskip_fgt2_\@ + + mov x0, xzr + mrs x1, id_aa64dfr0_el1 + ubfx x1, x1, #ID_AA64DFR0_EL1_PMUVer_SHIFT, #4 + cmp x1, #ID_AA64DFR0_EL1_PMUVer_V3P9 + b.lt .Lskip_pmuv3p9_\@ + + orr x0, x0, #HDFGRTR2_EL2_nPMICNTR_EL0 + orr x0, x0, #HDFGRTR2_EL2_nPMICFILTR_EL0 + orr x0, x0, #HDFGRTR2_EL2_nPMUACR_EL1 +.Lskip_pmuv3p9_\@: + msr_s SYS_HDFGRTR2_EL2, x0 + msr_s SYS_HDFGWTR2_EL2, x0 + msr_s SYS_HFGRTR2_EL2, xzr + msr_s SYS_HFGWTR2_EL2, xzr + msr_s SYS_HFGITR2_EL2, xzr +.Lskip_fgt2_\@: +.endm + .macro __init_el2_nvhe_prepare_eret mov x0, #INIT_PSTATE_EL1 msr spsr_el2, x0 @@ -312,6 +336,7 @@ __init_el2_nvhe_idregs __init_el2_cptr __init_el2_fgt + __init_el2_fgt2 .endm #ifndef __KVM_NVHE_HYPERVISOR__ -- 2.33.0
From: Marc Zyngier <maz@kernel.org> mainline inclusion from mainline-v6.14-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- *EL02 and *_EL12 system registers are actually only accessors for EL0 and EL1 registers accessed from EL2 when HCR_EL2.E2H==1. They do not have fields of their own. To that effect, introduce a 'Mapping' entry, describing which system register an _EL12 register maps to. Implementation wise, this is handled the same was as Fields, which ls only a comment. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241219173351.1123087-2-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/gen-sysreg.awk | 2 +- arch/arm64/tools/sysreg | 12 ++++++++++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk index d1254a056114..1a2afc9fdd42 100755 --- a/arch/arm64/tools/gen-sysreg.awk +++ b/arch/arm64/tools/gen-sysreg.awk @@ -206,7 +206,7 @@ END { # Currently this is effectivey a comment, in future we may want to emit # defines for the fields. -/^Fields/ && block_current() == "Sysreg" { +(/^Fields/ || /^Mapping/) && block_current() == "Sysreg" { expect_fields(2) if (next_bit != 63) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 6574d3aedd62..d11d9ac422f1 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -24,8 +24,16 @@ # ... # EndEnum -# Alternatively if multiple registers share the same layout then -# a SysregFields block can be used to describe the shared layout +# For VHE aliases (*_EL12, *_EL02) of system registers, a Mapping +# entry describes the register the alias actually accesses: + +# Sysreg <name_EL12> <op0> <op1> <crn> <crm> <op2> +# Mapping <name_EL1> +# EndSysreg + +# Where multiple system regsiters are not VHE aliases but share a +# common layout, a SysregFields block can be used to describe the +# shared layout: # SysregFields <fieldsname> # <field> -- 2.33.0
From: Marc Zyngier <maz@kernel.org> mainline inclusion from mainline-v6.16-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Bulk addition of all the system registers trapped by HDFG{R,W}TR2_EL2. The descriptions are extracted from the BSD-licenced JSON file part of the 2025-03 drop from ARM. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/sysreg.h | 10 + arch/arm64/tools/sysreg | 343 ++++++++++++++++++++++++++++++++ 2 files changed, 353 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 0d36f20ecd33..1cbbacb2482a 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -469,12 +469,22 @@ #define __PMEV_op2(n) ((n) & 0x7) #define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3)) +#define SYS_PMEVCNTSVRn_EL1(n) sys_reg(2, 0, 14, __CNTR_CRm(n), __PMEV_op2(n)) #define SYS_PMEVCNTRn_EL0(n) sys_reg(3, 3, 14, __CNTR_CRm(n), __PMEV_op2(n)) #define __TYPER_CRm(n) (0xc | (((n) >> 3) & 0x3)) #define SYS_PMEVTYPERn_EL0(n) sys_reg(3, 3, 14, __TYPER_CRm(n), __PMEV_op2(n)) #define SYS_PMCCFILTR_EL0 sys_reg(3, 3, 14, 15, 7) +#define SYS_SPMCGCRn_EL1(n) sys_reg(2, 0, 9, 13, ((n) & 1)) + +#define __SPMEV_op2(n) ((n) & 0x7) +#define __SPMEV_crm(p, n) ((((p) & 7) << 1) | (((n) >> 3) & 1)) +#define SYS_SPMEVCNTRn_EL0(n) sys_reg(2, 3, 14, __SPMEV_crm(0b000, n), __SPMEV_op2(n)) +#define SYS_SPMEVFILT2Rn_EL0(n) sys_reg(2, 3, 14, __SPMEV_crm(0b011, n), __SPMEV_op2(n)) +#define SYS_SPMEVFILTRn_EL0(n) sys_reg(2, 3, 14, __SPMEV_crm(0b010, n), __SPMEV_op2(n)) +#define SYS_SPMEVTYPERn_EL0(n) sys_reg(2, 3, 14, __SPMEV_crm(0b001, n), __SPMEV_op2(n)) + #define SYS_VPIDR_EL2 sys_reg(3, 4, 0, 0, 0) #define SYS_VMPIDR_EL2 sys_reg(3, 4, 0, 0, 5) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index d11d9ac422f1..506ef55e1f95 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -101,6 +101,17 @@ Res0 63:32 Field 31:0 DTRTX EndSysreg +Sysreg MDSELR_EL1 2 0 0 4 2 +Res0 63:6 +Field 5:4 BANK +Res0 3:0 +EndSysreg + +Sysreg MDSTEPOP_EL1 2 0 0 5 2 +Res0 63:32 +Field 31:0 OPCODE +EndSysreg + Sysreg OSECCR_EL1 2 0 0 6 2 Res0 63:32 Field 31:0 EDECCR @@ -111,6 +122,285 @@ Res0 63:1 Field 0 OSLK EndSysreg +Sysreg SPMACCESSR_EL1 2 0 9 13 3 +UnsignedEnum 63:62 P31 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 61:60 P30 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 59:58 P29 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 57:56 P28 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 55:54 P27 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 53:52 P26 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 51:50 P25 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 49:48 P24 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 47:46 P23 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 45:44 P22 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 43:42 P21 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 41:40 P20 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 39:38 P19 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 37:36 P18 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 35:34 P17 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 33:32 P16 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 31:30 P15 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 29:28 P14 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 27:26 P13 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 25:24 P12 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 23:22 P11 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 21:20 P10 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 19:18 P9 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 17:16 P8 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 15:14 P7 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 13:12 P6 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 11:10 P5 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 9:8 P4 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 7:6 P3 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 5:4 P2 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 3:2 P1 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +UnsignedEnum 1:0 P0 + 0b00 TRAP_RW + 0b01 TRAP_W + 0b11 NOTRAP +EndEnum +EndSysreg + +Sysreg SPMACCESSR_EL12 2 5 9 13 3 +Mapping SPMACCESSR_EL1 +EndSysreg + +Sysreg SPMIIDR_EL1 2 0 9 13 4 +Res0 63:32 +Field 31:20 ProductID +Field 19:16 Variant +Field 15:12 Revision +Field 11:0 Implementer +EndSysreg + +Sysreg SPMDEVARCH_EL1 2 0 9 13 5 +Res0 63:32 +Field 31:21 ARCHITECT +Field 20 PRESENT +Field 19:16 REVISION +Field 15:12 ARCHVER +Field 11:0 ARCHPART +EndSysreg + +Sysreg SPMDEVAFF_EL1 2 0 9 13 6 +Res0 63:40 +Field 39:32 Aff3 +Field 31 F0V +Field 30 U +Res0 29:25 +Field 24 MT +Field 23:16 Aff2 +Field 15:8 Aff1 +Field 7:0 Aff0 +EndSysreg + +Sysreg SPMCFGR_EL1 2 0 9 13 7 +Res0 63:32 +Field 31:28 NCG +Res0 27:25 +Field 24 HDBG +Field 23 TRO +Field 22 SS +Field 21 FZO +Field 20 MSI +Field 19 RAO +Res0 18 +Field 17 NA +Field 16 EX +Field 15:14 RAZ +Field 13:8 SIZE +Field 7:0 N +EndSysreg + +Sysreg SPMINTENSET_EL1 2 0 9 14 1 +Field 63:0 P +EndSysreg + +Sysreg SPMINTENCLR_EL1 2 0 9 14 2 +Field 63:0 P +EndSysreg + +Sysreg PMCCNTSVR_EL1 2 0 14 11 7 +Field 63:0 CCNT +EndSysreg + +Sysreg PMICNTSVR_EL1 2 0 14 12 0 +Field 63:0 ICNT +EndSysreg + +Sysreg SPMCR_EL0 2 3 9 12 0 +Res0 63:12 +Field 11 TRO +Field 10 HDBG +Field 9 FZO +Field 8 NA +Res0 7:5 +Field 4 EX +Res0 3:2 +Field 1 P +Field 0 E +EndSysreg + +Sysreg SPMCNTENSET_EL0 2 3 9 12 1 +Field 63:0 P +EndSysreg + +Sysreg SPMCNTENCLR_EL0 2 3 9 12 2 +Field 63:0 P +EndSysreg + +Sysreg SPMOVSCLR_EL0 2 3 9 12 3 +Field 63:0 P +EndSysreg + +Sysreg SPMZR_EL0 2 3 9 12 4 +Field 63:0 P +EndSysreg + +Sysreg SPMSELR_EL0 2 3 9 12 5 +Res0 63:10 +Field 9:4 SYSPMUSEL +Res0 3:2 +Field 1:0 BANK +EndSysreg + +Sysreg SPMOVSSET_EL0 2 3 9 14 3 +Field 63:0 P +EndSysreg + +Sysreg SPMSCR_EL1 2 7 9 14 7 +Field 63:32 IMPDEF +Field 31 RAO +Res0 30:5 +Field 4 NAO +Res0 3:1 +Field 0 SO +EndSysreg + Sysreg ID_PFR0_EL1 3 0 0 1 0 Res0 63:32 UnsignedEnum 31:28 RAS @@ -1904,6 +2194,16 @@ Sysreg CPACR_EL1 3 0 1 0 2 Fields CPACR_ELx EndSysreg +Sysreg TRCITECR_EL1 3 0 1 2 3 +Res0 63:2 +Field 1 E1E +Field 0 E0E +EndSysreg + +Sysreg TRCITECR_EL12 3 5 1 2 3 +Mapping TRCITECR_EL1 +EndSysreg + Sysreg SMPRI_EL1 3 0 1 2 4 Res0 63:4 Field 3:0 PRIORITY @@ -2053,6 +2353,16 @@ Field 16 COLL Field 15:0 MSS EndSysreg +Sysreg PMSDSFR_EL1 3 0 9 10 4 +Field 63:0 S +EndSysreg + +Sysreg PMBMAR_EL1 3 0 9 10 5 +Res0 63:10 +Field 9:8 SH +Field 7:0 Attr +EndSysreg + Sysreg PMBIDR_EL1 3 0 9 10 7 Res0 63:12 Enum 11:8 EA @@ -2066,6 +2376,39 @@ Field 4 P Field 3:0 ALIGN EndSysreg +Sysreg TRBMPAM_EL1 3 0 9 11 5 +Res0 63:27 +Field 26 EN +Field 25:24 MPAM_SP +Field 23:16 PMG +Field 15:0 PARTID +EndSysreg + +Sysreg PMSSCR_EL1 3 0 9 13 3 +Res0 63:33 +Field 32 NC +Res0 31:1 +Field 0 SS +EndSysreg + +Sysreg PMECR_EL1 3 0 9 14 5 +Res0 63:5 +Field 4:3 SSE +Field 2 KPME +Field 1:0 PMEE +EndSysreg + +Sysreg PMIAR_EL1 3 0 9 14 7 +Field 63:0 ADDRESS +EndSysreg + +Sysreg PMZR_EL0 3 3 9 13 4 +Res0 63:33 +Field 32 F0 +Field 31 C +Field 30:0 P +EndSysreg + SysregFields CONTEXTIDR_ELx Res0 63:32 Field 31:0 PROCID -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.14-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- FEAT_SPEv1p2 (optional from Armv8.6) adds a discard mode that allows all SPE data to be discarded rather than written to memory. Add a format bit for this mode. If the mode isn't supported, the format bit isn't published and attempts to use it will result in -EOPNOTSUPP. Allocating an aux buffer is still allowed even though it won't be written to so that old tools continue to work, but updated tools can choose to skip this step. Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Link: https://lore.kernel.org/r/20250108142904.401139-2-james.clark@linaro.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_spe_pmu.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index db9dddbddba3..19a54371140c 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -96,6 +96,7 @@ struct arm_spe_pmu { #define SPE_PMU_FEAT_LDS (1UL << 4) #define SPE_PMU_FEAT_ERND (1UL << 5) #define SPE_PMU_FEAT_INV_FILT_EVT (1UL << 6) +#define SPE_PMU_FEAT_DISCARD (1UL << 7) #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) u64 features; @@ -205,6 +206,9 @@ static const struct attribute_group arm_spe_pmu_cap_group = { #define ATTR_CFG_FLD_store_filter_CFG config /* PMSFCR_EL1.ST */ #define ATTR_CFG_FLD_store_filter_LO 34 #define ATTR_CFG_FLD_store_filter_HI 34 +#define ATTR_CFG_FLD_discard_CFG config /* PMBLIMITR_EL1.FM = DISCARD */ +#define ATTR_CFG_FLD_discard_LO 35 +#define ATTR_CFG_FLD_discard_HI 35 #define ATTR_CFG_FLD_event_filter_CFG config1 /* PMSEVFR_EL1 */ #define ATTR_CFG_FLD_event_filter_LO 0 @@ -228,6 +232,7 @@ GEN_PMU_FORMAT_ATTR(store_filter); GEN_PMU_FORMAT_ATTR(event_filter); GEN_PMU_FORMAT_ATTR(inv_event_filter); GEN_PMU_FORMAT_ATTR(min_latency); +GEN_PMU_FORMAT_ATTR(discard); static struct attribute *arm_spe_pmu_formats_attr[] = { &format_attr_ts_enable.attr, @@ -240,6 +245,7 @@ static struct attribute *arm_spe_pmu_formats_attr[] = { &format_attr_event_filter.attr, &format_attr_inv_event_filter.attr, &format_attr_min_latency.attr, + &format_attr_discard.attr, NULL, }; @@ -250,6 +256,9 @@ static umode_t arm_spe_pmu_format_attr_is_visible(struct kobject *kobj, struct device *dev = kobj_to_dev(kobj); struct arm_spe_pmu *spe_pmu = dev_get_drvdata(dev); + if (attr == &format_attr_discard.attr && !(spe_pmu->features & SPE_PMU_FEAT_DISCARD)) + return 0; + if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT)) return 0; @@ -514,6 +523,12 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, u64 base, limit; struct arm_spe_pmu_buf *buf; + if (ATTR_CFG_GET_FLD(&event->attr, discard)) { + limit = FIELD_PREP(PMBLIMITR_EL1_FM, PMBLIMITR_EL1_FM_DISCARD); + limit |= PMBLIMITR_EL1_E; + goto out_write_limit; + } + /* Start a new aux session */ buf = perf_aux_output_begin(handle, event); if (!buf) { @@ -783,6 +798,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event) !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT)) return -EOPNOTSUPP; + if (ATTR_CFG_GET_FLD(&event->attr, discard) && + !(spe_pmu->features & SPE_PMU_FEAT_DISCARD)) + return -EOPNOTSUPP; + set_spe_event_has_cx(event); reg = arm_spe_event_to_pmscr(event); if (reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT)) @@ -1081,6 +1100,9 @@ static void __arm_spe_pmu_dev_probe(void *info) if (FIELD_GET(PMSIDR_EL1_ERND, reg)) spe_pmu->features |= SPE_PMU_FEAT_ERND; + if (spe_pmu->pmsver >= ID_AA64DFR0_EL1_PMSVer_V1P2) + spe_pmu->features |= SPE_PMU_FEAT_DISCARD; + /* This field has a spaced out encoding, so just use a look-up */ fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg); switch (fld) { -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Add new fields and register that are introduced for the features FEAT_SPE_EFT (extended filtering) and FEAT_SPE_FDS (data source filtering). Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 506ef55e1f95..bf48bb4859f6 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2271,11 +2271,20 @@ Field 0 RND EndSysreg Sysreg PMSFCR_EL1 3 0 9 9 4 -Res0 63:19 +Res0 63:53 +Field 52 SIMDm +Field 51 FPm +Field 50 STm +Field 49 LDm +Field 48 Bm +Res0 47:21 +Field 20 SIMD +Field 19 FP Field 18 ST Field 17 LD Field 16 B -Res0 15:4 +Res0 15:5 +Field 4 FDS Field 3 FnE Field 2 FL Field 1 FT -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- FEAT_SPEv1p4 (optional from Armv8.8) adds some new filter bits and also makes some previously available bits unavailable again e.g: E[30], bit [30] When FEAT_SPEv1p4 is _not_ implemented ... Continuing to hard code the valid filter bits for each version isn't scalable, and it also doesn't work for filter bits that aren't related to SPE version. For example most bits have a further condition: E[15], bit [15] When ... and filtering on event 15 is supported: Whether "filtering on event 15" is implemented or not is only discoverable from the TRM of that specific CPU or by probing PMSEVFR_EL1. Instead of hard coding them, write all 1s to the PMSEVFR_EL1 register and read it back to discover the RES0 bits. Unsupported bits are RAZ/WI so should read as 0s. For any hardware that doesn't strictly follow RAZ/WI for unsupported filters: Any bits that should have been supported in a specific SPE version but now incorrectly appear to be RES0 wouldn't have worked anyway, so it's better to fail to open events that request them rather than behaving unexpectedly. Bits that aren't implemented but also aren't RAZ/WI will be incorrectly reported as supported, but allowing them to be used is harmless. Testing on N1SDP shows the probed RES0 bits to be the same as the hard coded ones. The FVP with SPEv1p4 shows only additional new RES0 bits, i.e. no previously hard coded RES0 bits are missing. Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/sysreg.h | 9 --------- drivers/perf/arm_spe_pmu.c | 23 +++++++---------------- 2 files changed, 7 insertions(+), 25 deletions(-) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 1cbbacb2482a..13354cdaf200 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -321,15 +321,6 @@ #define SYS_PAR_EL1_F BIT(0) #define SYS_PAR_EL1_FST GENMASK(6, 1) -/*** Statistical Profiling Extension ***/ -#define PMSEVFR_EL1_RES0_IMP \ - (GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\ - BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0)) -#define PMSEVFR_EL1_RES0_V1P1 \ - (PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11))) -#define PMSEVFR_EL1_RES0_V1P2 \ - (PMSEVFR_EL1_RES0_V1P1 & ~BIT_ULL(6)) - /* Buffer error reporting */ #define PMBSR_EL1_FAULT_FSC_SHIFT PMBSR_EL1_MSS_SHIFT #define PMBSR_EL1_FAULT_FSC_MASK PMBSR_EL1_MSS_MASK diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index 19a54371140c..d629c10629db 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -100,6 +100,7 @@ struct arm_spe_pmu { #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) u64 features; + u64 pmsevfr_res0; u16 max_record_sz; u16 align; struct perf_output_handle __percpu *handle; @@ -733,20 +734,6 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) return IRQ_HANDLED; } -static u64 arm_spe_pmsevfr_res0(u16 pmsver) -{ - switch (pmsver) { - case ID_AA64DFR0_EL1_PMSVer_IMP: - return PMSEVFR_EL1_RES0_IMP; - case ID_AA64DFR0_EL1_PMSVer_V1P1: - return PMSEVFR_EL1_RES0_V1P1; - case ID_AA64DFR0_EL1_PMSVer_V1P2: - /* Return the highest version we support in default */ - default: - return PMSEVFR_EL1_RES0_V1P2; - } -} - /* Perf callbacks */ static int arm_spe_pmu_event_init(struct perf_event *event) { @@ -762,10 +749,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event) !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus)) return -ENOENT; - if (arm_spe_event_to_pmsevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver)) + if (arm_spe_event_to_pmsevfr(event) & spe_pmu->pmsevfr_res0) return -EOPNOTSUPP; - if (arm_spe_event_to_pmsnevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver)) + if (arm_spe_event_to_pmsnevfr(event) & spe_pmu->pmsevfr_res0) return -EOPNOTSUPP; if (attr->exclude_idle) @@ -1157,6 +1144,10 @@ static void __arm_spe_pmu_dev_probe(void *info) spe_pmu->counter_sz = 16; } + /* Write all 1s and then read back. Unsupported filter bits are RAZ/WI. */ + write_sysreg_s(U64_MAX, SYS_PMSEVFR_EL1); + spe_pmu->pmsevfr_res0 = ~read_sysreg_s(SYS_PMSEVFR_EL1); + dev_info(dev, "probed SPEv1.%d for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n", spe_pmu->pmsver - 1, cpumask_pr_args(&spe_pmu->supported_cpus), -- 2.33.0
From: Leo Yan <leo.yan@arm.com> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Expose an "event_filter" entry in the caps folder to inform user space about which events can be filtered. Change the return type of arm_spe_pmu_cap_get() from u32 to u64 to accommodate the added event filter entry. Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_spe_pmu.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index d629c10629db..9ce83a7a05eb 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -128,6 +128,7 @@ enum arm_spe_pmu_capabilities { SPE_PMU_CAP_FEAT_MAX, SPE_PMU_CAP_CNT_SZ = SPE_PMU_CAP_FEAT_MAX, SPE_PMU_CAP_MIN_IVAL, + SPE_PMU_CAP_EVENT_FILTER, }; static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = { @@ -135,7 +136,7 @@ static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = { [SPE_PMU_CAP_ERND] = SPE_PMU_FEAT_ERND, }; -static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) +static u64 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) { if (cap < SPE_PMU_CAP_FEAT_MAX) return !!(spe_pmu->features & arm_spe_pmu_feat_caps[cap]); @@ -145,6 +146,8 @@ static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) return spe_pmu->counter_sz; case SPE_PMU_CAP_MIN_IVAL: return spe_pmu->min_period; + case SPE_PMU_CAP_EVENT_FILTER: + return ~spe_pmu->pmsevfr_res0; default: WARN(1, "unknown cap %d\n", cap); } @@ -161,7 +164,19 @@ static ssize_t arm_spe_pmu_cap_show(struct device *dev, container_of(attr, struct dev_ext_attribute, attr); int cap = (long)ea->var; - return sysfs_emit(buf, "%u\n", arm_spe_pmu_cap_get(spe_pmu, cap)); + return sysfs_emit(buf, "%llu\n", arm_spe_pmu_cap_get(spe_pmu, cap)); +} + +static ssize_t arm_spe_pmu_cap_show_hex(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct arm_spe_pmu *spe_pmu = dev_get_drvdata(dev); + struct dev_ext_attribute *ea = + container_of(attr, struct dev_ext_attribute, attr); + int cap = (long)ea->var; + + return sysfs_emit(buf, "0x%llx\n", arm_spe_pmu_cap_get(spe_pmu, cap)); } #define SPE_EXT_ATTR_ENTRY(_name, _func, _var) \ @@ -171,12 +186,15 @@ static ssize_t arm_spe_pmu_cap_show(struct device *dev, #define SPE_CAP_EXT_ATTR_ENTRY(_name, _var) \ SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show, _var) +#define SPE_CAP_EXT_ATTR_ENTRY_HEX(_name, _var) \ + SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show_hex, _var) static struct attribute *arm_spe_pmu_cap_attr[] = { SPE_CAP_EXT_ATTR_ENTRY(arch_inst, SPE_PMU_CAP_ARCH_INST), SPE_CAP_EXT_ATTR_ENTRY(ernd, SPE_PMU_CAP_ERND), SPE_CAP_EXT_ATTR_ENTRY(count_size, SPE_PMU_CAP_CNT_SZ), SPE_CAP_EXT_ATTR_ENTRY(min_interval, SPE_PMU_CAP_MIN_IVAL), + SPE_CAP_EXT_ATTR_ENTRY_HEX(event_filter, SPE_PMU_CAP_EVENT_FILTER), NULL, }; -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- FEAT_SPE_EFT (optional from Armv9.4) adds mask bits for the existing load, store and branch filters. It also adds two new filter bits for SIMD and floating point with their own associated mask bits. The current filters only allow OR filtering on samples that are load OR store etc, and the new mask bits allow setting part of the filter to an AND, for example filtering samples that are store AND SIMD. With mask bits set to 0, the OR behavior is preserved, so the unless any masks are explicitly set old filters will behave the same. Add them all and make them behave the same way as existing format bits, hidden and return EOPNOTSUPP if set when the feature doesn't exist. Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_spe_pmu.c | 66 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index 9ce83a7a05eb..c49077730217 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -97,6 +97,7 @@ struct arm_spe_pmu { #define SPE_PMU_FEAT_ERND (1UL << 5) #define SPE_PMU_FEAT_INV_FILT_EVT (1UL << 6) #define SPE_PMU_FEAT_DISCARD (1UL << 7) +#define SPE_PMU_FEAT_EFT (1UL << 8) #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) u64 features; @@ -228,6 +229,27 @@ static const struct attribute_group arm_spe_pmu_cap_group = { #define ATTR_CFG_FLD_discard_CFG config /* PMBLIMITR_EL1.FM = DISCARD */ #define ATTR_CFG_FLD_discard_LO 35 #define ATTR_CFG_FLD_discard_HI 35 +#define ATTR_CFG_FLD_branch_filter_mask_CFG config /* PMSFCR_EL1.Bm */ +#define ATTR_CFG_FLD_branch_filter_mask_LO 36 +#define ATTR_CFG_FLD_branch_filter_mask_HI 36 +#define ATTR_CFG_FLD_load_filter_mask_CFG config /* PMSFCR_EL1.LDm */ +#define ATTR_CFG_FLD_load_filter_mask_LO 37 +#define ATTR_CFG_FLD_load_filter_mask_HI 37 +#define ATTR_CFG_FLD_store_filter_mask_CFG config /* PMSFCR_EL1.STm */ +#define ATTR_CFG_FLD_store_filter_mask_LO 38 +#define ATTR_CFG_FLD_store_filter_mask_HI 38 +#define ATTR_CFG_FLD_simd_filter_CFG config /* PMSFCR_EL1.SIMD */ +#define ATTR_CFG_FLD_simd_filter_LO 39 +#define ATTR_CFG_FLD_simd_filter_HI 39 +#define ATTR_CFG_FLD_simd_filter_mask_CFG config /* PMSFCR_EL1.SIMDm */ +#define ATTR_CFG_FLD_simd_filter_mask_LO 40 +#define ATTR_CFG_FLD_simd_filter_mask_HI 40 +#define ATTR_CFG_FLD_float_filter_CFG config /* PMSFCR_EL1.FP */ +#define ATTR_CFG_FLD_float_filter_LO 41 +#define ATTR_CFG_FLD_float_filter_HI 41 +#define ATTR_CFG_FLD_float_filter_mask_CFG config /* PMSFCR_EL1.FPm */ +#define ATTR_CFG_FLD_float_filter_mask_LO 42 +#define ATTR_CFG_FLD_float_filter_mask_HI 42 #define ATTR_CFG_FLD_event_filter_CFG config1 /* PMSEVFR_EL1 */ #define ATTR_CFG_FLD_event_filter_LO 0 @@ -246,8 +268,15 @@ GEN_PMU_FORMAT_ATTR(pa_enable); GEN_PMU_FORMAT_ATTR(pct_enable); GEN_PMU_FORMAT_ATTR(jitter); GEN_PMU_FORMAT_ATTR(branch_filter); +GEN_PMU_FORMAT_ATTR(branch_filter_mask); GEN_PMU_FORMAT_ATTR(load_filter); +GEN_PMU_FORMAT_ATTR(load_filter_mask); GEN_PMU_FORMAT_ATTR(store_filter); +GEN_PMU_FORMAT_ATTR(store_filter_mask); +GEN_PMU_FORMAT_ATTR(simd_filter); +GEN_PMU_FORMAT_ATTR(simd_filter_mask); +GEN_PMU_FORMAT_ATTR(float_filter); +GEN_PMU_FORMAT_ATTR(float_filter_mask); GEN_PMU_FORMAT_ATTR(event_filter); GEN_PMU_FORMAT_ATTR(inv_event_filter); GEN_PMU_FORMAT_ATTR(min_latency); @@ -259,8 +288,15 @@ static struct attribute *arm_spe_pmu_formats_attr[] = { &format_attr_pct_enable.attr, &format_attr_jitter.attr, &format_attr_branch_filter.attr, + &format_attr_branch_filter_mask.attr, &format_attr_load_filter.attr, + &format_attr_load_filter_mask.attr, &format_attr_store_filter.attr, + &format_attr_store_filter_mask.attr, + &format_attr_simd_filter.attr, + &format_attr_simd_filter_mask.attr, + &format_attr_float_filter.attr, + &format_attr_float_filter_mask.attr, &format_attr_event_filter.attr, &format_attr_inv_event_filter.attr, &format_attr_min_latency.attr, @@ -281,6 +317,16 @@ static umode_t arm_spe_pmu_format_attr_is_visible(struct kobject *kobj, if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT)) return 0; + if ((attr == &format_attr_branch_filter_mask.attr || + attr == &format_attr_load_filter_mask.attr || + attr == &format_attr_store_filter_mask.attr || + attr == &format_attr_simd_filter.attr || + attr == &format_attr_simd_filter_mask.attr || + attr == &format_attr_float_filter.attr || + attr == &format_attr_float_filter_mask.attr) && + !(spe_pmu->features & SPE_PMU_FEAT_EFT)) + return 0; + return attr->mode; } @@ -372,8 +418,15 @@ static u64 arm_spe_event_to_pmsfcr(struct perf_event *event) u64 reg = 0; reg |= FIELD_PREP(PMSFCR_EL1_LD, ATTR_CFG_GET_FLD(attr, load_filter)); + reg |= FIELD_PREP(PMSFCR_EL1_LDm, ATTR_CFG_GET_FLD(attr, load_filter_mask)); reg |= FIELD_PREP(PMSFCR_EL1_ST, ATTR_CFG_GET_FLD(attr, store_filter)); + reg |= FIELD_PREP(PMSFCR_EL1_STm, ATTR_CFG_GET_FLD(attr, store_filter_mask)); reg |= FIELD_PREP(PMSFCR_EL1_B, ATTR_CFG_GET_FLD(attr, branch_filter)); + reg |= FIELD_PREP(PMSFCR_EL1_Bm, ATTR_CFG_GET_FLD(attr, branch_filter_mask)); + reg |= FIELD_PREP(PMSFCR_EL1_SIMD, ATTR_CFG_GET_FLD(attr, simd_filter)); + reg |= FIELD_PREP(PMSFCR_EL1_SIMDm, ATTR_CFG_GET_FLD(attr, simd_filter_mask)); + reg |= FIELD_PREP(PMSFCR_EL1_FP, ATTR_CFG_GET_FLD(attr, float_filter)); + reg |= FIELD_PREP(PMSFCR_EL1_FPm, ATTR_CFG_GET_FLD(attr, float_filter_mask)); if (reg) reg |= PMSFCR_EL1_FT; @@ -803,6 +856,16 @@ static int arm_spe_pmu_event_init(struct perf_event *event) !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT)) return -EOPNOTSUPP; + if ((FIELD_GET(PMSFCR_EL1_LDm, reg) || + FIELD_GET(PMSFCR_EL1_STm, reg) || + FIELD_GET(PMSFCR_EL1_Bm, reg) || + FIELD_GET(PMSFCR_EL1_SIMD, reg) || + FIELD_GET(PMSFCR_EL1_SIMDm, reg) || + FIELD_GET(PMSFCR_EL1_FP, reg) || + FIELD_GET(PMSFCR_EL1_FPm, reg)) && + !(spe_pmu->features & SPE_PMU_FEAT_EFT)) + return -EOPNOTSUPP; + if (ATTR_CFG_GET_FLD(&event->attr, discard) && !(spe_pmu->features & SPE_PMU_FEAT_DISCARD)) return -EOPNOTSUPP; @@ -1108,6 +1171,9 @@ static void __arm_spe_pmu_dev_probe(void *info) if (spe_pmu->pmsver >= ID_AA64DFR0_EL1_PMSVer_V1P2) spe_pmu->features |= SPE_PMU_FEAT_DISCARD; + if (FIELD_GET(PMSIDR_EL1_EFT, reg)) + spe_pmu->features |= SPE_PMU_FEAT_EFT; + /* This field has a spaced out encoding, so just use a look-up */ fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg); switch (fld) { -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- We check the version of SPE twice, and we'll add one more check in the next commit so factor out a macro to do this. Change the #3 magic number to the actual SPE version define (V1p2) to make it more readable. No functional changes intended. Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/el2_setup.h | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 1b429c201edb..8850e1b85f4c 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -70,6 +70,14 @@ msr cntvoff_el2, xzr // Clear virtual offset .endm +/* Branch to skip_label if SPE version is less than given version */ +.macro __spe_vers_imp skip_label, version, tmp + mrs \tmp, id_aa64dfr0_el1 + ubfx \tmp, \tmp, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 + cmp \tmp, \version + b.lt \skip_label +.endm + .macro __init_el2_debug mrs x1, id_aa64dfr0_el1 ubfx x0, x1, #ID_AA64DFR0_EL1_PMUVer_SHIFT, #4 @@ -82,8 +90,7 @@ csel x2, xzr, x0, eq // all PMU counters from EL1 /* Statistical profiling */ - ubfx x0, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 - cbz x0, .Lskip_spe_\@ // Skip if SPE not present + __spe_vers_imp .Lskip_spe_\@, ID_AA64DFR0_EL1_PMSVer_IMP, x0 // Skip if SPE not present mrs_s x0, SYS_PMBIDR_EL1 // If SPE available at EL2, and x0, x0, #(1 << PMBIDR_EL1_P_SHIFT) @@ -198,10 +205,8 @@ mov x0, xzr mov x2, xzr - mrs x1, id_aa64dfr0_el1 - ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 - cmp x1, #3 - b.lt .Lskip_spe_fgt_\@ + /* If SPEv1p2 is implemented, */ + __spe_vers_imp .Lskip_spe_fgt_\@, #ID_AA64DFR0_EL1_PMSVer_V1P2, x1 /* Disable PMSNEVFR_EL1 read and write traps */ orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- SPE data source filtering (optional from Armv8.8) requires that traps to the filter register PMSDSFR be disabled. Document the requirements and disable the traps if the feature is present. Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- Documentation/arch/arm64/booting.rst | 11 +++++++++++ arch/arm64/include/asm/el2_setup.h | 11 +++++++++++ 2 files changed, 22 insertions(+) diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst index f2e26af6f7da..d6bde493ef57 100644 --- a/Documentation/arch/arm64/booting.rst +++ b/Documentation/arch/arm64/booting.rst @@ -404,6 +404,17 @@ Before jumping into the kernel, the following conditions must be met: - HDFGWTR2_EL2.nPMICFILTR_EL0 (bit 3) must be initialised to 0b1. - HDFGWTR2_EL2.nPMUACR_EL1 (bit 4) must be initialised to 0b1. + For CPUs with SPE data source filtering (FEAT_SPE_FDS): + + - If EL3 is present: + + - MDCR_EL3.EnPMS3 (bit 42) must be initialised to 0b1. + + - If the kernel is entered at EL1 and EL2 is present: + + - HDFGRTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1. + - HDFGWTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1. + For CPUs with the Branch Record Buffer Extension (FEAT_BRBE): - If EL3 is present: diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 8850e1b85f4c..fdcbd7e1d439 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -307,6 +307,17 @@ orr x0, x0, #HDFGRTR2_EL2_nPMICFILTR_EL0 orr x0, x0, #HDFGRTR2_EL2_nPMUACR_EL1 .Lskip_pmuv3p9_\@: + /* If SPE is implemented, */ + __spe_vers_imp .Lskip_spefds_\@, ID_AA64DFR0_EL1_PMSVer_IMP, x1 + /* we can read PMSIDR and */ + mrs_s x1, SYS_PMSIDR_EL1 + and x1, x1, #PMSIDR_EL1_FDS + /* if FEAT_SPE_FDS is implemented, */ + cbz x1, .Lskip_spefds_\@ + /* disable traps of PMSDSFR to EL2. */ + orr x0, x0, #HDFGRTR2_EL2_nPMSDSFR_EL1 + +.Lskip_spefds_\@: msr_s SYS_HDFGRTR2_EL2, x0 msr_s SYS_HDFGWTR2_EL2, x0 msr_s SYS_HFGRTR2_EL2, xzr -- 2.33.0
From: Marc Zyngier <maz@kernel.org> mainline inclusion from mainline-v6.8-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- VNCR_EL2 points to a page containing a number of system registers accessed by a guest hypervisor when ARMv8.4-NV is enabled. Let's document the offsets in that page, as we are going to use this layout. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/vncr_mapping.h | 103 ++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) create mode 100644 arch/arm64/include/asm/vncr_mapping.h diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h new file mode 100644 index 000000000000..df2c47c55972 --- /dev/null +++ b/arch/arm64/include/asm/vncr_mapping.h @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * System register offsets in the VNCR page + * All offsets are *byte* displacements! + */ + +#ifndef __ARM64_VNCR_MAPPING_H__ +#define __ARM64_VNCR_MAPPING_H__ + +#define VNCR_VTTBR_EL2 0x020 +#define VNCR_VTCR_EL2 0x040 +#define VNCR_VMPIDR_EL2 0x050 +#define VNCR_CNTVOFF_EL2 0x060 +#define VNCR_HCR_EL2 0x078 +#define VNCR_HSTR_EL2 0x080 +#define VNCR_VPIDR_EL2 0x088 +#define VNCR_TPIDR_EL2 0x090 +#define VNCR_HCRX_EL2 0x0A0 +#define VNCR_VNCR_EL2 0x0B0 +#define VNCR_CPACR_EL1 0x100 +#define VNCR_CONTEXTIDR_EL1 0x108 +#define VNCR_SCTLR_EL1 0x110 +#define VNCR_ACTLR_EL1 0x118 +#define VNCR_TCR_EL1 0x120 +#define VNCR_AFSR0_EL1 0x128 +#define VNCR_AFSR1_EL1 0x130 +#define VNCR_ESR_EL1 0x138 +#define VNCR_MAIR_EL1 0x140 +#define VNCR_AMAIR_EL1 0x148 +#define VNCR_MDSCR_EL1 0x158 +#define VNCR_SPSR_EL1 0x160 +#define VNCR_CNTV_CVAL_EL0 0x168 +#define VNCR_CNTV_CTL_EL0 0x170 +#define VNCR_CNTP_CVAL_EL0 0x178 +#define VNCR_CNTP_CTL_EL0 0x180 +#define VNCR_SCXTNUM_EL1 0x188 +#define VNCR_TFSR_EL1 0x190 +#define VNCR_HFGRTR_EL2 0x1B8 +#define VNCR_HFGWTR_EL2 0x1C0 +#define VNCR_HFGITR_EL2 0x1C8 +#define VNCR_HDFGRTR_EL2 0x1D0 +#define VNCR_HDFGWTR_EL2 0x1D8 +#define VNCR_ZCR_EL1 0x1E0 +#define VNCR_HAFGRTR_EL2 0x1E8 +#define VNCR_TTBR0_EL1 0x200 +#define VNCR_TTBR1_EL1 0x210 +#define VNCR_FAR_EL1 0x220 +#define VNCR_ELR_EL1 0x230 +#define VNCR_SP_EL1 0x240 +#define VNCR_VBAR_EL1 0x250 +#define VNCR_TCR2_EL1 0x270 +#define VNCR_PIRE0_EL1 0x290 +#define VNCR_PIRE0_EL2 0x298 +#define VNCR_PIR_EL1 0x2A0 +#define VNCR_ICH_LR0_EL2 0x400 +#define VNCR_ICH_LR1_EL2 0x408 +#define VNCR_ICH_LR2_EL2 0x410 +#define VNCR_ICH_LR3_EL2 0x418 +#define VNCR_ICH_LR4_EL2 0x420 +#define VNCR_ICH_LR5_EL2 0x428 +#define VNCR_ICH_LR6_EL2 0x430 +#define VNCR_ICH_LR7_EL2 0x438 +#define VNCR_ICH_LR8_EL2 0x440 +#define VNCR_ICH_LR9_EL2 0x448 +#define VNCR_ICH_LR10_EL2 0x450 +#define VNCR_ICH_LR11_EL2 0x458 +#define VNCR_ICH_LR12_EL2 0x460 +#define VNCR_ICH_LR13_EL2 0x468 +#define VNCR_ICH_LR14_EL2 0x470 +#define VNCR_ICH_LR15_EL2 0x478 +#define VNCR_ICH_AP0R0_EL2 0x480 +#define VNCR_ICH_AP0R1_EL2 0x488 +#define VNCR_ICH_AP0R2_EL2 0x490 +#define VNCR_ICH_AP0R3_EL2 0x498 +#define VNCR_ICH_AP1R0_EL2 0x4A0 +#define VNCR_ICH_AP1R1_EL2 0x4A8 +#define VNCR_ICH_AP1R2_EL2 0x4B0 +#define VNCR_ICH_AP1R3_EL2 0x4B8 +#define VNCR_ICH_HCR_EL2 0x4C0 +#define VNCR_ICH_VMCR_EL2 0x4C8 +#define VNCR_VDISR_EL2 0x500 +#define VNCR_PMBLIMITR_EL1 0x800 +#define VNCR_PMBPTR_EL1 0x810 +#define VNCR_PMBSR_EL1 0x820 +#define VNCR_PMSCR_EL1 0x828 +#define VNCR_PMSEVFR_EL1 0x830 +#define VNCR_PMSICR_EL1 0x838 +#define VNCR_PMSIRR_EL1 0x840 +#define VNCR_PMSLATFR_EL1 0x848 +#define VNCR_TRFCR_EL1 0x880 +#define VNCR_MPAM1_EL1 0x900 +#define VNCR_MPAMHCR_EL2 0x930 +#define VNCR_MPAMVPMV_EL2 0x938 +#define VNCR_MPAMVPM0_EL2 0x940 +#define VNCR_MPAMVPM1_EL2 0x948 +#define VNCR_MPAMVPM2_EL2 0x950 +#define VNCR_MPAMVPM3_EL2 0x958 +#define VNCR_MPAMVPM4_EL2 0x960 +#define VNCR_MPAMVPM5_EL2 0x968 +#define VNCR_MPAMVPM6_EL2 0x970 +#define VNCR_MPAMVPM7_EL2 0x978 + +#endif /* __ARM64_VNCR_MAPPING_H__ */ -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.18-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- SPE data source filtering (SPE_FEAT_FDS) adds a new register PMSDSFR_EL1, add the trap configs for it. PMSNEVFR_EL1 was also missing its VNCR offset so add it along with PMSDSFR_EL1. Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/include/asm/vncr_mapping.h | 2 ++ arch/arm64/kvm/emulate-nested.c | 1 + arch/arm64/kvm/sys_regs.c | 1 + 3 files changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h index df2c47c55972..073e8154d095 100644 --- a/arch/arm64/include/asm/vncr_mapping.h +++ b/arch/arm64/include/asm/vncr_mapping.h @@ -87,6 +87,8 @@ #define VNCR_PMSICR_EL1 0x838 #define VNCR_PMSIRR_EL1 0x840 #define VNCR_PMSLATFR_EL1 0x848 +#define VNCR_PMSNEVFR_EL1 0x850 +#define VNCR_PMSDSFR_EL1 0x858 #define VNCR_TRFCR_EL1 0x880 #define VNCR_MPAM1_EL1 0x900 #define VNCR_MPAMHCR_EL2 0x930 diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index ee902ff2a50f..4afc6840b3d4 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -925,6 +925,7 @@ static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = { SR_TRAP(SYS_PMSIRR_EL1, CGT_MDCR_TPMS), SR_TRAP(SYS_PMSLATFR_EL1, CGT_MDCR_TPMS), SR_TRAP(SYS_PMSNEVFR_EL1, CGT_MDCR_TPMS), + SR_TRAP(SYS_PMSDSFR_EL1, CGT_MDCR_TPMS), SR_TRAP(SYS_TRFCR_EL1, CGT_MDCR_TTRF), SR_TRAP(SYS_TRBBASER_EL1, CGT_MDCR_E2TB), SR_TRAP(SYS_TRBLIMITR_EL1, CGT_MDCR_E2TB), diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index ded8d28d10a8..c94b24311525 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2597,6 +2597,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_PMBLIMITR_EL1), undef_access }, { SYS_DESC(SYS_PMBPTR_EL1), undef_access }, { SYS_DESC(SYS_PMBSR_EL1), undef_access }, + { SYS_DESC(SYS_PMSDSFR_EL1), undef_access }, /* PMBIDR_EL1 is not trapped */ { PMU_SYS_REG(PMINTENSET_EL1), -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.19-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Arm FEAT_SPE_FDS adds the ability to filter on the data source of a packet using another 64-bits of event filtering control. As the existing perf_event_attr::configN fields are all used up for SPE PMU, an additional field is needed. Add a new 'config4' field. Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- include/uapi/linux/perf_event.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 365184d931ad..e6d353d220a4 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -17,6 +17,8 @@ #include <linux/types.h> #include <linux/ioctl.h> +#include <linux/kabi.h> + #include <asm/byteorder.h> /* @@ -382,6 +384,7 @@ enum perf_event_read_format { #define PERF_ATTR_SIZE_VER6 120 /* Add: aux_sample_size */ #define PERF_ATTR_SIZE_VER7 128 /* Add: sig_data */ #define PERF_ATTR_SIZE_VER8 136 /* Add: config3 */ +#define PERF_ATTR_SIZE_VER9 144 /* add: config4 */ /* * 'struct perf_event_attr' contains various attributes that define @@ -534,6 +537,7 @@ struct perf_event_attr { __u64 sig_data; __u64 config3; /* extension of config2 */ + KABI_EXTEND(__u64 config4) /* extension of config3 */ }; /* -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.19-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- SPE_FEAT_FDS adds the ability to filter on the data source of packets. Like the other existing filters, enable filtering with PMSFCR_EL1.FDS when any of the filter bits are set. Each bit maps to data sources 0-63 described by bits[0:5] in the data source packet (although the full range of data source is 16 bits so higher value data sources can't be filtered on). The filter is an OR of all the bits, so for example clearing bits 0 and 3 only includes packets from data sources 0 OR 3. Invert the filter given by userspace so that the default value of 0 is equivalent to including all values (no filtering). This allows us to skip adding a new format bit to enable filtering and still support excluding all data sources which would have been a filter value of 0 if not for the inversion. Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_spe_pmu.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index c49077730217..50753cbebede 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -98,6 +98,7 @@ struct arm_spe_pmu { #define SPE_PMU_FEAT_INV_FILT_EVT (1UL << 6) #define SPE_PMU_FEAT_DISCARD (1UL << 7) #define SPE_PMU_FEAT_EFT (1UL << 8) +#define SPE_PMU_FEAT_FDS (1UL << 9) #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) u64 features; @@ -263,6 +264,10 @@ static const struct attribute_group arm_spe_pmu_cap_group = { #define ATTR_CFG_FLD_inv_event_filter_LO 0 #define ATTR_CFG_FLD_inv_event_filter_HI 63 +#define ATTR_CFG_FLD_inv_data_src_filter_CFG config4 /* inverse of PMSDSFR_EL1 */ +#define ATTR_CFG_FLD_inv_data_src_filter_LO 0 +#define ATTR_CFG_FLD_inv_data_src_filter_HI 63 + GEN_PMU_FORMAT_ATTR(ts_enable); GEN_PMU_FORMAT_ATTR(pa_enable); GEN_PMU_FORMAT_ATTR(pct_enable); @@ -279,6 +284,7 @@ GEN_PMU_FORMAT_ATTR(float_filter); GEN_PMU_FORMAT_ATTR(float_filter_mask); GEN_PMU_FORMAT_ATTR(event_filter); GEN_PMU_FORMAT_ATTR(inv_event_filter); +GEN_PMU_FORMAT_ATTR(inv_data_src_filter); GEN_PMU_FORMAT_ATTR(min_latency); GEN_PMU_FORMAT_ATTR(discard); @@ -299,6 +305,7 @@ static struct attribute *arm_spe_pmu_formats_attr[] = { &format_attr_float_filter_mask.attr, &format_attr_event_filter.attr, &format_attr_inv_event_filter.attr, + &format_attr_inv_data_src_filter.attr, &format_attr_min_latency.attr, &format_attr_discard.attr, NULL, @@ -317,6 +324,10 @@ static umode_t arm_spe_pmu_format_attr_is_visible(struct kobject *kobj, if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT)) return 0; + if (attr == &format_attr_inv_data_src_filter.attr && + !(spe_pmu->features & SPE_PMU_FEAT_FDS)) + return 0; + if ((attr == &format_attr_branch_filter_mask.attr || attr == &format_attr_load_filter_mask.attr || attr == &format_attr_store_filter_mask.attr || @@ -437,6 +448,9 @@ static u64 arm_spe_event_to_pmsfcr(struct perf_event *event) if (ATTR_CFG_GET_FLD(attr, inv_event_filter)) reg |= PMSFCR_EL1_FnE; + if (ATTR_CFG_GET_FLD(attr, inv_data_src_filter)) + reg |= PMSFCR_EL1_FDS; + if (ATTR_CFG_GET_FLD(attr, min_latency)) reg |= PMSFCR_EL1_FL; @@ -461,6 +475,17 @@ static u64 arm_spe_event_to_pmslatfr(struct perf_event *event) return FIELD_PREP(PMSLATFR_EL1_MINLAT, ATTR_CFG_GET_FLD(attr, min_latency)); } +static u64 arm_spe_event_to_pmsdsfr(struct perf_event *event) +{ + struct perf_event_attr *attr = &event->attr; + + /* + * Data src filter is inverted so that the default value of 0 is + * equivalent to no filtering. + */ + return ~ATTR_CFG_GET_FLD(attr, inv_data_src_filter); +} + static void arm_spe_pmu_pad_buf(struct perf_output_handle *handle, int len) { struct arm_spe_pmu_buf *buf = perf_get_aux(handle); @@ -826,6 +851,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event) if (arm_spe_event_to_pmsnevfr(event) & spe_pmu->pmsevfr_res0) return -EOPNOTSUPP; + if (arm_spe_event_to_pmsdsfr(event) != U64_MAX && + !(spe_pmu->features & SPE_PMU_FEAT_FDS)) + return -EOPNOTSUPP; + if (attr->exclude_idle) return -EOPNOTSUPP; @@ -905,6 +934,11 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags) write_sysreg_s(reg, SYS_PMSNEVFR_EL1); } + if (spe_pmu->features & SPE_PMU_FEAT_FDS) { + reg = arm_spe_event_to_pmsdsfr(event); + write_sysreg_s(reg, SYS_PMSDSFR_EL1); + } + reg = arm_spe_event_to_pmslatfr(event); write_sysreg_s(reg, SYS_PMSLATFR_EL1); @@ -1174,6 +1208,9 @@ static void __arm_spe_pmu_dev_probe(void *info) if (FIELD_GET(PMSIDR_EL1_EFT, reg)) spe_pmu->features |= SPE_PMU_FEAT_EFT; + if (FIELD_GET(PMSIDR_EL1_FDS, reg)) + spe_pmu->features |= SPE_PMU_FEAT_FDS; + /* This field has a spaced out encoding, so just use a look-up */ fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg); switch (fld) { -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.19-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- To pickup config4 changes. Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- tools/include/uapi/linux/perf_event.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 365184d931ad..9d835c2ecab8 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -382,6 +382,7 @@ enum perf_event_read_format { #define PERF_ATTR_SIZE_VER6 120 /* Add: aux_sample_size */ #define PERF_ATTR_SIZE_VER7 128 /* Add: sig_data */ #define PERF_ATTR_SIZE_VER8 136 /* Add: config3 */ +#define PERF_ATTR_SIZE_VER9 144 /* add: config4 */ /* * 'struct perf_event_attr' contains various attributes that define @@ -534,6 +535,7 @@ struct perf_event_attr { __u64 sig_data; __u64 config3; /* extension of config2 */ + __u64 config4; /* extension of config3 */ }; /* -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.19-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- perf_event_attr has gained a new field, config4, so add support for it extending the existing configN support. Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- tools/perf/tests/parse-events.c | 14 +++++++++++++- tools/perf/util/parse-events.c | 11 +++++++++++ tools/perf/util/parse-events.h | 1 + tools/perf/util/parse-events.l | 1 + tools/perf/util/pmu.c | 3 +++ tools/perf/util/pmu.h | 1 + 6 files changed, 30 insertions(+), 1 deletion(-) diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c index bc7b335b05f7..da07018b5b75 100644 --- a/tools/perf/tests/parse-events.c +++ b/tools/perf/tests/parse-events.c @@ -630,6 +630,8 @@ static int test__checkevent_pmu(struct evlist *evlist) TEST_ASSERT_VAL("wrong config1", 1 == evsel->core.attr.config1); TEST_ASSERT_VAL("wrong config2", 3 == evsel->core.attr.config2); TEST_ASSERT_VAL("wrong config3", 0 == evsel->core.attr.config3); + TEST_ASSERT_VAL("wrong config4", 0 == evsel->core.attr.config4); + /* * The period value gets configured within evlist__config, * while this test executes only parse events method. @@ -653,6 +655,7 @@ static int test__checkevent_list(struct evlist *evlist) TEST_ASSERT_VAL("wrong config1", 0 == evsel->core.attr.config1); TEST_ASSERT_VAL("wrong config2", 0 == evsel->core.attr.config2); TEST_ASSERT_VAL("wrong config3", 0 == evsel->core.attr.config3); + TEST_ASSERT_VAL("wrong config4", 0 == evsel->core.attr.config4); TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -831,6 +834,15 @@ static int test__checkterms_simple(struct list_head *terms) TEST_ASSERT_VAL("wrong val", term->val.num == 4); TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "config3")); + /* config4=5 */ + term = list_entry(term->list.next, struct parse_events_term, list); + TEST_ASSERT_VAL("wrong type term", + term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG4); + TEST_ASSERT_VAL("wrong type val", + term->type_val == PARSE_EVENTS__TERM_TYPE_NUM); + TEST_ASSERT_VAL("wrong val", term->val.num == 5); + TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "config4")); + /* umask=1*/ term = list_entry(term->list.next, struct parse_events_term, list); TEST_ASSERT_VAL("wrong type term", @@ -2482,7 +2494,7 @@ struct terms_test { static const struct terms_test test__terms[] = { [0] = { - .str = "config=10,config1,config2=3,config3=4,umask=1,read,r0xead", + .str = "config=10,config1,config2=3,config3=4,config4=5,umask=1,read,r0xead", .check = test__checkterms_simple, }, }; diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index ac5fea49e6db..21bc4fe41427 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -245,6 +245,8 @@ __add_event(struct list_head *list, int *idx, PERF_PMU_FORMAT_VALUE_CONFIG2, "config2"); perf_pmu__warn_invalid_config(pmu, attr->config3, name, PERF_PMU_FORMAT_VALUE_CONFIG3, "config3"); + perf_pmu__warn_invalid_config(pmu, attr->config4, name, + PERF_PMU_FORMAT_VALUE_CONFIG4, "config4"); } if (init_attr) event_attr_init(attr); @@ -750,6 +752,7 @@ static const char *config_term_name(enum parse_events__term_type term_type) [PARSE_EVENTS__TERM_TYPE_CONFIG1] = "config1", [PARSE_EVENTS__TERM_TYPE_CONFIG2] = "config2", [PARSE_EVENTS__TERM_TYPE_CONFIG3] = "config3", + [PARSE_EVENTS__TERM_TYPE_CONFIG4] = "config4", [PARSE_EVENTS__TERM_TYPE_NAME] = "name", [PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD] = "period", [PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ] = "freq", @@ -796,6 +799,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er case PARSE_EVENTS__TERM_TYPE_CONFIG1: case PARSE_EVENTS__TERM_TYPE_CONFIG2: case PARSE_EVENTS__TERM_TYPE_CONFIG3: + case PARSE_EVENTS__TERM_TYPE_CONFIG4: case PARSE_EVENTS__TERM_TYPE_NAME: case PARSE_EVENTS__TERM_TYPE_METRIC_ID: case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: @@ -863,6 +867,10 @@ do { \ CHECK_TYPE_VAL(NUM); attr->config3 = term->val.num; break; + case PARSE_EVENTS__TERM_TYPE_CONFIG4: + CHECK_TYPE_VAL(NUM); + attr->config4 = term->val.num; + break; case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: CHECK_TYPE_VAL(NUM); break; @@ -1058,6 +1066,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr, case PARSE_EVENTS__TERM_TYPE_CONFIG1: case PARSE_EVENTS__TERM_TYPE_CONFIG2: case PARSE_EVENTS__TERM_TYPE_CONFIG3: + case PARSE_EVENTS__TERM_TYPE_CONFIG4: case PARSE_EVENTS__TERM_TYPE_NAME: case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: @@ -1195,6 +1204,7 @@ do { \ case PARSE_EVENTS__TERM_TYPE_CONFIG1: case PARSE_EVENTS__TERM_TYPE_CONFIG2: case PARSE_EVENTS__TERM_TYPE_CONFIG3: + case PARSE_EVENTS__TERM_TYPE_CONFIG4: case PARSE_EVENTS__TERM_TYPE_NAME: case PARSE_EVENTS__TERM_TYPE_METRIC_ID: case PARSE_EVENTS__TERM_TYPE_RAW: @@ -1232,6 +1242,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct list_head *head_config, case PARSE_EVENTS__TERM_TYPE_CONFIG1: case PARSE_EVENTS__TERM_TYPE_CONFIG2: case PARSE_EVENTS__TERM_TYPE_CONFIG3: + case PARSE_EVENTS__TERM_TYPE_CONFIG4: case PARSE_EVENTS__TERM_TYPE_NAME: case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ: diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index 594e5d2dc67f..8ebd1f930e9c 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -59,6 +59,7 @@ enum parse_events__term_type { PARSE_EVENTS__TERM_TYPE_CONFIG1, PARSE_EVENTS__TERM_TYPE_CONFIG2, PARSE_EVENTS__TERM_TYPE_CONFIG3, + PARSE_EVENTS__TERM_TYPE_CONFIG4, PARSE_EVENTS__TERM_TYPE_NAME, PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD, PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ, diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 4ef4b6f171a0..9ebd4f148fd0 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -229,6 +229,7 @@ config { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG); } config1 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG1); } config2 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG2); } config3 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG3); } +config4 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG4); } name { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NAME); } period { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD); } freq { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ); } diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 309ccfbaffe5..493d8a5a3908 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -1342,6 +1342,9 @@ static int pmu_config_term(struct perf_pmu *pmu, case PERF_PMU_FORMAT_VALUE_CONFIG3: vp = &attr->config3; break; + case PERF_PMU_FORMAT_VALUE_CONFIG4: + vp = &attr->config4; + break; default: return -EINVAL; } diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h index b4de5bf86e18..134f98cf0a8c 100644 --- a/tools/perf/util/pmu.h +++ b/tools/perf/util/pmu.h @@ -20,6 +20,7 @@ enum { PERF_PMU_FORMAT_VALUE_CONFIG1, PERF_PMU_FORMAT_VALUE_CONFIG2, PERF_PMU_FORMAT_VALUE_CONFIG3, + PERF_PMU_FORMAT_VALUE_CONFIG4, PERF_PMU_FORMAT_VALUE_CONFIG_END, }; -- 2.33.0
From: James Clark <james.clark@linaro.org> mainline inclusion from mainline-v6.19-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- FEAT_SPE_EFT and FEAT_SPE_FDS etc have new user facing format attributes so document them. Also document existing 'event_filter' bits that were missing from the doc and the fact that latency values are stored in the weight field. Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- tools/perf/Documentation/perf-arm-spe.txt | 103 ++++++++++++++++++++-- 1 file changed, 94 insertions(+), 9 deletions(-) diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt index bf03222e9a68..09e555de8b93 100644 --- a/tools/perf/Documentation/perf-arm-spe.txt +++ b/tools/perf/Documentation/perf-arm-spe.txt @@ -132,26 +132,64 @@ Config parameters These are placed between the // in the event and comma separated. For example '-e arm_spe/load_filter=1,min_latency=10/' - branch_filter=1 - collect branches only (PMSFCR.B) - event_filter=<mask> - filter on specific events (PMSEVFR) - see bitfield description below + event_filter=<mask> - logical AND filter on specific events (PMSEVFR) - see bitfield description below + inv_event_filter=<mask> - logical OR to filter out specific events (PMSNEVFR, FEAT_SPEv1p2) - see bitfield description below jitter=1 - use jitter to avoid resonance when sampling (PMSIRR.RND) - load_filter=1 - collect loads only (PMSFCR.LD) min_latency=<n> - collect only samples with this latency or higher* (PMSLATFR) pa_enable=1 - collect physical address (as well as VA) of loads/stores (PMSCR.PA) - requires privilege pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege - store_filter=1 - collect stores only (PMSFCR.ST) ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS) + data_src_filter=<mask> - mask to filter from 0-63 possible data sources (PMSDSFR, FEAT_SPE_FDS) - See 'Data source filtering' +++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather than only the execution latency. -Only some events can be filtered on; these include: - - bit 1 - instruction retired (i.e. omit speculative instructions) +Only some events can be filtered on using 'event_filter' bits. The overall +filter is the logical AND of these bits, for example if bits 3 and 5 are set +only samples that have both 'L1D cache refill' AND 'TLB walk' are recorded. When +FEAT_SPEv1p2 is implemented 'inv_event_filter' can also be used to exclude +events that have any (OR) of the filter's bits set. For example setting bits 3 +and 5 in 'inv_event_filter' will exclude any events that are either L1D cache +refill OR TLB walk. If the same bit is set in both filters it's UNPREDICTABLE +whether the sample is included or excluded. Filter bits for both event_filter +and inv_event_filter are: + + bit 1 - Instruction retired (i.e. omit speculative instructions) + bit 2 - L1D access (FEAT_SPEv1p4) bit 3 - L1D refill + bit 4 - TLB access (FEAT_SPEv1p4) bit 5 - TLB refill - bit 7 - mispredict - bit 11 - misaligned access + bit 6 - Not taken event (FEAT_SPEv1p2) + bit 7 - Mispredict + bit 8 - Last level cache access (FEAT_SPEv1p4) + bit 9 - Last level cache miss (FEAT_SPEv1p4) + bit 10 - Remote access (FEAT_SPEv1p4) + bit 11 - Misaligned access (FEAT_SPEv1p1) + bit 12-15 - IMPLEMENTATION DEFINED events (when implemented) + bit 16 - Transaction (FEAT_TME) + bit 17 - Partial or empty SME or SVE predicate (FEAT_SPEv1p1) + bit 18 - Empty SME or SVE predicate (FEAT_SPEv1p1) + bit 19 - L2D access (FEAT_SPEv1p4) + bit 20 - L2D miss (FEAT_SPEv1p4) + bit 21 - Cache data modified (FEAT_SPEv1p4) + bit 22 - Recently fetched (FEAT_SPEv1p4) + bit 23 - Data snooped (FEAT_SPEv1p4) + bit 24 - Streaming SVE mode event (when FEAT_SPE_SME is implemented), or + IMPLEMENTATION DEFINED event 24 (when implemented, only versions + less than FEAT_SPEv1p4) + bit 25 - SMCU or external coprocessor operation event when FEAT_SPE_SME is + implemented, or IMPLEMENTATION DEFINED event 25 (when implemented, + only versions less than FEAT_SPEv1p4) + bit 26-31 - IMPLEMENTATION DEFINED events (only versions less than FEAT_SPEv1p4) + bit 48-63 - IMPLEMENTATION DEFINED events (when implemented) + +For IMPLEMENTATION DEFINED bits, refer to the CPU TRM if these bits are +implemented. + +The driver will reject events if requested filter bits require unimplemented SPE +versions, but will not reject filter bits for unimplemented IMPDEF bits or when +their related feature is not present (e.g. SME). For example, if FEAT_SPEv1p2 is +not implemented, filtering on "Not taken event" (bit 6) will be rejected. So to sample just retired instructions: @@ -161,6 +199,31 @@ or just mispredicted branches: perf record -e arm_spe/event_filter=0x80/ -- ./mybench +When set, the following filters can be used to select samples that match any of +the operation types (OR filtering). If only one is set then only samples of that +type are collected: + + branch_filter=1 - Collect branches (PMSFCR.B) + load_filter=1 - Collect loads (PMSFCR.LD) + store_filter=1 - Collect stores (PMSFCR.ST) + +When extended filtering is supported (FEAT_SPE_EFT), SIMD and float +pointer operations can also be selected: + + simd_filter=1 - Collect SIMD loads, stores and operations (PMSFCR.SIMD) + float_filter=1 - Collect floating point loads, stores and operations (PMSFCR.FP) + +When extended filtering is supported (FEAT_SPE_EFT), operation type filters can +be changed to AND using _mask fields. For example samples could be selected if +they are store AND SIMD by setting 'store_filter=1,simd_filter=1, +store_filter_mask=1,simd_filter_mask=1'. The new masks are as follows: + + branch_filter_mask=1 - Change branch filter behavior from OR to AND (PMSFCR.Bm) + load_filter_mask=1 - Change load filter behavior from OR to AND (PMSFCR.LDm) + store_filter_mask=1 - Change store filter behavior from OR to AND (PMSFCR.STm) + simd_filter_mask=1 - Change SIMD filter behavior from OR to AND (PMSFCR.SIMDm) + float_filter_mask=1 - Change floating point filter behavior from OR to AND (PMSFCR.FPm) + Viewing the data ~~~~~~~~~~~~~~~~~ @@ -194,6 +257,10 @@ Memory access details are also stored on the samples and this can be viewed with perf report --mem-mode +The latency value from the SPE sample is stored in the 'weight' field of the +Perf samples and can be displayed in Perf script and report outputs by enabling +its display from the command line. + Common errors ~~~~~~~~~~~~~ @@ -210,6 +277,24 @@ Common errors Increase sampling interval (see above) +Data source filtering +~~~~~~~~~~~~~~~~~~~~~ + +When FEAT_SPE_FDS is present, 'inv_data_src_filter' can be used as a mask to +filter on a subset (0 - 63) of possible data source IDs. The full range of data +sources is 0 - 65535 although these are unlikely to be used in practice. Data +sources are IMPDEF so refer to the TRM for the mappings. Each bit N of the +filter maps to data source N. The filter is an OR of all the bits, and the value +provided inv_data_src_filter is inverted before writing to PMSDSFR_EL1 so that +set bits exclude that data source and cleared bits include that data source. +Therefore the default value of 0 is equivalent to no filtering (all data sources +included). + +For example, to include only data sources 0 and 3, clear bits 0 and 3 +(0xFFFFFFFFFFFFFFF6) + +When 'inv_data_src_filter' is set to 0xFFFFFFFFFFFFFFFF, any samples with any +data source set are excluded. SEE ALSO -------- -- 2.33.0
From: Marc Zyngier <maz@kernel.org> mainline inclusion from mainline-v6.16-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Add the missing SME, ALTCLK, FPF, EFT. CRR and FDS fields. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/tools/sysreg | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index bf48bb4859f6..cf18cd2b142c 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2301,7 +2301,28 @@ Field 15:0 MINLAT EndSysreg Sysreg PMSIDR_EL1 3 0 9 9 7 -Res0 63:25 +Res0 63:33 +UnsignedEnum 32 SME + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 31:28 ALTCLK + 0b0000 NI + 0b0001 IMP + 0b1111 IMPDEF +EndEnum +UnsignedEnum 27 FPF + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 26 EFT + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 25 CRR + 0b0 NI + 0b1 IMP +EndEnum Field 24 PBT Field 23:20 FORMAT Enum 19:16 COUNTSIZE @@ -2319,7 +2340,10 @@ Enum 11:8 INTERVAL 0b0111 3072 0b1000 4096 EndEnum -Res0 7 +UnsignedEnum 7 FDS + 0b0 NI + 0b1 IMP +EndEnum Field 6 FnE Field 5 ERND Field 4 LDS -- 2.33.0
From: "Rob Herring (Arm)" <robh@kernel.org> mainline inclusion from mainline-v6.12-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Use of_property_present() to test for property presence rather than of_find_property(). This is part of a larger effort to remove callers of of_find_property() and similar functions. of_find_property() leaks the DT struct property and data pointers which is a problem for dynamically allocated nodes which may be freed. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240731191312.1710417-15-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_pmu_platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c index 3596db36cbff..9af933b5a851 100644 --- a/drivers/perf/arm_pmu_platform.c +++ b/drivers/perf/arm_pmu_platform.c @@ -59,7 +59,7 @@ static int pmu_parse_percpu_irq(struct arm_pmu *pmu, int irq) static bool pmu_has_irq_affinity(struct device_node *node) { - return !!of_find_property(node, "interrupt-affinity", NULL); + return of_property_present(node, "interrupt-affinity"); } static int pmu_parse_irq_affinity(struct device *dev, int i) -- 2.33.0
From: "Rob Herring (Arm)" <robh@kernel.org> mainline inclusion from mainline-v6.12-rc1 category: feature bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ----------------------------------------------- Xscale and Armv6 PMUs defined the cycle counter at 0 and event counters starting at 1 and had 1:1 event index to counter numbering. On Armv7 and later, this changed the cycle counter to 31 and event counters start at 0. The drivers for Armv7 and PMUv3 kept the old event index numbering and introduced an event index to counter conversion. The conversion uses masking to convert from event index to a counter number. This operation relies on having at most 32 counters so that the cycle counter index 0 can be transformed to counter number 31. Armv9.4 adds support for an additional fixed function counter (instructions) which increases possible counters to more than 32, and the conversion won't work anymore as a simple subtract and mask. The primary reason for the translation (other than history) seems to be to have a contiguous mask of counters 0-N. Keeping that would result in more complicated index to counter conversions. Instead, store a mask of available counters rather than just number of events. That provides more information in addition to the number of events. No (intended) functional changes. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-1-280a8d7ff465@kerne... Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- arch/arm64/kvm/pmu-emul.c | 6 ++-- drivers/perf/apple_m1_cpu_pmu.c | 4 +-- drivers/perf/arm_pmu.c | 11 +++--- drivers/perf/arm_pmuv3.c | 61 ++++++++++++--------------------- include/linux/perf/arm_pmu.h | 2 +- include/linux/perf/arm_pmuv3.h | 1 + 6 files changed, 34 insertions(+), 51 deletions(-) diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 2cf3c4462b64..d86c7a585f99 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -941,10 +941,10 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm) return kvm_realm_max_pmu_counters(); /* - * The arm_pmu->num_events considers the cycle counter as well. - * Ignore that and return only the general-purpose counters. + * The arm_pmu->cntr_mask considers the fixed counter(s) as well. + * Ignore those and return only the general-purpose counters. */ - return arm_pmu->num_events - 1; + return bitmap_weight(arm_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS); } static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id) diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c index f322e5ca1114..c8f607912567 100644 --- a/drivers/perf/apple_m1_cpu_pmu.c +++ b/drivers/perf/apple_m1_cpu_pmu.c @@ -400,7 +400,7 @@ static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu) regs = get_irq_regs(); - for (idx = 0; idx < cpu_pmu->num_events; idx++) { + for_each_set_bit(idx, cpu_pmu->cntr_mask, M1_PMU_NR_COUNTERS) { struct perf_event *event = cpuc->events[idx]; struct perf_sample_data data; @@ -560,7 +560,7 @@ static int m1_pmu_init(struct arm_pmu *cpu_pmu, u32 flags) cpu_pmu->reset = m1_pmu_reset; cpu_pmu->set_event_filter = m1_pmu_set_event_filter; - cpu_pmu->num_events = M1_PMU_NR_COUNTERS; + bitmap_set(cpu_pmu->cntr_mask, 0, M1_PMU_NR_COUNTERS); cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = &m1_pmu_events_attr_group; cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = &m1_pmu_format_attr_group; return 0; diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index ea9a8d0c0b21..801a93b16ac3 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -566,7 +566,7 @@ static void armpmu_enable(struct pmu *pmu) { struct arm_pmu *armpmu = to_arm_pmu(pmu); struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); - bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events); + bool enabled = !bitmap_empty(hw_events->used_mask, ARMPMU_MAX_HWEVENTS); /* For task-bound events we may be called on other CPUs */ if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus)) @@ -801,7 +801,7 @@ static void cpu_pm_pmu_setup(struct arm_pmu *armpmu, unsigned long cmd) struct perf_event *event; int idx; - for (idx = 0; idx < armpmu->num_events; idx++) { + for_each_set_bit(idx, armpmu->cntr_mask, ARMPMU_MAX_HWEVENTS) { event = hw_events->events[idx]; if (!event) continue; @@ -831,7 +831,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd, { struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb); struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); - bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events); + bool enabled = !bitmap_empty(hw_events->used_mask, ARMPMU_MAX_HWEVENTS); if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus)) return NOTIFY_DONE; @@ -991,8 +991,9 @@ int armpmu_register(struct arm_pmu *pmu) if (ret) goto out_destroy; - pr_info("enabled with %s PMU driver, %d counters available%s\n", - pmu->name, pmu->num_events, + pr_info("enabled with %s PMU driver, %d (%*pb) counters available%s\n", + pmu->name, bitmap_weight(pmu->cntr_mask, ARMPMU_MAX_HWEVENTS), + ARMPMU_MAX_HWEVENTS, &pmu->cntr_mask, has_nmi ? ", using NMIs" : ""); kvm_host_pmu_init(pmu); diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c index 5d750678a70e..ee56bde77ef5 100644 --- a/drivers/perf/arm_pmuv3.c +++ b/drivers/perf/arm_pmuv3.c @@ -508,9 +508,8 @@ static const struct attribute_group armv8_pmuv3_caps_attr_group = { /* * Perf Events' indices */ -#define ARMV8_IDX_CYCLE_COUNTER 0 +#define ARMV8_IDX_CYCLE_COUNTER 31 #define ARMV8_IDX_COUNTER0 1 -#define ARMV8_IDX_CYCLE_COUNTER_USER 32 /* * We unconditionally enable ARMv8.5-PMU long event counter support @@ -543,19 +542,12 @@ static bool armv8pmu_event_is_chained(struct perf_event *event) return !armv8pmu_event_has_user_read(event) && armv8pmu_event_is_64bit(event) && !armv8pmu_has_long_event(cpu_pmu) && - (idx != ARMV8_IDX_CYCLE_COUNTER); + (idx < ARMV8_PMU_MAX_GENERAL_COUNTERS); } /* * ARMv8 low level PMU access */ - -/* - * Perf Event to low level counters mapping - */ -#define ARMV8_IDX_TO_COUNTER(x) \ - (((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK) - static u64 armv8pmu_pmcr_read(void) { return read_pmcr(); @@ -575,14 +567,12 @@ static int armv8pmu_has_overflowed(u32 pmovsr) static int armv8pmu_counter_has_overflowed(u32 pmnc, int idx) { - return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx)); + return pmnc & BIT(idx); } static u64 armv8pmu_read_evcntr(int idx) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); - - return read_pmevcntrn(counter); + return read_pmevcntrn(idx); } static u64 armv8pmu_read_hw_counter(struct perf_event *event) @@ -611,7 +601,7 @@ static bool armv8pmu_event_needs_bias(struct perf_event *event) return false; if (armv8pmu_has_long_event(cpu_pmu) || - idx == ARMV8_IDX_CYCLE_COUNTER) + idx >= ARMV8_PMU_MAX_GENERAL_COUNTERS) return true; return false; @@ -649,9 +639,7 @@ static u64 armv8pmu_read_counter(struct perf_event *event) static void armv8pmu_write_evcntr(int idx, u64 value) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); - - write_pmevcntrn(counter, value); + write_pmevcntrn(idx, value); } #ifdef CONFIG_HISILICON_HW_METRIC @@ -722,7 +710,6 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value) static void armv8pmu_write_evtype(int idx, unsigned long val) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); unsigned long mask = ARMV8_PMU_EVTYPE_EVENT | ARMV8_PMU_INCLUDE_EL2 | ARMV8_PMU_EXCLUDE_EL0 | @@ -732,7 +719,7 @@ static void armv8pmu_write_evtype(int idx, unsigned long val) mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH; val &= mask; - write_pmevtypern(counter, val); + write_pmevtypern(idx, val); } static void armv8pmu_write_event_type(struct perf_event *event) @@ -761,7 +748,7 @@ static void armv8pmu_write_event_type(struct perf_event *event) static u32 armv8pmu_event_cnten_mask(struct perf_event *event) { - int counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); + int counter = event->hw.idx; u32 mask = BIT(counter); if (armv8pmu_event_is_chained(event)) @@ -864,8 +851,7 @@ static void armv8pmu_enable_intens(u32 mask) static void armv8pmu_enable_event_irq(struct perf_event *event) { - u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); - armv8pmu_enable_intens(BIT(counter)); + armv8pmu_enable_intens(BIT(event->hw.idx)); } static void armv8pmu_disable_intens(u32 mask) @@ -879,8 +865,7 @@ static void armv8pmu_disable_intens(u32 mask) static void armv8pmu_disable_event_irq(struct perf_event *event) { - u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); - armv8pmu_disable_intens(BIT(counter)); + armv8pmu_disable_intens(BIT(event->hw.idx)); } static u32 armv8pmu_getreset_flags(void) @@ -924,7 +909,8 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu) struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events); /* Clear any unused counters to avoid leaking their contents */ - for_each_clear_bit(i, cpuc->used_mask, cpu_pmu->num_events) { + for_each_andnot_bit(i, cpu_pmu->cntr_mask, cpuc->used_mask, + ARMPMU_MAX_HWEVENTS) { if (i == ARMV8_IDX_CYCLE_COUNTER) write_pmccntr(0); else @@ -1064,7 +1050,7 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu) * to prevent skews in group events. */ armv8pmu_stop(cpu_pmu); - for (idx = 0; idx < cpu_pmu->num_events; ++idx) { + for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS) { struct perf_event *event = cpuc->events[idx]; struct hw_perf_event *hwc; @@ -1112,7 +1098,7 @@ static int armv8pmu_get_single_idx(struct pmu_hw_events *cpuc, { int idx; - for (idx = ARMV8_IDX_COUNTER0; idx < cpu_pmu->num_events; idx++) { + for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) { if (!test_and_set_bit(idx, cpuc->used_mask)) return idx; } @@ -1128,7 +1114,9 @@ static int armv8pmu_get_chain_idx(struct pmu_hw_events *cpuc, * Chaining requires two consecutive event counters, where * the lower idx must be even. */ - for (idx = ARMV8_IDX_COUNTER0 + 1; idx < cpu_pmu->num_events; idx += 2) { + for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) { + if (!(idx & 0x1)) + continue; if (!test_and_set_bit(idx, cpuc->used_mask)) { /* Check if the preceding even counter is available */ if (!test_and_set_bit(idx - 1, cpuc->used_mask)) @@ -1271,15 +1259,7 @@ static int armv8pmu_user_event_idx(struct perf_event *event) if (!sysctl_perf_user_access || !armv8pmu_event_has_user_read(event)) return 0; - /* - * We remap the cycle counter index to 32 to - * match the offset applied to the rest of - * the counter indices. - */ - if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) - return ARMV8_IDX_CYCLE_COUNTER_USER; - - return event->hw.idx; + return event->hw.idx + 1; } static bool armv8pmu_branch_stack_init(struct perf_event *event) @@ -1546,10 +1526,11 @@ static void __armv8pmu_probe_pmu(void *info) probe->present = true; /* Read the nb of CNTx counters supported from PMNC */ - cpu_pmu->num_events = FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read()); + bitmap_set(cpu_pmu->cntr_mask, + 0, FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read())); /* Add the CPU cycles counter */ - cpu_pmu->num_events += 1; + set_bit(ARMV8_IDX_CYCLE_COUNTER, cpu_pmu->cntr_mask); pmceid[0] = pmceid_raw[0] = read_pmceid0(); pmceid[1] = pmceid_raw[1] = read_pmceid1(); diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 850f363cac87..84679844592f 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -131,7 +131,7 @@ struct arm_pmu { void (*branch_stack_add)(struct perf_event *event, struct pmu_hw_events *cpuc); void (*branch_stack_del)(struct perf_event *event, struct pmu_hw_events *cpuc); void (*branch_stack_reset)(void); - int num_events; + DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS); unsigned int secure_access : 1, /* 32-bit ARM only */ has_branch_stack: 1, /* 64-bit ARM only */ reserved : 30; diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h index 46377e134d67..96cbb85d19b7 100644 --- a/include/linux/perf/arm_pmuv3.h +++ b/include/linux/perf/arm_pmuv3.h @@ -6,6 +6,7 @@ #ifndef __PERF_ARM_PMUV3_H #define __PERF_ARM_PMUV3_H +#define ARMV8_PMU_MAX_GENERAL_COUNTERS 31 #define ARMV8_PMU_MAX_COUNTERS 32 #define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) -- 2.33.0
driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 ---------------------------------------------------------------------- ARMv7 has changed the cycle counter from counter 0 to counter 31, event counters will be indexed starting from 0 instead of 1. With that change, Linux mainline support of Arm core PMU has changed the starting counter id of chained events from 2 to 0, which will conflict to the counter configuration logic of hw-metric feature since hw-metric will subtract that id by 1 to calculate the correct chicken bit for the very first event counter. If the minimum id comes to 0, subtract from it will generate an invalid index which will then triger a warning. Adapt that change of mainline by remove the subtraction, as well as the comment. Fixes: 608277cf9fc2e ("arm64: perf: Add support for HIP12 hw metric") Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_pmuv3.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c index ee56bde77ef5..bbf464942ffa 100644 --- a/drivers/perf/arm_pmuv3.c +++ b/drivers/perf/arm_pmuv3.c @@ -646,9 +646,6 @@ static void armv8pmu_write_evcntr(int idx, u64 value) static inline void armv8pmu_write_reload_counter(struct perf_event *event, u64 value) { - /* Need to be event->hw.idx - 1 since counter 0 is PMCCNTR_EL0 */ - int idx = event->hw.idx - 1; - #define HW_METRIC_RELOAD_CNTR(n) sys_reg(3, 3, 15, 3, (2 + n)) #define write_hw_metric_reload_cntr(_value, _n) \ do { \ @@ -671,7 +668,7 @@ static inline void armv8pmu_write_reload_counter(struct perf_event *event, event->hw.config_base, event->hw.idx); \ } \ } while (0) - write_hw_metric_reload_cntr(value, idx); + write_hw_metric_reload_cntr(value, event->hw.idx); #undef write_hw_metric_reload_cntr #undef HW_METRIC_RELOAD_CNTR } -- 2.33.0
driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 ---------------------------------------------------------------------- hw_metric events should have same pmus, otherwise events marked as hw_metric will not work properly. Check the event types and event pmus for event consistency. Fixes: 981566bcb1f5 "arm64: perf: Add support for HIP12 hw metric" Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_pmuv3.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c index bbf464942ffa..64e838cb2456 100644 --- a/drivers/perf/arm_pmuv3.c +++ b/drivers/perf/arm_pmuv3.c @@ -371,7 +371,10 @@ static u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr) #ifdef CONFIG_HISILICON_HW_METRIC static inline bool armv8pmu_event_is_hw_metric(struct perf_event *event) { - return ATTR_CFG_GET_FLD(&event->attr, hw_metric); + bool hardware_event = event->attr.type & PERF_TYPE_HARDWARE; + bool hwmetric_event = ATTR_CFG_GET_FLD(&event->attr, hw_metric); + + return hardware_event && hwmetric_event; } static bool armpmu_support_hisi_hw_metric(void) @@ -1144,7 +1147,7 @@ static int armv8pmu_check_hw_metric_event(struct pmu_hw_events *cpuc, if (event == leader) return 0; - if (!armv8pmu_event_is_hw_metric(leader)) + if (leader->pmu != event->pmu || !armv8pmu_event_is_hw_metric(leader)) return -EINVAL; for_each_sibling_event(sibling, leader) { -- 2.33.0
driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9066 ---------------------------------------------------------------------- Core PMU supports hardware, raw or hw_cache events at the same time, each of which is allowed to have hw_metric enabled. Restricting the types to only hardware events is not correct. Fixes: 711d99ed97c7 ("perf: Keep event consistency of hw_metric events") Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Ying Jiang <jiangying44@h-partners.com> --- drivers/perf/arm_pmuv3.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c index 64e838cb2456..d12e36b4388b 100644 --- a/drivers/perf/arm_pmuv3.c +++ b/drivers/perf/arm_pmuv3.c @@ -371,10 +371,7 @@ static u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr) #ifdef CONFIG_HISILICON_HW_METRIC static inline bool armv8pmu_event_is_hw_metric(struct perf_event *event) { - bool hardware_event = event->attr.type & PERF_TYPE_HARDWARE; - bool hwmetric_event = ATTR_CFG_GET_FLD(&event->attr, hw_metric); - - return hardware_event && hwmetric_event; + return ATTR_CFG_GET_FLD(&event->attr, hw_metric); } static bool armpmu_support_hisi_hw_metric(void) -- 2.33.0
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://atomgit.com/openeuler/kernel/merge_requests/22258 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/E2K... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://atomgit.com/openeuler/kernel/merge_requests/22258 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/E2K...
participants (2)
-
patchwork bot -
Yushan Wang