From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit f8a85334a57e7842320476ff27be3a5f151da364 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 4419470191386456e0b8ed4eb06a70b0021798a6 upstream
Add the admin guide for Processor MMIO stale data vulnerabilities.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/admin-guide/hw-vuln/index.rst | 1 + .../hw-vuln/processor_mmio_stale_data.rst | 246 ++++++++++++++++++ 2 files changed, 247 insertions(+) create mode 100644 Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst index ca4dbdd9016d..2adec1e6520a 100644 --- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -15,3 +15,4 @@ are configurable at compile, boot or run time. tsx_async_abort multihit.rst special-register-buffer-data-sampling.rst + processor_mmio_stale_data.rst diff --git a/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst new file mode 100644 index 000000000000..9393c50b5afc --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst @@ -0,0 +1,246 @@ +========================================= +Processor MMIO Stale Data Vulnerabilities +========================================= + +Processor MMIO Stale Data Vulnerabilities are a class of memory-mapped I/O +(MMIO) vulnerabilities that can expose data. The sequences of operations for +exposing data range from simple to very complex. Because most of the +vulnerabilities require the attacker to have access to MMIO, many environments +are not affected. System environments using virtualization where MMIO access is +provided to untrusted guests may need mitigation. These vulnerabilities are +not transient execution attacks. However, these vulnerabilities may propagate +stale data into core fill buffers where the data can subsequently be inferred +by an unmitigated transient execution attack. Mitigation for these +vulnerabilities includes a combination of microcode update and software +changes, depending on the platform and usage model. Some of these mitigations +are similar to those used to mitigate Microarchitectural Data Sampling (MDS) or +those used to mitigate Special Register Buffer Data Sampling (SRBDS). + +Data Propagators +================ +Propagators are operations that result in stale data being copied or moved from +one microarchitectural buffer or register to another. Processor MMIO Stale Data +Vulnerabilities are operations that may result in stale data being directly +read into an architectural, software-visible state or sampled from a buffer or +register. + +Fill Buffer Stale Data Propagator (FBSDP) +----------------------------------------- +Stale data may propagate from fill buffers (FB) into the non-coherent portion +of the uncore on some non-coherent writes. Fill buffer propagation by itself +does not make stale data architecturally visible. Stale data must be propagated +to a location where it is subject to reading or sampling. + +Sideband Stale Data Propagator (SSDP) +------------------------------------- +The sideband stale data propagator (SSDP) is limited to the client (including +Intel Xeon server E3) uncore implementation. The sideband response buffer is +shared by all client cores. For non-coherent reads that go to sideband +destinations, the uncore logic returns 64 bytes of data to the core, including +both requested data and unrequested stale data, from a transaction buffer and +the sideband response buffer. As a result, stale data from the sideband +response and transaction buffers may now reside in a core fill buffer. + +Primary Stale Data Propagator (PSDP) +------------------------------------ +The primary stale data propagator (PSDP) is limited to the client (including +Intel Xeon server E3) uncore implementation. Similar to the sideband response +buffer, the primary response buffer is shared by all client cores. For some +processors, MMIO primary reads will return 64 bytes of data to the core fill +buffer including both requested data and unrequested stale data. This is +similar to the sideband stale data propagator. + +Vulnerabilities +=============== +Device Register Partial Write (DRPW) (CVE-2022-21166) +----------------------------------------------------- +Some endpoint MMIO registers incorrectly handle writes that are smaller than +the register size. Instead of aborting the write or only copying the correct +subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than +specified by the write transaction may be written to the register. On +processors affected by FBSDP, this may expose stale data from the fill buffers +of the core that created the write transaction. + +Shared Buffers Data Sampling (SBDS) (CVE-2022-21125) +---------------------------------------------------- +After propagators may have moved data around the uncore and copied stale data +into client core fill buffers, processors affected by MFBDS can leak data from +the fill buffer. It is limited to the client (including Intel Xeon server E3) +uncore implementation. + +Shared Buffers Data Read (SBDR) (CVE-2022-21123) +------------------------------------------------ +It is similar to Shared Buffer Data Sampling (SBDS) except that the data is +directly read into the architectural software-visible state. It is limited to +the client (including Intel Xeon server E3) uncore implementation. + +Affected Processors +=================== +Not all the CPUs are affected by all the variants. For instance, most +processors for the server market (excluding Intel Xeon E3 processors) are +impacted by only Device Register Partial Write (DRPW). + +Below is the list of affected Intel processors [#f1]_: + + =================== ============ ========= + Common name Family_Model Steppings + =================== ============ ========= + HASWELL_X 06_3FH 2,4 + SKYLAKE_L 06_4EH 3 + BROADWELL_X 06_4FH All + SKYLAKE_X 06_55H 3,4,6,7,11 + BROADWELL_D 06_56H 3,4,5 + SKYLAKE 06_5EH 3 + ICELAKE_X 06_6AH 4,5,6 + ICELAKE_D 06_6CH 1 + ICELAKE_L 06_7EH 5 + ATOM_TREMONT_D 06_86H All + LAKEFIELD 06_8AH 1 + KABYLAKE_L 06_8EH 9 to 12 + ATOM_TREMONT 06_96H 1 + ATOM_TREMONT_L 06_9CH 0 + KABYLAKE 06_9EH 9 to 13 + COMETLAKE 06_A5H 2,3,5 + COMETLAKE_L 06_A6H 0,1 + ROCKETLAKE 06_A7H 1 + =================== ============ ========= + +If a CPU is in the affected processor list, but not affected by a variant, it +is indicated by new bits in MSR IA32_ARCH_CAPABILITIES. As described in a later +section, mitigation largely remains the same for all the variants, i.e. to +clear the CPU fill buffers via VERW instruction. + +New bits in MSRs +================ +Newer processors and microcode update on existing affected processors added new +bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate +specific variants of Processor MMIO Stale Data vulnerabilities and mitigation +capability. + +MSR IA32_ARCH_CAPABILITIES +-------------------------- +Bit 13 - SBDR_SSDP_NO - When set, processor is not affected by either the + Shared Buffers Data Read (SBDR) vulnerability or the sideband stale + data propagator (SSDP). +Bit 14 - FBSDP_NO - When set, processor is not affected by the Fill Buffer + Stale Data Propagator (FBSDP). +Bit 15 - PSDP_NO - When set, processor is not affected by Primary Stale Data + Propagator (PSDP). +Bit 17 - FB_CLEAR - When set, VERW instruction will overwrite CPU fill buffer + values as part of MD_CLEAR operations. Processors that do not + enumerate MDS_NO (meaning they are affected by MDS) but that do + enumerate support for both L1D_FLUSH and MD_CLEAR implicitly enumerate + FB_CLEAR as part of their MD_CLEAR support. +Bit 18 - FB_CLEAR_CTRL - Processor supports read and write to MSR + IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]. On such processors, the FB_CLEAR_DIS + bit can be set to cause the VERW instruction to not perform the + FB_CLEAR action. Not all processors that support FB_CLEAR will support + FB_CLEAR_CTRL. + +MSR IA32_MCU_OPT_CTRL +--------------------- +Bit 3 - FB_CLEAR_DIS - When set, VERW instruction does not perform the FB_CLEAR +action. This may be useful to reduce the performance impact of FB_CLEAR in +cases where system software deems it warranted (for example, when performance +is more critical, or the untrusted software has no MMIO access). Note that +FB_CLEAR_DIS has no impact on enumeration (for example, it does not change +FB_CLEAR or MD_CLEAR enumeration) and it may not be supported on all processors +that enumerate FB_CLEAR. + +Mitigation +========== +Like MDS, all variants of Processor MMIO Stale Data vulnerabilities have the +same mitigation strategy to force the CPU to clear the affected buffers before +an attacker can extract the secrets. + +This is achieved by using the otherwise unused and obsolete VERW instruction in +combination with a microcode update. The microcode clears the affected CPU +buffers when the VERW instruction is executed. + +Kernel reuses the MDS function to invoke the buffer clearing: + + mds_clear_cpu_buffers() + +On MDS affected CPUs, the kernel already invokes CPU buffer clear on +kernel/userspace, hypervisor/guest and C-state (idle) transitions. No +additional mitigation is needed on such CPUs. + +For CPUs not affected by MDS or TAA, mitigation is needed only for the attacker +with MMIO capability. Therefore, VERW is not required for kernel/userspace. For +virtualization case, VERW is only needed at VMENTER for a guest with MMIO +capability. + +Mitigation points +----------------- +Return to user space +^^^^^^^^^^^^^^^^^^^^ +Same mitigation as MDS when affected by MDS/TAA, otherwise no mitigation +needed. + +C-State transition +^^^^^^^^^^^^^^^^^^ +Control register writes by CPU during C-state transition can propagate data +from fill buffer to uncore buffers. Execute VERW before C-state transition to +clear CPU fill buffers. + +Guest entry point +^^^^^^^^^^^^^^^^^ +Same mitigation as MDS when processor is also affected by MDS/TAA, otherwise +execute VERW at VMENTER only for MMIO capable guests. On CPUs not affected by +MDS/TAA, guest without MMIO access cannot extract secrets using Processor MMIO +Stale Data vulnerabilities, so there is no need to execute VERW for such guests. + +Mitigation control on the kernel command line +--------------------------------------------- +The kernel command line allows to control the Processor MMIO Stale Data +mitigations at boot time with the option "mmio_stale_data=". The valid +arguments for this option are: + + ========== ================================================================= + full If the CPU is vulnerable, enable mitigation; CPU buffer clearing + on exit to userspace and when entering a VM. Idle transitions are + protected as well. It does not automatically disable SMT. + full,nosmt Same as full, with SMT disabled on vulnerable CPUs. This is the + complete mitigation. + off Disables mitigation completely. + ========== ================================================================= + +If the CPU is affected and mmio_stale_data=off is not supplied on the kernel +command line, then the kernel selects the appropriate mitigation. + +Mitigation status information +----------------------------- +The Linux kernel provides a sysfs interface to enumerate the current +vulnerability status of the system: whether the system is vulnerable, and +which mitigations are active. The relevant sysfs file is: + + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data + +The possible values in this file are: + + .. list-table:: + + * - 'Not affected' + - The processor is not vulnerable + * - 'Vulnerable' + - The processor is vulnerable, but no mitigation enabled + * - 'Vulnerable: Clear CPU buffers attempted, no microcode' + - The processor is vulnerable, but microcode is not updated. The + mitigation is enabled on a best effort basis. + * - 'Mitigation: Clear CPU buffers' + - The processor is vulnerable and the CPU buffer clearing mitigation is + enabled. + +If the processor is vulnerable then the following information is appended to +the above information: + + ======================== =========================================== + 'SMT vulnerable' SMT is enabled + 'SMT disabled' SMT is disabled + 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown + ======================== =========================================== + +References +---------- +.. [#f1] Affected Processors + https://www.intel.com/content/www/us/en/developer/topic-technology/software-...
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit e66310bc96b74ed3df9993e5d835ef3084d62048 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 51802186158c74a0304f51ab963e7c2b3a2b046f upstream
Processor MMIO Stale Data is a class of vulnerabilities that may expose data after an MMIO operation. For more details please refer to Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
Add the Processor MMIO Stale Data bug enumeration. A microcode update adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 19 +++++++++++ arch/x86/kernel/cpu/common.c | 43 ++++++++++++++++++++++-- tools/arch/x86/include/asm/cpufeatures.h | 1 + tools/arch/x86/include/asm/msr-index.h | 19 +++++++++++ 5 files changed, 81 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 1e9b13636f17..4da419226377 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -419,5 +419,6 @@ #define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */ #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */ +#define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
#endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 5de2040b73a7..de883d3520d8 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -114,6 +114,25 @@ * Not susceptible to * TSX Async Abort (TAA) vulnerabilities. */ +#define ARCH_CAP_SBDR_SSDP_NO BIT(13) /* + * Not susceptible to SBDR and SSDP + * variants of Processor MMIO stale data + * vulnerabilities. + */ +#define ARCH_CAP_FBSDP_NO BIT(14) /* + * Not susceptible to FBSDP variant of + * Processor MMIO stale data + * vulnerabilities. + */ +#define ARCH_CAP_PSDP_NO BIT(15) /* + * Not susceptible to PSDP variant of + * Processor MMIO stale data + * vulnerabilities. + */ +#define ARCH_CAP_FB_CLEAR BIT(17) /* + * VERW clears CPU fill buffer + * even on MDS_NO CPUs. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 9c8fc6f513ed..d2e11cc5bd01 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1098,18 +1098,39 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { X86_FEATURE_ANY, issues)
#define SRBDS BIT(0) +/* CPU is affected by X86_BUG_MMIO_STALE_DATA */ +#define MMIO BIT(1)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), VULNBL_INTEL_STEPPINGS(HASWELL, X86_STEPPING_ANY, SRBDS), VULNBL_INTEL_STEPPINGS(HASWELL_L, X86_STEPPING_ANY, SRBDS), VULNBL_INTEL_STEPPINGS(HASWELL_G, X86_STEPPING_ANY, SRBDS), + VULNBL_INTEL_STEPPINGS(HASWELL_X, BIT(2) | BIT(4), MMIO), + VULNBL_INTEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x3, 0x5), MMIO), VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS), + VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO), VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS), + VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO), VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS), + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, BIT(3) | BIT(4) | BIT(6) | + BIT(7) | BIT(0xB), MMIO), + VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO), VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS), - VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0xC), SRBDS), - VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xD), SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x9, 0xC), SRBDS | MMIO), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0x8), SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x9, 0xD), SRBDS | MMIO), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0x8), SRBDS), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPINGS(0x5, 0x5), MMIO), + VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPINGS(0x1, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPINGS(0x4, 0x6), MMIO), + VULNBL_INTEL_STEPPINGS(COMETLAKE, BIT(2) | BIT(3) | BIT(5), MMIO), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPINGS(0x1, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPINGS(0x1, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPINGS(0x1, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPINGS(0x0, 0x0), MMIO), {} };
@@ -1130,6 +1151,13 @@ u64 x86_read_arch_cap_msr(void) return ia32_cap; }
+static bool arch_cap_mmio_immune(u64 ia32_cap) +{ + return (ia32_cap & ARCH_CAP_FBSDP_NO && + ia32_cap & ARCH_CAP_PSDP_NO && + ia32_cap & ARCH_CAP_SBDR_SSDP_NO); +} + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 ia32_cap = x86_read_arch_cap_msr(); @@ -1189,6 +1217,17 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) cpu_matches(cpu_vuln_blacklist, SRBDS)) setup_force_cpu_bug(X86_BUG_SRBDS);
+ /* + * Processor MMIO Stale Data bug enumeration + * + * Affected CPU list is generally enough to enumerate the vulnerability, + * but for virtualization case check for ARCH_CAP MSR bits also, VMM may + * not want the guest to enumerate the bug. + */ + if (cpu_matches(cpu_vuln_blacklist, MMIO) && + !arch_cap_mmio_immune(ia32_cap)) + setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index b58730cc12e8..a7b5c5efcf3b 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -417,5 +417,6 @@ #define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */ #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */ +#define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
#endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index c36a083c8ec0..8e343fc95ae6 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -114,6 +114,25 @@ * Not susceptible to * TSX Async Abort (TAA) vulnerabilities. */ +#define ARCH_CAP_SBDR_SSDP_NO BIT(13) /* + * Not susceptible to SBDR and SSDP + * variants of Processor MMIO stale data + * vulnerabilities. + */ +#define ARCH_CAP_FBSDP_NO BIT(14) /* + * Not susceptible to FBSDP variant of + * Processor MMIO stale data + * vulnerabilities. + */ +#define ARCH_CAP_PSDP_NO BIT(15) /* + * Not susceptible to PSDP variant of + * Processor MMIO stale data + * vulnerabilities. + */ +#define ARCH_CAP_FB_CLEAR BIT(17) /* + * VERW clears CPU fill buffer + * even on MDS_NO CPUs. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /*
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit f83d4e5be4a3955a6c8af61ecec0934d0ece40c0 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit f52ea6c26953fed339aa4eae717ee5c2133c7ff2 upstream
Processor MMIO Stale Data mitigation uses similar mitigation as MDS and TAA. In preparation for adding its mitigation, add a common function to update all mitigations that depend on MD_CLEAR.
[ bp: Add a newline in md_clear_update_mitigation() to separate statements better. ]
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/bugs.c | 59 +++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 26 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 78b9514a3844..37fabd29a8a7 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -41,7 +41,7 @@ static void __init spectre_v2_select_mitigation(void); static void __init ssb_select_mitigation(void); static void __init l1tf_select_mitigation(void); static void __init mds_select_mitigation(void); -static void __init mds_print_mitigation(void); +static void __init md_clear_update_mitigation(void); static void __init taa_select_mitigation(void); static void __init srbds_select_mitigation(void);
@@ -114,10 +114,10 @@ void __init check_bugs(void) srbds_select_mitigation();
/* - * As MDS and TAA mitigations are inter-related, print MDS - * mitigation until after TAA mitigation selection is done. + * As MDS and TAA mitigations are inter-related, update and print their + * mitigation after TAA mitigation selection is done. */ - mds_print_mitigation(); + md_clear_update_mitigation();
arch_smt_update();
@@ -258,14 +258,6 @@ static void __init mds_select_mitigation(void) } }
-static void __init mds_print_mitigation(void) -{ - if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) - return; - - pr_info("%s\n", mds_strings[mds_mitigation]); -} - static int __init mds_cmdline(char *str) { if (!boot_cpu_has_bug(X86_BUG_MDS)) @@ -320,7 +312,7 @@ static void __init taa_select_mitigation(void) /* TSX previously disabled by tsx=off */ if (!boot_cpu_has(X86_FEATURE_RTM)) { taa_mitigation = TAA_MITIGATION_TSX_DISABLED; - goto out; + return; }
if (cpu_mitigations_off()) { @@ -334,7 +326,7 @@ static void __init taa_select_mitigation(void) */ if (taa_mitigation == TAA_MITIGATION_OFF && mds_mitigation == MDS_MITIGATION_OFF) - goto out; + return;
if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) taa_mitigation = TAA_MITIGATION_VERW; @@ -366,18 +358,6 @@ static void __init taa_select_mitigation(void)
if (taa_nosmt || cpu_mitigations_auto_nosmt()) cpu_smt_disable(false); - - /* - * Update MDS mitigation, if necessary, as the mds_user_clear is - * now enabled for TAA mitigation. - */ - if (mds_mitigation == MDS_MITIGATION_OFF && - boot_cpu_has_bug(X86_BUG_MDS)) { - mds_mitigation = MDS_MITIGATION_FULL; - mds_select_mitigation(); - } -out: - pr_info("%s\n", taa_strings[taa_mitigation]); }
static int __init tsx_async_abort_parse_cmdline(char *str) @@ -401,6 +381,33 @@ static int __init tsx_async_abort_parse_cmdline(char *str) } early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
+#undef pr_fmt +#define pr_fmt(fmt) "" fmt + +static void __init md_clear_update_mitigation(void) +{ + if (cpu_mitigations_off()) + return; + + if (!static_key_enabled(&mds_user_clear)) + goto out; + + /* + * mds_user_clear is now enabled. Update MDS mitigation, if + * necessary. + */ + if (mds_mitigation == MDS_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_MDS)) { + mds_mitigation = MDS_MITIGATION_FULL; + mds_select_mitigation(); + } +out: + if (boot_cpu_has_bug(X86_BUG_MDS)) + pr_info("MDS: %s\n", mds_strings[mds_mitigation]); + if (boot_cpu_has_bug(X86_BUG_TAA)) + pr_info("TAA: %s\n", taa_strings[taa_mitigation]); +} + #undef pr_fmt #define pr_fmt(fmt) "SRBDS: " fmt
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit 26f6f231f6a5a79ccc274967939b22602dec76e8 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 8cb861e9e3c9a55099ad3d08e1a3b653d29c33ca upstream
Processor MMIO Stale Data is a class of vulnerabilities that may expose data after an MMIO operation. For details please refer to Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst.
These vulnerabilities are broadly categorized as:
Device Register Partial Write (DRPW): Some endpoint MMIO registers incorrectly handle writes that are smaller than the register size. Instead of aborting the write or only copying the correct subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than specified by the write transaction may be written to the register. On some processors, this may expose stale data from the fill buffers of the core that created the write transaction.
Shared Buffers Data Sampling (SBDS): After propagators may have moved data around the uncore and copied stale data into client core fill buffers, processors affected by MFBDS can leak data from the fill buffer.
Shared Buffers Data Read (SBDR): It is similar to Shared Buffer Data Sampling (SBDS) except that the data is directly read into the architectural software-visible state.
An attacker can use these vulnerabilities to extract data from CPU fill buffers using MDS and TAA methods. Mitigate it by clearing the CPU fill buffers using the VERW instruction before returning to a user or a guest.
On CPUs not affected by MDS and TAA, user application cannot sample data from CPU fill buffers using MDS or TAA. A guest with MMIO access can still use DRPW or SBDR to extract data architecturally. Mitigate it with VERW instruction to clear fill buffers before VMENTER for MMIO capable guests.
Add a kernel parameter mmio_stale_data={off|full|full,nosmt} to control the mitigation.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- .../admin-guide/kernel-parameters.txt | 36 ++++++ arch/x86/include/asm/nospec-branch.h | 2 + arch/x86/kernel/cpu/bugs.c | 111 +++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 3 + 4 files changed, 148 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 5ac05b4ed804..fa355ee14164 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2978,6 +2978,7 @@ kvm.nx_huge_pages=off [X86] no_entry_flush [PPC] no_uaccess_flush [PPC] + mmio_stale_data=off [X86]
Exceptions: This does not have any effect on @@ -2999,6 +3000,7 @@ Equivalent to: l1tf=flush,nosmt [X86] mds=full,nosmt [X86] tsx_async_abort=full,nosmt [X86] + mmio_stale_data=full,nosmt [X86]
mminit_loglevel= [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this @@ -3008,6 +3010,40 @@ log everything. Information is printed at KERN_DEBUG so loglevel=8 may also need to be specified.
+ mmio_stale_data= + [X86,INTEL] Control mitigation for the Processor + MMIO Stale Data vulnerabilities. + + Processor MMIO Stale Data is a class of + vulnerabilities that may expose data after an MMIO + operation. Exposed data could originate or end in + the same CPU buffers as affected by MDS and TAA. + Therefore, similar to MDS and TAA, the mitigation + is to clear the affected CPU buffers. + + This parameter controls the mitigation. The + options are: + + full - Enable mitigation on vulnerable CPUs + + full,nosmt - Enable mitigation and disable SMT on + vulnerable CPUs. + + off - Unconditionally disable mitigation + + On MDS or TAA affected machines, + mmio_stale_data=off can be prevented by an active + MDS or TAA mitigation as these vulnerabilities are + mitigated with the same mechanism so in order to + disable this mitigation, you need to specify + mds=off and tsx_async_abort=off too. + + Not specifying this option is equivalent to + mmio_stale_data=full. + + For details see: + Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst + module.sig_enforce [KNL] When CONFIG_MODULE_SIG is set, this means that modules without (valid) signatures will fail to load. diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 4d0f5386e637..e247151c3dcf 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -255,6 +255,8 @@ DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb); DECLARE_STATIC_KEY_FALSE(mds_user_clear); DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
+DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear); + #include <asm/segment.h>
/** diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 37fabd29a8a7..284a80f48627 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -43,6 +43,7 @@ static void __init l1tf_select_mitigation(void); static void __init mds_select_mitigation(void); static void __init md_clear_update_mitigation(void); static void __init taa_select_mitigation(void); +static void __init mmio_select_mitigation(void); static void __init srbds_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR that always has to be preserved. */ @@ -77,6 +78,10 @@ EXPORT_SYMBOL_GPL(mds_user_clear); DEFINE_STATIC_KEY_FALSE(mds_idle_clear); EXPORT_SYMBOL_GPL(mds_idle_clear);
+/* Controls CPU Fill buffer clear before KVM guest MMIO accesses */ +DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear); +EXPORT_SYMBOL_GPL(mmio_stale_data_clear); + void __init check_bugs(void) { identify_boot_cpu(); @@ -111,11 +116,13 @@ void __init check_bugs(void) l1tf_select_mitigation(); mds_select_mitigation(); taa_select_mitigation(); + mmio_select_mitigation(); srbds_select_mitigation();
/* - * As MDS and TAA mitigations are inter-related, update and print their - * mitigation after TAA mitigation selection is done. + * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update + * and print their mitigation after MDS, TAA and MMIO Stale Data + * mitigation selection is done. */ md_clear_update_mitigation();
@@ -381,6 +388,90 @@ static int __init tsx_async_abort_parse_cmdline(char *str) } early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
+#undef pr_fmt +#define pr_fmt(fmt) "MMIO Stale Data: " fmt + +enum mmio_mitigations { + MMIO_MITIGATION_OFF, + MMIO_MITIGATION_UCODE_NEEDED, + MMIO_MITIGATION_VERW, +}; + +/* Default mitigation for Processor MMIO Stale Data vulnerabilities */ +static enum mmio_mitigations mmio_mitigation __ro_after_init = MMIO_MITIGATION_VERW; +static bool mmio_nosmt __ro_after_init = false; + +static const char * const mmio_strings[] = { + [MMIO_MITIGATION_OFF] = "Vulnerable", + [MMIO_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode", + [MMIO_MITIGATION_VERW] = "Mitigation: Clear CPU buffers", +}; + +static void __init mmio_select_mitigation(void) +{ + u64 ia32_cap; + + if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) || + cpu_mitigations_off()) { + mmio_mitigation = MMIO_MITIGATION_OFF; + return; + } + + if (mmio_mitigation == MMIO_MITIGATION_OFF) + return; + + ia32_cap = x86_read_arch_cap_msr(); + + /* + * Enable CPU buffer clear mitigation for host and VMM, if also affected + * by MDS or TAA. Otherwise, enable mitigation for VMM only. + */ + if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) && + boot_cpu_has(X86_FEATURE_RTM))) + static_branch_enable(&mds_user_clear); + else + static_branch_enable(&mmio_stale_data_clear); + + /* + * Check if the system has the right microcode. + * + * CPU Fill buffer clear mitigation is enumerated by either an explicit + * FB_CLEAR or by the presence of both MD_CLEAR and L1D_FLUSH on MDS + * affected systems. + */ + if ((ia32_cap & ARCH_CAP_FB_CLEAR) || + (boot_cpu_has(X86_FEATURE_MD_CLEAR) && + boot_cpu_has(X86_FEATURE_FLUSH_L1D) && + !(ia32_cap & ARCH_CAP_MDS_NO))) + mmio_mitigation = MMIO_MITIGATION_VERW; + else + mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED; + + if (mmio_nosmt || cpu_mitigations_auto_nosmt()) + cpu_smt_disable(false); +} + +static int __init mmio_stale_data_parse_cmdline(char *str) +{ + if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) + return 0; + + if (!str) + return -EINVAL; + + if (!strcmp(str, "off")) { + mmio_mitigation = MMIO_MITIGATION_OFF; + } else if (!strcmp(str, "full")) { + mmio_mitigation = MMIO_MITIGATION_VERW; + } else if (!strcmp(str, "full,nosmt")) { + mmio_mitigation = MMIO_MITIGATION_VERW; + mmio_nosmt = true; + } + + return 0; +} +early_param("mmio_stale_data", mmio_stale_data_parse_cmdline); + #undef pr_fmt #define pr_fmt(fmt) "" fmt
@@ -393,19 +484,31 @@ static void __init md_clear_update_mitigation(void) goto out;
/* - * mds_user_clear is now enabled. Update MDS mitigation, if - * necessary. + * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data + * mitigation, if necessary. */ if (mds_mitigation == MDS_MITIGATION_OFF && boot_cpu_has_bug(X86_BUG_MDS)) { mds_mitigation = MDS_MITIGATION_FULL; mds_select_mitigation(); } + if (taa_mitigation == TAA_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_TAA)) { + taa_mitigation = TAA_MITIGATION_VERW; + taa_select_mitigation(); + } + if (mmio_mitigation == MMIO_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) { + mmio_mitigation = MMIO_MITIGATION_VERW; + mmio_select_mitigation(); + } out: if (boot_cpu_has_bug(X86_BUG_MDS)) pr_info("MDS: %s\n", mds_strings[mds_mitigation]); if (boot_cpu_has_bug(X86_BUG_TAA)) pr_info("TAA: %s\n", taa_strings[taa_mitigation]); + if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) + pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]); }
#undef pr_fmt diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index e208e54f5cad..ba6a34ab9247 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6710,6 +6710,9 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, vmx_l1d_flush(vcpu); else if (static_branch_unlikely(&mds_user_clear)) mds_clear_cpu_buffers(); + else if (static_branch_unlikely(&mmio_stale_data_clear) && + kvm_arch_has_assigned_device(vcpu->kvm)) + mds_clear_cpu_buffers();
if (vcpu->arch.cr2 != native_read_cr2()) native_write_cr2(vcpu->arch.cr2);
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit 56f0bca5e9c8456b7bb7089cbb6de866a9ba6da9 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit e5925fb867290ee924fcf2fe3ca887b792714366 upstream
MDS, TAA and Processor MMIO Stale Data mitigations rely on clearing CPU buffers. Moreover, status of these mitigations affects each other. During boot, it is important to maintain the order in which these mitigations are selected. This is especially true for md_clear_update_mitigation() that needs to be called after MDS, TAA and Processor MMIO Stale Data mitigation selection is done.
Introduce md_clear_select_mitigation(), and select all these mitigations from there. This reflects relationships between these mitigations and ensures proper ordering.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/bugs.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 284a80f48627..dc3b8b434fdc 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -42,6 +42,7 @@ static void __init ssb_select_mitigation(void); static void __init l1tf_select_mitigation(void); static void __init mds_select_mitigation(void); static void __init md_clear_update_mitigation(void); +static void __init md_clear_select_mitigation(void); static void __init taa_select_mitigation(void); static void __init mmio_select_mitigation(void); static void __init srbds_select_mitigation(void); @@ -114,18 +115,9 @@ void __init check_bugs(void) spectre_v2_select_mitigation(); ssb_select_mitigation(); l1tf_select_mitigation(); - mds_select_mitigation(); - taa_select_mitigation(); - mmio_select_mitigation(); + md_clear_select_mitigation(); srbds_select_mitigation();
- /* - * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update - * and print their mitigation after MDS, TAA and MMIO Stale Data - * mitigation selection is done. - */ - md_clear_update_mitigation(); - arch_smt_update();
#ifdef CONFIG_X86_32 @@ -511,6 +503,20 @@ static void __init md_clear_update_mitigation(void) pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]); }
+static void __init md_clear_select_mitigation(void) +{ + mds_select_mitigation(); + taa_select_mitigation(); + mmio_select_mitigation(); + + /* + * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update + * and print their mitigation after MDS, TAA and MMIO Stale Data + * mitigation selection is done. + */ + md_clear_update_mitigation(); +} + #undef pr_fmt #define pr_fmt(fmt) "SRBDS: " fmt
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit 3eb1180564fa0ecedc33b44029da7687c0a9fbf5 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 99a83db5a605137424e1efe29dc0573d6a5b6316 upstream
When the CPU is affected by Processor MMIO Stale Data vulnerabilities, Fill Buffer Stale Data Propagator (FBSDP) can propagate stale data out of Fill buffer to uncore buffer when CPU goes idle. Stale data can then be exploited with other variants using MMIO operations.
Mitigate it by clearing the Fill buffer before entering idle state.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Co-developed-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/bugs.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index dc3b8b434fdc..a39019760d9e 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -424,6 +424,14 @@ static void __init mmio_select_mitigation(void) else static_branch_enable(&mmio_stale_data_clear);
+ /* + * If Processor-MMIO-Stale-Data bug is present and Fill Buffer data can + * be propagated to uncore buffers, clearing the Fill buffers on idle + * is required irrespective of SMT state. + */ + if (!(ia32_cap & ARCH_CAP_FBSDP_NO)) + static_branch_enable(&mds_idle_clear); + /* * Check if the system has the right microcode. * @@ -1188,6 +1196,8 @@ static void update_indir_branch_cond(void) /* Update the static key controlling the MDS CPU buffer clear in idle */ static void update_mds_branch_idle(void) { + u64 ia32_cap = x86_read_arch_cap_msr(); + /* * Enable the idle clearing if SMT is active on CPUs which are * affected only by MSBDS and not any other MDS variant. @@ -1199,10 +1209,12 @@ static void update_mds_branch_idle(void) if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY)) return;
- if (sched_smt_active()) + if (sched_smt_active()) { static_branch_enable(&mds_idle_clear); - else + } else if (mmio_mitigation == MMIO_MITIGATION_OFF || + (ia32_cap & ARCH_CAP_FBSDP_NO)) { static_branch_disable(&mds_idle_clear); + } }
#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit 001415e4e626403c9ff35f2498feb0021d0c8328 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 8d50cdf8b8341770bc6367bce40c0c1bb0e1d5b3 upstream
Add the sysfs reporting file for Processor MMIO Stale Data vulnerability. It exposes the vulnerability and mitigation state similar to the existing files for the other hardware vulnerabilities.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- .../ABI/testing/sysfs-devices-system-cpu | 1 + arch/x86/kernel/cpu/bugs.c | 22 +++++++++++++++++++ drivers/base/cpu.c | 8 +++++++ include/linux/cpu.h | 3 +++ 4 files changed, 34 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 1a04ca8162ad..44c6e5730398 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -510,6 +510,7 @@ What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/tsx_async_abort /sys/devices/system/cpu/vulnerabilities/itlb_multihit + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data Date: January 2018 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: Information about CPU vulnerabilities diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index a39019760d9e..6108e5a294ea 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1832,6 +1832,20 @@ static ssize_t tsx_async_abort_show_state(char *buf) sched_smt_active() ? "vulnerable" : "disabled"); }
+static ssize_t mmio_stale_data_show_state(char *buf) +{ + if (mmio_mitigation == MMIO_MITIGATION_OFF) + return sysfs_emit(buf, "%s\n", mmio_strings[mmio_mitigation]); + + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { + return sysfs_emit(buf, "%s; SMT Host state unknown\n", + mmio_strings[mmio_mitigation]); + } + + return sysfs_emit(buf, "%s; SMT %s\n", mmio_strings[mmio_mitigation], + sched_smt_active() ? "vulnerable" : "disabled"); +} + static char *stibp_state(void) { if (spectre_v2_in_eibrs_mode(spectre_v2_enabled)) @@ -1932,6 +1946,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr case X86_BUG_SRBDS: return srbds_show_state(buf);
+ case X86_BUG_MMIO_STALE_DATA: + return mmio_stale_data_show_state(buf); + default: break; } @@ -1983,4 +2000,9 @@ ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr, char * { return cpu_show_common(dev, attr, buf, X86_BUG_SRBDS); } + +ssize_t cpu_show_mmio_stale_data(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_STALE_DATA); +} #endif diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 8f1d6569564c..8ecb9f90f467 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -566,6 +566,12 @@ ssize_t __weak cpu_show_srbds(struct device *dev, return sysfs_emit(buf, "Not affected\n"); }
+ssize_t __weak cpu_show_mmio_stale_data(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); @@ -575,6 +581,7 @@ static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL); static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL); static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL); static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL); +static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -586,6 +593,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_tsx_async_abort.attr, &dev_attr_itlb_multihit.attr, &dev_attr_srbds.attr, + &dev_attr_mmio_stale_data.attr, NULL };
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index d6428aaf67e7..d63b8f70d123 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -65,6 +65,9 @@ extern ssize_t cpu_show_tsx_async_abort(struct device *dev, extern ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr, char *buf); extern ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_mmio_stale_data(struct device *dev, + struct device_attribute *attr, + char *buf);
extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata,
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit cf1c01a5e4c3e269b9211ae2ef0a57f8c9474bfc category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 22cac9c677c95f3ac5c9244f8ca0afdc7c8afb19 upstream
Currently, Linux disables SRBDS mitigation on CPUs not affected by MDS and have the TSX feature disabled. On such CPUs, secrets cannot be extracted from CPU fill buffers using MDS or TAA. Without SRBDS mitigation, Processor MMIO Stale Data vulnerabilities can be used to extract RDRAND, RDSEED, and EGETKEY data.
Do not disable SRBDS mitigation by default when CPU is also affected by Processor MMIO Stale Data vulnerabilities.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/bugs.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 6108e5a294ea..3c3e4a466136 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -586,11 +586,13 @@ static void __init srbds_select_mitigation(void) return;
/* - * Check to see if this is one of the MDS_NO systems supporting - * TSX that are only exposed to SRBDS when TSX is enabled. + * Check to see if this is one of the MDS_NO systems supporting TSX that + * are only exposed to SRBDS when TSX is enabled or when CPU is affected + * by Processor MMIO Stale Data vulnerability. */ ia32_cap = x86_read_arch_cap_msr(); - if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM)) + if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM) && + !boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) srbds_mitigation = SRBDS_MITIGATION_TSX_OFF; else if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) srbds_mitigation = SRBDS_MITIGATION_HYPERVISOR;
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit 6df693dca31218f76c63b6fd4aa7b7db3bd6e049 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit a992b8a4682f119ae035a01b40d4d0665c4a2875 upstream
The Shared Buffers Data Sampling (SBDS) variant of Processor MMIO Stale Data vulnerabilities may expose RDRAND, RDSEED and SGX EGETKEY data. Mitigation for this is added by a microcode update.
As some of the implications of SBDS are similar to SRBDS, SRBDS mitigation infrastructure can be leveraged by SBDS. Set X86_BUG_SRBDS and use SRBDS mitigation.
Mitigation is enabled by default; use srbds=off to opt-out. Mitigation status can be checked from below file:
/sys/devices/system/cpu/vulnerabilities/srbds
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/common.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index d2e11cc5bd01..4917c2698ac1 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1100,6 +1100,8 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { #define SRBDS BIT(0) /* CPU is affected by X86_BUG_MMIO_STALE_DATA */ #define MMIO BIT(1) +/* CPU is affected by Shared Buffers Data Sampling (SBDS), a variant of X86_BUG_MMIO_STALE_DATA */ +#define MMIO_SBDS BIT(2)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1121,16 +1123,17 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0x8), SRBDS), VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x9, 0xD), SRBDS | MMIO), VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0x8), SRBDS), - VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPINGS(0x5, 0x5), MMIO), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPINGS(0x5, 0x5), MMIO | MMIO_SBDS), VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPINGS(0x1, 0x1), MMIO), VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPINGS(0x4, 0x6), MMIO), - VULNBL_INTEL_STEPPINGS(COMETLAKE, BIT(2) | BIT(3) | BIT(5), MMIO), - VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x1), MMIO), - VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPINGS(0x1, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(COMETLAKE, BIT(2) | BIT(3) | BIT(5), MMIO | MMIO_SBDS), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO), + VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS), VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPINGS(0x1, 0x1), MMIO), - VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPINGS(0x1, 0x1), MMIO), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS), VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO), - VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPINGS(0x0, 0x0), MMIO), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPINGS(0x0, 0x0), MMIO | MMIO_SBDS), {} };
@@ -1211,10 +1214,14 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) /* * SRBDS affects CPUs which support RDRAND or RDSEED and are listed * in the vulnerability blacklist. + * + * Some of the implications and mitigation of Shared Buffers Data + * Sampling (SBDS) are similar to SRBDS. Give SBDS same treatment as + * SRBDS. */ if ((cpu_has(c, X86_FEATURE_RDRAND) || cpu_has(c, X86_FEATURE_RDSEED)) && - cpu_matches(cpu_vuln_blacklist, SRBDS)) + cpu_matches(cpu_vuln_blacklist, SRBDS | MMIO_SBDS)) setup_force_cpu_bug(X86_BUG_SRBDS);
/*
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.123 commit bde15fdcce44956278b4f50680b7363ca126ffb9 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 027bbb884be006b05d9c577d6401686053aa789e upstream
The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an accurate indicator on all CPUs of whether the VERW instruction will overwrite fill buffers. FB_CLEAR enumeration in IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not vulnerable to MDS/TAA, indicating that microcode does overwrite fill buffers.
Guests running in VMM environments may not be aware of all the capabilities/vulnerabilities of the host CPU. Specifically, a guest may apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable to MDS/TAA even when the physical CPU is not. On CPUs that enumerate FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS during VMENTER and resetting on VMEXIT. For guests that enumerate FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM will not use FB_CLEAR_DIS.
Irrespective of guest state, host overwrites CPU buffers before VMENTER to protect itself from an MMIO capable guest, as part of mitigation for MMIO Stale Data vulnerabilities.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Conflicts: arch/x86/kvm/vmx/vmx.h Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/include/asm/msr-index.h | 6 +++ arch/x86/kvm/vmx/vmx.c | 69 ++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 2 + arch/x86/kvm/x86.c | 3 ++ tools/arch/x86/include/asm/msr-index.h | 6 +++ 5 files changed, 86 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index de883d3520d8..2b0af5eb5131 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -133,6 +133,11 @@ * VERW clears CPU fill buffer * even on MDS_NO CPUs. */ +#define ARCH_CAP_FB_CLEAR_CTRL BIT(18) /* + * MSR_IA32_MCU_OPT_CTRL[FB_CLEAR_DIS] + * bit available to control VERW + * behavior. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* @@ -150,6 +155,7 @@ /* SRBDS support */ #define MSR_IA32_MCU_OPT_CTRL 0x00000123 #define RNGDS_MITG_DIS BIT(0) +#define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */
#define MSR_IA32_SYSENTER_CS 0x00000174 #define MSR_IA32_SYSENTER_ESP 0x00000175 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ba6a34ab9247..79889d27aa5b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -226,6 +226,9 @@ static const struct { #define L1D_CACHE_ORDER 4 static void *vmx_l1d_flush_pages;
+/* Control for disabling CPU Fill buffer clear */ +static bool __read_mostly vmx_fb_clear_ctrl_available; + static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) { struct page *page; @@ -357,6 +360,60 @@ static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp) return sprintf(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].option); }
+static void vmx_setup_fb_clear_ctrl(void) +{ + u64 msr; + + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) && + !boot_cpu_has_bug(X86_BUG_MDS) && + !boot_cpu_has_bug(X86_BUG_TAA)) { + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr); + if (msr & ARCH_CAP_FB_CLEAR_CTRL) + vmx_fb_clear_ctrl_available = true; + } +} + +static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx) +{ + u64 msr; + + if (!vmx->disable_fb_clear) + return; + + rdmsrl(MSR_IA32_MCU_OPT_CTRL, msr); + msr |= FB_CLEAR_DIS; + wrmsrl(MSR_IA32_MCU_OPT_CTRL, msr); + /* Cache the MSR value to avoid reading it later */ + vmx->msr_ia32_mcu_opt_ctrl = msr; +} + +static __always_inline void vmx_enable_fb_clear(struct vcpu_vmx *vmx) +{ + if (!vmx->disable_fb_clear) + return; + + vmx->msr_ia32_mcu_opt_ctrl &= ~FB_CLEAR_DIS; + wrmsrl(MSR_IA32_MCU_OPT_CTRL, vmx->msr_ia32_mcu_opt_ctrl); +} + +static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx) +{ + vmx->disable_fb_clear = vmx_fb_clear_ctrl_available; + + /* + * If guest will not execute VERW, there is no need to set FB_CLEAR_DIS + * at VMEntry. Skip the MSR read/write when a guest has no use case to + * execute VERW. + */ + if ((vcpu->arch.arch_capabilities & ARCH_CAP_FB_CLEAR) || + ((vcpu->arch.arch_capabilities & ARCH_CAP_MDS_NO) && + (vcpu->arch.arch_capabilities & ARCH_CAP_TAA_NO) && + (vcpu->arch.arch_capabilities & ARCH_CAP_PSDP_NO) && + (vcpu->arch.arch_capabilities & ARCH_CAP_FBSDP_NO) && + (vcpu->arch.arch_capabilities & ARCH_CAP_SBDR_SSDP_NO))) + vmx->disable_fb_clear = false; +} + static const struct kernel_param_ops vmentry_l1d_flush_ops = { .set = vmentry_l1d_flush_set, .get = vmentry_l1d_flush_get, @@ -2259,6 +2316,10 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) ret = kvm_set_msr_common(vcpu, msr_info); }
+ /* FB_CLEAR may have changed, also update the FB_CLEAR_DIS behavior */ + if (msr_index == MSR_IA32_ARCH_CAPABILITIES) + vmx_update_fb_clear_dis(vcpu, vmx); + return ret; }
@@ -4531,6 +4592,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vpid_sync_context(vmx->vpid); if (init_event) vmx_clear_hlt(vcpu); + + vmx_update_fb_clear_dis(vcpu, vmx); }
static void enable_irq_window(struct kvm_vcpu *vcpu) @@ -6714,6 +6777,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, kvm_arch_has_assigned_device(vcpu->kvm)) mds_clear_cpu_buffers();
+ vmx_disable_fb_clear(vmx); + if (vcpu->arch.cr2 != native_read_cr2()) native_write_cr2(vcpu->arch.cr2);
@@ -6722,6 +6787,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
vcpu->arch.cr2 = native_read_cr2();
+ vmx_enable_fb_clear(vmx); + /* * VMEXIT disables interrupts (host state), but tracing and lockdep * have them in state 'on' as recorded before entering guest mode. @@ -8108,6 +8175,8 @@ static int __init vmx_init(void) return r; }
+ vmx_setup_fb_clear_ctrl(); + for_each_possible_cpu(cpu) { INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index c0b52498e4bb..05eca210a5ff 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -325,6 +325,8 @@ struct vcpu_vmx { u64 msr_ia32_feature_control; u64 msr_ia32_feature_control_valid_bits; u64 ept_pointer; + u64 msr_ia32_mcu_opt_ctrl; + bool disable_fb_clear;
struct pt_desc pt_desc; struct lbr_desc lbr_desc; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index eb6c42e40ec4..1f857bc5ac6e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1459,6 +1459,9 @@ static u64 kvm_get_arch_capabilities(void) */ }
+ /* Guests don't need to know "Fill buffer clear control" exists */ + data &= ~ARCH_CAP_FB_CLEAR_CTRL; + return data; }
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 8e343fc95ae6..7b9259868243 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -133,6 +133,11 @@ * VERW clears CPU fill buffer * even on MDS_NO CPUs. */ +#define ARCH_CAP_FB_CLEAR_CTRL BIT(18) /* + * MSR_IA32_MCU_OPT_CTRL[FB_CLEAR_DIS] + * bit available to control VERW + * behavior. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* @@ -150,6 +155,7 @@ /* SRBDS support */ #define MSR_IA32_MCU_OPT_CTRL 0x00000123 #define RNGDS_MITG_DIS BIT(0) +#define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */
#define MSR_IA32_SYSENTER_CS 0x00000174 #define MSR_IA32_SYSENTER_ESP 0x00000175
From: Josh Poimboeuf jpoimboe@kernel.org
stable inclusion from stable-v5.10.123 commit aa238a92cc94a15812c0de4adade86ba8f22707a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5D5RS CVE: CVE-2022-21123,CVE-2022-21125,CVE-2022-21166
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
--------------------------------
commit 1dc6ff02c8bf77d71b9b5d11cbc9df77cfb28626 upstream
Similar to MDS and TAA, print a warning if SMT is enabled for the MMIO Stale Data vulnerability.
Signed-off-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Yipeng Zou zouyipeng@huawei.com Reviewed-by: Zhang Jianhua chris.zjh@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- arch/x86/kernel/cpu/bugs.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 3c3e4a466136..2a21046846b6 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1221,6 +1221,7 @@ static void update_mds_branch_idle(void)
#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" #define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.h... for more details.\n" +#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_st... for more details.\n"
void cpu_bugs_smt_update(void) { @@ -1265,6 +1266,16 @@ void cpu_bugs_smt_update(void) break; }
+ switch (mmio_mitigation) { + case MMIO_MITIGATION_VERW: + case MMIO_MITIGATION_UCODE_NEEDED: + if (sched_smt_active()) + pr_warn_once(MMIO_MSG_SMT); + break; + case MMIO_MITIGATION_OFF: + break; + } + mutex_unlock(&spec_ctrl_mutex); }
From: Baokun Li libaokun1@huawei.com
hulk inclusion category: bugfix bugzilla: 186866, https://gitee.com/openeuler/kernel/issues/I5DTBL CVE: NA
--------------------------------
When adding an xattr to an inode, we must ensure that the inode_size is not less than EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad. Otherwise, the end position may be greater than the start position, resulting in UAF.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/xattr.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h index 730b91fa0dd7..87e5863bb493 100644 --- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -95,6 +95,19 @@ struct ext4_xattr_entry {
#define EXT4_ZERO_XATTR_VALUE ((void *)-1)
+/* + * If we want to add an xattr to the inode, we should make sure that + * i_extra_isize is not 0 and that the inode size is not less than + * EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad. + * EXT4_GOOD_OLD_INODE_SIZE extra_isize header entry pad data + * |--------------------------|------------|------|---------|---|-------| + */ +#define EXT4_INODE_HAS_XATTR_SPACE(inode) \ + ((EXT4_I(inode)->i_extra_isize != 0) && \ + (EXT4_GOOD_OLD_INODE_SIZE + EXT4_I(inode)->i_extra_isize + \ + sizeof(struct ext4_xattr_ibody_header) + EXT4_XATTR_PAD <= \ + EXT4_INODE_SIZE((inode)->i_sb))) + struct ext4_xattr_info { const char *name; const void *value;
From: Baokun Li libaokun1@huawei.com
hulk inclusion category: bugfix bugzilla: 186866, https://gitee.com/openeuler/kernel/issues/I5DTBL CVE: NA
--------------------------------
Hulk Robot reported a issue: Reviewed-by: Zhang Yi yi.zhang@huawei.com
================================================================== BUG: KASAN: use-after-free in ext4_xattr_set_entry+0x18ab/0x3500 Write of size 4105 at addr ffff8881675ef5f4 by task syz-executor.0/7092
CPU: 1 PID: 7092 Comm: syz-executor.0 Not tainted 4.19.90-dirty #17 Call Trace: [...] memcpy+0x34/0x50 mm/kasan/kasan.c:303 ext4_xattr_set_entry+0x18ab/0x3500 fs/ext4/xattr.c:1747 ext4_xattr_ibody_inline_set+0x86/0x2a0 fs/ext4/xattr.c:2205 ext4_xattr_set_handle+0x940/0x1300 fs/ext4/xattr.c:2386 ext4_xattr_set+0x1da/0x300 fs/ext4/xattr.c:2498 __vfs_setxattr+0x112/0x170 fs/xattr.c:149 __vfs_setxattr_noperm+0x11b/0x2a0 fs/xattr.c:180 __vfs_setxattr_locked+0x17b/0x250 fs/xattr.c:238 vfs_setxattr+0xed/0x270 fs/xattr.c:255 setxattr+0x235/0x330 fs/xattr.c:520 path_setxattr+0x176/0x190 fs/xattr.c:539 __do_sys_lsetxattr fs/xattr.c:561 [inline] __se_sys_lsetxattr fs/xattr.c:557 [inline] __x64_sys_lsetxattr+0xc2/0x160 fs/xattr.c:557 do_syscall_64+0xdf/0x530 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x459fe9 RSP: 002b:00007fa5e54b4c08 EFLAGS: 00000246 ORIG_RAX: 00000000000000bd RAX: ffffffffffffffda RBX: 000000000051bf60 RCX: 0000000000459fe9 RDX: 00000000200003c0 RSI: 0000000020000180 RDI: 0000000020000140 RBP: 000000000051bf60 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000001009 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc73c93fc0 R14: 000000000051bf60 R15: 00007fa5e54b4d80 [...] ==================================================================
Above issue may happen as follows: ------------------------------------- ext4_xattr_set ext4_xattr_set_handle ext4_xattr_ibody_find >> s->end < s->base >> no EXT4_STATE_XATTR >> xattr_check_inode is not executed ext4_xattr_ibody_set ext4_xattr_set_entry >> size_t min_offs = s->end - s->base >> UAF in memcpy
we can easily reproduce this problem with the following commands: mkfs.ext4 -F /dev/sda mount -o debug_want_extra_isize=128 /dev/sda /mnt touch /mnt/file setfattr -n user.cat -v `seq -s z 4096|tr -d '[:digit:]'` /mnt/file
In ext4_xattr_ibody_find, we have the following assignment logic: header = IHDR(inode, raw_inode) = raw_inode + EXT4_GOOD_OLD_INODE_SIZE + i_extra_isize is->s.base = IFIRST(header) = header + sizeof(struct ext4_xattr_ibody_header) is->s.end = raw_inode + s_inode_size
In ext4_xattr_set_entry min_offs = s->end - s->base = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) last = s->first free = min_offs - ((void *)last - s->base) - sizeof(__u32) = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) - sizeof(__u32)
In the calculation formula, all values except s_inode_size and i_extra_size are fixed values. When i_extra_size is the maximum value s_inode_size - EXT4_GOOD_OLD_INODE_SIZE, min_offs is -4 and free is -8. The value overflows. As a result, the preceding issue is triggered when memcpy is executed.
Therefore, when finding xattr or setting xattr, check whether there is space for storing xattr in the inode to resolve this issue.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/xattr.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 2f93e8b90492..b5016eb7b373 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2170,8 +2170,9 @@ int ext4_xattr_ibody_find(struct inode *inode, struct ext4_xattr_info *i, struct ext4_inode *raw_inode; int error;
- if (EXT4_I(inode)->i_extra_isize == 0) + if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) return 0; + raw_inode = ext4_raw_inode(&is->iloc); header = IHDR(inode, raw_inode); is->s.base = is->s.first = IFIRST(header); @@ -2199,8 +2200,9 @@ int ext4_xattr_ibody_inline_set(handle_t *handle, struct inode *inode, struct ext4_xattr_search *s = &is->s; int error;
- if (EXT4_I(inode)->i_extra_isize == 0) + if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) return -ENOSPC; + error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */); if (error) return error; @@ -2223,8 +2225,9 @@ static int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode, struct ext4_xattr_search *s = &is->s; int error;
- if (EXT4_I(inode)->i_extra_isize == 0) + if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) return -ENOSPC; + error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */); if (error) return error;
From: Baokun Li libaokun1@huawei.com
hulk inclusion category: bugfix bugzilla: 186866, https://gitee.com/openeuler/kernel/issues/I5DTBL CVE: NA
--------------------------------
If the ext4 inode does not have xattr space, 0 is returned in the get_max_inline_xattr_value_size function. Otherwise, the function returns a negative value when the inode does not contain EXT4_STATE_XATTR.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/inline.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index d3cfb0ffc02b..c2c688cb4500 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -34,6 +34,9 @@ static int get_max_inline_xattr_value_size(struct inode *inode, struct ext4_inode *raw_inode; int free, min_offs;
+ if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) + return 0; + min_offs = EXT4_SB(inode->i_sb)->s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - EXT4_I(inode)->i_extra_isize -
From: Baokun Li libaokun1@huawei.com
hulk inclusion category: bugfix bugzilla: 186866, https://gitee.com/openeuler/kernel/issues/I5DTBL CVE: NA
--------------------------------
Use the EXT4_INODE_HAS_XATTR_SPACE macro to more accurately determine whether the inode have xattr space.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- fs/ext4/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d01a1781b7ac..d474d7a7cb4b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4672,8 +4672,7 @@ static inline int ext4_iget_extra_inode(struct inode *inode, __le32 *magic = (void *)raw_inode + EXT4_GOOD_OLD_INODE_SIZE + ei->i_extra_isize;
- if (EXT4_GOOD_OLD_INODE_SIZE + ei->i_extra_isize + sizeof(__le32) <= - EXT4_INODE_SIZE(inode->i_sb) && + if (EXT4_INODE_HAS_XATTR_SPACE(inode) && *magic == cpu_to_le32(EXT4_XATTR_MAGIC)) { ext4_set_inode_state(inode, EXT4_STATE_XATTR); return ext4_find_inline_data_nolock(inode);
From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v5.19 commit 6cfeadbff3f8905f2854735ebb88e581402c16c4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5EKM7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
commit 364b61818f65 ("blk-mq: clearing flush request reference in tags->rqs[]") is added to clear the to-be-free flush request from tags->rqs[] for avoiding use-after-free on the flush rq.
Yu Kuai reported that blk_mq_clear_flush_rq_mapping() slows down boot time by ~8s because running scsi probe which may create and remove lots of unpresent LUNs on megaraid-sas which uses BLK_MQ_F_TAG_HCTX_SHARED and each request queue has lots of hw queues.
Improve the situation by not running blk_mq_clear_flush_rq_mapping if disk isn't added when there can't be any flush request issued.
Reviewed-by: Christoph Hellwig hch@lst.de Reported-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20220616014401.817001-4-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk
Conflicts: block/blk-mq.c Signed-off-by: Luo Meng luomeng12@huawei.com Reviewed-by: Jason Yan yanaijie@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- block/blk-mq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index 41885145e57b..83193e44aada 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2737,8 +2737,9 @@ static void blk_mq_exit_hctx(struct request_queue *q, blk_mq_dtag_idle(hctx, true); }
- blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx], - set->queue_depth, flush_rq); + if (blk_queue_init_done(q)) + blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx], + set->queue_depth, flush_rq); if (set->ops->exit_request) set->ops->exit_request(set, flush_rq, hctx_idx);
From: Wangming Shao shaowangming@h-partners.com
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5DKH2
-----------------------------------------------------------------------------
If the default hisi_sas_debugfs_dump_count is 50, a large amount of memory is occupied. Therefore, to avoid high memory usage, reduce the hisi_sas_debugfs_dump_count from 50 to 1.
Signed-off-by: Wangming Shao shaowangming@h-partners.com Reviewed-by: Jason Yan yanaijie@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/scsi/hisi_sas/hisi_sas_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 4cbe8711b6d4..a1c6a67da132 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -2852,13 +2852,13 @@ EXPORT_SYMBOL_GPL(hisi_sas_remove); #if IS_ENABLED(CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE) #define DEBUGFS_ENABLE_DEFAULT "enabled" bool hisi_sas_debugfs_enable = true; -u32 hisi_sas_debugfs_dump_count = 50; #else #define DEBUGFS_ENABLE_DEFAULT "disabled" bool hisi_sas_debugfs_enable; -u32 hisi_sas_debugfs_dump_count = 1; #endif
+u32 hisi_sas_debugfs_dump_count = 1; + EXPORT_SYMBOL_GPL(hisi_sas_debugfs_enable); module_param_named(debugfs_enable, hisi_sas_debugfs_enable, bool, 0444); MODULE_PARM_DESC(hisi_sas_debugfs_enable,
From: Yejune Deng yejune.deng@gmail.com
mainline inclusion from v5.11-rc1 commit a01a89b1db1066a6af23ae08b9a0c345b7966f0b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5DVR9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
atomic_inc() and atomic_dec() looks better
Signed-off-by: Yejune Deng yejune.deng@gmail.com Message-Id: 1605511807-7135-1-git-send-email-yejune.deng@gmail.com Signed-off-by: Corey Minyard cminyard@mvista.com Signed-off-by: Miaohe Lin linmiaohe@huawei.com Reviewed-by: Li JinLin lijinlin3@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/ipmi/ipmi_watchdog.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c index 92eda5b2f134..7f71471c7a46 100644 --- a/drivers/char/ipmi/ipmi_watchdog.c +++ b/drivers/char/ipmi/ipmi_watchdog.c @@ -503,7 +503,7 @@ static void panic_halt_ipmi_heartbeat(void) msg.cmd = IPMI_WDOG_RESET_TIMER; msg.data = NULL; msg.data_len = 0; - atomic_add(1, &panic_done_count); + atomic_inc(&panic_done_count); rv = ipmi_request_supply_msgs(watchdog_user, (struct ipmi_addr *) &addr, 0, @@ -513,7 +513,7 @@ static void panic_halt_ipmi_heartbeat(void) &panic_halt_heartbeat_recv_msg, 1); if (rv) - atomic_sub(1, &panic_done_count); + atomic_dec(&panic_done_count); }
static struct ipmi_smi_msg panic_halt_smi_msg = { @@ -537,12 +537,12 @@ static void panic_halt_ipmi_set_timeout(void) /* Wait for the messages to be free. */ while (atomic_read(&panic_done_count) != 0) ipmi_poll_interface(watchdog_user); - atomic_add(1, &panic_done_count); + atomic_inc(&panic_done_count); rv = __ipmi_set_timeout(&panic_halt_smi_msg, &panic_halt_recv_msg, &send_heartbeat_now); if (rv) { - atomic_sub(1, &panic_done_count); + atomic_dec(&panic_done_count); pr_warn("Unable to extend the watchdog timeout\n"); } else { if (send_heartbeat_now)
From: Corey Minyard cminyard@mvista.com
mainline inclusion from v5.16-rc1 commit db05ddf7f321634c5659a0cf7ea56594e22365f7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5DVR9 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
You will get two decrements when the messages on a panic are sent, not one, since commit 2033f6858970 ("ipmi: Free receive messages when in an oops") was added, but the watchdog code had a bug where it didn't set the value properly.
Reported-by: Anton Lundin glance@acc.umu.se Cc: Stable@vger.kernel.org # v5.4+ Fixes: 2033f6858970 ("ipmi: Free receive messages when in an oops") Signed-off-by: Corey Minyard cminyard@mvista.com Signed-off-by: Miaohe Lin linmiaohe@huawei.com Reviewed-by: Li JinLin lijinlin3@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/char/ipmi/ipmi_watchdog.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c index 7f71471c7a46..883b4a341012 100644 --- a/drivers/char/ipmi/ipmi_watchdog.c +++ b/drivers/char/ipmi/ipmi_watchdog.c @@ -503,7 +503,7 @@ static void panic_halt_ipmi_heartbeat(void) msg.cmd = IPMI_WDOG_RESET_TIMER; msg.data = NULL; msg.data_len = 0; - atomic_inc(&panic_done_count); + atomic_add(2, &panic_done_count); rv = ipmi_request_supply_msgs(watchdog_user, (struct ipmi_addr *) &addr, 0, @@ -513,7 +513,7 @@ static void panic_halt_ipmi_heartbeat(void) &panic_halt_heartbeat_recv_msg, 1); if (rv) - atomic_dec(&panic_done_count); + atomic_sub(2, &panic_done_count); }
static struct ipmi_smi_msg panic_halt_smi_msg = { @@ -537,12 +537,12 @@ static void panic_halt_ipmi_set_timeout(void) /* Wait for the messages to be free. */ while (atomic_read(&panic_done_count) != 0) ipmi_poll_interface(watchdog_user); - atomic_inc(&panic_done_count); + atomic_add(2, &panic_done_count); rv = __ipmi_set_timeout(&panic_halt_smi_msg, &panic_halt_recv_msg, &send_heartbeat_now); if (rv) { - atomic_dec(&panic_done_count); + atomic_sub(2, &panic_done_count); pr_warn("Unable to extend the watchdog timeout\n"); } else { if (send_heartbeat_now)
From: Chengchang Tang tangchengchang@huawei.com
mainline inclusion from mainline-for-linus commit c2fcafa78a33576b7fe47f5e4f85d413a62c2fe2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=c2f...
----------------------------------------------------------------------
The sgid_attr cannot be null in this scenario. This judgment is redundant.
Fixes: 606bf89e98ef ("RDMA/hns: Refactor for hns_roce_v2_modify_qp function") Link: https://lore.kernel.org/r/20220409083254.9696-2-liangwenpeng@huawei.com Signed-off-by: Chengchang Tang tangchengchang@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 936176712758..59b1956671e5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -4650,9 +4650,7 @@ static int hns_roce_v2_set_path(struct ib_qp *ibqp, if (ret) return ret;
- if (gid_attr) - is_udp = (gid_attr->gid_type == - IB_GID_TYPE_ROCE_UDP_ENCAP); + is_udp = (gid_attr->gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP); }
/* Only HIP08 needs to set the vlan_en bits in QPC */
From: Yixing Liu liuyixing1@huawei.com
mainline inclusion from mainline-for-linus commit 9216d05943833bdedefb8c88680a48f9e5e4aafc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=921...
----------------------------------------------------------------------
This function is only used in HIP06, which has been removed. So remove it.
Link: https://lore.kernel.org/r/20220409083254.9696-3-liangwenpeng@huawei.com Signed-off-by: Yixing Liu liuyixing1@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_device.h | 11 ----------- drivers/infiniband/hw/hns/hns_roce_qp.c | 20 -------------------- 2 files changed, 31 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 21fa93c86b04..9f8f198b9f3b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -106,16 +106,6 @@ enum { SERV_TYPE_XRC = 5, };
-enum hns_roce_qp_state { - HNS_ROCE_QP_STATE_RST, - HNS_ROCE_QP_STATE_INIT, - HNS_ROCE_QP_STATE_RTR, - HNS_ROCE_QP_STATE_RTS, - HNS_ROCE_QP_STATE_SQD, - HNS_ROCE_QP_STATE_ERR, - HNS_ROCE_QP_NUM_STATE, -}; - enum hns_roce_event { HNS_ROCE_EVENT_TYPE_PATH_MIG = 0x01, HNS_ROCE_EVENT_TYPE_PATH_MIG_FAILED = 0x02, @@ -1184,7 +1174,6 @@ void *hns_roce_get_send_wqe(struct hns_roce_qp *hr_qp, unsigned int n); void *hns_roce_get_extend_sge(struct hns_roce_qp *hr_qp, unsigned int n); bool hns_roce_wq_overflow(struct hns_roce_wq *hr_wq, u32 nreq, struct ib_cq *ib_cq); -enum hns_roce_qp_state to_hns_roce_state(enum ib_qp_state state); void hns_roce_lock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq); void hns_roce_unlock_cqs(struct hns_roce_cq *send_cq, diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index 1099963db1b6..43530a7c8304 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -243,26 +243,6 @@ static int alloc_qpn(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) return 0; }
-enum hns_roce_qp_state to_hns_roce_state(enum ib_qp_state state) -{ - switch (state) { - case IB_QPS_RESET: - return HNS_ROCE_QP_STATE_RST; - case IB_QPS_INIT: - return HNS_ROCE_QP_STATE_INIT; - case IB_QPS_RTR: - return HNS_ROCE_QP_STATE_RTR; - case IB_QPS_RTS: - return HNS_ROCE_QP_STATE_RTS; - case IB_QPS_SQD: - return HNS_ROCE_QP_STATE_SQD; - case IB_QPS_ERR: - return HNS_ROCE_QP_STATE_ERR; - default: - return HNS_ROCE_QP_NUM_STATE; - } -} - static void add_qp_to_list(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, struct ib_cq *send_cq, struct ib_cq *recv_cq)
From: Guofeng Yue yueguofeng@hisilicon.com
mainline inclusion from mainline-for-linus commit 601cdd861cf551e330c85c4dfa6d25bef3b8d554 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=601...
----------------------------------------------------------------------
It is completely redundant for this function to use "ret" to store the return value of the subfunction.
Link: https://lore.kernel.org/r/20220409083254.9696-4-liangwenpeng@huawei.com Signed-off-by: Guofeng Yue yueguofeng@hisilicon.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 59b1956671e5..242c6d0f8d2f 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -3045,7 +3045,6 @@ static int hns_roce_v2_write_mtpt(struct hns_roce_dev *hr_dev, void *mb_buf, struct hns_roce_mr *mr) { struct hns_roce_v2_mpt_entry *mpt_entry; - int ret;
mpt_entry = mb_buf; memset(mpt_entry, 0, sizeof(*mpt_entry)); @@ -3084,9 +3083,7 @@ static int hns_roce_v2_write_mtpt(struct hns_roce_dev *hr_dev, to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.ba_pg_shift)); hr_reg_enable(mpt_entry, MPT_INNER_PA_VLD);
- ret = set_mtpt_pbl(hr_dev, mpt_entry, mr); - - return ret; + return set_mtpt_pbl(hr_dev, mpt_entry, mr); }
static int hns_roce_v2_rereg_write_mtpt(struct hns_roce_dev *hr_dev,
From: Wenpeng Liang liangwenpeng@huawei.com
mainline inclusion from mainline-for-linus commit ac88da750f09c749e1c0ab0b8e5468c533704e52 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=ac8...
----------------------------------------------------------------------
CDMQ may fail to execute, so its return value should not be ignored.
Link: https://lore.kernel.org/r/20220409083254.9696-5-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 242c6d0f8d2f..ac8ce00069a4 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1510,7 +1510,7 @@ static void __hns_roce_function_clear(struct hns_roce_dev *hr_dev, int vf_id) hns_roce_func_clr_rst_proc(hr_dev, ret, fclr_write_fail_flag); }
-static void hns_roce_free_vf_resource(struct hns_roce_dev *hr_dev, int vf_id) +static int hns_roce_free_vf_resource(struct hns_roce_dev *hr_dev, int vf_id) { enum hns_roce_opcode_type opcode = HNS_ROCE_OPC_ALLOC_VF_RES; struct hns_roce_cmq_desc desc[2]; @@ -1521,17 +1521,26 @@ static void hns_roce_free_vf_resource(struct hns_roce_dev *hr_dev, int vf_id) desc[0].flag |= cpu_to_le16(HNS_ROCE_CMD_FLAG_NEXT); hns_roce_cmq_setup_basic_desc(&desc[1], opcode, false); hr_reg_write(req_a, FUNC_RES_A_VF_ID, vf_id); - hns_roce_cmq_send(hr_dev, desc, 2); + + return hns_roce_cmq_send(hr_dev, desc, 2); }
static void hns_roce_function_clear(struct hns_roce_dev *hr_dev) { + int ret; int i;
for (i = hr_dev->func_num - 1; i >= 0; i--) { __hns_roce_function_clear(hr_dev, i); - if (i != 0) - hns_roce_free_vf_resource(hr_dev, i); + + if (i == 0) + continue; + + ret = hns_roce_free_vf_resource(hr_dev, i); + if (ret) + ibdev_err(&hr_dev->ib_dev, + "failed to free vf resource, vf_id = %d, ret = %d.\n", + i, ret); } }
From: Haoyue Xu xuhaoyue1@hisilicon.com
mainline inclusion from mainline-for-linus commit 6f4f5cf9823387acc4f52e3d30f96b879acdff37 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=6f4...
----------------------------------------------------------------------
Assigning a value to ret in the init statement of a for-loop makes the code less readable.
Link: https://lore.kernel.org/r/20220409083254.9696-6-liangwenpeng@huawei.com Signed-off-by: Haoyue Xu xuhaoyue1@hisilicon.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index ac8ce00069a4..4ea8e0e90ba8 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1296,7 +1296,8 @@ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, } while (++timeout < priv->cmq.tx_timeout);
if (hns_roce_cmq_csq_done(hr_dev)) { - for (ret = 0, i = 0; i < num; i++) { + ret = 0; + for (i = 0; i < num; i++) { /* check the result of hardware write back */ desc[i] = csq->desc[tail++]; if (tail == csq->desc_num)
From: Guo Zhengkui guozhengkui@vivo.com
mainline inclusion from mainline-for-linus commit cc377b9b24c7839531c2c0b7a2165819b578393e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=cc3...
----------------------------------------------------------------------
Fix the following coccicheck warning:
drivers/infiniband/hw/hns/hns_roce_mr.c:343:5-8: Unneeded variable: "ret".
Return 0 directly instead.
Link: https://lore.kernel.org/r/20220426070858.9098-1-guozhengkui@vivo.com Signed-off-by: Guo Zhengkui guozhengkui@vivo.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_mr.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 214833a87542..1e36ac383ea3 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -337,12 +337,11 @@ int hns_roce_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) { struct hns_roce_dev *hr_dev = to_hr_dev(ibmr->device); struct hns_roce_mr *mr = to_hr_mr(ibmr); - int ret = 0;
hns_roce_mr_free(hr_dev, mr); kfree(mr);
- return ret; + return 0; }
struct ib_mr *hns_roce_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
From: Yangyang Li liyangyang20@huawei.com
mainline inclusion from mainline-for-linus commit e8ea058edc2b225a68b307057a65599625daaebf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=e8e...
----------------------------------------------------------------------
CMDQ may fail during HNS ROCEE initialization. The following is the log when the execution fails:
hns3 0000:bd:00.2: In reset process RoCE client reinit. hns3 0000:bd:00.2: CMDQ move tail from 840 to 839 hns3 0000:bd:00.2 hns_2: failed to set gid, ret = -11! hns3 0000:bd:00.2: CMDQ move tail from 840 to 839 <...> hns3 0000:bd:00.2: CMDQ move tail from 840 to 839 hns3 0000:bd:00.2: CMDQ move tail from 840 to 0 hns3 0000:bd:00.2: [cmd]token 14e mailbox 20 timeout. hns3 0000:bd:00.2 hns_2: set HEM step 0 failed! hns3 0000:bd:00.2 hns_2: set HEM address to HW failed! hns3 0000:bd:00.2 hns_2: failed to alloc mtpt, ret = -16. infiniband hns_2: Couldn't create ib_mad PD infiniband hns_2: Couldn't open port 1 hns3 0000:bd:00.2: Reset done, RoCE client reinit finished.
However, even if ib_mad client registration failed, ib_register_device() still returns success to the driver.
In the device initialization process, CMDQ execution fails because HW/FW is abnormal. Therefore, if CMDQ fails, the initialization function should set CMDQ to a fatal error state and return a failure to the caller.
Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Link: https://lore.kernel.org/r/20220429093104.26687-1-liangwenpeng@huawei.com Signed-off-by: Yangyang Li liyangyang20@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_device.h | 6 ++++++ drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 21 +++++++++++++++++++++ 2 files changed, 27 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 9f8f198b9f3b..8463416321e1 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -525,6 +525,11 @@ struct hns_roce_cmd_context { u16 busy; };
+enum hns_roce_cmdq_state { + HNS_ROCE_CMDQ_STATE_NORMAL, + HNS_ROCE_CMDQ_STATE_FATAL_ERR, +}; + struct hns_roce_cmdq { struct dma_pool *pool; struct semaphore poll_sem; @@ -544,6 +549,7 @@ struct hns_roce_cmdq { * close device, switch into poll mode(non event mode) */ u8 use_events; + enum hns_roce_cmdq_state state; };
struct hns_roce_cmd_mailbox { diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 4ea8e0e90ba8..413cdc527714 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1265,6 +1265,16 @@ static int hns_roce_cmq_csq_done(struct hns_roce_dev *hr_dev) return tail == priv->cmq.csq.head; }
+static void update_cmdq_status(struct hns_roce_dev *hr_dev) +{ + struct hns_roce_v2_priv *priv = hr_dev->priv; + struct hnae3_handle *handle = priv->handle; + + if (handle->rinfo.reset_state == HNS_ROCE_STATE_RST_INIT || + handle->rinfo.instance_state == HNS_ROCE_STATE_INIT) + hr_dev->cmd.state = HNS_ROCE_CMDQ_STATE_FATAL_ERR; +} + static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, struct hns_roce_cmq_desc *desc, int num) { @@ -1319,6 +1329,8 @@ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, csq->head, tail); csq->head = tail;
+ update_cmdq_status(hr_dev); + ret = -EAGAIN; }
@@ -1333,6 +1345,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev, bool busy; int ret;
+ if (hr_dev->cmd.state == HNS_ROCE_CMDQ_STATE_FATAL_ERR) + return -EIO; + if (!v2_chk_mbox_is_avail(hr_dev, &busy)) return busy ? -EBUSY : 0;
@@ -1531,6 +1546,9 @@ static void hns_roce_function_clear(struct hns_roce_dev *hr_dev) int ret; int i;
+ if (hr_dev->cmd.state == HNS_ROCE_CMDQ_STATE_FATAL_ERR) + return; + for (i = hr_dev->func_num - 1; i >= 0; i--) { __hns_roce_function_clear(hr_dev, i);
@@ -2798,6 +2816,9 @@ static int v2_wait_mbox_complete(struct hns_roce_dev *hr_dev, u32 timeout, mb_st = (struct hns_roce_mbox_status *)desc.data; end = msecs_to_jiffies(timeout) + jiffies; while (v2_chk_mbox_is_avail(hr_dev, &busy)) { + if (hr_dev->cmd.state == HNS_ROCE_CMDQ_STATE_FATAL_ERR) + return -EIO; + status = 0; hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_QUERY_MB_ST, true);
From: Yixing Liu liuyixing1@huawei.com
mainline inclusion from mainline-for-linus commit db5dfbf5b201df65c1f5332c4d9d5e7c2f42396b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=db5...
----------------------------------------------------------------------
The bt number of cqc_timer of HIP09 increases compared with that of HIP08. Therefore, cqc_timer_bt_num and num_cqc_timer do not match. As a result, the driver may fail to allocate cqc_timer. So the driver needs to uniquely uses cqc_timer_bt_num to represent the bt number of cqc_timer.
Fixes: 0e40dc2f70cd ("RDMA/hns: Add timer allocation support for hip08") Link: https://lore.kernel.org/r/20220429093545.58070-1-liangwenpeng@huawei.com Signed-off-by: Yixing Liu liuyixing1@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_device.h | 1 - drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 3 +-- drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +- drivers/infiniband/hw/hns/hns_roce_main.c | 2 +- 4 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 8463416321e1..a7049095d050 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -720,7 +720,6 @@ struct hns_roce_caps { u32 num_pi_qps; u32 reserved_qps; int num_qpc_timer; - int num_cqc_timer; u32 num_srqs; u32 max_wqes; u32 max_srq_wrs; diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 413cdc527714..e906dcfc56fa 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1969,7 +1969,7 @@ static void set_default_caps(struct hns_roce_dev *hr_dev) caps->num_mtpts = HNS_ROCE_V2_MAX_MTPT_NUM; caps->num_pds = HNS_ROCE_V2_MAX_PD_NUM; caps->num_qpc_timer = HNS_ROCE_V2_MAX_QPC_TIMER_NUM; - caps->num_cqc_timer = HNS_ROCE_V2_MAX_CQC_TIMER_NUM; + caps->cqc_timer_bt_num = HNS_ROCE_V2_MAX_CQC_TIMER_BT_NUM;
caps->max_qp_init_rdma = HNS_ROCE_V2_MAX_QP_INIT_RDMA; caps->max_qp_dest_rdma = HNS_ROCE_V2_MAX_QP_DEST_RDMA; @@ -2247,7 +2247,6 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev) caps->max_rq_sg = roundup_pow_of_two(caps->max_rq_sg); caps->max_extend_sg = le32_to_cpu(resp_a->max_extend_sg); caps->num_qpc_timer = le16_to_cpu(resp_a->num_qpc_timer); - caps->num_cqc_timer = le16_to_cpu(resp_a->num_cqc_timer); caps->max_srq_sges = le16_to_cpu(resp_a->max_srq_sges); caps->max_srq_sges = roundup_pow_of_two(caps->max_srq_sges); caps->num_aeq_vectors = resp_a->num_aeq_vectors; diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h index e9a73c34389b..9ec69ae7e58e 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h @@ -41,7 +41,7 @@ #define HNS_ROCE_V2_MAX_SRQ_WR 0x8000 #define HNS_ROCE_V2_MAX_SRQ_SGE 64 #define HNS_ROCE_V2_MAX_CQ_NUM 0x100000 -#define HNS_ROCE_V2_MAX_CQC_TIMER_NUM 0x100 +#define HNS_ROCE_V2_MAX_CQC_TIMER_BT_NUM 0x100 #define HNS_ROCE_V2_MAX_SRQ_NUM 0x100000 #define HNS_ROCE_V2_MAX_CQE_NUM 0x400000 #define HNS_ROCE_V2_MAX_RQ_SGE_NUM 64 diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 8aa0af069042..11f42cebfa40 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -774,7 +774,7 @@ static int hns_roce_init_hem(struct hns_roce_dev *hr_dev) ret = hns_roce_init_hem_table(hr_dev, &hr_dev->cqc_timer_table, HEM_TYPE_CQC_TIMER, hr_dev->caps.cqc_timer_entry_sz, - hr_dev->caps.num_cqc_timer, 1); + hr_dev->caps.cqc_timer_bt_num, 1); if (ret) { dev_err(dev, "Failed to init CQC timer memory, aborting.\n");
From: Wenpeng Liang liangwenpeng@huawei.com
mainline inclusion from mainline-for-linus commit 82600b2d3cd57428bdb03c66ae67708d3c8f7281 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=826...
----------------------------------------------------------------------
To reduce the code size and make the code clearer, replace all roce_set_xxx() with hr_reg_xxx() to write the data fields.
Link: https://lore.kernel.org/r/20220512080012.38728-2-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 261 ++++++++------------- drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 166 +++++-------- 2 files changed, 157 insertions(+), 270 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index e906dcfc56fa..f182c155f264 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -149,8 +149,7 @@ static void set_atomic_seg(const struct ib_send_wr *wr, aseg->cmp_data = 0; }
- roce_set_field(rc_sq_wqe->byte_16, V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M, - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SGE_NUM, valid_num_sge); }
static int fill_ext_sge_inl_data(struct hns_roce_qp *qp, @@ -271,8 +270,7 @@ static int set_rc_inl(struct hns_roce_qp *qp, const struct ib_send_wr *wr, dseg += sizeof(struct hns_roce_v2_rc_send_wqe);
if (msg_len <= HNS_ROCE_V2_MAX_RC_INL_INN_SZ) { - roce_set_bit(rc_sq_wqe->byte_20, - V2_RC_SEND_WQE_BYTE_20_INL_TYPE_S, 0); + hr_reg_clear(rc_sq_wqe, RC_SEND_WQE_INL_TYPE);
for (i = 0; i < wr->num_sge; i++) { memcpy(dseg, ((void *)wr->sg_list[i].addr), @@ -280,17 +278,13 @@ static int set_rc_inl(struct hns_roce_qp *qp, const struct ib_send_wr *wr, dseg += wr->sg_list[i].length; } } else { - roce_set_bit(rc_sq_wqe->byte_20, - V2_RC_SEND_WQE_BYTE_20_INL_TYPE_S, 1); + hr_reg_enable(rc_sq_wqe, RC_SEND_WQE_INL_TYPE);
ret = fill_ext_sge_inl_data(qp, wr, &curr_idx, msg_len); if (ret) return ret;
- roce_set_field(rc_sq_wqe->byte_16, - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M, - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, - curr_idx - *sge_idx); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SGE_NUM, curr_idx - *sge_idx); }
*sge_idx = curr_idx; @@ -309,12 +303,10 @@ static int set_rwqe_data_seg(struct ib_qp *ibqp, const struct ib_send_wr *wr, int j = 0; int i;
- roce_set_field(rc_sq_wqe->byte_20, - V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M, - V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S, - (*sge_ind) & (qp->sge.sge_cnt - 1)); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_MSG_START_SGE_IDX, + (*sge_ind) & (qp->sge.sge_cnt - 1));
- roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_INLINE_S, + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_INLINE, !!(wr->send_flags & IB_SEND_INLINE)); if (wr->send_flags & IB_SEND_INLINE) return set_rc_inl(qp, wr, rc_sq_wqe, sge_ind); @@ -339,9 +331,7 @@ static int set_rwqe_data_seg(struct ib_qp *ibqp, const struct ib_send_wr *wr, valid_num_sge - HNS_ROCE_SGE_IN_WQE); }
- roce_set_field(rc_sq_wqe->byte_16, - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M, - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SGE_NUM, valid_num_sge);
return 0; } @@ -412,8 +402,7 @@ static int set_ud_opcode(struct hns_roce_v2_ud_send_wqe *ud_sq_wqe,
ud_sq_wqe->immtdata = get_immtdata(wr);
- roce_set_field(ud_sq_wqe->byte_4, V2_UD_SEND_WQE_BYTE_4_OPCODE_M, - V2_UD_SEND_WQE_BYTE_4_OPCODE_S, to_hr_opcode(ib_op)); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_OPCODE, to_hr_opcode(ib_op));
return 0; } @@ -424,21 +413,15 @@ static int fill_ud_av(struct hns_roce_v2_ud_send_wqe *ud_sq_wqe, struct ib_device *ib_dev = ah->ibah.device; struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
- roce_set_field(ud_sq_wqe->byte_24, V2_UD_SEND_WQE_BYTE_24_UDPSPN_M, - V2_UD_SEND_WQE_BYTE_24_UDPSPN_S, ah->av.udp_sport); - - roce_set_field(ud_sq_wqe->byte_36, V2_UD_SEND_WQE_BYTE_36_HOPLIMIT_M, - V2_UD_SEND_WQE_BYTE_36_HOPLIMIT_S, ah->av.hop_limit); - roce_set_field(ud_sq_wqe->byte_36, V2_UD_SEND_WQE_BYTE_36_TCLASS_M, - V2_UD_SEND_WQE_BYTE_36_TCLASS_S, ah->av.tclass); - roce_set_field(ud_sq_wqe->byte_40, V2_UD_SEND_WQE_BYTE_40_FLOW_LABEL_M, - V2_UD_SEND_WQE_BYTE_40_FLOW_LABEL_S, ah->av.flowlabel); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_UDPSPN, ah->av.udp_sport); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_HOPLIMIT, ah->av.hop_limit); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_TCLASS, ah->av.tclass); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_FLOW_LABEL, ah->av.flowlabel);
if (WARN_ON(ah->av.sl > MAX_SERVICE_LEVEL)) return -EINVAL;
- roce_set_field(ud_sq_wqe->byte_40, V2_UD_SEND_WQE_BYTE_40_SL_M, - V2_UD_SEND_WQE_BYTE_40_SL_S, ah->av.sl); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_SL, ah->av.sl);
ud_sq_wqe->sgid_index = ah->av.gid_index;
@@ -448,10 +431,8 @@ static int fill_ud_av(struct hns_roce_v2_ud_send_wqe *ud_sq_wqe, if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) return 0;
- roce_set_bit(ud_sq_wqe->byte_40, V2_UD_SEND_WQE_BYTE_40_UD_VLAN_EN_S, - ah->av.vlan_en); - roce_set_field(ud_sq_wqe->byte_36, V2_UD_SEND_WQE_BYTE_36_VLAN_M, - V2_UD_SEND_WQE_BYTE_36_VLAN_S, ah->av.vlan_id); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_VLAN_EN, ah->av.vlan_en); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_VLAN, ah->av.vlan_id);
return 0; } @@ -476,27 +457,19 @@ static inline int set_ud_wqe(struct hns_roce_qp *qp,
ud_sq_wqe->msg_len = cpu_to_le32(msg_len);
- roce_set_bit(ud_sq_wqe->byte_4, V2_UD_SEND_WQE_BYTE_4_CQE_S, + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_CQE, !!(wr->send_flags & IB_SEND_SIGNALED)); - - roce_set_bit(ud_sq_wqe->byte_4, V2_UD_SEND_WQE_BYTE_4_SE_S, + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_SE, !!(wr->send_flags & IB_SEND_SOLICITED));
- roce_set_field(ud_sq_wqe->byte_16, V2_UD_SEND_WQE_BYTE_16_PD_M, - V2_UD_SEND_WQE_BYTE_16_PD_S, to_hr_pd(qp->ibqp.pd)->pdn); - - roce_set_field(ud_sq_wqe->byte_16, V2_UD_SEND_WQE_BYTE_16_SGE_NUM_M, - V2_UD_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge); - - roce_set_field(ud_sq_wqe->byte_20, - V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M, - V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S, - curr_idx & (qp->sge.sge_cnt - 1)); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_PD, to_hr_pd(qp->ibqp.pd)->pdn); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_SGE_NUM, valid_num_sge); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_MSG_START_SGE_IDX, + curr_idx & (qp->sge.sge_cnt - 1));
ud_sq_wqe->qkey = cpu_to_le32(ud_wr(wr)->remote_qkey & 0x80000000 ? qp->qkey : ud_wr(wr)->remote_qkey); - roce_set_field(ud_sq_wqe->byte_32, V2_UD_SEND_WQE_BYTE_32_DQPN_M, - V2_UD_SEND_WQE_BYTE_32_DQPN_S, ud_wr(wr)->remote_qpn); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_DQPN, ud_wr(wr)->remote_qpn);
ret = fill_ud_av(ud_sq_wqe, ah); if (ret) @@ -516,8 +489,7 @@ static inline int set_ud_wqe(struct hns_roce_qp *qp, dma_wmb();
*sge_idx = curr_idx; - roce_set_bit(ud_sq_wqe->byte_4, V2_UD_SEND_WQE_BYTE_4_OWNER_S, - owner_bit); + hr_reg_write(ud_sq_wqe, UD_SEND_WQE_OWNER, owner_bit);
return 0; } @@ -553,7 +525,7 @@ static int set_rc_opcode(struct hns_roce_dev *hr_dev, ret = -EOPNOTSUPP; break; case IB_WR_LOCAL_INV: - roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_SO_S, 1); + hr_reg_enable(rc_sq_wqe, RC_SEND_WQE_SO); fallthrough; case IB_WR_SEND_WITH_INV: rc_sq_wqe->inv_key = cpu_to_le32(wr->ex.invalidate_rkey); @@ -565,11 +537,11 @@ static int set_rc_opcode(struct hns_roce_dev *hr_dev, if (unlikely(ret)) return ret;
- roce_set_field(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_OPCODE_M, - V2_RC_SEND_WQE_BYTE_4_OPCODE_S, to_hr_opcode(ib_op)); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_OPCODE, to_hr_opcode(ib_op));
return ret; } + static inline int set_rc_wqe(struct hns_roce_qp *qp, const struct ib_send_wr *wr, void *wqe, unsigned int *sge_idx, @@ -590,13 +562,13 @@ static inline int set_rc_wqe(struct hns_roce_qp *qp, if (WARN_ON(ret)) return ret;
- roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_FENCE_S, + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_FENCE, (wr->send_flags & IB_SEND_FENCE) ? 1 : 0);
- roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_SE_S, + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_SE, (wr->send_flags & IB_SEND_SOLICITED) ? 1 : 0);
- roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_CQE_S, + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_CQE, (wr->send_flags & IB_SEND_SIGNALED) ? 1 : 0);
if (wr->opcode == IB_WR_ATOMIC_CMP_AND_SWP || @@ -616,8 +588,7 @@ static inline int set_rc_wqe(struct hns_roce_qp *qp, dma_wmb();
*sge_idx = curr_idx; - roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_OWNER_S, - owner_bit); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_OWNER, owner_bit);
return ret; } @@ -682,14 +653,11 @@ static void write_dwqe(struct hns_roce_dev *hr_dev, struct hns_roce_qp *qp, struct hns_roce_v2_rc_send_wqe *rc_sq_wqe = wqe;
/* All kinds of DirectWQE have the same header field layout */ - roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_FLAG_S, 1); - roce_set_field(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_DB_SL_L_M, - V2_RC_SEND_WQE_BYTE_4_DB_SL_L_S, qp->sl); - roce_set_field(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_DB_SL_H_M, - V2_RC_SEND_WQE_BYTE_4_DB_SL_H_S, - qp->sl >> HNS_ROCE_SL_SHIFT); - roce_set_field(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_WQE_INDEX_M, - V2_RC_SEND_WQE_BYTE_4_WQE_INDEX_S, qp->sq.head); + hr_reg_enable(rc_sq_wqe, RC_SEND_WQE_FLAG); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_DB_SL_L, qp->sl); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_DB_SL_H, + qp->sl >> HNS_ROCE_SL_SHIFT); + hr_reg_write(rc_sq_wqe, RC_SEND_WQE_WQE_INDEX, qp->sq.head);
hns_roce_write512(hr_dev, wqe, qp->sq.db_reg); } @@ -1779,17 +1747,16 @@ static int __hns_roce_set_vf_switch_param(struct hns_roce_dev *hr_dev, swt = (struct hns_roce_vf_switch *)desc.data; hns_roce_cmq_setup_basic_desc(&desc, HNS_SWITCH_PARAMETER_CFG, true); swt->rocee_sel |= cpu_to_le32(HNS_ICL_SWITCH_CMD_ROCEE_SEL); - roce_set_field(swt->fun_id, VF_SWITCH_DATA_FUN_ID_VF_ID_M, - VF_SWITCH_DATA_FUN_ID_VF_ID_S, vf_id); + hr_reg_write(swt, VF_SWITCH_VF_ID, vf_id); ret = hns_roce_cmq_send(hr_dev, &desc, 1); if (ret) return ret;
desc.flag = cpu_to_le16(HNS_ROCE_CMD_FLAG_IN); desc.flag &= cpu_to_le16(~HNS_ROCE_CMD_FLAG_WR); - roce_set_bit(swt->cfg, VF_SWITCH_DATA_CFG_ALW_LPBK_S, 1); - roce_set_bit(swt->cfg, VF_SWITCH_DATA_CFG_ALW_LCL_LPBK_S, 0); - roce_set_bit(swt->cfg, VF_SWITCH_DATA_CFG_ALW_DST_OVRD_S, 1); + hr_reg_enable(swt, VF_SWITCH_ALW_LPBK); + hr_reg_clear(swt, VF_SWITCH_ALW_LCL_LPBK); + hr_reg_enable(swt, VF_SWITCH_ALW_DST_OVRD);
return hns_roce_cmq_send(hr_dev, &desc, 1); } @@ -2921,10 +2888,8 @@ static int config_sgid_table(struct hns_roce_dev *hr_dev,
hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_CFG_SGID_TB, false);
- roce_set_field(sgid_tb->table_idx_rsv, CFG_SGID_TB_TABLE_IDX_M, - CFG_SGID_TB_TABLE_IDX_S, gid_index); - roce_set_field(sgid_tb->vf_sgid_type_rsv, CFG_SGID_TB_VF_SGID_TYPE_M, - CFG_SGID_TB_VF_SGID_TYPE_S, sgid_type); + hr_reg_write(sgid_tb, CFG_SGID_TB_TABLE_IDX, gid_index); + hr_reg_write(sgid_tb, CFG_SGID_TB_VF_SGID_TYPE, sgid_type);
copy_gid(&sgid_tb->vf_sgid_l, gid);
@@ -2959,19 +2924,14 @@ static int config_gmv_table(struct hns_roce_dev *hr_dev,
copy_gid(&tb_a->vf_sgid_l, gid);
- roce_set_field(tb_a->vf_sgid_type_vlan, CFG_GMV_TB_VF_SGID_TYPE_M, - CFG_GMV_TB_VF_SGID_TYPE_S, sgid_type); - roce_set_bit(tb_a->vf_sgid_type_vlan, CFG_GMV_TB_VF_VLAN_EN_S, - vlan_id < VLAN_CFI_MASK); - roce_set_field(tb_a->vf_sgid_type_vlan, CFG_GMV_TB_VF_VLAN_ID_M, - CFG_GMV_TB_VF_VLAN_ID_S, vlan_id); + hr_reg_write(tb_a, GMV_TB_A_VF_SGID_TYPE, sgid_type); + hr_reg_write(tb_a, GMV_TB_A_VF_VLAN_EN, vlan_id < VLAN_CFI_MASK); + hr_reg_write(tb_a, GMV_TB_A_VF_VLAN_ID, vlan_id);
tb_b->vf_smac_l = cpu_to_le32(*(u32 *)mac); - roce_set_field(tb_b->vf_smac_h, CFG_GMV_TB_SMAC_H_M, - CFG_GMV_TB_SMAC_H_S, *(u16 *)&mac[4]);
- roce_set_field(tb_b->table_idx_rsv, CFG_GMV_TB_SGID_IDX_M, - CFG_GMV_TB_SGID_IDX_S, gid_index); + hr_reg_write(tb_b, GMV_TB_B_SMAC_H, *(u16 *)&mac[4]); + hr_reg_write(tb_b, GMV_TB_B_SGID_IDX, gid_index);
return hns_roce_cmq_send(hr_dev, desc, 2); } @@ -3020,10 +2980,8 @@ static int hns_roce_v2_set_mac(struct hns_roce_dev *hr_dev, u8 phy_port, reg_smac_l = *(u32 *)(&addr[0]); reg_smac_h = *(u16 *)(&addr[4]);
- roce_set_field(smac_tb->tb_idx_rsv, CFG_SMAC_TB_IDX_M, - CFG_SMAC_TB_IDX_S, phy_port); - roce_set_field(smac_tb->vf_smac_h_rsv, CFG_SMAC_TB_VF_SMAC_H_M, - CFG_SMAC_TB_VF_SMAC_H_S, reg_smac_h); + hr_reg_write(smac_tb, CFG_SMAC_TB_IDX, phy_port); + hr_reg_write(smac_tb, CFG_SMAC_TB_VF_SMAC_H, reg_smac_h); smac_tb->vf_smac_l = cpu_to_le32(reg_smac_l);
return hns_roce_cmq_send(hr_dev, &desc, 1); @@ -3052,21 +3010,15 @@ static int set_mtpt_pbl(struct hns_roce_dev *hr_dev,
mpt_entry->pbl_size = cpu_to_le32(mr->npages); mpt_entry->pbl_ba_l = cpu_to_le32(pbl_ba >> 3); - roce_set_field(mpt_entry->byte_48_mode_ba, - V2_MPT_BYTE_48_PBL_BA_H_M, V2_MPT_BYTE_48_PBL_BA_H_S, - upper_32_bits(pbl_ba >> 3)); + hr_reg_write(mpt_entry, MPT_PBL_BA_H, upper_32_bits(pbl_ba >> 3));
mpt_entry->pa0_l = cpu_to_le32(lower_32_bits(pages[0])); - roce_set_field(mpt_entry->byte_56_pa0_h, V2_MPT_BYTE_56_PA0_H_M, - V2_MPT_BYTE_56_PA0_H_S, upper_32_bits(pages[0])); + hr_reg_write(mpt_entry, MPT_PA0_H, upper_32_bits(pages[0]));
mpt_entry->pa1_l = cpu_to_le32(lower_32_bits(pages[1])); - roce_set_field(mpt_entry->byte_64_buf_pa1, V2_MPT_BYTE_64_PA1_H_M, - V2_MPT_BYTE_64_PA1_H_S, upper_32_bits(pages[1])); - roce_set_field(mpt_entry->byte_64_buf_pa1, - V2_MPT_BYTE_64_PBL_BUF_PG_SZ_M, - V2_MPT_BYTE_64_PBL_BUF_PG_SZ_S, - to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.buf_pg_shift)); + hr_reg_write(mpt_entry, MPT_PA1_H, upper_32_bits(pages[1])); + hr_reg_write(mpt_entry, MPT_PBL_BUF_PG_SZ, + to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.buf_pg_shift));
return 0; } @@ -3124,24 +3076,19 @@ static int hns_roce_v2_rereg_write_mtpt(struct hns_roce_dev *hr_dev, u32 mr_access_flags = mr->access; int ret = 0;
- roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_MPT_ST_M, - V2_MPT_BYTE_4_MPT_ST_S, V2_MPT_ST_VALID); - - roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PD_M, - V2_MPT_BYTE_4_PD_S, mr->pd); + hr_reg_write(mpt_entry, MPT_ST, V2_MPT_ST_VALID); + hr_reg_write(mpt_entry, MPT_PD, mr->pd);
if (flags & IB_MR_REREG_ACCESS) { - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, - V2_MPT_BYTE_8_BIND_EN_S, + hr_reg_write(mpt_entry, MPT_BIND_EN, (mr_access_flags & IB_ACCESS_MW_BIND ? 1 : 0)); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, - V2_MPT_BYTE_8_ATOMIC_EN_S, + hr_reg_write(mpt_entry, MPT_ATOMIC_EN, mr_access_flags & IB_ACCESS_REMOTE_ATOMIC ? 1 : 0); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_RR_EN_S, + hr_reg_write(mpt_entry, MPT_RR_EN, mr_access_flags & IB_ACCESS_REMOTE_READ ? 1 : 0); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_RW_EN_S, + hr_reg_write(mpt_entry, MPT_RW_EN, mr_access_flags & IB_ACCESS_REMOTE_WRITE ? 1 : 0); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_LW_EN_S, + hr_reg_write(mpt_entry, MPT_LW_EN, mr_access_flags & IB_ACCESS_LOCAL_WRITE ? 1 : 0); }
@@ -3172,37 +3119,28 @@ static int hns_roce_v2_frmr_write_mtpt(struct hns_roce_dev *hr_dev, return -ENOBUFS; }
- roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_MPT_ST_M, - V2_MPT_BYTE_4_MPT_ST_S, V2_MPT_ST_FREE); - roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PBL_HOP_NUM_M, - V2_MPT_BYTE_4_PBL_HOP_NUM_S, 1); - roce_set_field(mpt_entry->byte_4_pd_hop_st, - V2_MPT_BYTE_4_PBL_BA_PG_SZ_M, - V2_MPT_BYTE_4_PBL_BA_PG_SZ_S, - to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.ba_pg_shift)); - roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PD_M, - V2_MPT_BYTE_4_PD_S, mr->pd); + hr_reg_write(mpt_entry, MPT_ST, V2_MPT_ST_FREE); + hr_reg_write(mpt_entry, MPT_PD, mr->pd); + + hr_reg_enable(mpt_entry, MPT_RA_EN); + hr_reg_enable(mpt_entry, MPT_R_INV_EN); + hr_reg_enable(mpt_entry, MPT_L_INV_EN);
- roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_RA_EN_S, 1); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_R_INV_EN_S, 1); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_L_INV_EN_S, 1); + hr_reg_enable(mpt_entry, MPT_FRE); + hr_reg_clear(mpt_entry, MPT_MR_MW); + hr_reg_enable(mpt_entry, MPT_BPD); + hr_reg_clear(mpt_entry, MPT_PA);
- roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_FRE_S, 1); - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_PA_S, 0); - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_MR_MW_S, 0); - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_BPD_S, 1); + hr_reg_write(mpt_entry, MPT_PBL_HOP_NUM, 1); + hr_reg_write(mpt_entry, MPT_PBL_BA_PG_SZ, + to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.ba_pg_shift)); + hr_reg_write(mpt_entry, MPT_PBL_BUF_PG_SZ, + to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.buf_pg_shift));
mpt_entry->pbl_size = cpu_to_le32(mr->npages);
mpt_entry->pbl_ba_l = cpu_to_le32(lower_32_bits(pbl_ba >> 3)); - roce_set_field(mpt_entry->byte_48_mode_ba, V2_MPT_BYTE_48_PBL_BA_H_M, - V2_MPT_BYTE_48_PBL_BA_H_S, - upper_32_bits(pbl_ba >> 3)); - - roce_set_field(mpt_entry->byte_64_buf_pa1, - V2_MPT_BYTE_64_PBL_BUF_PG_SZ_M, - V2_MPT_BYTE_64_PBL_BUF_PG_SZ_S, - to_hr_hw_page_shift(mr->pbl_mtr.hem_cfg.buf_pg_shift)); + hr_reg_write(mpt_entry, MPT_PBL_BA_H, upper_32_bits(pbl_ba >> 3));
return 0; } @@ -3214,36 +3152,29 @@ static int hns_roce_v2_mw_write_mtpt(void *mb_buf, struct hns_roce_mw *mw) mpt_entry = mb_buf; memset(mpt_entry, 0, sizeof(*mpt_entry));
- roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_MPT_ST_M, - V2_MPT_BYTE_4_MPT_ST_S, V2_MPT_ST_FREE); - roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PD_M, - V2_MPT_BYTE_4_PD_S, mw->pdn); - roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PBL_HOP_NUM_M, - V2_MPT_BYTE_4_PBL_HOP_NUM_S, - mw->pbl_hop_num == HNS_ROCE_HOP_NUM_0 ? 0 : - mw->pbl_hop_num); - roce_set_field(mpt_entry->byte_4_pd_hop_st, - V2_MPT_BYTE_4_PBL_BA_PG_SZ_M, - V2_MPT_BYTE_4_PBL_BA_PG_SZ_S, - mw->pbl_ba_pg_sz + PG_SHIFT_OFFSET); - - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_R_INV_EN_S, 1); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_L_INV_EN_S, 1); - roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_LW_EN_S, 1); - - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_PA_S, 0); - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_MR_MW_S, 1); - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_BPD_S, 1); - roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_BQP_S, - mw->ibmw.type == IB_MW_TYPE_1 ? 0 : 1); + hr_reg_write(mpt_entry, MPT_ST, V2_MPT_ST_FREE); + hr_reg_write(mpt_entry, MPT_PD, mw->pdn);
- roce_set_field(mpt_entry->byte_64_buf_pa1, - V2_MPT_BYTE_64_PBL_BUF_PG_SZ_M, - V2_MPT_BYTE_64_PBL_BUF_PG_SZ_S, - mw->pbl_buf_pg_sz + PG_SHIFT_OFFSET); + hr_reg_enable(mpt_entry, MPT_R_INV_EN); + hr_reg_enable(mpt_entry, MPT_L_INV_EN); + hr_reg_enable(mpt_entry, MPT_LW_EN); + + hr_reg_enable(mpt_entry, MPT_MR_MW); + hr_reg_enable(mpt_entry, MPT_BPD); + hr_reg_clear(mpt_entry, MPT_PA); + hr_reg_write(mpt_entry, MPT_BQP, + mw->ibmw.type == IB_MW_TYPE_1 ? 0 : 1);
mpt_entry->lkey = cpu_to_le32(mw->rkey);
+ hr_reg_write(mpt_entry, MPT_PBL_HOP_NUM, + mw->pbl_hop_num == HNS_ROCE_HOP_NUM_0 ? 0 : + mw->pbl_hop_num); + hr_reg_write(mpt_entry, MPT_PBL_BA_PG_SZ, + mw->pbl_ba_pg_sz + PG_SHIFT_OFFSET); + hr_reg_write(mpt_entry, MPT_PBL_BUF_PG_SZ, + mw->pbl_buf_pg_sz + PG_SHIFT_OFFSET); + return 0; }
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h index 9ec69ae7e58e..cd58be34dfed 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h @@ -776,12 +776,15 @@ struct hns_roce_v2_mpt_entry { #define MPT_LKEY MPT_FIELD_LOC(223, 192) #define MPT_VA MPT_FIELD_LOC(287, 224) #define MPT_PBL_SIZE MPT_FIELD_LOC(319, 288) -#define MPT_PBL_BA MPT_FIELD_LOC(380, 320) +#define MPT_PBL_BA_L MPT_FIELD_LOC(351, 320) +#define MPT_PBL_BA_H MPT_FIELD_LOC(380, 352) #define MPT_BLK_MODE MPT_FIELD_LOC(381, 381) #define MPT_RSV0 MPT_FIELD_LOC(383, 382) -#define MPT_PA0 MPT_FIELD_LOC(441, 384) +#define MPT_PA0_L MPT_FIELD_LOC(415, 384) +#define MPT_PA0_H MPT_FIELD_LOC(441, 416) #define MPT_BOUND_VA MPT_FIELD_LOC(447, 442) -#define MPT_PA1 MPT_FIELD_LOC(505, 448) +#define MPT_PA1_L MPT_FIELD_LOC(479, 448) +#define MPT_PA1_H MPT_FIELD_LOC(505, 480) #define MPT_PERSIST_EN MPT_FIELD_LOC(506, 506) #define MPT_RSV2 MPT_FIELD_LOC(507, 507) #define MPT_PBL_BUF_PG_SZ MPT_FIELD_LOC(511, 508) @@ -887,48 +890,24 @@ struct hns_roce_v2_ud_send_wqe { u8 dgid[GID_LEN_V2]; };
-#define V2_UD_SEND_WQE_BYTE_4_OPCODE_S 0 -#define V2_UD_SEND_WQE_BYTE_4_OPCODE_M GENMASK(4, 0) - -#define V2_UD_SEND_WQE_BYTE_4_OWNER_S 7 - -#define V2_UD_SEND_WQE_BYTE_4_CQE_S 8 - -#define V2_UD_SEND_WQE_BYTE_4_SE_S 11 - -#define V2_UD_SEND_WQE_BYTE_16_PD_S 0 -#define V2_UD_SEND_WQE_BYTE_16_PD_M GENMASK(23, 0) - -#define V2_UD_SEND_WQE_BYTE_16_SGE_NUM_S 24 -#define V2_UD_SEND_WQE_BYTE_16_SGE_NUM_M GENMASK(31, 24) - -#define V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S 0 -#define V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M GENMASK(23, 0) - -#define V2_UD_SEND_WQE_BYTE_24_UDPSPN_S 16 -#define V2_UD_SEND_WQE_BYTE_24_UDPSPN_M GENMASK(31, 16) - -#define V2_UD_SEND_WQE_BYTE_32_DQPN_S 0 -#define V2_UD_SEND_WQE_BYTE_32_DQPN_M GENMASK(23, 0) - -#define V2_UD_SEND_WQE_BYTE_36_VLAN_S 0 -#define V2_UD_SEND_WQE_BYTE_36_VLAN_M GENMASK(15, 0) - -#define V2_UD_SEND_WQE_BYTE_36_HOPLIMIT_S 16 -#define V2_UD_SEND_WQE_BYTE_36_HOPLIMIT_M GENMASK(23, 16) - -#define V2_UD_SEND_WQE_BYTE_36_TCLASS_S 24 -#define V2_UD_SEND_WQE_BYTE_36_TCLASS_M GENMASK(31, 24) - -#define V2_UD_SEND_WQE_BYTE_40_FLOW_LABEL_S 0 -#define V2_UD_SEND_WQE_BYTE_40_FLOW_LABEL_M GENMASK(19, 0) - -#define V2_UD_SEND_WQE_BYTE_40_SL_S 20 -#define V2_UD_SEND_WQE_BYTE_40_SL_M GENMASK(23, 20) - -#define V2_UD_SEND_WQE_BYTE_40_UD_VLAN_EN_S 30 - -#define V2_UD_SEND_WQE_BYTE_40_LBI_S 31 +#define UD_SEND_WQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_v2_ud_send_wqe, h, l) + +#define UD_SEND_WQE_OPCODE UD_SEND_WQE_FIELD_LOC(4, 0) +#define UD_SEND_WQE_OWNER UD_SEND_WQE_FIELD_LOC(7, 7) +#define UD_SEND_WQE_CQE UD_SEND_WQE_FIELD_LOC(8, 8) +#define UD_SEND_WQE_SE UD_SEND_WQE_FIELD_LOC(11, 11) +#define UD_SEND_WQE_PD UD_SEND_WQE_FIELD_LOC(119, 96) +#define UD_SEND_WQE_SGE_NUM UD_SEND_WQE_FIELD_LOC(127, 120) +#define UD_SEND_WQE_MSG_START_SGE_IDX UD_SEND_WQE_FIELD_LOC(151, 128) +#define UD_SEND_WQE_UDPSPN UD_SEND_WQE_FIELD_LOC(191, 176) +#define UD_SEND_WQE_DQPN UD_SEND_WQE_FIELD_LOC(247, 224) +#define UD_SEND_WQE_VLAN UD_SEND_WQE_FIELD_LOC(271, 256) +#define UD_SEND_WQE_HOPLIMIT UD_SEND_WQE_FIELD_LOC(279, 272) +#define UD_SEND_WQE_TCLASS UD_SEND_WQE_FIELD_LOC(287, 280) +#define UD_SEND_WQE_FLOW_LABEL UD_SEND_WQE_FIELD_LOC(307, 288) +#define UD_SEND_WQE_SL UD_SEND_WQE_FIELD_LOC(311, 308) +#define UD_SEND_WQE_VLAN_EN UD_SEND_WQE_FIELD_LOC(318, 318) +#define UD_SEND_WQE_LBI UD_SEND_WQE_FIELD_LOC(319, 319)
struct hns_roce_v2_rc_send_wqe { __le32 byte_4; @@ -943,42 +922,23 @@ struct hns_roce_v2_rc_send_wqe { __le64 va; };
-#define V2_RC_SEND_WQE_BYTE_4_OPCODE_S 0 -#define V2_RC_SEND_WQE_BYTE_4_OPCODE_M GENMASK(4, 0) - -#define V2_RC_SEND_WQE_BYTE_4_DB_SL_L_S 5 -#define V2_RC_SEND_WQE_BYTE_4_DB_SL_L_M GENMASK(6, 5) - -#define V2_RC_SEND_WQE_BYTE_4_DB_SL_H_S 13 -#define V2_RC_SEND_WQE_BYTE_4_DB_SL_H_M GENMASK(14, 13) - -#define V2_RC_SEND_WQE_BYTE_4_WQE_INDEX_S 15 -#define V2_RC_SEND_WQE_BYTE_4_WQE_INDEX_M GENMASK(30, 15) - -#define V2_RC_SEND_WQE_BYTE_4_OWNER_S 7 - -#define V2_RC_SEND_WQE_BYTE_4_CQE_S 8 - -#define V2_RC_SEND_WQE_BYTE_4_FENCE_S 9 - -#define V2_RC_SEND_WQE_BYTE_4_SO_S 10 - -#define V2_RC_SEND_WQE_BYTE_4_SE_S 11 - -#define V2_RC_SEND_WQE_BYTE_4_INLINE_S 12 - -#define V2_RC_SEND_WQE_BYTE_4_FLAG_S 31 - -#define V2_RC_SEND_WQE_BYTE_16_XRC_SRQN_S 0 -#define V2_RC_SEND_WQE_BYTE_16_XRC_SRQN_M GENMASK(23, 0) - -#define V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S 24 -#define V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M GENMASK(31, 24) - -#define V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S 0 -#define V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M GENMASK(23, 0) - -#define V2_RC_SEND_WQE_BYTE_20_INL_TYPE_S 31 +#define RC_SEND_WQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_v2_rc_send_wqe, h, l) + +#define RC_SEND_WQE_OPCODE RC_SEND_WQE_FIELD_LOC(4, 0) +#define RC_SEND_WQE_DB_SL_L RC_SEND_WQE_FIELD_LOC(6, 5) +#define RC_SEND_WQE_DB_SL_H RC_SEND_WQE_FIELD_LOC(14, 13) +#define RC_SEND_WQE_OWNER RC_SEND_WQE_FIELD_LOC(7, 7) +#define RC_SEND_WQE_CQE RC_SEND_WQE_FIELD_LOC(8, 8) +#define RC_SEND_WQE_FENCE RC_SEND_WQE_FIELD_LOC(9, 9) +#define RC_SEND_WQE_SO RC_SEND_WQE_FIELD_LOC(10, 10) +#define RC_SEND_WQE_SE RC_SEND_WQE_FIELD_LOC(11, 11) +#define RC_SEND_WQE_INLINE RC_SEND_WQE_FIELD_LOC(12, 12) +#define RC_SEND_WQE_WQE_INDEX RC_SEND_WQE_FIELD_LOC(30, 15) +#define RC_SEND_WQE_FLAG RC_SEND_WQE_FIELD_LOC(31, 31) +#define RC_SEND_WQE_XRC_SRQN RC_SEND_WQE_FIELD_LOC(119, 96) +#define RC_SEND_WQE_SGE_NUM RC_SEND_WQE_FIELD_LOC(127, 120) +#define RC_SEND_WQE_MSG_START_SGE_IDX RC_SEND_WQE_FIELD_LOC(151, 128) +#define RC_SEND_WQE_INL_TYPE RC_SEND_WQE_FIELD_LOC(159, 159)
struct hns_roce_wqe_frmr_seg { __le32 pbl_size; @@ -1100,12 +1060,12 @@ struct hns_roce_vf_switch { __le32 resv3; };
-#define VF_SWITCH_DATA_FUN_ID_VF_ID_S 3 -#define VF_SWITCH_DATA_FUN_ID_VF_ID_M GENMASK(10, 3) +#define VF_SWITCH_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_vf_switch, h, l)
-#define VF_SWITCH_DATA_CFG_ALW_LPBK_S 1 -#define VF_SWITCH_DATA_CFG_ALW_LCL_LPBK_S 2 -#define VF_SWITCH_DATA_CFG_ALW_DST_OVRD_S 3 +#define VF_SWITCH_VF_ID VF_SWITCH_FIELD_LOC(42, 35) +#define VF_SWITCH_ALW_LPBK VF_SWITCH_FIELD_LOC(65, 65) +#define VF_SWITCH_ALW_LCL_LPBK VF_SWITCH_FIELD_LOC(66, 66) +#define VF_SWITCH_ALW_DST_OVRD VF_SWITCH_FIELD_LOC(67, 67)
struct hns_roce_post_mbox { __le32 in_param_l; @@ -1168,11 +1128,10 @@ struct hns_roce_cfg_sgid_tb { __le32 vf_sgid_type_rsv; };
-#define CFG_SGID_TB_TABLE_IDX_S 0 -#define CFG_SGID_TB_TABLE_IDX_M GENMASK(7, 0) +#define SGID_TB_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_cfg_sgid_tb, h, l)
-#define CFG_SGID_TB_VF_SGID_TYPE_S 0 -#define CFG_SGID_TB_VF_SGID_TYPE_M GENMASK(1, 0) +#define CFG_SGID_TB_TABLE_IDX SGID_TB_FIELD_LOC(7, 0) +#define CFG_SGID_TB_VF_SGID_TYPE SGID_TB_FIELD_LOC(161, 160)
struct hns_roce_cfg_smac_tb { __le32 tb_idx_rsv; @@ -1180,11 +1139,11 @@ struct hns_roce_cfg_smac_tb { __le32 vf_smac_h_rsv; __le32 rsv[3]; }; -#define CFG_SMAC_TB_IDX_S 0 -#define CFG_SMAC_TB_IDX_M GENMASK(7, 0)
-#define CFG_SMAC_TB_VF_SMAC_H_S 0 -#define CFG_SMAC_TB_VF_SMAC_H_M GENMASK(15, 0) +#define SMAC_TB_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_cfg_smac_tb, h, l) + +#define CFG_SMAC_TB_IDX SMAC_TB_FIELD_LOC(7, 0) +#define CFG_SMAC_TB_VF_SMAC_H SMAC_TB_FIELD_LOC(79, 64)
struct hns_roce_cfg_gmv_tb_a { __le32 vf_sgid_l; @@ -1195,16 +1154,11 @@ struct hns_roce_cfg_gmv_tb_a { __le32 resv; };
-#define CFG_GMV_TB_SGID_IDX_S 0 -#define CFG_GMV_TB_SGID_IDX_M GENMASK(7, 0) - -#define CFG_GMV_TB_VF_SGID_TYPE_S 0 -#define CFG_GMV_TB_VF_SGID_TYPE_M GENMASK(1, 0) +#define GMV_TB_A_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_cfg_gmv_tb_a, h, l)
-#define CFG_GMV_TB_VF_VLAN_EN_S 2 - -#define CFG_GMV_TB_VF_VLAN_ID_S 16 -#define CFG_GMV_TB_VF_VLAN_ID_M GENMASK(27, 16) +#define GMV_TB_A_VF_SGID_TYPE GMV_TB_A_FIELD_LOC(129, 128) +#define GMV_TB_A_VF_VLAN_EN GMV_TB_A_FIELD_LOC(130, 130) +#define GMV_TB_A_VF_VLAN_ID GMV_TB_A_FIELD_LOC(155, 144)
struct hns_roce_cfg_gmv_tb_b { __le32 vf_smac_l; @@ -1213,8 +1167,10 @@ struct hns_roce_cfg_gmv_tb_b { __le32 resv[3]; };
-#define CFG_GMV_TB_SMAC_H_S 0 -#define CFG_GMV_TB_SMAC_H_M GENMASK(15, 0) +#define GMV_TB_B_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_cfg_gmv_tb_b, h, l) + +#define GMV_TB_B_SMAC_H GMV_TB_B_FIELD_LOC(47, 32) +#define GMV_TB_B_SGID_IDX GMV_TB_B_FIELD_LOC(71, 64)
#define HNS_ROCE_QUERY_PF_CAPS_CMD_NUM 5 struct hns_roce_query_pf_caps_a {
From: Wenpeng Liang liangwenpeng@huawei.com
mainline inclusion from mainline-for-linus commit 813c980294d48362ead5422b056072ed214ca2bf category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=813...
----------------------------------------------------------------------
To reduce the code size and make the code clearer, replace all roce_get_xxx() with hr_reg_read() to read the data fields.
Link: https://lore.kernel.org/r/20220512080012.38728-3-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Zhengfeng Luo luozhengfeng@h-partners.com Reviewed-by: Yangyang Li liyangyang20@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/infiniband/hw/hns/hns_roce_device.h | 14 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 137 +++++---------- drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 158 +++++------------- drivers/infiniband/hw/hns/hns_roce_restrack.c | 49 ++---- 4 files changed, 104 insertions(+), 254 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index a7049095d050..0d160432fa65 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -129,8 +129,6 @@ enum hns_roce_event { HNS_ROCE_EVENT_TYPE_INVALID_XRCETH = 0x17, };
-#define HNS_ROCE_CAP_FLAGS_EX_SHIFT 12 - enum { HNS_ROCE_CAP_FLAG_REREG_MR = BIT(0), HNS_ROCE_CAP_FLAG_ROCE_V1_V2 = BIT(1), @@ -652,6 +650,11 @@ struct hns_roce_ceqe { __le32 rsv[15]; };
+#define CEQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_ceqe, h, l) + +#define CEQE_CQN CEQE_FIELD_LOC(23, 0) +#define CEQE_OWNER CEQE_FIELD_LOC(31, 31) + struct hns_roce_aeqe { __le32 asyn; union { @@ -671,6 +674,13 @@ struct hns_roce_aeqe { __le32 rsv[12]; };
+#define AEQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_aeqe, h, l) + +#define AEQE_EVENT_TYPE AEQE_FIELD_LOC(7, 0) +#define AEQE_SUB_TYPE AEQE_FIELD_LOC(15, 8) +#define AEQE_OWNER AEQE_FIELD_LOC(31, 31) +#define AEQE_EVENT_QUEUE_NUM AEQE_FIELD_LOC(55, 32) + struct hns_roce_eq { struct hns_roce_dev *hr_dev; void __iomem *db_reg; diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index f182c155f264..b5ed2aee578b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1483,7 +1483,7 @@ static void __hns_roce_function_clear(struct hns_roce_dev *hr_dev, int vf_id) if (ret) continue;
- if (roce_get_bit(resp->func_done, FUNC_CLEAR_RST_FUN_DONE_S)) { + if (hr_reg_read(resp, FUNC_CLEAR_RST_FUN_DONE)) { if (vf_id == 0) hr_dev->is_reset = true; return; @@ -2240,87 +2240,39 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev) ctx_hop_num = resp_b->ctx_hop_num; pbl_hop_num = resp_b->pbl_hop_num;
- caps->num_pds = 1 << roce_get_field(resp_c->cap_flags_num_pds, - V2_QUERY_PF_CAPS_C_NUM_PDS_M, - V2_QUERY_PF_CAPS_C_NUM_PDS_S); - caps->flags = roce_get_field(resp_c->cap_flags_num_pds, - V2_QUERY_PF_CAPS_C_CAP_FLAGS_M, - V2_QUERY_PF_CAPS_C_CAP_FLAGS_S); + caps->num_pds = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_PDS); + + caps->flags = hr_reg_read(resp_c, PF_CAPS_C_CAP_FLAGS); caps->flags |= le16_to_cpu(resp_d->cap_flags_ex) << HNS_ROCE_CAP_FLAGS_EX_SHIFT;
- caps->num_cqs = 1 << roce_get_field(resp_c->max_gid_num_cqs, - V2_QUERY_PF_CAPS_C_NUM_CQS_M, - V2_QUERY_PF_CAPS_C_NUM_CQS_S); - caps->gid_table_len[0] = roce_get_field(resp_c->max_gid_num_cqs, - V2_QUERY_PF_CAPS_C_MAX_GID_M, - V2_QUERY_PF_CAPS_C_MAX_GID_S); - - caps->max_cqes = 1 << roce_get_field(resp_c->cq_depth, - V2_QUERY_PF_CAPS_C_CQ_DEPTH_M, - V2_QUERY_PF_CAPS_C_CQ_DEPTH_S); - caps->num_mtpts = 1 << roce_get_field(resp_c->num_mrws, - V2_QUERY_PF_CAPS_C_NUM_MRWS_M, - V2_QUERY_PF_CAPS_C_NUM_MRWS_S); - caps->num_qps = 1 << roce_get_field(resp_c->ord_num_qps, - V2_QUERY_PF_CAPS_C_NUM_QPS_M, - V2_QUERY_PF_CAPS_C_NUM_QPS_S); - caps->max_qp_init_rdma = roce_get_field(resp_c->ord_num_qps, - V2_QUERY_PF_CAPS_C_MAX_ORD_M, - V2_QUERY_PF_CAPS_C_MAX_ORD_S); + caps->num_cqs = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_CQS); + caps->gid_table_len[0] = hr_reg_read(resp_c, PF_CAPS_C_MAX_GID); + caps->max_cqes = 1 << hr_reg_read(resp_c, PF_CAPS_C_CQ_DEPTH); + caps->num_mtpts = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_MRWS); + caps->num_qps = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_QPS); + caps->max_qp_init_rdma = hr_reg_read(resp_c, PF_CAPS_C_MAX_ORD); caps->max_qp_dest_rdma = caps->max_qp_init_rdma; caps->max_wqes = 1 << le16_to_cpu(resp_c->sq_depth); - caps->num_srqs = 1 << roce_get_field(resp_d->wq_hop_num_max_srqs, - V2_QUERY_PF_CAPS_D_NUM_SRQS_M, - V2_QUERY_PF_CAPS_D_NUM_SRQS_S); - caps->cong_type = roce_get_field(resp_d->wq_hop_num_max_srqs, - V2_QUERY_PF_CAPS_D_CONG_TYPE_M, - V2_QUERY_PF_CAPS_D_CONG_TYPE_S); - caps->max_srq_wrs = 1 << le16_to_cpu(resp_d->srq_depth);
- caps->ceqe_depth = 1 << roce_get_field(resp_d->num_ceqs_ceq_depth, - V2_QUERY_PF_CAPS_D_CEQ_DEPTH_M, - V2_QUERY_PF_CAPS_D_CEQ_DEPTH_S); - caps->num_comp_vectors = roce_get_field(resp_d->num_ceqs_ceq_depth, - V2_QUERY_PF_CAPS_D_NUM_CEQS_M, - V2_QUERY_PF_CAPS_D_NUM_CEQS_S); - - caps->aeqe_depth = 1 << roce_get_field(resp_d->arm_st_aeq_depth, - V2_QUERY_PF_CAPS_D_AEQ_DEPTH_M, - V2_QUERY_PF_CAPS_D_AEQ_DEPTH_S); - caps->default_aeq_arm_st = roce_get_field(resp_d->arm_st_aeq_depth, - V2_QUERY_PF_CAPS_D_AEQ_ARM_ST_M, - V2_QUERY_PF_CAPS_D_AEQ_ARM_ST_S); - caps->default_ceq_arm_st = roce_get_field(resp_d->arm_st_aeq_depth, - V2_QUERY_PF_CAPS_D_CEQ_ARM_ST_M, - V2_QUERY_PF_CAPS_D_CEQ_ARM_ST_S); - caps->reserved_pds = roce_get_field(resp_d->num_uars_rsv_pds, - V2_QUERY_PF_CAPS_D_RSV_PDS_M, - V2_QUERY_PF_CAPS_D_RSV_PDS_S); - caps->num_uars = 1 << roce_get_field(resp_d->num_uars_rsv_pds, - V2_QUERY_PF_CAPS_D_NUM_UARS_M, - V2_QUERY_PF_CAPS_D_NUM_UARS_S); - caps->reserved_qps = roce_get_field(resp_d->rsv_uars_rsv_qps, - V2_QUERY_PF_CAPS_D_RSV_QPS_M, - V2_QUERY_PF_CAPS_D_RSV_QPS_S); - caps->reserved_uars = roce_get_field(resp_d->rsv_uars_rsv_qps, - V2_QUERY_PF_CAPS_D_RSV_UARS_M, - V2_QUERY_PF_CAPS_D_RSV_UARS_S); - caps->reserved_mrws = roce_get_field(resp_e->chunk_size_shift_rsv_mrws, - V2_QUERY_PF_CAPS_E_RSV_MRWS_M, - V2_QUERY_PF_CAPS_E_RSV_MRWS_S); - caps->chunk_sz = 1 << roce_get_field(resp_e->chunk_size_shift_rsv_mrws, - V2_QUERY_PF_CAPS_E_CHUNK_SIZE_SHIFT_M, - V2_QUERY_PF_CAPS_E_CHUNK_SIZE_SHIFT_S); - caps->reserved_cqs = roce_get_field(resp_e->rsv_cqs, - V2_QUERY_PF_CAPS_E_RSV_CQS_M, - V2_QUERY_PF_CAPS_E_RSV_CQS_S); - caps->reserved_srqs = roce_get_field(resp_e->rsv_srqs, - V2_QUERY_PF_CAPS_E_RSV_SRQS_M, - V2_QUERY_PF_CAPS_E_RSV_SRQS_S); - caps->reserved_lkey = roce_get_field(resp_e->rsv_lkey, - V2_QUERY_PF_CAPS_E_RSV_LKEYS_M, - V2_QUERY_PF_CAPS_E_RSV_LKEYS_S); + caps->num_srqs = 1 << hr_reg_read(resp_d, PF_CAPS_D_NUM_SRQS); + caps->cong_type = hr_reg_read(resp_d, PF_CAPS_D_CONG_TYPE); + caps->max_srq_wrs = 1 << le16_to_cpu(resp_d->srq_depth); + caps->ceqe_depth = 1 << hr_reg_read(resp_d, PF_CAPS_D_CEQ_DEPTH); + caps->num_comp_vectors = hr_reg_read(resp_d, PF_CAPS_D_NUM_CEQS); + caps->aeqe_depth = 1 << hr_reg_read(resp_d, PF_CAPS_D_AEQ_DEPTH); + caps->default_aeq_arm_st = hr_reg_read(resp_d, PF_CAPS_D_AEQ_ARM_ST); + caps->default_ceq_arm_st = hr_reg_read(resp_d, PF_CAPS_D_CEQ_ARM_ST); + caps->reserved_pds = hr_reg_read(resp_d, PF_CAPS_D_RSV_PDS); + caps->num_uars = 1 << hr_reg_read(resp_d, PF_CAPS_D_NUM_UARS); + caps->reserved_qps = hr_reg_read(resp_d, PF_CAPS_D_RSV_QPS); + caps->reserved_uars = hr_reg_read(resp_d, PF_CAPS_D_RSV_UARS); + + caps->reserved_mrws = hr_reg_read(resp_e, PF_CAPS_E_RSV_MRWS); + caps->chunk_sz = 1 << hr_reg_read(resp_e, PF_CAPS_E_CHUNK_SIZE_SHIFT); + caps->reserved_cqs = hr_reg_read(resp_e, PF_CAPS_E_RSV_CQS); + caps->reserved_srqs = hr_reg_read(resp_e, PF_CAPS_E_RSV_SRQS); + caps->reserved_lkey = hr_reg_read(resp_e, PF_CAPS_E_RSV_LKEYS); caps->default_ceq_max_cnt = le16_to_cpu(resp_e->ceq_max_cnt); caps->default_ceq_period = le16_to_cpu(resp_e->ceq_period); caps->default_aeq_max_cnt = le16_to_cpu(resp_e->aeq_max_cnt); @@ -2335,15 +2287,9 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev) caps->cqe_hop_num = pbl_hop_num; caps->srqwqe_hop_num = pbl_hop_num; caps->idx_hop_num = pbl_hop_num; - caps->wqe_sq_hop_num = roce_get_field(resp_d->wq_hop_num_max_srqs, - V2_QUERY_PF_CAPS_D_SQWQE_HOP_NUM_M, - V2_QUERY_PF_CAPS_D_SQWQE_HOP_NUM_S); - caps->wqe_sge_hop_num = roce_get_field(resp_d->wq_hop_num_max_srqs, - V2_QUERY_PF_CAPS_D_EX_SGE_HOP_NUM_M, - V2_QUERY_PF_CAPS_D_EX_SGE_HOP_NUM_S); - caps->wqe_rq_hop_num = roce_get_field(resp_d->wq_hop_num_max_srqs, - V2_QUERY_PF_CAPS_D_RQWQE_HOP_NUM_M, - V2_QUERY_PF_CAPS_D_RQWQE_HOP_NUM_S); + caps->wqe_sq_hop_num = hr_reg_read(resp_d, PF_CAPS_D_SQWQE_HOP_NUM); + caps->wqe_sge_hop_num = hr_reg_read(resp_d, PF_CAPS_D_EX_SGE_HOP_NUM); + caps->wqe_rq_hop_num = hr_reg_read(resp_d, PF_CAPS_D_RQWQE_HOP_NUM);
return 0; } @@ -5551,7 +5497,7 @@ static struct hns_roce_aeqe *next_aeqe_sw_v2(struct hns_roce_eq *eq) (eq->cons_index & (eq->entries - 1)) * eq->eqe_size);
- return (roce_get_bit(aeqe->asyn, HNS_ROCE_V2_AEQ_AEQE_OWNER_S) ^ + return (hr_reg_read(aeqe, AEQE_OWNER) ^ !!(eq->cons_index & eq->entries)) ? aeqe : NULL; }
@@ -5571,15 +5517,9 @@ static int hns_roce_v2_aeq_int(struct hns_roce_dev *hr_dev, */ dma_rmb();
- event_type = roce_get_field(aeqe->asyn, - HNS_ROCE_V2_AEQE_EVENT_TYPE_M, - HNS_ROCE_V2_AEQE_EVENT_TYPE_S); - sub_type = roce_get_field(aeqe->asyn, - HNS_ROCE_V2_AEQE_SUB_TYPE_M, - HNS_ROCE_V2_AEQE_SUB_TYPE_S); - queue_num = roce_get_field(aeqe->event.queue_event.num, - HNS_ROCE_V2_AEQE_EVENT_QUEUE_NUM_M, - HNS_ROCE_V2_AEQE_EVENT_QUEUE_NUM_S); + event_type = hr_reg_read(aeqe, AEQE_EVENT_TYPE); + sub_type = hr_reg_read(aeqe, AEQE_SUB_TYPE); + queue_num = hr_reg_read(aeqe, AEQE_EVENT_QUEUE_NUM);
switch (event_type) { case HNS_ROCE_EVENT_TYPE_PATH_MIG: @@ -5639,8 +5579,8 @@ static struct hns_roce_ceqe *next_ceqe_sw_v2(struct hns_roce_eq *eq) (eq->cons_index & (eq->entries - 1)) * eq->eqe_size);
- return (!!(roce_get_bit(ceqe->comp, HNS_ROCE_V2_CEQ_CEQE_OWNER_S))) ^ - (!!(eq->cons_index & eq->entries)) ? ceqe : NULL; + return (hr_reg_read(ceqe, CEQE_OWNER) ^ + !!(eq->cons_index & eq->entries)) ? ceqe : NULL; }
static int hns_roce_v2_ceq_int(struct hns_roce_dev *hr_dev, @@ -5656,8 +5596,7 @@ static int hns_roce_v2_ceq_int(struct hns_roce_dev *hr_dev, */ dma_rmb();
- cqn = roce_get_field(ceqe->comp, HNS_ROCE_V2_CEQE_COMP_CQN_M, - HNS_ROCE_V2_CEQE_COMP_CQN_S); + cqn = hr_reg_read(ceqe, CEQE_CQN);
hns_roce_cq_completion(hr_dev, cqn);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h index cd58be34dfed..a3a2524a5e25 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h @@ -291,33 +291,6 @@ struct hns_roce_v2_cq_context { #define HNS_ROCE_V2_CQ_DEFAULT_BURST_NUM 0x0 #define HNS_ROCE_V2_CQ_DEFAULT_INTERVAL 0x0
-#define V2_CQC_BYTE_4_ARM_ST_S 6 -#define V2_CQC_BYTE_4_ARM_ST_M GENMASK(7, 6) - -#define V2_CQC_BYTE_4_CEQN_S 15 -#define V2_CQC_BYTE_4_CEQN_M GENMASK(23, 15) - -#define V2_CQC_BYTE_8_CQN_S 0 -#define V2_CQC_BYTE_8_CQN_M GENMASK(23, 0) - -#define V2_CQC_BYTE_16_CQE_HOP_NUM_S 30 -#define V2_CQC_BYTE_16_CQE_HOP_NUM_M GENMASK(31, 30) - -#define V2_CQC_BYTE_28_CQ_PRODUCER_IDX_S 0 -#define V2_CQC_BYTE_28_CQ_PRODUCER_IDX_M GENMASK(23, 0) - -#define V2_CQC_BYTE_32_CQ_CONSUMER_IDX_S 0 -#define V2_CQC_BYTE_32_CQ_CONSUMER_IDX_M GENMASK(23, 0) - -#define V2_CQC_BYTE_52_CQE_CNT_S 0 -#define V2_CQC_BYTE_52_CQE_CNT_M GENMASK(23, 0) - -#define V2_CQC_BYTE_56_CQ_MAX_CNT_S 0 -#define V2_CQC_BYTE_56_CQ_MAX_CNT_M GENMASK(15, 0) - -#define V2_CQC_BYTE_56_CQ_PERIOD_S 16 -#define V2_CQC_BYTE_56_CQ_PERIOD_M GENMASK(31, 16) - #define CQC_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_v2_cq_context, h, l)
#define CQC_CQ_ST CQC_FIELD_LOC(1, 0) @@ -981,7 +954,10 @@ struct hns_roce_func_clear { __le32 rsv[4]; };
-#define FUNC_CLEAR_RST_FUN_DONE_S 0 +#define FUNC_CLEAR_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_func_clear, h, l) + +#define FUNC_CLEAR_RST_FUN_DONE FUNC_CLEAR_FIELD_LOC(32, 32) + /* Each physical function manages up to 248 virtual functions, it takes up to * 100ms for each function to execute clear. If an abnormal reset occurs, it is * executed twice at most, so it takes up to 249 * 2 * 100ms. @@ -1222,29 +1198,17 @@ struct hns_roce_query_pf_caps_c { __le16 rq_depth; };
-#define V2_QUERY_PF_CAPS_C_NUM_PDS_S 0 -#define V2_QUERY_PF_CAPS_C_NUM_PDS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_C_CAP_FLAGS_S 20 -#define V2_QUERY_PF_CAPS_C_CAP_FLAGS_M GENMASK(31, 20) - -#define V2_QUERY_PF_CAPS_C_NUM_CQS_S 0 -#define V2_QUERY_PF_CAPS_C_NUM_CQS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_C_MAX_GID_S 20 -#define V2_QUERY_PF_CAPS_C_MAX_GID_M GENMASK(28, 20) - -#define V2_QUERY_PF_CAPS_C_CQ_DEPTH_S 0 -#define V2_QUERY_PF_CAPS_C_CQ_DEPTH_M GENMASK(22, 0) +#define PF_CAPS_C_FIELD_LOC(h, l) \ + FIELD_LOC(struct hns_roce_query_pf_caps_c, h, l)
-#define V2_QUERY_PF_CAPS_C_NUM_MRWS_S 0 -#define V2_QUERY_PF_CAPS_C_NUM_MRWS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_C_NUM_QPS_S 0 -#define V2_QUERY_PF_CAPS_C_NUM_QPS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_C_MAX_ORD_S 20 -#define V2_QUERY_PF_CAPS_C_MAX_ORD_M GENMASK(27, 20) +#define PF_CAPS_C_NUM_PDS PF_CAPS_C_FIELD_LOC(19, 0) +#define PF_CAPS_C_CAP_FLAGS PF_CAPS_C_FIELD_LOC(31, 20) +#define PF_CAPS_C_NUM_CQS PF_CAPS_C_FIELD_LOC(51, 32) +#define PF_CAPS_C_MAX_GID PF_CAPS_C_FIELD_LOC(60, 52) +#define PF_CAPS_C_CQ_DEPTH PF_CAPS_C_FIELD_LOC(86, 64) +#define PF_CAPS_C_NUM_MRWS PF_CAPS_C_FIELD_LOC(115, 96) +#define PF_CAPS_C_NUM_QPS PF_CAPS_C_FIELD_LOC(147, 128) +#define PF_CAPS_C_MAX_ORD PF_CAPS_C_FIELD_LOC(155, 148)
struct hns_roce_query_pf_caps_d { __le32 wq_hop_num_max_srqs; @@ -1255,20 +1219,26 @@ struct hns_roce_query_pf_caps_d { __le32 num_uars_rsv_pds; __le32 rsv_uars_rsv_qps; }; -#define V2_QUERY_PF_CAPS_D_NUM_SRQS_S 0 -#define V2_QUERY_PF_CAPS_D_NUM_SRQS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_D_RQWQE_HOP_NUM_S 20 -#define V2_QUERY_PF_CAPS_D_RQWQE_HOP_NUM_M GENMASK(21, 20) - -#define V2_QUERY_PF_CAPS_D_EX_SGE_HOP_NUM_S 22 -#define V2_QUERY_PF_CAPS_D_EX_SGE_HOP_NUM_M GENMASK(23, 22)
-#define V2_QUERY_PF_CAPS_D_SQWQE_HOP_NUM_S 24 -#define V2_QUERY_PF_CAPS_D_SQWQE_HOP_NUM_M GENMASK(25, 24) - -#define V2_QUERY_PF_CAPS_D_CONG_TYPE_S 26 -#define V2_QUERY_PF_CAPS_D_CONG_TYPE_M GENMASK(29, 26) +#define PF_CAPS_D_FIELD_LOC(h, l) \ + FIELD_LOC(struct hns_roce_query_pf_caps_d, h, l) + +#define PF_CAPS_D_NUM_SRQS PF_CAPS_D_FIELD_LOC(19, 0) +#define PF_CAPS_D_RQWQE_HOP_NUM PF_CAPS_D_FIELD_LOC(21, 20) +#define PF_CAPS_D_EX_SGE_HOP_NUM PF_CAPS_D_FIELD_LOC(23, 22) +#define PF_CAPS_D_SQWQE_HOP_NUM PF_CAPS_D_FIELD_LOC(25, 24) +#define PF_CAPS_D_CONG_TYPE PF_CAPS_D_FIELD_LOC(29, 26) +#define PF_CAPS_D_CEQ_DEPTH PF_CAPS_D_FIELD_LOC(85, 64) +#define PF_CAPS_D_NUM_CEQS PF_CAPS_D_FIELD_LOC(95, 86) +#define PF_CAPS_D_AEQ_DEPTH PF_CAPS_D_FIELD_LOC(117, 96) +#define PF_CAPS_D_AEQ_ARM_ST PF_CAPS_D_FIELD_LOC(119, 118) +#define PF_CAPS_D_CEQ_ARM_ST PF_CAPS_D_FIELD_LOC(121, 120) +#define PF_CAPS_D_RSV_PDS PF_CAPS_D_FIELD_LOC(147, 128) +#define PF_CAPS_D_NUM_UARS PF_CAPS_D_FIELD_LOC(155, 148) +#define PF_CAPS_D_RSV_QPS PF_CAPS_D_FIELD_LOC(179, 160) +#define PF_CAPS_D_RSV_UARS PF_CAPS_D_FIELD_LOC(187, 180) + +#define HNS_ROCE_CAP_FLAGS_EX_SHIFT 12
struct hns_roce_congestion_algorithm { u8 alg_sel; @@ -1277,33 +1247,6 @@ struct hns_roce_congestion_algorithm { u8 wnd_mode_sel; };
-#define V2_QUERY_PF_CAPS_D_CEQ_DEPTH_S 0 -#define V2_QUERY_PF_CAPS_D_CEQ_DEPTH_M GENMASK(21, 0) - -#define V2_QUERY_PF_CAPS_D_NUM_CEQS_S 22 -#define V2_QUERY_PF_CAPS_D_NUM_CEQS_M GENMASK(31, 22) - -#define V2_QUERY_PF_CAPS_D_AEQ_DEPTH_S 0 -#define V2_QUERY_PF_CAPS_D_AEQ_DEPTH_M GENMASK(21, 0) - -#define V2_QUERY_PF_CAPS_D_AEQ_ARM_ST_S 22 -#define V2_QUERY_PF_CAPS_D_AEQ_ARM_ST_M GENMASK(23, 22) - -#define V2_QUERY_PF_CAPS_D_CEQ_ARM_ST_S 24 -#define V2_QUERY_PF_CAPS_D_CEQ_ARM_ST_M GENMASK(25, 24) - -#define V2_QUERY_PF_CAPS_D_RSV_PDS_S 0 -#define V2_QUERY_PF_CAPS_D_RSV_PDS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_D_NUM_UARS_S 20 -#define V2_QUERY_PF_CAPS_D_NUM_UARS_M GENMASK(27, 20) - -#define V2_QUERY_PF_CAPS_D_RSV_QPS_S 0 -#define V2_QUERY_PF_CAPS_D_RSV_QPS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_D_RSV_UARS_S 20 -#define V2_QUERY_PF_CAPS_D_RSV_UARS_M GENMASK(27, 20) - struct hns_roce_query_pf_caps_e { __le32 chunk_size_shift_rsv_mrws; __le32 rsv_cqs; @@ -1315,20 +1258,14 @@ struct hns_roce_query_pf_caps_e { __le16 aeq_period; };
-#define V2_QUERY_PF_CAPS_E_RSV_MRWS_S 0 -#define V2_QUERY_PF_CAPS_E_RSV_MRWS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_E_CHUNK_SIZE_SHIFT_S 20 -#define V2_QUERY_PF_CAPS_E_CHUNK_SIZE_SHIFT_M GENMASK(31, 20) - -#define V2_QUERY_PF_CAPS_E_RSV_CQS_S 0 -#define V2_QUERY_PF_CAPS_E_RSV_CQS_M GENMASK(19, 0) +#define PF_CAPS_E_FIELD_LOC(h, l) \ + FIELD_LOC(struct hns_roce_query_pf_caps_e, h, l)
-#define V2_QUERY_PF_CAPS_E_RSV_SRQS_S 0 -#define V2_QUERY_PF_CAPS_E_RSV_SRQS_M GENMASK(19, 0) - -#define V2_QUERY_PF_CAPS_E_RSV_LKEYS_S 0 -#define V2_QUERY_PF_CAPS_E_RSV_LKEYS_M GENMASK(19, 0) +#define PF_CAPS_E_RSV_MRWS PF_CAPS_E_FIELD_LOC(19, 0) +#define PF_CAPS_E_CHUNK_SIZE_SHIFT PF_CAPS_E_FIELD_LOC(31, 20) +#define PF_CAPS_E_RSV_CQS PF_CAPS_E_FIELD_LOC(51, 32) +#define PF_CAPS_E_RSV_SRQS PF_CAPS_E_FIELD_LOC(83, 64) +#define PF_CAPS_E_RSV_LKEYS PF_CAPS_E_FIELD_LOC(115, 96)
struct hns_roce_cmq_req { __le32 data[6]; @@ -1413,9 +1350,6 @@ struct hns_roce_dip { #define HNS_ROCE_EQ_INIT_CONS_IDX 0 #define HNS_ROCE_EQ_INIT_NXT_EQE_BA 0
-#define HNS_ROCE_V2_CEQ_CEQE_OWNER_S 31 -#define HNS_ROCE_V2_AEQ_AEQE_OWNER_S 31 - #define HNS_ROCE_V2_COMP_EQE_NUM 0x1000 #define HNS_ROCE_V2_ASYNC_EQE_NUM 0x1000
@@ -1472,18 +1406,6 @@ struct hns_roce_eq_context { #define EQC_NEX_EQE_BA_H EQC_FIELD_LOC(339, 320) #define EQC_EQE_SIZE EQC_FIELD_LOC(341, 340)
-#define HNS_ROCE_V2_CEQE_COMP_CQN_S 0 -#define HNS_ROCE_V2_CEQE_COMP_CQN_M GENMASK(23, 0) - -#define HNS_ROCE_V2_AEQE_EVENT_TYPE_S 0 -#define HNS_ROCE_V2_AEQE_EVENT_TYPE_M GENMASK(7, 0) - -#define HNS_ROCE_V2_AEQE_SUB_TYPE_S 8 -#define HNS_ROCE_V2_AEQE_SUB_TYPE_M GENMASK(15, 8) - -#define HNS_ROCE_V2_AEQE_EVENT_QUEUE_NUM_S 0 -#define HNS_ROCE_V2_AEQE_EVENT_QUEUE_NUM_M GENMASK(23, 0) - #define MAX_SERVICE_LEVEL 0x7
struct hns_roce_wqe_atomic_seg { diff --git a/drivers/infiniband/hw/hns/hns_roce_restrack.c b/drivers/infiniband/hw/hns/hns_roce_restrack.c index 259444c0a630..24a154d64630 100644 --- a/drivers/infiniband/hw/hns/hns_roce_restrack.c +++ b/drivers/infiniband/hw/hns/hns_roce_restrack.c @@ -13,61 +13,40 @@ static int hns_roce_fill_cq(struct sk_buff *msg, struct hns_roce_v2_cq_context *context) { if (rdma_nl_put_driver_u32(msg, "state", - roce_get_field(context->byte_4_pg_ceqn, - V2_CQC_BYTE_4_ARM_ST_M, - V2_CQC_BYTE_4_ARM_ST_S))) + hr_reg_read(context, CQC_ARM_ST))) + goto err;
if (rdma_nl_put_driver_u32(msg, "ceqn", - roce_get_field(context->byte_4_pg_ceqn, - V2_CQC_BYTE_4_CEQN_M, - V2_CQC_BYTE_4_CEQN_S))) + hr_reg_read(context, CQC_CEQN))) goto err;
if (rdma_nl_put_driver_u32(msg, "cqn", - roce_get_field(context->byte_8_cqn, - V2_CQC_BYTE_8_CQN_M, - V2_CQC_BYTE_8_CQN_S))) + hr_reg_read(context, CQC_CQN))) goto err;
if (rdma_nl_put_driver_u32(msg, "hopnum", - roce_get_field(context->byte_16_hop_addr, - V2_CQC_BYTE_16_CQE_HOP_NUM_M, - V2_CQC_BYTE_16_CQE_HOP_NUM_S))) + hr_reg_read(context, CQC_CQE_HOP_NUM))) goto err;
- if (rdma_nl_put_driver_u32( - msg, "pi", - roce_get_field(context->byte_28_cq_pi, - V2_CQC_BYTE_28_CQ_PRODUCER_IDX_M, - V2_CQC_BYTE_28_CQ_PRODUCER_IDX_S))) + if (rdma_nl_put_driver_u32(msg, "pi", + hr_reg_read(context, CQC_CQ_PRODUCER_IDX))) goto err;
- if (rdma_nl_put_driver_u32( - msg, "ci", - roce_get_field(context->byte_32_cq_ci, - V2_CQC_BYTE_32_CQ_CONSUMER_IDX_M, - V2_CQC_BYTE_32_CQ_CONSUMER_IDX_S))) + if (rdma_nl_put_driver_u32(msg, "ci", + hr_reg_read(context, CQC_CQ_CONSUMER_IDX))) goto err;
- if (rdma_nl_put_driver_u32( - msg, "coalesce", - roce_get_field(context->byte_56_cqe_period_maxcnt, - V2_CQC_BYTE_56_CQ_MAX_CNT_M, - V2_CQC_BYTE_56_CQ_MAX_CNT_S))) + if (rdma_nl_put_driver_u32(msg, "coalesce", + hr_reg_read(context, CQC_CQ_MAX_CNT))) goto err;
- if (rdma_nl_put_driver_u32( - msg, "period", - roce_get_field(context->byte_56_cqe_period_maxcnt, - V2_CQC_BYTE_56_CQ_PERIOD_M, - V2_CQC_BYTE_56_CQ_PERIOD_S))) + if (rdma_nl_put_driver_u32(msg, "period", + hr_reg_read(context, CQC_CQ_PERIOD))) goto err;
if (rdma_nl_put_driver_u32(msg, "cnt", - roce_get_field(context->byte_52_cqe_cnt, - V2_CQC_BYTE_52_CQE_CNT_M, - V2_CQC_BYTE_52_CQE_CNT_S))) + hr_reg_read(context, CQC_CQE_CNT))) goto err;
return 0;
From: Chen Zhongjin chenzhongjin@huawei.com
maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5EU5D CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?...
--------------------------------
csdlock_debug is a early_param to enable csd_lock_wait feature.
It uses static_branch_enable to control which triggers a bug on booting time. In early_param stage static_branch_enable will call __page_to_pfn before sparse_init.
This causes panic when CONFIG_SPARSEMEM_VMEMMAP=n on arm64, so change early_param to __setup to avoid the problem.
Reported-by: Chen jingwen chenjingwen6@huawei.com Signed-off-by: Chen Zhongjin chenzhongjin@huawei.com Reviewed-by: Xu Kuohai xukuohai@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/smp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c index b04ab01eb9e0..7cb03edf1735 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -113,9 +113,9 @@ static int __init csdlock_debug(char *str) if (val) static_branch_enable(&csdlock_debug_enabled);
- return 0; + return 1; } -early_param("csdlock_debug", csdlock_debug); +__setup("csdlock_debug=", csdlock_debug);
static DEFINE_PER_CPU(call_single_data_t *, cur_csd); static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);