From: Fenghua Yu fenghua.yu@intel.com
mainline inclusion from mainline-v5.3-rc1 commit b302e4b176d00e1cbc80148c5d0aee36751f7480 category: feature bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=44 CVE: NA
-----------------------------------------------
AVX512 BFLOAT16 instructions support 16-bit BFLOAT16 floating-point format (BF16) for deep learning optimization.
BF16 is a short version of 32-bit single-precision floating-point format (FP32) and has several advantages over 16-bit half-precision floating-point format (FP16). BF16 keeps FP32 accumulation after multiplication without loss of precision, offers more than enough range for deep learning training tasks, and doesn't need to handle hardware exception.
AVX512 BFLOAT16 instructions are enumerated in CPUID.7.1:EAX[bit 5] AVX512_BF16.
CPUID.7.1:EAX contains only feature bits. Reuse the currently empty word 12 as a pure features word to hold the feature bits including AVX512_BF16.
Detailed information of the CPUID bit and AVX512 BFLOAT16 instructions can be found in the latest Intel Architecture Instruction Set Extensions and Future Features Programming Reference.
[ bp: Check CPUID(7) subleaf validity before accessing subleaf 1. ]
Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Borislav Petkov bp@suse.de Cc: "Chang S. Bae" chang.seok.bae@intel.com Cc: Frederic Weisbecker frederic@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Jann Horn jannh@google.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Nadav Amit namit@vmware.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Pavel Tatashin pasha.tatashin@oracle.com Cc: Peter Feiner pfeiner@google.com Cc: Radim Krcmar rkrcmar@redhat.com Cc: "Rafael J. Wysocki" rafael.j.wysocki@intel.com Cc: "Ravi V Shankar" ravi.v.shankar@intel.com Cc: Robert Hoo robert.hu@linux.intel.com Cc: "Sean J Christopherson" sean.j.christopherson@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Thomas Lendacky Thomas.Lendacky@amd.com Cc: x86 x86@kernel.org Link: https://lkml.kernel.org/r/1560794416-217638-3-git-send-email-fenghua.yu@inte... Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Xiongfeng Wang wangxiongfeng2@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- arch/x86/include/asm/cpufeature.h | 2 +- arch/x86/include/asm/cpufeatures.h | 3 +++ arch/x86/kernel/cpu/common.c | 6 ++++++ arch/x86/kernel/cpu/cpuid-deps.c | 1 + 4 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 68889ace9c4c6..4ce54074eea57 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -23,7 +23,7 @@ enum cpuid_leafs CPUID_7_0_EBX, CPUID_D_1_EAX, CPUID_LNX_4, - CPUID_DUMMY, + CPUID_7_1_EAX, CPUID_8000_0008_EBX, CPUID_6_EAX, CPUID_8000_000A_EDX, diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 48535113efa68..b40125de2770e 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -305,6 +305,9 @@ #define X86_FEATURE_FENCE_SWAPGS_USER (11*32+ 4) /* "" LFENCE in user entry SWAPGS path */ #define X86_FEATURE_FENCE_SWAPGS_KERNEL (11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
+/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ +#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ + /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ #define X86_FEATURE_IRPERF (13*32+ 1) /* Instructions Retired Count */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 1d83e5f7c5a89..6e52e5a706e0b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -854,6 +854,12 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_capability[CPUID_7_0_EBX] = ebx; c->x86_capability[CPUID_7_ECX] = ecx; c->x86_capability[CPUID_7_EDX] = edx; + + /* Check valid sub-leaf index before accessing it */ + if (eax >= 1) { + cpuid_count(0x00000007, 1, &eax, &ebx, &ecx, &edx); + c->x86_capability[CPUID_7_1_EAX] = eax; + } }
/* Extended state features: level 0x0000000d */ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index fa07a224e7b98..a444028d81453 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -62,6 +62,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_CQM_OCCUP_LLC, X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC }, + { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, {} };