mainline inclusion from mainline-5.5 commit 6e898d2bf67a82df0aa0c955adc9278faba9a635 category: x86/mce
Add support for more Zhaoxin CPUs.
--------------------------------
All newer Zhaoxin CPUs are compatible with Intel's Machine-Check Architecture, so add support for them.
[ bp: Reflow comment in vendor_disable_error_reporting() and massage commit message. ]
Signed-off-by: Tony W Wang-oc TonyWWang-oc@zhaoxin.com Signed-off-by: Borislav Petkov bp@suse.de Cc: CooperYan@zhaoxin.com Cc: DavidWang@zhaoxin.com Cc: HerryYang@zhaoxin.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: linux-edac linux-edac@vger.kernel.org Cc: QiyuanWang@zhaoxin.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tony Luck tony.luck@intel.com Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/1568787573-1297-2-git-send-email-TonyWWang-oc@zhao... Signed-off-by: LeoLiu-oc LeoLiu-oc@zhaoxin.com --- arch/x86/kernel/cpu/mce/core.c | 42 ++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 5221c49d335e..dce0fbd4cb0f 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -473,8 +473,10 @@ int mce_usable_address(struct mce *m) if (!(m->status & MCI_STATUS_ADDRV)) return 0;
- /* Checks after this one are Intel-specific: */ - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + /* Checks after this one are Intel/Zhaoxin-specific: */ + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL && + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN && + boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR) return 1;
if (!(m->status & MCI_STATUS_MISCV)) @@ -492,10 +494,14 @@ EXPORT_SYMBOL_GPL(mce_usable_address);
bool mce_is_memory_error(struct mce *m) { - if (m->cpuvendor == X86_VENDOR_AMD || - m->cpuvendor == X86_VENDOR_HYGON) { + switch (m->cpuvendor) { + case X86_VENDOR_AMD: + case X86_VENDOR_HYGON: return amd_mce_is_memory_error(m); - } else if (m->cpuvendor == X86_VENDOR_INTEL) { + + case X86_VENDOR_INTEL: + case X86_VENDOR_ZHAOXIN: + case X86_VENDOR_CENTAUR: /* * Intel SDM Volume 3B - 15.9.2 Compound Error Codes * @@ -512,9 +518,10 @@ bool mce_is_memory_error(struct mce *m) return (m->status & 0xef80) == BIT(7) || (m->status & 0xef00) == BIT(8) || (m->status & 0xeffc) == 0xc; - }
- return false; + default: + return false; + } } EXPORT_SYMBOL_GPL(mce_is_memory_error);
@@ -1658,6 +1665,19 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c) if (c->x86 == 6 && c->x86_model == 45) quirk_no_way_out = quirk_sandybridge_ifu; } + + if (c->x86_vendor == X86_VENDOR_ZHAOXIN || + c->x86_vendor == X86_VENDOR_CENTAUR) { + /* + * All newer Zhaoxin CPUs support MCE broadcasting. Enable + * synchronization with a one second timeout. + */ + if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) { + if (cfg->monarch_timeout < 0) + cfg->monarch_timeout = USEC_PER_SEC; + } + } + if (cfg->monarch_timeout < 0) cfg->monarch_timeout = 0; if (cfg->bootlog != 0) @@ -1963,15 +1983,17 @@ static void mce_disable_error_reporting(void) static void vendor_disable_error_reporting(void) { /* - * Don't clear on Intel, AMD or Hygon CPUs. Some of these MSRs are - * socket-wide. + * Don't clear on Intel, AMD, Hygon or Zhaoxin CPUs. Some of these + * MSRs are socket-wide. * Disabling them for just a single offlined CPU is bad, since it will * inhibit reporting for all shared resources on the socket like the * last level cache (LLC), the integrated memory controller (iMC), etc. */ if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL || boot_cpu_data.x86_vendor == X86_VENDOR_HYGON || - boot_cpu_data.x86_vendor == X86_VENDOR_AMD) ++ boot_cpu_data.x86_vendor == X86_VENDOR_AMD || ++ boot_cpu_data.x86_vendor == X86_VENDOR_ZHAOXIN || ++ boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR) return;
mce_disable_error_reporting();
On 2021/3/25 18:07, LeoLiu-oc wrote:
mainline inclusion from mainline-5.5 commit 6e898d2bf67a82df0aa0c955adc9278faba9a635 category: x86/mce
Add support for more Zhaoxin CPUs.
All newer Zhaoxin CPUs are compatible with Intel's Machine-Check Architecture, so add support for them.
[ bp: Reflow comment in vendor_disable_error_reporting() and massage commit message. ]
Signed-off-by: Tony W Wang-oc TonyWWang-oc@zhaoxin.com Signed-off-by: Borislav Petkov bp@suse.de Cc: CooperYan@zhaoxin.com Cc: DavidWang@zhaoxin.com Cc: HerryYang@zhaoxin.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: linux-edac linux-edac@vger.kernel.org Cc: QiyuanWang@zhaoxin.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tony Luck tony.luck@intel.com Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/1568787573-1297-2-git-send-email-TonyWWang-oc@zhao...
Signed-off-by: LeoLiu-oc LeoLiu-oc@zhaoxin.com
arch/x86/kernel/cpu/mce/core.c | 42 ++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 5221c49d335e..dce0fbd4cb0f 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -473,8 +473,10 @@ int mce_usable_address(struct mce *m) if (!(m->status & MCI_STATUS_ADDRV)) return 0;
- /* Checks after this one are Intel-specific: */ - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + /* Checks after this one are Intel/Zhaoxin-specific: */ + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL && + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN && + boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR)
This looks good to me,
[...]
+ if (c->x86_vendor == X86_VENDOR_ZHAOXIN || + c->x86_vendor == X86_VENDOR_CENTAUR) { + /* + * All newer Zhaoxin CPUs support MCE broadcasting. Enable + * synchronization with a one second timeout. + */ + if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
But do we need constrains for x86_model for both vendor Zhaoxin and Centaur?
I'm not familiar with those two CPUs and you are the experts, but I can see patches for C state, for Centaur:
+ /* + * For all recent Centaur CPUs, the ucode will make sure that each + * core can keep cache coherence with each other while entering C3 + * type state. So, set bm_check to 1 to indicate that the kernel + * doesn't need to execute a cache flush operation (WBINVD) when + * entering C3 type state. + */ + if (c->x86_vendor == X86_VENDOR_CENTAUR) { + if (c->x86 > 6 || (c->x86 == 6 && c->x86_model == 0x0f && + c->x86_stepping >= 0x0e)) + flags->bm_check = 1; + }
But for Zhaoxin,
+ if (c->x86_vendor == X86_VENDOR_ZHAOXIN) { + /* + * All Zhaoxin CPUs that support C3 share cache. + * And caches should not be flushed by software while + * entering C3 type state. + */ + flags->bm_check = 1;
I'm just curious, correct me if I'm wrong.
Thanks Hanjun
On 26/03/2021 09:56, Hanjun Guo wrote:
On 2021/3/25 18:07, LeoLiu-oc wrote:
mainline inclusion from mainline-5.5 commit 6e898d2bf67a82df0aa0c955adc9278faba9a635 category: x86/mce
Add support for more Zhaoxin CPUs.
All newer Zhaoxin CPUs are compatible with Intel's Machine-Check Architecture, so add support for them.
[ bp: Reflow comment in vendor_disable_error_reporting() and massage commit message. ]
Signed-off-by: Tony W Wang-oc TonyWWang-oc@zhaoxin.com Signed-off-by: Borislav Petkov bp@suse.de Cc: CooperYan@zhaoxin.com Cc: DavidWang@zhaoxin.com Cc: HerryYang@zhaoxin.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: linux-edac linux-edac@vger.kernel.org Cc: QiyuanWang@zhaoxin.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tony Luck tony.luck@intel.com Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/1568787573-1297-2-git-send-email-TonyWWang-oc@zhao...
Signed-off-by: LeoLiu-oc LeoLiu-oc@zhaoxin.com
arch/x86/kernel/cpu/mce/core.c | 42 ++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 5221c49d335e..dce0fbd4cb0f 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -473,8 +473,10 @@ int mce_usable_address(struct mce *m) if (!(m->status & MCI_STATUS_ADDRV)) return 0;
- /* Checks after this one are Intel-specific: */ - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + /* Checks after this one are Intel/Zhaoxin-specific: */ + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL && + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN && + boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR)
This looks good to me,
[...]
+ if (c->x86_vendor == X86_VENDOR_ZHAOXIN || + c->x86_vendor == X86_VENDOR_CENTAUR) { + /* + * All newer Zhaoxin CPUs support MCE broadcasting. Enable + * synchronization with a one second timeout. + */ + if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
But do we need constrains for x86_model for both vendor Zhaoxin and Centaur?
Yes. Zhaoxin have two CPU vendor ID of "CentaurHauls" and " Shanghai " now. Zhaoxin CPUs with Family > 6, or Family == 6 and Model == 0x19/0x1f support MCE broadcasting.
I'm not familiar with those two CPUs and you are the experts, but I can see patches for C state, for Centaur:
+ /* + * For all recent Centaur CPUs, the ucode will make sure that each + * core can keep cache coherence with each other while entering C3 + * type state. So, set bm_check to 1 to indicate that the kernel + * doesn't need to execute a cache flush operation (WBINVD) when + * entering C3 type state. + */ + if (c->x86_vendor == X86_VENDOR_CENTAUR) { + if (c->x86 > 6 || (c->x86 == 6 && c->x86_model == 0x0f && + c->x86_stepping >= 0x0e)) + flags->bm_check = 1; + }
This is different item with MCE. These CPUs are belong to Zhaoxin and the if case distinguish from old VIA made "CentaurHauls" CPU vendor ID CPUs.
But for Zhaoxin,
+ if (c->x86_vendor == X86_VENDOR_ZHAOXIN) { + /* + * All Zhaoxin CPUs that support C3 share cache. + * And caches should not be flushed by software while + * entering C3 type state. + */ + flags->bm_check = 1;
I'm just curious, correct me if I'm wrong.
Zhaoxin CPUs with " Shanghai " Vendor ID do not need distinguish from other Vendor ID CPUs.
Sincerely TonyWWangoc
Thanks Hanjun .
On 2021/3/26 11:37, Tony W Wang-oc wrote:
On 26/03/2021 09:56, Hanjun Guo wrote:
On 2021/3/25 18:07, LeoLiu-oc wrote:
mainline inclusion from mainline-5.5 commit 6e898d2bf67a82df0aa0c955adc9278faba9a635 category: x86/mce
Add support for more Zhaoxin CPUs.
All newer Zhaoxin CPUs are compatible with Intel's Machine-Check Architecture, so add support for them.
[ bp: Reflow comment in vendor_disable_error_reporting() and massage commit message. ]
Signed-off-by: Tony W Wang-ocTonyWWang-oc@zhaoxin.com Signed-off-by: Borislav Petkovbp@suse.de Cc:CooperYan@zhaoxin.com Cc:DavidWang@zhaoxin.com Cc:HerryYang@zhaoxin.com Cc: "H. Peter Anvin"hpa@zytor.com Cc: Ingo Molnarmingo@redhat.com Cc: linux-edaclinux-edac@vger.kernel.org Cc:QiyuanWang@zhaoxin.com Cc: Thomas Gleixnertglx@linutronix.de Cc: Tony Lucktony.luck@intel.com Cc: x86-mlx86@kernel.org Link: https://lkml.kernel.org/r/1568787573-1297-2-git-send-email-TonyWWang-oc@zhao...
Signed-off-by: LeoLiu-ocLeoLiu-oc@zhaoxin.com
arch/x86/kernel/cpu/mce/core.c | 42 ++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 5221c49d335e..dce0fbd4cb0f 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -473,8 +473,10 @@ int mce_usable_address(struct mce *m) if (!(m->status & MCI_STATUS_ADDRV)) return 0;
- /* Checks after this one are Intel-specific: */ - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + /* Checks after this one are Intel/Zhaoxin-specific: */ + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL && + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN && + boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR)
This looks good to me,
[...]
+ if (c->x86_vendor == X86_VENDOR_ZHAOXIN || + c->x86_vendor == X86_VENDOR_CENTAUR) { + /* + * All newer Zhaoxin CPUs support MCE broadcasting. Enable + * synchronization with a one second timeout. + */ + if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
But do we need constrains for x86_model for both vendor Zhaoxin and Centaur?
Yes. Zhaoxin have two CPU vendor ID of "CentaurHauls" and " Shanghai " now. Zhaoxin CPUs with Family > 6, or Family == 6 and Model == 0x19/0x1f support MCE broadcasting.
I'm not familiar with those two CPUs and you are the experts, but I can see patches for C state, for Centaur:
+ /* + * For all recent Centaur CPUs, the ucode will make sure that each + * core can keep cache coherence with each other while entering C3 + * type state. So, set bm_check to 1 to indicate that the kernel + * doesn't need to execute a cache flush operation (WBINVD) when + * entering C3 type state. + */ + if (c->x86_vendor == X86_VENDOR_CENTAUR) { + if (c->x86 > 6 || (c->x86 == 6 && c->x86_model == 0x0f && + c->x86_stepping >= 0x0e)) + flags->bm_check = 1; + }
This is different item with MCE. These CPUs are belong to Zhaoxin and the if case distinguish from old VIA made "CentaurHauls" CPU vendor ID CPUs.
But for Zhaoxin,
+ if (c->x86_vendor == X86_VENDOR_ZHAOXIN) { + /* + * All Zhaoxin CPUs that support C3 share cache. + * And caches should not be flushed by software while + * entering C3 type state. + */ + flags->bm_check = 1;
I'm just curious, correct me if I'm wrong.
Zhaoxin CPUs with " Shanghai " Vendor ID do not need distinguish from other Vendor ID CPUs.
Thanks for the reply I will add my review for this patch set.
Thanks Hanjun