From: Zhang Rui rui.zhang@intel.com
stable inclusion from stable-v4.19.323 commit e75562346cac53c7e933373a004b1829e861123a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBBN6V
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------------------------
commit ffd95846c6ec6cf1f93da411ea10d504036cab42 upstream.
New processors have become pickier about the local APIC timer state before entering low power modes. These low power modes are used (for example) when you close your laptop lid and suspend. If you put your laptop in a bag and it is not in this low power mode, it is likely to get quite toasty while it quickly sucks the battery dry.
The problem boils down to some CPUs' inability to power down until the CPU recognizes that the local APIC timer is shut down. The current kernel code works in one-shot and periodic modes but does not work for deadline mode. Deadline mode has been the supported and preferred mode on Intel CPUs for over a decade and uses an MSR to drive the timer instead of an APIC register.
Disable the TSC Deadline timer in lapic_timer_shutdown() by writing to MSR_IA32_TSC_DEADLINE when in TSC-deadline mode. Also avoid writing to the initial-count register (APIC_TMICT) which is ignored in TSC-deadline mode.
Note: The APIC_LVTT|=APIC_LVT_MASKED operation should theoretically be enough to tell the hardware that the timer will not fire in any of the timer modes. But mitigating AMD erratum 411[1] also requires clearing out APIC_TMICT. Solely setting APIC_LVT_MASKED is also ineffective in practice on Intel Lunar Lake systems, which is the motivation for this change.
1. 411 Processor May Exit Message-Triggered C1E State Without an Interrupt if Local APIC Timer Reaches Zero - https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/revision...
Fixes: 279f1461432c ("x86: apic: Use tsc deadline for oneshot when available") Suggested-by: Dave Hansen dave.hansen@intel.com Signed-off-by: Zhang Rui rui.zhang@intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Tested-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Tested-by: Todd Brandt todd.e.brandt@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241015061522.25288-1-rui.zhang%40intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Heyuan Wang wangheyuan2@h-partners.com --- arch/x86/kernel/apic/apic.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 9318fe7d850e..e316f5a2aab4 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -484,7 +484,19 @@ static int lapic_timer_shutdown(struct clock_event_device *evt) v = apic_read(APIC_LVTT); v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR); apic_write(APIC_LVTT, v); - apic_write(APIC_TMICT, 0); + + /* + * Setting APIC_LVT_MASKED (above) should be enough to tell + * the hardware that this timer will never fire. But AMD + * erratum 411 and some Intel CPU behavior circa 2024 say + * otherwise. Time for belt and suspenders programming: mask + * the timer _and_ zero the counter registers: + */ + if (v & APIC_LVT_TIMER_TSCDEADLINE) + wrmsrl(MSR_IA32_TSC_DEADLINE, 0); + else + apic_write(APIC_TMICT, 0); + return 0; }