public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
@ 2024-10-09  7:20 Zhang Rui
  2024-10-09 11:24 ` Rafael J. Wysocki
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Zhang Rui @ 2024-10-09  7:20 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86, linux-pm
  Cc: hpa, peterz, thorsten.blum, yuntao.wang, tony.luck, len.brown,
	srinivas.pandruvada, linux-kernel, stable

This 12-year-old bug prevents some modern processors from achieving
maximum power savings during suspend. For example, Lunar Lake systems
gets 0% package C-states during suspend to idle and this causes energy
star compliance tests to fail.

According to Intel SDM, for the local APIC timer,
1. "The initial-count register is a read-write register. A write of 0 to
   the initial-count register effectively stops the local APIC timer, in
   both one-shot and periodic mode."
2. "In TSC deadline mode, writes to the initial-count register are
   ignored; and current-count register always reads 0. Instead, timer
   behavior is controlled using the IA32_TSC_DEADLINE MSR."
   "In TSC-deadline mode, writing 0 to the IA32_TSC_DEADLINE MSR disarms
   the local-APIC timer."

Stop the TSC Deadline timer in lapic_timer_shutdown() by writing 0 to
MSR_IA32_TSC_DEADLINE.

Cc: stable@vger.kernel.org
Fixes: 279f1461432c ("x86: apic: Use tsc deadline for oneshot when available")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
Changes since V1
- improve changelog
---
 arch/x86/kernel/apic/apic.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6513c53c9459..d1006531729a 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -441,6 +441,10 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
 	v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
 	apic_write(APIC_LVTT, v);
 	apic_write(APIC_TMICT, 0);
+
+	if (boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER))
+		wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
+
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09  7:20 [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown Zhang Rui
@ 2024-10-09 11:24 ` Rafael J. Wysocki
  2024-10-09 16:41   ` Dave Hansen
  2024-10-09 16:33 ` Ricardo Neri
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2024-10-09 11:24 UTC (permalink / raw)
  To: Zhang Rui, x86
  Cc: tglx, mingo, bp, dave.hansen, rafael.j.wysocki, linux-pm, hpa,
	peterz, thorsten.blum, yuntao.wang, tony.luck, len.brown,
	srinivas.pandruvada, linux-kernel, stable

On Wed, Oct 9, 2024 at 9:20 AM Zhang Rui <rui.zhang@intel.com> wrote:
>
> This 12-year-old bug prevents some modern processors from achieving
> maximum power savings during suspend. For example, Lunar Lake systems
> gets 0% package C-states during suspend to idle and this causes energy
> star compliance tests to fail.
>
> According to Intel SDM, for the local APIC timer,
> 1. "The initial-count register is a read-write register. A write of 0 to
>    the initial-count register effectively stops the local APIC timer, in
>    both one-shot and periodic mode."
> 2. "In TSC deadline mode, writes to the initial-count register are
>    ignored; and current-count register always reads 0. Instead, timer
>    behavior is controlled using the IA32_TSC_DEADLINE MSR."
>    "In TSC-deadline mode, writing 0 to the IA32_TSC_DEADLINE MSR disarms
>    the local-APIC timer."
>
> Stop the TSC Deadline timer in lapic_timer_shutdown() by writing 0 to
> MSR_IA32_TSC_DEADLINE.
>
> Cc: stable@vger.kernel.org
> Fixes: 279f1461432c ("x86: apic: Use tsc deadline for oneshot when available")
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86 folks, this is quite nasty, so please make it high-prio.

Alternatively, I can take it through the PM tree.

> ---
> Changes since V1
> - improve changelog
> ---
>  arch/x86/kernel/apic/apic.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 6513c53c9459..d1006531729a 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -441,6 +441,10 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
>         v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
>         apic_write(APIC_LVTT, v);
>         apic_write(APIC_TMICT, 0);
> +
> +       if (boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER))
> +               wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
> +
>         return 0;
>  }
>
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09  7:20 [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown Zhang Rui
  2024-10-09 11:24 ` Rafael J. Wysocki
@ 2024-10-09 16:33 ` Ricardo Neri
  2024-10-09 17:20   ` Rafael J. Wysocki
  2024-10-09 16:49 ` Dave Hansen
  2024-10-09 17:47 ` Dave Hansen
  3 siblings, 1 reply; 14+ messages in thread
From: Ricardo Neri @ 2024-10-09 16:33 UTC (permalink / raw)
  To: Zhang Rui
  Cc: tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86, linux-pm,
	hpa, peterz, thorsten.blum, yuntao.wang, tony.luck, len.brown,
	srinivas.pandruvada, linux-kernel, stable

On Wed, Oct 09, 2024 at 03:20:01PM +0800, Zhang Rui wrote:
> This 12-year-old bug prevents some modern processors from achieving
> maximum power savings during suspend. For example, Lunar Lake systems

Two nits:

> gets 0% package C-states during suspend to idle and this causes energy
> star compliance tests to fail.

s/gets/get/
s/energy start/Energy Star/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09 11:24 ` Rafael J. Wysocki
@ 2024-10-09 16:41   ` Dave Hansen
  2024-10-09 17:03     ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Hansen @ 2024-10-09 16:41 UTC (permalink / raw)
  To: Rafael J. Wysocki, Zhang Rui, x86
  Cc: tglx, mingo, bp, dave.hansen, rafael.j.wysocki, linux-pm, hpa,
	peterz, thorsten.blum, yuntao.wang, tony.luck, len.brown,
	srinivas.pandruvada, linux-kernel, stable

On 10/9/24 04:24, Rafael J. Wysocki wrote:
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> x86 folks, this is quite nasty, so please make it high-prio.

How much linux-next soak time do you think this needs?  We'd ideally
like to give it a week in x86/urgent.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09  7:20 [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown Zhang Rui
  2024-10-09 11:24 ` Rafael J. Wysocki
  2024-10-09 16:33 ` Ricardo Neri
@ 2024-10-09 16:49 ` Dave Hansen
  2024-10-09 17:19   ` Rafael J. Wysocki
  2024-10-09 17:47 ` Dave Hansen
  3 siblings, 1 reply; 14+ messages in thread
From: Dave Hansen @ 2024-10-09 16:49 UTC (permalink / raw)
  To: Zhang Rui, tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86,
	linux-pm
  Cc: hpa, peterz, thorsten.blum, yuntao.wang, tony.luck, len.brown,
	srinivas.pandruvada, linux-kernel, stable

On 10/9/24 00:20, Zhang Rui wrote:
> This 12-year-old bug prevents some modern processors from achieving
> maximum power savings during suspend. For example, Lunar Lake systems
> gets 0% package C-states during suspend to idle and this causes energy
> star compliance tests to fail.

Why haven't we noticed or cared for the last 12 years?

Also, plain language really matters.  Is this as simple as: "you close
the lid on the laptop and the CPU doesn't power down at all"?

> According to Intel SDM, for the local APIC timer,
> 1. "The initial-count register is a read-write register. A write of 0 to
>    the initial-count register effectively stops the local APIC timer, in
>    both one-shot and periodic mode."
> 2. "In TSC deadline mode, writes to the initial-count register are
>    ignored; and current-count register always reads 0. Instead, timer
>    behavior is controlled using the IA32_TSC_DEADLINE MSR."
>    "In TSC-deadline mode, writing 0 to the IA32_TSC_DEADLINE MSR disarms
>    the local-APIC timer."

Is "stopping" and "disarming" the same thing?

Second, while quoting the SDM is great, it would be even better to
including the Linux naming for these things.  The Linux naming for the
APIC registers is completely missing from this changelog.  You could say:

	"In TSC deadline mode, writes to the initial-count register
	(APIC_TMICT) are ignored"

which makes it much easier to relate this code:

        apic_write(APIC_TMICT, 0);

back to the SDM language.  This is especially true because:

#define APIC_TMICT      0x380

doesn't make it obvious that "ICT" is the "Initial-Count Register".  I
had to go back to the SDM table to make 100% sure.

This also doesn't ever say which mode the kernel is running in.

> Stop the TSC Deadline timer in lapic_timer_shutdown() by writing 0 to
> MSR_IA32_TSC_DEADLINE.

This dances around the problem but never comes out and says it:

	The CPU package does not go into lower power modes (higher
	package C-states) unless all local-APIC timers are disabled.

Plus something to connect the old to the new:

	On older CPUs, setting APIC_TMICT=0 was sufficient for disabling
	the local-APIC timer, no matter the timer mode (deadline, one-
	shot or periodic).  But newer CPUs adhere to the strict letter
	of the law in the SDM and more fully ignore APIC_TMICT when in
	deadline mode.  Those CPUs also don't fully "disable" the timer
	when IA32_TSC_DEADLINE has passed.  They _require_ writing a 0.

Or am I missing something?

> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 6513c53c9459..d1006531729a 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -441,6 +441,10 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
>  	v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
>  	apic_write(APIC_LVTT, v);
>  	apic_write(APIC_TMICT, 0);
> +
> +	if (boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER))
> +		wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
> +
>  	return 0;
>  }
>  


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09 16:41   ` Dave Hansen
@ 2024-10-09 17:03     ` Rafael J. Wysocki
  0 siblings, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2024-10-09 17:03 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Rafael J. Wysocki, Zhang Rui, x86, tglx, mingo, bp, dave.hansen,
	rafael.j.wysocki, linux-pm, hpa, peterz, thorsten.blum,
	yuntao.wang, tony.luck, len.brown, srinivas.pandruvada,
	linux-kernel, stable

On Wed, Oct 9, 2024 at 6:41 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 10/9/24 04:24, Rafael J. Wysocki wrote:
> > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > x86 folks, this is quite nasty, so please make it high-prio.
>
> How much linux-next soak time do you think this needs?  We'd ideally
> like to give it a week in x86/urgent.

That works, a week in x86/urgent should be fine.

Thank you!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09 16:49 ` Dave Hansen
@ 2024-10-09 17:19   ` Rafael J. Wysocki
  0 siblings, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2024-10-09 17:19 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Zhang Rui, tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86,
	linux-pm, hpa, peterz, thorsten.blum, yuntao.wang, tony.luck,
	len.brown, srinivas.pandruvada, linux-kernel, stable

On Wed, Oct 9, 2024 at 6:49 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 10/9/24 00:20, Zhang Rui wrote:
> > This 12-year-old bug prevents some modern processors from achieving
> > maximum power savings during suspend. For example, Lunar Lake systems
> > gets 0% package C-states during suspend to idle and this causes energy
> > star compliance tests to fail.
>
> Why haven't we noticed or cared for the last 12 years?

Because on the previous platforms the apic_write(APIC_TMICT, 0) in
lapic_timer_shutdown() was sufficient even in the TSC deadline mode,
or at least no problems with it have ever been reported.

> Also, plain language really matters.  Is this as simple as: "you close
> the lid on the laptop and the CPU doesn't power down at all"?

It will power down somewhat, but not as much as to justify suspending
the system.  It will just drain the battery at a relatively high rate
which will also cause the machine to heat up.

Not a good idea to put it into a bag in this state, for instance.

> > According to Intel SDM, for the local APIC timer,
> > 1. "The initial-count register is a read-write register. A write of 0 to
> >    the initial-count register effectively stops the local APIC timer, in
> >    both one-shot and periodic mode."
> > 2. "In TSC deadline mode, writes to the initial-count register are
> >    ignored; and current-count register always reads 0. Instead, timer
> >    behavior is controlled using the IA32_TSC_DEADLINE MSR."
> >    "In TSC-deadline mode, writing 0 to the IA32_TSC_DEADLINE MSR disarms
> >    the local-APIC timer."
>
> Is "stopping" and "disarming" the same thing?

That is my understanding.  If you disarm it and it is not armed again,
it will be stopped effectively.

> Second, while quoting the SDM is great, it would be even better to
> including the Linux naming for these things.  The Linux naming for the
> APIC registers is completely missing from this changelog.  You could say:
>
>         "In TSC deadline mode, writes to the initial-count register
>         (APIC_TMICT) are ignored"
>
> which makes it much easier to relate this code:
>
>         apic_write(APIC_TMICT, 0);
>
> back to the SDM language.  This is especially true because:
>
> #define APIC_TMICT      0x380
>
> doesn't make it obvious that "ICT" is the "Initial-Count Register".  I
> had to go back to the SDM table to make 100% sure.
>
> This also doesn't ever say which mode the kernel is running in.
>
> > Stop the TSC Deadline timer in lapic_timer_shutdown() by writing 0 to
> > MSR_IA32_TSC_DEADLINE.
>
> This dances around the problem but never comes out and says it:
>
>         The CPU package does not go into lower power modes (higher
>         package C-states) unless all local-APIC timers are disabled.
>
> Plus something to connect the old to the new:
>
>         On older CPUs, setting APIC_TMICT=0 was sufficient for disabling
>         the local-APIC timer, no matter the timer mode (deadline, one-
>         shot or periodic).  But newer CPUs adhere to the strict letter
>         of the law in the SDM and more fully ignore APIC_TMICT when in
>         deadline mode.  Those CPUs also don't fully "disable" the timer
>         when IA32_TSC_DEADLINE has passed.  They _require_ writing a 0.
>
> Or am I missing something?

No, you are right.

We need a new version of the patch with a better changelog.

> > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> > index 6513c53c9459..d1006531729a 100644
> > --- a/arch/x86/kernel/apic/apic.c
> > +++ b/arch/x86/kernel/apic/apic.c
> > @@ -441,6 +441,10 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
> >       v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
> >       apic_write(APIC_LVTT, v);
> >       apic_write(APIC_TMICT, 0);
> > +
> > +     if (boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER))
> > +             wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
> > +
> >       return 0;
> >  }
> >
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09 16:33 ` Ricardo Neri
@ 2024-10-09 17:20   ` Rafael J. Wysocki
  0 siblings, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2024-10-09 17:20 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Zhang Rui, tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86,
	linux-pm, hpa, peterz, thorsten.blum, yuntao.wang, tony.luck,
	len.brown, srinivas.pandruvada, linux-kernel, stable

On Wed, Oct 9, 2024 at 6:28 PM Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
>
> On Wed, Oct 09, 2024 at 03:20:01PM +0800, Zhang Rui wrote:
> > This 12-year-old bug prevents some modern processors from achieving
> > maximum power savings during suspend. For example, Lunar Lake systems
>
> Two nits:
>
> > gets 0% package C-states during suspend to idle and this causes energy
> > star compliance tests to fail.
>
> s/gets/get/
> s/energy start/Energy Star/

Thanks for pointing these out!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09  7:20 [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown Zhang Rui
                   ` (2 preceding siblings ...)
  2024-10-09 16:49 ` Dave Hansen
@ 2024-10-09 17:47 ` Dave Hansen
  2024-10-09 18:43   ` Rafael J. Wysocki
  3 siblings, 1 reply; 14+ messages in thread
From: Dave Hansen @ 2024-10-09 17:47 UTC (permalink / raw)
  To: Zhang Rui, tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86,
	linux-pm
  Cc: hpa, peterz, thorsten.blum, yuntao.wang, tony.luck, len.brown,
	srinivas.pandruvada, linux-kernel, stable

On 10/9/24 00:20, Zhang Rui wrote:
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 6513c53c9459..d1006531729a 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -441,6 +441,10 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
>  	v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
>  	apic_write(APIC_LVTT, v);
>  	apic_write(APIC_TMICT, 0);
> +
> +	if (boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER))
> +		wrmsrl(MSR_IA32_TSC_DEADLINE, 0);

One last thing, and this is a super nit.  We presumably have the actual
APIC_LVTT value (v) sitting in a register already.  Is there any
difference logically between a X86_FEATURE_TSC_DEADLINE_TIMER check and
an APIC_LVTT check for APIC_LVT_TIMER_TSCDEADLINE?

I suspect this will generate more compact code:

	if (v & APIC_LVT_TIMER_TSCDEADLINE)
		wrmsrl(MSR_IA32_TSC_DEADLINE, 0);

Does it have any downsides?

Oh, and how hot is this path?  Is this wrmsr() going to matter?  I
presume it's pretty cheap because it's one of the special
architecturally non-serializing WRMSRs.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09 17:47 ` Dave Hansen
@ 2024-10-09 18:43   ` Rafael J. Wysocki
  2024-10-11  0:43     ` Dave Hansen
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2024-10-09 18:43 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Zhang Rui, tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86,
	linux-pm, hpa, peterz, thorsten.blum, yuntao.wang, tony.luck,
	len.brown, srinivas.pandruvada, linux-kernel, stable

On Wed, Oct 9, 2024 at 7:49 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 10/9/24 00:20, Zhang Rui wrote:
> > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> > index 6513c53c9459..d1006531729a 100644
> > --- a/arch/x86/kernel/apic/apic.c
> > +++ b/arch/x86/kernel/apic/apic.c
> > @@ -441,6 +441,10 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
> >       v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
> >       apic_write(APIC_LVTT, v);
> >       apic_write(APIC_TMICT, 0);
> > +
> > +     if (boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER))
> > +             wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
>
> One last thing, and this is a super nit.  We presumably have the actual
> APIC_LVTT value (v) sitting in a register already.  Is there any
> difference logically between a X86_FEATURE_TSC_DEADLINE_TIMER check and
> an APIC_LVTT check for APIC_LVT_TIMER_TSCDEADLINE?
>
> I suspect this will generate more compact code:
>
>         if (v & APIC_LVT_TIMER_TSCDEADLINE)
>                 wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
>
> Does it have any downsides?

I don't see any.

> Oh, and how hot is this path?  Is this wrmsr() going to matter?  I
> presume it's pretty cheap because it's one of the special
> architecturally non-serializing WRMSRs.

lapic_timer_shutdown() is called under a raw spin lock in
___tick_broadcast_oneshot_control(), so it better not take too much
time or PREEMPT_RT might be unhappy.  I'm not sure how often that
happens, though.

Also tick_program_event() calls it to stop the tick, but it is assumed
that this may take time AFAICS.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-09 18:43   ` Rafael J. Wysocki
@ 2024-10-11  0:43     ` Dave Hansen
  2024-10-11  2:04       ` Zhang, Rui
  2024-10-11 10:25       ` Rafael J. Wysocki
  0 siblings, 2 replies; 14+ messages in thread
From: Dave Hansen @ 2024-10-11  0:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Zhang Rui, tglx, mingo, bp, dave.hansen, rafael.j.wysocki, x86,
	linux-pm, hpa, peterz, thorsten.blum, yuntao.wang, tony.luck,
	len.brown, srinivas.pandruvada, linux-kernel, stable

[-- Attachment #1: Type: text/plain, Size: 224 bytes --]

How about something like the completely untested attached patch?

IMNHO, it improves on what was posted here because it draws a parallel
with an AMD erratum and also avoids writes to APIC_TMICT that would get
ignored anyway.

[-- Attachment #2: deadline1.patch --]
[-- Type: text/x-patch, Size: 2412 bytes --]

Subject: x86/apic: Always explicitly disarm TSC-deadline timer

New processors have become pickier about the local APIC timer state
before entering low power modes. These low power modes are used (for
example) when you close your laptop lid and suspend. If you put your
laptop in a bag in this unnecessarily-high-power state, it is likely
to get quite toasty while it quickly sucks the battery dry.

The problem boils down to some CPUs' inability to power down until the
kernel fully disables the local APIC timer. The current kernel code
works in one-shot and periodic modes but does not work for deadline
mode. Deadline mode has been the supported and preferred mode on
Intel CPUs for over a decade and uses an MSR to drive the timer
instead of an APIC register.

Disable the TSC Deadline timer in lapic_timer_shutdown() by writing to
MSR_IA32_TSC_DEADLINE when in TSC-deadline mode. Also avoid writing
to the initial-count register (APIC_TMICT) which is ignored in
TSC-deadline mode.

Note: The APIC_LVTT|=APIC_LVT_MASKED operation should theoretically be
enough to tell the hardware that the timer will not fire in any of the
timer modes. But mitigating AMD erratum 411[1] also requires clearing
out APIC_TMICT. Solely setting APIC_LVT_MASKED is also ineffective in
practice on Intel Lunar Lake systems, which is the motivation for this
change.

1. 411 Processor May Exit Message-Triggered C1E State Without an Interrupt if Local APIC Timer Reaches Zero - https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/revision-guides/41322_10h_Rev_Gd.pdf 
														    //
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6513c53c9459e..5436a4083065d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -440,7 +440,19 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
 	v = apic_read(APIC_LVTT);
 	v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
 	apic_write(APIC_LVTT, v);
-	apic_write(APIC_TMICT, 0);
+
+	/*
+	 * Setting APIC_LVT_MASKED should be enough to tell the
+	 * hardware that this timer will never fire. But AMD
+	 * erratum 411 and some Intel CPU behavior circa 2024
+	 * say otherwise. Time for belt and suspenders programming,
+	 * mask the timer and zero the counter registers:
+	 */
+	if (v & APIC_LVT_TIMER_TSCDEADLINE)
+		wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
+	else
+		apic_write(APIC_TMICT, 0);
+
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-11  0:43     ` Dave Hansen
@ 2024-10-11  2:04       ` Zhang, Rui
  2024-10-11 10:25       ` Rafael J. Wysocki
  1 sibling, 0 replies; 14+ messages in thread
From: Zhang, Rui @ 2024-10-11  2:04 UTC (permalink / raw)
  To: Hansen, Dave, rafael@kernel.org
  Cc: Luck, Tony, dave.hansen@linux.intel.com, stable@vger.kernel.org,
	Wysocki, Rafael J, linux-kernel@vger.kernel.org, mingo@redhat.com,
	thorsten.blum@toblux.com, tglx@linutronix.de,
	yuntao.wang@linux.dev, hpa@zytor.com, peterz@infradead.org,
	bp@alien8.de, linux-pm@vger.kernel.org, Brown, Len,
	Pandruvada, Srinivas, x86@kernel.org

Hi, Dave,

On Thu, 2024-10-10 at 17:43 -0700, Dave Hansen wrote:
> How about something like the completely untested attached patch?
> 
> IMNHO, it improves on what was posted here because it draws a
> parallel
> with an AMD erratum and also avoids writes to APIC_TMICT that would
> get
> ignored anyway.

Thanks a lot for the rewrite. The patch looks great to me.

I will test it on our test boxes to make sure skipping the APIC_TMICT
write in TSC deadline mode does not bring unexpected impact to the old
machines.

thanks,
rui


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-11  0:43     ` Dave Hansen
  2024-10-11  2:04       ` Zhang, Rui
@ 2024-10-11 10:25       ` Rafael J. Wysocki
  2024-10-11 14:51         ` Pandruvada, Srinivas
  1 sibling, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2024-10-11 10:25 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Rafael J. Wysocki, Zhang Rui, tglx, mingo, bp, dave.hansen,
	rafael.j.wysocki, x86, linux-pm, hpa, peterz, thorsten.blum,
	yuntao.wang, tony.luck, len.brown, srinivas.pandruvada,
	linux-kernel, stable

On Fri, Oct 11, 2024 at 2:43 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> How about something like the completely untested attached patch?
>
> IMNHO, it improves on what was posted here because it draws a parallel
> with an AMD erratum and also avoids writes to APIC_TMICT that would get
> ignored anyway.

Please feel free to add

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

to this one when it's ready.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown
  2024-10-11 10:25       ` Rafael J. Wysocki
@ 2024-10-11 14:51         ` Pandruvada, Srinivas
  0 siblings, 0 replies; 14+ messages in thread
From: Pandruvada, Srinivas @ 2024-10-11 14:51 UTC (permalink / raw)
  To: rafael@kernel.org, Hansen, Dave
  Cc: Luck, Tony, dave.hansen@linux.intel.com, stable@vger.kernel.org,
	Wysocki, Rafael J, linux-kernel@vger.kernel.org, mingo@redhat.com,
	Zhang, Rui, thorsten.blum@toblux.com, tglx@linutronix.de,
	yuntao.wang@linux.dev, hpa@zytor.com, peterz@infradead.org,
	bp@alien8.de, linux-pm@vger.kernel.org, Brown, Len,
	x86@kernel.org

On Fri, 2024-10-11 at 12:25 +0200, Rafael J. Wysocki wrote:
> On Fri, Oct 11, 2024 at 2:43 AM Dave Hansen <dave.hansen@intel.com>
> wrote:
> > 
> > How about something like the completely untested attached patch?
> > 
> > IMNHO, it improves on what was posted here because it draws a
> > parallel
> > with an AMD erratum and also avoids writes to APIC_TMICT that would
> > get
> > ignored anyway.
> 
> Please feel free to add
> 
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> to this one when it's ready.
Tested on Lunar Lake for the new patch

Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-10-11 14:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09  7:20 [PATCH V2] x86/apic: Stop the TSC Deadline timer during lapic timer shutdown Zhang Rui
2024-10-09 11:24 ` Rafael J. Wysocki
2024-10-09 16:41   ` Dave Hansen
2024-10-09 17:03     ` Rafael J. Wysocki
2024-10-09 16:33 ` Ricardo Neri
2024-10-09 17:20   ` Rafael J. Wysocki
2024-10-09 16:49 ` Dave Hansen
2024-10-09 17:19   ` Rafael J. Wysocki
2024-10-09 17:47 ` Dave Hansen
2024-10-09 18:43   ` Rafael J. Wysocki
2024-10-11  0:43     ` Dave Hansen
2024-10-11  2:04       ` Zhang, Rui
2024-10-11 10:25       ` Rafael J. Wysocki
2024-10-11 14:51         ` Pandruvada, Srinivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox