All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	Len Brown <lenb@kernel.org>
Subject: [PATCH 02/10] cpuidle: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram
Date: Fri, 11 Aug 2023 19:00:41 +0200	[thread overview]
Message-ID: <20230811170049.308866-3-frederic@kernel.org> (raw)
In-Reply-To: <20230811170049.308866-1-frederic@kernel.org>

From: Peter Zijlstra <peterz@infradead.org>

intel_idle_irq() re-enables IRQs very early. As a result, an interrupt
may fire before mwait() is eventually called. If such an interrupt queues
a timer, it may go unnoticed until mwait returns and the idle loop
handles the tick re-evaluation. And monitoring TIF_NEED_RESCHED doesn't
help because a local timer enqueue doesn't set that flag.

The issue is mitigated by the fact that this idle handler is only invoked
for shallow C-states when, presumably, the next tick is supposed to be
close enough. There may still be rare cases though when the next tick
is far away and the selected C-state is shallow, resulting in a timer
getting ignored for a while.

Fix this with using sti_mwait() whose IRQ-reenablement only triggers
upon calling mwait(), dealing with the race while keeping the interrupt
latency within acceptable bounds.

Fixes: c227233ad64c (intel_idle: enable interrupts before C1 on Xeons)
Not-yet-signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 arch/x86/include/asm/mwait.h | 11 +++++++++--
 drivers/idle/intel_idle.c    | 19 +++++++------------
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 341ee4f1d91e..920426d691ce 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -124,8 +124,15 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
-		if (!need_resched())
-			__mwait(eax, ecx);
+
+		if (!need_resched()) {
+			if (ecx & 1) {
+				__mwait(eax, ecx);
+			} else {
+				__sti_mwait(eax, ecx);
+				raw_local_irq_disable();
+			}
+		}
 	}
 	current_clr_polling();
 }
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 256c2d42e350..d676d32741da 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -131,11 +131,12 @@ static unsigned int mwait_substates __initdata;
 #define MWAIT2flg(eax) ((eax & 0xFF) << 24)
 
 static __always_inline int __intel_idle(struct cpuidle_device *dev,
-					struct cpuidle_driver *drv, int index)
+					struct cpuidle_driver *drv,
+					int index, bool irqoff)
 {
 	struct cpuidle_state *state = &drv->states[index];
 	unsigned long eax = flg2MWAIT(state->flags);
-	unsigned long ecx = 1; /* break on interrupt flag */
+	unsigned long ecx = 1*irqoff; /* break on interrupt flag */
 
 	mwait_idle_with_hints(eax, ecx);
 
@@ -159,19 +160,13 @@ static __always_inline int __intel_idle(struct cpuidle_device *dev,
 static __cpuidle int intel_idle(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv, int index)
 {
-	return __intel_idle(dev, drv, index);
+	return __intel_idle(dev, drv, index, true);
 }
 
 static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
 				    struct cpuidle_driver *drv, int index)
 {
-	int ret;
-
-	raw_local_irq_enable();
-	ret = __intel_idle(dev, drv, index);
-	raw_local_irq_disable();
-
-	return ret;
+	return __intel_idle(dev, drv, index, false);
 }
 
 static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
@@ -184,7 +179,7 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
 	if (smt_active)
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
 
-	ret = __intel_idle(dev, drv, index);
+	ret = __intel_idle(dev, drv, index, true);
 
 	if (smt_active)
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
@@ -196,7 +191,7 @@ static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev,
 				       struct cpuidle_driver *drv, int index)
 {
 	fpu_idle_fpregs();
-	return __intel_idle(dev, drv, index);
+	return __intel_idle(dev, drv, index, true);
 }
 
 /**
-- 
2.34.1


  parent reply	other threads:[~2023-08-11 17:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11 17:00 [PATCH 00/10] timers/cpuidle: Fixes and cleanups Frederic Weisbecker
2023-08-11 17:00 ` [PATCH 01/10] x86: Add a comment about the "magic" behind shadow sti before mwait Frederic Weisbecker
2023-08-11 17:44   ` Rafael J. Wysocki
2023-08-11 17:00 ` Frederic Weisbecker [this message]
2023-08-11 17:35   ` [PATCH 02/10] cpuidle: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram Rafael J. Wysocki
2023-08-11 17:00 ` [PATCH 03/10] cpuidle: Report illegal tick stopped while polling Frederic Weisbecker
2023-08-11 17:37   ` Rafael J. Wysocki
2023-08-29 13:04     ` Frederic Weisbecker
2023-08-11 17:00 ` [PATCH 04/10] loongson: Fix idle VS timer enqueue Frederic Weisbecker
2023-08-14  2:58   ` bibo mao
2023-08-11 17:00 ` [PATCH 05/10] cpuidle: Comment about timers requirements VS idle handler Frederic Weisbecker
2023-08-11 17:38   ` Rafael J. Wysocki
2023-08-11 17:00 ` [PATCH 06/10] cpuidle: Remove unnecessary current_clr_polling_and_test() from haltpoll Frederic Weisbecker
2023-08-11 17:39   ` Rafael J. Wysocki
2023-08-11 17:00 ` [PATCH 07/10] cpuidle: Remove unnecessary current_clr_polling() on poll_idle() Frederic Weisbecker
2023-08-11 17:40   ` Rafael J. Wysocki
2023-08-11 17:00 ` [PATCH 08/10] x86: Remove __current_clr_polling() from mwait_idle() Frederic Weisbecker
2023-08-11 17:40   ` Rafael J. Wysocki
2023-08-11 17:00 ` [PATCH 09/10] x86: Remove the current_clr_polling() call upon mwait exit Frederic Weisbecker
2023-08-11 17:41   ` Rafael J. Wysocki
2023-08-11 17:00 ` [PATCH 10/10] sched/timers: Explain why idle task schedules out on remote timer enqueue Frederic Weisbecker
2023-08-11 17:43   ` Rafael J. Wysocki
2023-08-15 16:10 ` [PATCH 00/10] timers/cpuidle: Fixes and cleanups Peter Zijlstra
2023-08-29 11:28   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230811170049.308866-3-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=daniel.lezcano@linaro.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.