All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] watchdog: revert cleanup handling of false positives
@ 2021-05-17 14:06 Sergey Senozhatsky
  2021-05-18 15:36 ` Petr Mladek
  0 siblings, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2021-05-17 14:06 UTC (permalink / raw)
  To: Andrew Morton, Petr Mladek
  Cc: Peter Zijlstra, linux-kernel, Sergey Senozhatsky

This reverts commit 9bf3bc949f8aeefeacea4b1198db833b722a8e27.

I can reproduce the case when resumed VCPU starts to execute
is_softlockup() with PVCLOCK_GUEST_STOPPED set on this VCPU:

	watchdog_timer_fn()
	{
		...

		kvm_check_and_clear_guest_paused();

		...

		duration = is_softlockup(touch_ts, period_ts);
		if (unlikely(duration)) {
			....
		}
	}

Which means that guest VCPU has been suspended between
kvm_check_and_clear_guest_paused() and is_softlockup(),
and jiffies (clock) thus shifted forward.

VCPU can be preempted anywhere, so we want to do
kvm_check_and_clear_guest_paused() at the very last
moment: when we are in the soft-lockup branch but
then we figure out that it's false positive and we
need to bail out.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 kernel/watchdog.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7c397907d0e9..8cf0678378d2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -376,14 +376,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 	/* .. and repeat */
 	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
 
-	/*
-	 * If a virtual machine is stopped by the host it can look to
-	 * the watchdog like a soft lockup. Check to see if the host
-	 * stopped the vm before we process the timestamps.
-	 */
-	kvm_check_and_clear_guest_paused();
-
-	/* Reset the interval when touched by known problematic code. */
+	/* Reset the interval when touched externally by a known slow code. */
 	if (period_ts == SOFTLOCKUP_DELAY_REPORT) {
 		if (unlikely(__this_cpu_read(softlockup_touch_sync))) {
 			/*
@@ -394,7 +387,10 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 			sched_clock_tick();
 		}
 
+		/* Clear the guest paused flag on watchdog reset */
+		kvm_check_and_clear_guest_paused();
 		update_report_ts();
+
 		return HRTIMER_RESTART;
 	}
 
@@ -406,6 +402,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 	 */
 	duration = is_softlockup(touch_ts, period_ts);
 	if (unlikely(duration)) {
+		/*
+		 * If a virtual machine is stopped by the host it can look to
+		 * the watchdog like a soft lockup, check to see if the host
+		 * stopped the vm before we issue the warning
+		 */
+		if (kvm_check_and_clear_guest_paused())
+			return HRTIMER_RESTART;
+
 		/*
 		 * Prevent multiple soft-lockup reports if one cpu is already
 		 * engaged in dumping all cpu back traces.
-- 
2.31.1.751.gd2f1c929bd-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-20  5:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-17 14:06 [PATCH] watchdog: revert cleanup handling of false positives Sergey Senozhatsky
2021-05-18 15:36 ` Petr Mladek
2021-05-19 11:43   ` [PATCH] watchdog: Reliable handling of timestamps Petr Mladek
2021-05-19 12:01     ` Petr Mladek
2021-05-20  5:52       ` Sergey Senozhatsky
2021-05-20  5:29     ` Sergey Senozhatsky
2021-05-20  5:13   ` [PATCH] watchdog: revert cleanup handling of false positives Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.