public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Laurence Oberman <loberman@redhat.com>,
	Vincent Whitchurch <vincent.whitchurch@axis.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 6/7] watchdog: Cleanup handling of false positives
Date: Sun, 16 May 2021 19:46:21 +0900	[thread overview]
Message-ID: <YKD3/RuL/2qUcRhL@google.com> (raw)

Hi,

// This was never in my inbox, so sorry if I mess up the "Reply-to"
// Original message:  https://lore.kernel.org/lkml/20210311122130.6788-7-pmladek@suse.com/


>@@ -375,7 +375,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> 	/* .. and repeat */
> 	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
>
> -	/* Reset the interval when touched externally by a known slow code. */
> +	/*
> +	 * If a virtual machine is stopped by the host it can look to
> +	 * the watchdog like a soft lockup. Check to see if the host
> +	 * stopped the vm before we process the timestamps.
> +	 */
> +	kvm_check_and_clear_guest_paused();
> +
[..]
>@@ -401,14 +405,6 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> 	 */
> 	duration = is_softlockup(touch_ts, period_ts);
> 	if (unlikely(duration)) {
> -		/*
> -		 * If a virtual machine is stopped by the host it can look to
> -		 * the watchdog like a soft lockup, check to see if the host
> -		 * stopped the vm before we issue the warning
> -		 */
> -		if (kvm_check_and_clear_guest_paused())
> -			return HRTIMER_RESTART;

This looks racy to me. I believe kvm_check_and_clear_guest_paused()
was in the right place.

VCPU can be scheduled out/preepmpted any time at any point; and then
guest VM (or even the entire system) can be suspended. When we resume
the VM we continue from where we were preempted (from VCPU POW).

So what the old code did

watchdog_timer_fn()
{
	...
	<<!!>>

	// Suppose we are suspended here. When we are getting resumed
	// jiffies jump forward, which may look like a soft lockup.
	duration = is_softlockup(touch_ts, period_ts);
	if (unlikely(duration)) {
		// And this is where kvm_check_and_clear_guest_paused()
		// jumps in. We know already that jiffies have jumped,
		// we don't know if jiffies jumped because the VM was
		// suspended. And this is what we figure out here and
		// bail out
		if (kvm_check_and_clear_guest_paused())
			return HRTIMER_RESTART;
	}
}

The new code does the following

watchdog_timer_fn()
{
	...
	kvm_check_and_clear_guest_paused(); // PVCLOCK_GUEST_STOPPED is not set

	<<!!>>

	// Suppose the VM got suspended at this point. PVCLOCK_GUEST_STOPPED
	// is set, but we don't check it. jiffies will jump and this will look
	// like a lockup, but we don't check if jiffies jumped because the VM
	// was suspended
	duration = is_softlockup(touch_ts, period_ts);
	if (unlikely(duration)) {
		// report the lockup and perhaps panic the system,
		// depending on the configuration
	}
}

What am I missing?

             reply	other threads:[~2021-05-16 10:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-16 10:46 Sergey Senozhatsky [this message]
2021-05-17 15:01 ` [PATCH v2 6/7] watchdog: Cleanup handling of false positives Petr Mladek
  -- strict thread matches above, loose matches on Subject: below --
2021-03-11 12:21 [PATCH v2 0/7] watchdog/softlockup: Report overall time and some cleanup Petr Mladek
2021-03-11 12:21 ` [PATCH v2 6/7] watchdog: Cleanup handling of false positives Petr Mladek
2020-12-10 16:00 [PATCH v2 0/7] watchdog/softlockup: Report overall time and some cleanup Petr Mladek
2020-12-10 16:00 ` [PATCH v2 6/7] watchdog: Cleanup handling of false positives Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKD3/RuL/2qUcRhL@google.com \
    --to=senozhatsky@chromium.org \
    --cc=20210311122130.6788-7-pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.whitchurch@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox