public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Juergen Gross <jgross@suse.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jan Beulich <jbeulich@suse.com>,
	xen-devel@lists.xenproject.org,
	"Paul E. McKenney" <paulmck@kernel.org>,
	x86@kernel.org
Subject: [BUG] XEN/PV dom0 time management
Date: Mon, 07 Aug 2023 11:19:38 +0200	[thread overview]
Message-ID: <87a5v3us45.ffs@tglx> (raw)

Hi!

Something in XEN/PV time management seems to be seriously broken:

timekeeping watchdog on CPU9: Marking clocksource 'tsc' as unstable because the skew is too large:
[  152.557154] clocksource:                       'xen' wd_nsec: 511979417 wd_now: 24e4d7625e wd_last: 24c65332c5 mask: ffffffffffffffff
[  152.566197] clocksource:                       'tsc' cs_nsec: 512468734 cs_now: 9a306c9b808c cs_last: 9a302c9e30ba mask: ffffffffffffffff
[  152.572319] clocksource:                       Clocksource 'tsc' skewed 489317 ns (0 ms) over watchdog 'xen' interval of 511979417 ns (511 ms)
[  152.578067] clocksource:                       'tsc' is current clocksource.
[  152.581023] tsc: Marking TSC unstable due to clocksource watchdog
[  152.583751] clocksource: Checking clocksource tsc synchronization from CPU 5 to CPUs 0,3,8,10,12,15.
[  152.590860] clocksource:         CPUs 8 ahead of CPU 5 for clocksource tsc.
[  152.597196] clocksource:         CPU 5 check durations 14197ns - 124761ns for clocksource tsc.
[  152.602675] clocksource: Switched to clocksource xen

This is fully reproducible with variations of the failure report in the
following setup:

  - VM running on KVM on a SKLX machine

  - Debian bookworm install with XEN 4.17

  - Happens with the off the shelf debian 6.1 kernel and with current
    upstream (6.5-rc4)

Why am I convinced that this is a XENPV issue?

Simply because the same kernels booted w/o XEN on the same VM and the
same hardware do not have any issue with using TSC as clocksource. The
TSC on that machine is stable and fully synchronized. The clocksource
watchdog uses kvm-clock to monitor TSC and it never had any complaints.

But with XEN underneath its a matter of minutes after boot to happen. I
tried to make sense out of it, but ran out of steam and patience, so I
decided to report this to the XEN wizards.

Thanks,

        tglx

                 reply	other threads:[~2023-08-07  9:19 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5v3us45.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox