From: David Woodhouse <dwmw2@infradead.org>
To: "Sohil Mehta" <sohil.mehta@intel.com>,
x86@kernel.org, "Dave Hansen" <dave.hansen@linux.intel.com>,
"Tony Luck" <tony.luck@intel.com>,
"Jürgen Gross" <jgross@suse.com>,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
xen-devel <xen-devel@lists.xenproject.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Kan Liang <kan.liang@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Len Brown <lenb@kernel.org>, Andy Lutomirski <luto@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Jean Delvare <jdelvare@suse.com>,
Guenter Roeck <linux@roeck-us.net>,
Zhang Rui <rui.zhang@intel.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
David Laight <david.laight.linux@gmail.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-pm@vger.kernel.org,
kvm@vger.kernel.org, xiaoyao.li@intel.com,
Xin Li <xin@zytor.com>
Subject: Re: [PATCH v3 13/15] x86/cpu/intel: Bound the non-architectural constant_tsc model checks
Date: Thu, 21 Aug 2025 21:09:16 +0100 [thread overview]
Message-ID: <5b905902c99e13d65ea0810b0885fca97cffc74d.camel@infradead.org> (raw)
In-Reply-To: <5f5f1230-f373-469c-b0d9-abc80199886e@intel.com>
[-- Attachment #1: Type: text/plain, Size: 3233 bytes --]
On Thu, 2025-08-21 at 12:43 -0700, Sohil Mehta wrote:
> On 8/21/2025 12:34 PM, Sohil Mehta wrote:
> > On 8/21/2025 6:15 AM, David Woodhouse wrote:
> >
> > > Hm. My test host is INTEL_HASWELL_X (0x63f). For reasons which are
> > > unclear to me, QEMU doesn't set bit 8 of 0x80000007 EDX unless I
> > > explicitly append ',+invtsc' to the existing '-cpu host' on its command
> > > line. So now my guest doesn't think it has X86_FEATURE_CONSTANT_TSC.
> > >
> >
> > Haswell should have X86_FEATURE_CONSTANT_TSC, so I would have expected
> > the guest bit to be set. Until now, X86_FEATURE_CONSTANT_TSC was set
> > based on the Family-model instead of the CPUID enumeration which may
> > have hid the issue.
> >
>
> Correction:
> s/instead/as well as
>
> > From my initial look at the QEMU implementation, this seems intentional.
> >
> > QEMU considers Invariant TSC as un-migratable which prevents it from
> > being exposed to migratable guests (default).
> > target/i386/cpu.c:
> > [FEAT_8000_0007_EDX]
> > .unmigratable_flags = CPUID_APM_INVTSC,
> >
> > Can you please try '-cpu host,migratable=off'?
>
> This is mainly to verify. If confirmed, I am not sure what the long term
> solution should be.
Yes, explicitly turning it on with -cpu host,+invtsc does work.
I've been looking into why it takes a Xen guest four seconds per vCPU
in this case, but not a KVM guest.
When running as a KVM guest, Linux will infer the TSC frequency from
the KVM clock — or better still, from CPUID; see
https://lore.kernel.org/all/20250816101308.2594298-1-dwmw2@infradead.org
and/or
https://lore.kernel.org/all/20250227021855.3257188-36-seanjc@google.com
As a Xen guest though, Linux doesn't do that. This patch in the guest
should make it work without recalibrating the TSC for each vCPU...
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -489,7 +489,15 @@ static void xen_setup_vsyscall_time_info(void)
*/
static int __init xen_tsc_safe_clocksource(void)
{
- u32 eax, ebx, ecx, edx;
+ u32 eax, ebx, ecx, edx;
+ u64 lpj;
+
+ /* Leaf 4, sub-leaf 0 (0x40000x03) */
+ cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
+
+ lpj = ((u64)ecx * 1000);
+ do_div(lpj, HZ);
+ preset_lpj = lpj;
if (!(boot_cpu_has(X86_FEATURE_CONSTANT_TSC)))
return 0;
@@ -500,9 +508,6 @@ static int __init xen_tsc_safe_clocksource(void)
if (check_tsc_unstable())
return 0;
- /* Leaf 4, sub-leaf 0 (0x40000x03) */
- cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
-
return ebx == XEN_CPUID_TSC_MODE_NEVER_EMULATE;
}
... but then I got slightly distracted by the question of why I was
getting *nonsense* in those values, and why KVM is 'correcting' EAX in
subleaf 2 which is supposed to be the *host* TSC, not ECX in subleaf
zero...
Under the Fedora 6.13.8-200 kernel I'm fairly sure the guest was seeing
values in subleaf 0 ECX/EDX that *should* have been in subleaf 1
ECX/EDX, and that problem went away when I rebooted the host into a
mainline kernel. Will have to go back and retest that part...
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
next prev parent reply other threads:[~2025-08-21 20:09 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-19 18:41 [PATCH v3 00/15] Prepare for new Intel Family numbers Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 01/15] x86/apic: Fix 32-bit APIC initialization for extended Intel Families Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 02/15] x86/cpu/intel: Fix the movsl alignment preference for extended Families Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 03/15] x86/microcode: Update the Intel processor flag scan check Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 04/15] x86/mtrr: Modify a x86_model check to an Intel VFM check Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 05/15] x86/cpu/intel: Replace early Family 6 checks with VFM ones Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 06/15] x86/cpu/intel: Replace Family 15 " Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 07/15] x86/cpu/intel: Replace Family 5 model " Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 08/15] x86/acpi/cstate: Improve Intel Family model checks Sohil Mehta
2025-02-20 19:20 ` Rafael J. Wysocki
2025-02-19 18:41 ` [PATCH v3 09/15] x86/smpboot: Remove confusing quirk usage in INIT delay Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 10/15] x86/smpboot: Fix INIT delay assignment for extended Intel Families Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 11/15] x86/cpu/intel: Fix fast string initialization for extended Families Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 12/15] x86/pat: Replace Intel x86_model checks with VFM ones Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 13/15] x86/cpu/intel: Bound the non-architectural constant_tsc model checks Sohil Mehta
2025-08-21 13:15 ` David Woodhouse
2025-08-21 19:34 ` Sohil Mehta
2025-08-21 19:43 ` Sohil Mehta
2025-08-21 20:09 ` David Woodhouse [this message]
2025-08-22 1:46 ` Xiaoyao Li
2025-08-24 22:39 ` Demi Marie Obenour
2025-02-19 18:41 ` [PATCH v3 14/15] perf/x86: Simplify Intel PMU initialization Sohil Mehta
2025-02-19 20:10 ` Liang, Kan
2025-02-19 20:31 ` Sohil Mehta
2025-02-19 20:45 ` Liang, Kan
2025-02-27 0:16 ` [PATCH v3.1 " Sohil Mehta
2025-02-19 18:41 ` [PATCH v3 15/15] perf/x86/p4: Replace Pentium 4 model checks with VFM ones Sohil Mehta
2025-02-19 20:11 ` Liang, Kan
2025-03-17 17:09 ` [PATCH v3 00/15] Prepare for new Intel Family numbers Sohil Mehta
2025-03-18 18:35 ` Ingo Molnar
2025-03-18 19:10 ` Sohil Mehta
2025-03-18 20:13 ` Ingo Molnar
2025-03-19 15:53 ` Sohil Mehta
2025-03-19 19:46 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5b905902c99e13d65ea0810b0885fca97cffc74d.camel@infradead.org \
--to=dwmw2@infradead.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david.laight.linux@gmail.com \
--cc=hpa@zytor.com \
--cc=irogers@google.com \
--cc=jdelvare@suse.com \
--cc=jgross@suse.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rui.zhang@intel.com \
--cc=sohil.mehta@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=viresh.kumar@linaro.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=xiaoyao.li@intel.com \
--cc=xin@zytor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).