From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0D533F54C7 for ; Wed, 20 May 2026 17:59:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779299998; cv=none; b=S2Xus5Ru/Rw4zeMeJOqdIATHvePyUIaRT8Alq5ICHdh3fS1xedjKpqTEhtyN4G4uHUkhfh+PKMGJpnrFcexvLEMUC6eI1Fg2GuBoxppHkG/+55ZN3N+Ocr/qD2wN3O869mcxFqFnhlGV9a3fMp7x3L8rQcNOlKFKcGy+dcDJh84= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779299998; c=relaxed/simple; bh=dO19ZIhQCWsYMmrxTaUpUGG1PvauafKv/vB6jcGBVQM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cpCRRWL+duYyFQtHcqeIT/gZDIOUpq3D7YA4JUYlJ1XC+ANURFSDc9pTdkZVfJgAqIWVnJTr+kr27sw30rbZ3ZyyZzEi1Dqyr9N1XI9ogVdUJC0rczuf8aUaFfZB+7iKMStaFm5rBCXvZCunVTfToj4+BxYoesCZ/J1ra40uyn8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tF4Y7bnv; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tF4Y7bnv" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-82fa2165c3eso3261399b3a.0 for ; Wed, 20 May 2026 10:59:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779299996; x=1779904796; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=CN4Zn2M2jJcMvhiuUJl3O5eQtI7RDh/+btUc2M7YWms=; b=tF4Y7bnvmelvJ/9SA5mp9fpDR4nHjLtN1pq05uXBQ6E2xK7xhnND+XQATXdLIF/J/X 3vztRLP7+jOz+5cjIGMUqq8xMYTCqVhtlTUNbqwaYUvSyC+V3kSUKxB4A0vdCbs82iDf 0rtM0+vmeQmRJWZ4Zu4lAPXu206MmHYRBEICf1uKsPiUaOK84Cxliz2OVtMSKmaO2IOJ 4sNn936iu/6Y5IEcScaSzJQg8JtX88GpFKiRgefblzQYcX6b5f4WMiSBnPkpnshpWqx+ YJAHZxCFAPKJnCRQJO2A74bl6GVQBIgZohB1EkzSlTQyuJTuHSnjgFecx+PskIOeZqht kyvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779299996; x=1779904796; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=CN4Zn2M2jJcMvhiuUJl3O5eQtI7RDh/+btUc2M7YWms=; b=fQJsKB80qJMLAIkhayS9OmhcN/ky5HmcTQHsfO0G3WYrqWgFDAx6yHax+jhC+vBzWR aWwAaR7xEFp8RvHVO1Cur9KOZyJvDadW/WV7tUz0RzoPNwcZEuxJXkjB9IrmVjBVNnsE lD6iPTbL72/sfNTMR/+SHftx6cFG86oCKX4v7ROFuNoiB0Ymt0YmiiudwGqmeIXZrmHr pN30TBjbWTzkH4/aVHhFtMs7rMiyGsz4jQ3p4qWgBdieghS8rgyJDVouXNsNKnCjbKB4 XMk6Ai2deQoNhY6hHzMnU/VB3smpqdMEcfxQyZLLgOyVLr3tnm8ju7/F3zchEUpkH2R4 Ar1A== X-Forwarded-Encrypted: i=1; AFNElJ/ZUkjUGI+CWKRp1FR7wOLnN9frZkGnV1b5qGYhiizPgmRc0kH63CR+h1KtsWnRvnGTtDNVQ9epH93seZQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yx++pmcMaZ30pvC9x7kq3bt6uKt5dkqOhz+bUe0tcvzn6uz0ZA3 fFc18jdFW1uSmCX8Na7Uh5BJ0iG/fwcpUlTmT0YtfRyu38iyVbtLeSEZhOv9gFMPwQZtyXhQnkt F9hDmqA== X-Received: from pfblu4.prod.google.com ([2002:a05:6a00:7484:b0:83a:58c1:f5e2]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2449:b0:835:405a:7e72 with SMTP id d2e1a72fcca58-83f33c24d1cmr24352155b3a.11.1779299995865; Wed, 20 May 2026 10:59:55 -0700 (PDT) Date: Wed, 20 May 2026 10:59:55 -0700 In-Reply-To: <7260682b21c28d1299e58400b9a2f4b8d23bd434.camel@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260515191942.1892718-1-seanjc@google.com> <7260682b21c28d1299e58400b9a2f4b8d23bd434.camel@infradead.org> Message-ID: Subject: Re: [PATCH v3 00/41] x86: Try to wrangle PV clocks vs. TSC From: Sean Christopherson To: David Woodhouse Cc: Kiryl Shutsemau , Paolo Bonzini , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , Long Li , Ajay Kaher , Alexey Makhalov , Jan Kiszka , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Juergen Gross , Daniel Lezcano , Thomas Gleixner , John Stultz , Rick Edgecombe , Vitaly Kuznetsov , Broadcom internal kernel review list , Boris Ostrovsky , Stephen Boyd , x86@kernel.org, linux-coco@lists.linux.dev, kvm@vger.kernel.org, linux-hyperv@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, Michael Kelley , Tom Lendacky , Nikunj A Dadhania , Thomas Gleixner Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Tue, May 19, 2026, David Woodhouse wrote: > On Fri, 2026-05-15 at 12:19 -0700, Sean Christopherson wrote: > > Dave/Thomas/Peter/Boris, what's the going rate for bribes to take somet= hing > > like this through the tip tree? > >=20 > > The bulk of the changes are in kvmclock and TSC, but pretty much every > > hypervisor's guest-side code gets touched at some point.=C2=A0 I am rea= onsably > > confident in the correctness of the KVM changes.=C2=A0 Michael tested H= yper-V in > > v2, and while there were conflicts when rebasing, they were largely > > superficial (and I've just jinxed myself).=C2=A0 For all other hypervis= ors, assume > > the code is compile-tested only, but those changes are all quite small = and > > straightforward. > >=20 > > The only changes that are questionable/contentious are the last two pat= ches, > > which have KVM-as-a-guest use CPUID 0x16 to get the CPU frequency, even= on > > AMD (that's the dubious part).=C2=A0 I very deliberately put them last,= so that > > they can be dropped at will (I don't care terribly if those patches lan= d). > > To merge them, I would want explicit Acks from Paolo and David W. > >=20 > > So, except for the last two patches, to get the stuff I really care abo= ut > > landed, I think/hope it's just the TSC and guest-side CoCo changes that= need > > reviews/acks? > >=20 > > The primary goal of this series is (or at least was, when I started) to > > fix flaws with SNP and TDX guests where a PV clock provided by the untr= usted > > hypervisor is used instead of the secure/trusted TSC that is controlled= by > > trusted firmware. > >=20 > > The secondary goal is to draft off of the SNP and TDX changes to slight= ly > > modernize running under KVM.=C2=A0 Currently, KVM guests will use TSC f= or > > clocksource, but not sched_clock.=C2=A0 And they ignore Intel's CPUID-b= ased TSC > > and CPU frequency enumeration, even when using the TSC instead of kvmcl= ock. > > And if the host provides the core crystal frequency in CPUID.0x15, then= KVM > > guests can use that for the APIC timer period instead of manually calib= rating > > the frequency. > >=20 > > The tertiary goal is to clean up all of the PV clock code to deduplicat= e logic > > across hypervisors, and to hopefully make it all easier to maintain goi= ng > > forward. >=20 > I booted this in qemu with -cpu host,+invtsc,+vmware-cpuid-freq >=20 > I was expecting to see it eschew the kvmclock and use *only* the TSC. > Is there even any need for 'tsc-early' given that it's *told* the TSC > frequency in CPUID? Shouldn't it have detected that the TSC is known > before init_tsc_clocksource() runs? > > And then it even spent some time at boot actually using the kvmclock as > clocksource... when ideally I don't think it would even have *enabled* > it at all? Yeah, that's definitely the ideal state. And I had all the same expectatio= ns and observations as you when digging in and testing this. But unless this seri= es makes things worse, I want punt on achieving the ideal state for the moment= , as it's proving to be a big lift just to get to a not-awful state. > [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycle= s: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns > [ 0.000000] tsc: Detected 2400.000 MHz processor > [ 0.008205] TSC deadline timer available > [ 0.008270] clocksource: refined-jiffies: mask: 0xffffffff max_cycles:= 0xffffffff, max_idle_ns: 1910969940391419 ns > [ 0.159085] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff= , max_idle_ns: 19112604467 ns > [ 0.164074] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycle= s: 0x22983777dd9, max_idle_ns: 440795300422 ns > [ 0.229087] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xfffff= fff, max_idle_ns: 1911260446275000 ns > [ 0.337095] clocksource: Switched to clocksource kvm-clock > [ 0.345246] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff,= max_idle_ns: 2085701024 ns > [ 0.356201] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2= 2983777dd9, max_idle_ns: 440795300422 ns > [ 0.360560] clocksource: Switched to clocksource tsc >=20