public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: fuqiang wang <fuqiang.wng@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  "H . Peter Anvin" <hpa@zytor.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org,
	yu chen <chen.yu@easystack.com>,
	 dongxu zhang <dongxu.zhang@easystack.com>
Subject: Re: [PATCH RESEND] avoid hv timer fallback to sw timer if delay exceeds period
Date: Fri, 17 Oct 2025 08:59:00 -0700	[thread overview]
Message-ID: <aPJnxDj4mFSJc0tV@google.com> (raw)
In-Reply-To: <c87d11a7-b4dd-463e-b40a-188fd2219b3b@gmail.com>

On Fri, Oct 17, 2025, fuqiang wang wrote:
> On 10/14/25 7:29 AM, Sean Christopherson wrote:
> > On Wed, Oct 01, 2025, fuqiang wang wrote:
> > The only code that cares is __kvm_wait_lapic_expire(), and the only downside to
> > setting tscdeadline=L1.TSC is that adjust_lapic_timer_advance() won't adjust as
> > aggressively as it probably should.
> 
> I am not sure which type of timers should use the "advanced tscdeadline hrtimer
> expiration feature".
> 
> I list the history of this feature.
> 
> 1. Marcelo first introduce this feature, only support the tscdeadline sw timer.
> 2. Yunhong introduce vmx preemption timer(hv), only support for tscdeadline.
> 3. Liwanpeng extend the hv timer to oneshot and period timers.
> 4. Liwanpeng extend this feature to hv timer.
> 5. Sean and liwanpeng fix some BUG extend this feature to hv period/oneshot timer.
> 
> [1] d0659d946be0("KVM: x86: add option to advance tscdeadline hrtimer expiration")
>     Marcelo Tosatti     Dec 16 2014
> [2] ce7a058a2117("KVM: x86: support using the vmx preemption timer for tsc deadline timer")
>     Yunhong Jiang       Jun 13 2016
> [3] 8003c9ae204e("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
>     liwanpeng           Oct 24 2016
> [4] c5ce8235cffa("KVM: VMX: Optimize tscdeadline timer latency")
>     liwanpeng           May 29 2018
> [5] ee66e453db13("KVM: lapic: Busy wait for timer to expire when using hv_timer")
>     Sean Christopherson Apr 16 2019
> 
>     d981dd15498b("KVM: LAPIC: Accurately guarantee busy wait for timer to expire when using hv_timer")
>     liwanpeng           Apr 28 2021
> 
> Now, timers supported for this feature includes:
> - sw: tscdeadline
> - hv: all (tscdeadline, oneshot, period)
> 
> ====
> IMHO
> ====
> 
> 1. for period timer
> ===================
> 
> I think for periodic timers emulation, the expiration time is already adjusted
> to compensate for the delays introduced by timer emulation, so don't need this
> feature to adjust again. But after use the feature, the first timer expiration
> may be relatively accurate.
> 
> E.g., At time 0, start a periodic task (period: 10,000 ns) with a simulated
> delay of 100 ns.
> 
> With this feature enabled and reasonably accurate prediction, the expiration
> time set seen by the guest are: 10000, 20000, 30000...
> 
> With this feature not enabled, the expiration times set: 10100, 20100, 30100...
> 
> But IMHO, for periodic timers, accuracy of the period seems to be the main
> concern, because it does not frequently start and stop. The incorrect period
> caused by the first timer expiration can be ignored.

I agree it's superfluous, but applying the advancement also does no harm, and
avoiding it would be moreeffort than simply letting KVM predict the first expiration.

> 2. for oneshot timer
> ====================
> 
> In [1], Marcelo treated oneshot and tscdeadline differently. Shouldn’t the
> behavior of these two timers be similar?

Yes, but they aren't identical, and so supporting both would require additional
code, complexity, and testing.

> Unlike periodic timers, both oneshot and tscdeadline timers set a specific
> expiration time, and what the guest cares about is whether the expiration
> time is accurate. Moreover, this feature is mainly intended to mitigate the
> latency introduced by timer virtualization.  Since software timers have
> higher latency compared to hardware virtual timers, the need for this feature
> is actually more urgent for software timers.

Yep.

> However, in the current implementation, the feature is applied to hv
> oneshot/period timers, but not to sw oneshot/period timers.
>
> ===============
> Summary of IMHO
> ===============
> 
> The feature should be applied to the following timer types:
> sw/hv tscdeadline and sw/hv oneshot

In a perfect world, probably?  But I don't know that it's worth changing at this
time.  Much of this is balancing complexity with benefit, though it's also most
definitely a reflection of the initial implementation.

KVM unconditionally emulates TSC-deadline mode, and AFAIK every real-world kernel
prefers TSC-deadline over one-shot, and so in practice the benefits of applying
the advancement to one-shot hrtimers.  That was also the way the world was headed
back when Marcelo first implemented the support.  I don't know for sure why the
initial implementation targeted only TSC-deadline mode, but I think it's safe to
assume that the use case Marcelo was targeting exclusively used TSC-deadline.

I'm not entirely opposed to playing the advancement games with one-shot hrtimers,
but it's also not clear to me that it's worth doing.  E.g. supporting one-shot
hrtimers would likely require a bit of extra complexity to juggle the different
time domains.  And if the only use cases that are truly sensitive to timer
programming latency exclusively use TSC-deadline mode (because one-shot mode is
inherently "fuzzy"), then any amount of extra complexity is effectively dead weight.

> should not be applied to:
> sw/hv period

I wouldn't say "should not be applied to", I think it's more "doesn't provide much
benefit to".

  reply	other threads:[~2025-10-17 15:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 12:51 [PATCH RESEND] avoid hv timer fallback to sw timer if delay exceeds period fuqiang wang
2025-10-13 23:29 ` Sean Christopherson
2025-10-17 12:21   ` fuqiang wang
2025-10-17 15:59     ` Sean Christopherson [this message]
2025-10-21 15:37       ` fuqiang wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPJnxDj4mFSJc0tV@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=chen.yu@easystack.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dongxu.zhang@easystack.com \
    --cc=fuqiang.wng@gmail.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox