All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: fuqiang wang <fuqiang.wng@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  "H . Peter Anvin" <hpa@zytor.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org,
	yu chen <chen.yu@easystack.com>,
	 dongxu zhang <dongxu.zhang@easystack.com>
Subject: Re: [PATCH RESEND] avoid hv timer fallback to sw timer if delay exceeds period
Date: Fri, 17 Oct 2025 08:59:00 -0700	[thread overview]
Message-ID: <aPJnxDj4mFSJc0tV@google.com> (raw)
In-Reply-To: <c87d11a7-b4dd-463e-b40a-188fd2219b3b@gmail.com>

On Fri, Oct 17, 2025, fuqiang wang wrote:
> On 10/14/25 7:29 AM, Sean Christopherson wrote:
> > On Wed, Oct 01, 2025, fuqiang wang wrote:
> > The only code that cares is __kvm_wait_lapic_expire(), and the only downside to
> > setting tscdeadline=L1.TSC is that adjust_lapic_timer_advance() won't adjust as
> > aggressively as it probably should.
> 
> I am not sure which type of timers should use the "advanced tscdeadline hrtimer
> expiration feature".
> 
> I list the history of this feature.
> 
> 1. Marcelo first introduce this feature, only support the tscdeadline sw timer.
> 2. Yunhong introduce vmx preemption timer(hv), only support for tscdeadline.
> 3. Liwanpeng extend the hv timer to oneshot and period timers.
> 4. Liwanpeng extend this feature to hv timer.
> 5. Sean and liwanpeng fix some BUG extend this feature to hv period/oneshot timer.
> 
> [1] d0659d946be0("KVM: x86: add option to advance tscdeadline hrtimer expiration")
>     Marcelo Tosatti     Dec 16 2014
> [2] ce7a058a2117("KVM: x86: support using the vmx preemption timer for tsc deadline timer")
>     Yunhong Jiang       Jun 13 2016
> [3] 8003c9ae204e("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
>     liwanpeng           Oct 24 2016
> [4] c5ce8235cffa("KVM: VMX: Optimize tscdeadline timer latency")
>     liwanpeng           May 29 2018
> [5] ee66e453db13("KVM: lapic: Busy wait for timer to expire when using hv_timer")
>     Sean Christopherson Apr 16 2019
> 
>     d981dd15498b("KVM: LAPIC: Accurately guarantee busy wait for timer to expire when using hv_timer")
>     liwanpeng           Apr 28 2021
> 
> Now, timers supported for this feature includes:
> - sw: tscdeadline
> - hv: all (tscdeadline, oneshot, period)
> 
> ====
> IMHO
> ====
> 
> 1. for period timer
> ===================
> 
> I think for periodic timers emulation, the expiration time is already adjusted
> to compensate for the delays introduced by timer emulation, so don't need this
> feature to adjust again. But after use the feature, the first timer expiration
> may be relatively accurate.
> 
> E.g., At time 0, start a periodic task (period: 10,000 ns) with a simulated
> delay of 100 ns.
> 
> With this feature enabled and reasonably accurate prediction, the expiration
> time set seen by the guest are: 10000, 20000, 30000...
> 
> With this feature not enabled, the expiration times set: 10100, 20100, 30100...
> 
> But IMHO, for periodic timers, accuracy of the period seems to be the main
> concern, because it does not frequently start and stop. The incorrect period
> caused by the first timer expiration can be ignored.

I agree it's superfluous, but applying the advancement also does no harm, and
avoiding it would be moreeffort than simply letting KVM predict the first expiration.

> 2. for oneshot timer
> ====================
> 
> In [1], Marcelo treated oneshot and tscdeadline differently. Shouldn’t the
> behavior of these two timers be similar?

Yes, but they aren't identical, and so supporting both would require additional
code, complexity, and testing.

> Unlike periodic timers, both oneshot and tscdeadline timers set a specific
> expiration time, and what the guest cares about is whether the expiration
> time is accurate. Moreover, this feature is mainly intended to mitigate the
> latency introduced by timer virtualization.  Since software timers have
> higher latency compared to hardware virtual timers, the need for this feature
> is actually more urgent for software timers.

Yep.

> However, in the current implementation, the feature is applied to hv
> oneshot/period timers, but not to sw oneshot/period timers.
>
> ===============
> Summary of IMHO
> ===============
> 
> The feature should be applied to the following timer types:
> sw/hv tscdeadline and sw/hv oneshot

In a perfect world, probably?  But I don't know that it's worth changing at this
time.  Much of this is balancing complexity with benefit, though it's also most
definitely a reflection of the initial implementation.

KVM unconditionally emulates TSC-deadline mode, and AFAIK every real-world kernel
prefers TSC-deadline over one-shot, and so in practice the benefits of applying
the advancement to one-shot hrtimers.  That was also the way the world was headed
back when Marcelo first implemented the support.  I don't know for sure why the
initial implementation targeted only TSC-deadline mode, but I think it's safe to
assume that the use case Marcelo was targeting exclusively used TSC-deadline.

I'm not entirely opposed to playing the advancement games with one-shot hrtimers,
but it's also not clear to me that it's worth doing.  E.g. supporting one-shot
hrtimers would likely require a bit of extra complexity to juggle the different
time domains.  And if the only use cases that are truly sensitive to timer
programming latency exclusively use TSC-deadline mode (because one-shot mode is
inherently "fuzzy"), then any amount of extra complexity is effectively dead weight.

> should not be applied to:
> sw/hv period

I wouldn't say "should not be applied to", I think it's more "doesn't provide much
benefit to".

  reply	other threads:[~2025-10-17 15:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 12:51 [PATCH RESEND] avoid hv timer fallback to sw timer if delay exceeds period fuqiang wang
2025-10-13 23:29 ` Sean Christopherson
2025-10-17 12:21   ` fuqiang wang
2025-10-17 15:59     ` Sean Christopherson [this message]
2025-10-21 15:37       ` fuqiang wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPJnxDj4mFSJc0tV@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=chen.yu@easystack.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dongxu.zhang@easystack.com \
    --cc=fuqiang.wng@gmail.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.