From: Sean Christopherson <seanjc@google.com>
To: fuqiang wang <fuqiang.wng@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
Maxim Levitsky <mlevitsk@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
yu chen <chen.yu@easystack.com>,
dongxu zhang <dongxu.zhang@easystack.com>
Subject: Re: [PATCH RESEND] avoid hv timer fallback to sw timer if delay exceeds period
Date: Fri, 17 Oct 2025 08:59:00 -0700 [thread overview]
Message-ID: <aPJnxDj4mFSJc0tV@google.com> (raw)
In-Reply-To: <c87d11a7-b4dd-463e-b40a-188fd2219b3b@gmail.com>
On Fri, Oct 17, 2025, fuqiang wang wrote:
> On 10/14/25 7:29 AM, Sean Christopherson wrote:
> > On Wed, Oct 01, 2025, fuqiang wang wrote:
> > The only code that cares is __kvm_wait_lapic_expire(), and the only downside to
> > setting tscdeadline=L1.TSC is that adjust_lapic_timer_advance() won't adjust as
> > aggressively as it probably should.
>
> I am not sure which type of timers should use the "advanced tscdeadline hrtimer
> expiration feature".
>
> I list the history of this feature.
>
> 1. Marcelo first introduce this feature, only support the tscdeadline sw timer.
> 2. Yunhong introduce vmx preemption timer(hv), only support for tscdeadline.
> 3. Liwanpeng extend the hv timer to oneshot and period timers.
> 4. Liwanpeng extend this feature to hv timer.
> 5. Sean and liwanpeng fix some BUG extend this feature to hv period/oneshot timer.
>
> [1] d0659d946be0("KVM: x86: add option to advance tscdeadline hrtimer expiration")
> Marcelo Tosatti Dec 16 2014
> [2] ce7a058a2117("KVM: x86: support using the vmx preemption timer for tsc deadline timer")
> Yunhong Jiang Jun 13 2016
> [3] 8003c9ae204e("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
> liwanpeng Oct 24 2016
> [4] c5ce8235cffa("KVM: VMX: Optimize tscdeadline timer latency")
> liwanpeng May 29 2018
> [5] ee66e453db13("KVM: lapic: Busy wait for timer to expire when using hv_timer")
> Sean Christopherson Apr 16 2019
>
> d981dd15498b("KVM: LAPIC: Accurately guarantee busy wait for timer to expire when using hv_timer")
> liwanpeng Apr 28 2021
>
> Now, timers supported for this feature includes:
> - sw: tscdeadline
> - hv: all (tscdeadline, oneshot, period)
>
> ====
> IMHO
> ====
>
> 1. for period timer
> ===================
>
> I think for periodic timers emulation, the expiration time is already adjusted
> to compensate for the delays introduced by timer emulation, so don't need this
> feature to adjust again. But after use the feature, the first timer expiration
> may be relatively accurate.
>
> E.g., At time 0, start a periodic task (period: 10,000 ns) with a simulated
> delay of 100 ns.
>
> With this feature enabled and reasonably accurate prediction, the expiration
> time set seen by the guest are: 10000, 20000, 30000...
>
> With this feature not enabled, the expiration times set: 10100, 20100, 30100...
>
> But IMHO, for periodic timers, accuracy of the period seems to be the main
> concern, because it does not frequently start and stop. The incorrect period
> caused by the first timer expiration can be ignored.
I agree it's superfluous, but applying the advancement also does no harm, and
avoiding it would be moreeffort than simply letting KVM predict the first expiration.
> 2. for oneshot timer
> ====================
>
> In [1], Marcelo treated oneshot and tscdeadline differently. Shouldn’t the
> behavior of these two timers be similar?
Yes, but they aren't identical, and so supporting both would require additional
code, complexity, and testing.
> Unlike periodic timers, both oneshot and tscdeadline timers set a specific
> expiration time, and what the guest cares about is whether the expiration
> time is accurate. Moreover, this feature is mainly intended to mitigate the
> latency introduced by timer virtualization. Since software timers have
> higher latency compared to hardware virtual timers, the need for this feature
> is actually more urgent for software timers.
Yep.
> However, in the current implementation, the feature is applied to hv
> oneshot/period timers, but not to sw oneshot/period timers.
>
> ===============
> Summary of IMHO
> ===============
>
> The feature should be applied to the following timer types:
> sw/hv tscdeadline and sw/hv oneshot
In a perfect world, probably? But I don't know that it's worth changing at this
time. Much of this is balancing complexity with benefit, though it's also most
definitely a reflection of the initial implementation.
KVM unconditionally emulates TSC-deadline mode, and AFAIK every real-world kernel
prefers TSC-deadline over one-shot, and so in practice the benefits of applying
the advancement to one-shot hrtimers. That was also the way the world was headed
back when Marcelo first implemented the support. I don't know for sure why the
initial implementation targeted only TSC-deadline mode, but I think it's safe to
assume that the use case Marcelo was targeting exclusively used TSC-deadline.
I'm not entirely opposed to playing the advancement games with one-shot hrtimers,
but it's also not clear to me that it's worth doing. E.g. supporting one-shot
hrtimers would likely require a bit of extra complexity to juggle the different
time domains. And if the only use cases that are truly sensitive to timer
programming latency exclusively use TSC-deadline mode (because one-shot mode is
inherently "fuzzy"), then any amount of extra complexity is effectively dead weight.
> should not be applied to:
> sw/hv period
I wouldn't say "should not be applied to", I think it's more "doesn't provide much
benefit to".
next prev parent reply other threads:[~2025-10-17 15:59 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-13 12:51 [PATCH RESEND] avoid hv timer fallback to sw timer if delay exceeds period fuqiang wang
2025-10-13 23:29 ` Sean Christopherson
2025-10-17 12:21 ` fuqiang wang
2025-10-17 15:59 ` Sean Christopherson [this message]
2025-10-21 15:37 ` fuqiang wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPJnxDj4mFSJc0tV@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=chen.yu@easystack.com \
--cc=dave.hansen@linux.intel.com \
--cc=dongxu.zhang@easystack.com \
--cc=fuqiang.wng@gmail.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox