From: Sean Christopherson <seanjc@google.com>
To: Kai Huang <kai.huang@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"binbin.wu@linux.intel.com" <binbin.wu@linux.intel.com>,
Chao Gao <chao.gao@intel.com>,
Rick P Edgecombe <rick.p.edgecombe@intel.com>,
Xiaoyao Li <xiaoyao.li@intel.com>,
Reinette Chatre <reinette.chatre@intel.com>,
Yan Y Zhao <yan.y.zhao@intel.com>,
Adrian Hunter <adrian.hunter@intel.com>,
"tony.lindgren@linux.intel.com" <tony.lindgren@linux.intel.com>,
Isaku Yamahata <isaku.yamahata@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 12/16] KVM: TDX: Inhibit APICv for TDX guest
Date: Thu, 16 Jan 2025 06:50:31 -0800 [thread overview]
Message-ID: <Z4kcjygm19Qv1dNN@google.com> (raw)
In-Reply-To: <61e66ef579a86deb453bb25febd30f5aec7472fc.camel@intel.com>
On Thu, Jan 16, 2025, Kai Huang wrote:
> On Mon, 2025-01-13 at 10:09 +0800, Binbin Wu wrote:
> > Lazy check for pending APIC EOI when In-kernel IOAPIC
> > -----------------------------------------------------
> > In-kernel IOAPIC does not receive EOI with AMD SVM AVIC since the processor
> > accelerates write to APIC EOI register and does not trap if the interrupt
> > is edge-triggered. So there is a workaround by lazy check for pending APIC
> > EOI at the time when setting new IOAPIC irq, and update IOAPIC EOI if no
> > pending APIC EOI.
> > KVM is also not be able to intercept EOI for TDX guests.
> > - When APICv is enabled
> > The code of lazy check for pending APIC EOI doesn't work for TDX because
> > KVM can't get the status of real IRR and ISR, and the values are 0s in
> > vIRR and vISR in apic->regs[], kvm_apic_pending_eoi() will always return
> > false. So the RTC pending EOI will always be cleared when ioapic_set_irq()
> > is called for RTC. Then userspace may miss the coalesced RTC interrupts.
> > - When When APICv is disabled
> > ioapic_lazy_update_eoi() will not be called,then pending EOI status for
> > RTC will not be cleared after setting and this will mislead userspace to
> > see coalesced RTC interrupts.
> > Options:
> > - Force irqchip split for TDX guests to eliminate the use of in-kernel IOAPIC.
> > - Leave it as it is, but the use of RTC may not be accurate.
>
> Looking at the code, it seems KVM only traps EOI for level-triggered interrupt
> for in-kernel IOAPIC chip, but IIUC IOAPIC in userspace also needs to be told
> upon EOI for level-triggered interrupt. I don't know how does KVM works with
> userspace IOAPIC w/o trapping EOI for level-triggered interrupt, but "force
> irqchip split for TDX guest" seems not right.
Forcing a "split" IRQ chip is correct, in the sense that TDX doesn't support an
I/O APIC and the "split" model is the way to concoct such a setup. With a "full"
IRQ chip, KVM is responsible for emulating the I/O APIC, which is more or less
nonsensical on TDX because it's fully virtual world, i.e. there's no reason to
emulate legacy devices that only know how to talk to the I/O APIC (or PIC, etc.).
Disallowing an in-kernel I/O APIC is ideal from KVM's perspective, because
level-triggered interrupts and thus the I/O APIC as a whole can't be faithfully
emulated (see below).
> I think the problem is level-triggered interrupt,
Yes, because the TDX Module doesn't allow the hypervisor to modify the EOI-bitmap,
i.e. all EOIs are accelerated and never trigger exits.
> so I think another option is to reject level-triggered interrupt for TDX guest.
This is a "don't do that, it will hurt" situation. With a sane VMM, the level-ness
of GSIs is controlled by the guest. For GSIs that are routed through the I/O APIC,
the level-ness is determined by the corresponding Redirection Table entry. For
"GSIs" that are actually MSIs (KVM piggybacks legacy GSI routing to let userspace
wire up MSIs), and for direct MSIs injection (KVM_SIGNAL_MSI), the level-ness is
dictated by the MSI itself, which again is guest controlled.
If the guest induces generation of a level-triggered interrupt, the VMM is left
with the choice of dropping the interrupt, sending it as-is, or converting it to
an edge-triggered interrupt. Ditto for KVM. All of those options will make the
guest unhappy.
So while it _might_ make debugging broken guests either, I don't think it's worth
the complexity to try and prevent the VMM/guest from sending level-triggered
GSI-routed interrupts. It'd be a bit of a whack-a-mole and there's no architectural
behavior KVM can provide that's better than sending the interrupt and hoping for
the best.
next prev parent reply other threads:[~2025-01-16 14:50 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-09 1:07 [PATCH 00/16] KVM: TDX: TDX interrupts Binbin Wu
2024-12-09 1:07 ` [PATCH 01/16] KVM: TDX: Add support for find pending IRQ in a protected local APIC Binbin Wu
2025-01-09 15:38 ` Nikolay Borisov
2025-01-10 5:36 ` Binbin Wu
2024-12-09 1:07 ` [PATCH 02/16] KVM: VMX: Remove use of struct vcpu_vmx from posted_intr.c Binbin Wu
2024-12-09 1:07 ` [PATCH 03/16] KVM: TDX: Disable PI wakeup for IPIv Binbin Wu
2024-12-09 1:07 ` [PATCH 04/16] KVM: VMX: Move posted interrupt delivery code to common header Binbin Wu
2024-12-09 1:07 ` [PATCH 05/16] KVM: TDX: Implement non-NMI interrupt injection Binbin Wu
2024-12-09 1:07 ` [PATCH 06/16] KVM: x86: Assume timer IRQ was injected if APIC state is protected Binbin Wu
2024-12-09 1:07 ` [PATCH 07/16] KVM: TDX: Wait lapic expire when timer IRQ was injected Binbin Wu
2024-12-09 1:07 ` [PATCH 08/16] KVM: TDX: Implement methods to inject NMI Binbin Wu
2024-12-09 1:07 ` [PATCH 09/16] KVM: TDX: Complete interrupts after TD exit Binbin Wu
2024-12-09 1:07 ` [PATCH 10/16] KVM: TDX: Handle SMI request as !CONFIG_KVM_SMM Binbin Wu
2024-12-09 1:07 ` [PATCH 11/16] KVM: TDX: Always block INIT/SIPI Binbin Wu
2025-01-08 7:21 ` Xiaoyao Li
2025-01-08 7:53 ` Binbin Wu
2025-01-08 14:40 ` Sean Christopherson
2025-01-09 2:09 ` Xiaoyao Li
2025-01-09 2:26 ` Binbin Wu
2025-01-09 2:46 ` Huang, Kai
2025-01-09 3:20 ` Binbin Wu
2025-01-09 4:01 ` Huang, Kai
2025-01-09 2:51 ` Huang, Kai
2024-12-09 1:07 ` [PATCH 12/16] KVM: TDX: Inhibit APICv for TDX guest Binbin Wu
2025-01-03 21:59 ` Vishal Annapurve
2025-01-06 1:46 ` Binbin Wu
2025-01-06 22:49 ` Vishal Annapurve
2025-01-06 23:40 ` Sean Christopherson
2025-01-07 3:24 ` Chao Gao
2025-01-07 8:09 ` Binbin Wu
2025-01-07 21:15 ` Sean Christopherson
2025-01-13 2:03 ` Binbin Wu
2025-01-13 2:09 ` Binbin Wu
2025-01-13 17:16 ` Sean Christopherson
2025-01-14 8:20 ` Binbin Wu
2025-01-14 16:59 ` Sean Christopherson
2025-01-16 11:55 ` Huang, Kai
2025-01-16 14:50 ` Sean Christopherson [this message]
2025-01-16 20:16 ` Huang, Kai
2025-01-16 22:37 ` Sean Christopherson
2025-01-17 9:53 ` Huang, Kai
2025-01-17 10:46 ` Huang, Kai
2025-01-17 15:08 ` Sean Christopherson
2025-01-17 0:49 ` Binbin Wu
2024-12-09 1:07 ` [PATCH 13/16] KVM: TDX: Add methods to ignore virtual apic related operation Binbin Wu
2025-01-03 22:04 ` Vishal Annapurve
2025-01-06 2:18 ` Binbin Wu
2025-01-22 11:34 ` Paolo Bonzini
2025-01-22 13:59 ` Binbin Wu
2024-12-09 1:07 ` [PATCH 14/16] KVM: VMX: Move NMI/exception handler to common helper Binbin Wu
2024-12-09 1:07 ` [PATCH 15/16] KVM: TDX: Handle EXCEPTION_NMI and EXTERNAL_INTERRUPT Binbin Wu
2024-12-09 1:07 ` [PATCH 16/16] KVM: TDX: Handle EXIT_REASON_OTHER_SMI Binbin Wu
2024-12-10 18:24 ` [PATCH 00/16] KVM: TDX: TDX interrupts Paolo Bonzini
2025-01-06 10:51 ` Xiaoyao Li
2025-01-06 20:08 ` Sean Christopherson
2025-01-09 2:44 ` Binbin Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4kcjygm19Qv1dNN@google.com \
--to=seanjc@google.com \
--cc=adrian.hunter@intel.com \
--cc=binbin.wu@linux.intel.com \
--cc=chao.gao@intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=tony.lindgren@linux.intel.com \
--cc=xiaoyao.li@intel.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).