public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@redhat.com>
To: "Liu, Rong L" <rong.l.liu@intel.com>,
	Dmytro Maluka <dmy@semihalf.com>, "Christopherson,,
	Sean" <seanjc@google.com>
Cc: Micah Morton <mortonm@chromium.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Tomasz Nowicki <tn@semihalf.com>,
	Grzegorz Jaszczyk <jaz@semihalf.com>,
	Dmitry Torokhov <dtor@google.com>
Subject: Re: Add vfio-platform support for ONESHOT irq forwarding?
Date: Thu, 4 Aug 2022 16:44:47 +0200	[thread overview]
Message-ID: <00a09605-75d2-2a95-29dc-b5613a52a168@redhat.com> (raw)
In-Reply-To: <MW3PR11MB45544BF267FD3926F90A2158C7869@MW3PR11MB4554.namprd11.prod.outlook.com>

Hi,
On 7/12/22 18:02, Liu, Rong L wrote:
> Hi Sean and Dmytro,
>
>> -----Original Message-----
>> From: Dmytro Maluka <dmy@semihalf.com>
>> Sent: Thursday, July 7, 2022 10:39 AM
>> To: Christopherson,, Sean <seanjc@google.com>
>> Cc: Auger Eric <eric.auger@redhat.com>; Micah Morton
>> <mortonm@chromium.org>; Alex Williamson
>> <alex.williamson@redhat.com>; kvm@vger.kernel.org; Paolo Bonzini
>> <pbonzini@redhat.com>; Liu, Rong L <rong.l.liu@intel.com>; Tomasz
>> Nowicki <tn@semihalf.com>; Grzegorz Jaszczyk <jaz@semihalf.com>;
>> Dmitry Torokhov <dtor@google.com>
>> Subject: Re: Add vfio-platform support for ONESHOT irq forwarding?
>>
>> On 7/7/22 17:00, Sean Christopherson wrote:
>>> On Thu, Jul 07, 2022, Dmytro Maluka wrote:
>>>> Hi Sean,
>>>>
>>>> On 7/6/22 10:39 PM, Sean Christopherson wrote:
>>>>> On Wed, Jul 06, 2022, Dmytro Maluka wrote:
>>>>>> This is not a problem on native, since for oneshot irq we keep the
>>>>>> interrupt masked until the thread exits, so that the EOI at the end
>>>>>> of hardirq doesn't result in immediate re-assert. In vfio + KVM
>>>>>> case, however, the host doesn't check that the interrupt is still
>>>>>> masked in the guest, so
>>>>>> vfio_platform_unmask() is called regardless.
>>>>> Isn't not checking that an interrupt is unmasked the real bug?
>>>>> Fudging around vfio (or whatever is doing the premature unmasking)
>>>>> bugs by delaying an ack notification in KVM is a hack, no?
>>>> Yes, not checking that an interrupt is unmasked is IMO a bug, and my
>>>> patch actually adds this missing checking, only that it adds it in
>>>> KVM, not in VFIO. :)
>>>>
>>>> Arguably it's not a bug that VFIO is not checking the guest interrupt
>>>> state on its own, provided that the resample notification it receives
>>>> is always a notification that the interrupt has been actually acked.
>>>> That is the motivation behind postponing ack notification in KVM in
>>>> my patch: it is to ensure that KVM "ack notifications" are always
>>>> actual ack notifications (as the name suggests), not just "eoi
>> notifications".
>>> But EOI is an ACK.  It's software saying "this interrupt has been
>> consumed".
>>
>> Ok, I see we've had some mutual misunderstanding of the term "ack"
>> here.
>> EOI is an ACK in the interrupt controller sense, while I was talking about
>> an ACK in the device sense, i.e. a device-specific action, done in a device
>> driver's IRQ handler, which makes the device de-assert the IRQ line (so
>> that the IRQ will not get re-asserted after an EOI or unmask).
>>
>> So the problem I'm trying to fix stems from the peculiarity of "oneshot"
>> interrupts: an ACK in the device sense is done in a threaded handler, i.e.
>> after an ACK in the interrupt controller sense, not before it.
>>
>> In the meantime I've realized one more reason why my patch is wrong.
>> kvm_notify_acked_irq() is an internal KVM thing which is supposed to
>> notify interested parts of KVM about an ACK rather in the interrupt
>> controller sense, i.e. about an EOI as it is doing already.
>>
>> VFIO, on the other hand, rather expects a notification about an ACK in the
>> device sense. So it still seems a good idea to me to postpone sending
>> notifications to VFIO until an IRQ gets unmasked, but this postponing
>> should be done not for the entire kvm_notify_acked_irq() but only for
>> eventfd_signal() on resamplefd in irqfd_resampler_ack().
>>
>> Thanks for making me think about that.
>>
>>>> That said, your idea of checking the guest interrupt status in VFIO
>>>> (or whatever is listening on the resample eventfd) makes sense to me
>>>> too. The problem, though, is that it's KVM that knows the guest
>>>> interrupt status, so KVM would need to let VFIO/whatever know it
>>>> somehow. (I'm assuming we are focusing on the case of KVM kernel
>>>> irqchip, not userspace or split irqchip.) So do you have in mind
>>>> adding something like "maskfd" and "unmaskfd" to KVM IRQFD
>> interface,
>>>> in addition to resamplefd? If so, I'm actually in favor of such an
>>>> idea, as I think it would be also useful for other purposes, regardless
>> of oneshot interrupts.
>>> Unless I'm misreading things, KVM already provides a mask notifier,
>>> irqfd just needs to be wired up to use
>> kvm_(un)register_irq_mask_notifier().
>>
> Interesting...  I initially thought that kvm doesn't "trap" on ioapic's mmio
> write.  However, I just traced kvm/ioapic.c and it turns out
> ioapic_write_indirect() was called many times.   Does trapping on ioapic's mmio
> write cause vmexit on edge-triggered interrupt exit?  It seems the case because
> IOREGSEL and IOWIN of IOAPIC are memory mapped but not the redirection entry
> register for each IRQ (that is why the name indirect_write), in order to unmask
> redirection entry register on the exit of each interrupt (edge-triggered or
> level-triggered), kernel needs to write to IORESEL, which means vmexit if kvm
> traps on ioapic's mmio write.  However, for pass-thru device which uses
> edge-triggered interrupt (handled by vfio or something similar),  interrupt
> (pIRQ) is enabled by vfio and it seems unnecessary to cause a vmexit when guest
> updates virtual ioapic.  I think the situation is similar for level-triggered
> interrupt.  So 2 vmexits for each level-triggered interrupt completion, one for
> EOI on lapic and another for unmask of IOAPIC register.  Does this sound right? 
> I thought with vfio (or similar architecture), there is no vmexit necessary on
> edge-triggered interrupt completion and only one vmexit for level triggered
> interrupt completion, except the caveats of oneshot interrupt.  Maybe I
> misunderstand something?
Currently, no vmexit for edge-sensitive and 1 vmexit for level-sensitive
is what happens on ARM shared peripheral interrupts at least.
Note there is one setup that could remove the need for the vmexit on
vEOI: irq_bypass mode
(https://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf slide 12-14):
on GIC you have a mode that allows automatic completion of the physical
IRQ when the corresponding vIRQ is completed. This mode would not be
compatible with oneshort_irq.
At some point we worked on this enablement but given the lack of
vfio-platform customers this work was paused so we still have the
mask/unmask vfio dance.

Thanks

Eric

>
>> Thanks for the tip. I'll take a look into implementing this idea.
>>
>> It seems you agree that fixing this issue via a change in KVM (in irqfd, not
>> in ioapic) seems to be the way to go.
>>
>> An immediate problem I see with kvm_(un)register_irq_mask_notifier()
>> is that it is currently available for x86 only. Also, mask notifiers are called
>> under ioapic->lock (I'm not sure yet if that is good or bad news for us).
>>
>> Thanks,
>> Dmytro


  reply	other threads:[~2022-08-04 14:45 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 15:46 Add vfio-platform support for ONESHOT irq forwarding? Micah Morton
2021-01-25 17:31 ` Auger Eric
2021-01-25 18:32   ` Micah Morton
2021-01-26  8:48     ` Auger Eric
2021-01-26 15:31       ` Micah Morton
2021-01-26 16:05         ` Auger Eric
2021-01-25 20:36 ` Alex Williamson
2021-01-26  8:53   ` Auger Eric
2021-01-26 15:15     ` Micah Morton
2021-01-26 16:19       ` Auger Eric
2021-01-27 15:58         ` Micah Morton
2021-01-29 19:57           ` Micah Morton
2021-01-30 16:38             ` Auger Eric
2022-07-06 15:25               ` Dmytro Maluka
2022-07-06 20:39                 ` Sean Christopherson
2022-07-07  7:38                   ` Dmytro Maluka
2022-07-07 15:00                     ` Sean Christopherson
2022-07-07 17:38                       ` Dmytro Maluka
2022-07-12 16:02                         ` Liu, Rong L
2022-08-04 14:44                           ` Eric Auger [this message]
2022-08-08 21:15                             ` Liu, Rong L
2022-07-07  8:25                 ` Eric Auger
2022-07-07  9:15                   ` Dmytro Maluka
2022-07-25 22:03                     ` Liu, Rong L
2022-08-04 14:39                       ` Eric Auger
2022-08-08 17:34                         ` Liu, Rong L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=00a09605-75d2-2a95-29dc-b5613a52a168@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dmy@semihalf.com \
    --cc=dtor@google.com \
    --cc=jaz@semihalf.com \
    --cc=kvm@vger.kernel.org \
    --cc=mortonm@chromium.org \
    --cc=pbonzini@redhat.com \
    --cc=rong.l.liu@intel.com \
    --cc=seanjc@google.com \
    --cc=tn@semihalf.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox