From: Jan Kiszka <jan.kiszka@web.de>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>, kvm <kvm@vger.kernel.org>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
Date: Tue, 28 Feb 2012 02:15:23 +0100 [thread overview]
Message-ID: <4F4C2AAB.1060001@web.de> (raw)
In-Reply-To: <1330384513.29701.52.camel@bling.home>
[-- Attachment #1: Type: text/plain, Size: 7548 bytes --]
On 2012-02-28 00:15, Alex Williamson wrote:
> On Mon, 2012-02-27 at 23:07 +0100, Jan Kiszka wrote:
>> On 2012-02-27 22:05, Alex Williamson wrote:
>>> On Fri, 2012-02-10 at 19:17 +0100, Jan Kiszka wrote:
>>>> PCI 2.3 allows to generically disable IRQ sources at device level. This
>>>> enables us to share legacy IRQs of such devices with other host devices
>>>> when passing them to a guest.
>>>>
>>>> The new IRQ sharing feature introduced here is optional, user space has
>>>> to request it explicitly. Moreover, user space can inform us about its
>>>> view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
>>>> interrupt and signaling it if the guest masked it via the virtualized
>>>> PCI config space.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>
>>>> Changes in v3:
>>>> - rebased over current kvm.git (no code conflict, just api.txt)
>>>>
>>>> Documentation/virtual/kvm/api.txt | 31 ++++++
>>>> arch/x86/kvm/x86.c | 1 +
>>>> include/linux/kvm.h | 6 +
>>>> include/linux/kvm_host.h | 2 +
>>>> virt/kvm/assigned-dev.c | 208 +++++++++++++++++++++++++++++++-----
>>>> 5 files changed, 219 insertions(+), 29 deletions(-)
>>>>
>>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>>>> index 59a3826..5ce0e29 100644
>>>> --- a/Documentation/virtual/kvm/api.txt
>>>> +++ b/Documentation/virtual/kvm/api.txt
>>>> @@ -1169,6 +1169,14 @@ following flags are specified:
>>>>
>>>> /* Depends on KVM_CAP_IOMMU */
>>>> #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
>>>> +/* The following two depend on KVM_CAP_PCI_2_3 */
>>>> +#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
>>>> +#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)
>>>> +
>>>> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts
>>>> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other
>>>> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
>>>> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
>>>>
>>>> The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
>>>> isolation of the device. Usages not specifying this flag are deprecated.
>>>> @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM to determine whether it
>>>> should skip processing the bitmap and just invalidate everything. It must
>>>> be set to the number of set bits in the bitmap.
>>>>
>>>> +4.60 KVM_ASSIGN_SET_INTX_MASK
>>>> +
>>>> +Capability: KVM_CAP_PCI_2_3
>>>> +Architectures: x86
>>>> +Type: vm ioctl
>>>> +Parameters: struct kvm_assigned_pci_dev (in)
>>>> +Returns: 0 on success, -1 on error
>>>> +
>>>> +Informs the kernel about the guest's view on the INTx mask. As long as the
>>>> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
>>>> +hardware level and will not assert the guest's IRQ line. User space is still
>>>> +responsible for applying this state to the assigned device's real config space
>>>> +by setting or clearing the Interrupt Disable bit 10 in the Command register.
>>>> +
>>>> +To avoid that the kernel overwrites the state user space wants to set,
>>>> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config space.
>>>> +Moreover, user space has to write back its own view on the Interrupt Disable
>>>> +bit whenever modifying the Command word.
>>>
>>> This is very confusing. I think we need to work on the wording, but
>>> perhaps it's not worth hold up the patch. It seems the simplest,
>>
>> As I need another round anyway (see below), I'm open for better wording
>> suggestions.
>
> Now that I know what it does, I'll probably write something just as
> confusing, but here's a shot:
>
> Allows userspace to mask PCI INTx interrupts from the assigned
> device. The kernel will not deliver INTx interrupts to the
> guest between setting and clearing of KVM_ASSIGN_SET_INTX_MASK
> via this interface. This enables use of and emulation of PCI
> 2.3 INTx disable command register behavior.
>
> This may be used for both PCI 2.3 devices supporting INTx
> disable natively and older devices lacking this support.
> Userspace is responsible for emulating the read value of the
> INTx disable bit in the guest visible PCI command register.
> When modifying the INTx disable state, userspace should precede
> updating the physical device command register by calling this
> ioctl to inform the kernel of the new intended INTx mask state.
>
> Note that the kernel uses the device INTx disable bit to
> internally manage the device interrupt state for PCI 2.3
> devices. Reads of this register may therefore not match the
> expected value. Writes should always use the guest intended
> INTx disable value rather than attempting to read-copy-update
> the current physical device state. Races between user and
> kernel updates to the INTx disable bit are handled lazily in the
> kernel. It's possible the device may generate unintended
> interrupts, but they will not be injected into the guest.
Sounds good, will pick it up.
...
>>>> @@ -759,6 +851,56 @@ msix_entry_out:
>>>> }
>>>> #endif
>>>>
>>>> +static int kvm_vm_ioctl_set_pci_irq_mask(struct kvm *kvm,
>>>> + struct kvm_assigned_pci_dev *assigned_dev)
>>>> +{
>>>> + int r = 0;
>>>> + struct kvm_assigned_dev_kernel *match;
>>>> +
>>>> + mutex_lock(&kvm->lock);
>>>> +
>>>> + match = kvm_find_assigned_dev(&kvm->arch.assigned_dev_head,
>>>> + assigned_dev->assigned_dev_id);
>>>> + if (!match) {
>>>> + r = -ENODEV;
>>>> + goto out;
>>>> + }
>>>> +
>>>> + mutex_lock(&match->intx_mask_lock);
>>>> +
>>>> + match->flags &= ~KVM_DEV_ASSIGN_MASK_INTX;
>>>> + match->flags |= assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX;
>>>> +
>>>> + if (match->irq_requested_type & KVM_DEV_IRQ_GUEST_INTX) {
>>>> + if (assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX) {
>>>> + kvm_set_irq(match->kvm, match->irq_source_id,
>>>> + match->guest_irq, 0);
>>>> + /*
>>>> + * Masking at hardware-level is performed on demand,
>>>> + * i.e. when an IRQ actually arrives at the host.
>>>> + */
>>>> + } else {
>>>> + /*
>>>> + * Unmask the IRQ line. It may have been masked
>>>> + * meanwhile if we aren't using PCI 2.3 INTx masking
>>>> + * on the host side.
>>>> + */
>>>> + spin_lock_irq(&match->intx_lock);
>>>> + if (match->host_irq_disabled) {
>>>> + enable_irq(match->host_irq);
>>>
>>> How do we not get an unbalanced enable here for PCI 2.3 devices?
>>
>> By performing both the disable and the host_irq_disabled update under
>> intx_lock? Or which scenario do you see?
>
> Do we ever do disable_irq() on a PCI 2.3 device? Seems like we only use
> the intx API for them, and disable/enable_irq exclusively on non-2.3, so
> if we can get here on a 2.3 device we have unbalanced enables. Am I
> missing some nuance of host_irq_disabled that prevents 2.3 from getting
> here? Thanks,
OK, now I got it. Yes, that's indeed a bug. I dug in an older version of
this patch, and there I had multiple state values to encode if the line
or the device was disabled. Will limit to pre-2.3 devices.
Thanks,
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
prev parent reply other threads:[~2012-02-28 1:24 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-10 18:17 [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices Jan Kiszka
2012-02-27 15:44 ` Jan Kiszka
2012-02-27 21:05 ` Alex Williamson
2012-02-27 22:07 ` Jan Kiszka
2012-02-27 23:15 ` Alex Williamson
2012-02-28 1:15 ` Jan Kiszka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F4C2AAB.1060001@web.de \
--to=jan.kiszka@web.de \
--cc=alex.williamson@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox