public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@web.de>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>, kvm <kvm@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
Date: Tue, 28 Feb 2012 02:15:23 +0100	[thread overview]
Message-ID: <4F4C2AAB.1060001@web.de> (raw)
In-Reply-To: <1330384513.29701.52.camel@bling.home>

[-- Attachment #1: Type: text/plain, Size: 7548 bytes --]

On 2012-02-28 00:15, Alex Williamson wrote:
> On Mon, 2012-02-27 at 23:07 +0100, Jan Kiszka wrote:
>> On 2012-02-27 22:05, Alex Williamson wrote:
>>> On Fri, 2012-02-10 at 19:17 +0100, Jan Kiszka wrote:
>>>> PCI 2.3 allows to generically disable IRQ sources at device level. This
>>>> enables us to share legacy IRQs of such devices with other host devices
>>>> when passing them to a guest.
>>>>
>>>> The new IRQ sharing feature introduced here is optional, user space has
>>>> to request it explicitly. Moreover, user space can inform us about its
>>>> view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
>>>> interrupt and signaling it if the guest masked it via the virtualized
>>>> PCI config space.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>
>>>> Changes in v3:
>>>>  - rebased over current kvm.git (no code conflict, just api.txt)
>>>>
>>>>  Documentation/virtual/kvm/api.txt |   31 ++++++
>>>>  arch/x86/kvm/x86.c                |    1 +
>>>>  include/linux/kvm.h               |    6 +
>>>>  include/linux/kvm_host.h          |    2 +
>>>>  virt/kvm/assigned-dev.c           |  208 +++++++++++++++++++++++++++++++-----
>>>>  5 files changed, 219 insertions(+), 29 deletions(-)
>>>>
>>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>>>> index 59a3826..5ce0e29 100644
>>>> --- a/Documentation/virtual/kvm/api.txt
>>>> +++ b/Documentation/virtual/kvm/api.txt
>>>> @@ -1169,6 +1169,14 @@ following flags are specified:
>>>>  
>>>>  /* Depends on KVM_CAP_IOMMU */
>>>>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
>>>> +/* The following two depend on KVM_CAP_PCI_2_3 */
>>>> +#define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
>>>> +#define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
>>>> +
>>>> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts
>>>> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other
>>>> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
>>>> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
>>>>  
>>>>  The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
>>>>  isolation of the device.  Usages not specifying this flag are deprecated.
>>>> @@ -1441,6 +1449,29 @@ The "num_dirty" field is a performance hint for KVM to determine whether it
>>>>  should skip processing the bitmap and just invalidate everything.  It must
>>>>  be set to the number of set bits in the bitmap.
>>>>  
>>>> +4.60 KVM_ASSIGN_SET_INTX_MASK
>>>> +
>>>> +Capability: KVM_CAP_PCI_2_3
>>>> +Architectures: x86
>>>> +Type: vm ioctl
>>>> +Parameters: struct kvm_assigned_pci_dev (in)
>>>> +Returns: 0 on success, -1 on error
>>>> +
>>>> +Informs the kernel about the guest's view on the INTx mask. As long as the
>>>> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
>>>> +hardware level and will not assert the guest's IRQ line. User space is still
>>>> +responsible for applying this state to the assigned device's real config space
>>>> +by setting or clearing the Interrupt Disable bit 10 in the Command register.
>>>> +
>>>> +To avoid that the kernel overwrites the state user space wants to set,
>>>> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config space.
>>>> +Moreover, user space has to write back its own view on the Interrupt Disable
>>>> +bit whenever modifying the Command word.
>>>
>>> This is very confusing.  I think we need to work on the wording, but
>>> perhaps it's not worth hold up the patch.  It seems the simplest,
>>
>> As I need another round anyway (see below), I'm open for better wording
>> suggestions.
> 
> Now that I know what it does, I'll probably write something just as
> confusing, but here's a shot:
> 
>         Allows userspace to mask PCI INTx interrupts from the assigned
>         device.  The kernel will not deliver INTx interrupts to the
>         guest between setting and clearing of KVM_ASSIGN_SET_INTX_MASK
>         via this interface.  This enables use of and emulation of PCI
>         2.3 INTx disable command register behavior.
>         
>         This may be used for both PCI 2.3 devices supporting INTx
>         disable natively and older devices lacking this support.
>         Userspace is responsible for emulating the read value of the
>         INTx disable bit in the guest visible PCI command register.
>         When modifying the INTx disable state, userspace should precede
>         updating the physical device command register by calling this
>         ioctl to inform the kernel of the new intended INTx mask state.
>         
>         Note that the kernel uses the device INTx disable bit to
>         internally manage the device interrupt state for PCI 2.3
>         devices.  Reads of this register may therefore not match the
>         expected value.  Writes should always use the guest intended
>         INTx disable value rather than attempting to read-copy-update
>         the current physical device state.  Races between user and
>         kernel updates to the INTx disable bit are handled lazily in the
>         kernel.  It's possible the device may generate unintended
>         interrupts, but they will not be injected into the guest.

Sounds good, will pick it up.

...
>>>> @@ -759,6 +851,56 @@ msix_entry_out:
>>>>  }
>>>>  #endif
>>>>  
>>>> +static int kvm_vm_ioctl_set_pci_irq_mask(struct kvm *kvm,
>>>> +		struct kvm_assigned_pci_dev *assigned_dev)
>>>> +{
>>>> +	int r = 0;
>>>> +	struct kvm_assigned_dev_kernel *match;
>>>> +
>>>> +	mutex_lock(&kvm->lock);
>>>> +
>>>> +	match = kvm_find_assigned_dev(&kvm->arch.assigned_dev_head,
>>>> +				      assigned_dev->assigned_dev_id);
>>>> +	if (!match) {
>>>> +		r = -ENODEV;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	mutex_lock(&match->intx_mask_lock);
>>>> +
>>>> +	match->flags &= ~KVM_DEV_ASSIGN_MASK_INTX;
>>>> +	match->flags |= assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX;
>>>> +
>>>> +	if (match->irq_requested_type & KVM_DEV_IRQ_GUEST_INTX) {
>>>> +		if (assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX) {
>>>> +			kvm_set_irq(match->kvm, match->irq_source_id,
>>>> +				    match->guest_irq, 0);
>>>> +			/*
>>>> +			 * Masking at hardware-level is performed on demand,
>>>> +			 * i.e. when an IRQ actually arrives at the host.
>>>> +			 */
>>>> +		} else {
>>>> +			/*
>>>> +			 * Unmask the IRQ line. It may have been masked
>>>> +			 * meanwhile if we aren't using PCI 2.3 INTx masking
>>>> +			 * on the host side.
>>>> +			 */
>>>> +			spin_lock_irq(&match->intx_lock);
>>>> +			if (match->host_irq_disabled) {
>>>> +				enable_irq(match->host_irq);
>>>
>>> How do we not get an unbalanced enable here for PCI 2.3 devices?
>>
>> By performing both the disable and the host_irq_disabled update under
>> intx_lock? Or which scenario do you see?
> 
> Do we ever do disable_irq() on a PCI 2.3 device?  Seems like we only use
> the intx API for them, and disable/enable_irq exclusively on non-2.3, so
> if we can get here on a 2.3 device we have unbalanced enables.  Am I
> missing some nuance of host_irq_disabled that prevents 2.3 from getting
> here?  Thanks,

OK, now I got it. Yes, that's indeed a bug. I dug in an older version of
this patch, and there I had multiple state values to encode if the line
or the device was disabled. Will limit to pre-2.3 devices.

Thanks,
Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

      reply	other threads:[~2012-02-28  1:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-10 18:17 [PATCH v3] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices Jan Kiszka
2012-02-27 15:44 ` Jan Kiszka
2012-02-27 21:05 ` Alex Williamson
2012-02-27 22:07   ` Jan Kiszka
2012-02-27 23:15     ` Alex Williamson
2012-02-28  1:15       ` Jan Kiszka [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F4C2AAB.1060001@web.de \
    --to=jan.kiszka@web.de \
    --cc=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox