public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@linaro.org>
To: Alex Williamson <alex.williamson@redhat.com>,
	"Wu, Feng" <feng.wu@intel.com>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"gleb@kernel.org" <gleb@kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"dwmw2@infradead.org" <dwmw2@infradead.org>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"jiang.liu@linux.intel.com" <jiang.liu@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [v2 18/25] KVM: kvm-vfio: implement the VFIO skeleton for VT-d Posted-Interrupts
Date: Mon, 08 Dec 2014 11:15:38 +0100	[thread overview]
Message-ID: <54857A4A.5030003@linaro.org> (raw)
In-Reply-To: <1418015529.1095.26.camel@bling.home>

On 12/08/2014 06:12 AM, Alex Williamson wrote:
> On Mon, 2014-12-08 at 04:58 +0000, Wu, Feng wrote:
>>
>>> -----Original Message-----
>>> From: Eric Auger [mailto:eric.auger@linaro.org]
>>> Sent: Thursday, December 04, 2014 11:36 PM
>>> To: Wu, Feng; tglx@linutronix.de; mingo@redhat.com; hpa@zytor.com;
>>> x86@kernel.org; gleb@kernel.org; pbonzini@redhat.com;
>>> dwmw2@infradead.org; joro@8bytes.org; alex.williamson@redhat.com;
>>> jiang.liu@linux.intel.com
>>> Cc: linux-kernel@vger.kernel.org; iommu@lists.linux-foundation.org;
>>> kvm@vger.kernel.org
>>> Subject: Re: [v2 18/25] KVM: kvm-vfio: implement the VFIO skeleton for VT-d
>>> Posted-Interrupts
>>>
>>> Hi Feng,
>>>
>>> On 12/03/2014 08:39 AM, Feng Wu wrote:
>>>> This patch adds the kvm-vfio interface for VT-d Posted-Interrrupts.
>>>> When guests updates MSI/MSI-x information for an assigned-device,
>>> update
>>>> QEMU will use KVM_DEV_VFIO_DEVICE_POSTING_IRQ attribute to setup
>>>> IRTE for VT-d PI. This patch implement this IRQ attribute.
>>> s/implement/implements
>>>>
>>>> Signed-off-by: Feng Wu <feng.wu@intel.com>
>>>> ---
>>>>  include/linux/kvm_host.h |   19 ++++++++
>>>>  virt/kvm/vfio.c          |  103
>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>>  2 files changed, 122 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>>> index 5cd4420..8d06678 100644
>>>> --- a/include/linux/kvm_host.h
>>>> +++ b/include/linux/kvm_host.h
>>>> @@ -1134,6 +1134,25 @@ static inline int
>>> kvm_arch_vfio_set_forward(struct kvm_fwd_irq *fwd_irq,
>>>>  }
>>>>  #endif
>>>>
>>>> +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_POSTING
>>>> +/*
>>>> + * kvm_arch_vfio_update_pi_irte - set IRTE for Posted-Interrupts
>>>> + *
>>>> + * @kvm: kvm
>>>> + * @host_irq: host irq of the interrupt
>>>> + * @guest_irq: gsi of the interrupt
>>>> + * returns 0 on success, < 0 on failure
>>>> + */
>>>> +int kvm_arch_vfio_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
>>>> +				 uint32_t guest_irq);
>>>> +#else
>>>> +static int kvm_arch_vfio_update_pi_irte(struct kvm *kvm, unsigned int
>>> host_irq,
>>>> +					uint32_t guest_irq)
>>>> +{
>>>> +	return 0;
>>>> +}
>>>> +#endif
>>>> +
>>>>  #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
>>>>
>>>>  static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool
>>> val)
>>>> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
>>>> index 6bc7001..5e5515f 100644
>>>> --- a/virt/kvm/vfio.c
>>>> +++ b/virt/kvm/vfio.c
>>>> @@ -446,6 +446,99 @@ out:
>>>>  	return ret;
>>>>  }
>>>>
>>>> +static int kvm_vfio_pci_get_irq_count(struct pci_dev *pdev, int irq_type)
>>>> +{
>>>> +	if (irq_type == VFIO_PCI_INTX_IRQ_INDEX) {
>>>> +		u8 pin;
>>>> +
>>>> +		pci_read_config_byte(pdev, PCI_INTERRUPT_PIN, &pin);
>>>> +		if (pin)
>>>> +			return 1;
>>>> +	} else if (irq_type == VFIO_PCI_MSI_IRQ_INDEX)
>>>> +		return pci_msi_vec_count(pdev);
>>>> +	else if (irq_type == VFIO_PCI_MSIX_IRQ_INDEX)
>>>> +		return pci_msix_vec_count(pdev);
>>>> +
>>>> +	return 0;
>>>> +}
>>> for platform case I was asked to move the retrieval of absolute irq
>>> number to the architecture specific part. I don't know if it should
>>> apply to PCI stuff as well? This explains why I need to pass the VFIO
>>> device (or struct device handle) to the arch specific part. Actually we
>>> do the same job, we provide a phys/virt IRQ mapping to KVM, right? So to
>>> me our architecture specific API should look quite similar?
>>
>> In my patch, QEMU passes IRQ type(MSI/MSIx in my case), VFIO device index,
>> and sub-index via "struct kvm_vfio_dev_irq" to KVM, then KVM will find the
>> real host irq from the VFIO device index and the IRQ type. Is this something
>> similar with your patch?
>>
>>>
>>>> +
>>>> +static int kvm_vfio_set_pi(struct kvm_device *kdev, int32_t __user *argp)
>>>> +{
>>>> +	struct kvm_vfio_dev_irq pi_info;
>>>> +	uint32_t *gsi;
>>>> +	unsigned long minsz;
>>>> +	struct vfio_device *vdev;
>>>> +	struct msi_desc *entry;
>>>> +	struct device *dev;
>>>> +	struct pci_dev *pdev;
>>>> +	int i, max, ret;
>>>> +
>>>> +	minsz = offsetofend(struct kvm_vfio_dev_irq, count);
>>>> +
>>>> +	if (copy_from_user(&pi_info, (void __user *)argp, minsz))
>>>> +		return -EFAULT;
>>>> +
>>>> +	if (pi_info.argsz < minsz || pi_info.index >= VFIO_PCI_NUM_IRQS)
>>> PCI specific check, same remark as above but I will let Alex further
>>> comment on this and possibly invalidate this commeny ;-)
>>>> +		return -EINVAL;
>>>> +
>>>> +	vdev = kvm_vfio_get_vfio_device(pi_info.fd);
>>>> +	if (IS_ERR(vdev))
>>>> +		return PTR_ERR(vdev);
>>>> +
>>>> +	dev = kvm_vfio_external_base_device(vdev);
>>>> +	if (!dev || !dev_is_pci(dev)) {
>>>> +		ret = -EFAULT;
>>>> +		goto put_vfio_device;
>>>> +	}
>>>> +
>>>> +	pdev = to_pci_dev(dev);
>>>> +
>>>> +	max = kvm_vfio_pci_get_irq_count(pdev, pi_info.index);
>>>> +	if (max <= 0) {
>>>> +		ret = -EFAULT;
>>>> +		goto put_vfio_device;
>>>> +	}
>>>> +
>>>> +	if (pi_info.argsz - minsz < pi_info.count * sizeof(int) ||
>>> shouldn' we use the actual datatype?
>>
>> I am afraid I don't get this, could you please be more specific? Thanks a lot!
> 
> We could have a platform that supports 64bit INTs.
yes this is what I meant (struct datatype is __u32).

Thanks

Eric
> 
>>>> +	    pi_info.start >= max || pi_info.start + pi_info.count > max) {
>>>> +		ret = -EINVAL;
>>>> +		goto put_vfio_device;
>>>> +	}
>>>> +
>>>> +	gsi = memdup_user((void __user *)((unsigned long)argp + minsz),
>>>> +			   pi_info.count * sizeof(int));
>>> same question as above
>>>> +	if (IS_ERR(gsi)) {
>>>> +		ret = PTR_ERR(gsi);
>>>> +		goto put_vfio_device;
>>>> +	}
>>>> +
>>>> +#ifdef CONFIG_PCI_MSI
>>>> +	for (i = 0; i < pi_info.count; i++) {
>>>> +		list_for_each_entry(entry, &pdev->msi_list, list) {
>>>> +			if (entry->msi_attrib.entry_nr != pi_info.start+i)
>>>> +				continue;
>>>> +
>>>> +			ret = kvm_arch_vfio_update_pi_irte(kdev->kvm,
>>>> +							   entry->irq,
>>>> +							   gsi[i]);
>>>> +			if (ret) {
>>>> +				ret = -EFAULT;
>>> why -EFAULT? and not propagation of original error code?
>> Yes, you are right. Thanks for the comments!
>>
>>> you may have posting set for part of the subindexes and unset for rest.
>>> Isn't it an issue?
>>
>> QEMU will always set the posting for all the sub-indexes for MSI/MSIx,
>> once the guest updates the configuration of some sub-indexes, KVM will
>> update it accordingly. So in which case will what you mentioned above
>> happen?

Was pointing out you handle the case where kvm_arch_vfio_update_pi_irte
could fail and you still continue looping thru the other indexes. So
theoretically you could have a mixed of non posted IRQs and posted IRQs.

Best Regards

Eric
> 
> QEMU is just one userspace, not necessarily the only userspace.  The
> kernel shouldn't expect a specific userspace behavior.
> 
>>>> +				goto free_gsi;
>>>> +			}
>>>> +		}
>>>> +	}
>>>> +#endif
>>>> +
>>>> +	ret = 0;
>>>> +
>>>> +free_gsi:
>>>> +	kfree(gsi);
>>>> +
>>>> +put_vfio_device:
>>>> +	kvm_vfio_put_vfio_device(vdev);
>>>> +	return ret;
>>>> +}
>>>> +
>>>>  static int kvm_vfio_set_device(struct kvm_device *kdev, long attr, u64 arg)
>>>>  {
>>>>  	int32_t __user *argp = (int32_t __user *)(unsigned long)arg;
>>>> @@ -456,6 +549,11 @@ static int kvm_vfio_set_device(struct kvm_device
>>> *kdev, long attr, u64 arg)
>>>>  	case KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ:
>>>>  		ret = kvm_vfio_control_irq_forward(kdev, attr, argp);
>>>>  		break;
>>>> +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_POSTING
>>>> +	case KVM_DEV_VFIO_DEVICE_POSTING_IRQ:
>>>> +		ret = kvm_vfio_set_pi(kdev, argp);
>>>> +		break;
>>>> +#endif
>>>>  	default:
>>>>  		ret = -ENXIO;
>>>>  	}
>>>> @@ -511,6 +609,11 @@ static int kvm_vfio_has_attr(struct kvm_device
>>> *dev,
>>>>  		case KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ:
>>>>  			return 0;
>>>>  #endif
>>>> +#ifdef __KVM_HAVE_ARCH_KVM_VFIO_POSTING
>>>> +		case KVM_DEV_VFIO_DEVICE_POSTING_IRQ:
>>>> +			return 0;
>>>> +#endif
>>>> +
>>>>  		}
>>>>  		break;
>>>>  	}
>>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 


  reply	other threads:[~2014-12-08 10:17 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-03  7:39 [v2 00/25] Add VT-d Posted-Interrupts support Feng Wu
2014-12-03  7:39 ` [v2 01/25] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
2014-12-03  7:39 ` [v2 02/25] iommu: Add new member capability to struct irq_remap_ops Feng Wu
2014-12-03  7:39 ` [v2 03/25] iommu, x86: Define new irte structure for VT-d Posted-Interrupts Feng Wu
2014-12-03  7:39 ` [v2 04/25] iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip Feng Wu
2014-12-03  7:39 ` [v2 05/25] x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
2014-12-03  7:39 ` [v2 06/25] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts Feng Wu
2014-12-03  7:39 ` [v2 07/25] iommu, x86: Add cap_pi_support() to detect VT-d PI capability Feng Wu
2014-12-03  7:39 ` [v2 08/25] iommu, x86: Add intel_irq_remapping_capability() for Intel Feng Wu
2014-12-03  7:39 ` [v2 09/25] iommu, x86: define irq_remapping_cap() Feng Wu
2014-12-03  7:39 ` [v2 10/25] KVM: change struct pi_desc for VT-d Posted-Interrupts Feng Wu
2014-12-03  7:39 ` [v2 11/25] KVM: Add some helper functions for Posted-Interrupts Feng Wu
2014-12-03  7:39 ` [v2 12/25] KVM: Initialize VT-d Posted-Interrupts Descriptor Feng Wu
2014-12-03  7:39 ` [v2 13/25] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI Feng Wu
2014-12-03  7:39 ` [v2 14/25] KVM: Get Posted-Interrupts descriptor address from struct kvm_vcpu Feng Wu
2014-12-03  7:39 ` [v2 15/25] KVM: Make struct kvm_irq_routing_table accessible Feng Wu
2014-12-03  7:39 ` [v2 16/25] KVM: make kvm_set_msi_irq() public Feng Wu
2014-12-03  7:39 ` [v2 17/25] KVM: kvm-vfio: User API for VT-d Posted-Interrupts Feng Wu
2014-12-04 14:04   ` Eric Auger
2014-12-08  4:58     ` Wu, Feng
2014-12-08  5:21       ` Alex Williamson
2014-12-09 11:38         ` Wu, Feng
2014-12-11  5:55         ` Wu, Feng
2014-12-11 15:45           ` Alex Williamson
2014-12-03  7:39 ` [v2 18/25] KVM: kvm-vfio: implement the VFIO skeleton " Feng Wu
2014-12-04 15:35   ` Eric Auger
2014-12-08  4:58     ` Wu, Feng
2014-12-08  5:12       ` Alex Williamson
2014-12-08 10:15         ` Eric Auger [this message]
2014-12-09 11:51           ` Wu, Feng
2014-12-03  7:39 ` [v2 19/25] KVM: x86: kvm-vfio: VT-d posted-interrupts setup Feng Wu
2014-12-03  7:39 ` [v2 20/25] x86, irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
2014-12-03  7:39 ` [v2 21/25] KVM: Update Posted-Interrupts descriptor during vCPU scheduling Feng Wu
2014-12-03  7:39 ` [v2 22/25] KVM: Change NDST field after " Feng Wu
2014-12-03  7:39 ` [v2 23/25] KVM: Add the handler for Wake-up Vector Feng Wu
2014-12-03  7:39 ` [v2 24/25] KVM: Suppress posted-interrupt when 'SN' is set Feng Wu
2014-12-03  7:39 ` [v2 25/25] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Feng Wu
2014-12-08 13:36 ` [v2 00/25] Add VT-d Posted-Interrupts support Wu, Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54857A4A.5030003@linaro.org \
    --to=eric.auger@linaro.org \
    --cc=alex.williamson@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=feng.wu@intel.com \
    --cc=gleb@kernel.org \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jiang.liu@linux.intel.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox