From: Kirti Wankhede <kwankhede@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Michal Privoznik <mprivozn@redhat.com>,
"Song, Jike" <jike.song@intel.com>,
"cjia@nvidia.com" <cjia@nvidia.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"libvir-list@redhat.com" <libvir-list@redhat.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"kraxel@redhat.com" <kraxel@redhat.com>,
Laine Stump <laine@redhat.com>,
"bjsdjshi@linux.vnet.ibm.com" <bjsdjshi@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH v7 0/4] Add Mediated device support
Date: Wed, 7 Sep 2016 23:36:28 +0530 [thread overview]
Message-ID: <4acb3eac-1e75-6b42-a70d-bb9a14fc7654@nvidia.com> (raw)
In-Reply-To: <20160907104456.0e8d0477@t450s.home>
On 9/7/2016 10:14 PM, Alex Williamson wrote:
> On Wed, 7 Sep 2016 21:45:31 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>
>> On 9/7/2016 2:58 AM, Alex Williamson wrote:
>>> On Wed, 7 Sep 2016 01:05:11 +0530
>>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>>>
>>>> On 9/6/2016 11:10 PM, Alex Williamson wrote:
>>>>> On Sat, 3 Sep 2016 22:04:56 +0530
>>>>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>>>>>
>>>>>> On 9/3/2016 3:18 AM, Paolo Bonzini wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 02/09/2016 20:33, Kirti Wankhede wrote:
>>>>>>>> <Alex> We could even do:
>>>>>>>>>>
>>>>>>>>>> echo $UUID1:$GROUPA > create
>>>>>>>>>>
>>>>>>>>>> where $GROUPA is the group ID of a previously created mdev device into
>>>>>>>>>> which $UUID1 is to be created and added to the same group.
>>>>>>>> </Alex>
>>>>>>>
>>>>>>> From the point of view of libvirt, I think I prefer Alex's idea.
>>>>>>> <group> could be an additional element in the nodedev-create XML:
>>>>>>>
>>>>>>> <device>
>>>>>>> <name>my-vgpu</name>
>>>>>>> <parent>pci_0000_86_00_0</parent>
>>>>>>> <capability type='mdev'>
>>>>>>> <type id='11'/>
>>>>>>> <uuid>0695d332-7831-493f-9e71-1c85c8911a08</uuid>
>>>>>>> <group>group1</group>
>>>>>>> </capability>
>>>>>>> </device>
>>>>>>>
>>>>>>> (should group also be a UUID?)
>>>>>>>
>>>>>>
>>>>>> No, this should be a unique number in a system, similar to iommu_group.
>>>>>
>>>>> Sorry, just trying to catch up on this thread after a long weekend.
>>>>>
>>>>> We're talking about iommu groups here, we're not creating any sort of
>>>>> parallel grouping specific to mdev devices.
>>>>
>>>> I thought we were talking about group of mdev devices and not iommu
>>>> group. IIRC, there were concerns about it (this would be similar to
>>>> UUID+instance) and that would (ab)use iommu groups.
>>>
>>> What constraints does a group, which is not an iommu group, place on the
>>> usage of the mdev devices? What happens if we put two mdev devices in
>>> the same "mdev group" and then assign them to separate VMs/users? I
>>> believe that the answer is that this theoretical "mdev group" doesn't
>>> actually impose any constraints on the devices within the group or how
>>> they're used.
>>>
>>
>> We feel its not a good idea to try to associate device's iommu groups
>> with mdev device groups. That adds more complications.
>>
>> As in above nodedev-create xml, 'group1' could be a unique number that
>> can be generated by libvirt. Then to create mdev device:
>>
>> echo $UUID1:group1 > create
>>
>> If user want to add more mdev devices to same group, he/she should use
>> same group number in next nodedev-create devices. So create commands
>> would be:
>> echo $UUID2:group1 > create
>> echo $UUID3:group1 > create
>
> So groups return to being static, libvirt would need to destroy and
> create mdev devices specifically for use within the predefined group?
Yes.
> This imposes limitations on how mdev devices can be used (ie. the mdev
> pool option is once again removed). We're also back to imposing
> grouping semantics on mdev devices that may not need them. Do all mdev
> devices for a given user need to be put into the same group?
Yes.
> Do groups
> span parent devices? Do they span different vendor drivers?
>
Yes and yes. Group number would be associated with mdev device
irrespective of its parent.
>> Each mdev device would store this group number in its mdev_device
>> structure.
>>
>> With this, we would add open() and close() callbacks from vfio_mdev
>> module for vendor driver to commit resources. Then we don't need
>> 'start'/'stop' or online/offline interface.
>>
>> To commit resources for all devices associated to that domain/user space
>> application, vendor driver can use 'first open()' and 'last close()' to
>> free those. Or if vendor driver want to commit resources for each device
>> separately, they can do in each device's open() call. It will depend on
>> vendor driver how they want to implement.
>>
>> Libvirt don't have to do anything about assigned group numbers while
>> managing mdev devices.
>>
>> QEMU commandline parameter would be same as earlier (don't have to
>> mention group number here):
>>
>> -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$UUID1 \
>> -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$UUID2
>>
>> In case if two mdev devices from same groups are assigned to different
>> domains, we can fail open() call of second device. How would driver know
>> that those are being used by different domain? By checking <group1, pid>
>> of first device of 'group1'. The two devices in same group should have
>> same pid in their open() call.
>
> Are you assuming that the two devices are owned by the same vendor
> driver?
No. See my reply to next questions below.
> What if I put NVIDIA and Intel vGPUs both into the same group
> and give each of them to a separate VM?
It depends on where we put the logic to verify pid in open() call of
each devices in group.
If we place the logic of checking <group, pid> for devices in a group in
vendor driver, then in above case both VMs would boot.
But If we impose this logic in mdev core or vfio_mdev module, then
open() on second device should fail.
> How would the NVIDIA host
> driver know which <group, pid> the Intel device got?
How to make use of group number to commit resources for devices owned by
a vendor would be vendor driver's responsibility. NVIDIA driver doesn't
need to know about Intel's vGPU nor Intel driver need to know about
NVIDIA's vGPU.
> This is what the
> iommu groups do that a different layer of grouping cannot do. Maybe
> you're suggesting a group per vendor driver, but how does libvirt know
> the vendor driver? Do they need to go research the parent device in
> sysfs and compare driver links?
>
No, group is not associated with vendor driver. Group number is
associated iwth mdev device.
Thanks,
Kirti
next prev parent reply other threads:[~2016-09-07 18:07 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-25 3:53 [Qemu-devel] [PATCH v7 0/4] Add Mediated device support Kirti Wankhede
2016-08-25 3:53 ` [Qemu-devel] [PATCH v7 1/4] vfio: Mediated device Core driver Kirti Wankhede
2016-09-08 8:09 ` Jike Song
2016-09-08 9:38 ` Neo Jia
2016-09-09 6:26 ` Jike Song
2016-09-09 17:48 ` Kirti Wankhede
2016-09-09 18:42 ` Alex Williamson
2016-09-09 19:55 ` Kirti Wankhede
2016-09-12 5:10 ` Jike Song
2016-09-12 7:49 ` Kirti Wankhede
2016-09-12 15:53 ` Alex Williamson
2016-09-19 7:08 ` Jike Song
2016-09-19 17:29 ` Kirti Wankhede
2016-09-19 18:11 ` Alex Williamson
2016-09-19 20:09 ` Kirti Wankhede
2016-09-19 20:59 ` Alex Williamson
2016-09-20 12:48 ` Jike Song
2016-08-25 3:53 ` [Qemu-devel] [PATCH v7 2/4] vfio: VFIO driver for mediated devices Kirti Wankhede
2016-08-25 9:22 ` Dong Jia
2016-08-26 14:13 ` Kirti Wankhede
2016-09-08 2:38 ` Jike Song
2016-09-19 18:22 ` Kirti Wankhede
2016-09-19 18:36 ` Alex Williamson
2016-09-19 19:13 ` Kirti Wankhede
2016-09-19 20:03 ` Alex Williamson
2016-09-20 2:50 ` Jike Song
2016-09-20 16:24 ` Alex Williamson
2016-09-21 3:19 ` Jike Song
2016-09-21 4:51 ` Alex Williamson
2016-09-21 5:02 ` Jike Song
2016-09-08 2:45 ` Jike Song
2016-09-13 2:35 ` Jike Song
2016-09-20 5:48 ` Dong Jia Shi
2016-09-20 6:37 ` Jike Song
2016-09-20 12:53 ` Jike Song
2016-08-25 3:53 ` [Qemu-devel] [PATCH v7 3/4] vfio iommu: Add support " Kirti Wankhede
2016-08-25 7:29 ` Dong Jia
2016-08-26 13:50 ` Kirti Wankhede
2016-09-29 2:17 ` Jike Song
2016-09-29 15:06 ` Kirti Wankhede
2016-09-30 2:58 ` Jike Song
2016-09-30 3:10 ` Jike Song
2016-09-30 11:44 ` Kirti Wankhede
2016-10-08 7:09 ` Jike Song
2016-08-25 3:53 ` [Qemu-devel] [PATCH v7 4/4] docs: Add Documentation for Mediated devices Kirti Wankhede
2016-09-03 16:40 ` Kirti Wankhede
2016-08-30 16:16 ` [Qemu-devel] [PATCH v7 0/4] Add Mediated device support Alex Williamson
2016-08-31 6:12 ` Tian, Kevin
2016-08-31 7:04 ` Jike Song
2016-08-31 15:48 ` Alex Williamson
2016-09-01 4:09 ` Tian, Kevin
2016-09-01 4:10 ` Tian, Kevin
2016-09-01 18:22 ` Kirti Wankhede
2016-09-01 20:01 ` Alex Williamson
2016-09-02 6:17 ` Kirti Wankhede
2016-09-01 16:47 ` Michal Privoznik
2016-09-01 16:59 ` Alex Williamson
2016-09-02 4:48 ` Michal Privoznik
2016-09-02 5:21 ` Kirti Wankhede
2016-09-02 10:05 ` Paolo Bonzini
2016-09-02 17:15 ` Kirti Wankhede
2016-09-02 17:25 ` Paolo Bonzini
2016-09-02 18:33 ` Kirti Wankhede
2016-09-02 20:29 ` [Qemu-devel] [libvirt] " John Ferlan
2016-09-03 16:31 ` Kirti Wankhede
2016-09-06 17:54 ` Alex Williamson
2016-09-02 21:48 ` [Qemu-devel] " Paolo Bonzini
2016-09-03 11:56 ` [Qemu-devel] [libvirt] " John Ferlan
2016-09-03 13:07 ` Paolo Bonzini
2016-09-03 17:47 ` Kirti Wankhede
2016-09-03 16:34 ` [Qemu-devel] " Kirti Wankhede
2016-09-06 17:40 ` Alex Williamson
2016-09-06 19:35 ` Kirti Wankhede
2016-09-06 21:28 ` Alex Williamson
2016-09-07 8:22 ` Tian, Kevin
2016-09-07 16:00 ` Alex Williamson
2016-09-07 16:15 ` Kirti Wankhede
2016-09-07 16:44 ` Alex Williamson
2016-09-07 18:06 ` Kirti Wankhede [this message]
2016-09-07 22:13 ` Alex Williamson
2016-09-08 18:48 ` Kirti Wankhede
2016-09-08 20:51 ` Alex Williamson
2016-09-07 18:17 ` Neo Jia
2016-09-07 18:27 ` Daniel P. Berrange
2016-09-07 18:32 ` Neo Jia
2016-09-07 6:48 ` Tian, Kevin
2016-09-02 20:19 ` [Qemu-devel] [libvirt] " John Ferlan
2016-09-02 21:44 ` Paolo Bonzini
2016-09-02 23:57 ` Laine Stump
2016-09-03 16:49 ` Kirti Wankhede
2016-09-05 7:52 ` Paolo Bonzini
2016-09-03 11:57 ` John Ferlan
2016-09-05 7:54 ` Paolo Bonzini
2016-09-02 17:55 ` Laine Stump
2016-09-02 19:15 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4acb3eac-1e75-6b42-a70d-bb9a14fc7654@nvidia.com \
--to=kwankhede@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=bjsdjshi@linux.vnet.ibm.com \
--cc=cjia@nvidia.com \
--cc=jike.song@intel.com \
--cc=kevin.tian@intel.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=laine@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mprivozn@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).