From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Jike Song <jike.song@intel.com>,
cjia@nvidia.com, kvm@vger.kernel.org, qemu-devel@nongnu.org,
kevin.tian@intel.com, kraxel@redhat.com, pbonzini@redhat.com,
bjsdjshi@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH v7 1/4] vfio: Mediated device Core driver
Date: Mon, 19 Sep 2016 14:59:53 -0600 [thread overview]
Message-ID: <20160919145953.16a718e2@t450s.home> (raw)
In-Reply-To: <cdfb9445-e0b3-c66f-eb3d-024415e8df7f@nvidia.com>
On Tue, 20 Sep 2016 01:39:19 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:
> On 9/19/2016 11:41 PM, Alex Williamson wrote:
> > On Mon, 19 Sep 2016 22:59:34 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >
> >> On 9/12/2016 9:23 PM, Alex Williamson wrote:
> >>> On Mon, 12 Sep 2016 13:19:11 +0530
> >>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >>>
> >>>> On 9/12/2016 10:40 AM, Jike Song wrote:
> >>>>> On 09/10/2016 03:55 AM, Kirti Wankhede wrote:
> >>>>>> On 9/10/2016 12:12 AM, Alex Williamson wrote:
> >>>>>>> On Fri, 9 Sep 2016 23:18:45 +0530
> >>>>>>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >>>>>>>
> >>>>>>>>>> +
> >>>>>>>>>> +static struct parent_device *mdev_get_parent_from_dev(struct device *dev)
> >>>>>>>>>> +{
> >>>>>>>>>> + struct parent_device *parent;
> >>>>>>>>>> +
> >>>>>>>>>> + mutex_lock(&parent_list_lock);
> >>>>>>>>>> + parent = mdev_get_parent(__find_parent_device(dev));
> >>>>>>>>>> + mutex_unlock(&parent_list_lock);
> >>>>>>>>>> +
> >>>>>>>>>> + return parent;
> >>>>>>>>>> +}
> >>>>>>>>>
> >>>>>>>>> As we have demonstrated, all these refs and locks and release workqueue are not necessary,
> >>>>>>>>> as long as you have an independent device associated with the mdev host device
> >>>>>>>>> ("parent" device here).
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I don't think every lock will go away with that. This also changes how
> >>>>>>>> mdev devices entries are created in sysfs. It adds an extra directory.
> >>>>>>>
> >>>>>>> Exposing the parent-child relationship through sysfs is a desirable
> >>>>>>> feature, so I'm not sure how this is a negative. This part of Jike's
> >>>>>>> conversion was a big improvement, I thought. Thanks,
> >>>>>>>
> >>>>>>
> >>>>>> Jike's suggestion is to introduced a fake device over parent device i.e.
> >>>>>> mdev-host, and then all mdev devices are children of 'mdev-host' not
> >>>>>> children of real parent.
> >>>>>>
> >>>>>
> >>>>> It really depends on how you define 'real parent' :)
> >>>>>
> >>>>> With a physical-host-mdev hierarchy, the parent of mdev devices is the host
> >>>>> device, the parent of host device is the physical device. e.g.
> >>>>>
> >>>>> pdev mdev_host mdev_device
> >>>>> dev<------------dev<------------dev
> >>>>> parent parent
> >>>>>
> >>>>> Figure 1: device hierarchy
> >>>>>
> >>>>
> >>>> Right, mdev-host device doesn't represent physical device nor any mdev
> >>>> device. Then what is the need of such device?
> >>>
> >>> Is there anything implicitly wrong with using a device node to host the
> >>> mdev child devices? Is the argument against it only that it's
> >>> unnecessary? Can we make use of the device-core parent/child
> >>> dependencies as Jike has done w/o that extra node?
> >>>
> >>
> >> I do feel that mdev core module would get simplified with the new sysfs
> >> interface without having extra node.
> >
> > Can you provide an example of why that is?
> >
>
> 'online' or earlier 'start'/'stop' interface is removed and would add
> open() and close() callbacks. ref count from struct mdev_device and
> mdev_get_device()/mdev_put_device() were added for this interface, these
> would go away.
> Using device-core parent/child dependencies between physical device and
> mdev device, other functions would get simplified.
Yes, we've refined the interface over time and I'm glad that you're
incorporating the device-core simplifications, but this doesn't really
argue for or against the extra device node.
> >>>>>> For example, directory structure we have now is:
> >>>>>> /sys/bus/pci/devices/0000\:85\:00.0/<mdev_device>
> >>>>>>
> >>>>>> mdev devices are in real parents directory.
> >>>>>>
> >>>>>> By introducing fake device it would be:
> >>>>>> /sys/bus/pci/devices/0000\:85\:00.0/mdev-host/<mdev_device>
> >>>>>>
> >>>>>> mdev devices are in fake device's directory.
> >>>>>>
> >>>>>
> >>>>> Yes, this is the wanted directory.
> >>>>>
> >>>>
> >>>> I don't think so.
> >>>
> >>> Why?
> >>>
> >>
> >> This directory is not mandatory. right?
> >
> > Clearly you've done an implementation without it, so it's not
> > functionally mandatory, but Jike has made significant code reduction
> > and simplification with it. Those are desirable things.
> >
> >>>>>> Lock would be still required, to handle the race conditions like
> >>>>>> 'mdev_create' is still in process and parent device is unregistered by
> >>>>>> vendor driver/ parent device is unbind from vendor driver.
> >>>>>>
> >>>>>
> >>>>> locks are provided to protect resources, would you elaborate more on
> >>>>> what is the exact resource you want to protect by a lock in mdev_create?
> >>>>>
> >>>>
> >>>> Simple, in your suggestion mdev-host device. Fake device will go away if
> >>>> vendor driver unregisters the device from mdev module, right.
> >>>
> >>> I don't follow the reply here, but aiui there's ordering implicit in
> >>> the device core that Jike is trying to take advantage of that
> >>> simplifies the mdev layer significantly. In the case of an
> >>> mdev_create, the device core needs to take a reference to the parent
> >>> object, the mdev-host I'd guess in Jike's version, the created mdev
> >>> device would also have a reference to that object, so the physical host
> >>> device could not be removed so long as there are outstanding
> >>> references. It's just a matter of managing references and acquiring
> >>> and releasing objects. Thanks,
> >>>
> >>
> >> I do think this could be simplified without having extra node.
> >
> > The simplification is really what I'm after, whether or not it includes
> > an extra device node is not something I'm sure I have an opinion on
> > yet. Aren't we really just talking about an extra sysfs directory
> > node?
>
> No, not just extra sysfs directory. I'm trying to keep parent/child
> relationship between physical device and mdev device direct without
> having extra device node.
But it's just a difference of whether the mdev devices are rooted at
the parent itself or a child of the parent. It's just a slight
organizational change, isn't it?
> > Doesn't it make it easier for userspace to efficiently identify
> > all the mdev children when they're segregated from the other attributes
> > and sub-nodes of a parent device?
> >
>
> Physical devices are generally leaf nodes. I think its easier to find
> all mdev children in its own directory.
So we're just going to drop a bunch of UUID links in the parent device
sysfs directory, userspace needs to follow the link to figure out what
they are? Doesn't that seem messy? SR-IOV VFs are prefixed with
"virtfn" in the parent directory, putting them under an mdev-host
device is just another way to implement that organization.
> >>> the created mdev
> >>> device would also have a reference to that object, so the physical host
> >>> device could not be removed so long as there are outstanding
> >>> references.
> >>
> >> Yes, this is also true when physical device is direct parent of mdev
> >> device. mdev device keeps reference of parent, so physical host device
> >> could not be removed as long as mdev devices are present. That is why
> >> from mdev_unregister_device() a chance is given to free all child mdev
> >> devices.
> >
> > But why aren't we using the device core do do that? It seems like
> > we're getting hung up on this device node, which is more of a stylistic
> > and sysfs layout issue when the primary comment is to simplify the mdev
> > infrastructure by taking more advantage of the parent/child
> > dependencies of the device core. Thanks,
> >
>
> Definitely we would use device core parent/child dependency and simplify
> mdev framework without having extra device node.
That's great, I wasn't sure if you were rejecting the whole
parent/child aspect or just the extra node. Still, I don't see the
mdev-host node as an outright bad idea, it seems like just a way to
classify a set of child devices. In the SR-IOV case the VFs are actual
PCI devices spawned from the parent device, so making them direct
children of the PF might make sense. Here we're going through this
mdev core interface to create mdev devices and an mdev-host device also
seems like a realistic representation of that. In any case, it seems
like a fairly minor tweak to add or remove it, we can argue one way or
the other on the next draft. Thanks,
Alex
next prev parent reply other threads:[~2016-09-19 20:59 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-25 3:53 [PATCH v7 0/4] Add Mediated device support Kirti Wankhede
2016-08-25 3:53 ` [PATCH v7 1/4] vfio: Mediated device Core driver Kirti Wankhede
2016-09-08 8:09 ` Jike Song
2016-09-08 9:38 ` Neo Jia
2016-09-09 6:26 ` Jike Song
2016-09-09 17:48 ` Kirti Wankhede
2016-09-09 18:42 ` Alex Williamson
2016-09-09 19:55 ` Kirti Wankhede
2016-09-12 5:10 ` Jike Song
2016-09-12 7:49 ` Kirti Wankhede
2016-09-12 15:53 ` Alex Williamson
2016-09-19 7:08 ` Jike Song
2016-09-19 17:29 ` Kirti Wankhede
2016-09-19 18:11 ` Alex Williamson
2016-09-19 20:09 ` Kirti Wankhede
2016-09-19 20:59 ` Alex Williamson [this message]
2016-09-20 12:48 ` Jike Song
2016-08-25 3:53 ` [PATCH v7 2/4] vfio: VFIO driver for mediated devices Kirti Wankhede
2016-08-25 9:22 ` Dong Jia
2016-08-26 14:13 ` Kirti Wankhede
2016-09-08 2:38 ` Jike Song
2016-09-19 18:22 ` [Qemu-devel] " Kirti Wankhede
2016-09-19 18:36 ` Alex Williamson
2016-09-19 19:13 ` Kirti Wankhede
2016-09-19 20:03 ` Alex Williamson
2016-09-20 2:50 ` Jike Song
2016-09-20 16:24 ` Alex Williamson
2016-09-21 3:19 ` Jike Song
2016-09-21 4:51 ` Alex Williamson
2016-09-21 5:02 ` Jike Song
2016-09-08 2:45 ` Jike Song
2016-09-13 2:35 ` Jike Song
2016-09-20 5:48 ` Dong Jia Shi
[not found] ` <20160920054851.GA2186@bjsdjshi@linux.vnet.ibm.com>
2016-09-20 6:37 ` Jike Song
2016-09-20 12:53 ` Jike Song
2016-08-25 3:53 ` [PATCH v7 3/4] vfio iommu: Add support " Kirti Wankhede
2016-08-25 7:29 ` Dong Jia
2016-08-26 13:50 ` Kirti Wankhede
2016-09-29 2:17 ` Jike Song
2016-09-29 15:06 ` Kirti Wankhede
2016-09-30 2:58 ` Jike Song
2016-09-30 3:10 ` Jike Song
2016-09-30 11:44 ` Kirti Wankhede
2016-10-08 7:09 ` Jike Song
2016-08-25 3:53 ` [PATCH v7 4/4] docs: Add Documentation for Mediated devices Kirti Wankhede
2016-09-03 16:40 ` Kirti Wankhede
2016-08-30 16:16 ` [PATCH v7 0/4] Add Mediated device support Alex Williamson
2016-08-31 6:12 ` Tian, Kevin
2016-08-31 7:04 ` Jike Song
2016-08-31 15:48 ` Alex Williamson
2016-09-01 4:09 ` Tian, Kevin
2016-09-01 4:10 ` Tian, Kevin
2016-09-01 18:22 ` Kirti Wankhede
2016-09-01 20:01 ` Alex Williamson
2016-09-02 6:17 ` Kirti Wankhede
2016-09-01 16:47 ` [Qemu-devel] " Michal Privoznik
2016-09-01 16:59 ` Alex Williamson
2016-09-02 4:48 ` [Qemu-devel] " Michal Privoznik
2016-09-02 5:21 ` Kirti Wankhede
2016-09-02 10:05 ` Paolo Bonzini
2016-09-02 17:15 ` Kirti Wankhede
2016-09-02 17:25 ` Paolo Bonzini
2016-09-02 18:33 ` Kirti Wankhede
2016-09-02 20:29 ` [libvirt] " John Ferlan
2016-09-03 16:31 ` [libvirt] " Kirti Wankhede
2016-09-06 17:54 ` [libvirt] [Qemu-devel] " Alex Williamson
2016-09-02 21:48 ` Paolo Bonzini
2016-09-03 11:56 ` [libvirt] " John Ferlan
2016-09-03 13:07 ` Paolo Bonzini
2016-09-03 17:47 ` [libvirt] " Kirti Wankhede
2016-09-03 16:34 ` [Qemu-devel] " Kirti Wankhede
2016-09-06 17:40 ` Alex Williamson
2016-09-06 19:35 ` Kirti Wankhede
2016-09-06 21:28 ` Alex Williamson
2016-09-07 8:22 ` Tian, Kevin
2016-09-07 16:00 ` Alex Williamson
2016-09-07 16:15 ` Kirti Wankhede
2016-09-07 16:44 ` Alex Williamson
2016-09-07 18:06 ` Kirti Wankhede
2016-09-07 22:13 ` Alex Williamson
2016-09-08 18:48 ` Kirti Wankhede
2016-09-08 20:51 ` Alex Williamson
2016-09-07 18:17 ` Neo Jia
2016-09-07 18:27 ` Daniel P. Berrange
2016-09-07 18:32 ` Neo Jia
2016-09-07 6:48 ` Tian, Kevin
2016-09-02 20:19 ` [libvirt] " John Ferlan
2016-09-02 21:44 ` Paolo Bonzini
2016-09-02 23:57 ` Laine Stump
2016-09-03 16:49 ` Kirti Wankhede
2016-09-05 7:52 ` Paolo Bonzini
2016-09-03 11:57 ` John Ferlan
2016-09-05 7:54 ` Paolo Bonzini
2016-09-02 17:55 ` Laine Stump
2016-09-02 19:15 ` [libvirt] " Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160919145953.16a718e2@t450s.home \
--to=alex.williamson@redhat.com \
--cc=bjsdjshi@linux.vnet.ibm.com \
--cc=cjia@nvidia.com \
--cc=jike.song@intel.com \
--cc=kevin.tian@intel.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).