From: Yi Liu <yi.l.liu@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "jgg@nvidia.com" <jgg@nvidia.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"cohuck@redhat.com" <cohuck@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
"yi.y.sun@linux.intel.com" <yi.y.sun@linux.intel.com>,
"peterx@redhat.com" <peterx@redhat.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"shameerali.kolothum.thodi@huawei.com"
<shameerali.kolothum.thodi@huawei.com>,
"lulu@redhat.com" <lulu@redhat.com>,
"suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>,
"intel-gvt-dev@lists.freedesktop.org"
<intel-gvt-dev@lists.freedesktop.org>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>,
"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
"Hao, Xudong" <xudong.hao@intel.com>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"Xu, Terrence" <terrence.xu@intel.com>,
"Jiang, Yanting" <yanting.jiang@intel.com>
Subject: Re: [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
Date: Sun, 9 Apr 2023 19:58:47 +0800 [thread overview]
Message-ID: <81a3e148-89de-e399-fefa-0785dac75f85@intel.com> (raw)
In-Reply-To: <20230408082018.04dcd1e3.alex.williamson@redhat.com>
On 2023/4/8 22:20, Alex Williamson wrote:
> On Sat, 8 Apr 2023 05:07:16 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
>
>>> From: Alex Williamson <alex.williamson@redhat.com>
>>> Sent: Saturday, April 8, 2023 5:07 AM
>>>
>>> On Fri, 7 Apr 2023 15:47:10 +0000
>>> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
>>>
>>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>>> Sent: Friday, April 7, 2023 11:14 PM
>>>>>
>>>>> On Fri, 7 Apr 2023 14:04:02 +0000
>>>>> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
>>>>>
>>>>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>>>>> Sent: Friday, April 7, 2023 9:52 PM
>>>>>>>
>>>>>>> On Fri, 7 Apr 2023 13:24:25 +0000
>>>>>>> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
>>>>>>>
>>>>>>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>>>>>>> Sent: Friday, April 7, 2023 8:04 PM
>>>>>>>>>
>>>>>>>>>>>>> @@ -791,7 +813,21 @@ static int vfio_pci_fill_devs(struct pci_dev
>>>>> *pdev,
>>>>>>> void
>>>>>>>>>>> *data)
>>>>>>>>>>>>> if (!iommu_group)
>>>>>>>>>>>>> return -EPERM; /* Cannot reset non-isolated devices
>>> */
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Alex,
>>>>>>>>>>>>
>>>>>>>>>>>> Is disabling iommu a sane way to test vfio noiommu mode?
>>>>>>>>>>>
>>>>>>>>>>> Yes
>>>>>>>>>>>
>>>>>>>>>>>> I added intel_iommu=off to disable intel iommu and bind a device to
>>> vfio-
>>>>> pci.
>>>>>>>>>>>> I can see the /dev/vfio/noiommu-0 and /dev/vfio/devices/noiommu-
>>> vfio0.
>>>>>>> Bind
>>>>>>>>>>>> iommufd==-1 can succeed, but failed to get hot reset info due to the
>>>>> above
>>>>>>>>>>>> group check. Reason is that this happens to have some affected
>>> devices,
>>>>> and
>>>>>>>>>>>> these devices have no valid iommu_group (because they are not
>>> bound to
>>>>>>> vfio-
>>>>>>>>> pci
>>>>>>>>>>>> hence nobody allocates noiommu group for them). So when hot reset
>>> info
>>>>>>> loops
>>>>>>>>>>>> such devices, it failed with -EPERM. Is this expected?
>>>>>>>>>>>
>>>>>>>>>>> Hmm, I didn't recall that we put in such a limitation, but given the
>>>>>>>>>>> minimally intrusive approach to no-iommu and the fact that we never
>>>>>>>>>>> defined an invalid group ID to return to the user, it makes sense that
>>>>>>>>>>> we just blocked the ioctl for no-iommu use. I guess we can do the same
>>>>>>>>>>> for no-iommu cdev.
>>>>>>>>>>
>>>>>>>>>> I just realize a further issue related to this limitation. Remember that we
>>>>>>>>>> may finally compile out the vfio group infrastructure in the future. Say I
>>>>>>>>>> want to test noiommu, I may boot such a kernel with iommu disabled. I
>>> think
>>>>>>>>>> the _INFO ioctl would fail as there is no iommu_group. Does it mean we
>>> will
>>>>>>>>>> not support hot reset for noiommu in future if vfio group infrastructure is
>>>>>>>>>> compiled out?
>>>>>>>>>
>>>>>>>>> We're talking about IOMMU groups, IOMMU groups are always present
>>>>>>>>> regardless of whether we expose a vfio group interface to userspace.
>>>>>>>>> Remember, we create IOMMU groups even in the no-iommu case. Even
>>> with
>>>>>>>>> pure cdev, there are underlying IOMMU groups that maintain the DMA
>>>>>>>>> ownership.
>>>>>>>>
>>>>>>>> hmmm. As [1], when iommu is disabled, there will be no iommu_group for a
>>>>>>>> given device unless it is registered to VFIO, which a fake group is created.
>>>>>>>> That's why I hit the limitation [1]. When vfio_group is compiled out, then
>>>>>>>> even fake group goes away.
>>>>>>>
>>>>>>> In the vfio group case, [1] can be hit with no-iommu only when there
>>>>>>> are affected devices which are not bound to vfio.
>>>>>>
>>>>>> yes. because vfio would allocate fake group when device is registered to
>>>>>> it.
>>>>>>
>>>>>>> Why are we not
>>>>>>> allocating an IOMMU group to no-iommu devices when vfio group is
>>>>>>> disabled? Thanks,
>>>>>>
>>>>>> hmmm. when the vfio group code is configured out. The
>>>>>> vfio_device_set_group() just returns 0 after below patch is
>>>>>> applied and CONFIG_VFIO_GROUP=n. So when there is no
>>>>>> vfio group, the fake group also goes away.
>>>>>>
>>>>>> https://lore.kernel.org/kvm/20230401151833.124749-25-yi.l.liu@intel.com/
>>>>>
>>>>> Is this a fundamental issue or just a problem with the current
>>>>> implementation proposal? It seems like the latter. FWIW, I also don't
>>>>> see a taint happening in the cdev path for no-iommu use. Thanks,
>>>>
>>>> yes. the latter case. The reason I raised it here is to confirm the
>>>> policy on the new group/bdf capability in the DEVICE_GET_INFO. If
>>>> there is no iommu group, perhaps I only need to exclude the new
>>>> group/bdf capability from the cap chain of DEVICE_GET_INFO. is it?
>>>
>>> I think we need to revisit the question of why allocating an IOMMU
>>> group for a no-iommu device is exclusive to the vfio group support.
>>
>> For no-iommu device, the iommu group is a fake group allocated by vfio.
>> is it? And the fake group allocation is part of the vfio group code.
>> It is the vfio_device_set_group() in group.c. If vfio group code is not
>> compiled in, vfio does not allocate fake groups. Detail for this compiling
>> can be found in link [1].
>>
>>> We've already been down the path of trying to report a field that only
>>> exists for devices with certain properties with dev-id. It doesn't
>>> work well. I think we've said all along that while the cdev interface
>>> is device based, there are still going to be underlying IOMMU groups
>>> for the user to be aware of, they're just not as much a fundamental
>>> part of the interface. There should not be a case where a device
>>> doesn't have a group to report. Thanks,
>>
>> As the patch in link [1] makes vfio group optional, so if compile a kernel
>> with CONFIG_VFIO_GROUP=n, and boot it with iommu disabled, then there is no
>> group to report. Perhaps this is not a typical usage but still a sane usage
>> for noiommu mode as I confirmed with you in this thread. So when it comes,
>> needs to consider what to report for the group field.
>>
>> Perhaps I messed up the discussion by referring to a patch that is part of
>> another series. But I think it should be considered when talking about the
>> group to be reported.
>
> The question is whether the split that group.c code handles both the
> vfio group AND creation of the IOMMU group in such cases is the correct
> split. I'm not arguing that the way the code is currently laid out has
> the fake IOMMU group for no-iommu devices created in vfio group
> specific code, but we have a common interface that makes use of IOMMU
> group information for which we don't have an equivalent alternative
> data field to report.
yes. It is needed to ensure _HOT_RESET_INFO workable for noiommu devices.
> We've shown that dev-id doesn't work here because dev-ids only exist
> for devices within the user's IOMMU context. Also reporting an invalid
> ID of any sort fails to indicate the potential implied ownership.
> Therefore I recognize that if this interface is to report an IOMMU
> group, then the creation of fake IOMMU groups existing only in vfio
> group code would need to be refactored. Thanks,
yeah, needs to move the iommu group creation back to vfio_main.c. This
would be a prerequisite for [1]
[1] https://lore.kernel.org/kvm/20230401151833.124749-25-yi.l.liu@intel.com/
I'll also try out your suggestion to add a capability like below and link
it in the vfio_device_info cap chain.
#define VFIO_DEVICE_INFO_CAP_PCI_BDF 5
struct vfio_device_info_cap_pci_bdf {
struct vfio_info_cap_header header;
__u32 group_id;
__u16 segment;
__u8 bus;
__u8 devfn; /* Use PCI_SLOT/PCI_FUNC */
};
--
Regards,
Yi Liu
next prev parent reply other threads:[~2023-04-09 11:57 UTC|newest]
Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-01 14:44 [PATCH v3 00/12] Introduce new methods for verifying ownership in vfio PCI hot reset Yi Liu
2023-04-01 14:44 ` [PATCH v3 01/12] vfio/pci: Update comment around group_fd get in vfio_pci_ioctl_pci_hot_reset() Yi Liu
2023-04-04 13:59 ` Eric Auger
2023-04-04 14:37 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset Yi Liu
2023-04-04 13:59 ` Eric Auger
2023-04-04 14:37 ` Liu, Yi L
2023-04-04 15:18 ` Eric Auger
2023-04-04 15:29 ` Liu, Yi L
2023-04-04 15:59 ` Eric Auger
2023-04-05 11:41 ` Jason Gunthorpe
2023-04-05 15:14 ` Eric Auger
2023-04-01 14:44 ` [PATCH v3 03/12] vfio/pci: Move the existing hot reset logic to be a helper Yi Liu
2023-04-04 13:59 ` Eric Auger
2023-04-04 14:24 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 04/12] vfio-iommufd: Add helper to retrieve iommufd_ctx and devid for vfio_device Yi Liu
2023-04-04 15:28 ` Eric Auger
2023-04-04 21:48 ` Alex Williamson
2023-04-21 7:11 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 05/12] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET Yi Liu
2023-04-04 16:54 ` Eric Auger
2023-04-04 20:18 ` Alex Williamson
2023-04-05 7:55 ` Liu, Yi L
2023-04-05 8:01 ` Liu, Yi L
2023-04-05 15:36 ` Alex Williamson
2023-04-05 16:46 ` Jason Gunthorpe
2023-04-05 8:02 ` Eric Auger
2023-04-05 8:09 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 06/12] vfio: Refine vfio file kAPIs for vfio PCI hot reset Yi Liu
2023-04-05 8:27 ` Eric Auger
2023-04-05 9:23 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 07/12] vfio: Accpet device file from vfio PCI hot reset path Yi Liu
2023-04-04 20:31 ` Alex Williamson
2023-04-05 8:07 ` Eric Auger
2023-04-05 8:10 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 08/12] vfio/pci: Renaming for accepting device fd in " Yi Liu
2023-04-04 21:23 ` Alex Williamson
2023-04-05 9:32 ` Eric Auger
2023-04-01 14:44 ` [PATCH v3 09/12] vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl Yi Liu
2023-04-05 9:36 ` Eric Auger
2023-04-01 14:44 ` [PATCH v3 10/12] vfio: Mark cdev usage in vfio_device Yi Liu
2023-04-05 11:48 ` Eric Auger
2023-04-21 7:06 ` Liu, Yi L
2023-04-01 14:44 ` [PATCH v3 11/12] iommufd: Define IOMMUFD_INVALID_ID in uapi Yi Liu
2023-04-04 21:00 ` Alex Williamson
2023-04-05 9:31 ` Liu, Yi L
2023-04-05 15:13 ` Alex Williamson
2023-04-05 15:17 ` Liu, Yi L
2023-04-05 11:46 ` Eric Auger
2023-04-01 14:44 ` [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO Yi Liu
2023-04-03 9:25 ` Liu, Yi L
2023-04-03 15:01 ` Alex Williamson
2023-04-03 15:22 ` Liu, Yi L
2023-04-03 15:32 ` Alex Williamson
2023-04-03 16:12 ` Jason Gunthorpe
2023-04-07 10:09 ` Liu, Yi L
2023-04-07 12:03 ` Alex Williamson
2023-04-07 13:24 ` Liu, Yi L
2023-04-07 13:51 ` Alex Williamson
2023-04-07 14:04 ` Liu, Yi L
2023-04-07 15:14 ` Alex Williamson
2023-04-07 15:47 ` Liu, Yi L
2023-04-07 21:07 ` Alex Williamson
2023-04-08 5:07 ` Liu, Yi L
2023-04-08 14:20 ` Alex Williamson
2023-04-09 11:58 ` Yi Liu [this message]
2023-04-09 13:29 ` Alex Williamson
2023-04-10 8:48 ` Liu, Yi L
2023-04-10 14:41 ` Alex Williamson
2023-04-10 15:18 ` Liu, Yi L
2023-04-10 15:23 ` Alex Williamson
2023-04-11 13:34 ` Jason Gunthorpe
2023-04-11 13:33 ` Jason Gunthorpe
2023-04-11 6:16 ` Liu, Yi L
2023-04-04 22:20 ` Alex Williamson
2023-04-05 12:19 ` Eric Auger
2023-04-05 14:04 ` Liu, Yi L
2023-04-05 16:25 ` Alex Williamson
2023-04-05 16:37 ` Jason Gunthorpe
2023-04-05 16:52 ` Alex Williamson
2023-04-05 17:23 ` Jason Gunthorpe
2023-04-05 18:56 ` Alex Williamson
2023-04-05 19:18 ` Alex Williamson
2023-04-05 19:21 ` Jason Gunthorpe
2023-04-05 19:49 ` Alex Williamson
2023-04-05 23:22 ` Jason Gunthorpe
2023-04-06 10:02 ` Liu, Yi L
2023-04-06 17:53 ` Alex Williamson
2023-04-07 10:09 ` Liu, Yi L
2023-04-11 13:24 ` Jason Gunthorpe
[not found] ` <20230411095417.240bac39.alex.williamson@redhat.com>
[not found] ` <20230411111117.0766ad52.alex.williamson@redhat.com>
2023-04-11 18:40 ` Jason Gunthorpe
2023-04-11 21:58 ` Alex Williamson
2023-04-12 0:01 ` Jason Gunthorpe
2023-04-12 7:27 ` Tian, Kevin
2023-04-12 15:05 ` Jason Gunthorpe
2023-04-12 17:01 ` Alex Williamson
2023-04-13 2:57 ` Tian, Kevin
2023-04-12 10:09 ` Liu, Yi L
2023-04-12 16:54 ` Alex Williamson
2023-04-12 16:50 ` Alex Williamson
2023-04-12 20:06 ` Jason Gunthorpe
2023-04-13 8:25 ` Tian, Kevin
2023-04-13 11:50 ` Jason Gunthorpe
2023-04-13 14:35 ` Liu, Yi L
2023-04-13 14:41 ` Jason Gunthorpe
2023-04-13 18:07 ` Alex Williamson
2023-04-14 9:11 ` Tian, Kevin
2023-04-14 11:38 ` Liu, Yi L
2023-04-14 17:10 ` Alex Williamson
2023-04-17 4:20 ` Liu, Yi L
2023-04-17 19:01 ` Alex Williamson
2023-04-17 19:31 ` Jason Gunthorpe
2023-04-17 20:06 ` Alex Williamson
2023-04-18 3:24 ` Tian, Kevin
2023-04-18 4:10 ` Alex Williamson
2023-04-18 5:02 ` Tian, Kevin
2023-04-18 12:59 ` Jason Gunthorpe
2023-04-18 16:44 ` Alex Williamson
2023-04-18 10:34 ` Liu, Yi L
2023-04-18 16:49 ` Alex Williamson
2023-04-18 12:57 ` Jason Gunthorpe
2023-04-18 18:39 ` Alex Williamson
2023-04-20 12:10 ` Liu, Yi L
2023-04-20 14:08 ` Alex Williamson
2023-04-21 22:35 ` Jason Gunthorpe
2023-04-23 14:46 ` Liu, Yi L
2023-04-26 7:22 ` Liu, Yi L
2023-04-26 13:20 ` Alex Williamson
2023-04-26 15:08 ` Liu, Yi L
2023-04-14 16:34 ` Alex Williamson
2023-04-17 13:39 ` Jason Gunthorpe
2023-04-18 1:28 ` Tian, Kevin
2023-04-18 10:23 ` Liu, Yi L
2023-04-18 13:02 ` Jason Gunthorpe
2023-04-23 10:28 ` Liu, Yi L
2023-04-24 17:38 ` Jason Gunthorpe
2023-04-17 14:05 ` Jason Gunthorpe
2023-04-12 7:14 ` Tian, Kevin
2023-04-06 6:34 ` Liu, Yi L
2023-04-06 17:07 ` Alex Williamson
2023-04-05 17:58 ` Eric Auger
2023-04-06 5:31 ` Liu, Yi L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=81a3e148-89de-e399-fefa-0785dac75f85@intel.com \
--to=yi.l.liu@intel.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@linux.intel.com \
--cc=cohuck@redhat.com \
--cc=eric.auger@redhat.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-gvt-dev@lists.freedesktop.org \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lulu@redhat.com \
--cc=mjrosato@linux.ibm.com \
--cc=nicolinc@nvidia.com \
--cc=peterx@redhat.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=terrence.xu@intel.com \
--cc=xudong.hao@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yanting.jiang@intel.com \
--cc=yi.y.sun@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox