From: Nathan Chen <nathanc@nvidia.com>
To: "Cédric Le Goater" <clg@redhat.com>,
qemu-arm@nongnu.org, qemu-devel@nongnu.org
Cc: "Yi Liu" <yi.l.liu@intel.com>,
"Eric Auger" <eric.auger@redhat.com>,
"Zhenzhong Duan" <zhenzhong.duan@intel.com>,
"Peter Maydell" <peter.maydell@linaro.org>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Yanan Wang" <wangyanan55@huawei.com>,
"Zhao Liu" <zhao1.liu@intel.com>,
"Alex Williamson" <alex@shazbot.org>,
"Shameer Kolothum" <skolothumtho@nvidia.com>,
"Matt Ochs" <mochs@nvidia.com>,
"Nicolin Chen" <nicolinc@nvidia.com>
Subject: Re: [PATCH 04/11] vfio/pci: Add ats property and mask ATS cap when not exposed
Date: Mon, 13 Apr 2026 12:11:21 -0700 [thread overview]
Message-ID: <98b61e85-2c6c-4de2-ba2b-e120229d230c@nvidia.com> (raw)
In-Reply-To: <9481a1c3-0dec-482f-97ac-ad513af99f9b@redhat.com>
On 4/11/2026 5:45 AM, Cédric Le Goater wrote:
> On 4/11/26 03:52, Nathan Chen wrote:
>>
>> Hi Cédric,
>> On 4/10/2026 2:09 PM, Cédric Le Goater wrote:
>>> Hello Nathan,
>>>
>>> On 4/1/26 03:02, Nathan Chen wrote:
>>>> From: Nathan Chen <nathanc@nvidia.com>
>>>>
>>>> Add an "ats" OnOffAuto property to vfio-pci. When the device has an ATS
>>>> extended capability in config space but we should not expose it
>>>> (ats=off,
>>>> or ats=auto and kernel reports IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED),
>>>> mask
>>>> the capability so the guest does not see it.
>>>>
>>>> This aligns with the kernel's per-device effective ATS reporting and
>>>> allows omitting ATS capability when the vIOMMU has ats=off.
>>>>
>>>> Suggested-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>>> Signed-off-by: Nathan Chen <nathanc@nvidia.com>
>>>
>>> This looks like a standalone VFIO change that could be sent
>>> independently.
>>> Could you please rebase on vfio-next [*] ? There are a few conflicting
>>> changes in this branch.
>>>
>>> [*] https://github.com/legoater/qemu/commits/vfio-next
>>>
>>> You will need to add a compat machine property too. This is a guest-
>>> visible
>>> ABI change. A separate patch would be preferred.
>>>
>> Yes, I will separate this out and rebase on vfio-next, with the compat
>> machine property added.
>>
>>>> ---
>>>> backends/iommufd.c | 15 +++++++
>>>> hw/vfio/pci.c | 63 ++++++++++++++++++++++++++
>>>> ++++
>>>> hw/vfio/pci.h | 1 +
>>>> include/system/host_iommu_device.h | 10 +++++
>>>> 4 files changed, 89 insertions(+)
>>>>
>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>> index e1fee16acf..52cb060454 100644
>>>> --- a/backends/iommufd.c
>>>> +++ b/backends/iommufd.c
>>>> @@ -22,6 +22,13 @@
>>>> #include "hw/vfio/vfio-device.h"
>>>> #include <sys/ioctl.h>
>>>> #include <linux/iommufd.h>
>>>> +/*
>>>> + * Until kernel UAPI is synced via scripts;
>>>> + * matches include/uapi/linux/iommufd.h
>>>> + */
>>>> +#ifndef IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED
>>>> +#define IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED (1 << 3)
>>>> +#endif
>>>
>>> This definition is introduced by :
>>>
>>> https://lore.kernel.org/all/20260303150348.233997-1-
>>> skolothumtho@nvidia.com/
>>>
>>> It should reach Linux v7.1 since it is in linux-next. Anyhow, until
>>> then,
>>> please isolate this change in a temporary patch updating linux-headers/
>>>
>> Ok, I will separate it out.
>>>> static const char *iommufd_fd_name(IOMMUFDBackend *be)
>>>> {
>>>> @@ -573,6 +580,13 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice
>>>> *hiod, int cap, Error **errp)
>>>> }
>>>> }
>>>> +static bool hiod_iommufd_support_ats(HostIOMMUDevice *hiod)
>>>> +{
>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>> +
>>>> + return !(caps->hw_caps & IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED);
>>>> +}
>>>> +
>>>> static bool hiod_iommufd_get_pasid_info(HostIOMMUDevice *hiod,
>>>> PasidInfo *pasid_info)
>>>> {
>>>> @@ -595,6 +609,7 @@ static void hiod_iommufd_class_init(ObjectClass
>>>> *oc, const void *data)
>>>> hioc->get_cap = hiod_iommufd_get_cap;
>>>> hioc->get_pasid_info = hiod_iommufd_get_pasid_info;
>>>> + hioc->support_ats = hiod_iommufd_support_ats;
>>>> };
>>>> static const TypeInfo types[] = {
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index 1945751ffd..2d408e1d9a 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -49,6 +49,10 @@
>>>> #include "system/iommufd.h"
>>>> #include "vfio-migration-internal.h"
>>>> #include "vfio-helpers.h"
>>>> +#ifdef CONFIG_IOMMUFD
>>>> +#include "system/host_iommu_device.h"
>>>> +#include "linux/iommufd.h"
>>>> +#endif
>>> These includes are not necessary.
>> Got it, I will remove these. I see that system/host_iommu_device.h is
>> included in system/iommufd.h already. And linux/iommufd.h was leftover
>> from a previous revision.
>>
>>>
>>>> /* Protected by BQL */
>>>> static KVMRouteChange vfio_route_change;
>>>> @@ -2550,10 +2554,53 @@ static bool
>>>> vfio_pci_synthesize_pasid_cap(VFIOPCIDevice *vdev, Error **errp)
>>>> return true;
>>>> }
>>>> +/*
>>>> + * Determine whether ATS capability should be advertised for @vdev,
>>>> based on
>>>> + * whether it was enabled on the command line and whether it is
>>>> supported
>>>> + * according to the kernel's IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED bit.
>>>> + *
>>>> + * Store whether ATS capability should be advertised in @ats_need.
>>>> + *
>>>> + * Return false if kernel enables IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED
>>>> + * and ATS is effectively unsupported.
>>>> + */
>>>> +static bool vfio_pci_ats_requested_and_supported(VFIOPCIDevice *vdev,
>>>> + bool *ats_need,
>>>> Error **errp)
>>>
>>> May be rename 'ats_need' to 'ats' or 'ats_needed'
>>>
>> Ok, I will rename this to ats_needed.
>>>> +{
>>>> + HostIOMMUDevice *hiod = vdev->vbasedev.hiod;
>>>> + HostIOMMUDeviceClass *hiodc;
>>>> + bool ats_supported;
>>>> +
>>>> + if (vdev->ats == ON_OFF_AUTO_OFF) {
>>>> + *ats_need = false;
>>>> + return true;
>>>> + }
>>>> +
>>>> + *ats_need = true;
>>>> + if (!hiod) {
>>>> + return true;
>>>> + }
>>>> + hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>>>> + if (!hiodc || !hiodc->support_ats) {
>>>> + return true;
>>>> + }
>>>> +
>>>> + ats_supported = hiodc->support_ats(hiod);
>>>> + if (vdev->ats == ON_OFF_AUTO_ON && !ats_supported) {
>>>> + error_setg(errp, "vfio: ATS requested but not supported by
>>>> kernel");
>>>
>>> hmm, I wonder if we shouldn't fail to realize the device and stop the VM
>>> in this case (ats requested + no support).
>> Agreed. I can move the vfio_pci_ats_requested_and_supported() call to
>> vfio_pci_add_capabilities() so we can propagate the Error **, and then
>> pass ats_needed as a new input arg to vfio_add_ext_cap. Would that be
>> alright?
>>>
>>>> + *ats_need = false;
>>>> + return false;
>>>> + }
>>>> +
>>>> + *ats_need = ats_supported;
>>>> + return true;
>>>> +}
>>>> +
>>>> static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>>>> {
>>>> PCIDevice *pdev = PCI_DEVICE(vdev);
>>>> bool pasid_cap_added = false;
>>>> + bool ats_needed = false;
>>>> Error *err = NULL;
>>>> uint32_t header;
>>>> uint16_t cap_id, next, size;
>>>> @@ -2603,6 +2650,11 @@ static void vfio_add_ext_cap(VFIOPCIDevice
>>>> *vdev)
>>>> pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
>>>> pci_set_long(vdev->emulated_config_bits +
>>>> PCI_CONFIG_SPACE_SIZE, ~0);
>>>> + if (!vfio_pci_ats_requested_and_supported(vdev, &ats_needed,
>>>> &err)) {
>>>> + error_report_err(err);
>>>> + err = NULL;
>>>
>>> indent error
>>>
>> Thanks, will fix this.
>>>> + }
>>>> +
>>>> for (next = PCI_CONFIG_SPACE_SIZE; next;
>>>> next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
>>>> header = pci_get_long(config + next);
>>>> @@ -2640,6 +2692,16 @@ static void vfio_add_ext_cap(VFIOPCIDevice
>>>> *vdev)
>>>> case PCI_EXT_CAP_ID_PASID:
>>>> pasid_cap_added = true;
>>>> /* fallthrough */
>>>
>>> This change is coupling the PASID and ATS capabilities. Is this
>>> intended ?
>>>
>> This was not intentional, I will add a pcie_add_capability() call and
>> break for the PASID case.
>>>
>>>> + case PCI_EXT_CAP_ID_ATS:
>>>> + /*
>>>> + * If ATS is requested and supported according to the
>>>> kernel, add
>>>> + * the ATS capability. If not supported according to
>>>> the kernel or
>>>> + * disabled on the qemu command line, omit the ATS cap.
>>>> + */
>>>> + if (ats_needed) {
>>>> + pcie_add_capability(pdev, cap_id, cap_ver, next,
>>>> size);
>>>> + }
>>>> + break;
>>>> default:
>>>> pcie_add_capability(pdev, cap_id, cap_ver, next, size);
>>>> }
>>>> @@ -3819,6 +3881,7 @@ static const Property vfio_pci_properties[] = {
>>>> #ifdef CONFIG_IOMMUFD
>>>> DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd,
>>>> TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
>>>> + DEFINE_PROP_ON_OFF_AUTO("ats", VFIOPCIDevice, ats,
>>>> ON_OFF_AUTO_AUTO),
>>>
>>> conflicts with vfio-next
>>>
>> Ok, will take a look and rebase on vfio-next.
>
> when you rebase, please put the "ats" property at the end of the list
> and update accordingly vfio_pci_class_init().
Ok, I will keep this in mind, thanks!
Nathan
next prev parent reply other threads:[~2026-04-13 19:11 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 1:02 [PATCH 00/11] hw/arm/smmuv3-accel: Resolve AUTO properties Nathan Chen
2026-04-01 1:02 ` [PATCH 01/11] hw/arm/smmuv3-accel: Add helper for resolving auto parameters Nathan Chen
2026-04-10 7:26 ` Eric Auger
2026-04-10 18:50 ` Nathan Chen
2026-04-10 7:37 ` Eric Auger
2026-04-10 18:53 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 02/11] hw/arm/smmuv3-accel: Implement "auto" value for "ats" Nathan Chen
2026-04-10 7:36 ` Eric Auger
2026-04-10 18:53 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 03/11] hw/arm/smmuv3: Change the default ats support to match the host Nathan Chen
2026-04-10 7:42 ` Eric Auger
2026-04-10 18:54 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 04/11] vfio/pci: Add ats property and mask ATS cap when not exposed Nathan Chen
2026-04-10 21:09 ` Cédric Le Goater
2026-04-11 1:52 ` Nathan Chen
2026-04-11 12:45 ` Cédric Le Goater
2026-04-13 19:11 ` Nathan Chen [this message]
2026-04-16 13:58 ` Cédric Le Goater
2026-04-16 20:04 ` Nathan Chen
2026-04-16 13:43 ` Shameer Kolothum Thodi
2026-04-16 21:39 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 05/11] hw/arm/smmuv3-accel: Implement "auto" value for "ril" Nathan Chen
2026-04-10 7:47 ` Eric Auger
2026-04-10 18:58 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 06/11] hw/arm/smmuv3: Change the default ril support to match the host Nathan Chen
2026-04-10 7:41 ` Eric Auger
2026-04-10 18:59 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 07/11] hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize" Nathan Chen
2026-04-10 7:56 ` Eric Auger
2026-04-10 19:01 ` Nathan Chen
2026-04-01 1:02 ` [PATCH 08/11] hw/arm/smmuv3: Change the default ssidsize to match the host Nathan Chen
2026-04-01 1:02 ` [PATCH 09/11] hw/arm/smmuv3-accel: Implement "auto" value for "oas" Nathan Chen
2026-04-10 7:48 ` Eric Auger
2026-04-16 14:08 ` Shameer Kolothum Thodi
2026-04-16 20:02 ` Nathan Chen
2026-04-20 22:31 ` Nathan Chen
2026-04-21 9:30 ` Shameer Kolothum Thodi
2026-04-01 1:02 ` [PATCH 10/11] hw/arm/smmuv3: Change the default oas to match the host Nathan Chen
2026-04-01 1:02 ` [PATCH 11/11] qemu-options.hx: Support "auto" for accel SMMUv3 properties Nathan Chen
2026-04-16 14:19 ` Shameer Kolothum Thodi
2026-04-16 19:54 ` Nathan Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98b61e85-2c6c-4de2-ba2b-e120229d230c@nvidia.com \
--to=nathanc@nvidia.com \
--cc=alex@shazbot.org \
--cc=clg@redhat.com \
--cc=eric.auger@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mochs@nvidia.com \
--cc=nicolinc@nvidia.com \
--cc=peter.maydell@linaro.org \
--cc=philmd@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=skolothumtho@nvidia.com \
--cc=wangyanan55@huawei.com \
--cc=yi.l.liu@intel.com \
--cc=zhao1.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.