All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Chen <nathanc@nvidia.com>
To: "Cédric Le Goater" <clg@redhat.com>,
	qemu-arm@nongnu.org, qemu-devel@nongnu.org
Cc: "Yi Liu" <yi.l.liu@intel.com>,
	"Eric Auger" <eric.auger@redhat.com>,
	"Zhenzhong Duan" <zhenzhong.duan@intel.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Zhao Liu" <zhao1.liu@intel.com>,
	"Alex Williamson" <alex@shazbot.org>,
	"Shameer Kolothum" <skolothumtho@nvidia.com>,
	"Matt Ochs" <mochs@nvidia.com>,
	"Nicolin Chen" <nicolinc@nvidia.com>
Subject: Re: [PATCH 04/11] vfio/pci: Add ats property and mask ATS cap when not exposed
Date: Fri, 10 Apr 2026 18:52:15 -0700	[thread overview]
Message-ID: <e2f357c8-c124-4a20-ae7c-acbee4be8e38@nvidia.com> (raw)
In-Reply-To: <084d5550-c471-4d5b-8e2b-d2529a77cde8@redhat.com>


Hi Cédric,
On 4/10/2026 2:09 PM, Cédric Le Goater wrote:
> Hello Nathan,
> 
> On 4/1/26 03:02, Nathan Chen wrote:
>> From: Nathan Chen <nathanc@nvidia.com>
>>
>> Add an "ats" OnOffAuto property to vfio-pci. When the device has an ATS
>> extended capability in config space but we should not expose it (ats=off,
>> or ats=auto and kernel reports IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED), mask
>> the capability so the guest does not see it.
>>
>> This aligns with the kernel's per-device effective ATS reporting and
>> allows omitting ATS capability when the vIOMMU has ats=off.
>>
>> Suggested-by: Shameer Kolothum <skolothumtho@nvidia.com>
>> Signed-off-by: Nathan Chen <nathanc@nvidia.com>
> 
> This looks like a standalone VFIO change that could be sent independently.
> Could you please rebase on vfio-next [*] ? There are a few conflicting
> changes in this branch.
> 
> [*] https://github.com/legoater/qemu/commits/vfio-next
> 
> You will need to add a compat machine property too. This is a guest-visible
> ABI change. A separate patch would be preferred.
> 
Yes, I will separate this out and rebase on vfio-next, with the compat 
machine property added.

>> ---
>>   backends/iommufd.c                 | 15 +++++++
>>   hw/vfio/pci.c                      | 63 ++++++++++++++++++++++++++++++
>>   hw/vfio/pci.h                      |  1 +
>>   include/system/host_iommu_device.h | 10 +++++
>>   4 files changed, 89 insertions(+)
>>
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index e1fee16acf..52cb060454 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -22,6 +22,13 @@
>>   #include "hw/vfio/vfio-device.h"
>>   #include <sys/ioctl.h>
>>   #include <linux/iommufd.h>
>> +/*
>> + * Until kernel UAPI is synced via scripts;
>> + * matches include/uapi/linux/iommufd.h
>> + */
>> +#ifndef IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED
>> +#define IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED (1 << 3)
>> +#endif
> 
> This definition is introduced by :
> 
> https://lore.kernel.org/all/20260303150348.233997-1- 
> skolothumtho@nvidia.com/
> 
> It should reach Linux v7.1 since it is in linux-next. Anyhow, until then,
> please isolate this change in a temporary patch updating linux-headers/
> 
Ok, I will separate it out.
>>   static const char *iommufd_fd_name(IOMMUFDBackend *be)
>>   {
>> @@ -573,6 +580,13 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice 
>> *hiod, int cap, Error **errp)
>>       }
>>   }
>> +static bool hiod_iommufd_support_ats(HostIOMMUDevice *hiod)
>> +{
>> +    HostIOMMUDeviceCaps *caps = &hiod->caps;
>> +
>> +    return !(caps->hw_caps & IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED);
>> +}
>> +
>>   static bool hiod_iommufd_get_pasid_info(HostIOMMUDevice *hiod,
>>                                           PasidInfo *pasid_info)
>>   {
>> @@ -595,6 +609,7 @@ static void hiod_iommufd_class_init(ObjectClass 
>> *oc, const void *data)
>>       hioc->get_cap = hiod_iommufd_get_cap;
>>       hioc->get_pasid_info = hiod_iommufd_get_pasid_info;
>> +    hioc->support_ats = hiod_iommufd_support_ats;
>>   };
>>   static const TypeInfo types[] = {
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 1945751ffd..2d408e1d9a 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -49,6 +49,10 @@
>>   #include "system/iommufd.h"
>>   #include "vfio-migration-internal.h"
>>   #include "vfio-helpers.h"
>> +#ifdef CONFIG_IOMMUFD
>> +#include "system/host_iommu_device.h"
>> +#include "linux/iommufd.h"
>> +#endif
> These includes are not necessary.
Got it, I will remove these. I see that system/host_iommu_device.h is 
included in system/iommufd.h already. And linux/iommufd.h was leftover 
from a previous revision.

> 
>>   /* Protected by BQL */
>>   static KVMRouteChange vfio_route_change;
>> @@ -2550,10 +2554,53 @@ static bool 
>> vfio_pci_synthesize_pasid_cap(VFIOPCIDevice *vdev, Error **errp)
>>       return true;
>>   }
>> +/*
>> + * Determine whether ATS capability should be advertised for @vdev, 
>> based on
>> + * whether it was enabled on the command line and whether it is 
>> supported
>> + * according to the kernel's IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED bit.
>> + *
>> + * Store whether ATS capability should be advertised in @ats_need.
>> + *
>> + * Return false if kernel enables IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED
>> + * and ATS is effectively unsupported.
>> + */
>> +static bool vfio_pci_ats_requested_and_supported(VFIOPCIDevice *vdev,
>> +                                                 bool *ats_need, 
>> Error **errp)
> 
> May be rename 'ats_need' to 'ats' or 'ats_needed'
> 
Ok, I will rename this to ats_needed.
>> +{
>> +    HostIOMMUDevice *hiod = vdev->vbasedev.hiod;
>> +    HostIOMMUDeviceClass *hiodc;
>> +    bool ats_supported;
>> +
>> +    if (vdev->ats == ON_OFF_AUTO_OFF) {
>> +        *ats_need = false;
>> +        return true;
>> +    }
>> +
>> +    *ats_need = true;
>> +    if (!hiod) {
>> +        return true;
>> +    }
>> +    hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>> +    if (!hiodc || !hiodc->support_ats) {
>> +        return true;
>> +    }
>> +
>> +    ats_supported = hiodc->support_ats(hiod);
>> +    if (vdev->ats == ON_OFF_AUTO_ON && !ats_supported) {
>> +        error_setg(errp, "vfio: ATS requested but not supported by 
>> kernel");
> 
> hmm, I wonder if we shouldn't fail to realize the device and stop the VM
> in this case (ats requested + no support).
Agreed. I can move the vfio_pci_ats_requested_and_supported() call to 
vfio_pci_add_capabilities() so we can propagate the Error **, and then 
pass ats_needed as a new input arg to vfio_add_ext_cap. Would that be 
alright?
> 
>> +        *ats_need = false;
>> +        return false;
>> +    }
>> +
>> +    *ats_need = ats_supported;
>> +    return true;
>> +}
>> +
>>   static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>>   {
>>       PCIDevice *pdev = PCI_DEVICE(vdev);
>>       bool pasid_cap_added = false;
>> +    bool ats_needed = false;
>>       Error *err = NULL;
>>       uint32_t header;
>>       uint16_t cap_id, next, size;
>> @@ -2603,6 +2650,11 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>>       pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
>>       pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, 
>> ~0);
>> +    if (!vfio_pci_ats_requested_and_supported(vdev, &ats_needed, 
>> &err)) {
>> +        error_report_err(err);
>> +       err = NULL;
> 
> indent error
> 
Thanks, will fix this.
>> +    }
>> +
>>       for (next = PCI_CONFIG_SPACE_SIZE; next;
>>            next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
>>           header = pci_get_long(config + next);
>> @@ -2640,6 +2692,16 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>>           case PCI_EXT_CAP_ID_PASID:
>>               pasid_cap_added = true;
>>               /* fallthrough */
> 
> This change is coupling the PASID and ATS capabilities. Is this intended ?
> 
This was not intentional, I will add a pcie_add_capability() call and 
break for the PASID case.
> 
>> +        case PCI_EXT_CAP_ID_ATS:
>> +            /*
>> +             * If ATS is requested and supported according to the 
>> kernel, add
>> +             * the ATS capability. If not supported according to the 
>> kernel or
>> +             * disabled on the qemu command line, omit the ATS cap.
>> +             */
>> +            if (ats_needed) {
>> +                pcie_add_capability(pdev, cap_id, cap_ver, next, size);
>> +            }
>> +            break;
>>           default:
>>               pcie_add_capability(pdev, cap_id, cap_ver, next, size);
>>           }
>> @@ -3819,6 +3881,7 @@ static const Property vfio_pci_properties[] = {
>>   #ifdef CONFIG_IOMMUFD
>>       DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd,
>>                        TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
>> +    DEFINE_PROP_ON_OFF_AUTO("ats", VFIOPCIDevice, ats, 
>> ON_OFF_AUTO_AUTO),
> 
> conflicts with vfio-next
> 
Ok, will take a look and rebase on vfio-next.
>>   #endif
>>       DEFINE_PROP_BOOL("skip-vsc-check", VFIOPCIDevice, 
>> skip_vsc_check, true),
>>       DEFINE_PROP_UINT16("x-vpasid-cap-offset", VFIOPCIDevice,
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index d6495d7f29..514a9197ce 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -191,6 +191,7 @@ struct VFIOPCIDevice {
>>       VFIODisplay *dpy;
>>       Notifier irqchip_change_notifier;
>>       VFIOPCICPR cpr;
>> +    OnOffAuto ats;
>>   };
>>   /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot 
>> match hw */
>> diff --git a/include/system/host_iommu_device.h b/include/system/ 
>> host_iommu_device.h
>> index f000301583..44c56e87bb 100644
>> --- a/include/system/host_iommu_device.h
>> +++ b/include/system/host_iommu_device.h
> 
> Please add to your .git/config :
> 
> [diff]
>      orderFile = /path/to/qemu/scripts/git.orderfile
> 
> 
Ok, I will add this for the next refresh.
>> @@ -133,6 +133,16 @@ struct HostIOMMUDeviceClass {
>>        * Returns: true on success, false on failure.
>>        */
>>       bool (*get_pasid_info)(HostIOMMUDevice *hiod, PasidInfo 
>> *pasid_info);
>> +    /**
>> +     *  @support_ats: Return whether ATS is supported for the device
>> +     *  associated with @hiod host IOMMU device, checking if the
>> +     *  IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED capability bit is set.
>> +     *
>> +     *  @hiod: handle to the host IOMMU device
>> +     *
>> +     *  Returns: true on success, false on failure
> 
>   "Returns: true if ATS is supported, false otherwise"
> 
Ok, I will make this more clear.
>> +     */
>> +    bool (*support_ats)(HostIOMMUDevice *hiod);
>>   };
>>   /*
> 
> 
> I would split this patch in 4 parts :
> 
> - linux-headers/ v7.1 update
> - support_ats() handler introduction and iommufd implementation
> - "ats" property addition
> - compat property fixup
Got it, thanks for the guidance on organizing the split. I will send 
separately from this series.

Thanks,
Nathan



  reply	other threads:[~2026-04-11  1:52 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01  1:02 [PATCH 00/11] hw/arm/smmuv3-accel: Resolve AUTO properties Nathan Chen
2026-04-01  1:02 ` [PATCH 01/11] hw/arm/smmuv3-accel: Add helper for resolving auto parameters Nathan Chen
2026-04-10  7:26   ` Eric Auger
2026-04-10 18:50     ` Nathan Chen
2026-04-10  7:37   ` Eric Auger
2026-04-10 18:53     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 02/11] hw/arm/smmuv3-accel: Implement "auto" value for "ats" Nathan Chen
2026-04-10  7:36   ` Eric Auger
2026-04-10 18:53     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 03/11] hw/arm/smmuv3: Change the default ats support to match the host Nathan Chen
2026-04-10  7:42   ` Eric Auger
2026-04-10 18:54     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 04/11] vfio/pci: Add ats property and mask ATS cap when not exposed Nathan Chen
2026-04-10 21:09   ` Cédric Le Goater
2026-04-11  1:52     ` Nathan Chen [this message]
2026-04-11 12:45       ` Cédric Le Goater
2026-04-13 19:11         ` Nathan Chen
2026-04-16 13:58     ` Cédric Le Goater
2026-04-16 20:04       ` Nathan Chen
2026-04-16 13:43   ` Shameer Kolothum Thodi
2026-04-16 21:39     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 05/11] hw/arm/smmuv3-accel: Implement "auto" value for "ril" Nathan Chen
2026-04-10  7:47   ` Eric Auger
2026-04-10 18:58     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 06/11] hw/arm/smmuv3: Change the default ril support to match the host Nathan Chen
2026-04-10  7:41   ` Eric Auger
2026-04-10 18:59     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 07/11] hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize" Nathan Chen
2026-04-10  7:56   ` Eric Auger
2026-04-10 19:01     ` Nathan Chen
2026-04-01  1:02 ` [PATCH 08/11] hw/arm/smmuv3: Change the default ssidsize to match the host Nathan Chen
2026-04-01  1:02 ` [PATCH 09/11] hw/arm/smmuv3-accel: Implement "auto" value for "oas" Nathan Chen
2026-04-10  7:48   ` Eric Auger
2026-04-16 14:08   ` Shameer Kolothum Thodi
2026-04-16 20:02     ` Nathan Chen
2026-04-20 22:31     ` Nathan Chen
2026-04-21  9:30       ` Shameer Kolothum Thodi
2026-04-01  1:02 ` [PATCH 10/11] hw/arm/smmuv3: Change the default oas to match the host Nathan Chen
2026-04-01  1:02 ` [PATCH 11/11] qemu-options.hx: Support "auto" for accel SMMUv3 properties Nathan Chen
2026-04-16 14:19   ` Shameer Kolothum Thodi
2026-04-16 19:54     ` Nathan Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e2f357c8-c124-4a20-ae7c-acbee4be8e38@nvidia.com \
    --to=nathanc@nvidia.com \
    --cc=alex@shazbot.org \
    --cc=clg@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mochs@nvidia.com \
    --cc=nicolinc@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=skolothumtho@nvidia.com \
    --cc=wangyanan55@huawei.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhao1.liu@intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.