* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-04-27 5:54 ` [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices Nicolin Chen
@ 2026-04-27 16:31 ` Dave Jiang
2026-04-30 21:41 ` Dan Williams (nvidia)
` (2 subsequent siblings)
3 siblings, 0 replies; 26+ messages in thread
From: Dave Jiang @ 2026-04-27 16:31 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy, bhelgaas
Cc: joro, praan, baolu.lu, kevin.tian, miko.lenczewski,
linux-arm-kernel, iommu, linux-kernel, linux-pci, dan.j.williams,
jonathan.cameron, vsethi, linux-cxl, nirmoyd
On 4/26/26 10:54 PM, Nicolin Chen wrote:
> Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
> given PASID on a device is attached to an I/O page table. This is working
> even when a device has no translation on its RID (i.e., the RID is IOMMU
> bypassed).
>
> However, certain PCIe devices require non-PASID ATS on their RID even when
> the RID is IOMMU bypassed. Call this "always on".
>
> For example, CXL spec r4.0 notes in sec 3.2.5.13 Memory Type on CXL.cache:
> "To source requests on CXL.cache, devices need to get the Host Physical
> Address (HPA) from the Host by means of an ATS request on CXL.io."
>
> In other words, the CXL.cache capability requires ATS; otherwise, it can't
> access host physical memory.
>
> Introduce a new pci_ats_always_on() helper for the IOMMU driver to scan a
> PCI device and shift ATS policies between "on demand" and "always on".
>
> Add the support for CXL.cache devices first. Pre-CXL devices will be added
> in quirks.c file.
>
> Note that pci_ats_always_on() validates against pci_ats_supported(), so we
> ensure that untrusted devices (e.g. external ports) will not be always on.
> This maintains the existing ATS security policy regarding potential side-
> channel attacks via ATS.
>
> Cc: linux-cxl@vger.kernel.org
> Suggested-by: Vikram Sethi <vsethi@nvidia.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Tested-by: Nirmoy Das <nirmoyd@nvidia.com>
> Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> include/linux/pci-ats.h | 3 +++
> include/uapi/linux/pci_regs.h | 1 +
> drivers/pci/ats.c | 43 +++++++++++++++++++++++++++++++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index 75c6c86cf09dc..d14ba727d38b3 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -12,6 +12,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps);
> void pci_disable_ats(struct pci_dev *dev);
> int pci_ats_queue_depth(struct pci_dev *dev);
> int pci_ats_page_aligned(struct pci_dev *dev);
> +bool pci_ats_always_on(struct pci_dev *dev);
> #else /* CONFIG_PCI_ATS */
> static inline bool pci_ats_supported(struct pci_dev *d)
> { return false; }
> @@ -24,6 +25,8 @@ static inline int pci_ats_queue_depth(struct pci_dev *d)
> { return -ENODEV; }
> static inline int pci_ats_page_aligned(struct pci_dev *dev)
> { return 0; }
> +static inline bool pci_ats_always_on(struct pci_dev *dev)
> +{ return false; }
> #endif /* CONFIG_PCI_ATS */
>
> #ifdef CONFIG_PCI_PRI
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 14f634ab9350d..6ac45be1008b8 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1349,6 +1349,7 @@
> /* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
> #define PCI_DVSEC_CXL_DEVICE 0
> #define PCI_DVSEC_CXL_CAP 0xA
> +#define PCI_DVSEC_CXL_CACHE_CAPABLE _BITUL(0)
> #define PCI_DVSEC_CXL_MEM_CAPABLE _BITUL(2)
> #define PCI_DVSEC_CXL_HDM_COUNT __GENMASK(5, 4)
> #define PCI_DVSEC_CXL_CTRL 0xC
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index ec6c8dbdc5e9c..fc871858b65bc 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -205,6 +205,49 @@ int pci_ats_page_aligned(struct pci_dev *pdev)
> return 0;
> }
>
> +/*
> + * CXL r4.0, sec 3.2.5.13 Memory Type on CXL.cache notes: to source requests on
> + * CXL.cache, devices need to get the Host Physical Address (HPA) from the Host
> + * by means of an ATS request on CXL.io.
> + *
> + * In other words, CXL.cache devices cannot access host physical memory without
> + * ATS.
> + */
> +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> +{
> + int offset;
> + u16 cap;
> +
> + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> + PCI_DVSEC_CXL_DEVICE);
> + if (!offset)
> + return false;
> +
> + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> + return false;
> +
> + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
> +}
> +
> +/**
> + * pci_ats_always_on - Whether the PCI device requires ATS to be always enabled
> + * @pdev: the PCI device
> + *
> + * Returns true, if the PCI device requires ATS for basic functional operation.
> + */
> +bool pci_ats_always_on(struct pci_dev *pdev)
> +{
> + if (pci_ats_disabled() || !pci_ats_supported(pdev))
> + return false;
> +
> + /* A VF inherits its PF's requirement for ATS function */
> + if (pdev->is_virtfn)
> + pdev = pci_physfn(pdev);
> +
> + return pci_cxl_ats_always_on(pdev);
> +}
> +EXPORT_SYMBOL_GPL(pci_ats_always_on);
> +
> #ifdef CONFIG_PCI_PRI
> void pci_pri_init(struct pci_dev *pdev)
> {
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-04-27 5:54 ` [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices Nicolin Chen
2026-04-27 16:31 ` Dave Jiang
@ 2026-04-30 21:41 ` Dan Williams (nvidia)
2026-04-30 23:28 ` Nicolin Chen
2026-05-19 19:36 ` Bjorn Helgaas
2026-05-20 13:12 ` Yi Liu
3 siblings, 1 reply; 26+ messages in thread
From: Dan Williams (nvidia) @ 2026-04-30 21:41 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy, bhelgaas
Cc: joro, praan, baolu.lu, kevin.tian, miko.lenczewski,
linux-arm-kernel, iommu, linux-kernel, linux-pci, dan.j.williams,
jonathan.cameron, vsethi, linux-cxl, nirmoyd
Nicolin Chen wrote:
> Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
> given PASID on a device is attached to an I/O page table. This is working
> even when a device has no translation on its RID (i.e., the RID is IOMMU
> bypassed).
>
> However, certain PCIe devices require non-PASID ATS on their RID even when
> the RID is IOMMU bypassed. Call this "always on".
>
> For example, CXL spec r4.0 notes in sec 3.2.5.13 Memory Type on CXL.cache:
> "To source requests on CXL.cache, devices need to get the Host Physical
> Address (HPA) from the Host by means of an ATS request on CXL.io."
>
> In other words, the CXL.cache capability requires ATS; otherwise, it can't
> access host physical memory.
>
> Introduce a new pci_ats_always_on() helper for the IOMMU driver to scan a
> PCI device and shift ATS policies between "on demand" and "always on".
>
> Add the support for CXL.cache devices first. Pre-CXL devices will be added
> in quirks.c file.
>
> Note that pci_ats_always_on() validates against pci_ats_supported(), so we
> ensure that untrusted devices (e.g. external ports) will not be always on.
> This maintains the existing ATS security policy regarding potential side-
> channel attacks via ATS.
>
> Cc: linux-cxl@vger.kernel.org
> Suggested-by: Vikram Sethi <vsethi@nvidia.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Tested-by: Nirmoy Das <nirmoyd@nvidia.com>
> Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> include/linux/pci-ats.h | 3 +++
> include/uapi/linux/pci_regs.h | 1 +
> drivers/pci/ats.c | 43 +++++++++++++++++++++++++++++++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index 75c6c86cf09dc..d14ba727d38b3 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -12,6 +12,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps);
> void pci_disable_ats(struct pci_dev *dev);
> int pci_ats_queue_depth(struct pci_dev *dev);
> int pci_ats_page_aligned(struct pci_dev *dev);
> +bool pci_ats_always_on(struct pci_dev *dev);
> #else /* CONFIG_PCI_ATS */
> static inline bool pci_ats_supported(struct pci_dev *d)
> { return false; }
> @@ -24,6 +25,8 @@ static inline int pci_ats_queue_depth(struct pci_dev *d)
> { return -ENODEV; }
> static inline int pci_ats_page_aligned(struct pci_dev *dev)
> { return 0; }
> +static inline bool pci_ats_always_on(struct pci_dev *dev)
> +{ return false; }
> #endif /* CONFIG_PCI_ATS */
>
> #ifdef CONFIG_PCI_PRI
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 14f634ab9350d..6ac45be1008b8 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1349,6 +1349,7 @@
> /* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
> #define PCI_DVSEC_CXL_DEVICE 0
> #define PCI_DVSEC_CXL_CAP 0xA
> +#define PCI_DVSEC_CXL_CACHE_CAPABLE _BITUL(0)
> #define PCI_DVSEC_CXL_MEM_CAPABLE _BITUL(2)
> #define PCI_DVSEC_CXL_HDM_COUNT __GENMASK(5, 4)
> #define PCI_DVSEC_CXL_CTRL 0xC
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index ec6c8dbdc5e9c..fc871858b65bc 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -205,6 +205,49 @@ int pci_ats_page_aligned(struct pci_dev *pdev)
> return 0;
> }
>
> +/*
> + * CXL r4.0, sec 3.2.5.13 Memory Type on CXL.cache notes: to source requests on
> + * CXL.cache, devices need to get the Host Physical Address (HPA) from the Host
> + * by means of an ATS request on CXL.io.
> + *
> + * In other words, CXL.cache devices cannot access host physical memory without
> + * ATS.
> + */
> +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> +{
> + int offset;
> + u16 cap;
> +
> + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> + PCI_DVSEC_CXL_DEVICE);
> + if (!offset)
> + return false;
> +
> + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> + return false;
> +
> + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
Apologies for coming to this late and forgive me if the following has
already been asked and answered. Why not check for actual CXL.cache
protocol on the wire being present?
I.e. replace pci_cxl_ats_always_on() with a pdev->is_cxl_cache and this
incremental change (compile tested only):
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..45d87af4de63 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -483,7 +483,8 @@ struct pci_dev {
unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
- unsigned int is_cxl:1; /* Compute Express Link (CXL) */
+ unsigned int is_cxl_mem:1; /* Compute Express Link (CXL.mem) */
+ unsigned int is_cxl_cache:1; /* Compute Express Link (CXL.cache) */
/*
* Devices marked being untrusted are the ones that can potentially
* execute DMA attacks and similar. They are typically connected
@@ -809,7 +810,7 @@ static inline bool pci_is_display(struct pci_dev *pdev)
static inline bool pcie_is_cxl(struct pci_dev *pci_dev)
{
- return pci_dev->is_cxl;
+ return pci_dev->is_cxl_mem || pci_dev->is_cxl_cache;
}
#define for_each_pci_bridge(dev, bus) \
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b63cd0c310bc..c01f0e8362f1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1733,9 +1733,8 @@ static void set_pcie_cxl(struct pci_dev *dev)
pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS,
&cap);
- dev->is_cxl = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_CACHE, cap) ||
- FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_MEM, cap);
-
+ dev->is_cxl_cache = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_CACHE, cap);
+ dev->is_cxl_mem = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_MEM, cap);
}
static void set_pcie_untrusted(struct pci_dev *dev)
^ permalink raw reply related [flat|nested] 26+ messages in thread* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-04-30 21:41 ` Dan Williams (nvidia)
@ 2026-04-30 23:28 ` Nicolin Chen
2026-05-01 23:27 ` Dan Williams (nvidia)
0 siblings, 1 reply; 26+ messages in thread
From: Nicolin Chen @ 2026-04-30 23:28 UTC (permalink / raw)
To: Dan Williams (nvidia)
Cc: jgg, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Thu, Apr 30, 2026 at 02:41:22PM -0700, Dan Williams (nvidia) wrote:
> > +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> > +{
> > + int offset;
> > + u16 cap;
> > +
> > + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> > + PCI_DVSEC_CXL_DEVICE);
> > + if (!offset)
> > + return false;
> > +
> > + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> > + return false;
> > +
> > + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
[...]
> Apologies for coming to this late and forgive me if the following has
> already been asked and answered. Why not check for actual CXL.cache
> protocol on the wire being present?
Actually it would make the patch smaller. The thing is that this
is_cxl property wasn't added when I started the series. So, it's
not using it. :)
> @@ -1733,9 +1733,8 @@ static void set_pcie_cxl(struct pci_dev *dev)
> pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS,
> &cap);
>
> - dev->is_cxl = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_CACHE, cap) ||
> - FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_MEM, cap);
> -
> + dev->is_cxl_cache = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_CACHE, cap);
> + dev->is_cxl_mem = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_MEM, cap);
One caveat is that:
Here it checks the cap from:
PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS (0xE) via
PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS (0x7)
On the other hand, mine checks from:
PCI_DVSEC_CXL_CAP (0xA) via
PCI_DVSEC_CXL_DEVICE (0x0)
The spec mentions in 8.2.1.3.1 DVSEC Flex Bus Port Capability: "
Note: The Mem_Capable, IO_Capable, and Cache_Capable fields are
also present in the DVSEC Flex Bus for the device [which is the
legacy name for DVSEC 0x0]. This allows for future scalability
where multiple devices, each with potentially different
capabilities, may be populated behind a single Port.
"
Not arguing that set_pcie_cxl() is wrong, but I am not sure if there
would be any side effect to rely on the "legacy name" over DVSEC 0x0.
Is there any CXL expert who can help confirm?
Thanks!
Nicolin
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-04-30 23:28 ` Nicolin Chen
@ 2026-05-01 23:27 ` Dan Williams (nvidia)
2026-05-01 23:46 ` Jason Gunthorpe
0 siblings, 1 reply; 26+ messages in thread
From: Dan Williams (nvidia) @ 2026-05-01 23:27 UTC (permalink / raw)
To: Nicolin Chen, Dan Williams (nvidia)
Cc: jgg, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
Nicolin Chen wrote:
> On Thu, Apr 30, 2026 at 02:41:22PM -0700, Dan Williams (nvidia) wrote:
> > > +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> > > +{
> > > + int offset;
> > > + u16 cap;
> > > +
> > > + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> > > + PCI_DVSEC_CXL_DEVICE);
> > > + if (!offset)
> > > + return false;
> > > +
> > > + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> > > + return false;
> > > +
> > > + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
> [...]
> > Apologies for coming to this late and forgive me if the following has
> > already been asked and answered. Why not check for actual CXL.cache
> > protocol on the wire being present?
>
> Actually it would make the patch smaller. The thing is that this
> is_cxl property wasn't added when I started the series. So, it's
> not using it. :)
>
> > @@ -1733,9 +1733,8 @@ static void set_pcie_cxl(struct pci_dev *dev)
> > pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS,
> > &cap);
> >
> > - dev->is_cxl = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_CACHE, cap) ||
> > - FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_MEM, cap);
> > -
> > + dev->is_cxl_cache = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_CACHE, cap);
> > + dev->is_cxl_mem = FIELD_GET(PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS_MEM, cap);
>
> One caveat is that:
>
> Here it checks the cap from:
> PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS (0xE) via
> PCI_DVSEC_CXL_FLEXBUS_PORT_STATUS (0x7)
>
> On the other hand, mine checks from:
> PCI_DVSEC_CXL_CAP (0xA) via
> PCI_DVSEC_CXL_DEVICE (0x0)
>
> The spec mentions in 8.2.1.3.1 DVSEC Flex Bus Port Capability: "
> Note: The Mem_Capable, IO_Capable, and Cache_Capable fields are
> also present in the DVSEC Flex Bus for the device [which is the
> legacy name for DVSEC 0x0]. This allows for future scalability
> where multiple devices, each with potentially different
> capabilities, may be populated behind a single Port.
> "
>
> Not arguing that set_pcie_cxl() is wrong, but I am not sure if there
> would be any side effect to rely on the "legacy name" over DVSEC 0x0.
>
> Is there any CXL expert who can help confirm?
You appear to be confusing Cache_Capable and Cache_Enabled.
"8.2.1.3.1 DVSEC Flex Bus Port Capability" != "8.2.1.3.3 DVSEC Flex Bus Port Status"
Cache_Capable is only a capability. To check that the device has
actually trained the CXL.cache alternate protocol you need to look at
the status register.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-01 23:27 ` Dan Williams (nvidia)
@ 2026-05-01 23:46 ` Jason Gunthorpe
2026-05-02 0:19 ` Dan Williams (nvidia)
0 siblings, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2026-05-01 23:46 UTC (permalink / raw)
To: Dan Williams (nvidia)
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Fri, May 01, 2026 at 04:27:41PM -0700, Dan Williams (nvidia) wrote:
> You appear to be confusing Cache_Capable and Cache_Enabled.
>
> "8.2.1.3.1 DVSEC Flex Bus Port Capability" != "8.2.1.3.3 DVSEC Flex Bus Port Status"
>
> Cache_Capable is only a capability. To check that the device has
> actually trained the CXL.cache alternate protocol you need to look at
> the status register.
The capable is probably a reasonable choice here unless you are
confident the status will never change after the device is first
discovered? ATS is being set early in the boot sequence.
It is pretty safe to be over eager with the ATS enablement, less safe
to get it off when it needs to be on.
Jason
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-01 23:46 ` Jason Gunthorpe
@ 2026-05-02 0:19 ` Dan Williams (nvidia)
0 siblings, 0 replies; 26+ messages in thread
From: Dan Williams (nvidia) @ 2026-05-02 0:19 UTC (permalink / raw)
To: Jason Gunthorpe, Dan Williams (nvidia)
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
Jason Gunthorpe wrote:
> On Fri, May 01, 2026 at 04:27:41PM -0700, Dan Williams (nvidia) wrote:
>
> > You appear to be confusing Cache_Capable and Cache_Enabled.
> >
> > "8.2.1.3.1 DVSEC Flex Bus Port Capability" != "8.2.1.3.3 DVSEC Flex Bus Port Status"
> >
> > Cache_Capable is only a capability. To check that the device has
> > actually trained the CXL.cache alternate protocol you need to look at
> > the status register.
>
> The capable is probably a reasonable choice here unless you are
> confident the status will never change after the device is first
> discovered? ATS is being set early in the boot sequence.
>
> It is pretty safe to be over eager with the ATS enablement, less safe
> to get it off when it needs to be on.
True, a reset could turn on CXL.cache. Ok, stick with what you have.
The present state of alternate protocol negotiation is still relevant
though for distinguishing CXL protocol errors from other PCIe AER
"internal" errors.
Need a bit of fixup work for that to refresh the status bit after reset.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-04-27 5:54 ` [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices Nicolin Chen
2026-04-27 16:31 ` Dave Jiang
2026-04-30 21:41 ` Dan Williams (nvidia)
@ 2026-05-19 19:36 ` Bjorn Helgaas
2026-05-19 22:23 ` Jason Gunthorpe
2026-05-20 13:12 ` Yi Liu
3 siblings, 1 reply; 26+ messages in thread
From: Bjorn Helgaas @ 2026-05-19 19:36 UTC (permalink / raw)
To: Nicolin Chen
Cc: jgg, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Sun, Apr 26, 2026 at 10:54:00PM -0700, Nicolin Chen wrote:
> Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
> given PASID on a device is attached to an I/O page table. This is working
> even when a device has no translation on its RID (i.e., the RID is IOMMU
> bypassed).
>
> However, certain PCIe devices require non-PASID ATS on their RID even when
> the RID is IOMMU bypassed. Call this "always on".
>
> For example, CXL spec r4.0 notes in sec 3.2.5.13 Memory Type on CXL.cache:
> "To source requests on CXL.cache, devices need to get the Host Physical
> Address (HPA) from the Host by means of an ATS request on CXL.io."
>
> In other words, the CXL.cache capability requires ATS; otherwise, it can't
> access host physical memory.
>
> Introduce a new pci_ats_always_on() helper for the IOMMU driver to scan a
> PCI device and shift ATS policies between "on demand" and "always on".
>
> Add the support for CXL.cache devices first. Pre-CXL devices will be added
> in quirks.c file.
>
> Note that pci_ats_always_on() validates against pci_ats_supported(), so we
> ensure that untrusted devices (e.g. external ports) will not be always on.
> This maintains the existing ATS security policy regarding potential side-
> channel attacks via ATS.
IMO this doesn't really fit in the PCI core. ats.c encapsulates
discovery and provides interfaces to access the ATS Capability, but
the users of those interfaces are all outside the PCI core.
The decision to enable enable ATS for CXL.cache devices is fine but
it's really an IOMMU usage policy, and I think it should be
implemented in the IOMMU core. All the pieces needed
(pci_ats_disabled(), pci_ats_supported(), pci_find_dvsec_capability())
are already exported.
One motivation for putting this in the PCI core was to use the quirk
infrastructure, but this series doesn't use any of that. It doesn't
declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
update any state cached by the PCI core.
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1349,6 +1349,7 @@
> /* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
> #define PCI_DVSEC_CXL_DEVICE 0
> #define PCI_DVSEC_CXL_CAP 0xA
> +#define PCI_DVSEC_CXL_CACHE_CAPABLE _BITUL(0)
This makes good sense, I'm fine with adding
PCI_DVSEC_CXL_CACHE_CAPABLE.
> +bool pci_ats_always_on(struct pci_dev *pdev)
> +{
> + if (pci_ats_disabled() || !pci_ats_supported(pdev))
> + return false;
Isn't this the same as:
if (!pci_ats_supported(pdev))
return false;
If pci_ats_disabled(), dev->ats_cap should be zero, so
pci_ats_supported() should always return false.
> +
> + /* A VF inherits its PF's requirement for ATS function */
> + if (pdev->is_virtfn)
> + pdev = pci_physfn(pdev);
> +
> + return pci_cxl_ats_always_on(pdev);
> +}
> +EXPORT_SYMBOL_GPL(pci_ats_always_on);
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-19 19:36 ` Bjorn Helgaas
@ 2026-05-19 22:23 ` Jason Gunthorpe
2026-05-19 23:48 ` Bjorn Helgaas
0 siblings, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2026-05-19 22:23 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> One motivation for putting this in the PCI core was to use the quirk
> infrastructure, but this series doesn't use any of that. It doesn't
> declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> update any state cached by the PCI core.
It works like the acs quirks that are in the quirks file, which are
also arguably only used by iommu too :)
I'm not keen on spreading lists of device ids for PCI quirks to iommu
files, but it would be OK to move pci_ats_always_on() to
iommu_ats_always_on() that calls the PCI quirk function.
Thanks,
Jason
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-19 22:23 ` Jason Gunthorpe
@ 2026-05-19 23:48 ` Bjorn Helgaas
2026-05-20 0:05 ` Jason Gunthorpe
0 siblings, 1 reply; 26+ messages in thread
From: Bjorn Helgaas @ 2026-05-19 23:48 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Tue, May 19, 2026 at 07:23:35PM -0300, Jason Gunthorpe wrote:
> On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> > One motivation for putting this in the PCI core was to use the quirk
> > infrastructure, but this series doesn't use any of that. It doesn't
> > declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> > update any state cached by the PCI core.
>
> It works like the acs quirks that are in the quirks file, which are
> also arguably only used by iommu too :)
True, although ACS has a lot more PCI-specific grunge in it, including
all the "pci=config_acs" and "pci=disable_acs_redir" stuff.
> I'm not keen on spreading lists of device ids for PCI quirks to iommu
> files, but it would be OK to move pci_ats_always_on() to
> iommu_ats_always_on() that calls the PCI quirk function.
Yeah, I guess it's fair to collect the device IDs in PCI since this is
about characteristics of the device.
If we leave stuff in drivers/pci/, I would prefer that part of it be
named to be purely informational, i.e., "CXL.cache_enabled" or
something similar that would also cover the NVIDIA devices.
"pci_ats_always_on()" doesn't sound right quite to me because it
presupposes the policy choice that IOMMU is going to make; that PCI
function doesn't actually turn ATS on, and it looks like the question
of enabling ATS depends on how the device is actually *used*. E.g.,
if Cache_Enable is not set, is ATS required?
That raises the question of whether this is the right test:
+ if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
+ return false;
+
+ return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
That just says the device is *capable* of CXL.cache; should it check
whether CXL.cache is *enabled* instead?
Bjorn
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-19 23:48 ` Bjorn Helgaas
@ 2026-05-20 0:05 ` Jason Gunthorpe
2026-05-20 1:04 ` Nicolin Chen
0 siblings, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2026-05-20 0:05 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Tue, May 19, 2026 at 06:48:01PM -0500, Bjorn Helgaas wrote:
> On Tue, May 19, 2026 at 07:23:35PM -0300, Jason Gunthorpe wrote:
> > On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> > > One motivation for putting this in the PCI core was to use the quirk
> > > infrastructure, but this series doesn't use any of that. It doesn't
> > > declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> > > update any state cached by the PCI core.
> >
> > It works like the acs quirks that are in the quirks file, which are
> > also arguably only used by iommu too :)
>
> True, although ACS has a lot more PCI-specific grunge in it, including
> all the "pci=config_acs" and "pci=disable_acs_redir" stuff.
>
> > I'm not keen on spreading lists of device ids for PCI quirks to iommu
> > files, but it would be OK to move pci_ats_always_on() to
> > iommu_ats_always_on() that calls the PCI quirk function.
>
> Yeah, I guess it's fair to collect the device IDs in PCI since this is
> about characteristics of the device.
>
> If we leave stuff in drivers/pci/, I would prefer that part of it be
> named to be purely informational, i.e., "CXL.cache_enabled" or
> something similar that would also cover the NVIDIA devices.
Yeah, that's fair, so let's rename it to
pci_translated_required()
ie the device requires translated requests to function. This is what
CXL.cache implies (IIRC I was told the spec specifically says this)
Requiring translated requests implies you have to enable ATS in the
system.
> function doesn't actually turn ATS on, and it looks like the question
> of enabling ATS depends on how the device is actually *used*. E.g.,
> if Cache_Enable is not set, is ATS required?
We have no way to know..
> That raises the question of whether this is the right test:
>
> + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> + return false;
> +
> + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
>
> That just says the device is *capable* of CXL.cache; should it check
> whether CXL.cache is *enabled* instead?
No, we talked about this with Dan in one of the versions... it is
better to over-enable ATS than under-enable. over-enable at best is a
NOP, or maybe a tiny performance loss, under-enable is a functional
failure.
If the CXL.cache is not enabled right now it could become enabled
later, after the iommu has already called this and made its
choice..
Thus lets not try to be too narrow here..
Thanks,
Jason
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-20 0:05 ` Jason Gunthorpe
@ 2026-05-20 1:04 ` Nicolin Chen
2026-05-20 14:20 ` Jason Gunthorpe
0 siblings, 1 reply; 26+ messages in thread
From: Nicolin Chen @ 2026-05-20 1:04 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bjorn Helgaas, will, robin.murphy, bhelgaas, joro, praan,
baolu.lu, kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Tue, May 19, 2026 at 09:05:04PM -0300, Jason Gunthorpe wrote:
> On Tue, May 19, 2026 at 06:48:01PM -0500, Bjorn Helgaas wrote:
> > On Tue, May 19, 2026 at 07:23:35PM -0300, Jason Gunthorpe wrote:
> > > On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> > > > One motivation for putting this in the PCI core was to use the quirk
> > > > infrastructure, but this series doesn't use any of that. It doesn't
> > > > declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> > > > update any state cached by the PCI core.
> > >
> > > It works like the acs quirks that are in the quirks file, which are
> > > also arguably only used by iommu too :)
> >
> > True, although ACS has a lot more PCI-specific grunge in it, including
> > all the "pci=config_acs" and "pci=disable_acs_redir" stuff.
> >
> > > I'm not keen on spreading lists of device ids for PCI quirks to iommu
> > > files, but it would be OK to move pci_ats_always_on() to
> > > iommu_ats_always_on() that calls the PCI quirk function.
> >
> > Yeah, I guess it's fair to collect the device IDs in PCI since this is
> > about characteristics of the device.
> >
> > If we leave stuff in drivers/pci/, I would prefer that part of it be
> > named to be purely informational, i.e., "CXL.cache_enabled" or
> > something similar that would also cover the NVIDIA devices.
>
> Yeah, that's fair, so let's rename it to
>
> pci_translated_required()
>
> ie the device requires translated requests to function. This is what
> CXL.cache implies (IIRC I was told the spec specifically says this)
>
> Requiring translated requests implies you have to enable ATS in the
> system.
Perhaps we could let IOMMU drivers check:
pci_cxl_is_cache_capable() || pci_dev_specific_is_pre_cxl()
directly?
Thanks
Nicolin
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-20 1:04 ` Nicolin Chen
@ 2026-05-20 14:20 ` Jason Gunthorpe
2026-05-20 17:29 ` Nicolin Chen
0 siblings, 1 reply; 26+ messages in thread
From: Jason Gunthorpe @ 2026-05-20 14:20 UTC (permalink / raw)
To: Nicolin Chen
Cc: Bjorn Helgaas, will, robin.murphy, bhelgaas, joro, praan,
baolu.lu, kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Tue, May 19, 2026 at 06:04:18PM -0700, Nicolin Chen wrote:
> > Yeah, that's fair, so let's rename it to
> >
> > pci_translated_required()
> >
> > ie the device requires translated requests to function. This is what
> > CXL.cache implies (IIRC I was told the spec specifically says this)
> >
> > Requiring translated requests implies you have to enable ATS in the
> > system.
>
> Perhaps we could let IOMMU drivers check:
> pci_cxl_is_cache_capable() || pci_dev_specific_is_pre_cxl()
> directly?
I'd rather have a single function.
Jason
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-20 14:20 ` Jason Gunthorpe
@ 2026-05-20 17:29 ` Nicolin Chen
2026-05-20 17:47 ` Bjorn Helgaas
0 siblings, 1 reply; 26+ messages in thread
From: Nicolin Chen @ 2026-05-20 17:29 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bjorn Helgaas, will, robin.murphy, bhelgaas, joro, praan,
baolu.lu, kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Wed, May 20, 2026 at 11:20:43AM -0300, Jason Gunthorpe wrote:
> On Tue, May 19, 2026 at 06:04:18PM -0700, Nicolin Chen wrote:
>
> > > Yeah, that's fair, so let's rename it to
> > >
> > > pci_translated_required()
> > >
> > > ie the device requires translated requests to function. This is what
> > > CXL.cache implies (IIRC I was told the spec specifically says this)
> > >
> > > Requiring translated requests implies you have to enable ATS in the
> > > system.
> >
> > Perhaps we could let IOMMU drivers check:
> > pci_cxl_is_cache_capable() || pci_dev_specific_is_pre_cxl()
> > directly?
>
> I'd rather have a single function.
OK. Can we use pci_ats_required()?
CXL spec explicitly used "ATS" when stating the requirement of
CXL.cache). And it'd fit into the existing pci_ats_ functions.
Nicolin
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-20 17:29 ` Nicolin Chen
@ 2026-05-20 17:47 ` Bjorn Helgaas
2026-05-20 17:56 ` Jason Gunthorpe
0 siblings, 1 reply; 26+ messages in thread
From: Bjorn Helgaas @ 2026-05-20 17:47 UTC (permalink / raw)
To: Nicolin Chen
Cc: Jason Gunthorpe, will, robin.murphy, bhelgaas, joro, praan,
baolu.lu, kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Wed, May 20, 2026 at 10:29:19AM -0700, Nicolin Chen wrote:
> On Wed, May 20, 2026 at 11:20:43AM -0300, Jason Gunthorpe wrote:
> > On Tue, May 19, 2026 at 06:04:18PM -0700, Nicolin Chen wrote:
> >
> > > > Yeah, that's fair, so let's rename it to
> > > >
> > > > pci_translated_required()
> > > >
> > > > ie the device requires translated requests to function. This is what
> > > > CXL.cache implies (IIRC I was told the spec specifically says this)
> > > >
> > > > Requiring translated requests implies you have to enable ATS in the
> > > > system.
> > >
> > > Perhaps we could let IOMMU drivers check:
> > > pci_cxl_is_cache_capable() || pci_dev_specific_is_pre_cxl()
> > > directly?
> >
> > I'd rather have a single function.
>
> OK. Can we use pci_ats_required()?
>
> CXL spec explicitly used "ATS" when stating the requirement of
> CXL.cache). And it'd fit into the existing pci_ats_ functions.
OK by me.
You already have a comment in the code about the CXL.cache
requirement; thanks for that.
I don't know enough about CXL to know what's behind the ATS
requirement. It sounds like it's more than a simple performance
optimization. If you happen to know the reason, it might be worth
a short comment about that too.
Please add a one-line comment in the code about why we check
Cache_Capable instead of Cache_Enable, i.e., even if CXL.cache is not
enabled now, it may be enabled later.
Bjorn
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-20 17:47 ` Bjorn Helgaas
@ 2026-05-20 17:56 ` Jason Gunthorpe
0 siblings, 0 replies; 26+ messages in thread
From: Jason Gunthorpe @ 2026-05-20 17:56 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Wed, May 20, 2026 at 12:47:58PM -0500, Bjorn Helgaas wrote:
> I don't know enough about CXL to know what's behind the ATS
> requirement. It sounds like it's more than a simple performance
> optimization. If you happen to know the reason, it might be worth
> a short comment about that too.
At the core of this is underlying physical interconnect protocols that
only work with translated addresses.
Ie CXL.cache only has a definition for translated physical in its
protocol spec. The use of true physical only is due to the cache
coherence shootdown protocol..
It is why I suggested 'pci_translated_required()' earlier, there are a
few more than CXl.cache why a device might need translated physical
addresses only.
ATS is the only way for a device to get those addresses, so
ats_required is fine too, but it sort of glosses over what is driving
it.
Jason
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-04-27 5:54 ` [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices Nicolin Chen
` (2 preceding siblings ...)
2026-05-19 19:36 ` Bjorn Helgaas
@ 2026-05-20 13:12 ` Yi Liu
2026-05-20 14:34 ` Jason Gunthorpe
3 siblings, 1 reply; 26+ messages in thread
From: Yi Liu @ 2026-05-20 13:12 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy, bhelgaas
Cc: joro, praan, baolu.lu, kevin.tian, miko.lenczewski,
linux-arm-kernel, iommu, linux-kernel, linux-pci, dan.j.williams,
jonathan.cameron, vsethi, linux-cxl, nirmoyd
On 4/27/26 13:54, Nicolin Chen wrote:
> Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
> given PASID on a device is attached to an I/O page table. This is working
> even when a device has no translation on its RID (i.e., the RID is IOMMU
> bypassed).
nit: this description seems not accurate. Intel iommu driver enables ATS
in the probe_device() phase. mind tweak a bit to avoid misleading
message. :)
> However, certain PCIe devices require non-PASID ATS on their RID even when
> the RID is IOMMU bypassed. Call this "always on".
>
> For example, CXL spec r4.0 notes in sec 3.2.5.13 Memory Type on CXL.cache:
> "To source requests on CXL.cache, devices need to get the Host Physical
> Address (HPA) from the Host by means of an ATS request on CXL.io."
>
> In other words, the CXL.cache capability requires ATS; otherwise, it can't
> access host physical memory.
>
> Introduce a new pci_ats_always_on() helper for the IOMMU driver to scan a
> PCI device and shift ATS policies between "on demand" and "always on".
>
> Add the support for CXL.cache devices first. Pre-CXL devices will be added
> in quirks.c file.
>
> Note that pci_ats_always_on() validates against pci_ats_supported(), so we
> ensure that untrusted devices (e.g. external ports) will not be always on.
> This maintains the existing ATS security policy regarding potential side-
> channel attacks via ATS.
>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> Cc: linux-cxl@vger.kernel.org
> Suggested-by: Vikram Sethi <vsethi@nvidia.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Tested-by: Nirmoy Das <nirmoyd@nvidia.com>
> Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> include/linux/pci-ats.h | 3 +++
> include/uapi/linux/pci_regs.h | 1 +
> drivers/pci/ats.c | 43 +++++++++++++++++++++++++++++++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index 75c6c86cf09dc..d14ba727d38b3 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -12,6 +12,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps);
> void pci_disable_ats(struct pci_dev *dev);
> int pci_ats_queue_depth(struct pci_dev *dev);
> int pci_ats_page_aligned(struct pci_dev *dev);
> +bool pci_ats_always_on(struct pci_dev *dev);
> #else /* CONFIG_PCI_ATS */
> static inline bool pci_ats_supported(struct pci_dev *d)
> { return false; }
> @@ -24,6 +25,8 @@ static inline int pci_ats_queue_depth(struct pci_dev *d)
> { return -ENODEV; }
> static inline int pci_ats_page_aligned(struct pci_dev *dev)
> { return 0; }
> +static inline bool pci_ats_always_on(struct pci_dev *dev)
> +{ return false; }
> #endif /* CONFIG_PCI_ATS */
>
> #ifdef CONFIG_PCI_PRI
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 14f634ab9350d..6ac45be1008b8 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1349,6 +1349,7 @@
> /* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
> #define PCI_DVSEC_CXL_DEVICE 0
> #define PCI_DVSEC_CXL_CAP 0xA
> +#define PCI_DVSEC_CXL_CACHE_CAPABLE _BITUL(0)
> #define PCI_DVSEC_CXL_MEM_CAPABLE _BITUL(2)
> #define PCI_DVSEC_CXL_HDM_COUNT __GENMASK(5, 4)
> #define PCI_DVSEC_CXL_CTRL 0xC
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index ec6c8dbdc5e9c..fc871858b65bc 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -205,6 +205,49 @@ int pci_ats_page_aligned(struct pci_dev *pdev)
> return 0;
> }
>
> +/*
> + * CXL r4.0, sec 3.2.5.13 Memory Type on CXL.cache notes: to source requests on
> + * CXL.cache, devices need to get the Host Physical Address (HPA) from the Host
> + * by means of an ATS request on CXL.io.
> + *
> + * In other words, CXL.cache devices cannot access host physical memory without
> + * ATS.
> + */
> +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> +{
> + int offset;
> + u16 cap;
> +
> + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> + PCI_DVSEC_CXL_DEVICE);
> + if (!offset)
> + return false;
> +
> + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> + return false;
> +
> + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
> +}
> +
> +/**
> + * pci_ats_always_on - Whether the PCI device requires ATS to be always enabled
> + * @pdev: the PCI device
> + *
> + * Returns true, if the PCI device requires ATS for basic functional operation.
> + */
> +bool pci_ats_always_on(struct pci_dev *pdev)
> +{
> + if (pci_ats_disabled() || !pci_ats_supported(pdev))
> + return false;
> +
> + /* A VF inherits its PF's requirement for ATS function */
> + if (pdev->is_virtfn)
> + pdev = pci_physfn(pdev);
> +
> + return pci_cxl_ats_always_on(pdev);
> +}
> +EXPORT_SYMBOL_GPL(pci_ats_always_on);
> +
> #ifdef CONFIG_PCI_PRI
> void pci_pri_init(struct pci_dev *pdev)
> {
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
2026-05-20 13:12 ` Yi Liu
@ 2026-05-20 14:34 ` Jason Gunthorpe
0 siblings, 0 replies; 26+ messages in thread
From: Jason Gunthorpe @ 2026-05-20 14:34 UTC (permalink / raw)
To: Yi Liu
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
On Wed, May 20, 2026 at 09:12:31PM +0800, Yi Liu wrote:
> On 4/27/26 13:54, Nicolin Chen wrote:
> > Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
> > given PASID on a device is attached to an I/O page table. This is working
> > even when a device has no translation on its RID (i.e., the RID is IOMMU
> > bypassed).
>
> nit: this description seems not accurate. Intel iommu driver enables ATS
> in the probe_device() phase. mind tweak a bit to avoid misleading
> message. :)
It probably shouldn't do this, it should follow ARM and have it
dynamic during domain attach.
For security we need ATS disabled for blocking domains at a minimum.
Jason
^ permalink raw reply [flat|nested] 26+ messages in thread