* [PATCH v2 1/5] x86/iommu: check for CMPXCHG16B when enabling IOMMU
2025-01-24 12:01 [PATCH v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Roger Pau Monne
@ 2025-01-24 12:01 ` Roger Pau Monne
2025-01-27 9:51 ` Jan Beulich
2025-01-24 12:01 ` [PATCH v2 2/5] iommu/vtd: remove non-CX16 logic from interrupt remapping Roger Pau Monne
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monne @ 2025-01-24 12:01 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
From: Teddy Astie <teddy.astie@vates.tech>
All hardware with VT-d/AMD-Vi has CMPXCHG16B support. Check this at
initialisation time, and remove the effectively-dead logic for the
non-cx16 case.
If the local APICs support x2APIC mode the IOMMU support for interrupt
remapping will be checked earlier using a specific helper. If no support
for CX16 is detected by that earlier hook disable the IOMMU at that point
and prevent further poking for CX16 later in the boot process, which would
also fail.
There's a possible corner case when running virtualized, and the underlying
hypervisor exposing an IOMMU but no CMPXCHG16B support. In which case
ignoring the IOMMU is fine, albeit the most natural would be for the
underlying hypervisor to also expose CMPXCHG16B support if an IOMMU is
available to the VM.
Note this change only introduces the checks, but doesn't remove the now
stale checks for CX16 support sprinkled in the IOMMU code. Further changes
will take care of that.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since pickup:
- Unify all CX16 checks into a single patch.
---
xen/drivers/passthrough/amd/iommu_intr.c | 13 +++++++++++++
xen/drivers/passthrough/amd/pci_amd_iommu.c | 6 ++++++
xen/drivers/passthrough/vtd/intremap.c | 13 +++++++++++++
xen/drivers/passthrough/vtd/iommu.c | 7 +++++++
4 files changed, 39 insertions(+)
diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c
index 7fc796dec25b..f07fd9e3d970 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -649,6 +649,19 @@ bool __init cf_check iov_supports_xt(void)
if ( !iommu_enable || !iommu_intremap )
return false;
+ if ( unlikely(!cpu_has_cx16) )
+ {
+ AMD_IOMMU_ERROR("no CMPXCHG16B support, disabling IOMMU\n");
+ /*
+ * Disable IOMMU support at once: there's no reason to check for CX16
+ * yet again when attempting to initialize IOMMU DMA remapping
+ * functionality or interrupt remapping without x2APIC support.
+ */
+ iommu_enable = false;
+ iommu_intremap = iommu_intremap_off;
+ return false;
+ }
+
if ( amd_iommu_prepare(true) )
return false;
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 73dcc4a2dd0c..f96f59440bcc 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -309,6 +309,12 @@ static int __init cf_check iov_detect(void)
if ( !iommu_enable && !iommu_intremap )
return 0;
+ if ( unlikely(!cpu_has_cx16) )
+ {
+ AMD_IOMMU_ERROR("no CMPXCHG16B support, disabling IOMMU\n");
+ return -ENODEV;
+ }
+
if ( (init_done ? amd_iommu_init_late()
: amd_iommu_init(false)) != 0 )
{
diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index c504852eb818..233db5cb64de 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -150,6 +150,19 @@ bool __init cf_check intel_iommu_supports_eim(void)
if ( !iommu_qinval || !iommu_intremap || list_empty(&acpi_drhd_units) )
return false;
+ if ( unlikely(!cpu_has_cx16) )
+ {
+ printk(XENLOG_ERR VTDPREFIX "no CMPXCHG16B support, disabling IOMMU\n");
+ /*
+ * Disable IOMMU support at once: there's no reason to check for CX16
+ * yet again when attempting to initialize IOMMU DMA remapping
+ * functionality or interrupt remapping without x2APIC support.
+ */
+ iommu_enable = false;
+ iommu_intremap = iommu_intremap_off;
+ return false;
+ }
+
/* We MUST have a DRHD unit for each IOAPIC. */
for ( apic = 0; apic < nr_ioapics; apic++ )
if ( !ioapic_to_drhd(IO_APIC_ID(apic)) )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 27a4d1640189..3daedc4f5593 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2633,6 +2633,13 @@ static int __init cf_check vtd_setup(void)
int ret;
bool reg_inval_supported = true;
+ if ( unlikely(!cpu_has_cx16) )
+ {
+ printk(XENLOG_ERR VTDPREFIX "no CMPXCHG16B support, disabling IOMMU\n");
+ ret = -ENODEV;
+ goto error;
+ }
+
if ( list_empty(&acpi_drhd_units) )
{
ret = -ENODEV;
--
2.46.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH v2 1/5] x86/iommu: check for CMPXCHG16B when enabling IOMMU
2025-01-24 12:01 ` [PATCH v2 1/5] x86/iommu: check for CMPXCHG16B when enabling IOMMU Roger Pau Monne
@ 2025-01-27 9:51 ` Jan Beulich
2025-01-27 10:03 ` Roger Pau Monné
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2025-01-27 9:51 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Teddy Astie, Andrew Cooper, xen-devel
On 24.01.2025 13:01, Roger Pau Monne wrote:
> From: Teddy Astie <teddy.astie@vates.tech>
>
> All hardware with VT-d/AMD-Vi has CMPXCHG16B support. Check this at
> initialisation time, and remove the effectively-dead logic for the
> non-cx16 case.
Nit: There's nothing being removed here, so I expect the 2nd half of the
sentence wants dropping.
Jan
> If the local APICs support x2APIC mode the IOMMU support for interrupt
> remapping will be checked earlier using a specific helper. If no support
> for CX16 is detected by that earlier hook disable the IOMMU at that point
> and prevent further poking for CX16 later in the boot process, which would
> also fail.
>
> There's a possible corner case when running virtualized, and the underlying
> hypervisor exposing an IOMMU but no CMPXCHG16B support. In which case
> ignoring the IOMMU is fine, albeit the most natural would be for the
> underlying hypervisor to also expose CMPXCHG16B support if an IOMMU is
> available to the VM.
>
> Note this change only introduces the checks, but doesn't remove the now
> stale checks for CX16 support sprinkled in the IOMMU code. Further changes
> will take care of that.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since pickup:
> - Unify all CX16 checks into a single patch.
> ---
> xen/drivers/passthrough/amd/iommu_intr.c | 13 +++++++++++++
> xen/drivers/passthrough/amd/pci_amd_iommu.c | 6 ++++++
> xen/drivers/passthrough/vtd/intremap.c | 13 +++++++++++++
> xen/drivers/passthrough/vtd/iommu.c | 7 +++++++
> 4 files changed, 39 insertions(+)
>
> diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c
> index 7fc796dec25b..f07fd9e3d970 100644
> --- a/xen/drivers/passthrough/amd/iommu_intr.c
> +++ b/xen/drivers/passthrough/amd/iommu_intr.c
> @@ -649,6 +649,19 @@ bool __init cf_check iov_supports_xt(void)
> if ( !iommu_enable || !iommu_intremap )
> return false;
>
> + if ( unlikely(!cpu_has_cx16) )
> + {
> + AMD_IOMMU_ERROR("no CMPXCHG16B support, disabling IOMMU\n");
> + /*
> + * Disable IOMMU support at once: there's no reason to check for CX16
> + * yet again when attempting to initialize IOMMU DMA remapping
> + * functionality or interrupt remapping without x2APIC support.
> + */
> + iommu_enable = false;
> + iommu_intremap = iommu_intremap_off;
> + return false;
> + }
> +
> if ( amd_iommu_prepare(true) )
> return false;
>
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index 73dcc4a2dd0c..f96f59440bcc 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -309,6 +309,12 @@ static int __init cf_check iov_detect(void)
> if ( !iommu_enable && !iommu_intremap )
> return 0;
>
> + if ( unlikely(!cpu_has_cx16) )
> + {
> + AMD_IOMMU_ERROR("no CMPXCHG16B support, disabling IOMMU\n");
> + return -ENODEV;
> + }
> +
> if ( (init_done ? amd_iommu_init_late()
> : amd_iommu_init(false)) != 0 )
> {
> diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
> index c504852eb818..233db5cb64de 100644
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -150,6 +150,19 @@ bool __init cf_check intel_iommu_supports_eim(void)
> if ( !iommu_qinval || !iommu_intremap || list_empty(&acpi_drhd_units) )
> return false;
>
> + if ( unlikely(!cpu_has_cx16) )
> + {
> + printk(XENLOG_ERR VTDPREFIX "no CMPXCHG16B support, disabling IOMMU\n");
> + /*
> + * Disable IOMMU support at once: there's no reason to check for CX16
> + * yet again when attempting to initialize IOMMU DMA remapping
> + * functionality or interrupt remapping without x2APIC support.
> + */
> + iommu_enable = false;
> + iommu_intremap = iommu_intremap_off;
> + return false;
> + }
> +
> /* We MUST have a DRHD unit for each IOAPIC. */
> for ( apic = 0; apic < nr_ioapics; apic++ )
> if ( !ioapic_to_drhd(IO_APIC_ID(apic)) )
> diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
> index 27a4d1640189..3daedc4f5593 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2633,6 +2633,13 @@ static int __init cf_check vtd_setup(void)
> int ret;
> bool reg_inval_supported = true;
>
> + if ( unlikely(!cpu_has_cx16) )
> + {
> + printk(XENLOG_ERR VTDPREFIX "no CMPXCHG16B support, disabling IOMMU\n");
> + ret = -ENODEV;
> + goto error;
> + }
> +
> if ( list_empty(&acpi_drhd_units) )
> {
> ret = -ENODEV;
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2 1/5] x86/iommu: check for CMPXCHG16B when enabling IOMMU
2025-01-27 9:51 ` Jan Beulich
@ 2025-01-27 10:03 ` Roger Pau Monné
0 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monné @ 2025-01-27 10:03 UTC (permalink / raw)
To: Jan Beulich; +Cc: Teddy Astie, Andrew Cooper, xen-devel
On Mon, Jan 27, 2025 at 10:51:29AM +0100, Jan Beulich wrote:
> On 24.01.2025 13:01, Roger Pau Monne wrote:
> > From: Teddy Astie <teddy.astie@vates.tech>
> >
> > All hardware with VT-d/AMD-Vi has CMPXCHG16B support. Check this at
> > initialisation time, and remove the effectively-dead logic for the
> > non-cx16 case.
>
> Nit: There's nothing being removed here, so I expect the 2nd half of the
> sentence wants dropping.
Yes, that was a copy & paste from the following patch. Will drop the
last part of the sentence.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 2/5] iommu/vtd: remove non-CX16 logic from interrupt remapping
2025-01-24 12:01 [PATCH v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Roger Pau Monne
2025-01-24 12:01 ` [PATCH v2 1/5] x86/iommu: check for CMPXCHG16B when enabling IOMMU Roger Pau Monne
@ 2025-01-24 12:01 ` Roger Pau Monne
2025-01-27 9:52 ` Jan Beulich
2025-01-24 12:01 ` [PATCH v2 3/5] x86/iommu: remove non-CX16 logic from DMA remapping Roger Pau Monne
` (3 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monne @ 2025-01-24 12:01 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
From: Teddy Astie <teddy.astie@vates.tech>
As CX16 support is mandatory for IOMMU usage, the checks for CX16 in the
interrupt remapping code are stale. Remove them together with the
associated code introduced in case CX16 was not available.
Note that AMD-Vi support for atomically updating a 128bit IRTE entry is
still not implemented, it will be done by further changes.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/drivers/passthrough/vtd/intremap.c | 75 +++++---------------------
1 file changed, 14 insertions(+), 61 deletions(-)
diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index 233db5cb64de..820616a8de93 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -184,49 +184,26 @@ bool __init cf_check intel_iommu_supports_eim(void)
/*
* Assume iremap_lock has been acquired. It is to make sure software will not
- * change the same IRTE behind us. With this assumption, if only high qword or
- * low qword in IRTE is to be updated, this function's atomic variant can
- * present an atomic update to VT-d hardware even when cmpxchg16b
- * instruction is not supported.
+ * change the same IRTE behind us.
*/
static void update_irte(struct vtd_iommu *iommu, struct iremap_entry *entry,
const struct iremap_entry *new_ire, bool atomic)
{
- ASSERT(spin_is_locked(&iommu->intremap.lock));
+ __uint128_t ret;
+ struct iremap_entry old_ire;
- if ( cpu_has_cx16 )
- {
- __uint128_t ret;
- struct iremap_entry old_ire;
+ ASSERT(spin_is_locked(&iommu->intremap.lock));
- old_ire = *entry;
- ret = cmpxchg16b(entry, &old_ire, new_ire);
+ old_ire = *entry;
+ ret = cmpxchg16b(entry, &old_ire, new_ire);
- /*
- * In the above, we use cmpxchg16 to atomically update the 128-bit
- * IRTE, and the hardware cannot update the IRTE behind us, so
- * the return value of cmpxchg16 should be the same as old_ire.
- * This ASSERT validate it.
- */
- ASSERT(ret == old_ire.val);
- }
- else
- {
- /*
- * VT-d hardware doesn't update IRTEs behind us, nor the software
- * since we hold iremap_lock. If the caller wants VT-d hardware to
- * always see a consistent entry, but we can't meet it, a bug will
- * be raised.
- */
- if ( entry->lo == new_ire->lo )
- write_atomic(&entry->hi, new_ire->hi);
- else if ( entry->hi == new_ire->hi )
- write_atomic(&entry->lo, new_ire->lo);
- else if ( !atomic )
- *entry = *new_ire;
- else
- BUG();
- }
+ /*
+ * In the above, we use cmpxchg16 to atomically update the 128-bit
+ * IRTE, and the hardware cannot update the IRTE behind us, so
+ * the return value of cmpxchg16 should be the same as old_ire.
+ * This ASSERT validate it.
+ */
+ ASSERT(ret == old_ire.val);
}
/* Mark specified intr remap entry as free */
@@ -408,7 +385,6 @@ static int ioapic_rte_to_remap_entry(struct vtd_iommu *iommu,
/* Indicate remap format. */
remap_rte->format = 1;
- /* If cmpxchg16b is not available the caller must mask the IO-APIC pin. */
update_irte(iommu, iremap_entry, &new_ire, !init && !masked);
iommu_sync_cache(iremap_entry, sizeof(*iremap_entry));
iommu_flush_iec_index(iommu, 0, index);
@@ -447,38 +423,15 @@ void cf_check io_apic_write_remap_rte(
{
struct IO_xAPIC_route_entry old_rte = {}, new_rte;
struct vtd_iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
- bool masked = true;
int rc;
- if ( !cpu_has_cx16 )
- {
- /*
- * Cannot atomically update the IRTE entry: mask the IO-APIC pin to
- * avoid interrupts seeing an inconsistent IRTE entry.
- */
- old_rte = __ioapic_read_entry(apic, pin, true);
- if ( !old_rte.mask )
- {
- masked = false;
- old_rte.mask = 1;
- __ioapic_write_entry(apic, pin, true, old_rte);
- }
- }
-
/* Not the initializer, for old gcc to cope. */
new_rte.raw = rte;
rc = ioapic_rte_to_remap_entry(iommu, apic, pin, &old_rte, new_rte);
if ( rc )
- {
- if ( !masked )
- {
- /* Recover the original value of 'mask' bit */
- old_rte.mask = 0;
- __ioapic_write_entry(apic, pin, true, old_rte);
- }
return;
- }
+
/* old_rte will contain the updated IO-APIC RTE on success. */
__ioapic_write_entry(apic, pin, true, old_rte);
}
--
2.46.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH v2 2/5] iommu/vtd: remove non-CX16 logic from interrupt remapping
2025-01-24 12:01 ` [PATCH v2 2/5] iommu/vtd: remove non-CX16 logic from interrupt remapping Roger Pau Monne
@ 2025-01-27 9:52 ` Jan Beulich
2025-01-27 10:04 ` Roger Pau Monné
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2025-01-27 9:52 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Teddy Astie, Andrew Cooper, xen-devel
On 24.01.2025 13:01, Roger Pau Monne wrote:
> From: Teddy Astie <teddy.astie@vates.tech>
>
> As CX16 support is mandatory for IOMMU usage, the checks for CX16 in the
> interrupt remapping code are stale. Remove them together with the
> associated code introduced in case CX16 was not available.
Perhaps insert "now" before "mandatory" (then also in patch 3)?
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/5] iommu/vtd: remove non-CX16 logic from interrupt remapping
2025-01-27 9:52 ` Jan Beulich
@ 2025-01-27 10:04 ` Roger Pau Monné
0 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monné @ 2025-01-27 10:04 UTC (permalink / raw)
To: Jan Beulich; +Cc: Teddy Astie, Andrew Cooper, xen-devel
On Mon, Jan 27, 2025 at 10:52:47AM +0100, Jan Beulich wrote:
> On 24.01.2025 13:01, Roger Pau Monne wrote:
> > From: Teddy Astie <teddy.astie@vates.tech>
> >
> > As CX16 support is mandatory for IOMMU usage, the checks for CX16 in the
> > interrupt remapping code are stale. Remove them together with the
> > associated code introduced in case CX16 was not available.
>
> Perhaps insert "now" before "mandatory" (then also in patch 3)?
Sure.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 3/5] x86/iommu: remove non-CX16 logic from DMA remapping
2025-01-24 12:01 [PATCH v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Roger Pau Monne
2025-01-24 12:01 ` [PATCH v2 1/5] x86/iommu: check for CMPXCHG16B when enabling IOMMU Roger Pau Monne
2025-01-24 12:01 ` [PATCH v2 2/5] iommu/vtd: remove non-CX16 logic from interrupt remapping Roger Pau Monne
@ 2025-01-24 12:01 ` Roger Pau Monne
2025-01-24 12:01 ` [PATCH v2 4/5] iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code Roger Pau Monne
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monne @ 2025-01-24 12:01 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
From: Teddy Astie <teddy.astie@vates.tech>
As CX16 support is mandatory for IOMMU usage, the checks for CX16 in the
DMA remapping code are stale. Remove them together with the associated
code introduced in case CX16 was not available.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/drivers/passthrough/amd/iommu_map.c | 42 +++++----------
xen/drivers/passthrough/vtd/iommu.c | 71 ++++++-------------------
2 files changed, 31 insertions(+), 82 deletions(-)
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 1f0e4167566b..dde393645a62 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -167,15 +167,14 @@ int amd_iommu_set_root_page_table(struct amd_iommu_dte *dte,
{
bool valid = flags & SET_ROOT_VALID;
- if ( dte->v && dte->tv &&
- (cpu_has_cx16 || (flags & SET_ROOT_WITH_UNITY_MAP)) )
+ if ( dte->v && dte->tv )
{
union {
struct amd_iommu_dte dte;
uint64_t raw64[4];
__uint128_t raw128[2];
} ldte = { .dte = *dte };
- __uint128_t old = ldte.raw128[0];
+ __uint128_t res, old = ldte.raw128[0];
int ret = 0;
ldte.dte.domain_id = domain_id;
@@ -185,33 +184,20 @@ int amd_iommu_set_root_page_table(struct amd_iommu_dte *dte,
ldte.dte.paging_mode = paging_mode;
ldte.dte.v = valid;
- if ( cpu_has_cx16 )
- {
- __uint128_t res = cmpxchg16b(dte, &old, &ldte.raw128[0]);
+ res = cmpxchg16b(dte, &old, &ldte.raw128[0]);
- /*
- * Hardware does not update the DTE behind our backs, so the
- * return value should match "old".
- */
- if ( res != old )
- {
- printk(XENLOG_ERR
- "Dom%d: unexpected DTE %016lx_%016lx (expected %016lx_%016lx)\n",
- domain_id,
- (uint64_t)(res >> 64), (uint64_t)res,
- (uint64_t)(old >> 64), (uint64_t)old);
- ret = -EILSEQ;
- }
- }
- else /* Best effort, updating domain_id last. */
+ /*
+ * Hardware does not update the DTE behind our backs, so the
+ * return value should match "old".
+ */
+ if ( res != old )
{
- uint64_t *ptr = (void *)dte;
-
- write_atomic(ptr + 0, ldte.raw64[0]);
- /* No barrier should be needed between these two. */
- write_atomic(ptr + 1, ldte.raw64[1]);
-
- ret = 1;
+ printk(XENLOG_ERR
+ "Dom%d: unexpected DTE %016lx_%016lx (expected %016lx_%016lx)\n",
+ domain_id,
+ (uint64_t)(res >> 64), (uint64_t)res,
+ (uint64_t)(old >> 64), (uint64_t)old);
+ ret = -EILSEQ;
}
return ret;
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 3daedc4f5593..b0963bfcf74e 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1485,7 +1485,7 @@ int domain_context_mapping_one(
{
struct domain_iommu *hd = dom_iommu(domain);
struct context_entry *context, *context_entries, lctxt;
- __uint128_t old;
+ __uint128_t res, old;
uint64_t maddr;
uint16_t seg = iommu->drhd->segment, prev_did = 0;
struct domain *prev_dom = NULL;
@@ -1583,55 +1583,23 @@ int domain_context_mapping_one(
ASSERT(!context_fault_disable(lctxt));
}
- if ( cpu_has_cx16 )
- {
- __uint128_t res = cmpxchg16b(context, &old, &lctxt.full);
-
- /*
- * Hardware does not update the context entry behind our backs,
- * so the return value should match "old".
- */
- if ( res != old )
- {
- if ( pdev )
- check_cleanup_domid_map(domain, pdev, iommu);
- printk(XENLOG_ERR
- "%pp: unexpected context entry %016lx_%016lx (expected %016lx_%016lx)\n",
- &PCI_SBDF(seg, bus, devfn),
- (uint64_t)(res >> 64), (uint64_t)res,
- (uint64_t)(old >> 64), (uint64_t)old);
- rc = -EILSEQ;
- goto unlock;
- }
- }
- else if ( !prev_dom || !(mode & MAP_WITH_RMRR) )
- {
- context_clear_present(*context);
- iommu_sync_cache(context, sizeof(*context));
+ res = cmpxchg16b(context, &old, &lctxt.full);
- write_atomic(&context->hi, lctxt.hi);
- /* No barrier should be needed between these two. */
- write_atomic(&context->lo, lctxt.lo);
- }
- else /* Best effort, updating DID last. */
+ /*
+ * Hardware does not update the context entry behind our backs,
+ * so the return value should match "old".
+ */
+ if ( res != old )
{
- /*
- * By non-atomically updating the context entry's DID field last,
- * during a short window in time TLB entries with the old domain ID
- * but the new page tables may be inserted. This could affect I/O
- * of other devices using this same (old) domain ID. Such updating
- * therefore is not a problem if this was the only device associated
- * with the old domain ID. Diverting I/O of any of a dying domain's
- * devices to the quarantine page tables is intended anyway.
- */
- if ( !(mode & (MAP_OWNER_DYING | MAP_SINGLE_DEVICE)) )
- printk(XENLOG_WARNING VTDPREFIX
- " %pp: reassignment may cause %pd data corruption\n",
- &PCI_SBDF(seg, bus, devfn), prev_dom);
-
- write_atomic(&context->lo, lctxt.lo);
- /* No barrier should be needed between these two. */
- write_atomic(&context->hi, lctxt.hi);
+ if ( pdev )
+ check_cleanup_domid_map(domain, pdev, iommu);
+ printk(XENLOG_ERR
+ "%pp: unexpected context entry %016lx_%016lx (expected %016lx_%016lx)\n",
+ &PCI_SBDF(seg, bus, devfn),
+ (uint64_t)(res >> 64), (uint64_t)res,
+ (uint64_t)(old >> 64), (uint64_t)old);
+ rc = -EILSEQ;
+ goto unlock;
}
iommu_sync_cache(context, sizeof(struct context_entry));
@@ -2702,12 +2670,7 @@ static int __init cf_check vtd_setup(void)
iommu_intremap = iommu_intremap_off;
#ifndef iommu_intpost
- /*
- * We cannot use posted interrupt if X86_FEATURE_CX16 is
- * not supported, since we count on this feature to
- * atomically update 16-byte IRTE in posted format.
- */
- if ( !cap_intr_post(iommu->cap) || !iommu_intremap || !cpu_has_cx16 )
+ if ( !cap_intr_post(iommu->cap) || !iommu_intremap )
iommu_intpost = false;
#endif
--
2.46.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH v2 4/5] iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code
2025-01-24 12:01 [PATCH v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Roger Pau Monne
` (2 preceding siblings ...)
2025-01-24 12:01 ` [PATCH v2 3/5] x86/iommu: remove non-CX16 logic from DMA remapping Roger Pau Monne
@ 2025-01-24 12:01 ` Roger Pau Monne
2025-01-24 12:01 ` [PATCH v2 5/5] iommu/amd: atomically update IRTE Roger Pau Monne
2025-01-24 14:24 ` [PATCH for-4.20 v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Andrew Cooper
5 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monne @ 2025-01-24 12:01 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
From: Teddy Astie <teddy.astie@vates.tech>
This flag was only used in case cx16 is not available, as those code paths no
longer exist, this flag now does basically nothing.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/drivers/passthrough/vtd/iommu.c | 12 +++---------
xen/drivers/passthrough/vtd/vtd.h | 5 ++---
2 files changed, 5 insertions(+), 12 deletions(-)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index b0963bfcf74e..9d7a9977a6a6 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1695,15 +1695,9 @@ static int domain_context_mapping(struct domain *domain, u8 devfn,
break;
}
- if ( domain != pdev->domain && pdev->domain != dom_io )
- {
- if ( pdev->domain->is_dying )
- mode |= MAP_OWNER_DYING;
- else if ( drhd &&
- !any_pdev_behind_iommu(pdev->domain, pdev, drhd->iommu) &&
- !pdev->phantom_stride )
- mode |= MAP_SINGLE_DEVICE;
- }
+ if ( domain != pdev->domain && pdev->domain != dom_io &&
+ pdev->domain->is_dying )
+ mode |= MAP_OWNER_DYING;
switch ( pdev->type )
{
diff --git a/xen/drivers/passthrough/vtd/vtd.h b/xen/drivers/passthrough/vtd/vtd.h
index 8aeff8c1f287..b95124517bd3 100644
--- a/xen/drivers/passthrough/vtd/vtd.h
+++ b/xen/drivers/passthrough/vtd/vtd.h
@@ -28,9 +28,8 @@
*/
#define MAP_WITH_RMRR (1u << 0)
#define MAP_OWNER_DYING (1u << 1)
-#define MAP_SINGLE_DEVICE (1u << 2)
-#define MAP_ERROR_RECOVERY (1u << 3)
-#define UNMAP_ME_PHANTOM_FUNC (1u << 4)
+#define MAP_ERROR_RECOVERY (1u << 2)
+#define UNMAP_ME_PHANTOM_FUNC (1u << 3)
/* Allow for both IOAPIC and IOSAPIC. */
#define IO_xAPIC_route_entry IO_APIC_route_entry
--
2.46.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH v2 5/5] iommu/amd: atomically update IRTE
2025-01-24 12:01 [PATCH v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Roger Pau Monne
` (3 preceding siblings ...)
2025-01-24 12:01 ` [PATCH v2 4/5] iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code Roger Pau Monne
@ 2025-01-24 12:01 ` Roger Pau Monne
2025-01-24 14:24 ` [PATCH for-4.20 v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Andrew Cooper
5 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monne @ 2025-01-24 12:01 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Either when using a 32bit Interrupt Remapping Entry or a 128bit one update
the entry atomically, by using cmpxchg unconditionally as IOMMU depends on
it. No longer disable the entry by setting RemapEn = 0 ahead of updating
it. As a consequence of not toggling RemapEn ahead of the update the
Interrupt Remapping Table needs to be flushed after the entry update.
This avoids a window where the IRTE has RemapEn = 0, which can lead to
IO_PAGE_FAULT if the underlying interrupt source is not masked.
There's no guidance in AMD-Vi specification about how IRTE update should be
performed as opposed to DTE updating which has specific guidance. However
DTE updating claims that reads will always be at least 128bits in size, and
hence for the purposes here assume that reads and caching of the IRTE
entries in either 32 or 128 bit format will be done atomically from
the IOMMU.
Note that as part of introducing a new raw128 field in the IRTE struct, the
current raw field is renamed to raw64 to explicitly contain the size in the
field name.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- cmpxchg is now always available if the IOMMU is enabled.
---
xen/drivers/passthrough/amd/iommu_intr.c | 75 ++++++++++--------------
1 file changed, 32 insertions(+), 43 deletions(-)
diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c
index f07fd9e3d970..c0273059cb1d 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -39,7 +39,8 @@ union irte32 {
};
union irte128 {
- uint64_t raw[2];
+ uint64_t raw64[2];
+ __uint128_t raw128;
struct {
bool remap_en:1;
bool sup_io_pf:1;
@@ -187,7 +188,7 @@ static void free_intremap_entry(const struct amd_iommu *iommu,
if ( iommu->ctrl.ga_en )
{
- ACCESS_ONCE(entry.ptr128->raw[0]) = 0;
+ ACCESS_ONCE(entry.ptr128->raw64[0]) = 0;
/*
* Low half (containing RemapEn) needs to be cleared first. Note that
* strictly speaking smp_wmb() isn't enough, as conceptually it expands
@@ -197,7 +198,7 @@ static void free_intremap_entry(const struct amd_iommu *iommu,
* variant will do.
*/
smp_wmb();
- entry.ptr128->raw[1] = 0;
+ entry.ptr128->raw64[1] = 0;
}
else
ACCESS_ONCE(entry.ptr32->raw) = 0;
@@ -212,7 +213,7 @@ static void update_intremap_entry(const struct amd_iommu *iommu,
{
if ( iommu->ctrl.ga_en )
{
- union irte128 irte = {
+ const union irte128 irte = {
.full = {
.remap_en = true,
.int_type = int_type,
@@ -222,19 +223,26 @@ static void update_intremap_entry(const struct amd_iommu *iommu,
.vector = vector,
},
};
+ __uint128_t old = entry.ptr128->raw128;
+ __uint128_t res = cmpxchg16b(&entry.ptr128->raw128, &old,
+ &irte.raw128);
- ASSERT(!entry.ptr128->full.remap_en);
- entry.ptr128->raw[1] = irte.raw[1];
/*
- * High half needs to be set before low one (containing RemapEn). See
- * comment in free_intremap_entry() regarding the choice of barrier.
+ * Hardware does not update the IRTE behind our backs, so the return
+ * value should match "old".
*/
- smp_wmb();
- ACCESS_ONCE(entry.ptr128->raw[0]) = irte.raw[0];
+ if ( res != old )
+ {
+ printk(XENLOG_ERR
+ "unexpected IRTE %016lx_%016lx (expected %016lx_%016lx)\n",
+ (uint64_t)(res >> 64), (uint64_t)res,
+ (uint64_t)(old >> 64), (uint64_t)old);
+ ASSERT_UNREACHABLE();
+ }
}
else
{
- union irte32 irte = {
+ const union irte32 irte = {
.flds = {
.remap_en = true,
.int_type = int_type,
@@ -299,21 +307,13 @@ static int update_intremap_entry_from_ioapic(
entry = get_intremap_entry(iommu, req_id, offset);
- /* The RemapEn fields match for all formats. */
- while ( iommu->enabled && entry.ptr32->flds.remap_en )
- {
- entry.ptr32->flds.remap_en = false;
- spin_unlock(lock);
-
- amd_iommu_flush_intremap(iommu, req_id);
-
- spin_lock(lock);
- }
-
update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest);
spin_unlock_irqrestore(lock, flags);
+ if ( !fresh )
+ amd_iommu_flush_intremap(iommu, req_id);
+
set_rte_index(rte, offset);
return 0;
@@ -322,7 +322,7 @@ static int update_intremap_entry_from_ioapic(
void cf_check amd_iommu_ioapic_update_ire(
unsigned int apic, unsigned int pin, uint64_t rte)
{
- struct IO_APIC_route_entry old_rte, new_rte;
+ struct IO_APIC_route_entry new_rte;
int seg, bdf, rc;
struct amd_iommu *iommu;
unsigned int idx;
@@ -346,14 +346,6 @@ void cf_check amd_iommu_ioapic_update_ire(
return;
}
- old_rte = __ioapic_read_entry(apic, pin, true);
- /* mask the interrupt while we change the intremap table */
- if ( !old_rte.mask )
- {
- old_rte.mask = 1;
- __ioapic_write_entry(apic, pin, true, old_rte);
- }
-
/* Update interrupt remapping entry */
rc = update_intremap_entry_from_ioapic(
bdf, iommu, &new_rte,
@@ -425,6 +417,7 @@ static int update_intremap_entry_from_msi_msg(
uint8_t delivery_mode, vector, dest_mode;
spinlock_t *lock;
unsigned int dest, offset, i;
+ bool fresh = false;
req_id = get_dma_requestor_id(iommu->seg, bdf);
alias_id = get_intremap_requestor_id(iommu->seg, bdf);
@@ -468,26 +461,21 @@ static int update_intremap_entry_from_msi_msg(
return -ENOSPC;
}
*remap_index = offset;
+ fresh = true;
}
entry = get_intremap_entry(iommu, req_id, offset);
- /* The RemapEn fields match for all formats. */
- while ( iommu->enabled && entry.ptr32->flds.remap_en )
- {
- entry.ptr32->flds.remap_en = false;
- spin_unlock(lock);
+ update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest);
+ spin_unlock_irqrestore(lock, flags);
+ if ( !fresh )
+ {
amd_iommu_flush_intremap(iommu, req_id);
if ( alias_id != req_id )
amd_iommu_flush_intremap(iommu, alias_id);
-
- spin_lock(lock);
}
- update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest);
- spin_unlock_irqrestore(lock, flags);
-
*data = (msg->data & ~(INTREMAP_MAX_ENTRIES - 1)) | offset;
/*
@@ -735,7 +723,7 @@ static void dump_intremap_table(const struct amd_iommu *iommu,
for ( count = 0; count < nr; count++ )
{
if ( iommu->ctrl.ga_en
- ? !tbl.ptr128[count].raw[0] && !tbl.ptr128[count].raw[1]
+ ? !tbl.ptr128[count].raw64[0] && !tbl.ptr128[count].raw64[1]
: !tbl.ptr32[count].raw )
continue;
@@ -748,7 +736,8 @@ static void dump_intremap_table(const struct amd_iommu *iommu,
if ( iommu->ctrl.ga_en )
printk(" IRTE[%03x] %016lx_%016lx\n",
- count, tbl.ptr128[count].raw[1], tbl.ptr128[count].raw[0]);
+ count, tbl.ptr128[count].raw64[1],
+ tbl.ptr128[count].raw64[0]);
else
printk(" IRTE[%03x] %08x\n", count, tbl.ptr32[count].raw);
}
--
2.46.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH for-4.20 v2 0/5] x86/iommu: make CX16 mandatory for IOMMU
2025-01-24 12:01 [PATCH v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Roger Pau Monne
` (4 preceding siblings ...)
2025-01-24 12:01 ` [PATCH v2 5/5] iommu/amd: atomically update IRTE Roger Pau Monne
@ 2025-01-24 14:24 ` Andrew Cooper
2025-01-24 14:27 ` Roger Pau Monné
5 siblings, 1 reply; 13+ messages in thread
From: Andrew Cooper @ 2025-01-24 14:24 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich, Oleksii Kurochko
On 24/01/2025 12:01 pm, Roger Pau Monne wrote:
> Hello,
>
> The following series is the original CX16 series sent by Teddy, with the
> CX16 checks split into a separate patch, plus one extra patch to switch
> AMD-Vi to use CMPXCHG16B when updating Interrupt Remapping Entries.
>
> Note that last patch to use CMPXCHG16B fixes a real bug with AMD
> hardware.
>
> Thanks, Roger.
>
> Roger Pau Monne (1):
> iommu/amd: atomically update IRTE
>
> Teddy Astie (4):
> x86/iommu: check for CMPXCHG16B when enabling IOMMU
> iommu/vtd: remove non-CX16 logic from interrupt remapping
> x86/iommu: remove non-CX16 logic from DMA remapping
> iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC Oleksii. Patch 5 is a real bugfix that needs backporting, and the
prior patches have been in an almost-ready state for more than a release
now.
~Andrew
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH for-4.20 v2 0/5] x86/iommu: make CX16 mandatory for IOMMU
2025-01-24 14:24 ` [PATCH for-4.20 v2 0/5] x86/iommu: make CX16 mandatory for IOMMU Andrew Cooper
@ 2025-01-24 14:27 ` Roger Pau Monné
2025-01-27 10:38 ` Oleksii Kurochko
0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2025-01-24 14:27 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Jan Beulich, Oleksii Kurochko
On Fri, Jan 24, 2025 at 02:24:34PM +0000, Andrew Cooper wrote:
> On 24/01/2025 12:01 pm, Roger Pau Monne wrote:
> > Hello,
> >
> > The following series is the original CX16 series sent by Teddy, with the
> > CX16 checks split into a separate patch, plus one extra patch to switch
> > AMD-Vi to use CMPXCHG16B when updating Interrupt Remapping Entries.
> >
> > Note that last patch to use CMPXCHG16B fixes a real bug with AMD
> > hardware.
> >
> > Thanks, Roger.
> >
> > Roger Pau Monne (1):
> > iommu/amd: atomically update IRTE
> >
> > Teddy Astie (4):
> > x86/iommu: check for CMPXCHG16B when enabling IOMMU
> > iommu/vtd: remove non-CX16 logic from interrupt remapping
> > x86/iommu: remove non-CX16 logic from DMA remapping
> > iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Thanks.
> CC Oleksii. Patch 5 is a real bugfix that needs backporting, and the
> prior patches have been in an almost-ready state for more than a release
> now.
I've split the checks into a pre-patch, and did a bit more cleanup of
code that was no longer needed (pre/post interrupt mask before IRTE
update), but overall the code is the same plus the extra fix.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-4.20 v2 0/5] x86/iommu: make CX16 mandatory for IOMMU
2025-01-24 14:27 ` Roger Pau Monné
@ 2025-01-27 10:38 ` Oleksii Kurochko
0 siblings, 0 replies; 13+ messages in thread
From: Oleksii Kurochko @ 2025-01-27 10:38 UTC (permalink / raw)
To: Roger Pau Monné, Andrew Cooper; +Cc: xen-devel, Jan Beulich
On 1/24/25 3:27 PM, Roger Pau Monné wrote:
> On Fri, Jan 24, 2025 at 02:24:34PM +0000, Andrew Cooper wrote:
>> On 24/01/2025 12:01 pm, Roger Pau Monne wrote:
>>> Hello,
>>>
>>> The following series is the original CX16 series sent by Teddy, with the
>>> CX16 checks split into a separate patch, plus one extra patch to switch
>>> AMD-Vi to use CMPXCHG16B when updating Interrupt Remapping Entries.
>>>
>>> Note that last patch to use CMPXCHG16B fixes a real bug with AMD
>>> hardware.
>>>
>>> Thanks, Roger.
>>>
>>> Roger Pau Monne (1):
>>> iommu/amd: atomically update IRTE
>>>
>>> Teddy Astie (4):
>>> x86/iommu: check for CMPXCHG16B when enabling IOMMU
>>> iommu/vtd: remove non-CX16 logic from interrupt remapping
>>> x86/iommu: remove non-CX16 logic from DMA remapping
>>> iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
As we disscussed in matrix, with proper review R-Acked-by: Oleksii
Kurochko <oleksii.kurochko@gmail.com>
~ Oleksii
> Thanks.
>
>> CC Oleksii. Patch 5 is a real bugfix that needs backporting, and the
>> prior patches have been in an almost-ready state for more than a release
>> now.
> I've split the checks into a pre-patch, and did a bit more cleanup of
> code that was no longer needed (pre/post interrupt mask before IRTE
> update), but overall the code is the same plus the extra fix.
>
> Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread