* [PATCH v2 01/12] VT-d: correct ATS checking for root complex integrated devices
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
@ 2024-02-15 10:13 ` Jan Beulich
2024-05-03 14:01 ` Roger Pau Monné
2024-02-15 10:14 ` [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing Jan Beulich
` (10 subsequent siblings)
11 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:13 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
Spec version 4.1 says
"The ATSR structures identifies PCI Express Root-Ports supporting
Address Translation Services (ATS) transactions. Software must enable
ATS on endpoint devices behind a Root Port only if the Root Port is
reported as supporting ATS transactions."
Clearly root complex integrated devices aren't "behind root ports",
matching my observation on a SapphireRapids system having an ATS-
capable root complex integrated device. Hence for such devices we
shouldn't try to locate a corresponding ATSR.
Since both pci_find_ext_capability() and pci_find_cap_offset() return
"unsigned int", change "pos" to that type at the same time.
Fixes: 903b93211f56 ("[VTD] laying the ground work for ATS")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -44,7 +44,7 @@ struct acpi_drhd_unit *find_ats_dev_drhd
int ats_device(const struct pci_dev *pdev, const struct acpi_drhd_unit *drhd)
{
struct acpi_drhd_unit *ats_drhd;
- int pos;
+ unsigned int pos, expfl = 0;
if ( !ats_enabled || !iommu_qinval )
return 0;
@@ -53,7 +53,12 @@ int ats_device(const struct pci_dev *pde
!ecap_dev_iotlb(drhd->iommu->ecap) )
return 0;
- if ( !acpi_find_matched_atsr_unit(pdev) )
+ pos = pci_find_cap_offset(pdev->sbdf, PCI_CAP_ID_EXP);
+ if ( pos )
+ expfl = pci_conf_read16(pdev->sbdf, pos + PCI_EXP_FLAGS);
+
+ if ( MASK_EXTR(expfl, PCI_EXP_FLAGS_TYPE) != PCI_EXP_TYPE_RC_END &&
+ !acpi_find_matched_atsr_unit(pdev) )
return 0;
ats_drhd = find_ats_dev_drhd(drhd->iommu);
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 01/12] VT-d: correct ATS checking for root complex integrated devices
2024-02-15 10:13 ` [PATCH v2 01/12] VT-d: correct ATS checking for root complex integrated devices Jan Beulich
@ 2024-05-03 14:01 ` Roger Pau Monné
0 siblings, 0 replies; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-03 14:01 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Thu, Feb 15, 2024 at 11:13:24AM +0100, Jan Beulich wrote:
> Spec version 4.1 says
>
> "The ATSR structures identifies PCI Express Root-Ports supporting
> Address Translation Services (ATS) transactions. Software must enable
> ATS on endpoint devices behind a Root Port only if the Root Port is
> reported as supporting ATS transactions."
>
> Clearly root complex integrated devices aren't "behind root ports",
> matching my observation on a SapphireRapids system having an ATS-
> capable root complex integrated device. Hence for such devices we
> shouldn't try to locate a corresponding ATSR.
>
> Since both pci_find_ext_capability() and pci_find_cap_offset() return
> "unsigned int", change "pos" to that type at the same time.
>
> Fixes: 903b93211f56 ("[VTD] laying the ground work for ATS")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> v2: New.
>
> --- a/xen/drivers/passthrough/vtd/x86/ats.c
> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> @@ -44,7 +44,7 @@ struct acpi_drhd_unit *find_ats_dev_drhd
> int ats_device(const struct pci_dev *pdev, const struct acpi_drhd_unit *drhd)
> {
> struct acpi_drhd_unit *ats_drhd;
> - int pos;
> + unsigned int pos, expfl = 0;
>
> if ( !ats_enabled || !iommu_qinval )
> return 0;
> @@ -53,7 +53,12 @@ int ats_device(const struct pci_dev *pde
> !ecap_dev_iotlb(drhd->iommu->ecap) )
> return 0;
>
> - if ( !acpi_find_matched_atsr_unit(pdev) )
> + pos = pci_find_cap_offset(pdev->sbdf, PCI_CAP_ID_EXP);
> + if ( pos )
> + expfl = pci_conf_read16(pdev->sbdf, pos + PCI_EXP_FLAGS);
> +
> + if ( MASK_EXTR(expfl, PCI_EXP_FLAGS_TYPE) != PCI_EXP_TYPE_RC_END &&
> + !acpi_find_matched_atsr_unit(pdev) )
Given the spec quote above, it might also be helpful to check that the
type is PCI_EXP_TYPE_ENDPOINT before attempting the ATSR check?
I would assume a well formed ATSR won't list non-endpoint devices.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
2024-02-15 10:13 ` [PATCH v2 01/12] VT-d: correct ATS checking for root complex integrated devices Jan Beulich
@ 2024-02-15 10:14 ` Jan Beulich
2024-05-06 9:12 ` Roger Pau Monné
2024-02-15 10:14 ` [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s Jan Beulich
` (9 subsequent siblings)
11 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:14 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
It's acpi_parse_one_rmrr() where the allocation is coming from (by way
of invoking acpi_parse_dev_scope()), or in add_one_user_rmrr()'s case
allocation is even open-coded there, so freeing would better also happen
there. Care needs to be taken to preserve acpi_parse_one_rmrr()'s
ultimate return value.
While fiddling with callers also move scope_devices_free() to .init and
have it use XFREE() instead of open-coding it.
In register_one_rmrr() also have the "ignore" path take the main
function return path.
Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -82,14 +82,13 @@ static int __init acpi_register_rmrr_uni
return 0;
}
-static void scope_devices_free(struct dmar_scope *scope)
+static void __init scope_devices_free(struct dmar_scope *scope)
{
if ( !scope )
return;
scope->devices_cnt = 0;
- xfree(scope->devices);
- scope->devices = NULL;
+ XFREE(scope->devices);
}
static void __init disable_all_dmar_units(void)
@@ -595,17 +594,13 @@ static int register_one_rmrr(struct acpi
" Ignore RMRR [%"PRIx64",%"PRIx64"] as no device"
" under its scope is PCI discoverable!\n",
rmrru->base_address, rmrru->end_address);
- scope_devices_free(&rmrru->scope);
- xfree(rmrru);
- return 1;
+ ret = 1;
}
else if ( rmrru->base_address > rmrru->end_address )
{
dprintk(XENLOG_WARNING VTDPREFIX,
" RMRR [%"PRIx64",%"PRIx64"] is incorrect!\n",
rmrru->base_address, rmrru->end_address);
- scope_devices_free(&rmrru->scope);
- xfree(rmrru);
ret = -EFAULT;
}
else
@@ -660,21 +655,20 @@ acpi_parse_one_rmrr(struct acpi_dmar_hea
&rmrru->scope, RMRR_TYPE, rmrr->segment);
if ( !ret && (rmrru->scope.devices_cnt != 0) )
- {
ret = register_one_rmrr(rmrru);
- /*
- * register_one_rmrr() returns greater than 0 when a specified
- * PCIe device cannot be detected. To prevent VT-d from being
- * disabled in such cases, reset the return value to 0 here.
- */
- if ( ret > 0 )
- ret = 0;
- }
- else
+ if ( ret )
+ {
+ scope_devices_free(&rmrru->scope);
xfree(rmrru);
+ }
- return ret;
+ /*
+ * register_one_rmrr() returns greater than 0 when a specified PCIe
+ * device cannot be detected. To prevent VT-d from being disabled in
+ * such cases, make the return value 0 here.
+ */
+ return ret > 0 ? 0 : ret;
}
static int __init
@@ -945,9 +939,13 @@ static int __init add_one_user_rmrr(unsi
rmrr->scope.devices_cnt = dev_count;
if ( register_one_rmrr(rmrr) )
+ {
printk(XENLOG_ERR VTDPREFIX
"Could not register RMMR range "ERMRRU_FMT"\n",
ERMRRU_ARG);
+ scope_devices_free(&rmrr->scope);
+ xfree(rmrr);
+ }
return 1;
}
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing
2024-02-15 10:14 ` [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing Jan Beulich
@ 2024-05-06 9:12 ` Roger Pau Monné
2024-05-06 9:21 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 9:12 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Thu, Feb 15, 2024 at 11:14:02AM +0100, Jan Beulich wrote:
> It's acpi_parse_one_rmrr() where the allocation is coming from (by way
> of invoking acpi_parse_dev_scope()), or in add_one_user_rmrr()'s case
> allocation is even open-coded there, so freeing would better also happen
> there. Care needs to be taken to preserve acpi_parse_one_rmrr()'s
> ultimate return value.
>
> While fiddling with callers also move scope_devices_free() to .init and
> have it use XFREE() instead of open-coding it.
>
> In register_one_rmrr() also have the "ignore" path take the main
> function return path.
>
> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> v2: New.
>
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -82,14 +82,13 @@ static int __init acpi_register_rmrr_uni
> return 0;
> }
>
> -static void scope_devices_free(struct dmar_scope *scope)
> +static void __init scope_devices_free(struct dmar_scope *scope)
> {
> if ( !scope )
> return;
>
> scope->devices_cnt = 0;
> - xfree(scope->devices);
> - scope->devices = NULL;
> + XFREE(scope->devices);
> }
>
> static void __init disable_all_dmar_units(void)
> @@ -595,17 +594,13 @@ static int register_one_rmrr(struct acpi
register_one_rmrr() could also be made __init AFAICT? (even before
this patch)
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing
2024-05-06 9:12 ` Roger Pau Monné
@ 2024-05-06 9:21 ` Jan Beulich
2024-05-06 9:26 ` Roger Pau Monné
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-05-06 9:21 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On 06.05.2024 11:12, Roger Pau Monné wrote:
> On Thu, Feb 15, 2024 at 11:14:02AM +0100, Jan Beulich wrote:
>> It's acpi_parse_one_rmrr() where the allocation is coming from (by way
>> of invoking acpi_parse_dev_scope()), or in add_one_user_rmrr()'s case
>> allocation is even open-coded there, so freeing would better also happen
>> there. Care needs to be taken to preserve acpi_parse_one_rmrr()'s
>> ultimate return value.
>>
>> While fiddling with callers also move scope_devices_free() to .init and
>> have it use XFREE() instead of open-coding it.
>>
>> In register_one_rmrr() also have the "ignore" path take the main
>> function return path.
>>
>> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks.
>> --- a/xen/drivers/passthrough/vtd/dmar.c
>> +++ b/xen/drivers/passthrough/vtd/dmar.c
>> @@ -82,14 +82,13 @@ static int __init acpi_register_rmrr_uni
>> return 0;
>> }
>>
>> -static void scope_devices_free(struct dmar_scope *scope)
>> +static void __init scope_devices_free(struct dmar_scope *scope)
>> {
>> if ( !scope )
>> return;
>>
>> scope->devices_cnt = 0;
>> - xfree(scope->devices);
>> - scope->devices = NULL;
>> + XFREE(scope->devices);
>> }
>>
>> static void __init disable_all_dmar_units(void)
>> @@ -595,17 +594,13 @@ static int register_one_rmrr(struct acpi
>
> register_one_rmrr() could also be made __init AFAICT? (even before
> this patch)
Indeed, all the more when it calls acpi_register_rmrr_unit(), which is
__init. With scope_devices_free() becoming __init here, it would seem
quite logical to fold that adjustment right into here. I'll do so,
unless you'd indicate that this would then invalidate your R-b.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing
2024-05-06 9:21 ` Jan Beulich
@ 2024-05-06 9:26 ` Roger Pau Monné
0 siblings, 0 replies; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 9:26 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Mon, May 06, 2024 at 11:21:06AM +0200, Jan Beulich wrote:
> On 06.05.2024 11:12, Roger Pau Monné wrote:
> > On Thu, Feb 15, 2024 at 11:14:02AM +0100, Jan Beulich wrote:
> >> It's acpi_parse_one_rmrr() where the allocation is coming from (by way
> >> of invoking acpi_parse_dev_scope()), or in add_one_user_rmrr()'s case
> >> allocation is even open-coded there, so freeing would better also happen
> >> there. Care needs to be taken to preserve acpi_parse_one_rmrr()'s
> >> ultimate return value.
> >>
> >> While fiddling with callers also move scope_devices_free() to .init and
> >> have it use XFREE() instead of open-coding it.
> >>
> >> In register_one_rmrr() also have the "ignore" path take the main
> >> function return path.
> >>
> >> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Thanks.
>
> >> --- a/xen/drivers/passthrough/vtd/dmar.c
> >> +++ b/xen/drivers/passthrough/vtd/dmar.c
> >> @@ -82,14 +82,13 @@ static int __init acpi_register_rmrr_uni
> >> return 0;
> >> }
> >>
> >> -static void scope_devices_free(struct dmar_scope *scope)
> >> +static void __init scope_devices_free(struct dmar_scope *scope)
> >> {
> >> if ( !scope )
> >> return;
> >>
> >> scope->devices_cnt = 0;
> >> - xfree(scope->devices);
> >> - scope->devices = NULL;
> >> + XFREE(scope->devices);
> >> }
> >>
> >> static void __init disable_all_dmar_units(void)
> >> @@ -595,17 +594,13 @@ static int register_one_rmrr(struct acpi
> >
> > register_one_rmrr() could also be made __init AFAICT? (even before
> > this patch)
>
> Indeed, all the more when it calls acpi_register_rmrr_unit(), which is
> __init. With scope_devices_free() becoming __init here, it would seem
> quite logical to fold that adjustment right into here. I'll do so,
> unless you'd indicate that this would then invalidate your R-b.
Sure, feel free to fold here.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
2024-02-15 10:13 ` [PATCH v2 01/12] VT-d: correct ATS checking for root complex integrated devices Jan Beulich
2024-02-15 10:14 ` [PATCH v2 02/12] VT-d: tidy error handling of RMRR parsing Jan Beulich
@ 2024-02-15 10:14 ` Jan Beulich
2024-05-06 10:29 ` Roger Pau Monné
2024-02-15 10:15 ` [PATCH v2 04/12] AMD/IOMMU: add helper to check whether ATS is to be used for a device Jan Beulich
` (8 subsequent siblings)
11 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:14 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
This is a prereq to us, in particular, respecting the "ATC required"
flag.
Note that ACPI_SATC_ATC_REQUIRED has its #define put in dmar.h, as we
try to keep actbl*.h in sync what Linux (who in turn inherit from ACPI
CA) has.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Lovely: On the SPR system with the SATC I tried passing "ats" (the
"required" flag is clear there), just to then hit "IOMMU#4: QI dev wait
descriptor taking too long" while setting up Dom0. The 2nd message there
doesn't ever appear, so the request never completes. Not sure whether
that's us doing something wrong or the hardware acting up. In the former
case I'd generally expect an IOMMU fault to be raised, though. FTR same
on 4.18 with just "VT-d: correct ATS checking for root complex
integrated devices" backported there.
Should we check scope entries for appropriate types? (If so, then also
for e.g. ATSR.)
---
v2: Move error case freeing to acpi_parse_one_satc(). Introduce #define
for the flag bit. Style.
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -47,6 +47,7 @@ LIST_HEAD_READ_MOSTLY(acpi_drhd_units);
LIST_HEAD_READ_MOSTLY(acpi_rmrr_units);
static LIST_HEAD_READ_MOSTLY(acpi_atsr_units);
static LIST_HEAD_READ_MOSTLY(acpi_rhsa_units);
+static LIST_HEAD_READ_MOSTLY(acpi_satc_units);
static struct acpi_table_header *__read_mostly dmar_table;
static int __read_mostly dmar_flags;
@@ -750,6 +751,93 @@ acpi_parse_one_rhsa(struct acpi_dmar_hea
return ret;
}
+static int __init register_one_satc(struct acpi_satc_unit *satcu)
+{
+ bool ignore = false;
+ unsigned int i = 0;
+ int ret = 0;
+
+ /* Skip checking if segment is not accessible yet. */
+ if ( !pci_known_segment(satcu->segment) )
+ i = UINT_MAX;
+
+ for ( ; i < satcu->scope.devices_cnt; i++ )
+ {
+ uint8_t b = PCI_BUS(satcu->scope.devices[i]);
+ uint8_t d = PCI_SLOT(satcu->scope.devices[i]);
+ uint8_t f = PCI_FUNC(satcu->scope.devices[i]);
+
+ if ( !pci_device_detect(satcu->segment, b, d, f) )
+ {
+ dprintk(XENLOG_WARNING VTDPREFIX,
+ " Non-existent device (%pp) is reported in SATC scope!\n",
+ &PCI_SBDF(satcu->segment, b, d, f));
+ ignore = true;
+ }
+ else
+ {
+ ignore = false;
+ break;
+ }
+ }
+
+ if ( ignore )
+ {
+ dprintk(XENLOG_WARNING VTDPREFIX,
+ " Ignore SATC for seg %04x as no device under its scope is PCI discoverable\n",
+ satcu->segment);
+ return 1;
+ }
+
+ if ( iommu_verbose )
+ printk(VTDPREFIX " ATC required: %d\n", satcu->atc_required);
+
+ list_add(&satcu->list, &acpi_satc_units);
+
+ return ret;
+}
+
+static int __init
+acpi_parse_one_satc(const struct acpi_dmar_header *header)
+{
+ const struct acpi_dmar_satc *satc =
+ container_of(header, const struct acpi_dmar_satc, header);
+ struct acpi_satc_unit *satcu;
+ const void *dev_scope_start, *dev_scope_end;
+ int ret = acpi_dmar_check_length(header, sizeof(*satc));
+
+ if ( ret )
+ return ret;
+
+ satcu = xzalloc(struct acpi_satc_unit);
+ if ( !satcu )
+ return -ENOMEM;
+
+ satcu->segment = satc->segment;
+ satcu->atc_required = satc->flags & ACPI_SATC_ATC_REQUIRED;
+
+ dev_scope_start = (const void *)(satc + 1);
+ dev_scope_end = (const void *)satc + header->length;
+ ret = acpi_parse_dev_scope(dev_scope_start, dev_scope_end,
+ &satcu->scope, SATC_TYPE, satc->segment);
+
+ if ( !ret && satcu->scope.devices_cnt )
+ ret = register_one_satc(satcu);
+
+ if ( ret )
+ {
+ scope_devices_free(&satcu->scope);
+ xfree(satcu);
+ }
+
+ /*
+ * register_one_satc() returns greater than 0 when a specified PCIe
+ * device cannot be detected. To prevent VT-d from being disabled in
+ * such cases, make the return value 0 here.
+ */
+ return ret > 0 ? 0 : ret;
+}
+
static int __init cf_check acpi_parse_dmar(struct acpi_table_header *table)
{
struct acpi_table_dmar *dmar;
@@ -803,6 +891,13 @@ static int __init cf_check acpi_parse_dm
printk(VTDPREFIX "found ACPI_DMAR_RHSA:\n");
ret = acpi_parse_one_rhsa(entry_header);
break;
+
+ case ACPI_DMAR_TYPE_SATC:
+ if ( iommu_verbose )
+ printk(VTDPREFIX "found ACPI_DMAR_SATC:\n");
+ ret = acpi_parse_one_satc(entry_header);
+ break;
+
default:
dprintk(XENLOG_WARNING VTDPREFIX,
"Ignore unknown DMAR structure type (%#x)\n",
--- a/xen/drivers/passthrough/vtd/dmar.h
+++ b/xen/drivers/passthrough/vtd/dmar.h
@@ -91,6 +91,16 @@ struct acpi_rhsa_unit {
u32 proximity_domain;
};
+struct acpi_satc_unit {
+ struct dmar_scope scope;
+ struct list_head list;
+ uint16_t segment;
+ bool atc_required:1;
+};
+
+/* In lieu of a definition in actbl2.h. */
+#define ACPI_SATC_ATC_REQUIRED (1U << 0)
+
#define for_each_drhd_unit(drhd) \
list_for_each_entry(drhd, &acpi_drhd_units, list)
@@ -106,6 +116,7 @@ struct acpi_atsr_unit *acpi_find_matched
#define DMAR_TYPE 1
#define RMRR_TYPE 2
#define ATSR_TYPE 3
+#define SATC_TYPE 4
#define DMAR_OPERATION_TIMEOUT MILLISECS(1000)
--- a/xen/include/acpi/actbl2.h
+++ b/xen/include/acpi/actbl2.h
@@ -345,7 +345,8 @@ enum acpi_dmar_type {
ACPI_DMAR_TYPE_RESERVED_MEMORY = 1,
ACPI_DMAR_TYPE_ATSR = 2,
ACPI_DMAR_HARDWARE_AFFINITY = 3,
- ACPI_DMAR_TYPE_RESERVED = 4 /* 4 and greater are reserved */
+ ACPI_DMAR_TYPE_SATC = 5,
+ ACPI_DMAR_TYPE_RESERVED = 7 /* 7 and greater are reserved */
};
/* DMAR Device Scope structure */
@@ -427,6 +428,15 @@ struct acpi_dmar_rhsa {
u32 proximity_domain;
};
+/* 5: SOC Integrated Address Translation Cache Reporting Structure */
+
+struct acpi_dmar_satc {
+ struct acpi_dmar_header header;
+ uint8_t flags;
+ uint8_t reserved;
+ uint16_t segment;
+};
+
/*******************************************************************************
*
* HPET - High Precision Event Timer table
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s
2024-02-15 10:14 ` [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s Jan Beulich
@ 2024-05-06 10:29 ` Roger Pau Monné
2024-05-06 11:01 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 10:29 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Thu, Feb 15, 2024 at 11:14:31AM +0100, Jan Beulich wrote:
> This is a prereq to us, in particular, respecting the "ATC required"
> flag.
>
> Note that ACPI_SATC_ATC_REQUIRED has its #define put in dmar.h, as we
> try to keep actbl*.h in sync what Linux (who in turn inherit from ACPI
> CA) has.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Lovely: On the SPR system with the SATC I tried passing "ats" (the
> "required" flag is clear there), just to then hit "IOMMU#4: QI dev wait
> descriptor taking too long" while setting up Dom0. The 2nd message there
> doesn't ever appear, so the request never completes. Not sure whether
> that's us doing something wrong or the hardware acting up. In the former
> case I'd generally expect an IOMMU fault to be raised, though. FTR same
> on 4.18 with just "VT-d: correct ATS checking for root complex
> integrated devices" backported there.
Great, so we likely have a bug in our ATS implementation?
>
> Should we check scope entries for appropriate types? (If so, then also
> for e.g. ATSR.)
> ---
> v2: Move error case freeing to acpi_parse_one_satc(). Introduce #define
> for the flag bit. Style.
>
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -47,6 +47,7 @@ LIST_HEAD_READ_MOSTLY(acpi_drhd_units);
> LIST_HEAD_READ_MOSTLY(acpi_rmrr_units);
> static LIST_HEAD_READ_MOSTLY(acpi_atsr_units);
> static LIST_HEAD_READ_MOSTLY(acpi_rhsa_units);
> +static LIST_HEAD_READ_MOSTLY(acpi_satc_units);
We could even make this one RO after init.
>
> static struct acpi_table_header *__read_mostly dmar_table;
> static int __read_mostly dmar_flags;
> @@ -750,6 +751,93 @@ acpi_parse_one_rhsa(struct acpi_dmar_hea
> return ret;
> }
>
> +static int __init register_one_satc(struct acpi_satc_unit *satcu)
> +{
> + bool ignore = false;
> + unsigned int i = 0;
> + int ret = 0;
> +
> + /* Skip checking if segment is not accessible yet. */
> + if ( !pci_known_segment(satcu->segment) )
> + i = UINT_MAX;
> +
> + for ( ; i < satcu->scope.devices_cnt; i++ )
> + {
> + uint8_t b = PCI_BUS(satcu->scope.devices[i]);
> + uint8_t d = PCI_SLOT(satcu->scope.devices[i]);
> + uint8_t f = PCI_FUNC(satcu->scope.devices[i]);
> +
> + if ( !pci_device_detect(satcu->segment, b, d, f) )
> + {
> + dprintk(XENLOG_WARNING VTDPREFIX,
> + " Non-existent device (%pp) is reported in SATC scope!\n",
> + &PCI_SBDF(satcu->segment, b, d, f));
> + ignore = true;
> + }
> + else
> + {
> + ignore = false;
> + break;
> + }
> + }
> +
> + if ( ignore )
> + {
> + dprintk(XENLOG_WARNING VTDPREFIX,
> + " Ignore SATC for seg %04x as no device under its scope is PCI discoverable\n",
> + satcu->segment);
Re the error messages: won't it be better to print them using plain
printk and gate on iommu_verbose being enabled if anything?
It does seem a bit odd that such messages won't be printed when
iommu={debug,verbose} is enabled on the command line.
> + return 1;
> + }
> +
> + if ( iommu_verbose )
> + printk(VTDPREFIX " ATC required: %d\n", satcu->atc_required);
> +
> + list_add(&satcu->list, &acpi_satc_units);
> +
> + return ret;
> +}
> +
> +static int __init
> +acpi_parse_one_satc(const struct acpi_dmar_header *header)
> +{
> + const struct acpi_dmar_satc *satc =
> + container_of(header, const struct acpi_dmar_satc, header);
> + struct acpi_satc_unit *satcu;
> + const void *dev_scope_start, *dev_scope_end;
> + int ret = acpi_dmar_check_length(header, sizeof(*satc));
> +
> + if ( ret )
> + return ret;
> +
> + satcu = xzalloc(struct acpi_satc_unit);
> + if ( !satcu )
> + return -ENOMEM;
> +
> + satcu->segment = satc->segment;
> + satcu->atc_required = satc->flags & ACPI_SATC_ATC_REQUIRED;
> +
> + dev_scope_start = (const void *)(satc + 1);
> + dev_scope_end = (const void *)satc + header->length;
Isn't it enough to just cast to void * and inherit the const from the
left side variable declaration?
You could even initialize dev_scope_{start,end} at definition.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s
2024-05-06 10:29 ` Roger Pau Monné
@ 2024-05-06 11:01 ` Jan Beulich
2024-05-06 11:09 ` Roger Pau Monné
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-05-06 11:01 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On 06.05.2024 12:29, Roger Pau Monné wrote:
> On Thu, Feb 15, 2024 at 11:14:31AM +0100, Jan Beulich wrote:
>> This is a prereq to us, in particular, respecting the "ATC required"
>> flag.
>>
>> Note that ACPI_SATC_ATC_REQUIRED has its #define put in dmar.h, as we
>> try to keep actbl*.h in sync what Linux (who in turn inherit from ACPI
>> CA) has.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> Lovely: On the SPR system with the SATC I tried passing "ats" (the
>> "required" flag is clear there), just to then hit "IOMMU#4: QI dev wait
>> descriptor taking too long" while setting up Dom0. The 2nd message there
>> doesn't ever appear, so the request never completes. Not sure whether
>> that's us doing something wrong or the hardware acting up. In the former
>> case I'd generally expect an IOMMU fault to be raised, though. FTR same
>> on 4.18 with just "VT-d: correct ATS checking for root complex
>> integrated devices" backported there.
>
> Great, so we likely have a bug in our ATS implementation?
Or there's a hardware / firmware issue. As said in the remark, while I'm
not really sure which one it is, I'd kind of expect some form of error
indication rather than just a hang if we did something wrong.
>> --- a/xen/drivers/passthrough/vtd/dmar.c
>> +++ b/xen/drivers/passthrough/vtd/dmar.c
>> @@ -47,6 +47,7 @@ LIST_HEAD_READ_MOSTLY(acpi_drhd_units);
>> LIST_HEAD_READ_MOSTLY(acpi_rmrr_units);
>> static LIST_HEAD_READ_MOSTLY(acpi_atsr_units);
>> static LIST_HEAD_READ_MOSTLY(acpi_rhsa_units);
>> +static LIST_HEAD_READ_MOSTLY(acpi_satc_units);
>
> We could even make this one RO after init.
Maybe, after first introducing LIST_HEAD_RO_AFTER_INIT() and then
perhaps switching the others up front. IOW I'd prefer to keep those
consistent and then (if so desired) update them all in one go.
>> @@ -750,6 +751,93 @@ acpi_parse_one_rhsa(struct acpi_dmar_hea
>> return ret;
>> }
>>
>> +static int __init register_one_satc(struct acpi_satc_unit *satcu)
>> +{
>> + bool ignore = false;
>> + unsigned int i = 0;
>> + int ret = 0;
>> +
>> + /* Skip checking if segment is not accessible yet. */
>> + if ( !pci_known_segment(satcu->segment) )
>> + i = UINT_MAX;
>> +
>> + for ( ; i < satcu->scope.devices_cnt; i++ )
>> + {
>> + uint8_t b = PCI_BUS(satcu->scope.devices[i]);
>> + uint8_t d = PCI_SLOT(satcu->scope.devices[i]);
>> + uint8_t f = PCI_FUNC(satcu->scope.devices[i]);
>> +
>> + if ( !pci_device_detect(satcu->segment, b, d, f) )
>> + {
>> + dprintk(XENLOG_WARNING VTDPREFIX,
>> + " Non-existent device (%pp) is reported in SATC scope!\n",
>> + &PCI_SBDF(satcu->segment, b, d, f));
>> + ignore = true;
>> + }
>> + else
>> + {
>> + ignore = false;
>> + break;
>> + }
>> + }
>> +
>> + if ( ignore )
>> + {
>> + dprintk(XENLOG_WARNING VTDPREFIX,
>> + " Ignore SATC for seg %04x as no device under its scope is PCI discoverable\n",
>> + satcu->segment);
>
> Re the error messages: won't it be better to print them using plain
> printk and gate on iommu_verbose being enabled if anything?
>
> It does seem a bit odd that such messages won't be printed when
> iommu={debug,verbose} is enabled on the command line.
Well, perhaps yes. Yet I'm trying here to stay (largely) in sync with how
in particular register_one_rmrr() behaves. Do you strictly think I should
diverge here?
>> +static int __init
>> +acpi_parse_one_satc(const struct acpi_dmar_header *header)
>> +{
>> + const struct acpi_dmar_satc *satc =
>> + container_of(header, const struct acpi_dmar_satc, header);
>> + struct acpi_satc_unit *satcu;
>> + const void *dev_scope_start, *dev_scope_end;
>> + int ret = acpi_dmar_check_length(header, sizeof(*satc));
>> +
>> + if ( ret )
>> + return ret;
>> +
>> + satcu = xzalloc(struct acpi_satc_unit);
>> + if ( !satcu )
>> + return -ENOMEM;
>> +
>> + satcu->segment = satc->segment;
>> + satcu->atc_required = satc->flags & ACPI_SATC_ATC_REQUIRED;
>> +
>> + dev_scope_start = (const void *)(satc + 1);
>> + dev_scope_end = (const void *)satc + header->length;
>
> Isn't it enough to just cast to void * and inherit the const from the
> left side variable declaration?
Misra won't like the (transient) removal of const, afaict. Personally I
also consider it bad practice to omit such const.
> You could even initialize dev_scope_{start,end} at definition.
Right. This is again the way it is to be in sync with other
acpi_parse_one_...() functions. It's always hard to judge where to diverge
and where consistency is weighed higher. Whichever way you do it, you may
get comment asking for the opposite ...
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s
2024-05-06 11:01 ` Jan Beulich
@ 2024-05-06 11:09 ` Roger Pau Monné
0 siblings, 0 replies; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 11:09 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On Mon, May 06, 2024 at 01:01:48PM +0200, Jan Beulich wrote:
> On 06.05.2024 12:29, Roger Pau Monné wrote:
> > On Thu, Feb 15, 2024 at 11:14:31AM +0100, Jan Beulich wrote:
> >> This is a prereq to us, in particular, respecting the "ATC required"
> >> flag.
> >>
> >> Note that ACPI_SATC_ATC_REQUIRED has its #define put in dmar.h, as we
> >> try to keep actbl*.h in sync what Linux (who in turn inherit from ACPI
> >> CA) has.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
I think however it should be mentioned in the description that
introduced code attempts to stay in sync with the existing
register_one_satc and acpi_parse_one_*() functions.
> >> ---
> >> Lovely: On the SPR system with the SATC I tried passing "ats" (the
> >> "required" flag is clear there), just to then hit "IOMMU#4: QI dev wait
> >> descriptor taking too long" while setting up Dom0. The 2nd message there
> >> doesn't ever appear, so the request never completes. Not sure whether
> >> that's us doing something wrong or the hardware acting up. In the former
> >> case I'd generally expect an IOMMU fault to be raised, though. FTR same
> >> on 4.18 with just "VT-d: correct ATS checking for root complex
> >> integrated devices" backported there.
> >
> > Great, so we likely have a bug in our ATS implementation?
>
> Or there's a hardware / firmware issue. As said in the remark, while I'm
> not really sure which one it is, I'd kind of expect some form of error
> indication rather than just a hang if we did something wrong.
>
> >> --- a/xen/drivers/passthrough/vtd/dmar.c
> >> +++ b/xen/drivers/passthrough/vtd/dmar.c
> >> @@ -47,6 +47,7 @@ LIST_HEAD_READ_MOSTLY(acpi_drhd_units);
> >> LIST_HEAD_READ_MOSTLY(acpi_rmrr_units);
> >> static LIST_HEAD_READ_MOSTLY(acpi_atsr_units);
> >> static LIST_HEAD_READ_MOSTLY(acpi_rhsa_units);
> >> +static LIST_HEAD_READ_MOSTLY(acpi_satc_units);
> >
> > We could even make this one RO after init.
>
> Maybe, after first introducing LIST_HEAD_RO_AFTER_INIT() and then
> perhaps switching the others up front. IOW I'd prefer to keep those
> consistent and then (if so desired) update them all in one go.
>
> >> @@ -750,6 +751,93 @@ acpi_parse_one_rhsa(struct acpi_dmar_hea
> >> return ret;
> >> }
> >>
> >> +static int __init register_one_satc(struct acpi_satc_unit *satcu)
> >> +{
> >> + bool ignore = false;
> >> + unsigned int i = 0;
> >> + int ret = 0;
> >> +
> >> + /* Skip checking if segment is not accessible yet. */
> >> + if ( !pci_known_segment(satcu->segment) )
> >> + i = UINT_MAX;
> >> +
> >> + for ( ; i < satcu->scope.devices_cnt; i++ )
> >> + {
> >> + uint8_t b = PCI_BUS(satcu->scope.devices[i]);
> >> + uint8_t d = PCI_SLOT(satcu->scope.devices[i]);
> >> + uint8_t f = PCI_FUNC(satcu->scope.devices[i]);
> >> +
> >> + if ( !pci_device_detect(satcu->segment, b, d, f) )
> >> + {
> >> + dprintk(XENLOG_WARNING VTDPREFIX,
> >> + " Non-existent device (%pp) is reported in SATC scope!\n",
> >> + &PCI_SBDF(satcu->segment, b, d, f));
> >> + ignore = true;
> >> + }
> >> + else
> >> + {
> >> + ignore = false;
> >> + break;
> >> + }
> >> + }
> >> +
> >> + if ( ignore )
> >> + {
> >> + dprintk(XENLOG_WARNING VTDPREFIX,
> >> + " Ignore SATC for seg %04x as no device under its scope is PCI discoverable\n",
> >> + satcu->segment);
> >
> > Re the error messages: won't it be better to print them using plain
> > printk and gate on iommu_verbose being enabled if anything?
> >
> > It does seem a bit odd that such messages won't be printed when
> > iommu={debug,verbose} is enabled on the command line.
>
> Well, perhaps yes. Yet I'm trying here to stay (largely) in sync with how
> in particular register_one_rmrr() behaves. Do you strictly think I should
> diverge here?
>
> >> +static int __init
> >> +acpi_parse_one_satc(const struct acpi_dmar_header *header)
> >> +{
> >> + const struct acpi_dmar_satc *satc =
> >> + container_of(header, const struct acpi_dmar_satc, header);
> >> + struct acpi_satc_unit *satcu;
> >> + const void *dev_scope_start, *dev_scope_end;
> >> + int ret = acpi_dmar_check_length(header, sizeof(*satc));
> >> +
> >> + if ( ret )
> >> + return ret;
> >> +
> >> + satcu = xzalloc(struct acpi_satc_unit);
> >> + if ( !satcu )
> >> + return -ENOMEM;
> >> +
> >> + satcu->segment = satc->segment;
> >> + satcu->atc_required = satc->flags & ACPI_SATC_ATC_REQUIRED;
> >> +
> >> + dev_scope_start = (const void *)(satc + 1);
> >> + dev_scope_end = (const void *)satc + header->length;
> >
> > Isn't it enough to just cast to void * and inherit the const from the
> > left side variable declaration?
>
> Misra won't like the (transient) removal of const, afaict. Personally I
> also consider it bad practice to omit such const.
>
> > You could even initialize dev_scope_{start,end} at definition.
>
> Right. This is again the way it is to be in sync with other
> acpi_parse_one_...() functions. It's always hard to judge where to diverge
> and where consistency is weighed higher. Whichever way you do it, you may
> get comment asking for the opposite ...
Oh, yes, IIRC you already mentioned this in v1, yet I've forgot when
reviewing this one.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 04/12] AMD/IOMMU: add helper to check whether ATS is to be used for a device
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (2 preceding siblings ...)
2024-02-15 10:14 ` [PATCH v2 03/12] VT-d: parse ACPI "SoC Integrated Address Translation Cache Reporting Structure"s Jan Beulich
@ 2024-02-15 10:15 ` Jan Beulich
2024-05-06 11:27 ` Roger Pau Monné
2024-02-15 10:15 ` [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled Jan Beulich
` (7 subsequent siblings)
11 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:15 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Roger Pau Monné, Paul Durrant, Andrew Cooper
The same set of conditions is used in three places, requiring to be kept
in sync. Introduce a helper to centralize these checks.
To allow all parameters of the new helper be pointer-to-const,
iommu_has_cap() also needs its 1st parameter to be constified. Beyond
that further "modernize" that function.
Requested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
--- a/xen/drivers/passthrough/amd/iommu.h
+++ b/xen/drivers/passthrough/amd/iommu.h
@@ -404,9 +404,9 @@ static inline void __free_amd_iommu_tabl
free_xenheap_pages(table, order);
}
-static inline int iommu_has_cap(struct amd_iommu *iommu, uint32_t bit)
+static inline bool iommu_has_cap(const struct amd_iommu *iommu, unsigned int bit)
{
- return !!(iommu->cap.header & (1u << bit));
+ return iommu->cap.header & (1u << bit);
}
/* access device id field from iommu cmd */
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -114,6 +114,16 @@ static bool any_pdev_behind_iommu(const
return false;
}
+static bool use_ats(
+ const struct pci_dev *pdev,
+ const struct amd_iommu *iommu,
+ const struct ivrs_mappings *ivrs_dev)
+{
+ return !ivrs_dev->block_ats &&
+ iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
+ pci_ats_device(iommu->seg, pdev->bus, pdev->devfn);
+}
+
static int __must_check amd_iommu_setup_domain_device(
struct domain *domain, struct amd_iommu *iommu,
uint8_t devfn, struct pci_dev *pdev)
@@ -185,9 +195,7 @@ static int __must_check amd_iommu_setup_
dte->ex = ivrs_dev->dte_allow_exclusion;
dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
- if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
- !ivrs_dev->block_ats &&
- iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
+ if ( use_ats(pdev, iommu, ivrs_dev) )
dte->i = ats_enabled;
spin_unlock_irqrestore(&iommu->lock, flags);
@@ -248,9 +256,7 @@ static int __must_check amd_iommu_setup_
ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
ACPI_IVHD_SYSTEM_MGMT));
- if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
- !ivrs_dev->block_ats &&
- iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
+ if ( use_ats(pdev, iommu, ivrs_dev) )
ASSERT(dte->i == ats_enabled);
spin_unlock_irqrestore(&iommu->lock, flags);
@@ -268,9 +274,7 @@ static int __must_check amd_iommu_setup_
ASSERT(pcidevs_locked());
- if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
- !ivrs_dev->block_ats &&
- iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
+ if ( use_ats(pdev, iommu, ivrs_dev) &&
!pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
{
if ( devfn == pdev->devfn )
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 04/12] AMD/IOMMU: add helper to check whether ATS is to be used for a device
2024-02-15 10:15 ` [PATCH v2 04/12] AMD/IOMMU: add helper to check whether ATS is to be used for a device Jan Beulich
@ 2024-05-06 11:27 ` Roger Pau Monné
0 siblings, 0 replies; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 11:27 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Paul Durrant, Andrew Cooper
On Thu, Feb 15, 2024 at 11:15:12AM +0100, Jan Beulich wrote:
> The same set of conditions is used in three places, requiring to be kept
> in sync. Introduce a helper to centralize these checks.
>
> To allow all parameters of the new helper be pointer-to-const,
> iommu_has_cap() also needs its 1st parameter to be constified. Beyond
> that further "modernize" that function.
>
> Requested-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (3 preceding siblings ...)
2024-02-15 10:15 ` [PATCH v2 04/12] AMD/IOMMU: add helper to check whether ATS is to be used for a device Jan Beulich
@ 2024-02-15 10:15 ` Jan Beulich
2024-02-15 10:21 ` Jan Beulich
2024-05-06 12:42 ` Roger Pau Monné
2024-02-15 10:16 ` [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag Jan Beulich
` (6 subsequent siblings)
11 siblings, 2 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:15 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
Make the variable a tristate, with (as done elsewhere) a negative value
meaning "default". Since all use sites need looking at, also rename it
to match our usual "opt_*" pattern. While touching it, also move it to
.data.ro_after_init.
The only place it retains boolean nature is pci_ats_device(), for now.
In AMD code re-order conditionals to have the config space accesses
after (cheaper) flag checks.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
In domain_context_mapping_one() I'm a little puzzled that translation
type is selected based on only IOMMU and global properties, i.e. not
taking the device itself into account.
---
v2: Re-base over new earlier patches.
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con
struct amd_iommu *iommu;
unsigned int req_id, queueid, maxpend;
- if ( !ats_enabled )
+ if ( opt_ats <= 0 )
return;
if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) )
@@ -340,7 +340,7 @@ static void _amd_iommu_flush_pages(struc
flush_command_buffer(iommu, 0);
}
- if ( ats_enabled )
+ if ( opt_ats > 0 )
{
amd_iommu_flush_all_iotlbs(d, daddr, order);
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -119,7 +119,7 @@ static bool use_ats(
const struct amd_iommu *iommu,
const struct ivrs_mappings *ivrs_dev)
{
- return !ivrs_dev->block_ats &&
+ return opt_ats > 0 && !ivrs_dev->block_ats &&
iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
pci_ats_device(iommu->seg, pdev->bus, pdev->devfn);
}
@@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
if ( use_ats(pdev, iommu, ivrs_dev) )
- dte->i = ats_enabled;
+ dte->i = true;
spin_unlock_irqrestore(&iommu->lock, flags);
@@ -257,7 +257,7 @@ static int __must_check amd_iommu_setup_
ACPI_IVHD_SYSTEM_MGMT));
if ( use_ats(pdev, iommu, ivrs_dev) )
- ASSERT(dte->i == ats_enabled);
+ ASSERT(dte->i);
spin_unlock_irqrestore(&iommu->lock, flags);
--- a/xen/drivers/passthrough/ats.c
+++ b/xen/drivers/passthrough/ats.c
@@ -18,8 +18,8 @@
#include <xen/pci_regs.h>
#include "ats.h"
-bool __read_mostly ats_enabled;
-boolean_param("ats", ats_enabled);
+int8_t __ro_after_init opt_ats = -1;
+boolean_param("ats", opt_ats);
int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list)
{
--- a/xen/drivers/passthrough/ats.h
+++ b/xen/drivers/passthrough/ats.h
@@ -22,7 +22,7 @@
#define ATS_QUEUE_DEPTH_MASK 0x1f
#define ATS_ENABLE (1<<15)
-extern bool ats_enabled;
+extern int8_t opt_ats;
int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list);
void disable_ats_device(struct pci_dev *pdev);
@@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
static inline int pci_ats_device(int seg, int bus, int devfn)
{
- if ( !ats_enabled )
+ if ( !opt_ats )
return 0;
return pci_find_ext_capability(PCI_SBDF(seg, bus, devfn),
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1543,7 +1543,7 @@ int domain_context_mapping_one(
}
context_set_address_root(lctxt, root);
- if ( ats_enabled && ecap_dev_iotlb(iommu->ecap) )
+ if ( opt_ats > 0 && ecap_dev_iotlb(iommu->ecap) )
context_set_translation_type(lctxt, CONTEXT_TT_DEV_IOTLB);
else
context_set_translation_type(lctxt, CONTEXT_TT_MULTI_LEVEL);
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -46,7 +46,7 @@ int ats_device(const struct pci_dev *pde
struct acpi_drhd_unit *ats_drhd;
unsigned int pos, expfl = 0;
- if ( !ats_enabled || !iommu_qinval )
+ if ( opt_ats <= 0 || !iommu_qinval )
return 0;
if ( !ecap_queued_inval(drhd->iommu->ecap) ||
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-02-15 10:15 ` [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled Jan Beulich
@ 2024-02-15 10:21 ` Jan Beulich
2024-05-06 12:42 ` Roger Pau Monné
1 sibling, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:21 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant, Andrew Cooper
On 15.02.2024 11:15, Jan Beulich wrote:
> Make the variable a tristate, with (as done elsewhere) a negative value
> meaning "default". Since all use sites need looking at, also rename it
> to match our usual "opt_*" pattern. While touching it, also move it to
> .data.ro_after_init.
>
> The only place it retains boolean nature is pci_ats_device(), for now.
>
> In AMD code re-order conditionals to have the config space accesses
> after (cheaper) flag checks.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
I'm sorry, forgot to Cc Andrew for the AMD IOMMU part of the changes.
Jan
> ---
> In domain_context_mapping_one() I'm a little puzzled that translation
> type is selected based on only IOMMU and global properties, i.e. not
> taking the device itself into account.
> ---
> v2: Re-base over new earlier patches.
>
> --- a/xen/drivers/passthrough/amd/iommu_cmd.c
> +++ b/xen/drivers/passthrough/amd/iommu_cmd.c
> @@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con
> struct amd_iommu *iommu;
> unsigned int req_id, queueid, maxpend;
>
> - if ( !ats_enabled )
> + if ( opt_ats <= 0 )
> return;
>
> if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) )
> @@ -340,7 +340,7 @@ static void _amd_iommu_flush_pages(struc
> flush_command_buffer(iommu, 0);
> }
>
> - if ( ats_enabled )
> + if ( opt_ats > 0 )
> {
> amd_iommu_flush_all_iotlbs(d, daddr, order);
>
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -119,7 +119,7 @@ static bool use_ats(
> const struct amd_iommu *iommu,
> const struct ivrs_mappings *ivrs_dev)
> {
> - return !ivrs_dev->block_ats &&
> + return opt_ats > 0 && !ivrs_dev->block_ats &&
> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> pci_ats_device(iommu->seg, pdev->bus, pdev->devfn);
> }
> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>
> if ( use_ats(pdev, iommu, ivrs_dev) )
> - dte->i = ats_enabled;
> + dte->i = true;
>
> spin_unlock_irqrestore(&iommu->lock, flags);
>
> @@ -257,7 +257,7 @@ static int __must_check amd_iommu_setup_
> ACPI_IVHD_SYSTEM_MGMT));
>
> if ( use_ats(pdev, iommu, ivrs_dev) )
> - ASSERT(dte->i == ats_enabled);
> + ASSERT(dte->i);
>
> spin_unlock_irqrestore(&iommu->lock, flags);
>
> --- a/xen/drivers/passthrough/ats.c
> +++ b/xen/drivers/passthrough/ats.c
> @@ -18,8 +18,8 @@
> #include <xen/pci_regs.h>
> #include "ats.h"
>
> -bool __read_mostly ats_enabled;
> -boolean_param("ats", ats_enabled);
> +int8_t __ro_after_init opt_ats = -1;
> +boolean_param("ats", opt_ats);
>
> int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list)
> {
> --- a/xen/drivers/passthrough/ats.h
> +++ b/xen/drivers/passthrough/ats.h
> @@ -22,7 +22,7 @@
> #define ATS_QUEUE_DEPTH_MASK 0x1f
> #define ATS_ENABLE (1<<15)
>
> -extern bool ats_enabled;
> +extern int8_t opt_ats;
>
> int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list);
> void disable_ats_device(struct pci_dev *pdev);
> @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
>
> static inline int pci_ats_device(int seg, int bus, int devfn)
> {
> - if ( !ats_enabled )
> + if ( !opt_ats )
> return 0;
>
> return pci_find_ext_capability(PCI_SBDF(seg, bus, devfn),
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1543,7 +1543,7 @@ int domain_context_mapping_one(
> }
>
> context_set_address_root(lctxt, root);
> - if ( ats_enabled && ecap_dev_iotlb(iommu->ecap) )
> + if ( opt_ats > 0 && ecap_dev_iotlb(iommu->ecap) )
> context_set_translation_type(lctxt, CONTEXT_TT_DEV_IOTLB);
> else
> context_set_translation_type(lctxt, CONTEXT_TT_MULTI_LEVEL);
> --- a/xen/drivers/passthrough/vtd/x86/ats.c
> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> @@ -46,7 +46,7 @@ int ats_device(const struct pci_dev *pde
> struct acpi_drhd_unit *ats_drhd;
> unsigned int pos, expfl = 0;
>
> - if ( !ats_enabled || !iommu_qinval )
> + if ( opt_ats <= 0 || !iommu_qinval )
> return 0;
>
> if ( !ecap_queued_inval(drhd->iommu->ecap) ||
>
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-02-15 10:15 ` [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled Jan Beulich
2024-02-15 10:21 ` Jan Beulich
@ 2024-05-06 12:42 ` Roger Pau Monné
2024-05-06 13:20 ` Jan Beulich
1 sibling, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 12:42 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
> Make the variable a tristate, with (as done elsewhere) a negative value
> meaning "default". Since all use sites need looking at, also rename it
> to match our usual "opt_*" pattern. While touching it, also move it to
> .data.ro_after_init.
I guess I need to look at further patches, as given the feedback on
the past version I think we agreed we want to set ATS unconditionally
disabled by default, and hence I'm not sure I see the point of the
tri-state if enabling ATS will require an explicit opt-in on the
command line (ats=1).
> The only place it retains boolean nature is pci_ats_device(), for now.
>
> In AMD code re-order conditionals to have the config space accesses
> after (cheaper) flag checks.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> In domain_context_mapping_one() I'm a little puzzled that translation
> type is selected based on only IOMMU and global properties, i.e. not
> taking the device itself into account.
Is it maybe fine do set DEV_IOTLB unconditionally as long as the IOMMU
supports it, and then let the device decide whether it wants to issue
translated or non-translated requests depending on whether it supports
(and enables) ATS?
I think it would be best to limit this strictly to devices that have
ATS enabled.
> ---
> v2: Re-base over new earlier patches.
>
> --- a/xen/drivers/passthrough/amd/iommu_cmd.c
> +++ b/xen/drivers/passthrough/amd/iommu_cmd.c
> @@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con
> struct amd_iommu *iommu;
> unsigned int req_id, queueid, maxpend;
>
> - if ( !ats_enabled )
> + if ( opt_ats <= 0 )
If having a tri-state is required, won't it be best to decide on a
binary value at init time, so that runtime functions can handle
opt_ats as a boolean?
Otherwise we are open coding the default expectation of what -1
implies (disabled) in all use sites.
> return;
>
> if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) )
> @@ -340,7 +340,7 @@ static void _amd_iommu_flush_pages(struc
> flush_command_buffer(iommu, 0);
> }
>
> - if ( ats_enabled )
> + if ( opt_ats > 0 )
> {
> amd_iommu_flush_all_iotlbs(d, daddr, order);
>
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -119,7 +119,7 @@ static bool use_ats(
> const struct amd_iommu *iommu,
> const struct ivrs_mappings *ivrs_dev)
> {
> - return !ivrs_dev->block_ats &&
> + return opt_ats > 0 && !ivrs_dev->block_ats &&
> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> pci_ats_device(iommu->seg, pdev->bus, pdev->devfn);
> }
> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>
> if ( use_ats(pdev, iommu, ivrs_dev) )
> - dte->i = ats_enabled;
> + dte->i = true;
Might be easier to just use:
dte->i = use_ats(pdev, iommu, ivrs_dev);
>
> spin_unlock_irqrestore(&iommu->lock, flags);
>
> @@ -257,7 +257,7 @@ static int __must_check amd_iommu_setup_
> ACPI_IVHD_SYSTEM_MGMT));
>
> if ( use_ats(pdev, iommu, ivrs_dev) )
> - ASSERT(dte->i == ats_enabled);
> + ASSERT(dte->i);
ASSERT(dte->i == use_ats(pdev, iommu, ivrs_dev));
>
> spin_unlock_irqrestore(&iommu->lock, flags);
>
> --- a/xen/drivers/passthrough/ats.c
> +++ b/xen/drivers/passthrough/ats.c
> @@ -18,8 +18,8 @@
> #include <xen/pci_regs.h>
> #include "ats.h"
>
> -bool __read_mostly ats_enabled;
> -boolean_param("ats", ats_enabled);
> +int8_t __ro_after_init opt_ats = -1;
> +boolean_param("ats", opt_ats);
>
> int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list)
> {
> --- a/xen/drivers/passthrough/ats.h
> +++ b/xen/drivers/passthrough/ats.h
> @@ -22,7 +22,7 @@
> #define ATS_QUEUE_DEPTH_MASK 0x1f
> #define ATS_ENABLE (1<<15)
>
> -extern bool ats_enabled;
> +extern int8_t opt_ats;
>
> int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list);
> void disable_ats_device(struct pci_dev *pdev);
> @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
>
> static inline int pci_ats_device(int seg, int bus, int devfn)
> {
> - if ( !ats_enabled )
> + if ( !opt_ats )
> return 0;
Can't you remove that check altogether now, since you are adding an
opt_ats check to use_ats()?
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-06 12:42 ` Roger Pau Monné
@ 2024-05-06 13:20 ` Jan Beulich
2024-05-06 13:53 ` Roger Pau Monné
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-05-06 13:20 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On 06.05.2024 14:42, Roger Pau Monné wrote:
> On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
>> Make the variable a tristate, with (as done elsewhere) a negative value
>> meaning "default". Since all use sites need looking at, also rename it
>> to match our usual "opt_*" pattern. While touching it, also move it to
>> .data.ro_after_init.
>
> I guess I need to look at further patches, as given the feedback on
> the past version I think we agreed we want to set ATS unconditionally
> disabled by default, and hence I'm not sure I see the point of the
> tri-state if enabling ATS will require an explicit opt-in on the
> command line (ats=1).
With the present wording in the VT-d spec (which we've now had vague
indication that it may not be meant that way) there needs to be
tristate behavior:
- With "ats=0" ATS won't be used.
- With "ats=1" ATS will be used for all ATS-capable devices.
- Without either option ATS will be used for devices where firmware
mandates its use.
If the alternative reading was confirmed (and preferably the text also
adjusted), we should be able to get away with a simple boolean again.
At which point this patch may end up being purely renaming, and hence
could then as well be left out.
>> The only place it retains boolean nature is pci_ats_device(), for now.
>>
>> In AMD code re-order conditionals to have the config space accesses
>> after (cheaper) flag checks.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> In domain_context_mapping_one() I'm a little puzzled that translation
>> type is selected based on only IOMMU and global properties, i.e. not
>> taking the device itself into account.
>
> Is it maybe fine do set DEV_IOTLB unconditionally as long as the IOMMU
> supports it, and then let the device decide whether it wants to issue
> translated or non-translated requests depending on whether it supports
> (and enables) ATS?
This might be the reason why it is what it is now.
> I think it would be best to limit this strictly to devices that have
> ATS enabled.
And this was the reason for me putting the remark here.
>> --- a/xen/drivers/passthrough/amd/iommu_cmd.c
>> +++ b/xen/drivers/passthrough/amd/iommu_cmd.c
>> @@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con
>> struct amd_iommu *iommu;
>> unsigned int req_id, queueid, maxpend;
>>
>> - if ( !ats_enabled )
>> + if ( opt_ats <= 0 )
>
> If having a tri-state is required, won't it be best to decide on a
> binary value at init time, so that runtime functions can handle
> opt_ats as a boolean?
As per above, unlike for other options we can't consolidate into a
boolean, as runtime behavior wants to be different with all three
possible settings.
> Otherwise we are open coding the default expectation of what -1
> implies (disabled) in all use sites.
That's pretty much unavoidable with the different meaning of 1 and -1.
>> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>>
>> if ( use_ats(pdev, iommu, ivrs_dev) )
>> - dte->i = ats_enabled;
>> + dte->i = true;
>
> Might be easier to just use:
>
> dte->i = use_ats(pdev, iommu, ivrs_dev);
I'm hesitant here, as in principle we might be overwriting a "true" by
"false" then.
>> @@ -257,7 +257,7 @@ static int __must_check amd_iommu_setup_
>> ACPI_IVHD_SYSTEM_MGMT));
>>
>> if ( use_ats(pdev, iommu, ivrs_dev) )
>> - ASSERT(dte->i == ats_enabled);
>> + ASSERT(dte->i);
>
> ASSERT(dte->i == use_ats(pdev, iommu, ivrs_dev));
I'm okay switching here, but better to the precise logical equivalent of
the earlier code:
ASSERT(dte->i || !use_ats(pdev, iommu, ivrs_dev));
>> @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
>>
>> static inline int pci_ats_device(int seg, int bus, int devfn)
>> {
>> - if ( !ats_enabled )
>> + if ( !opt_ats )
>> return 0;
>
> Can't you remove that check altogether now, since you are adding an
> opt_ats check to use_ats()?
Two reasons why not: For one this isn't AMD-specific code, and hence
shouldn't be tied to the AMD-specific use_ats(). In principle VT-d
code should be okay to call here, too. And then
amd_iommu_disable_domain_device() doesn't use use_ats(), but does call
here.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-06 13:20 ` Jan Beulich
@ 2024-05-06 13:53 ` Roger Pau Monné
2024-05-15 10:07 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 13:53 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Mon, May 06, 2024 at 03:20:38PM +0200, Jan Beulich wrote:
> On 06.05.2024 14:42, Roger Pau Monné wrote:
> > On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
> >> Make the variable a tristate, with (as done elsewhere) a negative value
> >> meaning "default". Since all use sites need looking at, also rename it
> >> to match our usual "opt_*" pattern. While touching it, also move it to
> >> .data.ro_after_init.
> >
> > I guess I need to look at further patches, as given the feedback on
> > the past version I think we agreed we want to set ATS unconditionally
> > disabled by default, and hence I'm not sure I see the point of the
> > tri-state if enabling ATS will require an explicit opt-in on the
> > command line (ats=1).
>
> With the present wording in the VT-d spec (which we've now had vague
> indication that it may not be meant that way) there needs to be
> tristate behavior:
> - With "ats=0" ATS won't be used.
> - With "ats=1" ATS will be used for all ATS-capable devices.
> - Without either option ATS will be used for devices where firmware
> mandates its use.
I'm afraid I don't agree to this behavior. Regardless of what the
firmware requests ATS must only be enabled on user-request (iow: when
the ats=1 command line option is passed). Otherwise ATS must remain
disabled for all devices. It's not fine for firmware to trigger the
enabling of a feature that's not supported on Xen.
> >> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
> >> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
> >>
> >> if ( use_ats(pdev, iommu, ivrs_dev) )
> >> - dte->i = ats_enabled;
> >> + dte->i = true;
> >
> > Might be easier to just use:
> >
> > dte->i = use_ats(pdev, iommu, ivrs_dev);
>
> I'm hesitant here, as in principle we might be overwriting a "true" by
> "false" then.
Hm, but that would be fine, what's the point in enabling the IOMMU to
reply to ATS requests if ATS is not enabled on the device?
IOW: overwriting a "true" with a "false" seem like the correct
behavior if it's based on the output of use_ats().
> >> @@ -257,7 +257,7 @@ static int __must_check amd_iommu_setup_
> >> ACPI_IVHD_SYSTEM_MGMT));
> >>
> >> if ( use_ats(pdev, iommu, ivrs_dev) )
> >> - ASSERT(dte->i == ats_enabled);
> >> + ASSERT(dte->i);
> >
> > ASSERT(dte->i == use_ats(pdev, iommu, ivrs_dev));
>
> I'm okay switching here, but better to the precise logical equivalent of
> the earlier code:
>
> ASSERT(dte->i || !use_ats(pdev, iommu, ivrs_dev));
Hm, I see. I think we should be more strict with this (see my
previous comment), but we could defer to a later change.
>
> >> @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
> >>
> >> static inline int pci_ats_device(int seg, int bus, int devfn)
> >> {
> >> - if ( !ats_enabled )
> >> + if ( !opt_ats )
> >> return 0;
> >
> > Can't you remove that check altogether now, since you are adding an
> > opt_ats check to use_ats()?
>
> Two reasons why not: For one this isn't AMD-specific code, and hence
> shouldn't be tied to the AMD-specific use_ats(). In principle VT-d
> code should be okay to call here, too. And then
> amd_iommu_disable_domain_device() doesn't use use_ats(), but does call
> here.
Oh, that's confusing, I didn't realize use_ats was AMD specific code.
It should have some kind of prefix to avoid this kind of confusion.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-06 13:53 ` Roger Pau Monné
@ 2024-05-15 10:07 ` Jan Beulich
2024-05-20 10:29 ` Roger Pau Monné
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-05-15 10:07 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On 06.05.2024 15:53, Roger Pau Monné wrote:
> On Mon, May 06, 2024 at 03:20:38PM +0200, Jan Beulich wrote:
>> On 06.05.2024 14:42, Roger Pau Monné wrote:
>>> On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
>>>> Make the variable a tristate, with (as done elsewhere) a negative value
>>>> meaning "default". Since all use sites need looking at, also rename it
>>>> to match our usual "opt_*" pattern. While touching it, also move it to
>>>> .data.ro_after_init.
>>>
>>> I guess I need to look at further patches, as given the feedback on
>>> the past version I think we agreed we want to set ATS unconditionally
>>> disabled by default, and hence I'm not sure I see the point of the
>>> tri-state if enabling ATS will require an explicit opt-in on the
>>> command line (ats=1).
>>
>> With the present wording in the VT-d spec (which we've now had vague
>> indication that it may not be meant that way) there needs to be
>> tristate behavior:
>> - With "ats=0" ATS won't be used.
>> - With "ats=1" ATS will be used for all ATS-capable devices.
>> - Without either option ATS will be used for devices where firmware
>> mandates its use.
>
> I'm afraid I don't agree to this behavior. Regardless of what the
> firmware requests ATS must only be enabled on user-request (iow: when
> the ats=1 command line option is passed). Otherwise ATS must remain
> disabled for all devices. It's not fine for firmware to trigger the
> enabling of a feature that's not supported on Xen.
Well. On one hand I can see your point. Otoh with the spec still being the
way it is, on systems mandating ATS use for at least one device we'd then
simply need to deem Xen unsupported there altogether. The goal of the
series, though, is to make things work as mandated by the spec on such
systems, which to me implies we need to consider use of ATS supported in
such cases (and only for those specific devices, i.e. still without
considering use of "ats" on the command line supported).
If and when the spec was changed to clarify the flag is a performance hint,
not a functional requirement, then we could do as you suggest. At which
point, as mentioned before, opt_ats may be possible to become a plain
boolean variable.
>>>> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
>>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>>>>
>>>> if ( use_ats(pdev, iommu, ivrs_dev) )
>>>> - dte->i = ats_enabled;
>>>> + dte->i = true;
>>>
>>> Might be easier to just use:
>>>
>>> dte->i = use_ats(pdev, iommu, ivrs_dev);
>>
>> I'm hesitant here, as in principle we might be overwriting a "true" by
>> "false" then.
>
> Hm, but that would be fine, what's the point in enabling the IOMMU to
> reply to ATS requests if ATS is not enabled on the device?
>
> IOW: overwriting a "true" with a "false" seem like the correct
> behavior if it's based on the output of use_ats().
I don't think so, unless there were flow guarantees excluding the possibility
of taking this path twice without intermediately disabling the device again.
Down from here the enabling of ATS is gated on use_ats(). Hence if, in an
earlier invocation, we enabled ATS (and set dte->i), we wouldn't turn off ATS
below (there's only code to turn it on), yet with what you suggest we'd clear
dte->i.
Thinking about it: Maybe your comment roots in you meaning to leverage here
that use_ats() is not supposed to return different values for the same device,
when invoked multiple times. If so, I'm afraid I'm hesitant to make use of
such a property when I can easily avoid it.
>>>> @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
>>>>
>>>> static inline int pci_ats_device(int seg, int bus, int devfn)
>>>> {
>>>> - if ( !ats_enabled )
>>>> + if ( !opt_ats )
>>>> return 0;
>>>
>>> Can't you remove that check altogether now, since you are adding an
>>> opt_ats check to use_ats()?
>>
>> Two reasons why not: For one this isn't AMD-specific code, and hence
>> shouldn't be tied to the AMD-specific use_ats(). In principle VT-d
>> code should be okay to call here, too. And then
>> amd_iommu_disable_domain_device() doesn't use use_ats(), but does call
>> here.
>
> Oh, that's confusing, I didn't realize use_ats was AMD specific code.
> It should have some kind of prefix to avoid this kind of confusion.
Hmm, the function being static in an AMD-only file, I would have thought that
makes it clear enough that it's AMD-specific.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-15 10:07 ` Jan Beulich
@ 2024-05-20 10:29 ` Roger Pau Monné
2024-05-21 6:21 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-20 10:29 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On Wed, May 15, 2024 at 12:07:50PM +0200, Jan Beulich wrote:
> On 06.05.2024 15:53, Roger Pau Monné wrote:
> > On Mon, May 06, 2024 at 03:20:38PM +0200, Jan Beulich wrote:
> >> On 06.05.2024 14:42, Roger Pau Monné wrote:
> >>> On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
> >>>> Make the variable a tristate, with (as done elsewhere) a negative value
> >>>> meaning "default". Since all use sites need looking at, also rename it
> >>>> to match our usual "opt_*" pattern. While touching it, also move it to
> >>>> .data.ro_after_init.
> >>>
> >>> I guess I need to look at further patches, as given the feedback on
> >>> the past version I think we agreed we want to set ATS unconditionally
> >>> disabled by default, and hence I'm not sure I see the point of the
> >>> tri-state if enabling ATS will require an explicit opt-in on the
> >>> command line (ats=1).
> >>
> >> With the present wording in the VT-d spec (which we've now had vague
> >> indication that it may not be meant that way) there needs to be
> >> tristate behavior:
> >> - With "ats=0" ATS won't be used.
> >> - With "ats=1" ATS will be used for all ATS-capable devices.
> >> - Without either option ATS will be used for devices where firmware
> >> mandates its use.
> >
> > I'm afraid I don't agree to this behavior. Regardless of what the
> > firmware requests ATS must only be enabled on user-request (iow: when
> > the ats=1 command line option is passed). Otherwise ATS must remain
> > disabled for all devices. It's not fine for firmware to trigger the
> > enabling of a feature that's not supported on Xen.
>
> Well. On one hand I can see your point. Otoh with the spec still being the
> way it is, on systems mandating ATS use for at least one device we'd then
> simply need to deem Xen unsupported there altogether. The goal of the
> series, though, is to make things work as mandated by the spec on such
> systems, which to me implies we need to consider use of ATS supported in
> such cases (and only for those specific devices, i.e. still without
> considering use of "ats" on the command line supported).
I'm in general hesitant of ATS because I think it undermines the
security of PCI passthrough. However this would still be acceptable
for dom0 because it's (usually?) part of the trusted base of a Xen
host.
If we want to make use of ATS for devices assigned to dom0 we should
clarify the warning in xen-command-line.pandoc.
We should also consider that dom0 usually does a lot of p2m
manipulations (by having to map grants and foreign pages). Those will
result in p2m flushes that will lead to IOMMU flushes, and when using
ATS that will require device TLB flushes. I wonder how much of an
overhead this will add to normal dom0 operations (plus the added risk
of those device TLB flushes stalling the IOMMU queue).
I would be much more comfortable with making the ats= command line
option a tri-state:
ats={0,1,mandatory}
Where the 'mandatory' option or equivalent enables ATS only for
devices that mandate it. However I still think the default option
should be disabled for all devices. If devices that require ATS are
found on the system I would use `warning_add()` to notify the user
of the need to consider adding ats=mandatory to the command line.
> If and when the spec was changed to clarify the flag is a performance hint,
> not a functional requirement, then we could do as you suggest. At which
> point, as mentioned before, opt_ats may be possible to become a plain
> boolean variable.
It's a complex situation, and I'm kind of surprised by the
introduction of this mandatory ATS requirement by Intel in a
non-backwards compatible way (as the specification claims the device
won't be functional without ATS enabled if required).
> >>>> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
> >>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
> >>>>
> >>>> if ( use_ats(pdev, iommu, ivrs_dev) )
> >>>> - dte->i = ats_enabled;
> >>>> + dte->i = true;
> >>>
> >>> Might be easier to just use:
> >>>
> >>> dte->i = use_ats(pdev, iommu, ivrs_dev);
> >>
> >> I'm hesitant here, as in principle we might be overwriting a "true" by
> >> "false" then.
> >
> > Hm, but that would be fine, what's the point in enabling the IOMMU to
> > reply to ATS requests if ATS is not enabled on the device?
> >
> > IOW: overwriting a "true" with a "false" seem like the correct
> > behavior if it's based on the output of use_ats().
>
> I don't think so, unless there were flow guarantees excluding the possibility
> of taking this path twice without intermediately disabling the device again.
> Down from here the enabling of ATS is gated on use_ats(). Hence if, in an
> earlier invocation, we enabled ATS (and set dte->i), we wouldn't turn off ATS
> below (there's only code to turn it on), yet with what you suggest we'd clear
> dte->i.
Please bear with me, I think I'm confused, why would use_ats(), and if
that's the case, don't we want to update dte->i so that it matches the
ATS state?
Otherwise we would fail to disable IOMMU device address translation
support if ATS was disabled?
> Thinking about it: Maybe your comment roots in you meaning to leverage here
> that use_ats() is not supposed to return different values for the same device,
> when invoked multiple times. If so, I'm afraid I'm hesitant to make use of
> such a property when I can easily avoid it.
>
> >>>> @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se
> >>>>
> >>>> static inline int pci_ats_device(int seg, int bus, int devfn)
> >>>> {
> >>>> - if ( !ats_enabled )
> >>>> + if ( !opt_ats )
> >>>> return 0;
> >>>
> >>> Can't you remove that check altogether now, since you are adding an
> >>> opt_ats check to use_ats()?
> >>
> >> Two reasons why not: For one this isn't AMD-specific code, and hence
> >> shouldn't be tied to the AMD-specific use_ats(). In principle VT-d
> >> code should be okay to call here, too. And then
> >> amd_iommu_disable_domain_device() doesn't use use_ats(), but does call
> >> here.
> >
> > Oh, that's confusing, I didn't realize use_ats was AMD specific code.
> > It should have some kind of prefix to avoid this kind of confusion.
>
> Hmm, the function being static in an AMD-only file, I would have thought that
> makes it clear enough that it's AMD-specific.
Yes, sure, I guess the name looked generic enough to be something that
could be shared across vendor implementations.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-20 10:29 ` Roger Pau Monné
@ 2024-05-21 6:21 ` Jan Beulich
2024-05-21 10:03 ` Roger Pau Monné
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-05-21 6:21 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On 20.05.2024 12:29, Roger Pau Monné wrote:
> On Wed, May 15, 2024 at 12:07:50PM +0200, Jan Beulich wrote:
>> On 06.05.2024 15:53, Roger Pau Monné wrote:
>>> On Mon, May 06, 2024 at 03:20:38PM +0200, Jan Beulich wrote:
>>>> On 06.05.2024 14:42, Roger Pau Monné wrote:
>>>>> On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
>>>>>> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
>>>>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>>>>>>
>>>>>> if ( use_ats(pdev, iommu, ivrs_dev) )
>>>>>> - dte->i = ats_enabled;
>>>>>> + dte->i = true;
>>>>>
>>>>> Might be easier to just use:
>>>>>
>>>>> dte->i = use_ats(pdev, iommu, ivrs_dev);
>>>>
>>>> I'm hesitant here, as in principle we might be overwriting a "true" by
>>>> "false" then.
>>>
>>> Hm, but that would be fine, what's the point in enabling the IOMMU to
>>> reply to ATS requests if ATS is not enabled on the device?
>>>
>>> IOW: overwriting a "true" with a "false" seem like the correct
>>> behavior if it's based on the output of use_ats().
>>
>> I don't think so, unless there were flow guarantees excluding the possibility
>> of taking this path twice without intermediately disabling the device again.
>> Down from here the enabling of ATS is gated on use_ats(). Hence if, in an
>> earlier invocation, we enabled ATS (and set dte->i), we wouldn't turn off ATS
>> below (there's only code to turn it on), yet with what you suggest we'd clear
>> dte->i.
>
> Please bear with me, I think I'm confused, why would use_ats(), and if
> that's the case, don't we want to update dte->i so that it matches the
> ATS state?
I'm afraid I can't parse this. Maybe a result of incomplete editing? The
topic is complex enough that I don't want to even try to guess what you
may have meant to ask ...
> Otherwise we would fail to disable IOMMU device address translation
> support if ATS was disabled?
I think the answer here is "no", but with the above I'm not really sure
here, either.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-21 6:21 ` Jan Beulich
@ 2024-05-21 10:03 ` Roger Pau Monné
2024-05-21 10:23 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-21 10:03 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On Tue, May 21, 2024 at 08:21:35AM +0200, Jan Beulich wrote:
> On 20.05.2024 12:29, Roger Pau Monné wrote:
> > On Wed, May 15, 2024 at 12:07:50PM +0200, Jan Beulich wrote:
> >> On 06.05.2024 15:53, Roger Pau Monné wrote:
> >>> On Mon, May 06, 2024 at 03:20:38PM +0200, Jan Beulich wrote:
> >>>> On 06.05.2024 14:42, Roger Pau Monné wrote:
> >>>>> On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
> >>>>>> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
> >>>>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
> >>>>>>
> >>>>>> if ( use_ats(pdev, iommu, ivrs_dev) )
> >>>>>> - dte->i = ats_enabled;
> >>>>>> + dte->i = true;
> >>>>>
> >>>>> Might be easier to just use:
> >>>>>
> >>>>> dte->i = use_ats(pdev, iommu, ivrs_dev);
> >>>>
> >>>> I'm hesitant here, as in principle we might be overwriting a "true" by
> >>>> "false" then.
> >>>
> >>> Hm, but that would be fine, what's the point in enabling the IOMMU to
> >>> reply to ATS requests if ATS is not enabled on the device?
> >>>
> >>> IOW: overwriting a "true" with a "false" seem like the correct
> >>> behavior if it's based on the output of use_ats().
> >>
> >> I don't think so, unless there were flow guarantees excluding the possibility
> >> of taking this path twice without intermediately disabling the device again.
> >> Down from here the enabling of ATS is gated on use_ats(). Hence if, in an
> >> earlier invocation, we enabled ATS (and set dte->i), we wouldn't turn off ATS
> >> below (there's only code to turn it on), yet with what you suggest we'd clear
> >> dte->i.
> >
> > Please bear with me, I think I'm confused, why would use_ats(), and if
> > that's the case, don't we want to update dte->i so that it matches the
> > ATS state?
>
> I'm afraid I can't parse this. Maybe a result of incomplete editing? The
> topic is complex enough that I don't want to even try to guess what you
> may have meant to ask ...
Oh, indeed, sorry, the full sentences should have been:
Please bear with me, I think I'm confused, why would use_ats() return
different values for the same device?
And if that's the case, don't we want to update dte->i so that it
matches the ATS state signaled by use_ats()?
> > Otherwise we would fail to disable IOMMU device address translation
> > support if ATS was disabled?
>
> I think the answer here is "no", but with the above I'm not really sure
> here, either.
Given the current logic in use_ats() AFAICT the return value of that
function should not change for a given device?
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled
2024-05-21 10:03 ` Roger Pau Monné
@ 2024-05-21 10:23 ` Jan Beulich
0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-05-21 10:23 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: xen-devel@lists.xenproject.org, Paul Durrant
On 21.05.2024 12:03, Roger Pau Monné wrote:
> On Tue, May 21, 2024 at 08:21:35AM +0200, Jan Beulich wrote:
>> On 20.05.2024 12:29, Roger Pau Monné wrote:
>>> On Wed, May 15, 2024 at 12:07:50PM +0200, Jan Beulich wrote:
>>>> On 06.05.2024 15:53, Roger Pau Monné wrote:
>>>>> On Mon, May 06, 2024 at 03:20:38PM +0200, Jan Beulich wrote:
>>>>>> On 06.05.2024 14:42, Roger Pau Monné wrote:
>>>>>>> On Thu, Feb 15, 2024 at 11:15:39AM +0100, Jan Beulich wrote:
>>>>>>>> @@ -196,7 +196,7 @@ static int __must_check amd_iommu_setup_
>>>>>>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>>>>>>>>
>>>>>>>> if ( use_ats(pdev, iommu, ivrs_dev) )
>>>>>>>> - dte->i = ats_enabled;
>>>>>>>> + dte->i = true;
>>>>>>>
>>>>>>> Might be easier to just use:
>>>>>>>
>>>>>>> dte->i = use_ats(pdev, iommu, ivrs_dev);
>>>>>>
>>>>>> I'm hesitant here, as in principle we might be overwriting a "true" by
>>>>>> "false" then.
>>>>>
>>>>> Hm, but that would be fine, what's the point in enabling the IOMMU to
>>>>> reply to ATS requests if ATS is not enabled on the device?
>>>>>
>>>>> IOW: overwriting a "true" with a "false" seem like the correct
>>>>> behavior if it's based on the output of use_ats().
>>>>
>>>> I don't think so, unless there were flow guarantees excluding the possibility
>>>> of taking this path twice without intermediately disabling the device again.
>>>> Down from here the enabling of ATS is gated on use_ats(). Hence if, in an
>>>> earlier invocation, we enabled ATS (and set dte->i), we wouldn't turn off ATS
>>>> below (there's only code to turn it on), yet with what you suggest we'd clear
>>>> dte->i.
>>>
>>> Please bear with me, I think I'm confused, why would use_ats(), and if
>>> that's the case, don't we want to update dte->i so that it matches the
>>> ATS state?
>>
>> I'm afraid I can't parse this. Maybe a result of incomplete editing? The
>> topic is complex enough that I don't want to even try to guess what you
>> may have meant to ask ...
>
> Oh, indeed, sorry, the full sentences should have been:
>
> Please bear with me, I think I'm confused, why would use_ats() return
> different values for the same device?
Right now it can't, yet in principle opt_ats could change value when
wired up accordingly via hypfs.
> And if that's the case, don't we want to update dte->i so that it
> matches the ATS state signaled by use_ats()?
That depends on what adjustment would be done besides changing the
variable value, if that was to become runtime changeable.
>>> Otherwise we would fail to disable IOMMU device address translation
>>> support if ATS was disabled?
>>
>> I think the answer here is "no", but with the above I'm not really sure
>> here, either.
>
> Given the current logic in use_ats() AFAICT the return value of that
> function should not change for a given device?
Right now it shouldn't, yes. I'm still a little hesitant to have callers
make implications from that.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (4 preceding siblings ...)
2024-02-15 10:15 ` [PATCH v2 05/12] IOMMU: rename and re-type ats_enabled Jan Beulich
@ 2024-02-15 10:16 ` Jan Beulich
2024-05-06 13:38 ` Roger Pau Monné
2025-10-23 13:30 ` Teddy Astie
2024-02-15 10:16 ` [PATCH v2 07/12] VT-d: replace find_ats_dev_drhd() Jan Beulich
` (5 subsequent siblings)
11 siblings, 2 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:16 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
When the flag is set, permit Dom0 to control the device (no worse than
what we had before and in line with other "best effort" behavior we use
when it comes to Dom0), but suppress passing through to DomU-s unless
ATS can actually be enabled for such devices (and was explicitly enabled
on the command line).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Re-base over new earlier patches.
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -225,7 +225,11 @@ exceptions (watchdog NMIs and unexpected
> Default: `false`
Permits Xen to set up and use PCI Address Translation Services. This is a
-performance optimisation for PCI Passthrough.
+performance optimisation for PCI Passthrough. Note that firmware may indicate
+that certain devices need to have ATS enabled for proper operation. For such
+devices ATS will be enabled by default, unless the option is used in its
+negative form. Such devices will still not be eligible for passing through to
+guests, unless the option is used in its positive form.
**WARNING: Xen cannot currently safely use ATS because of its synchronous wait
loops for Queued Invalidation completions.**
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -253,6 +253,24 @@ struct acpi_atsr_unit *acpi_find_matched
return all_ports;
}
+const struct acpi_satc_unit *acpi_find_matched_satc_unit(
+ const struct pci_dev *pdev)
+{
+ const struct acpi_satc_unit *satc;
+
+ list_for_each_entry ( satc, &acpi_satc_units, list )
+ {
+ if ( satc->segment != pdev->seg )
+ continue;
+
+ for ( unsigned int i = 0; i < satc->scope.devices_cnt; ++i )
+ if ( satc->scope.devices[i] == pdev->sbdf.bdf )
+ return satc;
+ }
+
+ return NULL;
+}
+
struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd)
{
struct acpi_rhsa_unit *rhsa;
--- a/xen/drivers/passthrough/vtd/dmar.h
+++ b/xen/drivers/passthrough/vtd/dmar.h
@@ -112,6 +112,8 @@ struct acpi_satc_unit {
struct acpi_drhd_unit *acpi_find_matched_drhd_unit(const struct pci_dev *);
struct acpi_atsr_unit *acpi_find_matched_atsr_unit(const struct pci_dev *);
+const struct acpi_satc_unit *acpi_find_matched_satc_unit(
+ const struct pci_dev *pdev);
#define DMAR_TYPE 1
#define RMRR_TYPE 2
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
if ( ret )
dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
pdev->domain);
+ else if ( !pdev->broken )
+ {
+ const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
+ const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
+
+ /*
+ * Prevent the device from getting assigned to an unprivileged domain
+ * when firmware indicates ATS is required, but ATS could not be enabled
+ * or was not explicitly enabled via command line option.
+ */
+ if ( satc && satc->atc_required &&
+ (!drhd || ats_device(pdev, drhd) <= 0 ||
+ !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
+ opt_ats < 0) )
+ {
+ printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
+ &pdev->sbdf);
+ pdev->broken = true;
+ }
+ }
return ret;
}
@@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
pci_vtd_quirk(pdev);
- if ( ret <= 0 )
- return ret;
+ if ( ret <= 0 ||
+ (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
+ opt_ats < 0 )
+ {
+ const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
+
+ /*
+ * Besides in error cases also prevent the device from getting assigned
+ * to an unprivileged domain when firmware indicates ATS is required,
+ * but ATS use was not explicitly enabled via command line option.
+ */
+ if ( satc && satc->atc_required && !pdev->broken )
+ {
+ printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
+ &pdev->sbdf);
+ pdev->broken = true;
+ }
+ }
- ret = enable_ats_device(pdev, &drhd->iommu->ats_devices);
-
- return ret >= 0 ? 0 : ret;
+ return ret <= 0 ? ret : 0;
}
static int cf_check intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -45,8 +45,9 @@ int ats_device(const struct pci_dev *pde
{
struct acpi_drhd_unit *ats_drhd;
unsigned int pos, expfl = 0;
+ const struct acpi_satc_unit *satc;
- if ( opt_ats <= 0 || !iommu_qinval )
+ if ( !opt_ats || !iommu_qinval )
return 0;
if ( !ecap_queued_inval(drhd->iommu->ecap) ||
@@ -61,6 +62,10 @@ int ats_device(const struct pci_dev *pde
!acpi_find_matched_atsr_unit(pdev) )
return 0;
+ satc = acpi_find_matched_satc_unit(pdev);
+ if ( opt_ats < 0 && (!satc || !satc->atc_required) )
+ return 0;
+
ats_drhd = find_ats_dev_drhd(drhd->iommu);
pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2024-02-15 10:16 ` [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag Jan Beulich
@ 2024-05-06 13:38 ` Roger Pau Monné
2024-05-15 10:42 ` Jan Beulich
2025-10-23 13:30 ` Teddy Astie
1 sibling, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-06 13:38 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
> When the flag is set, permit Dom0 to control the device (no worse than
> what we had before and in line with other "best effort" behavior we use
> when it comes to Dom0),
I think we should somehow be able to signal dom0 that this device
might not operate as expected, otherwise dom0 might use it and the
device could silently malfunction due to ATS not being enabled.
Otherwise we should just hide the device from dom0.
I assume setting the IOMMU context entry to passthrough mode would
also be fine for such devices that require ATS?
> but suppress passing through to DomU-s unless
> ATS can actually be enabled for such devices (and was explicitly enabled
> on the command line).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Re-base over new earlier patches.
>
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -225,7 +225,11 @@ exceptions (watchdog NMIs and unexpected
> > Default: `false`
>
> Permits Xen to set up and use PCI Address Translation Services. This is a
> -performance optimisation for PCI Passthrough.
> +performance optimisation for PCI Passthrough. Note that firmware may indicate
> +that certain devices need to have ATS enabled for proper operation. For such
> +devices ATS will be enabled by default, unless the option is used in its
> +negative form. Such devices will still not be eligible for passing through to
> +guests, unless the option is used in its positive form.
>
> **WARNING: Xen cannot currently safely use ATS because of its synchronous wait
> loops for Queued Invalidation completions.**
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -253,6 +253,24 @@ struct acpi_atsr_unit *acpi_find_matched
> return all_ports;
> }
>
> +const struct acpi_satc_unit *acpi_find_matched_satc_unit(
> + const struct pci_dev *pdev)
> +{
> + const struct acpi_satc_unit *satc;
> +
> + list_for_each_entry ( satc, &acpi_satc_units, list )
> + {
> + if ( satc->segment != pdev->seg )
> + continue;
> +
> + for ( unsigned int i = 0; i < satc->scope.devices_cnt; ++i )
> + if ( satc->scope.devices[i] == pdev->sbdf.bdf )
> + return satc;
> + }
> +
> + return NULL;
> +}
> +
> struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd)
> {
> struct acpi_rhsa_unit *rhsa;
> --- a/xen/drivers/passthrough/vtd/dmar.h
> +++ b/xen/drivers/passthrough/vtd/dmar.h
> @@ -112,6 +112,8 @@ struct acpi_satc_unit {
>
> struct acpi_drhd_unit *acpi_find_matched_drhd_unit(const struct pci_dev *);
> struct acpi_atsr_unit *acpi_find_matched_atsr_unit(const struct pci_dev *);
> +const struct acpi_satc_unit *acpi_find_matched_satc_unit(
> + const struct pci_dev *pdev);
>
> #define DMAR_TYPE 1
> #define RMRR_TYPE 2
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
> if ( ret )
> dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
> pdev->domain);
> + else if ( !pdev->broken )
> + {
> + const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
> + const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
> +
> + /*
> + * Prevent the device from getting assigned to an unprivileged domain
> + * when firmware indicates ATS is required, but ATS could not be enabled
> + * or was not explicitly enabled via command line option.
> + */
> + if ( satc && satc->atc_required &&
> + (!drhd || ats_device(pdev, drhd) <= 0 ||
> + !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
> + opt_ats < 0) )
Do you need the opt_ats check here?
I don't think it's possible for pci_ats_enabled() to return true if
opt_ats is <= 0, and hence the opt_ats < 0 check can be dropped from
the conditional?
> + {
> + printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
> + &pdev->sbdf);
> + pdev->broken = true;
> + }
> + }
>
> return ret;
> }
> @@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
>
> pci_vtd_quirk(pdev);
>
> - if ( ret <= 0 )
> - return ret;
> + if ( ret <= 0 ||
> + (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
> + opt_ats < 0 )
Shouldn't this be opt_ats <= 0?
> + {
> + const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
> +
> + /*
> + * Besides in error cases also prevent the device from getting assigned
> + * to an unprivileged domain when firmware indicates ATS is required,
> + * but ATS use was not explicitly enabled via command line option.
> + */
> + if ( satc && satc->atc_required && !pdev->broken )
> + {
> + printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
> + &pdev->sbdf);
> + pdev->broken = true;
> + }
I think the code here could be easier to read if this was put in an
label at the end, and the early return above that you remove becomes a
goto. But that's a question of taste.
> + }
>
> - ret = enable_ats_device(pdev, &drhd->iommu->ats_devices);
> -
> - return ret >= 0 ? 0 : ret;
> + return ret <= 0 ? ret : 0;
> }
>
> static int cf_check intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
> --- a/xen/drivers/passthrough/vtd/x86/ats.c
> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> @@ -45,8 +45,9 @@ int ats_device(const struct pci_dev *pde
> {
> struct acpi_drhd_unit *ats_drhd;
> unsigned int pos, expfl = 0;
> + const struct acpi_satc_unit *satc;
>
> - if ( opt_ats <= 0 || !iommu_qinval )
> + if ( !opt_ats || !iommu_qinval )
> return 0;
FWIW, I find this change confusing, hence my request earlier that
opt_ats must be set to 0 or 1 by the point it gets used.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2024-05-06 13:38 ` Roger Pau Monné
@ 2024-05-15 10:42 ` Jan Beulich
2024-05-20 11:36 ` Roger Pau Monné
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-05-15 10:42 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On 06.05.2024 15:38, Roger Pau Monné wrote:
> On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
>> When the flag is set, permit Dom0 to control the device (no worse than
>> what we had before and in line with other "best effort" behavior we use
>> when it comes to Dom0),
>
> I think we should somehow be able to signal dom0 that this device
> might not operate as expected, otherwise dom0 might use it and the
> device could silently malfunction due to ATS not being enabled.
Whatever signaling we invented, no Dom0 would be required to respect it,
and for (perhaps quite) some time no Dom0 kernel would even exist to query
that property.
> Otherwise we should just hide the device from dom0.
This would feel wrong to me, almost like a regression from what we had
before.
> I assume setting the IOMMU context entry to passthrough mode would
> also be fine for such devices that require ATS?
I'm afraid I'm lacking the connection of the question to what is being
done here. Can you perhaps provide some more context? To provide some
context from my side: Using pass-through mode would be excluded when Dom0
is PVH. Hence why I'm not getting why we would want to even just consider
doing so.
Yet, looking at the spec, in pass-through mode translation requests are
treated as UR. So maybe your question was towards there needing to be
handling (whichever way) for the case where pass-through mode was
requested for PV Dom0? The only half-way sensible thing to do in that case
that I can think of right now would be to ignore that command line option,
just like we do when Dom0 is PVH. Yet that would equally apply to use of
"ats" on the command line, i.e. would likely first require yet another
separate patch. Except that in the "ats" case it may be reasonable to
instead panic(), for there being conflicting requests on the command line
(and it being unclear which one would be better to ignore).
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
>> if ( ret )
>> dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
>> pdev->domain);
>> + else if ( !pdev->broken )
>> + {
>> + const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
>> + const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
>> +
>> + /*
>> + * Prevent the device from getting assigned to an unprivileged domain
>> + * when firmware indicates ATS is required, but ATS could not be enabled
>> + * or was not explicitly enabled via command line option.
>> + */
>> + if ( satc && satc->atc_required &&
>> + (!drhd || ats_device(pdev, drhd) <= 0 ||
>> + !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
>> + opt_ats < 0) )
>
> Do you need the opt_ats check here?
>
> I don't think it's possible for pci_ats_enabled() to return true if
> opt_ats is <= 0, and hence the opt_ats < 0 check can be dropped from
> the conditional?
In the present tristate mode of opt_ats a device can have ATS enabled when
opt_ats is -1 (i.e. no command line override): For devices with ATC_REQUIRED
set.
>> @@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
>>
>> pci_vtd_quirk(pdev);
>>
>> - if ( ret <= 0 )
>> - return ret;
>> + if ( ret <= 0 ||
>> + (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
>> + opt_ats < 0 )
>
> Shouldn't this be opt_ats <= 0?
No, again not as long as this variable is a tristate one.
>> --- a/xen/drivers/passthrough/vtd/x86/ats.c
>> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
>> @@ -45,8 +45,9 @@ int ats_device(const struct pci_dev *pde
>> {
>> struct acpi_drhd_unit *ats_drhd;
>> unsigned int pos, expfl = 0;
>> + const struct acpi_satc_unit *satc;
>>
>> - if ( opt_ats <= 0 || !iommu_qinval )
>> + if ( !opt_ats || !iommu_qinval )
>> return 0;
>
> FWIW, I find this change confusing, hence my request earlier that
> opt_ats must be set to 0 or 1 by the point it gets used.
Right, but as said in particular on the subthread of patch 5, for now it has
to remain a full tristate. Whereas if the spec was changed, I expect the
variable could be switched to bool, and hence no overriding from -1 to 0/1
would be needed anymore at all.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2024-05-15 10:42 ` Jan Beulich
@ 2024-05-20 11:36 ` Roger Pau Monné
2024-05-21 6:25 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Roger Pau Monné @ 2024-05-20 11:36 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On Wed, May 15, 2024 at 12:42:40PM +0200, Jan Beulich wrote:
> On 06.05.2024 15:38, Roger Pau Monné wrote:
> > On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
> >> When the flag is set, permit Dom0 to control the device (no worse than
> >> what we had before and in line with other "best effort" behavior we use
> >> when it comes to Dom0),
> >
> > I think we should somehow be able to signal dom0 that this device
> > might not operate as expected, otherwise dom0 might use it and the
> > device could silently malfunction due to ATS not being enabled.
>
> Whatever signaling we invented, no Dom0 would be required to respect it,
> and for (perhaps quite) some time no Dom0 kernel would even exist to query
> that property.
>
> > Otherwise we should just hide the device from dom0.
>
> This would feel wrong to me, almost like a regression from what we had
> before.
Exposing a device to dom0 that won't be functional doesn't seem like a
very wise choice from Xen TBH.
> > I assume setting the IOMMU context entry to passthrough mode would
> > also be fine for such devices that require ATS?
>
> I'm afraid I'm lacking the connection of the question to what is being
> done here. Can you perhaps provide some more context? To provide some
> context from my side: Using pass-through mode would be excluded when Dom0
> is PVH. Hence why I'm not getting why we would want to even just consider
> doing so.
>
> Yet, looking at the spec, in pass-through mode translation requests are
> treated as UR. So maybe your question was towards there needing to be
> handling (whichever way) for the case where pass-through mode was
> requested for PV Dom0? The only half-way sensible thing to do in that case
> that I can think of right now would be to ignore that command line option,
Hm, maybe I'm confused, but if the IOMMU device context entry is set
in pass-through mode ATS won't be enabled and hence no translation
requests would be send from the device?
IOW, devices listed in the SATC can only mandate ATS enabled when the
IOMMU is enforcing translation. IF the IOMMU is not enabled or if
the device is in passthrough mode then the requirement for having ATS
enabled no longer applies.
Thanks, Roger.
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2024-05-20 11:36 ` Roger Pau Monné
@ 2024-05-21 6:25 ` Jan Beulich
0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-05-21 6:25 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Kevin Tian, Paul Durrant
On 20.05.2024 13:36, Roger Pau Monné wrote:
> On Wed, May 15, 2024 at 12:42:40PM +0200, Jan Beulich wrote:
>> On 06.05.2024 15:38, Roger Pau Monné wrote:
>>> On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
>>>> When the flag is set, permit Dom0 to control the device (no worse than
>>>> what we had before and in line with other "best effort" behavior we use
>>>> when it comes to Dom0),
>>>
>>> I think we should somehow be able to signal dom0 that this device
>>> might not operate as expected, otherwise dom0 might use it and the
>>> device could silently malfunction due to ATS not being enabled.
>>
>> Whatever signaling we invented, no Dom0 would be required to respect it,
>> and for (perhaps quite) some time no Dom0 kernel would even exist to query
>> that property.
>>
>>> Otherwise we should just hide the device from dom0.
>>
>> This would feel wrong to me, almost like a regression from what we had
>> before.
>
> Exposing a device to dom0 that won't be functional doesn't seem like a
> very wise choice from Xen TBH.
Yes but. That's what we're doing right now, after all.
>>> I assume setting the IOMMU context entry to passthrough mode would
>>> also be fine for such devices that require ATS?
>>
>> I'm afraid I'm lacking the connection of the question to what is being
>> done here. Can you perhaps provide some more context? To provide some
>> context from my side: Using pass-through mode would be excluded when Dom0
>> is PVH. Hence why I'm not getting why we would want to even just consider
>> doing so.
>>
>> Yet, looking at the spec, in pass-through mode translation requests are
>> treated as UR. So maybe your question was towards there needing to be
>> handling (whichever way) for the case where pass-through mode was
>> requested for PV Dom0? The only half-way sensible thing to do in that case
>> that I can think of right now would be to ignore that command line option,
>
> Hm, maybe I'm confused, but if the IOMMU device context entry is set
> in pass-through mode ATS won't be enabled and hence no translation
> requests would be send from the device?
>
> IOW, devices listed in the SATC can only mandate ATS enabled when the
> IOMMU is enforcing translation. IF the IOMMU is not enabled or if
> the device is in passthrough mode then the requirement for having ATS
> enabled no longer applies.
Oh, I think I now get what your original question was about: Instead of
enabling ATS on such devices, we might run them in pass-through mode.
For PV that would appear to be an option, yes. But with PVH (presumably)
being the future I'd be rather hesitant to go that route.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2024-02-15 10:16 ` [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag Jan Beulich
2024-05-06 13:38 ` Roger Pau Monné
@ 2025-10-23 13:30 ` Teddy Astie
2025-10-27 11:23 ` Jan Beulich
1 sibling, 1 reply; 38+ messages in thread
From: Teddy Astie @ 2025-10-23 13:30 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Roger Pau Monné, Andrew Cooper
Le 23/10/2025 à 15:14, Jan Beulich a écrit :
> When the flag is set, permit Dom0 to control the device (no worse than
> what we had before and in line with other "best effort" behavior we use
> when it comes to Dom0), but suppress passing through to DomU-s unless
> ATS can actually be enabled for such devices (and was explicitly enabled
> on the command line).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Re-base over new earlier patches.
>
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -225,7 +225,11 @@ exceptions (watchdog NMIs and unexpected
> > Default: `false`
>
> Permits Xen to set up and use PCI Address Translation Services. This is a
> -performance optimisation for PCI Passthrough.
> +performance optimisation for PCI Passthrough. Note that firmware may indicate
> +that certain devices need to have ATS enabled for proper operation. For such
> +devices ATS will be enabled by default, unless the option is used in its
> +negative form. Such devices will still not be eligible for passing through to
> +guests, unless the option is used in its positive form.
>
> **WARNING: Xen cannot currently safely use ATS because of its synchronous wait
> loops for Queued Invalidation completions.**
Do we want to address the warning before attempting to unconditionnaly
enable ATS in these scenarios ? A unstable hypervisor is likely worse
than a non-functionning device to me.
Or at least log a warning that ATS is enabled due to a device requiring it.
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -253,6 +253,24 @@ struct acpi_atsr_unit *acpi_find_matched
> return all_ports;
> }
>
> +const struct acpi_satc_unit *acpi_find_matched_satc_unit(
> + const struct pci_dev *pdev)
> +{
> + const struct acpi_satc_unit *satc;
> +
> + list_for_each_entry ( satc, &acpi_satc_units, list )
> + {
> + if ( satc->segment != pdev->seg )
> + continue;
> +
> + for ( unsigned int i = 0; i < satc->scope.devices_cnt; ++i )
> + if ( satc->scope.devices[i] == pdev->sbdf.bdf )
> + return satc;
> + }
> +
> + return NULL;
> +}
> +
> struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd)
> {
> struct acpi_rhsa_unit *rhsa;
> --- a/xen/drivers/passthrough/vtd/dmar.h
> +++ b/xen/drivers/passthrough/vtd/dmar.h
> @@ -112,6 +112,8 @@ struct acpi_satc_unit {
>
> struct acpi_drhd_unit *acpi_find_matched_drhd_unit(const struct pci_dev *);
> struct acpi_atsr_unit *acpi_find_matched_atsr_unit(const struct pci_dev *);
> +const struct acpi_satc_unit *acpi_find_matched_satc_unit(
> + const struct pci_dev *pdev);
>
> #define DMAR_TYPE 1
> #define RMRR_TYPE 2
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
> if ( ret )
> dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
> pdev->domain);
> + else if ( !pdev->broken )
> + {
> + const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
> + const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
> +
> + /*
> + * Prevent the device from getting assigned to an unprivileged domain
> + * when firmware indicates ATS is required, but ATS could not be enabled
> + * or was not explicitly enabled via command line option.
> + */
> + if ( satc && satc->atc_required &&
> + (!drhd || ats_device(pdev, drhd) <= 0 ||
> + !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
> + opt_ats < 0) )
> + {
> + printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
> + &pdev->sbdf);
> + pdev->broken = true;
> + }
> + }
I don't feel pdev->broken is the right way for signaling ineligibility
for passthrough due to policy (ATS required).
Especially if we eventually consider in the future allowing on a
per-domain basis the ability to use ATS (starting with Dom0).
>
> return ret;
> }
> @@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
>
> pci_vtd_quirk(pdev);
>
> - if ( ret <= 0 )
> - return ret;
> + if ( ret <= 0 ||
> + (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
> + opt_ats < 0 )
> + {
> + const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
> +
> + /*
> + * Besides in error cases also prevent the device from getting assigned
> + * to an unprivileged domain when firmware indicates ATS is required,
> + * but ATS use was not explicitly enabled via command line option.
> + */
> + if ( satc && satc->atc_required && !pdev->broken )
> + {
> + printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
> + &pdev->sbdf);
> + pdev->broken = true;
> + }
> + }
>
> - ret = enable_ats_device(pdev, &drhd->iommu->ats_devices);
> -
> - return ret >= 0 ? 0 : ret;
> + return ret <= 0 ? ret : 0;
> }
>
> static int cf_check intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
> --- a/xen/drivers/passthrough/vtd/x86/ats.c
> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> @@ -45,8 +45,9 @@ int ats_device(const struct pci_dev *pde
> {
> struct acpi_drhd_unit *ats_drhd;
> unsigned int pos, expfl = 0;
> + const struct acpi_satc_unit *satc;
>
> - if ( opt_ats <= 0 || !iommu_qinval )
> + if ( !opt_ats || !iommu_qinval )
> return 0;
>
> if ( !ecap_queued_inval(drhd->iommu->ecap) ||
> @@ -61,6 +62,10 @@ int ats_device(const struct pci_dev *pde
> !acpi_find_matched_atsr_unit(pdev) )
> return 0;
>
> + satc = acpi_find_matched_satc_unit(pdev);
> + if ( opt_ats < 0 && (!satc || !satc->atc_required) )
> + return 0;
> +
> ats_drhd = find_ats_dev_drhd(drhd->iommu);
> pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
>
>
>
>
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag
2025-10-23 13:30 ` Teddy Astie
@ 2025-10-27 11:23 ` Jan Beulich
0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2025-10-27 11:23 UTC (permalink / raw)
To: Teddy Astie; +Cc: Roger Pau Monné, Andrew Cooper, xen-devel
On 23.10.2025 15:30, Teddy Astie wrote:
> Le 23/10/2025 à 15:14, Jan Beulich a écrit :
>> When the flag is set, permit Dom0 to control the device (no worse than
>> what we had before and in line with other "best effort" behavior we use
>> when it comes to Dom0), but suppress passing through to DomU-s unless
>> ATS can actually be enabled for such devices (and was explicitly enabled
>> on the command line).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> v2: Re-base over new earlier patches.
>>
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -225,7 +225,11 @@ exceptions (watchdog NMIs and unexpected
>> > Default: `false`
>>
>> Permits Xen to set up and use PCI Address Translation Services. This is a
>> -performance optimisation for PCI Passthrough.
>> +performance optimisation for PCI Passthrough. Note that firmware may indicate
>> +that certain devices need to have ATS enabled for proper operation. For such
>> +devices ATS will be enabled by default, unless the option is used in its
>> +negative form. Such devices will still not be eligible for passing through to
>> +guests, unless the option is used in its positive form.
>>
>> **WARNING: Xen cannot currently safely use ATS because of its synchronous wait
>> loops for Queued Invalidation completions.**
>
> Do we want to address the warning before attempting to unconditionnaly
> enable ATS in these scenarios ? A unstable hypervisor is likely worse
> than a non-functionning device to me.
Addressing this requires, afaict, lots of changes. Such devices also can still
only be used by Dom0 unless ATS is explicitly enabled from the command line.
Whether a non-functioning device is worse than a (only possibly) "unstable"
hypervisor is also hard to tell. Dom0 may fail to boot if the "right" device is
affected. ("Possibly" because the synchronous wait loops of course are of
concern only if they end up taking long.)
> Or at least log a warning that ATS is enabled due to a device requiring it.
This would need to be a per-device message, which may not scale very well.
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
>> if ( ret )
>> dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
>> pdev->domain);
>> + else if ( !pdev->broken )
>> + {
>> + const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
>> + const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
>> +
>> + /*
>> + * Prevent the device from getting assigned to an unprivileged domain
>> + * when firmware indicates ATS is required, but ATS could not be enabled
>> + * or was not explicitly enabled via command line option.
>> + */
>> + if ( satc && satc->atc_required &&
>> + (!drhd || ats_device(pdev, drhd) <= 0 ||
>> + !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
>> + opt_ats < 0) )
>> + {
>> + printk(XENLOG_WARNING "ATS: %pp is not eligible for pass-through\n",
>> + &pdev->sbdf);
>> + pdev->broken = true;
>> + }
>> + }
>
> I don't feel pdev->broken is the right way for signaling ineligibility
> for passthrough due to policy (ATS required).
> Especially if we eventually consider in the future allowing on a
> per-domain basis the ability to use ATS (starting with Dom0).
Well, pdev->broken is what we have available. Anything better can come later,
imo. For now the goal has been to at least get in line with the spec. That said,
while - afaik - not written down anywhere, back at the time I got indications
that the "required" in ATC_REQUIRED may not be as strict an indication as the
word may suggest.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 07/12] VT-d: replace find_ats_dev_drhd()
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (5 preceding siblings ...)
2024-02-15 10:16 ` [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag Jan Beulich
@ 2024-02-15 10:16 ` Jan Beulich
2024-02-15 10:17 ` [PATCH v2 08/12] VT-d: move ats_device() to the sole file it's used from Jan Beulich
` (4 subsequent siblings)
11 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:16 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
All callers only care about boolean outcome. For this there's no point
in allocating a duplicate of the respective DRHD structure; a simple
boolean suffices (which eventually may want to become a count, such that
the "any ATS devices assigned state" can also clear again). With that
boolean, remove respective parameters from internal helper functions
right away, as those have access to the flag through another parameter.
(While the corresponding hook can have the parameter dropped too then,
leave this for a separate change, to limit scope here.)
No functional effect intended.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: Re-base over new earlier patches.
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -65,8 +65,6 @@ struct acpi_drhd_unit *ioapic_to_drhd(un
struct acpi_drhd_unit *hpet_to_drhd(unsigned int hpet_id);
struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd);
-struct acpi_drhd_unit *find_ats_dev_drhd(struct vtd_iommu *iommu);
-
int ats_device(const struct pci_dev *, const struct acpi_drhd_unit *);
int dev_invalidate_iotlb(struct vtd_iommu *iommu, u16 did,
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -624,8 +624,7 @@ int cf_check vtd_flush_iotlb_reg(
}
static int __must_check iommu_flush_iotlb_global(struct vtd_iommu *iommu,
- bool flush_non_present_entry,
- bool flush_dev_iotlb)
+ bool flush_non_present_entry)
{
int status;
@@ -633,7 +632,7 @@ static int __must_check iommu_flush_iotl
vtd_ops_preamble_quirk(iommu);
status = iommu->flush.iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
- flush_non_present_entry, flush_dev_iotlb);
+ flush_non_present_entry, iommu->flush_dev_iotlb);
/* undo platform specific errata workarounds */
vtd_ops_postamble_quirk(iommu);
@@ -642,8 +641,7 @@ static int __must_check iommu_flush_iotl
}
static int __must_check iommu_flush_iotlb_dsi(struct vtd_iommu *iommu, u16 did,
- bool flush_non_present_entry,
- bool flush_dev_iotlb)
+ bool flush_non_present_entry)
{
int status;
@@ -651,7 +649,7 @@ static int __must_check iommu_flush_iotl
vtd_ops_preamble_quirk(iommu);
status = iommu->flush.iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH,
- flush_non_present_entry, flush_dev_iotlb);
+ flush_non_present_entry, iommu->flush_dev_iotlb);
/* undo platform specific errata workarounds */
vtd_ops_postamble_quirk(iommu);
@@ -661,26 +659,23 @@ static int __must_check iommu_flush_iotl
static int __must_check iommu_flush_iotlb_psi(struct vtd_iommu *iommu, u16 did,
u64 addr, unsigned int order,
- bool flush_non_present_entry,
- bool flush_dev_iotlb)
+ bool flush_non_present_entry)
{
int status;
/* Fallback to domain selective flush if no PSI support */
if ( !cap_pgsel_inv(iommu->cap) )
- return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry,
- flush_dev_iotlb);
+ return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry);
/* Fallback to domain selective flush if size is too big */
if ( order > cap_max_amask_val(iommu->cap) )
- return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry,
- flush_dev_iotlb);
+ return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry);
/* apply platform specific errata workarounds */
vtd_ops_preamble_quirk(iommu);
status = iommu->flush.iotlb(iommu, did, addr, order, DMA_TLB_PSI_FLUSH,
- flush_non_present_entry, flush_dev_iotlb);
+ flush_non_present_entry, iommu->flush_dev_iotlb);
/* undo platform specific errata workarounds */
vtd_ops_postamble_quirk(iommu);
@@ -692,7 +687,6 @@ static int __must_check iommu_flush_all(
{
struct acpi_drhd_unit *drhd;
struct vtd_iommu *iommu;
- bool flush_dev_iotlb;
int rc = 0;
flush_local(FLUSH_CACHE);
@@ -703,8 +697,7 @@ static int __must_check iommu_flush_all(
iommu = drhd->iommu;
context_rc = iommu_flush_context_global(iommu, 0);
- flush_dev_iotlb = !!find_ats_dev_drhd(iommu);
- iotlb_rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
+ iotlb_rc = iommu_flush_iotlb_global(iommu, 0);
/*
* The current logic for returns:
@@ -734,7 +727,6 @@ static int __must_check cf_check iommu_f
struct domain_iommu *hd = dom_iommu(d);
struct acpi_drhd_unit *drhd;
struct vtd_iommu *iommu;
- bool flush_dev_iotlb;
int iommu_domid;
int ret = 0;
@@ -762,21 +754,18 @@ static int __must_check cf_check iommu_f
if ( !test_bit(iommu->index, hd->arch.vtd.iommu_bitmap) )
continue;
- flush_dev_iotlb = !!find_ats_dev_drhd(iommu);
iommu_domid = get_iommu_did(d->domain_id, iommu, !d->is_dying);
if ( iommu_domid == -1 )
continue;
if ( !page_count || (page_count & (page_count - 1)) ||
dfn_eq(dfn, INVALID_DFN) || !IS_ALIGNED(dfn_x(dfn), page_count) )
- rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
- 0, flush_dev_iotlb);
+ rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0);
else
rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
dfn_to_daddr(dfn),
get_order_from_pages(page_count),
- !(flush_flags & IOMMU_FLUSHF_modified),
- flush_dev_iotlb);
+ !(flush_flags & IOMMU_FLUSHF_modified));
if ( rc > 0 )
iommu_flush_write_buffer(iommu);
@@ -1488,7 +1477,6 @@ int domain_context_mapping_one(
uint16_t seg = iommu->drhd->segment, prev_did = 0;
struct domain *prev_dom = NULL;
int rc, ret;
- bool flush_dev_iotlb;
if ( QUARANTINE_SKIP(domain, pgd_maddr) )
return 0;
@@ -1637,8 +1625,7 @@ int domain_context_mapping_one(
rc = iommu_flush_context_device(iommu, prev_did, PCI_BDF(bus, devfn),
DMA_CCMD_MASK_NOBIT, !prev_dom);
- flush_dev_iotlb = !!find_ats_dev_drhd(iommu);
- ret = iommu_flush_iotlb_dsi(iommu, prev_did, !prev_dom, flush_dev_iotlb);
+ ret = iommu_flush_iotlb_dsi(iommu, prev_did, !prev_dom);
/*
* The current logic for returns:
@@ -1882,7 +1869,6 @@ int domain_context_unmap_one(
struct context_entry *context, *context_entries;
u64 maddr;
int iommu_domid, rc, ret;
- bool flush_dev_iotlb;
ASSERT(pcidevs_locked());
spin_lock(&iommu->lock);
@@ -1908,8 +1894,7 @@ int domain_context_unmap_one(
PCI_BDF(bus, devfn),
DMA_CCMD_MASK_NOBIT, 0);
- flush_dev_iotlb = !!find_ats_dev_drhd(iommu);
- ret = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb);
+ ret = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0);
/*
* The current logic for returns:
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -484,6 +484,7 @@ struct vtd_iommu {
spinlock_t register_lock; /* protect iommu register handling */
u64 root_maddr; /* root entry machine address */
nodeid_t node;
+ bool flush_dev_iotlb;
struct msi_desc msi;
struct acpi_drhd_unit *drhd;
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -28,22 +28,8 @@
#include "../extern.h"
#include "../../ats.h"
-static LIST_HEAD(ats_dev_drhd_units);
-
-struct acpi_drhd_unit *find_ats_dev_drhd(struct vtd_iommu *iommu)
-{
- struct acpi_drhd_unit *drhd;
- list_for_each_entry ( drhd, &ats_dev_drhd_units, list )
- {
- if ( drhd->iommu == iommu )
- return drhd;
- }
- return NULL;
-}
-
int ats_device(const struct pci_dev *pdev, const struct acpi_drhd_unit *drhd)
{
- struct acpi_drhd_unit *ats_drhd;
unsigned int pos, expfl = 0;
const struct acpi_satc_unit *satc;
@@ -66,17 +52,10 @@ int ats_device(const struct pci_dev *pde
if ( opt_ats < 0 && (!satc || !satc->atc_required) )
return 0;
- ats_drhd = find_ats_dev_drhd(drhd->iommu);
pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
+ if ( pos )
+ drhd->iommu->flush_dev_iotlb = true;
- if ( pos && (ats_drhd == NULL) )
- {
- ats_drhd = xmalloc(struct acpi_drhd_unit);
- if ( !ats_drhd )
- return -ENOMEM;
- *ats_drhd = *drhd;
- list_add_tail(&ats_drhd->list, &ats_dev_drhd_units);
- }
return pos;
}
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH v2 08/12] VT-d: move ats_device() to the sole file it's used from
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (6 preceding siblings ...)
2024-02-15 10:16 ` [PATCH v2 07/12] VT-d: replace find_ats_dev_drhd() Jan Beulich
@ 2024-02-15 10:17 ` Jan Beulich
2024-02-15 10:18 ` [PATCH v2 09/12] VT-d: move dev_invalidate_iotlb() " Jan Beulich
` (3 subsequent siblings)
11 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:17 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
..., thus allowing it to become static, and thus reducing scope overlap
between it and pci_ats_device(). There's nothing x86-specific about this
function anyway.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -65,8 +65,6 @@ struct acpi_drhd_unit *ioapic_to_drhd(un
struct acpi_drhd_unit *hpet_to_drhd(unsigned int hpet_id);
struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd);
-int ats_device(const struct pci_dev *, const struct acpi_drhd_unit *);
-
int dev_invalidate_iotlb(struct vtd_iommu *iommu, u16 did,
u64 addr, unsigned int size_order, u64 type);
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1455,6 +1455,38 @@ static void __hwdom_init cf_check intel_
}
}
+static int ats_device(const struct pci_dev *pdev,
+ const struct acpi_drhd_unit *drhd)
+{
+ unsigned int pos, expfl = 0;
+ const struct acpi_satc_unit *satc;
+
+ if ( !opt_ats || !iommu_qinval )
+ return 0;
+
+ if ( !ecap_queued_inval(drhd->iommu->ecap) ||
+ !ecap_dev_iotlb(drhd->iommu->ecap) )
+ return 0;
+
+ pos = pci_find_cap_offset(pdev->sbdf, PCI_CAP_ID_EXP);
+ if ( pos )
+ expfl = pci_conf_read16(pdev->sbdf, pos + PCI_EXP_FLAGS);
+
+ if ( MASK_EXTR(expfl, PCI_EXP_FLAGS_TYPE) != PCI_EXP_TYPE_RC_END &&
+ !acpi_find_matched_atsr_unit(pdev) )
+ return 0;
+
+ satc = acpi_find_matched_satc_unit(pdev);
+ if ( opt_ats < 0 && (!satc || !satc->atc_required) )
+ return 0;
+
+ pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
+ if ( pos )
+ drhd->iommu->flush_dev_iotlb = true;
+
+ return pos;
+}
+
/*
* This function returns
* - a negative errno value upon error,
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -28,37 +28,6 @@
#include "../extern.h"
#include "../../ats.h"
-int ats_device(const struct pci_dev *pdev, const struct acpi_drhd_unit *drhd)
-{
- unsigned int pos, expfl = 0;
- const struct acpi_satc_unit *satc;
-
- if ( !opt_ats || !iommu_qinval )
- return 0;
-
- if ( !ecap_queued_inval(drhd->iommu->ecap) ||
- !ecap_dev_iotlb(drhd->iommu->ecap) )
- return 0;
-
- pos = pci_find_cap_offset(pdev->sbdf, PCI_CAP_ID_EXP);
- if ( pos )
- expfl = pci_conf_read16(pdev->sbdf, pos + PCI_EXP_FLAGS);
-
- if ( MASK_EXTR(expfl, PCI_EXP_FLAGS_TYPE) != PCI_EXP_TYPE_RC_END &&
- !acpi_find_matched_atsr_unit(pdev) )
- return 0;
-
- satc = acpi_find_matched_satc_unit(pdev);
- if ( opt_ats < 0 && (!satc || !satc->atc_required) )
- return 0;
-
- pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
- if ( pos )
- drhd->iommu->flush_dev_iotlb = true;
-
- return pos;
-}
-
static bool device_in_domain(const struct vtd_iommu *iommu,
const struct pci_dev *pdev, uint16_t did)
{
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH v2 09/12] VT-d: move dev_invalidate_iotlb() to the sole file it's used from
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (7 preceding siblings ...)
2024-02-15 10:17 ` [PATCH v2 08/12] VT-d: move ats_device() to the sole file it's used from Jan Beulich
@ 2024-02-15 10:18 ` Jan Beulich
2024-02-15 10:18 ` [PATCH v2 10/12] VT-d: move {,un}map_vtd_domain_page() Jan Beulich
` (2 subsequent siblings)
11 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:18 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
..., thus allowing it and qinval_device_iotlb_sync() to become static.
There's nothing x86-specific about the function anyway. While moving,
adjust types to better match ./CODING_STYLE (albeit use of fixed-width
types for parameters is retained to limit the effective change).
Append a UL to a constant while moving, to please Misra. Also insert
blank lines in the switch(), between non-fall-through case blocks.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: Cosmetics.
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -65,13 +65,6 @@ struct acpi_drhd_unit *ioapic_to_drhd(un
struct acpi_drhd_unit *hpet_to_drhd(unsigned int hpet_id);
struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd);
-int dev_invalidate_iotlb(struct vtd_iommu *iommu, u16 did,
- u64 addr, unsigned int size_order, u64 type);
-
-int __must_check qinval_device_iotlb_sync(struct vtd_iommu *iommu,
- struct pci_dev *pdev,
- u16 did, u16 size, u64 addr);
-
uint64_t alloc_pgtable_maddr(unsigned long npages, nodeid_t node);
void free_pgtable_maddr(u64 maddr);
void *map_vtd_domain_page(u64 maddr);
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -251,8 +251,9 @@ static int __must_check dev_invalidate_s
return rc;
}
-int qinval_device_iotlb_sync(struct vtd_iommu *iommu, struct pci_dev *pdev,
- u16 did, u16 size, u64 addr)
+static int qinval_device_iotlb_sync(struct vtd_iommu *iommu,
+ struct pci_dev *pdev, uint16_t did,
+ uint16_t size, paddr_t addr)
{
unsigned long flags;
unsigned int index;
@@ -282,6 +283,103 @@ int qinval_device_iotlb_sync(struct vtd_
return dev_invalidate_sync(iommu, pdev, did);
}
+static bool device_in_domain(const struct vtd_iommu *iommu,
+ const struct pci_dev *pdev, uint16_t did)
+{
+ struct root_entry *root_entry;
+ struct context_entry *ctxt_entry = NULL;
+ unsigned int tt;
+ bool found = false;
+
+ if ( unlikely(!iommu->root_maddr) )
+ {
+ ASSERT_UNREACHABLE();
+ return false;
+ }
+
+ root_entry = map_vtd_domain_page(iommu->root_maddr);
+ if ( !root_present(root_entry[pdev->bus]) )
+ goto out;
+
+ ctxt_entry = map_vtd_domain_page(root_entry[pdev->bus].val);
+ if ( context_domain_id(ctxt_entry[pdev->devfn]) != did )
+ goto out;
+
+ tt = context_translation_type(ctxt_entry[pdev->devfn]);
+ if ( tt != CONTEXT_TT_DEV_IOTLB )
+ goto out;
+
+ found = true;
+ out:
+ if ( root_entry )
+ unmap_vtd_domain_page(root_entry);
+
+ if ( ctxt_entry )
+ unmap_vtd_domain_page(ctxt_entry);
+
+ return found;
+}
+
+static int dev_invalidate_iotlb(struct vtd_iommu *iommu, uint16_t did,
+ paddr_t addr, unsigned int size_order,
+ uint64_t type)
+{
+ struct pci_dev *pdev, *temp;
+ int ret = 0;
+
+ if ( !ecap_dev_iotlb(iommu->ecap) )
+ return ret;
+
+ list_for_each_entry_safe ( pdev, temp, &iommu->ats_devices, ats.list )
+ {
+ bool sbit;
+ int rc = 0;
+
+ switch ( type )
+ {
+ case DMA_TLB_DSI_FLUSH:
+ if ( !device_in_domain(iommu, pdev, did) )
+ break;
+ /* fall through if DSI condition met */
+ case DMA_TLB_GLOBAL_FLUSH:
+ /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
+ sbit = 1;
+ addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFFUL;
+ rc = qinval_device_iotlb_sync(iommu, pdev, did, sbit, addr);
+ break;
+
+ case DMA_TLB_PSI_FLUSH:
+ if ( !device_in_domain(iommu, pdev, did) )
+ break;
+
+ /* if size <= 4K, set sbit = 0, else set sbit = 1 */
+ sbit = size_order ? 1 : 0;
+
+ /* clear lower bits */
+ addr &= ~0UL << PAGE_SHIFT_4K;
+
+ /* if sbit == 1, zero out size_order bit and set lower bits to 1 */
+ if ( sbit )
+ {
+ addr &= ~((u64)PAGE_SIZE_4K << (size_order - 1));
+ addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K;
+ }
+
+ rc = qinval_device_iotlb_sync(iommu, pdev, did, sbit, addr);
+ break;
+
+ default:
+ dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n");
+ return -EOPNOTSUPP;
+ }
+
+ if ( !ret )
+ ret = rc;
+ }
+
+ return ret;
+}
+
static int __must_check queue_invalidate_iec_sync(struct vtd_iommu *iommu,
u8 granu, u8 im, u16 iidx)
{
--- a/xen/drivers/passthrough/vtd/x86/Makefile
+++ b/xen/drivers/passthrough/vtd/x86/Makefile
@@ -1,2 +1 @@
-obj-y += ats.o
obj-y += vtd.o
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ /dev/null
@@ -1,123 +0,0 @@
-/*
- * Copyright (c) 2006, Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License,
- * version 2, as published by the Free Software Foundation.
- *
- * This program is distributed in the hope it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program; If not, see <http://www.gnu.org/licenses/>.
- *
- * Author: Allen Kay <allen.m.kay@intel.com>
- */
-
-#include <xen/sched.h>
-#include <xen/iommu.h>
-#include <xen/time.h>
-#include <xen/pci.h>
-#include <xen/pci_regs.h>
-#include <asm/msi.h>
-#include "../iommu.h"
-#include "../dmar.h"
-#include "../vtd.h"
-#include "../extern.h"
-#include "../../ats.h"
-
-static bool device_in_domain(const struct vtd_iommu *iommu,
- const struct pci_dev *pdev, uint16_t did)
-{
- struct root_entry *root_entry;
- struct context_entry *ctxt_entry = NULL;
- unsigned int tt;
- bool found = false;
-
- if ( unlikely(!iommu->root_maddr) )
- {
- ASSERT_UNREACHABLE();
- return false;
- }
-
- root_entry = map_vtd_domain_page(iommu->root_maddr);
- if ( !root_present(root_entry[pdev->bus]) )
- goto out;
-
- ctxt_entry = map_vtd_domain_page(root_entry[pdev->bus].val);
- if ( context_domain_id(ctxt_entry[pdev->devfn]) != did )
- goto out;
-
- tt = context_translation_type(ctxt_entry[pdev->devfn]);
- if ( tt != CONTEXT_TT_DEV_IOTLB )
- goto out;
-
- found = true;
-out:
- if ( root_entry )
- unmap_vtd_domain_page(root_entry);
-
- if ( ctxt_entry )
- unmap_vtd_domain_page(ctxt_entry);
-
- return found;
-}
-
-int dev_invalidate_iotlb(struct vtd_iommu *iommu, u16 did,
- u64 addr, unsigned int size_order, u64 type)
-{
- struct pci_dev *pdev, *temp;
- int ret = 0;
-
- if ( !ecap_dev_iotlb(iommu->ecap) )
- return ret;
-
- list_for_each_entry_safe( pdev, temp, &iommu->ats_devices, ats.list )
- {
- bool sbit;
- int rc = 0;
-
- switch ( type )
- {
- case DMA_TLB_DSI_FLUSH:
- if ( !device_in_domain(iommu, pdev, did) )
- break;
- /* fall through if DSI condition met */
- case DMA_TLB_GLOBAL_FLUSH:
- /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
- sbit = 1;
- addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF;
- rc = qinval_device_iotlb_sync(iommu, pdev, did, sbit, addr);
- break;
- case DMA_TLB_PSI_FLUSH:
- if ( !device_in_domain(iommu, pdev, did) )
- break;
-
- /* if size <= 4K, set sbit = 0, else set sbit = 1 */
- sbit = size_order ? 1 : 0;
-
- /* clear lower bits */
- addr &= ~0UL << PAGE_SHIFT_4K;
-
- /* if sbit == 1, zero out size_order bit and set lower bits to 1 */
- if ( sbit )
- {
- addr &= ~((u64)PAGE_SIZE_4K << (size_order - 1));
- addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K;
- }
-
- rc = qinval_device_iotlb_sync(iommu, pdev, did, sbit, addr);
- break;
- default:
- dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n");
- return -EOPNOTSUPP;
- }
-
- if ( !ret )
- ret = rc;
- }
-
- return ret;
-}
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH v2 10/12] VT-d: move {,un}map_vtd_domain_page()
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (8 preceding siblings ...)
2024-02-15 10:18 ` [PATCH v2 09/12] VT-d: move dev_invalidate_iotlb() " Jan Beulich
@ 2024-02-15 10:18 ` Jan Beulich
2024-02-15 10:18 ` [PATCH v2 11/12] VT-d: drop flush_dev_iotlb parameter from IOTLB flush hook Jan Beulich
2024-02-15 10:19 ` [PATCH v2 12/12] PCI/ATS: tidy {en,dis}able_ats_device() a little Jan Beulich
11 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:18 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
..., thus allowing them to become static. There's nothing x86-specific
about these functions anyway.
Since only the "iommu_inclusive_mapping" parameter declaration would be
left in the file, move that as well. There's nothing VT-d specific about
it (anymore?): "dom0-iommu=map-inclusive" is similarly generic, and
documentation also doesn't say anything.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
v2: doc adjustment.
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1633,7 +1633,7 @@ Specify the timeout of the device IOTLB
By default, the timeout is 1000 ms. When you see error 'Queue invalidate
wait descriptor timed out', try increasing this value.
-### iommu_inclusive_mapping
+### iommu_inclusive_mapping (x86)
> `= <boolean>`
**WARNING: This command line option is deprecated, and superseded by
--- a/xen/drivers/passthrough/vtd/Makefile
+++ b/xen/drivers/passthrough/vtd/Makefile
@@ -1,5 +1,3 @@
-obj-$(CONFIG_X86) += x86/
-
obj-y += iommu.o
obj-y += dmar.o
obj-y += utils.o
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -21,6 +21,7 @@
#define _VTD_EXTERN_H_
#include "dmar.h"
+#include <xen/domain_page.h>
#include <xen/keyhandler.h>
#define VTDPREFIX "[VT-D]"
@@ -67,8 +68,6 @@ struct acpi_rhsa_unit *drhd_to_rhsa(cons
uint64_t alloc_pgtable_maddr(unsigned long npages, nodeid_t node);
void free_pgtable_maddr(u64 maddr);
-void *map_vtd_domain_page(u64 maddr);
-void unmap_vtd_domain_page(const void *va);
int domain_context_mapping_one(struct domain *domain, struct vtd_iommu *iommu,
uint8_t bus, uint8_t devfn,
const struct pci_dev *pdev, domid_t domid,
@@ -78,6 +77,16 @@ int domain_context_unmap_one(struct doma
int cf_check intel_iommu_get_reserved_device_memory(
iommu_grdm_t *func, void *ctxt);
+static inline void *map_vtd_domain_page(paddr_t maddr)
+{
+ return map_domain_page(_mfn(paddr_to_pfn(maddr)));
+}
+
+static inline void unmap_vtd_domain_page(const void *va)
+{
+ unmap_domain_page(va);
+}
+
unsigned int cf_check io_apic_read_remap_rte(
unsigned int apic, unsigned int reg);
void cf_check io_apic_write_remap_rte(
--- a/xen/drivers/passthrough/vtd/x86/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-obj-y += vtd.o
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ /dev/null
@@ -1,48 +0,0 @@
-/*
- * Copyright (c) 2008, Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License,
- * version 2, as published by the Free Software Foundation.
- *
- * This program is distributed in the hope it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program; If not, see <http://www.gnu.org/licenses/>.
- *
- * Copyright (C) Allen Kay <allen.m.kay@intel.com>
- * Copyright (C) Weidong Han <weidong.han@intel.com>
- */
-
-#include <xen/param.h>
-#include <xen/sched.h>
-#include <xen/softirq.h>
-#include <xen/domain_page.h>
-#include <asm/paging.h>
-#include <xen/iommu.h>
-#include <xen/irq.h>
-#include <xen/numa.h>
-#include <asm/fixmap.h>
-#include "../iommu.h"
-#include "../dmar.h"
-#include "../vtd.h"
-#include "../extern.h"
-
-/*
- * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
- * 1:1 iommu mappings except xen and unusable regions.
- */
-boolean_param("iommu_inclusive_mapping", iommu_hwdom_inclusive);
-
-void *map_vtd_domain_page(u64 maddr)
-{
- return map_domain_page(_mfn(paddr_to_pfn(maddr)));
-}
-
-void unmap_vtd_domain_page(const void *va)
-{
- unmap_domain_page(va);
-}
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -19,6 +19,7 @@
#include <xen/paging.h>
#include <xen/guest_access.h>
#include <xen/event.h>
+#include <xen/param.h>
#include <xen/softirq.h>
#include <xen/vm_event.h>
#include <xsm/xsm.h>
@@ -36,6 +37,12 @@ bool __initdata iommu_superpages = true;
enum iommu_intremap __read_mostly iommu_intremap = iommu_intremap_full;
+/*
+ * iommu_inclusive_mapping: When set, all memory below 4GB is included in dom0
+ * 1:1 iommu mappings except xen and unusable regions.
+ */
+boolean_param("iommu_inclusive_mapping", iommu_hwdom_inclusive);
+
#ifdef CONFIG_PV
/* Possible unfiltered LAPIC/MSI messages from untrusted sources? */
bool __read_mostly untrusted_msi;
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH v2 11/12] VT-d: drop flush_dev_iotlb parameter from IOTLB flush hook
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (9 preceding siblings ...)
2024-02-15 10:18 ` [PATCH v2 10/12] VT-d: move {,un}map_vtd_domain_page() Jan Beulich
@ 2024-02-15 10:18 ` Jan Beulich
2024-05-06 14:06 ` Roger Pau Monné
2024-02-15 10:19 ` [PATCH v2 12/12] PCI/ATS: tidy {en,dis}able_ats_device() a little Jan Beulich
11 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:18 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
All call sites pass it using the flag from the IOMMU which they also
pass.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -57,8 +57,7 @@ int __must_check cf_check vtd_flush_cont
uint8_t function_mask, uint64_t type, bool flush_non_present_entry);
int __must_check cf_check vtd_flush_iotlb_reg(
struct vtd_iommu *iommu, uint16_t did, uint64_t addr,
- unsigned int size_order, uint64_t type, bool flush_non_present_entry,
- bool flush_dev_iotlb);
+ unsigned int size_order, uint64_t type, bool flush_non_present_entry);
struct vtd_iommu *ioapic_to_iommu(unsigned int apic_id);
struct vtd_iommu *hpet_to_iommu(unsigned int hpet_id);
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -566,8 +566,7 @@ static int __must_check iommu_flush_cont
/* return value determine if we need a write buffer flush */
int cf_check vtd_flush_iotlb_reg(
struct vtd_iommu *iommu, uint16_t did, uint64_t addr,
- unsigned int size_order, uint64_t type, bool flush_non_present_entry,
- bool flush_dev_iotlb)
+ unsigned int size_order, uint64_t type, bool flush_non_present_entry)
{
int tlb_offset = ecap_iotlb_offset(iommu->ecap);
uint64_t val = type | DMA_TLB_IVT;
@@ -632,7 +631,7 @@ static int __must_check iommu_flush_iotl
vtd_ops_preamble_quirk(iommu);
status = iommu->flush.iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
- flush_non_present_entry, iommu->flush_dev_iotlb);
+ flush_non_present_entry);
/* undo platform specific errata workarounds */
vtd_ops_postamble_quirk(iommu);
@@ -649,7 +648,7 @@ static int __must_check iommu_flush_iotl
vtd_ops_preamble_quirk(iommu);
status = iommu->flush.iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH,
- flush_non_present_entry, iommu->flush_dev_iotlb);
+ flush_non_present_entry);
/* undo platform specific errata workarounds */
vtd_ops_postamble_quirk(iommu);
@@ -675,7 +674,7 @@ static int __must_check iommu_flush_iotl
vtd_ops_preamble_quirk(iommu);
status = iommu->flush.iotlb(iommu, did, addr, order, DMA_TLB_PSI_FLUSH,
- flush_non_present_entry, iommu->flush_dev_iotlb);
+ flush_non_present_entry);
/* undo platform specific errata workarounds */
vtd_ops_postamble_quirk(iommu);
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -502,8 +502,7 @@ struct vtd_iommu {
bool non_present_entry_flush);
int __must_check (*iotlb)(struct vtd_iommu *iommu, u16 did, u64 addr,
unsigned int size_order, u64 type,
- bool flush_non_present_entry,
- bool flush_dev_iotlb);
+ bool flush_non_present_entry);
} flush;
struct list_head ats_devices;
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -452,7 +452,7 @@ static int __must_check cf_check flush_c
static int __must_check cf_check flush_iotlb_qi(
struct vtd_iommu *iommu, u16 did, u64 addr, unsigned int size_order,
- u64 type, bool flush_non_present_entry, bool flush_dev_iotlb)
+ u64 type, bool flush_non_present_entry)
{
u8 dr = 0, dw = 0;
int ret = 0, rc;
@@ -478,7 +478,7 @@ static int __must_check cf_check flush_i
if ( !ret )
ret = rc;
- if ( flush_dev_iotlb )
+ if ( iommu->flush_dev_iotlb )
{
rc = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
if ( !ret )
@@ -573,8 +573,7 @@ static int cf_check vtd_flush_context_no
static int cf_check vtd_flush_iotlb_noop(
struct vtd_iommu *iommu, uint16_t did, uint64_t addr,
- unsigned int size_order, uint64_t type, bool flush_non_present_entry,
- bool flush_dev_iotlb)
+ unsigned int size_order, uint64_t type, bool flush_non_present_entry)
{
WARN();
return -EIO;
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH v2 12/12] PCI/ATS: tidy {en,dis}able_ats_device() a little
2024-02-15 10:11 [PATCH v2 00/12] VT-d: SATC handling; ATS: tidying Jan Beulich
` (10 preceding siblings ...)
2024-02-15 10:18 ` [PATCH v2 11/12] VT-d: drop flush_dev_iotlb parameter from IOTLB flush hook Jan Beulich
@ 2024-02-15 10:19 ` Jan Beulich
2024-05-06 14:10 ` Roger Pau Monné
11 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2024-02-15 10:19 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Kevin Tian, Roger Pau Monné, Paul Durrant
Use appropriate types for the control register value as well as the
capability position. Constify a pointer. Use "else" in favor of encoding
the opposite condition of the earlier if().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
--- a/xen/drivers/passthrough/ats.c
+++ b/xen/drivers/passthrough/ats.c
@@ -23,10 +23,9 @@ boolean_param("ats", opt_ats);
int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list)
{
- u32 value;
- int pos;
+ uint16_t value;
+ unsigned int pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
- pos = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_ATS);
BUG_ON(!pos);
if ( iommu_verbose )
@@ -35,7 +34,7 @@ int enable_ats_device(struct pci_dev *pd
value = pci_conf_read16(pdev->sbdf, pos + ATS_REG_CTL);
if ( value & ATS_ENABLE )
{
- struct pci_dev *other;
+ const struct pci_dev *other;
list_for_each_entry ( other, ats_list, ats.list )
if ( other == pdev )
@@ -44,8 +43,7 @@ int enable_ats_device(struct pci_dev *pd
break;
}
}
-
- if ( !(value & ATS_ENABLE) )
+ else
{
value |= ATS_ENABLE;
pci_conf_write16(pdev->sbdf, pos + ATS_REG_CTL, value);
@@ -69,7 +67,7 @@ int enable_ats_device(struct pci_dev *pd
void disable_ats_device(struct pci_dev *pdev)
{
- u32 value;
+ uint16_t value;
BUG_ON(!pdev->ats.cap_pos);
^ permalink raw reply [flat|nested] 38+ messages in thread