* [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range
@ 2025-02-19 16:48 Roger Pau Monne
2025-02-19 16:48 ` [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH Roger Pau Monne
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Roger Pau Monne @ 2025-02-19 16:48 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Hello,
First two patches are preparatory changes to reduce the changes required
in patch 3. I would have wanted those to go in 4.20 to fix the issues
on Lenovo Thinkpads, but it's too late now.
Thanks, Roger.
Roger Pau Monne (3):
x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
x86/iommu: account for IOMEM caps when populating dom0 IOMMU
page-tables
x86/dom0: be less restrictive with the Interrupt Address Range
xen/arch/x86/dom0_build.c | 24 ++++++++---
xen/arch/x86/hvm/dom0_build.c | 14 +++---
xen/arch/x86/hvm/io.c | 6 +--
xen/arch/x86/include/asm/hvm/io.h | 4 +-
xen/drivers/passthrough/x86/iommu.c | 67 ++++++++++++-----------------
5 files changed, 57 insertions(+), 58 deletions(-)
--
2.46.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
2025-02-19 16:48 [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
@ 2025-02-19 16:48 ` Roger Pau Monne
2025-02-20 8:22 ` Jan Beulich
2025-02-19 16:48 ` [PATCH v3 2/3] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables Roger Pau Monne
` (2 subsequent siblings)
3 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monne @ 2025-02-19 16:48 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
The logic in dom0_setup_permissions() sets the maximum bound in
->iomem_caps unconditionally using paddr_bits, which is not correct for HVM
based domains. Instead use domain_max_paddr_bits() to get the correct
maximum paddr bits for each possible domain type.
Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter.
Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
The fixes tag might be dubious, IIRC at that time we had PVHv1 dom0, which
would likely also need such adjustment, but not the current PVHv2.
---
Changes since v2:
- New in this version.
---
xen/arch/x86/dom0_build.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 3b9681dc9134..aec596997d5d 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -481,7 +481,8 @@ int __init dom0_setup_permissions(struct domain *d)
/* The hardware domain is initially permitted full I/O capabilities. */
rc = ioports_permit_access(d, 0, 0xFFFF);
- rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1);
+ rc |= iomem_permit_access(d, 0UL,
+ PFN_DOWN(1UL << domain_max_paddr_bits(d)) - 1);
rc |= irqs_permit_access(d, 1, nr_irqs_gsi - 1);
/* Modify I/O port access permissions. */
--
2.46.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/3] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables
2025-02-19 16:48 [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-02-19 16:48 ` [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH Roger Pau Monne
@ 2025-02-19 16:48 ` Roger Pau Monne
2025-02-20 8:29 ` Jan Beulich
2025-02-19 16:48 ` [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-03-05 14:27 ` [PATCH v3 0/3] " Jan Beulich
3 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monne @ 2025-02-19 16:48 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
The current code in arch_iommu_hwdom_init() kind of open-codes the same
MMIO permission ranges that are added to the hardware domain ->iomem_caps.
Avoid this duplication and use ->iomem_caps in arch_iommu_hwdom_init() to
filter which memory regions should be added to the dom0 IOMMU page-tables.
Note the IO-APIC and MCFG page(s) must be set as not accessible for a PVH
dom0, otherwise the internal Xen emulation for those ranges won't work.
This requires adjustments in dom0_setup_permissions().
The call to pvh_setup_mmcfg() in dom0_construct_pvh() must now strictly be
done ahead of setting up dom0 permissions, so take the opportunity to also
put it inside the existing is_hardware_domain() region.
Also the special casing of E820_UNUSABLE regions no longer needs to be done
in arch_iommu_hwdom_init(), as those regions are already blocked in
->iomem_caps and thus would be removed from the rangeset as part of
->iomem_caps processing in arch_iommu_hwdom_init(). The E820_UNUSABLE
regions below 1Mb are not removed from ->iomem_caps, that's a slight
difference for the IOMMU created page-tables, but the aim is to allow
access to the same memory either from the CPU or the IOMMU page-tables.
Since ->iomem_caps already takes into account the domain max paddr, there's
no need to remove any regions past the last address addressable by the
domain, as applying ->iomem_caps would have already taken care of that.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
- Expand commit message re E820_UNUSABLE.
- Use vpci_mmcfg_deny_access().
- Remove max() from map_subtract_iomemcap().
- Use has_vioapic() instead of is_hvm_domain().
Changes since v1:
- New in this version.
---
xen/arch/x86/dom0_build.c | 11 ++++-
xen/arch/x86/hvm/dom0_build.c | 14 +++---
xen/arch/x86/hvm/io.c | 6 +--
xen/arch/x86/include/asm/hvm/io.h | 4 +-
xen/drivers/passthrough/x86/iommu.c | 67 ++++++++++++-----------------
5 files changed, 49 insertions(+), 53 deletions(-)
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index aec596997d5d..a735e3650c28 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -558,7 +558,9 @@ int __init dom0_setup_permissions(struct domain *d)
for ( i = 0; i < nr_ioapics; i++ )
{
mfn = paddr_to_pfn(mp_ioapics[i].mpc_apicaddr);
- if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
+ /* If emulating IO-APIC(s) make sure the base address is unmapped. */
+ if ( has_vioapic(d) ||
+ !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
rc |= iomem_deny_access(d, mfn, mfn);
}
/* MSI range. */
@@ -599,6 +601,13 @@ int __init dom0_setup_permissions(struct domain *d)
rc |= rangeset_add_singleton(mmio_ro_ranges, mfn);
}
+ if ( has_vpci(d) )
+ /*
+ * TODO: runtime added MMCFG regions are not checked to make sure they
+ * don't overlap with already mapped regions, thus preventing trapping.
+ */
+ rc |= vpci_mmcfg_deny_access(d);
+
return rc;
}
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index ce5b5c31b318..6a4453103a9a 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -1323,6 +1323,13 @@ int __init dom0_construct_pvh(struct boot_info *bi, struct domain *d)
if ( is_hardware_domain(d) )
{
+ /*
+ * MMCFG initialization must be performed before setting domain
+ * permissions, as the MCFG areas must not be part of the domain IOMEM
+ * accessible regions.
+ */
+ pvh_setup_mmcfg(d);
+
/*
* Setup permissions early so that calls to add MMIO regions to the
* p2m as part of vPCI setup don't fail due to permission checks.
@@ -1335,13 +1342,6 @@ int __init dom0_construct_pvh(struct boot_info *bi, struct domain *d)
}
}
- /*
- * NB: MMCFG initialization needs to be performed before iommu
- * initialization so the iommu code can fetch the MMCFG regions used by the
- * domain.
- */
- pvh_setup_mmcfg(d);
-
/*
* Craft dom0 physical memory map and set the paging allocation. This must
* be done before the iommu initializion, since iommu initialization code
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index db726b38177b..de6ee6c4dd4d 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -363,14 +363,14 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d,
return NULL;
}
-int __hwdom_init vpci_subtract_mmcfg(const struct domain *d, struct rangeset *r)
+int __hwdom_init vpci_mmcfg_deny_access(struct domain *d)
{
const struct hvm_mmcfg *mmcfg;
list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
{
- int rc = rangeset_remove_range(r, PFN_DOWN(mmcfg->addr),
- PFN_DOWN(mmcfg->addr + mmcfg->size - 1));
+ int rc = iomem_deny_access(d, PFN_DOWN(mmcfg->addr),
+ PFN_DOWN(mmcfg->addr + mmcfg->size - 1));
if ( rc )
return rc;
diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h
index f2b8431facb0..565bad300d20 100644
--- a/xen/arch/x86/include/asm/hvm/io.h
+++ b/xen/arch/x86/include/asm/hvm/io.h
@@ -132,8 +132,8 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
/* Destroy tracked MMCFG areas. */
void destroy_vpci_mmcfg(struct domain *d);
-/* Remove MMCFG regions from a given rangeset. */
-int vpci_subtract_mmcfg(const struct domain *d, struct rangeset *r);
+/* Remove MMCFG regions from a domain ->iomem_caps. */
+int vpci_mmcfg_deny_access(struct domain *d);
#endif /* __ASM_X86_HVM_IO_H__ */
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 8b1e0596b84a..67f025c1ec6a 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -320,6 +320,26 @@ static int __hwdom_init cf_check map_subtract(unsigned long s, unsigned long e,
return rangeset_remove_range(map, s, e);
}
+struct handle_iomemcap {
+ struct rangeset *r;
+ unsigned long last;
+};
+static int __hwdom_init cf_check map_subtract_iomemcap(unsigned long s,
+ unsigned long e,
+ void *data)
+{
+ struct handle_iomemcap *h = data;
+ int rc = 0;
+
+ if ( h->last != s )
+ rc = rangeset_remove_range(h->r, h->last, s - 1);
+
+ ASSERT(e < ~0UL);
+ h->last = e + 1;
+
+ return rc;
+}
+
struct map_data {
struct domain *d;
unsigned int flush_flags;
@@ -400,6 +420,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
unsigned int i;
struct rangeset *map;
struct map_data map_data = { .d = d };
+ struct handle_iomemcap iomem = {};
int rc;
BUG_ON(!is_hardware_domain(d));
@@ -442,14 +463,6 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
switch ( entry.type )
{
- case E820_UNUSABLE:
- /* Only relevant for inclusive mode, otherwise this is a no-op. */
- rc = rangeset_remove_range(map, PFN_DOWN(entry.addr),
- PFN_DOWN(entry.addr + entry.size - 1));
- if ( rc )
- panic("IOMMU failed to remove unusable memory: %d\n", rc);
- continue;
-
case E820_RESERVED:
if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
continue;
@@ -475,22 +488,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
if ( rc )
panic("IOMMU failed to remove Xen ranges: %d\n", rc);
- /* Remove any overlap with the Interrupt Address Range. */
- rc = rangeset_remove_range(map, 0xfee00, 0xfeeff);
+ iomem.r = map;
+ rc = rangeset_report_ranges(d->iomem_caps, 0, ~0UL, map_subtract_iomemcap,
+ &iomem);
+ if ( !rc && iomem.last < ~0UL )
+ rc = rangeset_remove_range(map, iomem.last, ~0UL);
if ( rc )
- panic("IOMMU failed to remove Interrupt Address Range: %d\n", rc);
-
- /* If emulating IO-APIC(s) make sure the base address is unmapped. */
- if ( has_vioapic(d) )
- {
- for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
- {
- rc = rangeset_remove_singleton(map,
- PFN_DOWN(domain_vioapic(d, i)->base_address));
- if ( rc )
- panic("IOMMU failed to remove IO-APIC: %d\n", rc);
- }
- }
+ panic("IOMMU failed to remove forbidden regions: %d\n", rc);
if ( is_pv_domain(d) )
{
@@ -506,23 +510,6 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
panic("IOMMU failed to remove read-only regions: %d\n", rc);
}
- if ( has_vpci(d) )
- {
- /*
- * TODO: runtime added MMCFG regions are not checked to make sure they
- * don't overlap with already mapped regions, thus preventing trapping.
- */
- rc = vpci_subtract_mmcfg(d, map);
- if ( rc )
- panic("IOMMU unable to remove MMCFG areas: %d\n", rc);
- }
-
- /* Remove any regions past the last address addressable by the domain. */
- rc = rangeset_remove_range(map, PFN_DOWN(1UL << domain_max_paddr_bits(d)),
- ~0UL);
- if ( rc )
- panic("IOMMU unable to remove unaddressable ranges: %d\n", rc);
-
if ( iommu_verbose )
printk(XENLOG_INFO "%pd: identity mappings for IOMMU:\n", d);
--
2.46.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-19 16:48 [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-02-19 16:48 ` [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH Roger Pau Monne
2025-02-19 16:48 ` [PATCH v3 2/3] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables Roger Pau Monne
@ 2025-02-19 16:48 ` Roger Pau Monne
2025-02-20 8:33 ` Jan Beulich
2025-03-05 14:27 ` [PATCH v3 0/3] " Jan Beulich
3 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monne @ 2025-02-19 16:48 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper,
Jürgen Groß
Xen currently prevents dom0 from creating CPU or IOMMU page-table mappings
into the interrupt address range [0xfee00000, 0xfeefffff]. This range has
two different purposes. For accesses from the CPU is contains the default
position of local APIC page at 0xfee00000. For accesses from devices
it's the MSI address range, so the address field in the MSI entries
(usually) point to an address on that range to trigger an interrupt.
There are reports of Lenovo Thinkpad devices placing what seems to be the
UCSI shared mailbox at address 0xfeec2000 in the interrupt address range.
Attempting to use that device with a Linux PV dom0 leads to an error when
Linux kernel maps 0xfeec2000:
RIP: e030:xen_mc_flush+0x1e8/0x2b0
xen_leave_lazy_mmu+0x15/0x60
vmap_range_noflush+0x408/0x6f0
__ioremap_caller+0x20d/0x350
acpi_os_map_iomem+0x1a3/0x1c0
acpi_ex_system_memory_space_handler+0x229/0x3f0
acpi_ev_address_space_dispatch+0x17e/0x4c0
acpi_ex_access_region+0x28a/0x510
acpi_ex_field_datum_io+0x95/0x5c0
acpi_ex_extract_from_field+0x36b/0x4e0
acpi_ex_read_data_from_field+0xcb/0x430
acpi_ex_resolve_node_to_value+0x2e0/0x530
acpi_ex_resolve_to_value+0x1e7/0x550
acpi_ds_evaluate_name_path+0x107/0x170
acpi_ds_exec_end_op+0x392/0x860
acpi_ps_parse_loop+0x268/0xa30
acpi_ps_parse_aml+0x221/0x5e0
acpi_ps_execute_method+0x171/0x3e0
acpi_ns_evaluate+0x174/0x5d0
acpi_evaluate_object+0x167/0x440
acpi_evaluate_dsm+0xb6/0x130
ucsi_acpi_dsm+0x53/0x80
ucsi_acpi_read+0x2e/0x60
ucsi_register+0x24/0xa0
ucsi_acpi_probe+0x162/0x1e3
platform_probe+0x48/0x90
really_probe+0xde/0x340
__driver_probe_device+0x78/0x110
driver_probe_device+0x1f/0x90
__driver_attach+0xd2/0x1c0
bus_for_each_dev+0x77/0xc0
bus_add_driver+0x112/0x1f0
driver_register+0x72/0xd0
do_one_initcall+0x48/0x300
do_init_module+0x60/0x220
__do_sys_init_module+0x17f/0x1b0
do_syscall_64+0x82/0x170
Remove the restrictions to create mappings the interrupt address range for
dom0. Note that the restriction to map the local APIC page is enforced
separately, and that continues to be present. Additionally make sure the
emulated local APIC page is also not mapped, in case dom0 is using it.
Note that even if the interrupt address range entries are populated in the
IOMMU page-tables no device access will reach those pages. Device accesses
to the Interrupt Address Range will always be converted into Interrupt
Messages and are not subject to DMA remapping.
There's also the following restriction noted in Intel VT-d:
> Software must not program paging-structure entries to remap any address to
> the interrupt address range. Untranslated requests and translation requests
> that result in an address in the interrupt range will be blocked with
> condition code LGN.4 or SGN.8. Translated requests with an address in the
> interrupt address range are treated as Unsupported Request (UR).
Similarly for AMD-Vi:
> Accesses to the interrupt address range (Table 3) are defined to go through
> the interrupt remapping portion of the IOMMU and not through address
> translation processing. Therefore, when a transaction is being processed as
> an interrupt remapping operation, the transaction attribute of
> pretranslated or untranslated is ignored.
>
> Software Note: The IOMMU should
> not be configured such that an address translation results in a special
> address such as the interrupt address range.
However those restrictions don't apply to the identity mappings possibly
created for dom0, since the interrupt address range is never subject to DMA
remapping, and hence there's no output address after translation that
belongs to the interrupt address range.
Reported-by: Jürgen Groß <jgross@suse.com>
Link: https://lore.kernel.org/xen-devel/baade0a7-e204-4743-bda1-282df74e5f89@suse.com/
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
- Make sure vlapic page is not mapped.
Changes since v1:
- No longer needs to modify arch_iommu_hwdom_init().
---
xen/arch/x86/dom0_build.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index a735e3650c28..3cda0c2c8765 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -554,6 +554,12 @@ int __init dom0_setup_permissions(struct domain *d)
mfn = paddr_to_pfn(mp_lapic_addr);
rc |= iomem_deny_access(d, mfn, mfn);
}
+ /* If using an emulated local APIC make sure its MMIO is unpopulated. */
+ if ( has_vlapic(d) )
+ {
+ mfn = paddr_to_pfn(APIC_DEFAULT_PHYS_BASE);
+ rc |= iomem_deny_access(d, mfn, mfn);
+ }
/* I/O APICs. */
for ( i = 0; i < nr_ioapics; i++ )
{
@@ -563,10 +569,6 @@ int __init dom0_setup_permissions(struct domain *d)
!rangeset_contains_singleton(mmio_ro_ranges, mfn) )
rc |= iomem_deny_access(d, mfn, mfn);
}
- /* MSI range. */
- rc |= iomem_deny_access(d, paddr_to_pfn(MSI_ADDR_BASE_LO),
- paddr_to_pfn(MSI_ADDR_BASE_LO +
- MSI_ADDR_DEST_ID_MASK));
/* HyperTransport range. */
if ( boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
{
--
2.46.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
2025-02-19 16:48 ` [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH Roger Pau Monne
@ 2025-02-20 8:22 ` Jan Beulich
2025-02-20 8:49 ` Roger Pau Monné
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2025-02-20 8:22 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 19.02.2025 17:48, Roger Pau Monne wrote:
> The logic in dom0_setup_permissions() sets the maximum bound in
> ->iomem_caps unconditionally using paddr_bits, which is not correct for HVM
> based domains. Instead use domain_max_paddr_bits() to get the correct
> maximum paddr bits for each possible domain type.
>
> Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter.
>
> Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> The fixes tag might be dubious, IIRC at that time we had PVHv1 dom0, which
> would likely also need such adjustment, but not the current PVHv2.
Probably better to omit it then. It would be one of the changes moving to
PVHv2 that missed making the adjustment.
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -481,7 +481,8 @@ int __init dom0_setup_permissions(struct domain *d)
>
> /* The hardware domain is initially permitted full I/O capabilities. */
> rc = ioports_permit_access(d, 0, 0xFFFF);
> - rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1);
> + rc |= iomem_permit_access(d, 0UL,
> + PFN_DOWN(1UL << domain_max_paddr_bits(d)) - 1);
Why PFN_DOWN() rather than subtracting PAGE_SHIFT? That's two shifts rather
than just one. Personally I'd prefer if we continued using the subtraction,
but either way:
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/3] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables
2025-02-19 16:48 ` [PATCH v3 2/3] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables Roger Pau Monne
@ 2025-02-20 8:29 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2025-02-20 8:29 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 19.02.2025 17:48, Roger Pau Monne wrote:
> The current code in arch_iommu_hwdom_init() kind of open-codes the same
> MMIO permission ranges that are added to the hardware domain ->iomem_caps.
> Avoid this duplication and use ->iomem_caps in arch_iommu_hwdom_init() to
> filter which memory regions should be added to the dom0 IOMMU page-tables.
>
> Note the IO-APIC and MCFG page(s) must be set as not accessible for a PVH
> dom0, otherwise the internal Xen emulation for those ranges won't work.
> This requires adjustments in dom0_setup_permissions().
>
> The call to pvh_setup_mmcfg() in dom0_construct_pvh() must now strictly be
> done ahead of setting up dom0 permissions, so take the opportunity to also
> put it inside the existing is_hardware_domain() region.
>
> Also the special casing of E820_UNUSABLE regions no longer needs to be done
> in arch_iommu_hwdom_init(), as those regions are already blocked in
> ->iomem_caps and thus would be removed from the rangeset as part of
> ->iomem_caps processing in arch_iommu_hwdom_init(). The E820_UNUSABLE
> regions below 1Mb are not removed from ->iomem_caps, that's a slight
> difference for the IOMMU created page-tables, but the aim is to allow
> access to the same memory either from the CPU or the IOMMU page-tables.
>
> Since ->iomem_caps already takes into account the domain max paddr, there's
> no need to remove any regions past the last address addressable by the
> domain, as applying ->iomem_caps would have already taken care of that.
>
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-19 16:48 ` [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
@ 2025-02-20 8:33 ` Jan Beulich
2025-02-20 8:55 ` Roger Pau Monné
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2025-02-20 8:33 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, Jürgen Groß, xen-devel
On 19.02.2025 17:48, Roger Pau Monne wrote:
> Xen currently prevents dom0 from creating CPU or IOMMU page-table mappings
> into the interrupt address range [0xfee00000, 0xfeefffff]. This range has
> two different purposes. For accesses from the CPU is contains the default
> position of local APIC page at 0xfee00000. For accesses from devices
> it's the MSI address range, so the address field in the MSI entries
> (usually) point to an address on that range to trigger an interrupt.
>
> There are reports of Lenovo Thinkpad devices placing what seems to be the
> UCSI shared mailbox at address 0xfeec2000 in the interrupt address range.
> Attempting to use that device with a Linux PV dom0 leads to an error when
> Linux kernel maps 0xfeec2000:
>
> RIP: e030:xen_mc_flush+0x1e8/0x2b0
> xen_leave_lazy_mmu+0x15/0x60
> vmap_range_noflush+0x408/0x6f0
> __ioremap_caller+0x20d/0x350
> acpi_os_map_iomem+0x1a3/0x1c0
> acpi_ex_system_memory_space_handler+0x229/0x3f0
> acpi_ev_address_space_dispatch+0x17e/0x4c0
> acpi_ex_access_region+0x28a/0x510
> acpi_ex_field_datum_io+0x95/0x5c0
> acpi_ex_extract_from_field+0x36b/0x4e0
> acpi_ex_read_data_from_field+0xcb/0x430
> acpi_ex_resolve_node_to_value+0x2e0/0x530
> acpi_ex_resolve_to_value+0x1e7/0x550
> acpi_ds_evaluate_name_path+0x107/0x170
> acpi_ds_exec_end_op+0x392/0x860
> acpi_ps_parse_loop+0x268/0xa30
> acpi_ps_parse_aml+0x221/0x5e0
> acpi_ps_execute_method+0x171/0x3e0
> acpi_ns_evaluate+0x174/0x5d0
> acpi_evaluate_object+0x167/0x440
> acpi_evaluate_dsm+0xb6/0x130
> ucsi_acpi_dsm+0x53/0x80
> ucsi_acpi_read+0x2e/0x60
> ucsi_register+0x24/0xa0
> ucsi_acpi_probe+0x162/0x1e3
> platform_probe+0x48/0x90
> really_probe+0xde/0x340
> __driver_probe_device+0x78/0x110
> driver_probe_device+0x1f/0x90
> __driver_attach+0xd2/0x1c0
> bus_for_each_dev+0x77/0xc0
> bus_add_driver+0x112/0x1f0
> driver_register+0x72/0xd0
> do_one_initcall+0x48/0x300
> do_init_module+0x60/0x220
> __do_sys_init_module+0x17f/0x1b0
> do_syscall_64+0x82/0x170
>
> Remove the restrictions to create mappings the interrupt address range for
Nit: Missing "in"?
> dom0. Note that the restriction to map the local APIC page is enforced
> separately, and that continues to be present. Additionally make sure the
> emulated local APIC page is also not mapped, in case dom0 is using it.
But that's in GFN space, not in MFN one. Why would that matter for iomem_caps?
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
2025-02-20 8:22 ` Jan Beulich
@ 2025-02-20 8:49 ` Roger Pau Monné
2025-02-20 13:02 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2025-02-20 8:49 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Thu, Feb 20, 2025 at 09:22:40AM +0100, Jan Beulich wrote:
> On 19.02.2025 17:48, Roger Pau Monne wrote:
> > The logic in dom0_setup_permissions() sets the maximum bound in
> > ->iomem_caps unconditionally using paddr_bits, which is not correct for HVM
> > based domains. Instead use domain_max_paddr_bits() to get the correct
> > maximum paddr bits for each possible domain type.
> >
> > Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter.
> >
> > Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > The fixes tag might be dubious, IIRC at that time we had PVHv1 dom0, which
> > would likely also need such adjustment, but not the current PVHv2.
>
> Probably better to omit it then. It would be one of the changes moving to
> PVHv2 that missed making the adjustment.
Well, PVHv1 would have needed such adjustment, as it was also limited
to hap_paddr_bits instead of paddr_bits.
> > --- a/xen/arch/x86/dom0_build.c
> > +++ b/xen/arch/x86/dom0_build.c
> > @@ -481,7 +481,8 @@ int __init dom0_setup_permissions(struct domain *d)
> >
> > /* The hardware domain is initially permitted full I/O capabilities. */
> > rc = ioports_permit_access(d, 0, 0xFFFF);
> > - rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1);
> > + rc |= iomem_permit_access(d, 0UL,
> > + PFN_DOWN(1UL << domain_max_paddr_bits(d)) - 1);
>
> Why PFN_DOWN() rather than subtracting PAGE_SHIFT? That's two shifts rather
> than just one.
cosmetic: line length (it's mentioned in the commit message). I can
switch back to PAGE_SHIFT, didn't think it was a big deal since it's
a one time only calculation.
> Personally I'd prefer if we continued using the subtraction,
> but either way:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Thanks, will switch back to PAGE_SHIFT if it doesn't turn out to be
too ugly.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-20 8:33 ` Jan Beulich
@ 2025-02-20 8:55 ` Roger Pau Monné
2025-02-20 13:30 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2025-02-20 8:55 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Jürgen Groß, xen-devel
On Thu, Feb 20, 2025 at 09:33:46AM +0100, Jan Beulich wrote:
> On 19.02.2025 17:48, Roger Pau Monne wrote:
> > Xen currently prevents dom0 from creating CPU or IOMMU page-table mappings
> > into the interrupt address range [0xfee00000, 0xfeefffff]. This range has
> > two different purposes. For accesses from the CPU is contains the default
> > position of local APIC page at 0xfee00000. For accesses from devices
> > it's the MSI address range, so the address field in the MSI entries
> > (usually) point to an address on that range to trigger an interrupt.
> >
> > There are reports of Lenovo Thinkpad devices placing what seems to be the
> > UCSI shared mailbox at address 0xfeec2000 in the interrupt address range.
> > Attempting to use that device with a Linux PV dom0 leads to an error when
> > Linux kernel maps 0xfeec2000:
> >
> > RIP: e030:xen_mc_flush+0x1e8/0x2b0
> > xen_leave_lazy_mmu+0x15/0x60
> > vmap_range_noflush+0x408/0x6f0
> > __ioremap_caller+0x20d/0x350
> > acpi_os_map_iomem+0x1a3/0x1c0
> > acpi_ex_system_memory_space_handler+0x229/0x3f0
> > acpi_ev_address_space_dispatch+0x17e/0x4c0
> > acpi_ex_access_region+0x28a/0x510
> > acpi_ex_field_datum_io+0x95/0x5c0
> > acpi_ex_extract_from_field+0x36b/0x4e0
> > acpi_ex_read_data_from_field+0xcb/0x430
> > acpi_ex_resolve_node_to_value+0x2e0/0x530
> > acpi_ex_resolve_to_value+0x1e7/0x550
> > acpi_ds_evaluate_name_path+0x107/0x170
> > acpi_ds_exec_end_op+0x392/0x860
> > acpi_ps_parse_loop+0x268/0xa30
> > acpi_ps_parse_aml+0x221/0x5e0
> > acpi_ps_execute_method+0x171/0x3e0
> > acpi_ns_evaluate+0x174/0x5d0
> > acpi_evaluate_object+0x167/0x440
> > acpi_evaluate_dsm+0xb6/0x130
> > ucsi_acpi_dsm+0x53/0x80
> > ucsi_acpi_read+0x2e/0x60
> > ucsi_register+0x24/0xa0
> > ucsi_acpi_probe+0x162/0x1e3
> > platform_probe+0x48/0x90
> > really_probe+0xde/0x340
> > __driver_probe_device+0x78/0x110
> > driver_probe_device+0x1f/0x90
> > __driver_attach+0xd2/0x1c0
> > bus_for_each_dev+0x77/0xc0
> > bus_add_driver+0x112/0x1f0
> > driver_register+0x72/0xd0
> > do_one_initcall+0x48/0x300
> > do_init_module+0x60/0x220
> > __do_sys_init_module+0x17f/0x1b0
> > do_syscall_64+0x82/0x170
> >
> > Remove the restrictions to create mappings the interrupt address range for
>
> Nit: Missing "in"?
Indeed, thanks for spotting.
> > dom0. Note that the restriction to map the local APIC page is enforced
> > separately, and that continues to be present. Additionally make sure the
> > emulated local APIC page is also not mapped, in case dom0 is using it.
>
> But that's in GFN space, not in MFN one. Why would that matter for iomem_caps?
It's required to avoid arch_iommu_hwdom_init() creating an identity
mapping for APIC_DEFAULT_PHYS_BASE, which would prevent the local APIC
emulation from being used.
Note that mp_lapic_addr can be zeor if the host local APICs are
started in x2APIC mode, or it could (in theory) contain an address
different than APIC_DEFAULT_PHYS_BASE.
Thanks, Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
2025-02-20 8:49 ` Roger Pau Monné
@ 2025-02-20 13:02 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2025-02-20 13:02 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, xen-devel
On 20.02.2025 09:49, Roger Pau Monné wrote:
> On Thu, Feb 20, 2025 at 09:22:40AM +0100, Jan Beulich wrote:
>> On 19.02.2025 17:48, Roger Pau Monne wrote:
>>> The logic in dom0_setup_permissions() sets the maximum bound in
>>> ->iomem_caps unconditionally using paddr_bits, which is not correct for HVM
>>> based domains. Instead use domain_max_paddr_bits() to get the correct
>>> maximum paddr bits for each possible domain type.
>>>
>>> Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter.
>>>
>>> Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access')
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>> ---
>>> The fixes tag might be dubious, IIRC at that time we had PVHv1 dom0, which
>>> would likely also need such adjustment, but not the current PVHv2.
>>
>> Probably better to omit it then. It would be one of the changes moving to
>> PVHv2 that missed making the adjustment.
>
> Well, PVHv1 would have needed such adjustment, as it was also limited
> to hap_paddr_bits instead of paddr_bits.
Looks like I mis-interpreted your sentence then.
>>> --- a/xen/arch/x86/dom0_build.c
>>> +++ b/xen/arch/x86/dom0_build.c
>>> @@ -481,7 +481,8 @@ int __init dom0_setup_permissions(struct domain *d)
>>>
>>> /* The hardware domain is initially permitted full I/O capabilities. */
>>> rc = ioports_permit_access(d, 0, 0xFFFF);
>>> - rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1);
>>> + rc |= iomem_permit_access(d, 0UL,
>>> + PFN_DOWN(1UL << domain_max_paddr_bits(d)) - 1);
>>
>> Why PFN_DOWN() rather than subtracting PAGE_SHIFT? That's two shifts rather
>> than just one.
>
> cosmetic: line length (it's mentioned in the commit message).
Oh, I had overlooked that sentence there.
> I can
> switch back to PAGE_SHIFT, didn't think it was a big deal since it's
> a one time only calculation.
Feel free to keep as is then. I agree it's not a big deal here; my worry with such
usually is that people seeing something in one place may then copy/clone the same
to use elsewhere.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-20 8:55 ` Roger Pau Monné
@ 2025-02-20 13:30 ` Jan Beulich
2025-02-20 15:40 ` Roger Pau Monné
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2025-02-20 13:30 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, Jürgen Groß, xen-devel
On 20.02.2025 09:55, Roger Pau Monné wrote:
> On Thu, Feb 20, 2025 at 09:33:46AM +0100, Jan Beulich wrote:
>> On 19.02.2025 17:48, Roger Pau Monne wrote:
>>> Note that the restriction to map the local APIC page is enforced
>>> separately, and that continues to be present. Additionally make sure the
>>> emulated local APIC page is also not mapped, in case dom0 is using it.
>>
>> But that's in GFN space, not in MFN one. Why would that matter for iomem_caps?
>
> It's required to avoid arch_iommu_hwdom_init() creating an identity
> mapping for APIC_DEFAULT_PHYS_BASE, which would prevent the local APIC
> emulation from being used.
Hmm, yes, on one hand such a mapping would be created by default, as we
default to "dom0-iommu=map-reserved". Otoh that mapping would be replaced
before Dom0 is actually started, via the domain_creation_finished() hook.
On (modern) VMX that is. So yes, on old VMX and on SVM the slot would need
to remain unpopulated. Otoh, when the physical LAPICs are elsewhere and
when the domain is in x2APIC mode, there would be no reason to disallow
Dom0 access to that page. That would apparently mean fiddling with
iomem_caps once all vCPU-s have entered x2APIC mode. With LAPICs not
normally being elsewhere, question is whether this corner case actually
needs dealing with. Yet even if not, commentary may want extending, just
to make clear the case was considered?
> Note that mp_lapic_addr can be zeor if the host local APICs are
> started in x2APIC mode, or it could (in theory) contain an address
> different than APIC_DEFAULT_PHYS_BASE.
Of course; I didn't mean to suggest what you do is simply redundant.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-20 13:30 ` Jan Beulich
@ 2025-02-20 15:40 ` Roger Pau Monné
2025-02-20 16:05 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2025-02-20 15:40 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Jürgen Groß, xen-devel
On Thu, Feb 20, 2025 at 02:30:38PM +0100, Jan Beulich wrote:
> On 20.02.2025 09:55, Roger Pau Monné wrote:
> > On Thu, Feb 20, 2025 at 09:33:46AM +0100, Jan Beulich wrote:
> >> On 19.02.2025 17:48, Roger Pau Monne wrote:
> >>> Note that the restriction to map the local APIC page is enforced
> >>> separately, and that continues to be present. Additionally make sure the
> >>> emulated local APIC page is also not mapped, in case dom0 is using it.
> >>
> >> But that's in GFN space, not in MFN one. Why would that matter for iomem_caps?
> >
> > It's required to avoid arch_iommu_hwdom_init() creating an identity
> > mapping for APIC_DEFAULT_PHYS_BASE, which would prevent the local APIC
> > emulation from being used.
>
> Hmm, yes, on one hand such a mapping would be created by default, as we
> default to "dom0-iommu=map-reserved". Otoh that mapping would be replaced
> before Dom0 is actually started, via the domain_creation_finished() hook.
> On (modern) VMX that is. So yes, on old VMX and on SVM the slot would need
> to remain unpopulated. Otoh, when the physical LAPICs are elsewhere and
> when the domain is in x2APIC mode, there would be no reason to disallow
> Dom0 access to that page.
Right, but that's now how dom0 is started ATM, as the local APIC is
unconditionally started in xAPIC mode and at APIC_DEFAULT_PHYS_BASE.
I could use vlapic_base_address() against vCPU#0 vlapic, but even in
guest_wrmsr_apic_base() we don't allow moving the local APIC MMIO
region, and hence I assumed it was fine to just use
APIC_DEFAULT_PHYS_BASE here. Note in pvh_setup_acpi_madt() Xen also
hardcodes the local APIC address to APIC_DEFAULT_PHYS_BASE.
Would you be fine if I expand the comment so it's:
/* If using an emulated local APIC make sure its MMIO is unpopulated. */
if ( has_vlapic(d) )
{
/* Xen doesn't allow changing the local APIC MMIO window position. */
mfn = paddr_to_pfn(APIC_DEFAULT_PHYS_BASE);
rc |= iomem_deny_access(d, mfn, mfn);
}
> That would apparently mean fiddling with
> iomem_caps once all vCPU-s have entered x2APIC mode.
Urg, that seems ugly. It would also need undoing if the APICs are
reverted to xAPIC mode?
> With LAPICs not
> normally being elsewhere, question is whether this corner case actually
> needs dealing with. Yet even if not, commentary may want extending, just
> to make clear the case was considered?
As said above, for both HVM and PVH Xen doesn't allow moving the APIC
MMIO window to anything different than APIC_DEFAULT_PHYS_BASE.
Thanks, Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-20 15:40 ` Roger Pau Monné
@ 2025-02-20 16:05 ` Jan Beulich
2025-02-20 16:16 ` Roger Pau Monné
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2025-02-20 16:05 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, Jürgen Groß, xen-devel
On 20.02.2025 16:40, Roger Pau Monné wrote:
> On Thu, Feb 20, 2025 at 02:30:38PM +0100, Jan Beulich wrote:
>> On 20.02.2025 09:55, Roger Pau Monné wrote:
>>> On Thu, Feb 20, 2025 at 09:33:46AM +0100, Jan Beulich wrote:
>>>> On 19.02.2025 17:48, Roger Pau Monne wrote:
>>>>> Note that the restriction to map the local APIC page is enforced
>>>>> separately, and that continues to be present. Additionally make sure the
>>>>> emulated local APIC page is also not mapped, in case dom0 is using it.
>>>>
>>>> But that's in GFN space, not in MFN one. Why would that matter for iomem_caps?
>>>
>>> It's required to avoid arch_iommu_hwdom_init() creating an identity
>>> mapping for APIC_DEFAULT_PHYS_BASE, which would prevent the local APIC
>>> emulation from being used.
>>
>> Hmm, yes, on one hand such a mapping would be created by default, as we
>> default to "dom0-iommu=map-reserved". Otoh that mapping would be replaced
>> before Dom0 is actually started, via the domain_creation_finished() hook.
>> On (modern) VMX that is. So yes, on old VMX and on SVM the slot would need
>> to remain unpopulated. Otoh, when the physical LAPICs are elsewhere and
>> when the domain is in x2APIC mode, there would be no reason to disallow
>> Dom0 access to that page.
>
> Right, but that's now how dom0 is started ATM, as the local APIC is
> unconditionally started in xAPIC mode and at APIC_DEFAULT_PHYS_BASE.
>
> I could use vlapic_base_address() against vCPU#0 vlapic, but even in
> guest_wrmsr_apic_base() we don't allow moving the local APIC MMIO
> region, and hence I assumed it was fine to just use
> APIC_DEFAULT_PHYS_BASE here. Note in pvh_setup_acpi_madt() Xen also
> hardcodes the local APIC address to APIC_DEFAULT_PHYS_BASE.
>
> Would you be fine if I expand the comment so it's:
>
> /* If using an emulated local APIC make sure its MMIO is unpopulated. */
> if ( has_vlapic(d) )
> {
> /* Xen doesn't allow changing the local APIC MMIO window position. */
> mfn = paddr_to_pfn(APIC_DEFAULT_PHYS_BASE);
> rc |= iomem_deny_access(d, mfn, mfn);
> }
That will do, I think. Then:
Acked-by: Jan Beulich <jbeulich@suse.com>
>> That would apparently mean fiddling with
>> iomem_caps once all vCPU-s have entered x2APIC mode.
>
> Urg, that seems ugly. It would also need undoing if the APICs are
> reverted to xAPIC mode?
Right.
>> With LAPICs not
>> normally being elsewhere, question is whether this corner case actually
>> needs dealing with. Yet even if not, commentary may want extending, just
>> to make clear the case was considered?
>
> As said above, for both HVM and PVH Xen doesn't allow moving the APIC
> MMIO window to anything different than APIC_DEFAULT_PHYS_BASE.
I was talking about the real one Xen uses.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-20 16:05 ` Jan Beulich
@ 2025-02-20 16:16 ` Roger Pau Monné
0 siblings, 0 replies; 18+ messages in thread
From: Roger Pau Monné @ 2025-02-20 16:16 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Jürgen Groß, xen-devel
On Thu, Feb 20, 2025 at 05:05:44PM +0100, Jan Beulich wrote:
> On 20.02.2025 16:40, Roger Pau Monné wrote:
> > On Thu, Feb 20, 2025 at 02:30:38PM +0100, Jan Beulich wrote:
> >> On 20.02.2025 09:55, Roger Pau Monné wrote:
> >>> On Thu, Feb 20, 2025 at 09:33:46AM +0100, Jan Beulich wrote:
> >>>> On 19.02.2025 17:48, Roger Pau Monne wrote:
> >>>>> Note that the restriction to map the local APIC page is enforced
> >>>>> separately, and that continues to be present. Additionally make sure the
> >>>>> emulated local APIC page is also not mapped, in case dom0 is using it.
> >>>>
> >>>> But that's in GFN space, not in MFN one. Why would that matter for iomem_caps?
> >>>
> >>> It's required to avoid arch_iommu_hwdom_init() creating an identity
> >>> mapping for APIC_DEFAULT_PHYS_BASE, which would prevent the local APIC
> >>> emulation from being used.
> >>
> >> Hmm, yes, on one hand such a mapping would be created by default, as we
> >> default to "dom0-iommu=map-reserved". Otoh that mapping would be replaced
> >> before Dom0 is actually started, via the domain_creation_finished() hook.
> >> On (modern) VMX that is. So yes, on old VMX and on SVM the slot would need
> >> to remain unpopulated. Otoh, when the physical LAPICs are elsewhere and
> >> when the domain is in x2APIC mode, there would be no reason to disallow
> >> Dom0 access to that page.
> >
> > Right, but that's now how dom0 is started ATM, as the local APIC is
> > unconditionally started in xAPIC mode and at APIC_DEFAULT_PHYS_BASE.
> >
> > I could use vlapic_base_address() against vCPU#0 vlapic, but even in
> > guest_wrmsr_apic_base() we don't allow moving the local APIC MMIO
> > region, and hence I assumed it was fine to just use
> > APIC_DEFAULT_PHYS_BASE here. Note in pvh_setup_acpi_madt() Xen also
> > hardcodes the local APIC address to APIC_DEFAULT_PHYS_BASE.
> >
> > Would you be fine if I expand the comment so it's:
> >
> > /* If using an emulated local APIC make sure its MMIO is unpopulated. */
> > if ( has_vlapic(d) )
> > {
> > /* Xen doesn't allow changing the local APIC MMIO window position. */
> > mfn = paddr_to_pfn(APIC_DEFAULT_PHYS_BASE);
> > rc |= iomem_deny_access(d, mfn, mfn);
> > }
>
> That will do, I think. Then:
> Acked-by: Jan Beulich <jbeulich@suse.com>
Thanks.
> >> That would apparently mean fiddling with
> >> iomem_caps once all vCPU-s have entered x2APIC mode.
> >
> > Urg, that seems ugly. It would also need undoing if the APICs are
> > reverted to xAPIC mode?
>
> Right.
>
> >> With LAPICs not
> >> normally being elsewhere, question is whether this corner case actually
> >> needs dealing with. Yet even if not, commentary may want extending, just
> >> to make clear the case was considered?
> >
> > As said above, for both HVM and PVH Xen doesn't allow moving the APIC
> > MMIO window to anything different than APIC_DEFAULT_PHYS_BASE.
>
> I was talking about the real one Xen uses.
Oh, OK, I was confused by the context I think, sorry then.
Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-02-19 16:48 [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
` (2 preceding siblings ...)
2025-02-19 16:48 ` [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
@ 2025-03-05 14:27 ` Jan Beulich
2025-03-05 14:35 ` Roger Pau Monné
3 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2025-03-05 14:27 UTC (permalink / raw)
To: Roger Pau Monne, Andrew Cooper; +Cc: xen-devel
On 19.02.2025 17:48, Roger Pau Monne wrote:
> Hello,
>
> First two patches are preparatory changes to reduce the changes required
> in patch 3. I would have wanted those to go in 4.20 to fix the issues
> on Lenovo Thinkpads, but it's too late now.
>
> Thanks, Roger.
>
> Roger Pau Monne (3):
> x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
> x86/iommu: account for IOMEM caps when populating dom0 IOMMU
> page-tables
> x86/dom0: be less restrictive with the Interrupt Address Range
I'm uncertain whether to take this and "x86/pvh: workaround missing MMIO
regions in dom0 p2m" for backport. The sole Fixes: tag is in patch 1 here.
Thoughts?
Thanks, Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-03-05 14:27 ` [PATCH v3 0/3] " Jan Beulich
@ 2025-03-05 14:35 ` Roger Pau Monné
2025-03-05 14:54 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2025-03-05 14:35 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Wed, Mar 05, 2025 at 03:27:18PM +0100, Jan Beulich wrote:
> On 19.02.2025 17:48, Roger Pau Monne wrote:
> > Hello,
> >
> > First two patches are preparatory changes to reduce the changes required
> > in patch 3. I would have wanted those to go in 4.20 to fix the issues
> > on Lenovo Thinkpads, but it's too late now.
> >
> > Thanks, Roger.
> >
> > Roger Pau Monne (3):
> > x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
> > x86/iommu: account for IOMEM caps when populating dom0 IOMMU
> > page-tables
> > x86/dom0: be less restrictive with the Interrupt Address Range
>
> I'm uncertain whether to take this and "x86/pvh: workaround missing MMIO
> regions in dom0 p2m" for backport. The sole Fixes: tag is in patch 1 here.
> Thoughts?
At least the ones here would be helpful for the reported Lenovo
Thinkpad issue. The PVH p2m addition would be nice IMO.
Thanks, Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-03-05 14:35 ` Roger Pau Monné
@ 2025-03-05 14:54 ` Jan Beulich
2025-03-05 17:42 ` Roger Pau Monné
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2025-03-05 14:54 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, xen-devel
On 05.03.2025 15:35, Roger Pau Monné wrote:
> On Wed, Mar 05, 2025 at 03:27:18PM +0100, Jan Beulich wrote:
>> On 19.02.2025 17:48, Roger Pau Monne wrote:
>>> Hello,
>>>
>>> First two patches are preparatory changes to reduce the changes required
>>> in patch 3. I would have wanted those to go in 4.20 to fix the issues
>>> on Lenovo Thinkpads, but it's too late now.
>>>
>>> Thanks, Roger.
>>>
>>> Roger Pau Monne (3):
>>> x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
>>> x86/iommu: account for IOMEM caps when populating dom0 IOMMU
>>> page-tables
>>> x86/dom0: be less restrictive with the Interrupt Address Range
>>
>> I'm uncertain whether to take this and "x86/pvh: workaround missing MMIO
>> regions in dom0 p2m" for backport. The sole Fixes: tag is in patch 1 here.
>> Thoughts?
>
> At least the ones here would be helpful for the reported Lenovo
> Thinkpad issue. The PVH p2m addition would be nice IMO.
Are the ones here sufficient to deal with that issue? IOW iasn't the other
2-patch series also necessary?
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range
2025-03-05 14:54 ` Jan Beulich
@ 2025-03-05 17:42 ` Roger Pau Monné
0 siblings, 0 replies; 18+ messages in thread
From: Roger Pau Monné @ 2025-03-05 17:42 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Wed, Mar 05, 2025 at 03:54:56PM +0100, Jan Beulich wrote:
> On 05.03.2025 15:35, Roger Pau Monné wrote:
> > On Wed, Mar 05, 2025 at 03:27:18PM +0100, Jan Beulich wrote:
> >> On 19.02.2025 17:48, Roger Pau Monne wrote:
> >>> Hello,
> >>>
> >>> First two patches are preparatory changes to reduce the changes required
> >>> in patch 3. I would have wanted those to go in 4.20 to fix the issues
> >>> on Lenovo Thinkpads, but it's too late now.
> >>>
> >>> Thanks, Roger.
> >>>
> >>> Roger Pau Monne (3):
> >>> x86/dom0: correctly set the maximum ->iomem_caps bound for PVH
> >>> x86/iommu: account for IOMEM caps when populating dom0 IOMMU
> >>> page-tables
> >>> x86/dom0: be less restrictive with the Interrupt Address Range
> >>
> >> I'm uncertain whether to take this and "x86/pvh: workaround missing MMIO
> >> regions in dom0 p2m" for backport. The sole Fixes: tag is in patch 1 here.
> >> Thoughts?
> >
> > At least the ones here would be helpful for the reported Lenovo
> > Thinkpad issue. The PVH p2m addition would be nice IMO.
>
> Are the ones here sufficient to deal with that issue? IOW iasn't the other
> 2-patch series also necessary?
For a PV dom0, yes, the patches here are enough. For a PVH dom0 you
also need "x86/pvh: workaround missing MMIO regions in dom0 p2m".
Given that we now officially support PVH I think we would need to
backport the latter, to have parity between PV and PVH dom0.
Thanks, Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-03-05 17:42 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-19 16:48 [PATCH v3 0/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-02-19 16:48 ` [PATCH v3 1/3] x86/dom0: correctly set the maximum ->iomem_caps bound for PVH Roger Pau Monne
2025-02-20 8:22 ` Jan Beulich
2025-02-20 8:49 ` Roger Pau Monné
2025-02-20 13:02 ` Jan Beulich
2025-02-19 16:48 ` [PATCH v3 2/3] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables Roger Pau Monne
2025-02-20 8:29 ` Jan Beulich
2025-02-19 16:48 ` [PATCH v3 3/3] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-02-20 8:33 ` Jan Beulich
2025-02-20 8:55 ` Roger Pau Monné
2025-02-20 13:30 ` Jan Beulich
2025-02-20 15:40 ` Roger Pau Monné
2025-02-20 16:05 ` Jan Beulich
2025-02-20 16:16 ` Roger Pau Monné
2025-03-05 14:27 ` [PATCH v3 0/3] " Jan Beulich
2025-03-05 14:35 ` Roger Pau Monné
2025-03-05 14:54 ` Jan Beulich
2025-03-05 17:42 ` Roger Pau Monné
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.