From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 1/2] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables
Date: Tue, 18 Feb 2025 17:38:47 +0100 [thread overview]
Message-ID: <Z7S3l_AWEGrW57VH@macbook.local> (raw)
In-Reply-To: <19888001-ceac-44a8-8d14-cb0dd6d19b2c@suse.com>
On Tue, Feb 18, 2025 at 04:32:33PM +0100, Jan Beulich wrote:
> On 17.02.2025 15:16, Roger Pau Monne wrote:
> > The current code in arch_iommu_hwdom_init() kind of open-codes the same
> > MMIO permission ranges that are added to the hardware domain ->iomem_caps.
> > Avoid this duplication and use ->iomem_caps in arch_iommu_hwdom_init() to
> > filter which memory regions should be added to the dom0 IOMMU page-tables.
> >
> > This should result in no change in the memory regions that end up identity
> > mapped in the domain IOMMU page tables.
> >
> > Note the IO-APIC and MCFG page(s) must be set as not accessible for a PVH
> > dom0, otherwise the internal Xen emulation for those ranges won't work.
> > This requires an adjustment in dom0_setup_permissions().
> >
> > Also the special casing of E820_UNUSABLE regions no longer needs to be done
> > in arch_iommu_hwdom_init(), as those regions are already blocked in
> > ->iomem_caps and thus would be removed from the rangeset as part of
> > ->iomem_caps processing in arch_iommu_hwdom_init().
>
> Only almost: ->iomem_caps has them removed only for addresses 1Mb and up.
Right, but we would like to have parity between IOMMU and CPU
page-tables, and hence should also allow IOMMU to map unusable regions
below 1Mb if the CPU is also allowed to map them. I can tweak the
commit message to note this is done on purpose.
> > --- a/xen/arch/x86/dom0_build.c
> > +++ b/xen/arch/x86/dom0_build.c
> > @@ -552,7 +552,8 @@ int __init dom0_setup_permissions(struct domain *d)
> > for ( i = 0; i < nr_ioapics; i++ )
> > {
> > mfn = paddr_to_pfn(mp_ioapics[i].mpc_apicaddr);
> > - if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
> > + if ( is_hvm_domain(d) ||
> > + !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
> > rc |= iomem_deny_access(d, mfn, mfn);
> > }
>
> The original code used has_vioapic() and had a comment. At least one of#
> the two would better be transplanted here, I think.
I can indeed switch to has_vioapic() and keep the comment.
> > @@ -593,6 +594,22 @@ int __init dom0_setup_permissions(struct domain *d)
> > rc |= rangeset_add_singleton(mmio_ro_ranges, mfn);
> > }
> >
> > + /* For PVH dom0 prevent access to MCFG, it's emulated by Xen. */
> > + if ( is_hvm_domain(d) )
> > + {
> > + for ( i = 0; i < pci_mmcfg_config_num; i++ )
> > + {
> > + const unsigned long s =
> > + PFN_DOWN(pci_mmcfg_config[i].address) +
> > + PCI_BDF(pci_mmcfg_config[i].start_bus_number, 0, 0);
> > + const unsigned long e =
> > + PFN_DOWN(pci_mmcfg_config[i].address) +
> > + PCI_BDF(pci_mmcfg_config[i].end_bus_number, ~0, ~0);
> > +
> > + rc |= iomem_deny_access(d, s, e);
> > + }
> > + }
>
> Similarly here, the original code used has_vpci() and also had a comment.
> Is there actually a strong reason to replace the original MCFG code?
Hm, I see, I could tweak vpci_subtract_mmcfg() to remove the regions
from ->iomem_cap instead of open-coding here.
> > --- a/xen/drivers/passthrough/x86/iommu.c
> > +++ b/xen/drivers/passthrough/x86/iommu.c
> > @@ -320,6 +320,26 @@ static int __hwdom_init cf_check map_subtract(unsigned long s, unsigned long e,
> > return rangeset_remove_range(map, s, e);
> > }
> >
> > +struct handle_iomemcap {
> > + struct rangeset *r;
> > + unsigned long last;
> > +};
> > +static int __hwdom_init cf_check map_subtract_iomemcap(unsigned long s,
> > + unsigned long e,
> > + void *data)
> > +{
> > + struct handle_iomemcap *h = data;
> > + int rc = 0;
> > +
> > + if ( h->last != s )
> > + rc = rangeset_remove_range(h->r, h->last, s - 1);
> > +
> > + /* Be careful with overflows. */
> > + h->last = max(e + 1, e);
>
> What overflow could occur here? Addresses are limited to 52 bits; frame
> numbers are limited to 40 bits. And ->iomem_caps starts out with a range
> ending at the last address permitted by paddr_bits.
I was misremembering ->iomem_caps being initialized as [0, ~0UL] for
dom0, but that's not the case. I can indeed drop the max() here.
> > @@ -475,22 +488,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> > if ( rc )
> > panic("IOMMU failed to remove Xen ranges: %d\n", rc);
> >
> > - /* Remove any overlap with the Interrupt Address Range. */
> > - rc = rangeset_remove_range(map, 0xfee00, 0xfeeff);
> > + iomem.r = map;
> > + rc = rangeset_report_ranges(d->iomem_caps, 0, ~0UL, map_subtract_iomemcap,
> > + &iomem);
> > + if ( !rc && iomem.last < ~0UL )
> > + rc = rangeset_remove_range(map, iomem.last, ~0UL);
>
> Similarly here I'm wondering about the upper bound of ~0UL. Is this just
> to be on the safe side? And/or just because it's simpler than calculating
> the actual upper bound?
I could use the actual upper bound, as we already use it a bit below
in:
/* Remove any regions past the last address addressable by the domain. */
rc = rangeset_remove_range(map, PFN_DOWN(1UL << domain_max_paddr_bits(d)),
~0UL);
However using ~0UL is also correct for the purposes here.
> No ranges above the system physical address width
> should ever be entered into the rangeset. Kind of as an implication
> iomem.last also is guaranteed to be below ~0UL when making it here.
I could add an assert:
ASSERT(iomem.last < PFN_DOWN(1UL << domain_max_paddr_bits(d)));
But then I would need to adjust dom0_setup_permissions() to also use
domain_max_paddr_bits() instead of using paddr_bits for the
->iomem_caps initial max address. That should likely be a separate
patch on it's own.
Thanks, Roger.
next prev parent reply other threads:[~2025-02-18 16:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-17 14:16 [PATCH v2 0/2] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-02-17 14:16 ` [PATCH v2 1/2] x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables Roger Pau Monne
2025-02-18 15:32 ` Jan Beulich
2025-02-18 16:38 ` Roger Pau Monné [this message]
2025-02-17 14:16 ` [PATCH v2 2/2] x86/dom0: be less restrictive with the Interrupt Address Range Roger Pau Monne
2025-02-18 15:35 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7S3l_AWEGrW57VH@macbook.local \
--to=roger.pau@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.