Re: [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM
Date: Mon, 20 Nov 2023 11:27:10 +0100	[thread overview]
Message-ID: <ZVs0fvUQI9C7NRc5@macbook.local> (raw)
In-Reply-To: <0392fe23-7589-4dc2-b664-1e7e01c3914d@suse.com>

On Mon, Nov 20, 2023 at 10:45:29AM +0100, Jan Beulich wrote:
> On 17.11.2023 12:55, Andrew Cooper wrote:
> > On 17/11/2023 9:47 am, Roger Pau Monne wrote:
> >> Using different page table levels for HVM or PV guest is not helpful, and is
> >> not inline with the IOMMU implementation used by the other architecture vendor
> >> (VT-d).
> >>
> >> Switch to uniformly use DEFAULT_DOMAIN_ADDRESS_WIDTH in order to set the AMD-Vi
> >> page table levels.
> >>
> >> Note using the max RAM address for PV was bogus anyway, as there's no guarantee
> >> there can't be device MMIO or reserved regions past the maximum RAM region.
> > 
> > Indeed - and the MMIO regions do matter for P2P DMA.
> 
> So what about any such living above the 48-bit boundary (i.e. not covered
> by DEFAULT_DOMAIN_ADDRESS_WIDTH)?

That would only work for PV guests AFAICT, as HVM guests will already
refuse to create such mappings even before getting into the IOMMU
code: p2m_pt_set_entry() will return an error as the p2m code only
deals with 4 level page tables.

> 
> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > 
> > Variable-height IOMMU pagetables are not worth the security
> > vulnerabilities they're made of.  I regret not fighting hard enough to
> > kill them entirely several years ago...
> > 
> > Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, although...
> > 
> >> ---
> >>  xen/drivers/passthrough/amd/pci_amd_iommu.c | 20 ++++++++------------
> >>  1 file changed, 8 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> index 6bc73dc21052..f9e749d74da2 100644
> >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> @@ -359,21 +359,17 @@ int __read_mostly amd_iommu_min_paging_mode = 1;
> >>  static int cf_check amd_iommu_domain_init(struct domain *d)
> >>  {
> >>      struct domain_iommu *hd = dom_iommu(d);
> >> +    int pgmode = amd_iommu_get_paging_mode(
> >> +        1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT));
> > 
> > "paging mode" comes from the spec, but it's a very backwards way of
> > spelling height.
> > 
> > Can we at least start to improve the comprehensibility by renaming this
> > variable.
> > 
> >> +
> >> +    if ( pgmode < 0 )
> >> +        return pgmode;
> >>  
> >>      /*
> >> -     * Choose the number of levels for the IOMMU page tables.
> >> -     * - PV needs 3 or 4, depending on whether there is RAM (including hotplug
> >> -     *   RAM) above the 512G boundary.
> >> -     * - HVM could in principle use 3 or 4 depending on how much guest
> >> -     *   physical address space we give it, but this isn't known yet so use 4
> >> -     *   unilaterally.
> >> -     * - Unity maps may require an even higher number.
> >> +     * Choose the number of levels for the IOMMU page tables, taking into
> >> +     * account unity maps.
> >>       */
> >> -    hd->arch.amd.paging_mode = max(amd_iommu_get_paging_mode(
> >> -            is_hvm_domain(d)
> >> -            ? 1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT)
> >> -            : get_upper_mfn_bound() + 1),
> >> -        amd_iommu_min_paging_mode);
> >> +    hd->arch.amd.paging_mode = max(pgmode, amd_iommu_min_paging_mode);
> > 
> > I think these min/max variables can be dropped now we're not doing
> > variable height IOMMU pagetables, which further simplifies this expression.
> 
> Did you take unity maps into account? At least $subject and comment looks
> to not be consistent in this regard: Either unity maps need considering
> specially (and then we don't uniformly use the same depth), or they don't
> need mentioning in the comment (anymore).

Unity maps that require an address width > DEFAULT_DOMAIN_ADDRESS_WIDTH
will currently only work on PV at best, as HVM p2m code is limited to
4 level page tables, so even if the IOMMU page tables support a
greater address width the call to map such regions will trigger an
error in the p2m code way before attempting to create any IOMMU
mappings.

We could do:

hd->arch.amd.paging_mode =
    is_hvm_domain(d) ? pgmode : max(pgmode, amd_iommu_min_paging_mode);

Putting IVMD/RMRR regions that require the usage of 5 level page
tables would be a very short sighted move by vendors IMO.

And will put us back in a situation where PV vs HVM can get different
IOMMU page table levels, which is undesirable.  It might be better to
just assume all domains use DEFAULT_DOMAIN_ADDRESS_WIDTH and hide
devices that have IVMD/RMRR regions above that limit.

Thanks, Roger.

next prev parent reply	other threads:[~2023-11-20 10:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-17  9:47 [PATCH 0/3] x86/iommu: improve setup time of hwdom IOMMU Roger Pau Monne
2023-11-17  9:47 ` [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM Roger Pau Monne
2023-11-17 11:55   ` Andrew Cooper
2023-11-20  9:45     ` Jan Beulich
2023-11-20 10:27       ` Roger Pau Monné [this message]
2023-11-20 10:37         ` Jan Beulich
2023-11-20 10:50           ` Roger Pau Monné
2023-11-20 11:34             ` Jan Beulich
2023-11-20 12:01               ` Roger Pau Monné
2023-11-20 16:27                 ` Jan Beulich
2023-11-17  9:47 ` [PATCH 2/3] x86/iommu: move xen_in_range() so it can be made static Roger Pau Monne
2023-11-17 12:03   ` Andrew Cooper
2023-11-17 12:50     ` Roger Pau Monné
2023-11-17  9:47 ` [PATCH 3/3] x86/iommu: use a rangeset for hwdom setup Roger Pau Monne
2023-11-17 12:04   ` Roger Pau Monné
2023-11-17 14:34   ` Andrew Cooper
2023-11-20 10:30   ` Jan Beulich
2023-11-20 10:43     ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVs0fvUQI9C7NRc5@macbook.local \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.