Re: [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM
Date: Mon, 20 Nov 2023 11:50:01 +0100	[thread overview]
Message-ID: <ZVs52elOtWbTaD0i@macbook.local> (raw)
In-Reply-To: <f414730f-307d-4ef6-9aaa-a861925fdab4@suse.com>

On Mon, Nov 20, 2023 at 11:37:43AM +0100, Jan Beulich wrote:
> On 20.11.2023 11:27, Roger Pau Monné wrote:
> > On Mon, Nov 20, 2023 at 10:45:29AM +0100, Jan Beulich wrote:
> >> On 17.11.2023 12:55, Andrew Cooper wrote:
> >>> On 17/11/2023 9:47 am, Roger Pau Monne wrote:
> >>>>      /*
> >>>> -     * Choose the number of levels for the IOMMU page tables.
> >>>> -     * - PV needs 3 or 4, depending on whether there is RAM (including hotplug
> >>>> -     *   RAM) above the 512G boundary.
> >>>> -     * - HVM could in principle use 3 or 4 depending on how much guest
> >>>> -     *   physical address space we give it, but this isn't known yet so use 4
> >>>> -     *   unilaterally.
> >>>> -     * - Unity maps may require an even higher number.
> >>>> +     * Choose the number of levels for the IOMMU page tables, taking into
> >>>> +     * account unity maps.
> >>>>       */
> >>>> -    hd->arch.amd.paging_mode = max(amd_iommu_get_paging_mode(
> >>>> -            is_hvm_domain(d)
> >>>> -            ? 1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT)
> >>>> -            : get_upper_mfn_bound() + 1),
> >>>> -        amd_iommu_min_paging_mode);
> >>>> +    hd->arch.amd.paging_mode = max(pgmode, amd_iommu_min_paging_mode);
> >>>
> >>> I think these min/max variables can be dropped now we're not doing
> >>> variable height IOMMU pagetables, which further simplifies this expression.
> >>
> >> Did you take unity maps into account? At least $subject and comment looks
> >> to not be consistent in this regard: Either unity maps need considering
> >> specially (and then we don't uniformly use the same depth), or they don't
> >> need mentioning in the comment (anymore).
> > 
> > Unity maps that require an address width > DEFAULT_DOMAIN_ADDRESS_WIDTH
> > will currently only work on PV at best, as HVM p2m code is limited to
> > 4 level page tables, so even if the IOMMU page tables support a
> > greater address width the call to map such regions will trigger an
> > error in the p2m code way before attempting to create any IOMMU
> > mappings.
> > 
> > We could do:
> > 
> > hd->arch.amd.paging_mode =
> >     is_hvm_domain(d) ? pgmode : max(pgmode, amd_iommu_min_paging_mode);
> > 
> > Putting IVMD/RMRR regions that require the usage of 5 level page
> > tables would be a very short sighted move by vendors IMO.
> > 
> > And will put us back in a situation where PV vs HVM can get different
> > IOMMU page table levels, which is undesirable.  It might be better to
> > just assume all domains use DEFAULT_DOMAIN_ADDRESS_WIDTH and hide
> > devices that have IVMD/RMRR regions above that limit.
> 
> That's a possible approach, yes. To be honest, I was actually hoping we'd
> move in a different direction: Do away with the entirely arbitrary
> DEFAULT_DOMAIN_ADDRESS_WIDTH, and use actual system properties instead.

Hm, yes, that might be a sensible approach, but right now I don't want
to block this series on such (likely big) piece of work.  I think we
should aim for HVM and PV to have the same IOMMU page table levels,
and that's currently limited by the p2m code only supporting 4 levels.

> Whether having PV and HVM have uniform depth is indeed desirable is also
> not entirely obvious to me. Having looked over patch 3 now, it also
> hasn't become clear to me why the change here is actually a (necessary)
> prereq.

Oh, it's a prereq because I've found AMD systems that have reserved
regions > 512GB, but no RAM past that region.  arch_iommu_hwdom_init()
would fail on those systems when patch 3/3 was applied, as then
reserved regions past the last RAM address are also mapped in
arch_iommu_hwdom_init().

Thanks, Roger.

next prev parent reply	other threads:[~2023-11-20 10:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-17  9:47 [PATCH 0/3] x86/iommu: improve setup time of hwdom IOMMU Roger Pau Monne
2023-11-17  9:47 ` [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM Roger Pau Monne
2023-11-17 11:55   ` Andrew Cooper
2023-11-20  9:45     ` Jan Beulich
2023-11-20 10:27       ` Roger Pau Monné
2023-11-20 10:37         ` Jan Beulich
2023-11-20 10:50           ` Roger Pau Monné [this message]
2023-11-20 11:34             ` Jan Beulich
2023-11-20 12:01               ` Roger Pau Monné
2023-11-20 16:27                 ` Jan Beulich
2023-11-17  9:47 ` [PATCH 2/3] x86/iommu: move xen_in_range() so it can be made static Roger Pau Monne
2023-11-17 12:03   ` Andrew Cooper
2023-11-17 12:50     ` Roger Pau Monné
2023-11-17  9:47 ` [PATCH 3/3] x86/iommu: use a rangeset for hwdom setup Roger Pau Monne
2023-11-17 12:04   ` Roger Pau Monné
2023-11-17 14:34   ` Andrew Cooper
2023-11-20 10:30   ` Jan Beulich
2023-11-20 10:43     ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVs52elOtWbTaD0i@macbook.local \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.