All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Paul Durrant <paul@xen.org>, Wei Liu <wl@xen.org>
Subject: Re: [PATCH v5 02/15] IOMMU/x86: perform PV Dom0 mappings in batches
Date: Tue, 31 May 2022 18:01:24 +0200	[thread overview]
Message-ID: <YpY71HuPOP59Do+Y@Air-de-Roger> (raw)
In-Reply-To: <67fd1ed1-4a62-c014-51c0-f547e33fb427@suse.com>

On Fri, May 27, 2022 at 01:12:48PM +0200, Jan Beulich wrote:
> For large page mappings to be easily usable (i.e. in particular without
> un-shattering of smaller page mappings) and for mapping operations to
> then also be more efficient, pass batches of Dom0 memory to iommu_map().
> In dom0_construct_pv() and its helpers (covering strict mode) this
> additionally requires establishing the type of those pages (albeit with
> zero type references).
> 
> The earlier establishing of PGT_writable_page | PGT_validated requires
> the existing places where this gets done (through get_page_and_type())
> to be updated: For pages which actually have a mapping, the type
> refcount needs to be 1.
> 
> There is actually a related bug that gets fixed here as a side effect:
> Typically the last L1 table would get marked as such only after
> get_page_and_type(..., PGT_writable_page). While this is fine as far as
> refcounting goes, the page did remain mapped in the IOMMU in this case
> (when "iommu=dom0-strict").
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> Subsequently p2m_add_identity_entry() may want to also gain an order
> parameter, for arch_iommu_hwdom_init() to use. While this only affects
> non-RAM regions, systems typically have 2-16Mb of reserved space
> immediately below 4Gb, which hence could be mapped more efficiently.
> 
> Eventually we may want to overhaul this logic to use a rangeset based
> approach instead, punching holes into originally uniformly large-page-
> mapped regions. Doing so right here would first and foremost be yet more
> of a change.
> 
> The installing of zero-ref writable types has in fact shown (observed
> while putting together the change) that despite the intention by the
> XSA-288 changes (affecting DomU-s only) for Dom0 a number of
> sufficiently ordinary pages (at the very least initrd and P2M ones as
> well as pages that are part of the initial allocation but not part of
> the initial mapping) still have been starting out as PGT_none, meaning
> that they would have gained IOMMU mappings only the first time these
> pages would get mapped writably. Consequently an open question is
> whether iommu_memory_setup() should set the pages to PGT_writable_page
> independent of need_iommu_pt_sync().

Hm, I see, non strict PV dom0s won't get the pages set to
PGT_writable_page even when accessible by devices by virtue of such
domain having all RAM mapped in the IOMMU page-tables.

I guess it does make sense to also have the pages set as
PGT_writable_page by default in that case, as tthe pages _are_
writable by the IOMMU.  Do pages added during runtime (ie: ballooned
in) also get PGT_writable_page set?

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -363,8 +363,8 @@ static unsigned int __hwdom_init hwdom_i
>  
>  void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>  {
> -    unsigned long i, top, max_pfn;
> -    unsigned int flush_flags = 0;
> +    unsigned long i, top, max_pfn, start, count;
> +    unsigned int flush_flags = 0, start_perms = 0;
>  
>      BUG_ON(!is_hardware_domain(d));
>  
> @@ -395,9 +395,9 @@ void __hwdom_init arch_iommu_hwdom_init(
>       * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
>       * setting up potentially conflicting mappings here.
>       */
> -    i = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
> +    start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
>  
> -    for ( ; i < top; i++ )
> +    for ( i = start, count = 0; i < top; )
>      {
>          unsigned long pfn = pdx_to_pfn(i);
>          unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
> @@ -406,20 +406,41 @@ void __hwdom_init arch_iommu_hwdom_init(
>          if ( !perms )
>              rc = 0;
>          else if ( paging_mode_translate(d) )
> +        {
>              rc = p2m_add_identity_entry(d, pfn,
>                                          perms & IOMMUF_writable ? p2m_access_rw
>                                                                  : p2m_access_r,
>                                          0);
> +            if ( rc )
> +                printk(XENLOG_WARNING
> +                       "%pd: identity mapping of %lx failed: %d\n",
> +                       d, pfn, rc);
> +        }
> +        else if ( pfn != start + count || perms != start_perms )
> +        {
> +        commit:
> +            rc = iommu_map(d, _dfn(start), _mfn(start), count, start_perms,
> +                           &flush_flags);
> +            if ( rc )
> +                printk(XENLOG_WARNING
> +                       "%pd: IOMMU identity mapping of [%lx,%lx) failed: %d\n",
> +                       d, pfn, pfn + count, rc);
> +            SWAP(start, pfn);
> +            start_perms = perms;
> +            count = 1;
> +        }
>          else
> -            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
> -                           perms, &flush_flags);
> +        {
> +            ++count;
> +            rc = 0;
> +        }
>  
> -        if ( rc )
> -            printk(XENLOG_WARNING "%pd: identity %smapping of %lx failed: %d\n",
> -                   d, !paging_mode_translate(d) ? "IOMMU " : "", pfn, rc);
>  
> -        if (!(i & 0xfffff))
> +        if ( !(++i & 0xfffff) )
>              process_pending_softirqs();
> +
> +        if ( i == top && count )

Nit: do you really need to check for count != 0? AFAICT this is only
possible in the first iteration.

Thanks, Roger.


  reply	other threads:[~2022-05-31 16:01 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-27 11:10 [PATCH v5 00/15] IOMMU: superpage support when not sharing pagetables Jan Beulich
2022-05-27 11:12 ` [PATCH v5 01/15] IOMMU/x86: restrict IO-APIC mappings for PV Dom0 Jan Beulich
2022-05-31 14:40   ` Roger Pau Monné
2022-05-31 15:40     ` Jan Beulich
2022-05-31 16:15       ` Roger Pau Monné
2022-06-01  7:10         ` Jan Beulich
2022-06-01  8:17           ` Roger Pau Monné
2022-06-01 15:10             ` Jan Beulich
2022-05-27 11:12 ` [PATCH v5 02/15] IOMMU/x86: perform PV Dom0 mappings in batches Jan Beulich
2022-05-31 16:01   ` Roger Pau Monné [this message]
2022-06-01  7:30     ` Jan Beulich
2022-06-01  9:08       ` Roger Pau Monné
2022-05-27 11:13 ` [PATCH v5 03/15] IOMMU/x86: support freeing of pagetables Jan Beulich
2022-05-31 16:25   ` Roger Pau Monné
2022-06-01  7:32     ` Jan Beulich
2022-06-01  9:24       ` Roger Pau Monné
2022-06-01 15:25         ` Jan Beulich
2022-06-02  8:57           ` Roger Pau Monné
2022-05-27 11:13 ` [PATCH v5 04/15] AMD/IOMMU: allow use of superpage mappings Jan Beulich
2022-05-27 11:14 ` [PATCH v5 05/15] VT-d: " Jan Beulich
2022-05-27 11:16 ` [PATCH v5 06/15] IOMMU: fold flush-all hook into "flush one" Jan Beulich
2022-05-27 11:17 ` [PATCH v5 07/15] x86: introduce helper for recording degree of contiguity in page tables Jan Beulich
2022-06-01 11:29   ` Roger Pau Monné
2022-06-01 12:11     ` Jan Beulich
2022-06-01 13:02       ` Roger Pau Monné
2022-05-27 11:17 ` [PATCH v5 08/15] IOMMU/x86: prefill newly allocate " Jan Beulich
2022-06-01 12:59   ` Roger Pau Monné
2022-06-01 13:17     ` Jan Beulich
2022-05-27 11:18 ` [PATCH v5 09/15] AMD/IOMMU: free all-empty " Jan Beulich
2022-05-27 11:19 ` [PATCH v5 10/15] VT-d: " Jan Beulich
2022-05-27 11:19 ` [PATCH v5 11/15] AMD/IOMMU: replace all-contiguous page tables by superpage mappings Jan Beulich
2022-05-27 11:19 ` [PATCH v5 12/15] VT-d: " Jan Beulich
2022-06-02  9:35   ` Roger Pau Monné
2022-06-02  9:58     ` Jan Beulich
2022-06-02 10:31       ` Roger Pau Monné
2022-05-27 11:20 ` [PATCH v5 13/15] IOMMU/x86: add perf counters for page table splitting / coalescing Jan Beulich
2022-05-27 11:20 ` [PATCH v5 14/15] VT-d: fold iommu_flush_iotlb{,_pages}() Jan Beulich
2022-05-27 11:21 ` [PATCH v5 15/15] VT-d: fold dma_pte_clear_one() into its only caller Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YpY71HuPOP59Do+Y@Air-de-Roger \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=paul@xen.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.