All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Paul Durrant <paul@xen.org>, Kevin Tian <kevin.tian@intel.com>
Subject: Re: [PATCH v2 15/18] IOMMU/x86: prefill newly allocate page tables
Date: Tue, 14 Dec 2021 15:50:56 +0100	[thread overview]
Message-ID: <YbivUH/Er0o2PwsG@Air-de-Roger> (raw)
In-Reply-To: <2656844d-47cc-70c3-d7ce-7d83967d576e@suse.com>

On Fri, Sep 24, 2021 at 11:54:58AM +0200, Jan Beulich wrote:
> Page table are used for two purposes after allocation: They either start
> out all empty, or they get filled to replace a superpage. Subsequently,
> to replace all empty or fully contiguous page tables, contiguous sub-
> regions will be recorded within individual page tables. Install the
> initial set of markers immediately after allocation. Make sure to retain
> these markers when further populating a page table in preparation for it
> to replace a superpage.
> 
> The markers are simply 4-bit fields holding the order value of
> contiguous entries. To demonstrate this, if a page table had just 16
> entries, this would be the initial (fully contiguous) set of markers:
> 
> index  0 1 2 3 4 5 6 7 8 9 A B C D E F
> marker 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0
> 
> "Contiguous" here means not only present entries with successively
> increasing MFNs, each one suitably aligned for its slot, but also a
> respective number of all non-present entries.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Obviously this marker only works for newly created page tables right
now, the moment we start poking holes or replacing entries the marker
is not updated anymore. I expect further patches will expand on
this.

> ---
> An alternative to the ASSERT()s added to set_iommu_ptes_present() would
> be to make the function less general-purpose; it's used in a single
> place only after all (i.e. it might as well be folded into its only
> caller).
> ---
> v2: New.
> 
> --- a/xen/drivers/passthrough/amd/iommu-defs.h
> +++ b/xen/drivers/passthrough/amd/iommu-defs.h
> @@ -445,6 +445,8 @@ union amd_iommu_x2apic_control {
>  #define IOMMU_PAGE_TABLE_U32_PER_ENTRY	(IOMMU_PAGE_TABLE_ENTRY_SIZE / 4)
>  #define IOMMU_PAGE_TABLE_ALIGNMENT	4096
>  
> +#define IOMMU_PTE_CONTIG_MASK           0x1e /* The ign0 field below. */

Should you rename ign0 to contig_mask or some such now?

Same would apply to the comment next to dma_pte for VT-d, where bits
52:62 are ignored (the comments seems to be missing this already) and
we will be using bits 52:55 to store the contiguous mask for the
entry.

> +
>  union amd_iommu_pte {
>      uint64_t raw;
>      struct {
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -116,7 +116,19 @@ static void set_iommu_ptes_present(unsig
>  
>      while ( nr_ptes-- )
>      {
> -        set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> +        ASSERT(!pde->next_level);
> +        ASSERT(!pde->u);
> +
> +        if ( pde > table )
> +            ASSERT(pde->ign0 == find_first_set_bit(pde - table));
> +        else
> +            ASSERT(pde->ign0 == PAGE_SHIFT - 3);

You could even special case (pde - table) % 2 != 0, but this is debug
only code, and it's possible a mod is more costly than
find_first_set_bit.

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -433,12 +433,12 @@ int iommu_free_pgtables(struct domain *d
>      return 0;
>  }
>  
> -struct page_info *iommu_alloc_pgtable(struct domain *d)
> +struct page_info *iommu_alloc_pgtable(struct domain *d, uint64_t contig_mask)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
>      unsigned int memflags = 0;
>      struct page_info *pg;
> -    void *p;
> +    uint64_t *p;
>  
>  #ifdef CONFIG_NUMA
>      if ( hd->node != NUMA_NO_NODE )
> @@ -450,7 +450,28 @@ struct page_info *iommu_alloc_pgtable(st
>          return NULL;
>  
>      p = __map_domain_page(pg);
> -    clear_page(p);
> +
> +    if ( contig_mask )
> +    {
> +        unsigned int i, shift = find_first_set_bit(contig_mask);
> +
> +        ASSERT(((PAGE_SHIFT - 3) & (contig_mask >> shift)) == PAGE_SHIFT - 3);
> +
> +        p[0] = (PAGE_SHIFT - 3ull) << shift;
> +        p[1] = 0;
> +        p[2] = 1ull << shift;
> +        p[3] = 0;
> +
> +        for ( i = 4; i < PAGE_SIZE / 8; i += 4 )
> +        {
> +            p[i + 0] = (find_first_set_bit(i) + 0ull) << shift;
> +            p[i + 1] = 0;
> +            p[i + 2] = 1ull << shift;
> +            p[i + 3] = 0;
> +        }

You could likely do:

for ( i = 0; i < PAGE_SIZE / 8; i += 4 )
{
    p[i + 0] = i ? ((find_first_set_bit(i) + 0ull) << shift)
                 : ((PAGE_SHIFT - 3ull) << shift);
    p[i + 1] = 0;
    p[i + 2] = 1ull << shift;
    p[i + 3] = 0;
}

To avoid having to open code the first loop iteration. The ternary
operator could also be nested before the shift, but I find that
harder to read.

> +    }
> +    else
> +        clear_page(p);
>  
>      if ( hd->platform_ops->sync_cache )
>          iommu_vcall(hd->platform_ops, sync_cache, p, PAGE_SIZE);
> --- a/xen/include/asm-x86/iommu.h
> +++ b/xen/include/asm-x86/iommu.h
> @@ -142,7 +142,8 @@ int pi_update_irte(const struct pi_desc
>  })
>  
>  int __must_check iommu_free_pgtables(struct domain *d);
> -struct page_info *__must_check iommu_alloc_pgtable(struct domain *d);
> +struct page_info *__must_check iommu_alloc_pgtable(struct domain *d,
> +                                                   uint64_t contig_mask);
>  void iommu_queue_free_pgtable(struct domain *d, struct page_info *pg);
>  
>  #endif /* !__ARCH_X86_IOMMU_H__ */
> 


  parent reply	other threads:[~2021-12-14 14:51 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24  9:39 [PATCH v2 00/18] IOMMU: superpage support when not sharing pagetables Jan Beulich
2021-09-24  9:41 ` [PATCH v2 01/18] AMD/IOMMU: have callers specify the target level for page table walks Jan Beulich
2021-09-24 10:58   ` Roger Pau Monné
2021-09-24 12:02     ` Jan Beulich
2021-09-24  9:42 ` [PATCH v2 02/18] VT-d: " Jan Beulich
2021-09-24 14:45   ` Roger Pau Monné
2021-09-27  9:04     ` Jan Beulich
2021-09-27  9:13       ` Jan Beulich
2021-11-30 11:56       ` Roger Pau Monné
2021-11-30 14:38         ` Jan Beulich
2021-09-24  9:43 ` [PATCH v2 03/18] IOMMU: have vendor code announce supported page sizes Jan Beulich
2021-11-30 12:25   ` Roger Pau Monné
2021-12-17 14:43   ` Julien Grall
2021-12-21  9:26   ` Rahul Singh
2021-09-24  9:44 ` [PATCH v2 04/18] IOMMU: add order parameter to ->{,un}map_page() hooks Jan Beulich
2021-11-30 13:49   ` Roger Pau Monné
2021-11-30 14:45     ` Jan Beulich
2021-12-17 14:42   ` Julien Grall
2021-09-24  9:45 ` [PATCH v2 05/18] IOMMU: have iommu_{,un}map() split requests into largest possible chunks Jan Beulich
2021-11-30 15:24   ` Roger Pau Monné
2021-12-02 15:59     ` Jan Beulich
2021-09-24  9:46 ` [PATCH v2 06/18] IOMMU/x86: restrict IO-APIC mappings for PV Dom0 Jan Beulich
2021-12-01  9:09   ` Roger Pau Monné
2021-12-01  9:27     ` Jan Beulich
2021-12-01 10:32       ` Roger Pau Monné
2021-12-01 11:45         ` Jan Beulich
2021-12-02 15:12           ` Roger Pau Monné
2021-12-02 15:28             ` Jan Beulich
2021-12-02 19:16               ` Andrew Cooper
2021-12-03  6:41                 ` Jan Beulich
2021-09-24  9:47 ` [PATCH v2 07/18] IOMMU/x86: perform PV Dom0 mappings in batches Jan Beulich
2021-12-02 14:10   ` Roger Pau Monné
2021-12-03 12:38     ` Jan Beulich
2021-12-10  9:36       ` Roger Pau Monné
2021-12-10 11:41         ` Jan Beulich
2021-12-10 12:35           ` Roger Pau Monné
2021-09-24  9:48 ` [PATCH v2 08/18] IOMMU/x86: support freeing of pagetables Jan Beulich
2021-12-02 16:03   ` Roger Pau Monné
2021-12-02 16:10     ` Jan Beulich
2021-12-03  8:30       ` Roger Pau Monné
2021-12-03  9:38         ` Roger Pau Monné
2021-12-03  9:40         ` Jan Beulich
2021-12-10 13:51   ` Roger Pau Monné
2021-12-13  8:38     ` Jan Beulich
2021-09-24  9:48 ` [PATCH v2 09/18] AMD/IOMMU: drop stray TLB flush Jan Beulich
2021-12-02 16:16   ` Roger Pau Monné
2021-09-24  9:51 ` [PATCH v2 10/18] AMD/IOMMU: walk trees upon page fault Jan Beulich
2021-12-03  9:03   ` Roger Pau Monné
2021-12-03  9:49     ` Jan Beulich
2021-12-03  9:55       ` Jan Beulich
2021-12-10 10:23         ` Roger Pau Monné
2021-12-03  9:59     ` Jan Beulich
2021-09-24  9:51 ` [PATCH v2 11/18] AMD/IOMMU: return old PTE from {set,clear}_iommu_pte_present() Jan Beulich
2021-12-10 12:05   ` Roger Pau Monné
2021-12-10 12:59     ` Jan Beulich
2021-12-10 13:53       ` Roger Pau Monné
2021-09-24  9:52 ` [PATCH v2 12/18] AMD/IOMMU: allow use of superpage mappings Jan Beulich
2021-12-10 15:06   ` Roger Pau Monné
2021-12-13  8:49     ` Jan Beulich
2021-12-13  9:45       ` Roger Pau Monné
2021-12-13 10:00         ` Jan Beulich
2021-12-13 10:33           ` Roger Pau Monné
2021-12-13 10:41             ` Jan Beulich
2021-09-24  9:52 ` [PATCH v2 13/18] VT-d: " Jan Beulich
2021-12-13 11:54   ` Roger Pau Monné
2021-12-13 13:39     ` Jan Beulich
2021-09-24  9:53 ` [PATCH v2 14/18] IOMMU: fold flush-all hook into "flush one" Jan Beulich
2021-12-13 15:04   ` Roger Pau Monné
2021-12-14  9:06     ` Jan Beulich
2021-12-14  9:27       ` Roger Pau Monné
2021-12-15 15:28   ` Oleksandr
2021-12-16  8:49     ` Jan Beulich
2021-12-16 10:39       ` Oleksandr
2021-12-16 11:30   ` Rahul Singh
2021-12-21  8:04     ` Jan Beulich
2021-12-17 14:38   ` Julien Grall
2021-09-24  9:54 ` [PATCH v2 15/18] IOMMU/x86: prefill newly allocate page tables Jan Beulich
2021-12-13 15:51   ` Roger Pau Monné
2021-12-14  9:15     ` Jan Beulich
2021-12-14 11:41       ` Roger Pau Monné
2021-12-14 11:48         ` Jan Beulich
2021-12-14 14:50   ` Roger Pau Monné [this message]
2021-12-14 15:05     ` Jan Beulich
2021-12-14 15:15       ` Roger Pau Monné
2021-12-14 15:21         ` Jan Beulich
2021-12-14 15:06   ` Roger Pau Monné
2021-12-14 15:10     ` Jan Beulich
2021-12-14 15:17       ` Roger Pau Monné
2021-12-14 15:24         ` Jan Beulich
2021-09-24  9:55 ` [PATCH v2 16/18] x86: introduce helper for recording degree of contiguity in " Jan Beulich
2021-12-15 13:57   ` Roger Pau Monné
2021-12-16 15:47     ` Jan Beulich
2021-12-20 15:25       ` Roger Pau Monné
2021-12-21  8:09         ` Jan Beulich
2022-01-04  8:57           ` Roger Pau Monné
2022-01-04  9:00             ` Jan Beulich
2021-09-24  9:55 ` [PATCH v2 17/18] AMD/IOMMU: free all-empty " Jan Beulich
2021-12-15 15:14   ` Roger Pau Monné
2021-12-16 15:54     ` Jan Beulich
2021-09-24  9:56 ` [PATCH v2 18/18] VT-d: " Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YbivUH/Er0o2PwsG@Air-de-Roger \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=kevin.tian@intel.com \
    --cc=paul@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.