From: Julien Grall <julien.grall@arm.com>
To: xen-devel@lists.xen.org
Cc: proskurin@sec.in.tum.de, Julien Grall <julien.grall@arm.com>,
	sstabellini@kernel.org, steve.capper@arm.com,
	wei.chen@linaro.org
Subject: [RFC 18/22] xen/arm: p2m: Introduce p2m_set_entry and __p2m_set_entry
Date: Thu, 28 Jul 2016 15:51:41 +0100	[thread overview]
Message-ID: <1469717505-8026-19-git-send-email-julien.grall@arm.com> (raw)
In-Reply-To: <1469717505-8026-1-git-send-email-julien.grall@arm.com>
The ARM architecture mandates to use of a break-before-make sequence
when changing translation entries if the page table is shared between
multiple CPUs whenever a valid entry is replaced by another valid entry
(see D4.7.1 in ARM DDI 0487A.j for more details).
The break-before-make sequence can be divided in the following steps:
    1) Invalidate the old entry in the page table
    2) Issue a TLB invalidation instruction for the address associated
    to this entry
    3) Write the new entry
The current P2M code implemented in apply_one_level does not respect
this sequence and may result to break coherency on some processors.
Adapting the current implementation to use the break-before-make
sequence would imply some code duplication and more TLBs invalidation
than necessary. For instance, if we are replacing a 4KB page and the
current mapping in the P2M is using a 1GB superpage, the following steps
will happen:
    1) Shatter the 1GB superpage into a series of 2MB superpages
    2) Shatter the 2MB superpage into a series of 4KB pages
    3) Replace the 4KB page
As the current implementation is shattering while descending and install
the mapping, Xen would need to issue 3 TLB invalidation instructions
which is clearly inefficient.
Furthermore, all the operations which modify the page table are using
the same skeleton. It is more complicated to maintain different code paths
than having a generic function that set an entry and take care of the
break-before-make sequence.
The new implementation is based on the x86 EPT one which, I think,
fits quite well for the break-before-make sequence whilst keeping
the code simple.
The main function of the new implementation is __p2m_get_entry. It will
only work on mapping that are aligned to a block entry in the page table
(i.e 1GB, 2MB, 4KB when using a 4KB granularity).
Another function, p2m_get_entry, is provided to break down is region
into mapping that is aligned to a block entry.
Note that to keep this patch "small", there are no caller of those
functions added in this patch (they will be added in follow-up patches).
Signed-off-by: Julien Grall <julien.grall@arm.com>
---
    I need to find the impact of this new implementation on ARM32 because
    the domheap is not always mapped. This means that Xen needs to map/unmap
    everytime the page associated to the page table. It might be possible
    to re-use some caching structure as the current implementation does
    or rework the way a domheap is mapped/unmapped.
    Also, this code still contain few TODOs mostly to add sanity
    checks and few optimization. The IOMMU is not yet supported.
---
 xen/arch/arm/p2m.c | 335 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 335 insertions(+)
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index c93e554..297b176 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -750,6 +750,341 @@ static void p2m_put_l3_page(mfn_t mfn, p2m_type_t type)
     }
 }
 
+#if 0
+/* Free lpae sub-tree behind an entry */
+static void p2m_free_entry(struct p2m_domain *p2m,
+                           lpae_t entry, unsigned int level)
+{
+    unsigned int i;
+    lpae_t *table;
+    mfn_t mfn;
+
+    /* Nothing to do if the entry is invalid or a super-page */
+    if ( !p2m_valid(entry) || p2m_is_superpage(entry, level) )
+        return;
+
+    if ( level == 3 )
+    {
+        p2m_put_l3_page(_mfn(entry.p2m.base), entry.p2m.type);
+        return;
+    }
+
+    table = map_domain_page(_mfn(entry.p2m.base));
+    for ( i = 0; i < LPAE_ENTRIES; i++ )
+        p2m_free_entry(p2m, *(table + i), level + 1);
+
+    unmap_domain_page(table);
+
+    /*
+     * Make sure all the references in the TLB have been removed before
+     * freing the intermediate page table.
+     * XXX: Should we defer the free of the page table to avoid the
+     * flush?
+     */
+    if ( p2m->need_flush )
+        p2m_flush_tlb_sync(p2m);
+
+    mfn = _mfn(entry.p2m.base);
+    ASSERT(mfn_valid(mfn_x(mfn)));
+
+    free_domheap_page(mfn_to_page(mfn_x(mfn)));
+}
+
+static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
+                                unsigned int level, unsigned int target,
+                                const unsigned int *offsets)
+{
+    struct page_info *page;
+    unsigned int i;
+    lpae_t pte, *table;
+    bool rv = true;
+
+    /* Convenience aliases */
+    p2m_type_t t = entry->p2m.type;
+    mfn_t mfn = _mfn(entry->p2m.base);
+
+    /* Convenience aliases */
+    unsigned int next_level = level + 1;
+    unsigned int level_order = level_orders[next_level];
+
+    /*
+     * This should only be called with target != level and the entry is
+     * a superpage.
+     */
+    ASSERT(level < target);
+    ASSERT(p2m_is_superpage(*entry, level));
+
+    page = alloc_domheap_page(NULL, 0);
+    if ( !page )
+        return false;
+
+    page_list_add(page, &p2m->pages);
+    table = __map_domain_page(page);
+
+    /*
+     * We are either splitting a first level 1G page into 512 second level
+     * 2M pages, or a second level 2M page into 512 third level 4K pages.
+     */
+    for ( i = 0; i < LPAE_ENTRIES; i++ )
+    {
+        lpae_t *new_entry = table + i;
+
+        pte = mfn_to_p2m_entry(mfn, t, p2m->default_access);
+
+        mfn = mfn_add(mfn, (1UL << level_order));
+
+        /*
+         * First and second level pages set p2m.table = 0, but third
+         * level entries set p2m.table = 1.
+         */
+        if ( next_level < 3 )
+            pte.p2m.table = 0;
+
+        write_pte(new_entry, pte);
+    }
+
+    /*
+     * Shatter superpage in the page to the level we want to make the
+     * changes.
+     * This is done outside the loop to avoid checking the offset to
+     * know whether the entry should be shattered for every entry.
+     */
+    if ( next_level != target )
+        rv = p2m_split_superpage(p2m, table + offsets[next_level],
+                                 level + 1, target, offsets);
+
+    if ( p2m->clean_pte )
+        clean_dcache_va_range(table, PAGE_SIZE);
+
+    unmap_domain_page(table);
+
+    pte = mfn_to_p2m_entry(_mfn(page_to_mfn(page)), p2m_invalid,
+                           p2m->default_access);
+
+    p2m_write_pte(entry, pte, p2m->clean_pte);
+
+    /*
+     * Even if we failed, we should install the newly allocated LPAE
+     * entry. The caller will be in charge to free the sub-tree.
+     * XXX: See if we can free entry here.
+     */
+    *entry = pte;
+
+    return rv;
+}
+
+/*
+ * Insert an entry in the p2m. This should be called with a mapping
+ * equal to a page/superpage (4K, 2M, 1G).
+ */
+static int __p2m_set_entry(struct p2m_domain *p2m,
+                           gfn_t sgfn,
+                           unsigned int page_order,
+                           mfn_t smfn,
+                           p2m_type_t t,
+                           p2m_access_t a)
+{
+    paddr_t addr = pfn_to_paddr(gfn_x(sgfn));
+    unsigned int level = 0;
+    unsigned int target = 3 - (page_order / LPAE_SHIFT);
+    lpae_t *entry, *table, orig_pte;
+    int rc;
+
+    /* Convenience aliases */
+    const unsigned int offsets[4] = {
+        zeroeth_table_offset(addr),
+        first_table_offset(addr),
+        second_table_offset(addr),
+        third_table_offset(addr)
+    };
+
+    /* TODO: Check the validity for the address */
+
+    ASSERT(p2m_is_write_locked(p2m));
+
+    /*
+     * Check if the level target is valid: we only support
+     * 4K - 2M - 1G mapping.
+     */
+    ASSERT(target > 0 && target <= 3);
+
+    table = p2m_get_root_pointer(p2m, sgfn);
+    if ( !table )
+        return -EINVAL;
+
+    for ( level = P2M_ROOT_LEVEL; level < target; level++ )
+    {
+        rc = p2m_next_level(p2m, false, &table, offsets[level]);
+        if ( rc == GUEST_TABLE_MAP_FAILED )
+        {
+            rc = -ENOENT;
+            goto out;
+        }
+        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
+            break;
+    }
+
+    entry = table + offsets[level];
+
+    /*
+     * If we are here with level < target, we must be at a leaf node,
+     * and we need to break up the superpage.
+     */
+    if ( level < target )
+    {
+        /* We need to split the original page. */
+        lpae_t split_pte = *entry;
+
+        ASSERT(p2m_is_superpage(*entry, level));
+
+        if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets) )
+        {
+            p2m_free_entry(p2m, split_pte, level);
+            rc = -ENOMEM;
+            goto out;
+        }
+
+        /*
+         * Follow the break-before-sequence to update the entry.
+         * For more details see (D4.7.1 in ARM DDI 0487A.j).
+         * XXX: Can we flush by address?
+         */
+        p2m_remove_pte(entry, p2m->clean_pte);
+        p2m_flush_tlb_sync(p2m);
+
+        p2m_write_pte(entry, split_pte, p2m->clean_pte);
+
+        /* then move to the level we want to make real changes */
+        for ( ; level < target; level++ )
+        {
+            rc = p2m_next_level(p2m, true, &table, offsets[level]);
+
+            /*
+             * The entry should be found and either be a table
+             * or a superpage if level 3 is not targeted
+             */
+            ASSERT(rc == GUEST_TABLE_NORMAL_PAGE ||
+                   (rc == GUEST_TABLE_SUPER_PAGE && target < 3));
+        }
+
+        entry = table + offsets[level];
+    }
+
+    /*
+     * We should always be there with the correct level because
+     * all the intermediate tables have been installed if necessary.
+     */
+    ASSERT(level == target);
+
+    orig_pte = *entry;
+
+    /*
+     * The radix-tree can only work on 4KB. This is only used when
+     * memaccess is enabled.
+     */
+    ASSERT(!p2m->mem_access_enabled || page_order == 0);
+    /*
+     * The access type should always be p2m_access_rwx when the mapping
+     * is removed.
+     */
+    ASSERT(!mfn_eq(INVALID_MFN, smfn) || (a == p2m_access_rwx));
+    /*
+     * Update the mem access permission before update the P2M. So we
+     * don't have to revert the mapping if it has failed.
+     */
+    rc = p2m_mem_access_radix_set(p2m, sgfn, a);
+    if ( rc )
+        goto out;
+
+    /*
+     * Always remove the entry in order to follow the break-before-make
+     * sequence when updating the translation table (D4.7.1 in ARM DDI
+     * 0487A.j).
+     */
+    if ( p2m_valid(orig_pte) )
+        p2m_remove_pte(entry, p2m->clean_pte);
+
+    if ( mfn_eq(smfn, INVALID_MFN) )
+        /* Flush can be deferred if the entry is removed */
+        p2m->need_flush |= !!p2m_valid(orig_pte);
+    else
+    {
+        lpae_t pte;
+
+        /*
+         * Flush the TLB before write the new one to keep coherency.
+         * XXX: Can we flush by address?
+         */
+        if ( p2m_valid(orig_pte) )
+            p2m_flush_tlb_sync(p2m);
+
+        pte = mfn_to_p2m_entry(smfn, t, a);
+        if ( level < 3 )
+            pte.p2m.table = 0; /* Superpage entry */
+
+        p2m_write_pte(entry, pte, p2m->clean_pte);
+
+        p2m->max_mapped_gfn = gfn_max(p2m->max_mapped_gfn,
+                                      gfn_add(sgfn, 1 << page_order));
+        p2m->lowest_mapped_gfn = gfn_min(p2m->lowest_mapped_gfn, sgfn);
+    }
+
+    /*
+     * Free the entry only if the original pte was valid and the base
+     * is different (to avoid freeing when permission is changed).
+     */
+    if ( p2m_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
+        p2m_free_entry(p2m, orig_pte, level);
+
+    /* XXX: Flush iommu */
+
+    rc = 0;
+
+out:
+    unmap_domain_page(table);
+
+    return rc;
+}
+
+static int p2m_set_entry(struct p2m_domain *p2m,
+                         gfn_t sgfn,
+                         unsigned long todo,
+                         mfn_t smfn,
+                         p2m_type_t t,
+                         p2m_access_t a)
+{
+    int rc = 0;
+
+    while ( todo )
+    {
+        unsigned long mask = gfn_x(sgfn) | mfn_x(smfn) | todo;
+        unsigned long order;
+
+        /* Always map 4k by 4k when memaccess is enabled */
+        if ( unlikely(p2m->mem_access_enabled) )
+            order = THIRD_ORDER;
+        if ( !(mask & ((1UL << FIRST_ORDER) - 1)) )
+            order = FIRST_ORDER;
+        else if ( !(mask & ((1UL << SECOND_ORDER) - 1)) )
+            order = SECOND_ORDER;
+        else
+            order = THIRD_ORDER;
+
+        rc = __p2m_set_entry(p2m, sgfn, order, smfn, t, a);
+        if ( rc )
+            break;
+
+        sgfn = gfn_add(sgfn, (1 << order));
+        if ( !mfn_eq(smfn, INVALID_MFN) )
+           smfn = mfn_add(smfn, (1 << order));
+
+        todo -= (1 << order);
+    }
+
+    return rc;
+}
+#endif
+
 /*
  * Returns true if start_gpaddr..end_gpaddr contains at least one
  * suitably aligned level_size mappping of maddr.
-- 
1.9.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply	other threads:[~2016-07-28 14:51 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-28 14:51 [RFC 00/22] xen/arm: Rework the P2M code to follow break-before-make sequence Julien Grall
2016-07-28 14:51 ` [RFC 01/22] xen/arm: do_trap_instr_abort_guest: Move the IPA computation out of the switch Julien Grall
2016-08-16  0:21   ` Stefano Stabellini
2016-08-16 16:20     ` Julien Grall
2016-08-31 10:01       ` Julien Grall
2016-08-31 19:43       ` Stefano Stabellini
2016-09-06 14:54         ` Julien Grall
2016-07-28 14:51 ` [RFC 02/22] xen/arm: p2m: Store in p2m_domain whether we need to clean the entry Julien Grall
2016-08-16  0:35   ` Stefano Stabellini
2016-08-31 10:29     ` Julien Grall
2016-07-28 14:51 ` [RFC 03/22] xen/arm: p2m: Rename parameter in p2m_{remove, write}_pte Julien Grall
2016-08-16  0:36   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 04/22] xen/arm: p2m: Use typesafe gfn in p2m_mem_access_radix_set Julien Grall
2016-08-16  0:39   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 05/22] xen/arm: traps: Move MMIO emulation code in a separate helper Julien Grall
2016-08-16  0:49   ` Stefano Stabellini
2016-08-31 10:36     ` Julien Grall
2016-07-28 14:51 ` [RFC 06/22] xen/arm: traps: Check the P2M before injecting a data/instruction abort Julien Grall
2016-08-23  1:05   ` Stefano Stabellini
2016-08-31 10:58     ` Julien Grall
2016-07-28 14:51 ` [RFC 07/22] xen/arm: p2m: Rework p2m_put_l3_page Julien Grall
2016-08-23  1:10   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 08/22] xen/arm: p2m: Invalidate the TLBs when write unlocking the p2m Julien Grall
2016-08-23  1:18   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 09/22] xen/arm: p2m: Change the type of level_shifts from paddr_t to unsigned int Julien Grall
2016-08-23  1:20   ` Stefano Stabellini
2016-08-31 11:04     ` Julien Grall
2016-07-28 14:51 ` [RFC 10/22] xen/arm: p2m: Move the lookup helpers at the top of the file Julien Grall
2016-07-28 14:51 ` [RFC 11/22] xen/arm: p2m: Introduce p2m_get_root_pointer and use it in __p2m_lookup Julien Grall
2016-08-23  1:34   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 12/22] xen/arm: p2m: Introduce p2m_get_entry and use it to implement __p2m_lookup Julien Grall
2016-07-30 18:37   ` Tamas K Lengyel
2016-08-31  0:30   ` [RFC 12/22] xen/arm: p2m: Introduce p2m_get_entry and use it to implement __p2m_lookupo Stefano Stabellini
2016-08-31 12:25     ` Julien Grall
2016-08-31 19:33       ` Stefano Stabellini
2016-09-01 11:37         ` Julien Grall
2016-07-28 14:51 ` [RFC 13/22] xen/arm: p2m: Replace all usage of __p2m_lookup with p2m_get_entry Julien Grall
2016-07-28 17:29   ` Tamas K Lengyel
2016-07-28 17:36     ` Tamas K Lengyel
2016-07-29 15:06       ` Julien Grall
2016-07-29 22:36         ` Tamas K Lengyel
2016-07-28 17:51     ` Julien Grall
2016-09-05 20:45   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 14/22] xen/arm: p2m: Re-implement p2m_cache_flush using p2m_get_entry Julien Grall
2016-09-05 21:13   ` Stefano Stabellini
2016-09-06 14:56     ` Julien Grall
2016-07-28 14:51 ` [RFC 15/22] xen/arm: p2m: Re-implement relinquish_p2m_mapping " Julien Grall
2016-09-05 21:58   ` Stefano Stabellini
2016-09-06 15:05     ` Julien Grall
2016-09-06 18:21       ` Stefano Stabellini
2016-09-07  7:37         ` Julien Grall
2016-07-28 14:51 ` [RFC 16/22] xen/arm: p2m: Make p2m_{valid, table, mapping} helpers inline Julien Grall
2016-09-05 22:00   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 17/22] xen/arm: p2m: Introduce a helper to check if an entry is a superpage Julien Grall
2016-09-05 22:03   ` Stefano Stabellini
2016-07-28 14:51 ` Julien Grall [this message]
2016-07-30 18:40   ` [RFC 18/22] xen/arm: p2m: Introduce p2m_set_entry and __p2m_set_entry Tamas K Lengyel
2016-08-15 10:22   ` Sergej Proskurin
2016-09-06  1:08   ` Stefano Stabellini
2016-09-06 17:12     ` Julien Grall
2016-09-06 18:51       ` Stefano Stabellini
2016-09-07  8:18         ` Julien Grall
2016-09-09 23:14           ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 19/22] xen/arm: p2m: Re-implement p2m_remove_using using p2m_set_entry Julien Grall
2016-09-06 18:53   ` Stefano Stabellini
2016-07-28 14:51 ` [RFC 20/22] xen/arm: p2m: Re-implement p2m_insert_mapping " Julien Grall
2016-09-06 18:57   ` Stefano Stabellini
2016-09-15 10:38     ` Julien Grall
2016-07-28 14:51 ` [RFC 21/22] xen/arm: p2m: Re-implement p2m_set_mem_access using p2m_{set, get}_entry Julien Grall
2016-07-28 15:04   ` Razvan Cojocaru
2016-07-28 15:16     ` Julien Grall
2016-08-01 15:40     ` Julien Grall
2016-08-01 15:59       ` Tamas K Lengyel
2016-08-01 16:15         ` Julien Grall
2016-08-01 16:27           ` Tamas K Lengyel
2016-08-01 16:33             ` Julien Grall
2016-08-01 16:41               ` Tamas K Lengyel
2016-08-02  6:07             ` Razvan Cojocaru
2016-08-01 16:34       ` Mark Rutland
2016-08-01 16:57         ` Julien Grall
2016-08-01 17:26           ` Mark Rutland
2016-08-01 18:22             ` Mark Rutland
2016-08-02  9:58               ` Julien Grall
2016-08-02 10:26                 ` Mark Rutland
2016-07-28 17:21   ` Tamas K Lengyel
2016-09-06 19:06   ` Stefano Stabellini
2016-09-06 19:16     ` Razvan Cojocaru
2016-09-06 19:17       ` Stefano Stabellini
2016-09-07  6:56       ` Julien Grall
2016-09-07  7:03         ` Razvan Cojocaru
2016-07-28 14:51 ` [RFC 22/22] xen/arm: p2m: Do not handle shattering in p2m_create_table Julien Grall
2016-09-06 18:59   ` Stefano Stabellini
2016-07-28 17:46 ` [RFC 00/22] xen/arm: Rework the P2M code to follow break-before-make sequence Tamas K Lengyel
2016-07-29 16:23   ` Julien Grall
2016-07-29 19:05     ` Julien Grall
2016-08-15 10:24 ` Julien Grall
2016-08-15 15:06 ` Edgar E. Iglesias
2016-08-17  2:28   ` Shanker Donthineni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=1469717505-8026-19-git-send-email-julien.grall@arm.com \
    --to=julien.grall@arm.com \
    --cc=proskurin@sec.in.tum.de \
    --cc=sstabellini@kernel.org \
    --cc=steve.capper@arm.com \
    --cc=wei.chen@linaro.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).