xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Oleksii Kurochko <oleksii.kurochko@gmail.com>
To: xen-devel@lists.xenproject.org
Cc: "Oleksii Kurochko" <oleksii.kurochko@gmail.com>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"Bob Eshleman" <bobbyeshleman@gmail.com>,
	"Connor Davis" <connojdavis@gmail.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Anthony PERARD" <anthony.perard@vates.tech>,
	"Michal Orzel" <michal.orzel@amd.com>,
	"Jan Beulich" <jbeulich@suse.com>,
	"Julien Grall" <julien@xen.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>
Subject: [PATCH v4 14/18] xen/riscv: Implement superpage splitting for p2m mappings
Date: Wed, 17 Sep 2025 23:55:34 +0200	[thread overview]
Message-ID: <ff4df3d98d5acdffee3b1c1b0c345c25ea44264c.1758145428.git.oleksii.kurochko@gmail.com> (raw)
In-Reply-To: <cover.1758145428.git.oleksii.kurochko@gmail.com>

Add support for down large memory mappings ("superpages") in the RISC-V
p2m mapping so that smaller, more precise mappings ("finer-grained entries")
can be inserted into lower levels of the page table hierarchy.

To implement that the following is done:
- Introduce p2m_split_superpage(): Recursively shatters a superpage into
  smaller page table entries down to the target level, preserving original
  permissions and attributes.
- p2m_set_entry() updated to invoke superpage splitting when inserting
  entries at lower levels within a superpage-mapped region.

This implementation is based on the ARM code, with modifications to the part
that follows the BBM (break-before-make) approach, some parts are simplified
as according to RISC-V spec:
  It is permitted for multiple address-translation cache entries to co-exist
  for the same address. This represents the fact that in a conventional
  TLB hierarchy, it is possible for multiple entries to match a single
  address if, for example, a page is upgraded to a superpage without first
  clearing the original non-leaf PTE’s valid bit and executing an SFENCE.VMA
  with rs1=x0, or if multiple TLBs exist in parallel at a given level of the
  hierarchy. In this case, just as if an SFENCE.VMA is not executed between
  a write to the memory-management tables and subsequent implicit read of the
  same address: it is unpredictable whether the old non-leaf PTE or the new
  leaf PTE is used, but the behavior is otherwise well defined.
In contrast to the Arm architecture, where BBM is mandatory and failing to
use it in some cases can lead to CPU instability, RISC-V guarantees
stability, and the behavior remains safe — though unpredictable in terms of
which translation will be used.

Additionally, the page table walk logic has been adjusted, as ARM uses the
opposite level numbering compared to RISC-V.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - s/number of levels/level numbering in the commit message.
 - s/permissions/attributes.
 - Remove redundant comment in p2m_split_superpage() about page
   splitting.
 - Use P2M_PAGETABLE_ENTRIES as XEN_PT_ENTRIES
   doesn't takeinto  into acount that G stage root page table is
   extended by 2 bits.
 - Use earlier introduced P2M_LEVEL_ORDER().
---
Changes in V3:
 - Move     page_list_add(page, &p2m->pages) inside p2m_alloc_page().
 - Use 'unsigned long' for local vairiable 'i' in p2m_split_superpage().
 - Update the comment above if ( next_level != target ) in p2m_split_superpage().
 - Reverse cycle to iterate through page table levels in p2m_set_entry().
 - Update p2m_split_superpage() with the same changes which are done in the
   patch "P2M: Don't try to free the existing PTE if we can't allocate a new table".
---
Changes in V2:
 - New patch. It was a part of a big patch "xen/riscv: implement p2m mapping
   functionality" which was splitted to smaller.
 - Update the commit above the cycle which creates new page table as
   RISC-V travserse page tables in an opposite to ARM order.
 - RISC-V doesn't require BBM so there is no needed for invalidating
   and TLB flushing before updating PTE.
---
 xen/arch/riscv/p2m.c | 114 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 113 insertions(+), 1 deletion(-)

diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
index bf4945e99f..1577b09b15 100644
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -661,6 +661,87 @@ static void p2m_free_subtree(struct p2m_domain *p2m,
     p2m_free_page(p2m, pg);
 }
 
+static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
+                                unsigned int level, unsigned int target,
+                                const unsigned int *offsets)
+{
+    struct page_info *page;
+    unsigned long i;
+    pte_t pte, *table;
+    bool rv = true;
+
+    /* Convenience aliases */
+    mfn_t mfn = pte_get_mfn(*entry);
+    unsigned int next_level = level - 1;
+    unsigned int level_order = P2M_LEVEL_ORDER(next_level);
+
+    /*
+     * This should only be called with target != level and the entry is
+     * a superpage.
+     */
+    ASSERT(level > target);
+    ASSERT(pte_is_superpage(*entry, level));
+
+    page = p2m_alloc_page(p2m);
+    if ( !page )
+    {
+        /*
+         * The caller is in charge to free the sub-tree.
+         * As we didn't manage to allocate anything, just tell the
+         * caller there is nothing to free by invalidating the PTE.
+         */
+        memset(entry, 0, sizeof(*entry));
+        return false;
+    }
+
+    table = __map_domain_page(page);
+
+    for ( i = 0; i < P2M_PAGETABLE_ENTRIES(next_level); i++ )
+    {
+        pte_t *new_entry = table + i;
+
+        /*
+         * Use the content of the superpage entry and override
+         * the necessary fields. So the correct attributes are kept.
+         */
+        pte = *entry;
+        pte_set_mfn(&pte, mfn_add(mfn, i << level_order));
+
+        write_pte(new_entry, pte);
+    }
+
+    /*
+     * Shatter superpage in the page to the level we want to make the
+     * changes.
+     * This is done outside the loop to avoid checking the offset
+     * for every entry to know whether the entry should be shattered.
+     */
+    if ( next_level != target )
+        rv = p2m_split_superpage(p2m, table + offsets[next_level],
+                                 level - 1, target, offsets);
+
+    if ( p2m->clean_dcache )
+        clean_dcache_va_range(table, PAGE_SIZE);
+
+    /*
+     * TODO: an inefficiency here: the caller almost certainly wants to map
+     *       the same page again, to update the one entry that caused the
+     *       request to shatter the page.
+     */
+    unmap_domain_page(table);
+
+    /*
+     * Even if we failed, we should (according to the current implemetation
+     * of a way how sub-tree is freed if p2m_split_superpage hasn't been
+     * finished fully) install the newly allocated PTE
+     * entry.
+     * The caller will be in charge to free the sub-tree.
+     */
+    p2m_write_pte(entry, page_to_p2m_table(page), p2m->clean_dcache);
+
+    return rv;
+}
+
 /*
  * Insert an entry in the p2m. This should be called with a mapping
  * equal to a page/superpage.
@@ -729,7 +810,38 @@ static int p2m_set_entry(struct p2m_domain *p2m,
      */
     if ( level > target )
     {
-        panic("Shattering isn't implemented\n");
+        /* We need to split the original page. */
+        pte_t split_pte = *entry;
+
+        ASSERT(pte_is_superpage(*entry, level));
+
+        if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets) )
+        {
+            /* Free the allocated sub-tree */
+            p2m_free_subtree(p2m, split_pte, level);
+
+            rc = -ENOMEM;
+            goto out;
+        }
+
+        p2m_write_pte(entry, split_pte, p2m->clean_dcache);
+
+        p2m->need_flush = true;
+
+        /* Then move to the level we want to make real changes */
+        for ( ; level > target; level-- )
+        {
+            rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
+
+            /*
+             * The entry should be found and either be a table
+             * or a superpage if level 0 is not targeted
+             */
+            ASSERT(rc == P2M_TABLE_NORMAL ||
+                   (rc == P2M_TABLE_SUPER_PAGE && target > 0));
+        }
+
+        entry = table + offsets[level];
     }
 
     /*
-- 
2.51.0



  parent reply	other threads:[~2025-09-17 22:07 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17 21:55 [PATCH v4 00/18 for 4.22] xen/riscv: introduce p2m functionality Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 01/18] xen/riscv: detect and initialize G-stage mode Oleksii Kurochko
2025-09-18 15:54   ` Jan Beulich
2025-09-24 11:31     ` Oleksii Kurochko
2025-09-24 15:00       ` Oleksii Kurochko
2025-09-25 13:46         ` Jan Beulich
2025-09-26  7:30           ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 02/18] xen/riscv: introduce VMID allocation and manegement Oleksii Kurochko
2025-09-19 21:26   ` Jan Beulich
2025-09-24 14:25     ` Oleksii Kurochko
2025-09-25 13:53       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 03/18] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
2025-09-19 21:43   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 04/18] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 05/18] xen/riscv: add root page table allocation Oleksii Kurochko
2025-09-19 22:14   ` Jan Beulich
2025-09-24 15:40     ` Oleksii Kurochko
2025-09-25 13:56       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 06/18] xen/riscv: introduce pte_{set,get}_mfn() Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 07/18] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
2025-09-19 22:18   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 08/18] xen/dom0less: abstract Arm-specific p2m type name for device MMIO mappings Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 09/18] xen/riscv: implement function to map memory in guest p2m Oleksii Kurochko
2025-09-19 23:12   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 10/18] xen/riscv: implement p2m_set_range() Oleksii Kurochko
2025-09-19 23:36   ` Jan Beulich
2025-09-25 20:08     ` Oleksii Kurochko
2025-09-26  7:07       ` Jan Beulich
2025-09-26  8:58         ` Oleksii Kurochko
2025-10-13 11:59           ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 11/18] xen/riscv: Implement p2m_free_subtree() and related helpers Oleksii Kurochko
2025-09-19 23:57   ` Jan Beulich
2025-09-26 15:33     ` Oleksii Kurochko
2025-09-28 14:30       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 12/18] xen/riscv: Implement p2m_pte_from_mfn() and support PBMT configuration Oleksii Kurochko
2025-09-22 16:28   ` Jan Beulich
2025-09-29 13:30     ` Oleksii Kurochko
2025-10-07 13:09       ` Jan Beulich
2025-10-09  9:21         ` Oleksii Kurochko
2025-10-09 12:06           ` Jan Beulich
2025-10-10  8:29             ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 13/18] xen/riscv: implement p2m_next_level() Oleksii Kurochko
2025-09-22 17:35   ` Jan Beulich
2025-09-29 14:23     ` Oleksii Kurochko
2025-09-17 21:55 ` Oleksii Kurochko [this message]
2025-09-22 17:55   ` [PATCH v4 14/18] xen/riscv: Implement superpage splitting for p2m mappings Jan Beulich
2025-09-17 21:55 ` [PATCH v4 15/18] xen/riscv: implement put_page() Oleksii Kurochko
2025-09-22 19:54   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 16/18] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers Oleksii Kurochko
2025-09-22 20:02   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 17/18] xen/riscv: add support of page lookup by GFN Oleksii Kurochko
2025-09-22 20:46   ` Jan Beulich
2025-09-30 15:37     ` Oleksii Kurochko
2025-10-07 13:14       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 18/18] xen/riscv: introduce metadata table to store P2M type Oleksii Kurochko
2025-09-22 22:41   ` Jan Beulich
2025-10-01 16:00     ` Oleksii Kurochko
2025-10-07 13:25       ` Jan Beulich
2025-10-09 11:34         ` Oleksii Kurochko
2025-10-09 12:10           ` Jan Beulich
2025-10-10  8:42             ` Oleksii Kurochko
2025-10-10 11:00               ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff4df3d98d5acdffee3b1c1b0c345c25ea44264c.1758145428.git.oleksii.kurochko@gmail.com \
    --to=oleksii.kurochko@gmail.com \
    --cc=alistair.francis@wdc.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@vates.tech \
    --cc=bobbyeshleman@gmail.com \
    --cc=connojdavis@gmail.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=michal.orzel@amd.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).