xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Oleksii Kurochko <oleksii.kurochko@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Alistair Francis" <alistair.francis@wdc.com>,
	"Bob Eshleman" <bobbyeshleman@gmail.com>,
	"Connor Davis" <connojdavis@gmail.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Anthony PERARD" <anthony.perard@vates.tech>,
	"Michal Orzel" <michal.orzel@amd.com>,
	"Julien Grall" <julien@xen.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH v4 18/18] xen/riscv: introduce metadata table to store P2M type
Date: Wed, 1 Oct 2025 18:00:07 +0200	[thread overview]
Message-ID: <5142b7c4-ab2e-4f73-a60d-3d23fe255ca7@gmail.com> (raw)
In-Reply-To: <4c2eb99b-3e88-4364-8c3f-7c70d4064ef4@suse.com>

[-- Attachment #1: Type: text/plain, Size: 10003 bytes --]


On 9/23/25 12:41 AM, Jan Beulich wrote:
> On 17.09.2025 23:55, Oleksii Kurochko wrote:
>
>> +/*
>> + * `pte` – PTE entry for which the type `t` will be stored.
>> + *
>> + * If `t` is `p2m_ext_storage`, both `ctx` and `p2m` must be provided;
>> + * otherwise, they may be NULL.
>> + */
>> +static void p2m_set_type(pte_t *pte, const p2m_type_t t,
>> +                         struct p2m_pte_ctx *ctx,
>> +                         struct p2m_domain *p2m)
>>   {
>> -    int rc = 0;
>> +    /*
>> +    * For the root page table (16 KB in size), we need to select the correct
>> +    * metadata table, since allocations are 4 KB each. In total, there are
>> +    * 4 tables of 4 KB each.
>> +    * For none-root page table index of ->pt_page[] will be always 0 as
>> +    * index won't be higher then 511. ASSERT() below verifies that.
>> +    */
>> +    struct page_info **md_pg =
>> +        &ctx->pt_page[ctx->index / PAGETABLE_ENTRIES].v.md.metadata;
>> +    pte_t *metadata = NULL;
>> +
>> +     /* Be sure that an index correspondent to page level is passed. */
>> +    ASSERT(ctx->index <= P2M_PAGETABLE_ENTRIES(ctx->level));
> Doesn't this need to be < ?

Yeah, it should be <.

>
>> +    if ( !*md_pg && (t >= p2m_first_external) )
>> +    {
>> +        /*
>> +         * Ensure that when `t` is stored outside the PTE bits
>> +         * (i.e. `t == p2m_ext_storage` or higher),
>> +         * both `ctx` and `p2m` are provided.
>> +         */
>> +        ASSERT(p2m && ctx);
> Imo this would want to be checked whenever t > p2m_first_external, no
> matter whether a metadata page was already allocated.

I think that|ctx| should be checked before this|if| condition, since it is
used to obtain the proper metadata page.

The check for|p2m| can remain inside the|if| condition, as it is essentially
only needed for allocating a metadata page.

>
>> -    if ( t > p2m_first_external )
>> -        panic("unimplemeted\n");
>> -    else
>> +        if ( ctx->level <= P2M_SUPPORTED_LEVEL_MAPPING )
>> +        {
>> +            struct domain *d = p2m->domain;
>> +
>> +            *md_pg = p2m_alloc_table(p2m);
>> +            if ( !*md_pg )
>> +            {
>> +                printk("%s: can't allocate extra memory for dom%d\n",
>> +                        __func__, d->domain_id);
>> +                domain_crash(d);
>> +            }
>> +        }
>> +        else
>> +            /*
>> +             * It is not legal to set a type for an entry which shouldn't
>> +             * be mapped.
>> +             */
>> +            ASSERT_UNREACHABLE();
> Something not being legal doesn't mean it can't happen. Imo in this case
> BUG_ON() (in place of the if() above) would be better.
>
>> +    }
>> +
>> +    if ( *md_pg )
>> +        metadata = __map_domain_page(*md_pg);
>> +
>> +    if ( t < p2m_first_external )
>> +    {
>>           pte->pte |= MASK_INSR(t, P2M_TYPE_PTE_BITS_MASK);
>>   
>> -    return rc;
>> +        if ( metadata )
>> +            metadata[ctx->index].pte = p2m_invalid;
>> +    }
>> +    else
>> +    {
>> +        pte->pte |= MASK_INSR(p2m_ext_storage, P2M_TYPE_PTE_BITS_MASK);
>> +
>> +        metadata[ctx->index].pte = t;
> Afaict metadata can still be NULL when you get here.

It shouldn't be, because when this line is executed, the metadata page already
exists or was allocated at the start of p2m_set_type().

>
>> +    }
>> +
>> +    if ( metadata )
>> +        unmap_domain_page(metadata);
> According to the x86 implementation, passing NULL here ought to be fine,
> so no if() needed.

With the current one implementation for RISC-V (CONFIG_ARCH_MAP_DOMAIN_PAGE=n so
unmap_domain_page() does nothing), it is fine too.

>
>>   }
>>   
>> -static p2m_type_t p2m_get_type(const pte_t pte)
>> +/*
>> + * `pte` -> PTE entry that stores the PTE's type.
>> + *
>> + * If the PTE's type is `p2m_ext_storage`, `ctx` should be provided;
>> + * otherwise it could be NULL.
>> + */
>> +static p2m_type_t p2m_get_type(const pte_t pte, const struct p2m_pte_ctx *ctx)
>>   {
>>       p2m_type_t type = MASK_EXTR(pte.pte, P2M_TYPE_PTE_BITS_MASK);
>>   
>>       if ( type == p2m_ext_storage )
>> -        panic("unimplemented\n");
>> +    {
>> +        pte_t *md = __map_domain_page(ctx->pt_page->v.md.metadata);
> Pointer-to-const?
>
>> +        type = md[ctx->index].pte;
>> +        unmap_domain_page(ctx->pt_page->v.md.metadata);
> I'm pretty sure you want to pass md here, not the pointer you passed
> into __map_domain_page().

Oh, right. It should be `md`.

>
>> @@ -381,7 +465,10 @@ static void p2m_set_permission(pte_t *e, p2m_type_t t)
>>       }
>>   }
>>   
>> -static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t, bool is_table)
>> +static pte_t p2m_pte_from_mfn(const mfn_t mfn, const p2m_type_t t,
>> +                              struct p2m_pte_ctx *p2m_pte_ctx,
>> +                              const bool is_table,
> Do you really need both "is_table" and the context pointer? Couldn't
> the "is intermediate page table" case be identified by a NULL context
> and/or p2m pointer?

Good point. I will drop is_table.

>
> Also why "const" all of the sudden?

Because nothing of that is going to be changed in p2m_pte_from_mfn(). To have
diff clearer, I can revert these changes.

>
>> @@ -435,22 +527,47 @@ static struct page_info *p2m_alloc_page(struct p2m_domain *p2m)
>>       return pg;
>>   }
>>   
>> + * Link this metadata page to page table page's list field.
>> + */
>> +static struct page_info *p2m_alloc_table(struct p2m_domain *p2m)
>> +{
>> +    struct page_info *page_tbl = p2m_alloc_page(p2m);
>> +
>> +    if ( !page_tbl )
>> +        return NULL;
>> +
>> +    clear_and_clean_page(page_tbl, p2m->clean_dcache);
>> +
>> +    return page_tbl;
>> +}
> ... the function is needed in the first place.

On one hand, it may not seem strictly necessary, but on the
other hand, without it we would need to repeat the pattern of
allocating, clearing, and cleaning a page each time a page table
is allocated. At the moment, I prefer to keep it.
But considering another your comment below ...

>
>> +/*
>> + * Free page table's page and metadata page linked to page table's page.
>> + */
>> +static void p2m_free_table(struct p2m_domain *p2m, struct page_info *tbl_pg)
>> +{
>> +    ASSERT(tbl_pg->v.md.metadata);
> Why, when you no longer unconditionally alloc that page?

Agree, there is no need for this ASSERT() as "lazy allocation" is used for
metadata.

>>   static int p2m_create_table(struct p2m_domain *p2m, pte_t *entry)
>>   {
>> -    struct page_info *page;
>> +    struct page_info *page = p2m_alloc_table(p2m);
>>   
>>       ASSERT(!pte_is_valid(*entry));
>>   
>> -    page = p2m_alloc_page(p2m);
>> -    if ( page == NULL )
>> -        return -ENOMEM;
>> -
>> -    clear_and_clean_page(page, p2m->clean_dcache);
>> -
>>       p2m_write_pte(entry, page_to_p2m_table(page), p2m->clean_dcache);
>>   
>>       return 0;
> As per above I don't think any change is needed here.

There are some places in the code where it isn’t necessary to immediately
write the address of a newly allocated page table page into a PTE:
- During superpage splitting: a new page is first allocated for the new
   page table, then it is filled, and only afterwards is the PTE updated
   with the new page table address.
- In p2m_set_type(): when a table is allocated for storing metadata
   (although I think p2m_alloc_page() would work fine here as well),
   there is no need to update any PTE correspondingly.

...
So, I think I can agree that p2m_alloc_table() isn’t really needed.
It should be sufficient to move the clear_and_clean_page(page_tbl, p2m->clean_dcache)
call from p2m_alloc_table() into p2m_alloc_page(), and then just use
p2m_alloc_page() everywhere.

Does the last paragraph make sense?

>> @@ -707,6 +834,22 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
>>           pte = *entry;
>>           pte_set_mfn(&pte, mfn_add(mfn, i << level_order));
>>   
>> +        if ( MASK_EXTR(pte.pte, P2M_TYPE_PTE_BITS_MASK) == p2m_ext_storage )
>> +        {
>> +            struct p2m_pte_ctx p2m_pte_ctx = {
>> +                .pt_page = tbl_pg,
>> +                .index = offsets[level],
>> +            };
> Assuming using "level" is correct here (which it looks like it is), ...
>
>> +            p2m_type_t old_type = p2m_get_type(pte, &p2m_pte_ctx);
> ... can't this move ahead of the loop?

Considering that old_type is expected to be the same for all new PTEs, I think
we can move that ahead of the loop. I'll do that.

>
>> +            p2m_pte_ctx.pt_page = page;
>> +            p2m_pte_ctx.index = i;
>> +            p2m_pte_ctx.level = level;
> Whereas - doesn't this need to be "next_level"?

Yes, it should be next_level.

>
>> @@ -718,7 +861,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
>>        */
>>       if ( next_level != target )
>>           rv = p2m_split_superpage(p2m, table + offsets[next_level],
>> -                                 level - 1, target, offsets);
>> +                                 level - 1, target, offsets, page);
> And btw (alredy in the earlier patch introducing this code) - why isn't
> it "next_level" here, instead of "level - 1" (if already you have that
> variable)?

Missed to update that part. It should next_level used instead of level - 1.

>
>> @@ -812,13 +955,21 @@ static int p2m_set_entry(struct p2m_domain *p2m,
>>       {
>>           /* We need to split the original page. */
>>           pte_t split_pte = *entry;
>> +        struct page_info *tbl_pg = virt_to_page(table);
> This isn't valid on a pointer obtained from map_domain_page().

Oh, sure — virt_to_page() and page_to_virt() should be used only for Xen
heap addresses.

By the way, do we have any documentation, comments, or notes describing
what should be allocated and from where?

Since map_domain_page() returns an address from the direct map region,
should we instead use maddr_to_page(virt_to_maddr(table))?

Thanks for review.

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 14953 bytes --]

  reply	other threads:[~2025-10-01 16:00 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17 21:55 [PATCH v4 00/18 for 4.22] xen/riscv: introduce p2m functionality Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 01/18] xen/riscv: detect and initialize G-stage mode Oleksii Kurochko
2025-09-18 15:54   ` Jan Beulich
2025-09-24 11:31     ` Oleksii Kurochko
2025-09-24 15:00       ` Oleksii Kurochko
2025-09-25 13:46         ` Jan Beulich
2025-09-26  7:30           ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 02/18] xen/riscv: introduce VMID allocation and manegement Oleksii Kurochko
2025-09-19 21:26   ` Jan Beulich
2025-09-24 14:25     ` Oleksii Kurochko
2025-09-25 13:53       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 03/18] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
2025-09-19 21:43   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 04/18] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 05/18] xen/riscv: add root page table allocation Oleksii Kurochko
2025-09-19 22:14   ` Jan Beulich
2025-09-24 15:40     ` Oleksii Kurochko
2025-09-25 13:56       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 06/18] xen/riscv: introduce pte_{set,get}_mfn() Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 07/18] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
2025-09-19 22:18   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 08/18] xen/dom0less: abstract Arm-specific p2m type name for device MMIO mappings Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 09/18] xen/riscv: implement function to map memory in guest p2m Oleksii Kurochko
2025-09-19 23:12   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 10/18] xen/riscv: implement p2m_set_range() Oleksii Kurochko
2025-09-19 23:36   ` Jan Beulich
2025-09-25 20:08     ` Oleksii Kurochko
2025-09-26  7:07       ` Jan Beulich
2025-09-26  8:58         ` Oleksii Kurochko
2025-10-13 11:59           ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 11/18] xen/riscv: Implement p2m_free_subtree() and related helpers Oleksii Kurochko
2025-09-19 23:57   ` Jan Beulich
2025-09-26 15:33     ` Oleksii Kurochko
2025-09-28 14:30       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 12/18] xen/riscv: Implement p2m_pte_from_mfn() and support PBMT configuration Oleksii Kurochko
2025-09-22 16:28   ` Jan Beulich
2025-09-29 13:30     ` Oleksii Kurochko
2025-10-07 13:09       ` Jan Beulich
2025-10-09  9:21         ` Oleksii Kurochko
2025-10-09 12:06           ` Jan Beulich
2025-10-10  8:29             ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 13/18] xen/riscv: implement p2m_next_level() Oleksii Kurochko
2025-09-22 17:35   ` Jan Beulich
2025-09-29 14:23     ` Oleksii Kurochko
2025-09-17 21:55 ` [PATCH v4 14/18] xen/riscv: Implement superpage splitting for p2m mappings Oleksii Kurochko
2025-09-22 17:55   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 15/18] xen/riscv: implement put_page() Oleksii Kurochko
2025-09-22 19:54   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 16/18] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers Oleksii Kurochko
2025-09-22 20:02   ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 17/18] xen/riscv: add support of page lookup by GFN Oleksii Kurochko
2025-09-22 20:46   ` Jan Beulich
2025-09-30 15:37     ` Oleksii Kurochko
2025-10-07 13:14       ` Jan Beulich
2025-09-17 21:55 ` [PATCH v4 18/18] xen/riscv: introduce metadata table to store P2M type Oleksii Kurochko
2025-09-22 22:41   ` Jan Beulich
2025-10-01 16:00     ` Oleksii Kurochko [this message]
2025-10-07 13:25       ` Jan Beulich
2025-10-09 11:34         ` Oleksii Kurochko
2025-10-09 12:10           ` Jan Beulich
2025-10-10  8:42             ` Oleksii Kurochko
2025-10-10 11:00               ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5142b7c4-ab2e-4f73-a60d-3d23fe255ca7@gmail.com \
    --to=oleksii.kurochko@gmail.com \
    --cc=alistair.francis@wdc.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@vates.tech \
    --cc=bobbyeshleman@gmail.com \
    --cc=connojdavis@gmail.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=michal.orzel@amd.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).