All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleksii Kurochko <oleksii.kurochko@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Alistair Francis" <alistair.francis@wdc.com>,
	"Bob Eshleman" <bobbyeshleman@gmail.com>,
	"Connor Davis" <connojdavis@gmail.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Anthony PERARD" <anthony.perard@vates.tech>,
	"Michal Orzel" <michal.orzel@amd.com>,
	"Julien Grall" <julien@xen.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 11/17] xen/riscv: implement p2m_set_entry() and __p2m_set_entry()
Date: Tue, 8 Jul 2025 17:42:59 +0200	[thread overview]
Message-ID: <dd73f1ed-e2cb-40c5-ad06-bd390467e128@gmail.com> (raw)
In-Reply-To: <61773bca-32f7-42ee-867f-80dc8a389bef@suse.com>

[-- Attachment #1: Type: text/plain, Size: 9712 bytes --]


On 7/8/25 2:45 PM, Jan Beulich wrote:
> On 08.07.2025 12:37, Oleksii Kurochko wrote:
>> On 7/8/25 11:01 AM, Oleksii Kurochko wrote:
>>>
>>> On 7/8/25 9:10 AM, Jan Beulich wrote:
>>>> On 07.07.2025 18:10, Oleksii Kurochko wrote:
>>>>> On 7/7/25 5:15 PM, Jan Beulich wrote:
>>>>>> On 07.07.2025 17:00, Oleksii Kurochko wrote:
>>>>>>> On 7/7/25 2:53 PM, Jan Beulich wrote:
>>>>>>>> On 07.07.2025 13:46, Oleksii Kurochko wrote:
>>>>>>>>> On 7/7/25 9:20 AM, Jan Beulich wrote:
>>>>>>>>>> On 04.07.2025 17:01, Oleksii Kurochko wrote:
>>>>>>>>>>> On 7/1/25 3:49 PM, Jan Beulich wrote:
>>>>>>>>>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +    panic("%s: isn't implemented for now\n", __func__);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +    return false;
>>>>>>>>>>>>> +}
>>>>>>>>>>>> For this function in particular, though: Besides the "p2me" in the name
>>>>>>>>>>>> being somewhat odd (supposedly page table entries here are simply pte_t),
>>>>>>>>>>>> how is this going to be different from pte_is_valid()?
>>>>>>>>>>> pte_is_valid() is checking a real bit of PTE, but p2me_is_valid() is checking
>>>>>>>>>>> what is a type stored in the radix tree (p2m->p2m_types):
>>>>>>>>>>>         /*
>>>>>>>>>>>          * In the case of the P2M, the valid bit is used for other purpose. Use
>>>>>>>>>>>          * the type to check whether an entry is valid.
>>>>>>>>>>>          */
>>>>>>>>>>>         static inline bool p2me_is_valid(struct p2m_domain *p2m, pte_t pte)
>>>>>>>>>>>         {
>>>>>>>>>>>             return p2m_type_radix_get(p2m, pte) != p2m_invalid;
>>>>>>>>>>>         }
>>>>>>>>>>>
>>>>>>>>>>> It is done to track which page was modified by a guest.
>>>>>>>>>> But then (again) the name doesn't convey what the function does.
>>>>>>>>> Then probably p2me_type_is_valid(struct p2m_domain *p2m, pte_t pte) would better.
>>>>>>>> For P2M type checks please don't invent new naming, but use what both x86
>>>>>>>> and Arm are already using. Note how we already have p2m_is_valid() in that
>>>>>>>> set. Just that it's not doing what you want here.
>>>>>>> Hm, why not doing what I want? p2m_is_valid() verifies if P2M entry is valid.
>>>>>>> And in here it is checked if P2M pte is valid from P2M point of view by checking
>>>>>>> the type in radix tree and/or in reserved PTEs bits (just to remind we have only 2
>>>>>>> free bits for type).
>>>>>> Because this is how it's defined on x86:
>>>>>>
>>>>>> #define p2m_is_valid(_t)    (p2m_to_mask(_t) & \
>>>>>>                                 (P2M_RAM_TYPES | p2m_to_mask(p2m_mmio_direct)))
>>>>>>
>>>>>> I.e. more strict that simply "!= p2m_invalid". And I think such predicates
>>>>>> would better be uniform across architectures, such that in principle they
>>>>>> might also be usable in common code (as we already do with p2m_is_foreign()).
>>>>> Yeah, Arm isn't so strict in definition of p2m_is_valid() and it seems like
>>>>> x86 and Arm have different understanding what is valid.
>>>>>
>>>>> Except what mentioned in the comment that grant types aren't considered valid
>>>>> for x86 (and shouldn't be the same then for Arm?), it isn't clear why x86's
>>>>> p2m_is_valid() is stricter then Arm's one and if other arches should be also
>>>>> so strict.
>>>> Arm's p2m_is_valid() is entirely different (and imo misnamed, but arguably one
>>>> could also consider x86'es to require a better name). It's a local helper, not
>>>> a P2M type checking predicate. With that in mind, you may of course follow
>>>> Arm's model, but in the longer run we may need to do something about the name
>>>> collision then.
>>>>
>>>>>>> The only use case I can think of is that the caller
>>>>>>> might try to map the remaining GFNs again. But that doesn’t seem very useful,
>>>>>>> if|p2m_set_entry()| wasn’t able to map the full range, it likely indicates a serious
>>>>>>> issue, and retrying would probably result in the same error.
>>>>>>>
>>>>>>> The same applies to rolling back the state. It wouldn’t be difficult to add a local
>>>>>>> array to track all modified PTEs and then use it to revert the state if needed.
>>>>>>> But again, what would the caller do after the rollback? At this point, it still seems
>>>>>>> like the best option is simply to|panic(). |
>>>>>>>
>>>>>>> Basically, I don’t see or understand the cases where knowing how many GFNs were
>>>>>>> successfully mapped, or whether a rollback was performed, would really help — because
>>>>>>> in most cases, I don’t have a better option than just calling|panic()| at the end.
>>>>>> panic()-ing is of course only a last resort. Anything related to domain handling
>>>>>> would better crash only the domain in question. And even that only if suitable
>>>>>> error handling isn't possible.
>>>>> And if there is no still any runnable domain available, for example, we are creating
>>>>> domain and some p2m mapping is called? Will it be enough just ignore to boot this domain?
>>>>> If yes, then it is enough to return only error code without returning how many GFNs were
>>>>> mapped or rollbacking as domain won't be ran anyway.
>>>> During domain creation all you need to do is return an error. But when you write a
>>>> generic function that's also (going to be) used at domain runtime, you need to
>>>> consider what to do there in case of partial success.
>>>>
>>>>>>> For example, if I call|map_regions_p2mt()| for an MMIO region described in a device
>>>>>>> tree node, and the mapping fails partway through, I’m left with two options: either
>>>>>>> ignore the device (if it's not essential for Xen or guest functionality) and continue
>>>>>>>      booting; in which case I’d need to perform a rollback, and simply knowing the number
>>>>>>> of successfully mapped GFNs may not be enough or, more likely, just panic.
>>>>>> Well, no. For example, before even trying to map you could check that the range
>>>>>> of P2M entries covered is all empty.
>>>>> Could it be that they aren't all empty? Then it seems like we have overlapping and we can't
>>>>> just do a mapping, right?
>>>> Possibly that would simply mean to return an error, yes.
>>>>
>>>>> Won't be this procedure consume a lot of time as it is needed to go through each page
>>>>> tables for each entry.
>>>> Well, you're free to suggest a clean alternative without doing so.
>>> I thought about dynamically allocating an array in p2m_set_entry(), where to save all changed PTEs,
>>> and then use it to roll back if __p2m_set_entry() returns rc != 0 ...
> That's another possible source for failure, and such an allocation may end
> up being a rather big one.
>
>>>>>>     _Then_ you know how to correctly roll back.
>>>>>> And yes, doing so may not even require passing back information on how much of
>>>>>> a region was successfully mapped.
>>>>> If P2M entries were empty before start of the mapping then it is enough to just go
>>>>> through the same range (sgfn,nr,smfn) and just clean them, right?
>>>> Yes, what else would "roll back" mean in that case?
>>> ... If we know that the P2M entries were empty, then there's nothing else to be done, just
>>> clean PTE is needed to be done.
>>> However, if the P2M entries weren’t empty (and I’m still not sure whether that’s a legal
>>> case), then rolling back would mean restoring their original state, the state they
>>> had before the P2M mapping procedure started.
>> Possible roll back is harder to implement as expected because there is a case where subtree
>> could be freed:
>>       /*
>>        * Free the entry only if the original pte was valid and the base
>>        * is different (to avoid freeing when permission is changed).
>>        */
>>       if ( p2me_is_valid(p2m, orig_pte) &&
>>            !mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) )
>>           p2m_free_subtree(p2m, orig_pte, level);
>> In this case then it will be needed to store the full subtree.
> Right, which is why it may be desirable to limit the ability to update multiple
> entries in one go. Or work from certain assumptions, violation of which would
> cause the domain to be crashed.

It seems to me that the main issue with updating multiple entries in one go is the rollback
mechanism in case of a partial mapping failure. (other issues? mapping could consume a lot
of time so something should wait while allocation will end?) In my opinion, the rollback
mechanism is quite complex to implement and could become a source of further failures.
For example, most of the cases where p2m_set_entry() could fail are due to failure in
mapping the page table (to allow Xen to walk through it) or failure in creating a new page
table due to memory exhaustion. Then, during rollback, which might also require memory
allocation, we could face the same memory shortage issue.
And what should be done in that case?

In my opinion, the best option is to simply return from p2m_set_entry() the number of
successfully mapped GFNs (stored in rc which is returned by p2m_set_entry()) and let
the caller decide how to handle the partial mapping:
1. If a partial mapping occurs during domain creation, we could just report that this
    domain can't be created and continue without it if there are other domains to start;
    otherwise, panic.
2. If a partial mapping occurs during the lifetime of a domain, for example, if the domain
    requests to map some memory, we return the number of successfully mapped GFNs and let the
    domain decide what to do: either remove the mappings or retry mapping the remaining part.
    However, I think there's not much value in retrying, since p2m_set_entry() is likely to
    fail again. So, perhaps the best course of action is to stop the domain altogether.
Does that make sense?

~ Oleksii

[-- Attachment #2: Type: text/html, Size: 13087 bytes --]

  reply	other threads:[~2025-07-08 15:43 UTC|newest]

Thread overview: 161+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-10 13:05 [PATCH v2 00/17] xen/riscv: introduce p2m functionality Oleksii Kurochko
2025-06-10 13:05 ` [PATCH v2 01/17] xen/riscv: implement sbi_remote_hfence_gvma() Oleksii Kurochko
2025-06-18 15:15   ` Jan Beulich
2025-06-23 14:31     ` Oleksii Kurochko
2025-06-23 14:39       ` Jan Beulich
2025-06-23 14:45         ` Oleksii Kurochko
2025-06-24 10:33     ` Oleksii Kurochko
2025-06-24 10:48       ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 02/17] xen/riscv: introduce sbi_remote_hfence_gvma_vmid() Oleksii Kurochko
2025-06-18 15:20   ` Jan Beulich
2025-06-23 14:38     ` Oleksii Kurochko
2025-06-10 13:05 ` [PATCH v2 03/17] xen/riscv: introduce guest domain's VMID allocation and manegement Oleksii Kurochko
2025-06-18 15:46   ` Jan Beulich
2025-06-24  9:46     ` Oleksii Kurochko
2025-06-24 10:44       ` Jan Beulich
2025-06-24 13:47         ` Oleksii Kurochko
2025-06-24 14:01           ` Jan Beulich
2025-06-24 15:32             ` Oleksii Kurochko
2025-06-26 10:05             ` Oleksii Kurochko
2025-06-26 10:41               ` Jan Beulich
2025-06-26 11:34                 ` Oleksii Kurochko
2025-06-26 11:43                   ` Juergen Gross
2025-06-26 12:05                     ` Oleksii Kurochko
2025-06-26 12:17                     ` Teddy Astie
2025-06-26 12:37                       ` Jan Beulich
2025-06-26 12:16                   ` Jan Beulich
2025-06-26 12:25                     ` Oleksii Kurochko
2025-06-10 13:05 ` [PATCH v2 04/17] xen/riscv: construct the P2M pages pool for guests Oleksii Kurochko
2025-06-18 15:53   ` Jan Beulich
2025-06-25 14:48     ` Oleksii Kurochko
2025-06-25 14:55       ` Jan Beulich
2025-07-01 13:04   ` Jan Beulich
2025-07-02 10:30     ` Oleksii Kurochko
2025-07-02 10:34       ` Jan Beulich
2025-07-02 11:17         ` Oleksii Kurochko
2025-07-02 11:48     ` Oleksii Kurochko
2025-07-02 11:56       ` Jan Beulich
2025-07-02 12:34         ` Oleksii Kurochko
2025-07-02 12:49           ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 05/17] xen/riscv: introduce things necessary for p2m initialization Oleksii Kurochko
2025-06-18 16:08   ` Jan Beulich
2025-06-25 15:31     ` Oleksii Kurochko
2025-06-25 15:53       ` Jan Beulich
2025-06-26  8:40         ` Oleksii Kurochko
2025-06-26 11:01           ` Jan Beulich
2025-06-26 11:55             ` Oleksii Kurochko
2025-06-10 13:05 ` [PATCH v2 06/17] xen/riscv: add root page table allocation Oleksii Kurochko
2025-06-30 15:22   ` Jan Beulich
2025-06-30 16:18     ` Oleksii Kurochko
2025-07-01  6:29       ` Jan Beulich
2025-07-01  9:44         ` Oleksii Kurochko
2025-07-01 10:27           ` Jan Beulich
2025-07-01 14:02             ` Oleksii Kurochko
2025-07-01 14:28               ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 07/17] xen/riscv: introduce pte_{set,get}_mfn() Oleksii Kurochko
2025-06-26 14:57   ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 08/17] xen/riscv: add new p2m types and helper macros for type classification Oleksii Kurochko
2025-06-26 14:59   ` Jan Beulich
2025-06-30 14:33     ` Oleksii Kurochko
2025-06-30 14:38       ` Oleksii Kurochko
2025-06-30 14:45         ` Jan Beulich
2025-06-30 15:27           ` Oleksii Kurochko
2025-06-30 15:50             ` Jan Beulich
2025-07-02 10:13               ` Oleksii Kurochko
2025-07-02 10:36                 ` Jan Beulich
2025-06-30 14:42       ` Jan Beulich
2025-06-30 15:13         ` Oleksii Kurochko
2025-06-30 15:27           ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 09/17] xen/riscv: introduce page_set_xenheap_gfn() Oleksii Kurochko
2025-06-30 15:48   ` Jan Beulich
2025-07-02 15:59     ` Oleksii Kurochko
2025-07-03  5:59       ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 10/17] xen/riscv: implement guest_physmap_add_entry() for mapping GFNs to MFNs Oleksii Kurochko
2025-06-30 15:59   ` Jan Beulich
2025-07-03 11:02     ` Oleksii Kurochko
2025-07-03 11:33       ` Jan Beulich
2025-07-03 11:54         ` Oleksii Kurochko
2025-07-03 13:09           ` Jan Beulich
2025-07-03 13:28             ` Oleksii Kurochko
2025-07-03 13:34               ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 11/17] xen/riscv: implement p2m_set_entry() and __p2m_set_entry() Oleksii Kurochko
2025-07-01 13:49   ` Jan Beulich
2025-07-04 15:01     ` Oleksii Kurochko
2025-07-07  7:20       ` Jan Beulich
2025-07-07 11:46         ` Oleksii Kurochko
2025-07-07 12:53           ` Jan Beulich
2025-07-07 15:00             ` Oleksii Kurochko
2025-07-07 15:15               ` Jan Beulich
2025-07-07 16:10                 ` Oleksii Kurochko
2025-07-08  7:10                   ` Jan Beulich
2025-07-08  9:01                     ` Oleksii Kurochko
2025-07-08 10:37                       ` Oleksii Kurochko
2025-07-08 12:45                         ` Jan Beulich
2025-07-08 15:42                           ` Oleksii Kurochko [this message]
2025-07-08 16:04                             ` Jan Beulich
2025-07-09  8:24                               ` Oleksii Kurochko
2025-07-09  8:41                                 ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 12/17] xen/riscv: Implement p2m_free_entry() and related helpers Oleksii Kurochko
2025-07-01 14:23   ` Jan Beulich
2025-07-11 15:56     ` Oleksii Kurochko
2025-07-14  7:15       ` Jan Beulich
2025-07-14 16:01         ` Oleksii Kurochko
2025-07-14 16:17           ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration Oleksii Kurochko
2025-07-01 15:08   ` Jan Beulich
2025-07-15 14:47     ` Oleksii Kurochko
2025-07-16 11:31       ` Jan Beulich
2025-07-16 16:07         ` Oleksii Kurochko
2025-07-16 16:18           ` Jan Beulich
2025-07-17  8:56             ` Oleksii Kurochko
2025-07-17 10:25               ` Jan Beulich
2025-07-18  9:52                 ` Oleksii Kurochko
2025-07-21 12:18                   ` Jan Beulich
2025-07-22 10:41                     ` Oleksii Kurochko
2025-07-22 11:34                       ` Oleksii Kurochko
2025-07-22 12:00                         ` Jan Beulich
2025-07-22 14:25                           ` Oleksii Kurochko
2025-07-22 14:35                             ` Jan Beulich
2025-07-22 16:07                               ` Oleksii Kurochko
2025-07-23  9:46                                 ` Jan Beulich
2025-07-28  8:52                                   ` Oleksii Kurochko
2025-07-28  9:09                                     ` Jan Beulich
2025-07-28 11:37                                       ` Oleksii Kurochko
2025-07-28 11:49                                         ` Jan Beulich
2025-07-22 11:54                       ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 14/17] xen/riscv: implement p2m_next_level() Oleksii Kurochko
2025-07-02  8:35   ` Jan Beulich
2025-07-16 11:32     ` Oleksii Kurochko
2025-07-16 11:43       ` Jan Beulich
2025-07-16 15:53         ` Oleksii Kurochko
2025-07-16 16:12           ` Jan Beulich
2025-07-17  9:42             ` Oleksii Kurochko
2025-07-17 10:37               ` Jan Beulich
2025-07-18 11:19                 ` Oleksii Kurochko
2025-07-21 13:14                   ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 15/17] xen/riscv: Implement superpage splitting for p2m mappings Oleksii Kurochko
2025-07-02  9:25   ` Jan Beulich
2025-07-17 16:37     ` Oleksii Kurochko
2025-07-21 13:34       ` Jan Beulich
2025-07-22 14:57         ` Oleksii Kurochko
2025-07-22 16:02           ` Jan Beulich
2025-07-23 19:51             ` Oleksii Kurochko
2025-07-24  7:58               ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers Oleksii Kurochko
2025-07-02 10:09   ` Jan Beulich
2025-07-02 10:28     ` Jan Beulich
2025-07-18 14:37       ` Oleksii Kurochko
2025-07-21 13:39         ` Jan Beulich
2025-07-22 12:03           ` Oleksii Kurochko
2025-07-22 12:05             ` Jan Beulich
2025-07-29 13:47               ` Oleksii Kurochko
2025-07-29 14:48                 ` Jan Beulich
2025-07-02 12:52     ` Orzel, Michal
2025-07-18 14:49     ` Oleksii Kurochko
2025-07-21 13:42       ` Jan Beulich
2025-07-22 13:38         ` Oleksii Kurochko
2025-07-21 13:53       ` Jan Beulich
2025-06-10 13:05 ` [PATCH v2 17/17] xen/riscv: add support of page lookup by GFN Oleksii Kurochko
2025-07-02 11:44   ` Jan Beulich
2025-07-21  9:43     ` Oleksii Kurochko
2025-07-21 14:06       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd73f1ed-e2cb-40c5-ad06-bd390467e128@gmail.com \
    --to=oleksii.kurochko@gmail.com \
    --cc=alistair.francis@wdc.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@vates.tech \
    --cc=bobbyeshleman@gmail.com \
    --cc=connojdavis@gmail.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=michal.orzel@amd.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.