* [PATCH for 4.20? v4 0/3] Fixes for vmap_to_mfn() and pt_mapping_level
@ 2025-02-12 16:50 Oleksii Kurochko
2025-02-12 16:50 ` [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking Oleksii Kurochko
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Oleksii Kurochko @ 2025-02-12 16:50 UTC (permalink / raw)
To: xen-devel
Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
Julien Grall, Roger Pau Monné, Stefano Stabellini
Introduce pt_walk(), which does software page table walking to resolve the
following issues:
1. vmap_to_mfn() uses virt_to_maddr(), which is designed to work with VA
from either the direct map region or Xen's linkage region (XEN_VIRT_START),
thereby an assertion will occur if it is used with other regions, in
particular for the VMAP region. The solution is usage of pt_walk() for
vmap_to_mfn().
2. pt_mapping_level() returns incorrect page table level in the case when
mfn==INVALID_MFN when, for example, VA was mapped to PA using 4k mapping,
but during destroying/modification pt_mapping_level() could return incorrect
page table level as when mfn==INVALID_MFN then only VA is taking into account
for page table level calculation and so if VA is page table level 1 aligned
then page_mapping_level() will return level 1 ( instead of level 0 as VA was
mapped to PA using 4k mapping so there is incostinency here ).
The solution is to set level=CONFIG_PAGING_TABLE to tell pt_update() algo
that it should use pt_walk() to find proper page table entry instead of
using for searching of page table entry based on precalculated by
pt_mapping_level() `level` and `order` values.
It would be nice to have these fixes in Xen 4.20 but isn't really critical as
there is no any users for RISC-V port at this moment.
---
Changes in v4:
- please look at a specific patch.
---
Changes in v2-v3:
- update the commit message.
- other changes look in specific patch.
---
Oleksii Kurochko (3):
xen/riscv: implement software page table walking
xen/riscv: update defintion of vmap_to_mfn()
xen/riscv: update mfn calculation in pt_mapping_level()
xen/arch/riscv/include/asm/mm.h | 2 +-
xen/arch/riscv/include/asm/page.h | 9 ++
xen/arch/riscv/pt.c | 159 +++++++++++++++++++++++-------
3 files changed, 135 insertions(+), 35 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking 2025-02-12 16:50 [PATCH for 4.20? v4 0/3] Fixes for vmap_to_mfn() and pt_mapping_level Oleksii Kurochko @ 2025-02-12 16:50 ` Oleksii Kurochko 2025-02-19 11:14 ` Jan Beulich 2025-02-12 16:50 ` [PATCH for 4.20? v4 2/3] xen/riscv: update defintion of vmap_to_mfn() Oleksii Kurochko 2025-02-12 16:50 ` [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() Oleksii Kurochko 2 siblings, 1 reply; 11+ messages in thread From: Oleksii Kurochko @ 2025-02-12 16:50 UTC (permalink / raw) To: xen-devel Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné, Stefano Stabellini RISC-V doesn't have hardware feature to ask MMU to translate virtual address to physical address ( like Arm has, for example ), so software page table walking is implemented. Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in v4: - Update the comment message above _pt_walk(): add information that `pte_level` is optional and add a note that `table` should be unmapped by a caller. - Unmap `table` in pt_walk(). --- Changes in v3: - Remove circular dependency. - Move declaration of pt_walk() to asm/page.h. - Revert other not connected to pt_walk() changes. - Update the commit message. - Drop unnessary anymore local variables of pt_walk(). - Refactor pt_walk() to use for() loop instead of while() loop as it was suggested by Jan B. - Introduce _pt_walk() which returns pte_t * and update prototype of pt_walk() to return pte_t by value. --- Changes in v2: - Update the code of pt_walk() to return pte_t instead of paddr_t. - Fix typos and drop blankets inside parentheses in the comment. - Update the `ret` handling; there is no need for `mfn` calculation anymore as pt_walk() returns or pte_t of a leaf node or non-present pte_t. - Drop the comment before unmap_table(). - Drop local variable `pa` as pt_walk() is going to return pte_t instead of paddr_t. - Add the comment above pt_walk() to explain what it does and returns. - Add additional explanation about possible return values of pt_next_level() used inside while loop in pt_walk(). - Change `pa` to `table` in the comment before while loop in pt_walk() as actually this loop finds a page table where paga table entry for `va` is located. - After including <asm/page.h> in <asm/mm.h>, the following compilation error occurs: ./arch/riscv/include/asm/cmpxchg.h:56:9: error: implicit declaration of function 'GENMASK' To resolve this, <xen/bitops.h> needs to be included at the top of <asm/cmpxchg.h>. - To avoid an issue with the implicit declaration of map_domain_page() and unmap_domain_page() after including <asm/page.h> in <asm/mm.h>, the implementation of flush_page_to_ram() has been moved to mm.c. (look for more detailed explanation in the commit message) As a result drop inclusion of <xen/domain.h> in <asm/page.h>. - Update the commit message. --- xen/arch/riscv/include/asm/page.h | 2 + xen/arch/riscv/pt.c | 62 +++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h index 7a6174a109..0439a1a9ee 100644 --- a/xen/arch/riscv/include/asm/page.h +++ b/xen/arch/riscv/include/asm/page.h @@ -208,6 +208,8 @@ static inline pte_t pte_from_mfn(mfn_t mfn, unsigned int flags) return (pte_t){ .pte = pte }; } +pte_t pt_walk(vaddr_t va, unsigned int *pte_level); + #endif /* __ASSEMBLY__ */ #endif /* ASM__RISCV__PAGE_H */ diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c index a703e0f1bd..260a3a9699 100644 --- a/xen/arch/riscv/pt.c +++ b/xen/arch/riscv/pt.c @@ -185,6 +185,68 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) return XEN_TABLE_NORMAL; } +/* + * _pt_walk() performs software page table walking and returns the pte_t of + * a leaf node or the leaf-most not-present pte_t if no leaf node is found + * for further analysis. + * + * Additionally, _pt_walk() returns the level of the found pte by using + * `pte_level` argument. + * `pte_level` is optional, set `pte_level`=NULL if a caller doesn't need + * the level of the found pte. + * + * Note: unmapping of final `table` should be done by a caller. + */ +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) +{ + const mfn_t root = get_root_page(); + unsigned int level; + pte_t *table; + + DECLARE_OFFSETS(offsets, va); + + table = map_table(root); + + /* + * Find `table` of an entry which corresponds to `va` by iterating for each + * page level and checking if the entry points to a next page table or + * to a page. + * + * Two cases are possible: + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If + * pt_next_level() is called for page table level 0, it results in the + * entry being a pointer to a leaf node, thereby returning + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually + * mapped. + */ + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) + { + int ret = pt_next_level(false, &table, offsets[level]); + + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) + break; + + ASSERT(level); + } + + if ( pte_level ) + *pte_level = level; + + return table + offsets[level]; +} + +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) +{ + pte_t *entry = _pt_walk(va, pte_level); + pte_t pte = *entry; + + unmap_table(entry); + + return pte; +} + /* Update an entry at the level @target. */ static int pt_update_entry(mfn_t root, vaddr_t virt, mfn_t mfn, unsigned int target, -- 2.48.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking 2025-02-12 16:50 ` [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking Oleksii Kurochko @ 2025-02-19 11:14 ` Jan Beulich 2025-02-19 14:33 ` Oleksii Kurochko 0 siblings, 1 reply; 11+ messages in thread From: Jan Beulich @ 2025-02-19 11:14 UTC (permalink / raw) To: Oleksii Kurochko Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel On 12.02.2025 17:50, Oleksii Kurochko wrote: > --- a/xen/arch/riscv/pt.c > +++ b/xen/arch/riscv/pt.c > @@ -185,6 +185,68 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) > return XEN_TABLE_NORMAL; > } > > +/* > + * _pt_walk() performs software page table walking and returns the pte_t of > + * a leaf node or the leaf-most not-present pte_t if no leaf node is found > + * for further analysis. > + * > + * Additionally, _pt_walk() returns the level of the found pte by using > + * `pte_level` argument. > + * `pte_level` is optional, set `pte_level`=NULL if a caller doesn't need > + * the level of the found pte. How about this, reducing redundancy a little? * _pt_walk() can optionally return the level of the found pte. Pass NULL * for `pte_level` if this information isn't needed. > +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) > +{ > + pte_t *entry = _pt_walk(va, pte_level); > + pte_t pte = *entry; > + > + unmap_table(entry); > + > + return pte; > +} "entry" especially in this context is ambiguous. I would expect a variable of this name to be of type pte_t, not pte_t *. How about "ptep"? Preferably with these adjustments, which I'd be fine making while committing, Reviewed-by: Jan Beulich <jbeulich@suse.com> Considering the 4.20? tag you'll need to decide whether you still want this in before the release. Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking 2025-02-19 11:14 ` Jan Beulich @ 2025-02-19 14:33 ` Oleksii Kurochko 0 siblings, 0 replies; 11+ messages in thread From: Oleksii Kurochko @ 2025-02-19 14:33 UTC (permalink / raw) To: Jan Beulich Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel [-- Attachment #1: Type: text/plain, Size: 1793 bytes --] On 2/19/25 12:14 PM, Jan Beulich wrote: > On 12.02.2025 17:50, Oleksii Kurochko wrote: >> --- a/xen/arch/riscv/pt.c >> +++ b/xen/arch/riscv/pt.c >> @@ -185,6 +185,68 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) >> return XEN_TABLE_NORMAL; >> } >> >> +/* >> + * _pt_walk() performs software page table walking and returns the pte_t of >> + * a leaf node or the leaf-most not-present pte_t if no leaf node is found >> + * for further analysis. >> + * >> + * Additionally, _pt_walk() returns the level of the found pte by using >> + * `pte_level` argument. >> + * `pte_level` is optional, set `pte_level`=NULL if a caller doesn't need >> + * the level of the found pte. > How about this, reducing redundancy a little? > > * _pt_walk() can optionally return the level of the found pte. Pass NULL > * for `pte_level` if this information isn't needed. > >> +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) >> +{ >> + pte_t *entry = _pt_walk(va, pte_level); >> + pte_t pte = *entry; >> + >> + unmap_table(entry); >> + >> + return pte; >> +} > "entry" especially in this context is ambiguous. I would expect a variable of > this name to be of type pte_t, not pte_t *. How about "ptep"? Agree with both your suggestions, it would be better to use `ptep instead of `entry` and rephrase the comment. > > Preferably with these adjustments, which I'd be fine making while committing, > Reviewed-by: Jan Beulich<jbeulich@suse.com> > > Considering the 4.20? tag you'll need to decide whether you still want this > in before the release. Considering that it is still needed a new version for patch3 of this patch series and that the mentioned issues aren't affected no one, lets consider the full patch series for 4.21. Thanks. ~ Oleksii [-- Attachment #2: Type: text/html, Size: 2622 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH for 4.20? v4 2/3] xen/riscv: update defintion of vmap_to_mfn() 2025-02-12 16:50 [PATCH for 4.20? v4 0/3] Fixes for vmap_to_mfn() and pt_mapping_level Oleksii Kurochko 2025-02-12 16:50 ` [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking Oleksii Kurochko @ 2025-02-12 16:50 ` Oleksii Kurochko 2025-02-19 11:17 ` Jan Beulich 2025-02-12 16:50 ` [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() Oleksii Kurochko 2 siblings, 1 reply; 11+ messages in thread From: Oleksii Kurochko @ 2025-02-12 16:50 UTC (permalink / raw) To: xen-devel Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné, Stefano Stabellini vmap_to_mfn() uses virt_to_maddr(), which is designed to work with VA from either the direct map region or Xen's linkage region (XEN_VIRT_START). An assertion will occur if it is used with other regions, in particular for the VMAP region. Since RISC-V lacks a hardware feature to request the MMU to translate a VA to a PA (as Arm does, for example), software page table walking (pt_walk()) is used for the VMAP region to obtain the mfn from pte_t. To avoid introduce a circular dependency between asm/mm.h and asm/page.h by including each other, the static inline function _vmap_to_mfn() is introduced in asm/page.h, as it uses struct pte_t and pte_is_mapping() from asm/page.h. _vmap_to_mfn() is then reused in the definition of vmap_to_mfn() macro in asm/mm.h. Fixes: 7db8d2bd9b ("xen/riscv: add minimal stuff to mm.h to build full Xen") Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in v4: - Convert _vmap_to_mfn() macro to static inline function. - Update the commit message: change macro to static inline function for _vmap_to_mfn(). --- Changes in v3: - Move vmap_to_mfn_ to asm/page.h to deal with circular dependency. - Convert vmap_to_mfn_() to macros. - Rename vmap_to_mfn_ to _vmap_to_mfn. - Update _vmap_to_mfn() to work with pte_t instead of pte_t*. - Add parentheses around va argument for macros vmap_to_mfn(). - Updated the commit message. --- Changes in v2: - Update defintion of vmap_to_mfn() as pt_walk() now returns pte_t instead of paddr_t. - Update the commit message. --- xen/arch/riscv/include/asm/mm.h | 2 +- xen/arch/riscv/include/asm/page.h | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h index 292aa48fc1..4035cd400a 100644 --- a/xen/arch/riscv/include/asm/mm.h +++ b/xen/arch/riscv/include/asm/mm.h @@ -23,7 +23,7 @@ extern vaddr_t directmap_virt_start; #define gaddr_to_gfn(ga) _gfn(paddr_to_pfn(ga)) #define mfn_to_maddr(mfn) pfn_to_paddr(mfn_x(mfn)) #define maddr_to_mfn(ma) _mfn(paddr_to_pfn(ma)) -#define vmap_to_mfn(va) maddr_to_mfn(virt_to_maddr((vaddr_t)(va))) +#define vmap_to_mfn(va) _vmap_to_mfn((vaddr_t)(va)) #define vmap_to_page(va) mfn_to_page(vmap_to_mfn(va)) static inline void *maddr_to_virt(paddr_t ma) diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h index 0439a1a9ee..6ed570b478 100644 --- a/xen/arch/riscv/include/asm/page.h +++ b/xen/arch/riscv/include/asm/page.h @@ -210,6 +210,13 @@ static inline pte_t pte_from_mfn(mfn_t mfn, unsigned int flags) pte_t pt_walk(vaddr_t va, unsigned int *pte_level); +static inline mfn_t _vmap_to_mfn(vaddr_t va) +{ + pte_t entry = pt_walk(va, NULL); + BUG_ON(!pte_is_mapping(entry)); + return mfn_from_pte(entry); +} + #endif /* __ASSEMBLY__ */ #endif /* ASM__RISCV__PAGE_H */ -- 2.48.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 2/3] xen/riscv: update defintion of vmap_to_mfn() 2025-02-12 16:50 ` [PATCH for 4.20? v4 2/3] xen/riscv: update defintion of vmap_to_mfn() Oleksii Kurochko @ 2025-02-19 11:17 ` Jan Beulich 0 siblings, 0 replies; 11+ messages in thread From: Jan Beulich @ 2025-02-19 11:17 UTC (permalink / raw) To: Oleksii Kurochko Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel On 12.02.2025 17:50, Oleksii Kurochko wrote: > --- a/xen/arch/riscv/include/asm/page.h > +++ b/xen/arch/riscv/include/asm/page.h > @@ -210,6 +210,13 @@ static inline pte_t pte_from_mfn(mfn_t mfn, unsigned int flags) > > pte_t pt_walk(vaddr_t va, unsigned int *pte_level); > > +static inline mfn_t _vmap_to_mfn(vaddr_t va) > +{ > + pte_t entry = pt_walk(va, NULL); > + BUG_ON(!pte_is_mapping(entry)); > + return mfn_from_pte(entry); > +} Nit: Blank line between declaration(s) and statement(s) please. Blank line ahead of the main return statement of a function please. With that (happy to adjust while committing): Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() 2025-02-12 16:50 [PATCH for 4.20? v4 0/3] Fixes for vmap_to_mfn() and pt_mapping_level Oleksii Kurochko 2025-02-12 16:50 ` [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking Oleksii Kurochko 2025-02-12 16:50 ` [PATCH for 4.20? v4 2/3] xen/riscv: update defintion of vmap_to_mfn() Oleksii Kurochko @ 2025-02-12 16:50 ` Oleksii Kurochko 2025-02-19 11:28 ` Jan Beulich 2 siblings, 1 reply; 11+ messages in thread From: Oleksii Kurochko @ 2025-02-12 16:50 UTC (permalink / raw) To: xen-devel Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné, Stefano Stabellini When pt_update() is called with arguments (..., INVALID_MFN, ..., 0 or 1), it indicates that a mapping is being destroyed/modifyed. In the case when modifying or destroying a mapping, it is necessary to search until a leaf node is found, instead of searching for a page table entry based on the precalculated `level` and `order`(look at pt_update()). This is because when `mfn` == INVALID_MFN, the `mask` (in pt_mapping_level()) will take into account only `vfn`, which could accidentally return an incorrect level, leading to the discovery of an incorrect page table entry. For example, if `vfn` is page table level 1 aligned, but it was mapped as page table level 0, then pt_mapping_level() will return `level` = 1, since only `vfn` (which is page table level 1 aligned) is taken into account when `mfn` == INVALID_MFN (look at pt_mapping_level()). Fixes: c2f1ded524 ("xen/riscv: page table handling") Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in v4: - Move defintion of local variable table inside `else` case as it is used only there. - Change unmap_table(table) to unmap_table(entry) for unifying both cases when _pt_walk() is used and when pte is seached on the specified level. - Initialize local variable `entry` to avoid compilation error caused by uninitialized variable. --- Changes in v3: - Drop ASSERT() for order as it isn't needed anymore. - Drop PTE_LEAF_SEARCH and use instead level=CONFIG_PAGING_LEVELS; refactor connected code correspondingly. - Calculate order once. - Drop initializer for local variable order. - Drop BUG_ON(!pte_is_mapping(*entry)) for the case when leaf searching happens as there is a similar check in pt_check_entry(). Look at pt.c:41 and pt.c:75. --- Changes in v2: - Introduce PTE_LEAF_SEARCH to tell page table update operation to walk down to wherever the leaf entry is. - Use introduced PTE_LEAF_SEARCH to not searching pte_t entry twice. - Update the commit message. --- xen/arch/riscv/pt.c | 97 +++++++++++++++++++++++++++++---------------- 1 file changed, 63 insertions(+), 34 deletions(-) diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c index 260a3a9699..ed0587d58b 100644 --- a/xen/arch/riscv/pt.c +++ b/xen/arch/riscv/pt.c @@ -249,12 +249,10 @@ pte_t pt_walk(vaddr_t va, unsigned int *pte_level) /* Update an entry at the level @target. */ static int pt_update_entry(mfn_t root, vaddr_t virt, - mfn_t mfn, unsigned int target, + mfn_t mfn, unsigned int *target, unsigned int flags) { int rc; - unsigned int level = HYP_PT_ROOT_LEVEL; - pte_t *table; /* * The intermediate page table shouldn't be allocated when MFN isn't * valid and we are not populating page table. @@ -265,41 +263,48 @@ static int pt_update_entry(mfn_t root, vaddr_t virt, * combinations of (mfn, flags). */ bool alloc_tbl = !mfn_eq(mfn, INVALID_MFN) || (flags & PTE_POPULATE); - pte_t pte, *entry; - - /* convenience aliases */ - DECLARE_OFFSETS(offsets, virt); + pte_t pte, *entry = NULL; - table = map_table(root); - for ( ; level > target; level-- ) + if ( *target == CONFIG_PAGING_LEVELS ) + entry = _pt_walk(virt, target); + else { - rc = pt_next_level(alloc_tbl, &table, offsets[level]); - if ( rc == XEN_TABLE_MAP_NOMEM ) + pte_t *table; + unsigned int level = HYP_PT_ROOT_LEVEL; + /* convenience aliases */ + DECLARE_OFFSETS(offsets, virt); + + table = map_table(root); + for ( ; level > *target; level-- ) { - rc = -ENOMEM; - goto out; + rc = pt_next_level(alloc_tbl, &table, offsets[level]); + if ( rc == XEN_TABLE_MAP_NOMEM ) + { + rc = -ENOMEM; + goto out; + } + + if ( rc == XEN_TABLE_MAP_NONE ) + { + rc = 0; + goto out; + } + + if ( rc != XEN_TABLE_NORMAL ) + break; } - if ( rc == XEN_TABLE_MAP_NONE ) + if ( level != *target ) { - rc = 0; + dprintk(XENLOG_ERR, + "%s: Shattering superpage is not supported\n", __func__); + rc = -EOPNOTSUPP; goto out; } - if ( rc != XEN_TABLE_NORMAL ) - break; - } - - if ( level != target ) - { - dprintk(XENLOG_ERR, - "%s: Shattering superpage is not supported\n", __func__); - rc = -EOPNOTSUPP; - goto out; + entry = table + offsets[level]; } - entry = table + offsets[level]; - rc = -EINVAL; if ( !pt_check_entry(*entry, mfn, flags) ) goto out; @@ -331,7 +336,8 @@ static int pt_update_entry(mfn_t root, vaddr_t virt, rc = 0; out: - unmap_table(table); + if ( entry ) + unmap_table(entry); return rc; } @@ -424,17 +430,40 @@ static int pt_update(vaddr_t virt, mfn_t mfn, while ( left ) { - unsigned int order, level; - - level = pt_mapping_level(vfn, mfn, left, flags); - order = XEN_PT_LEVEL_ORDER(level); + unsigned int order, level = CONFIG_PAGING_LEVELS; - ASSERT(left >= BIT(order, UL)); + /* + * In the case when modifying or destroying a mapping, it is necessary + * to search until a leaf node is found, instead of searching for + * a page table entry based on the precalculated `level` and `order` + * (look at pt_update()). + * This is because when `mfn` == INVALID_MFN, the `mask`(in + * pt_mapping_level()) will take into account only `vfn`, which could + * accidentally return an incorrect level, leading to the discovery of + * an incorrect page table entry. + * + * For example, if `vfn` is page table level 1 aligned, but it was + * mapped as page table level 0, then pt_mapping_level() will return + * `level` = 1, since only `vfn` (which is page table level 1 aligned) + * is taken into account when `mfn` == INVALID_MFN + * (look at pt_mapping_level()). + * + * To force searching until a leaf node is found is necessary to have + * `level` == CONFIG_PAGING_LEVELS which is a default value for + * `level`. + * + * For other cases (when a mapping is not being modified or destroyed), + * pt_mapping_level() should be used. + */ + if ( !mfn_eq(mfn, INVALID_MFN) || (flags & PTE_POPULATE) ) + level = pt_mapping_level(vfn, mfn, left, flags); - rc = pt_update_entry(root, vfn << PAGE_SHIFT, mfn, level, flags); + rc = pt_update_entry(root, vfn << PAGE_SHIFT, mfn, &level, flags); if ( rc ) break; + order = XEN_PT_LEVEL_ORDER(level); + vfn += 1UL << order; if ( !mfn_eq(mfn, INVALID_MFN) ) mfn = mfn_add(mfn, 1UL << order); -- 2.48.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() 2025-02-12 16:50 ` [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() Oleksii Kurochko @ 2025-02-19 11:28 ` Jan Beulich 2025-02-19 14:46 ` Oleksii Kurochko 0 siblings, 1 reply; 11+ messages in thread From: Jan Beulich @ 2025-02-19 11:28 UTC (permalink / raw) To: Oleksii Kurochko Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel On 12.02.2025 17:50, Oleksii Kurochko wrote: > --- a/xen/arch/riscv/pt.c > +++ b/xen/arch/riscv/pt.c > @@ -249,12 +249,10 @@ pte_t pt_walk(vaddr_t va, unsigned int *pte_level) > > /* Update an entry at the level @target. */ > static int pt_update_entry(mfn_t root, vaddr_t virt, > - mfn_t mfn, unsigned int target, > + mfn_t mfn, unsigned int *target, > unsigned int flags) > { > int rc; > - unsigned int level = HYP_PT_ROOT_LEVEL; > - pte_t *table; > /* > * The intermediate page table shouldn't be allocated when MFN isn't > * valid and we are not populating page table. > @@ -265,41 +263,48 @@ static int pt_update_entry(mfn_t root, vaddr_t virt, > * combinations of (mfn, flags). > */ > bool alloc_tbl = !mfn_eq(mfn, INVALID_MFN) || (flags & PTE_POPULATE); > - pte_t pte, *entry; > - > - /* convenience aliases */ > - DECLARE_OFFSETS(offsets, virt); > + pte_t pte, *entry = NULL; With there also being "table" below, "entry" isn't quite as bad as in the other patch. Yet I'd still like to ask that you consider renaming. > - table = map_table(root); > - for ( ; level > target; level-- ) > + if ( *target == CONFIG_PAGING_LEVELS ) > + entry = _pt_walk(virt, target); Imo it's quite important for the comment ahead of the function to be updated to mention this special case. > + else > { > - rc = pt_next_level(alloc_tbl, &table, offsets[level]); > - if ( rc == XEN_TABLE_MAP_NOMEM ) > + pte_t *table; > + unsigned int level = HYP_PT_ROOT_LEVEL; > + /* convenience aliases */ Nit: Style. > @@ -331,7 +336,8 @@ static int pt_update_entry(mfn_t root, vaddr_t virt, > rc = 0; > > out: > - unmap_table(table); > + if ( entry ) > + unmap_table(entry); Would it perhaps be worth for unmap_table() to gracefully handle being passed NULL, to avoid such conditionals (there may be more in the future)? Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() 2025-02-19 11:28 ` Jan Beulich @ 2025-02-19 14:46 ` Oleksii Kurochko 2025-02-19 15:05 ` Jan Beulich 0 siblings, 1 reply; 11+ messages in thread From: Oleksii Kurochko @ 2025-02-19 14:46 UTC (permalink / raw) To: Jan Beulich Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel [-- Attachment #1: Type: text/plain, Size: 2374 bytes --] On 2/19/25 12:28 PM, Jan Beulich wrote: > On 12.02.2025 17:50, Oleksii Kurochko wrote: >> --- a/xen/arch/riscv/pt.c >> +++ b/xen/arch/riscv/pt.c >> @@ -249,12 +249,10 @@ pte_t pt_walk(vaddr_t va, unsigned int *pte_level) >> >> /* Update an entry at the level @target. */ >> static int pt_update_entry(mfn_t root, vaddr_t virt, >> - mfn_t mfn, unsigned int target, >> + mfn_t mfn, unsigned int *target, >> unsigned int flags) >> { >> int rc; >> - unsigned int level = HYP_PT_ROOT_LEVEL; >> - pte_t *table; >> /* >> * The intermediate page table shouldn't be allocated when MFN isn't >> * valid and we are not populating page table. >> @@ -265,41 +263,48 @@ static int pt_update_entry(mfn_t root, vaddr_t virt, >> * combinations of (mfn, flags). >> */ >> bool alloc_tbl = !mfn_eq(mfn, INVALID_MFN) || (flags & PTE_POPULATE); >> - pte_t pte, *entry; >> - >> - /* convenience aliases */ >> - DECLARE_OFFSETS(offsets, virt); >> + pte_t pte, *entry = NULL; > With there also being "table" below, "entry" isn't quite as bad as in the > other patch. Yet I'd still like to ask that you consider renaming. > >> - table = map_table(root); >> - for ( ; level > target; level-- ) >> + if ( *target == CONFIG_PAGING_LEVELS ) >> + entry = _pt_walk(virt, target); > Imo it's quite important for the comment ahead of the function to be updated > to mention this special case. > >> + else >> { >> - rc = pt_next_level(alloc_tbl, &table, offsets[level]); >> - if ( rc == XEN_TABLE_MAP_NOMEM ) >> + pte_t *table; >> + unsigned int level = HYP_PT_ROOT_LEVEL; >> + /* convenience aliases */ > Nit: Style. From the 'Comments' section of CODING_STYLE, I see that the comment should start with capital letter. Do you mean that? > >> @@ -331,7 +336,8 @@ static int pt_update_entry(mfn_t root, vaddr_t virt, >> rc = 0; >> >> out: >> - unmap_table(table); >> + if ( entry ) >> + unmap_table(entry); > Would it perhaps be worth for unmap_table() to gracefully handle being passed > NULL, to avoid such conditionals (there may be more in the future)? Agree, it would be more safe to move this check inside unmap_table(). I will update that. Thanks. ~ Oleksii [-- Attachment #2: Type: text/html, Size: 3387 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() 2025-02-19 14:46 ` Oleksii Kurochko @ 2025-02-19 15:05 ` Jan Beulich 2025-02-19 15:09 ` Oleksii Kurochko 0 siblings, 1 reply; 11+ messages in thread From: Jan Beulich @ 2025-02-19 15:05 UTC (permalink / raw) To: Oleksii Kurochko Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel On 19.02.2025 15:46, Oleksii Kurochko wrote: > On 2/19/25 12:28 PM, Jan Beulich wrote: >> On 12.02.2025 17:50, Oleksii Kurochko wrote: >>> + else >>> { >>> - rc = pt_next_level(alloc_tbl, &table, offsets[level]); >>> - if ( rc == XEN_TABLE_MAP_NOMEM ) >>> + pte_t *table; >>> + unsigned int level = HYP_PT_ROOT_LEVEL; >>> + /* convenience aliases */ >> Nit: Style. > > From the 'Comments' section of CODING_STYLE, I see that the comment should start > with capital letter. Do you mean that? In the (earlier) reply to "xen/riscv: identify specific ISA supported by cpu" I said precisely that. I didn't think I'd need to repeat it here. So yes. Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() 2025-02-19 15:05 ` Jan Beulich @ 2025-02-19 15:09 ` Oleksii Kurochko 0 siblings, 0 replies; 11+ messages in thread From: Oleksii Kurochko @ 2025-02-19 15:09 UTC (permalink / raw) To: Jan Beulich Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel [-- Attachment #1: Type: text/plain, Size: 949 bytes --] On 2/19/25 4:05 PM, Jan Beulich wrote: > On 19.02.2025 15:46, Oleksii Kurochko wrote: >> On 2/19/25 12:28 PM, Jan Beulich wrote: >>> On 12.02.2025 17:50, Oleksii Kurochko wrote: >>>> + else >>>> { >>>> - rc = pt_next_level(alloc_tbl, &table, offsets[level]); >>>> - if ( rc == XEN_TABLE_MAP_NOMEM ) >>>> + pte_t *table; >>>> + unsigned int level = HYP_PT_ROOT_LEVEL; >>>> + /* convenience aliases */ >>> Nit: Style. >> From the 'Comments' section of CODING_STYLE, I see that the comment should start >> with capital letter. Do you mean that? > In the (earlier) reply to "xen/riscv: identify specific ISA supported by cpu" > I said precisely that. I didn't think I'd need to repeat it here. So yes. Of course, it was enough. The problem was that I started to read and answer to this patch series first and went to another (where you wrote that) one after. Anyway thank you for clarifying. ~ Oleksii [-- Attachment #2: Type: text/html, Size: 1906 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-02-19 15:10 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-12 16:50 [PATCH for 4.20? v4 0/3] Fixes for vmap_to_mfn() and pt_mapping_level Oleksii Kurochko 2025-02-12 16:50 ` [PATCH for 4.20? v4 1/3] xen/riscv: implement software page table walking Oleksii Kurochko 2025-02-19 11:14 ` Jan Beulich 2025-02-19 14:33 ` Oleksii Kurochko 2025-02-12 16:50 ` [PATCH for 4.20? v4 2/3] xen/riscv: update defintion of vmap_to_mfn() Oleksii Kurochko 2025-02-19 11:17 ` Jan Beulich 2025-02-12 16:50 ` [PATCH for 4.20? v4 3/3] xen/riscv: update mfn calculation in pt_mapping_level() Oleksii Kurochko 2025-02-19 11:28 ` Jan Beulich 2025-02-19 14:46 ` Oleksii Kurochko 2025-02-19 15:05 ` Jan Beulich 2025-02-19 15:09 ` Oleksii Kurochko
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.