* [PATCH 0/2] xen/mm: fix fallout from populate_physmap() deferred scrub change
@ 2026-03-25 10:08 Roger Pau Monne
2026-03-25 10:08 ` [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() Roger Pau Monne
2026-03-25 10:08 ` [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed Roger Pau Monne
0 siblings, 2 replies; 12+ messages in thread
From: Roger Pau Monne @ 2026-03-25 10:08 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Stefano Stabellini
Hello,
Two fixes for the populate_physmap() deferred scrubbing changes.
Thanks, Roger.
Roger Pau Monne (2):
xen/mm: don't unconditionally clear PGC_need_scrub in
alloc_heap_pages()
xen/mm: do not assign pages to a domain until they are scrubbed
xen/common/memory.c | 9 ++++++++-
xen/common/page_alloc.c | 39 +++++++++++++++++++++++++++++----------
xen/include/xen/mm.h | 2 ++
3 files changed, 39 insertions(+), 11 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() 2026-03-25 10:08 [PATCH 0/2] xen/mm: fix fallout from populate_physmap() deferred scrub change Roger Pau Monne @ 2026-03-25 10:08 ` Roger Pau Monne 2026-03-25 13:37 ` Jan Beulich 2026-03-25 10:08 ` [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed Roger Pau Monne 1 sibling, 1 reply; 12+ messages in thread From: Roger Pau Monne @ 2026-03-25 10:08 UTC (permalink / raw) To: xen-devel Cc: Roger Pau Monne, Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall, Stefano Stabellini, Ayden Bottos alloc_heap_pages() will unconditionally clear PGC_need_scrub, even when MEMF_no_scrub is requested. This is kind of expected as otherwise some callers will assert on seeing non-expected flags set on the count_info field. Introduce a new MEMF bit to signal to alloc_heap_pages() that non-scrubbed pages should keep the PGC_need_scrub bit set. This fixes returning dirty pages from alloc_domheap_pages() without the PGC_need_scrub bit set for populate_physmap() to consume. With the above change alloc_domheap_pages() needs an adjustment to cope with allocated pages possibly having the PGC_need_scrub set. Fixes: 83a784a15b47 ("xen/mm: allow deferred scrub of physmap populate allocated pages") Reported-by: Ayden Bottos <aydenbottos12@gmail.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- This issue was initially reported to the Xen Security Team, and it did turn out to not require an XSA only because the code hasn't been part of any release, otherwise an XSA would have been issued. The Security Team would like to thanks Ayden for the prompt report. In the scrubbing loop in alloc_heap_pages() i should better be unsigned long. --- xen/common/memory.c | 3 ++- xen/common/page_alloc.c | 31 ++++++++++++++++++++++--------- xen/include/xen/mm.h | 2 ++ 3 files changed, 26 insertions(+), 10 deletions(-) diff --git a/xen/common/memory.c b/xen/common/memory.c index 918510f287a0..f0ff1311881c 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -345,7 +345,8 @@ static void populate_physmap(struct memop_args *a) unsigned int scrub_start = 0; unsigned int memflags = a->memflags | (d->creation_finished ? 0 - : MEMF_no_scrub); + : (MEMF_no_scrub | + MEMF_keep_scrub)); nodeid_t node = (a->memflags & MEMF_exact_node) ? MEMF_get_node(a->memflags) : NUMA_NO_NODE; diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 588b5b99cbc7..1316dfbd15ee 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -989,6 +989,8 @@ static struct page_info *alloc_heap_pages( ASSERT(zone_lo <= zone_hi); ASSERT(zone_hi < NR_ZONES); + ASSERT(!(memflags & MEMF_keep_scrub) || (memflags & MEMF_no_scrub)); + if ( unlikely(order > MAX_ORDER) ) return NULL; @@ -1110,17 +1112,26 @@ static struct page_info *alloc_heap_pages( { bool cold = d && d != current->domain; - for ( i = 0; i < (1U << order); i++ ) + if ( !(memflags & MEMF_no_scrub) ) { - if ( test_and_clear_bit(_PGC_need_scrub, &pg[i].count_info) ) + for ( i = 0; i < (1U << order); i++ ) { - if ( !(memflags & MEMF_no_scrub) ) + if ( test_and_clear_bit(_PGC_need_scrub, &pg[i].count_info) ) + { scrub_one_page(&pg[i], cold); - - dirty_cnt++; + dirty_cnt++; + } + else + check_one_page(&pg[i]); } - else if ( !(memflags & MEMF_no_scrub) ) - check_one_page(&pg[i]); + } + else + { + for ( i = 0; i < (1U << order); i++ ) + if ( (memflags & MEMF_keep_scrub) + ? test_bit(_PGC_need_scrub, &pg[i].count_info) + : test_and_clear_bit(_PGC_need_scrub, &pg[i].count_info) ) + dirty_cnt++; } if ( dirty_cnt ) @@ -2696,8 +2707,10 @@ struct page_info *alloc_domheap_pages( for ( i = 0; i < (1UL << order); i++ ) { - ASSERT(!pg[i].count_info); - pg[i].count_info = PGC_extra; + ASSERT(!(pg[i].count_info & + ~((memflags & MEMF_keep_scrub) ? PGC_need_scrub + : 0UL))); + pg[i].count_info |= PGC_extra; } } if ( assign_page(pg, order, d, memflags) ) diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index d80bfba6d393..0639fc0d21fb 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -208,6 +208,8 @@ struct npfec { #define MEMF_no_refcount (1U<<_MEMF_no_refcount) #define _MEMF_populate_on_demand 1 #define MEMF_populate_on_demand (1U<<_MEMF_populate_on_demand) +#define _MEMF_keep_scrub 2 +#define MEMF_keep_scrub (1U<<_MEMF_keep_scrub) #define _MEMF_no_dma 3 #define MEMF_no_dma (1U<<_MEMF_no_dma) #define _MEMF_exact_node 4 -- 2.51.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() 2026-03-25 10:08 ` [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() Roger Pau Monne @ 2026-03-25 13:37 ` Jan Beulich 2026-03-25 14:15 ` Ayden Bottos 0 siblings, 1 reply; 12+ messages in thread From: Jan Beulich @ 2026-03-25 13:37 UTC (permalink / raw) To: Roger Pau Monne Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Ayden Bottos, xen-devel On 25.03.2026 11:08, Roger Pau Monne wrote: > alloc_heap_pages() will unconditionally clear PGC_need_scrub, even when > MEMF_no_scrub is requested. This is kind of expected as otherwise some > callers will assert on seeing non-expected flags set on the count_info > field. > > Introduce a new MEMF bit to signal to alloc_heap_pages() that non-scrubbed > pages should keep the PGC_need_scrub bit set. This fixes returning dirty > pages from alloc_domheap_pages() without the PGC_need_scrub bit set for > populate_physmap() to consume. > > With the above change alloc_domheap_pages() needs an adjustment to cope > with allocated pages possibly having the PGC_need_scrub set. > > Fixes: 83a784a15b47 ("xen/mm: allow deferred scrub of physmap populate allocated pages") > Reported-by: Ayden Bottos <aydenbottos12@gmail.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> with one nit (minor request) at the bottom. > --- > This issue was initially reported to the Xen Security Team, and it did turn > out to not require an XSA only because the code hasn't been part of any > release, otherwise an XSA would have been issued. > > The Security Team would like to thanks Ayden for the prompt report. > > In the scrubbing loop in alloc_heap_pages() i should better be unsigned > long. This issue is wider than just that function. As long as MAX_ORDER <= BITS_PER_INT, I think we could have all such loops consistently use unsigned int induction variables. But of course switching to unsigned long would be okay as well, just perhaps a little less efficient on (at least) x86. My main wish would be for all of those variables to be consistent in type (and hence all involved literal number suffixes also being consistently U or UL). > --- a/xen/include/xen/mm.h > +++ b/xen/include/xen/mm.h > @@ -208,6 +208,8 @@ struct npfec { > #define MEMF_no_refcount (1U<<_MEMF_no_refcount) > #define _MEMF_populate_on_demand 1 > #define MEMF_populate_on_demand (1U<<_MEMF_populate_on_demand) > +#define _MEMF_keep_scrub 2 > +#define MEMF_keep_scrub (1U<<_MEMF_keep_scrub) > #define _MEMF_no_dma 3 > #define MEMF_no_dma (1U<<_MEMF_no_dma) > #define _MEMF_exact_node 4 Irrespective of all the similar issues in surrounding code, may I ask that << be surrounded by blanks in the new addition, to conform to ./CODING_STYLE? As an aside, I wonder whether we really need the separate _MEMF_keep_scrub, but the same likely applies to most other _MEMF_*. Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() 2026-03-25 13:37 ` Jan Beulich @ 2026-03-25 14:15 ` Ayden Bottos 0 siblings, 0 replies; 12+ messages in thread From: Ayden Bottos @ 2026-03-25 14:15 UTC (permalink / raw) To: Jan Beulich Cc: Roger Pau Monne, Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel This looks good to me. I would also add a brief comment in mm.h to make the contract clearer for future callers: MEMF_keep_scrub is an internal allocator flag and only valid together with MEMF_no_scrub. On Thu, Mar 26, 2026 at 12:37 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 25.03.2026 11:08, Roger Pau Monne wrote: > > alloc_heap_pages() will unconditionally clear PGC_need_scrub, even when > > MEMF_no_scrub is requested. This is kind of expected as otherwise some > > callers will assert on seeing non-expected flags set on the count_info > > field. > > > > Introduce a new MEMF bit to signal to alloc_heap_pages() that non-scrubbed > > pages should keep the PGC_need_scrub bit set. This fixes returning dirty > > pages from alloc_domheap_pages() without the PGC_need_scrub bit set for > > populate_physmap() to consume. > > > > With the above change alloc_domheap_pages() needs an adjustment to cope > > with allocated pages possibly having the PGC_need_scrub set. > > > > Fixes: 83a784a15b47 ("xen/mm: allow deferred scrub of physmap populate allocated pages") > > Reported-by: Ayden Bottos <aydenbottos12@gmail.com> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Reviewed-by: Jan Beulich <jbeulich@suse.com> > with one nit (minor request) at the bottom. > > > --- > > This issue was initially reported to the Xen Security Team, and it did turn > > out to not require an XSA only because the code hasn't been part of any > > release, otherwise an XSA would have been issued. > > > > The Security Team would like to thanks Ayden for the prompt report. > > > > In the scrubbing loop in alloc_heap_pages() i should better be unsigned > > long. > > This issue is wider than just that function. As long as MAX_ORDER <= BITS_PER_INT, > I think we could have all such loops consistently use unsigned int induction > variables. But of course switching to unsigned long would be okay as well, just > perhaps a little less efficient on (at least) x86. My main wish would be for all > of those variables to be consistent in type (and hence all involved literal > number suffixes also being consistently U or UL). > > > --- a/xen/include/xen/mm.h > > +++ b/xen/include/xen/mm.h > > @@ -208,6 +208,8 @@ struct npfec { > > #define MEMF_no_refcount (1U<<_MEMF_no_refcount) > > #define _MEMF_populate_on_demand 1 > > #define MEMF_populate_on_demand (1U<<_MEMF_populate_on_demand) > > +#define _MEMF_keep_scrub 2 > > +#define MEMF_keep_scrub (1U<<_MEMF_keep_scrub) > > #define _MEMF_no_dma 3 > > #define MEMF_no_dma (1U<<_MEMF_no_dma) > > #define _MEMF_exact_node 4 > > Irrespective of all the similar issues in surrounding code, may I ask that << be > surrounded by blanks in the new addition, to conform to ./CODING_STYLE? > > As an aside, I wonder whether we really need the separate _MEMF_keep_scrub, but > the same likely applies to most other _MEMF_*. > > Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 10:08 [PATCH 0/2] xen/mm: fix fallout from populate_physmap() deferred scrub change Roger Pau Monne 2026-03-25 10:08 ` [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() Roger Pau Monne @ 2026-03-25 10:08 ` Roger Pau Monne 2026-03-25 14:56 ` Jan Beulich 1 sibling, 1 reply; 12+ messages in thread From: Roger Pau Monne @ 2026-03-25 10:08 UTC (permalink / raw) To: xen-devel Cc: Roger Pau Monne, Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall, Stefano Stabellini Assigning pages to a domain make them the possible target of hypercalls like XENMEM_decrease_reservation ahead of such pages being scrubbed in populate_physmap() when the guest is running in PV mode. This might allow pages to be freed ahead of being scrubbed for example, as a stubdomain already running could target them by guessing their MFNs. It's also possible other action could set the page type ahead of scrubbing, which would be problematic. Prevent the pages pending scrub from being assigned to the domain, and only do the assign once the scrubbing has finished. This has the disadvantage that the allocated pages will be removed from the free pool, but not yet accounted towards the domain consumed page quota. However there can only be one stashed page in that state, and it's maximum size is bounded by the memop-max-order option. This is not too different from the current logic, where assigning pages to a domain (and thus checking whether such domain doesn't overflow it's quota) is also done after the memory has been allocated and removed from the pool of free pages. Fixes: 83a784a15b47 ("xen/mm: allow deferred scrub of physmap populate allocated pages") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- I've attempted various different ways to solve this, but they all ended up being impossible. * Prevent non-scrubbed pages from getting extra refcounts (iow: make get_page() fail for them). This seemed nice, but the cleanup using put_page_alloc_ref() was impossible as non-scrubbed pages would return failure in get_page(), and so I couldn't take the extra reference ahead of calling put_page_alloc_ref(). * Disallow XENMEM_decrease_reservation until the domain has finished creation would fix the issue of pages being freed while pending scrub, but it's not clear there might be other usages that would be problematic, as get_page() on non-scrubbed pages would still return success. --- xen/common/memory.c | 6 ++++++ xen/common/page_alloc.c | 8 +++++++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/xen/common/memory.c b/xen/common/memory.c index f0ff1311881c..1ad4b51c5b02 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) goto out; } } + + if ( assign_page(page, a->extent_order, d, memflags) ) + { + free_domheap_pages(page, a->extent_order); + goto out; + } } if ( unlikely(a->memflags & MEMF_no_tlbflush) ) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 1316dfbd15ee..b72a74c705ba 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2699,7 +2699,13 @@ struct page_info *alloc_domheap_pages( memflags, d)) == NULL)) ) return NULL; - if ( d && !(memflags & MEMF_no_owner) ) + /* + * Don't add pages with the PGC_need_scrub bit set to the domain, the + * caller must clean the bit and then manually call assign_pages(). + * Otherwise pages with the PGC_need_scrub would be reachable using + * get_page(). + */ + if ( d && !(memflags & MEMF_no_owner) && !(memflags & MEMF_keep_scrub) ) { if ( memflags & MEMF_no_refcount ) { -- 2.51.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 10:08 ` [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed Roger Pau Monne @ 2026-03-25 14:56 ` Jan Beulich 2026-03-25 16:26 ` Roger Pau Monné 0 siblings, 1 reply; 12+ messages in thread From: Jan Beulich @ 2026-03-25 14:56 UTC (permalink / raw) To: Roger Pau Monne Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On 25.03.2026 11:08, Roger Pau Monne wrote: > --- > I've attempted various different ways to solve this, but they all ended up > being impossible. > > * Prevent non-scrubbed pages from getting extra refcounts (iow: make > get_page() fail for them). This seemed nice, but the cleanup using > put_page_alloc_ref() was impossible as non-scrubbed pages would return > failure in get_page(), and so I couldn't take the extra reference ahead > of calling put_page_alloc_ref(). A special-case variant of get_page() could be introduced, but maybe that would still be overly fragile. When we discussed this, what I had proposed didn't require use of get_page() though. assign_pages() would install two general references (plus one type ref for PGT_writable) in this special case. To free, you'd call put_page_alloc_ref() followed by put_page_and_type(). That said, the patch here is still less intrusive than I feared, so I'm not asking to re-work this again. > * Disallow XENMEM_decrease_reservation until the domain has finished > creation would fix the issue of pages being freed while pending scrub, > but it's not clear there might be other usages that would be problematic, > as get_page() on non-scrubbed pages would still return success. I agree this is of concern. > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) > goto out; > } > } > + > + if ( assign_page(page, a->extent_order, d, memflags) ) > + { > + free_domheap_pages(page, a->extent_order); The pages don't have an owner set yet, so that function will go straight to free_heap_pages(), needlessly passing "true" as last argument. Correct, but (for large pages, which the stashing is about) highly inefficient. > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -2699,7 +2699,13 @@ struct page_info *alloc_domheap_pages( > memflags, d)) == NULL)) ) > return NULL; > > - if ( d && !(memflags & MEMF_no_owner) ) > + /* > + * Don't add pages with the PGC_need_scrub bit set to the domain, the > + * caller must clean the bit and then manually call assign_pages(). > + * Otherwise pages with the PGC_need_scrub would be reachable using > + * get_page(). > + */ How about replacing the latter "with the PGC_need_scrub" by "still subject to scrubbing"? > + if ( d && !(memflags & MEMF_no_owner) && !(memflags & MEMF_keep_scrub) ) > { > if ( memflags & MEMF_no_refcount ) > { This no-refcount code isn't repeated at the new call site of assign_page(). It's not needed there, yes, but wouldn't we better allow this to be taken care of right here, moving the MEMF_keep_scrub check immediately ahead of the call to assign_page()? Otherwise should we reject (much earlier) MEMF_no_refcount used together with MEMF_keep_scrub? Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 14:56 ` Jan Beulich @ 2026-03-25 16:26 ` Roger Pau Monné 2026-03-25 16:54 ` Roger Pau Monné ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Roger Pau Monné @ 2026-03-25 16:26 UTC (permalink / raw) To: Jan Beulich Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: > On 25.03.2026 11:08, Roger Pau Monne wrote: > > --- > > I've attempted various different ways to solve this, but they all ended up > > being impossible. > > > > * Prevent non-scrubbed pages from getting extra refcounts (iow: make > > get_page() fail for them). This seemed nice, but the cleanup using > > put_page_alloc_ref() was impossible as non-scrubbed pages would return > > failure in get_page(), and so I couldn't take the extra reference ahead > > of calling put_page_alloc_ref(). > > A special-case variant of get_page() could be introduced, but maybe that > would still be overly fragile. It seemed too much complexity (and risk), just to deal with this scenario. > When we discussed this, what I had proposed didn't require use of get_page() > though. assign_pages() would install two general references (plus one type > ref for PGT_writable) in this special case. To free, you'd call > put_page_alloc_ref() followed by put_page_and_type(). Doesn't that risk under flowing the page counter if there's a parallel call to decrease_reservation() against this MFN before? How would the freeing done in populate_physmap() (in case of concurrent calls) know whether already scrubbed pages have had it's PGC_allocated bit dropped? > That said, the patch here is still less intrusive than I feared, so I'm not > asking to re-work this again. Thanks. > > * Disallow XENMEM_decrease_reservation until the domain has finished > > creation would fix the issue of pages being freed while pending scrub, > > but it's not clear there might be other usages that would be problematic, > > as get_page() on non-scrubbed pages would still return success. > > I agree this is of concern. > > > --- a/xen/common/memory.c > > +++ b/xen/common/memory.c > > @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) > > goto out; > > } > > } > > + > > + if ( assign_page(page, a->extent_order, d, memflags) ) > > + { > > + free_domheap_pages(page, a->extent_order); > > The pages don't have an owner set yet, so that function will go straight > to free_heap_pages(), needlessly passing "true" as last argument. Correct, > but (for large pages, which the stashing is about) highly inefficient. My bad, I was sure I was using the same freeing function as alloc_domheap_pages() on failure to assign, but I clearly wasn't. I will switch to using free_heap_pages(). > > --- a/xen/common/page_alloc.c > > +++ b/xen/common/page_alloc.c > > @@ -2699,7 +2699,13 @@ struct page_info *alloc_domheap_pages( > > memflags, d)) == NULL)) ) > > return NULL; > > > > - if ( d && !(memflags & MEMF_no_owner) ) > > + /* > > + * Don't add pages with the PGC_need_scrub bit set to the domain, the > > + * caller must clean the bit and then manually call assign_pages(). > > + * Otherwise pages with the PGC_need_scrub would be reachable using > > + * get_page(). > > + */ > > How about replacing the latter "with the PGC_need_scrub" by "still subject > to scrubbing"? Sure. > > + if ( d && !(memflags & MEMF_no_owner) && !(memflags & MEMF_keep_scrub) ) > > { > > if ( memflags & MEMF_no_refcount ) > > { > > This no-refcount code isn't repeated at the new call site of assign_page(). > It's not needed there, yes, but wouldn't we better allow this to be taken > care of right here, moving the MEMF_keep_scrub check immediately ahead of > the call to assign_page()? > > Otherwise should we reject (much earlier) MEMF_no_refcount used together > with MEMF_keep_scrub? hm, I was likely too focus on the specific use-case of populate_physmap(), which is the only user of MEMF_keep_scrub. I've noticed the MEMF_no_refcount bits here, but since populate_physmap() never uses that flag I just left it as-is. Will see about adjusting it. Thanks, Roger. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 16:26 ` Roger Pau Monné @ 2026-03-25 16:54 ` Roger Pau Monné 2026-03-26 8:18 ` Jan Beulich 2026-03-25 16:55 ` Jan Beulich 2026-03-26 8:35 ` Jan Beulich 2 siblings, 1 reply; 12+ messages in thread From: Roger Pau Monné @ 2026-03-25 16:54 UTC (permalink / raw) To: Jan Beulich Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On Wed, Mar 25, 2026 at 05:26:01PM +0100, Roger Pau Monné wrote: > On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: > > On 25.03.2026 11:08, Roger Pau Monne wrote: > > > * Disallow XENMEM_decrease_reservation until the domain has finished > > > creation would fix the issue of pages being freed while pending scrub, > > > but it's not clear there might be other usages that would be problematic, > > > as get_page() on non-scrubbed pages would still return success. > > > > I agree this is of concern. > > > > > --- a/xen/common/memory.c > > > +++ b/xen/common/memory.c > > > @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) > > > goto out; > > > } > > > } > > > + > > > + if ( assign_page(page, a->extent_order, d, memflags) ) > > > + { > > > + free_domheap_pages(page, a->extent_order); > > > > The pages don't have an owner set yet, so that function will go straight > > to free_heap_pages(), needlessly passing "true" as last argument. Correct, > > but (for large pages, which the stashing is about) highly inefficient. > > My bad, I was sure I was using the same freeing function as > alloc_domheap_pages() on failure to assign, but I clearly wasn't. I > will switch to using free_heap_pages(). Coming back to this, I can export free_heap_pages(), but then the call would also unconditionally have need_scrub == true, as the pages have been allocated without scrubbing. So I don't see much benefit of using free_heap_pages(), the more that it requires making such function global then. Thanks, Roger. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 16:54 ` Roger Pau Monné @ 2026-03-26 8:18 ` Jan Beulich 2026-03-26 8:55 ` Roger Pau Monné 0 siblings, 1 reply; 12+ messages in thread From: Jan Beulich @ 2026-03-26 8:18 UTC (permalink / raw) To: Roger Pau Monné Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On 25.03.2026 17:54, Roger Pau Monné wrote: > On Wed, Mar 25, 2026 at 05:26:01PM +0100, Roger Pau Monné wrote: >> On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: >>> On 25.03.2026 11:08, Roger Pau Monne wrote: >>>> * Disallow XENMEM_decrease_reservation until the domain has finished >>>> creation would fix the issue of pages being freed while pending scrub, >>>> but it's not clear there might be other usages that would be problematic, >>>> as get_page() on non-scrubbed pages would still return success. >>> >>> I agree this is of concern. >>> >>>> --- a/xen/common/memory.c >>>> +++ b/xen/common/memory.c >>>> @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) >>>> goto out; >>>> } >>>> } >>>> + >>>> + if ( assign_page(page, a->extent_order, d, memflags) ) >>>> + { >>>> + free_domheap_pages(page, a->extent_order); >>> >>> The pages don't have an owner set yet, so that function will go straight >>> to free_heap_pages(), needlessly passing "true" as last argument. Correct, >>> but (for large pages, which the stashing is about) highly inefficient. >> >> My bad, I was sure I was using the same freeing function as >> alloc_domheap_pages() on failure to assign, but I clearly wasn't. I >> will switch to using free_heap_pages(). > > Coming back to this, I can export free_heap_pages(), but then the call > would also unconditionally have need_scrub == true, as the pages have > been allocated without scrubbing. But the assign_page() call is here to have the scrubbing done ahead of it, so re-scrubbing after freeing shouldn't be necessary? Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-26 8:18 ` Jan Beulich @ 2026-03-26 8:55 ` Roger Pau Monné 0 siblings, 0 replies; 12+ messages in thread From: Roger Pau Monné @ 2026-03-26 8:55 UTC (permalink / raw) To: Jan Beulich Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On Thu, Mar 26, 2026 at 09:18:14AM +0100, Jan Beulich wrote: > On 25.03.2026 17:54, Roger Pau Monné wrote: > > On Wed, Mar 25, 2026 at 05:26:01PM +0100, Roger Pau Monné wrote: > >> On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: > >>> On 25.03.2026 11:08, Roger Pau Monne wrote: > >>>> * Disallow XENMEM_decrease_reservation until the domain has finished > >>>> creation would fix the issue of pages being freed while pending scrub, > >>>> but it's not clear there might be other usages that would be problematic, > >>>> as get_page() on non-scrubbed pages would still return success. > >>> > >>> I agree this is of concern. > >>> > >>>> --- a/xen/common/memory.c > >>>> +++ b/xen/common/memory.c > >>>> @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) > >>>> goto out; > >>>> } > >>>> } > >>>> + > >>>> + if ( assign_page(page, a->extent_order, d, memflags) ) > >>>> + { > >>>> + free_domheap_pages(page, a->extent_order); > >>> > >>> The pages don't have an owner set yet, so that function will go straight > >>> to free_heap_pages(), needlessly passing "true" as last argument. Correct, > >>> but (for large pages, which the stashing is about) highly inefficient. > >> > >> My bad, I was sure I was using the same freeing function as > >> alloc_domheap_pages() on failure to assign, but I clearly wasn't. I > >> will switch to using free_heap_pages(). > > > > Coming back to this, I can export free_heap_pages(), but then the call > > would also unconditionally have need_scrub == true, as the pages have > > been allocated without scrubbing. > > But the assign_page() call is here to have the scrubbing done ahead of > it, so re-scrubbing after freeing shouldn't be necessary? I think I've done what you suggested in patch 3 of the v2. For the call here, yes, we could entirely avoid the scrubbing. For the other free_domheap_pages() calls in stash_allocation() and get_stashed_allocation() respectively we need to be more careful as some pages will still be pending scrub. Thanks, Roger. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 16:26 ` Roger Pau Monné 2026-03-25 16:54 ` Roger Pau Monné @ 2026-03-25 16:55 ` Jan Beulich 2026-03-26 8:35 ` Jan Beulich 2 siblings, 0 replies; 12+ messages in thread From: Jan Beulich @ 2026-03-25 16:55 UTC (permalink / raw) To: Roger Pau Monné Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On 25.03.2026 17:26, Roger Pau Monné wrote: > On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: >> On 25.03.2026 11:08, Roger Pau Monne wrote: >>> --- a/xen/common/memory.c >>> +++ b/xen/common/memory.c >>> @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) >>> goto out; >>> } >>> } >>> + >>> + if ( assign_page(page, a->extent_order, d, memflags) ) >>> + { >>> + free_domheap_pages(page, a->extent_order); >> >> The pages don't have an owner set yet, so that function will go straight >> to free_heap_pages(), needlessly passing "true" as last argument. Correct, >> but (for large pages, which the stashing is about) highly inefficient. > > My bad, I was sure I was using the same freeing function as > alloc_domheap_pages() on failure to assign, but I clearly wasn't. I > will switch to using free_heap_pages(). Well, not so much your bad, but likely a result of free_heap_pages() being static in page_alloc.c. Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed 2026-03-25 16:26 ` Roger Pau Monné 2026-03-25 16:54 ` Roger Pau Monné 2026-03-25 16:55 ` Jan Beulich @ 2026-03-26 8:35 ` Jan Beulich 2 siblings, 0 replies; 12+ messages in thread From: Jan Beulich @ 2026-03-26 8:35 UTC (permalink / raw) To: Roger Pau Monné Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel On 25.03.2026 17:26, Roger Pau Monné wrote: > On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: >> On 25.03.2026 11:08, Roger Pau Monne wrote: >>> --- >>> I've attempted various different ways to solve this, but they all ended up >>> being impossible. >>> >>> * Prevent non-scrubbed pages from getting extra refcounts (iow: make >>> get_page() fail for them). This seemed nice, but the cleanup using >>> put_page_alloc_ref() was impossible as non-scrubbed pages would return >>> failure in get_page(), and so I couldn't take the extra reference ahead >>> of calling put_page_alloc_ref(). >> >> A special-case variant of get_page() could be introduced, but maybe that >> would still be overly fragile. > > It seemed too much complexity (and risk), just to deal with this > scenario. > >> When we discussed this, what I had proposed didn't require use of get_page() >> though. assign_pages() would install two general references (plus one type >> ref for PGT_writable) in this special case. To free, you'd call >> put_page_alloc_ref() followed by put_page_and_type(). > > Doesn't that risk under flowing the page counter if there's a parallel > call to decrease_reservation() against this MFN before? > > How would the freeing done in populate_physmap() (in case of > concurrent calls) know whether already scrubbed pages have had it's > PGC_allocated bit dropped? In that case put_page_alloc_ref() simply does nothing. That's why we have this wrapper: To avoid open-coding the same check in many places. Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-26 8:55 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-25 10:08 [PATCH 0/2] xen/mm: fix fallout from populate_physmap() deferred scrub change Roger Pau Monne 2026-03-25 10:08 ` [PATCH 1/2] xen/mm: don't unconditionally clear PGC_need_scrub in alloc_heap_pages() Roger Pau Monne 2026-03-25 13:37 ` Jan Beulich 2026-03-25 14:15 ` Ayden Bottos 2026-03-25 10:08 ` [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed Roger Pau Monne 2026-03-25 14:56 ` Jan Beulich 2026-03-25 16:26 ` Roger Pau Monné 2026-03-25 16:54 ` Roger Pau Monné 2026-03-26 8:18 ` Jan Beulich 2026-03-26 8:55 ` Roger Pau Monné 2026-03-25 16:55 ` Jan Beulich 2026-03-26 8:35 ` Jan Beulich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.