* [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration
@ 2026-06-09 15:15 Ross Lagerwall
2026-06-09 15:15 ` [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock Ross Lagerwall
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Ross Lagerwall @ 2026-06-09 15:15 UTC (permalink / raw)
To: xen-devel
Cc: Ross Lagerwall, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, Daniel P. Smith
When performing multiple migrations in parallel, the domctl lock may
become extremely contended:
* Operations like "xl vcpu-list" were observed to take in excess of 20s
to execute.
* The "clean" shadow op may pause the domain, restart with a
continuation and then become blocked on the domctl lock, causing VM
downtime in excess of 20 seconds.
These issues can be fixed by not holding the domctl for the frequently
called operations during migration.
Thanks
Ross Lagerwall (2):
domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock
domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock
xen/arch/x86/domctl.c | 4 ++++
xen/arch/x86/mm/paging.c | 8 ++++++--
xen/common/domctl.c | 13 +++++++++++++
3 files changed, 23 insertions(+), 2 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 18+ messages in thread* [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock 2026-06-09 15:15 [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall @ 2026-06-09 15:15 ` Ross Lagerwall 2026-06-10 8:17 ` Roger Pau Monné 2026-06-11 13:11 ` Daniel P. Smith 2026-06-09 15:15 ` [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op " Ross Lagerwall ` (2 subsequent siblings) 3 siblings, 2 replies; 18+ messages in thread From: Ross Lagerwall @ 2026-06-09 15:15 UTC (permalink / raw) To: xen-devel Cc: Ross Lagerwall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith It does not have side effects and is protected from concurrent changes by the P2M read lock therefore skip taking the domctl lock. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> --- xen/arch/x86/domctl.c | 4 ++++ xen/common/domctl.c | 1 + 2 files changed, 5 insertions(+) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 83bf51e498df..0e9a2532887e 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -301,6 +301,10 @@ long arch_do_domctl( /* Games to allow this code block to handle a compat guest. */ void __user *guest_handle = domctl->u.getpageframeinfo3.array.p; + ret = xsm_domctl(XSM_OTHER, d, domctl); + if ( ret ) + break; + if ( unlikely(num > 1024) || unlikely(num != domctl->u.getpageframeinfo3.num) ) { diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 3efa5b9d55b9..35144d95b808 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -555,6 +555,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) case XEN_DOMCTL_gsi_permission: case XEN_DOMCTL_bind_pt_irq: case XEN_DOMCTL_unbind_pt_irq: + case XEN_DOMCTL_getpageframeinfo3: ret = arch_do_domctl(op, d, u_domctl); goto domctl_out_unlock_domonly; -- 2.53.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock 2026-06-09 15:15 ` [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock Ross Lagerwall @ 2026-06-10 8:17 ` Roger Pau Monné 2026-06-11 13:11 ` Daniel P. Smith 1 sibling, 0 replies; 18+ messages in thread From: Roger Pau Monné @ 2026-06-10 8:17 UTC (permalink / raw) To: Ross Lagerwall Cc: xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith On Tue, Jun 09, 2026 at 04:15:27PM +0100, Ross Lagerwall wrote: > It does not have side effects and is protected from concurrent changes > by the P2M read lock therefore skip taking the domctl lock. > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock 2026-06-09 15:15 ` [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock Ross Lagerwall 2026-06-10 8:17 ` Roger Pau Monné @ 2026-06-11 13:11 ` Daniel P. Smith 2026-06-11 14:23 ` Roger Pau Monné 1 sibling, 1 reply; 18+ messages in thread From: Daniel P. Smith @ 2026-06-11 13:11 UTC (permalink / raw) To: Ross Lagerwall, xen-devel Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini On 6/9/26 11:15 AM, Ross Lagerwall wrote: > It does not have side effects and is protected from concurrent changes > by the P2M read lock therefore skip taking the domctl lock. > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > --- > xen/arch/x86/domctl.c | 4 ++++ > xen/common/domctl.c | 1 + > 2 files changed, 5 insertions(+) > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index 83bf51e498df..0e9a2532887e 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -301,6 +301,10 @@ long arch_do_domctl( > /* Games to allow this code block to handle a compat guest. */ > void __user *guest_handle = domctl->u.getpageframeinfo3.array.p; > > + ret = xsm_domctl(XSM_OTHER, d, domctl); > + if ( ret ) > + break; > + > if ( unlikely(num > 1024) || > unlikely(num != domctl->u.getpageframeinfo3.num) ) > { > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 3efa5b9d55b9..35144d95b808 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -555,6 +555,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > case XEN_DOMCTL_gsi_permission: > case XEN_DOMCTL_bind_pt_irq: > case XEN_DOMCTL_unbind_pt_irq: > + case XEN_DOMCTL_getpageframeinfo3: > ret = arch_do_domctl(op, d, u_domctl); > goto domctl_out_unlock_domonly; > I would respectfully ask to be mindful when XSM hooks are being manipulated in a patch that a review from an XSM maintainer should be sought before committing a patch. In this case case the change itself is good, though I would have liked the opportunity to comment that the commit message should have had some explanation on the xsm change. V/r, Daniel P. Smith ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock 2026-06-11 13:11 ` Daniel P. Smith @ 2026-06-11 14:23 ` Roger Pau Monné 2026-06-11 14:25 ` Daniel P. Smith 0 siblings, 1 reply; 18+ messages in thread From: Roger Pau Monné @ 2026-06-11 14:23 UTC (permalink / raw) To: Daniel P. Smith Cc: Ross Lagerwall, xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini On Thu, Jun 11, 2026 at 09:11:15AM -0400, Daniel P. Smith wrote: > On 6/9/26 11:15 AM, Ross Lagerwall wrote: > > It does not have side effects and is protected from concurrent changes > > by the P2M read lock therefore skip taking the domctl lock. > > > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > > --- > > xen/arch/x86/domctl.c | 4 ++++ > > xen/common/domctl.c | 1 + > > 2 files changed, 5 insertions(+) > > > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > > index 83bf51e498df..0e9a2532887e 100644 > > --- a/xen/arch/x86/domctl.c > > +++ b/xen/arch/x86/domctl.c > > @@ -301,6 +301,10 @@ long arch_do_domctl( > > /* Games to allow this code block to handle a compat guest. */ > > void __user *guest_handle = domctl->u.getpageframeinfo3.array.p; > > + ret = xsm_domctl(XSM_OTHER, d, domctl); > > + if ( ret ) > > + break; > > + > > if ( unlikely(num > 1024) || > > unlikely(num != domctl->u.getpageframeinfo3.num) ) > > { > > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > > index 3efa5b9d55b9..35144d95b808 100644 > > --- a/xen/common/domctl.c > > +++ b/xen/common/domctl.c > > @@ -555,6 +555,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > > case XEN_DOMCTL_gsi_permission: > > case XEN_DOMCTL_bind_pt_irq: > > case XEN_DOMCTL_unbind_pt_irq: > > + case XEN_DOMCTL_getpageframeinfo3: > > ret = arch_do_domctl(op, d, u_domctl); > > goto domctl_out_unlock_domonly; > I would respectfully ask to be mindful when XSM hooks are being manipulated > in a patch that a review from an XSM maintainer should be sought before > committing a patch. In this case case the change itself is good, though I > would have liked the opportunity to comment that the commit message should > have had some explanation on the xsm change. I've already replied to 2/2, but would like to re-instate my apology here so it doesn't seem like this went unnoticed: Sorry, this was already picked up in a rush to get it into 4.22 and I didn't realize it was missing an XSM maintainer Ack. That's entirely my fault, there was no intention to bypass or overrule your opinion. Regards, Roger. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock 2026-06-11 14:23 ` Roger Pau Monné @ 2026-06-11 14:25 ` Daniel P. Smith 0 siblings, 0 replies; 18+ messages in thread From: Daniel P. Smith @ 2026-06-11 14:25 UTC (permalink / raw) To: Roger Pau Monné Cc: Ross Lagerwall, xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini On 6/11/26 10:23 AM, Roger Pau Monné wrote: > On Thu, Jun 11, 2026 at 09:11:15AM -0400, Daniel P. Smith wrote: >> On 6/9/26 11:15 AM, Ross Lagerwall wrote: >>> It does not have side effects and is protected from concurrent changes >>> by the P2M read lock therefore skip taking the domctl lock. >>> >>> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> >>> --- >>> xen/arch/x86/domctl.c | 4 ++++ >>> xen/common/domctl.c | 1 + >>> 2 files changed, 5 insertions(+) >>> >>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c >>> index 83bf51e498df..0e9a2532887e 100644 >>> --- a/xen/arch/x86/domctl.c >>> +++ b/xen/arch/x86/domctl.c >>> @@ -301,6 +301,10 @@ long arch_do_domctl( >>> /* Games to allow this code block to handle a compat guest. */ >>> void __user *guest_handle = domctl->u.getpageframeinfo3.array.p; >>> + ret = xsm_domctl(XSM_OTHER, d, domctl); >>> + if ( ret ) >>> + break; >>> + >>> if ( unlikely(num > 1024) || >>> unlikely(num != domctl->u.getpageframeinfo3.num) ) >>> { >>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >>> index 3efa5b9d55b9..35144d95b808 100644 >>> --- a/xen/common/domctl.c >>> +++ b/xen/common/domctl.c >>> @@ -555,6 +555,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >>> case XEN_DOMCTL_gsi_permission: >>> case XEN_DOMCTL_bind_pt_irq: >>> case XEN_DOMCTL_unbind_pt_irq: >>> + case XEN_DOMCTL_getpageframeinfo3: >>> ret = arch_do_domctl(op, d, u_domctl); >>> goto domctl_out_unlock_domonly; >> I would respectfully ask to be mindful when XSM hooks are being manipulated >> in a patch that a review from an XSM maintainer should be sought before >> committing a patch. In this case case the change itself is good, though I >> would have liked the opportunity to comment that the commit message should >> have had some explanation on the xsm change. > > I've already replied to 2/2, but would like to re-instate my apology > here so it doesn't seem like this went unnoticed: > > Sorry, this was already picked up in a rush to get it into 4.22 and I > didn't realize it was missing an XSM maintainer Ack. That's entirely > my fault, there was no intention to bypass or overrule your opinion. No worries, thank you for the apologies. . ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-09 15:15 [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall 2026-06-09 15:15 ` [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock Ross Lagerwall @ 2026-06-09 15:15 ` Ross Lagerwall 2026-06-10 8:35 ` Roger Pau Monné ` (2 more replies) 2026-06-10 9:57 ` [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall 2026-06-11 14:55 ` Jan Beulich 3 siblings, 3 replies; 18+ messages in thread From: Ross Lagerwall @ 2026-06-09 15:15 UTC (permalink / raw) To: xen-devel Cc: Ross Lagerwall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith Handle XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK} without taking the domctl lock. This is safe because for these subops, the paging lock is mostly held which prevents it from operating concurrently on the same domain. There are some parts that are called without the paging lock held: * hvm_mapped_guest_frames_mark_dirty() - The function itself takes a spinlock so is protected from concurrent calls. In any case, it will mark all the pages dirty as required. * domain_pause() - The toolstack cannot unpause the domain while in paging_log_dirty_op() because the toolstack's pause/unpause ops have a separate ref count. * p2m_flush_hardware_cached_dirty() - This is called elsewhere without the domctl lock held so holding it wouldn't achieve anything. It should be fine as long as it is called at least once. * log_dirty.ops->clean() - If the callback is hap_clean_dirty_bitmap(), then it will hold the p2m lock while modifying the table. If the callback is sh_clean_dirty_bitmap(), it will hold the paging lock while modifying the table. In both cases, this is OK. * domain_unpause() - Same as the earlier domain_pause(). Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> --- xen/arch/x86/mm/paging.c | 8 ++++++-- xen/common/domctl.c | 12 ++++++++++++ 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 1a5822808620..bfb5b423a0dd 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -746,11 +746,15 @@ long do_paging_domctl_cont( ret = xsm_domctl(XSM_OTHER, d, &op); if ( !ret ) { - if ( domctl_lock_acquire() ) + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); + + if ( !lock || domctl_lock_acquire() ) { ret = paging_domctl(d, &op.u.shadow_op, u_domctl, 1); - domctl_lock_release(); + if ( lock ) + domctl_lock_release(); } else ret = -ERESTART; diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 35144d95b808..a3888c4e87d4 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -559,6 +559,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) ret = arch_do_domctl(op, d, u_domctl); goto domctl_out_unlock_domonly; + case XEN_DOMCTL_shadow_op: + if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || + op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) + { + ret = xsm_domctl(XSM_OTHER, d, op); + if ( ret ) + goto domctl_out_unlock_domonly; + + ret = arch_do_domctl(op, d, u_domctl); + goto domctl_out_unlock_domonly; + } + fallthrough; default: /* Everything else handled further down. */ break; -- 2.53.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-09 15:15 ` [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op " Ross Lagerwall @ 2026-06-10 8:35 ` Roger Pau Monné 2026-06-10 9:50 ` Ross Lagerwall 2026-06-11 13:18 ` Daniel P. Smith 2026-06-11 15:02 ` Jan Beulich 2 siblings, 1 reply; 18+ messages in thread From: Roger Pau Monné @ 2026-06-10 8:35 UTC (permalink / raw) To: Ross Lagerwall Cc: xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith On Tue, Jun 09, 2026 at 04:15:28PM +0100, Ross Lagerwall wrote: > Handle XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK} without taking the domctl lock. > This is safe because for these subops, the paging lock is mostly held > which prevents it from operating concurrently on the same domain. There > are some parts that are called without the paging lock held: > > * hvm_mapped_guest_frames_mark_dirty() - The function itself takes a > spinlock so is protected from concurrent calls. In any case, it will > mark all the pages dirty as required. > > * domain_pause() - The toolstack cannot unpause the domain while in > paging_log_dirty_op() because the toolstack's pause/unpause ops have > a separate ref count. > > * p2m_flush_hardware_cached_dirty() - This is called elsewhere without > the domctl lock held so holding it wouldn't achieve anything. It > should be fine as long as it is called at least once. > > * log_dirty.ops->clean() - If the callback is hap_clean_dirty_bitmap(), > then it will hold the p2m lock while modifying the table. If the > callback is sh_clean_dirty_bitmap(), it will hold the paging lock > while modifying the table. In both cases, this is OK. > > * domain_unpause() - Same as the earlier domain_pause(). You could join both into a single domain_{,un}pause() bullet point. > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > --- > xen/arch/x86/mm/paging.c | 8 ++++++-- > xen/common/domctl.c | 12 ++++++++++++ > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c > index 1a5822808620..bfb5b423a0dd 100644 > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -746,11 +746,15 @@ long do_paging_domctl_cont( > ret = xsm_domctl(XSM_OTHER, d, &op); > if ( !ret ) > { > - if ( domctl_lock_acquire() ) > + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); > + > + if ( !lock || domctl_lock_acquire() ) > { > ret = paging_domctl(d, &op.u.shadow_op, u_domctl, 1); > > - domctl_lock_release(); > + if ( lock ) > + domctl_lock_release(); > } > else > ret = -ERESTART; > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 35144d95b808..a3888c4e87d4 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -559,6 +559,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > ret = arch_do_domctl(op, d, u_domctl); > goto domctl_out_unlock_domonly; > > + case XEN_DOMCTL_shadow_op: > + if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > + op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) > + { > + ret = xsm_domctl(XSM_OTHER, d, op); > + if ( ret ) > + goto domctl_out_unlock_domonly; > + > + ret = arch_do_domctl(op, d, u_domctl); > + goto domctl_out_unlock_domonly; > + } > + fallthrough; Newline, and I would use break rather than fallthrough, if further cases are added below you don't what to fallthrough, and there's nothing to do in the default case anyway. See for example how this is similar to XEN_DOMCTL_vm_event_op which also handles some sub-ops without a lock and uses a break instead of a fallthrough. FWIW, I would also put the XEN_DOMCTL_shadow_op case after XEN_DOMCTL_get_device_group and ahead of the XEN_DOMCTL_ioport_permission block, but that's just my taste. Thanks, Roger. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-10 8:35 ` Roger Pau Monné @ 2026-06-10 9:50 ` Ross Lagerwall 0 siblings, 0 replies; 18+ messages in thread From: Ross Lagerwall @ 2026-06-10 9:50 UTC (permalink / raw) To: Roger Pau Monné Cc: xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith On 6/10/26 9:35 AM, Roger Pau Monné wrote: > On Tue, Jun 09, 2026 at 04:15:28PM +0100, Ross Lagerwall wrote: >> Handle XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK} without taking the domctl lock. >> This is safe because for these subops, the paging lock is mostly held >> which prevents it from operating concurrently on the same domain. There >> are some parts that are called without the paging lock held: >> >> * hvm_mapped_guest_frames_mark_dirty() - The function itself takes a >> spinlock so is protected from concurrent calls. In any case, it will >> mark all the pages dirty as required. >> >> * domain_pause() - The toolstack cannot unpause the domain while in >> paging_log_dirty_op() because the toolstack's pause/unpause ops have >> a separate ref count. >> >> * p2m_flush_hardware_cached_dirty() - This is called elsewhere without >> the domctl lock held so holding it wouldn't achieve anything. It >> should be fine as long as it is called at least once. >> >> * log_dirty.ops->clean() - If the callback is hap_clean_dirty_bitmap(), >> then it will hold the p2m lock while modifying the table. If the >> callback is sh_clean_dirty_bitmap(), it will hold the paging lock >> while modifying the table. In both cases, this is OK. >> >> * domain_unpause() - Same as the earlier domain_pause(). > > You could join both into a single domain_{,un}pause() bullet point. > >> >> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > >> --- >> xen/arch/x86/mm/paging.c | 8 ++++++-- >> xen/common/domctl.c | 12 ++++++++++++ >> 2 files changed, 18 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c >> index 1a5822808620..bfb5b423a0dd 100644 >> --- a/xen/arch/x86/mm/paging.c >> +++ b/xen/arch/x86/mm/paging.c >> @@ -746,11 +746,15 @@ long do_paging_domctl_cont( >> ret = xsm_domctl(XSM_OTHER, d, &op); >> if ( !ret ) >> { >> - if ( domctl_lock_acquire() ) >> + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || >> + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); >> + >> + if ( !lock || domctl_lock_acquire() ) >> { >> ret = paging_domctl(d, &op.u.shadow_op, u_domctl, 1); >> >> - domctl_lock_release(); >> + if ( lock ) >> + domctl_lock_release(); >> } >> else >> ret = -ERESTART; >> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >> index 35144d95b808..a3888c4e87d4 100644 >> --- a/xen/common/domctl.c >> +++ b/xen/common/domctl.c >> @@ -559,6 +559,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >> ret = arch_do_domctl(op, d, u_domctl); >> goto domctl_out_unlock_domonly; >> >> + case XEN_DOMCTL_shadow_op: >> + if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || >> + op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) >> + { >> + ret = xsm_domctl(XSM_OTHER, d, op); >> + if ( ret ) >> + goto domctl_out_unlock_domonly; >> + >> + ret = arch_do_domctl(op, d, u_domctl); >> + goto domctl_out_unlock_domonly; >> + } >> + fallthrough; > > Newline, and I would use break rather than fallthrough, if further > cases are added below you don't what to fallthrough, and there's > nothing to do in the default case anyway. Yes, not sure what I was thinking here. break makes far more sense. Can this adjustment be done when committing? > > See for example how this is similar to XEN_DOMCTL_vm_event_op which > also handles some sub-ops without a lock and uses a break instead of a > fallthrough. > > FWIW, I would also put the XEN_DOMCTL_shadow_op case after > XEN_DOMCTL_get_device_group and ahead of the > XEN_DOMCTL_ioport_permission block, but that's just my taste. I don't have a preference here, either is fine. Thanks, Ross ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-09 15:15 ` [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op " Ross Lagerwall 2026-06-10 8:35 ` Roger Pau Monné @ 2026-06-11 13:18 ` Daniel P. Smith 2026-06-11 14:20 ` Roger Pau Monné 2026-06-11 15:02 ` Jan Beulich 2 siblings, 1 reply; 18+ messages in thread From: Daniel P. Smith @ 2026-06-11 13:18 UTC (permalink / raw) To: Ross Lagerwall, xen-devel Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini On 6/9/26 11:15 AM, Ross Lagerwall wrote: > Handle XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK} without taking the domctl lock. > This is safe because for these subops, the paging lock is mostly held > which prevents it from operating concurrently on the same domain. There > are some parts that are called without the paging lock held: > > * hvm_mapped_guest_frames_mark_dirty() - The function itself takes a > spinlock so is protected from concurrent calls. In any case, it will > mark all the pages dirty as required. > > * domain_pause() - The toolstack cannot unpause the domain while in > paging_log_dirty_op() because the toolstack's pause/unpause ops have > a separate ref count. > > * p2m_flush_hardware_cached_dirty() - This is called elsewhere without > the domctl lock held so holding it wouldn't achieve anything. It > should be fine as long as it is called at least once. > > * log_dirty.ops->clean() - If the callback is hap_clean_dirty_bitmap(), > then it will hold the p2m lock while modifying the table. If the > callback is sh_clean_dirty_bitmap(), it will hold the paging lock > while modifying the table. In both cases, this is OK. > > * domain_unpause() - Same as the earlier domain_pause(). Please add a comment that that xsm check is to continue protecting the sub-ops with XS_PRIV. > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > --- > xen/arch/x86/mm/paging.c | 8 ++++++-- > xen/common/domctl.c | 12 ++++++++++++ > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c > index 1a5822808620..bfb5b423a0dd 100644 > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -746,11 +746,15 @@ long do_paging_domctl_cont( > ret = xsm_domctl(XSM_OTHER, d, &op); > if ( !ret ) > { > - if ( domctl_lock_acquire() ) > + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); > + > + if ( !lock || domctl_lock_acquire() ) > { > ret = paging_domctl(d, &op.u.shadow_op, u_domctl, 1); > > - domctl_lock_release(); > + if ( lock ) > + domctl_lock_release(); > } > else > ret = -ERESTART; > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 35144d95b808..a3888c4e87d4 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -559,6 +559,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > ret = arch_do_domctl(op, d, u_domctl); > goto domctl_out_unlock_domonly; > > + case XEN_DOMCTL_shadow_op: > + if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > + op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) > + { > + ret = xsm_domctl(XSM_OTHER, d, op); > + if ( ret ) > + goto domctl_out_unlock_domonly; > + > + ret = arch_do_domctl(op, d, u_domctl); > + goto domctl_out_unlock_domonly; > + } > + fallthrough; > default: > /* Everything else handled further down. */ > break; After commit message change, Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-11 13:18 ` Daniel P. Smith @ 2026-06-11 14:20 ` Roger Pau Monné 2026-06-11 14:24 ` Daniel P. Smith 0 siblings, 1 reply; 18+ messages in thread From: Roger Pau Monné @ 2026-06-11 14:20 UTC (permalink / raw) To: Daniel P. Smith Cc: Ross Lagerwall, xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini On Thu, Jun 11, 2026 at 09:18:15AM -0400, Daniel P. Smith wrote: > On 6/9/26 11:15 AM, Ross Lagerwall wrote: > > Handle XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK} without taking the domctl lock. > > This is safe because for these subops, the paging lock is mostly held > > which prevents it from operating concurrently on the same domain. There > > are some parts that are called without the paging lock held: > > > > * hvm_mapped_guest_frames_mark_dirty() - The function itself takes a > > spinlock so is protected from concurrent calls. In any case, it will > > mark all the pages dirty as required. > > > > * domain_pause() - The toolstack cannot unpause the domain while in > > paging_log_dirty_op() because the toolstack's pause/unpause ops have > > a separate ref count. > > > > * p2m_flush_hardware_cached_dirty() - This is called elsewhere without > > the domctl lock held so holding it wouldn't achieve anything. It > > should be fine as long as it is called at least once. > > > > * log_dirty.ops->clean() - If the callback is hap_clean_dirty_bitmap(), > > then it will hold the p2m lock while modifying the table. If the > > callback is sh_clean_dirty_bitmap(), it will hold the paging lock > > while modifying the table. In both cases, this is OK. > > > > * domain_unpause() - Same as the earlier domain_pause(). > > Please add a comment that that xsm check is to continue protecting the > sub-ops with XS_PRIV. > > > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > > --- > > xen/arch/x86/mm/paging.c | 8 ++++++-- > > xen/common/domctl.c | 12 ++++++++++++ > > 2 files changed, 18 insertions(+), 2 deletions(-) > > > > diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c > > index 1a5822808620..bfb5b423a0dd 100644 > > --- a/xen/arch/x86/mm/paging.c > > +++ b/xen/arch/x86/mm/paging.c > > @@ -746,11 +746,15 @@ long do_paging_domctl_cont( > > ret = xsm_domctl(XSM_OTHER, d, &op); > > if ( !ret ) > > { > > - if ( domctl_lock_acquire() ) > > + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > > + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); > > + > > + if ( !lock || domctl_lock_acquire() ) > > { > > ret = paging_domctl(d, &op.u.shadow_op, u_domctl, 1); > > - domctl_lock_release(); > > + if ( lock ) > > + domctl_lock_release(); > > } > > else > > ret = -ERESTART; > > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > > index 35144d95b808..a3888c4e87d4 100644 > > --- a/xen/common/domctl.c > > +++ b/xen/common/domctl.c > > @@ -559,6 +559,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > > ret = arch_do_domctl(op, d, u_domctl); > > goto domctl_out_unlock_domonly; > > + case XEN_DOMCTL_shadow_op: > > + if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > > + op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) > > + { > > + ret = xsm_domctl(XSM_OTHER, d, op); > > + if ( ret ) > > + goto domctl_out_unlock_domonly; > > + > > + ret = arch_do_domctl(op, d, u_domctl); > > + goto domctl_out_unlock_domonly; > > + } > > + fallthrough; > > default: > > /* Everything else handled further down. */ > > break; > > After commit message change, > > Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com> Sorry, this was already picked up in a rush to get it into 4.22 and I didn't realize it was missing an XSM maintainer Ack. That's entirely my fault, there was no intention to bypass or overrule your opinion. Given it's already committed, and there are no objections aside from the commit message adjustment my preference would be to leave it alone. Regards, Roger. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-11 14:20 ` Roger Pau Monné @ 2026-06-11 14:24 ` Daniel P. Smith 0 siblings, 0 replies; 18+ messages in thread From: Daniel P. Smith @ 2026-06-11 14:24 UTC (permalink / raw) To: Roger Pau Monné Cc: Ross Lagerwall, xen-devel, Jan Beulich, Andrew Cooper, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini On 6/11/26 10:20 AM, Roger Pau Monné wrote: > On Thu, Jun 11, 2026 at 09:18:15AM -0400, Daniel P. Smith wrote: >> On 6/9/26 11:15 AM, Ross Lagerwall wrote: >>> Handle XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK} without taking the domctl lock. >>> This is safe because for these subops, the paging lock is mostly held >>> which prevents it from operating concurrently on the same domain. There >>> are some parts that are called without the paging lock held: >>> >>> * hvm_mapped_guest_frames_mark_dirty() - The function itself takes a >>> spinlock so is protected from concurrent calls. In any case, it will >>> mark all the pages dirty as required. >>> >>> * domain_pause() - The toolstack cannot unpause the domain while in >>> paging_log_dirty_op() because the toolstack's pause/unpause ops have >>> a separate ref count. >>> >>> * p2m_flush_hardware_cached_dirty() - This is called elsewhere without >>> the domctl lock held so holding it wouldn't achieve anything. It >>> should be fine as long as it is called at least once. >>> >>> * log_dirty.ops->clean() - If the callback is hap_clean_dirty_bitmap(), >>> then it will hold the p2m lock while modifying the table. If the >>> callback is sh_clean_dirty_bitmap(), it will hold the paging lock >>> while modifying the table. In both cases, this is OK. >>> >>> * domain_unpause() - Same as the earlier domain_pause(). >> >> Please add a comment that that xsm check is to continue protecting the >> sub-ops with XS_PRIV. >> >> >>> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> >>> --- >>> xen/arch/x86/mm/paging.c | 8 ++++++-- >>> xen/common/domctl.c | 12 ++++++++++++ >>> 2 files changed, 18 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c >>> index 1a5822808620..bfb5b423a0dd 100644 >>> --- a/xen/arch/x86/mm/paging.c >>> +++ b/xen/arch/x86/mm/paging.c >>> @@ -746,11 +746,15 @@ long do_paging_domctl_cont( >>> ret = xsm_domctl(XSM_OTHER, d, &op); >>> if ( !ret ) >>> { >>> - if ( domctl_lock_acquire() ) >>> + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || >>> + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); >>> + >>> + if ( !lock || domctl_lock_acquire() ) >>> { >>> ret = paging_domctl(d, &op.u.shadow_op, u_domctl, 1); >>> - domctl_lock_release(); >>> + if ( lock ) >>> + domctl_lock_release(); >>> } >>> else >>> ret = -ERESTART; >>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >>> index 35144d95b808..a3888c4e87d4 100644 >>> --- a/xen/common/domctl.c >>> +++ b/xen/common/domctl.c >>> @@ -559,6 +559,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >>> ret = arch_do_domctl(op, d, u_domctl); >>> goto domctl_out_unlock_domonly; >>> + case XEN_DOMCTL_shadow_op: >>> + if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || >>> + op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) >>> + { >>> + ret = xsm_domctl(XSM_OTHER, d, op); >>> + if ( ret ) >>> + goto domctl_out_unlock_domonly; >>> + >>> + ret = arch_do_domctl(op, d, u_domctl); >>> + goto domctl_out_unlock_domonly; >>> + } >>> + fallthrough; >>> default: >>> /* Everything else handled further down. */ >>> break; >> >> After commit message change, >> >> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com> > > Sorry, this was already picked up in a rush to get it into 4.22 and I > didn't realize it was missing an XSM maintainer Ack. That's entirely > my fault, there was no intention to bypass or overrule your opinion. I fully understand and take no offense. > Given it's already committed, and there are no objections aside from > the commit message adjustment my preference would be to leave it > alone. Hmm, I saw patch 1 but not patch 2. Sorry for the extra noise. v/r, dps ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock 2026-06-09 15:15 ` [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op " Ross Lagerwall 2026-06-10 8:35 ` Roger Pau Monné 2026-06-11 13:18 ` Daniel P. Smith @ 2026-06-11 15:02 ` Jan Beulich 2 siblings, 0 replies; 18+ messages in thread From: Jan Beulich @ 2026-06-11 15:02 UTC (permalink / raw) To: Ross Lagerwall Cc: Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith, xen-devel On 09.06.2026 17:15, Ross Lagerwall wrote: > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -746,11 +746,15 @@ long do_paging_domctl_cont( > ret = xsm_domctl(XSM_OTHER, d, &op); > if ( !ret ) > { > - if ( domctl_lock_acquire() ) > + bool lock = !(op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || > + op.u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK); I realize this has been committed already, yet I'd still like to mention that I never really understood why expressions like this (with a negation which can easily be avoided: bool lock = op.u.shadow_op.op != XEN_DOMCTL_SHADOW_OP_CLEAN && op.u.shadow_op.op != XEN_DOMCTL_SHADOW_OP_PEEK; ) would be used. Personally I consider such hampering readability. And no, the other expression (in do_domctl()) being if ( op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_CLEAN || op->u.shadow_op.op == XEN_DOMCTL_SHADOW_OP_PEEK ) imo doesn't justify that either: While in the patch they're close together and hence the analogy can be easily spotted, in the ultimate source files the two expressions are far apart. Jan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration 2026-06-09 15:15 [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall 2026-06-09 15:15 ` [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock Ross Lagerwall 2026-06-09 15:15 ` [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op " Ross Lagerwall @ 2026-06-10 9:57 ` Ross Lagerwall 2026-06-10 11:48 ` Oleksii Kurochko 2026-06-11 14:55 ` Jan Beulich 3 siblings, 1 reply; 18+ messages in thread From: Ross Lagerwall @ 2026-06-10 9:57 UTC (permalink / raw) To: xen-devel Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith, Oleksii Kurochko On 6/9/26 4:15 PM, Ross Lagerwall wrote: > When performing multiple migrations in parallel, the domctl lock may > become extremely contended: > > * Operations like "xl vcpu-list" were observed to take in excess of 20s > to execute. > * The "clean" shadow op may pause the domain, restart with a > continuation and then become blocked on the domctl lock, causing VM > downtime in excess of 20 seconds. > > These issues can be fixed by not holding the domctl for the frequently > called operations during migration. > > Thanks > > Ross Lagerwall (2): > domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock > domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock > > xen/arch/x86/domctl.c | 4 ++++ > xen/arch/x86/mm/paging.c | 8 ++++++-- > xen/common/domctl.c | 13 +++++++++++++ > 3 files changed, 23 insertions(+), 2 deletions(-) > I'd like to request inclusion of this in 4.22 since it fixes a real customer issue we have observed and would have been posted some time ago but was delayed to avoid drawing attention to and colliding with XSA-492. Thanks, Ross ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration 2026-06-10 9:57 ` [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall @ 2026-06-10 11:48 ` Oleksii Kurochko 0 siblings, 0 replies; 18+ messages in thread From: Oleksii Kurochko @ 2026-06-10 11:48 UTC (permalink / raw) To: Ross Lagerwall, xen-devel Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith On 6/10/26 11:57 AM, Ross Lagerwall wrote: > On 6/9/26 4:15 PM, Ross Lagerwall wrote: >> When performing multiple migrations in parallel, the domctl lock may >> become extremely contended: >> >> * Operations like "xl vcpu-list" were observed to take in excess of 20s >> to execute. >> * The "clean" shadow op may pause the domain, restart with a >> continuation and then become blocked on the domctl lock, causing VM >> downtime in excess of 20 seconds. >> >> These issues can be fixed by not holding the domctl for the frequently >> called operations during migration. >> >> Thanks >> >> Ross Lagerwall (2): >> domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock >> domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock >> >> xen/arch/x86/domctl.c | 4 ++++ >> xen/arch/x86/mm/paging.c | 8 ++++++-- >> xen/common/domctl.c | 13 +++++++++++++ >> 3 files changed, 23 insertions(+), 2 deletions(-) >> > > I'd like to request inclusion of this in 4.22 since it fixes a real > customer issue we have observed and would have been posted some time ago > but was delayed to avoid drawing attention to and colliding with > XSA-492. Considering this and performance improvements it would be really nice to have in in 4.22: Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Thanks. ~ Oleksii ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration 2026-06-09 15:15 [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall ` (2 preceding siblings ...) 2026-06-10 9:57 ` [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall @ 2026-06-11 14:55 ` Jan Beulich 2026-06-11 16:02 ` Ross Lagerwall 3 siblings, 1 reply; 18+ messages in thread From: Jan Beulich @ 2026-06-11 14:55 UTC (permalink / raw) To: Ross Lagerwall Cc: Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith, xen-devel On 09.06.2026 17:15, Ross Lagerwall wrote: > When performing multiple migrations in parallel, the domctl lock may > become extremely contended: > > * Operations like "xl vcpu-list" were observed to take in excess of 20s > to execute. Does "xl vcpu-list" involve ... > * The "clean" shadow op may pause the domain, restart with a > continuation and then become blocked on the domctl lock, causing VM > downtime in excess of 20 seconds. > > These issues can be fixed by not holding the domctl for the frequently > called operations during migration. > > Thanks > > Ross Lagerwall (2): > domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock ... XEN_DOMCTL_getpageframeinfo3? Jan > domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock > > xen/arch/x86/domctl.c | 4 ++++ > xen/arch/x86/mm/paging.c | 8 ++++++-- > xen/common/domctl.c | 13 +++++++++++++ > 3 files changed, 23 insertions(+), 2 deletions(-) > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration 2026-06-11 14:55 ` Jan Beulich @ 2026-06-11 16:02 ` Ross Lagerwall 2026-06-11 16:06 ` Jan Beulich 0 siblings, 1 reply; 18+ messages in thread From: Ross Lagerwall @ 2026-06-11 16:02 UTC (permalink / raw) To: Jan Beulich Cc: Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith, xen-devel On 6/11/26 3:55 PM, Jan Beulich wrote: > On 09.06.2026 17:15, Ross Lagerwall wrote: >> When performing multiple migrations in parallel, the domctl lock may >> become extremely contended: >> >> * Operations like "xl vcpu-list" were observed to take in excess of 20s >> to execute. > > Does "xl vcpu-list" involve ... > >> * The "clean" shadow op may pause the domain, restart with a >> continuation and then become blocked on the domctl lock, causing VM >> downtime in excess of 20 seconds. >> >> These issues can be fixed by not holding the domctl for the frequently >> called operations during migration. >> >> Thanks >> >> Ross Lagerwall (2): >> domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock > > ... XEN_DOMCTL_getpageframeinfo3? > No, but "xl vcpu-list" takes the domctl lock and this contends with XEN_DOMCTL_getpageframeinfo3 and XEN_DOMCTL_shadow_op taking the domctl lock which are called frequently by the migration process(es). Various other operations were slow due to the domctl lock contention but "xl vcpu-list" was the most obviously visible example. Ross ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration 2026-06-11 16:02 ` Ross Lagerwall @ 2026-06-11 16:06 ` Jan Beulich 0 siblings, 0 replies; 18+ messages in thread From: Jan Beulich @ 2026-06-11 16:06 UTC (permalink / raw) To: Ross Lagerwall Cc: Andrew Cooper, Roger Pau Monné, Teddy Astie, Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini, Daniel P. Smith, xen-devel On 11.06.2026 18:02, Ross Lagerwall wrote: > On 6/11/26 3:55 PM, Jan Beulich wrote: >> On 09.06.2026 17:15, Ross Lagerwall wrote: >>> When performing multiple migrations in parallel, the domctl lock may >>> become extremely contended: >>> >>> * Operations like "xl vcpu-list" were observed to take in excess of 20s >>> to execute. >> >> Does "xl vcpu-list" involve ... >> >>> * The "clean" shadow op may pause the domain, restart with a >>> continuation and then become blocked on the domctl lock, causing VM >>> downtime in excess of 20 seconds. >>> >>> These issues can be fixed by not holding the domctl for the frequently >>> called operations during migration. >>> >>> Thanks >>> >>> Ross Lagerwall (2): >>> domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock >> >> ... XEN_DOMCTL_getpageframeinfo3? >> > > No, but "xl vcpu-list" takes the domctl lock If this is still the case after XSA-492, then maybe the follow-ups I have pending to post will eliminate (or at least reduce) this. I don't think that's 4.22 material, though. > and this contends with > XEN_DOMCTL_getpageframeinfo3 and XEN_DOMCTL_shadow_op taking the domctl lock > which are called frequently by the migration process(es). > > Various other operations were slow due to the domctl lock contention but "xl > vcpu-list" was the most obviously visible example. I see. Jan ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2026-06-11 16:06 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-09 15:15 [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall 2026-06-09 15:15 ` [PATCH v1 1/2] domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock Ross Lagerwall 2026-06-10 8:17 ` Roger Pau Monné 2026-06-11 13:11 ` Daniel P. Smith 2026-06-11 14:23 ` Roger Pau Monné 2026-06-11 14:25 ` Daniel P. Smith 2026-06-09 15:15 ` [PATCH v1 2/2] domctl: Handle some of XEN_DOMCTL_shadow_op " Ross Lagerwall 2026-06-10 8:35 ` Roger Pau Monné 2026-06-10 9:50 ` Ross Lagerwall 2026-06-11 13:18 ` Daniel P. Smith 2026-06-11 14:20 ` Roger Pau Monné 2026-06-11 14:24 ` Daniel P. Smith 2026-06-11 15:02 ` Jan Beulich 2026-06-10 9:57 ` [PATCH v1 0/2] domctl: Avoid taking domctl lock for certain ops used during migration Ross Lagerwall 2026-06-10 11:48 ` Oleksii Kurochko 2026-06-11 14:55 ` Jan Beulich 2026-06-11 16:02 ` Ross Lagerwall 2026-06-11 16:06 ` Jan Beulich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.