All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy()
@ 2026-02-26  9:01 Jan Beulich
  2026-03-04 15:38 ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2026-02-26  9:01 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
	Michal Orzel, Roger Pau Monné

Using plain (signed) int variables as array indexes can be unhelpful on at
least x86, where the compiler may see the need to insert sign-extension
insns (strictly speaking it should be able to avoid that when the loop
continuation condition says >= 0, but that's not generally the case even
with gcc15).

Observed effects with gcc15 (will of course vary with compiler version and
level of optimization):
- on x86, one less preserved register in use, yet due to sub-optimal
  choice of register variables still a small code size increase (%r12
  isn't a good choice when it's used for base-without-index addressing, as
  it requires a SIB byte which other registers wouldn't require),
- on Arm64 code size decreases, albeit that's eaten up by padding which is
  being inserted ahead of a few labels,
- on Arm32 code size increases for a reason I didn't fully understand (my
  ability to read Arm assembly is still somewhat limited).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1475,7 +1475,7 @@ static void cf_check complete_domain_des
 {
     struct domain *d = container_of(head, struct domain, rcu);
     struct vcpu *v;
-    int i;
+    unsigned int i;
 
     /*
      * Flush all state for the vCPU previously having run on the current CPU.
@@ -1485,7 +1485,7 @@ static void cf_check complete_domain_des
      */
     sync_local_execstate();
 
-    for ( i = d->max_vcpus - 1; i >= 0; i-- )
+    for ( i = d->max_vcpus; i-- > 0; )
     {
         if ( (v = d->vcpu[i]) == NULL )
             continue;
@@ -1511,7 +1511,7 @@ static void cf_check complete_domain_des
     xfree(d->vm_event_share);
 #endif
 
-    for ( i = d->max_vcpus - 1; i >= 0; i-- )
+    for ( i = d->max_vcpus; i-- > 0; )
         if ( (v = d->vcpu[i]) != NULL )
             vcpu_destroy(v);
 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy()
  2026-02-26  9:01 [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy() Jan Beulich
@ 2026-03-04 15:38 ` Roger Pau Monné
  2026-03-04 15:48   ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2026-03-04 15:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
	Stefano Stabellini, Anthony PERARD, Michal Orzel

On Thu, Feb 26, 2026 at 10:01:45AM +0100, Jan Beulich wrote:
> Using plain (signed) int variables as array indexes can be unhelpful on at
> least x86, where the compiler may see the need to insert sign-extension
> insns (strictly speaking it should be able to avoid that when the loop
> continuation condition says >= 0, but that's not generally the case even
> with gcc15).
> 
> Observed effects with gcc15 (will of course vary with compiler version and
> level of optimization):
> - on x86, one less preserved register in use, yet due to sub-optimal
>   choice of register variables still a small code size increase (%r12
>   isn't a good choice when it's used for base-without-index addressing, as
>   it requires a SIB byte which other registers wouldn't require),
> - on Arm64 code size decreases, albeit that's eaten up by padding which is
>   being inserted ahead of a few labels,
> - on Arm32 code size increases for a reason I didn't fully understand (my
>   ability to read Arm assembly is still somewhat limited).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1475,7 +1475,7 @@ static void cf_check complete_domain_des
>  {
>      struct domain *d = container_of(head, struct domain, rcu);
>      struct vcpu *v;
> -    int i;
> +    unsigned int i;
>  
>      /*
>       * Flush all state for the vCPU previously having run on the current CPU.
> @@ -1485,7 +1485,7 @@ static void cf_check complete_domain_des
>       */
>      sync_local_execstate();
>  
> -    for ( i = d->max_vcpus - 1; i >= 0; i-- )
> +    for ( i = d->max_vcpus; i-- > 0; )

Is there any reason we need to do those loops backwards?

I would rather do:

for ( i = 0; i < d->max_vcpus; i++ )

?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy()
  2026-03-04 15:38 ` Roger Pau Monné
@ 2026-03-04 15:48   ` Jan Beulich
  2026-03-04 17:36     ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2026-03-04 15:48 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
	Stefano Stabellini, Anthony PERARD, Michal Orzel

On 04.03.2026 16:38, Roger Pau Monné wrote:
> On Thu, Feb 26, 2026 at 10:01:45AM +0100, Jan Beulich wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -1475,7 +1475,7 @@ static void cf_check complete_domain_des
>>  {
>>      struct domain *d = container_of(head, struct domain, rcu);
>>      struct vcpu *v;
>> -    int i;
>> +    unsigned int i;
>>  
>>      /*
>>       * Flush all state for the vCPU previously having run on the current CPU.
>> @@ -1485,7 +1485,7 @@ static void cf_check complete_domain_des
>>       */
>>      sync_local_execstate();
>>  
>> -    for ( i = d->max_vcpus - 1; i >= 0; i-- )
>> +    for ( i = d->max_vcpus; i-- > 0; )
> 
> Is there any reason we need to do those loops backwards?
> 
> I would rather do:
> 
> for ( i = 0; i < d->max_vcpus; i++ )
> 
> ?

I think it's better to keep like this. The latter of the loops would better
clear d->vcpu[i] (to not leave a dangling pointer), and there may be code
which assumes that for ordinary domains d->vcpu[] is populated contiguously.
Hardly any code should touch the vCPU-s of a domain destructed this far, but
still better safe than sorry, I guess.

In no case would I like to mix both changes.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy()
  2026-03-04 15:48   ` Jan Beulich
@ 2026-03-04 17:36     ` Roger Pau Monné
  2026-03-05  8:07       ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2026-03-04 17:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
	Stefano Stabellini, Anthony PERARD, Michal Orzel

On Wed, Mar 04, 2026 at 04:48:07PM +0100, Jan Beulich wrote:
> On 04.03.2026 16:38, Roger Pau Monné wrote:
> > On Thu, Feb 26, 2026 at 10:01:45AM +0100, Jan Beulich wrote:
> >> --- a/xen/common/domain.c
> >> +++ b/xen/common/domain.c
> >> @@ -1475,7 +1475,7 @@ static void cf_check complete_domain_des
> >>  {
> >>      struct domain *d = container_of(head, struct domain, rcu);
> >>      struct vcpu *v;
> >> -    int i;
> >> +    unsigned int i;
> >>  
> >>      /*
> >>       * Flush all state for the vCPU previously having run on the current CPU.
> >> @@ -1485,7 +1485,7 @@ static void cf_check complete_domain_des
> >>       */
> >>      sync_local_execstate();
> >>  
> >> -    for ( i = d->max_vcpus - 1; i >= 0; i-- )
> >> +    for ( i = d->max_vcpus; i-- > 0; )
> > 
> > Is there any reason we need to do those loops backwards?
> > 
> > I would rather do:
> > 
> > for ( i = 0; i < d->max_vcpus; i++ )
> > 
> > ?
> 
> I think it's better to keep like this. The latter of the loops would better
> clear d->vcpu[i] (to not leave a dangling pointer), and there may be code
> which assumes that for ordinary domains d->vcpu[] is populated contiguously.
> Hardly any code should touch the vCPU-s of a domain destructed this far, but
> still better safe than sorry, I guess.

Yes, you are right.  sched_destroy_vcpu() relies on this specific
top-down calling.

Since you are adjusting the code anyway, it might be worth writing
down that some functions (like sched_destroy_vcpu()) expect to be
called with a top-down order of vCPUs.

For the change itself:

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy()
  2026-03-04 17:36     ` Roger Pau Monné
@ 2026-03-05  8:07       ` Jan Beulich
  2026-03-05  8:15         ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2026-03-05  8:07 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
	Stefano Stabellini, Anthony PERARD, Michal Orzel

On 04.03.2026 18:36, Roger Pau Monné wrote:
> On Wed, Mar 04, 2026 at 04:48:07PM +0100, Jan Beulich wrote:
>> On 04.03.2026 16:38, Roger Pau Monné wrote:
>>> On Thu, Feb 26, 2026 at 10:01:45AM +0100, Jan Beulich wrote:
>>>> --- a/xen/common/domain.c
>>>> +++ b/xen/common/domain.c
>>>> @@ -1475,7 +1475,7 @@ static void cf_check complete_domain_des
>>>>  {
>>>>      struct domain *d = container_of(head, struct domain, rcu);
>>>>      struct vcpu *v;
>>>> -    int i;
>>>> +    unsigned int i;
>>>>  
>>>>      /*
>>>>       * Flush all state for the vCPU previously having run on the current CPU.
>>>> @@ -1485,7 +1485,7 @@ static void cf_check complete_domain_des
>>>>       */
>>>>      sync_local_execstate();
>>>>  
>>>> -    for ( i = d->max_vcpus - 1; i >= 0; i-- )
>>>> +    for ( i = d->max_vcpus; i-- > 0; )
>>>
>>> Is there any reason we need to do those loops backwards?
>>>
>>> I would rather do:
>>>
>>> for ( i = 0; i < d->max_vcpus; i++ )
>>>
>>> ?
>>
>> I think it's better to keep like this. The latter of the loops would better
>> clear d->vcpu[i] (to not leave a dangling pointer), and there may be code
>> which assumes that for ordinary domains d->vcpu[] is populated contiguously.
>> Hardly any code should touch the vCPU-s of a domain destructed this far, but
>> still better safe than sorry, I guess.
> 
> Yes, you are right.  sched_destroy_vcpu() relies on this specific
> top-down calling.
> 
> Since you are adjusting the code anyway, it might be worth writing
> down that some functions (like sched_destroy_vcpu()) expect to be
> called with a top-down order of vCPUs.

I've added

    /*
     * Iterating downwards is a requirement here, as e.g. sched_destroy_vcpu()
     * relies on this.
     */

ahead of the first of the two loops.

> For the change itself:
> 
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy()
  2026-03-05  8:07       ` Jan Beulich
@ 2026-03-05  8:15         ` Roger Pau Monné
  0 siblings, 0 replies; 6+ messages in thread
From: Roger Pau Monné @ 2026-03-05  8:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
	Stefano Stabellini, Anthony PERARD, Michal Orzel

On Thu, Mar 05, 2026 at 09:07:55AM +0100, Jan Beulich wrote:
> On 04.03.2026 18:36, Roger Pau Monné wrote:
> > On Wed, Mar 04, 2026 at 04:48:07PM +0100, Jan Beulich wrote:
> >> On 04.03.2026 16:38, Roger Pau Monné wrote:
> >>> On Thu, Feb 26, 2026 at 10:01:45AM +0100, Jan Beulich wrote:
> >>>> --- a/xen/common/domain.c
> >>>> +++ b/xen/common/domain.c
> >>>> @@ -1475,7 +1475,7 @@ static void cf_check complete_domain_des
> >>>>  {
> >>>>      struct domain *d = container_of(head, struct domain, rcu);
> >>>>      struct vcpu *v;
> >>>> -    int i;
> >>>> +    unsigned int i;
> >>>>  
> >>>>      /*
> >>>>       * Flush all state for the vCPU previously having run on the current CPU.
> >>>> @@ -1485,7 +1485,7 @@ static void cf_check complete_domain_des
> >>>>       */
> >>>>      sync_local_execstate();
> >>>>  
> >>>> -    for ( i = d->max_vcpus - 1; i >= 0; i-- )
> >>>> +    for ( i = d->max_vcpus; i-- > 0; )
> >>>
> >>> Is there any reason we need to do those loops backwards?
> >>>
> >>> I would rather do:
> >>>
> >>> for ( i = 0; i < d->max_vcpus; i++ )
> >>>
> >>> ?
> >>
> >> I think it's better to keep like this. The latter of the loops would better
> >> clear d->vcpu[i] (to not leave a dangling pointer), and there may be code
> >> which assumes that for ordinary domains d->vcpu[] is populated contiguously.
> >> Hardly any code should touch the vCPU-s of a domain destructed this far, but
> >> still better safe than sorry, I guess.
> > 
> > Yes, you are right.  sched_destroy_vcpu() relies on this specific
> > top-down calling.
> > 
> > Since you are adjusting the code anyway, it might be worth writing
> > down that some functions (like sched_destroy_vcpu()) expect to be
> > called with a top-down order of vCPUs.
> 
> I've added
> 
>     /*
>      * Iterating downwards is a requirement here, as e.g. sched_destroy_vcpu()
>      * relies on this.
>      */
> 
> ahead of the first of the two loops.

Thank you for adjusting that.

Roger.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-05  8:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26  9:01 [PATCH] domain: use unsigned loop induction variable in complete_domain_destroy() Jan Beulich
2026-03-04 15:38 ` Roger Pau Monné
2026-03-04 15:48   ` Jan Beulich
2026-03-04 17:36     ` Roger Pau Monné
2026-03-05  8:07       ` Jan Beulich
2026-03-05  8:15         ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.