All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: I Hsin Cheng <richard120310@gmail.com>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>,
	linux@rasmusvillemoes.dk, jserv@ccns.ncku.edu.tw,
	mark.rutland@arm.com, linux-kernel@vger.kernel.org,
	eleanor15x@gmail.com
Subject: Re: [PATCH] cpumask: Optimize cpumask_any_but()
Date: Fri, 17 Jan 2025 11:32:54 -0500	[thread overview]
Message-ID: <Z4qGNgu_HYp5LK6D@thinkpad> (raw)
In-Reply-To: <Z4pwUxE23JEG5flR@vaxr-BM6660-BM6360>

On Fri, Jan 17, 2025 at 10:59:31PM +0800, I Hsin Cheng wrote:
> On Fri, Jan 17, 2025 at 10:26:58PM +0800, Kuan-Wei Chiu wrote:
> > The cpumask_any_but() function can avoid using a loop to determine the
> > CPU index to return. If the first set bit in the cpumask is not equal
> > to the specified CPU, we can directly return the index of the first set
> > bit. Otherwise, we return the next set bit's index.
> > 
> > This optimization replaces the loop with a single if statement,
> > allowing the compiler to generate more concise and efficient code.

I thought compilers are smart enough to unroll loop in this case. Can
you show disassembled code before and after?

> > 
> > As a result, the size of the bzImage built with x86 defconfig is
> > reduced by 4096 bytes:
> > 
> > * Before:
> > $ size arch/x86/boot/bzImage
> >    text    data     bss     dec     hex filename
> > 13537280           1024       0 13538304         ce9400 arch/x86/boot/bzImage
> > 
> > * After:
> > $ size arch/x86/boot/bzImage
> >    text    data     bss     dec     hex filename
> > 13533184           1024       0 13534208         ce8400 arch/x86/boot/bzImage

Comparing zipped images tells little about code generation. Please use
scripts/bloat-o-meter.

> > 
> > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
> > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> > ---
> > Not sure how to measure the efficiency difference, but I guess this
> > patch might be slightly more efficient or nearly the same as before. If
> > you have any good ideas for measuring efficiency, please let me know!

Check lib/find_bit_benchmark.c

> > 
> >  include/linux/cpumask.h | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 9278a50d514f..b769fcdbaa10 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -404,10 +404,10 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> >  	unsigned int i;
> >  
> >  	cpumask_check(cpu);
> > -	for_each_cpu(i, mask)
> > -		if (i != cpu)
> > -			break;
> > -	return i;
> > +	i = find_first_bit(cpumask_bits(mask), small_cpumask_bits);
> 
> Hi Kuan-Wei,
> 
> How about using cpumask_first(mask) here to keep better consistency?

I would do it the other way: introduce find_first_but_bit(), and then
make cpumask_any_but() a wrapper around it. Doing this you'll be able
to leverage find_bit_benchmark infrastructure to measure performance
difference, if any.
 
> > +	if (i != cpu)
> > +		return i;
> Wouldn't it benefit abit to check "i >= nr_cpu_ids" prior to
> find_next_bit() ?

Yes it would.

Thanks,
Yury

> if "i >= nr_cpu_ids" holds it would be a waste to
> perform find_next_bit().
> 
> > +	return find_next_bit(cpumask_bits(mask), small_cpumask_bits, i + 1);
> >  }
> >  
> 
> Regards,
> I Hsin
> 
> >  /**
> > -- 
> > 2.34.1
> > 

  parent reply	other threads:[~2025-01-17 16:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17 14:26 [PATCH] cpumask: Optimize cpumask_any_but() Kuan-Wei Chiu
2025-01-17 14:59 ` I Hsin Cheng
2025-01-17 16:32   ` Kuan-Wei Chiu
2025-01-17 16:32   ` Yury Norov [this message]
2025-01-18  7:32     ` Kuan-Wei Chiu
2025-01-23 22:39       ` Yury Norov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4qGNgu_HYp5LK6D@thinkpad \
    --to=yury.norov@gmail.com \
    --cc=eleanor15x@gmail.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=mark.rutland@arm.com \
    --cc=richard120310@gmail.com \
    --cc=visitorckw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.