public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: I Hsin Cheng <richard120310@gmail.com>
Cc: anshuman.khandual@arm.com, arnd@arndb.de,
	linux-kernel@vger.kernel.org, jserv@ccns.ncku.edu.tw,
	skhan@linuxfoundation.org, Matthias Kaehlcke <mka@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 1/2] uapi: Refactor __GENMASK() for speed-up
Date: Fri, 14 Feb 2025 10:55:02 -0500	[thread overview]
Message-ID: <Z69nTow0iaZIeV85@thinkpad> (raw)
In-Reply-To: <Z69YYeQB5E5Mi3Jf@vaxr-BM6660-BM6360>

> No problem ! I'll be happy to help with these tests, I'll send the next
> iteration when I complete the things you mentioned.
> 
> > Is this for_each_cpu() thing the only case - I don't know, but it's
> > enough to consider reverting back to the original version.
> 
> So basically the reason of sizing up is because for_each_cpu() put
> non-constant parameter in GENMASK(), which is supposed to be used by
> constant only?

The (l - h) is supposed to be a const_true. In most cases both l and h
are compile-time constants. In that case GENMASK() is evaluated at
compile time, and it allows to complicate it to workaround clang
warning.

But now we obviously have an example where workarounding sacrifices
code generation. This is not OK.
 
> > Can you also please attach one or two examples of code generation for
> > real functions, not an artificial one as you did before. And maybe a
> > link to goldbolt?
> 
> I have no problem of other tests but this one, I wrote a simple
> artificial use case because most functions I found according to the
> report generated by bloat-o-meter, doesn't use GENMASK() directly or
> they're super long and GENMASK() only accounts for very small part of
> them, it wasn't very trivial to sense the difference of disassembly at
> least for me. Should I just pick 1~2 random use cases? or do you have
> any suggestions?

In bloat-o-meter output, pick some small function. The function size
is listed in 'old' and correspondingly 'new' columns. To begin with,
pick some with small difference. The cpuusage_read is one good example:

        cpuusage_read                     111     109      -2

It boils down to __cpuusage_read(), which is:

  static u64 __cpuusage_read(struct cgroup_subsys_state *css,
                             enum cpuacct_stat_index index)
  {
          struct cpuacct *ca = css_ca(css);
          u64 totalcpuusage = 0;
          int i;
  
          for_each_possible_cpu(i)
                  totalcpuusage += cpuacct_cpuusage_read(ca, i, index);
  
          return totalcpuusage;
  }

Now disassemble it like:

objdump --disassemble=cpuusage_read ../build-linux-x86_64/vmlinux.orig > cpuusage_read.orig
objdump --disassemble=cpuusage_read ../build-linux-x86_64/vmlinux.new > cpuusage_read.new

And if you put the listings side-by-side, you will see:

Orig                                              New
ja     ffffffff812efdf0 <cpuusage_read+0x60>      ja     ffffffff812efd2e <cpuusage_read+0x5e>
mov    %r8,%rdx                                   mov    %r8,%rdx
shl    %cl,%rdx                                   shl    %cl,%rdx
neg    %rdx                                       and    %rax,%rdx
and    %rax,%rdx                                  je     ffffffff812efd2e <cpuusage_read+0x5e>
je     ffffffff812efdf0 <cpuusage_read+0x60>      tzcnt  %rdx,%rdx
tzcnt  %rdx,%rdx                                

OK, now we see that new GENMASK saves one negation. Great success!

Can you repeat this exercise for few other functions both with
positive and negative impact and make a conclusion about how exactly
code generation is impacted?

> Btw, are you talking about Online Compiler Explorer? I don't really know
> what goldbolt means here, sorry XD .

https://godbolt.org/

The idea is that you can cherry-pick all necessary ifdefery from
kernel headers and build a simple example by using all supported
compilers. The godbolt's list of compilers is really long. You
will not need to install any of them locally to just check how
they work.

I actually cherry-picked GENMASK() recently for the other discussion,
so you can start from that:

https://lore.kernel.org/all/20250206020227.GA322224@yaz-khff2.amd.com/T/#m27bed06e15809d7b632966a1e861767c72a8722a

Thanks,
Yury


  reply	other threads:[~2025-02-14 15:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-11 16:24 [PATCH 0/2] uapi: Refactor __GENMASK() and __GENMASK_ULL() for I Hsin Cheng
2025-02-11 16:24 ` [PATCH 1/2] uapi: Refactor __GENMASK() for speed-up I Hsin Cheng
2025-02-11 18:39   ` Yury Norov
2025-02-11 18:44     ` Yury Norov
2025-02-12 14:01       ` I Hsin Cheng
2025-02-13 19:54         ` Yury Norov
2025-02-14 14:51           ` I Hsin Cheng
2025-02-14 15:55             ` Yury Norov [this message]
2025-02-12 13:50     ` I Hsin Cheng
2025-02-11 16:24 ` [PATCH 2/2] uapi: Refactor __GENMASK_ULL() " I Hsin Cheng
2025-02-11 22:30   ` David Laight
2025-02-12 12:39     ` I Hsin Cheng
2025-02-12 13:52       ` David Laight
2025-02-12 14:12       ` Yury Norov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z69nTow0iaZIeV85@thinkpad \
    --to=yury.norov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=arnd@arndb.de \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mka@chromium.org \
    --cc=richard120310@gmail.com \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox