From: David Laight <david.laight.linux@gmail.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-wireless@vger.kernel.org, Kees Cook <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>
Subject: Re: [PATCH v1 1/2] overflow: Allow to sum a few arguments at once
Date: Thu, 18 Jun 2026 22:36:53 +0100 [thread overview]
Message-ID: <20260618223653.01b42b38@pumpkin> (raw)
In-Reply-To: <37fcf7c0b1a330a40005fc5ddbe075267b93851e.camel@sipsolutions.net>
On Thu, 18 Jun 2026 20:53:37 +0200
Johannes Berg <johannes@sipsolutions.net> wrote:
> (hah, just found this window open from this morning ...)
>
> On Thu, 2026-06-18 at 09:39 +0300, Andy Shevchenko wrote:
> > On Wed, Jun 17, 2026 at 10:30:56PM +0100, David Laight wrote:
> > > On Wed, 17 Jun 2026 14:56:09 +0200
> > > Johannes Berg <johannes@sipsolutions.net> wrote:
> > > > On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote:
> > > > > Convert size_add() to take variadic argument, so we can simplify users
> > > > > with using a macro only once.
> > > >
> > > > > +#define __size_add3(addend1, addend2, addend3, addend4, ...) \
> > > > > + __size_add(__size_add2(addend1, addend2, addend3), addend4)
> > > > > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \
> > > > > + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5)
> > > >
> > > > I guess it's not going to really matter, but it would generate fewer
> > > > calls to have something more like
> > > >
> > > > #define __size_add3(a1, a2, a3, a4) \
> > > > size_add(size_add(a1, a2), size_add(a3, a4))
> > > > #define __size_add4(a1, a2, a3, a4, a5) \
> > > > size_add(size_add(a1, a2), size_add(a3, a4, a5))
> > > >
> > > > as a binary tree, rather than only cutting one off every time. Not sure
> > > > that results in hugely different code though - maybe fewer overflow
> > > > checks?
> >
> > Good question. I'm also thinking that one-by-one may expand in too much of
> > preprocessor code (haven't checked myself).
>
> No. I was confused, and managed to confuse you too perhaps, sorry!
>
> We have to have the same number of operations (__size_add calls)
> regardless, since you have to add it all up: 1 + 2 + 3 + 4 + 5 has a
> fixed number of + signs regardless of how you parenthesise it.
>
> I guess actual CPU execution would have a better data dependency tree if
> we balance it,
Absolutely.
Intel Haswell onwards and zen1-4 can execute 4 independent add/sub/and/or
(etc) every clock.
zen5 wins with 6 arithmetic ops or 4 cmov (and 2 alu) per clock.
> but ... if our hotpath depends on size_add() we've lost already.
I've no idea what the compiler generates, but a cmovc to copy in ~0
when the add sets carry stands a good chance of being pretty near the best.
What you don't want is a conditional jump.
The add, cmov pair will take two clocks, but the pairs are independent of
each other (the carry flag isn't a limitation).
The cpu should be able to execute two add and two cmov every clock.
So with 4 values the 'tree' version is 4 clocks
The other problem with ((a + b) + c) + d is that execution can't start
until both a and b are available; with (a + b) + (c + d) it is much
more likely that one of the adds can be executed early.
Trying to guess the performance of modern cpu is non-trivial.
David
>
> johannes
next prev parent reply other threads:[~2026-06-18 21:36 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 11:12 [rfc, PATCH v1 0/2] overflow: Convert size_add() to take variadic arguments Andy Shevchenko
2026-06-17 11:12 ` [PATCH v1 1/2] overflow: Allow to sum a few arguments at once Andy Shevchenko
2026-06-17 12:56 ` Johannes Berg
2026-06-17 21:30 ` David Laight
2026-06-18 6:39 ` Andy Shevchenko
2026-06-18 18:53 ` Johannes Berg
2026-06-18 21:36 ` David Laight [this message]
2026-06-19 3:47 ` Kees Cook
2026-06-19 6:45 ` Andy Shevchenko
2026-06-17 11:12 ` [PATCH v1 2/2] wifi: nl80211: Call size_add() only once Andy Shevchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260618223653.01b42b38@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=gustavoars@kernel.org \
--cc=johannes@sipsolutions.net \
--cc=kees@kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.