* [rfc, PATCH v1 0/2] overflow: Convert size_add() to take variadic arguments @ 2026-06-17 11:12 Andy Shevchenko 2026-06-17 11:12 ` [PATCH v1 1/2] overflow: Allow to sum a few arguments at once Andy Shevchenko 2026-06-17 11:12 ` [PATCH v1 2/2] wifi: nl80211: Call size_add() only once Andy Shevchenko 0 siblings, 2 replies; 8+ messages in thread From: Andy Shevchenko @ 2026-06-17 11:12 UTC (permalink / raw) To: Johannes Berg, linux-hardening, linux-kernel, linux-wireless Cc: Kees Cook, Gustavo A. R. Silva, Johannes Berg, Andy Shevchenko This is an RFC! We have already users that want add sizes of up to 5 arguments and I know about at least one that also wants 3 or 4. This is brave move to make size_add() to take variadic arguments. The second patch is an example of use. The implementation includes a case with a single argument on a purpose. In the future it might be extended to take an array as an argument, something like int sizes[21]; size_add(sizes); where the first element is amount of entries in the array (the same format as used in get_options() call) or other possible variants. This can be distinguished by _Generic(). But it may be dropped and we require always two arguments at minimum. The RFC just to collect opinions and perception. Note, array3*(), min3()/max3() and all like that also can use similar approach. Andy Shevchenko (2): overflow: Allow to sum a few arguments at once wifi: nl80211: Call size_add() only once include/linux/overflow.h | 37 ++++++++++++++++++++++++++----------- net/wireless/nl80211.c | 11 ++++------- 2 files changed, 30 insertions(+), 18 deletions(-) -- 2.50.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v1 1/2] overflow: Allow to sum a few arguments at once 2026-06-17 11:12 [rfc, PATCH v1 0/2] overflow: Convert size_add() to take variadic arguments Andy Shevchenko @ 2026-06-17 11:12 ` Andy Shevchenko 2026-06-17 12:56 ` Johannes Berg 2026-06-17 11:12 ` [PATCH v1 2/2] wifi: nl80211: Call size_add() only once Andy Shevchenko 1 sibling, 1 reply; 8+ messages in thread From: Andy Shevchenko @ 2026-06-17 11:12 UTC (permalink / raw) To: Johannes Berg, linux-hardening, linux-kernel, linux-wireless Cc: Kees Cook, Gustavo A. R. Silva, Johannes Berg, Andy Shevchenko Convert size_add() to take variadic argument, so we can simplify users with using a macro only once. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> --- include/linux/overflow.h | 37 ++++++++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 11 deletions(-) diff --git a/include/linux/overflow.h b/include/linux/overflow.h index a8cb6319b4fb..a8b0325e73f3 100644 --- a/include/linux/overflow.h +++ b/include/linux/overflow.h @@ -2,9 +2,10 @@ #ifndef __LINUX_OVERFLOW_H #define __LINUX_OVERFLOW_H +#include <linux/args.h> #include <linux/compiler.h> -#include <linux/limits.h> #include <linux/const.h> +#include <linux/limits.h> /* * We need to compute the minimum and maximum values representable in a given @@ -337,16 +338,7 @@ static __always_inline size_t __must_check size_mul(size_t factor1, size_t facto return bytes; } -/** - * size_add() - Calculate size_t addition with saturation at SIZE_MAX - * @addend1: first addend - * @addend2: second addend - * - * Returns: calculate @addend1 + @addend2, both promoted to size_t, - * with any overflow causing the return value to be SIZE_MAX. The - * lvalue must be size_t to avoid implicit type conversion. - */ -static __always_inline size_t __must_check size_add(size_t addend1, size_t addend2) +static __always_inline size_t __must_check __size_add(size_t addend1, size_t addend2) { size_t bytes; @@ -356,6 +348,29 @@ static __always_inline size_t __must_check size_add(size_t addend1, size_t adden return bytes; } +#define __size_add0(addend1, ...) \ + __size_add(addend1, 0) +#define __size_add1(addend1, addend2, ...) \ + __size_add(addend1, addend2) +#define __size_add2(addend1, addend2, addend3, ...) \ + __size_add(__size_add(addend1, addend2), addend3) +#define __size_add3(addend1, addend2, addend3, addend4, ...) \ + __size_add(__size_add2(addend1, addend2, addend3), addend4) +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \ + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5) + +/** + * size_add() - Calculate size_t addition with saturation at SIZE_MAX + * @addend1: first addend + * @...: more to add (optional) + * + * Returns: calculate @addend1 + @addend2, both promoted to size_t, + * with any overflow causing the return value to be SIZE_MAX. The + * lvalue must be size_t to avoid implicit type conversion. + */ +#define size_add(addend1, ...) \ + CONCATENATE(__size_add, COUNT_ARGS(__VA_ARGS__))(addend1, __VA_ARGS__) + /** * size_sub() - Calculate size_t subtraction with saturation at SIZE_MAX * @minuend: value to subtract from -- 2.50.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] overflow: Allow to sum a few arguments at once 2026-06-17 11:12 ` [PATCH v1 1/2] overflow: Allow to sum a few arguments at once Andy Shevchenko @ 2026-06-17 12:56 ` Johannes Berg 2026-06-17 21:30 ` David Laight 0 siblings, 1 reply; 8+ messages in thread From: Johannes Berg @ 2026-06-17 12:56 UTC (permalink / raw) To: Andy Shevchenko, linux-hardening, linux-kernel, linux-wireless Cc: Kees Cook, Gustavo A. R. Silva On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote: > Convert size_add() to take variadic argument, so we can simplify users > with using a macro only once. > +#define __size_add3(addend1, addend2, addend3, addend4, ...) \ > + __size_add(__size_add2(addend1, addend2, addend3), addend4) > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \ > + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5) I guess it's not going to really matter, but it would generate fewer calls to have something more like #define __size_add3(a1, a2, a3, a4) \ size_add(size_add(a1, a2), size_add(a3, a4)) #define __size_add4(a1, a2, a3, a4, a5) \ size_add(size_add(a1, a2), size_add(a3, a4, a5)) as a binary tree, rather than only cutting one off every time. Not sure that results in hugely different code though - maybe fewer overflow checks? Although your version make it really completely equivalent to the nl80211.c code, clearly it doesn't matter if all the values are "good", and I believe the overflow behaviour means it doesn't matter for the overflow case either? johannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] overflow: Allow to sum a few arguments at once 2026-06-17 12:56 ` Johannes Berg @ 2026-06-17 21:30 ` David Laight 2026-06-18 6:39 ` Andy Shevchenko 0 siblings, 1 reply; 8+ messages in thread From: David Laight @ 2026-06-17 21:30 UTC (permalink / raw) To: Johannes Berg Cc: Andy Shevchenko, linux-hardening, linux-kernel, linux-wireless, Kees Cook, Gustavo A. R. Silva On Wed, 17 Jun 2026 14:56:09 +0200 Johannes Berg <johannes@sipsolutions.net> wrote: > On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote: > > Convert size_add() to take variadic argument, so we can simplify users > > with using a macro only once. > > > +#define __size_add3(addend1, addend2, addend3, addend4, ...) \ > > + __size_add(__size_add2(addend1, addend2, addend3), addend4) > > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \ > > + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5) > > I guess it's not going to really matter, but it would generate fewer > calls to have something more like > > #define __size_add3(a1, a2, a3, a4) \ > size_add(size_add(a1, a2), size_add(a3, a4)) > #define __size_add4(a1, a2, a3, a4, a5) \ > size_add(size_add(a1, a2), size_add(a3, a4, a5)) > > as a binary tree, rather than only cutting one off every time. Not sure > that results in hugely different code though - maybe fewer overflow > checks? The binary tree stands a chance of executing less slowly because the leaf adds can be executed in parallel. Excluding the saturation checks (wtf is it called size_add() not saturating_add() ?) (a + b) + (c + d) will usually execute faster than ((a + b) + c) + d because the (a + b) and (c + d) can execute at the same time; unfortunately gcc will always generate the latter. David > > Although your version make it really completely equivalent to the > nl80211.c code, clearly it doesn't matter if all the values are "good", > and I believe the overflow behaviour means it doesn't matter for the > overflow case either? > > johannes > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] overflow: Allow to sum a few arguments at once 2026-06-17 21:30 ` David Laight @ 2026-06-18 6:39 ` Andy Shevchenko 2026-06-18 18:53 ` Johannes Berg 0 siblings, 1 reply; 8+ messages in thread From: Andy Shevchenko @ 2026-06-18 6:39 UTC (permalink / raw) To: David Laight Cc: Johannes Berg, linux-hardening, linux-kernel, linux-wireless, Kees Cook, Gustavo A. R. Silva On Wed, Jun 17, 2026 at 10:30:56PM +0100, David Laight wrote: > On Wed, 17 Jun 2026 14:56:09 +0200 > Johannes Berg <johannes@sipsolutions.net> wrote: > > On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote: > > > Convert size_add() to take variadic argument, so we can simplify users > > > with using a macro only once. > > > > > +#define __size_add3(addend1, addend2, addend3, addend4, ...) \ > > > + __size_add(__size_add2(addend1, addend2, addend3), addend4) > > > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \ > > > + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5) > > > > I guess it's not going to really matter, but it would generate fewer > > calls to have something more like > > > > #define __size_add3(a1, a2, a3, a4) \ > > size_add(size_add(a1, a2), size_add(a3, a4)) > > #define __size_add4(a1, a2, a3, a4, a5) \ > > size_add(size_add(a1, a2), size_add(a3, a4, a5)) > > > > as a binary tree, rather than only cutting one off every time. Not sure > > that results in hugely different code though - maybe fewer overflow > > checks? Good question. I'm also thinking that one-by-one may expand in too much of preprocessor code (haven't checked myself). > The binary tree stands a chance of executing less slowly because the leaf > adds can be executed in parallel. > Excluding the saturation checks (wtf is it called size_add() not > saturating_add() ?) (a + b) + (c + d) will usually execute faster than > ((a + b) + c) + d because the (a + b) and (c + d) can execute at the > same time; unfortunately gcc will always generate the latter. I'm confused. "unfortunately... the latter"? You meant "the former"? > > Although your version make it really completely equivalent to the > > nl80211.c code, clearly it doesn't matter if all the values are "good", > > and I believe the overflow behaviour means it doesn't matter for the > > overflow case either? Indeed. Whenever the value is saturated, the rest is just matter of sequential unlikely branches taken. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] overflow: Allow to sum a few arguments at once 2026-06-18 6:39 ` Andy Shevchenko @ 2026-06-18 18:53 ` Johannes Berg 2026-06-18 21:36 ` David Laight 0 siblings, 1 reply; 8+ messages in thread From: Johannes Berg @ 2026-06-18 18:53 UTC (permalink / raw) To: Andy Shevchenko, David Laight Cc: linux-hardening, linux-kernel, linux-wireless, Kees Cook, Gustavo A. R. Silva (hah, just found this window open from this morning ...) On Thu, 2026-06-18 at 09:39 +0300, Andy Shevchenko wrote: > On Wed, Jun 17, 2026 at 10:30:56PM +0100, David Laight wrote: > > On Wed, 17 Jun 2026 14:56:09 +0200 > > Johannes Berg <johannes@sipsolutions.net> wrote: > > > On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote: > > > > Convert size_add() to take variadic argument, so we can simplify users > > > > with using a macro only once. > > > > > > > +#define __size_add3(addend1, addend2, addend3, addend4, ...) \ > > > > + __size_add(__size_add2(addend1, addend2, addend3), addend4) > > > > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \ > > > > + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5) > > > > > > I guess it's not going to really matter, but it would generate fewer > > > calls to have something more like > > > > > > #define __size_add3(a1, a2, a3, a4) \ > > > size_add(size_add(a1, a2), size_add(a3, a4)) > > > #define __size_add4(a1, a2, a3, a4, a5) \ > > > size_add(size_add(a1, a2), size_add(a3, a4, a5)) > > > > > > as a binary tree, rather than only cutting one off every time. Not sure > > > that results in hugely different code though - maybe fewer overflow > > > checks? > > Good question. I'm also thinking that one-by-one may expand in too much of > preprocessor code (haven't checked myself). No. I was confused, and managed to confuse you too perhaps, sorry! We have to have the same number of operations (__size_add calls) regardless, since you have to add it all up: 1 + 2 + 3 + 4 + 5 has a fixed number of + signs regardless of how you parenthesise it. I guess actual CPU execution would have a better data dependency tree if we balance it, but ... if our hotpath depends on size_add() we've lost already. johannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] overflow: Allow to sum a few arguments at once 2026-06-18 18:53 ` Johannes Berg @ 2026-06-18 21:36 ` David Laight 0 siblings, 0 replies; 8+ messages in thread From: David Laight @ 2026-06-18 21:36 UTC (permalink / raw) To: Johannes Berg Cc: Andy Shevchenko, linux-hardening, linux-kernel, linux-wireless, Kees Cook, Gustavo A. R. Silva On Thu, 18 Jun 2026 20:53:37 +0200 Johannes Berg <johannes@sipsolutions.net> wrote: > (hah, just found this window open from this morning ...) > > On Thu, 2026-06-18 at 09:39 +0300, Andy Shevchenko wrote: > > On Wed, Jun 17, 2026 at 10:30:56PM +0100, David Laight wrote: > > > On Wed, 17 Jun 2026 14:56:09 +0200 > > > Johannes Berg <johannes@sipsolutions.net> wrote: > > > > On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote: > > > > > Convert size_add() to take variadic argument, so we can simplify users > > > > > with using a macro only once. > > > > > > > > > +#define __size_add3(addend1, addend2, addend3, addend4, ...) \ > > > > > + __size_add(__size_add2(addend1, addend2, addend3), addend4) > > > > > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...) \ > > > > > + __size_add(__size_add3(addend1, addend2, addend3, addend4), addend5) > > > > > > > > I guess it's not going to really matter, but it would generate fewer > > > > calls to have something more like > > > > > > > > #define __size_add3(a1, a2, a3, a4) \ > > > > size_add(size_add(a1, a2), size_add(a3, a4)) > > > > #define __size_add4(a1, a2, a3, a4, a5) \ > > > > size_add(size_add(a1, a2), size_add(a3, a4, a5)) > > > > > > > > as a binary tree, rather than only cutting one off every time. Not sure > > > > that results in hugely different code though - maybe fewer overflow > > > > checks? > > > > Good question. I'm also thinking that one-by-one may expand in too much of > > preprocessor code (haven't checked myself). > > No. I was confused, and managed to confuse you too perhaps, sorry! > > We have to have the same number of operations (__size_add calls) > regardless, since you have to add it all up: 1 + 2 + 3 + 4 + 5 has a > fixed number of + signs regardless of how you parenthesise it. > > I guess actual CPU execution would have a better data dependency tree if > we balance it, Absolutely. Intel Haswell onwards and zen1-4 can execute 4 independent add/sub/and/or (etc) every clock. zen5 wins with 6 arithmetic ops or 4 cmov (and 2 alu) per clock. > but ... if our hotpath depends on size_add() we've lost already. I've no idea what the compiler generates, but a cmovc to copy in ~0 when the add sets carry stands a good chance of being pretty near the best. What you don't want is a conditional jump. The add, cmov pair will take two clocks, but the pairs are independent of each other (the carry flag isn't a limitation). The cpu should be able to execute two add and two cmov every clock. So with 4 values the 'tree' version is 4 clocks The other problem with ((a + b) + c) + d is that execution can't start until both a and b are available; with (a + b) + (c + d) it is much more likely that one of the adds can be executed early. Trying to guess the performance of modern cpu is non-trivial. David > > johannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v1 2/2] wifi: nl80211: Call size_add() only once 2026-06-17 11:12 [rfc, PATCH v1 0/2] overflow: Convert size_add() to take variadic arguments Andy Shevchenko 2026-06-17 11:12 ` [PATCH v1 1/2] overflow: Allow to sum a few arguments at once Andy Shevchenko @ 2026-06-17 11:12 ` Andy Shevchenko 1 sibling, 0 replies; 8+ messages in thread From: Andy Shevchenko @ 2026-06-17 11:12 UTC (permalink / raw) To: Johannes Berg, linux-hardening, linux-kernel, linux-wireless Cc: Kees Cook, Gustavo A. R. Silva, Johannes Berg, Andy Shevchenko Since size_add() may take a few arguments at once, call it only once. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> --- net/wireless/nl80211.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 53b4b3f76697..98f92c268944 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -11560,13 +11560,10 @@ nl80211_parse_sched_scan(struct wiphy *wiphy, struct wireless_dev *wdev, attrs[NL80211_ATTR_SCHED_SCAN_RSSI_ADJUST])) return ERR_PTR(-EINVAL); - size = struct_size(request, channels, n_channels); - size = size_add(size, array_size(sizeof(*request->ssids), n_ssids)); - size = size_add(size, array_size(sizeof(*request->match_sets), - n_match_sets)); - size = size_add(size, array_size(sizeof(*request->scan_plans), - n_plans)); - size = size_add(size, ie_len); + size = size_add(struct_size(request, channels, n_channels), ie_len, + array_size(sizeof(*request->ssids), n_ssids), + array_size(sizeof(*request->match_sets), n_match_sets), + array_size(sizeof(*request->scan_plans), n_plans)); request = kzalloc(size, GFP_KERNEL); if (!request) return ERR_PTR(-ENOMEM); -- 2.50.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-06-18 21:36 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-17 11:12 [rfc, PATCH v1 0/2] overflow: Convert size_add() to take variadic arguments Andy Shevchenko 2026-06-17 11:12 ` [PATCH v1 1/2] overflow: Allow to sum a few arguments at once Andy Shevchenko 2026-06-17 12:56 ` Johannes Berg 2026-06-17 21:30 ` David Laight 2026-06-18 6:39 ` Andy Shevchenko 2026-06-18 18:53 ` Johannes Berg 2026-06-18 21:36 ` David Laight 2026-06-17 11:12 ` [PATCH v1 2/2] wifi: nl80211: Call size_add() only once Andy Shevchenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox