* [PATCH next] crypto: qat - replace avg_array() with a better function
@ 2026-02-06 21:09 david.laight.linux
2026-02-07 10:51 ` David Laight
2026-02-24 18:11 ` Giovanni Cabiddu
0 siblings, 2 replies; 4+ messages in thread
From: david.laight.linux @ 2026-02-06 21:09 UTC (permalink / raw)
To: Giovanni Cabiddu, Herbert Xu, David S. Miller,
Suman Kumar Chakraborty, Vijay Sundar Selvamani, George Abraham P,
qat-linux, linux-crypto, linux-kernel
Cc: David Laight
From: David Laight <david.laight.linux@gmail.com>
avg_array() is defined as a 'type independant' #define.
However the algorithm is only valid for unsigned types and the
implementation is only valid for u64.
All the callers pass temporary kmalloc() allocated arrays of u64.
Replace with a function that takes a pointer to a u64 array.
Change the implementation to sum the low and high 32bits of each
value separately and then compute the average.
This will be massively faster as it does two divisions rather than
one for each element.
Also removes some very pointless __unqual_scalar_typeof().
They could be 'auto _x = 0 ? x + 0 : 0;' even if the types weren't fixed.
Only compile tested.
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
.../intel/qat/qat_common/adf_tl_debugfs.c | 38 ++++++++-----------
1 file changed, 15 insertions(+), 23 deletions(-)
diff --git a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
index b81f70576683..a084437a2631 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
+++ b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
@@ -77,32 +77,24 @@ static int tl_collect_values_u64(struct adf_telemetry *telemetry,
* @len: Number of elements.
*
* This algorithm computes average of an array without running into overflow.
+ * (Provided len is less than 2 << 31.)
*
* Return: average of values.
*/
-#define avg_array(array, len) ( \
-{ \
- typeof(&(array)[0]) _array = (array); \
- __unqual_scalar_typeof(_array[0]) _x = 0; \
- __unqual_scalar_typeof(_array[0]) _y = 0; \
- __unqual_scalar_typeof(_array[0]) _a, _b; \
- typeof(len) _len = (len); \
- size_t _i; \
- \
- for (_i = 0; _i < _len; _i++) { \
- _a = _array[_i]; \
- _b = do_div(_a, _len); \
- _x += _a; \
- if (_y >= _len - _b) { \
- _x++; \
- _y -= _len - _b; \
- } else { \
- _y += _b; \
- } \
- } \
- do_div(_y, _len); \
- (_x + _y); \
-})
+static u64 avg_array(const u64 *array, size_t len)
+{
+ u64 sum_hi = 0, sum_lo = 0;
+ size_t i;
+
+ for (i = 0; i < len; i++) {
+ sum_hi += array[i] >> 32;
+ sum_lo += (u32)array[i];
+ }
+
+ sum_lo += (u64)do_div(sum_hi, len) << 32;
+
+ return (sum_hi << 32) + div_u64(sum_lo, len);
+}
/* Calculation function for simple counter. */
static int tl_calc_count(struct adf_telemetry *telemetry,
--
2.39.5
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH next] crypto: qat - replace avg_array() with a better function
2026-02-06 21:09 [PATCH next] crypto: qat - replace avg_array() with a better function david.laight.linux
@ 2026-02-07 10:51 ` David Laight
2026-02-24 18:11 ` Giovanni Cabiddu
1 sibling, 0 replies; 4+ messages in thread
From: David Laight @ 2026-02-07 10:51 UTC (permalink / raw)
To: Giovanni Cabiddu, Herbert Xu, David S. Miller,
Suman Kumar Chakraborty, Vijay Sundar Selvamani, George Abraham P,
qat-linux, linux-crypto, linux-kernel
Cc: Marco Elver, Will Deacon, Peter Zijlstra
On Fri, 6 Feb 2026 21:09:40 +0000
david.laight.linux@gmail.com wrote:
Cc the people discussing unqual_scalar_typeof() for arm64 LTO READ_ONCE().
> From: David Laight <david.laight.linux@gmail.com>
>
> avg_array() is defined as a 'type independant' #define.
> However the algorithm is only valid for unsigned types and the
> implementation is only valid for u64.
> All the callers pass temporary kmalloc() allocated arrays of u64.
>
> Replace with a function that takes a pointer to a u64 array.
>
> Change the implementation to sum the low and high 32bits of each
> value separately and then compute the average.
> This will be massively faster as it does two divisions rather than
> one for each element.
>
> Also removes some very pointless __unqual_scalar_typeof().
> They could be 'auto _x = 0 ? x + 0 : 0;' even if the types weren't fixed.
>
> Only compile tested.
>
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> .../intel/qat/qat_common/adf_tl_debugfs.c | 38 ++++++++-----------
> 1 file changed, 15 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> index b81f70576683..a084437a2631 100644
> --- a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> +++ b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> @@ -77,32 +77,24 @@ static int tl_collect_values_u64(struct adf_telemetry *telemetry,
> * @len: Number of elements.
> *
> * This algorithm computes average of an array without running into overflow.
> + * (Provided len is less than 2 << 31.)
> *
> * Return: average of values.
> */
> -#define avg_array(array, len) ( \
> -{ \
> - typeof(&(array)[0]) _array = (array); \
> - __unqual_scalar_typeof(_array[0]) _x = 0; \
> - __unqual_scalar_typeof(_array[0]) _y = 0; \
> - __unqual_scalar_typeof(_array[0]) _a, _b; \
> - typeof(len) _len = (len); \
> - size_t _i; \
> - \
> - for (_i = 0; _i < _len; _i++) { \
> - _a = _array[_i]; \
> - _b = do_div(_a, _len); \
> - _x += _a; \
> - if (_y >= _len - _b) { \
> - _x++; \
> - _y -= _len - _b; \
> - } else { \
> - _y += _b; \
> - } \
> - } \
> - do_div(_y, _len); \
> - (_x + _y); \
> -})
> +static u64 avg_array(const u64 *array, size_t len)
> +{
> + u64 sum_hi = 0, sum_lo = 0;
> + size_t i;
> +
> + for (i = 0; i < len; i++) {
> + sum_hi += array[i] >> 32;
> + sum_lo += (u32)array[i];
> + }
> +
> + sum_lo += (u64)do_div(sum_hi, len) << 32;
> +
> + return (sum_hi << 32) + div_u64(sum_lo, len);
> +}
>
> /* Calculation function for simple counter. */
> static int tl_calc_count(struct adf_telemetry *telemetry,
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH next] crypto: qat - replace avg_array() with a better function
2026-02-06 21:09 [PATCH next] crypto: qat - replace avg_array() with a better function david.laight.linux
2026-02-07 10:51 ` David Laight
@ 2026-02-24 18:11 ` Giovanni Cabiddu
2026-02-24 22:01 ` David Laight
1 sibling, 1 reply; 4+ messages in thread
From: Giovanni Cabiddu @ 2026-02-24 18:11 UTC (permalink / raw)
To: david.laight.linux
Cc: Herbert Xu, David S. Miller, Suman Kumar Chakraborty,
Vijay Sundar Selvamani, George Abraham P, qat-linux, linux-crypto,
linux-kernel
On Fri, Feb 06, 2026 at 09:09:40PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
>
> avg_array() is defined as a 'type independant' #define.
> However the algorithm is only valid for unsigned types and the
> implementation is only valid for u64.
> All the callers pass temporary kmalloc() allocated arrays of u64.
>
> Replace with a function that takes a pointer to a u64 array.
>
> Change the implementation to sum the low and high 32bits of each
> value separately and then compute the average.
Thanks David, this is a great optimization.
I also reviewed the algorithm and confirmed it is functionally equivalent
to the previous version. I tested it on a platform with QAT and it
behaves as expected.
Some minor comments below.
> This will be massively faster as it does two divisions rather than
> one for each element.
NIT: probably not `massively faster` as the maximum value for len in the
current implementation is 4.
> Also removes some very pointless __unqual_scalar_typeof().
> They could be 'auto _x = 0 ? x + 0 : 0;' even if the types weren't fixed.
>
> Only compile tested.
>
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> .../intel/qat/qat_common/adf_tl_debugfs.c | 38 ++++++++-----------
> 1 file changed, 15 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> index b81f70576683..a084437a2631 100644
> --- a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> +++ b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> @@ -77,32 +77,24 @@ static int tl_collect_values_u64(struct adf_telemetry *telemetry,
> * @len: Number of elements.
> *
> * This algorithm computes average of an array without running into overflow.
> + * (Provided len is less than 2 << 31.)
Should this be 2^31 or 1 << 31?
Alternatively: `Provided len fits in u32`?
> *
> * Return: average of values.
> */
> -#define avg_array(array, len) ( \
> -{ \
> - typeof(&(array)[0]) _array = (array); \
> - __unqual_scalar_typeof(_array[0]) _x = 0; \
> - __unqual_scalar_typeof(_array[0]) _y = 0; \
> - __unqual_scalar_typeof(_array[0]) _a, _b; \
> - typeof(len) _len = (len); \
> - size_t _i; \
> - \
> - for (_i = 0; _i < _len; _i++) { \
> - _a = _array[_i]; \
> - _b = do_div(_a, _len); \
> - _x += _a; \
> - if (_y >= _len - _b) { \
> - _x++; \
> - _y -= _len - _b; \
> - } else { \
> - _y += _b; \
> - } \
> - } \
> - do_div(_y, _len); \
> - (_x + _y); \
> -})
> +static u64 avg_array(const u64 *array, size_t len)
Shall size_t len be u32 len?
> +{
> + u64 sum_hi = 0, sum_lo = 0;
> + size_t i;
> +
> + for (i = 0; i < len; i++) {
> + sum_hi += array[i] >> 32;
> + sum_lo += (u32)array[i];
> + }
> +
> + sum_lo += (u64)do_div(sum_hi, len) << 32;
> +
> + return (sum_hi << 32) + div_u64(sum_lo, len);
> +}
>
> /* Calculation function for simple counter. */
> static int tl_calc_count(struct adf_telemetry *telemetry,
Thanks,
--
Giovanni
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH next] crypto: qat - replace avg_array() with a better function
2026-02-24 18:11 ` Giovanni Cabiddu
@ 2026-02-24 22:01 ` David Laight
0 siblings, 0 replies; 4+ messages in thread
From: David Laight @ 2026-02-24 22:01 UTC (permalink / raw)
To: Giovanni Cabiddu
Cc: Herbert Xu, David S. Miller, Suman Kumar Chakraborty,
Vijay Sundar Selvamani, George Abraham P, qat-linux, linux-crypto,
linux-kernel
On Tue, 24 Feb 2026 18:11:37 +0000
Giovanni Cabiddu <giovanni.cabiddu@intel.com> wrote:
> On Fri, Feb 06, 2026 at 09:09:40PM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> >
> > avg_array() is defined as a 'type independant' #define.
> > However the algorithm is only valid for unsigned types and the
> > implementation is only valid for u64.
> > All the callers pass temporary kmalloc() allocated arrays of u64.
> >
> > Replace with a function that takes a pointer to a u64 array.
> >
> > Change the implementation to sum the low and high 32bits of each
> > value separately and then compute the average.
> Thanks David, this is a great optimization.
>
> I also reviewed the algorithm and confirmed it is functionally equivalent
> to the previous version. I tested it on a platform with QAT and it
> behaves as expected.
>
> Some minor comments below.
>
> > This will be massively faster as it does two divisions rather than
> > one for each element.
> NIT: probably not `massively faster` as the maximum value for len in the
> current implementation is 4.
It is still a lot faster - but probably not significant to system performance.
Actually if the max for len is 65536 you can do better (esp. for 32bit).
Instead of splitting 32/32 split 48/16, then sum_lo needs only be u32.
giving:
{
u64 sum_hi = 0;
u32 sum_lo = 0;
size_t i;
for (i = 0; i < len; i++) {
sum_hi += array[i] >> 16;
sum_lo += array[i] & 0xffff;
}
sum_lo += do_div(sum_hi, len) << 16;
return (sum_hi << 16) + sum_lo / len;
}
OTOH aren't those values performance counts of some kind?
Adding four of them together isn't going to wrap.
>
> > Also removes some very pointless __unqual_scalar_typeof().
> > They could be 'auto _x = 0 ? x + 0 : 0;' even if the types weren't fixed.
> >
> > Only compile tested.
> >
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > .../intel/qat/qat_common/adf_tl_debugfs.c | 38 ++++++++-----------
> > 1 file changed, 15 insertions(+), 23 deletions(-)
> >
> > diff --git a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> > index b81f70576683..a084437a2631 100644
> > --- a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> > +++ b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> > @@ -77,32 +77,24 @@ static int tl_collect_values_u64(struct adf_telemetry *telemetry,
> > * @len: Number of elements.
> > *
> > * This algorithm computes average of an array without running into overflow.
> > + * (Provided len is less than 2 << 31.)
> Should this be 2^31 or 1 << 31?
> Alternatively: `Provided len fits in u32`?
Not sure why I wrote 2 << 31 :-)
The condition is that sum_lo must not overflow.
The worst case is all the low bits being 1.
If len is 2^32 then sum_lo is then (2^32 - 1) * 2^32.
The remainder from the sum_hi divide is shifted and added in,
giving (2^32 - 1) * (2^32 + 1) which is what my maths teacher called a
'cow and goat' - (cow + goat) * (cow - goat) = cow squared - goat squared,
so then maximum for sum_lo is 2^64 - 1 which fits.
Which means it should have been 'len <= 2^32'.
David
>
> > *
> > * Return: average of values.
> > */
> > -#define avg_array(array, len) ( \
> > -{ \
> > - typeof(&(array)[0]) _array = (array); \
> > - __unqual_scalar_typeof(_array[0]) _x = 0; \
> > - __unqual_scalar_typeof(_array[0]) _y = 0; \
> > - __unqual_scalar_typeof(_array[0]) _a, _b; \
> > - typeof(len) _len = (len); \
> > - size_t _i; \
> > - \
> > - for (_i = 0; _i < _len; _i++) { \
> > - _a = _array[_i]; \
> > - _b = do_div(_a, _len); \
> > - _x += _a; \
> > - if (_y >= _len - _b) { \
> > - _x++; \
> > - _y -= _len - _b; \
> > - } else { \
> > - _y += _b; \
> > - } \
> > - } \
> > - do_div(_y, _len); \
> > - (_x + _y); \
> > -})
> > +static u64 avg_array(const u64 *array, size_t len)
> Shall size_t len be u32 len?
>
> > +{
> > + u64 sum_hi = 0, sum_lo = 0;
> > + size_t i;
> > +
> > + for (i = 0; i < len; i++) {
> > + sum_hi += array[i] >> 32;
> > + sum_lo += (u32)array[i];
> > + }
> > +
> > + sum_lo += (u64)do_div(sum_hi, len) << 32;
> > +
> > + return (sum_hi << 32) + div_u64(sum_lo, len);
> > +}
> >
> > /* Calculation function for simple counter. */
> > static int tl_calc_count(struct adf_telemetry *telemetry,
>
> Thanks,
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-24 22:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-06 21:09 [PATCH next] crypto: qat - replace avg_array() with a better function david.laight.linux
2026-02-07 10:51 ` David Laight
2026-02-24 18:11 ` Giovanni Cabiddu
2026-02-24 22:01 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox