Re: [PATCH next] crypto: qat - replace avg_array() with a better function

public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed

From: David Laight <david.laight.linux@gmail.com>
To: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>,
	Vijay Sundar Selvamani <vijay.sundar.selvamani@intel.com>,
	George Abraham P <george.abraham.p@intel.com>,
	<qat-linux@intel.com>, <linux-crypto@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH next] crypto: qat - replace avg_array() with a better function
Date: Tue, 24 Feb 2026 22:01:44 +0000	[thread overview]
Message-ID: <20260224220144.231b17a5@pumpkin> (raw)
In-Reply-To: <aZ3p2dQFDNOgyQVz@gcabiddu-mobl.ger.corp.intel.com>

On Tue, 24 Feb 2026 18:11:37 +0000
Giovanni Cabiddu <giovanni.cabiddu@intel.com> wrote:

> On Fri, Feb 06, 2026 at 09:09:40PM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > avg_array() is defined as a 'type independant' #define.
> > However the algorithm is only valid for unsigned types and the
> > implementation is only valid for u64.
> > All the callers pass temporary kmalloc() allocated arrays of u64.
> > 
> > Replace with a function that takes a pointer to a u64 array.
> > 
> > Change the implementation to sum the low and high 32bits of each
> > value separately and then compute the average.  
> Thanks David, this is a great optimization.
> 
> I also reviewed the algorithm and confirmed it is functionally equivalent
> to the previous version. I tested it on a platform with QAT and it
> behaves as expected.
> 
> Some minor comments below.
> 
> > This will be massively faster as it does two divisions rather than
> > one for each element.  
> NIT: probably not `massively faster` as the maximum value for len in the
> current implementation is 4.

It is still a lot faster - but probably not significant to system performance.

Actually if the max for len is 65536 you can do better (esp. for 32bit).
Instead of splitting 32/32 split 48/16, then sum_lo needs only be u32.

giving:

{
	u64 sum_hi = 0;
	u32 sum_lo = 0;
	size_t i;

	for (i = 0; i < len; i++) {
		sum_hi += array[i] >> 16;
		sum_lo += array[i] & 0xffff;
	}

	sum_lo += do_div(sum_hi, len) << 16;

	return (sum_hi << 16) + sum_lo / len;
}

OTOH aren't those values performance counts of some kind?
Adding four of them together isn't going to wrap.

> 
> > Also removes some very pointless __unqual_scalar_typeof().
> > They could be 'auto _x = 0 ? x + 0 : 0;' even if the types weren't fixed.
> > 
> > Only compile tested.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> >  .../intel/qat/qat_common/adf_tl_debugfs.c     | 38 ++++++++-----------
> >  1 file changed, 15 insertions(+), 23 deletions(-)
> > 
> > diff --git a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> > index b81f70576683..a084437a2631 100644
> > --- a/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> > +++ b/drivers/crypto/intel/qat/qat_common/adf_tl_debugfs.c
> > @@ -77,32 +77,24 @@ static int tl_collect_values_u64(struct adf_telemetry *telemetry,
> >   * @len: Number of elements.
> >   *
> >   * This algorithm computes average of an array without running into overflow.
> > + * (Provided len is less than 2 << 31.)  
> Should this be 2^31 or 1 << 31?
> Alternatively: `Provided len fits in u32`?

Not sure why I wrote 2 << 31 :-)

The condition is that sum_lo must not overflow.
The worst case is all the low bits being 1.
If len is 2^32 then sum_lo is then (2^32 - 1) * 2^32.
The remainder from the sum_hi divide is shifted and added in,
giving (2^32 - 1) * (2^32 + 1) which is what my maths teacher called a
'cow and goat' - (cow + goat) * (cow - goat) = cow squared - goat squared,
so then maximum for sum_lo is 2^64 - 1 which fits.
Which means it should have been 'len <= 2^32'.

	David
 

> 
> >   *
> >   * Return: average of values.
> >   */
> > -#define avg_array(array, len) (				\
> > -{							\
> > -	typeof(&(array)[0]) _array = (array);		\
> > -	__unqual_scalar_typeof(_array[0]) _x = 0;	\
> > -	__unqual_scalar_typeof(_array[0]) _y = 0;	\
> > -	__unqual_scalar_typeof(_array[0]) _a, _b;	\
> > -	typeof(len) _len = (len);			\
> > -	size_t _i;					\
> > -							\
> > -	for (_i = 0; _i < _len; _i++) {			\
> > -		_a = _array[_i];			\
> > -		_b = do_div(_a, _len);			\
> > -		_x += _a;				\
> > -		if (_y >= _len - _b) {			\
> > -			_x++;				\
> > -			_y -= _len - _b;		\
> > -		} else {				\
> > -			_y += _b;			\
> > -		}					\
> > -	}						\
> > -	do_div(_y, _len);				\
> > -	(_x + _y);					\
> > -})
> > +static u64 avg_array(const u64 *array, size_t len)  
> Shall size_t len be u32 len?
> 
> > +{
> > +	u64 sum_hi = 0, sum_lo = 0;
> > +	size_t i;
> > +
> > +	for (i = 0; i < len; i++) {
> > +		sum_hi += array[i] >> 32;
> > +		sum_lo += (u32)array[i];
> > +	}
> > +
> > +	sum_lo += (u64)do_div(sum_hi, len) << 32;
> > +
> > +	return (sum_hi << 32) + div_u64(sum_lo, len);
> > +}
> >  
> >  /* Calculation function for simple counter. */
> >  static int tl_calc_count(struct adf_telemetry *telemetry,  
> 
> Thanks,
>

     prev parent reply	other threads:[~2026-02-24 22:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 21:09 [PATCH next] crypto: qat - replace avg_array() with a better function david.laight.linux
2026-02-07 10:51 ` David Laight
2026-02-24 18:11 ` Giovanni Cabiddu
2026-02-24 22:01   ` David Laight [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260224220144.231b17a5@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=davem@davemloft.net \
    --cc=george.abraham.p@intel.com \
    --cc=giovanni.cabiddu@intel.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qat-linux@intel.com \
    --cc=suman.kumar.chakraborty@intel.com \
    --cc=vijay.sundar.selvamani@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox