The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Cc: David Carlier <devnexen@gmail.com>,
	dan.scally@ideasonboard.com, mchehab@kernel.org,
	linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights
Date: Fri, 3 Jul 2026 22:16:51 +0100	[thread overview]
Message-ID: <20260703221651.41669d55@pumpkin> (raw)
In-Reply-To: <akd8E5jr722oTm49@zed>

On Fri, 3 Jul 2026 11:44:31 +0200
Jacopo Mondi <jacopo.mondi@ideasonboard.com> wrote:

> Hi David,
>    thanks for sending a patch to address this issue
> 
> On Thu, Jul 02, 2026 at 11:34:53AM +0100, David Carlier wrote:
> > mali_c55_params_aexp_hist_weights() packs the 225 per-zone u8 weights
> > into the ISP registers four at a time by casting the zone_weights array
> > to u32 and dereferencing it. The array sits at offset 10 within the
> > parameter block, so it is only 2-byte aligned: the u32 access is
> > unaligned, which is undefined behaviour and can fault on strict-align
> > configurations or once the loop is auto-vectorised.  
> 
> well, I don't there is a risk of undefined behaviour on ARMv8, it's
> just less efficient
> 
> >
> > The cast also reads the four weights in host byte order before they are
> > written to the little-endian register, so on big-endian hosts the four
> > weights packed into each register end up in the wrong byte lanes.  
> 
> Also we don't have any endianess issue as the IP is only found on
> little endian systems
> 
> >
> > Read the weights with get_unaligned_le32() instead, which is both
> > alignment-safe and fixes the byte order regardless of host endianness.
> >  
> 
> mmm, I read in Documentation/core-api/unaligned-memory-access.rst
> that:
> 
> ------------------------------------------------------------------------------
> 	u32 value = get_unaligned((u32 *) data);
> 
> These macros work for memory accesses of any length (not just 32 bits as
> in the examples above). Be aware that when compared to standard access of
> aligned memory, using these macros to access unaligned memory can be costly in
> terms of performance.
> 
> If use of such macros is not convenient, another option is to use memcpy(),
> where the source or destination (or both) are of type u8* or unsigned char*.
> Due to the byte-wise nature of this operation, unaligned accesses are avoided.
> ------------------------------------------------------------------------------
> 
> Which seems to suggest, if the issue here is performances, we should
> aim for something different ? (honest question here, any kind of
> guidance is appreciated)
> 
> > Fixes: d5f281f3dd29 ("media: mali-c55: Add Mali-C55 ISP driver")
> > Cc: stable@vger.kernel.org  
> 
> If it's only about performances, does this qualifies as a fix ?
> 
> > Signed-off-by: David Carlier <devnexen@gmail.com>
> > ---
> >  drivers/media/platform/arm/mali-c55/mali-c55-params.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/media/platform/arm/mali-c55/mali-c55-params.c b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> > index de0e9d898..1aaf64dde 100644
> > --- a/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> > +++ b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> > @@ -6,6 +6,7 @@
> >   */
> >  #include <linux/media/arm/mali-c55-config.h>
> >  #include <linux/pm_runtime.h>
> > +#include <linux/unaligned.h>
> >
> >  #include <media/media-entity.h>
> >  #include <media/v4l2-dev.h>
> > @@ -203,7 +204,7 @@ mali_c55_params_aexp_hist_weights(struct mali_c55 *mali_c55,
> >  	 * of overwriting other registers.
> >  	 */
> >  	for (unsigned int i = 0; i < 56; i++) {
> > -		val = ((u32 *)params->zone_weights)[i]
> > +		val = get_unaligned_le32(&params->zone_weights[i * 4])
> >  			    & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK;  

On LE with HAVE_EFFICIENT_UNALIGNED_ACCESS the latter generates what you
expect the former to generate.
But gcc can unroll loops and use (IIRC) 'rdp' to read two registers at once.
That will crash and burn.

The best thing would be to have a union of the two arrays with the
member marked __packed to remove the padding before it.

> 
> 
> We could do:
> 
>         memcpy(&val, &params->zone_weights[4 * i], 4);

Some of the KASAN (etc) builds might make a mess of that.
Without compiler optimisations of memcpy() it is horrid.

>         addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
> 
>         mali_c55_ctx_write(mali_c55, addr,
>                            val & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK);
> 
> Or this could be an alternative:
> 
>         const u8 *w = &params->zone_weights[4 * i];
> 
>         val = w[0] | w[1] << 8 | w[2] << 16 | w[3] << 24;

That is a possible implementation of get_unaligned_le32() no point
doing it explicitly.

A late enough gcc will convert that to a 32bit memory read (with any
byteswap in the read or after) if unaligned accesses are supported.
Otherwise you get byte loads, shifts and ors.

	David

>         addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
> 
>         mali_c55_ctx_write(mali_c55, addr,
>                            val & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK);
> 
> What do you think ?
> 
> >  		addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
> >
> > --
> > 2.53.0
> >
> >  
> 


      reply	other threads:[~2026-07-03 21:16 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02 10:34 [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights David Carlier
2026-07-03  9:44 ` Jacopo Mondi
2026-07-03 21:16   ` David Laight [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260703221651.41669d55@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=dan.scally@ideasonboard.com \
    --cc=devnexen@gmail.com \
    --cc=jacopo.mondi@ideasonboard.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox