* [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights
@ 2026-07-02 10:34 David Carlier
2026-07-03 9:44 ` Jacopo Mondi
0 siblings, 1 reply; 3+ messages in thread
From: David Carlier @ 2026-07-02 10:34 UTC (permalink / raw)
To: dan.scally, jacopo.mondi, mchehab
Cc: linux-media, linux-kernel, David Carlier, stable
mali_c55_params_aexp_hist_weights() packs the 225 per-zone u8 weights
into the ISP registers four at a time by casting the zone_weights array
to u32 and dereferencing it. The array sits at offset 10 within the
parameter block, so it is only 2-byte aligned: the u32 access is
unaligned, which is undefined behaviour and can fault on strict-align
configurations or once the loop is auto-vectorised.
The cast also reads the four weights in host byte order before they are
written to the little-endian register, so on big-endian hosts the four
weights packed into each register end up in the wrong byte lanes.
Read the weights with get_unaligned_le32() instead, which is both
alignment-safe and fixes the byte order regardless of host endianness.
Fixes: d5f281f3dd29 ("media: mali-c55: Add Mali-C55 ISP driver")
Cc: stable@vger.kernel.org
Signed-off-by: David Carlier <devnexen@gmail.com>
---
drivers/media/platform/arm/mali-c55/mali-c55-params.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/arm/mali-c55/mali-c55-params.c b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
index de0e9d898..1aaf64dde 100644
--- a/drivers/media/platform/arm/mali-c55/mali-c55-params.c
+++ b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
@@ -6,6 +6,7 @@
*/
#include <linux/media/arm/mali-c55-config.h>
#include <linux/pm_runtime.h>
+#include <linux/unaligned.h>
#include <media/media-entity.h>
#include <media/v4l2-dev.h>
@@ -203,7 +204,7 @@ mali_c55_params_aexp_hist_weights(struct mali_c55 *mali_c55,
* of overwriting other registers.
*/
for (unsigned int i = 0; i < 56; i++) {
- val = ((u32 *)params->zone_weights)[i]
+ val = get_unaligned_le32(¶ms->zone_weights[i * 4])
& MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK;
addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights
2026-07-02 10:34 [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights David Carlier
@ 2026-07-03 9:44 ` Jacopo Mondi
2026-07-03 21:16 ` David Laight
0 siblings, 1 reply; 3+ messages in thread
From: Jacopo Mondi @ 2026-07-03 9:44 UTC (permalink / raw)
To: David Carlier
Cc: dan.scally, jacopo.mondi, mchehab, linux-media, linux-kernel,
stable
Hi David,
thanks for sending a patch to address this issue
On Thu, Jul 02, 2026 at 11:34:53AM +0100, David Carlier wrote:
> mali_c55_params_aexp_hist_weights() packs the 225 per-zone u8 weights
> into the ISP registers four at a time by casting the zone_weights array
> to u32 and dereferencing it. The array sits at offset 10 within the
> parameter block, so it is only 2-byte aligned: the u32 access is
> unaligned, which is undefined behaviour and can fault on strict-align
> configurations or once the loop is auto-vectorised.
well, I don't there is a risk of undefined behaviour on ARMv8, it's
just less efficient
>
> The cast also reads the four weights in host byte order before they are
> written to the little-endian register, so on big-endian hosts the four
> weights packed into each register end up in the wrong byte lanes.
Also we don't have any endianess issue as the IP is only found on
little endian systems
>
> Read the weights with get_unaligned_le32() instead, which is both
> alignment-safe and fixes the byte order regardless of host endianness.
>
mmm, I read in Documentation/core-api/unaligned-memory-access.rst
that:
------------------------------------------------------------------------------
u32 value = get_unaligned((u32 *) data);
These macros work for memory accesses of any length (not just 32 bits as
in the examples above). Be aware that when compared to standard access of
aligned memory, using these macros to access unaligned memory can be costly in
terms of performance.
If use of such macros is not convenient, another option is to use memcpy(),
where the source or destination (or both) are of type u8* or unsigned char*.
Due to the byte-wise nature of this operation, unaligned accesses are avoided.
------------------------------------------------------------------------------
Which seems to suggest, if the issue here is performances, we should
aim for something different ? (honest question here, any kind of
guidance is appreciated)
> Fixes: d5f281f3dd29 ("media: mali-c55: Add Mali-C55 ISP driver")
> Cc: stable@vger.kernel.org
If it's only about performances, does this qualifies as a fix ?
> Signed-off-by: David Carlier <devnexen@gmail.com>
> ---
> drivers/media/platform/arm/mali-c55/mali-c55-params.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/media/platform/arm/mali-c55/mali-c55-params.c b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> index de0e9d898..1aaf64dde 100644
> --- a/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> +++ b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> @@ -6,6 +6,7 @@
> */
> #include <linux/media/arm/mali-c55-config.h>
> #include <linux/pm_runtime.h>
> +#include <linux/unaligned.h>
>
> #include <media/media-entity.h>
> #include <media/v4l2-dev.h>
> @@ -203,7 +204,7 @@ mali_c55_params_aexp_hist_weights(struct mali_c55 *mali_c55,
> * of overwriting other registers.
> */
> for (unsigned int i = 0; i < 56; i++) {
> - val = ((u32 *)params->zone_weights)[i]
> + val = get_unaligned_le32(¶ms->zone_weights[i * 4])
> & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK;
We could do:
memcpy(&val, ¶ms->zone_weights[4 * i], 4);
addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
mali_c55_ctx_write(mali_c55, addr,
val & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK);
Or this could be an alternative:
const u8 *w = ¶ms->zone_weights[4 * i];
val = w[0] | w[1] << 8 | w[2] << 16 | w[3] << 24;
addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
mali_c55_ctx_write(mali_c55, addr,
val & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK);
What do you think ?
> addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
>
> --
> 2.53.0
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights
2026-07-03 9:44 ` Jacopo Mondi
@ 2026-07-03 21:16 ` David Laight
0 siblings, 0 replies; 3+ messages in thread
From: David Laight @ 2026-07-03 21:16 UTC (permalink / raw)
To: Jacopo Mondi
Cc: David Carlier, dan.scally, mchehab, linux-media, linux-kernel,
stable
On Fri, 3 Jul 2026 11:44:31 +0200
Jacopo Mondi <jacopo.mondi@ideasonboard.com> wrote:
> Hi David,
> thanks for sending a patch to address this issue
>
> On Thu, Jul 02, 2026 at 11:34:53AM +0100, David Carlier wrote:
> > mali_c55_params_aexp_hist_weights() packs the 225 per-zone u8 weights
> > into the ISP registers four at a time by casting the zone_weights array
> > to u32 and dereferencing it. The array sits at offset 10 within the
> > parameter block, so it is only 2-byte aligned: the u32 access is
> > unaligned, which is undefined behaviour and can fault on strict-align
> > configurations or once the loop is auto-vectorised.
>
> well, I don't there is a risk of undefined behaviour on ARMv8, it's
> just less efficient
>
> >
> > The cast also reads the four weights in host byte order before they are
> > written to the little-endian register, so on big-endian hosts the four
> > weights packed into each register end up in the wrong byte lanes.
>
> Also we don't have any endianess issue as the IP is only found on
> little endian systems
>
> >
> > Read the weights with get_unaligned_le32() instead, which is both
> > alignment-safe and fixes the byte order regardless of host endianness.
> >
>
> mmm, I read in Documentation/core-api/unaligned-memory-access.rst
> that:
>
> ------------------------------------------------------------------------------
> u32 value = get_unaligned((u32 *) data);
>
> These macros work for memory accesses of any length (not just 32 bits as
> in the examples above). Be aware that when compared to standard access of
> aligned memory, using these macros to access unaligned memory can be costly in
> terms of performance.
>
> If use of such macros is not convenient, another option is to use memcpy(),
> where the source or destination (or both) are of type u8* or unsigned char*.
> Due to the byte-wise nature of this operation, unaligned accesses are avoided.
> ------------------------------------------------------------------------------
>
> Which seems to suggest, if the issue here is performances, we should
> aim for something different ? (honest question here, any kind of
> guidance is appreciated)
>
> > Fixes: d5f281f3dd29 ("media: mali-c55: Add Mali-C55 ISP driver")
> > Cc: stable@vger.kernel.org
>
> If it's only about performances, does this qualifies as a fix ?
>
> > Signed-off-by: David Carlier <devnexen@gmail.com>
> > ---
> > drivers/media/platform/arm/mali-c55/mali-c55-params.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/media/platform/arm/mali-c55/mali-c55-params.c b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> > index de0e9d898..1aaf64dde 100644
> > --- a/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> > +++ b/drivers/media/platform/arm/mali-c55/mali-c55-params.c
> > @@ -6,6 +6,7 @@
> > */
> > #include <linux/media/arm/mali-c55-config.h>
> > #include <linux/pm_runtime.h>
> > +#include <linux/unaligned.h>
> >
> > #include <media/media-entity.h>
> > #include <media/v4l2-dev.h>
> > @@ -203,7 +204,7 @@ mali_c55_params_aexp_hist_weights(struct mali_c55 *mali_c55,
> > * of overwriting other registers.
> > */
> > for (unsigned int i = 0; i < 56; i++) {
> > - val = ((u32 *)params->zone_weights)[i]
> > + val = get_unaligned_le32(¶ms->zone_weights[i * 4])
> > & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK;
On LE with HAVE_EFFICIENT_UNALIGNED_ACCESS the latter generates what you
expect the former to generate.
But gcc can unroll loops and use (IIRC) 'rdp' to read two registers at once.
That will crash and burn.
The best thing would be to have a union of the two arrays with the
member marked __packed to remove the padding before it.
>
>
> We could do:
>
> memcpy(&val, ¶ms->zone_weights[4 * i], 4);
Some of the KASAN (etc) builds might make a mess of that.
Without compiler optimisations of memcpy() it is horrid.
> addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
>
> mali_c55_ctx_write(mali_c55, addr,
> val & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK);
>
> Or this could be an alternative:
>
> const u8 *w = ¶ms->zone_weights[4 * i];
>
> val = w[0] | w[1] << 8 | w[2] << 16 | w[3] << 24;
That is a possible implementation of get_unaligned_le32() no point
doing it explicitly.
A late enough gcc will convert that to a 32bit memory read (with any
byteswap in the read or after) if unaligned accesses are supported.
Otherwise you get byte loads, shifts and ors.
David
> addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
>
> mali_c55_ctx_write(mali_c55, addr,
> val & MALI_C55_AEXP_HIST_ZONE_WEIGHT_MASK);
>
> What do you think ?
>
> > addr = base + MALI_C55_AEXP_HIST_ZONE_WEIGHTS_OFFSET + (4 * i);
> >
> > --
> > 2.53.0
> >
> >
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-07-03 21:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-02 10:34 [PATCH] media: mali-c55: Fix unaligned access of AEC histogram zone weights David Carlier
2026-07-03 9:44 ` Jacopo Mondi
2026-07-03 21:16 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox