Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] media: rkvdec: Switch to using a bitwriter
From: Detlev Casanova @ 2026-03-27 19:18 UTC (permalink / raw)
  To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	Jonas Karlman, Nicolas Dufresne
  Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
	kernel, Detlev Casanova

Using bitfields in large structures where fields are mostly unaligned can
be hard on the compiler.

Issues have been reported with clang ([1], [2]) and, even though those
issues are addressed by clang devs, some setup can't or won't update clang
just to compile a driver.

Even when fixed, the compiler still might have to allocate a bigger stack
frame to manage misalignement. Coupled with other features like KASAN, the
stack becomes larger than the kernel's maximum [3].

To avoid this, let's drop the bitfield implementation and switch to a
bitwriter. There is already one for the older variants, so make it global
and use it in other variants.

Note that only buffer structures are switched to the bitwriter. The
registers representation structures are kept with bitfields, as they are
properly aligned every 32 bits and don't require heavy stack overhead.

Also note that the VDPU381 SPS and PPS structs are kept with bitfields,
for the same reason that they are small and aligned enough not to require
heavy stack overhead.

[1]: https://lore.kernel.org/oe-kbuild-all/202601211924.rqKS2Ihm-lkp@intel.com/
[2]: https://github.com/llvm/llvm-project/issues/178535
[3]: https://yhbt.net/lore/llvm/20260121230406.GA2625738@ax162/T/#mad878ec24a8224e1387ef5e73cb77b9ada55e3f2

Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
Changes in v2:
- Don't use BW_FIELD to compute buffer size
- Use correct size for buffers
- Fix missed indentation issues
- Link to v1: https://patch.msgid.link/20260327-rkvdec-use-bitwriter-v1-0-982cf872b590@collabora.com

---
Detlev Casanova (4):
      media: rkvdec: Introduce a global bitwriter helper
      media: rkvdec: Use the global bitwriter instead of local one
      media: rkvdec: common: Drop bitfields for the bitwriter
      media: rkvdec: vdpu383: Drop bitfields for the bitwriter

 drivers/media/platform/rockchip/rkvdec/Makefile    |   1 +
 .../platform/rockchip/rkvdec/rkvdec-bitwriter.c    |  30 ++
 .../platform/rockchip/rkvdec/rkvdec-bitwriter.h    |  25 +
 .../platform/rockchip/rkvdec/rkvdec-h264-common.c  |  51 +--
 .../platform/rockchip/rkvdec/rkvdec-h264-common.h  |  40 +-
 .../media/platform/rockchip/rkvdec/rkvdec-h264.c   | 109 ++---
 .../platform/rockchip/rkvdec/rkvdec-hevc-common.c  |  93 +---
 .../platform/rockchip/rkvdec/rkvdec-hevc-common.h  |  57 +--
 .../media/platform/rockchip/rkvdec/rkvdec-hevc.c   | 171 +++----
 .../platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c | 351 ++++++--------
 .../platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c | 502 +++++++++------------
 11 files changed, 579 insertions(+), 851 deletions(-)
---
base-commit: bbeb83d3182abe0d245318e274e8531e5dd7a948
change-id: 20260327-rkvdec-use-bitwriter-f1d149b3cf7c

Best regards,
--  
Detlev Casanova <detlev.casanova@collabora.com>



^ permalink raw reply

* [PATCH v2 1/4] media: rkvdec: Introduce a global bitwriter helper
From: Detlev Casanova @ 2026-03-27 19:18 UTC (permalink / raw)
  To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	Jonas Karlman, Nicolas Dufresne
  Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
	kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v2-0-a5a4754b0518@collabora.com>

The use of structures with bitfields is good when the values are
somewhat aligned.
More mis-alignement means that compilers need to do more gymanstics
to edit the fields values.

Some cases have been reported with CLang on specific architectures
like armhf and hexagon, where the compiler would allocate a bigger
local stack than needed or even completely freeze during compilation.

Some fixes have been provided to ease the issues, but the real fix
here is to use a bitwriter instead of heavily unaligned bitfields.

This is a preparation commit to provide a global bitwriter interface
for the whole driver.

Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
 drivers/media/platform/rockchip/rkvdec/Makefile    |  1 +
 .../platform/rockchip/rkvdec/rkvdec-bitwriter.c    | 30 ++++++++++++++++++++++
 .../platform/rockchip/rkvdec/rkvdec-bitwriter.h    | 25 ++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/drivers/media/platform/rockchip/rkvdec/Makefile b/drivers/media/platform/rockchip/rkvdec/Makefile
index e629d571e4d8..11e2122bcbbf 100644
--- a/drivers/media/platform/rockchip/rkvdec/Makefile
+++ b/drivers/media/platform/rockchip/rkvdec/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_VIDEO_ROCKCHIP_VDEC) += rockchip-vdec.o
 
 rockchip-vdec-y += \
 		   rkvdec.o \
+		   rkvdec-bitwriter.o \
 		   rkvdec-cabac.o \
 		   rkvdec-h264.o \
 		   rkvdec-h264-common.o \
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.c
new file mode 100644
index 000000000000..673ebb89002b
--- /dev/null
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip Video Decoder bit writer
+ *
+ * Copyright (C) 2026 Collabora, Ltd.
+ *      Detlev Casanova <detlev.casanova@collabora.com>
+ * Copyright (C) 2019 Collabora, Ltd.
+ *	Boris Brezillon <boris.brezillon@collabora.com>
+ */
+
+#include <linux/types.h>
+#include <linux/bits.h>
+
+#include "rkvdec-bitwriter.h"
+
+void rkvdec_set_bw_field(u32 *buf, struct rkvdec_bw_field field, u32 value)
+{
+	u8 bit = field.offset % 32;
+	u16 word = field.offset / 32;
+	u64 mask = GENMASK_ULL(bit + field.len - 1, bit);
+	u64 val = ((u64)value << bit) & mask;
+
+	buf[word] &= ~mask;
+	buf[word] |= val;
+	if (bit + field.len > 32) {
+		buf[word + 1] &= ~(mask >> 32);
+		buf[word + 1] |= val >> 32;
+	}
+}
+
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.h
new file mode 100644
index 000000000000..44154f1ebc65
--- /dev/null
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Rockchip Video Decoder bit writer
+ *
+ * Copyright (C) 2026 Collabora, Ltd.
+ *      Detlev Casanova <detlev.casanova@collabora.com>
+ * Copyright (C) 2019 Collabora, Ltd.
+ *	Boris Brezillon <boris.brezillon@collabora.com>
+ */
+
+#ifndef RKVDEC_BIT_WRITER_H_
+#define RKVDEC_BIT_WRITER_H_
+
+#include <linux/types.h>
+
+struct rkvdec_bw_field {
+	u16 offset;
+	u8 len;
+};
+
+#define BW_FIELD(_offset, _len) ((struct rkvdec_bw_field){ _offset, _len })
+
+void rkvdec_set_bw_field(u32 *buf, struct rkvdec_bw_field field, u32 value);
+
+#endif /* RKVDEC_BIT_WRITER_H_ */

-- 
2.53.0



^ permalink raw reply related

* [PATCH v2 3/4] media: rkvdec: common: Drop bitfields for the bitwriter
From: Detlev Casanova @ 2026-03-27 19:18 UTC (permalink / raw)
  To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	Jonas Karlman, Nicolas Dufresne
  Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
	kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v2-0-a5a4754b0518@collabora.com>

Currently, the common code files for hevc and h264 use structs with
bitfields to represent the HW RPS buffer.

Because the bitfields are mostly unaligned and numerous, it brings compiler
issues, especially with clang.

To prevent that, switch to using the global bitwriter previously
introduced instead.

Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
 .../platform/rockchip/rkvdec/rkvdec-h264-common.c  | 51 +-----------
 .../platform/rockchip/rkvdec/rkvdec-h264-common.h  | 40 +++-------
 .../platform/rockchip/rkvdec/rkvdec-hevc-common.c  | 93 ++++------------------
 .../platform/rockchip/rkvdec/rkvdec-hevc-common.h  | 57 ++++---------
 4 files changed, 44 insertions(+), 197 deletions(-)

diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c
index e28f06394470..54639512e456 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c
@@ -21,51 +21,6 @@
 
 #define RKVDEC_NUM_REFLIST		3
 
-static void set_dpb_info(struct rkvdec_rps_entry *entries,
-			 u8 reflist,
-			 u8 refnum,
-			 u8 info,
-			 bool bottom)
-{
-	struct rkvdec_rps_entry *entry = &entries[(reflist * 4) + refnum / 8];
-	u8 idx = refnum % 8;
-
-	switch (idx) {
-	case 0:
-		entry->dpb_info0 = info;
-		entry->bottom_flag0 = bottom;
-		break;
-	case 1:
-		entry->dpb_info1 = info;
-		entry->bottom_flag1 = bottom;
-		break;
-	case 2:
-		entry->dpb_info2 = info;
-		entry->bottom_flag2 = bottom;
-		break;
-	case 3:
-		entry->dpb_info3 = info;
-		entry->bottom_flag3 = bottom;
-		break;
-	case 4:
-		entry->dpb_info4 = info;
-		entry->bottom_flag4 = bottom;
-		break;
-	case 5:
-		entry->dpb_info5 = info;
-		entry->bottom_flag5 = bottom;
-		break;
-	case 6:
-		entry->dpb_info6 = info;
-		entry->bottom_flag6 = bottom;
-		break;
-	case 7:
-		entry->dpb_info7 = info;
-		entry->bottom_flag7 = bottom;
-		break;
-	}
-}
-
 void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
 			struct rkvdec_h264_run *run)
 {
@@ -111,7 +66,7 @@ void assemble_hw_rps(struct v4l2_h264_reflist_builder *builder,
 		if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
 			continue;
 
-		hw_rps->frame_num[i] = builder->refs[i].frame_num;
+		rkvdec_set_bw_field(hw_rps->info, RPS_FRAME_NUM(i), builder->refs[i].frame_num);
 	}
 
 	for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
@@ -138,7 +93,9 @@ void assemble_hw_rps(struct v4l2_h264_reflist_builder *builder,
 			dpb_valid = !!(run->ref_buf[ref->index]);
 			bottom = ref->fields == V4L2_H264_BOTTOM_FIELD_REF;
 
-			set_dpb_info(hw_rps->entries, j, i, ref->index | (dpb_valid << 4), bottom);
+			rkvdec_set_bw_field(hw_rps->info, RPS_ENTRY_DPB_INFO(j, i),
+					    ref->index | (dpb_valid << 4));
+			rkvdec_set_bw_field(hw_rps->info, RPS_ENTRY_BOTTOM_FLAG(j, i), bottom);
 		}
 	}
 }
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h
index 5336370507d6..f04b700b863c 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h
@@ -16,6 +16,7 @@
 #include <media/v4l2-mem2mem.h>
 
 #include "rkvdec.h"
+#include "rkvdec-bitwriter.h"
 
 struct rkvdec_h264_scaling_list {
 	u8 scaling_list_4x4[6][16];
@@ -38,39 +39,16 @@ struct rkvdec_h264_run {
 	struct vb2_buffer *ref_buf[V4L2_H264_NUM_DPB_ENTRIES];
 };
 
-struct rkvdec_rps_entry {
-	u32 dpb_info0:          5;
-	u32 bottom_flag0:       1;
-	u32 view_index_off0:    1;
-	u32 dpb_info1:          5;
-	u32 bottom_flag1:       1;
-	u32 view_index_off1:    1;
-	u32 dpb_info2:          5;
-	u32 bottom_flag2:       1;
-	u32 view_index_off2:    1;
-	u32 dpb_info3:          5;
-	u32 bottom_flag3:       1;
-	u32 view_index_off3:    1;
-	u32 dpb_info4:          5;
-	u32 bottom_flag4:       1;
-	u32 view_index_off4:    1;
-	u32 dpb_info5:          5;
-	u32 bottom_flag5:       1;
-	u32 view_index_off5:    1;
-	u32 dpb_info6:          5;
-	u32 bottom_flag6:       1;
-	u32 view_index_off6:    1;
-	u32 dpb_info7:          5;
-	u32 bottom_flag7:       1;
-	u32 view_index_off7:    1;
-} __packed;
+#define RPS_FRAME_NUM(i)		BW_FIELD((i) * 16, 16)
+#define RPS_ENTRY_DPB_INFO(l, e)	BW_FIELD(288 + (l) * 7 * 32 + (e) * 7, 5) //l: 0-2, e: 0-31
+#define RPS_ENTRY_BOTTOM_FLAG(l, e)	BW_FIELD(293 + (l) * 7 * 32 + (e) * 7, 1) //l: 0-2, e: 0-31
+#define RPS_ENTRY_VIEW_INDEX_OFF(l, e)	BW_FIELD(294 + (l) * 7 * 32 + (e) * 7, 1) //l: 0-2, e: 0-31
+
+#define RKVDEC_H264_RPS_SIZE		ALIGN(288 + 3 * 7 * 32, 128)
 
 struct rkvdec_rps {
-	u16 frame_num[16];
-	u32 reserved0;
-	struct rkvdec_rps_entry entries[12];
-	u32 reserved1[66];
-} __packed;
+	u32 info[RKVDEC_H264_RPS_SIZE / 8 / 4];
+};
 
 void lookup_ref_buf_idx(struct rkvdec_ctx *ctx, struct rkvdec_h264_run *run);
 void assemble_hw_rps(struct v4l2_h264_reflist_builder *builder,
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c
index 3119f3bc9f98..f89602075121 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c
@@ -74,72 +74,6 @@ void compute_tiles_non_uniform(struct rkvdec_hevc_run *run, u16 log2_min_cb_size
 	row_height[i] = pic_in_cts_height - sum;
 }
 
-static void set_ref_poc(struct rkvdec_rps_short_term_ref_set *set, int poc, int value, int flag)
-{
-	switch (poc) {
-	case 0:
-		set->delta_poc0 = value;
-		set->used_flag0 = flag;
-		break;
-	case 1:
-		set->delta_poc1 = value;
-		set->used_flag1 = flag;
-		break;
-	case 2:
-		set->delta_poc2 = value;
-		set->used_flag2 = flag;
-		break;
-	case 3:
-		set->delta_poc3 = value;
-		set->used_flag3 = flag;
-		break;
-	case 4:
-		set->delta_poc4 = value;
-		set->used_flag4 = flag;
-		break;
-	case 5:
-		set->delta_poc5 = value;
-		set->used_flag5 = flag;
-		break;
-	case 6:
-		set->delta_poc6 = value;
-		set->used_flag6 = flag;
-		break;
-	case 7:
-		set->delta_poc7 = value;
-		set->used_flag7 = flag;
-		break;
-	case 8:
-		set->delta_poc8 = value;
-		set->used_flag8 = flag;
-		break;
-	case 9:
-		set->delta_poc9 = value;
-		set->used_flag9 = flag;
-		break;
-	case 10:
-		set->delta_poc10 = value;
-		set->used_flag10 = flag;
-		break;
-	case 11:
-		set->delta_poc11 = value;
-		set->used_flag11 = flag;
-		break;
-	case 12:
-		set->delta_poc12 = value;
-		set->used_flag12 = flag;
-		break;
-	case 13:
-		set->delta_poc13 = value;
-		set->used_flag13 = flag;
-		break;
-	case 14:
-		set->delta_poc14 = value;
-		set->used_flag14 = flag;
-		break;
-	}
-}
-
 static void assemble_scalingfactor0(struct rkvdec_ctx *ctx, u8 *output,
 				    const struct v4l2_ctrl_hevc_scaling_matrix *input)
 {
@@ -218,10 +152,11 @@ static void rkvdec_hevc_assemble_hw_lt_rps(struct rkvdec_hevc_run *run, struct r
 		return;
 
 	for (int i = 0; i < sps->num_long_term_ref_pics_sps; i++) {
-		rps->refs[i].lt_ref_pic_poc_lsb =
-			run->ext_sps_lt_rps[i].lt_ref_pic_poc_lsb_sps;
-		rps->refs[i].used_by_curr_pic_lt_flag =
-			!!(run->ext_sps_lt_rps[i].flags & V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT);
+		rkvdec_set_bw_field(rps->info, RPS_LT_REF_PIC_POC_LSB(i),
+				    run->ext_sps_lt_rps[i].lt_ref_pic_poc_lsb_sps);
+		rkvdec_set_bw_field(rps->info, RPS_LT_REF_USED_BY_CURR_PIC(i),
+				    !!(run->ext_sps_lt_rps[i].flags &
+				       V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT));
 	}
 }
 
@@ -235,18 +170,24 @@ static void rkvdec_hevc_assemble_hw_st_rps(struct rkvdec_hevc_run *run, struct r
 		int j = 0;
 		const struct calculated_rps_st_set *set = &calculated_rps_st_sets[i];
 
-		rps->short_term_ref_sets[i].num_negative = set->num_negative_pics;
-		rps->short_term_ref_sets[i].num_positive = set->num_positive_pics;
+		rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_NUM_NEGATIVE(i),
+				    set->num_negative_pics);
+		rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_NUM_POSITIVE(i),
+				    set->num_positive_pics);
 
 		for (; j < set->num_negative_pics; j++) {
-			set_ref_poc(&rps->short_term_ref_sets[i], j,
-				    set->delta_poc_s0[j], set->used_by_curr_pic_s0[j]);
+			rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_DELTA_POC(i, j),
+					    set->delta_poc_s0[j]);
+			rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_USED(i, j),
+					    set->used_by_curr_pic_s0[j]);
 		}
 		poc = j;
 
 		for (j = 0; j < set->num_positive_pics; j++) {
-			set_ref_poc(&rps->short_term_ref_sets[i], poc + j,
-				    set->delta_poc_s1[j], set->used_by_curr_pic_s1[j]);
+			rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_DELTA_POC(i, poc + j),
+					    set->delta_poc_s1[j]);
+			rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_USED(i, poc + j),
+					    set->used_by_curr_pic_s1[j]);
 		}
 	}
 }
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h
index 6f4faca4c091..2a9b7719ab2d 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h
@@ -19,53 +19,24 @@
 #include <linux/types.h>
 
 #include "rkvdec.h"
+#include "rkvdec-bitwriter.h"
 
-struct rkvdec_rps_refs {
-	u16 lt_ref_pic_poc_lsb;
-	u16 used_by_curr_pic_lt_flag	: 1;
-	u16 reserved			: 15;
-} __packed;
+#define RPS_LT_REF_PIC_POC_LSB(i)	BW_FIELD(0 + (i) * 32, 16) // i: 0-31
+#define RPS_LT_REF_USED_BY_CURR_PIC(i)	BW_FIELD(16 + (i) * 32, 1) // i: 0-31
 
-struct rkvdec_rps_short_term_ref_set {
-	u32 num_negative	: 4;
-	u32 num_positive	: 4;
-	u32 delta_poc0		: 16;
-	u32 used_flag0		: 1;
-	u32 delta_poc1		: 16;
-	u32 used_flag1		: 1;
-	u32 delta_poc2		: 16;
-	u32 used_flag2		: 1;
-	u32 delta_poc3		: 16;
-	u32 used_flag3		: 1;
-	u32 delta_poc4		: 16;
-	u32 used_flag4		: 1;
-	u32 delta_poc5		: 16;
-	u32 used_flag5		: 1;
-	u32 delta_poc6		: 16;
-	u32 used_flag6		: 1;
-	u32 delta_poc7		: 16;
-	u32 used_flag7		: 1;
-	u32 delta_poc8		: 16;
-	u32 used_flag8		: 1;
-	u32 delta_poc9		: 16;
-	u32 used_flag9		: 1;
-	u32 delta_poc10		: 16;
-	u32 used_flag10		: 1;
-	u32 delta_poc11		: 16;
-	u32 used_flag11		: 1;
-	u32 delta_poc12		: 16;
-	u32 used_flag12		: 1;
-	u32 delta_poc13		: 16;
-	u32 used_flag13		: 1;
-	u32 delta_poc14		: 16;
-	u32 used_flag14		: 1;
-	u32 reserved_bits	: 25;
-	u32 reserved[3];
-} __packed;
+#define RPS_ST_REF_SET_NUM_NEGATIVE(i)	BW_FIELD(1024 + ((i) * 384), 4) // i: 0-63
+#define RPS_ST_REF_SET_NUM_POSITIVE(i)	BW_FIELD(1028 + ((i) * 384), 4) // i: 0-63
+
+// i: 0-63, j: 0-14
+#define RPS_ST_REF_SET_DELTA_POC(i, j)	BW_FIELD(1032 + ((i) * 384) + ((j) * 17), 16)
+
+// i: 0-63, j: 0-14
+#define RPS_ST_REF_SET_USED(i, j)	BW_FIELD(1048 + ((i) * 384) + ((j) * 17), 1)
+
+#define RKVDEC_RPS_HEVC_SIZE		ALIGN(1032 + 64 * 384, 128)
 
 struct rkvdec_rps {
-	struct rkvdec_rps_refs refs[32];
-	struct rkvdec_rps_short_term_ref_set short_term_ref_sets[64];
+	u32 info[RKVDEC_RPS_HEVC_SIZE / 8 / 4];
 } __packed;
 
 struct rkvdec_hevc_run {

-- 
2.53.0



^ permalink raw reply related

* [PATCH v2 2/4] media: rkvdec: Use the global bitwriter instead of local one
From: Detlev Casanova @ 2026-03-27 19:18 UTC (permalink / raw)
  To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	Jonas Karlman, Nicolas Dufresne
  Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
	kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v2-0-a5a4754b0518@collabora.com>

Both rkvdec-h264.c and rkvdec-hevc.c use their own bitwriter
function and macros.

Move to using the global one introduced before.

Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
 .../media/platform/rockchip/rkvdec/rkvdec-h264.c   | 109 ++++++-------
 .../media/platform/rockchip/rkvdec/rkvdec-hevc.c   | 171 +++++++++------------
 2 files changed, 119 insertions(+), 161 deletions(-)

diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c
index d3202cecb988..ffa606038192 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c
@@ -16,6 +16,7 @@
 #include "rkvdec-regs.h"
 #include "rkvdec-cabac.h"
 #include "rkvdec-h264-common.h"
+#include "rkvdec-bitwriter.h"
 
 /* Size with u32 units. */
 #define RKV_CABAC_INIT_BUFFER_SIZE	(3680 + 128)
@@ -25,56 +26,48 @@ struct rkvdec_sps_pps_packet {
 	u32 info[8];
 };
 
-struct rkvdec_ps_field {
-	u16 offset;
-	u8 len;
-};
-
-#define PS_FIELD(_offset, _len) \
-	((struct rkvdec_ps_field){ _offset, _len })
-
-#define SEQ_PARAMETER_SET_ID				PS_FIELD(0, 4)
-#define PROFILE_IDC					PS_FIELD(4, 8)
-#define CONSTRAINT_SET3_FLAG				PS_FIELD(12, 1)
-#define CHROMA_FORMAT_IDC				PS_FIELD(13, 2)
-#define BIT_DEPTH_LUMA					PS_FIELD(15, 3)
-#define BIT_DEPTH_CHROMA				PS_FIELD(18, 3)
-#define QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG		PS_FIELD(21, 1)
-#define LOG2_MAX_FRAME_NUM_MINUS4			PS_FIELD(22, 4)
-#define MAX_NUM_REF_FRAMES				PS_FIELD(26, 5)
-#define PIC_ORDER_CNT_TYPE				PS_FIELD(31, 2)
-#define LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4		PS_FIELD(33, 4)
-#define DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG		PS_FIELD(37, 1)
-#define PIC_WIDTH_IN_MBS				PS_FIELD(38, 9)
-#define PIC_HEIGHT_IN_MBS				PS_FIELD(47, 9)
-#define FRAME_MBS_ONLY_FLAG				PS_FIELD(56, 1)
-#define MB_ADAPTIVE_FRAME_FIELD_FLAG			PS_FIELD(57, 1)
-#define DIRECT_8X8_INFERENCE_FLAG			PS_FIELD(58, 1)
-#define MVC_EXTENSION_ENABLE				PS_FIELD(59, 1)
-#define NUM_VIEWS					PS_FIELD(60, 2)
-#define VIEW_ID(i)					PS_FIELD(62 + ((i) * 10), 10)
-#define NUM_ANCHOR_REFS_L(i)				PS_FIELD(82 + ((i) * 11), 1)
-#define ANCHOR_REF_L(i)				PS_FIELD(83 + ((i) * 11), 10)
-#define NUM_NON_ANCHOR_REFS_L(i)			PS_FIELD(104 + ((i) * 11), 1)
-#define NON_ANCHOR_REFS_L(i)				PS_FIELD(105 + ((i) * 11), 10)
-#define PIC_PARAMETER_SET_ID				PS_FIELD(128, 8)
-#define PPS_SEQ_PARAMETER_SET_ID			PS_FIELD(136, 5)
-#define ENTROPY_CODING_MODE_FLAG			PS_FIELD(141, 1)
-#define BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG	PS_FIELD(142, 1)
-#define NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(i)		PS_FIELD(143 + ((i) * 5), 5)
-#define WEIGHTED_PRED_FLAG				PS_FIELD(153, 1)
-#define WEIGHTED_BIPRED_IDC				PS_FIELD(154, 2)
-#define PIC_INIT_QP_MINUS26				PS_FIELD(156, 7)
-#define PIC_INIT_QS_MINUS26				PS_FIELD(163, 6)
-#define CHROMA_QP_INDEX_OFFSET				PS_FIELD(169, 5)
-#define DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG		PS_FIELD(174, 1)
-#define CONSTRAINED_INTRA_PRED_FLAG			PS_FIELD(175, 1)
-#define REDUNDANT_PIC_CNT_PRESENT			PS_FIELD(176, 1)
-#define TRANSFORM_8X8_MODE_FLAG			PS_FIELD(177, 1)
-#define SECOND_CHROMA_QP_INDEX_OFFSET			PS_FIELD(178, 5)
-#define SCALING_LIST_ENABLE_FLAG			PS_FIELD(183, 1)
-#define SCALING_LIST_ADDRESS				PS_FIELD(184, 32)
-#define IS_LONG_TERM(i)				PS_FIELD(216 + (i), 1)
+#define SEQ_PARAMETER_SET_ID				BW_FIELD(0, 4)
+#define PROFILE_IDC					BW_FIELD(4, 8)
+#define CONSTRAINT_SET3_FLAG				BW_FIELD(12, 1)
+#define CHROMA_FORMAT_IDC				BW_FIELD(13, 2)
+#define BIT_DEPTH_LUMA					BW_FIELD(15, 3)
+#define BIT_DEPTH_CHROMA				BW_FIELD(18, 3)
+#define QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG		BW_FIELD(21, 1)
+#define LOG2_MAX_FRAME_NUM_MINUS4			BW_FIELD(22, 4)
+#define MAX_NUM_REF_FRAMES				BW_FIELD(26, 5)
+#define PIC_ORDER_CNT_TYPE				BW_FIELD(31, 2)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4		BW_FIELD(33, 4)
+#define DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG		BW_FIELD(37, 1)
+#define PIC_WIDTH_IN_MBS				BW_FIELD(38, 9)
+#define PIC_HEIGHT_IN_MBS				BW_FIELD(47, 9)
+#define FRAME_MBS_ONLY_FLAG				BW_FIELD(56, 1)
+#define MB_ADAPTIVE_FRAME_FIELD_FLAG			BW_FIELD(57, 1)
+#define DIRECT_8X8_INFERENCE_FLAG			BW_FIELD(58, 1)
+#define MVC_EXTENSION_ENABLE				BW_FIELD(59, 1)
+#define NUM_VIEWS					BW_FIELD(60, 2)
+#define VIEW_ID(i)					BW_FIELD(62 + ((i) * 10), 10)
+#define NUM_ANCHOR_REFS_L(i)				BW_FIELD(82 + ((i) * 11), 1)
+#define ANCHOR_REF_L(i)				BW_FIELD(83 + ((i) * 11), 10)
+#define NUM_NON_ANCHOR_REFS_L(i)			BW_FIELD(104 + ((i) * 11), 1)
+#define NON_ANCHOR_REFS_L(i)				BW_FIELD(105 + ((i) * 11), 10)
+#define PIC_PARAMETER_SET_ID				BW_FIELD(128, 8)
+#define PPS_SEQ_PARAMETER_SET_ID			BW_FIELD(136, 5)
+#define ENTROPY_CODING_MODE_FLAG			BW_FIELD(141, 1)
+#define BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG	BW_FIELD(142, 1)
+#define NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(i)		BW_FIELD(143 + ((i) * 5), 5)
+#define WEIGHTED_PRED_FLAG				BW_FIELD(153, 1)
+#define WEIGHTED_BIPRED_IDC				BW_FIELD(154, 2)
+#define PIC_INIT_QP_MINUS26				BW_FIELD(156, 7)
+#define PIC_INIT_QS_MINUS26				BW_FIELD(163, 6)
+#define CHROMA_QP_INDEX_OFFSET				BW_FIELD(169, 5)
+#define DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG		BW_FIELD(174, 1)
+#define CONSTRAINED_INTRA_PRED_FLAG			BW_FIELD(175, 1)
+#define REDUNDANT_PIC_CNT_PRESENT			BW_FIELD(176, 1)
+#define TRANSFORM_8X8_MODE_FLAG			BW_FIELD(177, 1)
+#define SECOND_CHROMA_QP_INDEX_OFFSET			BW_FIELD(178, 5)
+#define SCALING_LIST_ENABLE_FLAG			BW_FIELD(183, 1)
+#define SCALING_LIST_ADDRESS				BW_FIELD(184, 32)
+#define IS_LONG_TERM(i)				BW_FIELD(216 + (i), 1)
 
 /* Data structure describing auxiliary buffer format. */
 struct rkvdec_h264_priv_tbl {
@@ -91,20 +84,6 @@ struct rkvdec_h264_ctx {
 	struct rkvdec_regs regs;
 };
 
-static void set_ps_field(u32 *buf, struct rkvdec_ps_field field, u32 value)
-{
-	u8 bit = field.offset % 32, word = field.offset / 32;
-	u64 mask = GENMASK_ULL(bit + field.len - 1, bit);
-	u64 val = ((u64)value << bit) & mask;
-
-	buf[word] &= ~mask;
-	buf[word] |= val;
-	if (bit + field.len > 32) {
-		buf[word + 1] &= ~(mask >> 32);
-		buf[word + 1] |= val >> 32;
-	}
-}
-
 static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 			    struct rkvdec_h264_run *run)
 {
@@ -128,7 +107,7 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 	hw_ps = &priv_tbl->param_set[pps->pic_parameter_set_id];
 	memset(hw_ps, 0, sizeof(*hw_ps));
 
-#define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
+#define WRITE_PPS(value, field) rkvdec_set_bw_field(hw_ps->info, field, value)
 	/* write sps */
 	WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
 	WRITE_PPS(sps->profile_idc, PROFILE_IDC);
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c
index ac8b825d080a..87abf93dfd5e 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c
@@ -18,6 +18,7 @@
 #include "rkvdec-regs.h"
 #include "rkvdec-cabac.h"
 #include "rkvdec-hevc-common.h"
+#include "rkvdec-bitwriter.h"
 
 /* Size in u8/u32 units. */
 #define RKV_SCALING_LIST_SIZE		1360
@@ -34,80 +35,72 @@ struct rkvdec_rps_packet {
 	u32 info[RKV_RPS_SIZE];
 };
 
-struct rkvdec_ps_field {
-	u16 offset;
-	u8 len;
-};
-
-#define PS_FIELD(_offset, _len) \
-	((struct rkvdec_ps_field){ _offset, _len })
-
 /* SPS */
-#define VIDEO_PARAMETER_SET_ID				PS_FIELD(0, 4)
-#define SEQ_PARAMETER_SET_ID				PS_FIELD(4, 4)
-#define CHROMA_FORMAT_IDC				PS_FIELD(8, 2)
-#define PIC_WIDTH_IN_LUMA_SAMPLES			PS_FIELD(10, 13)
-#define PIC_HEIGHT_IN_LUMA_SAMPLES			PS_FIELD(23, 13)
-#define BIT_DEPTH_LUMA					PS_FIELD(36, 4)
-#define BIT_DEPTH_CHROMA				PS_FIELD(40, 4)
-#define LOG2_MAX_PIC_ORDER_CNT_LSB			PS_FIELD(44, 5)
-#define LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE	PS_FIELD(49, 2)
-#define LOG2_MIN_LUMA_CODING_BLOCK_SIZE			PS_FIELD(51, 3)
-#define LOG2_MIN_TRANSFORM_BLOCK_SIZE			PS_FIELD(54, 3)
-#define LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE	PS_FIELD(57, 2)
-#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTER		PS_FIELD(59, 3)
-#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA		PS_FIELD(62, 3)
-#define SCALING_LIST_ENABLED_FLAG			PS_FIELD(65, 1)
-#define AMP_ENABLED_FLAG				PS_FIELD(66, 1)
-#define SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG		PS_FIELD(67, 1)
-#define PCM_ENABLED_FLAG				PS_FIELD(68, 1)
-#define PCM_SAMPLE_BIT_DEPTH_LUMA			PS_FIELD(69, 4)
-#define PCM_SAMPLE_BIT_DEPTH_CHROMA			PS_FIELD(73, 4)
-#define PCM_LOOP_FILTER_DISABLED_FLAG			PS_FIELD(77, 1)
-#define LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE	PS_FIELD(78, 3)
-#define LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE		PS_FIELD(81, 3)
-#define NUM_SHORT_TERM_REF_PIC_SETS			PS_FIELD(84, 7)
-#define LONG_TERM_REF_PICS_PRESENT_FLAG			PS_FIELD(91, 1)
-#define NUM_LONG_TERM_REF_PICS_SPS			PS_FIELD(92, 6)
-#define SPS_TEMPORAL_MVP_ENABLED_FLAG			PS_FIELD(98, 1)
-#define STRONG_INTRA_SMOOTHING_ENABLED_FLAG		PS_FIELD(99, 1)
+#define VIDEO_PARAMETER_SET_ID				BW_FIELD(0, 4)
+#define SEQ_PARAMETER_SET_ID				BW_FIELD(4, 4)
+#define CHROMA_FORMAT_IDC				BW_FIELD(8, 2)
+#define PIC_WIDTH_IN_LUMA_SAMPLES			BW_FIELD(10, 13)
+#define PIC_HEIGHT_IN_LUMA_SAMPLES			BW_FIELD(23, 13)
+#define BIT_DEPTH_LUMA					BW_FIELD(36, 4)
+#define BIT_DEPTH_CHROMA				BW_FIELD(40, 4)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB			BW_FIELD(44, 5)
+#define LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE	BW_FIELD(49, 2)
+#define LOG2_MIN_LUMA_CODING_BLOCK_SIZE			BW_FIELD(51, 3)
+#define LOG2_MIN_TRANSFORM_BLOCK_SIZE			BW_FIELD(54, 3)
+#define LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE	BW_FIELD(57, 2)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTER		BW_FIELD(59, 3)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA		BW_FIELD(62, 3)
+#define SCALING_LIST_ENABLED_FLAG			BW_FIELD(65, 1)
+#define AMP_ENABLED_FLAG				BW_FIELD(66, 1)
+#define SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG		BW_FIELD(67, 1)
+#define PCM_ENABLED_FLAG				BW_FIELD(68, 1)
+#define PCM_SAMPLE_BIT_DEPTH_LUMA			BW_FIELD(69, 4)
+#define PCM_SAMPLE_BIT_DEPTH_CHROMA			BW_FIELD(73, 4)
+#define PCM_LOOP_FILTER_DISABLED_FLAG			BW_FIELD(77, 1)
+#define LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE	BW_FIELD(78, 3)
+#define LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE		BW_FIELD(81, 3)
+#define NUM_SHORT_TERM_REF_PIC_SETS			BW_FIELD(84, 7)
+#define LONG_TERM_REF_PICS_PRESENT_FLAG			BW_FIELD(91, 1)
+#define NUM_LONG_TERM_REF_PICS_SPS			BW_FIELD(92, 6)
+#define SPS_TEMPORAL_MVP_ENABLED_FLAG			BW_FIELD(98, 1)
+#define STRONG_INTRA_SMOOTHING_ENABLED_FLAG		BW_FIELD(99, 1)
 /* PPS */
-#define PIC_PARAMETER_SET_ID				PS_FIELD(128, 6)
-#define PPS_SEQ_PARAMETER_SET_ID			PS_FIELD(134, 4)
-#define DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG		PS_FIELD(138, 1)
-#define OUTPUT_FLAG_PRESENT_FLAG			PS_FIELD(139, 1)
-#define NUM_EXTRA_SLICE_HEADER_BITS			PS_FIELD(140, 13)
-#define SIGN_DATA_HIDING_ENABLED_FLAG			PS_FIELD(153, 1)
-#define CABAC_INIT_PRESENT_FLAG				PS_FIELD(154, 1)
-#define NUM_REF_IDX_L0_DEFAULT_ACTIVE			PS_FIELD(155, 4)
-#define NUM_REF_IDX_L1_DEFAULT_ACTIVE			PS_FIELD(159, 4)
-#define INIT_QP_MINUS26					PS_FIELD(163, 7)
-#define CONSTRAINED_INTRA_PRED_FLAG			PS_FIELD(170, 1)
-#define TRANSFORM_SKIP_ENABLED_FLAG			PS_FIELD(171, 1)
-#define CU_QP_DELTA_ENABLED_FLAG			PS_FIELD(172, 1)
-#define LOG2_MIN_CU_QP_DELTA_SIZE			PS_FIELD(173, 3)
-#define PPS_CB_QP_OFFSET				PS_FIELD(176, 5)
-#define PPS_CR_QP_OFFSET				PS_FIELD(181, 5)
-#define PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG	PS_FIELD(186, 1)
-#define WEIGHTED_PRED_FLAG				PS_FIELD(187, 1)
-#define WEIGHTED_BIPRED_FLAG				PS_FIELD(188, 1)
-#define TRANSQUANT_BYPASS_ENABLED_FLAG			PS_FIELD(189, 1)
-#define TILES_ENABLED_FLAG				PS_FIELD(190, 1)
-#define ENTROPY_CODING_SYNC_ENABLED_FLAG		PS_FIELD(191, 1)
-#define PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG	PS_FIELD(192, 1)
-#define LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG		PS_FIELD(193, 1)
-#define DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG		PS_FIELD(194, 1)
-#define PPS_DEBLOCKING_FILTER_DISABLED_FLAG		PS_FIELD(195, 1)
-#define PPS_BETA_OFFSET_DIV2				PS_FIELD(196, 4)
-#define PPS_TC_OFFSET_DIV2				PS_FIELD(200, 4)
-#define LISTS_MODIFICATION_PRESENT_FLAG			PS_FIELD(204, 1)
-#define LOG2_PARALLEL_MERGE_LEVEL			PS_FIELD(205, 3)
-#define SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG	PS_FIELD(208, 1)
-#define NUM_TILE_COLUMNS				PS_FIELD(212, 5)
-#define NUM_TILE_ROWS					PS_FIELD(217, 5)
-#define COLUMN_WIDTH(i)					PS_FIELD(256 + ((i) * 8), 8)
-#define ROW_HEIGHT(i)					PS_FIELD(416 + ((i) * 8), 8)
-#define SCALING_LIST_ADDRESS				PS_FIELD(592, 32)
+#define PIC_PARAMETER_SET_ID				BW_FIELD(128, 6)
+#define PPS_SEQ_PARAMETER_SET_ID			BW_FIELD(134, 4)
+#define DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG		BW_FIELD(138, 1)
+#define OUTPUT_FLAG_PRESENT_FLAG			BW_FIELD(139, 1)
+#define NUM_EXTRA_SLICE_HEADER_BITS			BW_FIELD(140, 13)
+#define SIGN_DATA_HIDING_ENABLED_FLAG			BW_FIELD(153, 1)
+#define CABAC_INIT_PRESENT_FLAG				BW_FIELD(154, 1)
+#define NUM_REF_IDX_L0_DEFAULT_ACTIVE			BW_FIELD(155, 4)
+#define NUM_REF_IDX_L1_DEFAULT_ACTIVE			BW_FIELD(159, 4)
+#define INIT_QP_MINUS26					BW_FIELD(163, 7)
+#define CONSTRAINED_INTRA_PRED_FLAG			BW_FIELD(170, 1)
+#define TRANSFORM_SKIP_ENABLED_FLAG			BW_FIELD(171, 1)
+#define CU_QP_DELTA_ENABLED_FLAG			BW_FIELD(172, 1)
+#define LOG2_MIN_CU_QP_DELTA_SIZE			BW_FIELD(173, 3)
+#define PPS_CB_QP_OFFSET				BW_FIELD(176, 5)
+#define PPS_CR_QP_OFFSET				BW_FIELD(181, 5)
+#define PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG	BW_FIELD(186, 1)
+#define WEIGHTED_PRED_FLAG				BW_FIELD(187, 1)
+#define WEIGHTED_BIPRED_FLAG				BW_FIELD(188, 1)
+#define TRANSQUANT_BYPASS_ENABLED_FLAG			BW_FIELD(189, 1)
+#define TILES_ENABLED_FLAG				BW_FIELD(190, 1)
+#define ENTROPY_CODING_SYNC_ENABLED_FLAG		BW_FIELD(191, 1)
+#define PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG	BW_FIELD(192, 1)
+#define LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG		BW_FIELD(193, 1)
+#define DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG		BW_FIELD(194, 1)
+#define PPS_DEBLOCKING_FILTER_DISABLED_FLAG		BW_FIELD(195, 1)
+#define PPS_BETA_OFFSET_DIV2				BW_FIELD(196, 4)
+#define PPS_TC_OFFSET_DIV2				BW_FIELD(200, 4)
+#define LISTS_MODIFICATION_PRESENT_FLAG			BW_FIELD(204, 1)
+#define LOG2_PARALLEL_MERGE_LEVEL			BW_FIELD(205, 3)
+#define SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG	BW_FIELD(208, 1)
+#define NUM_TILE_COLUMNS				BW_FIELD(212, 5)
+#define NUM_TILE_ROWS					BW_FIELD(217, 5)
+#define COLUMN_WIDTH(i)					BW_FIELD(256 + ((i) * 8), 8)
+#define ROW_HEIGHT(i)					BW_FIELD(416 + ((i) * 8), 8)
+#define SCALING_LIST_ADDRESS				BW_FIELD(592, 32)
 
 /* Data structure describing auxiliary buffer format. */
 struct rkvdec_hevc_priv_tbl {
@@ -123,20 +116,6 @@ struct rkvdec_hevc_ctx {
 	struct rkvdec_regs regs;
 };
 
-static void set_ps_field(u32 *buf, struct rkvdec_ps_field field, u32 value)
-{
-	u8 bit = field.offset % 32, word = field.offset / 32;
-	u64 mask = GENMASK_ULL(bit + field.len - 1, bit);
-	u64 val = ((u64)value << bit) & mask;
-
-	buf[word] &= ~mask;
-	buf[word] |= val;
-	if (bit + field.len > 32) {
-		buf[word + 1] &= ~(mask >> 32);
-		buf[word + 1] |= val >> 32;
-	}
-}
-
 static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 			    struct rkvdec_hevc_run *run)
 {
@@ -159,7 +138,7 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 	hw_ps = &priv_tbl->param_set[pps->pic_parameter_set_id];
 	memset(hw_ps, 0, sizeof(*hw_ps));
 
-#define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
+#define WRITE_PPS(value, field) rkvdec_set_bw_field(hw_ps->info, field, value)
 	/* write sps */
 	WRITE_PPS(sps->video_parameter_set_id, VIDEO_PARAMETER_SET_ID);
 	WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
@@ -321,17 +300,17 @@ static void assemble_sw_rps(struct rkvdec_ctx *ctx,
 	int i, j;
 	unsigned int lowdelay;
 
-#define WRITE_RPS(value, field) set_ps_field(hw_ps->info, field, value)
+#define WRITE_RPS(value, field) rkvdec_set_bw_field(hw_ps->info, field, value)
 
-#define REF_PIC_LONG_TERM_L0(i)			PS_FIELD((i) * 5, 1)
-#define REF_PIC_IDX_L0(i)			PS_FIELD(1 + ((i) * 5), 4)
-#define REF_PIC_LONG_TERM_L1(i)			PS_FIELD(((i) < 5 ? 75 : 132) + ((i) * 5), 1)
-#define REF_PIC_IDX_L1(i)			PS_FIELD(((i) < 4 ? 76 : 128) + ((i) * 5), 4)
+#define REF_PIC_LONG_TERM_L0(n)			BW_FIELD((n) * 5, 1)
+#define REF_PIC_IDX_L0(n)			BW_FIELD(1 + ((n) * 5), 4)
+#define REF_PIC_LONG_TERM_L1(n)			BW_FIELD(((n) < 5 ? 75 : 132) + ((n) * 5), 1)
+#define REF_PIC_IDX_L1(n)			BW_FIELD(((n) < 4 ? 76 : 128) + ((n) * 5), 4)
 
-#define LOWDELAY				PS_FIELD(182, 1)
-#define LONG_TERM_RPS_BIT_OFFSET		PS_FIELD(183, 10)
-#define SHORT_TERM_RPS_BIT_OFFSET		PS_FIELD(193, 9)
-#define NUM_RPS_POC				PS_FIELD(202, 4)
+#define LOWDELAY				BW_FIELD(182, 1)
+#define LONG_TERM_RPS_BIT_OFFSET		BW_FIELD(183, 10)
+#define SHORT_TERM_RPS_BIT_OFFSET		BW_FIELD(193, 9)
+#define NUM_RPS_POC				BW_FIELD(202, 4)
 
 	for (j = 0; j < run->num_slices; j++) {
 		uint st_bit_offset = 0;

-- 
2.53.0



^ permalink raw reply related

* [PATCH v2 4/4] media: rkvdec: vdpu383: Drop bitfields for the bitwriter
From: Detlev Casanova @ 2026-03-27 19:18 UTC (permalink / raw)
  To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
	Jonas Karlman, Nicolas Dufresne
  Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
	kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v2-0-a5a4754b0518@collabora.com>

The VDPU383 support for hevc and h264 use structs with bitfields to
represent the SPS and PPS.

Because the fields are mostly unaligned and numerous, it brings compiler
issues, especially with clang.

To prevent that, switch to using the global bitwriter previously
introduced instead.

Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
 .../platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c | 351 ++++++--------
 .../platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c | 502 +++++++++------------
 2 files changed, 360 insertions(+), 493 deletions(-)

diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c
index fb4f849d7366..5ec755733916 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c
@@ -15,105 +15,64 @@
 #include "rkvdec-cabac.h"
 #include "rkvdec-vdpu383-regs.h"
 #include "rkvdec-h264-common.h"
-
-struct rkvdec_sps {
-	u16 seq_parameter_set_id:			4;
-	u16 profile_idc:				8;
-	u16 constraint_set3_flag:			1;
-	u16 chroma_format_idc:				2;
-	u16 bit_depth_luma:				3;
-	u16 bit_depth_chroma:				3;
-	u16 qpprime_y_zero_transform_bypass_flag:	1;
-	u16 log2_max_frame_num_minus4:			4;
-	u16 max_num_ref_frames:				5;
-	u16 pic_order_cnt_type:				2;
-	u16 log2_max_pic_order_cnt_lsb_minus4:		4;
-	u16 delta_pic_order_always_zero_flag:		1;
-
-	u16 pic_width_in_mbs:				16;
-	u16 pic_height_in_mbs:				16;
-
-	u16 frame_mbs_only_flag:			1;
-	u16 mb_adaptive_frame_field_flag:		1;
-	u16 direct_8x8_inference_flag:			1;
-	u16 mvc_extension_enable:			1;
-	u16 num_views:					2;
-	u16 view_id0:                                   10;
-	u16 view_id1:                                   10;
-} __packed;
-
-struct rkvdec_pps {
-	u32 pic_parameter_set_id:				8;
-	u32 pps_seq_parameter_set_id:				5;
-	u32 entropy_coding_mode_flag:				1;
-	u32 bottom_field_pic_order_in_frame_present_flag:	1;
-	u32 num_ref_idx_l0_default_active_minus1:		5;
-	u32 num_ref_idx_l1_default_active_minus1:		5;
-	u32 weighted_pred_flag:					1;
-	u32 weighted_bipred_idc:				2;
-	u32 pic_init_qp_minus26:				7;
-	u32 pic_init_qs_minus26:				6;
-	u32 chroma_qp_index_offset:				5;
-	u32 deblocking_filter_control_present_flag:		1;
-	u32 constrained_intra_pred_flag:			1;
-	u32 redundant_pic_cnt_present:				1;
-	u32 transform_8x8_mode_flag:				1;
-	u32 second_chroma_qp_index_offset:			5;
-	u32 scaling_list_enable_flag:				1;
-	u32 is_longterm:					16;
-	u32 voidx:						16;
-
-	// dpb
-	u32 pic_field_flag:                                     1;
-	u32 pic_associated_flag:                                1;
-	u32 cur_top_field:					32;
-	u32 cur_bot_field:					32;
-
-	u32 top_field_order_cnt0:				32;
-	u32 bot_field_order_cnt0:				32;
-	u32 top_field_order_cnt1:				32;
-	u32 bot_field_order_cnt1:				32;
-	u32 top_field_order_cnt2:				32;
-	u32 bot_field_order_cnt2:				32;
-	u32 top_field_order_cnt3:				32;
-	u32 bot_field_order_cnt3:				32;
-	u32 top_field_order_cnt4:				32;
-	u32 bot_field_order_cnt4:				32;
-	u32 top_field_order_cnt5:				32;
-	u32 bot_field_order_cnt5:				32;
-	u32 top_field_order_cnt6:				32;
-	u32 bot_field_order_cnt6:				32;
-	u32 top_field_order_cnt7:				32;
-	u32 bot_field_order_cnt7:				32;
-	u32 top_field_order_cnt8:				32;
-	u32 bot_field_order_cnt8:				32;
-	u32 top_field_order_cnt9:				32;
-	u32 bot_field_order_cnt9:				32;
-	u32 top_field_order_cnt10:				32;
-	u32 bot_field_order_cnt10:				32;
-	u32 top_field_order_cnt11:				32;
-	u32 bot_field_order_cnt11:				32;
-	u32 top_field_order_cnt12:				32;
-	u32 bot_field_order_cnt12:				32;
-	u32 top_field_order_cnt13:				32;
-	u32 bot_field_order_cnt13:				32;
-	u32 top_field_order_cnt14:				32;
-	u32 bot_field_order_cnt14:				32;
-	u32 top_field_order_cnt15:				32;
-	u32 bot_field_order_cnt15:				32;
-
-	u32 ref_field_flags:					16;
-	u32 ref_topfield_used:					16;
-	u32 ref_botfield_used:					16;
-	u32 ref_colmv_use_flag:					16;
-
-	u32 reserved0:						30;
-	u32 reserved[3];
-} __packed;
+#include "rkvdec-bitwriter.h"
+
+#define SEQ_PARAMETER_SET_ID				BW_FIELD(0, 4)
+#define PROFILE_IDC					BW_FIELD(4, 8)
+#define CONSTRAINT_SET3_FLAG				BW_FIELD(12, 1)
+#define CHROMA_FORMAT_IDC				BW_FIELD(13, 2)
+#define BIT_DEPTH_LUMA					BW_FIELD(15, 3)
+#define BIT_DEPTH_CHROMA				BW_FIELD(18, 3)
+#define QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG		BW_FIELD(21, 1)
+#define LOG2_MAX_FRAME_NUM_MINUS4			BW_FIELD(22, 4)
+#define MAX_NUM_REF_FRAMES				BW_FIELD(26, 5)
+#define PIC_ORDER_CNT_TYPE				BW_FIELD(31, 2)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4		BW_FIELD(33, 4)
+#define DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG		BW_FIELD(37, 1)
+#define PIC_WIDTH_IN_MBS				BW_FIELD(38, 16)
+#define PIC_HEIGHT_IN_MBS				BW_FIELD(54, 16)
+#define FRAME_MBS_ONLY_FLAG				BW_FIELD(70, 1)
+#define MB_ADAPTIVE_FRAME_FIELD_FLAG			BW_FIELD(71, 1)
+#define DIRECT_8X8_INFERENCE_FLAG			BW_FIELD(72, 1)
+#define MVC_EXTENSION_ENABLE				BW_FIELD(73, 1)
+#define NUM_VIEWS					BW_FIELD(74, 2)
+#define VIEW_ID(i)					BW_FIELD(76 + ((i) * 10), 10) // i: 0-1
+
+#define PIC_PARAMETER_SET_ID				BW_FIELD(96, 8)
+#define PPS_SEQ_PARAMETER_SET_ID			BW_FIELD(104, 5)
+#define ENTROPY_CODING_MODE_FLAG			BW_FIELD(109, 1)
+#define BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG	BW_FIELD(110, 1)
+#define NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(i)		BW_FIELD(111 + ((i) * 5), 5) // i: 0-1
+#define WEIGHTED_PRED_FLAG				BW_FIELD(121, 1)
+#define WEIGHTED_BIPRED_IDC				BW_FIELD(122, 2)
+#define PIC_INIT_QP_MINUS26				BW_FIELD(124, 7)
+#define PIC_INIT_QS_MINUS26				BW_FIELD(131, 6)
+#define CHROMA_QP_INDEX_OFFSET				BW_FIELD(137, 5)
+#define DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG		BW_FIELD(142, 1)
+#define CONSTRAINED_INTRA_PRED_FLAG			BW_FIELD(143, 1)
+#define REDUNDANT_PIC_CNT_PRESENT			BW_FIELD(144, 1)
+#define TRANSFORM_8X8_MODE_FLAG				BW_FIELD(145, 1)
+#define SECOND_CHROMA_QP_INDEX_OFFSET			BW_FIELD(146, 5)
+#define SCALING_LIST_ENABLE_FLAG			BW_FIELD(151, 1)
+#define IS_LONG_TERM(i)					BW_FIELD(152 + (i), 1) // i: 0-15
+
+#define PIC_FIELD_FLAG					BW_FIELD(184, 1)
+#define PIC_ASSOCIATED_FLAG				BW_FIELD(185, 1)
+#define CUR_TOP_FIELD					BW_FIELD(186, 32)
+#define CUR_BOT_FIELD					BW_FIELD(218, 32)
+
+#define TOP_FIELD_ORDER_CNT(i)				BW_FIELD(250 + (i) * 64, 32) // i: 0-15
+#define BOT_FIELD_ORDER_CNT(i)				BW_FIELD(282 + (i) * 64, 32) // i: 0-15
+
+#define REF_FIELD_FLAGS(i)				BW_FIELD(1274 + (i), 1) // i: 0-15
+#define REF_TOPFIELD_USED(i)				BW_FIELD(1290 + (i), 1) // i: 0-15
+#define REF_BOTFIELD_USED(i)				BW_FIELD(1306 + (i), 1) // i: 0-15
+#define REF_COLMV_USE_FLAG(i)				BW_FIELD(1322 + (i), 1) // i: 0-15
+
+#define SPS_SIZE					ALIGN(1322 + 16, 128)
 
 struct rkvdec_sps_pps {
-	struct rkvdec_sps sps;
-	struct rkvdec_pps pps;
+	u32 info[SPS_SIZE / 8 / 4];
 } __packed;
 
 /* Data structure describing auxiliary buffer format. */
@@ -130,67 +89,6 @@ struct rkvdec_h264_ctx {
 	struct vdpu383_regs_h26x regs;
 };
 
-static noinline_for_stack void set_field_order_cnt(struct rkvdec_pps *pps, const struct v4l2_h264_dpb_entry *dpb)
-{
-	pps->top_field_order_cnt0 = dpb[0].top_field_order_cnt;
-	pps->bot_field_order_cnt0 = dpb[0].bottom_field_order_cnt;
-	pps->top_field_order_cnt1 = dpb[1].top_field_order_cnt;
-	pps->bot_field_order_cnt1 = dpb[1].bottom_field_order_cnt;
-	pps->top_field_order_cnt2 = dpb[2].top_field_order_cnt;
-	pps->bot_field_order_cnt2 = dpb[2].bottom_field_order_cnt;
-	pps->top_field_order_cnt3 = dpb[3].top_field_order_cnt;
-	pps->bot_field_order_cnt3 = dpb[3].bottom_field_order_cnt;
-	pps->top_field_order_cnt4 = dpb[4].top_field_order_cnt;
-	pps->bot_field_order_cnt4 = dpb[4].bottom_field_order_cnt;
-	pps->top_field_order_cnt5 = dpb[5].top_field_order_cnt;
-	pps->bot_field_order_cnt5 = dpb[5].bottom_field_order_cnt;
-	pps->top_field_order_cnt6 = dpb[6].top_field_order_cnt;
-	pps->bot_field_order_cnt6 = dpb[6].bottom_field_order_cnt;
-	pps->top_field_order_cnt7 = dpb[7].top_field_order_cnt;
-	pps->bot_field_order_cnt7 = dpb[7].bottom_field_order_cnt;
-	pps->top_field_order_cnt8 = dpb[8].top_field_order_cnt;
-	pps->bot_field_order_cnt8 = dpb[8].bottom_field_order_cnt;
-	pps->top_field_order_cnt9 = dpb[9].top_field_order_cnt;
-	pps->bot_field_order_cnt9 = dpb[9].bottom_field_order_cnt;
-	pps->top_field_order_cnt10 = dpb[10].top_field_order_cnt;
-	pps->bot_field_order_cnt10 = dpb[10].bottom_field_order_cnt;
-	pps->top_field_order_cnt11 = dpb[11].top_field_order_cnt;
-	pps->bot_field_order_cnt11 = dpb[11].bottom_field_order_cnt;
-	pps->top_field_order_cnt12 = dpb[12].top_field_order_cnt;
-	pps->bot_field_order_cnt12 = dpb[12].bottom_field_order_cnt;
-	pps->top_field_order_cnt13 = dpb[13].top_field_order_cnt;
-	pps->bot_field_order_cnt13 = dpb[13].bottom_field_order_cnt;
-	pps->top_field_order_cnt14 = dpb[14].top_field_order_cnt;
-	pps->bot_field_order_cnt14 = dpb[14].bottom_field_order_cnt;
-	pps->top_field_order_cnt15 = dpb[15].top_field_order_cnt;
-	pps->bot_field_order_cnt15 = dpb[15].bottom_field_order_cnt;
-}
-
-static noinline_for_stack void set_dec_params(struct rkvdec_pps *pps, const struct v4l2_ctrl_h264_decode_params *dec_params)
-{
-	const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb;
-
-	for (int i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
-		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
-			pps->is_longterm |= (1 << i);
-		pps->ref_field_flags |=
-		 (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD)) << i;
-		pps->ref_colmv_use_flag |=
-		 (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) << i;
-		pps->ref_topfield_used |=
-		 (!!(dpb[i].fields & V4L2_H264_TOP_FIELD_REF)) << i;
-		pps->ref_botfield_used |=
-			(!!(dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)) << i;
-	}
-	pps->pic_field_flag =
-		!!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC);
-	pps->pic_associated_flag =
-		!!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD);
-
-	pps->cur_top_field = dec_params->top_field_order_cnt;
-	pps->cur_bot_field = dec_params->bottom_field_order_cnt;
-}
-
 static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 			    struct rkvdec_h264_run *run)
 {
@@ -202,6 +100,7 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 	struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu;
 	struct rkvdec_sps_pps *hw_ps;
 	u32 pic_width, pic_height;
+	int i;
 
 	/*
 	 * HW read the SPS/PPS information from PPS packet index by PPS id.
@@ -213,23 +112,25 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 	memset(hw_ps, 0, sizeof(*hw_ps));
 
 	/* write sps */
-	hw_ps->sps.seq_parameter_set_id = sps->seq_parameter_set_id;
-	hw_ps->sps.profile_idc = sps->profile_idc;
-	hw_ps->sps.constraint_set3_flag = !!(sps->constraint_set_flags & (1 << 3));
-	hw_ps->sps.chroma_format_idc = sps->chroma_format_idc;
-	hw_ps->sps.bit_depth_luma = sps->bit_depth_luma_minus8;
-	hw_ps->sps.bit_depth_chroma = sps->bit_depth_chroma_minus8;
-	hw_ps->sps.qpprime_y_zero_transform_bypass_flag =
-		!!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
-	hw_ps->sps.log2_max_frame_num_minus4 = sps->log2_max_frame_num_minus4;
-	hw_ps->sps.max_num_ref_frames = sps->max_num_ref_frames;
-	hw_ps->sps.pic_order_cnt_type = sps->pic_order_cnt_type;
-	hw_ps->sps.log2_max_pic_order_cnt_lsb_minus4 =
-		sps->log2_max_pic_order_cnt_lsb_minus4;
-	hw_ps->sps.delta_pic_order_always_zero_flag =
-		!!(sps->flags & V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
-	hw_ps->sps.mvc_extension_enable = 0;
-	hw_ps->sps.num_views = 0;
+	rkvdec_set_bw_field(hw_ps->info, SEQ_PARAMETER_SET_ID, sps->seq_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, PROFILE_IDC, sps->profile_idc);
+	rkvdec_set_bw_field(hw_ps->info, CONSTRAINT_SET3_FLAG,
+			    !!(sps->constraint_set_flags & (1 << 3)));
+	rkvdec_set_bw_field(hw_ps->info, CHROMA_FORMAT_IDC, sps->chroma_format_idc);
+	rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_LUMA, sps->bit_depth_luma_minus8);
+	rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_CHROMA, sps->bit_depth_chroma_minus8);
+	rkvdec_set_bw_field(hw_ps->info, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG,
+			    !!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS));
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MAX_FRAME_NUM_MINUS4,
+			    sps->log2_max_frame_num_minus4);
+	rkvdec_set_bw_field(hw_ps->info, MAX_NUM_REF_FRAMES, sps->max_num_ref_frames);
+	rkvdec_set_bw_field(hw_ps->info, PIC_ORDER_CNT_TYPE, sps->pic_order_cnt_type);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4,
+			    sps->log2_max_pic_order_cnt_lsb_minus4);
+	rkvdec_set_bw_field(hw_ps->info, DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG,
+			    !!(sps->flags & V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO));
+	rkvdec_set_bw_field(hw_ps->info, MVC_EXTENSION_ENABLE, 0);
+	rkvdec_set_bw_field(hw_ps->info, NUM_VIEWS, 0);
 
 	/*
 	 * Use the SPS values since they are already in macroblocks
@@ -245,48 +146,72 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 	if (!!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC))
 		pic_height /= 2;
 
-	hw_ps->sps.pic_width_in_mbs = pic_width;
-	hw_ps->sps.pic_height_in_mbs = pic_height;
+	rkvdec_set_bw_field(hw_ps->info, PIC_WIDTH_IN_MBS, pic_width);
+	rkvdec_set_bw_field(hw_ps->info, PIC_HEIGHT_IN_MBS, pic_height);
 
-	hw_ps->sps.frame_mbs_only_flag =
-		!!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
-	hw_ps->sps.mb_adaptive_frame_field_flag =
-		!!(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
-	hw_ps->sps.direct_8x8_inference_flag =
-		!!(sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
+	rkvdec_set_bw_field(hw_ps->info, FRAME_MBS_ONLY_FLAG,
+			    !!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY));
+	rkvdec_set_bw_field(hw_ps->info, MB_ADAPTIVE_FRAME_FIELD_FLAG,
+			    !!(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD));
+	rkvdec_set_bw_field(hw_ps->info, DIRECT_8X8_INFERENCE_FLAG,
+			    !!(sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE));
 
 	/* write pps */
-	hw_ps->pps.pic_parameter_set_id = pps->pic_parameter_set_id;
-	hw_ps->pps.pps_seq_parameter_set_id = pps->seq_parameter_set_id;
-	hw_ps->pps.entropy_coding_mode_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
-	hw_ps->pps.bottom_field_pic_order_in_frame_present_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
-	hw_ps->pps.num_ref_idx_l0_default_active_minus1 =
-		pps->num_ref_idx_l0_default_active_minus1;
-	hw_ps->pps.num_ref_idx_l1_default_active_minus1 =
-		pps->num_ref_idx_l1_default_active_minus1;
-	hw_ps->pps.weighted_pred_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
-	hw_ps->pps.weighted_bipred_idc = pps->weighted_bipred_idc;
-	hw_ps->pps.pic_init_qp_minus26 = pps->pic_init_qp_minus26;
-	hw_ps->pps.pic_init_qs_minus26 = pps->pic_init_qs_minus26;
-	hw_ps->pps.chroma_qp_index_offset = pps->chroma_qp_index_offset;
-	hw_ps->pps.deblocking_filter_control_present_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
-	hw_ps->pps.constrained_intra_pred_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
-	hw_ps->pps.redundant_pic_cnt_present =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
-	hw_ps->pps.transform_8x8_mode_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
-	hw_ps->pps.second_chroma_qp_index_offset = pps->second_chroma_qp_index_offset;
-	hw_ps->pps.scaling_list_enable_flag =
-		!!(pps->flags & V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
-
-	set_field_order_cnt(&hw_ps->pps, dpb);
-	set_dec_params(&hw_ps->pps, dec_params);
+	rkvdec_set_bw_field(hw_ps->info, PIC_PARAMETER_SET_ID, pps->pic_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, PPS_SEQ_PARAMETER_SET_ID, pps->seq_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, ENTROPY_CODING_MODE_FLAG,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE));
+	rkvdec_set_bw_field(hw_ps->info, BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG,
+			    !!(pps->flags &
+			       V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(0),
+			    pps->num_ref_idx_l0_default_active_minus1);
+	rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(1),
+			    pps->num_ref_idx_l1_default_active_minus1);
+	rkvdec_set_bw_field(hw_ps->info, WEIGHTED_PRED_FLAG,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED));
+	rkvdec_set_bw_field(hw_ps->info, WEIGHTED_BIPRED_IDC, pps->weighted_bipred_idc);
+	rkvdec_set_bw_field(hw_ps->info, PIC_INIT_QP_MINUS26, pps->pic_init_qp_minus26);
+	rkvdec_set_bw_field(hw_ps->info, PIC_INIT_QS_MINUS26, pps->pic_init_qs_minus26);
+	rkvdec_set_bw_field(hw_ps->info, CHROMA_QP_INDEX_OFFSET, pps->chroma_qp_index_offset);
+	rkvdec_set_bw_field(hw_ps->info, DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, CONSTRAINED_INTRA_PRED_FLAG,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+	rkvdec_set_bw_field(hw_ps->info, REDUNDANT_PIC_CNT_PRESENT,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, TRANSFORM_8X8_MODE_FLAG,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE));
+	rkvdec_set_bw_field(hw_ps->info, SECOND_CHROMA_QP_INDEX_OFFSET,
+			    pps->second_chroma_qp_index_offset);
+	rkvdec_set_bw_field(hw_ps->info, SCALING_LIST_ENABLE_FLAG,
+			    !!(pps->flags & V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT));
+
+	for (i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
+		rkvdec_set_bw_field(hw_ps->info, TOP_FIELD_ORDER_CNT(i),
+				    dpb[i].top_field_order_cnt);
+		rkvdec_set_bw_field(hw_ps->info, BOT_FIELD_ORDER_CNT(i),
+				    dpb[i].bottom_field_order_cnt);
+
+		rkvdec_set_bw_field(hw_ps->info, IS_LONG_TERM(i),
+				    !!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM));
+		rkvdec_set_bw_field(hw_ps->info, REF_FIELD_FLAGS(i),
+				    !!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD));
+		rkvdec_set_bw_field(hw_ps->info, REF_COLMV_USE_FLAG(i),
+				    !!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE));
+		rkvdec_set_bw_field(hw_ps->info, REF_TOPFIELD_USED(i),
+				    !!(dpb[i].fields & V4L2_H264_TOP_FIELD_REF));
+		rkvdec_set_bw_field(hw_ps->info, REF_BOTFIELD_USED(i),
+				    !!(dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF));
+	}
+
+	rkvdec_set_bw_field(hw_ps->info, PIC_FIELD_FLAG,
+			    !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC));
+	rkvdec_set_bw_field(hw_ps->info, PIC_ASSOCIATED_FLAG,
+			    !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD));
 
+	rkvdec_set_bw_field(hw_ps->info, CUR_TOP_FIELD, dec_params->top_field_order_cnt);
+	rkvdec_set_bw_field(hw_ps->info, CUR_BOT_FIELD, dec_params->bottom_field_order_cnt);
 }
 
 static void rkvdec_write_regs(struct rkvdec_ctx *ctx)
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c
index 96d938ee70b0..3575338a531a 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c
@@ -13,149 +13,106 @@
 #include "rkvdec-rcb.h"
 #include "rkvdec-hevc-common.h"
 #include "rkvdec-vdpu383-regs.h"
+#include "rkvdec-bitwriter.h"
+
+#define VIDEO_PARAMETER_SET_ID				BW_FIELD(0, 4)
+#define SEQ_PARAMETER_SET_ID				BW_FIELD(4, 4)
+#define CHROMA_FORMAT_IDC				BW_FIELD(8, 2)
+#define PIC_WIDTH_IN_LUMA_SAMPLES			BW_FIELD(10, 16)
+#define PIC_HEIGHT_IN_LUMA_SAMPLES			BW_FIELD(26, 16)
+#define BIT_DEPTH_LUMA					BW_FIELD(42, 3)
+#define BIT_DEPTH_CHROMA				BW_FIELD(45, 3)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB			BW_FIELD(48, 5)
+#define LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE	BW_FIELD(53, 2)
+#define LOG2_MIN_LUMA_CODING_BLOCK_SIZE			BW_FIELD(55, 3)
+#define LOG2_MIN_TRANSFORM_BLOCK_SIZE			BW_FIELD(58, 3)
+#define LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE	BW_FIELD(61, 2)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTER		BW_FIELD(63, 3)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA		BW_FIELD(66, 3)
+#define SCALING_LIST_ENABLED_FLAG			BW_FIELD(69, 1)
+#define AMP_ENABLED_FLAG				BW_FIELD(70, 1)
+#define SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG		BW_FIELD(71, 1)
+#define PCM_ENABLED_FLAG				BW_FIELD(72, 1)
+#define PCM_SAMPLE_BIT_DEPTH_LUMA			BW_FIELD(73, 4)
+#define PCM_SAMPLE_BIT_DEPTH_CHROMA			BW_FIELD(77, 4)
+#define PCM_LOOP_FILTER_DISABLED_FLAG			BW_FIELD(81, 1)
+#define LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE	BW_FIELD(82, 3)
+#define LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE		BW_FIELD(85, 3)
+#define NUM_SHORT_TERM_REF_PIC_SETS			BW_FIELD(88, 7)
+#define LONG_TERM_REF_PICS_PRESENT_FLAG			BW_FIELD(95, 1)
+#define NUM_LONG_TERM_REF_PICS_SPS			BW_FIELD(96, 6)
+#define SPS_TEMPORAL_MVP_ENABLED_FLAG			BW_FIELD(102, 1)
+#define STRONG_INTRA_SMOOTHING_ENABLED_FLAG		BW_FIELD(103, 1)
+#define SPS_MAX_DEC_PIC_BUFFERING_MINUS1		BW_FIELD(111, 4)
+#define SEPARATE_COLOUR_PLANE_FLAG			BW_FIELD(115, 1)
+#define HIGH_PRECISION_OFFSETS_ENABLED_FLAG		BW_FIELD(116, 1)
+#define PERSISTENT_RICE_ADAPTATION_ENABLED_FLAG		BW_FIELD(117, 1)
+
+/* PPS */
+#define PIC_PARAMETER_SET_ID				BW_FIELD(118, 6)
+#define PPS_SEQ_PARAMETER_SET_ID			BW_FIELD(124, 4)
+#define DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG		BW_FIELD(128, 1)
+#define OUTPUT_FLAG_PRESENT_FLAG			BW_FIELD(129, 1)
+#define NUM_EXTRA_SLICE_HEADER_BITS			BW_FIELD(130, 13)
+#define SIGN_DATA_HIDING_ENABLED_FLAG			BW_FIELD(143, 1)
+#define CABAC_INIT_PRESENT_FLAG				BW_FIELD(144, 1)
+#define NUM_REF_IDX_L0_DEFAULT_ACTIVE			BW_FIELD(145, 4)
+#define NUM_REF_IDX_L1_DEFAULT_ACTIVE			BW_FIELD(149, 4)
+#define INIT_QP_MINUS26					BW_FIELD(153, 7)
+#define CONSTRAINED_INTRA_PRED_FLAG			BW_FIELD(160, 1)
+#define TRANSFORM_SKIP_ENABLED_FLAG			BW_FIELD(161, 1)
+#define CU_QP_DELTA_ENABLED_FLAG			BW_FIELD(162, 1)
+#define LOG2_MIN_CU_QP_DELTA_SIZE			BW_FIELD(163, 3)
+#define PPS_CB_QP_OFFSET				BW_FIELD(166, 5)
+#define PPS_CR_QP_OFFSET				BW_FIELD(171, 5)
+#define PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG	BW_FIELD(176, 1)
+#define WEIGHTED_PRED_FLAG				BW_FIELD(177, 1)
+#define WEIGHTED_BIPRED_FLAG				BW_FIELD(178, 1)
+#define TRANSQUANT_BYPASS_ENABLED_FLAG			BW_FIELD(179, 1)
+#define TILES_ENABLED_FLAG				BW_FIELD(180, 1)
+#define ENTROPY_CODING_SYNC_ENABLED_FLAG		BW_FIELD(181, 1)
+#define PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG	BW_FIELD(182, 1)
+#define LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG		BW_FIELD(183, 1)
+#define DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG		BW_FIELD(184, 1)
+#define PPS_DEBLOCKING_FILTER_DISABLED_FLAG		BW_FIELD(185, 1)
+#define PPS_BETA_OFFSET_DIV2				BW_FIELD(186, 4)
+#define PPS_TC_OFFSET_DIV2				BW_FIELD(190, 4)
+#define LISTS_MODIFICATION_PRESENT_FLAG			BW_FIELD(194, 1)
+#define LOG2_PARALLEL_MERGE_LEVEL			BW_FIELD(195, 3)
+#define SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG	BW_FIELD(198, 1)
+
+/* pps extensions */
+#define LOG2_MAX_TRANSFORM_SKIP_BLOCK_SIZE		BW_FIELD(202, 2)
+#define CROSS_COMPONENT_PREDICTION_ENABLED_FLAG		BW_FIELD(204, 1)
+#define CHROMA_QP_OFFSET_LIST_ENABLED_FLAG		BW_FIELD(205, 1)
+#define LOG2_MIN_CU_CHROMA_QP_DELTA_SIZE		BW_FIELD(206, 3)
+#define CB_QP_OFFSET_LIST(i)				BW_FIELD(209 + (i) * 5, 5) // i: 0-5
+#define CB_CR_OFFSET_LIST(i)				BW_FIELD(239 + (i) * 5, 5) // i: 0-5
+#define CHROMA_QP_OFFSET_LIST_LEN_MINUS1		BW_FIELD(269, 3)
+
+/* mvc0 && mvc1 */
+#define MVC_FF						BW_FIELD(272, 16)
+#define MVC_00						BW_FIELD(288, 9)
+
+/* poc info */
+#define RESERVED2					BW_FIELD(297, 3)
+#define CURRENT_POC					BW_FIELD(300, 32)
+#define REF_PIC_POC(i)					BW_FIELD(332 + (i) * 32, 32) // i: 0-14
+#define RESERVED3					BW_FIELD(812, 32)
+#define REF_IS_VALID(i)					BW_FIELD(844 + (i), 1) // i: 0-14
+#define RESERVED4					BW_FIELD(859, 1)
+
+/* tile info*/
+#define NUM_TILE_COLUMNS				BW_FIELD(860, 5)
+#define NUM_TILE_ROWS					BW_FIELD(865, 5)
+#define COLUMN_WIDTH(i)					BW_FIELD(870 + (i) * 12, 12) // i: 0-19
+#define ROW_HEIGHT(i)					BW_FIELD(1110 + (i) * 12, 12) // i: 0-21
+
+#define HEVC_SPS_SIZE					ALIGN(1110 + 22 * 12, 256)
 
 struct rkvdec_hevc_sps_pps {
-	// SPS
-	u16 video_parameters_set_id			: 4;
-	u16 seq_parameters_set_id_sps			: 4;
-	u16 chroma_format_idc				: 2;
-	u16 width					: 16;
-	u16 height					: 16;
-	u16 bit_depth_luma				: 3;
-	u16 bit_depth_chroma				: 3;
-	u16 max_pic_order_count_lsb			: 5;
-	u16 diff_max_min_luma_coding_block_size		: 2;
-	u16 min_luma_coding_block_size			: 3;
-	u16 min_transform_block_size			: 3;
-	u16 diff_max_min_transform_block_size		: 2;
-	u16 max_transform_hierarchy_depth_inter		: 3;
-	u16 max_transform_hierarchy_depth_intra		: 3;
-	u16 scaling_list_enabled_flag			: 1;
-	u16 amp_enabled_flag				: 1;
-	u16 sample_adaptive_offset_enabled_flag		: 1;
-	u16 pcm_enabled_flag				: 1;
-	u16 pcm_sample_bit_depth_luma			: 4;
-	u16 pcm_sample_bit_depth_chroma			: 4;
-	u16 pcm_loop_filter_disabled_flag		: 1;
-	u16 diff_max_min_pcm_luma_coding_block_size	: 3;
-	u16 min_pcm_luma_coding_block_size		: 3;
-	u16 num_short_term_ref_pic_sets			: 7;
-	u16 long_term_ref_pics_present_flag		: 1;
-	u16 num_long_term_ref_pics_sps			: 6;
-	u16 sps_temporal_mvp_enabled_flag		: 1;
-	u16 strong_intra_smoothing_enabled_flag		: 1;
-	u16 reserved0					: 7;
-	u16 sps_max_dec_pic_buffering_minus1		: 4;
-	u16 separate_colour_plane_flag			: 1;
-	u16 high_precision_offsets_enabled_flag		: 1;
-	u16 persistent_rice_adaptation_enabled_flag	: 1;
-
-	// PPS
-	u16 picture_parameters_set_id			: 6;
-	u16 seq_parameters_set_id_pps			: 4;
-	u16 dependent_slice_segments_enabled_flag	: 1;
-	u16 output_flag_present_flag			: 1;
-	u16 num_extra_slice_header_bits			: 13;
-	u16 sign_data_hiding_enabled_flag		: 1;
-	u16 cabac_init_present_flag			: 1;
-	u16 num_ref_idx_l0_default_active		: 4;
-	u16 num_ref_idx_l1_default_active		: 4;
-	u16 init_qp_minus26				: 7;
-	u16 constrained_intra_pred_flag			: 1;
-	u16 transform_skip_enabled_flag			: 1;
-	u16 cu_qp_delta_enabled_flag			: 1;
-	u16 log2_min_cb_size				: 3;
-	u16 pps_cb_qp_offset				: 5;
-	u16 pps_cr_qp_offset				: 5;
-	u16 pps_slice_chroma_qp_offsets_present_flag	: 1;
-	u16 weighted_pred_flag				: 1;
-	u16 weighted_bipred_flag			: 1;
-	u16 transquant_bypass_enabled_flag		: 1;
-	u16 tiles_enabled_flag				: 1;
-	u16 entropy_coding_sync_enabled_flag		: 1;
-	u16 pps_loop_filter_across_slices_enabled_flag	: 1;
-	u16 loop_filter_across_tiles_enabled_flag	: 1;
-	u16 deblocking_filter_override_enabled_flag	: 1;
-	u16 pps_deblocking_filter_disabled_flag		: 1;
-	u16 pps_beta_offset_div2			: 4;
-	u16 pps_tc_offset_div2				: 4;
-	u16 lists_modification_present_flag		: 1;
-	u16 log2_parallel_merge_level			: 3;
-	u16 slice_segment_header_extension_present_flag	: 1;
-	u16 reserved1					: 3;
-
-	// pps extensions
-	u16 log2_max_transform_skip_block_size		: 2;
-	u16 cross_component_prediction_enabled_flag	: 1;
-	u16 chroma_qp_offset_list_enabled_flag		: 1;
-	u16 log2_min_cu_chroma_qp_delta_size		: 3;
-	u16 cb_qp_offset_list0				: 5;
-	u16 cb_qp_offset_list1				: 5;
-	u16 cb_qp_offset_list2				: 5;
-	u16 cb_qp_offset_list3				: 5;
-	u16 cb_qp_offset_list4				: 5;
-	u16 cb_qp_offset_list5				: 5;
-	u16 cb_cr_offset_list0				: 5;
-	u16 cb_cr_offset_list1				: 5;
-	u16 cb_cr_offset_list2				: 5;
-	u16 cb_cr_offset_list3				: 5;
-	u16 cb_cr_offset_list4				: 5;
-	u16 cb_cr_offset_list5				: 5;
-	u16 chroma_qp_offset_list_len_minus1		: 3;
-
-	/* mvc0 && mvc1 */
-	u16 mvc_ff					: 16;
-	u16 mvc_00					: 9;
-
-	/* poc info */
-	u16 reserved2					: 3;
-	u32 current_poc					: 32;
-	u32 ref_pic_poc0				: 32;
-	u32 ref_pic_poc1				: 32;
-	u32 ref_pic_poc2				: 32;
-	u32 ref_pic_poc3				: 32;
-	u32 ref_pic_poc4				: 32;
-	u32 ref_pic_poc5				: 32;
-	u32 ref_pic_poc6				: 32;
-	u32 ref_pic_poc7				: 32;
-	u32 ref_pic_poc8				: 32;
-	u32 ref_pic_poc9				: 32;
-	u32 ref_pic_poc10				: 32;
-	u32 ref_pic_poc11				: 32;
-	u32 ref_pic_poc12				: 32;
-	u32 ref_pic_poc13				: 32;
-	u32 ref_pic_poc14				: 32;
-	u32 reserved3					: 32;
-	u32 ref_is_valid				: 15;
-	u32 reserved4					: 1;
-
-	/* tile info*/
-	u16 num_tile_columns				: 5;
-	u16 num_tile_rows				: 5;
-	u32 column_width0				: 24;
-	u32 column_width1				: 24;
-	u32 column_width2				: 24;
-	u32 column_width3				: 24;
-	u32 column_width4				: 24;
-	u32 column_width5				: 24;
-	u32 column_width6				: 24;
-	u32 column_width7				: 24;
-	u32 column_width8				: 24;
-	u32 column_width9				: 24;
-	u32 row_height0					: 24;
-	u32 row_height1					: 24;
-	u32 row_height2					: 24;
-	u32 row_height3					: 24;
-	u32 row_height4					: 24;
-	u32 row_height5					: 24;
-	u32 row_height6					: 24;
-	u32 row_height7					: 24;
-	u32 row_height8					: 24;
-	u32 row_height9					: 24;
-	u32 row_height10				: 24;
-	u32 reserved5					: 2;
-	u32 padding;
-} __packed;
+	u32 info[HEVC_SPS_SIZE / 8 / 4];
+};
 
 struct rkvdec_hevc_priv_tbl {
 	struct rkvdec_hevc_sps_pps param_set;
@@ -171,51 +128,6 @@ struct rkvdec_hevc_ctx {
 	struct vdpu383_regs_h26x		regs;
 };
 
-static void set_column_row(struct rkvdec_hevc_sps_pps *hw_ps, u16 *column, u16 *row)
-{
-	hw_ps->column_width0 = column[0] | (column[1] << 12);
-	hw_ps->row_height0 = row[0] | (row[1] << 12);
-	hw_ps->column_width1 = column[2] | (column[3] << 12);
-	hw_ps->row_height1 = row[2] | (row[3] << 12);
-	hw_ps->column_width2 = column[4] | (column[5] << 12);
-	hw_ps->row_height2 = row[4] | (row[5] << 12);
-	hw_ps->column_width3 = column[6] | (column[7] << 12);
-	hw_ps->row_height3 = row[6] | (row[7] << 12);
-	hw_ps->column_width4 = column[8] | (column[9] << 12);
-	hw_ps->row_height4 = row[8] | (row[9] << 12);
-	hw_ps->column_width5 = column[10] | (column[11] << 12);
-	hw_ps->row_height5 = row[10] | (row[11] << 12);
-	hw_ps->column_width6 = column[12] | (column[13] << 12);
-	hw_ps->row_height6 = row[12] | (row[13] << 12);
-	hw_ps->column_width7 = column[14] | (column[15] << 12);
-	hw_ps->row_height7 = row[14] | (row[15] << 12);
-	hw_ps->column_width8 = column[16] | (column[17] << 12);
-	hw_ps->row_height8 = row[16] | (row[17] << 12);
-	hw_ps->column_width9 = column[18] | (column[19] << 12);
-	hw_ps->row_height9 = row[18] | (row[19] << 12);
-
-	hw_ps->row_height10 = row[20] | (row[21] << 12);
-}
-
-static void set_pps_ref_pic_poc(struct rkvdec_hevc_sps_pps *hw_ps, const struct v4l2_hevc_dpb_entry *dpb)
-{
-	hw_ps->ref_pic_poc0 = dpb[0].pic_order_cnt_val;
-	hw_ps->ref_pic_poc1 = dpb[1].pic_order_cnt_val;
-	hw_ps->ref_pic_poc2 = dpb[2].pic_order_cnt_val;
-	hw_ps->ref_pic_poc3 = dpb[3].pic_order_cnt_val;
-	hw_ps->ref_pic_poc4 = dpb[4].pic_order_cnt_val;
-	hw_ps->ref_pic_poc5 = dpb[5].pic_order_cnt_val;
-	hw_ps->ref_pic_poc6 = dpb[6].pic_order_cnt_val;
-	hw_ps->ref_pic_poc7 = dpb[7].pic_order_cnt_val;
-	hw_ps->ref_pic_poc8 = dpb[8].pic_order_cnt_val;
-	hw_ps->ref_pic_poc9 = dpb[9].pic_order_cnt_val;
-	hw_ps->ref_pic_poc10 = dpb[10].pic_order_cnt_val;
-	hw_ps->ref_pic_poc11 = dpb[11].pic_order_cnt_val;
-	hw_ps->ref_pic_poc12 = dpb[12].pic_order_cnt_val;
-	hw_ps->ref_pic_poc13 = dpb[13].pic_order_cnt_val;
-	hw_ps->ref_pic_poc14 = dpb[14].pic_order_cnt_val;
-}
-
 static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 			    struct rkvdec_hevc_run *run)
 {
@@ -245,104 +157,130 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 	memset(hw_ps, 0, sizeof(*hw_ps));
 
 	/* write sps */
-	hw_ps->video_parameters_set_id = sps->video_parameter_set_id;
-	hw_ps->seq_parameters_set_id_sps = sps->seq_parameter_set_id;
-	hw_ps->chroma_format_idc = sps->chroma_format_idc;
+	rkvdec_set_bw_field(hw_ps->info, VIDEO_PARAMETER_SET_ID, sps->video_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, SEQ_PARAMETER_SET_ID, sps->seq_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, CHROMA_FORMAT_IDC, sps->chroma_format_idc);
 
 	log2_min_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
 	width = sps->pic_width_in_luma_samples;
 	height = sps->pic_height_in_luma_samples;
-	hw_ps->width = width;
-	hw_ps->height = height;
-	hw_ps->bit_depth_luma = sps->bit_depth_luma_minus8 + 8;
-	hw_ps->bit_depth_chroma = sps->bit_depth_chroma_minus8 + 8;
-	hw_ps->max_pic_order_count_lsb = sps->log2_max_pic_order_cnt_lsb_minus4 + 4;
-	hw_ps->diff_max_min_luma_coding_block_size = sps->log2_diff_max_min_luma_coding_block_size;
-	hw_ps->min_luma_coding_block_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
-	hw_ps->min_transform_block_size = sps->log2_min_luma_transform_block_size_minus2 + 2;
-	hw_ps->diff_max_min_transform_block_size =
-		sps->log2_diff_max_min_luma_transform_block_size;
-	hw_ps->max_transform_hierarchy_depth_inter = sps->max_transform_hierarchy_depth_inter;
-	hw_ps->max_transform_hierarchy_depth_intra = sps->max_transform_hierarchy_depth_intra;
-	hw_ps->scaling_list_enabled_flag =
-		!!(sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED);
-	hw_ps->amp_enabled_flag = !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED);
-	hw_ps->sample_adaptive_offset_enabled_flag =
-		!!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET);
+
+	rkvdec_set_bw_field(hw_ps->info, PIC_WIDTH_IN_LUMA_SAMPLES, width);
+	rkvdec_set_bw_field(hw_ps->info, PIC_HEIGHT_IN_LUMA_SAMPLES, height);
+	rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_LUMA, sps->bit_depth_luma_minus8 + 8);
+	rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_CHROMA, sps->bit_depth_chroma_minus8 + 8);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MAX_PIC_ORDER_CNT_LSB,
+			    sps->log2_max_pic_order_cnt_lsb_minus4 + 4);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE,
+			    sps->log2_diff_max_min_luma_coding_block_size);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_LUMA_CODING_BLOCK_SIZE,
+			    sps->log2_min_luma_coding_block_size_minus3 + 3);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_TRANSFORM_BLOCK_SIZE,
+			    sps->log2_min_luma_transform_block_size_minus2 + 2);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE,
+			    sps->log2_diff_max_min_luma_transform_block_size);
+	rkvdec_set_bw_field(hw_ps->info, MAX_TRANSFORM_HIERARCHY_DEPTH_INTER,
+			    sps->max_transform_hierarchy_depth_inter);
+	rkvdec_set_bw_field(hw_ps->info, MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA,
+			    sps->max_transform_hierarchy_depth_intra);
+	rkvdec_set_bw_field(hw_ps->info, SCALING_LIST_ENABLED_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, AMP_ENABLED_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
 
 	pcm_enabled = !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED);
-	hw_ps->pcm_enabled_flag = pcm_enabled;
-	hw_ps->pcm_sample_bit_depth_luma =
-		pcm_enabled ? sps->pcm_sample_bit_depth_luma_minus1 + 1 : 0;
-	hw_ps->pcm_sample_bit_depth_chroma =
-		pcm_enabled ? sps->pcm_sample_bit_depth_chroma_minus1 + 1 : 0;
-	hw_ps->pcm_loop_filter_disabled_flag =
-		!!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED);
-	hw_ps->diff_max_min_pcm_luma_coding_block_size =
-		sps->log2_diff_max_min_pcm_luma_coding_block_size;
-	hw_ps->min_pcm_luma_coding_block_size =
-		pcm_enabled ? sps->log2_min_pcm_luma_coding_block_size_minus3 + 3 : 0;
-	hw_ps->num_short_term_ref_pic_sets = sps->num_short_term_ref_pic_sets;
-	hw_ps->long_term_ref_pics_present_flag =
-		!!(sps->flags & V4L2_HEVC_SPS_FLAG_LONG_TERM_REF_PICS_PRESENT);
-	hw_ps->num_long_term_ref_pics_sps = sps->num_long_term_ref_pics_sps;
-	hw_ps->sps_temporal_mvp_enabled_flag =
-		!!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED);
-	hw_ps->strong_intra_smoothing_enabled_flag =
-		!!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED);
-	hw_ps->sps_max_dec_pic_buffering_minus1 = sps->sps_max_dec_pic_buffering_minus1;
+	rkvdec_set_bw_field(hw_ps->info, PCM_ENABLED_FLAG, pcm_enabled);
+	rkvdec_set_bw_field(hw_ps->info, PCM_SAMPLE_BIT_DEPTH_LUMA,
+			    pcm_enabled ? sps->pcm_sample_bit_depth_luma_minus1 + 1 : 0);
+	rkvdec_set_bw_field(hw_ps->info, PCM_SAMPLE_BIT_DEPTH_CHROMA,
+			    pcm_enabled ? sps->pcm_sample_bit_depth_chroma_minus1 + 1 : 0);
+	rkvdec_set_bw_field(hw_ps->info, PCM_LOOP_FILTER_DISABLED_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
+	rkvdec_set_bw_field(hw_ps->info, LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE,
+			    sps->log2_diff_max_min_pcm_luma_coding_block_size);
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE,
+			    pcm_enabled ? sps->log2_min_pcm_luma_coding_block_size_minus3 + 3 : 0);
+	rkvdec_set_bw_field(hw_ps->info, NUM_SHORT_TERM_REF_PIC_SETS,
+			    sps->num_short_term_ref_pic_sets);
+	rkvdec_set_bw_field(hw_ps->info, LONG_TERM_REF_PICS_PRESENT_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_LONG_TERM_REF_PICS_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, NUM_LONG_TERM_REF_PICS_SPS,
+			    sps->num_long_term_ref_pics_sps);
+	rkvdec_set_bw_field(hw_ps->info, SPS_TEMPORAL_MVP_ENABLED_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, STRONG_INTRA_SMOOTHING_ENABLED_FLAG,
+			    !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, SPS_MAX_DEC_PIC_BUFFERING_MINUS1,
+			    sps->sps_max_dec_pic_buffering_minus1);
 
 	/* write pps */
-	hw_ps->picture_parameters_set_id = pps->pic_parameter_set_id;
-	hw_ps->seq_parameters_set_id_pps = sps->seq_parameter_set_id;
-	hw_ps->dependent_slice_segments_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT_ENABLED);
-	hw_ps->output_flag_present_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_OUTPUT_FLAG_PRESENT);
-	hw_ps->num_extra_slice_header_bits = pps->num_extra_slice_header_bits;
-	hw_ps->sign_data_hiding_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED);
-	hw_ps->cabac_init_present_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT);
-	hw_ps->num_ref_idx_l0_default_active = pps->num_ref_idx_l0_default_active_minus1 + 1;
-	hw_ps->num_ref_idx_l1_default_active = pps->num_ref_idx_l1_default_active_minus1 + 1;
-	hw_ps->init_qp_minus26 = pps->init_qp_minus26;
-	hw_ps->constrained_intra_pred_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED);
-	hw_ps->transform_skip_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED);
-	hw_ps->cu_qp_delta_enabled_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED);
-	hw_ps->log2_min_cb_size = log2_min_cb_size +
-				  sps->log2_diff_max_min_luma_coding_block_size -
-				  pps->diff_cu_qp_delta_depth;
-	hw_ps->pps_cb_qp_offset = pps->pps_cb_qp_offset;
-	hw_ps->pps_cr_qp_offset = pps->pps_cr_qp_offset;
-	hw_ps->pps_slice_chroma_qp_offsets_present_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT);
-	hw_ps->weighted_pred_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED);
-	hw_ps->weighted_bipred_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED);
-	hw_ps->transquant_bypass_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED);
+	rkvdec_set_bw_field(hw_ps->info, PIC_PARAMETER_SET_ID, pps->pic_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, SEQ_PARAMETER_SET_ID, sps->seq_parameter_set_id);
+	rkvdec_set_bw_field(hw_ps->info, DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, OUTPUT_FLAG_PRESENT_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_OUTPUT_FLAG_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, NUM_EXTRA_SLICE_HEADER_BITS,
+			    pps->num_extra_slice_header_bits);
+	rkvdec_set_bw_field(hw_ps->info, SIGN_DATA_HIDING_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, CABAC_INIT_PRESENT_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L0_DEFAULT_ACTIVE,
+			    pps->num_ref_idx_l0_default_active_minus1 + 1);
+	rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L1_DEFAULT_ACTIVE,
+			    pps->num_ref_idx_l1_default_active_minus1 + 1);
+	rkvdec_set_bw_field(hw_ps->info, INIT_QP_MINUS26, pps->init_qp_minus26);
+	rkvdec_set_bw_field(hw_ps->info, CONSTRAINED_INTRA_PRED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+	rkvdec_set_bw_field(hw_ps->info, TRANSFORM_SKIP_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, CU_QP_DELTA_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_CU_QP_DELTA_SIZE, log2_min_cb_size +
+			    sps->log2_diff_max_min_luma_coding_block_size -
+			    pps->diff_cu_qp_delta_depth);
+	rkvdec_set_bw_field(hw_ps->info, PPS_CB_QP_OFFSET, pps->pps_cb_qp_offset);
+	rkvdec_set_bw_field(hw_ps->info, PPS_CR_QP_OFFSET, pps->pps_cr_qp_offset);
+	rkvdec_set_bw_field(hw_ps->info, PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG,
+			    !!(pps->flags &
+			       V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, WEIGHTED_PRED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
+	rkvdec_set_bw_field(hw_ps->info, WEIGHTED_BIPRED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
+	rkvdec_set_bw_field(hw_ps->info, TRANSQUANT_BYPASS_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
 	tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
-	hw_ps->tiles_enabled_flag = tiles_enabled;
-	hw_ps->entropy_coding_sync_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED);
-	hw_ps->pps_loop_filter_across_slices_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED);
-	hw_ps->loop_filter_across_tiles_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED);
-	hw_ps->deblocking_filter_override_enabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED);
-	hw_ps->pps_deblocking_filter_disabled_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER);
-	hw_ps->pps_beta_offset_div2 = pps->pps_beta_offset_div2;
-	hw_ps->pps_tc_offset_div2 = pps->pps_tc_offset_div2;
-	hw_ps->lists_modification_present_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT);
-	hw_ps->log2_parallel_merge_level = pps->log2_parallel_merge_level_minus2 + 2;
-	hw_ps->slice_segment_header_extension_present_flag =
-		!!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT);
-	hw_ps->num_tile_columns = tiles_enabled ? pps->num_tile_columns_minus1 + 1 : 1;
-	hw_ps->num_tile_rows = tiles_enabled ? pps->num_tile_rows_minus1 + 1 : 1;
-	hw_ps->mvc_ff = 0xffff;
+	rkvdec_set_bw_field(hw_ps->info, TILES_ENABLED_FLAG, tiles_enabled);
+	rkvdec_set_bw_field(hw_ps->info, ENTROPY_CODING_SYNC_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG,
+			    !!(pps->flags &
+			       V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG,
+			    !!(pps->flags &
+			       V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
+	rkvdec_set_bw_field(hw_ps->info, PPS_DEBLOCKING_FILTER_DISABLED_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
+	rkvdec_set_bw_field(hw_ps->info, PPS_BETA_OFFSET_DIV2, pps->pps_beta_offset_div2);
+	rkvdec_set_bw_field(hw_ps->info, PPS_TC_OFFSET_DIV2, pps->pps_tc_offset_div2);
+	rkvdec_set_bw_field(hw_ps->info, LISTS_MODIFICATION_PRESENT_FLAG,
+			    !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, LOG2_PARALLEL_MERGE_LEVEL,
+			    pps->log2_parallel_merge_level_minus2 + 2);
+	rkvdec_set_bw_field(hw_ps->info, SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG,
+			    !!(pps->flags &
+			       V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
+	rkvdec_set_bw_field(hw_ps->info, NUM_TILE_COLUMNS,
+			    tiles_enabled ? pps->num_tile_columns_minus1 + 1 : 1);
+	rkvdec_set_bw_field(hw_ps->info, NUM_TILE_ROWS,
+			    tiles_enabled ? pps->num_tile_rows_minus1 + 1 : 1);
+	rkvdec_set_bw_field(hw_ps->info, MVC_FF, 0xffff);
 
 	// Setup tiles information
 	memset(column_width, 0, sizeof(column_width));
@@ -367,15 +305,19 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
 		row_height[0] = (height + max_cu_width - 1) / max_cu_width;
 	}
 
-	set_column_row(hw_ps, column_width, row_height);
+	for (i = 0; i < 20; i++)
+		rkvdec_set_bw_field(hw_ps->info, COLUMN_WIDTH(i), column_width[i]);
+	for (i = 0; i < 22; i++)
+		rkvdec_set_bw_field(hw_ps->info, ROW_HEIGHT(i), row_height[i]);
 
 	// Setup POC information
-	hw_ps->current_poc = dec_params->pic_order_cnt_val;
+	rkvdec_set_bw_field(hw_ps->info, CURRENT_POC, dec_params->pic_order_cnt_val);
 
-	set_pps_ref_pic_poc(hw_ps, dec_params->dpb);
 	for (i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
-		u32 valid = !!(dec_params->num_active_dpb_entries > i);
-		hw_ps->ref_is_valid |= valid << i;
+		rkvdec_set_bw_field(hw_ps->info, REF_IS_VALID(i),
+				    !!(dec_params->num_active_dpb_entries > i));
+		rkvdec_set_bw_field(hw_ps->info, REF_PIC_POC(i),
+				    dec_params->dpb[i].pic_order_cnt_val);
 	}
 }
 

-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH] arm64/mm: Describe TTBR1_BADDR_4852_OFFSET
From: Catalin Marinas @ 2026-03-27 19:20 UTC (permalink / raw)
  To: linux-arm-kernel, Anshuman Khandual
  Cc: Will Deacon, Mark Rutland, Ryan Roberts, linux-kernel
In-Reply-To: <20260225064028.1525192-1-anshuman.khandual@arm.com>

On Wed, 25 Feb 2026 06:40:28 +0000, Anshuman Khandual wrote:
> TTBR1_BADDR_4852_OFFSET is a constant offset which gets added into kernel
> page table physical address for TTBR1_EL1 when kernel is build for 52 bit
> VA but found to be running on 48 bit VA capable system. Although there is
> no explanation on how the macro is computed.
> 
> Describe TTBR1_BADDR_4852_OFFSET computation in detail via deriving from
> all required parameters involved thus improving clarity and readability.
> 
> [...]

Applied to arm64 (for-next/ttbr-macros-cleanup), thanks!

[1/1] arm64/mm: Describe TTBR1_BADDR_4852_OFFSET
      (no commit info, script got confused for some reason)

-- 
Catalin


^ permalink raw reply

* Re: [PATCH V2 0/2] arm64/mm: Drop TTBR_CNP_BIT and TTBR_ASID_MASK
From: Catalin Marinas @ 2026-03-27 19:20 UTC (permalink / raw)
  To: linux-arm-kernel, Anshuman Khandual
  Cc: Will Deacon, Ryan Roberts, Mark Rutland, Marc Zyngier,
	Oliver Upton, linux-kernel, kvmarm
In-Reply-To: <20260302064437.2791034-1-anshuman.khandual@arm.com>

On Mon, 02 Mar 2026 06:44:35 +0000, Anshuman Khandual wrote:
> Directly use existing tools sysreg format field macros TTBRx_EL1_CNP_BIT/
> TTBRx_EL1_ASID_MASK, while also dropping off now redundant custom macros
> TTBR_CNP_BIT and TTBR_ASID_MASK. With this change in place, there are no
> more TTBR_EL1 register based custom macros left in the tree.
> 
> This was discussed and suggested earlier.
> 
> [...]

Applied to arm64 (for-next/ttbr-macros-cleanup), thanks!

[1/2] arm64/mm: Directly use TTBRx_EL1_ASID_MASK
      (no commit info)
[2/2] arm64/mm: Directly use TTBRx_EL1_CnP
      (no commit info)

-- 
Catalin


^ permalink raw reply

* Re: (subset) [PATCH v17 0/8] support FEAT_LSUI
From: Catalin Marinas @ 2026-03-27 19:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, kvmarm, kvm, linux-kselftest,
	Yeoreum Yun
  Cc: will, maz, oupton, miko.lenczewski, kevin.brodsky, broonie, ardb,
	suzuki.poulose, lpieralisi, joey.gouly, yuzenghui
In-Reply-To: <20260314175133.1084528-1-yeoreum.yun@arm.com>

On Sat, 14 Mar 2026 17:51:25 +0000, Yeoreum Yun wrote:
> Since Armv9.6, FEAT_LSUI supplies the load/store instructions for
> previleged level to access to access user memory without clearing
> PSTATE.PAN bit.
> 
> This patchset support FEAT_LSUI and applies it mainly in
> futex atomic operation and others.
> 
> [...]

Applied to arm64 (for-next/feat_lsui), thanks!

[6/8] arm64: armv8_deprecated: disable swp emulation when FEAT_LSUI present
      (commit e223258ed8a6)

-- 
Catalin


^ permalink raw reply

* Re: [GIT PULL] arm_mpam: Add KVM/arm64 and resctrl glue code
From: Catalin Marinas @ 2026-03-27 19:22 UTC (permalink / raw)
  To: James Morse
  Cc: will@kernel.org, Ben Horgan, Marc Zyngier, Oliver Upton,
	linux-arm-kernel@lists.infradead.org, Dave P Martin,
	Shanker Donthineni, Zeng Heng
In-Reply-To: <01f76011-f3c2-4dcb-b3bc-37c7d4de342e@arm.com>

On Fri, Mar 27, 2026 at 04:19:26PM +0000, James Morse wrote:
> The following changes since commit 1f318b96cc84d7c2ab792fcc0bfd42a7ca890681:
> 
>   Linux 7.0-rc3 (2026-03-08 16:56:54 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/glue/v7.0-rc3
> 
> for you to fetch changes up to 4ce0a2ccc0358f3f746fa50815a599f861fd5d68:
> 
>   arm64: mpam: Add initial MPAM documentation (2026-03-27 15:32:52 +0000)

Pulled into arm64 for-next/mpam. Thanks.

-- 
Catalin


^ permalink raw reply

* [PATCH 1/2] KVM: arm64: Don't leave mmu->pgt dangling on kvm_init_stage2_mmu() error
From: Will Deacon @ 2026-03-27 19:27 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-arm-kernel, Will Deacon, Marc Zyngier, Oliver Upton,
	Joey Gouly, Suzuki K Poulose, Zenghui Yu
In-Reply-To: <20260327192758.21739-1-will@kernel.org>

If kvm_init_stage2_mmu() fails to allocate 'mmu->last_vcpu_ran', it
destroys the newly allocated stage-2 page-table before returning ENOMEM.

Unfortunately, it also leaves a dangling pointer in 'mmu->pgt' which
points at the freed 'kvm_pgtable' structure. This is likely to confuse
the kvm_vcpu_init_nested() failure path which can double-free the
structure if it finds it via kvm_free_stage2_pgd().

Ensure that the dangling 'mmu->pgt' pointer is cleared when returning an
error from kvm_init_stage2_mmu().

Link: https://sashiko.dev/#/patchset/20260327140039.21228-1-will%40kernel.org?patch=12265
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/mmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 17d64a1e11e5..34e9d897d08b 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1013,6 +1013,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
 
 out_destroy_pgtable:
 	kvm_stage2_destroy(pgt);
+	mmu->pgt = NULL;
 out_free_pgtable:
 	kfree(pgt);
 	return err;
-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply related

* [PATCH 0/2] KVM: arm64: Tentative fixes for page-table lifetime issues
From: Will Deacon @ 2026-03-27 19:27 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-arm-kernel, Will Deacon, Marc Zyngier, Oliver Upton,
	Joey Gouly, Suzuki K Poulose, Zenghui Yu

Hi all,

Sashiko highlighted a couple of potential page-table lifetime issues
in the upstream code while it was reviewing the pKVM protected memory
series. They make sense to me so I've had a crack at fixing them and
writing a better description of the problem in the commit message.

For the second issue, I've tested it by forcing the notifier
registration to fail and then watching the SecPageTables line in
/proc/meminfo after attempting to create VMs.

Cheers,

Will

Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>

--->8

Will Deacon (2):
  KVM: arm64: Don't leave mmu->pgt dangling on kvm_init_stage2_mmu()
    error
  KVM: arm64: Destroy stage-2 page-table in kvm_arch_destroy_vm()

 arch/arm64/kvm/arm.c | 1 +
 arch/arm64/kvm/mmu.c | 1 +
 2 files changed, 2 insertions(+)

-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply

* [PATCH 2/2] KVM: arm64: Destroy stage-2 page-table in kvm_arch_destroy_vm()
From: Will Deacon @ 2026-03-27 19:27 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-arm-kernel, Will Deacon, Marc Zyngier, Oliver Upton,
	Joey Gouly, Suzuki K Poulose, Zenghui Yu
In-Reply-To: <20260327192758.21739-1-will@kernel.org>

kvm_arch_destroy_vm() can be called on the kvm_create_vm() error path
after we have failed to register the MMU notifiers for the new VM. In
this case, we cannot rely on the MMU ->release() notifier to call
kvm_arch_flush_shadow_all() and so the stage-2 page-table allocated in
kvm_arch_init_vm() will be leaked.

Explicitly destroy the stage-2 page-table in kvm_arch_destroy_vm(), so
that we clean up after kvm_arch_destroy_vm() without relying on the MMU
notifiers.

Link: https://sashiko.dev/#/patchset/20260327140039.21228-1-will%40kernel.org?patch=12265
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/arm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 410ffd41fd73..29bfa79555b2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -301,6 +301,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	if (is_protected_kvm_enabled())
 		pkvm_destroy_hyp_vm(kvm);
 
+	kvm_uninit_stage2_mmu(kvm);
 	kvm_destroy_mpidr_data(kvm);
 
 	kfree(kvm->arch.sysreg_masks);
-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply related

* Re: [PATCH v2] lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation
From: Eric Biggers @ 2026-03-27 19:38 UTC (permalink / raw)
  To: Demian Shulhan; +Cc: linux-crypto, linux-kernel, ardb, linux-arm-kernel
In-Reply-To: <20260327060211.902077-1-demyansh@gmail.com>

[+Cc linux-arm-kernel@lists.infradead.org]

Thanks!  This is almost ready.  Just a few more comments:

On Fri, Mar 27, 2026 at 06:02:11AM +0000, Demian Shulhan wrote:
> - Safely falls back to the generic implementation on Big-Endian systems.

Drop the above bullet point.  This patch doesn't explicitly exclude big
endian.  Which is correct: Linux arm64 is little-endian-only now.

> +	/*
> +	 * Reduce the 128-bit value to 64 bits.
> +	 * By multiplying the high 64 bits by x^127 mod G (fold_consts_val[1])
> +	 * and XORing the result with the low 64 bits.
> +	 */

That is not what this code does.  How about something like:

	/* Multiply the 128-bit value by x^64 and reduce it back to 128 bits. */

Granted, that doesn't do a good job explaining it either.  However, a
full explanation of this stuff, like the one in the comments in
lib/crc/x86/crc-pclmul-template.S, would be much longer.

I suggest we leave the full explanation for when a similar template is
written for arm64.  For now brief comments or even no comments are fine.

Just if any comments are included they really ought to be correct, as
otherwise they are worse than no comments.

> +			scoped_ksimd() crc = crc64_nvme_arm64_c(crc, p, chunk);

clang-format doesn't know about scoped_ksimd(), so I suggest overriding
the formatting in this particular case:

	scoped_ksimd()
		crc = crc64_nvme_arm64_c(crc, p, chunk);

- Eric


^ permalink raw reply

* Re: [PATCH v5 0/5] drm/msm: add RGB101010 pixel format
From: Dmitry Baryshkov @ 2026-03-27 19:47 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Rob Clark, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Sean Paul, Marijn Suijten, Jeffrey Hugo,
	Neil Armstrong, Martin Blumenstingl, Alexander Koskovich
  Cc: dri-devel, linux-kernel, linux-arm-msm, freedreno, linux-amlogic,
	linux-arm-kernel, Konrad Dybcio
In-Reply-To: <20260324-dsi-rgb101010-support-v5-0-ff6afc904115@pm.me>

On Tue, 24 Mar 2026 11:47:56 +0000, Alexander Koskovich wrote:
> This series adds support for the RGB101010 (30bpp) pixel format used by some
> newer panels.
> 
> Tested on the BOE BF068MWM-TD0 panel (10 bit DSC) on the Nothing Phone (3a).

Applied to msm-next, thanks!

[1/5] drm/mipi-dsi: add RGB101010 pixel format
      https://gitlab.freedesktop.org/lumag/msm/-/commit/b50dc1e54750
[2/5] drm/meson: use default case for unsupported DSI pixel formats
      https://gitlab.freedesktop.org/lumag/msm/-/commit/a780b7f6c8e5
[3/5] drm/msm/dsi: rename MSM8998 DSI version from V2_2_0 to V2_0_0
      https://gitlab.freedesktop.org/lumag/msm/-/commit/913a709dea0e
[4/5] drm/msm/dsi: add DSI version >= comparison helper
      https://gitlab.freedesktop.org/lumag/msm/-/commit/a65c4d30988e
[5/5] drm/msm/dsi: Add support for RGB101010 pixel format
      https://gitlab.freedesktop.org/lumag/msm/-/commit/cebf747abeeb

Best regards,
-- 
With best wishes
Dmitry




^ permalink raw reply

* Re: [PATCH v1 0/2] perf build: Remove libunwind support
From: Arnaldo Carvalho de Melo @ 2026-03-27 20:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ian Rogers, 9erthalion6, adrian.hunter, alex, alexander.shishkin,
	andrew.jones, aou, atrajeev, blakejones, ctshao, dapeng1.mi,
	howardchu95, james.clark, john.g.garry, jolsa, leo.yan,
	libunwind-devel, linux-arm-kernel, linux-kernel, linux-perf-users,
	linux-riscv, mingo, palmer, peterz, pjw, shimin.guo, tglozar,
	tmricht, will, amadio, yuzhuo
In-Reply-To: <acW4Z2KJbyPZM1SG@google.com>

On Thu, Mar 26, 2026 at 03:51:19PM -0700, Namhyung Kim wrote:
> On Sat, Mar 21, 2026 at 04:42:18PM -0700, Ian Rogers wrote:
> > libunwind support exists for "--call-graph dwarf", however, libunwind
> > support has been opt-in rather than opt-out since Linux v6.13 as libdw
> > is preferred - commit 13e17c9ff49119aa ("perf build: Make libunwind
> > opt-in rather than opt-out"). A problem with the libdw support was
> > that it was slow, an issue fixed in Linux v7.0 in commit 6b2658b3f36a
> > ("perf unwind-libdw: Don't discard loaded ELF/DWARF after every
> > unwind"). As such libunwind support is now unnecessary.
> > 
> > The patch series:
> > https://lore.kernel.org/lkml/20260305221927.3237145-1-irogers@google.com/
> > looked to make the libunwind support in perf similar to the libdw
> > support, allow cross-architecture unwinding, etc. This was motivated
> > by the perf regs conventions being altered by the addition of x86 APX
> > support:
> > https://lore.kernel.org/lkml/20260209072047.2180332-1-dapeng1.mi@linux.intel.com/
> > It is necessary to translate the library's notion of registers to the
> > perf register convention so that the stack unwinding state can be
> > initialized. On this series it was stated that removing libunwind
> > support from perf should be an option, rather than updating support:
> > https://lore.kernel.org/lkml/abxs-2rozL1tBEO1@google.com/
> > This was also what motivated making libunwind opt-in rather than
> > opt-out.

> > Given that 7 minor releases have happened with libunwind "deprecated"
> > by making it opt-in, let's remove the libunwind support. There doesn't
> > appear to be any disagreement to this on the mailing list.
 
> I'm not sure if we want to remove it now.  I think we need more time to
> verify libdw unwinding is stable and fast enough.  Also maybe we can
> add build- or run-time warning when people try to use libunwind.

We have:

acme@x1:~/git/perf-tools$ perf -vv | grep LIBU
             libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
acme@x1:~/git/perf-tools$

acme@x1:~/git/perf-tools$ perf check feature libunwind && echo perf built with libunwind
             libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
acme@x1:~/git/perf-tools$

Building with both, as Ian mentioned ends up with:

  LD      /tmp/build/perf-tools/util/perf-util-in.o
ld: /tmp/build/perf-tools/util/unwind-libunwind.o: in function `unwind__get_entries':
unwind-libunwind.c:(.text+0x2a0): multiple definition of `unwind__get_entries'; /tmp/build/perf-tools/util/unwind-libdw.o:unwind-libdw.c:(.text+0x940): first defined here
make[4]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:164: /tmp/build/perf-tools/util/perf-util-in.o] Error 123
make[3]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:158: util] Error 2
make[2]: *** [Makefile.perf:797: /tmp/build/perf-tools/perf-util-in.o] Error 2
make[1]: *** [Makefile.perf:289: sub-make] Error 2
make: *** [Makefile:119: install-bin] Error 2
make: Leaving directory '/home/acme/git/perf-tools/tools/perf'
⬢ [acme@toolbx perf-tools]$

So what Namhyung is suggesting is to disable libdw when libunwind is
asked for?

I.e.

alias m='rm -rf ~/libexec/perf-core/ ; make LIBUNWIND=1 NO_LIBDW=1 -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin && perf test "import python" && cat /tmp/build/$(basename $PWD)/feature/test-all.make.output' ; export PYTHONPATH=/tmp/build/$(basename $PWD)/python

Which builds and ends up linking with both:

⬢ [acme@toolbx perf-tools]$ ldd ~/bin/perf | egrep unwind\|dw
	libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007fbf430b6000)
	libunwind.so.8 => /lib64/libunwind.so.8 (0x00007fbf4309b000)
	libdw.so.1 => /lib64/libdw.so.1 (0x00007fbf38570000)
⬢ [acme@toolbx perf-tools]$

I.e. that NO_LIBDW isn't really disabling linking with it, just some
features based on it, likely.

Hum, we also have NO_LIBDW_DWARF_UNWIND, which probably is what we want
here... nope:

⬢ [acme@toolbx perf-tools]$ alias m='rm -rf ~/libexec/perf-core/ ; make LIBUNWIND=1 NO_LIBDW_DWARF_UNWIND=1 -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin && perf test "import python" && cat /tmp/build/$(basename $PWD)/feature/test-all.make.output' ; export PYTHONPATH=/tmp/build/$(basename $PWD)/python
⬢ [acme@toolbx perf-tools]$ m
rm: cannot remove '/home/acme/libexec/perf-core/scripts/python/Perf-Trace-Util/lib/Perf/Trace/__pycache__/Core.cpython-314.pyc': Permission denied
make: Entering directory '/home/acme/git/perf-tools/tools/perf'
  BUILD:   Doing 'make -j12' parallel build
Warning: Kernel ABI header differences:
  diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h

Auto-detecting system features:
...                                   libdw: [ on  ]
...                                   glibc: [ on  ]
...                                  libelf: [ on  ]
...                                 libnuma: [ on  ]
...                  numa_num_possible_cpus: [ on  ]
...                               libpython: [ on  ]
...                             libcapstone: [ on  ]
...                               llvm-perf: [ on  ]
...                                    zlib: [ on  ]
...                                    lzma: [ on  ]
...                                     bpf: [ on  ]
...                                  libaio: [ on  ]
...                                 libzstd: [ on  ]
...                              libopenssl: [ on  ]
...                                    rust: [ on  ]

  INSTALL libsubcmd_headers
  INSTALL libperf_headers
  INSTALL libapi_headers
  INSTALL libsymbol_headers
  INSTALL libbpf_headers
  LD      /tmp/build/perf-tools/util/perf-util-in.o
ld: /tmp/build/perf-tools/util/unwind-libunwind.o: in function `unwind__get_entries':
unwind-libunwind.c:(.text+0x2a0): multiple definition of `unwind__get_entries'; /tmp/build/perf-tools/util/unwind-libdw.o:unwind-libdw.c:(.text+0x940): first defined here
make[4]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:164: /tmp/build/perf-tools/util/perf-util-in.o] Error 123
make[3]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:158: util] Error 2
make[2]: *** [Makefile.perf:797: /tmp/build/perf-tools/perf-util-in.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:289: sub-make] Error 2
make: *** [Makefile:119: install-bin] Error 2
make: Leaving directory '/home/acme/git/perf-tools/tools/perf'
⬢ [acme@toolbx perf-tools]$

I expected NO_LIBDW_DWARF_UNWIND=1 would resolve this case, and maybe it
should?

Or maybe it did it in the past as by now it is just a comment:

⬢ [acme@toolbx perf-tools]$ git grep NO_LIBDW_DWARF_UNWIND
tools/perf/Makefile.perf:# Define NO_LIBDW_DWARF_UNWIND if you do not want libdw support
⬢ [acme@toolbx perf-tools]$

Its from:

# Define NO_LIBDW_DWARF_UNWIND if you do not want libdw support
# for dwarf backtrace post unwind.

As we need libdw for 'perf probe', etc, so being able to disable it just
for DWARF backtrace is what we need here to make them mutually
exclusive, i.e. the default is for building with libdw for backtraces,
but if the user asks for LIBUNWIND=1, then a warning that libdw won't be
used for DWARF backtraces and select NO_LIBDW_DWARF_UNWIND?

We could then have a regression test that builds perf with one, does
some backtraces, then with the other, then compare? This would be
followup work, if somebody has the cycles, but making them mutually
exclusive should be doable with not that much work?

This is an area that is tricky and since we _already_ have two
implementations, the good thing for regression testing would be the
compare their results until libunwind becomes completely rotten and
unusable?

- Arnaldo


^ permalink raw reply

* Re: [PATCH v6 phy-next 09/28] scsi: ufs: exynos: stop poking into struct phy guts
From: Martin K. Petersen @ 2026-03-27 20:19 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: linux-phy, Vinod Koul, Neil Armstrong, dri-devel, freedreno,
	linux-arm-kernel, linux-arm-msm, linux-can, linux-gpio, linux-ide,
	linux-kernel, linux-media, linux-pci, linux-renesas-soc,
	linux-riscv, linux-rockchip, linux-samsung-soc, linux-scsi,
	linux-sunxi, linux-tegra, linux-usb, netdev, spacemit,
	UNGLinuxDriver, Bart Van Assche, Alim Akhtar, Peter Griffin,
	James E.J. Bottomley, Martin K. Petersen, Krzysztof Kozlowski,
	Chanho Park
In-Reply-To: <20260327184706.1600329-10-vladimir.oltean@nxp.com>


Vladimir,

> The Exynos host controller driver is clearly a PHY consumer (gets the
> ufs->phy using devm_phy_get()), but pokes into the guts of struct phy
> to get the generic_phy->power_count.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen


^ permalink raw reply

* Re: [PATCH v6 phy-next 11/28] scsi: ufs: qcom: include missing <linux/interrupt.h>
From: Martin K. Petersen @ 2026-03-27 20:20 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: linux-phy, Vinod Koul, Neil Armstrong, dri-devel, freedreno,
	linux-arm-kernel, linux-arm-msm, linux-can, linux-gpio, linux-ide,
	linux-kernel, linux-media, linux-pci, linux-renesas-soc,
	linux-riscv, linux-rockchip, linux-samsung-soc, linux-scsi,
	linux-sunxi, linux-tegra, linux-usb, netdev, spacemit,
	UNGLinuxDriver, Manivannan Sadhasivam, James E.J. Bottomley,
	Martin K. Petersen
In-Reply-To: <20260327184706.1600329-12-vladimir.oltean@nxp.com>


Vladimir,

> The point is that <linux/phy/phy.h> will stop providing
> <linux/regulator/consumer.h>, and this would break the transitive
> include chain on armv7.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen


^ permalink raw reply

* Re: [PATCH v6 phy-next 09/28] scsi: ufs: exynos: stop poking into struct phy guts
From: Peter Griffin @ 2026-03-27 20:23 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: linux-phy, Vinod Koul, Neil Armstrong, dri-devel, freedreno,
	linux-arm-kernel, linux-arm-msm, linux-can, linux-gpio, linux-ide,
	linux-kernel, linux-media, linux-pci, linux-renesas-soc,
	linux-riscv, linux-rockchip, linux-samsung-soc, linux-scsi,
	linux-sunxi, linux-tegra, linux-usb, netdev, spacemit,
	UNGLinuxDriver, Bart Van Assche, Alim Akhtar,
	James E.J. Bottomley, Martin K. Petersen, Krzysztof Kozlowski,
	Chanho Park
In-Reply-To: <20260327184706.1600329-10-vladimir.oltean@nxp.com>

On Fri, 27 Mar 2026 at 18:48, Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
>
> The Exynos host controller driver is clearly a PHY consumer (gets the
> ufs->phy using devm_phy_get()), but pokes into the guts of struct phy
> to get the generic_phy->power_count.
>
> The UFS core (specifically ufshcd_link_startup()) may call the variant
> operation exynos_ufs_pre_link() -> exynos_ufs_phy_init() multiple times
> if the link startup fails and needs to be retried.
>
> However ufs-exynos shouldn't be doing what it's doing, i.e. looking at
> the generic_phy->power_count, because in the general sense of the API, a
> single Generic PHY may have multiple consumers. If ufs-exynos looks at
> generic_phy->power_count, there's no guarantee that this ufs-exynos
> instance is the one who previously bumped that power count. So it may be
> powering down the PHY on behalf of another consumer.
>
> The correct way in which this should be handled is ufs-exynos should
> *remember* whether it has initialized and powered up the PHY before, and
> power it down during link retries. Not rely on the power_count (which,
> btw, on the writer side is modified under &phy->mutex, but on the reader
> side is accessed unlocked). This is a discouraged pattern even if here
> it doesn't cause functional problems.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> Acked-by: Alim Akhtar <alim.akhtar@samsung.com>
> Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
> ---

Reviewed-by: Peter Griffin <peter.griffin@linaro.org>

> Cc: Alim Akhtar <alim.akhtar@samsung.com>
> Cc: Peter Griffin <peter.griffin@linaro.org>
> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
> Cc: Krzysztof Kozlowski <krzk@kernel.org>
> Cc: Chanho Park <chanho61.park@samsung.com>
>
> v5->v6: collect tags from Alim Akhtar
> v4->v5: collect tag, add "scsi: " prefix to commit title
> v3->v4: none
> v2->v3:
> - add Cc Chanho Park, author of commit 3d73b200f989 ("scsi: ufs:
>   ufs-exynos: Change ufs phy control sequence")
> v1->v2:
> - add better ufs->phy_powered_on handling in exynos_ufs_exit(),
>   exynos_ufs_suspend() and exynos_ufs_resume() which ensures we won't
>   enter a phy->power_count underrun condition
> ---
>  drivers/ufs/host/ufs-exynos.c | 24 ++++++++++++++++++++----
>  drivers/ufs/host/ufs-exynos.h |  1 +
>  2 files changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/ufs/host/ufs-exynos.c b/drivers/ufs/host/ufs-exynos.c
> index 76fee3a79c77..274e53833571 100644
> --- a/drivers/ufs/host/ufs-exynos.c
> +++ b/drivers/ufs/host/ufs-exynos.c
> @@ -963,9 +963,10 @@ static int exynos_ufs_phy_init(struct exynos_ufs *ufs)
>
>         phy_set_bus_width(generic_phy, ufs->avail_ln_rx);
>
> -       if (generic_phy->power_count) {
> +       if (ufs->phy_powered_on) {
>                 phy_power_off(generic_phy);
>                 phy_exit(generic_phy);
> +               ufs->phy_powered_on = false;
>         }
>
>         ret = phy_init(generic_phy);
> @@ -979,6 +980,8 @@ static int exynos_ufs_phy_init(struct exynos_ufs *ufs)
>         if (ret)
>                 goto out_exit_phy;
>
> +       ufs->phy_powered_on = true;
> +
>         return 0;
>
>  out_exit_phy:
> @@ -1527,6 +1530,9 @@ static void exynos_ufs_exit(struct ufs_hba *hba)
>  {
>         struct exynos_ufs *ufs = ufshcd_get_variant(hba);
>
> +       if (!ufs->phy_powered_on)
> +               return;
> +
>         phy_power_off(ufs->phy);
>         phy_exit(ufs->phy);
>  }
> @@ -1728,8 +1734,10 @@ static int exynos_ufs_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op,
>         if (ufs->drv_data->suspend)
>                 ufs->drv_data->suspend(ufs);
>
> -       if (!ufshcd_is_link_active(hba))
> +       if (!ufshcd_is_link_active(hba) && ufs->phy_powered_on) {
>                 phy_power_off(ufs->phy);
> +               ufs->phy_powered_on = false;
> +       }
>
>         return 0;
>  }
> @@ -1737,9 +1745,17 @@ static int exynos_ufs_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op,
>  static int exynos_ufs_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
>  {
>         struct exynos_ufs *ufs = ufshcd_get_variant(hba);
> +       int err;
>
> -       if (!ufshcd_is_link_active(hba))
> -               phy_power_on(ufs->phy);
> +       if (!ufshcd_is_link_active(hba) && !ufs->phy_powered_on) {
> +               err = phy_power_on(ufs->phy);
> +               if (err) {
> +                       dev_err(hba->dev, "Failed to power on PHY: %pe\n",
> +                               ERR_PTR(err));
> +               } else {
> +                       ufs->phy_powered_on = true;
> +               }
> +       }
>
>         exynos_ufs_config_smu(ufs);
>         exynos_ufs_fmp_resume(hba);
> diff --git a/drivers/ufs/host/ufs-exynos.h b/drivers/ufs/host/ufs-exynos.h
> index abe7e472759e..683b9150e2ba 100644
> --- a/drivers/ufs/host/ufs-exynos.h
> +++ b/drivers/ufs/host/ufs-exynos.h
> @@ -227,6 +227,7 @@ struct exynos_ufs {
>         int avail_ln_rx;
>         int avail_ln_tx;
>         int rx_sel_idx;
> +       bool phy_powered_on;
>         struct ufs_pa_layer_attr dev_req_params;
>         struct ufs_phy_time_cfg t_cfg;
>         ktime_t entry_hibern8_t;
> --
> 2.43.0
>


^ permalink raw reply

* Re: [PATCH] mailbox: Fix NULL message support in mbox_send_message()
From: Doug Anderson @ 2026-03-27 20:24 UTC (permalink / raw)
  To: jassisinghbrar
  Cc: linux-kernel, linux-arm-kernel, shawn.guo, maz, stable, andersson,
	tglx, joonwonkang
In-Reply-To: <20260322171752.608486-1-jassisinghbrar@gmail.com>

Jassi,

On Sun, Mar 22, 2026 at 10:18 AM <jassisinghbrar@gmail.com> wrote:
>
> From: Jassi Brar <jassisinghbrar@gmail.com>
>
> The active_req field serves double duty as both the "is a TX in
> flight" flag (NULL means idle) and the storage for the in-flight
> message pointer. When a client sends NULL via mbox_send_message(),
> active_req is set to NULL, which the framework misinterprets as
> "no active request". This breaks the TX state machine by:
>
>  - tx_tick() short-circuits on (!mssg), skipping the tx_done
>    callback and the tx_complete completion
>  - txdone_hrtimer() skips the channel entirely since active_req
>    is NULL, so poll-based TX-done detection never fires.
>
> Fix this by introducing a MBOX_NO_MSG sentinel value that means
> "no active request," freeing NULL to be valid message data. The
> sentinel is defined in the subsystem-internal mailbox.h so that
> controller drivers within drivers/mailbox/ can reference it, but
> it is not exposed to clients outside the subsystem.
>
> Fifteen in-tree callers send NULL (doorbell-style IPCs on Qualcomm,
> Tegra, TI, Xilinx, i.MX, SCMI, and PCC platforms). All were
> audited for regression:
>
>  - Most already work around the bug via knows_txdone=true with a
>    manual mbox_client_txdone() call, making the framework's
>    tracking irrelevant. These are unaffected.
>
>  - Poll-based callers (Xilinx zynqmp/r5) are strictly better off:
>    the poll timer now correctly detects NULL-active channels
>    instead of silently skipping them.
>
>  - irq-qcom-mpm.c was a pre-existing bug -- the only Qualcomm
>    caller that omitted the knows_txdone + mbox_client_txdone()
>    pattern. Fixed in a companion commit ("irqchip/qcom-mpm: Fix
>    missing mailbox TX done acknowledgment").
>
>  - No caller sets both a tx_done callback and sends NULL, nor
>    combines tx_block=true with NULL sends, so the newly reachable
>    callback/completion paths are never exercised.
>
> Also update tegra-hsp's flush callback, which directly inspects
> active_req to wait for the channel to drain: the old "!= NULL"
> check becomes "!= MBOX_NO_MSG", otherwise flush spins until
> timeout since the sentinel is non-NULL.
>
> The only tradeoff is that 'MBOX_NO_MSG' can not be used as a message
> by clients.
>
> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
> ---
>  drivers/mailbox/mailbox.c   | 13 +++++++------
>  drivers/mailbox/mailbox.h   |  3 +++
>  drivers/mailbox/tegra-hsp.c |  2 +-
>  3 files changed, 11 insertions(+), 7 deletions(-)

This looks reasonable to me. I have one nit, though. Can you please
add a snippet to the beginning of mbox_send_message() that looks like:

if (mssg == MBOX_NO_MSG)
  return -EINVAL

I just want to ensure a client doesn't decide to simulate the
old/weird behavior by sending this sentinel value. ;-)

Other than that:

Reviewed-by: Douglas Anderson <dianders@chromium.org>


-Doug


^ permalink raw reply

* Re: [PATCH v1 1/1] scsi: ufs: rockchip: Drop unused include
From: Martin K. Petersen @ 2026-03-27 20:28 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Shawn Lin, linux-scsi, linux-arm-kernel, linux-rockchip,
	linux-kernel, James E.J. Bottomley, Martin K. Petersen,
	Heiko Stuebner
In-Reply-To: <20260320215606.3236516-1-andriy.shevchenko@linux.intel.com>


Andy,

> This driver includes the legacy header <linux/gpio.h> but does
> not use any symbols from it. Drop the inclusion.

Applied to 7.1/scsi-staging, thanks!

-- 
Martin K. Petersen


^ permalink raw reply

* Re: [PATCH v1 0/2] perf build: Remove libunwind support
From: Ian Rogers @ 2026-03-27 20:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, 9erthalion6, adrian.hunter, alex,
	alexander.shishkin, andrew.jones, aou, atrajeev, blakejones,
	ctshao, dapeng1.mi, howardchu95, james.clark, john.g.garry, jolsa,
	leo.yan, libunwind-devel, linux-arm-kernel, linux-kernel,
	linux-perf-users, linux-riscv, mingo, palmer, peterz, pjw,
	shimin.guo, tglozar, tmricht, will, amadio, yuzhuo
In-Reply-To: <acbjjwc-WCk1CwtF@x1>

On Fri, Mar 27, 2026 at 1:07 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Thu, Mar 26, 2026 at 03:51:19PM -0700, Namhyung Kim wrote:
> > On Sat, Mar 21, 2026 at 04:42:18PM -0700, Ian Rogers wrote:
> > > libunwind support exists for "--call-graph dwarf", however, libunwind
> > > support has been opt-in rather than opt-out since Linux v6.13 as libdw
> > > is preferred - commit 13e17c9ff49119aa ("perf build: Make libunwind
> > > opt-in rather than opt-out"). A problem with the libdw support was
> > > that it was slow, an issue fixed in Linux v7.0 in commit 6b2658b3f36a
> > > ("perf unwind-libdw: Don't discard loaded ELF/DWARF after every
> > > unwind"). As such libunwind support is now unnecessary.
> > >
> > > The patch series:
> > > https://lore.kernel.org/lkml/20260305221927.3237145-1-irogers@google.com/
> > > looked to make the libunwind support in perf similar to the libdw
> > > support, allow cross-architecture unwinding, etc. This was motivated
> > > by the perf regs conventions being altered by the addition of x86 APX
> > > support:
> > > https://lore.kernel.org/lkml/20260209072047.2180332-1-dapeng1.mi@linux.intel.com/
> > > It is necessary to translate the library's notion of registers to the
> > > perf register convention so that the stack unwinding state can be
> > > initialized. On this series it was stated that removing libunwind
> > > support from perf should be an option, rather than updating support:
> > > https://lore.kernel.org/lkml/abxs-2rozL1tBEO1@google.com/
> > > This was also what motivated making libunwind opt-in rather than
> > > opt-out.
>
> > > Given that 7 minor releases have happened with libunwind "deprecated"
> > > by making it opt-in, let's remove the libunwind support. There doesn't
> > > appear to be any disagreement to this on the mailing list.
>
> > I'm not sure if we want to remove it now.  I think we need more time to
> > verify libdw unwinding is stable and fast enough.  Also maybe we can
> > add build- or run-time warning when people try to use libunwind.
>
> We have:
>
> acme@x1:~/git/perf-tools$ perf -vv | grep LIBU
>              libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
> acme@x1:~/git/perf-tools$
>
> acme@x1:~/git/perf-tools$ perf check feature libunwind && echo perf built with libunwind
>              libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
> acme@x1:~/git/perf-tools$
>
> Building with both, as Ian mentioned ends up with:
>
>   LD      /tmp/build/perf-tools/util/perf-util-in.o
> ld: /tmp/build/perf-tools/util/unwind-libunwind.o: in function `unwind__get_entries':
> unwind-libunwind.c:(.text+0x2a0): multiple definition of `unwind__get_entries'; /tmp/build/perf-tools/util/unwind-libdw.o:unwind-libdw.c:(.text+0x940): first defined here
> make[4]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:164: /tmp/build/perf-tools/util/perf-util-in.o] Error 123
> make[3]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:158: util] Error 2
> make[2]: *** [Makefile.perf:797: /tmp/build/perf-tools/perf-util-in.o] Error 2
> make[1]: *** [Makefile.perf:289: sub-make] Error 2
> make: *** [Makefile:119: install-bin] Error 2
> make: Leaving directory '/home/acme/git/perf-tools/tools/perf'
> ⬢ [acme@toolbx perf-tools]$
>
> So what Namhyung is suggesting is to disable libdw when libunwind is
> asked for?
>
> I.e.
>
> alias m='rm -rf ~/libexec/perf-core/ ; make LIBUNWIND=1 NO_LIBDW=1 -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin && perf test "import python" && cat /tmp/build/$(basename $PWD)/feature/test-all.make.output' ; export PYTHONPATH=/tmp/build/$(basename $PWD)/python
>
> Which builds and ends up linking with both:
>
> ⬢ [acme@toolbx perf-tools]$ ldd ~/bin/perf | egrep unwind\|dw
>         libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007fbf430b6000)
>         libunwind.so.8 => /lib64/libunwind.so.8 (0x00007fbf4309b000)
>         libdw.so.1 => /lib64/libdw.so.1 (0x00007fbf38570000)
> ⬢ [acme@toolbx perf-tools]$
>
> I.e. that NO_LIBDW isn't really disabling linking with it, just some
> features based on it, likely.
>
> Hum, we also have NO_LIBDW_DWARF_UNWIND, which probably is what we want
> here... nope:
>
> ⬢ [acme@toolbx perf-tools]$ alias m='rm -rf ~/libexec/perf-core/ ; make LIBUNWIND=1 NO_LIBDW_DWARF_UNWIND=1 -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin && perf test "import python" && cat /tmp/build/$(basename $PWD)/feature/test-all.make.output' ; export PYTHONPATH=/tmp/build/$(basename $PWD)/python
> ⬢ [acme@toolbx perf-tools]$ m
> rm: cannot remove '/home/acme/libexec/perf-core/scripts/python/Perf-Trace-Util/lib/Perf/Trace/__pycache__/Core.cpython-314.pyc': Permission denied
> make: Entering directory '/home/acme/git/perf-tools/tools/perf'
>   BUILD:   Doing 'make -j12' parallel build
> Warning: Kernel ABI header differences:
>   diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
>
> Auto-detecting system features:
> ...                                   libdw: [ on  ]
> ...                                   glibc: [ on  ]
> ...                                  libelf: [ on  ]
> ...                                 libnuma: [ on  ]
> ...                  numa_num_possible_cpus: [ on  ]
> ...                               libpython: [ on  ]
> ...                             libcapstone: [ on  ]
> ...                               llvm-perf: [ on  ]
> ...                                    zlib: [ on  ]
> ...                                    lzma: [ on  ]
> ...                                     bpf: [ on  ]
> ...                                  libaio: [ on  ]
> ...                                 libzstd: [ on  ]
> ...                              libopenssl: [ on  ]
> ...                                    rust: [ on  ]
>
>   INSTALL libsubcmd_headers
>   INSTALL libperf_headers
>   INSTALL libapi_headers
>   INSTALL libsymbol_headers
>   INSTALL libbpf_headers
>   LD      /tmp/build/perf-tools/util/perf-util-in.o
> ld: /tmp/build/perf-tools/util/unwind-libunwind.o: in function `unwind__get_entries':
> unwind-libunwind.c:(.text+0x2a0): multiple definition of `unwind__get_entries'; /tmp/build/perf-tools/util/unwind-libdw.o:unwind-libdw.c:(.text+0x940): first defined here
> make[4]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:164: /tmp/build/perf-tools/util/perf-util-in.o] Error 123
> make[3]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:158: util] Error 2
> make[2]: *** [Makefile.perf:797: /tmp/build/perf-tools/perf-util-in.o] Error 2
> make[2]: *** Waiting for unfinished jobs....
> make[1]: *** [Makefile.perf:289: sub-make] Error 2
> make: *** [Makefile:119: install-bin] Error 2
> make: Leaving directory '/home/acme/git/perf-tools/tools/perf'
> ⬢ [acme@toolbx perf-tools]$
>
> I expected NO_LIBDW_DWARF_UNWIND=1 would resolve this case, and maybe it
> should?
>
> Or maybe it did it in the past as by now it is just a comment:
>
> ⬢ [acme@toolbx perf-tools]$ git grep NO_LIBDW_DWARF_UNWIND
> tools/perf/Makefile.perf:# Define NO_LIBDW_DWARF_UNWIND if you do not want libdw support
> ⬢ [acme@toolbx perf-tools]$
>
> Its from:
>
> # Define NO_LIBDW_DWARF_UNWIND if you do not want libdw support
> # for dwarf backtrace post unwind.
>
> As we need libdw for 'perf probe', etc, so being able to disable it just
> for DWARF backtrace is what we need here to make them mutually
> exclusive, i.e. the default is for building with libdw for backtraces,
> but if the user asks for LIBUNWIND=1, then a warning that libdw won't be
> used for DWARF backtraces and select NO_LIBDW_DWARF_UNWIND?
>
> We could then have a regression test that builds perf with one, does
> some backtraces, then with the other, then compare? This would be
> followup work, if somebody has the cycles, but making them mutually
> exclusive should be doable with not that much work?
>
> This is an area that is tricky and since we _already_ have two
> implementations, the good thing for regression testing would be the
> compare their results until libunwind becomes completely rotten and
> unusable?

My series:
https://lore.kernel.org/lkml/20260305221927.3237145-1-irogers@google.com/
makes libdw and libunwind supported together:
https://lore.kernel.org/lkml/20260305221927.3237145-2-irogers@google.com/
"""
This commit refactors the DWARF unwind post-processing to be
configurable at runtime via the .perfconfig file option
'unwind.style', or using the argument '--unwind-style' in the commands
'perf report', 'perf script' and 'perf inject', in a similar manner to
the addr2line or the disassembler style.
"""
That series cleans up several other issues, which is why I think it is
worth landing while we wait for libdw to become stable.

Thanks,
Ian

> - Arnaldo


^ permalink raw reply

* Re: [PATCH v2] ASoC: dt-bindings: mediatek,mt8173-rt5650-rt5514: convert to DT schema
From: Mark Brown @ 2026-03-27 17:21 UTC (permalink / raw)
  To: lgirdwood, Khushal Chitturi
  Cc: robh, krzk+dt, conor+dt, matthias.bgg, angelogioacchino.delregno,
	koro.chen, linux-sound, devicetree, linux-kernel,
	linux-arm-kernel, linux-mediatek
In-Reply-To: <20260327134649.31376-1-khushalchitturi@gmail.com>

On Fri, 27 Mar 2026 19:16:49 +0530, Khushal Chitturi wrote:
> ASoC: dt-bindings: mediatek,mt8173-rt5650-rt5514: convert to DT schema

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-7.1

Thanks!

[1/1] ASoC: dt-bindings: mediatek,mt8173-rt5650-rt5514: convert to DT schema
      https://git.kernel.org/broonie/sound/c/472d77bdc511

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark



^ permalink raw reply

* Re: [PATCH v1 0/2] perf build: Remove libunwind support
From: Ian Rogers @ 2026-03-27 20:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, 9erthalion6, adrian.hunter, alex,
	alexander.shishkin, andrew.jones, aou, atrajeev, blakejones,
	ctshao, dapeng1.mi, howardchu95, james.clark, john.g.garry, jolsa,
	leo.yan, libunwind-devel, linux-arm-kernel, linux-kernel,
	linux-perf-users, linux-riscv, mingo, palmer, peterz, pjw,
	shimin.guo, tglozar, tmricht, will, amadio, yuzhuo
In-Reply-To: <CAP-5=fXmBpqP86FT85sA_BZYRnzwbZ5joYLogCqON6fjXCkRHQ@mail.gmail.com>

On Fri, Mar 27, 2026 at 1:37 PM Ian Rogers <irogers@google.com> wrote:
>
> On Fri, Mar 27, 2026 at 1:07 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Thu, Mar 26, 2026 at 03:51:19PM -0700, Namhyung Kim wrote:
> > > On Sat, Mar 21, 2026 at 04:42:18PM -0700, Ian Rogers wrote:
> > > > libunwind support exists for "--call-graph dwarf", however, libunwind
> > > > support has been opt-in rather than opt-out since Linux v6.13 as libdw
> > > > is preferred - commit 13e17c9ff49119aa ("perf build: Make libunwind
> > > > opt-in rather than opt-out"). A problem with the libdw support was
> > > > that it was slow, an issue fixed in Linux v7.0 in commit 6b2658b3f36a
> > > > ("perf unwind-libdw: Don't discard loaded ELF/DWARF after every
> > > > unwind"). As such libunwind support is now unnecessary.
> > > >
> > > > The patch series:
> > > > https://lore.kernel.org/lkml/20260305221927.3237145-1-irogers@google.com/
> > > > looked to make the libunwind support in perf similar to the libdw
> > > > support, allow cross-architecture unwinding, etc. This was motivated
> > > > by the perf regs conventions being altered by the addition of x86 APX
> > > > support:
> > > > https://lore.kernel.org/lkml/20260209072047.2180332-1-dapeng1.mi@linux.intel.com/
> > > > It is necessary to translate the library's notion of registers to the
> > > > perf register convention so that the stack unwinding state can be
> > > > initialized. On this series it was stated that removing libunwind
> > > > support from perf should be an option, rather than updating support:
> > > > https://lore.kernel.org/lkml/abxs-2rozL1tBEO1@google.com/
> > > > This was also what motivated making libunwind opt-in rather than
> > > > opt-out.
> >
> > > > Given that 7 minor releases have happened with libunwind "deprecated"
> > > > by making it opt-in, let's remove the libunwind support. There doesn't
> > > > appear to be any disagreement to this on the mailing list.
> >
> > > I'm not sure if we want to remove it now.  I think we need more time to
> > > verify libdw unwinding is stable and fast enough.  Also maybe we can
> > > add build- or run-time warning when people try to use libunwind.
> >
> > We have:
> >
> > acme@x1:~/git/perf-tools$ perf -vv | grep LIBU
> >              libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
> > acme@x1:~/git/perf-tools$
> >
> > acme@x1:~/git/perf-tools$ perf check feature libunwind && echo perf built with libunwind
> >              libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
> > acme@x1:~/git/perf-tools$
> >
> > Building with both, as Ian mentioned ends up with:
> >
> >   LD      /tmp/build/perf-tools/util/perf-util-in.o
> > ld: /tmp/build/perf-tools/util/unwind-libunwind.o: in function `unwind__get_entries':
> > unwind-libunwind.c:(.text+0x2a0): multiple definition of `unwind__get_entries'; /tmp/build/perf-tools/util/unwind-libdw.o:unwind-libdw.c:(.text+0x940): first defined here
> > make[4]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:164: /tmp/build/perf-tools/util/perf-util-in.o] Error 123
> > make[3]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:158: util] Error 2
> > make[2]: *** [Makefile.perf:797: /tmp/build/perf-tools/perf-util-in.o] Error 2
> > make[1]: *** [Makefile.perf:289: sub-make] Error 2
> > make: *** [Makefile:119: install-bin] Error 2
> > make: Leaving directory '/home/acme/git/perf-tools/tools/perf'
> > ⬢ [acme@toolbx perf-tools]$
> >
> > So what Namhyung is suggesting is to disable libdw when libunwind is
> > asked for?
> >
> > I.e.
> >
> > alias m='rm -rf ~/libexec/perf-core/ ; make LIBUNWIND=1 NO_LIBDW=1 -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin && perf test "import python" && cat /tmp/build/$(basename $PWD)/feature/test-all.make.output' ; export PYTHONPATH=/tmp/build/$(basename $PWD)/python
> >
> > Which builds and ends up linking with both:
> >
> > ⬢ [acme@toolbx perf-tools]$ ldd ~/bin/perf | egrep unwind\|dw
> >         libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007fbf430b6000)
> >         libunwind.so.8 => /lib64/libunwind.so.8 (0x00007fbf4309b000)
> >         libdw.so.1 => /lib64/libdw.so.1 (0x00007fbf38570000)
> > ⬢ [acme@toolbx perf-tools]$
> >
> > I.e. that NO_LIBDW isn't really disabling linking with it, just some
> > features based on it, likely.
> >
> > Hum, we also have NO_LIBDW_DWARF_UNWIND, which probably is what we want
> > here... nope:
> >
> > ⬢ [acme@toolbx perf-tools]$ alias m='rm -rf ~/libexec/perf-core/ ; make LIBUNWIND=1 NO_LIBDW_DWARF_UNWIND=1 -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin && perf test "import python" && cat /tmp/build/$(basename $PWD)/feature/test-all.make.output' ; export PYTHONPATH=/tmp/build/$(basename $PWD)/python
> > ⬢ [acme@toolbx perf-tools]$ m
> > rm: cannot remove '/home/acme/libexec/perf-core/scripts/python/Perf-Trace-Util/lib/Perf/Trace/__pycache__/Core.cpython-314.pyc': Permission denied
> > make: Entering directory '/home/acme/git/perf-tools/tools/perf'
> >   BUILD:   Doing 'make -j12' parallel build
> > Warning: Kernel ABI header differences:
> >   diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
> >
> > Auto-detecting system features:
> > ...                                   libdw: [ on  ]
> > ...                                   glibc: [ on  ]
> > ...                                  libelf: [ on  ]
> > ...                                 libnuma: [ on  ]
> > ...                  numa_num_possible_cpus: [ on  ]
> > ...                               libpython: [ on  ]
> > ...                             libcapstone: [ on  ]
> > ...                               llvm-perf: [ on  ]
> > ...                                    zlib: [ on  ]
> > ...                                    lzma: [ on  ]
> > ...                                     bpf: [ on  ]
> > ...                                  libaio: [ on  ]
> > ...                                 libzstd: [ on  ]
> > ...                              libopenssl: [ on  ]
> > ...                                    rust: [ on  ]
> >
> >   INSTALL libsubcmd_headers
> >   INSTALL libperf_headers
> >   INSTALL libapi_headers
> >   INSTALL libsymbol_headers
> >   INSTALL libbpf_headers
> >   LD      /tmp/build/perf-tools/util/perf-util-in.o
> > ld: /tmp/build/perf-tools/util/unwind-libunwind.o: in function `unwind__get_entries':
> > unwind-libunwind.c:(.text+0x2a0): multiple definition of `unwind__get_entries'; /tmp/build/perf-tools/util/unwind-libdw.o:unwind-libdw.c:(.text+0x940): first defined here
> > make[4]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:164: /tmp/build/perf-tools/util/perf-util-in.o] Error 123
> > make[3]: *** [/home/acme/git/perf-tools/tools/build/Makefile.build:158: util] Error 2
> > make[2]: *** [Makefile.perf:797: /tmp/build/perf-tools/perf-util-in.o] Error 2
> > make[2]: *** Waiting for unfinished jobs....
> > make[1]: *** [Makefile.perf:289: sub-make] Error 2
> > make: *** [Makefile:119: install-bin] Error 2
> > make: Leaving directory '/home/acme/git/perf-tools/tools/perf'
> > ⬢ [acme@toolbx perf-tools]$
> >
> > I expected NO_LIBDW_DWARF_UNWIND=1 would resolve this case, and maybe it
> > should?
> >
> > Or maybe it did it in the past as by now it is just a comment:
> >
> > ⬢ [acme@toolbx perf-tools]$ git grep NO_LIBDW_DWARF_UNWIND
> > tools/perf/Makefile.perf:# Define NO_LIBDW_DWARF_UNWIND if you do not want libdw support
> > ⬢ [acme@toolbx perf-tools]$
> >
> > Its from:
> >
> > # Define NO_LIBDW_DWARF_UNWIND if you do not want libdw support
> > # for dwarf backtrace post unwind.
> >
> > As we need libdw for 'perf probe', etc, so being able to disable it just
> > for DWARF backtrace is what we need here to make them mutually
> > exclusive, i.e. the default is for building with libdw for backtraces,
> > but if the user asks for LIBUNWIND=1, then a warning that libdw won't be
> > used for DWARF backtraces and select NO_LIBDW_DWARF_UNWIND?
> >
> > We could then have a regression test that builds perf with one, does
> > some backtraces, then with the other, then compare? This would be
> > followup work, if somebody has the cycles, but making them mutually
> > exclusive should be doable with not that much work?
> >
> > This is an area that is tricky and since we _already_ have two
> > implementations, the good thing for regression testing would be the
> > compare their results until libunwind becomes completely rotten and
> > unusable?
>
> My series:
> https://lore.kernel.org/lkml/20260305221927.3237145-1-irogers@google.com/
> makes libdw and libunwind supported together:
> https://lore.kernel.org/lkml/20260305221927.3237145-2-irogers@google.com/
> """
> This commit refactors the DWARF unwind post-processing to be
> configurable at runtime via the .perfconfig file option
> 'unwind.style', or using the argument '--unwind-style' in the commands
> 'perf report', 'perf script' and 'perf inject', in a similar manner to
> the addr2line or the disassembler style.
> """
> That series cleans up several other issues, which is why I think it is
> worth landing while we wait for libdw to become stable.

Fwiw, fixing cross-platform MIPS libunwind support (my series does
this) goes someway to address problems with perf raised here:
https://github.com/koute/not-perf

Thanks,
Ian

> Thanks,
> Ian
>
> > - Arnaldo


^ permalink raw reply

* ✅ PASS: Test report for for-kernelci (7.0.0-rc5, upstream-arm-next, 8d9c78aa)
From: cki-project @ 2026-03-27 20:43 UTC (permalink / raw)
  To: will, catalin.marinas, linux-arm-kernel

Hi, we tested your kernel and here are the results:

    Overall result: PASSED
             Merge: OK
           Compile: OK
              Test: OK

Tested-by: CKI Project <cki-project@redhat.com>

Kernel information:
    Commit message: Merge remote-tracking branch 'will/for-next/perf' into for-kernelci

You can find all the details about the test run at
    https://datawarehouse.cki-project.org/kcidb/checkouts/redhat:2414225370


If you find a failure unrelated to your changes, please ask the test maintainer to review it.
This will prevent the failures from being incorrectly reported in the future.

Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.

        ,-.   ,-.
       ( C ) ( K )  Continuous
        `-',-.`-'   Kernel
          ( I )     Integration
           `-'
______________________________________________________________________________



^ permalink raw reply

* Re: [PATCH v1 2/3] arm64: mm: Handle invalid large leaf mappings correctly
From: Yang Shi @ 2026-03-27 20:46 UTC (permalink / raw)
  To: Ryan Roberts, Catalin Marinas, Will Deacon,
	David Hildenbrand (Arm), Dev Jain, Suzuki K Poulose, Jinjiang Tu,
	Kevin Brodsky
  Cc: linux-arm-kernel, linux-kernel, stable
In-Reply-To: <eea4f7f1-c929-453b-adca-606ba6e4ec69@arm.com>



On 3/25/26 10:37 AM, Ryan Roberts wrote:
> On 24/03/2026 18:20, Yang Shi wrote:
>>
>> On 3/23/26 6:03 AM, Ryan Roberts wrote:
>>> It has been possible for a long time to mark ptes in the linear map as
>>> invalid. This is done for secretmem, kfence, realm dma memory un/share,
>>> and others, by simply clearing the PTE_VALID bit. But until commit
>>> a166563e7ec37 ("arm64: mm: support large block mapping when
>>> rodata=full") large leaf mappings were never made invalid in this way.
>>>
>>> It turns out various parts of the code base are not equipped to handle
>>> invalid large leaf mappings (in the way they are currently encoded) and
>>> I've observed a kernel panic while booting a realm guest on a
>>> BBML2_NOABORT system as a result:
>>>
>>> [   15.432706] software IO TLB: Memory encryption is active and system is
>>> using DMA bounce buffers
>>> [   15.476896] Unable to handle kernel paging request at virtual address
>>> ffff000019600000
>>> [   15.513762] Mem abort info:
>>> [   15.527245]   ESR = 0x0000000096000046
>>> [   15.548553]   EC = 0x25: DABT (current EL), IL = 32 bits
>>> [   15.572146]   SET = 0, FnV = 0
>>> [   15.592141]   EA = 0, S1PTW = 0
>>> [   15.612694]   FSC = 0x06: level 2 translation fault
>>> [   15.640644] Data abort info:
>>> [   15.661983]   ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
>>> [   15.694875]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
>>> [   15.723740]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>> [   15.755776] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081f3f000
>>> [   15.800410] [ffff000019600000] pgd=0000000000000000, p4d=180000009ffff403,
>>> pud=180000009fffe403, pmd=00e8000199600704
>>> [   15.855046] Internal error: Oops: 0000000096000046 [#1]  SMP
>>> [   15.886394] Modules linked in:
>>> [   15.900029] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc4-
>>> dirty #4 PREEMPT
>>> [   15.935258] Hardware name: linux,dummy-virt (DT)
>>> [   15.955612] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
>>> [   15.986009] pc : __pi_memcpy_generic+0x128/0x22c
>>> [   16.006163] lr : swiotlb_bounce+0xf4/0x158
>>> [   16.024145] sp : ffff80008000b8f0
>>> [   16.038896] x29: ffff80008000b8f0 x28: 0000000000000000 x27: 0000000000000000
>>> [   16.069953] x26: ffffb3976d261ba8 x25: 0000000000000000 x24: ffff000019600000
>>> [   16.100876] x23: 0000000000000001 x22: ffff0000043430d0 x21: 0000000000007ff0
>>> [   16.131946] x20: 0000000084570010 x19: 0000000000000000 x18: ffff00001ffe3fcc
>>> [   16.163073] x17: 0000000000000000 x16: 00000000003fffff x15: 646e612065766974
>>> [   16.194131] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
>>> [   16.225059] x11: 0000000000000000 x10: 0000000000000010 x9 : 0000000000000018
>>> [   16.256113] x8 : 0000000000000018 x7 : 0000000000000000 x6 : 0000000000000000
>>> [   16.287203] x5 : ffff000019607ff0 x4 : ffff000004578000 x3 : ffff000019600000
>>> [   16.318145] x2 : 0000000000007ff0 x1 : ffff000004570010 x0 : ffff000019600000
>>> [   16.349071] Call trace:
>>> [   16.360143]  __pi_memcpy_generic+0x128/0x22c (P)
>>> [   16.380310]  swiotlb_tbl_map_single+0x154/0x2b4
>>> [   16.400282]  swiotlb_map+0x5c/0x228
>>> [   16.415984]  dma_map_phys+0x244/0x2b8
>>> [   16.432199]  dma_map_page_attrs+0x44/0x58
>>> [   16.449782]  virtqueue_map_page_attrs+0x38/0x44
>>> [   16.469596]  virtqueue_map_single_attrs+0xc0/0x130
>>> [   16.490509]  virtnet_rq_alloc.isra.0+0xa4/0x1fc
>>> [   16.510355]  try_fill_recv+0x2a4/0x584
>>> [   16.526989]  virtnet_open+0xd4/0x238
>>> [   16.542775]  __dev_open+0x110/0x24c
>>> [   16.558280]  __dev_change_flags+0x194/0x20c
>>> [   16.576879]  netif_change_flags+0x24/0x6c
>>> [   16.594489]  dev_change_flags+0x48/0x7c
>>> [   16.611462]  ip_auto_config+0x258/0x1114
>>> [   16.628727]  do_one_initcall+0x80/0x1c8
>>> [   16.645590]  kernel_init_freeable+0x208/0x2f0
>>> [   16.664917]  kernel_init+0x24/0x1e0
>>> [   16.680295]  ret_from_fork+0x10/0x20
>>> [   16.696369] Code: 927cec03 cb0e0021 8b0e0042 a9411c26 (a900340c)
>>> [   16.723106] ---[ end trace 0000000000000000 ]---
>>> [   16.752866] Kernel panic - not syncing: Attempted to kill init!
>>> exitcode=0x0000000b
>>> [   16.792556] Kernel Offset: 0x3396ea200000 from 0xffff800080000000
>>> [   16.818966] PHYS_OFFSET: 0xfff1000080000000
>>> [   16.837237] CPU features: 0x0000000,00060005,13e38581,957e772f
>>> [   16.862904] Memory Limit: none
>>> [   16.876526] ---[ end Kernel panic - not syncing: Attempted to kill init!
>>> exitcode=0x0000000b ]---
>>>
>>> This panic occurs because the swiotlb memory was previously shared to
>>> the host (__set_memory_enc_dec()), which involves transitioning the
>>> (large) leaf mappings to invalid, sharing to the host, then marking the
>>> mappings valid again. But pageattr_p[mu]d_entry() would only update the
>>> entry if it is a section mapping, since otherwise it concluded it must
>>> be a table entry so shouldn't be modified. But p[mu]d_sect() only
>>> returns true if the entry is valid. So the result was that the large
>>> leaf entry was made invalid in the first pass then ignored in the second
>>> pass. It remains invalid until the above code tries to access it and
>>> blows up.
>> Good catch. I recall I met this problem when I worked on a very early PoC of
>> large block mapping patch. It took a total different approach than
>> BBML2_NOABORT. I didn't run into that problem when I implemented BBML2_NOABORT
>> because nobody actually changed valid/invalid attribute on large block mapping
>> granule so I forgot it. But I definitely missed realm usecase.
>>
>>> The simple fix would be to update pageattr_pmd_entry() to use
>>> !pmd_table() instead of pmd_sect(). That would solve this problem.
>> Yes, I agree.
>>
>>> But the ptdump code also suffers from a similar issue. It checks
>>> pmd_leaf() and doesn't call into the arch-specific note_page() machinery
>>> if it returns false. As a result of this, ptdump wasn't even able to
>>> show the invalid large leaf mappings; it looked like they were valid
>>> which made this super fun to debug. the ptdump code is core-mm and
>>> pmd_table() is arm64-specific so we can't use the same trick to solve
>>> that.
>> I don't quite get why we need to show invalid mappings in ptdump? IIUC ptdump is
>> not supposed to show invalid mappings even though they are transient.
> Let's say we have 8M of PMD mappings, then we want to mark 2M in the middle of
> that as invalid. Prior to my fix, ptdump would show the full 8M as still being
> valid after making the middle 2M invalid. This happened because the note_page()
> call for the 2M invalid part was suppressed, but there was also no ptdump_hole()
> call since the PMD entry is not none. After my fix, we call note_page() for the
> non-none but invalid pmd and now the "F" is correctly displayed for that portion.

I see your point now. Yes, pmd_entry() will return 0 because pmd_leaf() 
returns false in this case. But the page table walk still continues 
since the later "pmd_leaf(*pmd) || !pmd_present(*pmd)" returns true 
because it is not present either. So the invalid entry will be 
mistakenly covered in a valid range.

It may be better to show the user visible ptdump change in the commit log.

Thanks,
Yang
>
> Thanks,
> Ryan
>
>
>
>> Thanks,
>> Yang
>>
>>
>>> But we already support the concept of "present-invalid" for user space
>>> entries. And even better, pmd_leaf() will return true for a leaf mapping
>>> that is marked present-invalid. So let's just use that encoding for
>>> present-invalid kernel mappings too. Then we can use pmd_leaf() where we
>>> previously used pmd_sect() and everything is magically fixed.
>>>
>>> Additionally, from inspection kernel_page_present() was broken in a
>>> similar way, so I'm also updating that to use pmd_leaf().
>>>
>>> I haven't spotted any other issues of this shape but plan to do a follow
>>> up patch to remove pmd_sect() and pud_sect() in favour of the more
>>> sophisticated pmd_leaf()/pud_leaf() which are core-mm APIs and will
>>> simplify arm64 code a bit.
>>>
>>> Fixes: a166563e7ec37 ("arm64: mm: support large block mapping when rodata=full")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> ---
>>>    arch/arm64/mm/pageattr.c | 50 ++++++++++++++++++++++------------------
>>>    1 file changed, 28 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>> index 358d1dc9a576f..87dfe4c82fa92 100644
>>> --- a/arch/arm64/mm/pageattr.c
>>> +++ b/arch/arm64/mm/pageattr.c
>>> @@ -25,6 +25,11 @@ static ptdesc_t set_pageattr_masks(ptdesc_t val, struct
>>> mm_walk *walk)
>>>    {
>>>        struct page_change_data *masks = walk->private;
>>>    +    /*
>>> +     * Some users clear and set bits which alias eachother (e.g. PTE_NG and
>>> +     * PTE_PRESENT_INVALID). It is therefore important that we always clear
>>> +     * first then set.
>>> +     */
>>>        val &= ~(pgprot_val(masks->clear_mask));
>>>        val |= (pgprot_val(masks->set_mask));
>>>    @@ -36,7 +41,7 @@ static int pageattr_pud_entry(pud_t *pud, unsigned long addr,
>>>    {
>>>        pud_t val = pudp_get(pud);
>>>    -    if (pud_sect(val)) {
>>> +    if (pud_leaf(val)) {
>>>            if (WARN_ON_ONCE((next - addr) != PUD_SIZE))
>>>                return -EINVAL;
>>>            val = __pud(set_pageattr_masks(pud_val(val), walk));
>>> @@ -52,7 +57,7 @@ static int pageattr_pmd_entry(pmd_t *pmd, unsigned long addr,
>>>    {
>>>        pmd_t val = pmdp_get(pmd);
>>>    -    if (pmd_sect(val)) {
>>> +    if (pmd_leaf(val)) {
>>>            if (WARN_ON_ONCE((next - addr) != PMD_SIZE))
>>>                return -EINVAL;
>>>            val = __pmd(set_pageattr_masks(pmd_val(val), walk));
>>> @@ -132,11 +137,12 @@ static int __change_memory_common(unsigned long start,
>>> unsigned long size,
>>>        ret = update_range_prot(start, size, set_mask, clear_mask);
>>>          /*
>>> -     * If the memory is being made valid without changing any other bits
>>> -     * then a TLBI isn't required as a non-valid entry cannot be cached in
>>> -     * the TLB.
>>> +     * If the memory is being switched from present-invalid to valid without
>>> +     * changing any other bits then a TLBI isn't required as a non-valid
>>> +     * entry cannot be cached in the TLB.
>>>         */
>>> -    if (pgprot_val(set_mask) != PTE_VALID || pgprot_val(clear_mask))
>>> +    if (pgprot_val(set_mask) != (PTE_MAYBE_NG | PTE_VALID) ||
>>> +        pgprot_val(clear_mask) != PTE_PRESENT_INVALID)
>>>            flush_tlb_kernel_range(start, start + size);
>>>        return ret;
>>>    }
>>> @@ -237,18 +243,18 @@ int set_memory_valid(unsigned long addr, int numpages,
>>> int enable)
>>>    {
>>>        if (enable)
>>>            return __change_memory_common(addr, PAGE_SIZE * numpages,
>>> -                    __pgprot(PTE_VALID),
>>> -                    __pgprot(0));
>>> +                    __pgprot(PTE_MAYBE_NG | PTE_VALID),
>>> +                    __pgprot(PTE_PRESENT_INVALID));
>>>        else
>>>            return __change_memory_common(addr, PAGE_SIZE * numpages,
>>> -                    __pgprot(0),
>>> -                    __pgprot(PTE_VALID));
>>> +                    __pgprot(PTE_PRESENT_INVALID),
>>> +                    __pgprot(PTE_MAYBE_NG | PTE_VALID));
>>>    }
>>>      int set_direct_map_invalid_noflush(struct page *page)
>>>    {
>>> -    pgprot_t clear_mask = __pgprot(PTE_VALID);
>>> -    pgprot_t set_mask = __pgprot(0);
>>> +    pgprot_t clear_mask = __pgprot(PTE_MAYBE_NG | PTE_VALID);
>>> +    pgprot_t set_mask = __pgprot(PTE_PRESENT_INVALID);
>>>          if (!can_set_direct_map())
>>>            return 0;
>>> @@ -259,8 +265,8 @@ int set_direct_map_invalid_noflush(struct page *page)
>>>      int set_direct_map_default_noflush(struct page *page)
>>>    {
>>> -    pgprot_t set_mask = __pgprot(PTE_VALID | PTE_WRITE);
>>> -    pgprot_t clear_mask = __pgprot(PTE_RDONLY);
>>> +    pgprot_t set_mask = __pgprot(PTE_MAYBE_NG | PTE_VALID | PTE_WRITE);
>>> +    pgprot_t clear_mask = __pgprot(PTE_PRESENT_INVALID | PTE_RDONLY);
>>>          if (!can_set_direct_map())
>>>            return 0;
>>> @@ -296,8 +302,8 @@ static int __set_memory_enc_dec(unsigned long addr,
>>>         * entries or Synchronous External Aborts caused by RIPAS_EMPTY
>>>         */
>>>        ret = __change_memory_common(addr, PAGE_SIZE * numpages,
>>> -                     __pgprot(set_prot),
>>> -                     __pgprot(clear_prot | PTE_VALID));
>>> +                     __pgprot(set_prot | PTE_PRESENT_INVALID),
>>> +                     __pgprot(clear_prot | PTE_MAYBE_NG | PTE_VALID));
>>>          if (ret)
>>>            return ret;
>>> @@ -311,8 +317,8 @@ static int __set_memory_enc_dec(unsigned long addr,
>>>            return ret;
>>>          return __change_memory_common(addr, PAGE_SIZE * numpages,
>>> -                      __pgprot(PTE_VALID),
>>> -                      __pgprot(0));
>>> +                      __pgprot(PTE_MAYBE_NG | PTE_VALID),
>>> +                      __pgprot(PTE_PRESENT_INVALID));
>>>    }
>>>      static int realm_set_memory_encrypted(unsigned long addr, int numpages)
>>> @@ -404,15 +410,15 @@ bool kernel_page_present(struct page *page)
>>>        pud = READ_ONCE(*pudp);
>>>        if (pud_none(pud))
>>>            return false;
>>> -    if (pud_sect(pud))
>>> -        return true;
>>> +    if (pud_leaf(pud))
>>> +        return pud_valid(pud);
>>>          pmdp = pmd_offset(pudp, addr);
>>>        pmd = READ_ONCE(*pmdp);
>>>        if (pmd_none(pmd))
>>>            return false;
>>> -    if (pmd_sect(pmd))
>>> -        return true;
>>> +    if (pmd_leaf(pmd))
>>> +        return pmd_valid(pmd);
>>>          ptep = pte_offset_kernel(pmdp, addr);
>>>        return pte_valid(__ptep_get(ptep));



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox