* [PATCH 1/4] media: rkvdec: Introduce a global bitwriter helper
From: Detlev Casanova @ 2026-03-27 15:16 UTC (permalink / raw)
To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
Jonas Karlman, Nicolas Dufresne
Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v1-0-982cf872b590@collabora.com>
The use of structures with bitfields is good when the values are
somewhat aligned.
More mis-alignement means that compilers need to do more gymanstics
to edit the fields values.
Some cases have been reported with CLang on specific architectures
like armhf and hexagon, where the compiler would allocate a bigger
local stack than needed or even completely freeze during compilation.
Some fixes have been provided to ease the issues, but the real fix
here is to use a bitwriter instead of heavily unaligned bitfields.
This is a preparation commit to provide a global bitwriter interface
for the whole driver.
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
drivers/media/platform/rockchip/rkvdec/Makefile | 1 +
.../platform/rockchip/rkvdec/rkvdec-bitwriter.c | 30 ++++++++++++++++++++++
.../platform/rockchip/rkvdec/rkvdec-bitwriter.h | 25 ++++++++++++++++++
3 files changed, 56 insertions(+)
diff --git a/drivers/media/platform/rockchip/rkvdec/Makefile b/drivers/media/platform/rockchip/rkvdec/Makefile
index e629d571e4d8..11e2122bcbbf 100644
--- a/drivers/media/platform/rockchip/rkvdec/Makefile
+++ b/drivers/media/platform/rockchip/rkvdec/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_VIDEO_ROCKCHIP_VDEC) += rockchip-vdec.o
rockchip-vdec-y += \
rkvdec.o \
+ rkvdec-bitwriter.o \
rkvdec-cabac.o \
rkvdec-h264.o \
rkvdec-h264-common.o \
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.c
new file mode 100644
index 000000000000..673ebb89002b
--- /dev/null
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip Video Decoder bit writer
+ *
+ * Copyright (C) 2026 Collabora, Ltd.
+ * Detlev Casanova <detlev.casanova@collabora.com>
+ * Copyright (C) 2019 Collabora, Ltd.
+ * Boris Brezillon <boris.brezillon@collabora.com>
+ */
+
+#include <linux/types.h>
+#include <linux/bits.h>
+
+#include "rkvdec-bitwriter.h"
+
+void rkvdec_set_bw_field(u32 *buf, struct rkvdec_bw_field field, u32 value)
+{
+ u8 bit = field.offset % 32;
+ u16 word = field.offset / 32;
+ u64 mask = GENMASK_ULL(bit + field.len - 1, bit);
+ u64 val = ((u64)value << bit) & mask;
+
+ buf[word] &= ~mask;
+ buf[word] |= val;
+ if (bit + field.len > 32) {
+ buf[word + 1] &= ~(mask >> 32);
+ buf[word + 1] |= val >> 32;
+ }
+}
+
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.h
new file mode 100644
index 000000000000..44154f1ebc65
--- /dev/null
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-bitwriter.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Rockchip Video Decoder bit writer
+ *
+ * Copyright (C) 2026 Collabora, Ltd.
+ * Detlev Casanova <detlev.casanova@collabora.com>
+ * Copyright (C) 2019 Collabora, Ltd.
+ * Boris Brezillon <boris.brezillon@collabora.com>
+ */
+
+#ifndef RKVDEC_BIT_WRITER_H_
+#define RKVDEC_BIT_WRITER_H_
+
+#include <linux/types.h>
+
+struct rkvdec_bw_field {
+ u16 offset;
+ u8 len;
+};
+
+#define BW_FIELD(_offset, _len) ((struct rkvdec_bw_field){ _offset, _len })
+
+void rkvdec_set_bw_field(u32 *buf, struct rkvdec_bw_field field, u32 value);
+
+#endif /* RKVDEC_BIT_WRITER_H_ */
--
2.53.0
^ permalink raw reply related
* [PATCH 2/4] media: rkvdec: Use the global bitwriter instead of local one
From: Detlev Casanova @ 2026-03-27 15:16 UTC (permalink / raw)
To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
Jonas Karlman, Nicolas Dufresne
Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v1-0-982cf872b590@collabora.com>
Both rkvdec-h264.c and rkvdec-hevc.c use their own bitwriter
function and macros.
Move to using the global one introduced before.
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
.../media/platform/rockchip/rkvdec/rkvdec-h264.c | 109 ++++++-------
.../media/platform/rockchip/rkvdec/rkvdec-hevc.c | 171 +++++++++------------
2 files changed, 119 insertions(+), 161 deletions(-)
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c
index d3202cecb988..ffa606038192 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264.c
@@ -16,6 +16,7 @@
#include "rkvdec-regs.h"
#include "rkvdec-cabac.h"
#include "rkvdec-h264-common.h"
+#include "rkvdec-bitwriter.h"
/* Size with u32 units. */
#define RKV_CABAC_INIT_BUFFER_SIZE (3680 + 128)
@@ -25,56 +26,48 @@ struct rkvdec_sps_pps_packet {
u32 info[8];
};
-struct rkvdec_ps_field {
- u16 offset;
- u8 len;
-};
-
-#define PS_FIELD(_offset, _len) \
- ((struct rkvdec_ps_field){ _offset, _len })
-
-#define SEQ_PARAMETER_SET_ID PS_FIELD(0, 4)
-#define PROFILE_IDC PS_FIELD(4, 8)
-#define CONSTRAINT_SET3_FLAG PS_FIELD(12, 1)
-#define CHROMA_FORMAT_IDC PS_FIELD(13, 2)
-#define BIT_DEPTH_LUMA PS_FIELD(15, 3)
-#define BIT_DEPTH_CHROMA PS_FIELD(18, 3)
-#define QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG PS_FIELD(21, 1)
-#define LOG2_MAX_FRAME_NUM_MINUS4 PS_FIELD(22, 4)
-#define MAX_NUM_REF_FRAMES PS_FIELD(26, 5)
-#define PIC_ORDER_CNT_TYPE PS_FIELD(31, 2)
-#define LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4 PS_FIELD(33, 4)
-#define DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG PS_FIELD(37, 1)
-#define PIC_WIDTH_IN_MBS PS_FIELD(38, 9)
-#define PIC_HEIGHT_IN_MBS PS_FIELD(47, 9)
-#define FRAME_MBS_ONLY_FLAG PS_FIELD(56, 1)
-#define MB_ADAPTIVE_FRAME_FIELD_FLAG PS_FIELD(57, 1)
-#define DIRECT_8X8_INFERENCE_FLAG PS_FIELD(58, 1)
-#define MVC_EXTENSION_ENABLE PS_FIELD(59, 1)
-#define NUM_VIEWS PS_FIELD(60, 2)
-#define VIEW_ID(i) PS_FIELD(62 + ((i) * 10), 10)
-#define NUM_ANCHOR_REFS_L(i) PS_FIELD(82 + ((i) * 11), 1)
-#define ANCHOR_REF_L(i) PS_FIELD(83 + ((i) * 11), 10)
-#define NUM_NON_ANCHOR_REFS_L(i) PS_FIELD(104 + ((i) * 11), 1)
-#define NON_ANCHOR_REFS_L(i) PS_FIELD(105 + ((i) * 11), 10)
-#define PIC_PARAMETER_SET_ID PS_FIELD(128, 8)
-#define PPS_SEQ_PARAMETER_SET_ID PS_FIELD(136, 5)
-#define ENTROPY_CODING_MODE_FLAG PS_FIELD(141, 1)
-#define BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG PS_FIELD(142, 1)
-#define NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(i) PS_FIELD(143 + ((i) * 5), 5)
-#define WEIGHTED_PRED_FLAG PS_FIELD(153, 1)
-#define WEIGHTED_BIPRED_IDC PS_FIELD(154, 2)
-#define PIC_INIT_QP_MINUS26 PS_FIELD(156, 7)
-#define PIC_INIT_QS_MINUS26 PS_FIELD(163, 6)
-#define CHROMA_QP_INDEX_OFFSET PS_FIELD(169, 5)
-#define DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG PS_FIELD(174, 1)
-#define CONSTRAINED_INTRA_PRED_FLAG PS_FIELD(175, 1)
-#define REDUNDANT_PIC_CNT_PRESENT PS_FIELD(176, 1)
-#define TRANSFORM_8X8_MODE_FLAG PS_FIELD(177, 1)
-#define SECOND_CHROMA_QP_INDEX_OFFSET PS_FIELD(178, 5)
-#define SCALING_LIST_ENABLE_FLAG PS_FIELD(183, 1)
-#define SCALING_LIST_ADDRESS PS_FIELD(184, 32)
-#define IS_LONG_TERM(i) PS_FIELD(216 + (i), 1)
+#define SEQ_PARAMETER_SET_ID BW_FIELD(0, 4)
+#define PROFILE_IDC BW_FIELD(4, 8)
+#define CONSTRAINT_SET3_FLAG BW_FIELD(12, 1)
+#define CHROMA_FORMAT_IDC BW_FIELD(13, 2)
+#define BIT_DEPTH_LUMA BW_FIELD(15, 3)
+#define BIT_DEPTH_CHROMA BW_FIELD(18, 3)
+#define QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG BW_FIELD(21, 1)
+#define LOG2_MAX_FRAME_NUM_MINUS4 BW_FIELD(22, 4)
+#define MAX_NUM_REF_FRAMES BW_FIELD(26, 5)
+#define PIC_ORDER_CNT_TYPE BW_FIELD(31, 2)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4 BW_FIELD(33, 4)
+#define DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG BW_FIELD(37, 1)
+#define PIC_WIDTH_IN_MBS BW_FIELD(38, 9)
+#define PIC_HEIGHT_IN_MBS BW_FIELD(47, 9)
+#define FRAME_MBS_ONLY_FLAG BW_FIELD(56, 1)
+#define MB_ADAPTIVE_FRAME_FIELD_FLAG BW_FIELD(57, 1)
+#define DIRECT_8X8_INFERENCE_FLAG BW_FIELD(58, 1)
+#define MVC_EXTENSION_ENABLE BW_FIELD(59, 1)
+#define NUM_VIEWS BW_FIELD(60, 2)
+#define VIEW_ID(i) BW_FIELD(62 + ((i) * 10), 10)
+#define NUM_ANCHOR_REFS_L(i) BW_FIELD(82 + ((i) * 11), 1)
+#define ANCHOR_REF_L(i) BW_FIELD(83 + ((i) * 11), 10)
+#define NUM_NON_ANCHOR_REFS_L(i) BW_FIELD(104 + ((i) * 11), 1)
+#define NON_ANCHOR_REFS_L(i) BW_FIELD(105 + ((i) * 11), 10)
+#define PIC_PARAMETER_SET_ID BW_FIELD(128, 8)
+#define PPS_SEQ_PARAMETER_SET_ID BW_FIELD(136, 5)
+#define ENTROPY_CODING_MODE_FLAG BW_FIELD(141, 1)
+#define BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG BW_FIELD(142, 1)
+#define NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(i) BW_FIELD(143 + ((i) * 5), 5)
+#define WEIGHTED_PRED_FLAG BW_FIELD(153, 1)
+#define WEIGHTED_BIPRED_IDC BW_FIELD(154, 2)
+#define PIC_INIT_QP_MINUS26 BW_FIELD(156, 7)
+#define PIC_INIT_QS_MINUS26 BW_FIELD(163, 6)
+#define CHROMA_QP_INDEX_OFFSET BW_FIELD(169, 5)
+#define DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG BW_FIELD(174, 1)
+#define CONSTRAINED_INTRA_PRED_FLAG BW_FIELD(175, 1)
+#define REDUNDANT_PIC_CNT_PRESENT BW_FIELD(176, 1)
+#define TRANSFORM_8X8_MODE_FLAG BW_FIELD(177, 1)
+#define SECOND_CHROMA_QP_INDEX_OFFSET BW_FIELD(178, 5)
+#define SCALING_LIST_ENABLE_FLAG BW_FIELD(183, 1)
+#define SCALING_LIST_ADDRESS BW_FIELD(184, 32)
+#define IS_LONG_TERM(i) BW_FIELD(216 + (i), 1)
/* Data structure describing auxiliary buffer format. */
struct rkvdec_h264_priv_tbl {
@@ -91,20 +84,6 @@ struct rkvdec_h264_ctx {
struct rkvdec_regs regs;
};
-static void set_ps_field(u32 *buf, struct rkvdec_ps_field field, u32 value)
-{
- u8 bit = field.offset % 32, word = field.offset / 32;
- u64 mask = GENMASK_ULL(bit + field.len - 1, bit);
- u64 val = ((u64)value << bit) & mask;
-
- buf[word] &= ~mask;
- buf[word] |= val;
- if (bit + field.len > 32) {
- buf[word + 1] &= ~(mask >> 32);
- buf[word + 1] |= val >> 32;
- }
-}
-
static void assemble_hw_pps(struct rkvdec_ctx *ctx,
struct rkvdec_h264_run *run)
{
@@ -128,7 +107,7 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
hw_ps = &priv_tbl->param_set[pps->pic_parameter_set_id];
memset(hw_ps, 0, sizeof(*hw_ps));
-#define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
+#define WRITE_PPS(value, field) rkvdec_set_bw_field(hw_ps->info, field, value)
/* write sps */
WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
WRITE_PPS(sps->profile_idc, PROFILE_IDC);
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c
index ac8b825d080a..6d367bfcdd13 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c
@@ -18,6 +18,7 @@
#include "rkvdec-regs.h"
#include "rkvdec-cabac.h"
#include "rkvdec-hevc-common.h"
+#include "rkvdec-bitwriter.h"
/* Size in u8/u32 units. */
#define RKV_SCALING_LIST_SIZE 1360
@@ -34,80 +35,72 @@ struct rkvdec_rps_packet {
u32 info[RKV_RPS_SIZE];
};
-struct rkvdec_ps_field {
- u16 offset;
- u8 len;
-};
-
-#define PS_FIELD(_offset, _len) \
- ((struct rkvdec_ps_field){ _offset, _len })
-
/* SPS */
-#define VIDEO_PARAMETER_SET_ID PS_FIELD(0, 4)
-#define SEQ_PARAMETER_SET_ID PS_FIELD(4, 4)
-#define CHROMA_FORMAT_IDC PS_FIELD(8, 2)
-#define PIC_WIDTH_IN_LUMA_SAMPLES PS_FIELD(10, 13)
-#define PIC_HEIGHT_IN_LUMA_SAMPLES PS_FIELD(23, 13)
-#define BIT_DEPTH_LUMA PS_FIELD(36, 4)
-#define BIT_DEPTH_CHROMA PS_FIELD(40, 4)
-#define LOG2_MAX_PIC_ORDER_CNT_LSB PS_FIELD(44, 5)
-#define LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE PS_FIELD(49, 2)
-#define LOG2_MIN_LUMA_CODING_BLOCK_SIZE PS_FIELD(51, 3)
-#define LOG2_MIN_TRANSFORM_BLOCK_SIZE PS_FIELD(54, 3)
-#define LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE PS_FIELD(57, 2)
-#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTER PS_FIELD(59, 3)
-#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA PS_FIELD(62, 3)
-#define SCALING_LIST_ENABLED_FLAG PS_FIELD(65, 1)
-#define AMP_ENABLED_FLAG PS_FIELD(66, 1)
-#define SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG PS_FIELD(67, 1)
-#define PCM_ENABLED_FLAG PS_FIELD(68, 1)
-#define PCM_SAMPLE_BIT_DEPTH_LUMA PS_FIELD(69, 4)
-#define PCM_SAMPLE_BIT_DEPTH_CHROMA PS_FIELD(73, 4)
-#define PCM_LOOP_FILTER_DISABLED_FLAG PS_FIELD(77, 1)
-#define LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE PS_FIELD(78, 3)
-#define LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE PS_FIELD(81, 3)
-#define NUM_SHORT_TERM_REF_PIC_SETS PS_FIELD(84, 7)
-#define LONG_TERM_REF_PICS_PRESENT_FLAG PS_FIELD(91, 1)
-#define NUM_LONG_TERM_REF_PICS_SPS PS_FIELD(92, 6)
-#define SPS_TEMPORAL_MVP_ENABLED_FLAG PS_FIELD(98, 1)
-#define STRONG_INTRA_SMOOTHING_ENABLED_FLAG PS_FIELD(99, 1)
+#define VIDEO_PARAMETER_SET_ID BW_FIELD(0, 4)
+#define SEQ_PARAMETER_SET_ID BW_FIELD(4, 4)
+#define CHROMA_FORMAT_IDC BW_FIELD(8, 2)
+#define PIC_WIDTH_IN_LUMA_SAMPLES BW_FIELD(10, 13)
+#define PIC_HEIGHT_IN_LUMA_SAMPLES BW_FIELD(23, 13)
+#define BIT_DEPTH_LUMA BW_FIELD(36, 4)
+#define BIT_DEPTH_CHROMA BW_FIELD(40, 4)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB BW_FIELD(44, 5)
+#define LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE BW_FIELD(49, 2)
+#define LOG2_MIN_LUMA_CODING_BLOCK_SIZE BW_FIELD(51, 3)
+#define LOG2_MIN_TRANSFORM_BLOCK_SIZE BW_FIELD(54, 3)
+#define LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE BW_FIELD(57, 2)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTER BW_FIELD(59, 3)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA BW_FIELD(62, 3)
+#define SCALING_LIST_ENABLED_FLAG BW_FIELD(65, 1)
+#define AMP_ENABLED_FLAG BW_FIELD(66, 1)
+#define SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG BW_FIELD(67, 1)
+#define PCM_ENABLED_FLAG BW_FIELD(68, 1)
+#define PCM_SAMPLE_BIT_DEPTH_LUMA BW_FIELD(69, 4)
+#define PCM_SAMPLE_BIT_DEPTH_CHROMA BW_FIELD(73, 4)
+#define PCM_LOOP_FILTER_DISABLED_FLAG BW_FIELD(77, 1)
+#define LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE BW_FIELD(78, 3)
+#define LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE BW_FIELD(81, 3)
+#define NUM_SHORT_TERM_REF_PIC_SETS BW_FIELD(84, 7)
+#define LONG_TERM_REF_PICS_PRESENT_FLAG BW_FIELD(91, 1)
+#define NUM_LONG_TERM_REF_PICS_SPS BW_FIELD(92, 6)
+#define SPS_TEMPORAL_MVP_ENABLED_FLAG BW_FIELD(98, 1)
+#define STRONG_INTRA_SMOOTHING_ENABLED_FLAG BW_FIELD(99, 1)
/* PPS */
-#define PIC_PARAMETER_SET_ID PS_FIELD(128, 6)
-#define PPS_SEQ_PARAMETER_SET_ID PS_FIELD(134, 4)
-#define DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG PS_FIELD(138, 1)
-#define OUTPUT_FLAG_PRESENT_FLAG PS_FIELD(139, 1)
-#define NUM_EXTRA_SLICE_HEADER_BITS PS_FIELD(140, 13)
-#define SIGN_DATA_HIDING_ENABLED_FLAG PS_FIELD(153, 1)
-#define CABAC_INIT_PRESENT_FLAG PS_FIELD(154, 1)
-#define NUM_REF_IDX_L0_DEFAULT_ACTIVE PS_FIELD(155, 4)
-#define NUM_REF_IDX_L1_DEFAULT_ACTIVE PS_FIELD(159, 4)
-#define INIT_QP_MINUS26 PS_FIELD(163, 7)
-#define CONSTRAINED_INTRA_PRED_FLAG PS_FIELD(170, 1)
-#define TRANSFORM_SKIP_ENABLED_FLAG PS_FIELD(171, 1)
-#define CU_QP_DELTA_ENABLED_FLAG PS_FIELD(172, 1)
-#define LOG2_MIN_CU_QP_DELTA_SIZE PS_FIELD(173, 3)
-#define PPS_CB_QP_OFFSET PS_FIELD(176, 5)
-#define PPS_CR_QP_OFFSET PS_FIELD(181, 5)
-#define PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG PS_FIELD(186, 1)
-#define WEIGHTED_PRED_FLAG PS_FIELD(187, 1)
-#define WEIGHTED_BIPRED_FLAG PS_FIELD(188, 1)
-#define TRANSQUANT_BYPASS_ENABLED_FLAG PS_FIELD(189, 1)
-#define TILES_ENABLED_FLAG PS_FIELD(190, 1)
-#define ENTROPY_CODING_SYNC_ENABLED_FLAG PS_FIELD(191, 1)
-#define PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG PS_FIELD(192, 1)
-#define LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG PS_FIELD(193, 1)
-#define DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG PS_FIELD(194, 1)
-#define PPS_DEBLOCKING_FILTER_DISABLED_FLAG PS_FIELD(195, 1)
-#define PPS_BETA_OFFSET_DIV2 PS_FIELD(196, 4)
-#define PPS_TC_OFFSET_DIV2 PS_FIELD(200, 4)
-#define LISTS_MODIFICATION_PRESENT_FLAG PS_FIELD(204, 1)
-#define LOG2_PARALLEL_MERGE_LEVEL PS_FIELD(205, 3)
-#define SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG PS_FIELD(208, 1)
-#define NUM_TILE_COLUMNS PS_FIELD(212, 5)
-#define NUM_TILE_ROWS PS_FIELD(217, 5)
-#define COLUMN_WIDTH(i) PS_FIELD(256 + ((i) * 8), 8)
-#define ROW_HEIGHT(i) PS_FIELD(416 + ((i) * 8), 8)
-#define SCALING_LIST_ADDRESS PS_FIELD(592, 32)
+#define PIC_PARAMETER_SET_ID BW_FIELD(128, 6)
+#define PPS_SEQ_PARAMETER_SET_ID BW_FIELD(134, 4)
+#define DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG BW_FIELD(138, 1)
+#define OUTPUT_FLAG_PRESENT_FLAG BW_FIELD(139, 1)
+#define NUM_EXTRA_SLICE_HEADER_BITS BW_FIELD(140, 13)
+#define SIGN_DATA_HIDING_ENABLED_FLAG BW_FIELD(153, 1)
+#define CABAC_INIT_PRESENT_FLAG BW_FIELD(154, 1)
+#define NUM_REF_IDX_L0_DEFAULT_ACTIVE BW_FIELD(155, 4)
+#define NUM_REF_IDX_L1_DEFAULT_ACTIVE BW_FIELD(159, 4)
+#define INIT_QP_MINUS26 BW_FIELD(163, 7)
+#define CONSTRAINED_INTRA_PRED_FLAG BW_FIELD(170, 1)
+#define TRANSFORM_SKIP_ENABLED_FLAG BW_FIELD(171, 1)
+#define CU_QP_DELTA_ENABLED_FLAG BW_FIELD(172, 1)
+#define LOG2_MIN_CU_QP_DELTA_SIZE BW_FIELD(173, 3)
+#define PPS_CB_QP_OFFSET BW_FIELD(176, 5)
+#define PPS_CR_QP_OFFSET BW_FIELD(181, 5)
+#define PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG BW_FIELD(186, 1)
+#define WEIGHTED_PRED_FLAG BW_FIELD(187, 1)
+#define WEIGHTED_BIPRED_FLAG BW_FIELD(188, 1)
+#define TRANSQUANT_BYPASS_ENABLED_FLAG BW_FIELD(189, 1)
+#define TILES_ENABLED_FLAG BW_FIELD(190, 1)
+#define ENTROPY_CODING_SYNC_ENABLED_FLAG BW_FIELD(191, 1)
+#define PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG BW_FIELD(192, 1)
+#define LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG BW_FIELD(193, 1)
+#define DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG BW_FIELD(194, 1)
+#define PPS_DEBLOCKING_FILTER_DISABLED_FLAG BW_FIELD(195, 1)
+#define PPS_BETA_OFFSET_DIV2 BW_FIELD(196, 4)
+#define PPS_TC_OFFSET_DIV2 BW_FIELD(200, 4)
+#define LISTS_MODIFICATION_PRESENT_FLAG BW_FIELD(204, 1)
+#define LOG2_PARALLEL_MERGE_LEVEL BW_FIELD(205, 3)
+#define SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG BW_FIELD(208, 1)
+#define NUM_TILE_COLUMNS BW_FIELD(212, 5)
+#define NUM_TILE_ROWS BW_FIELD(217, 5)
+#define COLUMN_WIDTH(i) BW_FIELD(256 + ((i) * 8), 8)
+#define ROW_HEIGHT(i) BW_FIELD(416 + ((i) * 8), 8)
+#define SCALING_LIST_ADDRESS BW_FIELD(592, 32)
/* Data structure describing auxiliary buffer format. */
struct rkvdec_hevc_priv_tbl {
@@ -123,20 +116,6 @@ struct rkvdec_hevc_ctx {
struct rkvdec_regs regs;
};
-static void set_ps_field(u32 *buf, struct rkvdec_ps_field field, u32 value)
-{
- u8 bit = field.offset % 32, word = field.offset / 32;
- u64 mask = GENMASK_ULL(bit + field.len - 1, bit);
- u64 val = ((u64)value << bit) & mask;
-
- buf[word] &= ~mask;
- buf[word] |= val;
- if (bit + field.len > 32) {
- buf[word + 1] &= ~(mask >> 32);
- buf[word + 1] |= val >> 32;
- }
-}
-
static void assemble_hw_pps(struct rkvdec_ctx *ctx,
struct rkvdec_hevc_run *run)
{
@@ -159,7 +138,7 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
hw_ps = &priv_tbl->param_set[pps->pic_parameter_set_id];
memset(hw_ps, 0, sizeof(*hw_ps));
-#define WRITE_PPS(value, field) set_ps_field(hw_ps->info, field, value)
+#define WRITE_PPS(value, field) rkvdec_set_bw_field(hw_ps->info, field, value)
/* write sps */
WRITE_PPS(sps->video_parameter_set_id, VIDEO_PARAMETER_SET_ID);
WRITE_PPS(sps->seq_parameter_set_id, SEQ_PARAMETER_SET_ID);
@@ -321,17 +300,17 @@ static void assemble_sw_rps(struct rkvdec_ctx *ctx,
int i, j;
unsigned int lowdelay;
-#define WRITE_RPS(value, field) set_ps_field(hw_ps->info, field, value)
+#define WRITE_RPS(value, field) rkvdec_set_bw_field(hw_ps->info, field, value)
-#define REF_PIC_LONG_TERM_L0(i) PS_FIELD((i) * 5, 1)
-#define REF_PIC_IDX_L0(i) PS_FIELD(1 + ((i) * 5), 4)
-#define REF_PIC_LONG_TERM_L1(i) PS_FIELD(((i) < 5 ? 75 : 132) + ((i) * 5), 1)
-#define REF_PIC_IDX_L1(i) PS_FIELD(((i) < 4 ? 76 : 128) + ((i) * 5), 4)
+#define REF_PIC_LONG_TERM_L0(i) BW_FIELD((i) * 5, 1)
+#define REF_PIC_IDX_L0(i) BW_FIELD(1 + ((i) * 5), 4)
+#define REF_PIC_LONG_TERM_L1(i) BW_FIELD(((i) < 5 ? 75 : 132) + ((i) * 5), 1)
+#define REF_PIC_IDX_L1(i) BW_FIELD(((i) < 4 ? 76 : 128) + ((i) * 5), 4)
-#define LOWDELAY PS_FIELD(182, 1)
-#define LONG_TERM_RPS_BIT_OFFSET PS_FIELD(183, 10)
-#define SHORT_TERM_RPS_BIT_OFFSET PS_FIELD(193, 9)
-#define NUM_RPS_POC PS_FIELD(202, 4)
+#define LOWDELAY BW_FIELD(182, 1)
+#define LONG_TERM_RPS_BIT_OFFSET BW_FIELD(183, 10)
+#define SHORT_TERM_RPS_BIT_OFFSET BW_FIELD(193, 9)
+#define NUM_RPS_POC BW_FIELD(202, 4)
for (j = 0; j < run->num_slices; j++) {
uint st_bit_offset = 0;
--
2.53.0
^ permalink raw reply related
* [PATCH 3/4] media: rkvdec: common: Drop bitfields for the bitwriter
From: Detlev Casanova @ 2026-03-27 15:16 UTC (permalink / raw)
To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
Jonas Karlman, Nicolas Dufresne
Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v1-0-982cf872b590@collabora.com>
Currently, the common code files for hevc and h264 use structs with
bitfields to represent the HW RPS buffer.
Because the bitfields are mostly unaligned and numerous, it brings compiler
issues, especially with clang.
To prevent that, switch to using the global bitwriter previously
introduced instead.
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
.../platform/rockchip/rkvdec/rkvdec-h264-common.c | 51 +-----------
.../platform/rockchip/rkvdec/rkvdec-h264-common.h | 40 +++-------
.../platform/rockchip/rkvdec/rkvdec-hevc-common.c | 92 ++++------------------
.../platform/rockchip/rkvdec/rkvdec-hevc-common.h | 57 ++++----------
4 files changed, 43 insertions(+), 197 deletions(-)
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c
index e28f06394470..54639512e456 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.c
@@ -21,51 +21,6 @@
#define RKVDEC_NUM_REFLIST 3
-static void set_dpb_info(struct rkvdec_rps_entry *entries,
- u8 reflist,
- u8 refnum,
- u8 info,
- bool bottom)
-{
- struct rkvdec_rps_entry *entry = &entries[(reflist * 4) + refnum / 8];
- u8 idx = refnum % 8;
-
- switch (idx) {
- case 0:
- entry->dpb_info0 = info;
- entry->bottom_flag0 = bottom;
- break;
- case 1:
- entry->dpb_info1 = info;
- entry->bottom_flag1 = bottom;
- break;
- case 2:
- entry->dpb_info2 = info;
- entry->bottom_flag2 = bottom;
- break;
- case 3:
- entry->dpb_info3 = info;
- entry->bottom_flag3 = bottom;
- break;
- case 4:
- entry->dpb_info4 = info;
- entry->bottom_flag4 = bottom;
- break;
- case 5:
- entry->dpb_info5 = info;
- entry->bottom_flag5 = bottom;
- break;
- case 6:
- entry->dpb_info6 = info;
- entry->bottom_flag6 = bottom;
- break;
- case 7:
- entry->dpb_info7 = info;
- entry->bottom_flag7 = bottom;
- break;
- }
-}
-
void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
struct rkvdec_h264_run *run)
{
@@ -111,7 +66,7 @@ void assemble_hw_rps(struct v4l2_h264_reflist_builder *builder,
if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
continue;
- hw_rps->frame_num[i] = builder->refs[i].frame_num;
+ rkvdec_set_bw_field(hw_rps->info, RPS_FRAME_NUM(i), builder->refs[i].frame_num);
}
for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
@@ -138,7 +93,9 @@ void assemble_hw_rps(struct v4l2_h264_reflist_builder *builder,
dpb_valid = !!(run->ref_buf[ref->index]);
bottom = ref->fields == V4L2_H264_BOTTOM_FIELD_REF;
- set_dpb_info(hw_rps->entries, j, i, ref->index | (dpb_valid << 4), bottom);
+ rkvdec_set_bw_field(hw_rps->info, RPS_ENTRY_DPB_INFO(j, i),
+ ref->index | (dpb_valid << 4));
+ rkvdec_set_bw_field(hw_rps->info, RPS_ENTRY_BOTTOM_FLAG(j, i), bottom);
}
}
}
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h
index 5336370507d6..8d3255289135 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-h264-common.h
@@ -16,6 +16,7 @@
#include <media/v4l2-mem2mem.h>
#include "rkvdec.h"
+#include "rkvdec-bitwriter.h"
struct rkvdec_h264_scaling_list {
u8 scaling_list_4x4[6][16];
@@ -38,39 +39,16 @@ struct rkvdec_h264_run {
struct vb2_buffer *ref_buf[V4L2_H264_NUM_DPB_ENTRIES];
};
-struct rkvdec_rps_entry {
- u32 dpb_info0: 5;
- u32 bottom_flag0: 1;
- u32 view_index_off0: 1;
- u32 dpb_info1: 5;
- u32 bottom_flag1: 1;
- u32 view_index_off1: 1;
- u32 dpb_info2: 5;
- u32 bottom_flag2: 1;
- u32 view_index_off2: 1;
- u32 dpb_info3: 5;
- u32 bottom_flag3: 1;
- u32 view_index_off3: 1;
- u32 dpb_info4: 5;
- u32 bottom_flag4: 1;
- u32 view_index_off4: 1;
- u32 dpb_info5: 5;
- u32 bottom_flag5: 1;
- u32 view_index_off5: 1;
- u32 dpb_info6: 5;
- u32 bottom_flag6: 1;
- u32 view_index_off6: 1;
- u32 dpb_info7: 5;
- u32 bottom_flag7: 1;
- u32 view_index_off7: 1;
-} __packed;
+#define RPS_FRAME_NUM(i) BW_FIELD((i) * 16, 16)
+#define RPS_ENTRY_DPB_INFO(l, e) BW_FIELD(288 + (l) * 7 * 32 + (e) * 7, 5) //l: 0-2, e: 0-31
+#define RPS_ENTRY_BOTTOM_FLAG(l, e) BW_FIELD(293 + (l) * 7 * 32 + (e) * 7, 1) //l: 0-2, e: 0-31
+#define RPS_ENTRY_VIEW_INDEX_OFF(l, e) BW_FIELD(294 + (l) * 7 * 32 + (e) * 7, 1) //l: 0-2, e: 0-31
+
+#define RKVDEC_H264_RPS_SIZE ALIGN(RPS_ENTRY_VIEW_INDEX_OFF(3, 32).offset, 128)
struct rkvdec_rps {
- u16 frame_num[16];
- u32 reserved0;
- struct rkvdec_rps_entry entries[12];
- u32 reserved1[66];
-} __packed;
+ u32 info[RKVDEC_H264_RPS_SIZE / 8 / 4];
+};
void lookup_ref_buf_idx(struct rkvdec_ctx *ctx, struct rkvdec_h264_run *run);
void assemble_hw_rps(struct v4l2_h264_reflist_builder *builder,
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c
index 3119f3bc9f98..be7e86dd976b 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c
@@ -74,72 +74,6 @@ void compute_tiles_non_uniform(struct rkvdec_hevc_run *run, u16 log2_min_cb_size
row_height[i] = pic_in_cts_height - sum;
}
-static void set_ref_poc(struct rkvdec_rps_short_term_ref_set *set, int poc, int value, int flag)
-{
- switch (poc) {
- case 0:
- set->delta_poc0 = value;
- set->used_flag0 = flag;
- break;
- case 1:
- set->delta_poc1 = value;
- set->used_flag1 = flag;
- break;
- case 2:
- set->delta_poc2 = value;
- set->used_flag2 = flag;
- break;
- case 3:
- set->delta_poc3 = value;
- set->used_flag3 = flag;
- break;
- case 4:
- set->delta_poc4 = value;
- set->used_flag4 = flag;
- break;
- case 5:
- set->delta_poc5 = value;
- set->used_flag5 = flag;
- break;
- case 6:
- set->delta_poc6 = value;
- set->used_flag6 = flag;
- break;
- case 7:
- set->delta_poc7 = value;
- set->used_flag7 = flag;
- break;
- case 8:
- set->delta_poc8 = value;
- set->used_flag8 = flag;
- break;
- case 9:
- set->delta_poc9 = value;
- set->used_flag9 = flag;
- break;
- case 10:
- set->delta_poc10 = value;
- set->used_flag10 = flag;
- break;
- case 11:
- set->delta_poc11 = value;
- set->used_flag11 = flag;
- break;
- case 12:
- set->delta_poc12 = value;
- set->used_flag12 = flag;
- break;
- case 13:
- set->delta_poc13 = value;
- set->used_flag13 = flag;
- break;
- case 14:
- set->delta_poc14 = value;
- set->used_flag14 = flag;
- break;
- }
-}
-
static void assemble_scalingfactor0(struct rkvdec_ctx *ctx, u8 *output,
const struct v4l2_ctrl_hevc_scaling_matrix *input)
{
@@ -218,10 +152,10 @@ static void rkvdec_hevc_assemble_hw_lt_rps(struct rkvdec_hevc_run *run, struct r
return;
for (int i = 0; i < sps->num_long_term_ref_pics_sps; i++) {
- rps->refs[i].lt_ref_pic_poc_lsb =
- run->ext_sps_lt_rps[i].lt_ref_pic_poc_lsb_sps;
- rps->refs[i].used_by_curr_pic_lt_flag =
- !!(run->ext_sps_lt_rps[i].flags & V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT);
+ rkvdec_set_bw_field(rps->info, RPS_LT_REF_PIC_POC_LSB(i),
+ run->ext_sps_lt_rps[i].lt_ref_pic_poc_lsb_sps);
+ rkvdec_set_bw_field(rps->info, RPS_LT_REF_USED_BY_CURR_PIC(i),
+ !!(run->ext_sps_lt_rps[i].flags & V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT));
}
}
@@ -235,18 +169,24 @@ static void rkvdec_hevc_assemble_hw_st_rps(struct rkvdec_hevc_run *run, struct r
int j = 0;
const struct calculated_rps_st_set *set = &calculated_rps_st_sets[i];
- rps->short_term_ref_sets[i].num_negative = set->num_negative_pics;
- rps->short_term_ref_sets[i].num_positive = set->num_positive_pics;
+ rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_NUM_NEGATIVE(i),
+ set->num_negative_pics);
+ rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_NUM_POSITIVE(i),
+ set->num_positive_pics);
for (; j < set->num_negative_pics; j++) {
- set_ref_poc(&rps->short_term_ref_sets[i], j,
- set->delta_poc_s0[j], set->used_by_curr_pic_s0[j]);
+ rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_DELTA_POC(i, j),
+ set->delta_poc_s0[j]);
+ rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_USED(i, j),
+ set->used_by_curr_pic_s0[j]);
}
poc = j;
for (j = 0; j < set->num_positive_pics; j++) {
- set_ref_poc(&rps->short_term_ref_sets[i], poc + j,
- set->delta_poc_s1[j], set->used_by_curr_pic_s1[j]);
+ rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_DELTA_POC(i, poc + j),
+ set->delta_poc_s1[j]);
+ rkvdec_set_bw_field(rps->info, RPS_ST_REF_SET_USED(i, poc + j),
+ set->used_by_curr_pic_s1[j]);
}
}
}
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h
index 6f4faca4c091..cd3a2eb36b58 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.h
@@ -19,53 +19,24 @@
#include <linux/types.h>
#include "rkvdec.h"
+#include "rkvdec-bitwriter.h"
-struct rkvdec_rps_refs {
- u16 lt_ref_pic_poc_lsb;
- u16 used_by_curr_pic_lt_flag : 1;
- u16 reserved : 15;
-} __packed;
+#define RPS_LT_REF_PIC_POC_LSB(i) BW_FIELD(0 + (i) * 32, 16) // i: 0-31
+#define RPS_LT_REF_USED_BY_CURR_PIC(i) BW_FIELD(16 + (i) * 32, 1) // i: 0-31
-struct rkvdec_rps_short_term_ref_set {
- u32 num_negative : 4;
- u32 num_positive : 4;
- u32 delta_poc0 : 16;
- u32 used_flag0 : 1;
- u32 delta_poc1 : 16;
- u32 used_flag1 : 1;
- u32 delta_poc2 : 16;
- u32 used_flag2 : 1;
- u32 delta_poc3 : 16;
- u32 used_flag3 : 1;
- u32 delta_poc4 : 16;
- u32 used_flag4 : 1;
- u32 delta_poc5 : 16;
- u32 used_flag5 : 1;
- u32 delta_poc6 : 16;
- u32 used_flag6 : 1;
- u32 delta_poc7 : 16;
- u32 used_flag7 : 1;
- u32 delta_poc8 : 16;
- u32 used_flag8 : 1;
- u32 delta_poc9 : 16;
- u32 used_flag9 : 1;
- u32 delta_poc10 : 16;
- u32 used_flag10 : 1;
- u32 delta_poc11 : 16;
- u32 used_flag11 : 1;
- u32 delta_poc12 : 16;
- u32 used_flag12 : 1;
- u32 delta_poc13 : 16;
- u32 used_flag13 : 1;
- u32 delta_poc14 : 16;
- u32 used_flag14 : 1;
- u32 reserved_bits : 25;
- u32 reserved[3];
-} __packed;
+#define RPS_ST_REF_SET_NUM_NEGATIVE(i) BW_FIELD(1024 + ((i) * 384), 4) // i: 0-63
+#define RPS_ST_REF_SET_NUM_POSITIVE(i) BW_FIELD(1028 + ((i) * 384), 4) // i: 0-63
+
+// i: 0-63, j: 0-14
+#define RPS_ST_REF_SET_DELTA_POC(i, j) BW_FIELD(1032 + ((i) * 384) + ((j) * 17), 16)
+
+// i: 0-63, j: 0-14
+#define RPS_ST_REF_SET_USED(i, j) BW_FIELD(1048 + ((i) * 384) + ((j) * 17), 1)
+
+#define RKVDEC_RPS_HEVC_SIZE ALIGN(RPS_ST_REF_SET_USED(64, 15).offset, 128)
struct rkvdec_rps {
- struct rkvdec_rps_refs refs[32];
- struct rkvdec_rps_short_term_ref_set short_term_ref_sets[64];
+ u32 info[RKVDEC_RPS_HEVC_SIZE / 8 / 4];
} __packed;
struct rkvdec_hevc_run {
--
2.53.0
^ permalink raw reply related
* [PATCH 4/4] media: rkvdec: vdpu383: Drop bitfields for the bitwriter
From: Detlev Casanova @ 2026-03-27 15:16 UTC (permalink / raw)
To: Ezequiel Garcia, Mauro Carvalho Chehab, Heiko Stuebner,
Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt,
Jonas Karlman, Nicolas Dufresne
Cc: linux-kernel, linux-media, linux-rockchip, linux-arm-kernel, llvm,
kernel, Detlev Casanova
In-Reply-To: <20260327-rkvdec-use-bitwriter-v1-0-982cf872b590@collabora.com>
The VDPU383 support for hevc and h264 use structs with bitfields to
represent the SPS and PPS.
Because the fields are mostly unaligned and numerous, it brings compiler
issues, especially with clang.
To prevent that, switch to using the global bitwriter previously
introduced instead.
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
---
.../platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c | 351 ++++++--------
.../platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c | 502 +++++++++------------
2 files changed, 360 insertions(+), 493 deletions(-)
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c
index fb4f849d7366..a08038fbc6d5 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c
@@ -15,105 +15,64 @@
#include "rkvdec-cabac.h"
#include "rkvdec-vdpu383-regs.h"
#include "rkvdec-h264-common.h"
-
-struct rkvdec_sps {
- u16 seq_parameter_set_id: 4;
- u16 profile_idc: 8;
- u16 constraint_set3_flag: 1;
- u16 chroma_format_idc: 2;
- u16 bit_depth_luma: 3;
- u16 bit_depth_chroma: 3;
- u16 qpprime_y_zero_transform_bypass_flag: 1;
- u16 log2_max_frame_num_minus4: 4;
- u16 max_num_ref_frames: 5;
- u16 pic_order_cnt_type: 2;
- u16 log2_max_pic_order_cnt_lsb_minus4: 4;
- u16 delta_pic_order_always_zero_flag: 1;
-
- u16 pic_width_in_mbs: 16;
- u16 pic_height_in_mbs: 16;
-
- u16 frame_mbs_only_flag: 1;
- u16 mb_adaptive_frame_field_flag: 1;
- u16 direct_8x8_inference_flag: 1;
- u16 mvc_extension_enable: 1;
- u16 num_views: 2;
- u16 view_id0: 10;
- u16 view_id1: 10;
-} __packed;
-
-struct rkvdec_pps {
- u32 pic_parameter_set_id: 8;
- u32 pps_seq_parameter_set_id: 5;
- u32 entropy_coding_mode_flag: 1;
- u32 bottom_field_pic_order_in_frame_present_flag: 1;
- u32 num_ref_idx_l0_default_active_minus1: 5;
- u32 num_ref_idx_l1_default_active_minus1: 5;
- u32 weighted_pred_flag: 1;
- u32 weighted_bipred_idc: 2;
- u32 pic_init_qp_minus26: 7;
- u32 pic_init_qs_minus26: 6;
- u32 chroma_qp_index_offset: 5;
- u32 deblocking_filter_control_present_flag: 1;
- u32 constrained_intra_pred_flag: 1;
- u32 redundant_pic_cnt_present: 1;
- u32 transform_8x8_mode_flag: 1;
- u32 second_chroma_qp_index_offset: 5;
- u32 scaling_list_enable_flag: 1;
- u32 is_longterm: 16;
- u32 voidx: 16;
-
- // dpb
- u32 pic_field_flag: 1;
- u32 pic_associated_flag: 1;
- u32 cur_top_field: 32;
- u32 cur_bot_field: 32;
-
- u32 top_field_order_cnt0: 32;
- u32 bot_field_order_cnt0: 32;
- u32 top_field_order_cnt1: 32;
- u32 bot_field_order_cnt1: 32;
- u32 top_field_order_cnt2: 32;
- u32 bot_field_order_cnt2: 32;
- u32 top_field_order_cnt3: 32;
- u32 bot_field_order_cnt3: 32;
- u32 top_field_order_cnt4: 32;
- u32 bot_field_order_cnt4: 32;
- u32 top_field_order_cnt5: 32;
- u32 bot_field_order_cnt5: 32;
- u32 top_field_order_cnt6: 32;
- u32 bot_field_order_cnt6: 32;
- u32 top_field_order_cnt7: 32;
- u32 bot_field_order_cnt7: 32;
- u32 top_field_order_cnt8: 32;
- u32 bot_field_order_cnt8: 32;
- u32 top_field_order_cnt9: 32;
- u32 bot_field_order_cnt9: 32;
- u32 top_field_order_cnt10: 32;
- u32 bot_field_order_cnt10: 32;
- u32 top_field_order_cnt11: 32;
- u32 bot_field_order_cnt11: 32;
- u32 top_field_order_cnt12: 32;
- u32 bot_field_order_cnt12: 32;
- u32 top_field_order_cnt13: 32;
- u32 bot_field_order_cnt13: 32;
- u32 top_field_order_cnt14: 32;
- u32 bot_field_order_cnt14: 32;
- u32 top_field_order_cnt15: 32;
- u32 bot_field_order_cnt15: 32;
-
- u32 ref_field_flags: 16;
- u32 ref_topfield_used: 16;
- u32 ref_botfield_used: 16;
- u32 ref_colmv_use_flag: 16;
-
- u32 reserved0: 30;
- u32 reserved[3];
-} __packed;
+#include "rkvdec-bitwriter.h"
+
+#define SEQ_PARAMETER_SET_ID BW_FIELD(0, 4)
+#define PROFILE_IDC BW_FIELD(4, 8)
+#define CONSTRAINT_SET3_FLAG BW_FIELD(12, 1)
+#define CHROMA_FORMAT_IDC BW_FIELD(13, 2)
+#define BIT_DEPTH_LUMA BW_FIELD(15, 3)
+#define BIT_DEPTH_CHROMA BW_FIELD(18, 3)
+#define QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG BW_FIELD(21, 1)
+#define LOG2_MAX_FRAME_NUM_MINUS4 BW_FIELD(22, 4)
+#define MAX_NUM_REF_FRAMES BW_FIELD(26, 5)
+#define PIC_ORDER_CNT_TYPE BW_FIELD(31, 2)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4 BW_FIELD(33, 4)
+#define DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG BW_FIELD(37, 1)
+#define PIC_WIDTH_IN_MBS BW_FIELD(38, 16)
+#define PIC_HEIGHT_IN_MBS BW_FIELD(54, 16)
+#define FRAME_MBS_ONLY_FLAG BW_FIELD(70, 1)
+#define MB_ADAPTIVE_FRAME_FIELD_FLAG BW_FIELD(71, 1)
+#define DIRECT_8X8_INFERENCE_FLAG BW_FIELD(72, 1)
+#define MVC_EXTENSION_ENABLE BW_FIELD(73, 1)
+#define NUM_VIEWS BW_FIELD(74, 2)
+#define VIEW_ID(i) BW_FIELD(76 + ((i) * 10), 10) // i: 0-1
+
+#define PIC_PARAMETER_SET_ID BW_FIELD(96, 8)
+#define PPS_SEQ_PARAMETER_SET_ID BW_FIELD(104, 5)
+#define ENTROPY_CODING_MODE_FLAG BW_FIELD(109, 1)
+#define BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG BW_FIELD(110, 1)
+#define NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(i) BW_FIELD(111 + ((i) * 5), 5) // i: 0-1
+#define WEIGHTED_PRED_FLAG BW_FIELD(121, 1)
+#define WEIGHTED_BIPRED_IDC BW_FIELD(122, 2)
+#define PIC_INIT_QP_MINUS26 BW_FIELD(124, 7)
+#define PIC_INIT_QS_MINUS26 BW_FIELD(131, 6)
+#define CHROMA_QP_INDEX_OFFSET BW_FIELD(137, 5)
+#define DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG BW_FIELD(142, 1)
+#define CONSTRAINED_INTRA_PRED_FLAG BW_FIELD(143, 1)
+#define REDUNDANT_PIC_CNT_PRESENT BW_FIELD(144, 1)
+#define TRANSFORM_8X8_MODE_FLAG BW_FIELD(145, 1)
+#define SECOND_CHROMA_QP_INDEX_OFFSET BW_FIELD(146, 5)
+#define SCALING_LIST_ENABLE_FLAG BW_FIELD(151, 1)
+#define IS_LONG_TERM(i) BW_FIELD(152 + (i), 1) // i: 0-15
+
+#define PIC_FIELD_FLAG BW_FIELD(184, 1)
+#define PIC_ASSOCIATED_FLAG BW_FIELD(185, 1)
+#define CUR_TOP_FIELD BW_FIELD(186, 32)
+#define CUR_BOT_FIELD BW_FIELD(218, 32)
+
+#define TOP_FIELD_ORDER_CNT(i) BW_FIELD(250 + (i) * 64, 32) // i: 0-15
+#define BOT_FIELD_ORDER_CNT(i) BW_FIELD(282 + (i) * 64, 32) // i: 0-15
+
+#define REF_FIELD_FLAGS(i) BW_FIELD(1274 + (i), 1) // i: 0-15
+#define REF_TOPFIELD_USED(i) BW_FIELD(1290 + (i), 1) // i: 0-15
+#define REF_BOTFIELD_USED(i) BW_FIELD(1306 + (i), 1) // i: 0-15
+#define REF_COLMV_USE_FLAG(i) BW_FIELD(1322 + (i), 1) // i: 0-15
+
+#define SPS_SIZE ALIGN(REF_COLMV_USE_FLAG(16).offset, 128)
struct rkvdec_sps_pps {
- struct rkvdec_sps sps;
- struct rkvdec_pps pps;
+ u32 info[SPS_SIZE / 8 / 4];
} __packed;
/* Data structure describing auxiliary buffer format. */
@@ -130,67 +89,6 @@ struct rkvdec_h264_ctx {
struct vdpu383_regs_h26x regs;
};
-static noinline_for_stack void set_field_order_cnt(struct rkvdec_pps *pps, const struct v4l2_h264_dpb_entry *dpb)
-{
- pps->top_field_order_cnt0 = dpb[0].top_field_order_cnt;
- pps->bot_field_order_cnt0 = dpb[0].bottom_field_order_cnt;
- pps->top_field_order_cnt1 = dpb[1].top_field_order_cnt;
- pps->bot_field_order_cnt1 = dpb[1].bottom_field_order_cnt;
- pps->top_field_order_cnt2 = dpb[2].top_field_order_cnt;
- pps->bot_field_order_cnt2 = dpb[2].bottom_field_order_cnt;
- pps->top_field_order_cnt3 = dpb[3].top_field_order_cnt;
- pps->bot_field_order_cnt3 = dpb[3].bottom_field_order_cnt;
- pps->top_field_order_cnt4 = dpb[4].top_field_order_cnt;
- pps->bot_field_order_cnt4 = dpb[4].bottom_field_order_cnt;
- pps->top_field_order_cnt5 = dpb[5].top_field_order_cnt;
- pps->bot_field_order_cnt5 = dpb[5].bottom_field_order_cnt;
- pps->top_field_order_cnt6 = dpb[6].top_field_order_cnt;
- pps->bot_field_order_cnt6 = dpb[6].bottom_field_order_cnt;
- pps->top_field_order_cnt7 = dpb[7].top_field_order_cnt;
- pps->bot_field_order_cnt7 = dpb[7].bottom_field_order_cnt;
- pps->top_field_order_cnt8 = dpb[8].top_field_order_cnt;
- pps->bot_field_order_cnt8 = dpb[8].bottom_field_order_cnt;
- pps->top_field_order_cnt9 = dpb[9].top_field_order_cnt;
- pps->bot_field_order_cnt9 = dpb[9].bottom_field_order_cnt;
- pps->top_field_order_cnt10 = dpb[10].top_field_order_cnt;
- pps->bot_field_order_cnt10 = dpb[10].bottom_field_order_cnt;
- pps->top_field_order_cnt11 = dpb[11].top_field_order_cnt;
- pps->bot_field_order_cnt11 = dpb[11].bottom_field_order_cnt;
- pps->top_field_order_cnt12 = dpb[12].top_field_order_cnt;
- pps->bot_field_order_cnt12 = dpb[12].bottom_field_order_cnt;
- pps->top_field_order_cnt13 = dpb[13].top_field_order_cnt;
- pps->bot_field_order_cnt13 = dpb[13].bottom_field_order_cnt;
- pps->top_field_order_cnt14 = dpb[14].top_field_order_cnt;
- pps->bot_field_order_cnt14 = dpb[14].bottom_field_order_cnt;
- pps->top_field_order_cnt15 = dpb[15].top_field_order_cnt;
- pps->bot_field_order_cnt15 = dpb[15].bottom_field_order_cnt;
-}
-
-static noinline_for_stack void set_dec_params(struct rkvdec_pps *pps, const struct v4l2_ctrl_h264_decode_params *dec_params)
-{
- const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb;
-
- for (int i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
- if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
- pps->is_longterm |= (1 << i);
- pps->ref_field_flags |=
- (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD)) << i;
- pps->ref_colmv_use_flag |=
- (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) << i;
- pps->ref_topfield_used |=
- (!!(dpb[i].fields & V4L2_H264_TOP_FIELD_REF)) << i;
- pps->ref_botfield_used |=
- (!!(dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)) << i;
- }
- pps->pic_field_flag =
- !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC);
- pps->pic_associated_flag =
- !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD);
-
- pps->cur_top_field = dec_params->top_field_order_cnt;
- pps->cur_bot_field = dec_params->bottom_field_order_cnt;
-}
-
static void assemble_hw_pps(struct rkvdec_ctx *ctx,
struct rkvdec_h264_run *run)
{
@@ -202,6 +100,7 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu;
struct rkvdec_sps_pps *hw_ps;
u32 pic_width, pic_height;
+ int i;
/*
* HW read the SPS/PPS information from PPS packet index by PPS id.
@@ -213,23 +112,25 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
memset(hw_ps, 0, sizeof(*hw_ps));
/* write sps */
- hw_ps->sps.seq_parameter_set_id = sps->seq_parameter_set_id;
- hw_ps->sps.profile_idc = sps->profile_idc;
- hw_ps->sps.constraint_set3_flag = !!(sps->constraint_set_flags & (1 << 3));
- hw_ps->sps.chroma_format_idc = sps->chroma_format_idc;
- hw_ps->sps.bit_depth_luma = sps->bit_depth_luma_minus8;
- hw_ps->sps.bit_depth_chroma = sps->bit_depth_chroma_minus8;
- hw_ps->sps.qpprime_y_zero_transform_bypass_flag =
- !!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
- hw_ps->sps.log2_max_frame_num_minus4 = sps->log2_max_frame_num_minus4;
- hw_ps->sps.max_num_ref_frames = sps->max_num_ref_frames;
- hw_ps->sps.pic_order_cnt_type = sps->pic_order_cnt_type;
- hw_ps->sps.log2_max_pic_order_cnt_lsb_minus4 =
- sps->log2_max_pic_order_cnt_lsb_minus4;
- hw_ps->sps.delta_pic_order_always_zero_flag =
- !!(sps->flags & V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
- hw_ps->sps.mvc_extension_enable = 0;
- hw_ps->sps.num_views = 0;
+ rkvdec_set_bw_field(hw_ps->info, SEQ_PARAMETER_SET_ID, sps->seq_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, PROFILE_IDC, sps->profile_idc);
+ rkvdec_set_bw_field(hw_ps->info, CONSTRAINT_SET3_FLAG,
+ !!(sps->constraint_set_flags & (1 << 3)));
+ rkvdec_set_bw_field(hw_ps->info, CHROMA_FORMAT_IDC, sps->chroma_format_idc);
+ rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_LUMA, sps->bit_depth_luma_minus8);
+ rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_CHROMA, sps->bit_depth_chroma_minus8);
+ rkvdec_set_bw_field(hw_ps->info, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG,
+ !!(sps->flags & V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS));
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MAX_FRAME_NUM_MINUS4,
+ sps->log2_max_frame_num_minus4);
+ rkvdec_set_bw_field(hw_ps->info, MAX_NUM_REF_FRAMES, sps->max_num_ref_frames);
+ rkvdec_set_bw_field(hw_ps->info, PIC_ORDER_CNT_TYPE, sps->pic_order_cnt_type);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4,
+ sps->log2_max_pic_order_cnt_lsb_minus4);
+ rkvdec_set_bw_field(hw_ps->info, DELTA_PIC_ORDER_ALWAYS_ZERO_FLAG,
+ !!(sps->flags & V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO));
+ rkvdec_set_bw_field(hw_ps->info, MVC_EXTENSION_ENABLE, 0);
+ rkvdec_set_bw_field(hw_ps->info, NUM_VIEWS, 0);
/*
* Use the SPS values since they are already in macroblocks
@@ -245,48 +146,72 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
if (!!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC))
pic_height /= 2;
- hw_ps->sps.pic_width_in_mbs = pic_width;
- hw_ps->sps.pic_height_in_mbs = pic_height;
+ rkvdec_set_bw_field(hw_ps->info, PIC_WIDTH_IN_MBS, pic_width);
+ rkvdec_set_bw_field(hw_ps->info, PIC_HEIGHT_IN_MBS, pic_height);
- hw_ps->sps.frame_mbs_only_flag =
- !!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
- hw_ps->sps.mb_adaptive_frame_field_flag =
- !!(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
- hw_ps->sps.direct_8x8_inference_flag =
- !!(sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
+ rkvdec_set_bw_field(hw_ps->info, FRAME_MBS_ONLY_FLAG,
+ !!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY));
+ rkvdec_set_bw_field(hw_ps->info, MB_ADAPTIVE_FRAME_FIELD_FLAG,
+ !!(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD));
+ rkvdec_set_bw_field(hw_ps->info, DIRECT_8X8_INFERENCE_FLAG,
+ !!(sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE));
/* write pps */
- hw_ps->pps.pic_parameter_set_id = pps->pic_parameter_set_id;
- hw_ps->pps.pps_seq_parameter_set_id = pps->seq_parameter_set_id;
- hw_ps->pps.entropy_coding_mode_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
- hw_ps->pps.bottom_field_pic_order_in_frame_present_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
- hw_ps->pps.num_ref_idx_l0_default_active_minus1 =
- pps->num_ref_idx_l0_default_active_minus1;
- hw_ps->pps.num_ref_idx_l1_default_active_minus1 =
- pps->num_ref_idx_l1_default_active_minus1;
- hw_ps->pps.weighted_pred_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
- hw_ps->pps.weighted_bipred_idc = pps->weighted_bipred_idc;
- hw_ps->pps.pic_init_qp_minus26 = pps->pic_init_qp_minus26;
- hw_ps->pps.pic_init_qs_minus26 = pps->pic_init_qs_minus26;
- hw_ps->pps.chroma_qp_index_offset = pps->chroma_qp_index_offset;
- hw_ps->pps.deblocking_filter_control_present_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
- hw_ps->pps.constrained_intra_pred_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
- hw_ps->pps.redundant_pic_cnt_present =
- !!(pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
- hw_ps->pps.transform_8x8_mode_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
- hw_ps->pps.second_chroma_qp_index_offset = pps->second_chroma_qp_index_offset;
- hw_ps->pps.scaling_list_enable_flag =
- !!(pps->flags & V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
-
- set_field_order_cnt(&hw_ps->pps, dpb);
- set_dec_params(&hw_ps->pps, dec_params);
+ rkvdec_set_bw_field(hw_ps->info, PIC_PARAMETER_SET_ID, pps->pic_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, PPS_SEQ_PARAMETER_SET_ID, pps->seq_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, ENTROPY_CODING_MODE_FLAG,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE));
+ rkvdec_set_bw_field(hw_ps->info, BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT_FLAG,
+ !!(pps->flags &
+ V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(0),
+ pps->num_ref_idx_l0_default_active_minus1);
+ rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L_DEFAULT_ACTIVE_MINUS1(1),
+ pps->num_ref_idx_l1_default_active_minus1);
+ rkvdec_set_bw_field(hw_ps->info, WEIGHTED_PRED_FLAG,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED));
+ rkvdec_set_bw_field(hw_ps->info, WEIGHTED_BIPRED_IDC, pps->weighted_bipred_idc);
+ rkvdec_set_bw_field(hw_ps->info, PIC_INIT_QP_MINUS26, pps->pic_init_qp_minus26);
+ rkvdec_set_bw_field(hw_ps->info, PIC_INIT_QS_MINUS26, pps->pic_init_qs_minus26);
+ rkvdec_set_bw_field(hw_ps->info, CHROMA_QP_INDEX_OFFSET, pps->chroma_qp_index_offset);
+ rkvdec_set_bw_field(hw_ps->info, DEBLOCKING_FILTER_CONTROL_PRESENT_FLAG,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, CONSTRAINED_INTRA_PRED_FLAG,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+ rkvdec_set_bw_field(hw_ps->info, REDUNDANT_PIC_CNT_PRESENT,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, TRANSFORM_8X8_MODE_FLAG,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE));
+ rkvdec_set_bw_field(hw_ps->info, SECOND_CHROMA_QP_INDEX_OFFSET,
+ pps->second_chroma_qp_index_offset);
+ rkvdec_set_bw_field(hw_ps->info, SCALING_LIST_ENABLE_FLAG,
+ !!(pps->flags & V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT));
+
+ for (i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
+ rkvdec_set_bw_field(hw_ps->info, TOP_FIELD_ORDER_CNT(i),
+ dpb[i].top_field_order_cnt);
+ rkvdec_set_bw_field(hw_ps->info, BOT_FIELD_ORDER_CNT(i),
+ dpb[i].bottom_field_order_cnt);
+
+ rkvdec_set_bw_field(hw_ps->info, IS_LONG_TERM(i),
+ !!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM));
+ rkvdec_set_bw_field(hw_ps->info, REF_FIELD_FLAGS(i),
+ !!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD));
+ rkvdec_set_bw_field(hw_ps->info, REF_COLMV_USE_FLAG(i),
+ !!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE));
+ rkvdec_set_bw_field(hw_ps->info, REF_TOPFIELD_USED(i),
+ !!(dpb[i].fields & V4L2_H264_TOP_FIELD_REF));
+ rkvdec_set_bw_field(hw_ps->info, REF_BOTFIELD_USED(i),
+ !!(dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF));
+ }
+
+ rkvdec_set_bw_field(hw_ps->info, PIC_FIELD_FLAG,
+ !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC));
+ rkvdec_set_bw_field(hw_ps->info, PIC_ASSOCIATED_FLAG,
+ !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD));
+ rkvdec_set_bw_field(hw_ps->info, CUR_TOP_FIELD, dec_params->top_field_order_cnt);
+ rkvdec_set_bw_field(hw_ps->info, CUR_BOT_FIELD, dec_params->bottom_field_order_cnt);
}
static void rkvdec_write_regs(struct rkvdec_ctx *ctx)
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c
index 96d938ee70b0..c818a92f1e63 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-hevc.c
@@ -13,149 +13,106 @@
#include "rkvdec-rcb.h"
#include "rkvdec-hevc-common.h"
#include "rkvdec-vdpu383-regs.h"
+#include "rkvdec-bitwriter.h"
+
+#define VIDEO_PARAMETER_SET_ID BW_FIELD(0, 4)
+#define SEQ_PARAMETER_SET_ID BW_FIELD(4, 4)
+#define CHROMA_FORMAT_IDC BW_FIELD(8, 2)
+#define PIC_WIDTH_IN_LUMA_SAMPLES BW_FIELD(10, 16)
+#define PIC_HEIGHT_IN_LUMA_SAMPLES BW_FIELD(26, 16)
+#define BIT_DEPTH_LUMA BW_FIELD(42, 3)
+#define BIT_DEPTH_CHROMA BW_FIELD(45, 3)
+#define LOG2_MAX_PIC_ORDER_CNT_LSB BW_FIELD(48, 5)
+#define LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE BW_FIELD(53, 2)
+#define LOG2_MIN_LUMA_CODING_BLOCK_SIZE BW_FIELD(55, 3)
+#define LOG2_MIN_TRANSFORM_BLOCK_SIZE BW_FIELD(58, 3)
+#define LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE BW_FIELD(61, 2)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTER BW_FIELD(63, 3)
+#define MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA BW_FIELD(66, 3)
+#define SCALING_LIST_ENABLED_FLAG BW_FIELD(69, 1)
+#define AMP_ENABLED_FLAG BW_FIELD(70, 1)
+#define SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG BW_FIELD(71, 1)
+#define PCM_ENABLED_FLAG BW_FIELD(72, 1)
+#define PCM_SAMPLE_BIT_DEPTH_LUMA BW_FIELD(73, 4)
+#define PCM_SAMPLE_BIT_DEPTH_CHROMA BW_FIELD(77, 4)
+#define PCM_LOOP_FILTER_DISABLED_FLAG BW_FIELD(81, 1)
+#define LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE BW_FIELD(82, 3)
+#define LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE BW_FIELD(85, 3)
+#define NUM_SHORT_TERM_REF_PIC_SETS BW_FIELD(88, 7)
+#define LONG_TERM_REF_PICS_PRESENT_FLAG BW_FIELD(95, 1)
+#define NUM_LONG_TERM_REF_PICS_SPS BW_FIELD(96, 6)
+#define SPS_TEMPORAL_MVP_ENABLED_FLAG BW_FIELD(102, 1)
+#define STRONG_INTRA_SMOOTHING_ENABLED_FLAG BW_FIELD(103, 1)
+#define SPS_MAX_DEC_PIC_BUFFERING_MINUS1 BW_FIELD(111, 4)
+#define SEPARATE_COLOUR_PLANE_FLAG BW_FIELD(115, 1)
+#define HIGH_PRECISION_OFFSETS_ENABLED_FLAG BW_FIELD(116, 1)
+#define PERSISTENT_RICE_ADAPTATION_ENABLED_FLAG BW_FIELD(117, 1)
+
+/* PPS */
+#define PIC_PARAMETER_SET_ID BW_FIELD(118, 6)
+#define PPS_SEQ_PARAMETER_SET_ID BW_FIELD(124, 4)
+#define DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG BW_FIELD(128, 1)
+#define OUTPUT_FLAG_PRESENT_FLAG BW_FIELD(129, 1)
+#define NUM_EXTRA_SLICE_HEADER_BITS BW_FIELD(130, 13)
+#define SIGN_DATA_HIDING_ENABLED_FLAG BW_FIELD(143, 1)
+#define CABAC_INIT_PRESENT_FLAG BW_FIELD(144, 1)
+#define NUM_REF_IDX_L0_DEFAULT_ACTIVE BW_FIELD(145, 4)
+#define NUM_REF_IDX_L1_DEFAULT_ACTIVE BW_FIELD(149, 4)
+#define INIT_QP_MINUS26 BW_FIELD(153, 7)
+#define CONSTRAINED_INTRA_PRED_FLAG BW_FIELD(160, 1)
+#define TRANSFORM_SKIP_ENABLED_FLAG BW_FIELD(161, 1)
+#define CU_QP_DELTA_ENABLED_FLAG BW_FIELD(162, 1)
+#define LOG2_MIN_CU_QP_DELTA_SIZE BW_FIELD(163, 3)
+#define PPS_CB_QP_OFFSET BW_FIELD(166, 5)
+#define PPS_CR_QP_OFFSET BW_FIELD(171, 5)
+#define PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG BW_FIELD(176, 1)
+#define WEIGHTED_PRED_FLAG BW_FIELD(177, 1)
+#define WEIGHTED_BIPRED_FLAG BW_FIELD(178, 1)
+#define TRANSQUANT_BYPASS_ENABLED_FLAG BW_FIELD(179, 1)
+#define TILES_ENABLED_FLAG BW_FIELD(180, 1)
+#define ENTROPY_CODING_SYNC_ENABLED_FLAG BW_FIELD(181, 1)
+#define PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG BW_FIELD(182, 1)
+#define LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG BW_FIELD(183, 1)
+#define DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG BW_FIELD(184, 1)
+#define PPS_DEBLOCKING_FILTER_DISABLED_FLAG BW_FIELD(185, 1)
+#define PPS_BETA_OFFSET_DIV2 BW_FIELD(186, 4)
+#define PPS_TC_OFFSET_DIV2 BW_FIELD(190, 4)
+#define LISTS_MODIFICATION_PRESENT_FLAG BW_FIELD(194, 1)
+#define LOG2_PARALLEL_MERGE_LEVEL BW_FIELD(195, 3)
+#define SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG BW_FIELD(198, 1)
+
+/* pps extensions */
+#define LOG2_MAX_TRANSFORM_SKIP_BLOCK_SIZE BW_FIELD(202, 2)
+#define CROSS_COMPONENT_PREDICTION_ENABLED_FLAG BW_FIELD(204, 1)
+#define CHROMA_QP_OFFSET_LIST_ENABLED_FLAG BW_FIELD(205, 1)
+#define LOG2_MIN_CU_CHROMA_QP_DELTA_SIZE BW_FIELD(206, 3)
+#define CB_QP_OFFSET_LIST(i) BW_FIELD(209 + (i) * 5, 5) // i: 0-5
+#define CB_CR_OFFSET_LIST(i) BW_FIELD(239 + (i) * 5, 5) // i: 0-5
+#define CHROMA_QP_OFFSET_LIST_LEN_MINUS1 BW_FIELD(269, 3)
+
+/* mvc0 && mvc1 */
+#define MVC_FF BW_FIELD(272, 16)
+#define MVC_00 BW_FIELD(288, 9)
+
+/* poc info */
+#define RESERVED2 BW_FIELD(297, 3)
+#define CURRENT_POC BW_FIELD(300, 32)
+#define REF_PIC_POC(i) BW_FIELD(332 + (i) * 32, 32) // i: 0-14
+#define RESERVED3 BW_FIELD(812, 32)
+#define REF_IS_VALID(i) BW_FIELD(844 + (i), 1) // i: 0-14
+#define RESERVED4 BW_FIELD(859, 1)
+
+/* tile info*/
+#define NUM_TILE_COLUMNS BW_FIELD(860, 5)
+#define NUM_TILE_ROWS BW_FIELD(865, 5)
+#define COLUMN_WIDTH(i) BW_FIELD(870 + (i) * 12, 12) // i: 0-19
+#define ROW_HEIGHT(i) BW_FIELD(1110 + (i) * 12, 12) // i: 0-21
+
+#define HEVC_SPS_SIZE ALIGN(ROW_HEIGHT(22).offset, 256)
struct rkvdec_hevc_sps_pps {
- // SPS
- u16 video_parameters_set_id : 4;
- u16 seq_parameters_set_id_sps : 4;
- u16 chroma_format_idc : 2;
- u16 width : 16;
- u16 height : 16;
- u16 bit_depth_luma : 3;
- u16 bit_depth_chroma : 3;
- u16 max_pic_order_count_lsb : 5;
- u16 diff_max_min_luma_coding_block_size : 2;
- u16 min_luma_coding_block_size : 3;
- u16 min_transform_block_size : 3;
- u16 diff_max_min_transform_block_size : 2;
- u16 max_transform_hierarchy_depth_inter : 3;
- u16 max_transform_hierarchy_depth_intra : 3;
- u16 scaling_list_enabled_flag : 1;
- u16 amp_enabled_flag : 1;
- u16 sample_adaptive_offset_enabled_flag : 1;
- u16 pcm_enabled_flag : 1;
- u16 pcm_sample_bit_depth_luma : 4;
- u16 pcm_sample_bit_depth_chroma : 4;
- u16 pcm_loop_filter_disabled_flag : 1;
- u16 diff_max_min_pcm_luma_coding_block_size : 3;
- u16 min_pcm_luma_coding_block_size : 3;
- u16 num_short_term_ref_pic_sets : 7;
- u16 long_term_ref_pics_present_flag : 1;
- u16 num_long_term_ref_pics_sps : 6;
- u16 sps_temporal_mvp_enabled_flag : 1;
- u16 strong_intra_smoothing_enabled_flag : 1;
- u16 reserved0 : 7;
- u16 sps_max_dec_pic_buffering_minus1 : 4;
- u16 separate_colour_plane_flag : 1;
- u16 high_precision_offsets_enabled_flag : 1;
- u16 persistent_rice_adaptation_enabled_flag : 1;
-
- // PPS
- u16 picture_parameters_set_id : 6;
- u16 seq_parameters_set_id_pps : 4;
- u16 dependent_slice_segments_enabled_flag : 1;
- u16 output_flag_present_flag : 1;
- u16 num_extra_slice_header_bits : 13;
- u16 sign_data_hiding_enabled_flag : 1;
- u16 cabac_init_present_flag : 1;
- u16 num_ref_idx_l0_default_active : 4;
- u16 num_ref_idx_l1_default_active : 4;
- u16 init_qp_minus26 : 7;
- u16 constrained_intra_pred_flag : 1;
- u16 transform_skip_enabled_flag : 1;
- u16 cu_qp_delta_enabled_flag : 1;
- u16 log2_min_cb_size : 3;
- u16 pps_cb_qp_offset : 5;
- u16 pps_cr_qp_offset : 5;
- u16 pps_slice_chroma_qp_offsets_present_flag : 1;
- u16 weighted_pred_flag : 1;
- u16 weighted_bipred_flag : 1;
- u16 transquant_bypass_enabled_flag : 1;
- u16 tiles_enabled_flag : 1;
- u16 entropy_coding_sync_enabled_flag : 1;
- u16 pps_loop_filter_across_slices_enabled_flag : 1;
- u16 loop_filter_across_tiles_enabled_flag : 1;
- u16 deblocking_filter_override_enabled_flag : 1;
- u16 pps_deblocking_filter_disabled_flag : 1;
- u16 pps_beta_offset_div2 : 4;
- u16 pps_tc_offset_div2 : 4;
- u16 lists_modification_present_flag : 1;
- u16 log2_parallel_merge_level : 3;
- u16 slice_segment_header_extension_present_flag : 1;
- u16 reserved1 : 3;
-
- // pps extensions
- u16 log2_max_transform_skip_block_size : 2;
- u16 cross_component_prediction_enabled_flag : 1;
- u16 chroma_qp_offset_list_enabled_flag : 1;
- u16 log2_min_cu_chroma_qp_delta_size : 3;
- u16 cb_qp_offset_list0 : 5;
- u16 cb_qp_offset_list1 : 5;
- u16 cb_qp_offset_list2 : 5;
- u16 cb_qp_offset_list3 : 5;
- u16 cb_qp_offset_list4 : 5;
- u16 cb_qp_offset_list5 : 5;
- u16 cb_cr_offset_list0 : 5;
- u16 cb_cr_offset_list1 : 5;
- u16 cb_cr_offset_list2 : 5;
- u16 cb_cr_offset_list3 : 5;
- u16 cb_cr_offset_list4 : 5;
- u16 cb_cr_offset_list5 : 5;
- u16 chroma_qp_offset_list_len_minus1 : 3;
-
- /* mvc0 && mvc1 */
- u16 mvc_ff : 16;
- u16 mvc_00 : 9;
-
- /* poc info */
- u16 reserved2 : 3;
- u32 current_poc : 32;
- u32 ref_pic_poc0 : 32;
- u32 ref_pic_poc1 : 32;
- u32 ref_pic_poc2 : 32;
- u32 ref_pic_poc3 : 32;
- u32 ref_pic_poc4 : 32;
- u32 ref_pic_poc5 : 32;
- u32 ref_pic_poc6 : 32;
- u32 ref_pic_poc7 : 32;
- u32 ref_pic_poc8 : 32;
- u32 ref_pic_poc9 : 32;
- u32 ref_pic_poc10 : 32;
- u32 ref_pic_poc11 : 32;
- u32 ref_pic_poc12 : 32;
- u32 ref_pic_poc13 : 32;
- u32 ref_pic_poc14 : 32;
- u32 reserved3 : 32;
- u32 ref_is_valid : 15;
- u32 reserved4 : 1;
-
- /* tile info*/
- u16 num_tile_columns : 5;
- u16 num_tile_rows : 5;
- u32 column_width0 : 24;
- u32 column_width1 : 24;
- u32 column_width2 : 24;
- u32 column_width3 : 24;
- u32 column_width4 : 24;
- u32 column_width5 : 24;
- u32 column_width6 : 24;
- u32 column_width7 : 24;
- u32 column_width8 : 24;
- u32 column_width9 : 24;
- u32 row_height0 : 24;
- u32 row_height1 : 24;
- u32 row_height2 : 24;
- u32 row_height3 : 24;
- u32 row_height4 : 24;
- u32 row_height5 : 24;
- u32 row_height6 : 24;
- u32 row_height7 : 24;
- u32 row_height8 : 24;
- u32 row_height9 : 24;
- u32 row_height10 : 24;
- u32 reserved5 : 2;
- u32 padding;
-} __packed;
+ u32 info[HEVC_SPS_SIZE / 8 / 4];
+};
struct rkvdec_hevc_priv_tbl {
struct rkvdec_hevc_sps_pps param_set;
@@ -171,51 +128,6 @@ struct rkvdec_hevc_ctx {
struct vdpu383_regs_h26x regs;
};
-static void set_column_row(struct rkvdec_hevc_sps_pps *hw_ps, u16 *column, u16 *row)
-{
- hw_ps->column_width0 = column[0] | (column[1] << 12);
- hw_ps->row_height0 = row[0] | (row[1] << 12);
- hw_ps->column_width1 = column[2] | (column[3] << 12);
- hw_ps->row_height1 = row[2] | (row[3] << 12);
- hw_ps->column_width2 = column[4] | (column[5] << 12);
- hw_ps->row_height2 = row[4] | (row[5] << 12);
- hw_ps->column_width3 = column[6] | (column[7] << 12);
- hw_ps->row_height3 = row[6] | (row[7] << 12);
- hw_ps->column_width4 = column[8] | (column[9] << 12);
- hw_ps->row_height4 = row[8] | (row[9] << 12);
- hw_ps->column_width5 = column[10] | (column[11] << 12);
- hw_ps->row_height5 = row[10] | (row[11] << 12);
- hw_ps->column_width6 = column[12] | (column[13] << 12);
- hw_ps->row_height6 = row[12] | (row[13] << 12);
- hw_ps->column_width7 = column[14] | (column[15] << 12);
- hw_ps->row_height7 = row[14] | (row[15] << 12);
- hw_ps->column_width8 = column[16] | (column[17] << 12);
- hw_ps->row_height8 = row[16] | (row[17] << 12);
- hw_ps->column_width9 = column[18] | (column[19] << 12);
- hw_ps->row_height9 = row[18] | (row[19] << 12);
-
- hw_ps->row_height10 = row[20] | (row[21] << 12);
-}
-
-static void set_pps_ref_pic_poc(struct rkvdec_hevc_sps_pps *hw_ps, const struct v4l2_hevc_dpb_entry *dpb)
-{
- hw_ps->ref_pic_poc0 = dpb[0].pic_order_cnt_val;
- hw_ps->ref_pic_poc1 = dpb[1].pic_order_cnt_val;
- hw_ps->ref_pic_poc2 = dpb[2].pic_order_cnt_val;
- hw_ps->ref_pic_poc3 = dpb[3].pic_order_cnt_val;
- hw_ps->ref_pic_poc4 = dpb[4].pic_order_cnt_val;
- hw_ps->ref_pic_poc5 = dpb[5].pic_order_cnt_val;
- hw_ps->ref_pic_poc6 = dpb[6].pic_order_cnt_val;
- hw_ps->ref_pic_poc7 = dpb[7].pic_order_cnt_val;
- hw_ps->ref_pic_poc8 = dpb[8].pic_order_cnt_val;
- hw_ps->ref_pic_poc9 = dpb[9].pic_order_cnt_val;
- hw_ps->ref_pic_poc10 = dpb[10].pic_order_cnt_val;
- hw_ps->ref_pic_poc11 = dpb[11].pic_order_cnt_val;
- hw_ps->ref_pic_poc12 = dpb[12].pic_order_cnt_val;
- hw_ps->ref_pic_poc13 = dpb[13].pic_order_cnt_val;
- hw_ps->ref_pic_poc14 = dpb[14].pic_order_cnt_val;
-}
-
static void assemble_hw_pps(struct rkvdec_ctx *ctx,
struct rkvdec_hevc_run *run)
{
@@ -245,104 +157,130 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
memset(hw_ps, 0, sizeof(*hw_ps));
/* write sps */
- hw_ps->video_parameters_set_id = sps->video_parameter_set_id;
- hw_ps->seq_parameters_set_id_sps = sps->seq_parameter_set_id;
- hw_ps->chroma_format_idc = sps->chroma_format_idc;
+ rkvdec_set_bw_field(hw_ps->info, VIDEO_PARAMETER_SET_ID, sps->video_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, SEQ_PARAMETER_SET_ID, sps->seq_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, CHROMA_FORMAT_IDC, sps->chroma_format_idc);
log2_min_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
width = sps->pic_width_in_luma_samples;
height = sps->pic_height_in_luma_samples;
- hw_ps->width = width;
- hw_ps->height = height;
- hw_ps->bit_depth_luma = sps->bit_depth_luma_minus8 + 8;
- hw_ps->bit_depth_chroma = sps->bit_depth_chroma_minus8 + 8;
- hw_ps->max_pic_order_count_lsb = sps->log2_max_pic_order_cnt_lsb_minus4 + 4;
- hw_ps->diff_max_min_luma_coding_block_size = sps->log2_diff_max_min_luma_coding_block_size;
- hw_ps->min_luma_coding_block_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
- hw_ps->min_transform_block_size = sps->log2_min_luma_transform_block_size_minus2 + 2;
- hw_ps->diff_max_min_transform_block_size =
- sps->log2_diff_max_min_luma_transform_block_size;
- hw_ps->max_transform_hierarchy_depth_inter = sps->max_transform_hierarchy_depth_inter;
- hw_ps->max_transform_hierarchy_depth_intra = sps->max_transform_hierarchy_depth_intra;
- hw_ps->scaling_list_enabled_flag =
- !!(sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED);
- hw_ps->amp_enabled_flag = !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED);
- hw_ps->sample_adaptive_offset_enabled_flag =
- !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET);
+
+ rkvdec_set_bw_field(hw_ps->info, PIC_WIDTH_IN_LUMA_SAMPLES, width);
+ rkvdec_set_bw_field(hw_ps->info, PIC_HEIGHT_IN_LUMA_SAMPLES, height);
+ rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_LUMA, sps->bit_depth_luma_minus8 + 8);
+ rkvdec_set_bw_field(hw_ps->info, BIT_DEPTH_CHROMA, sps->bit_depth_chroma_minus8 + 8);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MAX_PIC_ORDER_CNT_LSB,
+ sps->log2_max_pic_order_cnt_lsb_minus4 + 4);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_DIFF_MAX_MIN_LUMA_CODING_BLOCK_SIZE,
+ sps->log2_diff_max_min_luma_coding_block_size);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_LUMA_CODING_BLOCK_SIZE,
+ sps->log2_min_luma_coding_block_size_minus3 + 3);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_TRANSFORM_BLOCK_SIZE,
+ sps->log2_min_luma_transform_block_size_minus2 + 2);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_DIFF_MAX_MIN_LUMA_TRANSFORM_BLOCK_SIZE,
+ sps->log2_diff_max_min_luma_transform_block_size);
+ rkvdec_set_bw_field(hw_ps->info, MAX_TRANSFORM_HIERARCHY_DEPTH_INTER,
+ sps->max_transform_hierarchy_depth_inter);
+ rkvdec_set_bw_field(hw_ps->info, MAX_TRANSFORM_HIERARCHY_DEPTH_INTRA,
+ sps->max_transform_hierarchy_depth_intra);
+ rkvdec_set_bw_field(hw_ps->info, SCALING_LIST_ENABLED_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, AMP_ENABLED_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
pcm_enabled = !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED);
- hw_ps->pcm_enabled_flag = pcm_enabled;
- hw_ps->pcm_sample_bit_depth_luma =
- pcm_enabled ? sps->pcm_sample_bit_depth_luma_minus1 + 1 : 0;
- hw_ps->pcm_sample_bit_depth_chroma =
- pcm_enabled ? sps->pcm_sample_bit_depth_chroma_minus1 + 1 : 0;
- hw_ps->pcm_loop_filter_disabled_flag =
- !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED);
- hw_ps->diff_max_min_pcm_luma_coding_block_size =
- sps->log2_diff_max_min_pcm_luma_coding_block_size;
- hw_ps->min_pcm_luma_coding_block_size =
- pcm_enabled ? sps->log2_min_pcm_luma_coding_block_size_minus3 + 3 : 0;
- hw_ps->num_short_term_ref_pic_sets = sps->num_short_term_ref_pic_sets;
- hw_ps->long_term_ref_pics_present_flag =
- !!(sps->flags & V4L2_HEVC_SPS_FLAG_LONG_TERM_REF_PICS_PRESENT);
- hw_ps->num_long_term_ref_pics_sps = sps->num_long_term_ref_pics_sps;
- hw_ps->sps_temporal_mvp_enabled_flag =
- !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED);
- hw_ps->strong_intra_smoothing_enabled_flag =
- !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED);
- hw_ps->sps_max_dec_pic_buffering_minus1 = sps->sps_max_dec_pic_buffering_minus1;
+ rkvdec_set_bw_field(hw_ps->info, PCM_ENABLED_FLAG, pcm_enabled);
+ rkvdec_set_bw_field(hw_ps->info, PCM_SAMPLE_BIT_DEPTH_LUMA,
+ pcm_enabled ? sps->pcm_sample_bit_depth_luma_minus1 + 1 : 0);
+ rkvdec_set_bw_field(hw_ps->info, PCM_SAMPLE_BIT_DEPTH_CHROMA,
+ pcm_enabled ? sps->pcm_sample_bit_depth_chroma_minus1 + 1 : 0);
+ rkvdec_set_bw_field(hw_ps->info, PCM_LOOP_FILTER_DISABLED_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
+ rkvdec_set_bw_field(hw_ps->info, LOG2_DIFF_MAX_MIN_PCM_LUMA_CODING_BLOCK_SIZE,
+ sps->log2_diff_max_min_pcm_luma_coding_block_size);
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_PCM_LUMA_CODING_BLOCK_SIZE,
+ pcm_enabled ? sps->log2_min_pcm_luma_coding_block_size_minus3 + 3 : 0);
+ rkvdec_set_bw_field(hw_ps->info, NUM_SHORT_TERM_REF_PIC_SETS,
+ sps->num_short_term_ref_pic_sets);
+ rkvdec_set_bw_field(hw_ps->info, LONG_TERM_REF_PICS_PRESENT_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_LONG_TERM_REF_PICS_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, NUM_LONG_TERM_REF_PICS_SPS,
+ sps->num_long_term_ref_pics_sps);
+ rkvdec_set_bw_field(hw_ps->info, SPS_TEMPORAL_MVP_ENABLED_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, STRONG_INTRA_SMOOTHING_ENABLED_FLAG,
+ !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, SPS_MAX_DEC_PIC_BUFFERING_MINUS1,
+ sps->sps_max_dec_pic_buffering_minus1);
/* write pps */
- hw_ps->picture_parameters_set_id = pps->pic_parameter_set_id;
- hw_ps->seq_parameters_set_id_pps = sps->seq_parameter_set_id;
- hw_ps->dependent_slice_segments_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT_ENABLED);
- hw_ps->output_flag_present_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_OUTPUT_FLAG_PRESENT);
- hw_ps->num_extra_slice_header_bits = pps->num_extra_slice_header_bits;
- hw_ps->sign_data_hiding_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED);
- hw_ps->cabac_init_present_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT);
- hw_ps->num_ref_idx_l0_default_active = pps->num_ref_idx_l0_default_active_minus1 + 1;
- hw_ps->num_ref_idx_l1_default_active = pps->num_ref_idx_l1_default_active_minus1 + 1;
- hw_ps->init_qp_minus26 = pps->init_qp_minus26;
- hw_ps->constrained_intra_pred_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED);
- hw_ps->transform_skip_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED);
- hw_ps->cu_qp_delta_enabled_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED);
- hw_ps->log2_min_cb_size = log2_min_cb_size +
- sps->log2_diff_max_min_luma_coding_block_size -
- pps->diff_cu_qp_delta_depth;
- hw_ps->pps_cb_qp_offset = pps->pps_cb_qp_offset;
- hw_ps->pps_cr_qp_offset = pps->pps_cr_qp_offset;
- hw_ps->pps_slice_chroma_qp_offsets_present_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT);
- hw_ps->weighted_pred_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED);
- hw_ps->weighted_bipred_flag = !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED);
- hw_ps->transquant_bypass_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED);
+ rkvdec_set_bw_field(hw_ps->info, PIC_PARAMETER_SET_ID, pps->pic_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, SEQ_PARAMETER_SET_ID, sps->seq_parameter_set_id);
+ rkvdec_set_bw_field(hw_ps->info, DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, OUTPUT_FLAG_PRESENT_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_OUTPUT_FLAG_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, NUM_EXTRA_SLICE_HEADER_BITS,
+ pps->num_extra_slice_header_bits);
+ rkvdec_set_bw_field(hw_ps->info, SIGN_DATA_HIDING_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, CABAC_INIT_PRESENT_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L0_DEFAULT_ACTIVE,
+ pps->num_ref_idx_l0_default_active_minus1 + 1);
+ rkvdec_set_bw_field(hw_ps->info, NUM_REF_IDX_L1_DEFAULT_ACTIVE,
+ pps->num_ref_idx_l1_default_active_minus1 + 1);
+ rkvdec_set_bw_field(hw_ps->info, INIT_QP_MINUS26, pps->init_qp_minus26);
+ rkvdec_set_bw_field(hw_ps->info, CONSTRAINED_INTRA_PRED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+ rkvdec_set_bw_field(hw_ps->info, TRANSFORM_SKIP_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, CU_QP_DELTA_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, LOG2_MIN_CU_QP_DELTA_SIZE, log2_min_cb_size +
+ sps->log2_diff_max_min_luma_coding_block_size -
+ pps->diff_cu_qp_delta_depth);
+ rkvdec_set_bw_field(hw_ps->info, PPS_CB_QP_OFFSET, pps->pps_cb_qp_offset);
+ rkvdec_set_bw_field(hw_ps->info, PPS_CR_QP_OFFSET, pps->pps_cr_qp_offset);
+ rkvdec_set_bw_field(hw_ps->info, PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG,
+ !!(pps->flags &
+ V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, WEIGHTED_PRED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
+ rkvdec_set_bw_field(hw_ps->info, WEIGHTED_BIPRED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
+ rkvdec_set_bw_field(hw_ps->info, TRANSQUANT_BYPASS_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
- hw_ps->tiles_enabled_flag = tiles_enabled;
- hw_ps->entropy_coding_sync_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED);
- hw_ps->pps_loop_filter_across_slices_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED);
- hw_ps->loop_filter_across_tiles_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED);
- hw_ps->deblocking_filter_override_enabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED);
- hw_ps->pps_deblocking_filter_disabled_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER);
- hw_ps->pps_beta_offset_div2 = pps->pps_beta_offset_div2;
- hw_ps->pps_tc_offset_div2 = pps->pps_tc_offset_div2;
- hw_ps->lists_modification_present_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT);
- hw_ps->log2_parallel_merge_level = pps->log2_parallel_merge_level_minus2 + 2;
- hw_ps->slice_segment_header_extension_present_flag =
- !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT);
- hw_ps->num_tile_columns = tiles_enabled ? pps->num_tile_columns_minus1 + 1 : 1;
- hw_ps->num_tile_rows = tiles_enabled ? pps->num_tile_rows_minus1 + 1 : 1;
- hw_ps->mvc_ff = 0xffff;
+ rkvdec_set_bw_field(hw_ps->info, TILES_ENABLED_FLAG, tiles_enabled);
+ rkvdec_set_bw_field(hw_ps->info, ENTROPY_CODING_SYNC_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED_FLAG,
+ !!(pps->flags &
+ V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, LOOP_FILTER_ACROSS_TILES_ENABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG,
+ !!(pps->flags &
+ V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
+ rkvdec_set_bw_field(hw_ps->info, PPS_DEBLOCKING_FILTER_DISABLED_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
+ rkvdec_set_bw_field(hw_ps->info, PPS_BETA_OFFSET_DIV2, pps->pps_beta_offset_div2);
+ rkvdec_set_bw_field(hw_ps->info, PPS_TC_OFFSET_DIV2, pps->pps_tc_offset_div2);
+ rkvdec_set_bw_field(hw_ps->info, LISTS_MODIFICATION_PRESENT_FLAG,
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, LOG2_PARALLEL_MERGE_LEVEL,
+ pps->log2_parallel_merge_level_minus2 + 2);
+ rkvdec_set_bw_field(hw_ps->info, SLICE_SEGMENT_HEADER_EXTENSION_PRESENT_FLAG,
+ !!(pps->flags &
+ V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
+ rkvdec_set_bw_field(hw_ps->info, NUM_TILE_COLUMNS,
+ tiles_enabled ? pps->num_tile_columns_minus1 + 1 : 1);
+ rkvdec_set_bw_field(hw_ps->info, NUM_TILE_ROWS,
+ tiles_enabled ? pps->num_tile_rows_minus1 + 1 : 1);
+ rkvdec_set_bw_field(hw_ps->info, MVC_FF, 0xffff);
// Setup tiles information
memset(column_width, 0, sizeof(column_width));
@@ -367,15 +305,19 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
row_height[0] = (height + max_cu_width - 1) / max_cu_width;
}
- set_column_row(hw_ps, column_width, row_height);
+ for (i = 0; i < 20; i++)
+ rkvdec_set_bw_field(hw_ps->info, COLUMN_WIDTH(i), column_width[i]);
+ for (i = 0; i < 22; i++)
+ rkvdec_set_bw_field(hw_ps->info, ROW_HEIGHT(i), row_height[i]);
// Setup POC information
- hw_ps->current_poc = dec_params->pic_order_cnt_val;
+ rkvdec_set_bw_field(hw_ps->info, CURRENT_POC, dec_params->pic_order_cnt_val);
- set_pps_ref_pic_poc(hw_ps, dec_params->dpb);
for (i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
- u32 valid = !!(dec_params->num_active_dpb_entries > i);
- hw_ps->ref_is_valid |= valid << i;
+ rkvdec_set_bw_field(hw_ps->info, REF_IS_VALID(i),
+ !!(dec_params->num_active_dpb_entries > i));
+ rkvdec_set_bw_field(hw_ps->info, REF_PIC_POC(i),
+ dec_params->dpb[i].pic_order_cnt_val);
}
}
--
2.53.0
^ permalink raw reply related
* Re: [PATCH net-next v2 00/15] net: stmmac: qcom-ethqos: more cleanups
From: Mohd Ayaan Anwar @ 2026-03-27 15:20 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Alexandre Torgue, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-arm-msm,
linux-stm32, netdev, Paolo Abeni
In-Reply-To: <acZDEg9wdjhBTHlL@shell.armlinux.org.uk>
Hi Russell,
On Fri, Mar 27, 2026 at 08:42:58AM +0000, Russell King (Oracle) wrote:
> Further cleanups to qcom-ethqos, mainly concentrating on the RGMII
> code, making it clearer what the differences are for each speed, thus
> making the code more readable.
>
> I'm still not really happy with this. The speed specific configuration
> remains split between ethqos_fix_mac_speed_rgmii() and
> ethqos_rgmii_macro_init(), where the latter is only ever called from
> the former. So, I think further work is needed here - maybe it needs
> restructuring into the various componenet parts of the RGMII block?
>
> v2:
> - patch 2: fix typo in commit message
> - patch 3: fix ethqos_fix_mac_speed() comment
>
> .../ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c | 220 ++++++++-------------
> 1 file changed, 87 insertions(+), 133 deletions(-)
>
No issues found at 100M and 1G on the QCS615 Ride board with the KSZ9031
RGMII PHY. As noted earlier, Ethernet support for this board is not yet
upstream, but I have some local changes to make it work.
10M could not be tested due to limitations of the link partner. But with
100M working fine, I am fairly certain that this series will not
introduce any new issues at 10M.
Please feel free to add my:
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Ayaan
^ permalink raw reply
* Re: [PATCH v1 1/3] arm64: dts: amlogic: meson-s4: add VRTC node
From: neil.armstrong @ 2026-03-27 15:34 UTC (permalink / raw)
To: Nick Xie, khilman, martin.blumenstingl, jbrunet
Cc: krzk+dt, robh, conor+dt, linux-amlogic, linux-arm-kernel,
devicetree, linux-kernel
In-Reply-To: <20260327093016.722095-2-nick@khadas.com>
On 3/27/26 10:30, Nick Xie wrote:
> Add the Virtual RTC (VRTC) controller node to the Meson S4 SoC dtsi.
>
> Signed-off-by: Nick Xie <nick@khadas.com>
> ---
> arch/arm64/boot/dts/amlogic/meson-s4.dtsi | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-s4.dtsi b/arch/arm64/boot/dts/amlogic/meson-s4.dtsi
> index 936a5c1353d15..2a6fbd5308362 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-s4.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-s4.dtsi
> @@ -59,6 +59,11 @@ psci {
> method = "smc";
> };
>
> + vrtc: rtc@fe010288 {
> + compatible = "amlogic,meson-vrtc";
> + reg = <0x0 0xfe010288 0x0 0x4>;
> + };
> +
> xtal: xtal-clk {
> compatible = "fixed-clock";
> clock-frequency = <24000000>;
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Thanks,
Neil
^ permalink raw reply
* Re: [PATCH v1 2/3] arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: enable HYM8563 RTC
From: neil.armstrong @ 2026-03-27 15:34 UTC (permalink / raw)
To: Nick Xie, khilman, martin.blumenstingl, jbrunet
Cc: krzk+dt, robh, conor+dt, linux-amlogic, linux-arm-kernel,
devicetree, linux-kernel
In-Reply-To: <20260327093016.722095-3-nick@khadas.com>
On 3/27/26 10:30, Nick Xie wrote:
> The Khadas VIM1S board has an on-board Haoyu Micro HYM8563 Real Time
> Clock (RTC) connected to the I2C1 bus.
>
> Enable the I2C1 controller and add the RTC child node to support
> hardware clock persistence.
>
> Signed-off-by: Nick Xie <nick@khadas.com>
> ---
> .../dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts b/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts
> index 792ab45c4c944..7314e0ab81da3 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts
> @@ -20,6 +20,8 @@ aliases {
> mmc0 = &emmc; /* eMMC */
> mmc1 = &sd; /* SD card */
> mmc2 = &sdio; /* SDIO */
> + rtc0 = &rtc;
> + rtc1 = &vrtc;
> serial0 = &uart_b;
> };
>
> @@ -223,6 +225,19 @@ ðmac {
> phy-mode = "rmii";
> };
>
> +&i2c1 {
> + status = "okay";
> + pinctrl-names = "default";
> + pinctrl-0 = <&i2c1_pins2>;
> + clock-frequency = <100000>;
> +
> + rtc: rtc@51 {
> + compatible = "haoyu,hym8563";
> + reg = <0x51>;
> + #clock-cells = <0>;
> + };
> +};
> +
> &ir {
> status = "okay";
> pinctrl-0 = <&remote_pins>;
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Thanks,
Neil
^ permalink raw reply
* Re: [PATCH v1 3/3] arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: use rc-khadas keymap
From: neil.armstrong @ 2026-03-27 15:34 UTC (permalink / raw)
To: Nick Xie, khilman, martin.blumenstingl, jbrunet
Cc: krzk+dt, robh, conor+dt, linux-amlogic, linux-arm-kernel,
devicetree, linux-kernel
In-Reply-To: <20260327093016.722095-4-nick@khadas.com>
On 3/27/26 10:30, Nick Xie wrote:
> The Khadas VIM1S board has an onboard IR receiver.
> Configure the default keymap to "rc-khadas" to support the official
> Khadas IR remote control.
>
> Signed-off-by: Nick Xie <nick@khadas.com>
> ---
> arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts b/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts
> index 7314e0ab81da3..99d5df71b9cd4 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts
> @@ -242,6 +242,7 @@ &ir {
> status = "okay";
> pinctrl-0 = <&remote_pins>;
> pinctrl-names = "default";
> + linux,rc-map-name = "rc-khadas";
> };
>
> &pwm_ef {
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Thanks,
Neil
^ permalink raw reply
* Re: [PATCH v6 01/40] arm_mpam: Ensure in_reset_state is false after applying configuration
From: James Morse @ 2026-03-27 15:42 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
In-Reply-To: <20260313144617.3420416-2-ben.horgan@arm.com>
Hi Ben, Zeng,
On 13/03/2026 14:45, Ben Horgan wrote:
> From: Zeng Heng <zengheng4@huawei.com>
>
> The per-RIS flag, in_reset_state, indicates whether or not the MSC
> registers are in reset state, and allows avoiding resetting when they are
> already in reset state. However, when mpam_apply_config() updates the
> configuration it doesn't update the in_reset_state flag and so even after
> the configuration update in_reset_state can be true and mpam_reset_ris()
> will skip the actual register restoration on subsequent resets.
>
> Once resctrl has a MPAM backend it will use resctrl_arch_reset_all_ctrls()
> to reset the MSC configuration on unmount and, if the in_reset_state flag
> is bogusly true, fail to reset the MSC configuration. The resulting
> non-reset MSC configuration can lead to persistent performance restrictions
> even after resctrl is unmounted.
>
> Fix by clearing in_reset_state to false immediately after successful
> configuration application, ensuring that the next reset operation
> properly restores MSC register defaults.
Reviewed-by: James Morse <james.morse@arm.com>
Thanks!
James
^ permalink raw reply
* Re: [PATCH v6 08/40] arm64: mpam: Drop the CONFIG_EXPERT restriction
From: James Morse @ 2026-03-27 15:43 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
In-Reply-To: <20260313144617.3420416-9-ben.horgan@arm.com>
Hi Ben,
On 13/03/2026 14:45, Ben Horgan wrote:
> In anticipation of MPAM being useful remove the CONFIG_EXPERT restriction.
Useful - ha! I've added a second paragraph describing why this was done, just
so it doesn't look odd in 5 years time.
| This was done to prevent the driver being enabled before the user-space
| interface was wired up.
Reviewed-by: James Morse <james.morse@arm.com>
Thanks,
James
^ permalink raw reply
* Re: [PATCH v6 11/40] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
From: James Morse @ 2026-03-27 15:44 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
In-Reply-To: <20260313144617.3420416-12-ben.horgan@arm.com>
Hi Ben,
On 13/03/2026 14:45, Ben Horgan wrote:
> The MPAMSM_EL1 sets the MPAM labels, PMG and PARTID, for loads and stores
> generated by a shared SMCU. Disable the traps so the kernel can use it and
> set it to the same configuration as the per-EL cpu MPAM configuration.
>
> If an SMCU is not shared with other cpus then it is implementation
> defined whether the configuration from MPAMSM_EL1 is used or that from
> the appropriate MPAMy_ELx. As we set the same, PMG_D and PARTID_D,
> configuration for MPAM0_EL1, MPAM1_EL1 and MPAMSM_EL1 the resulting
> configuration is the same regardless.
>
> The range of valid configurations for the PARTID and PMG in MPAMSM_EL1 is
> not currently specified in Arm Architectural Reference Manual but the
> architect has confirmed that it is intended to be the same as that for the
> cpu configuration in the MPAMy_ELx registers.
Reviewed-by: James Morse <james.morse@arm.com>
> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
> index 0747e0526927..6bccbfdccb87 100644
> --- a/arch/arm64/include/asm/mpam.h
> +++ b/arch/arm64/include/asm/mpam.h
> @@ -53,6 +53,8 @@ static inline void mpam_thread_switch(struct task_struct *tsk)
> return;
>
> write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
> + if (system_supports_sme())
> + write_sysreg_s(regval & (MPAMSM_EL1_PARTID_D | MPAMSM_EL1_PMG_D), SYS_MPAMSM_EL1);
Doing it here saves a surprise later.
> isb();
> /* Synchronising the EL0 write is left until the ERET to EL0 */
(down here would have been the alternative)
Thanks,
James
^ permalink raw reply
* Re: [PATCH v6 21/40] arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT
From: James Morse @ 2026-03-27 15:44 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
In-Reply-To: <20260313144617.3420416-22-ben.horgan@arm.com>
Hi Ben,
On 13/03/2026 14:45, Ben Horgan wrote:
> When CDP is not enabled, the 'rmid_entry's in the limbo list,
> rmid_busy_llc, map directly to a (PARTID,PMG) pair and when CDP is enabled
> the mapping is to two different pairs.
> As the limbo list is reused between
> mounts and CDP disabled on unmount this can lead to stale mapping and the
> limbo handler will then make monitor reads with potentially out of range
> PARTID.
Bother - I missed that!
> This may then cause an MPAM error interrupt and the driver will
> disable MPAM.
... and that's why it's not a problem on x86 because the RMID range is unaffected by CDP,
whereas MPAM works on a combined value.
> No problems are expected if you just mount the resctrl file system
> once with CDP enabled and never unmount it.
(guess how it was tested!)
> Hide CDP emulation behind CONFIG_EXPERT to protect the unwary.
>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Adding this ugliness in the hope of avoiding patch churn and extra
> reviewer work. I am looking into the resctrl changes needed to fix this.
Makes sense - people can still use this if they're aware of the limitation, and it sounds
like you've got a plan to fix it properly. We just don't want it enabled in distros until
then.
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> index 903d1a0f564f..cab3e9ccb5c7 100644
> --- a/drivers/resctrl/mpam_resctrl.c
> +++ b/drivers/resctrl/mpam_resctrl.c
> @@ -82,6 +82,18 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level rid, bool enable)
> u32 partid_i = RESCTRL_RESERVED_CLOSID, partid_d = RESCTRL_RESERVED_CLOSID;
> int cpu;
>
> + if (!IS_ENABLED(CONFIG_EXPERT) && enable) {
> + /*
> + * If the resctrl fs is mounted more than once, sequentially,
> + * then CDP can lead to the use of out of range PARTIDs.
> + */
> + pr_warn("CDP not supported\n");
> + return -EOPNOTSUPP;
> + }
> +
> + if (enable)
> + pr_warn("CDP is an expert feature and may cause MPAM to malfunction.\n");
> +
> /*
> * resctrl_arch_set_cdp_enabled() is only called with enable set to
> * false on error and unmount.
Reviewed-by: James Morse <james.morse@arm.com>
Thanks,
James
^ permalink raw reply
* Re: [PATCH 1/4] exec: inherit HWCAPs from the parent process
From: Andrei Vagin @ 2026-03-27 15:46 UTC (permalink / raw)
To: Will Deacon, Mark Rutland
Cc: Kees Cook, Andrew Morton, Marek Szyprowski, Cyrill Gorcunov,
Mike Rapoport, Alexander Mikhalitsyn, linux-kernel, linux-fsdevel,
linux-mm, criu, Catalin Marinas, linux-arm-kernel, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Alexander Mikhalitsyn
In-Reply-To: <CAEWA0a7iR8YHooqXJfhersV6YhAXGMZDUhib3QQH5XGn=KNowA@mail.gmail.com>
On Tue, Mar 24, 2026 at 3:19 PM Andrei Vagin <avagin@google.com> wrote:
>
> Hi Mark and Will,
>
> Thanks for the feedback. Please read the inline comments.
Mark, Will, just checking in to see if my explanation makes sense to you.
Let me know if you have any further feedback or questions.
Thanks,
Andrei
>
> On Tue, Mar 24, 2026 at 3:28 AM Will Deacon <will@kernel.org> wrote:
> >
> > On Mon, Mar 23, 2026 at 06:21:22PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 23, 2026 at 05:53:37PM +0000, Andrei Vagin wrote:
> > > > Introduces a mechanism to inherit hardware capabilities (AT_HWCAP,
> > > > AT_HWCAP2, etc.) from a parent process when they have been modified via
> > > > prctl.
> > > >
> > > > To support C/R operations (snapshots, live migration) in heterogeneous
> > > > clusters, we must ensure that processes utilize CPU features available
> > > > on all potential target nodes. To solve this, we need to advertise a
> > > > common feature set across the cluster.
> > > >
> > > > This patch adds a new mm flag MMF_USER_HWCAP, which is set when the
> > > > auxiliary vector is modified via prctl(PR_SET_MM, PR_SET_MM_AUXV). When
> > > > execve() is called, if the current process has MMF_USER_HWCAP set, the
> > > > HWCAP values are extracted from the current auxiliary vector and stored
> > > > in the linux_binprm structure. These values are then used to populate
> > > > the auxiliary vector of the new process, effectively inheriting the
> > > > hardware capabilities.
> > > >
> > > > The inherited HWCAPs are masked with the hardware capabilities supported
> > > > by the current kernel to ensure that we don't report more features than
> > > > actually supported. This is important to avoid unexpected behavior,
> > > > especially for processes with additional privileges.
> > >
> > > At a high level, I don't think that's going to be sufficient:
> > >
> > > * On an architecture with other userspace accessible feature
> > > identification mechanism registers (e.g. ID registers), userspace
> > > might read those. So you might need to hide stuff there too, and
> > > that's going to require architecture-specific interfaces to manage.
> > >
> > > It's possible that some code checks HWCAPs and others check ID
> > > registers, and mismatch between the two could be problematic.
> > >
> > > * If the HWCAPs can be inherited by a more privileged task, then a
> > > malicious user could use this to hide security features (e.g. shadow
> > > stack or pointer authentication on arm64), and make it easier to
> > > attack that task. While not a direct attack, it would undermine those
> > > features.
>
> I agree with Mark that only a privileged process have to be able to mask
> certain hardware features. Currently, PR_SET_MM_AUXV is guarded by
> CAP_SYS_RESOURCE, but PR_SET_MM_MAP allows changing the auxiliary vector
> without specific capabilities. This is definitely the issue. To address
> this, I think we can consider to introduce a new prctl command to enable
> HWCAP inheritance explicitly.
>
> >
> > Yeah, this looks like a non-starter to me on arm64. Even if it was
> > extended to apply the same treatment to the idregs, many of the hwcap
> > features can't actually be disabled by the kernel and so you still run
> > the risk of a task that probes for the presence of a feature using
> > something like a SIGILL handler or, perhaps more likely, assumes that
> > the presence of one hwcap implies the presence of another. And then
> > there are the applications that just base everything off the MIDR...
>
> The goal of this mechanism is not to provide strict architectural
> enforcement or to trap the use of hardware features; rather, it is to
> provide a consistent discovery interface for applications. I chose the
> HWCAP vector because it mirrors the existing behavior of running an
> older kernel on newer hardware: while ID registers might report a
> feature as physically present, the HWCAPs will omit it if the kernel
> lacks support. Applications are generally expected to treat HWCAPs as
> the source of truth for which features are safe to use, even if the
> underlying hardware is technically capable of more.
>
> Another significant advantage of using HWCAPs is that many
> applications already rely on them for feature detection. This interface
> allows these applications to work correctly "out-of-the-box" in a
> migrated environment without requiring any userspace modifications. I
> understand that some apps may use other detection methods; however, there
> it no gurantee that these applications will work correctly after
> migration to another machine.
>
> >
> > There's also kvm, which provides a roundabout way to query some features
> > of the underlying hardware.
> >
> > You're probably better off using/extending the idreg overrides we have
> > in arch/arm64/kernel/pi/idreg-override.c so that you can make your
> > cluster of heterogeneous machines look alike.
>
> IIRC, idreg-override/cpuid-masking usually works for an entire machine.
> We actually need to have a mechanism that will work on a per-container
> basis. Workloads inside one cluster can have different
> migration/snapshot requirements. Some are pinned to a specific node,
> others are never migrated, while others need to be migratable across a
> cluster or even between clusters. We need a mechanism that can be
> tunable on a per-container/per-process basis.
>
> >
> > On the other hand, if munging the hwcaps happens to be sufficient for
> > this particular use-case, can't it be handled entirely in userspace (e.g.
> > by hacking libc?)
>
> CRIU often handles workloads with a mix of runtimes: some linked against
> glibc, some against musl, and others like Go that bypass libc entirely.
> CRIU is mostly used to handle containers that can run multiple processes
> possible based on different runtimes. It means available cpu features
> should not be only specified for one runtime, they have to be passed
> across different runtimes. I think the pure userspace solution is near
> infeasible in this case.
>
> Thanks,
> Andrei
^ permalink raw reply
* Re: [PATCH v6 22/40] arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
From: James Morse @ 2026-03-27 15:47 UTC (permalink / raw)
To: Gavin Shan, Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
In-Reply-To: <bd450f1f-05a3-44d6-9bbc-1c48d967baa4@redhat.com>
Hi Gavin,
On 23/03/2026 22:49, Gavin Shan wrote:
> On 3/14/26 12:45 AM, Ben Horgan wrote:
>> From: Dave Martin <Dave.Martin@arm.com>
>>
>> MPAM uses a fixed-point formats for some hardware controls. Resctrl
>> provides the bandwidth controls as a percentage. Add helpers to convert
>> between these.
>>
>> Ensure bwa_wd is at most 16 to make it clear higher values have no meaning.
> One nitpick below, but this looks good to me in either way.
>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
Thanks!
>> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
>> index 0e5e24ef60fe..0c97f7708722 100644
>> --- a/drivers/resctrl/mpam_devices.c
>> +++ b/drivers/resctrl/mpam_devices.c
>> @@ -713,6 +713,13 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
>> mpam_set_feature(mpam_feat_mbw_part, props);
>> props->bwa_wd = FIELD_GET(MPAMF_MBW_IDR_BWA_WD, mbw_features);
>> +
>> + /*
>> + * The BWA_WD field can represent 0-63, but the control fields it
>> + * describes have a maximum of 16 bits.
>> + */
>> + props->bwa_wd = min(props->bwa_wd, 16);
>> +
>
> 16 may deserve a definition for it since it's a constant value and referred
> for multiple times in this patch, if we need to give this series another
> respin :-)
Hmmm.,. I've left this, I'm not sure what you'd call it. U16_BITS? That sort of thing
might be needed for long/int etc. Here there is either a comment, or its
accepting/returning a u16. I think its fairly obvious where the number 16 is coming from.
Thanks,
James
^ permalink raw reply
* Re: [PATCH v6 25/40] arm_mpam: resctrl: Add support for 'MB' resource
From: James Morse @ 2026-03-27 15:47 UTC (permalink / raw)
To: Gavin Shan, Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
In-Reply-To: <3ae3356d-a901-4b71-90df-557d468e4785@redhat.com>
Hi Gavin,
On 23/03/2026 23:09, Gavin Shan wrote:
> On 3/14/26 12:46 AM, Ben Horgan wrote:
>> From: James Morse <james.morse@arm.com>
>>
>> resctrl supports 'MB', as a percentage throttling of traffic from the
>> L3. This is the control that mba_sc uses, so ideally the class chosen
>> should be as close as possible to the counters used for mbm_total. If there
>> is a single L3, it's the last cache, and the topology of the memory matches
>> then the traffic at the memory controller will be equivalent to that at
>> egress of the L3. If these conditions are met allow the memory class to
>> back MB.
>>
>> MB's percentage control should be backed either with the fixed point
>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>> bitmaps is not used as its tricky to pick which bits to use to avoid
>> contention, and may be possible to expose this as something other than a
>> percentage in the future.
> One comment below and it deserves to be addressed if we have another respin:
>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
Thanks!
>> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
>> index 93c8a9608ed4..cad65cf7d12d 100644
>> --- a/drivers/resctrl/mpam_resctrl.c
>> +++ b/drivers/resctrl/mpam_resctrl.c
>> @@ -317,6 +344,166 @@ static u16 percent_to_mbw_max(u8 pc, struct mpam_props *cprops)
>> +/*
>> + * Test if the traffic for a class matches that at egress from the L3. For
>> + * MSC at memory controllers this is only possible if there is a single L3
>> + * as otherwise the counters at the memory can include bandwidth from the
>> + * non-local L3.
>> + */
>> +static bool traffic_matches_l3(struct mpam_class *class)
>> +{
>> + int err, cpu;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + if (class->type == MPAM_CLASS_CACHE && class->level == 3)
>> + return true;
>> +
>> + if (class->type == MPAM_CLASS_CACHE && class->level != 3) {
>> + pr_debug("class %u is a different cache from L3\n", class->level);
>> + return false;
>> + }
>> +
>> + if (class->type != MPAM_CLASS_MEMORY) {
>> + pr_debug("class %u is neither of type cache or memory\n", class->level);
>> + return false;
>> + }
>> +
>
> We bail if the calss isn't MPAM_CLASS_MEMORY here ...
>
>> + cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
>> + if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL)) {
>> + pr_debug("cpumask allocation failed\n");
>> + return false;
>> + }
>> +
>> + if (class->type != MPAM_CLASS_MEMORY) {
>> + pr_debug("class %u is neither of type cache or memory\n",
>> + class->level);
>> + return false;
>> + }
>> +
>
> Duplicated check here as the previous one. So this check can be dropped.
Heh, that looks like a rebase conflict! Thanks for spotting it.
Fixed locally.
James
^ permalink raw reply
* Re: [PATCH v6 36/40] arm_mpam: Add workaround for T241-MPAM-1
From: James Morse @ 2026-03-27 15:48 UTC (permalink / raw)
To: Gavin Shan, Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
In-Reply-To: <7b73d10e-4bfd-434f-b05f-25c4859a7abd@redhat.com>
Hi Gavin,
On 24/03/2026 04:16, Gavin Shan wrote:
> On 3/14/26 12:46 AM, Ben Horgan wrote:
>> From: Shanker Donthineni <sdonthineni@nvidia.com>
>>
>> The MPAM bandwidth partitioning controls will not be correctly configured,
>> and hardware will retain default configuration register values, meaning
>> generally that bandwidth will remain unprovisioned.
>>
>> To address the issue, follow the below steps after updating the MBW_MIN
>> and/or MBW_MAX registers.
>>
>> - Perform 64b reads from all 12 bridge MPAM shadow registers at offsets
>> (0x360048 + slice*0x10000 + partid*8). These registers are read-only.
>> - Continue iterating until all 12 shadow register values match in a loop.
>> pr_warn_once if the values fail to match within the loop count 1000.
>> - Perform 64b writes with the value 0x0 to the two spare registers at
>> offsets 0x1b0000 and 0x1c0000.
>>
>> In the hardware, writes to the MPAMCFG_MBW_MAX MPAMCFG_MBW_MIN registers
>> are transformed into broadcast writes to the 12 shadow registers. The
>> final two writes to the spare registers cause a final rank of downstream
>> micro-architectural MPAM registers to be updated from the shadow copies.
>> The intervening loop to read the 12 shadow registers helps avoid a race
>> condition where writes to the spare registers occur before all shadow
>> registers have been updated.
> One question below.
>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
>> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
>> index e66631f3f732..b1753498f07f 100644
>> --- a/drivers/resctrl/mpam_devices.c
>> +++ b/drivers/resctrl/mpam_devices.c
>> @@ -630,7 +640,45 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc
>> *msc,
>> return ERR_PTR(-ENOENT);
>> }
>> +static int mpam_enable_quirk_nvidia_t241_1(struct mpam_msc *msc,
>> + const struct mpam_quirk *quirk)
>> +{
>> + s32 soc_id = arm_smccc_get_soc_id_version();
>> + struct resource *r;
>> + phys_addr_t phys;
>> +
>> + /*
>> + * A mapping to a device other than the MSC is needed, check
>> + * SOC_ID is NVIDIA T241 chip (036b:0241)
>> + */
>> + if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241)
>> + return -EINVAL;
>> +
>> + r = platform_get_resource(msc->pdev, IORESOURCE_MEM, 0);
>> + if (!r)
>> + return -EINVAL;
>> +
>> + /* Find the internal registers base addr from the CHIP ID */
>> + msc->t241_id = T241_CHIP_ID(r->start);
>> + phys = FIELD_PREP(GENMASK_ULL(45, 44), msc->t241_id) | 0x19000000ULL;
>> +
>> + t241_scratch_regs[msc->t241_id] = ioremap(phys, SZ_8M);
>> + if (WARN_ON_ONCE(!t241_scratch_regs[msc->t241_id]))
>> + return -EINVAL;
>
> Those IO regions aren't unmapped when the MSCs are removed. I guess it would be
> something to be improved? :-)
It's just leaking some VA space in the unlikely event the error interrupt goes off.
That is never expected to happen - all the errors indicate a software bug, so its
not a case of being unlucky. (This assumes T241 supports the error interrupt!).
Adding some teardown would just be for this erratum, I expect it to be the only one
that needs to map some other device to poke at. I'm not sure its worth it.
I'm also very nervous changing this quirk as its difficult for me to test!
>> +
>> + pr_info_once("Enabled workaround for NVIDIA T241 erratum T241-MPAM-1\n");
>> +
>> + return 0;
>> +}
>> +
>> static const struct mpam_quirk mpam_quirks[] = {
>> + {
>> + /* NVIDIA t241 erratum T241-MPAM-1 */
>> + .init = mpam_enable_quirk_nvidia_t241_1,
>> + .iidr = MPAM_IIDR_NVIDIA_T241,
>> + .iidr_mask = MPAM_IIDR_MATCH_ONE,
>> + .workaround = T241_SCRUB_SHADOW_REGS,
>
> Perhaps we need a more leading space for every line in the above block.
Sure, done locally.
>> + },
>> { NULL } /* Sentinel */
>> };
Thanks,
James
^ permalink raw reply
* Re: [PATCH 1/4] exec: inherit HWCAPs from the parent process
From: Mark Rutland @ 2026-03-27 16:06 UTC (permalink / raw)
To: Andrei Vagin
Cc: Will Deacon, Kees Cook, Andrew Morton, Marek Szyprowski,
Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Catalin Marinas,
linux-arm-kernel, Chen Ridong, Christian Brauner,
David Hildenbrand, Eric Biederman, Lorenzo Stoakes, Michal Koutny,
Alexander Mikhalitsyn
In-Reply-To: <CAEWA0a7iR8YHooqXJfhersV6YhAXGMZDUhib3QQH5XGn=KNowA@mail.gmail.com>
On Tue, Mar 24, 2026 at 03:19:49PM -0700, Andrei Vagin wrote:
> Hi Mark and Will,
>
> Thanks for the feedback. Please read the inline comments.
>
> On Tue, Mar 24, 2026 at 3:28 AM Will Deacon <will@kernel.org> wrote:
> >
> > On Mon, Mar 23, 2026 at 06:21:22PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 23, 2026 at 05:53:37PM +0000, Andrei Vagin wrote:
> > > > Introduces a mechanism to inherit hardware capabilities (AT_HWCAP,
> > > > AT_HWCAP2, etc.) from a parent process when they have been modified via
> > > > prctl.
> > > >
> > > > To support C/R operations (snapshots, live migration) in heterogeneous
> > > > clusters, we must ensure that processes utilize CPU features available
> > > > on all potential target nodes. To solve this, we need to advertise a
> > > > common feature set across the cluster.
> > > >
> > > > This patch adds a new mm flag MMF_USER_HWCAP, which is set when the
> > > > auxiliary vector is modified via prctl(PR_SET_MM, PR_SET_MM_AUXV). When
> > > > execve() is called, if the current process has MMF_USER_HWCAP set, the
> > > > HWCAP values are extracted from the current auxiliary vector and stored
> > > > in the linux_binprm structure. These values are then used to populate
> > > > the auxiliary vector of the new process, effectively inheriting the
> > > > hardware capabilities.
> > > >
> > > > The inherited HWCAPs are masked with the hardware capabilities supported
> > > > by the current kernel to ensure that we don't report more features than
> > > > actually supported. This is important to avoid unexpected behavior,
> > > > especially for processes with additional privileges.
> > >
> > > At a high level, I don't think that's going to be sufficient:
> > >
> > > * On an architecture with other userspace accessible feature
> > > identification mechanism registers (e.g. ID registers), userspace
> > > might read those. So you might need to hide stuff there too, and
> > > that's going to require architecture-specific interfaces to manage.
> > >
> > > It's possible that some code checks HWCAPs and others check ID
> > > registers, and mismatch between the two could be problematic.
> > >
> > > * If the HWCAPs can be inherited by a more privileged task, then a
> > > malicious user could use this to hide security features (e.g. shadow
> > > stack or pointer authentication on arm64), and make it easier to
> > > attack that task. While not a direct attack, it would undermine those
> > > features.
>
> I agree with Mark that only a privileged process have to be able to mask
> certain hardware features. Currently, PR_SET_MM_AUXV is guarded by
> CAP_SYS_RESOURCE, but PR_SET_MM_MAP allows changing the auxiliary vector
> without specific capabilities. This is definitely the issue. To address
> this, I think we can consider to introduce a new prctl command to enable
> HWCAP inheritance explicitly.
>
> > Yeah, this looks like a non-starter to me on arm64. Even if it was
> > extended to apply the same treatment to the idregs, many of the hwcap
> > features can't actually be disabled by the kernel and so you still run
> > the risk of a task that probes for the presence of a feature using
> > something like a SIGILL handler or, perhaps more likely, assumes that
> > the presence of one hwcap implies the presence of another. And then
> > there are the applications that just base everything off the MIDR...
>
> The goal of this mechanism is not to provide strict architectural
> enforcement or to trap the use of hardware features; rather, it is to
> provide a consistent discovery interface for applications. I chose the
> HWCAP vector because it mirrors the existing behavior of running an
> older kernel on newer hardware: while ID registers might report a
> feature as physically present, the HWCAPs will omit it if the kernel
> lacks support.
On arm64, the view of the ID registers that userspace gets *only*
exposes features that the kernel knows about, as userspace reads of
those registers are trapped+emulated by the kernel. On arm64 it's
not true to say that something appears in those but not the HWCAPs.
I understand that might be different on other architectures, and so
maybe this approach is sufficient on other architectures, but it is not
sufficient on arm64.
> Applications are generally expected to treat HWCAPs as
> the source of truth for which features are safe to use, even if the
> underlying hardware is technically capable of more.
I'm fairly certain that there are arm64 applications (and libraries)
which check only the ID register values, and not the HWCAPs.
Architecturally, there are features which are detected via other
mechanisms (e.g. CHKFEAT), for which HWCAPs are also irrelevant. Even if
that happens to be ok today, there are almost certainly future uses that
will not be compatible with the scheme you propose.
I don't think we can say "applications must check the HWCAPs", when we
know that applications and libraries legitimately don't always do that.
> Another significant advantage of using HWCAPs is that many
> applications already rely on them for feature detection. This interface
> allows these applications to work correctly "out-of-the-box" in a
> migrated environment without requiring any userspace modifications. I
> understand that some apps may use other detection methods; however, there
> it no gurantee that these applications will work correctly after
> migration to another machine.
I think the existince of applications that detect features by other
(legitimate!) means implies that there's no guarantee that this feature
is useful and will remain useful going forwards.
For example, what do you plan to do if an application or library starts
doing something legitimate that causes it to become incompatible with
this scheme?
I don't want to be in a position where userspace is asked to steer clear
of legitimate mechanisms, or where architecture code suddently has to
pick up a lot of complexity to make this work.
> > There's also kvm, which provides a roundabout way to query some features
> > of the underlying hardware.
> >
> > You're probably better off using/extending the idreg overrides we have
> > in arch/arm64/kernel/pi/idreg-override.c so that you can make your
> > cluster of heterogeneous machines look alike.
>
> IIRC, idreg-override/cpuid-masking usually works for an entire machine.
> We actually need to have a mechanism that will work on a per-container
> basis. Workloads inside one cluster can have different
> migration/snapshot requirements. Some are pinned to a specific node,
> others are never migrated, while others need to be migratable across a
> cluster or even between clusters. We need a mechanism that can be
> tunable on a per-container/per-process basis.
I think that's theoretically possible, BUT it will require substantially
more complexity, to address the issues that Will and I have mentioned. I
don't think people are very happy to pick up that complexity.
There are many other aspects that are going to be problematic for
heterogeneous migration. Even if you hide the HWCAP for a stateful
feature (e.g. SME), it might appear in one machine's signal frames (and
be mandatory there), but might not appear in anothers, and so migration
might not work either way. Likewise, that state can appear via ptrace.
Thanks,
Mark.
^ permalink raw reply
* [GIT PULL] arm_mpam: Add KVM/arm64 and resctrl glue code
From: James Morse @ 2026-03-27 16:19 UTC (permalink / raw)
To: Catalin Marinas, will@kernel.org
Cc: Ben Horgan, Marc Zyngier, Oliver Upton,
linux-arm-kernel@lists.infradead.org, Dave P Martin,
Shanker Donthineni, Zeng Heng
Hi Catalin, Will,
Below is the MPAM series that plumbs all this stuff out to user-space via resctrl.
3 patches against KVM Acked by Marc Z, 9 patches against arm64 Acked or Reviewed by
Catalin. One patch "arm_mpam: resctrl: Add CDP emulation" adds an include to the
asm/mpam.h header. I'm assuming that doesn't need an ack.
Thanks to Ben for fixing all my bugs!
James
----------------------------------------------------------------
The following changes since commit 1f318b96cc84d7c2ab792fcc0bfd42a7ca890681:
Linux 7.0-rc3 (2026-03-08 16:56:54 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/glue/v7.0-rc3
for you to fetch changes up to 4ce0a2ccc0358f3f746fa50815a599f861fd5d68:
arm64: mpam: Add initial MPAM documentation (2026-03-27 15:32:52 +0000)
----------------------------------------------------------------
Expose MPAM to user-space via resctrl based on v7.0-rc3
- Add architecture context-switch and hiding of the feature from KVM.
- Add interface to allow MPAM to be exposed to user-space using resctrl.
- Add errata workaoround for some existing platforms.
- Add documentation for using MPAM and what shape of platforms can use resctrl
----------------------------------------------------------------
Ben Horgan (11):
arm_mpam: Reset when feature configuration bit unset
arm64/sysreg: Add MPAMSM_EL1 register
KVM: arm64: Preserve host MPAM configuration when changing traps
KVM: arm64: Make MPAMSM_EL1 accesses UNDEF
arm64: mpam: Drop the CONFIG_EXPERT restriction
arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT
arm_mpam: resctrl: Add rmid index helpers
arm_mpam: resctrl: Wait for cacheinfo to be ready
arm_mpam: resctrl: Add monitor initialisation and domain boilerplate
arm64: mpam: Add initial MPAM documentation
Dave Martin (2):
arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
arm_mpam: resctrl: Add kunit test for control format conversions
James Morse (22):
arm64: mpam: Context switch the MPAM registers
arm64: mpam: Re-initialise MPAM regs when CPU comes online
arm64: mpam: Advertise the CPUs MPAM limits to the driver
arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs
arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values
KVM: arm64: Force guest EL1 to use user-space's partid configuration
arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
arm_mpam: resctrl: Pick the caches we will use as resctrl resources
arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()
arm_mpam: resctrl: Add resctrl_arch_get_config()
arm_mpam: resctrl: Implement helpers to update configuration
arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks
arm_mpam: resctrl: Add CDP emulation
arm_mpam: resctrl: Add support for 'MB' resource
arm_mpam: resctrl: Add support for csu counters
arm_mpam: resctrl: Allow resctrl to allocate monitors
arm_mpam: resctrl: Add resctrl_arch_rmid_read()
arm_mpam: resctrl: Update the rmid reallocation limit
arm_mpam: resctrl: Add empty definitions for assorted resctrl functions
arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl
arm_mpam: Quirk CMN-650's CSU NRDY behaviour
Shanker Donthineni (4):
arm_mpam: Add quirk framework
arm_mpam: Add workaround for T241-MPAM-1
arm_mpam: Add workaround for T241-MPAM-4
arm_mpam: Add workaround for T241-MPAM-6
Zeng Heng (1):
arm_mpam: Ensure in_reset_state is false after applying configuration
Documentation/arch/arm64/index.rst | 1 +
Documentation/arch/arm64/mpam.rst | 72 ++
Documentation/arch/arm64/silicon-errata.rst | 9 +
arch/arm64/Kconfig | 6 +-
arch/arm64/include/asm/el2_setup.h | 3 +-
arch/arm64/include/asm/mpam.h | 96 ++
arch/arm64/include/asm/resctrl.h | 2 +
arch/arm64/include/asm/thread_info.h | 3 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/cpufeature.c | 21 +-
arch/arm64/kernel/mpam.c | 62 +
arch/arm64/kernel/process.c | 7 +
arch/arm64/kvm/hyp/include/hyp/switch.h | 12 +-
arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 16 +
arch/arm64/kvm/sys_regs.c | 2 +
arch/arm64/tools/sysreg | 8 +
drivers/resctrl/Kconfig | 9 +-
drivers/resctrl/Makefile | 1 +
drivers/resctrl/mpam_devices.c | 305 ++++-
drivers/resctrl/mpam_internal.h | 108 +-
drivers/resctrl/mpam_resctrl.c | 1704 +++++++++++++++++++++++++++
drivers/resctrl/test_mpam_resctrl.c | 315 +++++
include/linux/arm_mpam.h | 32 +
23 files changed, 2735 insertions(+), 60 deletions(-)
create mode 100644 Documentation/arch/arm64/mpam.rst
create mode 100644 arch/arm64/include/asm/mpam.h
create mode 100644 arch/arm64/include/asm/resctrl.h
create mode 100644 arch/arm64/kernel/mpam.c
create mode 100644 drivers/resctrl/mpam_resctrl.c
create mode 100644 drivers/resctrl/test_mpam_resctrl.c
^ permalink raw reply
* Re: [PATCH v6 02/40] arm_mpam: Reset when feature configuration bit unset
From: James Morse @ 2026-03-27 16:21 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
In-Reply-To: <20260313144617.3420416-3-ben.horgan@arm.com>
Hi Ben,
On 13/03/2026 14:45, Ben Horgan wrote:
> To indicate that the configuration, of the controls used by resctrl, in a
> RIS need resetting to driver defaults the reset flags in mpam_config are
> set. However, these flags are only ever set temporarily at RIS scope in
> mpam_reset_ris() and hence mpam_cpu_online() will never reset these
> controls to default. As the hardware reset is unknown this leads to unknown
> configuration when the control values haven't been configured away from the
> defaults.
>
> Use the policy that an unset feature configuration bit means reset. In this
> way the mpam_config in the component can encode that it should be in reset
> state and mpam_reprogram_msc() will reset controls as needed.
> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
> index 0fd6590a9b5c..ff861291bd4e 100644
> --- a/drivers/resctrl/mpam_devices.c
> +++ b/drivers/resctrl/mpam_devices.c
> @@ -1364,17 +1364,15 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
> __mpam_intpart_sel(ris->ris_idx, partid, msc);
> }
>
> - if (mpam_has_feature(mpam_feat_cpor_part, rprops) &&
> - mpam_has_feature(mpam_feat_cpor_part, cfg)) {
> - if (cfg->reset_cpbm)
After this, nothing reads/writes these explicit reset flags so they can be removed from
struct mpam_config.
(I'll do this locally)
> - mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
> - else
> + if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_cpor_part, cfg))
> mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
> + else
> + mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
> }
>
> - if (mpam_has_feature(mpam_feat_mbw_part, rprops) &&
> - mpam_has_feature(mpam_feat_mbw_part, cfg)) {
> - if (cfg->reset_mbw_pbm)
> + if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_part, cfg))
> mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
> else
> mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
Reviewed-by: James Morse <james.morse@arm.com>
Thanks!
James
^ permalink raw reply
* Re: [PATCH v2] PCI: imx6: Don't remove MSI capability For i.MX7D/i.MX8M
From: Frank Li @ 2026-03-27 16:22 UTC (permalink / raw)
To: Hongxing Zhu
Cc: l.stach@pengutronix.de, lpieralisi@kernel.org,
kwilczynski@kernel.org, mani@kernel.org, robh@kernel.org,
bhelgaas@google.com, s.hauer@pengutronix.de,
kernel@pengutronix.de, festevam@gmail.com,
linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
imx@lists.linux.dev, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
In-Reply-To: <AS8PR04MB883306406390FCB4106C3A978C57A@AS8PR04MB8833.eurprd04.prod.outlook.com>
On Fri, Mar 27, 2026 at 08:12:29AM +0000, Hongxing Zhu wrote:
> > -----Original Message-----
> > From: Frank Li <frank.li@nxp.com>
> > Sent: 2026年3月19日 22:17
> > To: Hongxing Zhu <hongxing.zhu@nxp.com>
> > Cc: l.stach@pengutronix.de; lpieralisi@kernel.org; kwilczynski@kernel.org;
> > mani@kernel.org; robh@kernel.org; bhelgaas@google.com;
> > s.hauer@pengutronix.de; kernel@pengutronix.de; festevam@gmail.com;
> > linux-pci@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> > imx@lists.linux.dev; linux-kernel@vger.kernel.org; stable@vger.kernel.org
> > Subject: Re: [PATCH v2] PCI: imx6: Don't remove MSI capability For
> > i.MX7D/i.MX8M
> >
> > On Thu, Mar 19, 2026 at 05:18:23PM +0800, Richard Zhu wrote:
> > > The MSI trigger mechanism for endpoint devices connected to i.MX7D,
> > > i.MX8MM, and i.MX8MQ PCIe root complex ports depends on the MSI
> > > capability register settings in the root complex. Removing the MSI
> > > capability breaks MSI functionality for these endpoints.
> > >
> > > Preserve the MSI capability for i.MX7D/i.MX8M PCIe root complex to
> > > maintain MSI functionality.
> > >
> > > Cc: stable@vger.kernel.org
> > > Fixes: f5cd8a929c825 ("PCI: dwc: Remove MSI/MSIX capability for Root
> > > Port if iMSI-RX is used as MSI controller")
> >
> > I think it'd better add another varible to check in f5cd8a929c825 if
> > (pp->has_msi_ctrl && !pp->xxx_broken) or direct use IP version, which
> > already auto detected.
> >
> > Previous patch have not consider this old version controller.
> Hi Frank:
> From what I've observed, this behavior seems tied to the specific controller
> design. For example, neither the i.MX6Q nor the i.MX6SX exhibit this issue.
Yes, should rename has_msi_ctrl -> disable_msi_ctrl. Set it according to
difference condition, such as has_msi_ctrl or skip it for problem platform
such as i.MX8MM and i.MX8MQ.
Disable it and overwrite later will cause confuse.
>
> The intention of commit f5cd8a929c825 is to remove the MSI capability from the
> Root Complex (RC). From the author's perspective, this change should not
> affect the Endpoint's (EP) MSI functionality.
Yes, your patch fix RC mode?
Frank
>
> I'm not sure do this check (pp->has_msi_ctrl && !pp->msi_broken) is proper or not.
> Best Regards
> Richard Zhu
> > >
^ permalink raw reply
* Re: [GIT PULL 1/2] Broadcom devicetree changes for 7.1
From: Florian Fainelli @ 2026-03-27 16:26 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: soc, Rosen Penev, Rob Herring, Linus Walleij, Linus Walleij,
William Zhang, Miquel Raynal, Rafał Miłecki,
linux-arm-kernel, arnd, khilman, bcm-kernel-feedback-list
In-Reply-To: <20260327-lurking-amazing-pudu-114fe8@quoll>
On 3/27/26 04:53, Krzysztof Kozlowski wrote:
> On Mon, Mar 23, 2026 at 12:02:38PM -0700, Florian Fainelli wrote:
>> The following changes since commit 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f:
>>
>> Linux 7.0-rc1 (2026-02-22 13:18:59 -0800)
>>
>> are available in the Git repository at:
>>
>> https://github.com/Broadcom/stblinux.git tags/arm-soc/for-7.1/devicetree
>>
>> for you to fetch changes up to 220bdfcb4b4788f57faa2c28454d8b2dd3bcab6c:
>>
>> ARM: dts: BCM5301X: EA9200: specify partitions (2026-03-20 16:57:31 -0700)
>
> Four days after:
>
> Days in linux-next:
> ----------------------------------------
> 0 | ++++++++++++++++ (16)
>
> ...
>
> Commits with 0 days in linux-next (16 of 19: 84.2%)...
>
> Are you sure your tree is included in the next?
The branch that is included in linux-next is my "next" branch which is a
merge of all branches. In this particular case however it looks like the
branch was not updated.
--
Florian
^ permalink raw reply
* [PATCH v1 1/1] arm64: dts: imx91-var-dart-sonata: add RGB select supply for PCA6408
From: Stefano Radaelli @ 2026-03-27 16:32 UTC (permalink / raw)
To: linux-kernel, devicetree, imx, linux-arm-kernel
Cc: pierluigi.p, Stefano Radaelli, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Frank Li, Sascha Hauer, Pengutronix Kernel Team,
Fabio Estevam
From: Stefano Radaelli <stefano.r@variscite.com>
RGB_SEL controls the routing of some carrier board lines on the Sonata
board. The two PCA6408 GPIO expanders depend on that path being enabled,
so describe the selector as a fixed regulator and use it as their
vcc-supply.
Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
---
arch/arm64/boot/dts/freescale/imx91-var-dart-sonata.dts | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/arm64/boot/dts/freescale/imx91-var-dart-sonata.dts b/arch/arm64/boot/dts/freescale/imx91-var-dart-sonata.dts
index afa39dab240a..3b5816884f24 100644
--- a/arch/arm64/boot/dts/freescale/imx91-var-dart-sonata.dts
+++ b/arch/arm64/boot/dts/freescale/imx91-var-dart-sonata.dts
@@ -90,6 +90,13 @@ reg_vref_1v8: regulator-adc-vref {
regulator-max-microvolt = <1800000>;
};
+ reg_rgb_sel: regulator-rgb-sel {
+ compatible = "regulator-fixed";
+ regulator-name = "rgb-select";
+ gpio = <&pca9534 7 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+ };
+
reg_usdhc2_vmmc: regulator-vmmc-usdhc2 {
compatible = "regulator-fixed";
pinctrl-names = "default";
@@ -195,6 +202,7 @@ pca6408_1: gpio@20 {
#gpio-cells = <2>;
interrupt-parent = <&gpio1>;
interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
+ vcc-supply = <®_rgb_sel>;
};
pca6408_2: gpio@21 {
@@ -204,6 +212,7 @@ pca6408_2: gpio@21 {
#gpio-cells = <2>;
interrupt-parent = <&gpio1>;
interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
+ vcc-supply = <®_rgb_sel>;
};
pca9534: gpio@22 {
--
2.47.3
^ permalink raw reply related
* Re: [PATCH v9 0/5] PCI: of: Remove max-link-speed generation validation
From: Bjorn Helgaas @ 2026-03-27 16:42 UTC (permalink / raw)
To: Hans Zhang
Cc: lpieralisi, jingoohan1, mani, kwilczynski, bhelgaas,
florian.fainelli, jim2101024, robh, ilpo.jarvinen, linux-arm-msm,
linux-arm-kernel, linux-renesas-soc, claudiu.beznea.uj,
linux-mediatek, linux-tegra, linux-omap, bcm-kernel-feedback-list,
linux-pci, linux-kernel, shawn.lin
In-Reply-To: <20260313165522.123518-1-18255117159@163.com>
On Sat, Mar 14, 2026 at 12:55:17AM +0800, Hans Zhang wrote:
> Hi,
>
> This series moves the validation from the common OF function to the
> individual PCIe controller drivers. To protect against out-of-bounds
> accesses to the pcie_link_speed[] array, we first introduce a helper
> function pcie_get_link_speed() that safely returns the speed value
> (or PCI_SPEED_UNKNOWN) for a given generation number.
>
> Then all direct uses of pcie_link_speed[] as an array are converted to
> use the new helper, ensuring that even if an invalid generation number
> reaches those code paths, no out-of-bounds access occurs.
>
> For several drivers that read the "max-link-speed" property
> (pci-j721e, brcmstb, mediatek-gen3, rzg3s-host), we add an explicit
> validation step: if the value is missing, out of range, or unsupported
> by the hardware, a safe default is used (usually Gen2). Other drivers
> (mainly DesignWare glue drivers) rely on the helper to safely handle
> invalid values, but do not yet include fallback logic or warnings.
>
> Finally, the range check is removed from of_pci_get_max_link_speed(),
> so that future PCIe generations can be supported without modifying
> drivers/pci/of.c.
Thanks for this series.
We still have a couple references to pcie_link_speed[] that bypass
pcie_get_link_speed(). These are safe because PCI_EXP_LNKSTA_CLS is
0xf and pcie_link_speed[] is size 16, but I'm not sure the direct
reference is necessary.
The array itself is exported, which I suppose we needed for modular
PCI controller drivers, but we probably don't need it now that
pcie_get_link_speed() is exported?
$ git grep "\<pcie_link_speed\>"
drivers/pci/pci-sysfs.c: speed = pcie_link_speed[linkstat & PCI_EXP_LNKSTA_CLS];
drivers/pci/pci.c: return pcie_link_speed[FIELD_GET(PCI_EXP_LNKSTA_CLS, lnksta)];
drivers/pci/pci.h:extern const unsigned char pcie_link_speed[];
drivers/pci/pci.h: bus->cur_bus_speed = pcie_link_speed[linksta & PCI_EXP_LNKSTA_CLS];
drivers/pci/probe.c:const unsigned char pcie_link_speed[] = {
drivers/pci/probe.c:EXPORT_SYMBOL_GPL(pcie_link_speed);
drivers/pci/probe.c: if (speed >= ARRAY_SIZE(pcie_link_speed))
drivers/pci/probe.c: return pcie_link_speed[speed];
drivers/pci/probe.c: bus->max_bus_speed = pcie_link_speed[linkcap & PCI_EXP_LNKCAP_SLS];
^ permalink raw reply
* [PATCH v4 0/5] dt-bindings: usb: atmel: convert Atmel USB controller bindings to YAML
From: Charan Pedumuru @ 2026-03-27 16:47 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Claudiu Beznea, Herve Codina, Nicolas Ferre,
Alexandre Belloni
Cc: linux-usb, devicetree, linux-arm-kernel, linux-kernel,
Charan Pedumuru
This patch series converts the legacy text-based Device Tree bindings for
Atmel/Microchip USB controllers to DT schema (YAML) format.
Note:
The patch "dt-bindings: usb: atmel,at91sam9rl-udc: convert to DT schema"
depends on the patch "arm: dts: at91: remove unused #address-cells/#size-cells
from sam9x60 UDC node". If the DT schema patch is applied before the DTS
cleanup patch, `dtbs_check` will fail due to the presence of the removed
properties in the existing DTS.
Signed-off-by: Charan Pedumuru <charan.pedumuru@gmail.com>
---
Changes in v4:
- generic-ohci: Modify the commit message and modify description for the
properties "atmel,vbus-gpio" and "atmel,oc-gpio".
- atmel,at91rm9200-udc: Remove minItems for clocks and rename
unevaluatedProperties to additionalProperties.
- atmel,at91sam9rl-udc: Remove minItems for clocks and modify commit
message.
- all: Remove the corresponding text binding node for each patch from
the text binding file.
- Link to v3: https://lore.kernel.org/r/20260307-atmel-usb-v3-0-3dc48fe772be@gmail.com
Changes in v3:
- sam9x60: Add a new patch removing the unnecessary #address-cells and
#size-cells properties from the sam9x60 UDC node.
- atmel,at91sam9rl-udc: Remove #address-cells and #size-cells from the
atmel,at91sam9rl-udc binding properties.
- generic-ohci: Add an else condition to the generic-ohci schema properties
for improved validation precision.
- Link to v2: https://lore.kernel.org/r/20260224-atmel-usb-v2-0-6d6a615c9c47@gmail.com
Changes in v2:
- Drop the separate YAML patches for OHCI and EHCI.
- Add the compatibles "atmel,at91rm9200-ohci" and "atmel,at91sam9g45-ehci"
to the existing generic OHCI and EHCI binding files.
- Link to v1: https://lore.kernel.org/r/20260201-atmel-usb-v1-0-d1a3e93003f1@gmail.com
---
Charan Pedumuru (5):
arm: dts: at91: remove unused #address-cells/#size-cells from sam9x60 udc node
dt-bindings: usb: generic-ohci: add AT91RM9200 OHCI binding support
dt-bindings: usb: generic-ehci: fix schema structure and add at91sam9g45 constraints
dt-bindings: usb: atmel,at91rm9200-udc: convert to DT schema
dt-bindings: usb: atmel,at91sam9rl-udc: convert to DT schema
.../bindings/usb/atmel,at91rm9200-udc.yaml | 76 +++++++++++++
.../bindings/usb/atmel,at91sam9rl-udc.yaml | 74 ++++++++++++
.../devicetree/bindings/usb/atmel-usb.txt | 125 ---------------------
.../devicetree/bindings/usb/generic-ehci.yaml | 46 +++++---
.../devicetree/bindings/usb/generic-ohci.yaml | 41 +++++++
arch/arm/boot/dts/microchip/sam9x60.dtsi | 2 -
6 files changed, 224 insertions(+), 140 deletions(-)
---
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279
change-id: 20260129-atmel-usb-37f89a141e48
Best regards,
--
Charan Pedumuru <charan.pedumuru@gmail.com>
^ permalink raw reply
* [PATCH v4 1/5] arm: dts: at91: remove unused #address-cells/#size-cells from sam9x60 udc node
From: Charan Pedumuru @ 2026-03-27 16:47 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Claudiu Beznea, Herve Codina, Nicolas Ferre,
Alexandre Belloni
Cc: linux-usb, devicetree, linux-arm-kernel, linux-kernel,
Charan Pedumuru
In-Reply-To: <20260327-atmel-usb-v4-0-eb8b6e49b29d@gmail.com>
The UDC node does not define any child nodes, so the "#address-cells" and
"#size-cells" properties are unnecessary. Remove these unused properties
to simplify the devicetree node and keep it consistent with DT conventions.
Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Signed-off-by: Charan Pedumuru <charan.pedumuru@gmail.com>
---
arch/arm/boot/dts/microchip/sam9x60.dtsi | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm/boot/dts/microchip/sam9x60.dtsi b/arch/arm/boot/dts/microchip/sam9x60.dtsi
index b075865e6a76..e708b3df4ccd 100644
--- a/arch/arm/boot/dts/microchip/sam9x60.dtsi
+++ b/arch/arm/boot/dts/microchip/sam9x60.dtsi
@@ -75,8 +75,6 @@ ahb {
ranges;
usb0: gadget@500000 {
- #address-cells = <1>;
- #size-cells = <0>;
compatible = "microchip,sam9x60-udc";
reg = <0x00500000 0x100000
0xf803c000 0x400>;
--
2.53.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox