* [PATCH v1 0/9] AV1 stateless decoder for RK3588
@ 2022-12-19 15:56 Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible Benjamin Gaignard
` (9 more replies)
0 siblings, 10 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
This series implement AV1 stateless decoder for RK3588 SoC.
The harware support 8 and 10 bits bitstreams up to 7680x4320.
AV1 feature like film grain or scaling are done by the postprocessor.
The driver can produce NV12_4L4 and NV12 pixel formats.
A native 10bits NV12_4L4 format is possible but need more investigation
to be completly documented and enabled.
It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
Sebastian's device-tree patches for RK3588.
The full branch can be found here:
https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
The failing tests are:
- 10bits bitstream because 10bits output formats aren't yet implemented.
- the 2 tests with 2 spatial layers: few errors in luma/chroma values
- tests with resolution < hardware limit (64x64)
Benjamin
Benjamin Gaignard (9):
dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
media: verisilicon: Add AV1 decoder mode and controls
media: verisilicon: Save bit depth for AV1 decoder
media: verisilicon: Check AV1 bitstreams bit depth
media: verisilicon: Compute motion vectors size for AV1 frames
media: verisilicon: Add AV1 entropy helpers
media: verisilicon: Add Rockchip AV1 decoder
media: verisilicon: Add film grain feature to AV1 driver
media: verisilicon: Enable AV1 decoder on rk3588
.../bindings/media/rockchip-vpu.yaml | 1 +
drivers/media/platform/verisilicon/Makefile | 3 +
drivers/media/platform/verisilicon/hantro.h | 5 +
.../media/platform/verisilicon/hantro_drv.c | 54 +
.../media/platform/verisilicon/hantro_hw.h | 102 +
.../platform/verisilicon/hantro_postproc.c | 3 +
.../media/platform/verisilicon/hantro_v4l2.c | 5 +
.../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
.../verisilicon/rockchip_av1_entropymode.h | 272 +
.../verisilicon/rockchip_av1_filmgrain.c | 401 ++
.../verisilicon/rockchip_av1_filmgrain.h | 36 +
.../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
.../verisilicon/rockchip_vpu981_regs.h | 477 ++
.../platform/verisilicon/rockchip_vpu_hw.c | 116 +
14 files changed, 8291 insertions(+)
create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
2022-12-19 16:06 ` Krzysztof Kozlowski
2022-12-20 9:55 ` Krzysztof Kozlowski
2022-12-19 15:56 ` [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls Benjamin Gaignard
` (8 subsequent siblings)
9 siblings, 2 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
Add compatible for rk3588 AV1 vpu decoder.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
Documentation/devicetree/bindings/media/rockchip-vpu.yaml | 1 +
1 file changed, 1 insertion(+)
diff --git a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
index 6cc4d3e5a61d..8454df53f5cb 100644
--- a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
+++ b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
@@ -24,6 +24,7 @@ properties:
- rockchip,rk3399-vpu
- rockchip,px30-vpu
- rockchip,rk3568-vpu
+ - rockchip,rk3588-av1-vpu
- items:
- const: rockchip,rk3188-vpu
- const: rockchip,rk3066-vpu
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
[not found] ` <202212200257.l7xJlHCi-lkp@intel.com>
2022-12-19 20:28 ` Nicolas Dufresne
2022-12-19 15:56 ` [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder Benjamin Gaignard
` (7 subsequent siblings)
9 siblings, 2 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
Add AV1 decoder as new decoder mode to Hantro driver.
Register needed AV1 controls for the decoder.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
drivers/media/platform/verisilicon/hantro.h | 3 +++
.../media/platform/verisilicon/hantro_drv.c | 21 +++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/drivers/media/platform/verisilicon/hantro.h b/drivers/media/platform/verisilicon/hantro.h
index 2989ebc631cc..61480825b856 100644
--- a/drivers/media/platform/verisilicon/hantro.h
+++ b/drivers/media/platform/verisilicon/hantro.h
@@ -38,6 +38,7 @@ struct hantro_postproc_ops;
#define HANTRO_H264_DECODER BIT(18)
#define HANTRO_HEVC_DECODER BIT(19)
#define HANTRO_VP9_DECODER BIT(20)
+#define HANTRO_AV1_DECODER BIT(21)
#define HANTRO_DECODERS 0xffff0000
/**
@@ -111,6 +112,7 @@ struct hantro_variant {
* @HANTRO_MODE_VP8_DEC: VP8 decoder.
* @HANTRO_MODE_HEVC_DEC: HEVC decoder.
* @HANTRO_MODE_VP9_DEC: VP9 decoder.
+ * @HANTRO_MODE_AV1_DEC: AV1 decoder
*/
enum hantro_codec_mode {
HANTRO_MODE_NONE = -1,
@@ -120,6 +122,7 @@ enum hantro_codec_mode {
HANTRO_MODE_VP8_DEC,
HANTRO_MODE_HEVC_DEC,
HANTRO_MODE_VP9_DEC,
+ HANTRO_MODE_AV1_DEC,
};
/*
diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
index 8cb4a68c9119..4500e1fc0f2c 100644
--- a/drivers/media/platform/verisilicon/hantro_drv.c
+++ b/drivers/media/platform/verisilicon/hantro_drv.c
@@ -498,6 +498,27 @@ static const struct hantro_ctrl controls[] = {
.cfg = {
.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
},
+ }, {
+ .codec = HANTRO_AV1_DECODER,
+ .cfg = {
+ .id = V4L2_CID_STATELESS_AV1_FRAME,
+ },
+ }, {
+ .codec = HANTRO_AV1_DECODER,
+ .cfg = {
+ .id = V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY,
+ .dims = { V4L2_AV1_MAX_TILE_COUNT },
+ },
+ }, {
+ .codec = HANTRO_AV1_DECODER,
+ .cfg = {
+ .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
+ },
+ }, {
+ .codec = HANTRO_AV1_DECODER,
+ .cfg = {
+ .id = V4L2_CID_STATELESS_AV1_FILM_GRAIN,
+ },
},
};
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
2022-12-19 20:37 ` Nicolas Dufresne
2022-12-19 15:56 ` [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth Benjamin Gaignard
` (6 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
Store bit depth information from AV1 sequence control.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
.../media/platform/verisilicon/hantro_drv.c | 26 +++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
index 4500e1fc0f2c..8e93710dcfed 100644
--- a/drivers/media/platform/verisilicon/hantro_drv.c
+++ b/drivers/media/platform/verisilicon/hantro_drv.c
@@ -324,6 +324,25 @@ static int hantro_vp9_s_ctrl(struct v4l2_ctrl *ctrl)
return 0;
}
+static int hantro_av1_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+ struct hantro_ctx *ctx;
+
+ ctx = container_of(ctrl->handler,
+ struct hantro_ctx, ctrl_handler);
+
+ switch (ctrl->id) {
+ case V4L2_CID_STATELESS_AV1_SEQUENCE:
+ ctx->bit_depth = ctrl->p_new.p_av1_sequence->bit_depth;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+
static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
.try_ctrl = hantro_try_ctrl,
};
@@ -336,6 +355,12 @@ static const struct v4l2_ctrl_ops hantro_vp9_ctrl_ops = {
.s_ctrl = hantro_vp9_s_ctrl,
};
+static const struct v4l2_ctrl_ops hantro_av1_ctrl_ops = {
+ .try_ctrl = hantro_try_ctrl,
+ .s_ctrl = hantro_av1_s_ctrl,
+};
+
+
#define HANTRO_JPEG_ACTIVE_MARKERS (V4L2_JPEG_ACTIVE_MARKER_APP0 | \
V4L2_JPEG_ACTIVE_MARKER_COM | \
V4L2_JPEG_ACTIVE_MARKER_DQT | \
@@ -513,6 +538,7 @@ static const struct hantro_ctrl controls[] = {
.codec = HANTRO_AV1_DECODER,
.cfg = {
.id = V4L2_CID_STATELESS_AV1_SEQUENCE,
+ .ops = &hantro_av1_ctrl_ops,
},
}, {
.codec = HANTRO_AV1_DECODER,
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
` (2 preceding siblings ...)
2022-12-19 15:56 ` [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
2022-12-19 20:38 ` Nicolas Dufresne
2022-12-19 15:56 ` [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames Benjamin Gaignard
` (5 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
The driver supports 8 and 10 bits bitstreams, make sure to discard
other cases.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
drivers/media/platform/verisilicon/hantro_drv.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
index 8e93710dcfed..e10fc59634dd 100644
--- a/drivers/media/platform/verisilicon/hantro_drv.c
+++ b/drivers/media/platform/verisilicon/hantro_drv.c
@@ -282,7 +282,13 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
/* We only support profile 0 */
if (dec_params->profile != 0)
return -EINVAL;
+ } else if (ctrl->id == V4L2_CID_STATELESS_AV1_SEQUENCE) {
+ const struct v4l2_ctrl_av1_sequence *sequence = ctrl->p_new.p_av1_sequence;
+
+ if (sequence->bit_depth != 8 && sequence->bit_depth != 10)
+ return -EINVAL;
}
+
return 0;
}
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
` (3 preceding siblings ...)
2022-12-19 15:56 ` [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
2022-12-19 20:42 ` Nicolas Dufresne
2022-12-19 15:56 ` [PATCH v1 8/9] media: verisilicon: Add film grain feature to AV1 driver Benjamin Gaignard
` (4 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
Compute the additional required to store motion vectors at
the end of the frames buffers.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
drivers/media/platform/verisilicon/hantro_hw.h | 13 +++++++++++++
.../media/platform/verisilicon/hantro_postproc.c | 3 +++
drivers/media/platform/verisilicon/hantro_v4l2.c | 5 +++++
3 files changed, 21 insertions(+)
diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
index e83f0c523a30..8b3bc7e31395 100644
--- a/drivers/media/platform/verisilicon/hantro_hw.h
+++ b/drivers/media/platform/verisilicon/hantro_hw.h
@@ -417,6 +417,19 @@ hantro_hevc_mv_size(unsigned int width, unsigned int height)
return width * height / 16;
}
+static inline unsigned short hantro_av1_num_sbs(unsigned short dimension)
+{
+ return DIV_ROUND_UP(dimension, 64) + 1;
+}
+
+static inline size_t
+hantro_av1_mv_size(unsigned int width, unsigned int height)
+{
+ size_t num_sbs = hantro_av1_num_sbs(width) * hantro_av1_num_sbs(height);
+
+ return ALIGN(num_sbs * 384, 16) + 512;
+}
+
int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx);
void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
diff --git a/drivers/media/platform/verisilicon/hantro_postproc.c b/drivers/media/platform/verisilicon/hantro_postproc.c
index 09d8cf942689..7dc39519a2ee 100644
--- a/drivers/media/platform/verisilicon/hantro_postproc.c
+++ b/drivers/media/platform/verisilicon/hantro_postproc.c
@@ -213,6 +213,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE)
buf_size += hantro_hevc_mv_size(pix_mp.width,
pix_mp.height);
+ else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME)
+ buf_size += hantro_av1_mv_size(pix_mp.width,
+ pix_mp.height);
for (i = 0; i < num_buffers; ++i) {
struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
index 2c7a805289e7..d41dcb108a6d 100644
--- a/drivers/media/platform/verisilicon/hantro_v4l2.c
+++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
@@ -334,6 +334,11 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
pix_mp->plane_fmt[0].sizeimage +=
hantro_hevc_mv_size(pix_mp->width,
pix_mp->height);
+ else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME &&
+ !hantro_needs_postproc(ctx, fmt))
+ pix_mp->plane_fmt[0].sizeimage +=
+ hantro_av1_mv_size(pix_mp->width,
+ pix_mp->height);
} else if (!pix_mp->plane_fmt[0].sizeimage) {
/*
* For coded formats the application can specify
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 8/9] media: verisilicon: Add film grain feature to AV1 driver
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
` (4 preceding siblings ...)
2022-12-19 15:56 ` [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 9/9] media: verisilicon: Enable AV1 decoder on rk3588 Benjamin Gaignard
` (3 subsequent siblings)
9 siblings, 0 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
Film grain feature add "old style" grain noise on decoded streams.
Grain noise is applied after decoding by the postprocessor.
The level of grain is based on gaussian sequence.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
drivers/media/platform/verisilicon/Makefile | 1 +
.../media/platform/verisilicon/hantro_hw.h | 3 +
.../verisilicon/rockchip_av1_filmgrain.c | 401 ++++++++++++++++++
.../verisilicon/rockchip_av1_filmgrain.h | 36 ++
.../verisilicon/rockchip_vpu981_hw_av1_dec.c | 213 ++++++++++
5 files changed, 654 insertions(+)
create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
diff --git a/drivers/media/platform/verisilicon/Makefile b/drivers/media/platform/verisilicon/Makefile
index c9a9806ab8c5..6ad2ef885920 100644
--- a/drivers/media/platform/verisilicon/Makefile
+++ b/drivers/media/platform/verisilicon/Makefile
@@ -19,6 +19,7 @@ hantro-vpu-y += \
rockchip_vpu2_hw_mpeg2_dec.o \
rockchip_vpu2_hw_vp8_dec.o \
rockchip_vpu981_hw_av1_dec.o \
+ rockchip_av1_filmgrain.o \
rockchip_av1_entropymode.o \
hantro_jpeg.o \
hantro_h264.o \
diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
index 3c0a995998a5..ac23fc3be079 100644
--- a/drivers/media/platform/verisilicon/hantro_hw.h
+++ b/drivers/media/platform/verisilicon/hantro_hw.h
@@ -16,6 +16,7 @@
#include <media/videobuf2-core.h>
#include "rockchip_av1_entropymode.h"
+#include "rockchip_av1_filmgrain.h"
#define DEC_8190_ALIGN_MASK 0x07U
@@ -288,6 +289,7 @@ struct hantro_av1_frame_ref {
* @global_model: global model buffer
* @tile_info: tile info buffer
* @segment: segmentation info buffer
+ * @film_grain: film grain buffer
* @prob_tbl: probability table
* @prob_tbl_out: probability table output
* @tile_buf: tile buffer
@@ -312,6 +314,7 @@ struct hantro_av1_dec_hw_ctx {
struct hantro_aux_buf global_model;
struct hantro_aux_buf tile_info;
struct hantro_aux_buf segment;
+ struct hantro_aux_buf film_grain;
struct hantro_aux_buf prob_tbl;
struct hantro_aux_buf prob_tbl_out;
struct hantro_aux_buf tile_buf;
diff --git a/drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c b/drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
new file mode 100644
index 000000000000..008ba5782eb0
--- /dev/null
+++ b/drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
@@ -0,0 +1,401 @@
+// SPDX-License-Identifier: GPL-2.0-only or Apache-2.0
+
+#include "rockchip_av1_filmgrain.h"
+
+static const int32_t gaussian_sequence[2048] = {
+ 56, 568, -180, 172, 124, -84, 172, -64, -900, 24, 820,
+ 224, 1248, 996, 272, -8, -916, -388, -732, -104, -188, 800,
+ 112, -652, -320, -376, 140, -252, 492, -168, 44, -788, 588,
+ -584, 500, -228, 12, 680, 272, -476, 972, -100, 652, 368,
+ 432, -196, -720, -192, 1000, -332, 652, -136, -552, -604, -4,
+ 192, -220, -136, 1000, -52, 372, -96, -624, 124, -24, 396,
+ 540, -12, -104, 640, 464, 244, -208, -84, 368, -528, -740,
+ 248, -968, -848, 608, 376, -60, -292, -40, -156, 252, -292,
+ 248, 224, -280, 400, -244, 244, -60, 76, -80, 212, 532,
+ 340, 128, -36, 824, -352, -60, -264, -96, -612, 416, -704,
+ 220, -204, 640, -160, 1220, -408, 900, 336, 20, -336, -96,
+ -792, 304, 48, -28, -1232, -1172, -448, 104, -292, -520, 244,
+ 60, -948, 0, -708, 268, 108, 356, -548, 488, -344, -136,
+ 488, -196, -224, 656, -236, -1128, 60, 4, 140, 276, -676,
+ -376, 168, -108, 464, 8, 564, 64, 240, 308, -300, -400,
+ -456, -136, 56, 120, -408, -116, 436, 504, -232, 328, 844,
+ -164, -84, 784, -168, 232, -224, 348, -376, 128, 568, 96,
+ -1244, -288, 276, 848, 832, -360, 656, 464, -384, -332, -356,
+ 728, -388, 160, -192, 468, 296, 224, 140, -776, -100, 280,
+ 4, 196, 44, -36, -648, 932, 16, 1428, 28, 528, 808,
+ 772, 20, 268, 88, -332, -284, 124, -384, -448, 208, -228,
+ -1044, -328, 660, 380, -148, -300, 588, 240, 540, 28, 136,
+ -88, -436, 256, 296, -1000, 1400, 0, -48, 1056, -136, 264,
+ -528, -1108, 632, -484, -592, -344, 796, 124, -668, -768, 388,
+ 1296, -232, -188, -200, -288, -4, 308, 100, -168, 256, -500,
+ 204, -508, 648, -136, 372, -272, -120, -1004, -552, -548, -384,
+ 548, -296, 428, -108, -8, -912, -324, -224, -88, -112, -220,
+ -100, 996, -796, 548, 360, -216, 180, 428, -200, -212, 148,
+ 96, 148, 284, 216, -412, -320, 120, -300, -384, -604, -572,
+ -332, -8, -180, -176, 696, 116, -88, 628, 76, 44, -516,
+ 240, -208, -40, 100, -592, 344, -308, -452, -228, 20, 916,
+ -1752, -136, -340, -804, 140, 40, 512, 340, 248, 184, -492,
+ 896, -156, 932, -628, 328, -688, -448, -616, -752, -100, 560,
+ -1020, 180, -800, -64, 76, 576, 1068, 396, 660, 552, -108,
+ -28, 320, -628, 312, -92, -92, -472, 268, 16, 560, 516,
+ -672, -52, 492, -100, 260, 384, 284, 292, 304, -148, 88,
+ -152, 1012, 1064, -228, 164, -376, -684, 592, -392, 156, 196,
+ -524, -64, -884, 160, -176, 636, 648, 404, -396, -436, 864,
+ 424, -728, 988, -604, 904, -592, 296, -224, 536, -176, -920,
+ 436, -48, 1176, -884, 416, -776, -824, -884, 524, -548, -564,
+ -68, -164, -96, 692, 364, -692, -1012, -68, 260, -480, 876,
+ -1116, 452, -332, -352, 892, -1088, 1220, -676, 12, -292, 244,
+ 496, 372, -32, 280, 200, 112, -440, -96, 24, -644, -184,
+ 56, -432, 224, -980, 272, -260, 144, -436, 420, 356, 364,
+ -528, 76, 172, -744, -368, 404, -752, -416, 684, -688, 72,
+ 540, 416, 92, 444, 480, -72, -1416, 164, -1172, -68, 24,
+ 424, 264, 1040, 128, -912, -524, -356, 64, 876, -12, 4,
+ -88, 532, 272, -524, 320, 276, -508, 940, 24, -400, -120,
+ 756, 60, 236, -412, 100, 376, -484, 400, -100, -740, -108,
+ -260, 328, -268, 224, -200, -416, 184, -604, -564, -20, 296,
+ 60, 892, -888, 60, 164, 68, -760, 216, -296, 904, -336,
+ -28, 404, -356, -568, -208, -1480, -512, 296, 328, -360, -164,
+ -1560, -776, 1156, -428, 164, -504, -112, 120, -216, -148, -264,
+ 308, 32, 64, -72, 72, 116, 176, -64, -272, 460, -536,
+ -784, -280, 348, 108, -752, -132, 524, -540, -776, 116, -296,
+ -1196, -288, -560, 1040, -472, 116, -848, -1116, 116, 636, 696,
+ 284, -176, 1016, 204, -864, -648, -248, 356, 972, -584, -204,
+ 264, 880, 528, -24, -184, 116, 448, -144, 828, 524, 212,
+ -212, 52, 12, 200, 268, -488, -404, -880, 824, -672, -40,
+ 908, -248, 500, 716, -576, 492, -576, 16, 720, -108, 384,
+ 124, 344, 280, 576, -500, 252, 104, -308, 196, -188, -8,
+ 1268, 296, 1032, -1196, 436, 316, 372, -432, -200, -660, 704,
+ -224, 596, -132, 268, 32, -452, 884, 104, -1008, 424, -1348,
+ -280, 4, -1168, 368, 476, 696, 300, -8, 24, 180, -592,
+ -196, 388, 304, 500, 724, -160, 244, -84, 272, -256, -420,
+ 320, 208, -144, -156, 156, 364, 452, 28, 540, 316, 220,
+ -644, -248, 464, 72, 360, 32, -388, 496, -680, -48, 208,
+ -116, -408, 60, -604, -392, 548, -840, 784, -460, 656, -544,
+ -388, -264, 908, -800, -628, -612, -568, 572, -220, 164, 288,
+ -16, -308, 308, -112, -636, -760, 280, -668, 432, 364, 240,
+ -196, 604, 340, 384, 196, 592, -44, -500, 432, -580, -132,
+ 636, -76, 392, 4, -412, 540, 508, 328, -356, -36, 16,
+ -220, -64, -248, -60, 24, -192, 368, 1040, 92, -24, -1044,
+ -32, 40, 104, 148, 192, -136, -520, 56, -816, -224, 732,
+ 392, 356, 212, -80, -424, -1008, -324, 588, -1496, 576, 460,
+ -816, -848, 56, -580, -92, -1372, -112, -496, 200, 364, 52,
+ -140, 48, -48, -60, 84, 72, 40, 132, -356, -268, -104,
+ -284, -404, 732, -520, 164, -304, -540, 120, 328, -76, -460,
+ 756, 388, 588, 236, -436, -72, -176, -404, -316, -148, 716,
+ -604, 404, -72, -88, -888, -68, 944, 88, -220, -344, 960,
+ 472, 460, -232, 704, 120, 832, -228, 692, -508, 132, -476,
+ 844, -748, -364, -44, 1116, -1104, -1056, 76, 428, 552, -692,
+ 60, 356, 96, -384, -188, -612, -576, 736, 508, 892, 352,
+ -1132, 504, -24, -352, 324, 332, -600, -312, 292, 508, -144,
+ -8, 484, 48, 284, -260, -240, 256, -100, -292, -204, -44,
+ 472, -204, 908, -188, -1000, -256, 92, 1164, -392, 564, 356,
+ 652, -28, -884, 256, 484, -192, 760, -176, 376, -524, -452,
+ -436, 860, -736, 212, 124, 504, -476, 468, 76, -472, 552,
+ -692, -944, -620, 740, -240, 400, 132, 20, 192, -196, 264,
+ -668, -1012, -60, 296, -316, -828, 76, -156, 284, -768, -448,
+ -832, 148, 248, 652, 616, 1236, 288, -328, -400, -124, 588,
+ 220, 520, -696, 1032, 768, -740, -92, -272, 296, 448, -464,
+ 412, -200, 392, 440, -200, 264, -152, -260, 320, 1032, 216,
+ 320, -8, -64, 156, -1016, 1084, 1172, 536, 484, -432, 132,
+ 372, -52, -256, 84, 116, -352, 48, 116, 304, -384, 412,
+ 924, -300, 528, 628, 180, 648, 44, -980, -220, 1320, 48,
+ 332, 748, 524, -268, -720, 540, -276, 564, -344, -208, -196,
+ 436, 896, 88, -392, 132, 80, -964, -288, 568, 56, -48,
+ -456, 888, 8, 552, -156, -292, 948, 288, 128, -716, -292,
+ 1192, -152, 876, 352, -600, -260, -812, -468, -28, -120, -32,
+ -44, 1284, 496, 192, 464, 312, -76, -516, -380, -456, -1012,
+ -48, 308, -156, 36, 492, -156, -808, 188, 1652, 68, -120,
+ -116, 316, 160, -140, 352, 808, -416, 592, 316, -480, 56,
+ 528, -204, -568, 372, -232, 752, -344, 744, -4, 324, -416,
+ -600, 768, 268, -248, -88, -132, -420, -432, 80, -288, 404,
+ -316, -1216, -588, 520, -108, 92, -320, 368, -480, -216, -92,
+ 1688, -300, 180, 1020, -176, 820, -68, -228, -260, 436, -904,
+ 20, 40, -508, 440, -736, 312, 332, 204, 760, -372, 728,
+ 96, -20, -632, -520, -560, 336, 1076, -64, -532, 776, 584,
+ 192, 396, -728, -520, 276, -188, 80, -52, -612, -252, -48,
+ 648, 212, -688, 228, -52, -260, 428, -412, -272, -404, 180,
+ 816, -796, 48, 152, 484, -88, -216, 988, 696, 188, -528,
+ 648, -116, -180, 316, 476, 12, -564, 96, 476, -252, -364,
+ -376, -392, 556, -256, -576, 260, -352, 120, -16, -136, -260,
+ -492, 72, 556, 660, 580, 616, 772, 436, 424, -32, -324,
+ -1268, 416, -324, -80, 920, 160, 228, 724, 32, -516, 64,
+ 384, 68, -128, 136, 240, 248, -204, -68, 252, -932, -120,
+ -480, -628, -84, 192, 852, -404, -288, -132, 204, 100, 168,
+ -68, -196, -868, 460, 1080, 380, -80, 244, 0, 484, -888,
+ 64, 184, 352, 600, 460, 164, 604, -196, 320, -64, 588,
+ -184, 228, 12, 372, 48, -848, -344, 224, 208, -200, 484,
+ 128, -20, 272, -468, -840, 384, 256, -720, -520, -464, -580,
+ 112, -120, 644, -356, -208, -608, -528, 704, 560, -424, 392,
+ 828, 40, 84, 200, -152, 0, -144, 584, 280, -120, 80,
+ -556, -972, -196, -472, 724, 80, 168, -32, 88, 160, -688,
+ 0, 160, 356, 372, -776, 740, -128, 676, -248, -480, 4,
+ -364, 96, 544, 232, -1032, 956, 236, 356, 20, -40, 300,
+ 24, -676, -596, 132, 1120, -104, 532, -1096, 568, 648, 444,
+ 508, 380, 188, -376, -604, 1488, 424, 24, 756, -220, -192,
+ 716, 120, 920, 688, 168, 44, -460, 568, 284, 1144, 1160,
+ 600, 424, 888, 656, -356, -320, 220, 316, -176, -724, -188,
+ -816, -628, -348, -228, -380, 1012, -452, -660, 736, 928, 404,
+ -696, -72, -268, -892, 128, 184, -344, -780, 360, 336, 400,
+ 344, 428, 548, -112, 136, -228, -216, -820, -516, 340, 92,
+ -136, 116, -300, 376, -244, 100, -316, -520, -284, -12, 824,
+ 164, -548, -180, -128, 116, -924, -828, 268, -368, -580, 620,
+ 192, 160, 0, -1676, 1068, 424, -56, -360, 468, -156, 720,
+ 288, -528, 556, -364, 548, -148, 504, 316, 152, -648, -620,
+ -684, -24, -376, -384, -108, -920, -1032, 768, 180, -264, -508,
+ -1268, -260, -60, 300, -240, 988, 724, -376, -576, -212, -736,
+ 556, 192, 1092, -620, -880, 376, -56, -4, -216, -32, 836,
+ 268, 396, 1332, 864, -600, 100, 56, -412, -92, 356, 180,
+ 884, -468, -436, 292, -388, -804, -704, -840, 368, -348, 140,
+ -724, 1536, 940, 372, 112, -372, 436, -480, 1136, 296, -32,
+ -228, 132, -48, -220, 868, -1016, -60, -1044, -464, 328, 916,
+ 244, 12, -736, -296, 360, 468, -376, -108, -92, 788, 368,
+ -56, 544, 400, -672, -420, 728, 16, 320, 44, -284, -380,
+ -796, 488, 132, 204, -596, -372, 88, -152, -908, -636, -572,
+ -624, -116, -692, -200, -56, 276, -88, 484, -324, 948, 864,
+ 1000, -456, -184, -276, 292, -296, 156, 676, 320, 160, 908,
+ -84, -1236, -288, -116, 260, -372, -644, 732, -756, -96, 84,
+ 344, -520, 348, -688, 240, -84, 216, -1044, -136, -676, -396,
+ -1500, 960, -40, 176, 168, 1516, 420, -504, -344, -364, -360,
+ 1216, -940, -380, -212, 252, -660, -708, 484, -444, -152, 928,
+ -120, 1112, 476, -260, 560, -148, -344, 108, -196, 228, -288,
+ 504, 560, -328, -88, 288, -1008, 460, -228, 468, -836, -196,
+ 76, 388, 232, 412, -1168, -716, -644, 756, -172, -356, -504,
+ 116, 432, 528, 48, 476, -168, -608, 448, 160, -532, -272,
+ 28, -676, -12, 828, 980, 456, 520, 104, -104, 256, -344,
+ -4, -28, -368, -52, -524, -572, -556, -200, 768, 1124, -208,
+ -512, 176, 232, 248, -148, -888, 604, -600, -304, 804, -156,
+ -212, 488, -192, -804, -256, 368, -360, -916, -328, 228, -240,
+ -448, -472, 856, -556, -364, 572, -12, -156, -368, -340, 432,
+ 252, -752, -152, 288, 268, -580, -848, -592, 108, -76, 244,
+ 312, -716, 592, -80, 436, 360, 4, -248, 160, 516, 584,
+ 732, 44, -468, -280, -292, -156, -588, 28, 308, 912, 24,
+ 124, 156, 180, -252, 944, -924, -772, -520, -428, -624, 300,
+ -212, -1144, 32, -724, 800, -1128, -212, -1288, -848, 180, -416,
+ 440, 192, -576, -792, -76, -1080, 80, -532, -352, -132, 380,
+ -820, 148, 1112, 128, 164, 456, 700, -924, 144, -668, -384,
+ 648, -832, 508, 552, -52, -100, -656, 208, -568, 748, -88,
+ 680, 232, 300, 192, -408, -1012, -152, -252, -268, 272, -876,
+ -664, -648, -332, -136, 16, 12, 1152, -28, 332, -536, 320,
+ -672, -460, -316, 532, -260, 228, -40, 1052, -816, 180, 88,
+ -496, -556, -672, -368, 428, 92, 356, 404, -408, 252, 196,
+ -176, -556, 792, 268, 32, 372, 40, 96, -332, 328, 120,
+ 372, -900, -40, 472, -264, -592, 952, 128, 656, 112, 664,
+ -232, 420, 4, -344, -464, 556, 244, -416, -32, 252, 0,
+ -412, 188, -696, 508, -476, 324, -1096, 656, -312, 560, 264,
+ -136, 304, 160, -64, -580, 248, 336, -720, 560, -348, -288,
+ -276, -196, -500, 852, -544, -236, -1128, -992, -776, 116, 56,
+ 52, 860, 884, 212, -12, 168, 1020, 512, -552, 924, -148,
+ 716, 188, 164, -340, -520, -184, 880, -152, -680, -208, -1156,
+ -300, -528, -472, 364, 100, -744, -1056, -32, 540, 280, 144,
+ -676, -32, -232, -280, -224, 96, 568, -76, 172, 148, 148,
+ 104, 32, -296, -32, 788, -80, 32, -16, 280, 288, 944,
+ 428, -484
+};
+
+static inline int32_t clamp(int32_t value, int32_t low, int32_t high)
+{
+ return value < low ? low : (value > high ? high : value);
+}
+
+static inline int32_t round_power_of_two(const int32_t val, int32_t n)
+{
+ const int32_t a = (int32_t)1 << (n - 1);
+
+ return (val + a) >> n;
+}
+
+static void rockchip_av1_init_random_generator(uint8_t luma_num, uint16_t seed,
+ uint16_t *random_register)
+{
+ uint16_t random_reg = seed;
+
+ random_reg ^= ((luma_num * 37 + 178) & 255) << 8;
+ random_reg ^= ((luma_num * 173 + 105) & 255);
+ *random_register = random_reg;
+}
+
+static inline void rockchip_av1_update_random_register(uint16_t *random_register)
+{
+ uint16_t bit;
+ uint16_t random_reg = *random_register;
+
+ bit = ((random_reg >> 0) ^ (random_reg >> 1) ^ (random_reg >> 3) ^
+ (random_reg >> 12)) & 1;
+ *random_register = (random_reg >> 1) | (bit << 15);
+}
+
+static inline int32_t rockchip_av1_get_random_number(uint16_t random_register)
+{
+ return (random_register >> 5) & ((1 << 11) - 1);
+}
+
+void rockchip_av1_generate_luma_grain_block(int32_t (*luma_grain_block)[73][82],
+ int32_t bitdepth,
+ uint8_t num_y_points,
+ int32_t grain_scale_shift,
+ int32_t ar_coeff_lag,
+ int32_t (*ar_coeffs_y)[24],
+ int32_t ar_coeff_shift,
+ int32_t grain_min,
+ int32_t grain_max,
+ uint16_t random_seed)
+{
+ int32_t gauss_sec_shift = 12 - bitdepth + grain_scale_shift;
+ uint16_t grain_random_register = random_seed;
+ int32_t i, j;
+
+ for (i = 0; i < 73; i++) {
+ for (j = 0; j < 82; j++) {
+ if (num_y_points > 0) {
+ rockchip_av1_update_random_register
+ (&grain_random_register);
+ (*luma_grain_block)[i][j] =
+ round_power_of_two(gaussian_sequence
+ [rockchip_av1_get_random_number
+ (grain_random_register)],
+ gauss_sec_shift);
+ } else {
+ (*luma_grain_block)[i][j] = 0;
+ }
+ }
+ }
+
+ for (i = 3; i < 73; i++)
+ for (j = 3; j < 82 - 3; j++) {
+ int32_t pos = 0;
+ int32_t wsum = 0;
+ int32_t deltaRow, deltaCol;
+
+ for (deltaRow = -ar_coeff_lag; deltaRow <= 0;
+ deltaRow++) {
+ for (deltaCol = -ar_coeff_lag;
+ deltaCol <= ar_coeff_lag; deltaCol++) {
+ if (deltaRow == 0 && deltaCol == 0)
+ break;
+ wsum = wsum + (*ar_coeffs_y)[pos] *
+ (*luma_grain_block)[i + deltaRow][j + deltaCol];
+ ++pos;
+ }
+ }
+ (*luma_grain_block)[i][j] =
+ clamp((*luma_grain_block)[i][j] +
+ round_power_of_two(wsum, ar_coeff_shift),
+ grain_min, grain_max);
+ }
+}
+
+// Calculate chroma grain noise once per frame
+void rockchip_av1_generate_chroma_grain_block(int32_t (*luma_grain_block)[73][82],
+ int32_t (*cb_grain_block)[38][44],
+ int32_t (*cr_grain_block)[38][44],
+ int32_t bitdepth,
+ uint8_t num_y_points,
+ uint8_t num_cb_points,
+ uint8_t num_cr_points,
+ int32_t grain_scale_shift,
+ int32_t ar_coeff_lag,
+ int32_t (*ar_coeffs_cb)[25],
+ int32_t (*ar_coeffs_cr)[25],
+ int32_t ar_coeff_shift,
+ int32_t grain_min,
+ int32_t grain_max,
+ uint8_t chroma_scaling_from_luma,
+ uint16_t random_seed)
+{
+ int32_t gauss_sec_shift = 12 - bitdepth + grain_scale_shift;
+ uint16_t grain_random_register = 0;
+ int32_t i, j;
+
+ rockchip_av1_init_random_generator(7, random_seed,
+ &grain_random_register);
+ for (i = 0; i < 38; i++) {
+ for (j = 0; j < 44; j++) {
+ if (num_cb_points || chroma_scaling_from_luma) {
+ rockchip_av1_update_random_register
+ (&grain_random_register);
+ (*cb_grain_block)[i][j] =
+ round_power_of_two(gaussian_sequence
+ [rockchip_av1_get_random_number
+ (grain_random_register)],
+ gauss_sec_shift);
+ } else {
+ (*cb_grain_block)[i][j] = 0;
+ }
+ }
+ }
+
+ rockchip_av1_init_random_generator(11, random_seed,
+ &grain_random_register);
+ for (i = 0; i < 38; i++) {
+ for (j = 0; j < 44; j++) {
+ if (num_cr_points || chroma_scaling_from_luma) {
+ rockchip_av1_update_random_register
+ (&grain_random_register);
+ (*cr_grain_block)[i][j] =
+ round_power_of_two(gaussian_sequence
+ [rockchip_av1_get_random_number
+ (grain_random_register)],
+ gauss_sec_shift);
+ } else {
+ (*cr_grain_block)[i][j] = 0;
+ }
+ }
+ }
+
+ for (i = 3; i < 38; i++) {
+ for (j = 3; j < 44 - 3; j++) {
+ int32_t wsum_cb = 0;
+ int32_t wsum_cr = 0;
+ int32_t pos = 0;
+ int32_t deltaRow, deltaCol;
+
+ for (deltaRow = -ar_coeff_lag; deltaRow <= 0;
+ deltaRow++) {
+ for (deltaCol = -ar_coeff_lag;
+ deltaCol <= ar_coeff_lag; deltaCol++) {
+ if (deltaRow == 0 && deltaCol == 0)
+ break;
+ wsum_cb = wsum_cb + (*ar_coeffs_cb)[pos] *
+ (*cb_grain_block)[i + deltaRow][j + deltaCol];
+ wsum_cr =
+ wsum_cr +
+ (*ar_coeffs_cr)[pos] *
+ (*cr_grain_block)[i + deltaRow][j + deltaCol];
+ ++pos;
+ }
+ }
+
+ if (num_y_points > 0) {
+ int32_t av_luma = 0;
+ int32_t luma_coord_y = (i << 1) - 3;
+ int32_t luma_coord_x = (j << 1) - 3;
+
+ av_luma +=
+ (*luma_grain_block)[luma_coord_y][luma_coord_x];
+ av_luma +=
+ (*luma_grain_block)[luma_coord_y][luma_coord_x + 1];
+ av_luma +=
+ (*luma_grain_block)[luma_coord_y + 1][luma_coord_x];
+ av_luma +=
+ (*luma_grain_block)[(luma_coord_y + 1)][luma_coord_x + 1];
+ av_luma = round_power_of_two(av_luma, 2);
+
+ wsum_cb = wsum_cb + (*ar_coeffs_cb)[pos] * av_luma;
+ wsum_cr = wsum_cr + (*ar_coeffs_cr)[pos] * av_luma;
+ }
+
+ if (num_cb_points || chroma_scaling_from_luma) {
+ (*cb_grain_block)[i][j] =
+ clamp((*cb_grain_block)[i][j] +
+ round_power_of_two(wsum_cb, ar_coeff_shift),
+ grain_min, grain_max);
+ }
+ if (num_cr_points || chroma_scaling_from_luma) {
+ (*cr_grain_block)[i][j] =
+ clamp((*cr_grain_block)[i][j] +
+ round_power_of_two(wsum_cr, ar_coeff_shift),
+ grain_min, grain_max);
+ }
+ }
+ }
+}
diff --git a/drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h b/drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
new file mode 100644
index 000000000000..dbef112699b8
--- /dev/null
+++ b/drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ROCKCHIP_AV1_FILMGRAIN_H_
+#define _ROCKCHIP_AV1_FILMGRAIN_H_
+
+#include <linux/types.h>
+
+void rockchip_av1_generate_luma_grain_block(int32_t (*luma_grain_block)[73][82],
+ int32_t bitdepth,
+ uint8_t num_y_points,
+ int32_t grain_scale_shift,
+ int32_t ar_coeff_lag,
+ int32_t (*ar_coeffs_y)[24],
+ int32_t ar_coeff_shift,
+ int32_t grain_min,
+ int32_t grain_max,
+ uint16_t random_seed);
+
+void rockchip_av1_generate_chroma_grain_block(int32_t (*luma_grain_block)[73][82],
+ int32_t (*cb_grain_block)[38][44],
+ int32_t (*cr_grain_block)[38][44],
+ int32_t bitdepth,
+ uint8_t num_y_points,
+ uint8_t num_cb_points,
+ uint8_t num_cr_points,
+ int32_t grain_scale_shift,
+ int32_t ar_coeff_lag,
+ int32_t (*ar_coeffs_cb)[25],
+ int32_t (*ar_coeffs_cr)[25],
+ int32_t ar_coeff_shift,
+ int32_t grain_min,
+ int32_t grain_max,
+ uint8_t chroma_scaling_from_luma,
+ uint16_t random_seed);
+
+#endif
diff --git a/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c b/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
index a183e4f35e00..0534f2ca9c2f 100644
--- a/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
+++ b/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
@@ -341,6 +341,12 @@ void rockchip_vpu981_av1_dec_exit(struct hantro_ctx *ctx)
av1_dec->tile_info.dma);
av1_dec->tile_info.cpu = NULL;
+ if (av1_dec->film_grain.cpu)
+ dma_free_coherent(vpu->dev, av1_dec->film_grain.size,
+ av1_dec->film_grain.cpu,
+ av1_dec->film_grain.dma);
+ av1_dec->film_grain.cpu = NULL;
+
if (av1_dec->prob_tbl.cpu)
dma_free_coherent(vpu->dev, av1_dec->prob_tbl.size,
av1_dec->prob_tbl.cpu, av1_dec->prob_tbl.dma);
@@ -381,6 +387,14 @@ int rockchip_vpu981_av1_dec_init(struct hantro_ctx *ctx)
return -ENOMEM;
av1_dec->tile_info.size = AV1_MAX_TILES;
+ av1_dec->film_grain.cpu = dma_alloc_coherent(vpu->dev,
+ ALIGN(sizeof(struct rockchip_av1_film_grain), 2048),
+ &av1_dec->film_grain.dma,
+ GFP_KERNEL);
+ if (!av1_dec->film_grain.cpu)
+ return -ENOMEM;
+ av1_dec->film_grain.size = ALIGN(sizeof(struct rockchip_av1_film_grain), 2048);
+
av1_dec->prob_tbl.cpu = dma_alloc_coherent(vpu->dev,
ALIGN(sizeof(struct av1cdfs), 2048),
&av1_dec->prob_tbl.dma,
@@ -1178,6 +1192,204 @@ static void rockchip_vpu981_av1_dec_set_prob(struct hantro_ctx *ctx)
hantro_write_addr(vpu, AV1_PROP_TABLE, av1_dec->prob_tbl.dma);
}
+static void
+rockchip_vpu981_av1_dec_init_scaling_function(const uint8_t *values,
+ const uint8_t *scaling,
+ uint8_t num_points,
+ uint8_t *scaling_lut)
+{
+ int i, point;
+
+ if (num_points == 0) {
+ memset(scaling_lut, 0, 256);
+ return;
+ }
+
+ for (point = 0; point < num_points - 1; point++) {
+ int x;
+ int32_t delta_y = scaling[point + 1] - scaling[point];
+ int32_t delta_x = values[point + 1] - values[point];
+ int64_t delta =
+ delta_x ? delta_y * ((65536 + (delta_x >> 1)) /
+ delta_x) : 0;
+
+ for (x = 0; x < delta_x; x++) {
+ scaling_lut[values[point] + x] =
+ scaling[point] +
+ (int32_t) ((x * delta + 32768) >> 16);
+ }
+ }
+
+ for (i = values[num_points - 1]; i < 256; i++)
+ scaling_lut[i] = scaling[num_points - 1];
+}
+
+static void rockchip_vpu981_av1_dec_set_fgs(struct hantro_ctx *ctx)
+{
+ struct hantro_av1_dec_hw_ctx *av1_dec = &ctx->av1_dec;
+ struct hantro_av1_dec_ctrls *ctrls = &av1_dec->ctrls;
+ const struct v4l2_ctrl_av1_film_grain *film_grain = ctrls->film_grain;
+ struct rockchip_av1_film_grain *fgmem = av1_dec->film_grain.cpu;
+ struct hantro_dev *vpu = ctx->dev;
+ int32_t (*ar_coeffs_y)[24];
+ int32_t (*ar_coeffs_cb)[25];
+ int32_t (*ar_coeffs_cr)[25];
+ int32_t (*luma_grain_block)[73][82];
+ int32_t (*cb_grain_block)[38][44];
+ int32_t (*cr_grain_block)[38][44];
+ int32_t ar_coeff_lag, ar_coeff_shift;
+ int32_t grain_scale_shift, bitdepth;
+ int32_t grain_center, grain_min, grain_max;
+ int i, j;
+
+ hantro_reg_write(vpu, &av1_apply_grain, 0);
+
+ if (!(film_grain->flags & V4L2_AV1_FILM_GRAIN_FLAG_APPLY_GRAIN)) {
+ hantro_reg_write(vpu, &av1_num_y_points_b, 0);
+ hantro_reg_write(vpu, &av1_num_cb_points_b, 0);
+ hantro_reg_write(vpu, &av1_num_cr_points_b, 0);
+ hantro_reg_write(vpu, &av1_scaling_shift, 0);
+ hantro_reg_write(vpu, &av1_cb_mult, 0);
+ hantro_reg_write(vpu, &av1_cb_luma_mult, 0);
+ hantro_reg_write(vpu, &av1_cb_offset, 0);
+ hantro_reg_write(vpu, &av1_cr_mult, 0);
+ hantro_reg_write(vpu, &av1_cr_luma_mult, 0);
+ hantro_reg_write(vpu, &av1_cr_offset, 0);
+ hantro_reg_write(vpu, &av1_overlap_flag, 0);
+ hantro_reg_write(vpu, &av1_clip_to_restricted_range, 0);
+ hantro_reg_write(vpu, &av1_chroma_scaling_from_luma, 0);
+ hantro_reg_write(vpu, &av1_random_seed, 0);
+ hantro_write_addr(vpu, AV1_FILM_GRAIN, 0);
+ return;
+ }
+
+ ar_coeffs_y = kzalloc(sizeof(int32_t) * 24, GFP_KERNEL);
+ ar_coeffs_cb = kzalloc(sizeof(int32_t) * 25, GFP_KERNEL);
+ ar_coeffs_cr = kzalloc(sizeof(int32_t) * 25, GFP_KERNEL);
+ luma_grain_block = kzalloc(sizeof(int32_t) * 73 * 82, GFP_KERNEL);
+ cb_grain_block = kzalloc(sizeof(int32_t) * 38 * 44, GFP_KERNEL);
+ cr_grain_block = kzalloc(sizeof(int32_t) * 38 * 44, GFP_KERNEL);
+
+ if (!ar_coeffs_y || !ar_coeffs_cb || !ar_coeffs_cr
+ || !luma_grain_block || !cb_grain_block || !cr_grain_block) {
+ pr_warn("Fail allocating memory for film grain parameters\n");
+ goto alloc_fail;
+ }
+
+ hantro_reg_write(vpu, &av1_apply_grain, 1);
+
+ hantro_reg_write(vpu, &av1_num_y_points_b,
+ film_grain->num_y_points > 0);
+ hantro_reg_write(vpu, &av1_num_cb_points_b,
+ film_grain->num_cb_points > 0);
+ hantro_reg_write(vpu, &av1_num_cr_points_b,
+ film_grain->num_cr_points > 0);
+ hantro_reg_write(vpu, &av1_scaling_shift,
+ film_grain->grain_scaling_minus_8 + 8);
+
+ if (!(film_grain->flags & V4L2_AV1_FILM_GRAIN_FLAG_CHROMA_SCALING_FROM_LUMA)) {
+ hantro_reg_write(vpu, &av1_cb_mult, film_grain->cb_mult - 128);
+ hantro_reg_write(vpu, &av1_cb_luma_mult, film_grain->cb_luma_mult - 128);
+ hantro_reg_write(vpu, &av1_cb_offset, film_grain->cb_offset - 256);
+ hantro_reg_write(vpu, &av1_cr_mult, film_grain->cr_mult - 128);
+ hantro_reg_write(vpu, &av1_cr_luma_mult, film_grain->cr_luma_mult - 128);
+ hantro_reg_write(vpu, &av1_cr_offset, film_grain->cr_offset - 256);
+ } else {
+ hantro_reg_write(vpu, &av1_cb_mult, 0);
+ hantro_reg_write(vpu, &av1_cb_luma_mult, 0);
+ hantro_reg_write(vpu, &av1_cb_offset, 0);
+ hantro_reg_write(vpu, &av1_cr_mult, 0);
+ hantro_reg_write(vpu, &av1_cr_luma_mult, 0);
+ hantro_reg_write(vpu, &av1_cr_offset, 0);
+ }
+
+ hantro_reg_write(vpu, &av1_overlap_flag,
+ !!(film_grain->flags & V4L2_AV1_FILM_GRAIN_FLAG_OVERLAP));
+ hantro_reg_write(vpu, &av1_clip_to_restricted_range,
+ !!(film_grain->flags & V4L2_AV1_FILM_GRAIN_FLAG_CLIP_TO_RESTRICTED_RANGE));
+ hantro_reg_write(vpu, &av1_chroma_scaling_from_luma,
+ !!(film_grain->flags & V4L2_AV1_FILM_GRAIN_FLAG_CHROMA_SCALING_FROM_LUMA));
+ hantro_reg_write(vpu, &av1_random_seed, film_grain->grain_seed);
+
+ rockchip_vpu981_av1_dec_init_scaling_function(film_grain->point_y_value,
+ film_grain->point_y_scaling,
+ film_grain->num_y_points,
+ fgmem->scaling_lut_y);
+
+ if (film_grain->flags &
+ V4L2_AV1_FILM_GRAIN_FLAG_CHROMA_SCALING_FROM_LUMA) {
+ memcpy(fgmem->scaling_lut_cb, fgmem->scaling_lut_y,
+ sizeof(*fgmem->scaling_lut_y) * 256);
+ memcpy(fgmem->scaling_lut_cr, fgmem->scaling_lut_y,
+ sizeof(*fgmem->scaling_lut_y) * 256);
+ } else {
+ rockchip_vpu981_av1_dec_init_scaling_function
+ (film_grain->point_cb_value, film_grain->point_cb_scaling,
+ film_grain->num_cb_points, fgmem->scaling_lut_cb);
+ rockchip_vpu981_av1_dec_init_scaling_function
+ (film_grain->point_cr_value, film_grain->point_cr_scaling,
+ film_grain->num_cr_points, fgmem->scaling_lut_cr);
+ }
+
+ for (i = 0; i < V4L2_AV1_MAX_NUM_POS_LUMA; i++) {
+ if (i < 24)
+ (*ar_coeffs_y)[i] = film_grain->ar_coeffs_y_plus_128[i] - 128;
+ (*ar_coeffs_cb)[i] = film_grain->ar_coeffs_cb_plus_128[i] - 128;
+ (*ar_coeffs_cr)[i] = film_grain->ar_coeffs_cr_plus_128[i] - 128;
+ }
+
+ ar_coeff_lag = film_grain->ar_coeff_lag;
+ ar_coeff_shift = film_grain->ar_coeff_shift_minus_6 + 6;
+ grain_scale_shift = film_grain->grain_scale_shift;
+ bitdepth = ctx->bit_depth;
+ grain_center = 128 << (bitdepth - 8);
+ grain_min = 0 - grain_center;
+ grain_max = (256 << (bitdepth - 8)) - 1 - grain_center;
+
+ rockchip_av1_generate_luma_grain_block(luma_grain_block, bitdepth,
+ film_grain->num_y_points, grain_scale_shift,
+ ar_coeff_lag, ar_coeffs_y, ar_coeff_shift,
+ grain_min, grain_max,
+ film_grain->grain_seed);
+
+ rockchip_av1_generate_chroma_grain_block(luma_grain_block, cb_grain_block,
+ cr_grain_block, bitdepth,
+ film_grain->num_y_points,
+ film_grain->num_cb_points,
+ film_grain->num_cr_points,
+ grain_scale_shift, ar_coeff_lag, ar_coeffs_cb,
+ ar_coeffs_cr, ar_coeff_shift, grain_min,
+ grain_max,
+ !!(film_grain->flags
+ & V4L2_AV1_FILM_GRAIN_FLAG_CHROMA_SCALING_FROM_LUMA),
+ film_grain->grain_seed);
+
+ for (i = 0; i < 64; i++) {
+ for (j = 0; j < 64; j++)
+ fgmem->cropped_luma_grain_block[i * 64 + j] =
+ (*luma_grain_block)[i + 9][j + 9];
+ }
+
+ for (i = 0; i < 32; i++) {
+ for (j = 0; j < 32; j++) {
+ fgmem->cropped_chroma_grain_block[i * 64 + 2 * j] =
+ (*cb_grain_block)[i + 6][j + 6];
+ fgmem->cropped_chroma_grain_block[i * 64 + 2 * j + 1] =
+ (*cr_grain_block)[i + 6][j + 6];
+ }
+ }
+
+ hantro_write_addr(vpu, AV1_FILM_GRAIN, av1_dec->film_grain.dma);
+
+alloc_fail:
+ kfree(ar_coeffs_y);
+ kfree(ar_coeffs_cb);
+ kfree(ar_coeffs_cr);
+ kfree(luma_grain_block);
+ kfree(cb_grain_block);
+ kfree(cr_grain_block);
+}
+
static void rockchip_vpu981_av1_dec_set_cdef(struct hantro_ctx *ctx)
{
struct hantro_av1_dec_hw_ctx *av1_dec = &ctx->av1_dec;
@@ -1960,6 +2172,7 @@ int rockchip_vpu981_av1_dec_run(struct hantro_ctx *ctx)
rockchip_vpu981_av1_dec_set_picture_dimensions(ctx);
rockchip_vpu981_av1_dec_set_cdef(ctx);
rockchip_vpu981_av1_dec_set_lr(ctx);
+ rockchip_vpu981_av1_dec_set_fgs(ctx);
rockchip_vpu981_av1_dec_set_prob(ctx);
hantro_reg_write(vpu, &av1_dec_mode, AV1_DEC_MODE);
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 9/9] media: verisilicon: Enable AV1 decoder on rk3588
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
` (5 preceding siblings ...)
2022-12-19 15:56 ` [PATCH v1 8/9] media: verisilicon: Add film grain feature to AV1 driver Benjamin Gaignard
@ 2022-12-19 15:56 ` Benjamin Gaignard
2022-12-19 20:22 ` [PATCH v1 0/9] AV1 stateless decoder for RK3588 Nicolas Dufresne
` (2 subsequent siblings)
9 siblings, 0 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 15:56 UTC (permalink / raw)
To: ezequiel, p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt,
heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Benjamin Gaignard
Add rk3588 AV1 decoder to Hantro variant.
The hardware support image from 64x64 up to 7680x4320
by steps of 16 pixels.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
.../media/platform/verisilicon/hantro_drv.c | 1 +
.../media/platform/verisilicon/hantro_hw.h | 6 +
.../platform/verisilicon/rockchip_vpu_hw.c | 116 ++++++++++++++++++
3 files changed, 123 insertions(+)
diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
index e10fc59634dd..b4c711a25ef7 100644
--- a/drivers/media/platform/verisilicon/hantro_drv.c
+++ b/drivers/media/platform/verisilicon/hantro_drv.c
@@ -682,6 +682,7 @@ static const struct of_device_id of_hantro_match[] = {
{ .compatible = "rockchip,rk3399-vpu", .data = &rk3399_vpu_variant, },
{ .compatible = "rockchip,rk3568-vepu", .data = &rk3568_vepu_variant, },
{ .compatible = "rockchip,rk3568-vpu", .data = &rk3568_vpu_variant, },
+ { .compatible = "rockchip,rk3588-av1-vpu", .data = &rk3588_vpu981_variant, },
#endif
#ifdef CONFIG_VIDEO_HANTRO_IMX8M
{ .compatible = "nxp,imx8mm-vpu-g1", .data = &imx8mm_vpu_g1_variant, },
diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
index ac23fc3be079..96781f65421e 100644
--- a/drivers/media/platform/verisilicon/hantro_hw.h
+++ b/drivers/media/platform/verisilicon/hantro_hw.h
@@ -403,11 +403,13 @@ extern const struct hantro_variant rk3328_vpu_variant;
extern const struct hantro_variant rk3399_vpu_variant;
extern const struct hantro_variant rk3568_vepu_variant;
extern const struct hantro_variant rk3568_vpu_variant;
+extern const struct hantro_variant rk3588_vpu981_variant;
extern const struct hantro_variant sama5d4_vdec_variant;
extern const struct hantro_variant sunxi_vpu_variant;
extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
+extern const struct hantro_postproc_ops rockchip_vpu981_postproc_ops;
extern const u32 hantro_vp8_dec_mc_filter[8][6];
@@ -444,6 +446,10 @@ void hantro_hevc_ref_init(struct hantro_ctx *ctx);
dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, s32 poc);
int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr);
+int rockchip_vpu981_av1_dec_init(struct hantro_ctx *ctx);
+void rockchip_vpu981_av1_dec_exit(struct hantro_ctx *ctx);
+int rockchip_vpu981_av1_dec_run(struct hantro_ctx *ctx);
+void rockchip_vpu981_av1_dec_done(struct hantro_ctx *ctx);
static inline unsigned short hantro_vp9_num_sbs(unsigned short dimension)
{
diff --git a/drivers/media/platform/verisilicon/rockchip_vpu_hw.c b/drivers/media/platform/verisilicon/rockchip_vpu_hw.c
index 8de6fd2e8eef..b600a74d1caa 100644
--- a/drivers/media/platform/verisilicon/rockchip_vpu_hw.c
+++ b/drivers/media/platform/verisilicon/rockchip_vpu_hw.c
@@ -13,9 +13,11 @@
#include "hantro_g1_regs.h"
#include "hantro_h1_regs.h"
#include "rockchip_vpu2_regs.h"
+#include "rockchip_vpu981_regs.h"
#define RK3066_ACLK_MAX_FREQ (300 * 1000 * 1000)
#define RK3288_ACLK_MAX_FREQ (400 * 1000 * 1000)
+#define RK3588_ACLK_MAX_FREQ (300 * 1000 * 1000)
/*
* Supported formats.
@@ -74,6 +76,35 @@ static const struct hantro_fmt rockchip_vpu1_postproc_fmts[] = {
},
};
+static const struct hantro_fmt rockchip_vpu981_postproc_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
+ .frmsize = {
+ .min_width = 64,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = 64,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_P010,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
+ .frmsize = {
+ .min_width = 64,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = 64,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+};
+
static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12,
@@ -277,6 +308,34 @@ static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
},
};
+static const struct hantro_fmt rk3588_vpu981_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12_4L4,
+ .codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = 64,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = 64,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_AV1_FRAME,
+ .codec_mode = HANTRO_MODE_AV1_DEC,
+ .max_depth = 2,
+ .frmsize = {
+ .min_width = 64,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = 64,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+};
+
static irqreturn_t rockchip_vpu1_vepu_irq(int irq, void *dev_id)
{
struct hantro_dev *vpu = dev_id;
@@ -331,6 +390,24 @@ static irqreturn_t rockchip_vpu2_vepu_irq(int irq, void *dev_id)
return IRQ_HANDLED;
}
+static irqreturn_t rk3588_vpu981_irq(int irq, void *dev_id)
+{
+ struct hantro_dev *vpu = dev_id;
+ enum vb2_buffer_state state;
+ u32 status;
+
+ status = vdpu_read(vpu, AV1_REG_INTERRUPT);
+ state = (status & AV1_REG_INTERRUPT_DEC_RDY_INT) ?
+ VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
+
+ vdpu_write(vpu, 0, AV1_REG_INTERRUPT);
+ vdpu_write(vpu, AV1_REG_CONFIG_DEC_CLK_GATE_E, AV1_REG_CONFIG);
+
+ hantro_irq_done(vpu, state);
+
+ return IRQ_HANDLED;
+}
+
static int rk3036_vpu_hw_init(struct hantro_dev *vpu)
{
/* Bump ACLK to max. possible freq. to improve performance. */
@@ -346,6 +423,13 @@ static int rk3066_vpu_hw_init(struct hantro_dev *vpu)
return 0;
}
+static int rk3588_vpu981_hw_init(struct hantro_dev *vpu)
+{
+ /* Bump ACLKs to max. possible freq. to improve performance. */
+ clk_set_rate(vpu->clocks[0].clk, RK3588_ACLK_MAX_FREQ);
+ return 0;
+}
+
static int rockchip_vpu_hw_init(struct hantro_dev *vpu)
{
/* Bump ACLK to max. possible freq. to improve performance. */
@@ -498,6 +582,14 @@ static const struct hantro_codec_ops rk3568_vepu_codec_ops[] = {
},
};
+static const struct hantro_codec_ops rk3588_vpu981_codec_ops[] = {
+ [HANTRO_MODE_AV1_DEC] = {
+ .run = rockchip_vpu981_av1_dec_run,
+ .init = rockchip_vpu981_av1_dec_init,
+ .exit = rockchip_vpu981_av1_dec_exit,
+ .done = rockchip_vpu981_av1_dec_done,
+ },
+};
/*
* VPU variant.
*/
@@ -529,10 +621,18 @@ static const char * const rk3066_vpu_clk_names[] = {
"aclk_vepu", "hclk_vepu"
};
+static const struct hantro_irq rk3588_vpu981_irqs[] = {
+ { "vdpu", rk3588_vpu981_irq },
+};
+
static const char * const rockchip_vpu_clk_names[] = {
"aclk", "hclk"
};
+static const char * const rk3588_vpu981_vpu_clk_names[] = {
+ "aclk", "hclk", "aclk_vdpu_root", "hclk_vdpu_root"
+};
+
/* VDPU1/VEPU1 */
const struct hantro_variant rk3036_vpu_variant = {
@@ -678,3 +778,19 @@ const struct hantro_variant px30_vpu_variant = {
.clk_names = rockchip_vpu_clk_names,
.num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names)
};
+
+const struct hantro_variant rk3588_vpu981_variant = {
+ .dec_offset = 0x0,
+ .dec_fmts = rk3588_vpu981_dec_fmts,
+ .num_dec_fmts = ARRAY_SIZE(rk3588_vpu981_dec_fmts),
+ .postproc_fmts = rockchip_vpu981_postproc_fmts,
+ .num_postproc_fmts = ARRAY_SIZE(rockchip_vpu981_postproc_fmts),
+ .postproc_ops = &rockchip_vpu981_postproc_ops,
+ .codec = HANTRO_AV1_DECODER,
+ .codec_ops = rk3588_vpu981_codec_ops,
+ .irqs = rk3588_vpu981_irqs,
+ .num_irqs = ARRAY_SIZE(rk3588_vpu981_irqs),
+ .init = rk3588_vpu981_hw_init,
+ .clk_names = rk3588_vpu981_vpu_clk_names,
+ .num_clocks = ARRAY_SIZE(rk3588_vpu981_vpu_clk_names)
+};
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
2022-12-19 15:56 ` [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible Benjamin Gaignard
@ 2022-12-19 16:06 ` Krzysztof Kozlowski
2022-12-19 16:44 ` Benjamin Gaignard
2022-12-20 9:55 ` Krzysztof Kozlowski
1 sibling, 1 reply; 36+ messages in thread
From: Krzysztof Kozlowski @ 2022-12-19 16:06 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
On 19/12/2022 16:56, Benjamin Gaignard wrote:
> Add compatible for rk3588 AV1 vpu decoder.
>
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> Documentation/devicetree/bindings/media/rockchip-vpu.yaml | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
> index 6cc4d3e5a61d..8454df53f5cb 100644
> --- a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
> +++ b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
> @@ -24,6 +24,7 @@ properties:
> - rockchip,rk3399-vpu
> - rockchip,px30-vpu
> - rockchip,rk3568-vpu
> + - rockchip,rk3588-av1-vpu
Why "av1" suffix? Is there another type (different device, different
programming model) expected on rk3588?
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
2022-12-19 16:06 ` Krzysztof Kozlowski
@ 2022-12-19 16:44 ` Benjamin Gaignard
0 siblings, 0 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-19 16:44 UTC (permalink / raw)
To: Krzysztof Kozlowski, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le 19/12/2022 à 17:06, Krzysztof Kozlowski a écrit :
> On 19/12/2022 16:56, Benjamin Gaignard wrote:
>> Add compatible for rk3588 AV1 vpu decoder.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> Documentation/devicetree/bindings/media/rockchip-vpu.yaml | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
>> index 6cc4d3e5a61d..8454df53f5cb 100644
>> --- a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
>> +++ b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
>> @@ -24,6 +24,7 @@ properties:
>> - rockchip,rk3399-vpu
>> - rockchip,px30-vpu
>> - rockchip,rk3568-vpu
>> + - rockchip,rk3588-av1-vpu
> Why "av1" suffix? Is there another type (different device, different
> programming model) expected on rk3588?
Yes there is 4 different vpu on rk3588.
This one only does av1.
>
> Best regards,
> Krzysztof
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
` (6 preceding siblings ...)
2022-12-19 15:56 ` [PATCH v1 9/9] media: verisilicon: Enable AV1 decoder on rk3588 Benjamin Gaignard
@ 2022-12-19 20:22 ` Nicolas Dufresne
2022-12-19 21:07 ` Ezequiel Garcia
[not found] ` <20221219155616.848690-8-benjamin.gaignard@collabora.com>
9 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-19 20:22 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> This series implement AV1 stateless decoder for RK3588 SoC.
> The harware support 8 and 10 bits bitstreams up to 7680x4320.
> AV1 feature like film grain or scaling are done by the postprocessor.
> The driver can produce NV12_4L4 and NV12 pixel formats.
> A native 10bits NV12_4L4 format is possible but need more investigation
> to be completly documented and enabled.
>
> It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> Sebastian's device-tree patches for RK3588.
>
> The full branch can be found here:
> https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>
> Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> The failing tests are:
> - 10bits bitstream because 10bits output formats aren't yet implemented.
> - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> - tests with resolution < hardware limit (64x64)
Its nice to note that we tested 10bit support by forcing P010 output from the
postprocessor, with all bitstream working except for filmgrain. Hopefully we'll
get 10bit properly sorted out, but we don't think the uAPI have any issues
specifically for 10bit (Mediatek driver does not support 10bit or filmgrain).
>
> Benjamin
>
> Benjamin Gaignard (9):
> dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> media: verisilicon: Add AV1 decoder mode and controls
> media: verisilicon: Save bit depth for AV1 decoder
> media: verisilicon: Check AV1 bitstreams bit depth
> media: verisilicon: Compute motion vectors size for AV1 frames
> media: verisilicon: Add AV1 entropy helpers
> media: verisilicon: Add Rockchip AV1 decoder
> media: verisilicon: Add film grain feature to AV1 driver
> media: verisilicon: Enable AV1 decoder on rk3588
>
> .../bindings/media/rockchip-vpu.yaml | 1 +
> drivers/media/platform/verisilicon/Makefile | 3 +
> drivers/media/platform/verisilicon/hantro.h | 5 +
> .../media/platform/verisilicon/hantro_drv.c | 54 +
> .../media/platform/verisilicon/hantro_hw.h | 102 +
> .../platform/verisilicon/hantro_postproc.c | 3 +
> .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> .../verisilicon/rockchip_av1_entropymode.h | 272 +
> .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> 14 files changed, 8291 insertions(+)
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls
[not found] ` <202212200257.l7xJlHCi-lkp@intel.com>
@ 2022-12-19 20:26 ` Nicolas Dufresne
0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-19 20:26 UTC (permalink / raw)
To: kernel test robot, Benjamin Gaignard, ezequiel, p.zabel, mchehab,
robh+dt, krzysztof.kozlowski+dt, heiko, daniel.almeida,
nicolas.dufresne
Cc: oe-kbuild-all, linux-media, linux-rockchip, devicetree,
linux-arm-kernel, linux-kernel, kernel
Le mardi 20 décembre 2022 à 02:25 +0800, kernel test robot a écrit :
> Hi Benjamin,
>
> I love your patch! Yet something to improve:
This is expected as kernel next does not pull the uAPI v3 for this CODEC. The
patchset is RFC I believe, we will have to send a non-rfc one before this test
can pass.
>
> [auto build test ERROR on media-tree/master]
> [also build test ERROR on rockchip/for-next linus/master v6.1 next-20221219]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Benjamin-Gaignard/AV1-stateless-decoder-for-RK3588/20221220-000013
> base: git://linuxtv.org/media_tree.git master
> patch link: https://lore.kernel.org/r/20221219155616.848690-3-benjamin.gaignard%40collabora.com
> patch subject: [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls
> config: m68k-allmodconfig
> compiler: m68k-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/intel-lab-lkp/linux/commit/d60040964a6a110a1d3a9af3794c27e25a24182d
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review Benjamin-Gaignard/AV1-stateless-decoder-for-RK3588/20221220-000013
> git checkout d60040964a6a110a1d3a9af3794c27e25a24182d
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash drivers/media/
>
> If you fix the issue, kindly add following tag where applicable
> > Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> > > drivers/media/platform/verisilicon/hantro_drv.c:504:31: error: 'V4L2_CID_STATELESS_AV1_FRAME' undeclared here (not in a function); did you mean 'V4L2_CID_STATELESS_VP9_FRAME'?
> 504 | .id = V4L2_CID_STATELESS_AV1_FRAME,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> | V4L2_CID_STATELESS_VP9_FRAME
> > > drivers/media/platform/verisilicon/hantro_drv.c:509:31: error: 'V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY' undeclared here (not in a function)
> 509 | .id = V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > drivers/media/platform/verisilicon/hantro_drv.c:510:35: error: 'V4L2_AV1_MAX_TILE_COUNT' undeclared here (not in a function)
> 510 | .dims = { V4L2_AV1_MAX_TILE_COUNT },
> | ^~~~~~~~~~~~~~~~~~~~~~~
> > > drivers/media/platform/verisilicon/hantro_drv.c:515:31: error: 'V4L2_CID_STATELESS_AV1_SEQUENCE' undeclared here (not in a function); did you mean 'V4L2_CID_STATELESS_MPEG2_SEQUENCE'?
> 515 | .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> | V4L2_CID_STATELESS_MPEG2_SEQUENCE
> > > drivers/media/platform/verisilicon/hantro_drv.c:520:31: error: 'V4L2_CID_STATELESS_AV1_FILM_GRAIN' undeclared here (not in a function); did you mean 'V4L2_CID_STATELESS_VP9_FRAME'?
> 520 | .id = V4L2_CID_STATELESS_AV1_FILM_GRAIN,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> | V4L2_CID_STATELESS_VP9_FRAME
>
>
> vim +504 drivers/media/platform/verisilicon/hantro_drv.c
>
> 338
> 339 #define HANTRO_JPEG_ACTIVE_MARKERS (V4L2_JPEG_ACTIVE_MARKER_APP0 | \
> 340 V4L2_JPEG_ACTIVE_MARKER_COM | \
> 341 V4L2_JPEG_ACTIVE_MARKER_DQT | \
> 342 V4L2_JPEG_ACTIVE_MARKER_DHT)
> 343
> 344 static const struct hantro_ctrl controls[] = {
> 345 {
> 346 .codec = HANTRO_JPEG_ENCODER,
> 347 .cfg = {
> 348 .id = V4L2_CID_JPEG_COMPRESSION_QUALITY,
> 349 .min = 5,
> 350 .max = 100,
> 351 .step = 1,
> 352 .def = 50,
> 353 .ops = &hantro_jpeg_ctrl_ops,
> 354 },
> 355 }, {
> 356 .codec = HANTRO_JPEG_ENCODER,
> 357 .cfg = {
> 358 .id = V4L2_CID_JPEG_ACTIVE_MARKER,
> 359 .max = HANTRO_JPEG_ACTIVE_MARKERS,
> 360 .def = HANTRO_JPEG_ACTIVE_MARKERS,
> 361 /*
> 362 * Changing the set of active markers/segments also
> 363 * messes up the alignment of the JPEG header, which
> 364 * is needed to allow the hardware to write directly
> 365 * to the output buffer. Implementing this introduces
> 366 * a lot of complexity for little gain, as the markers
> 367 * enabled is already the minimum required set.
> 368 */
> 369 .flags = V4L2_CTRL_FLAG_READ_ONLY,
> 370 },
> 371 }, {
> 372 .codec = HANTRO_MPEG2_DECODER,
> 373 .cfg = {
> 374 .id = V4L2_CID_STATELESS_MPEG2_SEQUENCE,
> 375 },
> 376 }, {
> 377 .codec = HANTRO_MPEG2_DECODER,
> 378 .cfg = {
> 379 .id = V4L2_CID_STATELESS_MPEG2_PICTURE,
> 380 },
> 381 }, {
> 382 .codec = HANTRO_MPEG2_DECODER,
> 383 .cfg = {
> 384 .id = V4L2_CID_STATELESS_MPEG2_QUANTISATION,
> 385 },
> 386 }, {
> 387 .codec = HANTRO_VP8_DECODER,
> 388 .cfg = {
> 389 .id = V4L2_CID_STATELESS_VP8_FRAME,
> 390 },
> 391 }, {
> 392 .codec = HANTRO_H264_DECODER,
> 393 .cfg = {
> 394 .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> 395 },
> 396 }, {
> 397 .codec = HANTRO_H264_DECODER,
> 398 .cfg = {
> 399 .id = V4L2_CID_STATELESS_H264_SPS,
> 400 .ops = &hantro_ctrl_ops,
> 401 },
> 402 }, {
> 403 .codec = HANTRO_H264_DECODER,
> 404 .cfg = {
> 405 .id = V4L2_CID_STATELESS_H264_PPS,
> 406 },
> 407 }, {
> 408 .codec = HANTRO_H264_DECODER,
> 409 .cfg = {
> 410 .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> 411 },
> 412 }, {
> 413 .codec = HANTRO_H264_DECODER,
> 414 .cfg = {
> 415 .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> 416 .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> 417 .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> 418 .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> 419 },
> 420 }, {
> 421 .codec = HANTRO_H264_DECODER,
> 422 .cfg = {
> 423 .id = V4L2_CID_STATELESS_H264_START_CODE,
> 424 .min = V4L2_STATELESS_H264_START_CODE_ANNEX_B,
> 425 .def = V4L2_STATELESS_H264_START_CODE_ANNEX_B,
> 426 .max = V4L2_STATELESS_H264_START_CODE_ANNEX_B,
> 427 },
> 428 }, {
> 429 .codec = HANTRO_H264_DECODER,
> 430 .cfg = {
> 431 .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> 432 .min = V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE,
> 433 .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> 434 .menu_skip_mask =
> 435 BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> 436 .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> 437 }
> 438 }, {
> 439 .codec = HANTRO_HEVC_DECODER,
> 440 .cfg = {
> 441 .id = V4L2_CID_STATELESS_HEVC_DECODE_MODE,
> 442 .min = V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED,
> 443 .max = V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED,
> 444 .def = V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED,
> 445 },
> 446 }, {
> 447 .codec = HANTRO_HEVC_DECODER,
> 448 .cfg = {
> 449 .id = V4L2_CID_STATELESS_HEVC_START_CODE,
> 450 .min = V4L2_STATELESS_HEVC_START_CODE_ANNEX_B,
> 451 .max = V4L2_STATELESS_HEVC_START_CODE_ANNEX_B,
> 452 .def = V4L2_STATELESS_HEVC_START_CODE_ANNEX_B,
> 453 },
> 454 }, {
> 455 .codec = HANTRO_HEVC_DECODER,
> 456 .cfg = {
> 457 .id = V4L2_CID_MPEG_VIDEO_HEVC_PROFILE,
> 458 .min = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
> 459 .max = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN_10,
> 460 .def = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
> 461 },
> 462 }, {
> 463 .codec = HANTRO_HEVC_DECODER,
> 464 .cfg = {
> 465 .id = V4L2_CID_MPEG_VIDEO_HEVC_LEVEL,
> 466 .min = V4L2_MPEG_VIDEO_HEVC_LEVEL_1,
> 467 .max = V4L2_MPEG_VIDEO_HEVC_LEVEL_5_1,
> 468 },
> 469 }, {
> 470 .codec = HANTRO_HEVC_DECODER,
> 471 .cfg = {
> 472 .id = V4L2_CID_STATELESS_HEVC_SPS,
> 473 .ops = &hantro_ctrl_ops,
> 474 },
> 475 }, {
> 476 .codec = HANTRO_HEVC_DECODER,
> 477 .cfg = {
> 478 .id = V4L2_CID_STATELESS_HEVC_PPS,
> 479 },
> 480 }, {
> 481 .codec = HANTRO_HEVC_DECODER,
> 482 .cfg = {
> 483 .id = V4L2_CID_STATELESS_HEVC_DECODE_PARAMS,
> 484 },
> 485 }, {
> 486 .codec = HANTRO_HEVC_DECODER,
> 487 .cfg = {
> 488 .id = V4L2_CID_STATELESS_HEVC_SCALING_MATRIX,
> 489 },
> 490 }, {
> 491 .codec = HANTRO_VP9_DECODER,
> 492 .cfg = {
> 493 .id = V4L2_CID_STATELESS_VP9_FRAME,
> 494 .ops = &hantro_vp9_ctrl_ops,
> 495 },
> 496 }, {
> 497 .codec = HANTRO_VP9_DECODER,
> 498 .cfg = {
> 499 .id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
> 500 },
> 501 }, {
> 502 .codec = HANTRO_AV1_DECODER,
> 503 .cfg = {
> > 504 .id = V4L2_CID_STATELESS_AV1_FRAME,
> 505 },
> 506 }, {
> 507 .codec = HANTRO_AV1_DECODER,
> 508 .cfg = {
> > 509 .id = V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY,
> > 510 .dims = { V4L2_AV1_MAX_TILE_COUNT },
> 511 },
> 512 }, {
> 513 .codec = HANTRO_AV1_DECODER,
> 514 .cfg = {
> > 515 .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
> 516 },
> 517 }, {
> 518 .codec = HANTRO_AV1_DECODER,
> 519 .cfg = {
> > 520 .id = V4L2_CID_STATELESS_AV1_FILM_GRAIN,
> 521 },
> 522 },
> 523 };
> 524
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls
2022-12-19 15:56 ` [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls Benjamin Gaignard
[not found] ` <202212200257.l7xJlHCi-lkp@intel.com>
@ 2022-12-19 20:28 ` Nicolas Dufresne
1 sibling, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-19 20:28 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> Add AV1 decoder as new decoder mode to Hantro driver.
> Register needed AV1 controls for the decoder.
>
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> ---
> drivers/media/platform/verisilicon/hantro.h | 3 +++
> .../media/platform/verisilicon/hantro_drv.c | 21 +++++++++++++++++++
> 2 files changed, 24 insertions(+)
>
> diff --git a/drivers/media/platform/verisilicon/hantro.h b/drivers/media/platform/verisilicon/hantro.h
> index 2989ebc631cc..61480825b856 100644
> --- a/drivers/media/platform/verisilicon/hantro.h
> +++ b/drivers/media/platform/verisilicon/hantro.h
> @@ -38,6 +38,7 @@ struct hantro_postproc_ops;
> #define HANTRO_H264_DECODER BIT(18)
> #define HANTRO_HEVC_DECODER BIT(19)
> #define HANTRO_VP9_DECODER BIT(20)
> +#define HANTRO_AV1_DECODER BIT(21)
> #define HANTRO_DECODERS 0xffff0000
>
> /**
> @@ -111,6 +112,7 @@ struct hantro_variant {
> * @HANTRO_MODE_VP8_DEC: VP8 decoder.
> * @HANTRO_MODE_HEVC_DEC: HEVC decoder.
> * @HANTRO_MODE_VP9_DEC: VP9 decoder.
> + * @HANTRO_MODE_AV1_DEC: AV1 decoder
> */
> enum hantro_codec_mode {
> HANTRO_MODE_NONE = -1,
> @@ -120,6 +122,7 @@ enum hantro_codec_mode {
> HANTRO_MODE_VP8_DEC,
> HANTRO_MODE_HEVC_DEC,
> HANTRO_MODE_VP9_DEC,
> + HANTRO_MODE_AV1_DEC,
> };
>
> /*
> diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
> index 8cb4a68c9119..4500e1fc0f2c 100644
> --- a/drivers/media/platform/verisilicon/hantro_drv.c
> +++ b/drivers/media/platform/verisilicon/hantro_drv.c
> @@ -498,6 +498,27 @@ static const struct hantro_ctrl controls[] = {
> .cfg = {
> .id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
> },
> + }, {
> + .codec = HANTRO_AV1_DECODER,
> + .cfg = {
> + .id = V4L2_CID_STATELESS_AV1_FRAME,
> + },
> + }, {
> + .codec = HANTRO_AV1_DECODER,
> + .cfg = {
> + .id = V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY,
> + .dims = { V4L2_AV1_MAX_TILE_COUNT },
> + },
> + }, {
> + .codec = HANTRO_AV1_DECODER,
> + .cfg = {
> + .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
> + },
> + }, {
> + .codec = HANTRO_AV1_DECODER,
> + .cfg = {
> + .id = V4L2_CID_STATELESS_AV1_FILM_GRAIN,
> + },
> },
> };
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder
2022-12-19 15:56 ` [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder Benjamin Gaignard
@ 2022-12-19 20:37 ` Nicolas Dufresne
2022-12-19 21:29 ` Ezequiel Garcia
0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-19 20:37 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> Store bit depth information from AV1 sequence control.
>
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> .../media/platform/verisilicon/hantro_drv.c | 26 +++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
> index 4500e1fc0f2c..8e93710dcfed 100644
> --- a/drivers/media/platform/verisilicon/hantro_drv.c
> +++ b/drivers/media/platform/verisilicon/hantro_drv.c
> @@ -324,6 +324,25 @@ static int hantro_vp9_s_ctrl(struct v4l2_ctrl *ctrl)
> return 0;
> }
>
> +static int hantro_av1_s_ctrl(struct v4l2_ctrl *ctrl)
> +{
> + struct hantro_ctx *ctx;
> +
> + ctx = container_of(ctrl->handler,
> + struct hantro_ctx, ctrl_handler);
> +
> + switch (ctrl->id) {
> + case V4L2_CID_STATELESS_AV1_SEQUENCE:
> + ctx->bit_depth = ctrl->p_new.p_av1_sequence->bit_depth;
That seems a little be weak, what happens if you change the bit_depth with a
non-request s_ctrl while its decoding ? To be this deserve a little bit of
protection, a something that validate and copy it at the start of the decoding.
p.s. I know, VP9 seems similar, though arguably that was copied from jpeg, for
which it seems totally save to change the quality at run-time.
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +
> static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
> .try_ctrl = hantro_try_ctrl,
> };
> @@ -336,6 +355,12 @@ static const struct v4l2_ctrl_ops hantro_vp9_ctrl_ops = {
> .s_ctrl = hantro_vp9_s_ctrl,
> };
>
> +static const struct v4l2_ctrl_ops hantro_av1_ctrl_ops = {
> + .try_ctrl = hantro_try_ctrl,
> + .s_ctrl = hantro_av1_s_ctrl,
> +};
> +
> +
> #define HANTRO_JPEG_ACTIVE_MARKERS (V4L2_JPEG_ACTIVE_MARKER_APP0 | \
> V4L2_JPEG_ACTIVE_MARKER_COM | \
> V4L2_JPEG_ACTIVE_MARKER_DQT | \
> @@ -513,6 +538,7 @@ static const struct hantro_ctrl controls[] = {
> .codec = HANTRO_AV1_DECODER,
> .cfg = {
> .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
> + .ops = &hantro_av1_ctrl_ops,
> },
> }, {
> .codec = HANTRO_AV1_DECODER,
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth
2022-12-19 15:56 ` [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth Benjamin Gaignard
@ 2022-12-19 20:38 ` Nicolas Dufresne
2022-12-20 13:02 ` Benjamin Gaignard
0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-19 20:38 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> The driver supports 8 and 10 bits bitstreams, make sure to discard
> other cases.
>
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> drivers/media/platform/verisilicon/hantro_drv.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
> index 8e93710dcfed..e10fc59634dd 100644
> --- a/drivers/media/platform/verisilicon/hantro_drv.c
> +++ b/drivers/media/platform/verisilicon/hantro_drv.c
> @@ -282,7 +282,13 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
> /* We only support profile 0 */
> if (dec_params->profile != 0)
> return -EINVAL;
> + } else if (ctrl->id == V4L2_CID_STATELESS_AV1_SEQUENCE) {
> + const struct v4l2_ctrl_av1_sequence *sequence = ctrl->p_new.p_av1_sequence;
> +
> + if (sequence->bit_depth != 8 && sequence->bit_depth != 10)
> + return -EINVAL;
As you state in the cover letter, should this just be this for now ?
> + if (sequence->bit_depth != 8)
> + return -EINVAL;
> }
> +
> return 0;
> }
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames
2022-12-19 15:56 ` [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames Benjamin Gaignard
@ 2022-12-19 20:42 ` Nicolas Dufresne
2022-12-20 13:13 ` Benjamin Gaignard
0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-19 20:42 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> Compute the additional required to store motion vectors at
requires *space*, requires *buffer* ? I think this is missing a word.
> the end of the frames buffers.
>
>
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> drivers/media/platform/verisilicon/hantro_hw.h | 13 +++++++++++++
> .../media/platform/verisilicon/hantro_postproc.c | 3 +++
> drivers/media/platform/verisilicon/hantro_v4l2.c | 5 +++++
> 3 files changed, 21 insertions(+)
>
> diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
> index e83f0c523a30..8b3bc7e31395 100644
> --- a/drivers/media/platform/verisilicon/hantro_hw.h
> +++ b/drivers/media/platform/verisilicon/hantro_hw.h
> @@ -417,6 +417,19 @@ hantro_hevc_mv_size(unsigned int width, unsigned int height)
> return width * height / 16;
> }
>
> +static inline unsigned short hantro_av1_num_sbs(unsigned short dimension)
> +{
> + return DIV_ROUND_UP(dimension, 64) + 1;
Why plus one ? I've tested locally with the logical DIV_ROUND_UP(dimension, 64),
and didn't see any difference. It then match hantro_vp_num_sbs(), so can't this
be shared ?
> +}
> +
> +static inline size_t
> +hantro_av1_mv_size(unsigned int width, unsigned int height)
> +{
> + size_t num_sbs = hantro_av1_num_sbs(width) * hantro_av1_num_sbs(height);
> +
> + return ALIGN(num_sbs * 384, 16) + 512;
Shall the magic numbers be turned into defines ?
> +}
> +
> int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
> int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx);
> void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
> diff --git a/drivers/media/platform/verisilicon/hantro_postproc.c b/drivers/media/platform/verisilicon/hantro_postproc.c
> index 09d8cf942689..7dc39519a2ee 100644
> --- a/drivers/media/platform/verisilicon/hantro_postproc.c
> +++ b/drivers/media/platform/verisilicon/hantro_postproc.c
> @@ -213,6 +213,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
> else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE)
> buf_size += hantro_hevc_mv_size(pix_mp.width,
> pix_mp.height);
> + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME)
> + buf_size += hantro_av1_mv_size(pix_mp.width,
> + pix_mp.height);
nit: Time to turn into a switch or use an ops ?
>
> for (i = 0; i < num_buffers; ++i) {
> struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> index 2c7a805289e7..d41dcb108a6d 100644
> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> @@ -334,6 +334,11 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
> pix_mp->plane_fmt[0].sizeimage +=
> hantro_hevc_mv_size(pix_mp->width,
> pix_mp->height);
> + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME &&
> + !hantro_needs_postproc(ctx, fmt))
> + pix_mp->plane_fmt[0].sizeimage +=
> + hantro_av1_mv_size(pix_mp->width,
> + pix_mp->height);
> } else if (!pix_mp->plane_fmt[0].sizeimage) {
> /*
> * For coded formats the application can specify
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
` (7 preceding siblings ...)
2022-12-19 20:22 ` [PATCH v1 0/9] AV1 stateless decoder for RK3588 Nicolas Dufresne
@ 2022-12-19 21:07 ` Ezequiel Garcia
2022-12-19 21:54 ` Michael Grzeschik
2022-12-21 16:33 ` Nicolas Dufresne
[not found] ` <20221219155616.848690-8-benjamin.gaignard@collabora.com>
9 siblings, 2 replies; 36+ messages in thread
From: Ezequiel Garcia @ 2022-12-19 21:07 UTC (permalink / raw)
To: Benjamin Gaignard
Cc: p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt, heiko,
daniel.almeida, nicolas.dufresne, linux-media, linux-rockchip,
devicetree, linux-arm-kernel, linux-kernel, kernel
Hi Benjamin,
On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
<benjamin.gaignard@collabora.com> wrote:
>
> This series implement AV1 stateless decoder for RK3588 SoC.
> The harware support 8 and 10 bits bitstreams up to 7680x4320.
> AV1 feature like film grain or scaling are done by the postprocessor.
> The driver can produce NV12_4L4 and NV12 pixel formats.
> A native 10bits NV12_4L4 format is possible but need more investigation
> to be completly documented and enabled.
>
> It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> Sebastian's device-tree patches for RK3588.
>
I thought the AV1 decoder in RK3588 was really a separate hardware
from the Hantro G1/G2.
Shouldn't this need a new driver for this new hardware?
Thanks!
Ezequiel
> The full branch can be found here:
> https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>
> Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> The failing tests are:
> - 10bits bitstream because 10bits output formats aren't yet implemented.
> - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> - tests with resolution < hardware limit (64x64)
>
> Benjamin
>
> Benjamin Gaignard (9):
> dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> media: verisilicon: Add AV1 decoder mode and controls
> media: verisilicon: Save bit depth for AV1 decoder
> media: verisilicon: Check AV1 bitstreams bit depth
> media: verisilicon: Compute motion vectors size for AV1 frames
> media: verisilicon: Add AV1 entropy helpers
> media: verisilicon: Add Rockchip AV1 decoder
> media: verisilicon: Add film grain feature to AV1 driver
> media: verisilicon: Enable AV1 decoder on rk3588
>
> .../bindings/media/rockchip-vpu.yaml | 1 +
> drivers/media/platform/verisilicon/Makefile | 3 +
> drivers/media/platform/verisilicon/hantro.h | 5 +
> .../media/platform/verisilicon/hantro_drv.c | 54 +
> .../media/platform/verisilicon/hantro_hw.h | 102 +
> .../platform/verisilicon/hantro_postproc.c | 3 +
> .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> .../verisilicon/rockchip_av1_entropymode.h | 272 +
> .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> 14 files changed, 8291 insertions(+)
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>
> --
> 2.34.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder
2022-12-19 20:37 ` Nicolas Dufresne
@ 2022-12-19 21:29 ` Ezequiel Garcia
2022-12-20 13:05 ` Benjamin Gaignard
0 siblings, 1 reply; 36+ messages in thread
From: Ezequiel Garcia @ 2022-12-19 21:29 UTC (permalink / raw)
To: Nicolas Dufresne
Cc: Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Bonjour Nicolas,
On Mon, Dec 19, 2022 at 5:37 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> > Store bit depth information from AV1 sequence control.
> >
> > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > ---
> > .../media/platform/verisilicon/hantro_drv.c | 26 +++++++++++++++++++
> > 1 file changed, 26 insertions(+)
> >
> > diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
> > index 4500e1fc0f2c..8e93710dcfed 100644
> > --- a/drivers/media/platform/verisilicon/hantro_drv.c
> > +++ b/drivers/media/platform/verisilicon/hantro_drv.c
> > @@ -324,6 +324,25 @@ static int hantro_vp9_s_ctrl(struct v4l2_ctrl *ctrl)
> > return 0;
> > }
> >
> > +static int hantro_av1_s_ctrl(struct v4l2_ctrl *ctrl)
> > +{
> > + struct hantro_ctx *ctx;
> > +
> > + ctx = container_of(ctrl->handler,
> > + struct hantro_ctx, ctrl_handler);
> > +
> > + switch (ctrl->id) {
> > + case V4L2_CID_STATELESS_AV1_SEQUENCE:
> > + ctx->bit_depth = ctrl->p_new.p_av1_sequence->bit_depth;
>
> That seems a little be weak, what happens if you change the bit_depth with a
> non-request s_ctrl while its decoding ? To be this deserve a little bit of
> protection, a something that validate and copy it at the start of the decoding.
>
Oh, nice catch. We need to return EBUSY, see
https://www.kernel.org/doc/html/v5.0/media/uapi/v4l/buffer.html#interactions-between-formats-controls-and-buffers.
There's already an API in the V4L2 control framework for drivers to use,
see v4l2_ctrl_grab in
https://www.kernel.org/doc/html/v5.0/media/kapi/v4l2-controls.html#active-and-grabbed-controls.
> p.s. I know, VP9 seems similar, though arguably that was copied from jpeg, for
> which it seems totally save to change the quality at run-time.
>
No, wasn't copied from JPEG :-) I just didn't realize this was an
issue, but it is
given the bit_depth affects the buffers so you are correct, it needs
to be fixed for VP9 too.
Thanks!
Ezequiel
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +
> > static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
> > .try_ctrl = hantro_try_ctrl,
> > };
> > @@ -336,6 +355,12 @@ static const struct v4l2_ctrl_ops hantro_vp9_ctrl_ops = {
> > .s_ctrl = hantro_vp9_s_ctrl,
> > };
> >
> > +static const struct v4l2_ctrl_ops hantro_av1_ctrl_ops = {
> > + .try_ctrl = hantro_try_ctrl,
> > + .s_ctrl = hantro_av1_s_ctrl,
> > +};
> > +
> > +
> > #define HANTRO_JPEG_ACTIVE_MARKERS (V4L2_JPEG_ACTIVE_MARKER_APP0 | \
> > V4L2_JPEG_ACTIVE_MARKER_COM | \
> > V4L2_JPEG_ACTIVE_MARKER_DQT | \
> > @@ -513,6 +538,7 @@ static const struct hantro_ctrl controls[] = {
> > .codec = HANTRO_AV1_DECODER,
> > .cfg = {
> > .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
> > + .ops = &hantro_av1_ctrl_ops,
> > },
> > }, {
> > .codec = HANTRO_AV1_DECODER,
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-19 21:07 ` Ezequiel Garcia
@ 2022-12-19 21:54 ` Michael Grzeschik
2022-12-20 1:52 ` Ezequiel Garcia
2022-12-20 17:00 ` Nicolas Dufresne
2022-12-21 16:33 ` Nicolas Dufresne
1 sibling, 2 replies; 36+ messages in thread
From: Michael Grzeschik @ 2022-12-19 21:54 UTC (permalink / raw)
To: Ezequiel Garcia
Cc: Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
[-- Attachment #1.1: Type: text/plain, Size: 5248 bytes --]
Hi Benjamin,
Hi Ezequiel,
On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
>On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
><benjamin.gaignard@collabora.com> wrote:
>>
>> This series implement AV1 stateless decoder for RK3588 SoC.
>> The harware support 8 and 10 bits bitstreams up to 7680x4320.
>> AV1 feature like film grain or scaling are done by the postprocessor.
>> The driver can produce NV12_4L4 and NV12 pixel formats.
>> A native 10bits NV12_4L4 format is possible but need more investigation
>> to be completly documented and enabled.
>>
>> It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
>> Sebastian's device-tree patches for RK3588.
>>
>
>I thought the AV1 decoder in RK3588 was really a separate hardware
>from the Hantro G1/G2.
>
>Shouldn't this need a new driver for this new hardware?
Just jumping into this discussion as I am currently working on the rkvenc driver.
In my case I am extending the rkvdec driver to become more generic for
other rockchip specific enc/decoders.
My first change looks like this:
---
drivers/staging/media/rkvdec/Makefile | 4 +-
drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
---
While working on other parts of the encoder I found many places in the
rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
functions but where limited to the decoder case.
I think there are two options for the av1 codec.
1) If the vpu981 is a driver that has nothing to do with verisilicon but
works with this driver framework, then we should integrate vepu981 into it
but consider rename the verisilicon unrelated parts to something generic.
2) Move the vepu981 av1 driver into the rkvdec instead.
If 1) is the way to go, we can even think of moving the staging code parts from
rkvdec to the verisilicon code. Likewise to the vepu981-av1.
I could also keep on integrating the rkvenc on that base instead.
Regards,
Michael
>> The full branch can be found here:
>> https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>>
>> Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
>> The failing tests are:
>> - 10bits bitstream because 10bits output formats aren't yet implemented.
>> - the 2 tests with 2 spatial layers: few errors in luma/chroma values
>> - tests with resolution < hardware limit (64x64)
>>
>> Benjamin
>>
>> Benjamin Gaignard (9):
>> dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
>> media: verisilicon: Add AV1 decoder mode and controls
>> media: verisilicon: Save bit depth for AV1 decoder
>> media: verisilicon: Check AV1 bitstreams bit depth
>> media: verisilicon: Compute motion vectors size for AV1 frames
>> media: verisilicon: Add AV1 entropy helpers
>> media: verisilicon: Add Rockchip AV1 decoder
>> media: verisilicon: Add film grain feature to AV1 driver
>> media: verisilicon: Enable AV1 decoder on rk3588
>>
>> .../bindings/media/rockchip-vpu.yaml | 1 +
>> drivers/media/platform/verisilicon/Makefile | 3 +
>> drivers/media/platform/verisilicon/hantro.h | 5 +
>> .../media/platform/verisilicon/hantro_drv.c | 54 +
>> .../media/platform/verisilicon/hantro_hw.h | 102 +
>> .../platform/verisilicon/hantro_postproc.c | 3 +
>> .../media/platform/verisilicon/hantro_v4l2.c | 5 +
>> .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
>> .../verisilicon/rockchip_av1_entropymode.h | 272 +
>> .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
>> .../verisilicon/rockchip_av1_filmgrain.h | 36 +
>> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
>> .../verisilicon/rockchip_vpu981_regs.h | 477 ++
>> .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
>> 14 files changed, 8291 insertions(+)
>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
>> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>>
>> --
>> 2.34.1
>>
>
>_______________________________________________
>linux-arm-kernel mailing list
>linux-arm-kernel@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-19 21:54 ` Michael Grzeschik
@ 2022-12-20 1:52 ` Ezequiel Garcia
2022-12-20 12:26 ` Benjamin Gaignard
2022-12-20 13:40 ` Michael Grzeschik
2022-12-20 17:00 ` Nicolas Dufresne
1 sibling, 2 replies; 36+ messages in thread
From: Ezequiel Garcia @ 2022-12-20 1:52 UTC (permalink / raw)
To: Michael Grzeschik
Cc: Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Hi Michael,
On Mon, Dec 19, 2022 at 6:54 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>
>
> Hi Benjamin,
> Hi Ezequiel,
>
> On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
> >On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
> ><benjamin.gaignard@collabora.com> wrote:
> >>
> >> This series implement AV1 stateless decoder for RK3588 SoC.
> >> The harware support 8 and 10 bits bitstreams up to 7680x4320.
> >> AV1 feature like film grain or scaling are done by the postprocessor.
> >> The driver can produce NV12_4L4 and NV12 pixel formats.
> >> A native 10bits NV12_4L4 format is possible but need more investigation
> >> to be completly documented and enabled.
> >>
> >> It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> >> Sebastian's device-tree patches for RK3588.
> >>
> >
> >I thought the AV1 decoder in RK3588 was really a separate hardware
> >from the Hantro G1/G2.
> >
> >Shouldn't this need a new driver for this new hardware?
>
> Just jumping into this discussion as I am currently working on the rkvenc driver.
>
The more the merrier, there's always room for developers :-)
> In my case I am extending the rkvdec driver to become more generic for
> other rockchip specific enc/decoders.
>
> My first change looks like this:
> ---
> drivers/staging/media/rkvdec/Makefile | 4 +-
> drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
> drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
> drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
> drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
> ---
>
> While working on other parts of the encoder I found many places in the
> rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
> functions but where limited to the decoder case.
>
Because stateless decoders devices are very similar in their general behavior,
their drivers could be very similar.
Hantro and Rkvdec could look similar because the same humans worked on them.
Most boilerplate code, as well as V4L2 format negotiation, VB2 buffer handling
could be shared among all stateless decoder drivers. I think even at one point
we experimented with having a shared/common code base for all stateless codecs.
In other words, it's entirely possible to support Hantro devices in
the Cedrus driver
and vice-versa, you would only have to write the hardware-specific bits.
However, there is consensus to have a separate driver for each
different hardware,
even when the hardware is a bit similar. This may lead to some code duplication,
but it's less fragile / more flexible. Maintaining drivers this way allows
developers to evolve, testing on a small family of devices, without
breaking support
for other devices.
This is important as sometimes it's hard to get the hardware,
but we still don't want to break the support!
> I think there are two options for the av1 codec.
>
> 1) If the vpu981 is a driver that has nothing to do with verisilicon but
> works with this driver framework, then we should integrate vepu981 into it
> but consider rename the verisilicon unrelated parts to something generic.
>
> 2) Move the vepu981 av1 driver into the rkvdec instead.
>
> If 1) is the way to go, we can even think of moving the staging code parts from
> rkvdec to the verisilicon code. Likewise to the vepu981-av1.
>
The Hantro driver should only support G1, G2, and VC8000D;
which can be said to belong to the same family.
The RKVDEC driver supports Rockchip vdpu34x core. I have to admit
I'm not exactly sure if we support anything else than vdpu34x.
I'm not familiar with the AV1 support provided by this patch,
but looking at the mpp code:
...
"rk3588",
ROCKCHIP_SOC_RK3588,
HAVE_VDPU2 | HAVE_VDPU2_PP | HAVE_VEPU2 | HAVE_RKVDEC | HAVE_RKVENC |
HAVE_JPEG_DEC | HAVE_AV1DEC | HAVE_AVSDEC | HAVE_VEPU2_JPEG,
{ &vdpu38x, &rkjpegd, &vdpu2, &vdpu2_jpeg_pp, &av1d, &avspd},
{ &vepu58x, &vepu2, &vepu2_jpeg, NULL, },
Seems RK3588 supports a Hantro core (VDPU2), a vdpu38x core and this AV1 core,
which according to this patchset is vdpu981 (?)
If the vdpu38x device interface, configuration, buffer handling and
registers are
similar enough with vdpu34x, adding vdpu38x to the Rkvdec driver
should be straightforward.
If the vdpu38x core differs, it may be reason enough to consider a new driver.
As for vdpu981 (AV1), I'm inclined to think it deserves its own driver.
Again, I'm far less worried for a little code duplication in the
boilerplate (which can be solved
with helpers, etc.) and more worried about making sure we can evolve
drivers easily,
while minimizing regressions.
Hope it helps!
Ezequiel
> I could also keep on integrating the rkvenc on that base instead.
>
> Regards,
> Michael
>
> >> The full branch can be found here:
> >> https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
> >>
> >> Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> >> The failing tests are:
> >> - 10bits bitstream because 10bits output formats aren't yet implemented.
> >> - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> >> - tests with resolution < hardware limit (64x64)
> >>
> >> Benjamin
> >>
> >> Benjamin Gaignard (9):
> >> dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> >> media: verisilicon: Add AV1 decoder mode and controls
> >> media: verisilicon: Save bit depth for AV1 decoder
> >> media: verisilicon: Check AV1 bitstreams bit depth
> >> media: verisilicon: Compute motion vectors size for AV1 frames
> >> media: verisilicon: Add AV1 entropy helpers
> >> media: verisilicon: Add Rockchip AV1 decoder
> >> media: verisilicon: Add film grain feature to AV1 driver
> >> media: verisilicon: Enable AV1 decoder on rk3588
> >>
> >> .../bindings/media/rockchip-vpu.yaml | 1 +
> >> drivers/media/platform/verisilicon/Makefile | 3 +
> >> drivers/media/platform/verisilicon/hantro.h | 5 +
> >> .../media/platform/verisilicon/hantro_drv.c | 54 +
> >> .../media/platform/verisilicon/hantro_hw.h | 102 +
> >> .../platform/verisilicon/hantro_postproc.c | 3 +
> >> .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> >> .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> >> .../verisilicon/rockchip_av1_entropymode.h | 272 +
> >> .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> >> .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> >> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> >> .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> >> .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> >> 14 files changed, 8291 insertions(+)
> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
> >>
> >> --
> >> 2.34.1
> >>
> >
> >_______________________________________________
> >linux-arm-kernel mailing list
> >linux-arm-kernel@lists.infradead.org
> >http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
>
> --
> Pengutronix e.K. | |
> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
2022-12-19 15:56 ` [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible Benjamin Gaignard
2022-12-19 16:06 ` Krzysztof Kozlowski
@ 2022-12-20 9:55 ` Krzysztof Kozlowski
1 sibling, 0 replies; 36+ messages in thread
From: Krzysztof Kozlowski @ 2022-12-20 9:55 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
On 19/12/2022 16:56, Benjamin Gaignard wrote:
> Add compatible for rk3588 AV1 vpu decoder.
>
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> Documentation/devicetree/bindings/media/rockchip-vpu.yaml | 1 +
> 1 file changed, 1 insertion(+)
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-20 1:52 ` Ezequiel Garcia
@ 2022-12-20 12:26 ` Benjamin Gaignard
2022-12-20 13:40 ` Michael Grzeschik
1 sibling, 0 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-20 12:26 UTC (permalink / raw)
To: Ezequiel Garcia, Michael Grzeschik
Cc: p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt, heiko,
daniel.almeida, nicolas.dufresne, linux-media, linux-rockchip,
devicetree, linux-arm-kernel, linux-kernel, kernel
Le 20/12/2022 à 02:52, Ezequiel Garcia a écrit :
> Hi Michael,
>
> On Mon, Dec 19, 2022 at 6:54 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>>
>> Hi Benjamin,
>> Hi Ezequiel,
>>
>> On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
>>> On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
>>> <benjamin.gaignard@collabora.com> wrote:
>>>> This series implement AV1 stateless decoder for RK3588 SoC.
>>>> The harware support 8 and 10 bits bitstreams up to 7680x4320.
>>>> AV1 feature like film grain or scaling are done by the postprocessor.
>>>> The driver can produce NV12_4L4 and NV12 pixel formats.
>>>> A native 10bits NV12_4L4 format is possible but need more investigation
>>>> to be completly documented and enabled.
>>>>
>>>> It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
>>>> Sebastian's device-tree patches for RK3588.
>>>>
>>> I thought the AV1 decoder in RK3588 was really a separate hardware
>> >from the Hantro G1/G2.
>>> Shouldn't this need a new driver for this new hardware?
>> Just jumping into this discussion as I am currently working on the rkvenc driver.
>>
> The more the merrier, there's always room for developers :-)
>
>> In my case I am extending the rkvdec driver to become more generic for
>> other rockchip specific enc/decoders.
>>
>> My first change looks like this:
>> ---
>> drivers/staging/media/rkvdec/Makefile | 4 +-
>> drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
>> drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
>> drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
>> drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
>> ---
>>
>> While working on other parts of the encoder I found many places in the
>> rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
>> functions but where limited to the decoder case.
>>
> Because stateless decoders devices are very similar in their general behavior,
> their drivers could be very similar.
>
> Hantro and Rkvdec could look similar because the same humans worked on them.
>
> Most boilerplate code, as well as V4L2 format negotiation, VB2 buffer handling
> could be shared among all stateless decoder drivers. I think even at one point
> we experimented with having a shared/common code base for all stateless codecs.
>
> In other words, it's entirely possible to support Hantro devices in
> the Cedrus driver
> and vice-versa, you would only have to write the hardware-specific bits.
>
> However, there is consensus to have a separate driver for each
> different hardware,
> even when the hardware is a bit similar. This may lead to some code duplication,
> but it's less fragile / more flexible. Maintaining drivers this way allows
> developers to evolve, testing on a small family of devices, without
> breaking support
> for other devices.
>
> This is important as sometimes it's hard to get the hardware,
> but we still don't want to break the support!
>
>> I think there are two options for the av1 codec.
>>
>> 1) If the vpu981 is a driver that has nothing to do with verisilicon but
>> works with this driver framework, then we should integrate vepu981 into it
>> but consider rename the verisilicon unrelated parts to something generic.
>>
>> 2) Move the vepu981 av1 driver into the rkvdec instead.
>>
>> If 1) is the way to go, we can even think of moving the staging code parts from
>> rkvdec to the verisilicon code. Likewise to the vepu981-av1.
>>
> The Hantro driver should only support G1, G2, and VC8000D;
> which can be said to belong to the same family.
Rockchip TRM names this hardware block vpu981 but it is a Verisilicon hardware block,
probably a VC9000D with a different register mapping.
>
> The RKVDEC driver supports Rockchip vdpu34x core. I have to admit
> I'm not exactly sure if we support anything else than vdpu34x.
>
> I'm not familiar with the AV1 support provided by this patch,
> but looking at the mpp code:
>
> ...
> "rk3588",
> ROCKCHIP_SOC_RK3588,
> HAVE_VDPU2 | HAVE_VDPU2_PP | HAVE_VEPU2 | HAVE_RKVDEC | HAVE_RKVENC |
> HAVE_JPEG_DEC | HAVE_AV1DEC | HAVE_AVSDEC | HAVE_VEPU2_JPEG,
> { &vdpu38x, &rkjpegd, &vdpu2, &vdpu2_jpeg_pp, &av1d, &avspd},
> { &vepu58x, &vepu2, &vepu2_jpeg, NULL, },
>
> Seems RK3588 supports a Hantro core (VDPU2), a vdpu38x core and this AV1 core,
> which according to this patchset is vdpu981 (?)
>
> If the vdpu38x device interface, configuration, buffer handling and
> registers are
> similar enough with vdpu34x, adding vdpu38x to the Rkvdec driver
> should be straightforward.
> If the vdpu38x core differs, it may be reason enough to consider a new driver.
>
> As for vdpu981 (AV1), I'm inclined to think it deserves its own driver.
>
> Again, I'm far less worried for a little code duplication in the
> boilerplate (which can be solved
> with helpers, etc.) and more worried about making sure we can evolve
> drivers easily,
> while minimizing regressions.
>
> Hope it helps!
> Ezequiel
>
>
>> I could also keep on integrating the rkvenc on that base instead.
>>
>> Regards,
>> Michael
>>
>>>> The full branch can be found here:
>>>> https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>>>>
>>>> Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
>>>> The failing tests are:
>>>> - 10bits bitstream because 10bits output formats aren't yet implemented.
>>>> - the 2 tests with 2 spatial layers: few errors in luma/chroma values
>>>> - tests with resolution < hardware limit (64x64)
>>>>
>>>> Benjamin
>>>>
>>>> Benjamin Gaignard (9):
>>>> dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
>>>> media: verisilicon: Add AV1 decoder mode and controls
>>>> media: verisilicon: Save bit depth for AV1 decoder
>>>> media: verisilicon: Check AV1 bitstreams bit depth
>>>> media: verisilicon: Compute motion vectors size for AV1 frames
>>>> media: verisilicon: Add AV1 entropy helpers
>>>> media: verisilicon: Add Rockchip AV1 decoder
>>>> media: verisilicon: Add film grain feature to AV1 driver
>>>> media: verisilicon: Enable AV1 decoder on rk3588
>>>>
>>>> .../bindings/media/rockchip-vpu.yaml | 1 +
>>>> drivers/media/platform/verisilicon/Makefile | 3 +
>>>> drivers/media/platform/verisilicon/hantro.h | 5 +
>>>> .../media/platform/verisilicon/hantro_drv.c | 54 +
>>>> .../media/platform/verisilicon/hantro_hw.h | 102 +
>>>> .../platform/verisilicon/hantro_postproc.c | 3 +
>>>> .../media/platform/verisilicon/hantro_v4l2.c | 5 +
>>>> .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
>>>> .../verisilicon/rockchip_av1_entropymode.h | 272 +
>>>> .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
>>>> .../verisilicon/rockchip_av1_filmgrain.h | 36 +
>>>> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
>>>> .../verisilicon/rockchip_vpu981_regs.h | 477 ++
>>>> .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
>>>> 14 files changed, 8291 insertions(+)
>>>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
>>>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
>>>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
>>>> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
>>>> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>>>> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>>>>
>>>> --
>>>> 2.34.1
>>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>> --
>> Pengutronix e.K. | |
>> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
>> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
>> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth
2022-12-19 20:38 ` Nicolas Dufresne
@ 2022-12-20 13:02 ` Benjamin Gaignard
2022-12-21 16:16 ` Nicolas Dufresne
0 siblings, 1 reply; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-20 13:02 UTC (permalink / raw)
To: Nicolas Dufresne, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le 19/12/2022 à 21:38, Nicolas Dufresne a écrit :
> Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
>> The driver supports 8 and 10 bits bitstreams, make sure to discard
>> other cases.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> drivers/media/platform/verisilicon/hantro_drv.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
>> index 8e93710dcfed..e10fc59634dd 100644
>> --- a/drivers/media/platform/verisilicon/hantro_drv.c
>> +++ b/drivers/media/platform/verisilicon/hantro_drv.c
>> @@ -282,7 +282,13 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
>> /* We only support profile 0 */
>> if (dec_params->profile != 0)
>> return -EINVAL;
>> + } else if (ctrl->id == V4L2_CID_STATELESS_AV1_SEQUENCE) {
>> + const struct v4l2_ctrl_av1_sequence *sequence = ctrl->p_new.p_av1_sequence;
>> +
>> + if (sequence->bit_depth != 8 && sequence->bit_depth != 10)
>> + return -EINVAL;
> As you state in the cover letter, should this just be this for now ?
The driver can decode 8 or 10 bits bitstreams but will on produce 8bits (NV12_4L4 or NV12)
frames. The hardware is able to truncate 10bits bitstreams to 8 bits output.
>
>
>> + if (sequence->bit_depth != 8)
>> + return -EINVAL;
>
>> }
>> +
>> return 0;
>> }
>>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder
2022-12-19 21:29 ` Ezequiel Garcia
@ 2022-12-20 13:05 ` Benjamin Gaignard
0 siblings, 0 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-20 13:05 UTC (permalink / raw)
To: Ezequiel Garcia, Nicolas Dufresne
Cc: p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt, heiko,
daniel.almeida, nicolas.dufresne, linux-media, linux-rockchip,
devicetree, linux-arm-kernel, linux-kernel, kernel
Le 19/12/2022 à 22:29, Ezequiel Garcia a écrit :
> Bonjour Nicolas,
>
> On Mon, Dec 19, 2022 at 5:37 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>> Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
>>> Store bit depth information from AV1 sequence control.
>>>
>>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>>> ---
>>> .../media/platform/verisilicon/hantro_drv.c | 26 +++++++++++++++++++
>>> 1 file changed, 26 insertions(+)
>>>
>>> diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
>>> index 4500e1fc0f2c..8e93710dcfed 100644
>>> --- a/drivers/media/platform/verisilicon/hantro_drv.c
>>> +++ b/drivers/media/platform/verisilicon/hantro_drv.c
>>> @@ -324,6 +324,25 @@ static int hantro_vp9_s_ctrl(struct v4l2_ctrl *ctrl)
>>> return 0;
>>> }
>>>
>>> +static int hantro_av1_s_ctrl(struct v4l2_ctrl *ctrl)
>>> +{
>>> + struct hantro_ctx *ctx;
>>> +
>>> + ctx = container_of(ctrl->handler,
>>> + struct hantro_ctx, ctrl_handler);
>>> +
>>> + switch (ctrl->id) {
>>> + case V4L2_CID_STATELESS_AV1_SEQUENCE:
>>> + ctx->bit_depth = ctrl->p_new.p_av1_sequence->bit_depth;
>> That seems a little be weak, what happens if you change the bit_depth with a
>> non-request s_ctrl while its decoding ? To be this deserve a little bit of
>> protection, a something that validate and copy it at the start of the decoding.
>>
> Oh, nice catch. We need to return EBUSY, see
> https://www.kernel.org/doc/html/v5.0/media/uapi/v4l/buffer.html#interactions-between-formats-controls-and-buffers.
>
> There's already an API in the V4L2 control framework for drivers to use,
> see v4l2_ctrl_grab in
> https://www.kernel.org/doc/html/v5.0/media/kapi/v4l2-controls.html#active-and-grabbed-controls.
>
>> p.s. I know, VP9 seems similar, though arguably that was copied from jpeg, for
>> which it seems totally save to change the quality at run-time.
>>
> No, wasn't copied from JPEG :-) I just didn't realize this was an
> issue, but it is
> given the bit_depth affects the buffers so you are correct, it needs
> to be fixed for VP9 too.
I will use v4l2_ctrl_grab() in codecs->ops init() and exit() functions
but it will be on patch 7 because it where they appear for this codec.
Benjamin
>
> Thanks!
> Ezequiel
>
>>> + break;
>>> + default:
>>> + return -EINVAL;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +
>>> static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>>> .try_ctrl = hantro_try_ctrl,
>>> };
>>> @@ -336,6 +355,12 @@ static const struct v4l2_ctrl_ops hantro_vp9_ctrl_ops = {
>>> .s_ctrl = hantro_vp9_s_ctrl,
>>> };
>>>
>>> +static const struct v4l2_ctrl_ops hantro_av1_ctrl_ops = {
>>> + .try_ctrl = hantro_try_ctrl,
>>> + .s_ctrl = hantro_av1_s_ctrl,
>>> +};
>>> +
>>> +
>>> #define HANTRO_JPEG_ACTIVE_MARKERS (V4L2_JPEG_ACTIVE_MARKER_APP0 | \
>>> V4L2_JPEG_ACTIVE_MARKER_COM | \
>>> V4L2_JPEG_ACTIVE_MARKER_DQT | \
>>> @@ -513,6 +538,7 @@ static const struct hantro_ctrl controls[] = {
>>> .codec = HANTRO_AV1_DECODER,
>>> .cfg = {
>>> .id = V4L2_CID_STATELESS_AV1_SEQUENCE,
>>> + .ops = &hantro_av1_ctrl_ops,
>>> },
>>> }, {
>>> .codec = HANTRO_AV1_DECODER,
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames
2022-12-19 20:42 ` Nicolas Dufresne
@ 2022-12-20 13:13 ` Benjamin Gaignard
0 siblings, 0 replies; 36+ messages in thread
From: Benjamin Gaignard @ 2022-12-20 13:13 UTC (permalink / raw)
To: Nicolas Dufresne, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le 19/12/2022 à 21:42, Nicolas Dufresne a écrit :
> Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
>> Compute the additional required to store motion vectors at
> requires *space*, requires *buffer* ? I think this is missing a word.
>
>> the end of the frames buffers.
>>
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> drivers/media/platform/verisilicon/hantro_hw.h | 13 +++++++++++++
>> .../media/platform/verisilicon/hantro_postproc.c | 3 +++
>> drivers/media/platform/verisilicon/hantro_v4l2.c | 5 +++++
>> 3 files changed, 21 insertions(+)
>>
>> diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
>> index e83f0c523a30..8b3bc7e31395 100644
>> --- a/drivers/media/platform/verisilicon/hantro_hw.h
>> +++ b/drivers/media/platform/verisilicon/hantro_hw.h
>> @@ -417,6 +417,19 @@ hantro_hevc_mv_size(unsigned int width, unsigned int height)
>> return width * height / 16;
>> }
>>
>> +static inline unsigned short hantro_av1_num_sbs(unsigned short dimension)
>> +{
>> + return DIV_ROUND_UP(dimension, 64) + 1;
> Why plus one ? I've tested locally with the logical DIV_ROUND_UP(dimension, 64),
> and didn't see any difference. It then match hantro_vp_num_sbs(), so can't this
> be shared ?
MPP code use plus one so I keep it like that.
>
>> +}
>> +
>> +static inline size_t
>> +hantro_av1_mv_size(unsigned int width, unsigned int height)
>> +{
>> + size_t num_sbs = hantro_av1_num_sbs(width) * hantro_av1_num_sbs(height);
>> +
>> + return ALIGN(num_sbs * 384, 16) + 512;
> Shall the magic numbers be turned into defines ?
MPP code is:
dir_mvs_size = MPP_ALIGN(num_sbs * 24 * 128 / 8, 16);
and 512 is added later by another piece of code.
I have no clue about the meaning of those values, sorry.
>
>> +}
>> +
>> int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
>> int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx);
>> void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
>> diff --git a/drivers/media/platform/verisilicon/hantro_postproc.c b/drivers/media/platform/verisilicon/hantro_postproc.c
>> index 09d8cf942689..7dc39519a2ee 100644
>> --- a/drivers/media/platform/verisilicon/hantro_postproc.c
>> +++ b/drivers/media/platform/verisilicon/hantro_postproc.c
>> @@ -213,6 +213,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
>> else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE)
>> buf_size += hantro_hevc_mv_size(pix_mp.width,
>> pix_mp.height);
>> + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME)
>> + buf_size += hantro_av1_mv_size(pix_mp.width,
>> + pix_mp.height);
> nit: Time to turn into a switch or use an ops ?
>
>>
>> for (i = 0; i < num_buffers; ++i) {
>> struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
>> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
>> index 2c7a805289e7..d41dcb108a6d 100644
>> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
>> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
>> @@ -334,6 +334,11 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
>> pix_mp->plane_fmt[0].sizeimage +=
>> hantro_hevc_mv_size(pix_mp->width,
>> pix_mp->height);
>> + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME &&
>> + !hantro_needs_postproc(ctx, fmt))
>> + pix_mp->plane_fmt[0].sizeimage +=
>> + hantro_av1_mv_size(pix_mp->width,
>> + pix_mp->height);
>> } else if (!pix_mp->plane_fmt[0].sizeimage) {
>> /*
>> * For coded formats the application can specify
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-20 1:52 ` Ezequiel Garcia
2022-12-20 12:26 ` Benjamin Gaignard
@ 2022-12-20 13:40 ` Michael Grzeschik
2022-12-20 17:15 ` Nicolas Dufresne
1 sibling, 1 reply; 36+ messages in thread
From: Michael Grzeschik @ 2022-12-20 13:40 UTC (permalink / raw)
To: Ezequiel Garcia
Cc: Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
[-- Attachment #1.1: Type: text/plain, Size: 9204 bytes --]
Hi Ezequiel,
On Mon, Dec 19, 2022 at 10:52:02PM -0300, Ezequiel Garcia wrote:
>On Mon, Dec 19, 2022 at 6:54 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>> On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
>> >On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
>> ><benjamin.gaignard@collabora.com> wrote:
>> >>
>> >> This series implement AV1 stateless decoder for RK3588 SoC.
>> >> The harware support 8 and 10 bits bitstreams up to 7680x4320.
>> >> AV1 feature like film grain or scaling are done by the postprocessor.
>> >> The driver can produce NV12_4L4 and NV12 pixel formats.
>> >> A native 10bits NV12_4L4 format is possible but need more investigation
>> >> to be completly documented and enabled.
>> >>
>> >> It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
>> >> Sebastian's device-tree patches for RK3588.
>> >>
>> >
>> >I thought the AV1 decoder in RK3588 was really a separate hardware
>> >from the Hantro G1/G2.
>> >
>> >Shouldn't this need a new driver for this new hardware?
>>
>> Just jumping into this discussion as I am currently working on the rkvenc driver.
>>
>
>The more the merrier, there's always room for developers :-)
>
>> In my case I am extending the rkvdec driver to become more generic for
>> other rockchip specific enc/decoders.
>>
>> My first change looks like this:
>> ---
>> drivers/staging/media/rkvdec/Makefile | 4 +-
>> drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
>> drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
>> drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
>> drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
>> ---
>>
>> While working on other parts of the encoder I found many places in the
>> rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
>> functions but where limited to the decoder case.
>>
>
>Because stateless decoders devices are very similar in their general behavior,
>their drivers could be very similar.
>
>Hantro and Rkvdec could look similar because the same humans worked on them.
>
>Most boilerplate code, as well as V4L2 format negotiation, VB2 buffer handling
>could be shared among all stateless decoder drivers. I think even at one point
>we experimented with having a shared/common code base for all stateless codecs.
>
>In other words, it's entirely possible to support Hantro devices in
>the Cedrus driver
>and vice-versa, you would only have to write the hardware-specific bits.
>
>However, there is consensus to have a separate driver for each
>different hardware,
>even when the hardware is a bit similar. This may lead to some code duplication,
>but it's less fragile / more flexible. Maintaining drivers this way allows
>developers to evolve, testing on a small family of devices, without
>breaking support
>for other devices.
>
>This is important as sometimes it's hard to get the hardware,
>but we still don't want to break the support!
>
>> I think there are two options for the av1 codec.
>>
>> 1) If the vpu981 is a driver that has nothing to do with verisilicon but
>> works with this driver framework, then we should integrate vepu981 into it
>> but consider rename the verisilicon unrelated parts to something generic.
>>
>> 2) Move the vepu981 av1 driver into the rkvdec instead.
>>
>> If 1) is the way to go, we can even think of moving the staging code parts from
>> rkvdec to the verisilicon code. Likewise to the vepu981-av1.
>>
>
>The Hantro driver should only support G1, G2, and VC8000D;
>which can be said to belong to the same family.
>
>The RKVDEC driver supports Rockchip vdpu34x core. I have to admit
>I'm not exactly sure if we support anything else than vdpu34x.
Currently the rkvdec is only supporting vdpu34x. My work would integrate
vepu54x into the rkvdec boilerplate and so it would support encode as decode.
>I'm not familiar with the AV1 support provided by this patch,
>but looking at the mpp code:
>
>...
> "rk3588",
> ROCKCHIP_SOC_RK3588,
> HAVE_VDPU2 | HAVE_VDPU2_PP | HAVE_VEPU2 | HAVE_RKVDEC | HAVE_RKVENC |
> HAVE_JPEG_DEC | HAVE_AV1DEC | HAVE_AVSDEC | HAVE_VEPU2_JPEG,
> { &vdpu38x, &rkjpegd, &vdpu2, &vdpu2_jpeg_pp, &av1d, &avspd},
> { &vepu58x, &vepu2, &vepu2_jpeg, NULL, },
>
>Seems RK3588 supports a Hantro core (VDPU2), a vdpu38x core and this AV1 core,
>which according to this patchset is vdpu981 (?)
>
>If the vdpu38x device interface, configuration, buffer handling and
>registers are
>similar enough with vdpu34x, adding vdpu38x to the Rkvdec driver
>should be straightforward.
>If the vdpu38x core differs, it may be reason enough to consider a new driver.
>
>As for vdpu981 (AV1), I'm inclined to think it deserves its own driver.
>
>Again, I'm far less worried for a little code duplication in the
>boilerplate (which can be solved
>with helpers, etc.) and more worried about making sure we can evolve
>drivers easily,
>while minimizing regressions.
Thanks for the explanation.
As I agree that not breaking current drivers is a strong argument. Also
rkvdec is still in staging, which makes it less harmful for the
integration of the encoder path.
Since we can not ensure that the rkvenc/rkvdec is not another unknown
verisilicon core, going the way of working on a common rkvpu driver is
probably the best for now.
Also, since I have already done some work into that direction, it sounds
good for me. :)
>> I could also keep on integrating the rkvenc on that base instead.
>>
>> Regards,
>> Michael
>>
>> >> The full branch can be found here:
>> >> https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>> >>
>> >> Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
>> >> The failing tests are:
>> >> - 10bits bitstream because 10bits output formats aren't yet implemented.
>> >> - the 2 tests with 2 spatial layers: few errors in luma/chroma values
>> >> - tests with resolution < hardware limit (64x64)
>> >>
>> >> Benjamin
>> >>
>> >> Benjamin Gaignard (9):
>> >> dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
>> >> media: verisilicon: Add AV1 decoder mode and controls
>> >> media: verisilicon: Save bit depth for AV1 decoder
>> >> media: verisilicon: Check AV1 bitstreams bit depth
>> >> media: verisilicon: Compute motion vectors size for AV1 frames
>> >> media: verisilicon: Add AV1 entropy helpers
>> >> media: verisilicon: Add Rockchip AV1 decoder
>> >> media: verisilicon: Add film grain feature to AV1 driver
>> >> media: verisilicon: Enable AV1 decoder on rk3588
>> >>
>> >> .../bindings/media/rockchip-vpu.yaml | 1 +
>> >> drivers/media/platform/verisilicon/Makefile | 3 +
>> >> drivers/media/platform/verisilicon/hantro.h | 5 +
>> >> .../media/platform/verisilicon/hantro_drv.c | 54 +
>> >> .../media/platform/verisilicon/hantro_hw.h | 102 +
>> >> .../platform/verisilicon/hantro_postproc.c | 3 +
>> >> .../media/platform/verisilicon/hantro_v4l2.c | 5 +
>> >> .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
>> >> .../verisilicon/rockchip_av1_entropymode.h | 272 +
>> >> .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
>> >> .../verisilicon/rockchip_av1_filmgrain.h | 36 +
>> >> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
>> >> .../verisilicon/rockchip_vpu981_regs.h | 477 ++
>> >> .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
>> >> 14 files changed, 8291 insertions(+)
>> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
>> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
>> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
>> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
>> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> >> create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>> >>
>> >> --
>> >> 2.34.1
>> >>
>> >
>> >_______________________________________________
>> >linux-arm-kernel mailing list
>> >linux-arm-kernel@lists.infradead.org
>> >http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>> >
>>
>> --
>> Pengutronix e.K. | |
>> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
>> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
>> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-19 21:54 ` Michael Grzeschik
2022-12-20 1:52 ` Ezequiel Garcia
@ 2022-12-20 17:00 ` Nicolas Dufresne
2022-12-21 22:01 ` Michael Grzeschik
1 sibling, 1 reply; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-20 17:00 UTC (permalink / raw)
To: Michael Grzeschik, Ezequiel Garcia
Cc: Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le lundi 19 décembre 2022 à 22:54 +0100, Michael Grzeschik a écrit :
> Hi Benjamin,
> Hi Ezequiel,
>
> On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
> > On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
> > <benjamin.gaignard@collabora.com> wrote:
> > >
> > > This series implement AV1 stateless decoder for RK3588 SoC.
> > > The harware support 8 and 10 bits bitstreams up to 7680x4320.
> > > AV1 feature like film grain or scaling are done by the postprocessor.
> > > The driver can produce NV12_4L4 and NV12 pixel formats.
> > > A native 10bits NV12_4L4 format is possible but need more investigation
> > > to be completly documented and enabled.
> > >
> > > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> > > Sebastian's device-tree patches for RK3588.
> > >
> >
> > I thought the AV1 decoder in RK3588 was really a separate hardware
> > from the Hantro G1/G2.
> >
> > Shouldn't this need a new driver for this new hardware?
>
> Just jumping into this discussion as I am currently working on the rkvenc driver.
>
> In my case I am extending the rkvdec driver to become more generic for
> other rockchip specific enc/decoders.
>
> My first change looks like this:
> ---
> drivers/staging/media/rkvdec/Makefile | 4 +-
> drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
> drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
> drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
> drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
> ---
>
> While working on other parts of the encoder I found many places in the
> rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
> functions but where limited to the decoder case.
>
> I think there are two options for the av1 codec.
>
> 1) If the vpu981 is a driver that has nothing to do with verisilicon but
> works with this driver framework, then we should integrate vepu981 into it
> but consider rename the verisilicon unrelated parts to something generic.
I've raised in my review the the naming is sub-optimal. This is an unmodified
VC9000D AV1 decoder. No other codecs have been included in the package, even
though VC9000D cores can support more.
Stating this driver have no place here seems a bit strange to me, but with
proper arguments, maybe we can make a case and start a VC9000D dedicated driver
(that will be a lot of copy paste, VC9000D post processor notably is identical
to VC8000 post processor, but one could argue we should make a VCX000 driver ?
>
> 2) Move the vepu981 av1 driver into the rkvdec instead.
That make no sense, its not a Rockchip HW design, and will likely start
appearing on non-RK SoC in the future.
>
> If 1) is the way to go, we can even think of moving the staging code parts from
> rkvdec to the verisilicon code. Likewise to the vepu981-av1.
Again, I think using RK naming is unfortunate choice. This AV1 decoder is just
like the G1/H1 combo you will find on RK3288. And that same combo is found on
many older SoC (actually even newer SoC un the VC8000Nano brand).
Like all generation of Hantro chips, there is an optional dependency that can
exist between encoder and decoders. The question is if this requires a single
driver to maintain a valid state or not. So far, it seems devs have assume that
is it needed.
p.s. fun fact, on most HW, the decoder rate is cut in half with running
concurrently with the encoder
>
> I could also keep on integrating the rkvenc on that base instead.
Do you know if there is any interaction between the encoder and decoder ? Shared
registers, shared internal cache ? That's basically what differentiate Hantro
here. Also, be aware that some folks are considering starting on RKVDEC2 driver,
are you looking at RK32/33 series ? or more RK35 ?
>
> Regards,
> Michael
>
> > > The full branch can be found here:
> > > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
> > >
> > > Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> > > The failing tests are:
> > > - 10bits bitstream because 10bits output formats aren't yet implemented.
> > > - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> > > - tests with resolution < hardware limit (64x64)
> > >
> > > Benjamin
> > >
> > > Benjamin Gaignard (9):
> > > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> > > media: verisilicon: Add AV1 decoder mode and controls
> > > media: verisilicon: Save bit depth for AV1 decoder
> > > media: verisilicon: Check AV1 bitstreams bit depth
> > > media: verisilicon: Compute motion vectors size for AV1 frames
> > > media: verisilicon: Add AV1 entropy helpers
> > > media: verisilicon: Add Rockchip AV1 decoder
> > > media: verisilicon: Add film grain feature to AV1 driver
> > > media: verisilicon: Enable AV1 decoder on rk3588
> > >
> > > .../bindings/media/rockchip-vpu.yaml | 1 +
> > > drivers/media/platform/verisilicon/Makefile | 3 +
> > > drivers/media/platform/verisilicon/hantro.h | 5 +
> > > .../media/platform/verisilicon/hantro_drv.c | 54 +
> > > .../media/platform/verisilicon/hantro_hw.h | 102 +
> > > .../platform/verisilicon/hantro_postproc.c | 3 +
> > > .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> > > .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> > > .../verisilicon/rockchip_av1_entropymode.h | 272 +
> > > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> > > .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> > > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> > > .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> > > .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> > > 14 files changed, 8291 insertions(+)
> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
> > >
> > > --
> > > 2.34.1
> > >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-20 13:40 ` Michael Grzeschik
@ 2022-12-20 17:15 ` Nicolas Dufresne
2022-12-20 17:50 ` Ezequiel Garcia
2022-12-21 22:17 ` Michael Grzeschik
0 siblings, 2 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-20 17:15 UTC (permalink / raw)
To: Michael Grzeschik, Ezequiel Garcia
Cc: Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le mardi 20 décembre 2022 à 14:40 +0100, Michael Grzeschik a écrit :
> Hi Ezequiel,
>
> On Mon, Dec 19, 2022 at 10:52:02PM -0300, Ezequiel Garcia wrote:
> > On Mon, Dec 19, 2022 at 6:54 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
> > > On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
> > > > On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
> > > > <benjamin.gaignard@collabora.com> wrote:
> > > > >
> > > > > This series implement AV1 stateless decoder for RK3588 SoC.
> > > > > The harware support 8 and 10 bits bitstreams up to 7680x4320.
> > > > > AV1 feature like film grain or scaling are done by the postprocessor.
> > > > > The driver can produce NV12_4L4 and NV12 pixel formats.
> > > > > A native 10bits NV12_4L4 format is possible but need more investigation
> > > > > to be completly documented and enabled.
> > > > >
> > > > > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> > > > > Sebastian's device-tree patches for RK3588.
> > > > >
> > > >
> > > > I thought the AV1 decoder in RK3588 was really a separate hardware
> > > > from the Hantro G1/G2.
> > > >
> > > > Shouldn't this need a new driver for this new hardware?
> > >
> > > Just jumping into this discussion as I am currently working on the rkvenc driver.
> > >
> >
> > The more the merrier, there's always room for developers :-)
> >
> > > In my case I am extending the rkvdec driver to become more generic for
> > > other rockchip specific enc/decoders.
> > >
> > > My first change looks like this:
> > > ---
> > > drivers/staging/media/rkvdec/Makefile | 4 +-
> > > drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
> > > drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
> > > drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
> > > drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
> > > ---
> > >
> > > While working on other parts of the encoder I found many places in the
> > > rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
> > > functions but where limited to the decoder case.
> > >
> >
> > Because stateless decoders devices are very similar in their general behavior,
> > their drivers could be very similar.
> >
> > Hantro and Rkvdec could look similar because the same humans worked on them.
> >
> > Most boilerplate code, as well as V4L2 format negotiation, VB2 buffer handling
> > could be shared among all stateless decoder drivers. I think even at one point
> > we experimented with having a shared/common code base for all stateless codecs.
> >
> > In other words, it's entirely possible to support Hantro devices in
> > the Cedrus driver
> > and vice-versa, you would only have to write the hardware-specific bits.
> >
> > However, there is consensus to have a separate driver for each
> > different hardware,
> > even when the hardware is a bit similar. This may lead to some code duplication,
> > but it's less fragile / more flexible. Maintaining drivers this way allows
> > developers to evolve, testing on a small family of devices, without
> > breaking support
> > for other devices.
> >
> > This is important as sometimes it's hard to get the hardware,
> > but we still don't want to break the support!
> >
> > > I think there are two options for the av1 codec.
> > >
> > > 1) If the vpu981 is a driver that has nothing to do with verisilicon but
> > > works with this driver framework, then we should integrate vepu981 into it
> > > but consider rename the verisilicon unrelated parts to something generic.
> > >
> > > 2) Move the vepu981 av1 driver into the rkvdec instead.
> > >
> > > If 1) is the way to go, we can even think of moving the staging code parts from
> > > rkvdec to the verisilicon code. Likewise to the vepu981-av1.
> > >
> >
> > The Hantro driver should only support G1, G2, and VC8000D;
> > which can be said to belong to the same family.
> >
> > The RKVDEC driver supports Rockchip vdpu34x core. I have to admit
> > I'm not exactly sure if we support anything else than vdpu34x.
>
> Currently the rkvdec is only supporting vdpu34x. My work would integrate
> vepu54x into the rkvdec boilerplate and so it would support encode as decode.
Which CODEC do you currently work on ? We are about to send a first RFC for a
VP8 stateless encoder API (with a rk3399 driver to test), but haven't written
the Stateless Encoder API spec yet, so still some work there. And was planning
to make an H.264 Sateless Encoder soon. Would be nice to avoid duplicating the
effort.
>
> > I'm not familiar with the AV1 support provided by this patch,
> > but looking at the mpp code:
> >
> > ...
> > "rk3588",
> > ROCKCHIP_SOC_RK3588,
> > HAVE_VDPU2 | HAVE_VDPU2_PP | HAVE_VEPU2 | HAVE_RKVDEC | HAVE_RKVENC |
> > HAVE_JPEG_DEC | HAVE_AV1DEC | HAVE_AVSDEC | HAVE_VEPU2_JPEG,
> > { &vdpu38x, &rkjpegd, &vdpu2, &vdpu2_jpeg_pp, &av1d, &avspd},
> > { &vepu58x, &vepu2, &vepu2_jpeg, NULL, },
> >
> > Seems RK3588 supports a Hantro core (VDPU2), a vdpu38x core and this AV1 core,
> > which according to this patchset is vdpu981 (?)
> >
> > If the vdpu38x device interface, configuration, buffer handling and
> > registers are
> > similar enough with vdpu34x, adding vdpu38x to the Rkvdec driver
> > should be straightforward.
> > If the vdpu38x core differs, it may be reason enough to consider a new driver.
> >
> > As for vdpu981 (AV1), I'm inclined to think it deserves its own driver.
Well, it has its own driver, Hantro (which is not rkvdec). But maybe you could
extend on why you think VC9000D decoder have no place in the hantro/verisilicon
family ?
> >
> > Again, I'm far less worried for a little code duplication in the
> > boilerplate (which can be solved
> > with helpers, etc.) and more worried about making sure we can evolve
> > drivers easily,
> > while minimizing regressions.
>
> Thanks for the explanation.
>
> As I agree that not breaking current drivers is a strong argument. Also
> rkvdec is still in staging, which makes it less harmful for the
> integration of the encoder path.
We are working on unstaging patches.
>
> Since we can not ensure that the rkvenc/rkvdec is not another unknown
> verisilicon core, going the way of working on a common rkvpu driver is
> probably the best for now.
We can collectively share our knowledge (to the limit of our legal rights to
share) make the right call. In the case of this VC9000D decoder, there is a
massive amount of registers that aren't AV1 specific, and existed in VC8000
cores as it, same offset, same size. Hantro designs have this very specific
style, which is to share register, giving it a meaning for multiple CODECs.
I've commented about that in my review, but until we have more codecs support on
VC9000 cores, generalizing the register definition is premature.
Though, an typical example of things that are Hantro specific and common to G1,
VC8000 and VC9000, is the handling of references for H.264 decoding. This
differs massively from how it works with rkvdec here.
>
> Also, since I have already done some work into that direction, it sounds
> good for me. :)
Great. For you interest, the modified Hantro H1 encoder is an information that
Rockchip disclosed to us directly. And that whys vepu121 (if my memory is right)
is implemented in Hantro driver. The register layout have been altered by RK but
that's all there is, it does share semantic (and a lot of code) with the "real"
H1 found on RK3288, IMX8M Mini and others.
>
>
> > > I could also keep on integrating the rkvenc on that base instead.
> > >
> > > Regards,
> > > Michael
> > >
> > > > > The full branch can be found here:
> > > > > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
> > > > >
> > > > > Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> > > > > The failing tests are:
> > > > > - 10bits bitstream because 10bits output formats aren't yet implemented.
> > > > > - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> > > > > - tests with resolution < hardware limit (64x64)
> > > > >
> > > > > Benjamin
> > > > >
> > > > > Benjamin Gaignard (9):
> > > > > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> > > > > media: verisilicon: Add AV1 decoder mode and controls
> > > > > media: verisilicon: Save bit depth for AV1 decoder
> > > > > media: verisilicon: Check AV1 bitstreams bit depth
> > > > > media: verisilicon: Compute motion vectors size for AV1 frames
> > > > > media: verisilicon: Add AV1 entropy helpers
> > > > > media: verisilicon: Add Rockchip AV1 decoder
> > > > > media: verisilicon: Add film grain feature to AV1 driver
> > > > > media: verisilicon: Enable AV1 decoder on rk3588
> > > > >
> > > > > .../bindings/media/rockchip-vpu.yaml | 1 +
> > > > > drivers/media/platform/verisilicon/Makefile | 3 +
> > > > > drivers/media/platform/verisilicon/hantro.h | 5 +
> > > > > .../media/platform/verisilicon/hantro_drv.c | 54 +
> > > > > .../media/platform/verisilicon/hantro_hw.h | 102 +
> > > > > .../platform/verisilicon/hantro_postproc.c | 3 +
> > > > > .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> > > > > .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> > > > > .../verisilicon/rockchip_av1_entropymode.h | 272 +
> > > > > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> > > > > .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> > > > > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> > > > > .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> > > > > .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> > > > > 14 files changed, 8291 insertions(+)
> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
> > > > >
> > > > > --
> > > > > 2.34.1
> > > > >
> > > >
> > > > _______________________________________________
> > > > linux-arm-kernel mailing list
> > > > linux-arm-kernel@lists.infradead.org
> > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > > >
> > >
> > > --
> > > Pengutronix e.K. | |
> > > Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> > > 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> > > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
> >
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-20 17:15 ` Nicolas Dufresne
@ 2022-12-20 17:50 ` Ezequiel Garcia
2022-12-21 22:17 ` Michael Grzeschik
1 sibling, 0 replies; 36+ messages in thread
From: Ezequiel Garcia @ 2022-12-20 17:50 UTC (permalink / raw)
To: Nicolas Dufresne
Cc: Michael Grzeschik, Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel, Sebastian Fricke
Bonjour Nicolas,
On Tue, Dec 20, 2022 at 2:15 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le mardi 20 décembre 2022 à 14:40 +0100, Michael Grzeschik a écrit :
> > Hi Ezequiel,
> >
> > On Mon, Dec 19, 2022 at 10:52:02PM -0300, Ezequiel Garcia wrote:
> > > On Mon, Dec 19, 2022 at 6:54 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
> > > > On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
> > > > > On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
> > > > > <benjamin.gaignard@collabora.com> wrote:
> > > > > >
> > > > > > This series implement AV1 stateless decoder for RK3588 SoC.
> > > > > > The harware support 8 and 10 bits bitstreams up to 7680x4320.
> > > > > > AV1 feature like film grain or scaling are done by the postprocessor.
> > > > > > The driver can produce NV12_4L4 and NV12 pixel formats.
> > > > > > A native 10bits NV12_4L4 format is possible but need more investigation
> > > > > > to be completly documented and enabled.
> > > > > >
> > > > > > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> > > > > > Sebastian's device-tree patches for RK3588.
> > > > > >
> > > > >
> > > > > I thought the AV1 decoder in RK3588 was really a separate hardware
> > > > > from the Hantro G1/G2.
> > > > >
> > > > > Shouldn't this need a new driver for this new hardware?
> > > >
> > > > Just jumping into this discussion as I am currently working on the rkvenc driver.
> > > >
> > >
> > > The more the merrier, there's always room for developers :-)
> > >
> > > > In my case I am extending the rkvdec driver to become more generic for
> > > > other rockchip specific enc/decoders.
> > > >
> > > > My first change looks like this:
> > > > ---
> > > > drivers/staging/media/rkvdec/Makefile | 4 +-
> > > > drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
> > > > drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
> > > > drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
> > > > drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
> > > > ---
> > > >
> > > > While working on other parts of the encoder I found many places in the
> > > > rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
> > > > functions but where limited to the decoder case.
> > > >
> > >
> > > Because stateless decoders devices are very similar in their general behavior,
> > > their drivers could be very similar.
> > >
> > > Hantro and Rkvdec could look similar because the same humans worked on them.
> > >
> > > Most boilerplate code, as well as V4L2 format negotiation, VB2 buffer handling
> > > could be shared among all stateless decoder drivers. I think even at one point
> > > we experimented with having a shared/common code base for all stateless codecs.
> > >
> > > In other words, it's entirely possible to support Hantro devices in
> > > the Cedrus driver
> > > and vice-versa, you would only have to write the hardware-specific bits.
> > >
> > > However, there is consensus to have a separate driver for each
> > > different hardware,
> > > even when the hardware is a bit similar. This may lead to some code duplication,
> > > but it's less fragile / more flexible. Maintaining drivers this way allows
> > > developers to evolve, testing on a small family of devices, without
> > > breaking support
> > > for other devices.
> > >
> > > This is important as sometimes it's hard to get the hardware,
> > > but we still don't want to break the support!
> > >
> > > > I think there are two options for the av1 codec.
> > > >
> > > > 1) If the vpu981 is a driver that has nothing to do with verisilicon but
> > > > works with this driver framework, then we should integrate vepu981 into it
> > > > but consider rename the verisilicon unrelated parts to something generic.
> > > >
> > > > 2) Move the vepu981 av1 driver into the rkvdec instead.
> > > >
> > > > If 1) is the way to go, we can even think of moving the staging code parts from
> > > > rkvdec to the verisilicon code. Likewise to the vepu981-av1.
> > > >
> > >
> > > The Hantro driver should only support G1, G2, and VC8000D;
> > > which can be said to belong to the same family.
> > >
> > > The RKVDEC driver supports Rockchip vdpu34x core. I have to admit
> > > I'm not exactly sure if we support anything else than vdpu34x.
> >
> > Currently the rkvdec is only supporting vdpu34x. My work would integrate
> > vepu54x into the rkvdec boilerplate and so it would support encode as decode.
>
> Which CODEC do you currently work on ? We are about to send a first RFC for a
> VP8 stateless encoder API (with a rk3399 driver to test), but haven't written
> the Stateless Encoder API spec yet, so still some work there. And was planning
> to make an H.264 Sateless Encoder soon. Would be nice to avoid duplicating the
> effort.
>
> >
> > > I'm not familiar with the AV1 support provided by this patch,
> > > but looking at the mpp code:
> > >
> > > ...
> > > "rk3588",
> > > ROCKCHIP_SOC_RK3588,
> > > HAVE_VDPU2 | HAVE_VDPU2_PP | HAVE_VEPU2 | HAVE_RKVDEC | HAVE_RKVENC |
> > > HAVE_JPEG_DEC | HAVE_AV1DEC | HAVE_AVSDEC | HAVE_VEPU2_JPEG,
> > > { &vdpu38x, &rkjpegd, &vdpu2, &vdpu2_jpeg_pp, &av1d, &avspd},
> > > { &vepu58x, &vepu2, &vepu2_jpeg, NULL, },
> > >
> > > Seems RK3588 supports a Hantro core (VDPU2), a vdpu38x core and this AV1 core,
> > > which according to this patchset is vdpu981 (?)
> > >
> > > If the vdpu38x device interface, configuration, buffer handling and
> > > registers are
> > > similar enough with vdpu34x, adding vdpu38x to the Rkvdec driver
> > > should be straightforward.
> > > If the vdpu38x core differs, it may be reason enough to consider a new driver.
> > >
> > > As for vdpu981 (AV1), I'm inclined to think it deserves its own driver.
>
> Well, it has its own driver, Hantro (which is not rkvdec). But maybe you could
> extend on why you think VC9000D decoder have no place in the hantro/verisilicon
> family ?
>
Oh good, then the AV1 core in RK3588 is actually a Verisilicon VC9000D core,
I didn't know that.
Maybe the naming in the driver should be vc9000? Or maybe it should be noted
in comments and documentation. If that was already documented somewhere
in this patchset, then I definitely missed it.
> > >
> > > Again, I'm far less worried for a little code duplication in the
> > > boilerplate (which can be solved
> > > with helpers, etc.) and more worried about making sure we can evolve
> > > drivers easily,
> > > while minimizing regressions.
> >
> > Thanks for the explanation.
> >
> > As I agree that not breaking current drivers is a strong argument. Also
> > rkvdec is still in staging, which makes it less harmful for the
> > integration of the encoder path.
>
> We are working on unstaging patches.
>
Rkvdec can be unstaged.
Sebastian: I remember you were working on Rkvdec.
If you want to unstage this driver now, and work on features later,
I believe that would make a lot of sense!
Thanks!
Ezequiel
> >
> > Since we can not ensure that the rkvenc/rkvdec is not another unknown
> > verisilicon core, going the way of working on a common rkvpu driver is
> > probably the best for now.
>
> We can collectively share our knowledge (to the limit of our legal rights to
> share) make the right call. In the case of this VC9000D decoder, there is a
> massive amount of registers that aren't AV1 specific, and existed in VC8000
> cores as it, same offset, same size. Hantro designs have this very specific
> style, which is to share register, giving it a meaning for multiple CODECs.
>
> I've commented about that in my review, but until we have more codecs support on
> VC9000 cores, generalizing the register definition is premature.
>
> Though, an typical example of things that are Hantro specific and common to G1,
> VC8000 and VC9000, is the handling of references for H.264 decoding. This
> differs massively from how it works with rkvdec here.
>
> >
> > Also, since I have already done some work into that direction, it sounds
> > good for me. :)
>
> Great. For you interest, the modified Hantro H1 encoder is an information that
> Rockchip disclosed to us directly. And that whys vepu121 (if my memory is right)
> is implemented in Hantro driver. The register layout have been altered by RK but
> that's all there is, it does share semantic (and a lot of code) with the "real"
> H1 found on RK3288, IMX8M Mini and others.
>
> >
> >
> > > > I could also keep on integrating the rkvenc on that base instead.
> > > >
> > > > Regards,
> > > > Michael
> > > >
> > > > > > The full branch can be found here:
> > > > > > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
> > > > > >
> > > > > > Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> > > > > > The failing tests are:
> > > > > > - 10bits bitstream because 10bits output formats aren't yet implemented.
> > > > > > - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> > > > > > - tests with resolution < hardware limit (64x64)
> > > > > >
> > > > > > Benjamin
> > > > > >
> > > > > > Benjamin Gaignard (9):
> > > > > > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> > > > > > media: verisilicon: Add AV1 decoder mode and controls
> > > > > > media: verisilicon: Save bit depth for AV1 decoder
> > > > > > media: verisilicon: Check AV1 bitstreams bit depth
> > > > > > media: verisilicon: Compute motion vectors size for AV1 frames
> > > > > > media: verisilicon: Add AV1 entropy helpers
> > > > > > media: verisilicon: Add Rockchip AV1 decoder
> > > > > > media: verisilicon: Add film grain feature to AV1 driver
> > > > > > media: verisilicon: Enable AV1 decoder on rk3588
> > > > > >
> > > > > > .../bindings/media/rockchip-vpu.yaml | 1 +
> > > > > > drivers/media/platform/verisilicon/Makefile | 3 +
> > > > > > drivers/media/platform/verisilicon/hantro.h | 5 +
> > > > > > .../media/platform/verisilicon/hantro_drv.c | 54 +
> > > > > > .../media/platform/verisilicon/hantro_hw.h | 102 +
> > > > > > .../platform/verisilicon/hantro_postproc.c | 3 +
> > > > > > .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> > > > > > .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> > > > > > .../verisilicon/rockchip_av1_entropymode.h | 272 +
> > > > > > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> > > > > > .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> > > > > > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> > > > > > .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> > > > > > .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> > > > > > 14 files changed, 8291 insertions(+)
> > > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> > > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> > > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> > > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> > > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> > > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
> > > > > >
> > > > > > --
> > > > > > 2.34.1
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > linux-arm-kernel mailing list
> > > > > linux-arm-kernel@lists.infradead.org
> > > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > > > >
> > > >
> > > > --
> > > > Pengutronix e.K. | |
> > > > Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> > > > 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> > > > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
> > >
> >
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth
2022-12-20 13:02 ` Benjamin Gaignard
@ 2022-12-21 16:16 ` Nicolas Dufresne
0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-21 16:16 UTC (permalink / raw)
To: Benjamin Gaignard, ezequiel, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Le mardi 20 décembre 2022 à 14:02 +0100, Benjamin Gaignard a écrit :
> Le 19/12/2022 à 21:38, Nicolas Dufresne a écrit :
> > Le lundi 19 décembre 2022 à 16:56 +0100, Benjamin Gaignard a écrit :
> > > The driver supports 8 and 10 bits bitstreams, make sure to discard
> > > other cases.
> > >
> > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > ---
> > > drivers/media/platform/verisilicon/hantro_drv.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
> > > index 8e93710dcfed..e10fc59634dd 100644
> > > --- a/drivers/media/platform/verisilicon/hantro_drv.c
> > > +++ b/drivers/media/platform/verisilicon/hantro_drv.c
> > > @@ -282,7 +282,13 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
> > > /* We only support profile 0 */
> > > if (dec_params->profile != 0)
> > > return -EINVAL;
> > > + } else if (ctrl->id == V4L2_CID_STATELESS_AV1_SEQUENCE) {
> > > + const struct v4l2_ctrl_av1_sequence *sequence = ctrl->p_new.p_av1_sequence;
> > > +
> > > + if (sequence->bit_depth != 8 && sequence->bit_depth != 10)
> > > + return -EINVAL;
> > As you state in the cover letter, should this just be this for now ?
>
> The driver can decode 8 or 10 bits bitstreams but will on produce 8bits (NV12_4L4 or NV12)
> frames. The hardware is able to truncate 10bits bitstreams to 8 bits output.
I tested that, and NV12 works, picking NV12_4L4 though leads to corrupted
buffers. I think the PP is not being activated. G2/VC8000 and likely VC9000 can
only produce tile reference in the original depth chosen (hence why it goes not
have a format register like G1 does).
As you are aware, the driver didn't pick NV12_10LE40_4L4 automatically, and
that's what broke fluster 10bit test in tiled mode for 10bit. I suspect in v2,
we'll have all this fixed and 10bit will be activated, so this comment will be
ignored.
>
> >
> >
> > > + if (sequence->bit_depth != 8)
> > > + return -EINVAL;
> >
> > > }
> > > +
> > > return 0;
> > > }
> > >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-19 21:07 ` Ezequiel Garcia
2022-12-19 21:54 ` Michael Grzeschik
@ 2022-12-21 16:33 ` Nicolas Dufresne
1 sibling, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-12-21 16:33 UTC (permalink / raw)
To: Ezequiel Garcia, Benjamin Gaignard
Cc: p.zabel, mchehab, robh+dt, krzysztof.kozlowski+dt, heiko,
daniel.almeida, nicolas.dufresne, linux-media, linux-rockchip,
devicetree, linux-arm-kernel, linux-kernel, kernel
Le lundi 19 décembre 2022 à 18:07 -0300, Ezequiel Garcia a écrit :
> Hi Benjamin,
>
> On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
> <benjamin.gaignard@collabora.com> wrote:
> >
> > This series implement AV1 stateless decoder for RK3588 SoC.
> > The harware support 8 and 10 bits bitstreams up to 7680x4320.
> > AV1 feature like film grain or scaling are done by the postprocessor.
> > The driver can produce NV12_4L4 and NV12 pixel formats.
> > A native 10bits NV12_4L4 format is possible but need more investigation
> > to be completly documented and enabled.
> >
> > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> > Sebastian's device-tree patches for RK3588.
> >
>
> I thought the AV1 decoder in RK3588 was really a separate hardware
> from the Hantro G1/G2.
>
> Shouldn't this need a new driver for this new hardware?
As discussed on IRC, whenever we enable H.264 and HEVC on VC9000 cores, we will
benefit from sharing some very specific code with the G1 and G2 (respectively).
Though for now, there is no overlap as the single core on RK3588 have all other
codecs disabled by fuse.
>
> Thanks!
> Ezequiel
>
> > The full branch can be found here:
> > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
> >
> > Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
> > The failing tests are:
> > - 10bits bitstream because 10bits output formats aren't yet implemented.
> > - the 2 tests with 2 spatial layers: few errors in luma/chroma values
> > - tests with resolution < hardware limit (64x64)
> >
> > Benjamin
> >
> > Benjamin Gaignard (9):
> > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
> > media: verisilicon: Add AV1 decoder mode and controls
> > media: verisilicon: Save bit depth for AV1 decoder
> > media: verisilicon: Check AV1 bitstreams bit depth
> > media: verisilicon: Compute motion vectors size for AV1 frames
> > media: verisilicon: Add AV1 entropy helpers
> > media: verisilicon: Add Rockchip AV1 decoder
> > media: verisilicon: Add film grain feature to AV1 driver
> > media: verisilicon: Enable AV1 decoder on rk3588
> >
> > .../bindings/media/rockchip-vpu.yaml | 1 +
> > drivers/media/platform/verisilicon/Makefile | 3 +
> > drivers/media/platform/verisilicon/hantro.h | 5 +
> > .../media/platform/verisilicon/hantro_drv.c | 54 +
> > .../media/platform/verisilicon/hantro_hw.h | 102 +
> > .../platform/verisilicon/hantro_postproc.c | 3 +
> > .../media/platform/verisilicon/hantro_v4l2.c | 5 +
> > .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
> > .../verisilicon/rockchip_av1_entropymode.h | 272 +
> > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
> > .../verisilicon/rockchip_av1_filmgrain.h | 36 +
> > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
> > .../verisilicon/rockchip_vpu981_regs.h | 477 ++
> > .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
> > 14 files changed, 8291 insertions(+)
> > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
> > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
> > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
> > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
> > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
> > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
> >
> > --
> > 2.34.1
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-20 17:00 ` Nicolas Dufresne
@ 2022-12-21 22:01 ` Michael Grzeschik
2022-12-22 13:24 ` Ezequiel Garcia
0 siblings, 1 reply; 36+ messages in thread
From: Michael Grzeschik @ 2022-12-21 22:01 UTC (permalink / raw)
To: Nicolas Dufresne
Cc: Ezequiel Garcia, Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
[-- Attachment #1.1: Type: text/plain, Size: 7913 bytes --]
On Tue, Dec 20, 2022 at 12:00:01PM -0500, Nicolas Dufresne wrote:
>Le lundi 19 décembre 2022 à 22:54 +0100, Michael Grzeschik a écrit :
>> Hi Benjamin,
>> Hi Ezequiel,
>>
>> On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
>> > On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
>> > <benjamin.gaignard@collabora.com> wrote:
>> > >
>> > > This series implement AV1 stateless decoder for RK3588 SoC.
>> > > The harware support 8 and 10 bits bitstreams up to 7680x4320.
>> > > AV1 feature like film grain or scaling are done by the postprocessor.
>> > > The driver can produce NV12_4L4 and NV12 pixel formats.
>> > > A native 10bits NV12_4L4 format is possible but need more investigation
>> > > to be completly documented and enabled.
>> > >
>> > > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
>> > > Sebastian's device-tree patches for RK3588.
>> > >
>> >
>> > I thought the AV1 decoder in RK3588 was really a separate hardware
>> > from the Hantro G1/G2.
>> >
>> > Shouldn't this need a new driver for this new hardware?
>>
>> Just jumping into this discussion as I am currently working on the rkvenc driver.
>>
>> In my case I am extending the rkvdec driver to become more generic for
>> other rockchip specific enc/decoders.
>>
>> My first change looks like this:
>> ---
>> drivers/staging/media/rkvdec/Makefile | 4 +-
>> drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
>> drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
>> drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
>> drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
>> ---
>>
>> While working on other parts of the encoder I found many places in the
>> rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
>> functions but where limited to the decoder case.
>>
>> I think there are two options for the av1 codec.
>>
>> 1) If the vpu981 is a driver that has nothing to do with verisilicon but
>> works with this driver framework, then we should integrate vepu981 into it
>> but consider rename the verisilicon unrelated parts to something generic.
>
>I've raised in my review the the naming is sub-optimal. This is an unmodified
>VC9000D AV1 decoder. No other codecs have been included in the package, even
>though VC9000D cores can support more.
>
>Stating this driver have no place here seems a bit strange to me, but with
>proper arguments, maybe we can make a case and start a VC9000D dedicated driver
>(that will be a lot of copy paste, VC9000D post processor notably is identical
>to VC8000 post processor, but one could argue we should make a VCX000 driver ?
>
>>
>> 2) Move the vepu981 av1 driver into the rkvdec instead.
>
>That make no sense, its not a Rockchip HW design, and will likely start
>appearing on non-RK SoC in the future.
Sure. I did not know that it actually is an VC9000.
>> If 1) is the way to go, we can even think of moving the staging code parts from
>> rkvdec to the verisilicon code. Likewise to the vepu981-av1.
>
>Again, I think using RK naming is unfortunate choice. This AV1 decoder is just
>like the G1/H1 combo you will find on RK3288. And that same combo is found on
>many older SoC (actually even newer SoC un the VC8000Nano brand).
>
>Like all generation of Hantro chips, there is an optional dependency that can
>exist between encoder and decoders. The question is if this requires a single
>driver to maintain a valid state or not. So far, it seems devs have assume that
>is it needed.
>
>p.s. fun fact, on most HW, the decoder rate is cut in half with running
>concurrently with the encoder
>
>>
>> I could also keep on integrating the rkvenc on that base instead.
>
>Do you know if there is any interaction between the encoder and decoder ? Shared
>registers, shared internal cache ? That's basically what differentiate Hantro
>here. Also, be aware that some folks are considering starting on RKVDEC2 driver,
>are you looking at RK32/33 series ? or more RK35 ?
I don't know of any limitations or interactions between the encoder and
decoder. I know that the rkvdec is implementing the register space of
the mpp called vdpu34x. My work would adress the vepu54x encoder
registers. Which can be found on rk3588 (vepu541) and rk3568/r3566
(vepu540).
AFAIK the vepu541 and vepu540 are very similar. The vepu540 is limited
by 4k and vepu541 can handle 8k h264.
However how the vepu541 is interacting with the vdpu34x needs to be
found out. Also I did not find any common parts in the mpp
implementation yet.
>> > > The full branch can be found here:
>> > > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>> > >
>> > > Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
>> > > The failing tests are:
>> > > - 10bits bitstream because 10bits output formats aren't yet implemented.
>> > > - the 2 tests with 2 spatial layers: few errors in luma/chroma values
>> > > - tests with resolution < hardware limit (64x64)
>> > >
>> > > Benjamin
>> > >
>> > > Benjamin Gaignard (9):
>> > > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
>> > > media: verisilicon: Add AV1 decoder mode and controls
>> > > media: verisilicon: Save bit depth for AV1 decoder
>> > > media: verisilicon: Check AV1 bitstreams bit depth
>> > > media: verisilicon: Compute motion vectors size for AV1 frames
>> > > media: verisilicon: Add AV1 entropy helpers
>> > > media: verisilicon: Add Rockchip AV1 decoder
>> > > media: verisilicon: Add film grain feature to AV1 driver
>> > > media: verisilicon: Enable AV1 decoder on rk3588
>> > >
>> > > .../bindings/media/rockchip-vpu.yaml | 1 +
>> > > drivers/media/platform/verisilicon/Makefile | 3 +
>> > > drivers/media/platform/verisilicon/hantro.h | 5 +
>> > > .../media/platform/verisilicon/hantro_drv.c | 54 +
>> > > .../media/platform/verisilicon/hantro_hw.h | 102 +
>> > > .../platform/verisilicon/hantro_postproc.c | 3 +
>> > > .../media/platform/verisilicon/hantro_v4l2.c | 5 +
>> > > .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
>> > > .../verisilicon/rockchip_av1_entropymode.h | 272 +
>> > > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
>> > > .../verisilicon/rockchip_av1_filmgrain.h | 36 +
>> > > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
>> > > .../verisilicon/rockchip_vpu981_regs.h | 477 ++
>> > > .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
>> > > 14 files changed, 8291 insertions(+)
>> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
>> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
>> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
>> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
>> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>> > >
>> > > --
>> > > 2.34.1
>> > >
>> >
>> > _______________________________________________
>> > linux-arm-kernel mailing list
>> > linux-arm-kernel@lists.infradead.org
>> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>> >
>>
>
>
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-20 17:15 ` Nicolas Dufresne
2022-12-20 17:50 ` Ezequiel Garcia
@ 2022-12-21 22:17 ` Michael Grzeschik
1 sibling, 0 replies; 36+ messages in thread
From: Michael Grzeschik @ 2022-12-21 22:17 UTC (permalink / raw)
To: Nicolas Dufresne
Cc: Ezequiel Garcia, Benjamin Gaignard, p.zabel, mchehab, robh+dt,
krzysztof.kozlowski+dt, heiko, daniel.almeida, nicolas.dufresne,
linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
[-- Attachment #1.1: Type: text/plain, Size: 13098 bytes --]
On Tue, Dec 20, 2022 at 12:15:20PM -0500, Nicolas Dufresne wrote:
>Le mardi 20 décembre 2022 à 14:40 +0100, Michael Grzeschik a écrit :
>> Hi Ezequiel,
>>
>> On Mon, Dec 19, 2022 at 10:52:02PM -0300, Ezequiel Garcia wrote:
>> > On Mon, Dec 19, 2022 at 6:54 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>> > > On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
>> > > > On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
>> > > > <benjamin.gaignard@collabora.com> wrote:
>> > > > >
>> > > > > This series implement AV1 stateless decoder for RK3588 SoC.
>> > > > > The harware support 8 and 10 bits bitstreams up to 7680x4320.
>> > > > > AV1 feature like film grain or scaling are done by the postprocessor.
>> > > > > The driver can produce NV12_4L4 and NV12 pixel formats.
>> > > > > A native 10bits NV12_4L4 format is possible but need more investigation
>> > > > > to be completly documented and enabled.
>> > > > >
>> > > > > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
>> > > > > Sebastian's device-tree patches for RK3588.
>> > > > >
>> > > >
>> > > > I thought the AV1 decoder in RK3588 was really a separate hardware
>> > > > from the Hantro G1/G2.
>> > > >
>> > > > Shouldn't this need a new driver for this new hardware?
>> > >
>> > > Just jumping into this discussion as I am currently working on the rkvenc driver.
>> > >
>> >
>> > The more the merrier, there's always room for developers :-)
>> >
>> > > In my case I am extending the rkvdec driver to become more generic for
>> > > other rockchip specific enc/decoders.
>> > >
>> > > My first change looks like this:
>> > > ---
>> > > drivers/staging/media/rkvdec/Makefile | 4 +-
>> > > drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
>> > > drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
>> > > drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
>> > > drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
>> > > ---
>> > >
>> > > While working on other parts of the encoder I found many places in the
>> > > rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
>> > > functions but where limited to the decoder case.
>> > >
>> >
>> > Because stateless decoders devices are very similar in their general behavior,
>> > their drivers could be very similar.
>> >
>> > Hantro and Rkvdec could look similar because the same humans worked on them.
>> >
>> > Most boilerplate code, as well as V4L2 format negotiation, VB2 buffer handling
>> > could be shared among all stateless decoder drivers. I think even at one point
>> > we experimented with having a shared/common code base for all stateless codecs.
>> >
>> > In other words, it's entirely possible to support Hantro devices in
>> > the Cedrus driver
>> > and vice-versa, you would only have to write the hardware-specific bits.
>> >
>> > However, there is consensus to have a separate driver for each
>> > different hardware,
>> > even when the hardware is a bit similar. This may lead to some code duplication,
>> > but it's less fragile / more flexible. Maintaining drivers this way allows
>> > developers to evolve, testing on a small family of devices, without
>> > breaking support
>> > for other devices.
>> >
>> > This is important as sometimes it's hard to get the hardware,
>> > but we still don't want to break the support!
>> >
>> > > I think there are two options for the av1 codec.
>> > >
>> > > 1) If the vpu981 is a driver that has nothing to do with verisilicon but
>> > > works with this driver framework, then we should integrate vepu981 into it
>> > > but consider rename the verisilicon unrelated parts to something generic.
>> > >
>> > > 2) Move the vepu981 av1 driver into the rkvdec instead.
>> > >
>> > > If 1) is the way to go, we can even think of moving the staging code parts from
>> > > rkvdec to the verisilicon code. Likewise to the vepu981-av1.
>> > >
>> >
>> > The Hantro driver should only support G1, G2, and VC8000D;
>> > which can be said to belong to the same family.
>> >
>> > The RKVDEC driver supports Rockchip vdpu34x core. I have to admit
>> > I'm not exactly sure if we support anything else than vdpu34x.
>>
>> Currently the rkvdec is only supporting vdpu34x. My work would integrate
>> vepu54x into the rkvdec boilerplate and so it would support encode as decode.
>
>Which CODEC do you currently work on ? We are about to send a first RFC for a
>VP8 stateless encoder API (with a rk3399 driver to test), but haven't written
>the Stateless Encoder API spec yet, so still some work there. And was planning
>to make an H.264 Sateless Encoder soon. Would be nice to avoid duplicating the
>effort.
As mentioned in the other Mail. I am working on the vepu54x core, which
can be found on the rk35 chips. The H.264 Stateless API it is currently
based on is comming from the bootlin stack that is floating around since
some time.
https://git.pengutronix.de/cgit/mgr/linux/commit/?h=v5.19/topic/rk3568-vepu-h264-stateless-bootlin&id=ec2b92670100c6bd075ca859bc3392b5b913be27
The STATELESS controlls are not yet final. And with your future
implementation of a second h264 encoder we will have to find a common
definition.
For me now it is good enough to get started with the first steps of
encoding some early frames and get to know the hardware better.
My first goal is to get the driver perform good enough to run
the bootlin application against it and get a h264 stream.
https://github.com/bootlin/v4l2-hantro-h264-encoder
>> > I'm not familiar with the AV1 support provided by this patch,
>> > but looking at the mpp code:
>> >
>> > ...
>> > "rk3588",
>> > ROCKCHIP_SOC_RK3588,
>> > HAVE_VDPU2 | HAVE_VDPU2_PP | HAVE_VEPU2 | HAVE_RKVDEC | HAVE_RKVENC |
>> > HAVE_JPEG_DEC | HAVE_AV1DEC | HAVE_AVSDEC | HAVE_VEPU2_JPEG,
>> > { &vdpu38x, &rkjpegd, &vdpu2, &vdpu2_jpeg_pp, &av1d, &avspd},
>> > { &vepu58x, &vepu2, &vepu2_jpeg, NULL, },
>> >
>> > Seems RK3588 supports a Hantro core (VDPU2), a vdpu38x core and this AV1 core,
>> > which according to this patchset is vdpu981 (?)
>> >
>> > If the vdpu38x device interface, configuration, buffer handling and
>> > registers are
>> > similar enough with vdpu34x, adding vdpu38x to the Rkvdec driver
>> > should be straightforward.
>> > If the vdpu38x core differs, it may be reason enough to consider a new driver.
>> >
>> > As for vdpu981 (AV1), I'm inclined to think it deserves its own driver.
>
>Well, it has its own driver, Hantro (which is not rkvdec). But maybe you could
>extend on why you think VC9000D decoder have no place in the hantro/verisilicon
>family ?
>
>> >
>> > Again, I'm far less worried for a little code duplication in the
>> > boilerplate (which can be solved
>> > with helpers, etc.) and more worried about making sure we can evolve
>> > drivers easily,
>> > while minimizing regressions.
>>
>> Thanks for the explanation.
>>
>> As I agree that not breaking current drivers is a strong argument. Also
>> rkvdec is still in staging, which makes it less harmful for the
>> integration of the encoder path.
>
>We are working on unstaging patches.
>
>>
>> Since we can not ensure that the rkvenc/rkvdec is not another unknown
>> verisilicon core, going the way of working on a common rkvpu driver is
>> probably the best for now.
>
>We can collectively share our knowledge (to the limit of our legal rights to
>share) make the right call. In the case of this VC9000D decoder, there is a
>massive amount of registers that aren't AV1 specific, and existed in VC8000
>cores as it, same offset, same size. Hantro designs have this very specific
>style, which is to share register, giving it a meaning for multiple CODECs.
>
>I've commented about that in my review, but until we have more codecs support on
>VC9000 cores, generalizing the register definition is premature.
>
>Though, an typical example of things that are Hantro specific and common to G1,
>VC8000 and VC9000, is the handling of references for H.264 decoding. This
>differs massively from how it works with rkvdec here.
>
>>
>> Also, since I have already done some work into that direction, it sounds
>> good for me. :)
>
>Great. For you interest, the modified Hantro H1 encoder is an information that
>Rockchip disclosed to us directly. And that whys vepu121 (if my memory is right)
>is implemented in Hantro driver. The register layout have been altered by RK but
>that's all there is, it does share semantic (and a lot of code) with the "real"
>H1 found on RK3288, IMX8M Mini and others.
Good to know!
After asking at Rockchip some time ago. They ensured us that the vepu54x
core is something rockchip specific. We will live with that. I hope that
we can find everything necessary for implementing the rkvenc mainline
driver based on the mpp stack and the datasheets that are freely
available.
However it is possible that we could run into limitations here in the future.
>> > > I could also keep on integrating the rkvenc on that base instead.
>> > >
>> > > Regards,
>> > > Michael
>> > >
>> > > > > The full branch can be found here:
>> > > > > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v1
>> > > > >
>> > > > > Fluster score is: 151/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0.
>> > > > > The failing tests are:
>> > > > > - 10bits bitstream because 10bits output formats aren't yet implemented.
>> > > > > - the 2 tests with 2 spatial layers: few errors in luma/chroma values
>> > > > > - tests with resolution < hardware limit (64x64)
>> > > > >
>> > > > > Benjamin
>> > > > >
>> > > > > Benjamin Gaignard (9):
>> > > > > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible
>> > > > > media: verisilicon: Add AV1 decoder mode and controls
>> > > > > media: verisilicon: Save bit depth for AV1 decoder
>> > > > > media: verisilicon: Check AV1 bitstreams bit depth
>> > > > > media: verisilicon: Compute motion vectors size for AV1 frames
>> > > > > media: verisilicon: Add AV1 entropy helpers
>> > > > > media: verisilicon: Add Rockchip AV1 decoder
>> > > > > media: verisilicon: Add film grain feature to AV1 driver
>> > > > > media: verisilicon: Enable AV1 decoder on rk3588
>> > > > >
>> > > > > .../bindings/media/rockchip-vpu.yaml | 1 +
>> > > > > drivers/media/platform/verisilicon/Makefile | 3 +
>> > > > > drivers/media/platform/verisilicon/hantro.h | 5 +
>> > > > > .../media/platform/verisilicon/hantro_drv.c | 54 +
>> > > > > .../media/platform/verisilicon/hantro_hw.h | 102 +
>> > > > > .../platform/verisilicon/hantro_postproc.c | 3 +
>> > > > > .../media/platform/verisilicon/hantro_v4l2.c | 5 +
>> > > > > .../verisilicon/rockchip_av1_entropymode.c | 4536 +++++++++++++++++
>> > > > > .../verisilicon/rockchip_av1_entropymode.h | 272 +
>> > > > > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++
>> > > > > .../verisilicon/rockchip_av1_filmgrain.h | 36 +
>> > > > > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2280 +++++++++
>> > > > > .../verisilicon/rockchip_vpu981_regs.h | 477 ++
>> > > > > .../platform/verisilicon/rockchip_vpu_hw.c | 116 +
>> > > > > 14 files changed, 8291 insertions(+)
>> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c
>> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h
>> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c
>> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h
>> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> > > > > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
>> > > > >
>> > > > > --
>> > > > > 2.34.1
>> > > > >
>> > > >
>> > > > _______________________________________________
>> > > > linux-arm-kernel mailing list
>> > > > linux-arm-kernel@lists.infradead.org
>> > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>> > > >
>> > >
>> > > --
>> > > Pengutronix e.K. | |
>> > > Steuerwalder Str. 21 | http://www.pengutronix.de/ |
>> > > 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
>> > > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>> >
>>
>
>
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] AV1 stateless decoder for RK3588
2022-12-21 22:01 ` Michael Grzeschik
@ 2022-12-22 13:24 ` Ezequiel Garcia
0 siblings, 0 replies; 36+ messages in thread
From: Ezequiel Garcia @ 2022-12-22 13:24 UTC (permalink / raw)
To: Michael Grzeschik
Cc: Nicolas Dufresne, Benjamin Gaignard, Philipp Zabel,
Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
Heiko Stuebner, Daniel Almeida, nicolas.dufresne, linux-media,
open list:ARM/Rockchip SoC..., devicetree, linux-arm-kernel,
Linux Kernel Mailing List, Collabora Kernel ML,
Andrzej Pietrasiewicz
Hi Nicolas, Michael,
(+Andrzej)
El mié, 21 dic. 2022 19:01, Michael Grzeschik <mgr@pengutronix.de> escribió:
>
> On Tue, Dec 20, 2022 at 12:00:01PM -0500, Nicolas Dufresne wrote:
> >Le lundi 19 décembre 2022 à 22:54 +0100, Michael Grzeschik a écrit :
> >> Hi Benjamin,
> >> Hi Ezequiel,
> >>
> >> On Mon, Dec 19, 2022 at 06:07:38PM -0300, Ezequiel Garcia wrote:
> >> > On Mon, Dec 19, 2022 at 12:56 PM Benjamin Gaignard
> >> > <benjamin.gaignard@collabora.com> wrote:
> >> > >
> >> > > This series implement AV1 stateless decoder for RK3588 SoC.
> >> > > The harware support 8 and 10 bits bitstreams up to 7680x4320.
> >> > > AV1 feature like film grain or scaling are done by the postprocessor.
> >> > > The driver can produce NV12_4L4 and NV12 pixel formats.
> >> > > A native 10bits NV12_4L4 format is possible but need more investigation
> >> > > to be completly documented and enabled.
> >> > >
> >> > > It is based on Daniel's "[RFC,v3] media: Add AV1 uAPI" [1] patches and
> >> > > Sebastian's device-tree patches for RK3588.
> >> > >
> >> >
> >> > I thought the AV1 decoder in RK3588 was really a separate hardware
> >> > from the Hantro G1/G2.
> >> >
> >> > Shouldn't this need a new driver for this new hardware?
> >>
> >> Just jumping into this discussion as I am currently working on the rkvenc driver.
> >>
> >> In my case I am extending the rkvdec driver to become more generic for
> >> other rockchip specific enc/decoders.
> >>
> >> My first change looks like this:
> >> ---
> >> drivers/staging/media/rkvdec/Makefile | 4 +-
> >> drivers/staging/media/rkvdec/rkvdec-h264.c | 100 ++++-----
> >> drivers/staging/media/rkvdec/rkvdec-vp9.c | 142 ++++++-------
> >> drivers/staging/media/rkvdec/{rkvdec.c => rkvpu.c} | 510 +++++++++++++++++++++++-----------------------
> >> drivers/staging/media/rkvdec/{rkvdec.h => rkvpu.h} | 66 +++---
> >> ---
> >>
> >> While working on other parts of the encoder I found many places in the
> >> rkvdec driver (e.g. v4l2 and vb2 callbacks) that looked familiar to the hantro
> >> functions but where limited to the decoder case.
> >>
> >> I think there are two options for the av1 codec.
> >>
> >> 1) If the vpu981 is a driver that has nothing to do with verisilicon but
> >> works with this driver framework, then we should integrate vepu981 into it
> >> but consider rename the verisilicon unrelated parts to something generic.
> >
> >I've raised in my review the the naming is sub-optimal. This is an unmodified
> >VC9000D AV1 decoder. No other codecs have been included in the package, even
> >though VC9000D cores can support more.
> >
> >Stating this driver have no place here seems a bit strange to me, but with
> >proper arguments, maybe we can make a case and start a VC9000D dedicated driver
> >(that will be a lot of copy paste, VC9000D post processor notably is identical
> >to VC8000 post processor, but one could argue we should make a VCX000 driver ?
> >
> >>
> >> 2) Move the vepu981 av1 driver into the rkvdec instead.
> >
> >That make no sense, its not a Rockchip HW design, and will likely start
> >appearing on non-RK SoC in the future.
>
> Sure. I did not know that it actually is an VC9000.
>
> >> If 1) is the way to go, we can even think of moving the staging code parts from
> >> rkvdec to the verisilicon code. Likewise to the vepu981-av1.
> >
> >Again, I think using RK naming is unfortunate choice. This AV1 decoder is just
> >like the G1/H1 combo you will find on RK3288. And that same combo is found on
> >many older SoC (actually even newer SoC un the VC8000Nano brand).
> >
> >Like all generation of Hantro chips, there is an optional dependency that can
> >exist between encoder and decoders. The question is if this requires a single
> >driver to maintain a valid state or not. So far, it seems devs have assume that
> >is it needed.
> >
> >p.s. fun fact, on most HW, the decoder rate is cut in half with running
> >concurrently with the encoder
> >
> >>
> >> I could also keep on integrating the rkvenc on that base instead.
> >
> >Do you know if there is any interaction between the encoder and decoder ? Shared
> >registers, shared internal cache ? That's basically what differentiate Hantro
> >here. Also, be aware that some folks are considering starting on RKVDEC2 driver,
> >are you looking at RK32/33 series ? or more RK35 ?
>
> I don't know of any limitations or interactions between the encoder and
> decoder.
I believe we should explore separate drivers, and if there is any interaction,
try to model the shared piece through a shared block in the device tree.
In most cases, the decoder and encoder are separate blocks.
Also, the V4L2 stateless decoder interface covers only decoding.
Supporting both in the same driver has been painful, especially
the V4L2 negotiation, is hard to support for both encoders and decoders,
and has led to many bugs (and even worse, regressions) in the drivers.
Thanks,
Ezequiel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 7/9] media: verisilicon: Add Rockchip AV1 decoder
[not found] ` <d7025e7e-246f-a7f4-f0b5-d1290a38a96c@collabora.com>
@ 2022-12-23 23:15 ` Daniel Almeida
2022-12-23 23:20 ` Daniel Almeida
0 siblings, 1 reply; 36+ messages in thread
From: Daniel Almeida @ 2022-12-23 23:15 UTC (permalink / raw)
To: Benjamin Gaignard, Nicolas Dufresne, ezequiel, p.zabel, mchehab,
robh+dt, krzysztof.kozlowski+dt, heiko, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
> > + int cur_offset[V4L2_AV1_NUM_REF_FRAMES - 1];
> > + int cur_roffset[V4L2_AV1_NUM_REF_FRAMES - 1];
>
> This looks like V4L2_AV1_REFS_PER_FRAME. Daniel, should be remove
> this
> V4L2_AV1_NUM_REF_FRAMES ? Its redundant with
> V4L2_AV1_TOTAL_REFS_PER_FRAME ...
Hi. These are different. NUM_REF_FRAMES is the size of the "DPB" while
TOTAL_REFS_PER_FRAME is the maximum number of references a frame can
use. It just so happens that in AV1 these two are close in absolute
value (i.e. 7 vs 8).
Using VP9 as a comparison, the DPB size is still 8, but REFS_PER_FRAME
is 3 (meaning a frame can specificy LAST, GOLDEN and ALTREF values).
As this is per spec and a mere convenience, I vote for keeping it.
-- Daniel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 7/9] media: verisilicon: Add Rockchip AV1 decoder
2022-12-23 23:15 ` [PATCH v1 7/9] media: verisilicon: Add Rockchip AV1 decoder Daniel Almeida
@ 2022-12-23 23:20 ` Daniel Almeida
0 siblings, 0 replies; 36+ messages in thread
From: Daniel Almeida @ 2022-12-23 23:20 UTC (permalink / raw)
To: Benjamin Gaignard, Nicolas Dufresne, ezequiel, p.zabel, mchehab,
robh+dt, krzysztof.kozlowski+dt, heiko, nicolas.dufresne
Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
linux-kernel, kernel
Ah, I was too quick on that answer :/
I see that they have REFS_PER_FRAME (7), TOTAL_REFS_PER_FRAME (8) _and_
NUM_REF_FRAMES (8), in which case it is redundant indeed. I will remove
that on v4.
-- Daniel
On Fri, 2022-12-23 at 20:15 -0300, Daniel Almeida wrote:
> > > + int cur_offset[V4L2_AV1_NUM_REF_FRAMES - 1];
> > > + int cur_roffset[V4L2_AV1_NUM_REF_FRAMES - 1];
> >
> > This looks like V4L2_AV1_REFS_PER_FRAME. Daniel, should be remove
> > this
> > V4L2_AV1_NUM_REF_FRAMES ? Its redundant with
> > V4L2_AV1_TOTAL_REFS_PER_FRAME ...
>
> Hi. These are different. NUM_REF_FRAMES is the size of the "DPB"
> while
> TOTAL_REFS_PER_FRAME is the maximum number of references a frame can
> use. It just so happens that in AV1 these two are close in absolute
> value (i.e. 7 vs 8).
>
> Using VP9 as a comparison, the DPB size is still 8, but
> REFS_PER_FRAME
> is 3 (meaning a frame can specificy LAST, GOLDEN and ALTREF values).
>
> As this is per spec and a mere convenience, I vote for keeping it.
>
> -- Daniel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2022-12-24 7:29 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-19 15:56 [PATCH v1 0/9] AV1 stateless decoder for RK3588 Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 1/9] dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible Benjamin Gaignard
2022-12-19 16:06 ` Krzysztof Kozlowski
2022-12-19 16:44 ` Benjamin Gaignard
2022-12-20 9:55 ` Krzysztof Kozlowski
2022-12-19 15:56 ` [PATCH v1 2/9] media: verisilicon: Add AV1 decoder mode and controls Benjamin Gaignard
[not found] ` <202212200257.l7xJlHCi-lkp@intel.com>
2022-12-19 20:26 ` Nicolas Dufresne
2022-12-19 20:28 ` Nicolas Dufresne
2022-12-19 15:56 ` [PATCH v1 3/9] media: verisilicon: Save bit depth for AV1 decoder Benjamin Gaignard
2022-12-19 20:37 ` Nicolas Dufresne
2022-12-19 21:29 ` Ezequiel Garcia
2022-12-20 13:05 ` Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 4/9] media: verisilicon: Check AV1 bitstreams bit depth Benjamin Gaignard
2022-12-19 20:38 ` Nicolas Dufresne
2022-12-20 13:02 ` Benjamin Gaignard
2022-12-21 16:16 ` Nicolas Dufresne
2022-12-19 15:56 ` [PATCH v1 5/9] media: verisilicon: Compute motion vectors size for AV1 frames Benjamin Gaignard
2022-12-19 20:42 ` Nicolas Dufresne
2022-12-20 13:13 ` Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 8/9] media: verisilicon: Add film grain feature to AV1 driver Benjamin Gaignard
2022-12-19 15:56 ` [PATCH v1 9/9] media: verisilicon: Enable AV1 decoder on rk3588 Benjamin Gaignard
2022-12-19 20:22 ` [PATCH v1 0/9] AV1 stateless decoder for RK3588 Nicolas Dufresne
2022-12-19 21:07 ` Ezequiel Garcia
2022-12-19 21:54 ` Michael Grzeschik
2022-12-20 1:52 ` Ezequiel Garcia
2022-12-20 12:26 ` Benjamin Gaignard
2022-12-20 13:40 ` Michael Grzeschik
2022-12-20 17:15 ` Nicolas Dufresne
2022-12-20 17:50 ` Ezequiel Garcia
2022-12-21 22:17 ` Michael Grzeschik
2022-12-20 17:00 ` Nicolas Dufresne
2022-12-21 22:01 ` Michael Grzeschik
2022-12-22 13:24 ` Ezequiel Garcia
2022-12-21 16:33 ` Nicolas Dufresne
[not found] ` <20221219155616.848690-8-benjamin.gaignard@collabora.com>
[not found] ` <092f76873a914c52c5157a21401be6cf78e3f188.camel@ndufresne.ca>
[not found] ` <be1ff517-765d-c97f-8cce-3f359cd0015c@collabora.com>
[not found] ` <4afdce546b2f10cf97e12f8325232483efeea1ce.camel@ndufresne.ca>
[not found] ` <d7025e7e-246f-a7f4-f0b5-d1290a38a96c@collabora.com>
2022-12-23 23:15 ` [PATCH v1 7/9] media: verisilicon: Add Rockchip AV1 decoder Daniel Almeida
2022-12-23 23:20 ` Daniel Almeida
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).