public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] media: iris: optimize COMV buffer allocation for VPU3x and VPU4x
@ 2026-04-21  6:41 ` Vishnu Reddy
  2026-04-21  9:01   ` Bryan O'Donoghue
  0 siblings, 1 reply; 2+ messages in thread
From: Vishnu Reddy @ 2026-04-21  6:41 UTC (permalink / raw)
  To: Vikash Garodia, Dikshita Agarwal, Abhinav Kumar,
	Bryan O'Donoghue, Mauro Carvalho Chehab
  Cc: linux-media, linux-arm-msm, linux-kernel, Vishnu Reddy

The existing iris_vpu_dec_comv_size() used VIDEO_MAX_FRAME (32) as
num_comv count unconditionally when calculating the COMV buffer size.
This resulted in an oversized COMV buffer allocation throughout decode
session, wasting memory regardless of actual number of buffers required.

For VPU3x and VPU4x platforms, introduce iris_vpu3x_4x_dec_comv_size() to
replace iris_vpu_dec_comv_size(). It derives num_comv dynamically, it
uses inst->fw_min_count once the firmware has reported its minimum buffer
requirements, and fallback to inst->buffers[BUF_OUTPUT].min_count during
initialization before firmware has communicated its requirements. This
aligns the COMV buffer size to the actual count needed rather than always
allocating with fixed VIDEO_MAX_FRAME value.

Additionally, during iris_vdec_inst_init(), fw_min_count was initialized
to MIN_BUFFERS instead of 0. This masked the fallback logic and caused the
COMV size calculation to use MIN_BUFFERS even before firmware had reported
its actual requirements. Fix this by initializing fw_min_count to 0.

During testing of 1080p AVC, it reduces the COMV buffer size from 32.89MB
to 6.16MB per decode session, significantly reducing memory consumption.

Signed-off-by: Vishnu Reddy <busanna.reddy@oss.qualcomm.com>
---
 drivers/media/platform/qcom/iris/iris_vdec.c       |  2 +-
 drivers/media/platform/qcom/iris/iris_vpu_buffer.c | 24 +++++++++++++++++++---
 drivers/media/platform/qcom/iris/iris_vpu_buffer.h |  1 +
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/qcom/iris/iris_vdec.c b/drivers/media/platform/qcom/iris/iris_vdec.c
index 719217399a30..f433065e08b2 100644
--- a/drivers/media/platform/qcom/iris/iris_vdec.c
+++ b/drivers/media/platform/qcom/iris/iris_vdec.c
@@ -24,7 +24,7 @@ int iris_vdec_inst_init(struct iris_inst *inst)
 	inst->fmt_src = kzalloc_obj(*inst->fmt_src);
 	inst->fmt_dst = kzalloc_obj(*inst->fmt_dst);
 
-	inst->fw_min_count = MIN_BUFFERS;
+	inst->fw_min_count = 0;
 
 	f = inst->fmt_src;
 	f->type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;
diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
index 9270422c1601..57237543b229 100644
--- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
+++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
@@ -731,6 +731,23 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *inst)
 	u32 height = f->fmt.pix_mp.height;
 	u32 width = f->fmt.pix_mp.width;
 
+	if (inst->codec == V4L2_PIX_FMT_H264)
+		return hfi_buffer_comv_h264d(width, height, num_comv);
+	else if (inst->codec == V4L2_PIX_FMT_HEVC)
+		return hfi_buffer_comv_h265d(width, height, num_comv);
+
+	return 0;
+}
+
+static u32 iris_vpu3x_4x_dec_comv_size(struct iris_inst *inst)
+{
+	struct v4l2_format *f = inst->fmt_src;
+	u32 height = f->fmt.pix_mp.height;
+	u32 width = f->fmt.pix_mp.width;
+	u32 num_comv;
+
+	num_comv = inst->fw_min_count ? inst->fw_min_count : inst->buffers[BUF_OUTPUT].min_count;
+
 	if (inst->codec == V4L2_PIX_FMT_H264)
 		return hfi_buffer_comv_h264d(width, height, num_comv);
 	else if (inst->codec == V4L2_PIX_FMT_HEVC)
@@ -739,7 +756,8 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *inst)
 		if (inst->fw_caps[DRAP].value)
 			return 0;
 		else
-			return hfi_buffer_comv_av1d(width, height, num_comv);
+			return hfi_buffer_comv_av1d(width, height,
+						    num_comv + AV1D_COMV_BUFFER_OVERHEAD);
 	}
 
 	return 0;
@@ -2025,7 +2043,7 @@ u32 iris_vpu_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer_type)
 
 	static const struct iris_vpu_buf_type_handle dec_internal_buf_type_handle[] = {
 		{BUF_BIN,         iris_vpu_dec_bin_size             },
-		{BUF_COMV,        iris_vpu_dec_comv_size            },
+		{BUF_COMV,        iris_vpu3x_4x_dec_comv_size       },
 		{BUF_NON_COMV,    iris_vpu_dec_non_comv_size        },
 		{BUF_LINE,        iris_vpu_dec_line_size            },
 		{BUF_PERSIST,     iris_vpu_dec_persist_size         },
@@ -2098,7 +2116,7 @@ u32 iris_vpu4x_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer_typ
 
 	static const struct iris_vpu_buf_type_handle dec_internal_buf_type_handle[] = {
 		{BUF_BIN,         iris_vpu_dec_bin_size         },
-		{BUF_COMV,        iris_vpu_dec_comv_size        },
+		{BUF_COMV,        iris_vpu3x_4x_dec_comv_size   },
 		{BUF_NON_COMV,    iris_vpu_dec_non_comv_size    },
 		{BUF_LINE,        iris_vpu4x_dec_line_size      },
 		{BUF_PERSIST,     iris_vpu4x_dec_persist_size   },
diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
index 12640eb5ed8c..7a9cc1c92da3 100644
--- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
+++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
@@ -110,6 +110,7 @@ struct iris_inst;
 #define MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE 96
 #define AV1D_NUM_HW_PIC_BUF    16
 #define AV1D_NUM_FRAME_HEADERS 16
+#define AV1D_COMV_BUFFER_OVERHEAD 7
 #define SIZE_AV1D_SEQUENCE_HEADER 768
 #define SIZE_AV1D_METADATA        512
 #define SIZE_AV1D_FRAME_HEADER    1280

---
base-commit: 4fbeef21f5387234111b5d52924e77757626faa5
change-id: 20260421-optimize_comv_buffer-ae7107673609

Best regards,
-- 
Vishnu Reddy <busanna.reddy@oss.qualcomm.com>


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] media: iris: optimize COMV buffer allocation for VPU3x and VPU4x
  2026-04-21  6:41 ` [PATCH] media: iris: optimize COMV buffer allocation for VPU3x and VPU4x Vishnu Reddy
@ 2026-04-21  9:01   ` Bryan O'Donoghue
  0 siblings, 0 replies; 2+ messages in thread
From: Bryan O'Donoghue @ 2026-04-21  9:01 UTC (permalink / raw)
  To: Vishnu Reddy, Vikash Garodia, Dikshita Agarwal, Abhinav Kumar,
	Mauro Carvalho Chehab
  Cc: linux-media, linux-arm-msm, linux-kernel

On 21/04/2026 07:41, Vishnu Reddy wrote:
> The existing iris_vpu_dec_comv_size() used VIDEO_MAX_FRAME (32) as
> num_comv count unconditionally when calculating the COMV buffer size.
> This resulted in an oversized COMV buffer allocation throughout decode
> session, wasting memory regardless of actual number of buffers required.

You should define what a COMV buffer is before talking about how you are 
changing it, i.e. define the term Co-located Motion Vector (CMOV) and 
then use the abbreviation CMOV liberally as you wish.

> For VPU3x and VPU4x platforms, introduce iris_vpu3x_4x_dec_comv_size() to
> replace iris_vpu_dec_comv_size(). It derives num_comv dynamically, it

"These derive num_cmove dynamically"

> uses inst->fw_min_count once the firmware has reported its minimum buffer
> requirements, and fallback to inst->buffers[BUF_OUTPUT].min_count during
> initialization before firmware has communicated its requirements. This
> aligns the COMV buffer size to the actual count needed rather than always
> allocating with fixed VIDEO_MAX_FRAME value.
> 
> Additionally, during iris_vdec_inst_init(), fw_min_count was initialized
> to MIN_BUFFERS instead of 0. This masked the fallback logic and caused the
> COMV size calculation to use MIN_BUFFERS even before firmware had reported
> its actual requirements. Fix this by initializing fw_min_count to 0.
> 
> During testing of 1080p AVC, it reduces the COMV buffer size from 32.89MB
> to 6.16MB per decode session, significantly reducing memory consumption.

Cool nice fix.

> 
> Signed-off-by: Vishnu Reddy <busanna.reddy@oss.qualcomm.com>
> ---
>   drivers/media/platform/qcom/iris/iris_vdec.c       |  2 +-
>   drivers/media/platform/qcom/iris/iris_vpu_buffer.c | 24 +++++++++++++++++++---
>   drivers/media/platform/qcom/iris/iris_vpu_buffer.h |  1 +
>   3 files changed, 23 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/media/platform/qcom/iris/iris_vdec.c b/drivers/media/platform/qcom/iris/iris_vdec.c
> index 719217399a30..f433065e08b2 100644
> --- a/drivers/media/platform/qcom/iris/iris_vdec.c
> +++ b/drivers/media/platform/qcom/iris/iris_vdec.c
> @@ -24,7 +24,7 @@ int iris_vdec_inst_init(struct iris_inst *inst)
>   	inst->fmt_src = kzalloc_obj(*inst->fmt_src);
>   	inst->fmt_dst = kzalloc_obj(*inst->fmt_dst);
> 
> -	inst->fw_min_count = MIN_BUFFERS;
> +	inst->fw_min_count = 0;
> 
>   	f = inst->fmt_src;
>   	f->type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;
> diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
> index 9270422c1601..57237543b229 100644
> --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
> +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
> @@ -731,6 +731,23 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *inst)
>   	u32 height = f->fmt.pix_mp.height;
>   	u32 width = f->fmt.pix_mp.width;
> 
> +	if (inst->codec == V4L2_PIX_FMT_H264)
> +		return hfi_buffer_comv_h264d(width, height, num_comv);
> +	else if (inst->codec == V4L2_PIX_FMT_HEVC)
> +		return hfi_buffer_comv_h265d(width, height, num_comv);
> +
> +	return 0;
> +}
> +
> +static u32 iris_vpu3x_4x_dec_comv_size(struct iris_inst *inst)
> +{
> +	struct v4l2_format *f = inst->fmt_src;
> +	u32 height = f->fmt.pix_mp.height;
> +	u32 width = f->fmt.pix_mp.width;
> +	u32 num_comv;
> +
> +	num_comv = inst->fw_min_count ? inst->fw_min_count : inst->buffers[BUF_OUTPUT].min_count;

Please just if/else this though its far easier to read/understand that way.

> +
>   	if (inst->codec == V4L2_PIX_FMT_H264)
>   		return hfi_buffer_comv_h264d(width, height, num_comv);
>   	else if (inst->codec == V4L2_PIX_FMT_HEVC)
> @@ -739,7 +756,8 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *inst)
>   		if (inst->fw_caps[DRAP].value)
>   			return 0;
>   		else
> -			return hfi_buffer_comv_av1d(width, height, num_comv);
> +			return hfi_buffer_comv_av1d(width, height,
> +						    num_comv + AV1D_COMV_BUFFER_OVERHEAD);
>   	}
> 
>   	return 0;
> @@ -2025,7 +2043,7 @@ u32 iris_vpu_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer_type)
> 
>   	static const struct iris_vpu_buf_type_handle dec_internal_buf_type_handle[] = {
>   		{BUF_BIN,         iris_vpu_dec_bin_size             },
> -		{BUF_COMV,        iris_vpu_dec_comv_size            },
> +		{BUF_COMV,        iris_vpu3x_4x_dec_comv_size       },
>   		{BUF_NON_COMV,    iris_vpu_dec_non_comv_size        },
>   		{BUF_LINE,        iris_vpu_dec_line_size            },
>   		{BUF_PERSIST,     iris_vpu_dec_persist_size         },
> @@ -2098,7 +2116,7 @@ u32 iris_vpu4x_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer_typ
> 
>   	static const struct iris_vpu_buf_type_handle dec_internal_buf_type_handle[] = {
>   		{BUF_BIN,         iris_vpu_dec_bin_size         },
> -		{BUF_COMV,        iris_vpu_dec_comv_size        },
> +		{BUF_COMV,        iris_vpu3x_4x_dec_comv_size   },
>   		{BUF_NON_COMV,    iris_vpu_dec_non_comv_size    },
>   		{BUF_LINE,        iris_vpu4x_dec_line_size      },
>   		{BUF_PERSIST,     iris_vpu4x_dec_persist_size   },
> diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
> index 12640eb5ed8c..7a9cc1c92da3 100644
> --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
> +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
> @@ -110,6 +110,7 @@ struct iris_inst;
>   #define MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE 96
>   #define AV1D_NUM_HW_PIC_BUF    16
>   #define AV1D_NUM_FRAME_HEADERS 16
> +#define AV1D_COMV_BUFFER_OVERHEAD 7

Whats this ? Why is there a new seven byte overhead ? Does it represent 
a header, an alignment ?

An overhead can mean anything.

>   #define SIZE_AV1D_SEQUENCE_HEADER 768
>   #define SIZE_AV1D_METADATA        512
>   #define SIZE_AV1D_FRAME_HEADER    1280
> 
> ---
> base-commit: 4fbeef21f5387234111b5d52924e77757626faa5
> change-id: 20260421-optimize_comv_buffer-ae7107673609
> 
> Best regards,
> --
> Vishnu Reddy <busanna.reddy@oss.qualcomm.com>
> 



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-21  9:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <lWwJ9pbXoZXg350L9fA8Sx-qznLud6KXnJlBTFNBLZQXEwKZeI50KGzJPDq43FO2QtbisF9pyxxeVTXX-WvN0Q==@protonmail.internalid>
2026-04-21  6:41 ` [PATCH] media: iris: optimize COMV buffer allocation for VPU3x and VPU4x Vishnu Reddy
2026-04-21  9:01   ` Bryan O'Donoghue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox