public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed
From: Vishnu Reddy <busanna.reddy@oss.qualcomm.com>
To: Vikash Garodia <vikash.garodia@oss.qualcomm.com>,
	Bryan O'Donoghue <bod@kernel.org>,
	Dikshita Agarwal <dikshita.agarwal@oss.qualcomm.com>,
	Abhinav Kumar <abhinav.kumar@linux.dev>,
	Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: iris: optimize COMV buffer allocation for VPU3x and VPU4x
Date: Fri, 24 Apr 2026 14:47:44 +0530	[thread overview]
Message-ID: <ec1a0557-18eb-4b16-7c10-95b69ea7e63a@oss.qualcomm.com> (raw)
In-Reply-To: <899e0575-6de3-4ab7-b817-7a51c6b45787@oss.qualcomm.com>


On 4/22/2026 1:12 PM, Vikash Garodia wrote:
>
>
> On 4/21/2026 2:31 PM, Bryan O'Donoghue wrote:
>> On 21/04/2026 07:41, Vishnu Reddy wrote:
>>> The existing iris_vpu_dec_comv_size() used VIDEO_MAX_FRAME (32) as
>>> num_comv count unconditionally when calculating the COMV buffer size.
>>> This resulted in an oversized COMV buffer allocation throughout decode
>>> session, wasting memory regardless of actual number of buffers required.
>>
>> You should define what a COMV buffer is before talking about how you are
>> changing it, i.e. define the term Co-located Motion Vector (CMOV) and then
>> use the abbreviation CMOV liberally as you wish.
>>
>>> For VPU3x and VPU4x platforms, introduce iris_vpu3x_4x_dec_comv_size() to
>>> replace iris_vpu_dec_comv_size(). It derives num_comv dynamically, it
>>
>> "These derive num_cmove dynamically"
>>
>>> uses inst->fw_min_count once the firmware has reported its minimum buffer
>>> requirements, and fallback to inst->buffers[BUF_OUTPUT].min_count during
>>> initialization before firmware has communicated its requirements. This
>>> aligns the COMV buffer size to the actual count needed rather than always
>>> allocating with fixed VIDEO_MAX_FRAME value.
>>>
>>> Additionally, during iris_vdec_inst_init(), fw_min_count was initialized
>>> to MIN_BUFFERS instead of 0. This masked the fallback logic and caused the
>>> COMV size calculation to use MIN_BUFFERS even before firmware had reported
>>> its actual requirements. Fix this by initializing fw_min_count to 0.
>>>
>>> During testing of 1080p AVC, it reduces the COMV buffer size from 32.89MB
>>> to 6.16MB per decode session, significantly reducing memory consumption.
>>
>> Cool nice fix.
>>
>
> Indeed, a good saving.
>
>>>
>>> Signed-off-by: Vishnu Reddy <busanna.reddy@oss.qualcomm.com>
>>> ---
>>>   drivers/media/platform/qcom/iris/iris_vdec.c       |  2 +-
>>>   drivers/media/platform/qcom/iris/iris_vpu_buffer.c | 24 ++++++++++++
>>> +++++++---
>>>   drivers/media/platform/qcom/iris/iris_vpu_buffer.h |  1 +
>>>   3 files changed, 23 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/media/platform/qcom/iris/iris_vdec.c b/drivers/
>>> media/platform/qcom/iris/iris_vdec.c
>>> index 719217399a30..f433065e08b2 100644
>>> --- a/drivers/media/platform/qcom/iris/iris_vdec.c
>>> +++ b/drivers/media/platform/qcom/iris/iris_vdec.c
>>> @@ -24,7 +24,7 @@ int iris_vdec_inst_init(struct iris_inst *inst)
>>>       inst->fmt_src = kzalloc_obj(*inst->fmt_src);
>>>       inst->fmt_dst = kzalloc_obj(*inst->fmt_dst);
>>>
>>> -    inst->fw_min_count = MIN_BUFFERS;
>>> +    inst->fw_min_count = 0;
>>>
>>>       f = inst->fmt_src;
>>>       f->type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;
>>> diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c b/
>>> drivers/media/platform/qcom/iris/iris_vpu_buffer.c
>>> index 9270422c1601..57237543b229 100644
>>> --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
>>> +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c
>>> @@ -731,6 +731,23 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *inst)
>>>       u32 height = f->fmt.pix_mp.height;
>>>       u32 width = f->fmt.pix_mp.width;
>>>
>>> +    if (inst->codec == V4L2_PIX_FMT_H264)
>>> +        return hfi_buffer_comv_h264d(width, height, num_comv);
>>> +    else if (inst->codec == V4L2_PIX_FMT_HEVC)
>>> +        return hfi_buffer_comv_h265d(width, height, num_comv);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static u32 iris_vpu3x_4x_dec_comv_size(struct iris_inst *inst)
>>> +{
>>> +    struct v4l2_format *f = inst->fmt_src;
>>> +    u32 height = f->fmt.pix_mp.height;
>>> +    u32 width = f->fmt.pix_mp.width;
>>> +    u32 num_comv;
>>> +
>>> +    num_comv = inst->fw_min_count ? inst->fw_min_count : inst-
>>> >buffers[BUF_OUTPUT].min_count;
>>
>> Please just if/else this though its far easier to read/understand that way.
>>
>>> +
>>>       if (inst->codec == V4L2_PIX_FMT_H264)
>>>           return hfi_buffer_comv_h264d(width, height, num_comv);
>>>       else if (inst->codec == V4L2_PIX_FMT_HEVC)
>>> @@ -739,7 +756,8 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *inst)
>>>           if (inst->fw_caps[DRAP].value)
>>>               return 0;
>>>           else
>>> -            return hfi_buffer_comv_av1d(width, height, num_comv);
>>> +            return hfi_buffer_comv_av1d(width, height,
>>> +                            num_comv + AV1D_COMV_BUFFER_OVERHEAD);
>>>       }
>>>
>>>       return 0;
>>> @@ -2025,7 +2043,7 @@ u32 iris_vpu_buf_size(struct iris_inst *inst, enum
>>> iris_buffer_type buffer_type)
>>>
>>>       static const struct iris_vpu_buf_type_handle
>>> dec_internal_buf_type_handle[] = {
>>>           {BUF_BIN,         iris_vpu_dec_bin_size             },
>>> -        {BUF_COMV,        iris_vpu_dec_comv_size            },
>>> +        {BUF_COMV,        iris_vpu3x_4x_dec_comv_size       },
>>>           {BUF_NON_COMV,    iris_vpu_dec_non_comv_size        },
>>>           {BUF_LINE,        iris_vpu_dec_line_size            },
>>>           {BUF_PERSIST,     iris_vpu_dec_persist_size         },
>>> @@ -2098,7 +2116,7 @@ u32 iris_vpu4x_buf_size(struct iris_inst *inst, enum
>>> iris_buffer_type buffer_typ
>>>
>>>       static const struct iris_vpu_buf_type_handle
>>> dec_internal_buf_type_handle[] = {
>>>           {BUF_BIN,         iris_vpu_dec_bin_size         },
>>> -        {BUF_COMV,        iris_vpu_dec_comv_size        },
>>> +        {BUF_COMV,        iris_vpu3x_4x_dec_comv_size   },
>>>           {BUF_NON_COMV,    iris_vpu_dec_non_comv_size    },
>>>           {BUF_LINE,        iris_vpu4x_dec_line_size      },
>>>           {BUF_PERSIST,     iris_vpu4x_dec_persist_size   },
>>> diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/
>>> drivers/media/platform/qcom/iris/iris_vpu_buffer.h
>>> index 12640eb5ed8c..7a9cc1c92da3 100644
>>> --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
>>> +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h
>>> @@ -110,6 +110,7 @@ struct iris_inst;
>>>   #define MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE 96
>>>   #define AV1D_NUM_HW_PIC_BUF    16
>>>   #define AV1D_NUM_FRAME_HEADERS 16
>>> +#define AV1D_COMV_BUFFER_OVERHEAD 7
>>
>> Whats this ? Why is there a new seven byte overhead ? Does it represent a
>> header, an alignment ?
>
> Vishnu, pls check if we need to add this as initial count was 18 [1] ? What if
> initial count was 11 [2], and post DRC, fw_min_count would be 11 too for AV1d,
> so the overhead can be avoided.
>
> [1]
> https://elixir.bootlin.com/linux/v7.0-rc7/source/drivers/media/platform/qcom/iris/iris_hfi_gen2_command.c#L1220
>
> [2]
> https://elixir.bootlin.com/linux/v7.0-rc7/source/drivers/media/platform/qcom/iris/iris_vpu_buffer.c#L2157
>

Thanks for the information, will check and update in next revision.

> Regards,
> Vikash
>>
>> An overhead can mean anything.
>>
>>>   #define SIZE_AV1D_SEQUENCE_HEADER 768
>>>   #define SIZE_AV1D_METADATA        512
>>>   #define SIZE_AV1D_FRAME_HEADER    1280
>>>
>>> ---
>>> base-commit: 4fbeef21f5387234111b5d52924e77757626faa5
>>> change-id: 20260421-optimize_comv_buffer-ae7107673609
>>>
>>> Best regards,
>>> -- 
>>> Vishnu Reddy <busanna.reddy@oss.qualcomm.com>
>>>
>>
>>
>

  reply	other threads:[~2026-04-24  9:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <lWwJ9pbXoZXg350L9fA8Sx-qznLud6KXnJlBTFNBLZQXEwKZeI50KGzJPDq43FO2QtbisF9pyxxeVTXX-WvN0Q==@protonmail.internalid>
2026-04-21  6:41 ` [PATCH] media: iris: optimize COMV buffer allocation for VPU3x and VPU4x Vishnu Reddy
2026-04-21  9:01   ` Bryan O'Donoghue
2026-04-22  7:42     ` Vikash Garodia
2026-04-24  9:17       ` Vishnu Reddy [this message]
2026-04-24  9:16     ` Vishnu Reddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec1a0557-18eb-4b16-7c10-95b69ea7e63a@oss.qualcomm.com \
    --to=busanna.reddy@oss.qualcomm.com \
    --cc=abhinav.kumar@linux.dev \
    --cc=bod@kernel.org \
    --cc=dikshita.agarwal@oss.qualcomm.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=vikash.garodia@oss.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox