Devicetree
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Kyrie Wu" <kyrie.wu@mediatek.com>
Cc: conor+dt@kernel.org, robh@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH v15 08/12] media: mediatek: jpeg: fix remove buffer removal timing for multi-core
Date: Thu, 02 Jul 2026 07:54:48 +0000	[thread overview]
Message-ID: <20260702075449.1A6691F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260702072614.10373-9-kyrie.wu@mediatek.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] In `mtk_jpegenc_worker`, `v4l2_m2m_src_buf_remove()` and `v4l2_m2m_dst_buf_remove()` are incorrectly called after `mtk_jpeg_enc_start()`. This introduces a severe race condition where an early hardware completion interrupt can process the buffers before they are removed from the ready queue.

Pre-existing issues:
- [High] The job timeout delayed work is scheduled before acquiring the hardware lock, creating a window where it can execute prematurely and lead to DMA memory corruption or list corruption.
- [High] Concurrent execution of the hardware IRQ handler and the timeout work leads to list corruption and state desynchronization.
- [Medium] `mtk_jpegenc_worker` leaks a PM runtime usage counter reference on error paths.
--

commit 4ce8f75cccbcd44894ec25eb120040d2e022ccdb
Author: Kyrie Wu <kyrie.wu@mediatek.com>

media: mediatek: jpeg: fix remove buffer removal timing for multi-core
    
Move calls to v4l2_m2m_src/dst_buf_remove() inside of the spinlock
protected scope to ensure all necessary operations are performed
before buffers are removed from their queues and ensure proper
synchronization of buffer handling to avoid buffer lost.

> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> index 02efaed198b85..0b9a3b66895a0 100644
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> @@ -1653,9 +1653,6 @@ static void mtk_jpegenc_worker(struct work_struct *work)
>  		goto enc_end;
>  	}

[Severity: Medium]
This is a pre-existing issue, but does pm_runtime_get_sync() increment
the usage counter even on failure?

If it fails here in mtk_jpegenc_worker() and jumps to enc_end without
dropping the reference, couldn't this leak the usage counter and prevent
the device from suspending?

Would using pm_runtime_put_noidle() in the error path fix this?

> -	v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> -	v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> -
>  	schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
>  			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));

[Severity: High]
This is a pre-existing issue, but by scheduling the timeout work before
acquiring the hw_lock, could the timeout expire while the worker is
preempted?

If the worker thread is preempted or suspended longer than the timeout
period, the lockless timeout handler could execute, reset the hardware,
and return the buffer to userspace.

When the worker thread resumes, it blindly acquires the lock and configures
the hardware with pointers to the now-completed buffers. Could this result
in the hardware forcing DMA into memory that has already been freed or
reused by userspace?

This same pattern appears in mtk_jpegdec_worker() as well.

> @@ -1673,6 +1670,8 @@ static void mtk_jpegenc_worker(struct work_struct *work)
>  			     &src_buf->vb2_buf);
>  	mtk_jpeg_set_enc_params(ctx, comp_jpeg[hw_id]->reg_base);
>  	mtk_jpeg_enc_start(comp_jpeg[hw_id]->reg_base);
> +	v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> +	v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);

[Severity: High]
By moving the buffer removal after mtk_jpeg_enc_start(), does this create a
race condition where the hardware completion interrupt processes the buffers
before they are removed from the V4L2 M2M ready queue?

If the hardware finishes instantly and triggers mtk_jpegenc_hw_irq_handler()
on another CPU, the IRQ handler will complete the buffer and return it to
userspace.

If userspace quickly dequeues and requeues it via VIDIOC_QBUF before the
worker thread calls v4l2_m2m_src_buf_remove(), the framework might call
list_add_tail() on a node already linked in the ready queue. Could this
corrupt the list pointers and lead to a kernel panic?

Should the buffers be removed from the queues before starting the hardware
to avoid this window, similar to how it is correctly handled in the decoder
modifications below?

>  	v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
>  	spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags);

[ ... ]


[Severity: High]
This is a pre-existing issue not touched by this patch, but could there be a
race condition between the hardware IRQ handler and the timeout work?

In drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c,
mtk_jpegenc_hw_irq_handler() calls:

    cancel_delayed_work(&jpeg->job_timeout_work);

If the timer expires at the exact same moment the hardware finishes,
cancel_delayed_work() will return false, meaning the timeout work could
execute concurrently on another CPU.

Since both threads process the same buffers and simultaneously call
mtk_jpegenc_put_buf() (which executes list_add_tail()), could this corrupt
the list structure by adding the same node concurrently? Would it also lead
to double-incrementing master_jpeg->hw_rdy and double-decrementing PM
runtime?

This same vulnerability appears to exist in mtk_jpegdec_hw_irq_handler()
for the decoder.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260702072614.10373-1-kyrie.wu@mediatek.com?part=8

  reply	other threads:[~2026-07-02  7:54 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02  7:26 [PATCH v15 00/12] Enable jpeg enc & dec multi-hardwares for MT8196 Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 01/12] media: mediatek: jpeg: fix jpeg cores' amounts setting Kyrie Wu
2026-07-02  7:41   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 02/12] media: mediatek: jpeg: fix jpeg buffer payload size setting Kyrie Wu
2026-07-02  7:50   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 03/12] media: mediatek: jpeg: fix buffer structure size and layout Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 04/12] media: mediatek: jpeg: Fix buffer completion on multi-core streaming stop Kyrie Wu
2026-07-02  7:44   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 05/12] media: mediatek: jpeg: Fix multi-core clk suspend and resume setting Kyrie Wu
2026-07-02  7:45   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 06/12] media: mediatek: jpeg: fix decoding buffer number setting timing issue Kyrie Wu
2026-07-02  7:41   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 07/12] media: mediatek: jpeg: fix resolution change event handling in decoder Kyrie Wu
2026-07-02  7:48   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 08/12] media: mediatek: jpeg: fix remove buffer removal timing for multi-core Kyrie Wu
2026-07-02  7:54   ` sashiko-bot [this message]
2026-07-02  7:26 ` [PATCH v15 09/12] media: dt-bindings: mediatek,jpeg: Add mediatek, mt8196-jpgdec compatible Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 10/12] media: dt-bindings: mediatek,jpeg: Add mediatek, mt8196-jpgenc compatible Kyrie Wu
2026-07-02  7:49   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 11/12] media: mediatek: jpeg: add jpeg compatible Kyrie Wu
2026-07-02  7:51   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 12/12] media: mediatek: jpeg: add jpeg smmu sid setting Kyrie Wu
2026-07-02  7:56   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260702075449.1A6691F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=kyrie.wu@mediatek.com \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox