Devicetree
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Kyrie Wu" <kyrie.wu@mediatek.com>
Cc: robh@kernel.org, devicetree@vger.kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH v15 06/12] media: mediatek: jpeg: fix decoding buffer number setting timing issue
Date: Thu, 02 Jul 2026 07:41:22 +0000	[thread overview]
Message-ID: <20260702074122.DC90D1F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260702072614.10373-7-kyrie.wu@mediatek.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] Delayed initialization of `jpeg_dst_buf` metadata introduces a race condition with the timeout worker.

Pre-existing issues:
- [High] PM usage counter leak in `mtk_jpegenc_worker` on `pm_runtime_get_sync()` failure.
- [High] Double completion and severe list corruption due to IRQ handler and timeout worker race condition.
--

commit 938df83523dee529816f6765923d0c66f07e6d22
Author: Kyrie Wu <kyrie.wu@mediatek.com>

media: mediatek: jpeg: fix decoding buffer number setting timing issue

The patch delays the initialization of curr_ctx and frame_num for jpeg_dst_buf
by moving it into a spinlock-protected section in mtk_jpegdec_worker,
aiming to ensure any later operations act on an accurate state.

> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> index 89048aba8dca2..4dc574e03bd5a 100644
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> @@ -1743,11 +1742,6 @@ static void mtk_jpegdec_worker(struct work_struct *work)
>  		goto getbuf_fail;
>  	}
>  
> -	jpeg_src_buf->curr_ctx = ctx;
> -	jpeg_src_buf->frame_num = ctx->total_frame_num;
> -	jpeg_dst_buf->curr_ctx = ctx;
> -	jpeg_dst_buf->frame_num = ctx->total_frame_num;
> -
>  	mtk_jpegdec_set_hw_param(ctx, hw_id, src_buf, dst_buf);
>  	ret = pm_runtime_resume_and_get(comp_jpeg[hw_id]->dev);
>  	if (ret < 0) {

[Severity: High]
This isn't a bug introduced by this patch, but while the decoder worker uses
pm_runtime_resume_and_get() here, the encoder worker still uses
pm_runtime_get_sync().

In drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:mtk_jpegenc_worker(),
if pm_runtime_get_sync() fails, it jumps to enc_end without dropping the usage
counter via pm_runtime_put_noidle(). This PM usage counter leak prevents the
device from entering deep sleep states.

Does it make sense to update mtk_jpegenc_worker() to use
pm_runtime_resume_and_get() as well?

> @@ -1772,6 +1766,9 @@ static void mtk_jpegdec_worker(struct work_struct *work)
>  			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
>  
>  	spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
> +	jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf);
> +	jpeg_dst_buf->curr_ctx = ctx;
> +	jpeg_dst_buf->frame_num = ctx->total_frame_num;

[Severity: High]
Does this delayed initialization introduce a race condition with the timeout
worker?

By moving the initialization of jpeg_dst_buf->curr_ctx and frame_num here,
they are set after schedule_delayed_work() has already been called above.
If the worker thread gets preempted before taking the spinlock, the timeout
worker could execute and read hw_param.curr_ctx without taking the lock,
observing uninitialized or stale frame numbers.

>  	ctx->total_frame_num++;
>  	mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base);
>  	mtk_jpeg_dec_set_config(comp_jpeg[hw_id]->reg_base,
> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
> index 9a8dbca6af006..e4d2c5d4ec73d 100644
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
> @@ -513,6 +513,7 @@ static void mtk_jpegdec_put_buf(struct mtk_jpegdec_comp_dev *jpeg)
>  				v4l2_m2m_buf_done(&tmp_dst_done_buf->b,
>  						  VB2_BUF_STATE_DONE);
>  				ctx->last_done_frame_num++;
> +				break;
>  			}
>  		}
>  	}

[Severity: High]
This is a pre-existing issue, but can the IRQ handler and timeout worker
concurrently execute mtk_jpegdec_put_buf() and corrupt the dst_done_queue?

The hardware IRQ handler mtk_jpegdec_hw_irq_handler() calls
cancel_delayed_work(&jpeg->job_timeout_work). If the timeout worker is
already running on another CPU, it will not be stopped because this is
not the sync variant.

If both concurrently call mtk_jpegdec_put_buf() (and similarly
mtk_jpegenc_put_buf() in the encoder), they will both execute
list_add_tail(&dst_done_buf->list, &ctx->dst_done_queue). Adding a node
that is already on the list overwrites its next and prev pointers,
creating a circular link that can hang the kernel during traversal.

Are there plans to synchronize these paths?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260702072614.10373-1-kyrie.wu@mediatek.com?part=6

  reply	other threads:[~2026-07-02  7:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02  7:26 [PATCH v15 00/12] Enable jpeg enc & dec multi-hardwares for MT8196 Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 01/12] media: mediatek: jpeg: fix jpeg cores' amounts setting Kyrie Wu
2026-07-02  7:41   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 02/12] media: mediatek: jpeg: fix jpeg buffer payload size setting Kyrie Wu
2026-07-02  7:50   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 03/12] media: mediatek: jpeg: fix buffer structure size and layout Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 04/12] media: mediatek: jpeg: Fix buffer completion on multi-core streaming stop Kyrie Wu
2026-07-02  7:44   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 05/12] media: mediatek: jpeg: Fix multi-core clk suspend and resume setting Kyrie Wu
2026-07-02  7:45   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 06/12] media: mediatek: jpeg: fix decoding buffer number setting timing issue Kyrie Wu
2026-07-02  7:41   ` sashiko-bot [this message]
2026-07-02  7:26 ` [PATCH v15 07/12] media: mediatek: jpeg: fix resolution change event handling in decoder Kyrie Wu
2026-07-02  7:48   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 08/12] media: mediatek: jpeg: fix remove buffer removal timing for multi-core Kyrie Wu
2026-07-02  7:54   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 09/12] media: dt-bindings: mediatek,jpeg: Add mediatek, mt8196-jpgdec compatible Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 10/12] media: dt-bindings: mediatek,jpeg: Add mediatek, mt8196-jpgenc compatible Kyrie Wu
2026-07-02  7:49   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 11/12] media: mediatek: jpeg: add jpeg compatible Kyrie Wu
2026-07-02  7:51   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 12/12] media: mediatek: jpeg: add jpeg smmu sid setting Kyrie Wu
2026-07-02  7:56   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260702074122.DC90D1F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=kyrie.wu@mediatek.com \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox