Devicetree
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Kyrie Wu" <kyrie.wu@mediatek.com>
Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH v15 05/12] media: mediatek: jpeg: Fix multi-core clk suspend and resume setting
Date: Thu, 02 Jul 2026 07:45:04 +0000	[thread overview]
Message-ID: <20260702074505.03BCD1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260702072614.10373-6-kyrie.wu@mediatek.com>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:

New issues:
- [High] Component-level system resume callbacks unpause the shared V4L2 M2M queue prematurely, causing jobs to be dispatched to components that are still in system suspend.
- [High] Missing pm_runtime_disable() on the probe error path leaves runtime PM enabled.
- [Medium] Double acquisition of clocks via devm_clk_bulk_get() causes duplicate resources and leaks original references.
- [Low] Unconditional assignment of dev_pm_ops produces dead code when PM is disabled.

Pre-existing issues:
- [Critical] Race condition between timeout work and IRQ handler leads to double-completion and PM usage underflow.
- [High] Unfixed pm_runtime_get_sync() usage counter leak in the encoder worker error path.
- [High] The worker threads call v4l2_m2m_job_finish() immediately after starting the hardware, leading to a UAF upon device release.
- [High] Missed wakeup in IRQ handler due to incorrect ordering of wake_up() and atomic_inc().
--

commit 5cd8fa923e48c4448d2b551ff3153b7043979096
Author: Kyrie Wu <kyrie.wu@mediatek.com>
Subject: media: mediatek: jpeg: Fix multi-core clk suspend and resume setting

This commit adds component-level suspend and resume callbacks for the MediaTek
JPEG multi-core architecture. It registers power management operations for
each hardware driver to handle clock preparation and power management across
system sleep states.

> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c

[ ... ]

> @@ -1647,13 +1653,6 @@ static void mtk_jpegenc_worker(struct work_struct *work)
>  		goto enc_end;
>  	}

[Severity: High]
This is a pre-existing issue, but if pm_runtime_get_sync() fails earlier in
mtk_jpegenc_worker(), does the error path to enc_end skip calling
pm_runtime_put_noidle() and leak the usage counter?

The same fix applied to mtk_jpegdec_worker() using pm_runtime_resume_and_get()
does not appear to be applied here.

[Severity: High]
This is a pre-existing issue, but does calling v4l2_m2m_job_finish()
immediately after starting the hardware cause a use-after-free upon device
release?

In mtk_jpegenc_worker() and mtk_jpegdec_worker(), the hardware starts and
then the worker calls v4l2_m2m_job_finish() prematurely:

    mtk_jpeg_enc_start(comp_jpeg[hw_id]->reg_base);
    v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
    v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
    v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);

When a user closes the file descriptor, v4l2_m2m_ctx_release() and
stop_streaming() are called. The interrupt handler could then call
v4l2_m2m_buf_done(), immediately waking up the waiter in stop_streaming().
This allows the context to be freed while the interrupt handler is still
manipulating it, causing a use-after-free write.

> -	ret = clk_prepare_enable(comp_jpeg[hw_id]->venc_clk.clks->clk);
> -	if (ret) {
> -		dev_err(jpeg->dev, "%s : %d, jpegenc clk_prepare_enable fail\n",
> -			__func__, __LINE__);
> -		goto enc_end;
> -	}
> -
>  	v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
>  	v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);

[ ... ]

> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c

[ ... ]

> @@ -547,7 +546,6 @@ static irqreturn_t mtk_jpegdec_hw_irq_handler(int irq, void *priv)
>  	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>  	struct mtk_jpeg_src_buf *jpeg_src_buf;
>  	enum vb2_buffer_state buf_state;
> -	struct mtk_jpeg_ctx *ctx;
>  	u32 dec_irq_ret;
>  	u32 irq_status;
>  	int i;
> @@ -557,7 +555,6 @@ static irqreturn_t mtk_jpegdec_hw_irq_handler(int irq, void *priv)
>  
>  	cancel_delayed_work(&jpeg->job_timeout_work);

[Severity: Critical]
This is a pre-existing issue, but does calling cancel_delayed_work() instead
of cancel_delayed_work_sync() here cause a race condition if the timeout
handler is concurrently executing on another processor?

Since neither path takes a lock to serialize completion, both could proceed
to reset the hardware, call v4l2_m2m_buf_done() on the same buffer, increment
the ready counter twice, and invoke pm_runtime_put(). Could this corrupt the
buffer linked list or cause a power management usage counter underflow?

> -	ctx = jpeg->hw_param.curr_ctx;

[ ... ]

> @@ -580,12 +577,11 @@ static irqreturn_t mtk_jpegdec_hw_irq_handler(int irq, void *priv)
>  	buf_state = VB2_BUF_STATE_DONE;
>  	v4l2_m2m_buf_done(src_buf, buf_state);
>  	mtk_jpegdec_put_buf(jpeg);
> -	pm_runtime_put(ctx->jpeg->dev);
> -	clk_disable_unprepare(jpeg->jdec_clk.clks->clk);
>  
>  	jpeg->hw_state = MTK_JPEG_HW_IDLE;
>  	wake_up(&master_jpeg->hw_wq);
>  	atomic_inc(&master_jpeg->hw_rdy);

[Severity: High]
This is a pre-existing issue, but does calling wake_up() immediately before
atomic_inc() cause a missed wakeup?

If the waiting worker thread wakes up and checks atomic_read() before the
increment happens, it will go back to sleep. Since the subsequent atomic_inc()
is not followed by another wake_up(), could the worker thread hang
indefinitely?

> +	pm_runtime_put(jpeg->dev);
>  
>  	return IRQ_HANDLED;
>  }

[ ... ]

> @@ -668,15 +664,69 @@ static int mtk_jpegdec_hw_probe(struct platform_device *pdev)
>  
>  	platform_set_drvdata(pdev, dev);
>  	pm_runtime_enable(&pdev->dev);
> +	ret = devm_clk_bulk_get(dev->dev,
> +				jpegdec_clk->clk_num,
> +				jpegdec_clk->clks);

[Severity: Medium]
Does this second devm_clk_bulk_get() overwrite the clocks already acquired
by devm_clk_bulk_get_all() earlier in the probe function?

This appears to acquire the clocks a second time, overwriting the array
pointers and leaking the original references until device unbind.

> +	if (ret) {
> +		dev_err(&pdev->dev, "Failed to init clk\n");
> +		return ret;
> +	}

[Severity: High]
If devm_clk_bulk_get() fails here, does the probe function return without
disabling the runtime power management?

Returning directly leaves the power management core tracking runtime status
as enabled for a device that failed to probe. Should pm_runtime_disable()
be called in this error path?

> +
> +static int mtk_jpegdec_resume(struct device *dev)
> +{
> +	struct mtk_jpegdec_comp_dev *jpeg = dev_get_drvdata(dev);
> +	int ret;
> +
> +	ret = pm_runtime_force_resume(dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	v4l2_m2m_resume(jpeg->master_dev->m2m_dev);

[Severity: High]
When a component resumes, does calling v4l2_m2m_resume() on the single shared
master queue unpause it prematurely?

If the shared queue is unpaused while sibling components are still suspended,
could a job be immediately dispatched to a suspended sibling, causing
pm_runtime_resume_and_get() to fail or block improperly?

> +
> +	return 0;
> +}
> +
> +static const struct dev_pm_ops mtk_jpegdec_pm_ops = {
> +	SYSTEM_SLEEP_PM_OPS(mtk_jpegdec_suspend, mtk_jpegdec_resume)
> +	RUNTIME_PM_OPS(mtk_jpegdec_pm_suspend, mtk_jpegdec_pm_resume, NULL)
> +};
> +
>  static struct platform_driver mtk_jpegdec_hw_driver = {
>  	.probe = mtk_jpegdec_hw_probe,
>  	.driver = {
>  		.name = "mtk-jpegdec-hw",
>  		.of_match_table = mtk_jpegdec_hw_ids,
> +		.pm             = &mtk_jpegdec_pm_ops,

[Severity: Low]
Does unconditionally assigning mtk_jpegdec_pm_ops produce dead code when
power management is disabled?

Guidelines suggest wrapping the assignment with pm_ptr() to eliminate dead
code and compiler warnings when power management support is disabled.

>  	},
>  };

[ ... ]

> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c

[ ... ]

> @@ -304,12 +303,11 @@ static irqreturn_t mtk_jpegenc_hw_irq_handler(int irq, void *priv)
>  	buf_state = VB2_BUF_STATE_DONE;
>  	v4l2_m2m_buf_done(src_buf, buf_state);
>  	mtk_jpegenc_put_buf(jpeg);
> -	pm_runtime_put(ctx->jpeg->dev);
> -	clk_disable_unprepare(jpeg->venc_clk.clks->clk);
>  
>  	jpeg->hw_state = MTK_JPEG_HW_IDLE;
>  	wake_up(&master_jpeg->hw_wq);
>  	atomic_inc(&master_jpeg->hw_rdy);

[Severity: Critical]
This is a pre-existing issue, but does calling cancel_delayed_work() earlier
in this interrupt handler cause a race condition if the timeout handler is
executing concurrently? Both paths could proceed to reset the hardware and call
v4l2_m2m_buf_done() on the same buffer locklessly.

[Severity: High]
This is also a pre-existing issue, but does calling wake_up() immediately
before atomic_inc() here cause a missed wakeup if the worker thread checks the
counter before the increment?

> +	pm_runtime_put(jpeg->dev);
>  
>  	return IRQ_HANDLED;
>  }

[ ... ]

> @@ -390,15 +388,69 @@ static int mtk_jpegenc_hw_probe(struct platform_device *pdev)
>  
>  	platform_set_drvdata(pdev, dev);
>  	pm_runtime_enable(&pdev->dev);
> +	ret = devm_clk_bulk_get(dev->dev,
> +				jpegenc_clk->clk_num,
> +				jpegenc_clk->clks);

[Severity: Medium]
Similarly for the encoder, does this devm_clk_bulk_get() overwrite the clocks
acquired by devm_clk_bulk_get_all() earlier in the probe function?

> +	if (ret) {
> +		dev_err(&pdev->dev, "Failed to init clk\n");
> +		return ret;
> +	}

[Severity: High]
If devm_clk_bulk_get() fails here, does this error path also skip calling
pm_runtime_disable()?

> +
> +static int mtk_jpegenc_resume(struct device *dev)
> +{
> +	struct mtk_jpegenc_comp_dev *jpeg = dev_get_drvdata(dev);
> +	int ret;
> +
> +	ret = pm_runtime_force_resume(dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	v4l2_m2m_resume(jpeg->master_dev->m2m_dev);

[Severity: High]
Like the decoder, does calling v4l2_m2m_resume() here unpause the shared
master queue prematurely while other siblings might still be suspended?

> +
> +	return 0;
> +}
> +
> +static const struct dev_pm_ops mtk_jpegenc_pm_ops = {
> +	SYSTEM_SLEEP_PM_OPS(mtk_jpegenc_suspend, mtk_jpegenc_resume)
> +	RUNTIME_PM_OPS(mtk_jpegenc_pm_suspend, mtk_jpegenc_pm_resume, NULL)
> +};
> +
>  static struct platform_driver mtk_jpegenc_hw_driver = {
>  	.probe = mtk_jpegenc_hw_probe,
>  	.driver = {
>  		.name = "mtk-jpegenc-hw",
>  		.of_match_table = mtk_jpegenc_drv_ids,
> +		.pm = &mtk_jpegenc_pm_ops,

[Severity: Low]
Should this assignment be wrapped with pm_ptr() to avoid dead code when
power management is disabled?

>  	},
>  };

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260702072614.10373-1-kyrie.wu@mediatek.com?part=5

  reply	other threads:[~2026-07-02  7:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02  7:26 [PATCH v15 00/12] Enable jpeg enc & dec multi-hardwares for MT8196 Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 01/12] media: mediatek: jpeg: fix jpeg cores' amounts setting Kyrie Wu
2026-07-02  7:41   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 02/12] media: mediatek: jpeg: fix jpeg buffer payload size setting Kyrie Wu
2026-07-02  7:50   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 03/12] media: mediatek: jpeg: fix buffer structure size and layout Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 04/12] media: mediatek: jpeg: Fix buffer completion on multi-core streaming stop Kyrie Wu
2026-07-02  7:44   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 05/12] media: mediatek: jpeg: Fix multi-core clk suspend and resume setting Kyrie Wu
2026-07-02  7:45   ` sashiko-bot [this message]
2026-07-02  7:26 ` [PATCH v15 06/12] media: mediatek: jpeg: fix decoding buffer number setting timing issue Kyrie Wu
2026-07-02  7:41   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 07/12] media: mediatek: jpeg: fix resolution change event handling in decoder Kyrie Wu
2026-07-02  7:48   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 08/12] media: mediatek: jpeg: fix remove buffer removal timing for multi-core Kyrie Wu
2026-07-02  7:54   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 09/12] media: dt-bindings: mediatek,jpeg: Add mediatek, mt8196-jpgdec compatible Kyrie Wu
2026-07-02  7:26 ` [PATCH v15 10/12] media: dt-bindings: mediatek,jpeg: Add mediatek, mt8196-jpgenc compatible Kyrie Wu
2026-07-02  7:49   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 11/12] media: mediatek: jpeg: add jpeg compatible Kyrie Wu
2026-07-02  7:51   ` sashiko-bot
2026-07-02  7:26 ` [PATCH v15 12/12] media: mediatek: jpeg: add jpeg smmu sid setting Kyrie Wu
2026-07-02  7:56   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260702074505.03BCD1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=kyrie.wu@mediatek.com \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox