From: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
To: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: linux-media@vger.kernel.org, linux-renesas-soc@vger.kernel.org,
Kieran Bingham <kieran.bingham@ideasonboard.com>,
Jacopo Mondi <jacopo.mondi@ideasonboard.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] media: renesas: vsp1: Fix race condition when stopping display pipeline
Date: Tue, 12 May 2026 18:43:39 +0200 [thread overview]
Message-ID: <20260512164339.GB332351@ragnatech.se> (raw)
In-Reply-To: <20260512153557.GA332351@ragnatech.se>
Hello again,
On 2026-05-12 17:36:01 +0200, Niklas Söderlund wrote:
> Hi Laurent,
>
> Thanks for your work.
>
> On 2026-05-12 01:38:32 +0300, Laurent Pinchart wrote:
> > When stopping a display pipeline, the vsp1_du_setup_lif() function first
> > stops the hardware by calling vsp1_pipeline_stop(), and then resets
> > drm_pipe->du_complete to NULL. Stopping the hardware ensures no new
> > interrupt is generated, but does not wait for completion of any running
> > interrupt handler. This creates a race with vsp1_du_pipeline_frame_end()
> > which tests drm_pipe->du_complete before calling the function pointer.
> > If the drm_pipe->du_complete pointer is reset to NULL between those two
> > operations, a NULL pointer derefence will occur.
> >
> > Fix this by setting pipe->state to STOPPING before stopping the
> > hardware, and avoid calling the frame end handler if the state is not
> > RUNNING. Condition the latter to the display pipeline, as the other
> > pipeline use a different stop procedure that waits for the frame end
> > handler to set the state to STOPPED.
> >
> > The state check needs to be protected by the pipe->irqlock. The lock is
> > used by the video and vspx completion handlers already, so move it one
> > level up to vsp1_pipeline_frame_end().
>
> Running with this and the still out-of-tree ISP driver I hit a splat. It
> might be an issue in the ISP driver, I will dig some more. But for
> reference I log the splat here. My stress test also seems to trigger the
> deadlock.
I did some more digging, and indeed the deadlock is a combination of the
R-Car ISP driver and this change. The usage pattern by the ISP is,
void prepare_next_job_for_vspx(...)
{
lockdep_assert_held(&core->lock);
/*
* Collect resources protected by core->lock into the next VSPX job.
*/
myjob = ...;
if(vsp1_isp_job_run(myjob)) {
/* Error path, clean up. */
vsp1_isp_job_release(myjob);
/* Cleanup resources protected by core->lock */
}
}
void risp_vspx_frame_end_callback(...)
{
/* I am called by the VSPX from vsp1_pipeline_frame_end() */
guard(spinlock_irqsave)(&core->lock);
/*
* Act on resources protected by core->lock that the VSPX is now
* done processing.
*/
prepare_next_job_for_vspx(...);
}
void start_work_by_queueing_first_job_to_vspx(...)
{
/*
* I'm called at stream on time in this example to get the ball
* rolling.
*/
guard(spinlock_irqsave)(&core->lock);
prepare_next_job_for_vspx(...)
}
The R-Car ISP driver could possibly be reworked to release the
core->lock before calling vsp1_isp_job_run(). But it would then need to
retake the lock in the error path, which seems messy.
If it all possible do think this patch can be reworked to call the user
registered callback... (see below)
>
> [ 32.111727] ======================================================
> [ 32.112507] WARNING: possible circular locking dependency detected
> [ 32.113287] 7.1.0-rc1-arm64-renesas-02551-g0fc35b78411a #17 Not tainted
> [ 32.114122] ------------------------------------------------------
> [ 32.114900] swapper/0/0 is trying to acquire lock:
> [ 32.115506] ffff000442e1ef30 (&core->lock){-...}-{3:3}, at: risp_core_svspx_frame_end+0x24/0x60
> [ 32.116624]
> but task is already holding lock:
> [ 32.117358] ffff000442e16cc8 (&pipe->irqlock){-...}-{3:3}, at: vsp1_pipeline_frame_end+0x4c/0xc0
> [ 32.118478]
> which lock already depends on the new lock.
>
> [ 32.119511]
> the existing dependency chain (in reverse order) is:
> [ 32.120457]
> -> #1 (&pipe->irqlock){-...}-{3:3}:
> [ 32.121223] lock_acquire+0x27c/0x3fc
> [ 32.121764] _raw_spin_lock_irqsave+0x58/0x80
> [ 32.122392] vsp1_isp_job_run+0x84/0xf8
> [ 32.122952] risp_core_job_run+0x1e4/0x2a4
> [ 32.123540] risp_core_start_streaming+0x3d8/0x440
> [ 32.124215] risp_io_start_streaming+0x64/0xe0
> [ 32.124846] vb2_start_streaming+0x64/0x168
> [ 32.125449] vb2_core_streamon+0xd0/0x1b8
> [ 32.126028] vb2_ioctl_streamon+0x50/0x8c
> [ 32.126606] v4l_streamon+0x20/0x28
> [ 32.127120] __video_do_ioctl+0x344/0x3f0
> [ 32.127698] video_usercopy+0x2e8/0x9f0
> [ 32.128254] video_ioctl2+0x14/0x80
> [ 32.128766] v4l2_ioctl+0x3c/0x60
> [ 32.129254] __arm64_sys_ioctl+0x88/0xe0
> [ 32.129825] invoke_syscall.constprop.0+0x3c/0x100
> [ 32.130504] el0_svc_common.constprop.0+0x34/0xcc
> [ 32.131170] do_el0_svc+0x18/0x20
> [ 32.131661] el0_svc+0x3c/0x2b8
> [ 32.132131] el0t_64_sync_handler+0x98/0xe0
> [ 32.132731] el0t_64_sync+0x154/0x158
> [ 32.133265]
> -> #0 (&core->lock){-...}-{3:3}:
> [ 32.133999] check_prev_add+0x10c/0xda0
> [ 32.134554] __lock_acquire+0x129c/0x1584
> [ 32.135131] lock_acquire+0x27c/0x3fc
> [ 32.135665] _raw_spin_lock_irqsave+0x58/0x80
> [ 32.136286] risp_core_svspx_frame_end+0x24/0x60
> [ 32.136939] vsp1_vspx_pipeline_frame_end+0x1c/0x28
> [ 32.137626] vsp1_pipeline_frame_end+0xa8/0xc0
> [ 32.138260] vsp1_irq_handler+0xfc/0x12c
> [ 32.138828] __handle_irq_event_percpu+0xa8/0x4cc
> [ 32.139497] handle_irq_event+0x40/0x100
> [ 32.140065] handle_fasteoi_irq+0xe8/0x210
> [ 32.140655] handle_irq_desc+0x30/0x58
> [ 32.141202] generic_handle_domain_irq+0x14/0x1c
> [ 32.141857] gic_handle_irq+0x50/0xe0
> [ 32.142389] call_on_irq_stack+0x30/0x60
> [ 32.142955] do_interrupt_handler+0x78/0x7c
> [ 32.143555] el1_interrupt+0x34/0x50
> [ 32.144079] el1h_64_irq_handler+0x14/0x1c
> [ 32.144668] el1h_64_irq+0x6c/0x70
> [ 32.145168] cpuidle_enter_state+0xf4/0x440
> [ 32.145769] cpuidle_enter+0x30/0x40
> [ 32.146295] do_idle+0x16c/0x2d0
> [ 32.146776] cpu_startup_entry+0x30/0x40
> [ 32.147342] kernel_init+0x0/0x130
> [ 32.147842] do_one_initcall+0x0/0x248
> [ 32.148392] __primary_switched+0x88/0x90
> [ 32.148970]
> other info that might help us debug this:
>
> [ 32.149983] Possible unsafe locking scenario:
>
> [ 32.150741] CPU0 CPU1
> [ 32.151325] ---- ----
> [ 32.151908] lock(&pipe->irqlock);
> [ 32.152366] lock(&core->lock);
> [ 32.153110] lock(&pipe->irqlock);
> [ 32.153886] lock(&core->lock);
> [ 32.154311]
> *** DEADLOCK ***
>
> [ 32.155073] 1 lock held by swapper/0/0:
> [ 32.155572] #0: ffff000442e16cc8 (&pipe->irqlock){-...}-{3:3}, at: vsp1_pipeline_frame_end+0x4c/0xc0
> [ 32.156780]
> stack backtrace:
> [ 32.157347] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 7.1.0-rc1-arm64-renesas-02551-g0fc35b78411a #17 PREEMPT
> [ 32.157359] Hardware name: Retronix Sparrow Hawk board based on r8a779g3 (DT)
> [ 32.157365] Call trace:
> [ 32.157369] show_stack+0x14/0x1c (C)
> [ 32.157388] dump_stack_lvl+0x6c/0x90
> [ 32.157396] dump_stack+0x14/0x1c
> [ 32.157404] print_circular_bug+0x254/0x2a0
> [ 32.157413] check_noncircular+0x170/0x190
> [ 32.157422] check_prev_add+0x10c/0xda0
> [ 32.157431] __lock_acquire+0x129c/0x1584
> [ 32.157440] lock_acquire+0x27c/0x3fc
> [ 32.157449] _raw_spin_lock_irqsave+0x58/0x80
> [ 32.157460] risp_core_svspx_frame_end+0x24/0x60
> [ 32.157468] vsp1_vspx_pipeline_frame_end+0x1c/0x28
> [ 32.157480] vsp1_pipeline_frame_end+0xa8/0xc0
> [ 32.157494] vsp1_irq_handler+0xfc/0x12c
> [ 32.157507] __handle_irq_event_percpu+0xa8/0x4cc
> [ 32.157522] handle_irq_event+0x40/0x100
> [ 32.157537] handle_fasteoi_irq+0xe8/0x210
> [ 32.157547] handle_irq_desc+0x30/0x58
> [ 32.157560] generic_handle_domain_irq+0x14/0x1c
> [ 32.157574] gic_handle_irq+0x50/0xe0
> [ 32.157581] call_on_irq_stack+0x30/0x60
> [ 32.157590] do_interrupt_handler+0x78/0x7c
> [ 32.157600] el1_interrupt+0x34/0x50
> [ 32.157611] el1h_64_irq_handler+0x14/0x1c
> [ 32.157623] el1h_64_irq+0x6c/0x70
> [ 32.157630] cpuidle_enter_state+0xf4/0x440 (P)
> [ 32.157645] cpuidle_enter+0x30/0x40
> [ 32.157655] do_idle+0x16c/0x2d0
> [ 32.157665] cpu_startup_entry+0x30/0x40
> [ 32.157675] kernel_init+0x0/0x130
> [ 32.157682] do_one_initcall+0x0/0x248
> [ 32.157694] __primary_switched+0x88/0x90
>
> >
> > Fixes: d7ade201ae7f ("v4l: vsp1: Extend VSP1 module API to allow DRM callbacks")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> > ---
> >
> > I have noticed this race condition while debugging a separate issue and
> > adding printk() statements in the display pipeline frame end. I have
> > tested the fix with the DU test suite and VSP test suite, exercising
> > both the display and video pipelines. I'm fairly confident that the VSPX
> > pipeline won't be negatively affected, but it would be good to
> > double-check that. Jacopo, Niklas, would you be able to give test it ?
> >
> > ---
> > drivers/media/platform/renesas/vsp1/vsp1_pipe.c | 12 ++++++++++--
> > drivers/media/platform/renesas/vsp1/vsp1_video.c | 5 -----
> > drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 13 +++++--------
> > 3 files changed, 15 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_pipe.c b/drivers/media/platform/renesas/vsp1/vsp1_pipe.c
> > index 5d769cc42fe1..aaec1aa15091 100644
> > --- a/drivers/media/platform/renesas/vsp1/vsp1_pipe.c
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_pipe.c
> > @@ -509,6 +509,10 @@ int vsp1_pipeline_stop(struct vsp1_pipeline *pipe)
> > * When using display lists in continuous frame mode the only
> > * way to stop the pipeline is to reset the hardware.
> > */
> > + scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> > + pipe->state = VSP1_PIPELINE_STOPPING;
> > + }
> > +
> > ret = vsp1_reset_wpf(vsp1, pipe->output->entity.index);
> > if (ret == 0) {
> > spin_lock_irqsave(&pipe->irqlock, flags);
> > @@ -583,8 +587,12 @@ void vsp1_pipeline_frame_end(struct vsp1_pipeline *pipe)
> > * Regardless of frame completion we still need to notify the pipe
> > * frame_end to account for vblank events.
> > */
> > - if (pipe->frame_end)
> > - pipe->frame_end(pipe, flags);
> > + scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> > + if (pipe->state == VSP1_PIPELINE_RUNNING || !pipe->lif) {
> > + if (pipe->frame_end)
> > + pipe->frame_end(pipe, flags);
.. here without holding the pipe->irqlock? AFIK this would be safe as
the user is in control on when the next VSPX job is scheduled. And would
IMHO make the API nicer as the user would be able to hold it's own locks
in the interaction of queueing new jobs from the frame_end callback.
I'm willing to try and rework the R-Car ISP driver for this, but will
await your feedback if you think it would be safe to avoid this issue in
the API design between VSPX provider and user.
> > + }
> > + }
> >
> > pipe->sequence++;
> > }
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_video.c b/drivers/media/platform/renesas/vsp1/vsp1_video.c
> > index fe1dac11d4ae..a8db94bdb670 100644
> > --- a/drivers/media/platform/renesas/vsp1/vsp1_video.c
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_video.c
> > @@ -325,14 +325,11 @@ static void vsp1_video_pipeline_frame_end(struct vsp1_pipeline *pipe,
> > {
> > struct vsp1_device *vsp1 = pipe->output->entity.vsp1;
> > enum vsp1_pipeline_state state;
> > - unsigned long flags;
> > unsigned int i;
> >
> > /* M2M Pipelines should never call here with an incomplete frame. */
> > WARN_ON_ONCE(!(completion & VSP1_DL_FRAME_END_COMPLETED));
> >
> > - spin_lock_irqsave(&pipe->irqlock, flags);
> > -
> > /* Complete buffers on all video nodes. */
> > for (i = 0; i < vsp1->info->rpf_count; ++i) {
> > if (!pipe->inputs[i])
> > @@ -354,8 +351,6 @@ static void vsp1_video_pipeline_frame_end(struct vsp1_pipeline *pipe,
> > wake_up(&pipe->wq);
> > else if (vsp1_pipeline_ready(pipe))
> > vsp1_video_pipeline_run(pipe);
> > -
> > - spin_unlock_irqrestore(&pipe->irqlock, flags);
> > }
> >
> > static int vsp1_video_pipeline_build_branch(struct vsp1_pipeline *pipe,
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > index 1673479be0ff..26c477708858 100644
> > --- a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > @@ -176,14 +176,11 @@ static void vsp1_vspx_pipeline_frame_end(struct vsp1_pipeline *pipe,
> > {
> > struct vsp1_vspx_pipeline *vspx_pipe = to_vsp1_vspx_pipeline(pipe);
> >
> > - scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> > - /*
> > - * Operating the vsp1_pipe in singleshot mode requires to
> > - * manually set the pipeline state to stopped when a transfer
> > - * is completed.
> > - */
> > - pipe->state = VSP1_PIPELINE_STOPPED;
> > - }
> > + /*
> > + * Operating the vsp1_pipe in singleshot mode requires to manually set
> > + * the pipeline state to stopped when a transfer is completed.
> > + */
> > + pipe->state = VSP1_PIPELINE_STOPPED;
> >
> > if (vspx_pipe->vspx_frame_end)
> > vspx_pipe->vspx_frame_end(vspx_pipe->frame_end_data);
> >
> > base-commit: bc1ba628e37c93cf2abeb2c79716f49087f8a024
> > --
> > Regards,
> >
> > Laurent Pinchart
> >
>
> --
> Kind Regards,
> Niklas Söderlund
--
Kind Regards,
Niklas Söderlund
prev parent reply other threads:[~2026-05-12 16:43 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 22:38 [PATCH] media: renesas: vsp1: Fix race condition when stopping display pipeline Laurent Pinchart
2026-05-12 7:42 ` Jacopo Mondi
2026-05-12 8:28 ` Laurent Pinchart
2026-05-12 15:35 ` Niklas Söderlund
2026-05-12 16:43 ` Niklas Söderlund [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512164339.GB332351@ragnatech.se \
--to=niklas.soderlund@ragnatech.se \
--cc=jacopo.mondi@ideasonboard.com \
--cc=kieran.bingham@ideasonboard.com \
--cc=laurent.pinchart+renesas@ideasonboard.com \
--cc=linux-media@vger.kernel.org \
--cc=linux-renesas-soc@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox