Linux Media Controller development
 help / color / mirror / Atom feed
From: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
To: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: linux-media@vger.kernel.org, linux-renesas-soc@vger.kernel.org,
	Kieran Bingham <kieran.bingham@ideasonboard.com>,
	Jacopo Mondi <jacopo.mondi@ideasonboard.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] media: renesas: vsp1: Fix race condition when stopping display pipeline
Date: Tue, 12 May 2026 17:35:57 +0200	[thread overview]
Message-ID: <20260512153557.GA332351@ragnatech.se> (raw)
In-Reply-To: <20260511223832.3385049-1-laurent.pinchart+renesas@ideasonboard.com>

Hi Laurent,

Thanks for your work.

On 2026-05-12 01:38:32 +0300, Laurent Pinchart wrote:
> When stopping a display pipeline, the vsp1_du_setup_lif() function first
> stops the hardware by calling vsp1_pipeline_stop(), and then resets
> drm_pipe->du_complete to NULL. Stopping the hardware ensures no new
> interrupt is generated, but does not wait for completion of any running
> interrupt handler. This creates a race with vsp1_du_pipeline_frame_end()
> which tests drm_pipe->du_complete before calling the function pointer.
> If the drm_pipe->du_complete pointer is reset to NULL between those two
> operations, a NULL pointer derefence will occur.
> 
> Fix this by setting pipe->state to STOPPING before stopping the
> hardware, and avoid calling the frame end handler if the state is not
> RUNNING. Condition the latter to the display pipeline, as the other
> pipeline use a different stop procedure that waits for the frame end
> handler to set the state to STOPPED.
> 
> The state check needs to be protected by the pipe->irqlock. The lock is
> used by the video and vspx completion handlers already, so move it one
> level up to vsp1_pipeline_frame_end().

Running with this and the still out-of-tree ISP driver I hit a splat. It 
might be an issue in the ISP driver, I will dig some more. But for 
reference I log the splat here. My stress test also seems to trigger the 
deadlock.

[   32.111727] ======================================================
[   32.112507] WARNING: possible circular locking dependency detected
[   32.113287] 7.1.0-rc1-arm64-renesas-02551-g0fc35b78411a #17 Not tainted
[   32.114122] ------------------------------------------------------
[   32.114900] swapper/0/0 is trying to acquire lock:
[   32.115506] ffff000442e1ef30 (&core->lock){-...}-{3:3}, at: risp_core_svspx_frame_end+0x24/0x60
[   32.116624]
               but task is already holding lock:
[   32.117358] ffff000442e16cc8 (&pipe->irqlock){-...}-{3:3}, at: vsp1_pipeline_frame_end+0x4c/0xc0
[   32.118478]
               which lock already depends on the new lock.

[   32.119511]
               the existing dependency chain (in reverse order) is:
[   32.120457]
               -> #1 (&pipe->irqlock){-...}-{3:3}:
[   32.121223]        lock_acquire+0x27c/0x3fc
[   32.121764]        _raw_spin_lock_irqsave+0x58/0x80
[   32.122392]        vsp1_isp_job_run+0x84/0xf8
[   32.122952]        risp_core_job_run+0x1e4/0x2a4
[   32.123540]        risp_core_start_streaming+0x3d8/0x440
[   32.124215]        risp_io_start_streaming+0x64/0xe0
[   32.124846]        vb2_start_streaming+0x64/0x168
[   32.125449]        vb2_core_streamon+0xd0/0x1b8
[   32.126028]        vb2_ioctl_streamon+0x50/0x8c
[   32.126606]        v4l_streamon+0x20/0x28
[   32.127120]        __video_do_ioctl+0x344/0x3f0
[   32.127698]        video_usercopy+0x2e8/0x9f0
[   32.128254]        video_ioctl2+0x14/0x80
[   32.128766]        v4l2_ioctl+0x3c/0x60
[   32.129254]        __arm64_sys_ioctl+0x88/0xe0
[   32.129825]        invoke_syscall.constprop.0+0x3c/0x100
[   32.130504]        el0_svc_common.constprop.0+0x34/0xcc
[   32.131170]        do_el0_svc+0x18/0x20
[   32.131661]        el0_svc+0x3c/0x2b8
[   32.132131]        el0t_64_sync_handler+0x98/0xe0
[   32.132731]        el0t_64_sync+0x154/0x158
[   32.133265]
               -> #0 (&core->lock){-...}-{3:3}:
[   32.133999]        check_prev_add+0x10c/0xda0
[   32.134554]        __lock_acquire+0x129c/0x1584
[   32.135131]        lock_acquire+0x27c/0x3fc
[   32.135665]        _raw_spin_lock_irqsave+0x58/0x80
[   32.136286]        risp_core_svspx_frame_end+0x24/0x60
[   32.136939]        vsp1_vspx_pipeline_frame_end+0x1c/0x28
[   32.137626]        vsp1_pipeline_frame_end+0xa8/0xc0
[   32.138260]        vsp1_irq_handler+0xfc/0x12c
[   32.138828]        __handle_irq_event_percpu+0xa8/0x4cc
[   32.139497]        handle_irq_event+0x40/0x100
[   32.140065]        handle_fasteoi_irq+0xe8/0x210
[   32.140655]        handle_irq_desc+0x30/0x58
[   32.141202]        generic_handle_domain_irq+0x14/0x1c
[   32.141857]        gic_handle_irq+0x50/0xe0
[   32.142389]        call_on_irq_stack+0x30/0x60
[   32.142955]        do_interrupt_handler+0x78/0x7c
[   32.143555]        el1_interrupt+0x34/0x50
[   32.144079]        el1h_64_irq_handler+0x14/0x1c
[   32.144668]        el1h_64_irq+0x6c/0x70
[   32.145168]        cpuidle_enter_state+0xf4/0x440
[   32.145769]        cpuidle_enter+0x30/0x40
[   32.146295]        do_idle+0x16c/0x2d0
[   32.146776]        cpu_startup_entry+0x30/0x40
[   32.147342]        kernel_init+0x0/0x130
[   32.147842]        do_one_initcall+0x0/0x248
[   32.148392]        __primary_switched+0x88/0x90
[   32.148970]
               other info that might help us debug this:

[   32.149983]  Possible unsafe locking scenario:

[   32.150741]        CPU0                    CPU1
[   32.151325]        ----                    ----
[   32.151908]   lock(&pipe->irqlock);
[   32.152366]                                lock(&core->lock);
[   32.153110]                                lock(&pipe->irqlock);
[   32.153886]   lock(&core->lock);
[   32.154311]
                *** DEADLOCK ***

[   32.155073] 1 lock held by swapper/0/0:
[   32.155572]  #0: ffff000442e16cc8 (&pipe->irqlock){-...}-{3:3}, at: vsp1_pipeline_frame_end+0x4c/0xc0
[   32.156780]
               stack backtrace:
[   32.157347] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 7.1.0-rc1-arm64-renesas-02551-g0fc35b78411a #17 PREEMPT
[   32.157359] Hardware name: Retronix Sparrow Hawk board based on r8a779g3 (DT)
[   32.157365] Call trace:
[   32.157369]  show_stack+0x14/0x1c (C)
[   32.157388]  dump_stack_lvl+0x6c/0x90
[   32.157396]  dump_stack+0x14/0x1c
[   32.157404]  print_circular_bug+0x254/0x2a0
[   32.157413]  check_noncircular+0x170/0x190
[   32.157422]  check_prev_add+0x10c/0xda0
[   32.157431]  __lock_acquire+0x129c/0x1584
[   32.157440]  lock_acquire+0x27c/0x3fc
[   32.157449]  _raw_spin_lock_irqsave+0x58/0x80
[   32.157460]  risp_core_svspx_frame_end+0x24/0x60
[   32.157468]  vsp1_vspx_pipeline_frame_end+0x1c/0x28
[   32.157480]  vsp1_pipeline_frame_end+0xa8/0xc0
[   32.157494]  vsp1_irq_handler+0xfc/0x12c
[   32.157507]  __handle_irq_event_percpu+0xa8/0x4cc
[   32.157522]  handle_irq_event+0x40/0x100
[   32.157537]  handle_fasteoi_irq+0xe8/0x210
[   32.157547]  handle_irq_desc+0x30/0x58
[   32.157560]  generic_handle_domain_irq+0x14/0x1c
[   32.157574]  gic_handle_irq+0x50/0xe0
[   32.157581]  call_on_irq_stack+0x30/0x60
[   32.157590]  do_interrupt_handler+0x78/0x7c
[   32.157600]  el1_interrupt+0x34/0x50
[   32.157611]  el1h_64_irq_handler+0x14/0x1c
[   32.157623]  el1h_64_irq+0x6c/0x70
[   32.157630]  cpuidle_enter_state+0xf4/0x440 (P)
[   32.157645]  cpuidle_enter+0x30/0x40
[   32.157655]  do_idle+0x16c/0x2d0
[   32.157665]  cpu_startup_entry+0x30/0x40
[   32.157675]  kernel_init+0x0/0x130
[   32.157682]  do_one_initcall+0x0/0x248
[   32.157694]  __primary_switched+0x88/0x90

> 
> Fixes: d7ade201ae7f ("v4l: vsp1: Extend VSP1 module API to allow DRM callbacks")
> Cc: stable@vger.kernel.org
> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> ---
> 
> I have noticed this race condition while debugging a separate issue and
> adding printk() statements in the display pipeline frame end. I have
> tested the fix with the DU test suite and VSP test suite, exercising
> both the display and video pipelines. I'm fairly confident that the VSPX
> pipeline won't be negatively affected, but it would be good to
> double-check that. Jacopo, Niklas, would you be able to give test it ?
> 
> ---
>  drivers/media/platform/renesas/vsp1/vsp1_pipe.c  | 12 ++++++++++--
>  drivers/media/platform/renesas/vsp1/vsp1_video.c |  5 -----
>  drivers/media/platform/renesas/vsp1/vsp1_vspx.c  | 13 +++++--------
>  3 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_pipe.c b/drivers/media/platform/renesas/vsp1/vsp1_pipe.c
> index 5d769cc42fe1..aaec1aa15091 100644
> --- a/drivers/media/platform/renesas/vsp1/vsp1_pipe.c
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_pipe.c
> @@ -509,6 +509,10 @@ int vsp1_pipeline_stop(struct vsp1_pipeline *pipe)
>  		 * When using display lists in continuous frame mode the only
>  		 * way to stop the pipeline is to reset the hardware.
>  		 */
> +		scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> +			pipe->state = VSP1_PIPELINE_STOPPING;
> +		}
> +
>  		ret = vsp1_reset_wpf(vsp1, pipe->output->entity.index);
>  		if (ret == 0) {
>  			spin_lock_irqsave(&pipe->irqlock, flags);
> @@ -583,8 +587,12 @@ void vsp1_pipeline_frame_end(struct vsp1_pipeline *pipe)
>  	 * Regardless of frame completion we still need to notify the pipe
>  	 * frame_end to account for vblank events.
>  	 */
> -	if (pipe->frame_end)
> -		pipe->frame_end(pipe, flags);
> +	scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> +		if (pipe->state == VSP1_PIPELINE_RUNNING || !pipe->lif) {
> +			if (pipe->frame_end)
> +				pipe->frame_end(pipe, flags);
> +		}
> +	}
>  
>  	pipe->sequence++;
>  }
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_video.c b/drivers/media/platform/renesas/vsp1/vsp1_video.c
> index fe1dac11d4ae..a8db94bdb670 100644
> --- a/drivers/media/platform/renesas/vsp1/vsp1_video.c
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_video.c
> @@ -325,14 +325,11 @@ static void vsp1_video_pipeline_frame_end(struct vsp1_pipeline *pipe,
>  {
>  	struct vsp1_device *vsp1 = pipe->output->entity.vsp1;
>  	enum vsp1_pipeline_state state;
> -	unsigned long flags;
>  	unsigned int i;
>  
>  	/* M2M Pipelines should never call here with an incomplete frame. */
>  	WARN_ON_ONCE(!(completion & VSP1_DL_FRAME_END_COMPLETED));
>  
> -	spin_lock_irqsave(&pipe->irqlock, flags);
> -
>  	/* Complete buffers on all video nodes. */
>  	for (i = 0; i < vsp1->info->rpf_count; ++i) {
>  		if (!pipe->inputs[i])
> @@ -354,8 +351,6 @@ static void vsp1_video_pipeline_frame_end(struct vsp1_pipeline *pipe,
>  		wake_up(&pipe->wq);
>  	else if (vsp1_pipeline_ready(pipe))
>  		vsp1_video_pipeline_run(pipe);
> -
> -	spin_unlock_irqrestore(&pipe->irqlock, flags);
>  }
>  
>  static int vsp1_video_pipeline_build_branch(struct vsp1_pipeline *pipe,
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> index 1673479be0ff..26c477708858 100644
> --- a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> @@ -176,14 +176,11 @@ static void vsp1_vspx_pipeline_frame_end(struct vsp1_pipeline *pipe,
>  {
>  	struct vsp1_vspx_pipeline *vspx_pipe = to_vsp1_vspx_pipeline(pipe);
>  
> -	scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> -		/*
> -		 * Operating the vsp1_pipe in singleshot mode requires to
> -		 * manually set the pipeline state to stopped when a transfer
> -		 * is completed.
> -		 */
> -		pipe->state = VSP1_PIPELINE_STOPPED;
> -	}
> +	/*
> +	 * Operating the vsp1_pipe in singleshot mode requires to manually set
> +	 * the pipeline state to stopped when a transfer is completed.
> +	 */
> +	pipe->state = VSP1_PIPELINE_STOPPED;
>  
>  	if (vspx_pipe->vspx_frame_end)
>  		vspx_pipe->vspx_frame_end(vspx_pipe->frame_end_data);
> 
> base-commit: bc1ba628e37c93cf2abeb2c79716f49087f8a024
> -- 
> Regards,
> 
> Laurent Pinchart
> 

-- 
Kind Regards,
Niklas Söderlund

  parent reply	other threads:[~2026-05-12 15:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 22:38 [PATCH] media: renesas: vsp1: Fix race condition when stopping display pipeline Laurent Pinchart
2026-05-12  7:42 ` Jacopo Mondi
2026-05-12  8:28   ` Laurent Pinchart
2026-05-12 15:35 ` Niklas Söderlund [this message]
2026-05-12 16:43   ` Niklas Söderlund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260512153557.GA332351@ragnatech.se \
    --to=niklas.soderlund@ragnatech.se \
    --cc=jacopo.mondi@ideasonboard.com \
    --cc=kieran.bingham@ideasonboard.com \
    --cc=laurent.pinchart+renesas@ideasonboard.com \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox