public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] drm/msm: Error path fixes
@ 2025-07-23 19:08 Rob Clark
  2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
  2025-07-23 19:08 ` [PATCH 2/2] drm/msm: Fix submit error path cleanup Rob Clark
  0 siblings, 2 replies; 8+ messages in thread
From: Rob Clark @ 2025-07-23 19:08 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-arm-msm, freedreno, Rob Clark, Abhinav Kumar, David Airlie,
	Dmitry Baryshkov, Jessica Zhang, open list, Marijn Suijten,
	Sean Paul, Simona Vetter

For reasons unknown to me, systemd-udev recently started limiting
max-files to 64k (at least in f42), which exposed some problematic
allocation related error paths.

Rob Clark (2):
  drm/msm: Fix refcnt underflow in error path
  drm/msm: Fix submit error path cleanup

 drivers/gpu/drm/msm/msm_gem.c        | 4 +++-
 drivers/gpu/drm/msm/msm_gem_submit.c | 9 +++++----
 2 files changed, 8 insertions(+), 5 deletions(-)

-- 
2.50.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
  2025-07-23 19:08 [PATCH 0/2] drm/msm: Error path fixes Rob Clark
@ 2025-07-23 19:08 ` Rob Clark
  2025-09-22 16:32   ` Stephan Gerhold
  2025-07-23 19:08 ` [PATCH 2/2] drm/msm: Fix submit error path cleanup Rob Clark
  1 sibling, 1 reply; 8+ messages in thread
From: Rob Clark @ 2025-07-23 19:08 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-arm-msm, freedreno, Rob Clark, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

If we hit an error path in GEM obj creation before msm_gem_new_handle()
updates obj->resv to point to the gpuvm resv object, then obj->resv
still points to &obj->_resv.  In this case we don't want to decrement
the refcount of the object being freed (since the refcnt is already
zero).  This fixes the following splat:

   ------------[ cut here ]------------
   refcount_t: underflow; use-after-free.
   WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
   Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
    qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
   CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
   Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
   pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
   pc : refcount_warn_saturate+0xf4/0x148
   lr : refcount_warn_saturate+0xf4/0x148
   sp : ffff8000a2073920
   x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
   x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
   x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
   x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
   x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
   x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
   x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
   x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
   x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
   x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
   Call trace:
    refcount_warn_saturate+0xf4/0x148 (P)
    msm_gem_free_object+0x248/0x260 [msm]
    drm_gem_object_free+0x24/0x40 [drm]
    msm_gem_new+0x1c4/0x1e0 [msm]
    msm_gem_new_handle+0x3c/0x1a0 [msm]
    msm_ioctl_gem_new+0x38/0x70 [msm]
    drm_ioctl_kernel+0xc8/0x138 [drm]
    drm_ioctl+0x2c8/0x618 [drm]
    __arm64_sys_ioctl+0xac/0x108
    invoke_syscall.constprop.0+0x64/0xe8
    el0_svc_common.constprop.0+0x40/0xe8
    do_el0_svc+0x24/0x38
    el0_svc+0x54/0x1d8
    el0t_64_sync_handler+0x10c/0x138
    el0t_64_sync+0x19c/0x1a0
   irq event stamp: 3698694
   hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
   hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
   softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
   softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
   ---[ end trace 0000000000000000 ]---

Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/msm_gem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 33d3354c6102..958bac4e2768 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
 		put_pages(obj);
 	}
 
-	if (msm_obj->flags & MSM_BO_NO_SHARE) {
+	if (obj->resv != &obj->_resv) {
 		struct drm_gem_object *r_obj =
 			container_of(obj->resv, struct drm_gem_object, _resv);
 
+		WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
+
 		/* Drop reference we hold to shared resv obj: */
 		drm_gem_object_put(r_obj);
 	}
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] drm/msm: Fix submit error path cleanup
  2025-07-23 19:08 [PATCH 0/2] drm/msm: Error path fixes Rob Clark
  2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
@ 2025-07-23 19:08 ` Rob Clark
  1 sibling, 0 replies; 8+ messages in thread
From: Rob Clark @ 2025-07-23 19:08 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-arm-msm, freedreno, Rob Clark, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

submit_unpin_objects() should come before we unlock the objects.  This
fixes the splat:

   WARNING: CPU: 2 PID: 2171 at drivers/gpu/drm/msm/msm_gem.h:395 msm_gem_unpin_locked+0x8c/0xd8 [msm]
   Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_dais q6apm_dai snd_q6dsp_common q6prm snd_q6apm qcom_pd_mapper cdc_mbim cdc_wdm cdc_ncm r8153_ecm cdc_ether usbnet sunrpc nls_ascii nls_cp437 vfat fat snd_soc_x1e80100 snd_soc_lpass_rx_macro snd_soc_lpass_tx_macro snd_soc_lpass_va_macro snd_soc_lpass_wsa_macro snd_soc_qcom_common soundwire_qcom snd_soc_lpass_macro_common snd_soc_hdmi_codec snd_soc_qcom_sdw ext4 snd_soc_core snd_compress soundwire_bus snd_pcm_dmaengine snd_seq mbcache jbd2 snd_seq_device snd_pcm pm8941_pwrkey snd_timer r8152 qcom_spmi_temp_alarm industrialio snd lenovo_yoga_slim7x ath12k mii arm_smccc_trng soundcore rng_core evdev loop panel_samsung_atna33xc20 msm ubwc_config drm_client_lib drm_gpuvm drm_exec gpu_sched drm_display_helper pmic_glink_altmode aux_hpd_bridge ucsi_glink qcom_battmgr phy_qcom_qmp_combo ps883x cec aux_bridge drm_dp_aux_bus i2c_hid_of aes_ce_blk drm_kms_helper aes_ce_cipher i2c_hid qcom_q6v5_pas
    ghash_ce qcom_pil_info drm sha1_ce qcom_common phy_snps_eusb2 qcom_geni_serial qcom_q6v5 qcom_sysmon pinctrl_sm8550_lpass_lpi lpasscc_sc8280xp sbsa_gwdt mdt_loader gpio_keys pmic_glink i2c_dev efivarfs autofs4
   CPU: 2 UID: 1000 PID: 2171 Comm: gnome-shell Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
   Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
   pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
   pc : msm_gem_unpin_locked+0x8c/0xd8 [msm]
   lr : msm_gem_unpin_locked+0x88/0xd8 [msm]
   sp : ffff80009c963820
   x29: ffff80009c963820 x28: ffff80009c9639f8 x27: ffff00080552a830
   x26: 0000000000000000 x25: ffff0009d5655800 x24: 0000000000000000
   x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000
   x20: ffff000831db5480 x19: ffff000816e74400 x18: 0000000000000000
   x17: 0000000000000000 x16: ffffc1396afdd720 x15: 0000000000000000
   x14: 0000000000000000 x13: 0000000000000000 x12: ffff0008c065bc00
   x11: ffff0008c065c000 x10: 0000000000000000 x9 : ffffc13945b19074
   x8 : 0000000000000000 x7 : 0000000000000209 x6 : 0000000000000002
   x5 : 0000000000019d01 x4 : ffff0008ba8db080 x3 : 000000000004093f
   x2 : ffff3ed5e727f000 x1 : 0000000000000000 x0 : 0000000000000000
   Call trace:
    msm_gem_unpin_locked+0x8c/0xd8 [msm] (P)
    msm_ioctl_gem_submit+0x32c/0x1760 [msm]
    drm_ioctl_kernel+0xc8/0x138 [drm]
    drm_ioctl+0x2c8/0x618 [drm]
    __arm64_sys_ioctl+0xac/0x108
    invoke_syscall.constprop.0+0x64/0xe8
    el0_svc_common.constprop.0+0x40/0xe8
    do_el0_svc+0x24/0x38
    el0_svc+0x54/0x1d8
    el0t_64_sync_handler+0x10c/0x138
    el0t_64_sync+0x19c/0x1a0
   irq event stamp: 2185036
   hardirqs last  enabled at (2185035): [<ffffc1396afeef9c>] _raw_spin_unlock_irqrestore+0x74/0x80
   hardirqs last disabled at (2185036): [<ffffc1396afd8164>] el1_dbg+0x24/0x90
   softirqs last  enabled at (2184778): [<ffffc13969675e44>] fpsimd_restore_current_state+0x3c/0x328
   softirqs last disabled at (2184776): [<ffffc13969675e14>] fpsimd_restore_current_state+0xc/0x328
   ---[ end trace 0000000000000000 ]---

Fixes: 111fdd2198e6 ("drm/msm: drm_gpuvm conversion")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 5f8e939a5906..0ac4c199ec93 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -514,14 +514,15 @@ static int submit_reloc(struct msm_gem_submit *submit, struct drm_gem_object *ob
  */
 static void submit_cleanup(struct msm_gem_submit *submit, bool error)
 {
+	if (error)
+		submit_unpin_objects(submit);
+
 	if (submit->exec.objects)
 		drm_exec_fini(&submit->exec);
 
-	if (error) {
-		submit_unpin_objects(submit);
-		/* job wasn't enqueued to scheduler, so early retirement: */
+	/* if job wasn't enqueued to scheduler, early retirement: */
+	if (error)
 		msm_submit_retire(submit);
-	}
 }
 
 void msm_submit_retire(struct msm_gem_submit *submit)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
  2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
@ 2025-09-22 16:32   ` Stephan Gerhold
  2025-09-22 16:41     ` Rob Clark
  0 siblings, 1 reply; 8+ messages in thread
From: Stephan Gerhold @ 2025-09-22 16:32 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, linux-arm-msm, freedreno, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

Hi Rob,

On Wed, Jul 23, 2025 at 12:08:49PM -0700, Rob Clark wrote:
> If we hit an error path in GEM obj creation before msm_gem_new_handle()
> updates obj->resv to point to the gpuvm resv object, then obj->resv
> still points to &obj->_resv.  In this case we don't want to decrement
> the refcount of the object being freed (since the refcnt is already
> zero).  This fixes the following splat:
> 
>    ------------[ cut here ]------------
>    refcount_t: underflow; use-after-free.
>    WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
>    Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
>     qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
>    CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
>    Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
>    pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
>    pc : refcount_warn_saturate+0xf4/0x148
>    lr : refcount_warn_saturate+0xf4/0x148
>    sp : ffff8000a2073920
>    x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
>    x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
>    x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
>    x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
>    x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
>    x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
>    x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
>    x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
>    x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
>    x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
>    Call trace:
>     refcount_warn_saturate+0xf4/0x148 (P)
>     msm_gem_free_object+0x248/0x260 [msm]
>     drm_gem_object_free+0x24/0x40 [drm]
>     msm_gem_new+0x1c4/0x1e0 [msm]
>     msm_gem_new_handle+0x3c/0x1a0 [msm]
>     msm_ioctl_gem_new+0x38/0x70 [msm]
>     drm_ioctl_kernel+0xc8/0x138 [drm]
>     drm_ioctl+0x2c8/0x618 [drm]
>     __arm64_sys_ioctl+0xac/0x108
>     invoke_syscall.constprop.0+0x64/0xe8
>     el0_svc_common.constprop.0+0x40/0xe8
>     do_el0_svc+0x24/0x38
>     el0_svc+0x54/0x1d8
>     el0t_64_sync_handler+0x10c/0x138
>     el0t_64_sync+0x19c/0x1a0
>    irq event stamp: 3698694
>    hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
>    hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
>    softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
>    softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
>    ---[ end trace 0000000000000000 ]---
> 
> Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
> ---
>  drivers/gpu/drm/msm/msm_gem.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 33d3354c6102..958bac4e2768 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
>  		put_pages(obj);
>  	}
>  
> -	if (msm_obj->flags & MSM_BO_NO_SHARE) {
> +	if (obj->resv != &obj->_resv) {
>  		struct drm_gem_object *r_obj =
>  			container_of(obj->resv, struct drm_gem_object, _resv);
>  
> +		WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
> +
>  		/* Drop reference we hold to shared resv obj: */
>  		drm_gem_object_put(r_obj);
>  	}

This patch seems to break something for direct IRIS/video playback using
dmabuf. I use a simple GStreamer test pipeline for testing IRIS on X1E
(on GNOME, in case that matters):

 $ gst-launch-1.0 filesrc location=bbb_sunflower_2160p_60fps_normal.mp4 \
   ! qtdemux name=d d.video_0 ! h264parse ! v4l2h264dec \
   ! capture-io-mode=dmabuf ! waylandsink

The video plays fine, but if I try to exit (CTRL+C) the display hangs
for a few seconds and then the console is spammed with pretty much
exactly the messages that you tried to fix here. If I revert this patch,
everything is fine again. It feels like your patch does exactly the
opposite for this use case. :-)

It seems to run into the WARN_ON you added.

Any ideas?

linux-next should have IRIS support for the Slim 7x if you want to try
this for yourself. Or alternatively, there is a backport for 6.17-rc7 in
the Linaro arm64-laptops tree: https://gitlab.com/Linaro/arm64-laptops/linux

You can find the test video here:
https://download.blender.org/demo/movies/BBB/

Thanks,
Stephan

[  107.430721] ------------[ cut here ]------------
[  107.435513] WARNING: CPU: 3 PID: 2040 at drivers/gpu/drm/msm/msm_gem.c:1127 msm_gem_free_object+0x1f8/0x264 [msm]
[  107.630472] CPU: 3 UID: 1000 PID: 2040 Comm: .gnome-shell-wr Not tainted 6.17.0-rc7 #1 PREEMPT 
[  107.630482] pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[  107.630488] pc : msm_gem_free_object+0x1f8/0x264 [msm]
[  107.676630] lr : msm_gem_free_object+0x138/0x264 [msm]
[  107.676666] sp : ffff800092a1bb30
[  107.676668] x29: ffff800092a1bb80 x28: ffff800092a1bce8 x27: ffffbc702dbdbe08
[  107.676676] x26: 0000000000000008 x25: 0000000000000009 x24: 00000000000000a6
[  107.676682] x23: ffff00083c72f850 x22: ffff00083c72f868 x21: ffff00087e69f200
[  107.676689] x20: ffff00087e69f330 x19: ffff00084d157ae0 x18: 0000000000000000
[  107.676695] x17: 0000000000000000 x16: ffffbc704bd46b80 x15: 0000ffffd0959540
[  107.676701] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[  107.676706] x11: ffffbc702e6cdb48 x10: 0000000000000000 x9 : 000000000000003f
[  107.676712] x8 : ffff800092a1ba90 x7 : 0000000000000000 x6 : 0000000000000020
[  107.676718] x5 : ffffbc704bd46c40 x4 : fffffdffe102cf60 x3 : 0000000000400032
[  107.676724] x2 : 0000000000020000 x1 : ffff00087e6978e8 x0 : ffff00087e6977e8
[  107.676731] Call trace:
[  107.676733]  msm_gem_free_object+0x1f8/0x264 [msm] (P)
[  107.676771]  drm_gem_object_free+0x1c/0x30 [drm]
[  107.676816]  drm_gem_object_handle_put_unlocked+0x138/0x150 [drm]
[  107.676852]  drm_gem_object_release_handle+0x5c/0xcc [drm]
[  107.676886]  drm_gem_handle_delete+0x68/0xbc [drm]
[  107.788743]  drm_gem_close_ioctl+0x34/0x40 [drm]
[  107.793553]  drm_ioctl_kernel+0xc0/0x130 [drm]
[  107.798178]  drm_ioctl+0x360/0x4e0 [drm]
[  107.802277]  __arm64_sys_ioctl+0xac/0x104
[  107.806436]  invoke_syscall+0x48/0x104
[  107.810334]  el0_svc_common.constprop.0+0x40/0xe0
[  107.815209]  do_el0_svc+0x1c/0x28
[  107.818662]  el0_svc+0x34/0xec
[  107.821838]  el0t_64_sync_handler+0xa0/0xe4
[  107.826173]  el0t_64_sync+0x198/0x19c
[  107.829971] ---[ end trace 0000000000000000 ]---
[  107.834789] ------------[ cut here ]------------
[  107.839587] refcount_t: underflow; use-after-free.
[  107.844553] WARNING: CPU: 3 PID: 2040 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x144
[  108.052928] CPU: 3 UID: 1000 PID: 2040 Comm: .gnome-shell-wr Tainted: G        W           6.17.0-rc7 #1 PREEMPT 
[  108.063491] Tainted: [W]=WARN
[  108.075627] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[  108.082808] pc : refcount_warn_saturate+0xf4/0x144
[  108.087756] lr : refcount_warn_saturate+0xf4/0x144
[  108.092704] sp : ffff800092a1bb20
[  108.096141] x29: ffff800092a1bb20 x28: ffff800092a1bce8 x27: ffffbc702dbdbe08
[  108.103491] x26: 0000000000000008 x25: 0000000000000009 x24: 00000000000000a6
[  108.110852] x23: ffff00083c72f850 x22: ffff00083c72f868 x21: ffff00087e69f200
[  108.118222] x20: ffff00087e69f330 x19: ffff00084d157ae0 x18: 0000000000000006
[  108.125572] x17: 0000000000000000 x16: ffffbc704ba1eda0 x15: ffff800092a1b6ef
[  108.132925] x14: 000000000000003a x13: 000000000000003a x12: 0000000000000000
[  108.140280] x11: 00000000000000c0 x10: d2c95932de8ceaa3 x9 : 128386994077d608
[  108.147631] x8 : ffff000840c0c588 x7 : 0000000002ac3ea0 x6 : 0000000000000002
[  108.154990] x5 : 0000000435572e2f x4 : 0000000000000002 x3 : 0000000000000010
[  108.162339] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000840c0b480
[  108.169697] Call trace:
[  108.172243]  refcount_warn_saturate+0xf4/0x144 (P)
[  108.177199]  msm_gem_free_object+0x25c/0x264 [msm]
[  108.182167]  drm_gem_object_free+0x1c/0x30 [drm]
[  108.186943]  drm_gem_object_handle_put_unlocked+0x138/0x150 [drm]
[  108.193237]  drm_gem_object_release_handle+0x5c/0xcc [drm]
[  108.198906]  drm_gem_handle_delete+0x68/0xbc [drm]
[  108.203867]  drm_gem_close_ioctl+0x34/0x40 [drm]
[  108.208651]  drm_ioctl_kernel+0xc0/0x130 [drm]
[  108.213248]  drm_ioctl+0x360/0x4e0 [drm]
[  108.217319]  __arm64_sys_ioctl+0xac/0x104
[  108.221464]  invoke_syscall+0x48/0x104
[  108.225343]  el0_svc_common.constprop.0+0x40/0xe0
[  108.230207]  do_el0_svc+0x1c/0x28
[  108.233650]  el0_svc+0x34/0xec
[  108.236817]  el0t_64_sync_handler+0xa0/0xe4
[  108.241143]  el0t_64_sync+0x198/0x19c
[  108.244931] ---[ end trace 0000000000000000 ]---


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
  2025-09-22 16:32   ` Stephan Gerhold
@ 2025-09-22 16:41     ` Rob Clark
  2025-09-22 16:46       ` Stephan Gerhold
  0 siblings, 1 reply; 8+ messages in thread
From: Rob Clark @ 2025-09-22 16:41 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: dri-devel, linux-arm-msm, freedreno, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

On Mon, Sep 22, 2025 at 9:33 AM Stephan Gerhold
<stephan.gerhold@linaro.org> wrote:
>
> Hi Rob,
>
> On Wed, Jul 23, 2025 at 12:08:49PM -0700, Rob Clark wrote:
> > If we hit an error path in GEM obj creation before msm_gem_new_handle()
> > updates obj->resv to point to the gpuvm resv object, then obj->resv
> > still points to &obj->_resv.  In this case we don't want to decrement
> > the refcount of the object being freed (since the refcnt is already
> > zero).  This fixes the following splat:
> >
> >    ------------[ cut here ]------------
> >    refcount_t: underflow; use-after-free.
> >    WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
> >    Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
> >     qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
> >    CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
> >    Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
> >    pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> >    pc : refcount_warn_saturate+0xf4/0x148
> >    lr : refcount_warn_saturate+0xf4/0x148
> >    sp : ffff8000a2073920
> >    x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
> >    x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
> >    x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
> >    x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
> >    x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
> >    x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
> >    x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
> >    x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
> >    x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
> >    x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
> >    Call trace:
> >     refcount_warn_saturate+0xf4/0x148 (P)
> >     msm_gem_free_object+0x248/0x260 [msm]
> >     drm_gem_object_free+0x24/0x40 [drm]
> >     msm_gem_new+0x1c4/0x1e0 [msm]
> >     msm_gem_new_handle+0x3c/0x1a0 [msm]
> >     msm_ioctl_gem_new+0x38/0x70 [msm]
> >     drm_ioctl_kernel+0xc8/0x138 [drm]
> >     drm_ioctl+0x2c8/0x618 [drm]
> >     __arm64_sys_ioctl+0xac/0x108
> >     invoke_syscall.constprop.0+0x64/0xe8
> >     el0_svc_common.constprop.0+0x40/0xe8
> >     do_el0_svc+0x24/0x38
> >     el0_svc+0x54/0x1d8
> >     el0t_64_sync_handler+0x10c/0x138
> >     el0t_64_sync+0x19c/0x1a0
> >    irq event stamp: 3698694
> >    hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
> >    hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
> >    softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
> >    softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
> >    ---[ end trace 0000000000000000 ]---
> >
> > Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
> > Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
> > ---
> >  drivers/gpu/drm/msm/msm_gem.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > index 33d3354c6102..958bac4e2768 100644
> > --- a/drivers/gpu/drm/msm/msm_gem.c
> > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > @@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
> >               put_pages(obj);
> >       }
> >
> > -     if (msm_obj->flags & MSM_BO_NO_SHARE) {
> > +     if (obj->resv != &obj->_resv) {
> >               struct drm_gem_object *r_obj =
> >                       container_of(obj->resv, struct drm_gem_object, _resv);
> >
> > +             WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
> > +
> >               /* Drop reference we hold to shared resv obj: */
> >               drm_gem_object_put(r_obj);
> >       }
>
> This patch seems to break something for direct IRIS/video playback using
> dmabuf. I use a simple GStreamer test pipeline for testing IRIS on X1E
> (on GNOME, in case that matters):
>
>  $ gst-launch-1.0 filesrc location=bbb_sunflower_2160p_60fps_normal.mp4 \
>    ! qtdemux name=d d.video_0 ! h264parse ! v4l2h264dec \
>    ! capture-io-mode=dmabuf ! waylandsink
>
> The video plays fine, but if I try to exit (CTRL+C) the display hangs
> for a few seconds and then the console is spammed with pretty much
> exactly the messages that you tried to fix here. If I revert this patch,
> everything is fine again. It feels like your patch does exactly the
> opposite for this use case. :-)
>
> It seems to run into the WARN_ON you added.

Hmm, are we allocating from drm and importing into v4l2, or the other direction?

BR,
-R

> Any ideas?
>
> linux-next should have IRIS support for the Slim 7x if you want to try
> this for yourself. Or alternatively, there is a backport for 6.17-rc7 in
> the Linaro arm64-laptops tree: https://gitlab.com/Linaro/arm64-laptops/linux
>
> You can find the test video here:
> https://download.blender.org/demo/movies/BBB/
>
> Thanks,
> Stephan
>
> [  107.430721] ------------[ cut here ]------------
> [  107.435513] WARNING: CPU: 3 PID: 2040 at drivers/gpu/drm/msm/msm_gem.c:1127 msm_gem_free_object+0x1f8/0x264 [msm]
> [  107.630472] CPU: 3 UID: 1000 PID: 2040 Comm: .gnome-shell-wr Not tainted 6.17.0-rc7 #1 PREEMPT
> [  107.630482] pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [  107.630488] pc : msm_gem_free_object+0x1f8/0x264 [msm]
> [  107.676630] lr : msm_gem_free_object+0x138/0x264 [msm]
> [  107.676666] sp : ffff800092a1bb30
> [  107.676668] x29: ffff800092a1bb80 x28: ffff800092a1bce8 x27: ffffbc702dbdbe08
> [  107.676676] x26: 0000000000000008 x25: 0000000000000009 x24: 00000000000000a6
> [  107.676682] x23: ffff00083c72f850 x22: ffff00083c72f868 x21: ffff00087e69f200
> [  107.676689] x20: ffff00087e69f330 x19: ffff00084d157ae0 x18: 0000000000000000
> [  107.676695] x17: 0000000000000000 x16: ffffbc704bd46b80 x15: 0000ffffd0959540
> [  107.676701] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> [  107.676706] x11: ffffbc702e6cdb48 x10: 0000000000000000 x9 : 000000000000003f
> [  107.676712] x8 : ffff800092a1ba90 x7 : 0000000000000000 x6 : 0000000000000020
> [  107.676718] x5 : ffffbc704bd46c40 x4 : fffffdffe102cf60 x3 : 0000000000400032
> [  107.676724] x2 : 0000000000020000 x1 : ffff00087e6978e8 x0 : ffff00087e6977e8
> [  107.676731] Call trace:
> [  107.676733]  msm_gem_free_object+0x1f8/0x264 [msm] (P)
> [  107.676771]  drm_gem_object_free+0x1c/0x30 [drm]
> [  107.676816]  drm_gem_object_handle_put_unlocked+0x138/0x150 [drm]
> [  107.676852]  drm_gem_object_release_handle+0x5c/0xcc [drm]
> [  107.676886]  drm_gem_handle_delete+0x68/0xbc [drm]
> [  107.788743]  drm_gem_close_ioctl+0x34/0x40 [drm]
> [  107.793553]  drm_ioctl_kernel+0xc0/0x130 [drm]
> [  107.798178]  drm_ioctl+0x360/0x4e0 [drm]
> [  107.802277]  __arm64_sys_ioctl+0xac/0x104
> [  107.806436]  invoke_syscall+0x48/0x104
> [  107.810334]  el0_svc_common.constprop.0+0x40/0xe0
> [  107.815209]  do_el0_svc+0x1c/0x28
> [  107.818662]  el0_svc+0x34/0xec
> [  107.821838]  el0t_64_sync_handler+0xa0/0xe4
> [  107.826173]  el0t_64_sync+0x198/0x19c
> [  107.829971] ---[ end trace 0000000000000000 ]---
> [  107.834789] ------------[ cut here ]------------
> [  107.839587] refcount_t: underflow; use-after-free.
> [  107.844553] WARNING: CPU: 3 PID: 2040 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x144
> [  108.052928] CPU: 3 UID: 1000 PID: 2040 Comm: .gnome-shell-wr Tainted: G        W           6.17.0-rc7 #1 PREEMPT
> [  108.063491] Tainted: [W]=WARN
> [  108.075627] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [  108.082808] pc : refcount_warn_saturate+0xf4/0x144
> [  108.087756] lr : refcount_warn_saturate+0xf4/0x144
> [  108.092704] sp : ffff800092a1bb20
> [  108.096141] x29: ffff800092a1bb20 x28: ffff800092a1bce8 x27: ffffbc702dbdbe08
> [  108.103491] x26: 0000000000000008 x25: 0000000000000009 x24: 00000000000000a6
> [  108.110852] x23: ffff00083c72f850 x22: ffff00083c72f868 x21: ffff00087e69f200
> [  108.118222] x20: ffff00087e69f330 x19: ffff00084d157ae0 x18: 0000000000000006
> [  108.125572] x17: 0000000000000000 x16: ffffbc704ba1eda0 x15: ffff800092a1b6ef
> [  108.132925] x14: 000000000000003a x13: 000000000000003a x12: 0000000000000000
> [  108.140280] x11: 00000000000000c0 x10: d2c95932de8ceaa3 x9 : 128386994077d608
> [  108.147631] x8 : ffff000840c0c588 x7 : 0000000002ac3ea0 x6 : 0000000000000002
> [  108.154990] x5 : 0000000435572e2f x4 : 0000000000000002 x3 : 0000000000000010
> [  108.162339] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000840c0b480
> [  108.169697] Call trace:
> [  108.172243]  refcount_warn_saturate+0xf4/0x144 (P)
> [  108.177199]  msm_gem_free_object+0x25c/0x264 [msm]
> [  108.182167]  drm_gem_object_free+0x1c/0x30 [drm]
> [  108.186943]  drm_gem_object_handle_put_unlocked+0x138/0x150 [drm]
> [  108.193237]  drm_gem_object_release_handle+0x5c/0xcc [drm]
> [  108.198906]  drm_gem_handle_delete+0x68/0xbc [drm]
> [  108.203867]  drm_gem_close_ioctl+0x34/0x40 [drm]
> [  108.208651]  drm_ioctl_kernel+0xc0/0x130 [drm]
> [  108.213248]  drm_ioctl+0x360/0x4e0 [drm]
> [  108.217319]  __arm64_sys_ioctl+0xac/0x104
> [  108.221464]  invoke_syscall+0x48/0x104
> [  108.225343]  el0_svc_common.constprop.0+0x40/0xe0
> [  108.230207]  do_el0_svc+0x1c/0x28
> [  108.233650]  el0_svc+0x34/0xec
> [  108.236817]  el0t_64_sync_handler+0xa0/0xe4
> [  108.241143]  el0t_64_sync+0x198/0x19c
> [  108.244931] ---[ end trace 0000000000000000 ]---
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
  2025-09-22 16:41     ` Rob Clark
@ 2025-09-22 16:46       ` Stephan Gerhold
  2025-09-22 17:42         ` Rob Clark
  0 siblings, 1 reply; 8+ messages in thread
From: Stephan Gerhold @ 2025-09-22 16:46 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, linux-arm-msm, freedreno, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

On Mon, Sep 22, 2025 at 09:41:07AM -0700, Rob Clark wrote:
> On Mon, Sep 22, 2025 at 9:33 AM Stephan Gerhold
> <stephan.gerhold@linaro.org> wrote:
> > On Wed, Jul 23, 2025 at 12:08:49PM -0700, Rob Clark wrote:
> > > If we hit an error path in GEM obj creation before msm_gem_new_handle()
> > > updates obj->resv to point to the gpuvm resv object, then obj->resv
> > > still points to &obj->_resv.  In this case we don't want to decrement
> > > the refcount of the object being freed (since the refcnt is already
> > > zero).  This fixes the following splat:
> > >
> > >    ------------[ cut here ]------------
> > >    refcount_t: underflow; use-after-free.
> > >    WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
> > >    Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
> > >     qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
> > >    CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
> > >    Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
> > >    pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > >    pc : refcount_warn_saturate+0xf4/0x148
> > >    lr : refcount_warn_saturate+0xf4/0x148
> > >    sp : ffff8000a2073920
> > >    x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
> > >    x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
> > >    x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
> > >    x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
> > >    x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
> > >    x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
> > >    x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
> > >    x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
> > >    x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
> > >    x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
> > >    Call trace:
> > >     refcount_warn_saturate+0xf4/0x148 (P)
> > >     msm_gem_free_object+0x248/0x260 [msm]
> > >     drm_gem_object_free+0x24/0x40 [drm]
> > >     msm_gem_new+0x1c4/0x1e0 [msm]
> > >     msm_gem_new_handle+0x3c/0x1a0 [msm]
> > >     msm_ioctl_gem_new+0x38/0x70 [msm]
> > >     drm_ioctl_kernel+0xc8/0x138 [drm]
> > >     drm_ioctl+0x2c8/0x618 [drm]
> > >     __arm64_sys_ioctl+0xac/0x108
> > >     invoke_syscall.constprop.0+0x64/0xe8
> > >     el0_svc_common.constprop.0+0x40/0xe8
> > >     do_el0_svc+0x24/0x38
> > >     el0_svc+0x54/0x1d8
> > >     el0t_64_sync_handler+0x10c/0x138
> > >     el0t_64_sync+0x19c/0x1a0
> > >    irq event stamp: 3698694
> > >    hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
> > >    hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
> > >    softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
> > >    softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
> > >    ---[ end trace 0000000000000000 ]---
> > >
> > > Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
> > > Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
> > > ---
> > >  drivers/gpu/drm/msm/msm_gem.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > > index 33d3354c6102..958bac4e2768 100644
> > > --- a/drivers/gpu/drm/msm/msm_gem.c
> > > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > > @@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
> > >               put_pages(obj);
> > >       }
> > >
> > > -     if (msm_obj->flags & MSM_BO_NO_SHARE) {
> > > +     if (obj->resv != &obj->_resv) {
> > >               struct drm_gem_object *r_obj =
> > >                       container_of(obj->resv, struct drm_gem_object, _resv);
> > >
> > > +             WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
> > > +
> > >               /* Drop reference we hold to shared resv obj: */
> > >               drm_gem_object_put(r_obj);
> > >       }
> >
> > This patch seems to break something for direct IRIS/video playback using
> > dmabuf. I use a simple GStreamer test pipeline for testing IRIS on X1E
> > (on GNOME, in case that matters):
> >
> >  $ gst-launch-1.0 filesrc location=bbb_sunflower_2160p_60fps_normal.mp4 \
> >    ! qtdemux name=d d.video_0 ! h264parse ! v4l2h264dec \
> >    ! capture-io-mode=dmabuf ! waylandsink
> >
> > The video plays fine, but if I try to exit (CTRL+C) the display hangs
> > for a few seconds and then the console is spammed with pretty much
> > exactly the messages that you tried to fix here. If I revert this patch,
> > everything is fine again. It feels like your patch does exactly the
> > opposite for this use case. :-)
> >
> > It seems to run into the WARN_ON you added.
> 
> Hmm, are we allocating from drm and importing into v4l2, or the other direction?
> 

Is there an easy way to check?

I would need to study the code to be sure, you probably know more about
this than I do. I just run this command and it always worked so far
somehow. :-)

Thanks,
Stephan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
  2025-09-22 16:46       ` Stephan Gerhold
@ 2025-09-22 17:42         ` Rob Clark
  2025-09-22 18:49           ` Stephan Gerhold
  0 siblings, 1 reply; 8+ messages in thread
From: Rob Clark @ 2025-09-22 17:42 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: dri-devel, linux-arm-msm, freedreno, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

On Mon, Sep 22, 2025 at 9:46 AM Stephan Gerhold
<stephan.gerhold@linaro.org> wrote:
>
> On Mon, Sep 22, 2025 at 09:41:07AM -0700, Rob Clark wrote:
> > On Mon, Sep 22, 2025 at 9:33 AM Stephan Gerhold
> > <stephan.gerhold@linaro.org> wrote:
> > > On Wed, Jul 23, 2025 at 12:08:49PM -0700, Rob Clark wrote:
> > > > If we hit an error path in GEM obj creation before msm_gem_new_handle()
> > > > updates obj->resv to point to the gpuvm resv object, then obj->resv
> > > > still points to &obj->_resv.  In this case we don't want to decrement
> > > > the refcount of the object being freed (since the refcnt is already
> > > > zero).  This fixes the following splat:
> > > >
> > > >    ------------[ cut here ]------------
> > > >    refcount_t: underflow; use-after-free.
> > > >    WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
> > > >    Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
> > > >     qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
> > > >    CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
> > > >    Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
> > > >    pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > > >    pc : refcount_warn_saturate+0xf4/0x148
> > > >    lr : refcount_warn_saturate+0xf4/0x148
> > > >    sp : ffff8000a2073920
> > > >    x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
> > > >    x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
> > > >    x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
> > > >    x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
> > > >    x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
> > > >    x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
> > > >    x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
> > > >    x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
> > > >    x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
> > > >    x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
> > > >    Call trace:
> > > >     refcount_warn_saturate+0xf4/0x148 (P)
> > > >     msm_gem_free_object+0x248/0x260 [msm]
> > > >     drm_gem_object_free+0x24/0x40 [drm]
> > > >     msm_gem_new+0x1c4/0x1e0 [msm]
> > > >     msm_gem_new_handle+0x3c/0x1a0 [msm]
> > > >     msm_ioctl_gem_new+0x38/0x70 [msm]
> > > >     drm_ioctl_kernel+0xc8/0x138 [drm]
> > > >     drm_ioctl+0x2c8/0x618 [drm]
> > > >     __arm64_sys_ioctl+0xac/0x108
> > > >     invoke_syscall.constprop.0+0x64/0xe8
> > > >     el0_svc_common.constprop.0+0x40/0xe8
> > > >     do_el0_svc+0x24/0x38
> > > >     el0_svc+0x54/0x1d8
> > > >     el0t_64_sync_handler+0x10c/0x138
> > > >     el0t_64_sync+0x19c/0x1a0
> > > >    irq event stamp: 3698694
> > > >    hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
> > > >    hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
> > > >    softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
> > > >    softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
> > > >    ---[ end trace 0000000000000000 ]---
> > > >
> > > > Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
> > > > Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
> > > > ---
> > > >  drivers/gpu/drm/msm/msm_gem.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > > > index 33d3354c6102..958bac4e2768 100644
> > > > --- a/drivers/gpu/drm/msm/msm_gem.c
> > > > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > > > @@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
> > > >               put_pages(obj);
> > > >       }
> > > >
> > > > -     if (msm_obj->flags & MSM_BO_NO_SHARE) {
> > > > +     if (obj->resv != &obj->_resv) {
> > > >               struct drm_gem_object *r_obj =
> > > >                       container_of(obj->resv, struct drm_gem_object, _resv);
> > > >
> > > > +             WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
> > > > +
> > > >               /* Drop reference we hold to shared resv obj: */
> > > >               drm_gem_object_put(r_obj);
> > > >       }
> > >
> > > This patch seems to break something for direct IRIS/video playback using
> > > dmabuf. I use a simple GStreamer test pipeline for testing IRIS on X1E
> > > (on GNOME, in case that matters):
> > >
> > >  $ gst-launch-1.0 filesrc location=bbb_sunflower_2160p_60fps_normal.mp4 \
> > >    ! qtdemux name=d d.video_0 ! h264parse ! v4l2h264dec \
> > >    ! capture-io-mode=dmabuf ! waylandsink
> > >
> > > The video plays fine, but if I try to exit (CTRL+C) the display hangs
> > > for a few seconds and then the console is spammed with pretty much
> > > exactly the messages that you tried to fix here. If I revert this patch,
> > > everything is fine again. It feels like your patch does exactly the
> > > opposite for this use case. :-)
> > >
> > > It seems to run into the WARN_ON you added.
> >
> > Hmm, are we allocating from drm and importing into v4l2, or the other direction?
> >
>
> Is there an easy way to check?

Maybe strace?  But, I think this would help, at least if v4l2 is allocating:

- if (obj->resv != &obj->_resv) {
+ if ((msm_obj->flags & MSM_BO_NO_SHARE) && (obj->resv != &obj->_resv)) {

(sorry about gmail mangling the formatting)

BR,
-R

>
> I would need to study the code to be sure, you probably know more about
> this than I do. I just run this command and it always worked so far
> somehow. :-)
>
> Thanks,
> Stephan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
  2025-09-22 17:42         ` Rob Clark
@ 2025-09-22 18:49           ` Stephan Gerhold
  0 siblings, 0 replies; 8+ messages in thread
From: Stephan Gerhold @ 2025-09-22 18:49 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, linux-arm-msm, freedreno, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
	David Airlie, Simona Vetter, open list

On Mon, Sep 22, 2025 at 10:42:52AM -0700, Rob Clark wrote:
> On Mon, Sep 22, 2025 at 9:46 AM Stephan Gerhold
> <stephan.gerhold@linaro.org> wrote:
> >
> > On Mon, Sep 22, 2025 at 09:41:07AM -0700, Rob Clark wrote:
> > > On Mon, Sep 22, 2025 at 9:33 AM Stephan Gerhold
> > > <stephan.gerhold@linaro.org> wrote:
> > > > On Wed, Jul 23, 2025 at 12:08:49PM -0700, Rob Clark wrote:
> > > > > If we hit an error path in GEM obj creation before msm_gem_new_handle()
> > > > > updates obj->resv to point to the gpuvm resv object, then obj->resv
> > > > > still points to &obj->_resv.  In this case we don't want to decrement
> > > > > the refcount of the object being freed (since the refcnt is already
> > > > > zero).  This fixes the following splat:
> > > > >
> > > > >    ------------[ cut here ]------------
> > > > >    refcount_t: underflow; use-after-free.
> > > > >    WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
> > > > >    Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
> > > > >     qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
> > > > >    CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
> > > > >    Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
> > > > >    pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > > > >    pc : refcount_warn_saturate+0xf4/0x148
> > > > >    lr : refcount_warn_saturate+0xf4/0x148
> > > > >    sp : ffff8000a2073920
> > > > >    x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
> > > > >    x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
> > > > >    x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
> > > > >    x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
> > > > >    x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
> > > > >    x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
> > > > >    x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
> > > > >    x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
> > > > >    x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
> > > > >    x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
> > > > >    Call trace:
> > > > >     refcount_warn_saturate+0xf4/0x148 (P)
> > > > >     msm_gem_free_object+0x248/0x260 [msm]
> > > > >     drm_gem_object_free+0x24/0x40 [drm]
> > > > >     msm_gem_new+0x1c4/0x1e0 [msm]
> > > > >     msm_gem_new_handle+0x3c/0x1a0 [msm]
> > > > >     msm_ioctl_gem_new+0x38/0x70 [msm]
> > > > >     drm_ioctl_kernel+0xc8/0x138 [drm]
> > > > >     drm_ioctl+0x2c8/0x618 [drm]
> > > > >     __arm64_sys_ioctl+0xac/0x108
> > > > >     invoke_syscall.constprop.0+0x64/0xe8
> > > > >     el0_svc_common.constprop.0+0x40/0xe8
> > > > >     do_el0_svc+0x24/0x38
> > > > >     el0_svc+0x54/0x1d8
> > > > >     el0t_64_sync_handler+0x10c/0x138
> > > > >     el0t_64_sync+0x19c/0x1a0
> > > > >    irq event stamp: 3698694
> > > > >    hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
> > > > >    hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
> > > > >    softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
> > > > >    softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
> > > > >    ---[ end trace 0000000000000000 ]---
> > > > >
> > > > > Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
> > > > > Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
> > > > > ---
> > > > >  drivers/gpu/drm/msm/msm_gem.c | 4 +++-
> > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > > > > index 33d3354c6102..958bac4e2768 100644
> > > > > --- a/drivers/gpu/drm/msm/msm_gem.c
> > > > > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > > > > @@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
> > > > >               put_pages(obj);
> > > > >       }
> > > > >
> > > > > -     if (msm_obj->flags & MSM_BO_NO_SHARE) {
> > > > > +     if (obj->resv != &obj->_resv) {
> > > > >               struct drm_gem_object *r_obj =
> > > > >                       container_of(obj->resv, struct drm_gem_object, _resv);
> > > > >
> > > > > +             WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
> > > > > +
> > > > >               /* Drop reference we hold to shared resv obj: */
> > > > >               drm_gem_object_put(r_obj);
> > > > >       }
> > > >
> > > > This patch seems to break something for direct IRIS/video playback using
> > > > dmabuf. I use a simple GStreamer test pipeline for testing IRIS on X1E
> > > > (on GNOME, in case that matters):
> > > >
> > > >  $ gst-launch-1.0 filesrc location=bbb_sunflower_2160p_60fps_normal.mp4 \
> > > >    ! qtdemux name=d d.video_0 ! h264parse ! v4l2h264dec \
> > > >    ! capture-io-mode=dmabuf ! waylandsink
> > > >
> > > > The video plays fine, but if I try to exit (CTRL+C) the display hangs
> > > > for a few seconds and then the console is spammed with pretty much
> > > > exactly the messages that you tried to fix here. If I revert this patch,
> > > > everything is fine again. It feels like your patch does exactly the
> > > > opposite for this use case. :-)
> > > >
> > > > It seems to run into the WARN_ON you added.
> > >
> > > Hmm, are we allocating from drm and importing into v4l2, or the other direction?
> > >
> >
> > Is there an easy way to check?
> 
> Maybe strace?  But, I think this would help, at least if v4l2 is allocating:
> 

I would indeed guess that v4l2 is allocating in this case. I didn't end
up checking with strace, because your proposed change

> - if (obj->resv != &obj->_resv) {
> + if ((msm_obj->flags & MSM_BO_NO_SHARE) && (obj->resv != &obj->_resv)) {
> 

fixes the issue. Thanks! If this makes sense to you, could you submit a
patch?

Thanks,
Stephan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-09-22 18:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 19:08 [PATCH 0/2] drm/msm: Error path fixes Rob Clark
2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
2025-09-22 16:32   ` Stephan Gerhold
2025-09-22 16:41     ` Rob Clark
2025-09-22 16:46       ` Stephan Gerhold
2025-09-22 17:42         ` Rob Clark
2025-09-22 18:49           ` Stephan Gerhold
2025-07-23 19:08 ` [PATCH 2/2] drm/msm: Fix submit error path cleanup Rob Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox