* [PATCH 0/2] drm/msm: Error path fixes
@ 2025-07-23 19:08 Rob Clark
2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
2025-07-23 19:08 ` [PATCH 2/2] drm/msm: Fix submit error path cleanup Rob Clark
0 siblings, 2 replies; 3+ messages in thread
From: Rob Clark @ 2025-07-23 19:08 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Abhinav Kumar, David Airlie,
Dmitry Baryshkov, Jessica Zhang, open list, Marijn Suijten,
Sean Paul, Simona Vetter
For reasons unknown to me, systemd-udev recently started limiting
max-files to 64k (at least in f42), which exposed some problematic
allocation related error paths.
Rob Clark (2):
drm/msm: Fix refcnt underflow in error path
drm/msm: Fix submit error path cleanup
drivers/gpu/drm/msm/msm_gem.c | 4 +++-
drivers/gpu/drm/msm/msm_gem_submit.c | 9 +++++----
2 files changed, 8 insertions(+), 5 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] drm/msm: Fix refcnt underflow in error path
2025-07-23 19:08 [PATCH 0/2] drm/msm: Error path fixes Rob Clark
@ 2025-07-23 19:08 ` Rob Clark
2025-07-23 19:08 ` [PATCH 2/2] drm/msm: Fix submit error path cleanup Rob Clark
1 sibling, 0 replies; 3+ messages in thread
From: Rob Clark @ 2025-07-23 19:08 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Dmitry Baryshkov,
Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
David Airlie, Simona Vetter, open list
If we hit an error path in GEM obj creation before msm_gem_new_handle()
updates obj->resv to point to the gpuvm resv object, then obj->resv
still points to &obj->_resv. In this case we don't want to decrement
the refcount of the object being freed (since the refcnt is already
zero). This fixes the following splat:
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xf4/0x148
lr : refcount_warn_saturate+0xf4/0x148
sp : ffff8000a2073920
x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
Call trace:
refcount_warn_saturate+0xf4/0x148 (P)
msm_gem_free_object+0x248/0x260 [msm]
drm_gem_object_free+0x24/0x40 [drm]
msm_gem_new+0x1c4/0x1e0 [msm]
msm_gem_new_handle+0x3c/0x1a0 [msm]
msm_ioctl_gem_new+0x38/0x70 [msm]
drm_ioctl_kernel+0xc8/0x138 [drm]
drm_ioctl+0x2c8/0x618 [drm]
__arm64_sys_ioctl+0xac/0x108
invoke_syscall.constprop.0+0x64/0xe8
el0_svc_common.constprop.0+0x40/0xe8
do_el0_svc+0x24/0x38
el0_svc+0x54/0x1d8
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x19c/0x1a0
irq event stamp: 3698694
hardirqs last enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
softirqs last enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
---[ end trace 0000000000000000 ]---
Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
---
drivers/gpu/drm/msm/msm_gem.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 33d3354c6102..958bac4e2768 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1114,10 +1114,12 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
put_pages(obj);
}
- if (msm_obj->flags & MSM_BO_NO_SHARE) {
+ if (obj->resv != &obj->_resv) {
struct drm_gem_object *r_obj =
container_of(obj->resv, struct drm_gem_object, _resv);
+ WARN_ON(!(msm_obj->flags & MSM_BO_NO_SHARE));
+
/* Drop reference we hold to shared resv obj: */
drm_gem_object_put(r_obj);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] drm/msm: Fix submit error path cleanup
2025-07-23 19:08 [PATCH 0/2] drm/msm: Error path fixes Rob Clark
2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
@ 2025-07-23 19:08 ` Rob Clark
1 sibling, 0 replies; 3+ messages in thread
From: Rob Clark @ 2025-07-23 19:08 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Dmitry Baryshkov,
Abhinav Kumar, Jessica Zhang, Sean Paul, Marijn Suijten,
David Airlie, Simona Vetter, open list
submit_unpin_objects() should come before we unlock the objects. This
fixes the splat:
WARNING: CPU: 2 PID: 2171 at drivers/gpu/drm/msm/msm_gem.h:395 msm_gem_unpin_locked+0x8c/0xd8 [msm]
Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_dais q6apm_dai snd_q6dsp_common q6prm snd_q6apm qcom_pd_mapper cdc_mbim cdc_wdm cdc_ncm r8153_ecm cdc_ether usbnet sunrpc nls_ascii nls_cp437 vfat fat snd_soc_x1e80100 snd_soc_lpass_rx_macro snd_soc_lpass_tx_macro snd_soc_lpass_va_macro snd_soc_lpass_wsa_macro snd_soc_qcom_common soundwire_qcom snd_soc_lpass_macro_common snd_soc_hdmi_codec snd_soc_qcom_sdw ext4 snd_soc_core snd_compress soundwire_bus snd_pcm_dmaengine snd_seq mbcache jbd2 snd_seq_device snd_pcm pm8941_pwrkey snd_timer r8152 qcom_spmi_temp_alarm industrialio snd lenovo_yoga_slim7x ath12k mii arm_smccc_trng soundcore rng_core evdev loop panel_samsung_atna33xc20 msm ubwc_config drm_client_lib drm_gpuvm drm_exec gpu_sched drm_display_helper pmic_glink_altmode aux_hpd_bridge ucsi_glink qcom_battmgr phy_qcom_qmp_combo ps883x cec aux_bridge drm_dp_aux_bus i2c_hid_of aes_ce_blk drm_kms_helper aes_ce_cipher i2c_hid qcom_q6v5_pas
ghash_ce qcom_pil_info drm sha1_ce qcom_common phy_snps_eusb2 qcom_geni_serial qcom_q6v5 qcom_sysmon pinctrl_sm8550_lpass_lpi lpasscc_sc8280xp sbsa_gwdt mdt_loader gpio_keys pmic_glink i2c_dev efivarfs autofs4
CPU: 2 UID: 1000 PID: 2171 Comm: gnome-shell Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : msm_gem_unpin_locked+0x8c/0xd8 [msm]
lr : msm_gem_unpin_locked+0x88/0xd8 [msm]
sp : ffff80009c963820
x29: ffff80009c963820 x28: ffff80009c9639f8 x27: ffff00080552a830
x26: 0000000000000000 x25: ffff0009d5655800 x24: 0000000000000000
x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000
x20: ffff000831db5480 x19: ffff000816e74400 x18: 0000000000000000
x17: 0000000000000000 x16: ffffc1396afdd720 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: ffff0008c065bc00
x11: ffff0008c065c000 x10: 0000000000000000 x9 : ffffc13945b19074
x8 : 0000000000000000 x7 : 0000000000000209 x6 : 0000000000000002
x5 : 0000000000019d01 x4 : ffff0008ba8db080 x3 : 000000000004093f
x2 : ffff3ed5e727f000 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
msm_gem_unpin_locked+0x8c/0xd8 [msm] (P)
msm_ioctl_gem_submit+0x32c/0x1760 [msm]
drm_ioctl_kernel+0xc8/0x138 [drm]
drm_ioctl+0x2c8/0x618 [drm]
__arm64_sys_ioctl+0xac/0x108
invoke_syscall.constprop.0+0x64/0xe8
el0_svc_common.constprop.0+0x40/0xe8
do_el0_svc+0x24/0x38
el0_svc+0x54/0x1d8
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x19c/0x1a0
irq event stamp: 2185036
hardirqs last enabled at (2185035): [<ffffc1396afeef9c>] _raw_spin_unlock_irqrestore+0x74/0x80
hardirqs last disabled at (2185036): [<ffffc1396afd8164>] el1_dbg+0x24/0x90
softirqs last enabled at (2184778): [<ffffc13969675e44>] fpsimd_restore_current_state+0x3c/0x328
softirqs last disabled at (2184776): [<ffffc13969675e14>] fpsimd_restore_current_state+0xc/0x328
---[ end trace 0000000000000000 ]---
Fixes: 111fdd2198e6 ("drm/msm: drm_gpuvm conversion")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
---
drivers/gpu/drm/msm/msm_gem_submit.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 5f8e939a5906..0ac4c199ec93 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -514,14 +514,15 @@ static int submit_reloc(struct msm_gem_submit *submit, struct drm_gem_object *ob
*/
static void submit_cleanup(struct msm_gem_submit *submit, bool error)
{
+ if (error)
+ submit_unpin_objects(submit);
+
if (submit->exec.objects)
drm_exec_fini(&submit->exec);
- if (error) {
- submit_unpin_objects(submit);
- /* job wasn't enqueued to scheduler, so early retirement: */
+ /* if job wasn't enqueued to scheduler, early retirement: */
+ if (error)
msm_submit_retire(submit);
- }
}
void msm_submit_retire(struct msm_gem_submit *submit)
--
2.50.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-07-23 19:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 19:08 [PATCH 0/2] drm/msm: Error path fixes Rob Clark
2025-07-23 19:08 ` [PATCH 1/2] drm/msm: Fix refcnt underflow in error path Rob Clark
2025-07-23 19:08 ` [PATCH 2/2] drm/msm: Fix submit error path cleanup Rob Clark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).