From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Andy Yan" <andyshrk@163.com>, Danilo Krummrich <dakr@redhat.com>
Cc: dri-devel@lists.freedesktop.org,
"Tatsuyuki Ishi" <ishitatsuyuki@gmail.com>,
"Nicolas Boichat" <drinkcat@chromium.org>,
kernel@collabora.com, "Daniel Stone" <daniels@collabora.com>,
"Neil Armstrong" <neil.armstrong@linaro.org>,
"Ketil Johnsen" <ketil.johnsen@arm.com>,
"Liviu Dudau" <Liviu.Dudau@arm.com>,
"Steven Price" <steven.price@arm.com>,
"Clément Péron" <peron.clem@gmail.com>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Chris Diamand" <chris.diamand@foss.arm.com>,
"Marty E . Plummer" <hanetzer@startmail.com>,
"Robin Murphy" <robin.murphy@arm.com>,
"Faith Ekstrand" <faith.ekstrand@collabora.com>
Subject: Re: [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs
Date: Mon, 5 Feb 2024 10:03:21 +0100 [thread overview]
Message-ID: <20240205100321.0321a208@collabora.com> (raw)
In-Reply-To: <1554e55.29c.18d71ae9b6c.Coremail.andyshrk@163.com>
+Danilo for the panthor gpuvm-needs update.
On Sun, 4 Feb 2024 09:14:44 +0800 (CST)
"Andy Yan" <andyshrk@163.com> wrote:
> Hi Boris:
> I saw this warning sometimes(Run on a armbain based bookworm),not sure is a know issue or something else。
> [15368.293031] systemd-journald[715]: Received client request to relinquish /var/log/journal/1bc4a340506142af9bd31a6a3d2170ba access.
> [37743.040737] ------------[ cut here ]------------
> [37743.040764] panthor fb000000.gpu: drm_WARN_ON(shmem->pages_use_count)
> [37743.040890] WARNING: CPU: 2 PID: 5702 at drivers/gpu/drm/drm_gem_shmem_helper.c:158 drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
> [37743.040929] Modules linked in: joydev rfkill sunrpc lz4hc lz4 zram binfmt_misc hantro_vpu crct10dif_ce v4l2_vp9 v4l2_h264 snd_soc_simple_amplifier v4l2_mem2mem videobuf2_dma_contig snd_soc_es8328_i2c videobuf2_memops rk_crypto2 snd_soc_es8328 videobuf2_v4l2 sm3_generic videodev crypto_engine sm3 rockchip_rng videobuf2_common nvmem_rockchip_otp snd_soc_rockchip_i2s_tdm snd_soc_hdmi_codec snd_soc_simple_card mc snd_soc_simple_card_utils snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_pcm snd_timer snd soundcore dm_mod ip_tables x_tables autofs4 dw_hdmi_qp_i2s_audio dw_hdmi_qp_cec rk808_regulator rockchipdrm dw_mipi_dsi dw_hdmi_qp dw_hdmi analogix_dp drm_dma_helper fusb302 display_connector rk8xx_spi drm_display_helper phy_rockchip_snps_pcie3 phy_rockchip_samsung_hdptx_hdmi panthor tcpm rk8xx_core cec drm_gpuvm gpu_sched drm_kms_helper drm_shmem_helper drm_exec r8169 drm pwm_bl adc_keys
> [37743.041108] CPU: 2 PID: 5702 Comm: kworker/u16:8 Not tainted 6.8.0-rc1-edge-rockchip-rk3588 #2
> [37743.041115] Hardware name: Rockchip RK3588 EVB1 V10 Board (DT)
> [37743.041120] Workqueue: panthor-cleanup panthor_vm_bind_job_cleanup_op_ctx_work [panthor]
> [37743.041151] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [37743.041157] pc : drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
> [37743.041169] lr : drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
> [37743.041181] sp : ffff80008d37bcc0
> [37743.041184] x29: ffff80008d37bcc0 x28: ffff800081d379c0 x27: ffff800081d37000
> [37743.041196] x26: ffff00019909a280 x25: ffff00019909a2c0 x24: ffff0001017a4c05
> [37743.041206] x23: dead000000000100 x22: dead000000000122 x21: ffff0001627ac1a0
> [37743.041217] x20: 0000000000000000 x19: ffff0001627ac000 x18: 0000000000000000
> [37743.041227] x17: 000000040044ffff x16: 005000f2b5503510 x15: fffffffffff91b77
> [37743.041238] x14: 0000000000000001 x13: 00000000000003c5 x12: 00000000ffffffea
> [37743.041248] x11: 00000000ffffdfff x10: 00000000ffffdfff x9 : ffff800081e0e818
> [37743.041259] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
> [37743.041269] x5 : 0000000000001fff x4 : 0000000000000000 x3 : ffff8000819a6008
> [37743.041279] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00018465e900
> [37743.041290] Call trace:
> [37743.041293] drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
> [37743.041306] panthor_gem_free_object+0x24/0xa0 [panthor]
> [37743.041321] drm_gem_object_free+0x1c/0x30 [drm]
> [37743.041452] panthor_vm_bo_put+0xc4/0x12c [panthor]
> [37743.041475] panthor_vm_cleanup_op_ctx.constprop.0+0xb0/0x104 [panthor]
> [37743.041491] panthor_vm_bind_job_cleanup_op_ctx_work+0x28/0xd0 [panthor]
Ok, I think I found the culprit: there's a race between
the drm_gpuvm_bo_put() call in panthor_vm_bo_put() and the list
iteration done by drm_gpuvm_prepare_objects(). Because we're not
setting DRM_GPUVM_RESV_PROTECTED, the code goes through the 'lockless'
iteration loop, and takes/release a vm_bo ref at each iteration. This
means our 'were we the last vm_bo user?' test in panthor_vm_bo_put()
might return false even if we were actually the last user, and when
for_each_vm_bo_in_list() releases the ref it acquired, it not only leaks
the pin reference, thus leaving GEM pages pinned (which explains this
WARN_ON() splat), but it also calls drm_gpuvm_bo_destroy() in a path
where we don't hold the GPUVA list lock, which is bad.
Long story short, I'll have to use DRM_GPUVM_RESV_PROTECTED, which is
fine because I'm deferring vm_bo removal to a work where taking the VM
resv lock is allowed. Since I was the one asking for this lockless
iterator in the first place, I wonder if we should kill that and make
DRM_GPUVM_RESV_PROTECTED the default (this would greatly simplify
the code). AFAICT, The PowerVR driver shouldn't be impacted because it's
using drm_gpuvm in synchronous mode only, and Xe already uses the
resv-protected mode. That leaves Nouveau, but IIRC, it's also doing VM
updates in the ioctl path.
Danilo, any opinions?
Andy, I pushed a new version to the panthor-next [1] and
panthor-next+rk3588 [2] branches. The fix I'm talking about is [3], but
you probably want to consider taking all the fixups in your branch.
Regards,
Boris
[1]https://gitlab.freedesktop.org/panfrost/linux/-/commits/panthor-next
[2]https://gitlab.freedesktop.org/panfrost/linux/-/commits/panthor-next+rk3588
[3]https://gitlab.freedesktop.org/panfrost/linux/-/commit/df48c09662a403275e76e679ee004085badea7c1
next prev parent reply other threads:[~2024-02-05 9:03 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-22 16:30 [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 01/14] drm/panthor: Add uAPI Boris Brezillon
2024-02-08 14:29 ` Liviu Dudau
2024-01-22 16:30 ` [PATCH v4 02/14] drm/panthor: Add GPU register definitions Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 03/14] drm/panthor: Add the device logical block Boris Brezillon
2024-02-08 14:30 ` Liviu Dudau
2024-02-08 15:14 ` Boris Brezillon
2024-02-08 15:55 ` Liviu Dudau
2024-02-08 16:00 ` Boris Brezillon
2024-02-08 16:18 ` Liviu Dudau
2024-02-08 16:54 ` Boris Brezillon
2024-02-08 15:19 ` Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 04/14] drm/panthor: Add the GPU " Boris Brezillon
2024-02-08 16:09 ` Steven Price
2024-01-22 16:30 ` [PATCH v4 05/14] drm/panthor: Add GEM " Boris Brezillon
2024-02-09 15:58 ` Steven Price
2024-02-12 12:31 ` Liviu Dudau
2024-01-22 16:30 ` [PATCH v4 06/14] drm/panthor: Add the devfreq " Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 07/14] drm/panthor: Add the MMU/VM " Boris Brezillon
2024-02-09 16:51 ` Steven Price
2024-02-10 8:02 ` Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 08/14] drm/panthor: Add the FW " Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 09/14] drm/panthor: Add the heap " Boris Brezillon
2024-02-12 11:40 ` Steven Price
2024-02-14 17:33 ` Boris Brezillon
2024-02-15 9:34 ` Steven Price
2024-01-22 16:30 ` [PATCH v4 10/14] drm/panthor: Add the scheduler " Boris Brezillon
2024-02-14 16:48 ` Steven Price
2024-01-22 16:30 ` [PATCH v4 11/14] drm/panthor: Add the driver frontend block Boris Brezillon
2024-01-23 16:29 ` Heiko Stübner
2024-01-23 17:06 ` Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 12/14] drm/panthor: Allow driver compilation Boris Brezillon
2024-01-22 16:30 ` [PATCH v4 13/14] dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs Boris Brezillon
2024-01-22 16:30 ` Boris Brezillon
2024-01-30 19:54 ` Rob Herring
2024-01-30 19:54 ` Rob Herring
2024-01-22 16:30 ` [PATCH v4 14/14] drm/panthor: Add an entry to MAINTAINERS Boris Brezillon
2024-01-23 16:18 ` [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs Heiko Stübner
2024-01-29 7:58 ` Boris Brezillon
2024-01-29 9:20 ` Andy Yan
2024-01-29 10:41 ` [PATCH " Boris Brezillon
2024-02-04 1:14 ` Andy Yan
2024-02-04 10:07 ` Boris Brezillon
2024-02-04 10:36 ` Andy Yan
2024-02-05 9:03 ` Boris Brezillon [this message]
2024-02-05 9:41 ` Andy Yan
2024-02-05 9:54 ` Danilo Krummrich
2024-02-05 10:31 ` Boris Brezillon
2024-02-06 10:35 ` Andy Yan
2024-02-06 11:11 ` Boris Brezillon
2024-02-08 16:02 ` Liviu Dudau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240205100321.0321a208@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=Liviu.Dudau@arm.com \
--cc=andyshrk@163.com \
--cc=chris.diamand@foss.arm.com \
--cc=dakr@redhat.com \
--cc=daniel@ffwll.ch \
--cc=daniels@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=drinkcat@chromium.org \
--cc=faith.ekstrand@collabora.com \
--cc=hanetzer@startmail.com \
--cc=ishitatsuyuki@gmail.com \
--cc=kernel@collabora.com \
--cc=ketil.johnsen@arm.com \
--cc=neil.armstrong@linaro.org \
--cc=peron.clem@gmail.com \
--cc=robin.murphy@arm.com \
--cc=steven.price@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.