All of lore.kernel.org
 help / color / mirror / Atom feed
* Use after free with GEM shadow-buffered planes
@ 2023-11-15 16:32 Alyssa Ross
  2023-11-17 14:08 ` Thomas Zimmermann
  0 siblings, 1 reply; 3+ messages in thread
From: Alyssa Ross @ 2023-11-15 16:32 UTC (permalink / raw)
  To: dri-devel, Thomas Zimmermann

[-- Attachment #1: Type: text/plain, Size: 4459 bytes --]

[Originally reported at https://gitlab.freedesktop.org/drm/misc/-/issues/33]

The following happens in a cycle:

 • An atomic state is allocated
 • A plane state is allocated (drm_gem_duplicate_shadow_plane_state())
 • Commit (drm_atomic_helper_commit(), possibly nonblocking / asynchronously)
 • The previous plane state is freed (drm_gem_destroy_shadow_plane_state())
 • The atomic state is put

But what happens if a nonblocking commit doesn't get scheduled until a
couple of iterations later in the cycle?  Plane states are not
refcounted, so by that point, the plane state has been freed, and so
commit_tail() will encounter a use after free when it accesses the plane
state.

I encountered this issue using simpledrm on the Asahi kernel based on
v6.5, but none of the files I examined to determine that this is a
use-after-free have been modified from mainline.  I've also reviewed the
diff between my kernel and tip of mainline (8f6f76a6a29f), and didn't
see anything that would affect this issue.

Here's an example of a use after free.  It's been a couple of weeks
since I thoroughly investigated this, but from memory, in this case, the
plane state has been overwritten by a struct drm_crtc_state.

Unable to handle kernel paging request at virtual address 0000000100000049
Mem abort info:
  ESR = 0x0000000096000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 16k pages, 48-bit VAs, pgdp=000000080e0e31b0
[0000000100000049] pgd=080000083d390003, p4d=080000083d390003, pud=080000083db9c003, pmd=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Modules linked in: overlay uas usb_storage usbhid rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device bnep des_generic libdes md4 brcmfmac_wcc joydev hci_bcm4377 bluetooth brcmfmac brcmutil cfg80211 hid_magicmouse ecdh_generic ecc rfkill snd_soc_macaudio macsmc_hid macsmc_power macsmc_reboot ofpart spi_nor apple_isp videobuf2_dma_sg snd_soc_cs42l84 snd_soc_tas2764 videobuf2_memops clk_apple_nco snd_soc_apple_mca apple_admac videobuf2_v4l2 videodev videobuf2_common mc hid_apple pwm_apple leds_pwm apple_soc_cpufreq xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog nft_compat nf_tables nfnetlink loop tun tap macvlan bridge stp llc fuse zstd zram dm_crypt xhci_plat_hcd xhci_hcd nvmem_spmi_mfd rtc_macsmc gpio_macsmc tps6598x dockchannel_hid simple_mfd_spmi regmap_spmi nvme_apple phy_apple_atc dwc3 pcie_apple typec pci_host_common udc_core apple_sart macsmc_rtkit apple_rtkit_helper apple_dockchannel macsmc apple_rtkit mfd_core
 spmi_apple_controller nvmem_apple_efuses pinctrl_apple_gpio spi_apple i2c_apple apple_dart apple_mailbox btrfs xor xor_neon raid6_pq
CPU: 0 PID: 1095074 Comm: kworker/u16:11 Tainted: G S                 6.5.0-asahi #1-NixOS
Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT)
Workqueue: events_unbound commit_work
pstate: 21400009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : drm_gem_fb_vunmap+0x18/0x74
lr : drm_gem_end_shadow_fb_access+0x1c/0x2c
sp : ffff800087ea3d00
x29: ffff800087ea3d00 x28: 0000000000000000 x27: 0000000000000000
x26: ffff800081325000 x25: 00000000fffffef7 x24: ffff000046c5b560
x23: ffff000001fcaa05 x22: 0000000000000000 x21: 0000000100000001
x20: ffff000046c5b500 x19: 0000000000000001 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff2e2d5ab0
x14: 0000000000000195 x13: 0000000000000000 x12: ffff800081310a80
x11: 0000000000000001 x10: 1444e7e23f083897 x9 : 6e82f0b7605f292f
x8 : ffff0001249e0f48 x7 : 0000000000000004 x6 : 0000000000000190
x5 : 0000000000000001 x4 : ffff000093c54440 x3 : ffff00000e968000
x2 : ffff80008077883c x1 : ffff00009ce37498 x0 : 0000000100000001
Call trace:
 drm_gem_fb_vunmap+0x18/0x74
 drm_gem_end_shadow_fb_access+0x1c/0x2c
 drm_atomic_helper_cleanup_planes+0x58/0xd8
 drm_atomic_helper_commit_tail+0x90/0xa0
 commit_tail+0x15c/0x188
 commit_work+0x14/0x20
 process_one_work+0x1e0/0x344
 worker_thread+0x68/0x424
 kthread+0xf4/0x100
 ret_from_fork+0x10/0x20
Code: 910003fd a90153f3 f90013f5 aa0003f5 (f9402400) 
---[ end trace 0000000000000000 ]---

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Use after free with GEM shadow-buffered planes
  2023-11-15 16:32 Use after free with GEM shadow-buffered planes Alyssa Ross
@ 2023-11-17 14:08 ` Thomas Zimmermann
  2023-11-18 16:28   ` Daniel Vetter
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Zimmermann @ 2023-11-17 14:08 UTC (permalink / raw)
  To: Alyssa Ross, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 5763 bytes --]

Hi

Am 15.11.23 um 17:32 schrieb Alyssa Ross:
> [Originally reported at https://gitlab.freedesktop.org/drm/misc/-/issues/33]
> 
> The following happens in a cycle:
> 
>   • An atomic state is allocated
>   • A plane state is allocated (drm_gem_duplicate_shadow_plane_state())
>   • Commit (drm_atomic_helper_commit(), possibly nonblocking / asynchronously)
>   • The previous plane state is freed (drm_gem_destroy_shadow_plane_state())
>   • The atomic state is put

We acquire a reference on the state at

 
https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2057

and release it at

 
https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L1833

All new sub-state, such as for planes, should be kept allocated during 
that time.

> 
> But what happens if a nonblocking commit doesn't get scheduled until a
> couple of iterations later in the cycle?  Plane states are not
> refcounted, so by that point, the plane state has been freed, and so
> commit_tail() will encounter a use after free when it accesses the plane
> state.

I think it's assumed that the old plane state can be gone by the time 
the commit happens. But why would the new plane state? Did you figure 
this out in your analysis?

> 
> I encountered this issue using simpledrm on the Asahi kernel based on
> v6.5, but none of the files I examined to determine that this is a
> use-after-free have been modified from mainline.  I've also reviewed the
> diff between my kernel and tip of mainline (8f6f76a6a29f), and didn't
> see anything that would affect this issue.

It could be that we're passing the wrong state at

 
https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2919

in some corner case, but I really don't see how.

At least that's how it's supposed to work. I'm still trying to get an 
idea how you end up with an invalid plane state.

Best regards
Thomas

> 
> Here's an example of a use after free.  It's been a couple of weeks
> since I thoroughly investigated this, but from memory, in this case, the
> plane state has been overwritten by a struct drm_crtc_state.
> 
> Unable to handle kernel paging request at virtual address 0000000100000049
> Mem abort info:
>    ESR = 0x0000000096000005
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>    FSC = 0x05: level 1 translation fault
> Data abort info:
>    ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
>    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 16k pages, 48-bit VAs, pgdp=000000080e0e31b0
> [0000000100000049] pgd=080000083d390003, p4d=080000083d390003, pud=080000083db9c003, pmd=0000000000000000
> Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> Modules linked in: overlay uas usb_storage usbhid rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device bnep des_generic libdes md4 brcmfmac_wcc joydev hci_bcm4377 bluetooth brcmfmac brcmutil cfg80211 hid_magicmouse ecdh_generic ecc rfkill snd_soc_macaudio macsmc_hid macsmc_power macsmc_reboot ofpart spi_nor apple_isp videobuf2_dma_sg snd_soc_cs42l84 snd_soc_tas2764 videobuf2_memops clk_apple_nco snd_soc_apple_mca apple_admac videobuf2_v4l2 videodev videobuf2_common mc hid_apple pwm_apple leds_pwm apple_soc_cpufreq xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog nft_compat nf_tables nfnetlink loop tun tap macvlan bridge stp llc fuse zstd zram dm_crypt xhci_plat_hcd xhci_hcd nvmem_spmi_mfd rtc_macsmc gpio_macsmc tps6598x dockchannel_hid simple_mfd_spmi regmap_spmi nvme_apple phy_apple_atc dwc3 pcie_apple typec pci_host_common udc_core apple_sart macsmc_rtkit apple_rtkit_helper apple_dockchannel macsmc apple_rtkit mfd_core
>   spmi_apple_controller nvmem_apple_efuses pinctrl_apple_gpio spi_apple i2c_apple apple_dart apple_mailbox btrfs xor xor_neon raid6_pq
> CPU: 0 PID: 1095074 Comm: kworker/u16:11 Tainted: G S                 6.5.0-asahi #1-NixOS
> Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT)
> Workqueue: events_unbound commit_work
> pstate: 21400009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> pc : drm_gem_fb_vunmap+0x18/0x74
> lr : drm_gem_end_shadow_fb_access+0x1c/0x2c
> sp : ffff800087ea3d00
> x29: ffff800087ea3d00 x28: 0000000000000000 x27: 0000000000000000
> x26: ffff800081325000 x25: 00000000fffffef7 x24: ffff000046c5b560
> x23: ffff000001fcaa05 x22: 0000000000000000 x21: 0000000100000001
> x20: ffff000046c5b500 x19: 0000000000000001 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff2e2d5ab0
> x14: 0000000000000195 x13: 0000000000000000 x12: ffff800081310a80
> x11: 0000000000000001 x10: 1444e7e23f083897 x9 : 6e82f0b7605f292f
> x8 : ffff0001249e0f48 x7 : 0000000000000004 x6 : 0000000000000190
> x5 : 0000000000000001 x4 : ffff000093c54440 x3 : ffff00000e968000
> x2 : ffff80008077883c x1 : ffff00009ce37498 x0 : 0000000100000001
> Call trace:
>   drm_gem_fb_vunmap+0x18/0x74
>   drm_gem_end_shadow_fb_access+0x1c/0x2c
>   drm_atomic_helper_cleanup_planes+0x58/0xd8
>   drm_atomic_helper_commit_tail+0x90/0xa0
>   commit_tail+0x15c/0x188
>   commit_work+0x14/0x20
>   process_one_work+0x1e0/0x344
>   worker_thread+0x68/0x424
>   kthread+0xf4/0x100
>   ret_from_fork+0x10/0x20
> Code: 910003fd a90153f3 f90013f5 aa0003f5 (f9402400)
> ---[ end trace 0000000000000000 ]---

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Use after free with GEM shadow-buffered planes
  2023-11-17 14:08 ` Thomas Zimmermann
@ 2023-11-18 16:28   ` Daniel Vetter
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Vetter @ 2023-11-18 16:28 UTC (permalink / raw)
  To: Thomas Zimmermann; +Cc: Alyssa Ross, dri-devel

On Fri, 17 Nov 2023 at 15:08, Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 15.11.23 um 17:32 schrieb Alyssa Ross:
> > [Originally reported at https://gitlab.freedesktop.org/drm/misc/-/issues/33]
> >
> > The following happens in a cycle:
> >
> >   • An atomic state is allocated
> >   • A plane state is allocated (drm_gem_duplicate_shadow_plane_state())
> >   • Commit (drm_atomic_helper_commit(), possibly nonblocking / asynchronously)
> >   • The previous plane state is freed (drm_gem_destroy_shadow_plane_state())
> >   • The atomic state is put
>
> We acquire a reference on the state at
>
>
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2057
>
> and release it at
>
>
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L1833
>
> All new sub-state, such as for planes, should be kept allocated during
> that time.
>
> >
> > But what happens if a nonblocking commit doesn't get scheduled until a
> > couple of iterations later in the cycle?  Plane states are not
> > refcounted, so by that point, the plane state has been freed, and so
> > commit_tail() will encounter a use after free when it accesses the plane
> > state.
>
> I think it's assumed that the old plane state can be gone by the time
> the commit happens. But why would the new plane state? Did you figure
> this out in your analysis?
>
> >
> > I encountered this issue using simpledrm on the Asahi kernel based on
> > v6.5, but none of the files I examined to determine that this is a
> > use-after-free have been modified from mainline.  I've also reviewed the
> > diff between my kernel and tip of mainline (8f6f76a6a29f), and didn't
> > see anything that would affect this issue.
>
> It could be that we're passing the wrong state at
>
>
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2919

Yeah that's buggy, and I think we we discussed this, I missed it. So
the lifetime rules are somewhat tricky:

- An atomic commit is only allowed to look at the subsequent state
objects until it signals comletion of that commit with the hw_done()
helper. After that point ownership of those state structures is handed
to the next commit, which can release them.

- The next commit has to wait for hw_done completions of the previous
commits, so that it doesn't release state objects too early.

- This means that when you're after hw_done() in the commit code, you
can only look at the old state structures. This means that
->cleanup_fb is fine, but the ->end_fb_access case is not. It needs to
be moved much earlier, probably into the varios commit_planes()
functions (we have two of those iirc).

- But ->end_fb_access _is_ in the right place for the case where we've
aborted a commit before the point of no return.

> in some corner case, but I really don't see how.
>
> At least that's how it's supposed to work. I'm still trying to get an
> idea how you end up with an invalid plane state.

It's indeed a very tricky situation here, but I think I understand the
bug. Maybe we need a bunch more commits in relevant places to explain
this all ..

Cheers, Sima

>
> Best regards
> Thomas
>
> >
> > Here's an example of a use after free.  It's been a couple of weeks
> > since I thoroughly investigated this, but from memory, in this case, the
> > plane state has been overwritten by a struct drm_crtc_state.
> >
> > Unable to handle kernel paging request at virtual address 0000000100000049
> > Mem abort info:
> >    ESR = 0x0000000096000005
> >    EC = 0x25: DABT (current EL), IL = 32 bits
> >    SET = 0, FnV = 0
> >    EA = 0, S1PTW = 0
> >    FSC = 0x05: level 1 translation fault
> > Data abort info:
> >    ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> >    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> >    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > user pgtable: 16k pages, 48-bit VAs, pgdp=000000080e0e31b0
> > [0000000100000049] pgd=080000083d390003, p4d=080000083d390003, pud=080000083db9c003, pmd=0000000000000000
> > Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> > Modules linked in: overlay uas usb_storage usbhid rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device bnep des_generic libdes md4 brcmfmac_wcc joydev hci_bcm4377 bluetooth brcmfmac brcmutil cfg80211 hid_magicmouse ecdh_generic ecc rfkill snd_soc_macaudio macsmc_hid macsmc_power macsmc_reboot ofpart spi_nor apple_isp videobuf2_dma_sg snd_soc_cs42l84 snd_soc_tas2764 videobuf2_memops clk_apple_nco snd_soc_apple_mca apple_admac videobuf2_v4l2 videodev videobuf2_common mc hid_apple pwm_apple leds_pwm apple_soc_cpufreq xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog nft_compat nf_tables nfnetlink loop tun tap macvlan bridge stp llc fuse zstd zram dm_crypt xhci_plat_hcd xhci_hcd nvmem_spmi_mfd rtc_macsmc gpio_macsmc tps6598x dockchannel_hid simple_mfd_spmi regmap_spmi nvme_apple phy_apple_atc dwc3 pcie_apple typec pci_host_common udc_core apple_sart macsmc_rtkit apple_rtkit_helper apple_dockchannel macsmc apple_rtkit mfd_core
> >   spmi_apple_controller nvmem_apple_efuses pinctrl_apple_gpio spi_apple i2c_apple apple_dart apple_mailbox btrfs xor xor_neon raid6_pq
> > CPU: 0 PID: 1095074 Comm: kworker/u16:11 Tainted: G S                 6.5.0-asahi #1-NixOS
> > Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT)
> > Workqueue: events_unbound commit_work
> > pstate: 21400009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > pc : drm_gem_fb_vunmap+0x18/0x74
> > lr : drm_gem_end_shadow_fb_access+0x1c/0x2c
> > sp : ffff800087ea3d00
> > x29: ffff800087ea3d00 x28: 0000000000000000 x27: 0000000000000000
> > x26: ffff800081325000 x25: 00000000fffffef7 x24: ffff000046c5b560
> > x23: ffff000001fcaa05 x22: 0000000000000000 x21: 0000000100000001
> > x20: ffff000046c5b500 x19: 0000000000000001 x18: 0000000000000000
> > x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff2e2d5ab0
> > x14: 0000000000000195 x13: 0000000000000000 x12: ffff800081310a80
> > x11: 0000000000000001 x10: 1444e7e23f083897 x9 : 6e82f0b7605f292f
> > x8 : ffff0001249e0f48 x7 : 0000000000000004 x6 : 0000000000000190
> > x5 : 0000000000000001 x4 : ffff000093c54440 x3 : ffff00000e968000
> > x2 : ffff80008077883c x1 : ffff00009ce37498 x0 : 0000000100000001
> > Call trace:
> >   drm_gem_fb_vunmap+0x18/0x74
> >   drm_gem_end_shadow_fb_access+0x1c/0x2c
> >   drm_atomic_helper_cleanup_planes+0x58/0xd8
> >   drm_atomic_helper_commit_tail+0x90/0xa0
> >   commit_tail+0x15c/0x188
> >   commit_work+0x14/0x20
> >   process_one_work+0x1e0/0x344
> >   worker_thread+0x68/0x424
> >   kthread+0xf4/0x100
> >   ret_from_fork+0x10/0x20
> > Code: 910003fd a90153f3 f90013f5 aa0003f5 (f9402400)
> > ---[ end trace 0000000000000000 ]---
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-11-18 16:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-15 16:32 Use after free with GEM shadow-buffered planes Alyssa Ross
2023-11-17 14:08 ` Thomas Zimmermann
2023-11-18 16:28   ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.