public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
@ 2026-03-16  0:26 Pedro Demarchi Gomes
  2026-03-16  9:14 ` Thomas Zimmermann
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Pedro Demarchi Gomes @ 2026-03-16  0:26 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Boris Brezillon
  Cc: dri-devel, linux-kernel, Pedro Demarchi Gomes

When running ./tools/testing/selftests/mm/split_huge_page_test multiple
times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
/sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
occurs:

[  232.728858] ------------[ cut here ]------------
[  232.729458] kernel BUG at mm/memory.c:2276!
[  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
[  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
[  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
[  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
[  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
[  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
[  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
[  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
[  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
[  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
[  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
[  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
[  232.736031] Call Trace:
[  232.736273]  <TASK>
[  232.736500]  get_locked_pte+0x1f/0xa0
[  232.736878]  insert_pfn+0x9f/0x350
[  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
[  232.737614]  ? __pfx_insert_pfn+0x10/0x10
[  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
[  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
[  232.738552]  ? lookup_memtype+0x62/0x180
[  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
[  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
[  232.739247]  ? __pfx___might_resched+0x10/0x10
[  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
[  232.739677]  ? rcu_read_unlock+0x20/0x70
[  232.739882]  __do_fault+0x251/0x7b0
[  232.740028]  do_fault+0x6e1/0xc00
[  232.740167]  ? __lock_acquire+0x590/0xc40
[  232.740335]  handle_pte_fault+0x439/0x760
[  232.740498]  ? mtree_range_walk+0x252/0xae0
[  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
[  232.740899]  __handle_mm_fault+0xa02/0xf30
[  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
[  232.741255]  ? find_vma+0xa1/0x120
[  232.741403]  handle_mm_fault+0x2bf/0x8f0
[  232.741564]  do_user_addr_fault+0x2d3/0xed0
[  232.741736]  ? trace_page_fault_user+0x1bf/0x240
[  232.741969]  exc_page_fault+0x87/0x120
[  232.742124]  asm_exc_page_fault+0x26/0x30
[  232.742288] RIP: 0033:0x7fb4d73ed546
[  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
[  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
[  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
[  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
[  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
[  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
[  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
[  232.746616]  </TASK>
[  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
[  232.749308] ---[ end trace 0000000000000000 ]---
[  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
[  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
[  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
[  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
[  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
[  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
[  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
[  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
[  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
[  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
[  232.752983] Kernel panic - not syncing: Fatal exception
[  232.753510] Kernel Offset: disabled
[  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---

This happens when two concurrent page faults occur within the same PMD range.
One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
triggered because a pmd_trans_huge is not expected when walking the page
table inside vmf_insert_pfn.

Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
PMD-sized mappings are handled through the appropriate huge page fault path.

Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
---

Changes in v3:
 - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
 - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
is defined to avoid a build warning

Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
 - Keep the #ifdef unindented
 - Create drm_gem_shmem_any_fault to handle faults of any order and use
drm_gem_shmem_[huge_]fault() as wrappers

v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/

---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 63 ++++++++++++++------------
 1 file changed, 35 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 7b5a49935ae4..9428c3ab7283 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -550,36 +550,18 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
 }
 EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);

-static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
-				      struct page *page)
-{
-#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
-	unsigned long pfn = page_to_pfn(page);
-	unsigned long paddr = pfn << PAGE_SHIFT;
-	bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
-
-	if (aligned &&
-	    pmd_none(*vmf->pmd) &&
-	    folio_test_pmd_mappable(page_folio(page))) {
-		pfn &= PMD_MASK >> PAGE_SHIFT;
-		if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
-			return true;
-	}
-#endif
-
-	return false;
-}
-
-static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
+static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	struct drm_gem_object *obj = vma->vm_private_data;
 	struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
 	loff_t num_pages = obj->size >> PAGE_SHIFT;
-	vm_fault_t ret;
+	vm_fault_t ret = VM_FAULT_FALLBACK;
 	struct page **pages = shmem->pages;
 	pgoff_t page_offset;
 	unsigned long pfn;
+	unsigned long paddr;
+	bool aligned;

 	/* Offset to faulty address in the VMA. */
 	page_offset = vmf->pgoff - vma->vm_pgoff;
@@ -593,13 +575,19 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
 		goto out;
 	}

-	if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
-		ret = VM_FAULT_NOPAGE;
-		goto out;
-	}
-
 	pfn = page_to_pfn(pages[page_offset]);
-	ret = vmf_insert_pfn(vma, vmf->address, pfn);
+	if (try_pmd) {
+		paddr = pfn << PAGE_SHIFT;
+		aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
+
+		if (aligned &&
+		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
+			pfn &= PMD_MASK >> PAGE_SHIFT;
+			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
+		}
+	} else {
+		ret = vmf_insert_pfn(vma, vmf->address, pfn);
+	}

  out:
 	dma_resv_unlock(shmem->base.resv);
@@ -607,6 +595,22 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
 	return ret;
 }

+#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
+static vm_fault_t drm_gem_shmem_huge_fault(struct vm_fault *vmf,
+					   unsigned int order)
+{
+	if (order != PMD_ORDER)
+		return VM_FAULT_FALLBACK;
+
+	return drm_gem_shmem_any_fault(vmf, true);
+}
+#endif
+
+static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
+{
+	return drm_gem_shmem_any_fault(vmf, false);
+}
+
 static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
 {
 	struct drm_gem_object *obj = vma->vm_private_data;
@@ -643,6 +647,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)

 const struct vm_operations_struct drm_gem_shmem_vm_ops = {
 	.fault = drm_gem_shmem_fault,
+#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
+	.huge_fault = drm_gem_shmem_huge_fault,
+#endif
 	.open = drm_gem_shmem_vm_open,
 	.close = drm_gem_shmem_vm_close,
 };
--
2.47.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  0:26 [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler Pedro Demarchi Gomes
@ 2026-03-16  9:14 ` Thomas Zimmermann
  2026-03-16  9:28   ` Boris Brezillon
  2026-03-16  9:50 ` Boris Brezillon
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Thomas Zimmermann @ 2026-03-16  9:14 UTC (permalink / raw)
  To: Pedro Demarchi Gomes, Maarten Lankhorst, Maxime Ripard,
	David Airlie, Simona Vetter, Boris Brezillon
  Cc: dri-devel, linux-kernel

Hi

Am 16.03.26 um 01:26 schrieb Pedro Demarchi Gomes:
> When running ./tools/testing/selftests/mm/split_huge_page_test multiple
> times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
> /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
> occurs:
>
> [  232.728858] ------------[ cut here ]------------
> [  232.729458] kernel BUG at mm/memory.c:2276!
> [  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
> [  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
> [  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.736031] Call Trace:
> [  232.736273]  <TASK>
> [  232.736500]  get_locked_pte+0x1f/0xa0
> [  232.736878]  insert_pfn+0x9f/0x350
> [  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
> [  232.737614]  ? __pfx_insert_pfn+0x10/0x10
> [  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
> [  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
> [  232.738552]  ? lookup_memtype+0x62/0x180
> [  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
> [  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
> [  232.739247]  ? __pfx___might_resched+0x10/0x10
> [  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
> [  232.739677]  ? rcu_read_unlock+0x20/0x70
> [  232.739882]  __do_fault+0x251/0x7b0
> [  232.740028]  do_fault+0x6e1/0xc00
> [  232.740167]  ? __lock_acquire+0x590/0xc40
> [  232.740335]  handle_pte_fault+0x439/0x760
> [  232.740498]  ? mtree_range_walk+0x252/0xae0
> [  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
> [  232.740899]  __handle_mm_fault+0xa02/0xf30
> [  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
> [  232.741255]  ? find_vma+0xa1/0x120
> [  232.741403]  handle_mm_fault+0x2bf/0x8f0
> [  232.741564]  do_user_addr_fault+0x2d3/0xed0
> [  232.741736]  ? trace_page_fault_user+0x1bf/0x240
> [  232.741969]  exc_page_fault+0x87/0x120
> [  232.742124]  asm_exc_page_fault+0x26/0x30
> [  232.742288] RIP: 0033:0x7fb4d73ed546
> [  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
> [  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
> [  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
> [  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
> [  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
> [  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
> [  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
> [  232.746616]  </TASK>
> [  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
> [  232.749308] ---[ end trace 0000000000000000 ]---
> [  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.752983] Kernel panic - not syncing: Fatal exception
> [  232.753510] Kernel Offset: disabled
> [  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---
>
> This happens when two concurrent page faults occur within the same PMD range.
> One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
> attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
> triggered because a pmd_trans_huge is not expected when walking the page
> table inside vmf_insert_pfn.
>
> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> PMD-sized mappings are handled through the appropriate huge page fault path.
>
> Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
> Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
> ---
>
> Changes in v3:
>   - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
>   - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> is defined to avoid a build warning
>
> Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
>   - Keep the #ifdef unindented
>   - Create drm_gem_shmem_any_fault to handle faults of any order and use
> drm_gem_shmem_[huge_]fault() as wrappers
>
> v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/
>
> ---
>   drivers/gpu/drm/drm_gem_shmem_helper.c | 63 ++++++++++++++------------
>   1 file changed, 35 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 7b5a49935ae4..9428c3ab7283 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -550,36 +550,18 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
>   }
>   EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
>
> -static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
> -				      struct page *page)
> -{
> -#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> -	unsigned long pfn = page_to_pfn(page);
> -	unsigned long paddr = pfn << PAGE_SHIFT;
> -	bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
> -
> -	if (aligned &&
> -	    pmd_none(*vmf->pmd) &&
> -	    folio_test_pmd_mappable(page_folio(page))) {
> -		pfn &= PMD_MASK >> PAGE_SHIFT;
> -		if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
> -			return true;
> -	}
> -#endif
> -
> -	return false;
> -}
> -
> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)

Please write two functions. One for huge pages and one for regular page 
faults.

Best regards
Thomas

>   {
>   	struct vm_area_struct *vma = vmf->vma;
>   	struct drm_gem_object *obj = vma->vm_private_data;
>   	struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
>   	loff_t num_pages = obj->size >> PAGE_SHIFT;
> -	vm_fault_t ret;
> +	vm_fault_t ret = VM_FAULT_FALLBACK;
>   	struct page **pages = shmem->pages;
>   	pgoff_t page_offset;
>   	unsigned long pfn;
> +	unsigned long paddr;
> +	bool aligned;
>
>   	/* Offset to faulty address in the VMA. */
>   	page_offset = vmf->pgoff - vma->vm_pgoff;
> @@ -593,13 +575,19 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>   		goto out;
>   	}
>
> -	if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
> -		ret = VM_FAULT_NOPAGE;
> -		goto out;
> -	}
> -
>   	pfn = page_to_pfn(pages[page_offset]);
> -	ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +	if (try_pmd) {
> +		paddr = pfn << PAGE_SHIFT;
> +		aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
> +
> +		if (aligned &&
> +		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
> +			pfn &= PMD_MASK >> PAGE_SHIFT;
> +			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
> +		}
> +	} else {
> +		ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +	}
>
>    out:
>   	dma_resv_unlock(shmem->base.resv);
> @@ -607,6 +595,22 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>   	return ret;
>   }
>
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +static vm_fault_t drm_gem_shmem_huge_fault(struct vm_fault *vmf,
> +					   unsigned int order)
> +{
> +	if (order != PMD_ORDER)
> +		return VM_FAULT_FALLBACK;
> +
> +	return drm_gem_shmem_any_fault(vmf, true);
> +}
> +#endif
> +
> +static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +{
> +	return drm_gem_shmem_any_fault(vmf, false);
> +}
> +
>   static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
>   {
>   	struct drm_gem_object *obj = vma->vm_private_data;
> @@ -643,6 +647,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
>
>   const struct vm_operations_struct drm_gem_shmem_vm_ops = {
>   	.fault = drm_gem_shmem_fault,
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +	.huge_fault = drm_gem_shmem_huge_fault,
> +#endif
>   	.open = drm_gem_shmem_vm_open,
>   	.close = drm_gem_shmem_vm_close,
>   };
> --
> 2.47.3
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  9:14 ` Thomas Zimmermann
@ 2026-03-16  9:28   ` Boris Brezillon
  2026-03-16 10:12     ` Thomas Zimmermann
  0 siblings, 1 reply; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16  9:28 UTC (permalink / raw)
  To: Thomas Zimmermann, Maarten Lankhorst, Maxime Ripard
  Cc: Pedro Demarchi Gomes, David Airlie, Simona Vetter, dri-devel,
	linux-kernel

On Mon, 16 Mar 2026 10:14:23 +0100
Thomas Zimmermann <tzimmermann@suse.de> wrote:

> Hi
> 
> Am 16.03.26 um 01:26 schrieb Pedro Demarchi Gomes:
> > When running ./tools/testing/selftests/mm/split_huge_page_test multiple
> > times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
> > /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
> > occurs:
> >
> > [  232.728858] ------------[ cut here ]------------
> > [  232.729458] kernel BUG at mm/memory.c:2276!
> > [  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > [  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
> > [  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
> > [  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> > [  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> > [  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> > [  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> > [  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> > [  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> > [  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> > [  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> > [  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> > [  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> > [  232.736031] Call Trace:
> > [  232.736273]  <TASK>
> > [  232.736500]  get_locked_pte+0x1f/0xa0
> > [  232.736878]  insert_pfn+0x9f/0x350
> > [  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
> > [  232.737614]  ? __pfx_insert_pfn+0x10/0x10
> > [  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
> > [  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
> > [  232.738552]  ? lookup_memtype+0x62/0x180
> > [  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
> > [  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
> > [  232.739247]  ? __pfx___might_resched+0x10/0x10
> > [  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
> > [  232.739677]  ? rcu_read_unlock+0x20/0x70
> > [  232.739882]  __do_fault+0x251/0x7b0
> > [  232.740028]  do_fault+0x6e1/0xc00
> > [  232.740167]  ? __lock_acquire+0x590/0xc40
> > [  232.740335]  handle_pte_fault+0x439/0x760
> > [  232.740498]  ? mtree_range_walk+0x252/0xae0
> > [  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
> > [  232.740899]  __handle_mm_fault+0xa02/0xf30
> > [  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
> > [  232.741255]  ? find_vma+0xa1/0x120
> > [  232.741403]  handle_mm_fault+0x2bf/0x8f0
> > [  232.741564]  do_user_addr_fault+0x2d3/0xed0
> > [  232.741736]  ? trace_page_fault_user+0x1bf/0x240
> > [  232.741969]  exc_page_fault+0x87/0x120
> > [  232.742124]  asm_exc_page_fault+0x26/0x30
> > [  232.742288] RIP: 0033:0x7fb4d73ed546
> > [  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
> > [  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
> > [  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
> > [  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
> > [  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
> > [  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
> > [  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
> > [  232.746616]  </TASK>
> > [  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
> > [  232.749308] ---[ end trace 0000000000000000 ]---
> > [  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> > [  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> > [  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> > [  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> > [  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> > [  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> > [  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> > [  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> > [  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> > [  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> > [  232.752983] Kernel panic - not syncing: Fatal exception
> > [  232.753510] Kernel Offset: disabled
> > [  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---
> >
> > This happens when two concurrent page faults occur within the same PMD range.
> > One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
> > attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
> > triggered because a pmd_trans_huge is not expected when walking the page
> > table inside vmf_insert_pfn.
> >
> > Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> > PMD-sized mappings are handled through the appropriate huge page fault path.
> >
> > Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
> > Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
> > ---
> >
> > Changes in v3:
> >   - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
> >   - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> > is defined to avoid a build warning
> >
> > Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
> >   - Keep the #ifdef unindented
> >   - Create drm_gem_shmem_any_fault to handle faults of any order and use
> > drm_gem_shmem_[huge_]fault() as wrappers
> >
> > v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/
> >
> > ---
> >   drivers/gpu/drm/drm_gem_shmem_helper.c | 63 ++++++++++++++------------
> >   1 file changed, 35 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 7b5a49935ae4..9428c3ab7283 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -550,36 +550,18 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
> >
> > -static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
> > -				      struct page *page)
> > -{
> > -#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> > -	unsigned long pfn = page_to_pfn(page);
> > -	unsigned long paddr = pfn << PAGE_SHIFT;
> > -	bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
> > -
> > -	if (aligned &&
> > -	    pmd_none(*vmf->pmd) &&
> > -	    folio_test_pmd_mappable(page_folio(page))) {
> > -		pfn &= PMD_MASK >> PAGE_SHIFT;
> > -		if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
> > -			return true;
> > -	}
> > -#endif
> > -
> > -	return false;
> > -}
> > -
> > -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> > +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)  
> 
> Please write two functions. One for huge pages and one for regular page 
> faults.

That's actually what the first version of the patch was doing, and I
strongly disagree with that. In my opinion, it's more error-prone to
have two distinct implementations duplicating most of the logic
(locking, bookkeeping, ...) than having both handled in the same
function and have dummy wrappers to decide the kind of vmf_insert_pfn
to do (_pmd vs regular). Think about the folio_mark_accessed() you
recently added, and how easy it would have been to add it to
drm_gem_shmem_fault() and forget it in the huge_fault handler.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  0:26 [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler Pedro Demarchi Gomes
  2026-03-16  9:14 ` Thomas Zimmermann
@ 2026-03-16  9:50 ` Boris Brezillon
  2026-03-16 10:14   ` Thomas Zimmermann
  2026-03-16 10:03 ` Boris Brezillon
  2026-03-16 14:21 ` Boris Brezillon
  3 siblings, 1 reply; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16  9:50 UTC (permalink / raw)
  To: Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Sun, 15 Mar 2026 21:26:49 -0300
Pedro Demarchi Gomes <pedrodemargomes@gmail.com> wrote:

> When running ./tools/testing/selftests/mm/split_huge_page_test multiple
> times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
> /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
> occurs:
> 
> [  232.728858] ------------[ cut here ]------------
> [  232.729458] kernel BUG at mm/memory.c:2276!
> [  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
> [  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
> [  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.736031] Call Trace:
> [  232.736273]  <TASK>
> [  232.736500]  get_locked_pte+0x1f/0xa0
> [  232.736878]  insert_pfn+0x9f/0x350
> [  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
> [  232.737614]  ? __pfx_insert_pfn+0x10/0x10
> [  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
> [  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
> [  232.738552]  ? lookup_memtype+0x62/0x180
> [  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
> [  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
> [  232.739247]  ? __pfx___might_resched+0x10/0x10
> [  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
> [  232.739677]  ? rcu_read_unlock+0x20/0x70
> [  232.739882]  __do_fault+0x251/0x7b0
> [  232.740028]  do_fault+0x6e1/0xc00
> [  232.740167]  ? __lock_acquire+0x590/0xc40
> [  232.740335]  handle_pte_fault+0x439/0x760
> [  232.740498]  ? mtree_range_walk+0x252/0xae0
> [  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
> [  232.740899]  __handle_mm_fault+0xa02/0xf30
> [  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
> [  232.741255]  ? find_vma+0xa1/0x120
> [  232.741403]  handle_mm_fault+0x2bf/0x8f0
> [  232.741564]  do_user_addr_fault+0x2d3/0xed0
> [  232.741736]  ? trace_page_fault_user+0x1bf/0x240
> [  232.741969]  exc_page_fault+0x87/0x120
> [  232.742124]  asm_exc_page_fault+0x26/0x30
> [  232.742288] RIP: 0033:0x7fb4d73ed546
> [  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
> [  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
> [  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
> [  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
> [  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
> [  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
> [  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
> [  232.746616]  </TASK>
> [  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
> [  232.749308] ---[ end trace 0000000000000000 ]---
> [  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.752983] Kernel panic - not syncing: Fatal exception
> [  232.753510] Kernel Offset: disabled
> [  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> This happens when two concurrent page faults occur within the same PMD range.
> One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
> attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
> triggered because a pmd_trans_huge is not expected when walking the page
> table inside vmf_insert_pfn.
> 
> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> PMD-sized mappings are handled through the appropriate huge page fault path.
> 
> Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
> Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
> ---
> 
> Changes in v3:
>  - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
>  - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> is defined to avoid a build warning
> 
> Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
>  - Keep the #ifdef unindented
>  - Create drm_gem_shmem_any_fault to handle faults of any order and use
> drm_gem_shmem_[huge_]fault() as wrappers
> 
> v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/
> 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 63 ++++++++++++++------------
>  1 file changed, 35 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 7b5a49935ae4..9428c3ab7283 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -550,36 +550,18 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
> 
> -static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
> -				      struct page *page)
> -{
> -#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> -	unsigned long pfn = page_to_pfn(page);
> -	unsigned long paddr = pfn << PAGE_SHIFT;
> -	bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
> -
> -	if (aligned &&
> -	    pmd_none(*vmf->pmd) &&
> -	    folio_test_pmd_mappable(page_folio(page))) {
> -		pfn &= PMD_MASK >> PAGE_SHIFT;
> -		if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
> -			return true;
> -	}
> -#endif
> -
> -	return false;
> -}
> -
> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
>  	struct drm_gem_object *obj = vma->vm_private_data;
>  	struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
>  	loff_t num_pages = obj->size >> PAGE_SHIFT;
> -	vm_fault_t ret;
> +	vm_fault_t ret = VM_FAULT_FALLBACK;
>  	struct page **pages = shmem->pages;
>  	pgoff_t page_offset;
>  	unsigned long pfn;
> +	unsigned long paddr;
> +	bool aligned;

Those two variables can be moved to the `if (try_pmd)` branch.

> 
>  	/* Offset to faulty address in the VMA. */
>  	page_offset = vmf->pgoff - vma->vm_pgoff;
> @@ -593,13 +575,19 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>  		goto out;
>  	}
> 
> -	if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
> -		ret = VM_FAULT_NOPAGE;
> -		goto out;
> -	}
> -
>  	pfn = page_to_pfn(pages[page_offset]);
> -	ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +	if (try_pmd) {
> +		paddr = pfn << PAGE_SHIFT;
> +		aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
> +
> +		if (aligned &&
> +		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
> +			pfn &= PMD_MASK >> PAGE_SHIFT;
> +			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
> +		}
> +	} else {
> +		ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +	}

Hm, if we get rid of drm_gem_shmem_try_map_pmd(), we probably
need something like

	if (!try_pmd) {
		ret = vmf_insert_pfn(vma, vmf->address, pfn);
	} else {
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
		unsigned long paddr = pfn << PAGE_SHIFT;
		bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);

		if (aligned &&
		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
			pfn &= PMD_MASK >> PAGE_SHIFT;
			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
		}
#else
		ret = VM_FAULT_FALLBACK;
#endif
	}

to make sure we're not calling helpers that might not
exist in some configs (the compile might optimize the
try_pmd==true branch out already, but better safe than sorry).

> 
>   out:
>  	dma_resv_unlock(shmem->base.resv);
> @@ -607,6 +595,22 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>  	return ret;
>  }
> 
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +static vm_fault_t drm_gem_shmem_huge_fault(struct vm_fault *vmf,
> +					   unsigned int order)
> +{
> +	if (order != PMD_ORDER)
> +		return VM_FAULT_FALLBACK;
> +
> +	return drm_gem_shmem_any_fault(vmf, true);
> +}
> +#endif
> +
> +static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +{
> +	return drm_gem_shmem_any_fault(vmf, false);
> +}
> +
>  static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
>  {
>  	struct drm_gem_object *obj = vma->vm_private_data;
> @@ -643,6 +647,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
> 
>  const struct vm_operations_struct drm_gem_shmem_vm_ops = {
>  	.fault = drm_gem_shmem_fault,
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +	.huge_fault = drm_gem_shmem_huge_fault,
> +#endif
>  	.open = drm_gem_shmem_vm_open,
>  	.close = drm_gem_shmem_vm_close,
>  };
> --
> 2.47.3
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  0:26 [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler Pedro Demarchi Gomes
  2026-03-16  9:14 ` Thomas Zimmermann
  2026-03-16  9:50 ` Boris Brezillon
@ 2026-03-16 10:03 ` Boris Brezillon
  2026-03-16 14:21 ` Boris Brezillon
  3 siblings, 0 replies; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16 10:03 UTC (permalink / raw)
  To: Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Sun, 15 Mar 2026 21:26:49 -0300
Pedro Demarchi Gomes <pedrodemargomes@gmail.com> wrote:

> When running ./tools/testing/selftests/mm/split_huge_page_test multiple
> times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
> /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
> occurs:
> 
> [  232.728858] ------------[ cut here ]------------
> [  232.729458] kernel BUG at mm/memory.c:2276!
> [  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
> [  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
> [  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.736031] Call Trace:
> [  232.736273]  <TASK>
> [  232.736500]  get_locked_pte+0x1f/0xa0
> [  232.736878]  insert_pfn+0x9f/0x350
> [  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
> [  232.737614]  ? __pfx_insert_pfn+0x10/0x10
> [  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
> [  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
> [  232.738552]  ? lookup_memtype+0x62/0x180
> [  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
> [  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
> [  232.739247]  ? __pfx___might_resched+0x10/0x10
> [  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
> [  232.739677]  ? rcu_read_unlock+0x20/0x70
> [  232.739882]  __do_fault+0x251/0x7b0
> [  232.740028]  do_fault+0x6e1/0xc00
> [  232.740167]  ? __lock_acquire+0x590/0xc40
> [  232.740335]  handle_pte_fault+0x439/0x760
> [  232.740498]  ? mtree_range_walk+0x252/0xae0
> [  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
> [  232.740899]  __handle_mm_fault+0xa02/0xf30
> [  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
> [  232.741255]  ? find_vma+0xa1/0x120
> [  232.741403]  handle_mm_fault+0x2bf/0x8f0
> [  232.741564]  do_user_addr_fault+0x2d3/0xed0
> [  232.741736]  ? trace_page_fault_user+0x1bf/0x240
> [  232.741969]  exc_page_fault+0x87/0x120
> [  232.742124]  asm_exc_page_fault+0x26/0x30
> [  232.742288] RIP: 0033:0x7fb4d73ed546
> [  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
> [  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
> [  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
> [  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
> [  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
> [  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
> [  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
> [  232.746616]  </TASK>
> [  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
> [  232.749308] ---[ end trace 0000000000000000 ]---
> [  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.752983] Kernel panic - not syncing: Fatal exception
> [  232.753510] Kernel Offset: disabled
> [  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> This happens when two concurrent page faults occur within the same PMD range.
> One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
> attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
> triggered because a pmd_trans_huge is not expected when walking the page
> table inside vmf_insert_pfn.
> 
> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> PMD-sized mappings are handled through the appropriate huge page fault path.
> 
> Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
> Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
> ---
> 
> Changes in v3:
>  - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
>  - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> is defined to avoid a build warning
> 
> Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
>  - Keep the #ifdef unindented
>  - Create drm_gem_shmem_any_fault to handle faults of any order and use
> drm_gem_shmem_[huge_]fault() as wrappers
> 
> v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/
> 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 63 ++++++++++++++------------
>  1 file changed, 35 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 7b5a49935ae4..9428c3ab7283 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -550,36 +550,18 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
> 
> -static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
> -				      struct page *page)
> -{
> -#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> -	unsigned long pfn = page_to_pfn(page);
> -	unsigned long paddr = pfn << PAGE_SHIFT;
> -	bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
> -
> -	if (aligned &&
> -	    pmd_none(*vmf->pmd) &&
> -	    folio_test_pmd_mappable(page_folio(page))) {
> -		pfn &= PMD_MASK >> PAGE_SHIFT;
> -		if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
> -			return true;
> -	}
> -#endif
> -
> -	return false;
> -}
> -
> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)

Since we're at it, maybe we can take the order here, and use
drm_gem_shmem_any_fault directly for the huge_fault handler (see
below for the few places to modify to do that)

static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, unsigned int order)

>  {
>  	struct vm_area_struct *vma = vmf->vma;
>  	struct drm_gem_object *obj = vma->vm_private_data;
>  	struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
>  	loff_t num_pages = obj->size >> PAGE_SHIFT;
> -	vm_fault_t ret;
> +	vm_fault_t ret = VM_FAULT_FALLBACK;
>  	struct page **pages = shmem->pages;
>  	pgoff_t page_offset;
>  	unsigned long pfn;
> +	unsigned long paddr;
> +	bool aligned;

	if (order && order != PMD_ORDER)
		return VM_FAULT_FALLBACK;

> 
>  	/* Offset to faulty address in the VMA. */
>  	page_offset = vmf->pgoff - vma->vm_pgoff;
> @@ -593,13 +575,19 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>  		goto out;
>  	}
> 
> -	if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
> -		ret = VM_FAULT_NOPAGE;
> -		goto out;
> -	}
> -
>  	pfn = page_to_pfn(pages[page_offset]);
> -	ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +	if (try_pmd) {
> +		paddr = pfn << PAGE_SHIFT;
> +		aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
> +
> +		if (aligned &&
> +		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
> +			pfn &= PMD_MASK >> PAGE_SHIFT;
> +			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
> +		}
> +	} else {
> +		ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +	}

	if (!order) {
		ret = vmf_insert_pfn(vma, vmf->address, pfn);
	} else {
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
		unsigned long paddr = pfn << PAGE_SHIFT;
		bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);

		if (aligned &&
		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
			pfn &= PMD_MASK >> PAGE_SHIFT;
			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
		} else {
			ret = VM_FAULT_FALLBACK;
		}
#else
		ret = VM_FAULT_FALLBACK;
#endif
	}

> 
>   out:
>  	dma_resv_unlock(shmem->base.resv);
> @@ -607,6 +595,22 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>  	return ret;
>  }
> 
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +static vm_fault_t drm_gem_shmem_huge_fault(struct vm_fault *vmf,
> +					   unsigned int order)
> +{
> +	if (order != PMD_ORDER)
> +		return VM_FAULT_FALLBACK;
> +
> +	return drm_gem_shmem_any_fault(vmf, true);
> +}
> +#endif

You can get rid of drm_gem_shmem_huge_fault.

> +
> +static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +{
> +	return drm_gem_shmem_any_fault(vmf, false);

	return drm_gem_shmem_any_fault(vmf, 0);

> +}
> +
>  static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
>  {
>  	struct drm_gem_object *obj = vma->vm_private_data;
> @@ -643,6 +647,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
> 
>  const struct vm_operations_struct drm_gem_shmem_vm_ops = {
>  	.fault = drm_gem_shmem_fault,
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +	.huge_fault = drm_gem_shmem_huge_fault,

	.huge_fault = drm_gem_shmem_any_fault,
> +#endif
>  	.open = drm_gem_shmem_vm_open,
>  	.close = drm_gem_shmem_vm_close,
>  };
> --
> 2.47.3
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  9:28   ` Boris Brezillon
@ 2026-03-16 10:12     ` Thomas Zimmermann
  2026-03-16 10:42       ` Boris Brezillon
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Zimmermann @ 2026-03-16 10:12 UTC (permalink / raw)
  To: Boris Brezillon, Maarten Lankhorst, Maxime Ripard
  Cc: Pedro Demarchi Gomes, David Airlie, Simona Vetter, dri-devel,
	linux-kernel

Hi

Am 16.03.26 um 10:28 schrieb Boris Brezillon:
[...]
>>> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>>> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)
>> Please write two functions. One for huge pages and one for regular page
>> faults.
> That's actually what the first version of the patch was doing, and I
> strongly disagree with that. In my opinion, it's more error-prone to
> have two distinct implementations duplicating most of the logic
> (locking, bookkeeping, ...) than having both handled in the same
> function and have dummy wrappers to decide the kind of vmf_insert_pfn
> to do (_pmd vs regular). Think about the folio_mark_accessed() you
> recently added, and how easy it would have been to add it to
> drm_gem_shmem_fault() and forget it in the huge_fault handler.

But that would be a _simple_ one-time fix.  All these branching any 
parameterization will stay for _all_ later changes to that code.

There's little to be gained from cramming things into single functions. 
I'd say that we have two distinct fault callbacks for huge pages and 
regular ones already support this point.

Best regards
Thomas


-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  9:50 ` Boris Brezillon
@ 2026-03-16 10:14   ` Thomas Zimmermann
  2026-03-16 10:36     ` Boris Brezillon
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Zimmermann @ 2026-03-16 10:14 UTC (permalink / raw)
  To: Boris Brezillon, Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, David Airlie, Simona Vetter,
	dri-devel, linux-kernel

Hi

Am 16.03.26 um 10:50 schrieb Boris Brezillon:

>> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
>> PMD-sized mappings are handled through the appropriate huge page fault path.

This patch will conflict heavily with the folio-tracking patches that 
has been merged into upstream DRM. How do we deal with that?

Best regards
Thomas

>>
>> Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
>> Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
>> ---
>>
>> Changes in v3:
>>   - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
>>   - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
>> is defined to avoid a build warning
>>
>> Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
>>   - Keep the #ifdef unindented
>>   - Create drm_gem_shmem_any_fault to handle faults of any order and use
>> drm_gem_shmem_[huge_]fault() as wrappers
>>
>> v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/
>>
>> ---
>>   drivers/gpu/drm/drm_gem_shmem_helper.c | 63 ++++++++++++++------------
>>   1 file changed, 35 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
>> index 7b5a49935ae4..9428c3ab7283 100644
>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>> @@ -550,36 +550,18 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
>>   }
>>   EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
>>
>> -static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
>> -				      struct page *page)
>> -{
>> -#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
>> -	unsigned long pfn = page_to_pfn(page);
>> -	unsigned long paddr = pfn << PAGE_SHIFT;
>> -	bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
>> -
>> -	if (aligned &&
>> -	    pmd_none(*vmf->pmd) &&
>> -	    folio_test_pmd_mappable(page_folio(page))) {
>> -		pfn &= PMD_MASK >> PAGE_SHIFT;
>> -		if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
>> -			return true;
>> -	}
>> -#endif
>> -
>> -	return false;
>> -}
>> -
>> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)
>>   {
>>   	struct vm_area_struct *vma = vmf->vma;
>>   	struct drm_gem_object *obj = vma->vm_private_data;
>>   	struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
>>   	loff_t num_pages = obj->size >> PAGE_SHIFT;
>> -	vm_fault_t ret;
>> +	vm_fault_t ret = VM_FAULT_FALLBACK;
>>   	struct page **pages = shmem->pages;
>>   	pgoff_t page_offset;
>>   	unsigned long pfn;
>> +	unsigned long paddr;
>> +	bool aligned;
> Those two variables can be moved to the `if (try_pmd)` branch.
>
>>   	/* Offset to faulty address in the VMA. */
>>   	page_offset = vmf->pgoff - vma->vm_pgoff;
>> @@ -593,13 +575,19 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>>   		goto out;
>>   	}
>>
>> -	if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
>> -		ret = VM_FAULT_NOPAGE;
>> -		goto out;
>> -	}
>> -
>>   	pfn = page_to_pfn(pages[page_offset]);
>> -	ret = vmf_insert_pfn(vma, vmf->address, pfn);
>> +	if (try_pmd) {
>> +		paddr = pfn << PAGE_SHIFT;
>> +		aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
>> +
>> +		if (aligned &&
>> +		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
>> +			pfn &= PMD_MASK >> PAGE_SHIFT;
>> +			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
>> +		}
>> +	} else {
>> +		ret = vmf_insert_pfn(vma, vmf->address, pfn);
>> +	}
> Hm, if we get rid of drm_gem_shmem_try_map_pmd(), we probably
> need something like
>
> 	if (!try_pmd) {
> 		ret = vmf_insert_pfn(vma, vmf->address, pfn);
> 	} else {
> #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> 		unsigned long paddr = pfn << PAGE_SHIFT;
> 		bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
>
> 		if (aligned &&
> 		    folio_test_pmd_mappable(page_folio(pages[page_offset]))) {
> 			pfn &= PMD_MASK >> PAGE_SHIFT;
> 			ret = vmf_insert_pfn_pmd(vmf, pfn, false);
> 		}
> #else
> 		ret = VM_FAULT_FALLBACK;
> #endif
> 	}
>
> to make sure we're not calling helpers that might not
> exist in some configs (the compile might optimize the
> try_pmd==true branch out already, but better safe than sorry).
>
>>    out:
>>   	dma_resv_unlock(shmem->base.resv);
>> @@ -607,6 +595,22 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>>   	return ret;
>>   }
>>
>> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
>> +static vm_fault_t drm_gem_shmem_huge_fault(struct vm_fault *vmf,
>> +					   unsigned int order)
>> +{
>> +	if (order != PMD_ORDER)
>> +		return VM_FAULT_FALLBACK;
>> +
>> +	return drm_gem_shmem_any_fault(vmf, true);
>> +}
>> +#endif
>> +
>> +static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>> +{
>> +	return drm_gem_shmem_any_fault(vmf, false);
>> +}
>> +
>>   static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
>>   {
>>   	struct drm_gem_object *obj = vma->vm_private_data;
>> @@ -643,6 +647,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
>>
>>   const struct vm_operations_struct drm_gem_shmem_vm_ops = {
>>   	.fault = drm_gem_shmem_fault,
>> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
>> +	.huge_fault = drm_gem_shmem_huge_fault,
>> +#endif
>>   	.open = drm_gem_shmem_vm_open,
>>   	.close = drm_gem_shmem_vm_close,
>>   };
>> --
>> 2.47.3
>>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16 10:14   ` Thomas Zimmermann
@ 2026-03-16 10:36     ` Boris Brezillon
  2026-03-16 12:10       ` Thomas Zimmermann
  0 siblings, 1 reply; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16 10:36 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Pedro Demarchi Gomes, Maarten Lankhorst, Maxime Ripard,
	David Airlie, Simona Vetter, dri-devel, linux-kernel

On Mon, 16 Mar 2026 11:14:40 +0100
Thomas Zimmermann <tzimmermann@suse.de> wrote:

> Am 16.03.26 um 10:50 schrieb Boris Brezillon:
> 
> >> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> >> PMD-sized mappings are handled through the appropriate huge page fault path.  
> 
> This patch will conflict heavily with the folio-tracking patches that 
> has been merged into upstream DRM. How do we deal with that?

We need a fix in drm-misc-fixes since it's a bug that exists in v7.0,
and a conflict resolution recorded in drm-tip to deal with the
drm-misc-fixes vs drm-misc-next conflicts (I don't remember the exact
process, but it's documented in dim).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16 10:12     ` Thomas Zimmermann
@ 2026-03-16 10:42       ` Boris Brezillon
  0 siblings, 0 replies; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16 10:42 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Maarten Lankhorst, Maxime Ripard, Pedro Demarchi Gomes,
	David Airlie, Simona Vetter, dri-devel, linux-kernel

On Mon, 16 Mar 2026 11:12:45 +0100
Thomas Zimmermann <tzimmermann@suse.de> wrote:

> Hi
> 
> Am 16.03.26 um 10:28 schrieb Boris Brezillon:
> [...]
> >>> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> >>> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, bool try_pmd)  
> >> Please write two functions. One for huge pages and one for regular page
> >> faults.  
> > That's actually what the first version of the patch was doing, and I
> > strongly disagree with that. In my opinion, it's more error-prone to
> > have two distinct implementations duplicating most of the logic
> > (locking, bookkeeping, ...) than having both handled in the same
> > function and have dummy wrappers to decide the kind of vmf_insert_pfn
> > to do (_pmd vs regular). Think about the folio_mark_accessed() you
> > recently added, and how easy it would have been to add it to
> > drm_gem_shmem_fault() and forget it in the huge_fault handler.  
> 
> But that would be a _simple_ one-time fix.  All these branching any 
> parameterization will stay for _all_ later changes to that code.

But you'll have to deal with these two distinct fault cases in the long
run anyway. Again, look at the access tracking you've added, it has to
apply to huge folios too, so why have two completely distinct functions
that do exactly the same except for the kind of page they insert into
the page table.

> 
> There's little to be gained from cramming things into single functions. 
> I'd say that we have two distinct fault callbacks for huge pages and 
> regular ones already support this point.

The bulk of the code is the same, the only bits that differ are the
function we use to insert the pfn. Again, if we were to duplicate it'd
be more error-prone, because then you have to update two places every
time you fix/add something. And it's not like we'd be the only ones to
do this sort of multiplexing, see [1][2]. So I'm still very much in
favor of this common drm_gem_shmem_any_fault(), and then have
fault/huge_fault implementation call that thing with the proper
parameters.

[1]https://elixir.bootlin.com/linux/v6.19.8/source/fs/xfs/xfs_file.c#L1889
[2]https://elixir.bootlin.com/linux/v6.19.8/source/fs/fuse/dax.c#L796

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16 10:36     ` Boris Brezillon
@ 2026-03-16 12:10       ` Thomas Zimmermann
  2026-03-16 13:20         ` Boris Brezillon
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Zimmermann @ 2026-03-16 12:10 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Pedro Demarchi Gomes, Maarten Lankhorst, Maxime Ripard,
	David Airlie, Simona Vetter, dri-devel, linux-kernel

Hi

Am 16.03.26 um 11:36 schrieb Boris Brezillon:
> On Mon, 16 Mar 2026 11:14:40 +0100
> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
>> Am 16.03.26 um 10:50 schrieb Boris Brezillon:
>>
>>>> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
>>>> PMD-sized mappings are handled through the appropriate huge page fault path.
>> This patch will conflict heavily with the folio-tracking patches that
>> has been merged into upstream DRM. How do we deal with that?
> We need a fix in drm-misc-fixes since it's a bug that exists in v7.0,
> and a conflict resolution recorded in drm-tip to deal with the
> drm-misc-fixes vs drm-misc-next conflicts (I don't remember the exact
> process, but it's documented in dim).

I can easily do that when I merge the fix into drm-misc-fixes.  I'd only 
need the fixed and tested patch that applies to the current 
drm-misc-next beforehand.

Best regards
Thomas


-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16 12:10       ` Thomas Zimmermann
@ 2026-03-16 13:20         ` Boris Brezillon
  0 siblings, 0 replies; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16 13:20 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Pedro Demarchi Gomes, Maarten Lankhorst, Maxime Ripard,
	David Airlie, Simona Vetter, dri-devel, linux-kernel

On Mon, 16 Mar 2026 13:10:43 +0100
Thomas Zimmermann <tzimmermann@suse.de> wrote:

> Hi
> 
> Am 16.03.26 um 11:36 schrieb Boris Brezillon:
> > On Mon, 16 Mar 2026 11:14:40 +0100
> > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> >  
> >> Am 16.03.26 um 10:50 schrieb Boris Brezillon:
> >>  
> >>>> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> >>>> PMD-sized mappings are handled through the appropriate huge page fault path.  
> >> This patch will conflict heavily with the folio-tracking patches that
> >> has been merged into upstream DRM. How do we deal with that?  
> > We need a fix in drm-misc-fixes since it's a bug that exists in v7.0,
> > and a conflict resolution recorded in drm-tip to deal with the
> > drm-misc-fixes vs drm-misc-next conflicts (I don't remember the exact
> > process, but it's documented in dim).  
> 
> I can easily do that when I merge the fix into drm-misc-fixes.  I'd only 
> need the fixed and tested patch that applies to the current 
> drm-misc-next beforehand.

Let's wait until we've settled on a version we're happy with for
drm-misc-fixes, I can provide a conflict resolution for drm-misc-next
based on this final version.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16  0:26 [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler Pedro Demarchi Gomes
                   ` (2 preceding siblings ...)
  2026-03-16 10:03 ` Boris Brezillon
@ 2026-03-16 14:21 ` Boris Brezillon
  2026-03-17 17:30   ` Boris Brezillon
  3 siblings, 1 reply; 16+ messages in thread
From: Boris Brezillon @ 2026-03-16 14:21 UTC (permalink / raw)
  To: Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Sun, 15 Mar 2026 21:26:49 -0300
Pedro Demarchi Gomes <pedrodemargomes@gmail.com> wrote:

> When running ./tools/testing/selftests/mm/split_huge_page_test multiple
> times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
> /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
> occurs:
> 
> [  232.728858] ------------[ cut here ]------------
> [  232.729458] kernel BUG at mm/memory.c:2276!
> [  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
> [  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
> [  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.736031] Call Trace:
> [  232.736273]  <TASK>
> [  232.736500]  get_locked_pte+0x1f/0xa0
> [  232.736878]  insert_pfn+0x9f/0x350
> [  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
> [  232.737614]  ? __pfx_insert_pfn+0x10/0x10
> [  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
> [  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
> [  232.738552]  ? lookup_memtype+0x62/0x180
> [  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
> [  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
> [  232.739247]  ? __pfx___might_resched+0x10/0x10
> [  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
> [  232.739677]  ? rcu_read_unlock+0x20/0x70
> [  232.739882]  __do_fault+0x251/0x7b0
> [  232.740028]  do_fault+0x6e1/0xc00
> [  232.740167]  ? __lock_acquire+0x590/0xc40
> [  232.740335]  handle_pte_fault+0x439/0x760
> [  232.740498]  ? mtree_range_walk+0x252/0xae0
> [  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
> [  232.740899]  __handle_mm_fault+0xa02/0xf30
> [  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
> [  232.741255]  ? find_vma+0xa1/0x120
> [  232.741403]  handle_mm_fault+0x2bf/0x8f0
> [  232.741564]  do_user_addr_fault+0x2d3/0xed0
> [  232.741736]  ? trace_page_fault_user+0x1bf/0x240
> [  232.741969]  exc_page_fault+0x87/0x120
> [  232.742124]  asm_exc_page_fault+0x26/0x30
> [  232.742288] RIP: 0033:0x7fb4d73ed546
> [  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
> [  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
> [  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
> [  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
> [  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
> [  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
> [  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
> [  232.746616]  </TASK>
> [  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
> [  232.749308] ---[ end trace 0000000000000000 ]---
> [  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> [  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> [  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> [  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> [  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> [  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> [  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> [  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> [  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> [  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> [  232.752983] Kernel panic - not syncing: Fatal exception
> [  232.753510] Kernel Offset: disabled
> [  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> This happens when two concurrent page faults occur within the same PMD range.
> One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
> attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
> triggered because a pmd_trans_huge is not expected when walking the page
> table inside vmf_insert_pfn.
> 
> Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> PMD-sized mappings are handled through the appropriate huge page fault path.
> 
> Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
> Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
> ---
> 
> Changes in v3:
>  - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
>  - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> is defined to avoid a build warning
> 
> Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
>  - Keep the #ifdef unindented
>  - Create drm_gem_shmem_any_fault to handle faults of any order and use
> drm_gem_shmem_[huge_]fault() as wrappers
> 
> v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/

I've played a bit with the drm-misc-{next,fixes} versions (with the extra
pfn_mkwrite fixes on top for the -next variant), and below is my preferred
version (pfn insertion is all handled inside the try_insert_pfn() helper,
with order == PMD_ORDER specialized there).

--->8---

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 7b5a49935ae4..f549cc040c7a 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -550,27 +550,27 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
 }
 EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
 
-static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
-                                     struct page *page)
+static vm_fault_t try_insert_pfn(struct vm_fault *vmf, unsigned int order, unsigned long pfn)
 {
+       if (!order) {
+               return vmf_insert_pfn(vmf->vma, vmf->address, pfn);
 #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
-       unsigned long pfn = page_to_pfn(page);
-       unsigned long paddr = pfn << PAGE_SHIFT;
-       bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
+       } else if (order == PMD_ORDER) {
+               unsigned long paddr = pfn << PAGE_SHIFT;
+               bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
 
-       if (aligned &&
-           pmd_none(*vmf->pmd) &&
-           folio_test_pmd_mappable(page_folio(page))) {
-               pfn &= PMD_MASK >> PAGE_SHIFT;
-               if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
-                       return true;
-       }
+               if (aligned &&
+                   folio_test_pmd_mappable(page_folio(pfn_to_page(pfn)))) {
+                       pfn &= PMD_MASK >> PAGE_SHIFT;
+                       return vmf_insert_pfn_pmd(vmf, pfn, false);
+               }
 #endif
+       }
 
-       return false;
+       return VM_FAULT_FALLBACK;
 }
 
-static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
+static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, unsigned int order)
 {
        struct vm_area_struct *vma = vmf->vma;
        struct drm_gem_object *obj = vma->vm_private_data;
@@ -581,6 +581,9 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
        pgoff_t page_offset;
        unsigned long pfn;
 
+       if (order && order != PMD_ORDER)
+               return VM_FAULT_FALLBACK;
+
        /* Offset to faulty address in the VMA. */
        page_offset = vmf->pgoff - vma->vm_pgoff;
 
@@ -593,13 +596,8 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
                goto out;
        }
 
-       if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
-               ret = VM_FAULT_NOPAGE;
-               goto out;
-       }
-
        pfn = page_to_pfn(pages[page_offset]);
-       ret = vmf_insert_pfn(vma, vmf->address, pfn);
+       ret = try_insert_pfn(vmf, order, pfn);
 
  out:
        dma_resv_unlock(shmem->base.resv);
@@ -607,6 +605,11 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
        return ret;
 }
 
+static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
+{
+       return drm_gem_shmem_any_fault(vmf, 0);
+}
+
 static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
 {
        struct drm_gem_object *obj = vma->vm_private_data;
@@ -643,6 +646,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
 
 const struct vm_operations_struct drm_gem_shmem_vm_ops = {
        .fault = drm_gem_shmem_fault,
+#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
+       .huge_fault = drm_gem_shmem_any_fault,
+#endif
        .open = drm_gem_shmem_vm_open,
        .close = drm_gem_shmem_vm_close,
 };

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-16 14:21 ` Boris Brezillon
@ 2026-03-17 17:30   ` Boris Brezillon
  2026-03-18 10:16     ` Boris Brezillon
  0 siblings, 1 reply; 16+ messages in thread
From: Boris Brezillon @ 2026-03-17 17:30 UTC (permalink / raw)
  To: Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Mon, 16 Mar 2026 15:21:52 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Sun, 15 Mar 2026 21:26:49 -0300
> Pedro Demarchi Gomes <pedrodemargomes@gmail.com> wrote:
> 
> > When running ./tools/testing/selftests/mm/split_huge_page_test multiple
> > times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
> > /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
> > occurs:
> > 
> > [  232.728858] ------------[ cut here ]------------
> > [  232.729458] kernel BUG at mm/memory.c:2276!
> > [  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > [  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
> > [  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
> > [  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> > [  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> > [  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> > [  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> > [  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> > [  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> > [  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> > [  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> > [  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> > [  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> > [  232.736031] Call Trace:
> > [  232.736273]  <TASK>
> > [  232.736500]  get_locked_pte+0x1f/0xa0
> > [  232.736878]  insert_pfn+0x9f/0x350
> > [  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
> > [  232.737614]  ? __pfx_insert_pfn+0x10/0x10
> > [  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
> > [  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
> > [  232.738552]  ? lookup_memtype+0x62/0x180
> > [  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
> > [  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
> > [  232.739247]  ? __pfx___might_resched+0x10/0x10
> > [  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
> > [  232.739677]  ? rcu_read_unlock+0x20/0x70
> > [  232.739882]  __do_fault+0x251/0x7b0
> > [  232.740028]  do_fault+0x6e1/0xc00
> > [  232.740167]  ? __lock_acquire+0x590/0xc40
> > [  232.740335]  handle_pte_fault+0x439/0x760
> > [  232.740498]  ? mtree_range_walk+0x252/0xae0
> > [  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
> > [  232.740899]  __handle_mm_fault+0xa02/0xf30
> > [  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
> > [  232.741255]  ? find_vma+0xa1/0x120
> > [  232.741403]  handle_mm_fault+0x2bf/0x8f0
> > [  232.741564]  do_user_addr_fault+0x2d3/0xed0
> > [  232.741736]  ? trace_page_fault_user+0x1bf/0x240
> > [  232.741969]  exc_page_fault+0x87/0x120
> > [  232.742124]  asm_exc_page_fault+0x26/0x30
> > [  232.742288] RIP: 0033:0x7fb4d73ed546
> > [  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
> > [  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
> > [  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
> > [  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
> > [  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
> > [  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
> > [  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
> > [  232.746616]  </TASK>
> > [  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
> > [  232.749308] ---[ end trace 0000000000000000 ]---
> > [  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
> > [  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
> > [  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
> > [  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
> > [  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
> > [  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
> > [  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
> > [  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
> > [  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
> > [  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
> > [  232.752983] Kernel panic - not syncing: Fatal exception
> > [  232.753510] Kernel Offset: disabled
> > [  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---
> > 
> > This happens when two concurrent page faults occur within the same PMD range.
> > One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
> > attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
> > triggered because a pmd_trans_huge is not expected when walking the page
> > table inside vmf_insert_pfn.
> > 
> > Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
> > PMD-sized mappings are handled through the appropriate huge page fault path.
> > 
> > Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler")
> > Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
> > ---
> > 
> > Changes in v3:
> >  - Pass a try_pmd boolean parameter to drm_gem_shmem_any_fault
> >  - Compile drm_gem_shmem_huge_fault only if CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> > is defined to avoid a build warning
> > 
> > Changes in v2: https://lore.kernel.org/dri-devel/20260313141719.3949700-1-pedrodemargomes@gmail.com/
> >  - Keep the #ifdef unindented
> >  - Create drm_gem_shmem_any_fault to handle faults of any order and use
> > drm_gem_shmem_[huge_]fault() as wrappers
> > 
> > v1: https://lore.kernel.org/all/20260312155027.1682606-1-pedrodemargomes@gmail.com/  
> 
> I've played a bit with the drm-misc-{next,fixes} versions (with the extra
> pfn_mkwrite fixes on top for the -next variant), and below is my preferred
> version (pfn insertion is all handled inside the try_insert_pfn() helper,
> with order == PMD_ORDER specialized there).

One last thing, the subject should probably be "drm/shmem-helper: Fix
huge page mapping in fault handler" (instead of "Fix Map huge page
..."). Pedro, do you think you'll be able to send a v4 before the end
of the week? I'd really like to have that fix in -rc5 or -rc6.

> 
> --->8---  
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 7b5a49935ae4..f549cc040c7a 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -550,27 +550,27 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
>  
> -static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
> -                                     struct page *page)
> +static vm_fault_t try_insert_pfn(struct vm_fault *vmf, unsigned int order, unsigned long pfn)
>  {
> +       if (!order) {
> +               return vmf_insert_pfn(vmf->vma, vmf->address, pfn);
>  #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> -       unsigned long pfn = page_to_pfn(page);
> -       unsigned long paddr = pfn << PAGE_SHIFT;
> -       bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
> +       } else if (order == PMD_ORDER) {
> +               unsigned long paddr = pfn << PAGE_SHIFT;
> +               bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
>  
> -       if (aligned &&
> -           pmd_none(*vmf->pmd) &&
> -           folio_test_pmd_mappable(page_folio(page))) {
> -               pfn &= PMD_MASK >> PAGE_SHIFT;
> -               if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
> -                       return true;
> -       }
> +               if (aligned &&
> +                   folio_test_pmd_mappable(page_folio(pfn_to_page(pfn)))) {
> +                       pfn &= PMD_MASK >> PAGE_SHIFT;
> +                       return vmf_insert_pfn_pmd(vmf, pfn, false);
> +               }
>  #endif
> +       }
>  
> -       return false;
> +       return VM_FAULT_FALLBACK;
>  }
>  
> -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, unsigned int order)
>  {
>         struct vm_area_struct *vma = vmf->vma;
>         struct drm_gem_object *obj = vma->vm_private_data;
> @@ -581,6 +581,9 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>         pgoff_t page_offset;
>         unsigned long pfn;
>  
> +       if (order && order != PMD_ORDER)
> +               return VM_FAULT_FALLBACK;
> +
>         /* Offset to faulty address in the VMA. */
>         page_offset = vmf->pgoff - vma->vm_pgoff;
>  
> @@ -593,13 +596,8 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>                 goto out;
>         }
>  
> -       if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
> -               ret = VM_FAULT_NOPAGE;
> -               goto out;
> -       }
> -
>         pfn = page_to_pfn(pages[page_offset]);
> -       ret = vmf_insert_pfn(vma, vmf->address, pfn);
> +       ret = try_insert_pfn(vmf, order, pfn);
>  
>   out:
>         dma_resv_unlock(shmem->base.resv);
> @@ -607,6 +605,11 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
>         return ret;
>  }
>  
> +static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
> +{
> +       return drm_gem_shmem_any_fault(vmf, 0);
> +}
> +
>  static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
>  {
>         struct drm_gem_object *obj = vma->vm_private_data;
> @@ -643,6 +646,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
>  
>  const struct vm_operations_struct drm_gem_shmem_vm_ops = {
>         .fault = drm_gem_shmem_fault,
> +#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> +       .huge_fault = drm_gem_shmem_any_fault,
> +#endif
>         .open = drm_gem_shmem_vm_open,
>         .close = drm_gem_shmem_vm_close,
>  };


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-17 17:30   ` Boris Brezillon
@ 2026-03-18 10:16     ` Boris Brezillon
  2026-03-18 11:14       ` Pedro Demarchi Gomes
  0 siblings, 1 reply; 16+ messages in thread
From: Boris Brezillon @ 2026-03-18 10:16 UTC (permalink / raw)
  To: Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Tue, 17 Mar 2026 18:30:54 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> > 
> > I've played a bit with the drm-misc-{next,fixes} versions (with the extra
> > pfn_mkwrite fixes on top for the -next variant), and below is my preferred
> > version (pfn insertion is all handled inside the try_insert_pfn() helper,
> > with order == PMD_ORDER specialized there).  
> 
> One last thing, the subject should probably be "drm/shmem-helper: Fix
> huge page mapping in fault handler" (instead of "Fix Map huge page
> ..."). Pedro, do you think you'll be able to send a v4 before the end
> of the week? I'd really like to have that fix in -rc5 or -rc6.

If you can't, please let me know, I'll post a new version for you.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-18 10:16     ` Boris Brezillon
@ 2026-03-18 11:14       ` Pedro Demarchi Gomes
  2026-03-18 11:45         ` Boris Brezillon
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Demarchi Gomes @ 2026-03-18 11:14 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Wed, Mar 18, 2026 at 11:16:49AM +0100, Boris Brezillon wrote:
> On Tue, 17 Mar 2026 18:30:54 +0100
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
> > > 
> > > I've played a bit with the drm-misc-{next,fixes} versions (with the extra
> > > pfn_mkwrite fixes on top for the -next variant), and below is my preferred
> > > version (pfn insertion is all handled inside the try_insert_pfn() helper,
> > > with order == PMD_ORDER specialized there).  
> > 
> > One last thing, the subject should probably be "drm/shmem-helper: Fix
> > huge page mapping in fault handler" (instead of "Fix Map huge page
> > ..."). Pedro, do you think you'll be able to send a v4 before the end
> > of the week? I'd really like to have that fix in -rc5 or -rc6.
> 
> If you can't, please let me know, I'll post a new version for you.

Should I send a patch for the drm-misc-{next,fixes}, or v7.0-rc4?
I am planning to send it today.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler
  2026-03-18 11:14       ` Pedro Demarchi Gomes
@ 2026-03-18 11:45         ` Boris Brezillon
  0 siblings, 0 replies; 16+ messages in thread
From: Boris Brezillon @ 2026-03-18 11:45 UTC (permalink / raw)
  To: Pedro Demarchi Gomes
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel

On Wed, 18 Mar 2026 08:14:31 -0300
Pedro Demarchi Gomes <pedrodemargomes@gmail.com> wrote:

> On Wed, Mar 18, 2026 at 11:16:49AM +0100, Boris Brezillon wrote:
> > On Tue, 17 Mar 2026 18:30:54 +0100
> > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> >   
> > > > 
> > > > I've played a bit with the drm-misc-{next,fixes} versions (with the extra
> > > > pfn_mkwrite fixes on top for the -next variant), and below is my preferred
> > > > version (pfn insertion is all handled inside the try_insert_pfn() helper,
> > > > with order == PMD_ORDER specialized there).    
> > > 
> > > One last thing, the subject should probably be "drm/shmem-helper: Fix
> > > huge page mapping in fault handler" (instead of "Fix Map huge page
> > > ..."). Pedro, do you think you'll be able to send a v4 before the end
> > > of the week? I'd really like to have that fix in -rc5 or -rc6.  
> > 
> > If you can't, please let me know, I'll post a new version for you.  
> 
> Should I send a patch for the drm-misc-{next,fixes}, or v7.0-rc4?

The patch should be against drm-misc-fixes, and if you can port it to
drm-misc-next and share the diff, that would be helpful for the merge
conflict resolution.

> I am planning to send it today.

Thanks!

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-03-18 11:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-16  0:26 [PATCH v3] drm/shmem-helper: Fix Map huge page mapping in fault handler Pedro Demarchi Gomes
2026-03-16  9:14 ` Thomas Zimmermann
2026-03-16  9:28   ` Boris Brezillon
2026-03-16 10:12     ` Thomas Zimmermann
2026-03-16 10:42       ` Boris Brezillon
2026-03-16  9:50 ` Boris Brezillon
2026-03-16 10:14   ` Thomas Zimmermann
2026-03-16 10:36     ` Boris Brezillon
2026-03-16 12:10       ` Thomas Zimmermann
2026-03-16 13:20         ` Boris Brezillon
2026-03-16 10:03 ` Boris Brezillon
2026-03-16 14:21 ` Boris Brezillon
2026-03-17 17:30   ` Boris Brezillon
2026-03-18 10:16     ` Boris Brezillon
2026-03-18 11:14       ` Pedro Demarchi Gomes
2026-03-18 11:45         ` Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox