Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH 1/1] RDMA/rxe: Fix a use-after-free problem in rxe_mmap
@ 2026-05-15  0:25 Zhu Yanjun
  0 siblings, 0 replies; only message in thread
From: Zhu Yanjun @ 2026-05-15  0:25 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma; +Cc: Zhu Yanjun, nasm

rxe_mmap() removes a rxe_mmap_info struct from the pending_mmaps list
and releases pending_lock while the struct's kref is still at 1:
"
   list_del_init(&ip->pending_mmaps);
   spin_unlock_bh(&rxe->pending_lock);   /* ref == 1, no lock held */
   ret = remap_vmalloc_range(vma, ip->obj, 0);  /* walks PTEs */
   [...]
   rxe_vma_open(vma);                    /* kref_get, ref → 2 */
   remap_vmalloc_range_partial() walks PTEs without any lock.
"

A concurrent DESTROY_CQ ioctl on another CPU calls:
"
   → kref_put(&q->ip->ref, rxe_mmap_release)   /* ref 1→0 */
   → vfree(ip->obj)   /* clears vmalloc PTEs mid-walk */
   → kfree(ip)        /* frees rxe_mmap_info */
"
   This yields:

   1. Kernel crash, vmalloc_to_page() returns NULL when vfree wins the
   per-PTE race → vm_insert_page(NULL) → GPF in validate_page_before_insert

   2. Page UAF, vmalloc_to_page() reads a stale PTE before vfree clears it
   → user VMA holds a PTE to a free'd page which might eventually get
   reallocated later by vmalloc which allows the attacker to get a clean
   page-level UAF.

   It is worth noting that even though a page-level UAF is possible given
   the strong primitive, it is statistically very difficult to achieve
   given the very short time window (after the last insert_page and before
   the kref_get).

The call trace are as below:

  [   67.916353] Oops: general protection fault, probably for
   non-canonical address 0xdffffc0000000001: 0000 [#1] SMP KASAN NOPTI

   [   67.916865] KASAN: null-ptr-deref in range
   [0x0000000000000008-0x000000000000000f]

   [   67.917755] CPU: 0 UID: 1000 PID: 413 Comm: poc Not tainted
   7.0.0-rc5-dirty #28 PREEMPT(lazy)

   [   67.918164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
   BIOS 1.15.0-1 04/01/2014

   [   67.918512] RIP: 0010:validate_page_before_insert+0x32/0x300

   [   67.919173] Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 e8
   93 b5 a3 ff 48 8d 7b 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea
   03 <80> 3c 02 00 0f 85 7b 02 00 00 4c 8b 63 08 31 ff 4d 89 e5 41 83 e5

   [   67.919645] RSP: 0018:ffff88811b15f2f0 EFLAGS: 00000202

   [   67.919919] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
   0000000000000000

   [   67.920125] RDX: 0000000000000001 RSI: 0000000000000000 RDI:
   0000000000000008

   [   67.920356] RBP: ffff88811b15f318 R08: 0000000000000000 R09:
   0000000000000000

   [   67.920599] R10: 0000000000000000 R11: 0000000000000000 R12:
   ffff8881181eee00

   [   67.920822] R13: 0000000000000000 R14: ffff8881181eee00 R15:
   ffff8881181eee20

   [   67.921073] FS:  00007b1e000f76c0(0000) GS:ffff8884268e0000(0000)
   knlGS:0000000000000000

   [   67.921335] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

   [   67.921543] CR2: 00007b1e00a24ac0 CR3: 0000000116eb3000 CR4:
   00000000000006f0

   [   67.921881] Call Trace:
   [   67.922064]  <TASK>
   [   67.922339]  insert_page+0x8f/0x190
   [   67.922681]  ? __pfx_insert_page+0x10/0x10
   [   67.922916]  ? kasan_save_alloc_info+0x38/0x60
   [   67.923149]  vm_insert_page+0x2e7/0x400
   [   67.923405]  remap_vmalloc_range_partial+0x212/0x3e0
   [   67.923662]  remap_vmalloc_range+0x6e/0xb0
   [   67.923883]  ? __kasan_check_write+0x14/0x30
   [   67.924096]  rxe_mmap+0x2e9/0x5d0
   [   67.924289]  ib_uverbs_mmap+0x1ad/0x2c0
   [   67.924492]  __mmap_region+0x12c2/0x2ad0
   [   67.924736]  ? __pfx___mmap_region+0x10/0x10
   [   67.924956]  ? __sanitizer_cov_trace_switch+0x58/0xb0
   [   67.925188]  ? mas_prev_slot+0x360/0x39c0
   [   67.925419]  ? __sanitizer_cov_trace_switch+0x58/0xb0
   [   67.925660]  ? mas_next_slot+0x1e5b/0x2f40
   [   67.925904]  ? __sanitizer_cov_trace_cmp8+0x18/0x30
   [   67.926132]  ? unmapped_area_topdown+0x4dd/0x610
   [   67.926381]  ? kfree+0x1b1/0x440
   [   67.926634]  ? free_cpumask_var+0x16/0x30
   [   67.926844]  ? __kasan_slab_free+0x7d/0xa0
   [   67.927079]  ? __sanitizer_cov_trace_cmp8+0x18/0x30
   [   67.927353]  mmap_region+0x2e6/0x3c0
   [   67.927560]  do_mmap+0xa3e/0x12a0
   [   67.927776]  ? __pfx_do_mmap+0x10/0x10
   [   67.927991]  ? __kasan_check_write+0x14/0x30
   [   67.928224]  ? down_write_killable+0xba/0x160
   [   67.928451]  ? __pfx_down_write_killable+0x10/0x10
   [   67.928657]  ? __sanitizer_cov_trace_cmp4+0x16/0x30
   [   67.928908]  vm_mmap_pgoff+0x2d4/0x4a0
   [   67.929138]  ? __pfx_vm_mmap_pgoff+0x10/0x10
   [   67.929398]  ? fget+0x1bf/0x270
   [   67.929594]  ksys_mmap_pgoff+0x40c/0x690
   [   67.929814]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x30
   [   67.930058]  ? __pfx_ksys_mmap_pgoff+0x10/0x10
   [   67.930273]  ? __kasan_check_write+0x14/0x30
   [   67.930495]  ? _raw_spin_trylock+0xbb/0x130
   [   67.930706]  ? __pfx__raw_spin_trylock+0x10/0x10
   [   67.930932]  __x64_sys_mmap+0x135/0x1e0
   [   67.931142]  x64_sys_call+0x1c14/0x2790
   [   67.931385]  do_syscall_64+0xd2/0x1050
   [   67.931588]  ? rcu_core+0x352/0x7d0
   [   67.931819]  ? rcu_core_si+0xe/0x20
   [   67.932061]  ? handle_softirqs+0x1aa/0x650
   [   67.932313]  ? __sanitizer_cov_trace_cmp4+0x16/0x30
   [   67.932550]  ? fpregs_assert_state_consistent+0xe1/0x160
   [   67.932800]  ? irqentry_exit+0xb1/0x670
   [   67.933026]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
   [   67.933285] RIP: 0033:0x7b1e00a08e22
   [   67.933955] Code: 00 00 00 0f 1f 44 00 00 41 f7 c1 ff 0f 00 00 75 27
   55 89 cd 53 48 89 fb 48 85 ff 74 3b 41 89 ea 48 89 df b8 09 00 00 00 0f
   05 <48> 3d 00 f0 ff ff 77 76 5b 5d c3 0f 1f 00 48 8b 05 b9 7f 0d 00 64
   [   67.934388] RSP: 002b:00007b1e000f6dd8 EFLAGS: 00000246 ORIG_RAX:
   0000000000000009
   [   67.934683] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
   00007b1e00a08e22
   [   67.934896] RDX: 0000000000000003 RSI: 0000000000005000 RDI:
   0000000000000000
   [   67.935097] RBP: 0000000000000001 R08: 0000000000000003 R09:
   0000000000006000
   [   67.935296] R10: 0000000000000001 R11: 0000000000000246 R12:
   0000000000000021
   [   67.935496] R13: 0000000000000018 R14: 00007ffe0e245730 R15:
   00007b1dff8f7000
   [   67.935751]  </TASK>

   [   67.935915] Modules linked in:
   [   67.936701] ---[ end trace 0000000000000000 ]---
   [   67.937421] RIP: 0010:validate_page_before_insert+0x32/0x300
   [   67.937729] Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 e8
   93 b5 a3 ff 48 8d 7b 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea
   03 <80> 3c 02 00 0f 85 7b 02 00 00 4c 8b 63 08 31 ff 4d 89 e5 41 83 e5
   [   67.938129] RSP: 0018:ffff88811b15f2f0 EFLAGS: 00000202
   [   67.938486] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
   0000000000000000
   [   67.938708] RDX: 0000000000000001 RSI: 0000000000000000 RDI:
   0000000000000008
   [   67.938954] RBP: ffff88811b15f318 R08: 0000000000000000 R09:
   0000000000000000
   [   67.939157] R10: 0000000000000000 R11: 0000000000000000 R12:
   ffff8881181eee00
   [   67.939446] R13: 0000000000000000 R14: ffff8881181eee00 R15:
   ffff8881181eee20
   [   67.939660] FS:  00007b1e000f76c0(0000) GS:ffff8884268e0000(0000)
   knlGS:0000000000000000
   [   67.939889] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   [   67.940079] CR2: 00007b1e00a24ac0 CR3: 0000000116eb3000 CR4:
   00000000000006f0

Reported-and-tested-by: nasm <n4sm@protonmail.com>
Suggested-by: nasm <n4sm@protonmail.com>
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_mmap.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mmap.c b/drivers/infiniband/sw/rxe/rxe_mmap.c
index db380302149e..3407785c582c 100644
--- a/drivers/infiniband/sw/rxe/rxe_mmap.c
+++ b/drivers/infiniband/sw/rxe/rxe_mmap.c
@@ -93,18 +93,29 @@ int rxe_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
 	goto done;
 
 found_it:
+	/* Increment refcount and check whether it is being freed atm while
+	 * holding lock to prevent UAF */
+	if (!kref_get_unless_zero(&ip->ref)) {
+		spin_unlock_bh(&rxe->pending_lock);
+		ret = -ENXIO;
+		goto done;
+	}
+
 	list_del_init(&ip->pending_mmaps);
 	spin_unlock_bh(&rxe->pending_lock);
 
+	vma->vm_ops = &rxe_vm_ops;
+	vma->vm_private_data = ip;
+
 	ret = remap_vmalloc_range(vma, ip->obj, 0);
 	if (ret) {
+		vma->vm_private_data = NULL;
+		vma->vm_ops = NULL;
+		kref_put(&ip->ref, rxe_mmap_release);
 		rxe_dbg_dev(rxe, "err %d from remap_vmalloc_range\n", ret);
 		goto done;
 	}
 
-	vma->vm_ops = &rxe_vm_ops;
-	vma->vm_private_data = ip;
-	rxe_vma_open(vma);
 done:
 	return ret;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-05-15  0:25 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15  0:25 [PATCH 1/1] RDMA/rxe: Fix a use-after-free problem in rxe_mmap Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox