public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node
@ 2024-07-13 10:25 syzbot
  2024-07-13 13:21 ` Hillf Danton
  2024-07-15 20:23 ` Carlos Llamas
  0 siblings, 2 replies; 14+ messages in thread
From: syzbot @ 2024-07-13 10:25 UTC (permalink / raw)
  To: arve, brauner, cmllamas, gregkh, joel, linux-kernel, maco, surenb,
	syzkaller-bugs, tkjos

Hello,

syzbot found the following issue on:

HEAD commit:    82d01fe6ee52 Add linux-next specific files for 20240709
git tree:       linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=106472a5980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=95a20e7acf357998
dashboard link: https://syzkaller.appspot.com/bug?extid=3dae065ca76952a67257
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15f2e87e980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16a4869e980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/12dcacb06142/disk-82d01fe6.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6ef954821378/vmlinux-82d01fe6.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9ebf01d42887/bzImage-82d01fe6.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com

------------[ cut here ]------------
kernel BUG at drivers/android/binder.c:1175!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 5100 Comm: syz-executor841 Not tainted 6.10.0-rc7-next-20240709-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
RIP: 0010:binder_get_ref_for_node_olocked drivers/android/binder.c:1175 [inline]
RIP: 0010:binder_inc_ref_for_node+0xdf7/0xe00 drivers/android/binder.c:1478
Code: e8 f8 e9 4e fd ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c 59 fd ff ff 48 89 df e8 24 e9 e8 f8 e9 4c fd ff ff e8 0a 15 82 f8 90 <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffc900038f75e8 EFLAGS: 00010293
RAX: ffffffff891176d6 RBX: 0000000000000000 RCX: ffff888077030000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888021c2e820 R08: ffffffff8911720b R09: 0000000000000000
R10: ffff88802f1b0330 R11: ffffed1005e36068 R12: 0000000000000000
R13: dffffc0000000000 R14: ffff88802f1b0428 R15: ffff88802f1b0410
FS:  000055557b30a380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa59eb210e0 CR3: 000000001fee6000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 binder_thread_write drivers/android/binder.c:3946 [inline]
 binder_ioctl_write_read+0xc7b/0x8d60 drivers/android/binder.c:5163
 binder_ioctl+0x43d/0x1c70 drivers/android/binder.c:5449
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:907 [inline]
 __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa59eaaa1e9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffea9c908d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa59eaaa1e9
RDX: 00000000200003c0 RSI: 00000000c0306201 RDI: 0000000000000003
RBP: 00000000000f4240 R08: 000055557b30b610 R09: 000055557b30b610
R10: 000055557b30b610 R11: 0000000000000246 R12: 00007fa59eaf80dc
R13: 00007fa59eaf30a3 R14: 00007ffea9c90900 R15: 00007ffea9c908f0
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:binder_get_ref_for_node_olocked drivers/android/binder.c:1175 [inline]
RIP: 0010:binder_inc_ref_for_node+0xdf7/0xe00 drivers/android/binder.c:1478
Code: e8 f8 e9 4e fd ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c 59 fd ff ff 48 89 df e8 24 e9 e8 f8 e9 4c fd ff ff e8 0a 15 82 f8 90 <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffc900038f75e8 EFLAGS: 00010293
RAX: ffffffff891176d6 RBX: 0000000000000000 RCX: ffff888077030000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888021c2e820 R08: ffffffff8911720b R09: 0000000000000000
R10: ffff88802f1b0330 R11: ffffed1005e36068 R12: 0000000000000000
R13: dffffc0000000000 R14: ffff88802f1b0428 R15: ffff88802f1b0410
FS:  000055557b30a380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa59eb210e0 CR3: 000000001fee6000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node
  2024-07-13 10:25 [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node syzbot
@ 2024-07-13 13:21 ` Hillf Danton
  2024-07-13 13:56   ` syzbot
  2024-07-15 20:23 ` Carlos Llamas
  1 sibling, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2024-07-13 13:21 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Sat, 13 Jul 2024 03:25:20 -0700
> syzbot found the following issue on:
> 
> HEAD commit:    82d01fe6ee52 Add linux-next specific files for 20240709
> git tree:       linux-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16a4869e980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git  82d01fe6ee52

--- x/drivers/android/binder.c
+++ y/drivers/android/binder.c
@@ -1131,6 +1131,7 @@ static struct binder_ref *binder_get_ref
 	struct binder_ref *ref;
 	struct rb_node *parent;
 	struct rb_node **p;
+	struct rb_node **p0, *pa0;
 	u32 desc;
 
 retry:
@@ -1147,6 +1148,8 @@ retry:
 		else
 			return ref;
 	}
+	p0 = p;
+	pa0 = parent;
 	if (!new_ref)
 		return NULL;
 
@@ -1158,11 +1161,10 @@ retry:
 	new_ref->data.debug_id = atomic_inc_return(&binder_last_id);
 	new_ref->proc = proc;
 	new_ref->node = node;
-	rb_link_node(&new_ref->rb_node_node, parent, p);
-	rb_insert_color(&new_ref->rb_node_node, &proc->refs_by_node);
 
 	new_ref->data.desc = desc;
 	p = &proc->refs_by_desc.rb_node;
+	parent = NULL;
 	while (*p) {
 		parent = *p;
 		ref = rb_entry(parent, struct binder_ref, rb_node_desc);
@@ -1172,11 +1174,14 @@ retry:
 		else if (new_ref->data.desc > ref->data.desc)
 			p = &(*p)->rb_right;
 		else
-			BUG();
+			return ref;
 	}
 	rb_link_node(&new_ref->rb_node_desc, parent, p);
 	rb_insert_color(&new_ref->rb_node_desc, &proc->refs_by_desc);
 
+	rb_link_node(&new_ref->rb_node_node, pa0, p0);
+	rb_insert_color(&new_ref->rb_node_node, &proc->refs_by_node);
+
 	binder_node_lock(node);
 	hlist_add_head(&new_ref->node_entry, &node->refs);
 
--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node
  2024-07-13 13:21 ` Hillf Danton
@ 2024-07-13 13:56   ` syzbot
  0 siblings, 0 replies; 14+ messages in thread
From: syzbot @ 2024-07-13 13:56 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com

Tested on:

commit:         82d01fe6 Add linux-next specific files for 20240709
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12f2eb6e980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=95a20e7acf357998
dashboard link: https://syzkaller.appspot.com/bug?extid=3dae065ca76952a67257
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=10120b85980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node
  2024-07-13 10:25 [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node syzbot
  2024-07-13 13:21 ` Hillf Danton
@ 2024-07-15 20:23 ` Carlos Llamas
  2024-07-15 23:52   ` syzbot
  1 sibling, 1 reply; 14+ messages in thread
From: Carlos Llamas @ 2024-07-15 20:23 UTC (permalink / raw)
  To: syzbot
  Cc: arve, brauner, gregkh, joel, linux-kernel, maco, surenb,
	syzkaller-bugs, tkjos

On Sat, Jul 13, 2024 at 03:25:20AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    82d01fe6ee52 Add linux-next specific files for 20240709
> git tree:       linux-next
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=106472a5980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=95a20e7acf357998
> dashboard link: https://syzkaller.appspot.com/bug?extid=3dae065ca76952a67257
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15f2e87e980000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16a4869e980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 82d01fe6ee52

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 0c2161b1f057..f54e5c0c7d04 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1046,13 +1046,13 @@ static struct binder_ref *binder_get_ref_olocked(struct binder_proc *proc,
 }
 
 /* Find the smallest unused descriptor the "slow way" */
-static u32 slow_desc_lookup_olocked(struct binder_proc *proc)
+static u32 slow_desc_lookup_olocked(struct binder_proc *proc, u32 offset)
 {
 	struct binder_ref *ref;
 	struct rb_node *n;
 	u32 desc;
 
-	desc = 1;
+	desc = offset;
 	for (n = rb_first(&proc->refs_by_desc); n; n = rb_next(n)) {
 		ref = rb_entry(n, struct binder_ref, rb_node_desc);
 		if (ref->data.desc > desc)
@@ -1073,21 +1073,18 @@ static int get_ref_desc_olocked(struct binder_proc *proc,
 				u32 *desc)
 {
 	struct dbitmap *dmap = &proc->dmap;
+	unsigned int nbits, offset;
 	unsigned long *new, bit;
-	unsigned int nbits;
 
 	/* 0 is reserved for the context manager */
-	if (node == proc->context->binder_context_mgr_node) {
-		*desc = 0;
-		return 0;
-	}
+	offset = (node == proc->context->binder_context_mgr_node) ? 0 : 1;
 
 	if (!dbitmap_enabled(dmap)) {
-		*desc = slow_desc_lookup_olocked(proc);
+		*desc = slow_desc_lookup_olocked(proc, offset);
 		return 0;
 	}
 
-	if (dbitmap_acquire_first_zero_bit(dmap, &bit) == 0) {
+	if (dbitmap_acquire_next_zero_bit(dmap, offset, &bit) == 0) {
 		*desc = bit;
 		return 0;
 	}
diff --git a/drivers/android/dbitmap.h b/drivers/android/dbitmap.h
index b8ac7b4764fd..1d58c2e7abd6 100644
--- a/drivers/android/dbitmap.h
+++ b/drivers/android/dbitmap.h
@@ -6,8 +6,7 @@
  *
  * Used by the binder driver to optimize the allocation of the smallest
  * available descriptor ID. Each bit in the bitmap represents the state
- * of an ID, with the exception of BIT(0) which is used exclusively to
- * reference binder's context manager.
+ * of an ID.
  *
  * A dbitmap can grow or shrink as needed. This part has been designed
  * considering that users might need to briefly release their locks in
@@ -132,16 +131,17 @@ dbitmap_grow(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
 }
 
 /*
- * Finds and sets the first zero bit in the bitmap. Upon success @bit
+ * Finds and sets the next zero bit in the bitmap. Upon success @bit
  * is populated with the index and 0 is returned. Otherwise, -ENOSPC
  * is returned to indicate that a dbitmap_grow() is needed.
  */
 static inline int
-dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
+dbitmap_acquire_next_zero_bit(struct dbitmap *dmap, unsigned long offset,
+			      unsigned long *bit)
 {
 	unsigned long n;
 
-	n = find_first_zero_bit(dmap->map, dmap->nbits);
+	n = find_next_zero_bit(dmap->map, dmap->nbits, offset);
 	if (n == dmap->nbits)
 		return -ENOSPC;
 
@@ -154,9 +154,7 @@ dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
 static inline void
 dbitmap_clear_bit(struct dbitmap *dmap, unsigned long bit)
 {
-	/* BIT(0) should always set for the context manager */
-	if (bit)
-		clear_bit(bit, dmap->map);
+	clear_bit(bit, dmap->map);
 }
 
 static inline int dbitmap_init(struct dbitmap *dmap)
@@ -168,8 +166,6 @@ static inline int dbitmap_init(struct dbitmap *dmap)
 	}
 
 	dmap->nbits = NBITS_MIN;
-	/* BIT(0) is reserved for the context manager */
-	set_bit(0, dmap->map);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node
  2024-07-15 20:23 ` Carlos Llamas
@ 2024-07-15 23:52   ` syzbot
  2024-07-16  4:28     ` [PATCH] binder: fix descriptor lookup for context manager Carlos Llamas
  0 siblings, 1 reply; 14+ messages in thread
From: syzbot @ 2024-07-15 23:52 UTC (permalink / raw)
  To: arve, brauner, cmllamas, gregkh, joel, linux-kernel, maco, surenb,
	syzkaller-bugs, tkjos

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com

Tested on:

commit:         82d01fe6 Add linux-next specific files for 20240709
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1339eb6e980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=95a20e7acf357998
dashboard link: https://syzkaller.appspot.com/bug?extid=3dae065ca76952a67257
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1198f85e980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] binder: fix descriptor lookup for context manager
  2024-07-15 23:52   ` syzbot
@ 2024-07-16  4:28     ` Carlos Llamas
  2024-07-16 17:40       ` Todd Kjos
  2024-07-22 11:30       ` Alice Ryhl
  0 siblings, 2 replies; 14+ messages in thread
From: Carlos Llamas @ 2024-07-16  4:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Alice Ryhl
  Cc: linux-kernel, kernel-team, syzkaller-bugs, stable,
	syzbot+3dae065ca76952a67257

In commit 15d9da3f818c ("binder: use bitmap for faster descriptor
lookup"), it was incorrectly assumed that references to the context
manager node should always get descriptor zero assigned to them.

However, if the context manager dies and a new process takes its place,
then assigning descriptor zero to the new context manager might lead to
collisions, as there could still be references to the older node. This
issue was reported by syzbot with the following trace:

  kernel BUG at drivers/android/binder.c:1173!
  Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 447 Comm: binder-util Not tainted 6.10.0-rc6-00348-g31643d84b8c3 #10
  Hardware name: linux,dummy-virt (DT)
  pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : binder_inc_ref_for_node+0x500/0x544
  lr : binder_inc_ref_for_node+0x1e4/0x544
  sp : ffff80008112b940
  x29: ffff80008112b940 x28: ffff0e0e40310780 x27: 0000000000000000
  x26: 0000000000000001 x25: ffff0e0e40310738 x24: ffff0e0e4089ba34
  x23: ffff0e0e40310b00 x22: ffff80008112bb50 x21: ffffaf7b8f246970
  x20: ffffaf7b8f773f08 x19: ffff0e0e4089b800 x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000 x15: 000000002de4aa60
  x14: 0000000000000000 x13: 2de4acf000000000 x12: 0000000000000020
  x11: 0000000000000018 x10: 0000000000000020 x9 : ffffaf7b90601000
  x8 : ffff0e0e48739140 x7 : 0000000000000000 x6 : 000000000000003f
  x5 : ffff0e0e40310b28 x4 : 0000000000000000 x3 : ffff0e0e40310720
  x2 : ffff0e0e40310728 x1 : 0000000000000000 x0 : ffff0e0e40310710
  Call trace:
   binder_inc_ref_for_node+0x500/0x544
   binder_transaction+0xf68/0x2620
   binder_thread_write+0x5bc/0x139c
   binder_ioctl+0xef4/0x10c8
  [...]

This patch adds back the previous behavior of assigning the next
non-zero descriptor if references to previous context managers still
exist. It amends both strategies, the newer dbitmap code and also the
legacy slow_desc_lookup_olocked(), by allowing them to start looking
for available descriptors at a given offset.

Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000c1c0a0061d1e6979@google.com/
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 drivers/android/binder.c  | 15 ++++++---------
 drivers/android/dbitmap.h | 16 ++++++----------
 2 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f26286e3713e..905290c98c3c 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1044,13 +1044,13 @@ static struct binder_ref *binder_get_ref_olocked(struct binder_proc *proc,
 }
 
 /* Find the smallest unused descriptor the "slow way" */
-static u32 slow_desc_lookup_olocked(struct binder_proc *proc)
+static u32 slow_desc_lookup_olocked(struct binder_proc *proc, u32 offset)
 {
 	struct binder_ref *ref;
 	struct rb_node *n;
 	u32 desc;
 
-	desc = 1;
+	desc = offset;
 	for (n = rb_first(&proc->refs_by_desc); n; n = rb_next(n)) {
 		ref = rb_entry(n, struct binder_ref, rb_node_desc);
 		if (ref->data.desc > desc)
@@ -1071,21 +1071,18 @@ static int get_ref_desc_olocked(struct binder_proc *proc,
 				u32 *desc)
 {
 	struct dbitmap *dmap = &proc->dmap;
+	unsigned int nbits, offset;
 	unsigned long *new, bit;
-	unsigned int nbits;
 
 	/* 0 is reserved for the context manager */
-	if (node == proc->context->binder_context_mgr_node) {
-		*desc = 0;
-		return 0;
-	}
+	offset = (node == proc->context->binder_context_mgr_node) ? 0 : 1;
 
 	if (!dbitmap_enabled(dmap)) {
-		*desc = slow_desc_lookup_olocked(proc);
+		*desc = slow_desc_lookup_olocked(proc, offset);
 		return 0;
 	}
 
-	if (dbitmap_acquire_first_zero_bit(dmap, &bit) == 0) {
+	if (dbitmap_acquire_next_zero_bit(dmap, offset, &bit) == 0) {
 		*desc = bit;
 		return 0;
 	}
diff --git a/drivers/android/dbitmap.h b/drivers/android/dbitmap.h
index b8ac7b4764fd..1d58c2e7abd6 100644
--- a/drivers/android/dbitmap.h
+++ b/drivers/android/dbitmap.h
@@ -6,8 +6,7 @@
  *
  * Used by the binder driver to optimize the allocation of the smallest
  * available descriptor ID. Each bit in the bitmap represents the state
- * of an ID, with the exception of BIT(0) which is used exclusively to
- * reference binder's context manager.
+ * of an ID.
  *
  * A dbitmap can grow or shrink as needed. This part has been designed
  * considering that users might need to briefly release their locks in
@@ -132,16 +131,17 @@ dbitmap_grow(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
 }
 
 /*
- * Finds and sets the first zero bit in the bitmap. Upon success @bit
+ * Finds and sets the next zero bit in the bitmap. Upon success @bit
  * is populated with the index and 0 is returned. Otherwise, -ENOSPC
  * is returned to indicate that a dbitmap_grow() is needed.
  */
 static inline int
-dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
+dbitmap_acquire_next_zero_bit(struct dbitmap *dmap, unsigned long offset,
+			      unsigned long *bit)
 {
 	unsigned long n;
 
-	n = find_first_zero_bit(dmap->map, dmap->nbits);
+	n = find_next_zero_bit(dmap->map, dmap->nbits, offset);
 	if (n == dmap->nbits)
 		return -ENOSPC;
 
@@ -154,9 +154,7 @@ dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
 static inline void
 dbitmap_clear_bit(struct dbitmap *dmap, unsigned long bit)
 {
-	/* BIT(0) should always set for the context manager */
-	if (bit)
-		clear_bit(bit, dmap->map);
+	clear_bit(bit, dmap->map);
 }
 
 static inline int dbitmap_init(struct dbitmap *dmap)
@@ -168,8 +166,6 @@ static inline int dbitmap_init(struct dbitmap *dmap)
 	}
 
 	dmap->nbits = NBITS_MIN;
-	/* BIT(0) is reserved for the context manager */
-	set_bit(0, dmap->map);
 
 	return 0;
 }
-- 
2.45.2.993.g49e7a77208-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] binder: fix descriptor lookup for context manager
  2024-07-16  4:28     ` [PATCH] binder: fix descriptor lookup for context manager Carlos Llamas
@ 2024-07-16 17:40       ` Todd Kjos
  2024-07-16 18:48         ` Carlos Llamas
  2024-07-22 10:57         ` Alice Ryhl
  2024-07-22 11:30       ` Alice Ryhl
  1 sibling, 2 replies; 14+ messages in thread
From: Todd Kjos @ 2024-07-16 17:40 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, Alice Ryhl, linux-kernel, kernel-team,
	syzkaller-bugs, stable, syzbot+3dae065ca76952a67257

On Mon, Jul 15, 2024 at 9:29 PM Carlos Llamas <cmllamas@google.com> wrote:
>
> In commit 15d9da3f818c ("binder: use bitmap for faster descriptor
> lookup"), it was incorrectly assumed that references to the context
> manager node should always get descriptor zero assigned to them.
>
> However, if the context manager dies and a new process takes its place,
> then assigning descriptor zero to the new context manager might lead to
> collisions, as there could still be references to the older node. This
> issue was reported by syzbot with the following trace:
>
>   kernel BUG at drivers/android/binder.c:1173!
>   Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>   Modules linked in:
>   CPU: 1 PID: 447 Comm: binder-util Not tainted 6.10.0-rc6-00348-g31643d84b8c3 #10
>   Hardware name: linux,dummy-virt (DT)
>   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : binder_inc_ref_for_node+0x500/0x544
>   lr : binder_inc_ref_for_node+0x1e4/0x544
>   sp : ffff80008112b940
>   x29: ffff80008112b940 x28: ffff0e0e40310780 x27: 0000000000000000
>   x26: 0000000000000001 x25: ffff0e0e40310738 x24: ffff0e0e4089ba34
>   x23: ffff0e0e40310b00 x22: ffff80008112bb50 x21: ffffaf7b8f246970
>   x20: ffffaf7b8f773f08 x19: ffff0e0e4089b800 x18: 0000000000000000
>   x17: 0000000000000000 x16: 0000000000000000 x15: 000000002de4aa60
>   x14: 0000000000000000 x13: 2de4acf000000000 x12: 0000000000000020
>   x11: 0000000000000018 x10: 0000000000000020 x9 : ffffaf7b90601000
>   x8 : ffff0e0e48739140 x7 : 0000000000000000 x6 : 000000000000003f
>   x5 : ffff0e0e40310b28 x4 : 0000000000000000 x3 : ffff0e0e40310720
>   x2 : ffff0e0e40310728 x1 : 0000000000000000 x0 : ffff0e0e40310710
>   Call trace:
>    binder_inc_ref_for_node+0x500/0x544
>    binder_transaction+0xf68/0x2620
>    binder_thread_write+0x5bc/0x139c
>    binder_ioctl+0xef4/0x10c8
>   [...]
>
> This patch adds back the previous behavior of assigning the next
> non-zero descriptor if references to previous context managers still
> exist. It amends both strategies, the newer dbitmap code and also the
> legacy slow_desc_lookup_olocked(), by allowing them to start looking
> for available descriptors at a given offset.
>
> Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup")
> Cc: stable@vger.kernel.org
> Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000c1c0a0061d1e6979@google.com/
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---
>  drivers/android/binder.c  | 15 ++++++---------
>  drivers/android/dbitmap.h | 16 ++++++----------
>  2 files changed, 12 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> index f26286e3713e..905290c98c3c 100644
> --- a/drivers/android/binder.c
> +++ b/drivers/android/binder.c
> @@ -1044,13 +1044,13 @@ static struct binder_ref *binder_get_ref_olocked(struct binder_proc *proc,
>  }
>
>  /* Find the smallest unused descriptor the "slow way" */
> -static u32 slow_desc_lookup_olocked(struct binder_proc *proc)
> +static u32 slow_desc_lookup_olocked(struct binder_proc *proc, u32 offset)
>  {
>         struct binder_ref *ref;
>         struct rb_node *n;
>         u32 desc;
>
> -       desc = 1;
> +       desc = offset;
>         for (n = rb_first(&proc->refs_by_desc); n; n = rb_next(n)) {
>                 ref = rb_entry(n, struct binder_ref, rb_node_desc);
>                 if (ref->data.desc > desc)
> @@ -1071,21 +1071,18 @@ static int get_ref_desc_olocked(struct binder_proc *proc,
>                                 u32 *desc)
>  {
>         struct dbitmap *dmap = &proc->dmap;
> +       unsigned int nbits, offset;
>         unsigned long *new, bit;
> -       unsigned int nbits;
>
>         /* 0 is reserved for the context manager */
> -       if (node == proc->context->binder_context_mgr_node) {
> -               *desc = 0;
> -               return 0;
> -       }
> +       offset = (node == proc->context->binder_context_mgr_node) ? 0 : 1;

If context manager doesn't need to be bit 0 anymore, then why do we
bother to prefer bit 0? Does it matter?

It would simplify the code below if the offset is always 0 since you
wouldn't need an offset at all.

>
>         if (!dbitmap_enabled(dmap)) {
> -               *desc = slow_desc_lookup_olocked(proc);
> +               *desc = slow_desc_lookup_olocked(proc, offset);
>                 return 0;
>         }
>
> -       if (dbitmap_acquire_first_zero_bit(dmap, &bit) == 0) {
> +       if (dbitmap_acquire_next_zero_bit(dmap, offset, &bit) == 0) {
>                 *desc = bit;
>                 return 0;
>         }
> diff --git a/drivers/android/dbitmap.h b/drivers/android/dbitmap.h
> index b8ac7b4764fd..1d58c2e7abd6 100644
> --- a/drivers/android/dbitmap.h
> +++ b/drivers/android/dbitmap.h
> @@ -6,8 +6,7 @@
>   *
>   * Used by the binder driver to optimize the allocation of the smallest
>   * available descriptor ID. Each bit in the bitmap represents the state
> - * of an ID, with the exception of BIT(0) which is used exclusively to
> - * reference binder's context manager.
> + * of an ID.
>   *
>   * A dbitmap can grow or shrink as needed. This part has been designed
>   * considering that users might need to briefly release their locks in
> @@ -132,16 +131,17 @@ dbitmap_grow(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
>  }
>
>  /*
> - * Finds and sets the first zero bit in the bitmap. Upon success @bit
> + * Finds and sets the next zero bit in the bitmap. Upon success @bit
>   * is populated with the index and 0 is returned. Otherwise, -ENOSPC
>   * is returned to indicate that a dbitmap_grow() is needed.
>   */
>  static inline int
> -dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
> +dbitmap_acquire_next_zero_bit(struct dbitmap *dmap, unsigned long offset,
> +                             unsigned long *bit)
>  {
>         unsigned long n;
>
> -       n = find_first_zero_bit(dmap->map, dmap->nbits);
> +       n = find_next_zero_bit(dmap->map, dmap->nbits, offset);
>         if (n == dmap->nbits)
>                 return -ENOSPC;
>
> @@ -154,9 +154,7 @@ dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
>  static inline void
>  dbitmap_clear_bit(struct dbitmap *dmap, unsigned long bit)
>  {
> -       /* BIT(0) should always set for the context manager */
> -       if (bit)
> -               clear_bit(bit, dmap->map);
> +       clear_bit(bit, dmap->map);
>  }
>
>  static inline int dbitmap_init(struct dbitmap *dmap)
> @@ -168,8 +166,6 @@ static inline int dbitmap_init(struct dbitmap *dmap)
>         }
>
>         dmap->nbits = NBITS_MIN;
> -       /* BIT(0) is reserved for the context manager */
> -       set_bit(0, dmap->map);
>
>         return 0;
>  }
> --
> 2.45.2.993.g49e7a77208-goog
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] binder: fix descriptor lookup for context manager
  2024-07-16 17:40       ` Todd Kjos
@ 2024-07-16 18:48         ` Carlos Llamas
  2024-07-16 19:00           ` Carlos Llamas
  2024-07-22 10:57         ` Alice Ryhl
  1 sibling, 1 reply; 14+ messages in thread
From: Carlos Llamas @ 2024-07-16 18:48 UTC (permalink / raw)
  To: Todd Kjos
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, Alice Ryhl, linux-kernel, kernel-team,
	syzkaller-bugs, stable, syzbot+3dae065ca76952a67257

On Tue, Jul 16, 2024 at 10:40:20AM -0700, Todd Kjos wrote:
> If context manager doesn't need to be bit 0 anymore, then why do we
> bother to prefer bit 0? Does it matter?
> 
> It would simplify the code below if the offset is always 0 since you
> wouldn't need an offset at all.

Yes, it would make things simplier if references to the context manager
could get any descriptor id. However, there seems to be an expectation
from libbinder that this descriptor would be zero. At least according to
some folks more familiar with userspace binder than myself.

I think we can revisit this expectation though and also look closer at
the scenario of a context manager "swap". The procs can still reach the
new context manager using descriptor 0. However, this may cause some
issues with operations with refs such as BC_INCREFS/BC_DECREFS.

AFAICT, the context manager doesn't even need a reference. But while we
dig furhter into this I think the best option is to keep the behavior
the same for now: reserve descriptor zero for the context manager node
unless it's already taken. Changing this is non-trivial IMO.

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] binder: fix descriptor lookup for context manager
  2024-07-16 18:48         ` Carlos Llamas
@ 2024-07-16 19:00           ` Carlos Llamas
  0 siblings, 0 replies; 14+ messages in thread
From: Carlos Llamas @ 2024-07-16 19:00 UTC (permalink / raw)
  To: Todd Kjos
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, Alice Ryhl, linux-kernel, kernel-team,
	syzkaller-bugs, stable, syzbot+3dae065ca76952a67257

On Tue, Jul 16, 2024 at 06:48:54PM +0000, Carlos Llamas wrote:
> On Tue, Jul 16, 2024 at 10:40:20AM -0700, Todd Kjos wrote:
> > If context manager doesn't need to be bit 0 anymore, then why do we
> > bother to prefer bit 0? Does it matter?
> > 
> > It would simplify the code below if the offset is always 0 since you
> > wouldn't need an offset at all.
> 
> Yes, it would make things simplier if references to the context manager
> could get any descriptor id. However, there seems to be an expectation
> from libbinder that this descriptor would be zero. At least according to
> some folks more familiar with userspace binder than myself.
> 
> I think we can revisit this expectation though and also look closer at
> the scenario of a context manager "swap". The procs can still reach the
> new context manager using descriptor 0. However, this may cause some
> issues with operations with refs such as BC_INCREFS/BC_DECREFS.
> 
> AFAICT, the context manager doesn't even need a reference. But while we
> dig furhter into this I think the best option is to keep the behavior
> the same for now: reserve descriptor zero for the context manager node
> unless it's already taken. Changing this is non-trivial IMO.
> 
> --
> Carlos Llamas

Also, we need to consider that references to regular nodes (not context
manager) cannot get descriptor zero assigned to them for obvious
reasons. So descriptor zero is always reserved for the context manager,
but there might be certain scenarios in which references to the context
manager get a non-zero descriptor.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] binder: fix descriptor lookup for context manager
  2024-07-16 17:40       ` Todd Kjos
  2024-07-16 18:48         ` Carlos Llamas
@ 2024-07-22 10:57         ` Alice Ryhl
  2024-07-22 15:39           ` Carlos Llamas
  1 sibling, 1 reply; 14+ messages in thread
From: Alice Ryhl @ 2024-07-22 10:57 UTC (permalink / raw)
  To: Todd Kjos
  Cc: Carlos Llamas, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, syzkaller-bugs,
	stable, syzbot+3dae065ca76952a67257

On Tue, Jul 16, 2024 at 7:40 PM Todd Kjos <tkjos@google.com> wrote:
>
> On Mon, Jul 15, 2024 at 9:29 PM Carlos Llamas <cmllamas@google.com> wrote:
> >         /* 0 is reserved for the context manager */
> > -       if (node == proc->context->binder_context_mgr_node) {
> > -               *desc = 0;
> > -               return 0;
> > -       }
> > +       offset = (node == proc->context->binder_context_mgr_node) ? 0 : 1;
>
> If context manager doesn't need to be bit 0 anymore, then why do we
> bother to prefer bit 0? Does it matter?
>
> It would simplify the code below if the offset is always 0 since you
> wouldn't need an offset at all.

Userspace assumes that sending a message to handle 0 means that the
current context manager receives it. If we assign anything that is not
the context manager to bit 0, then libbinder will send ctxmgr messages
to random other processes. I don't think libbinder handles the case
where context manager is restarted well at all. Most likely, if we hit
this condition in real life, processes that had a non-zero refcount to
the context manager will lose the ability to interact with ctxmgr
until they are restarted.

I think this patch just needs to make sure that this scenario doesn't
lead to a UAF in the kernel. Ensuring that userspace handles it
gracefully is another matter.

Alice

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] binder: fix descriptor lookup for context manager
  2024-07-16  4:28     ` [PATCH] binder: fix descriptor lookup for context manager Carlos Llamas
  2024-07-16 17:40       ` Todd Kjos
@ 2024-07-22 11:30       ` Alice Ryhl
  2024-07-22 15:05         ` [PATCH v2] " Carlos Llamas
  1 sibling, 1 reply; 14+ messages in thread
From: Alice Ryhl @ 2024-07-22 11:30 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, syzkaller-bugs,
	stable, syzbot+3dae065ca76952a67257

On Tue, Jul 16, 2024 at 6:29 AM Carlos Llamas <cmllamas@google.com> wrote:
> In commit 15d9da3f818c ("binder: use bitmap for faster descriptor
> lookup"), it was incorrectly assumed that references to the context
> manager node should always get descriptor zero assigned to them.
>
> However, if the context manager dies and a new process takes its place,
> then assigning descriptor zero to the new context manager might lead to
> collisions, as there could still be references to the older node. This
> issue was reported by syzbot with the following trace:
>
>   kernel BUG at drivers/android/binder.c:1173!
>   Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>   Modules linked in:
>   CPU: 1 PID: 447 Comm: binder-util Not tainted 6.10.0-rc6-00348-g31643d84b8c3 #10
>   Hardware name: linux,dummy-virt (DT)
>   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : binder_inc_ref_for_node+0x500/0x544
>   lr : binder_inc_ref_for_node+0x1e4/0x544
>   sp : ffff80008112b940
>   x29: ffff80008112b940 x28: ffff0e0e40310780 x27: 0000000000000000
>   x26: 0000000000000001 x25: ffff0e0e40310738 x24: ffff0e0e4089ba34
>   x23: ffff0e0e40310b00 x22: ffff80008112bb50 x21: ffffaf7b8f246970
>   x20: ffffaf7b8f773f08 x19: ffff0e0e4089b800 x18: 0000000000000000
>   x17: 0000000000000000 x16: 0000000000000000 x15: 000000002de4aa60
>   x14: 0000000000000000 x13: 2de4acf000000000 x12: 0000000000000020
>   x11: 0000000000000018 x10: 0000000000000020 x9 : ffffaf7b90601000
>   x8 : ffff0e0e48739140 x7 : 0000000000000000 x6 : 000000000000003f
>   x5 : ffff0e0e40310b28 x4 : 0000000000000000 x3 : ffff0e0e40310720
>   x2 : ffff0e0e40310728 x1 : 0000000000000000 x0 : ffff0e0e40310710
>   Call trace:
>    binder_inc_ref_for_node+0x500/0x544
>    binder_transaction+0xf68/0x2620
>    binder_thread_write+0x5bc/0x139c
>    binder_ioctl+0xef4/0x10c8
>   [...]
>
> This patch adds back the previous behavior of assigning the next
> non-zero descriptor if references to previous context managers still
> exist. It amends both strategies, the newer dbitmap code and also the
> legacy slow_desc_lookup_olocked(), by allowing them to start looking
> for available descriptors at a given offset.
>
> Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup")
> Cc: stable@vger.kernel.org
> Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000c1c0a0061d1e6979@google.com/
> Signed-off-by: Carlos Llamas <cmllamas@google.com>

You are changing dbitmap so that BIT(0) is no longer guaranteed to be
set, so you should update this comment:

         /*
          * Note that find_last_bit() returns dmap->nbits when no bits
          * are set. While this is technically not possible here since
          * BIT(0) is always set, this check is left for extra safety.
          */
         if (bit == dmap->nbits)
                  return NBITS_MIN;

Otherwise LGTM. With the above comment fixed:

Reviewed-by: Alice Ryhl <aliceryhl@google.com>

Alice

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] binder: fix descriptor lookup for context manager
  2024-07-22 11:30       ` Alice Ryhl
@ 2024-07-22 15:05         ` Carlos Llamas
  2024-07-22 15:50           ` Carlos Llamas
  0 siblings, 1 reply; 14+ messages in thread
From: Carlos Llamas @ 2024-07-22 15:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan, Alice Ryhl
  Cc: linux-kernel, kernel-team, syzkaller-bugs, stable,
	syzbot+3dae065ca76952a67257

In commit 15d9da3f818c ("binder: use bitmap for faster descriptor
lookup"), it was incorrectly assumed that references to the context
manager node should always get descriptor zero assigned to them.

However, if the context manager dies and a new process takes its place,
then assigning descriptor zero to the new context manager might lead to
collisions, as there could still be references to the older node. This
issue was reported by syzbot with the following trace:

  kernel BUG at drivers/android/binder.c:1173!
  Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 447 Comm: binder-util Not tainted 6.10.0-rc6-00348-g31643d84b8c3 #10
  Hardware name: linux,dummy-virt (DT)
  pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : binder_inc_ref_for_node+0x500/0x544
  lr : binder_inc_ref_for_node+0x1e4/0x544
  sp : ffff80008112b940
  x29: ffff80008112b940 x28: ffff0e0e40310780 x27: 0000000000000000
  x26: 0000000000000001 x25: ffff0e0e40310738 x24: ffff0e0e4089ba34
  x23: ffff0e0e40310b00 x22: ffff80008112bb50 x21: ffffaf7b8f246970
  x20: ffffaf7b8f773f08 x19: ffff0e0e4089b800 x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000 x15: 000000002de4aa60
  x14: 0000000000000000 x13: 2de4acf000000000 x12: 0000000000000020
  x11: 0000000000000018 x10: 0000000000000020 x9 : ffffaf7b90601000
  x8 : ffff0e0e48739140 x7 : 0000000000000000 x6 : 000000000000003f
  x5 : ffff0e0e40310b28 x4 : 0000000000000000 x3 : ffff0e0e40310720
  x2 : ffff0e0e40310728 x1 : 0000000000000000 x0 : ffff0e0e40310710
  Call trace:
   binder_inc_ref_for_node+0x500/0x544
   binder_transaction+0xf68/0x2620
   binder_thread_write+0x5bc/0x139c
   binder_ioctl+0xef4/0x10c8
  [...]

This patch adds back the previous behavior of assigning the next
non-zero descriptor if references to previous context managers still
exist. It amends both strategies, the newer dbitmap code and also the
legacy slow_desc_lookup_olocked(), by allowing them to start looking
for available descriptors at a given offset.

Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000c1c0a0061d1e6979@google.com/
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 drivers/android/binder.c  | 15 ++++++---------
 drivers/android/dbitmap.h | 22 +++++++---------------
 2 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f26286e3713e..905290c98c3c 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1044,13 +1044,13 @@ static struct binder_ref *binder_get_ref_olocked(struct binder_proc *proc,
 }
 
 /* Find the smallest unused descriptor the "slow way" */
-static u32 slow_desc_lookup_olocked(struct binder_proc *proc)
+static u32 slow_desc_lookup_olocked(struct binder_proc *proc, u32 offset)
 {
 	struct binder_ref *ref;
 	struct rb_node *n;
 	u32 desc;
 
-	desc = 1;
+	desc = offset;
 	for (n = rb_first(&proc->refs_by_desc); n; n = rb_next(n)) {
 		ref = rb_entry(n, struct binder_ref, rb_node_desc);
 		if (ref->data.desc > desc)
@@ -1071,21 +1071,18 @@ static int get_ref_desc_olocked(struct binder_proc *proc,
 				u32 *desc)
 {
 	struct dbitmap *dmap = &proc->dmap;
+	unsigned int nbits, offset;
 	unsigned long *new, bit;
-	unsigned int nbits;
 
 	/* 0 is reserved for the context manager */
-	if (node == proc->context->binder_context_mgr_node) {
-		*desc = 0;
-		return 0;
-	}
+	offset = (node == proc->context->binder_context_mgr_node) ? 0 : 1;
 
 	if (!dbitmap_enabled(dmap)) {
-		*desc = slow_desc_lookup_olocked(proc);
+		*desc = slow_desc_lookup_olocked(proc, offset);
 		return 0;
 	}
 
-	if (dbitmap_acquire_first_zero_bit(dmap, &bit) == 0) {
+	if (dbitmap_acquire_next_zero_bit(dmap, offset, &bit) == 0) {
 		*desc = bit;
 		return 0;
 	}
diff --git a/drivers/android/dbitmap.h b/drivers/android/dbitmap.h
index b8ac7b4764fd..956f1bd087d1 100644
--- a/drivers/android/dbitmap.h
+++ b/drivers/android/dbitmap.h
@@ -6,8 +6,7 @@
  *
  * Used by the binder driver to optimize the allocation of the smallest
  * available descriptor ID. Each bit in the bitmap represents the state
- * of an ID, with the exception of BIT(0) which is used exclusively to
- * reference binder's context manager.
+ * of an ID.
  *
  * A dbitmap can grow or shrink as needed. This part has been designed
  * considering that users might need to briefly release their locks in
@@ -58,11 +57,7 @@ static inline unsigned int dbitmap_shrink_nbits(struct dbitmap *dmap)
 	if (bit < (dmap->nbits >> 2))
 		return dmap->nbits >> 1;
 
-	/*
-	 * Note that find_last_bit() returns dmap->nbits when no bits
-	 * are set. While this is technically not possible here since
-	 * BIT(0) is always set, this check is left for extra safety.
-	 */
+	/* find_last_bit() returns dmap->nbits when no bits are set. */
 	if (bit == dmap->nbits)
 		return NBITS_MIN;
 
@@ -132,16 +127,17 @@ dbitmap_grow(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
 }
 
 /*
- * Finds and sets the first zero bit in the bitmap. Upon success @bit
+ * Finds and sets the next zero bit in the bitmap. Upon success @bit
  * is populated with the index and 0 is returned. Otherwise, -ENOSPC
  * is returned to indicate that a dbitmap_grow() is needed.
  */
 static inline int
-dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
+dbitmap_acquire_next_zero_bit(struct dbitmap *dmap, unsigned long offset,
+			      unsigned long *bit)
 {
 	unsigned long n;
 
-	n = find_first_zero_bit(dmap->map, dmap->nbits);
+	n = find_next_zero_bit(dmap->map, dmap->nbits, offset);
 	if (n == dmap->nbits)
 		return -ENOSPC;
 
@@ -154,9 +150,7 @@ dbitmap_acquire_first_zero_bit(struct dbitmap *dmap, unsigned long *bit)
 static inline void
 dbitmap_clear_bit(struct dbitmap *dmap, unsigned long bit)
 {
-	/* BIT(0) should always set for the context manager */
-	if (bit)
-		clear_bit(bit, dmap->map);
+	clear_bit(bit, dmap->map);
 }
 
 static inline int dbitmap_init(struct dbitmap *dmap)
@@ -168,8 +162,6 @@ static inline int dbitmap_init(struct dbitmap *dmap)
 	}
 
 	dmap->nbits = NBITS_MIN;
-	/* BIT(0) is reserved for the context manager */
-	set_bit(0, dmap->map);
 
 	return 0;
 }
-- 
2.45.2.1089.g2a221341d9-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] binder: fix descriptor lookup for context manager
  2024-07-22 10:57         ` Alice Ryhl
@ 2024-07-22 15:39           ` Carlos Llamas
  0 siblings, 0 replies; 14+ messages in thread
From: Carlos Llamas @ 2024-07-22 15:39 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Todd Kjos, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, syzkaller-bugs,
	stable, syzbot+3dae065ca76952a67257

On Mon, Jul 22, 2024 at 12:57:31PM +0200, Alice Ryhl wrote:
> On Tue, Jul 16, 2024 at 7:40 PM Todd Kjos <tkjos@google.com> wrote:
> >
> > On Mon, Jul 15, 2024 at 9:29 PM Carlos Llamas <cmllamas@google.com> wrote:
> > >         /* 0 is reserved for the context manager */
> > > -       if (node == proc->context->binder_context_mgr_node) {
> > > -               *desc = 0;
> > > -               return 0;
> > > -       }
> > > +       offset = (node == proc->context->binder_context_mgr_node) ? 0 : 1;
> >
> > If context manager doesn't need to be bit 0 anymore, then why do we
> > bother to prefer bit 0? Does it matter?
> >
> > It would simplify the code below if the offset is always 0 since you
> > wouldn't need an offset at all.
> 
> Userspace assumes that sending a message to handle 0 means that the
> current context manager receives it. If we assign anything that is not
> the context manager to bit 0, then libbinder will send ctxmgr messages
> to random other processes. I don't think libbinder handles the case
> where context manager is restarted well at all. Most likely, if we hit
> this condition in real life, processes that had a non-zero refcount to
> the context manager will lose the ability to interact with ctxmgr
> until they are restarted.

Using handle 0 for transaction will always reach the current context
manager. This is hardcoded so it works regardless of the descriptor
assigned to any references.

Things get complicated when doing refcount operations though. It seems
that some commands will reach the new context manager and others will
reach the old dead node. Odd.

This needs to the fixed. I'll look into it.

> 
> I think this patch just needs to make sure that this scenario doesn't
> lead to a UAF in the kernel. Ensuring that userspace handles it
> gracefully is another matter.
> 

Right, the main concern in this patch is handling the BUG() assert.

Thanks,
--
Carlos Llamas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] binder: fix descriptor lookup for context manager
  2024-07-22 15:05         ` [PATCH v2] " Carlos Llamas
@ 2024-07-22 15:50           ` Carlos Llamas
  0 siblings, 0 replies; 14+ messages in thread
From: Carlos Llamas @ 2024-07-22 15:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, Alice Ryhl
  Cc: linux-kernel, kernel-team, syzkaller-bugs, stable,
	syzbot+3dae065ca76952a67257

On Mon, Jul 22, 2024 at 03:05:11PM +0000, Carlos Llamas wrote:
> In commit 15d9da3f818c ("binder: use bitmap for faster descriptor
> lookup"), it was incorrectly assumed that references to the context
> manager node should always get descriptor zero assigned to them.
> 
> However, if the context manager dies and a new process takes its place,
> then assigning descriptor zero to the new context manager might lead to
> collisions, as there could still be references to the older node. This
> issue was reported by syzbot with the following trace:
> 
>   kernel BUG at drivers/android/binder.c:1173!
>   Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>   Modules linked in:
>   CPU: 1 PID: 447 Comm: binder-util Not tainted 6.10.0-rc6-00348-g31643d84b8c3 #10
>   Hardware name: linux,dummy-virt (DT)
>   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : binder_inc_ref_for_node+0x500/0x544
>   lr : binder_inc_ref_for_node+0x1e4/0x544
>   sp : ffff80008112b940
>   x29: ffff80008112b940 x28: ffff0e0e40310780 x27: 0000000000000000
>   x26: 0000000000000001 x25: ffff0e0e40310738 x24: ffff0e0e4089ba34
>   x23: ffff0e0e40310b00 x22: ffff80008112bb50 x21: ffffaf7b8f246970
>   x20: ffffaf7b8f773f08 x19: ffff0e0e4089b800 x18: 0000000000000000
>   x17: 0000000000000000 x16: 0000000000000000 x15: 000000002de4aa60
>   x14: 0000000000000000 x13: 2de4acf000000000 x12: 0000000000000020
>   x11: 0000000000000018 x10: 0000000000000020 x9 : ffffaf7b90601000
>   x8 : ffff0e0e48739140 x7 : 0000000000000000 x6 : 000000000000003f
>   x5 : ffff0e0e40310b28 x4 : 0000000000000000 x3 : ffff0e0e40310720
>   x2 : ffff0e0e40310728 x1 : 0000000000000000 x0 : ffff0e0e40310710
>   Call trace:
>    binder_inc_ref_for_node+0x500/0x544
>    binder_transaction+0xf68/0x2620
>    binder_thread_write+0x5bc/0x139c
>    binder_ioctl+0xef4/0x10c8
>   [...]
> 
> This patch adds back the previous behavior of assigning the next
> non-zero descriptor if references to previous context managers still
> exist. It amends both strategies, the newer dbitmap code and also the
> legacy slow_desc_lookup_olocked(), by allowing them to start looking
> for available descriptors at a given offset.
> 
> Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup")
> Cc: stable@vger.kernel.org
> Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000c1c0a0061d1e6979@google.com/
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---

Sorry, I forgot to feed --notes to git send-email and the list of
changes in this v2 patch was missed. Here it is:

Notes:
    v2: updated comment about BIT(0) per Alice's feedback
        collect tags

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-07-22 15:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-13 10:25 [syzbot] [kernel?] kernel BUG in binder_inc_ref_for_node syzbot
2024-07-13 13:21 ` Hillf Danton
2024-07-13 13:56   ` syzbot
2024-07-15 20:23 ` Carlos Llamas
2024-07-15 23:52   ` syzbot
2024-07-16  4:28     ` [PATCH] binder: fix descriptor lookup for context manager Carlos Llamas
2024-07-16 17:40       ` Todd Kjos
2024-07-16 18:48         ` Carlos Llamas
2024-07-16 19:00           ` Carlos Llamas
2024-07-22 10:57         ` Alice Ryhl
2024-07-22 15:39           ` Carlos Llamas
2024-07-22 11:30       ` Alice Ryhl
2024-07-22 15:05         ` [PATCH v2] " Carlos Llamas
2024-07-22 15:50           ` Carlos Llamas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox