From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF95A3019B1 for ; Wed, 26 Nov 2025 15:56:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764172582; cv=none; b=oCV/0CSghznwH3nEKUVRcd7o0eMBK8BXE2xSg7JWnB0YoP2p747eaXYxjZCE0CsOFSd/K3CLRcQGnQVbKo2FDUoHMGRYLcuSdfZUasEmOtlMlMj/XqsCpFRtGJ0QpARHtDL9L4jvXuJZ2yoYwgOZn0Mx9ht2+Aysg7TjeMcjEsc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764172582; c=relaxed/simple; bh=UwJyehoP4Y2XMfAMWtpdCSOFuhAUD33mqfLFRUFX/Is=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=dp+2LfVlbf62mdI2IjXP7KijNLNz+twUx1cAnyP9TBgUqBpsLt85FVjizkxMnj3lumxR0TRqy7yTAKv+dQ+wZ4LBeVKUZKlI+DtmRRqFp6xTq1VSIxYurbm3o9Vk1XREw7qGrVahrDwsc55xLuQ8OE4KReZ3ttKMyfsWjhKpQhk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--aliceryhl.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fcrZ+e3+; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--aliceryhl.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fcrZ+e3+" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-42b30184be7so3365990f8f.2 for ; Wed, 26 Nov 2025 07:56:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764172578; x=1764777378; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=KdELtatd4ns6Xv3P11fU2D1lbge239E1UYYBp38T5dA=; b=fcrZ+e3+oL46Ed2KQkiPxluws/pdi4gzIKTSsybPMKvf5kOdki4cVBwZn6e57ILpq+ h8zJOS9MgP2NNDOH9r8NiyLhnIZtu4zMcgNzbE9/mUE/9mtkd88BbQIrKb5TxJben78v yp3tOVXww/Gmr+fBUSj11OHS2LKv/eN58c/4FfOIWb4mBRUUB2OMuNzMbmR1kFRVZllr mdajEMOlFhe7w0K775pUHGMYNedb48zm6/hmsGr8OMdFh0VyESJqfwIK6yAeswofaID3 Q3iAftkkIojRzSqIeYbGpBokAF7Vcpcb64myrgGaNbYOYQPi8WvCo0GzbkcjDPqLkuZb oHMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764172578; x=1764777378; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=KdELtatd4ns6Xv3P11fU2D1lbge239E1UYYBp38T5dA=; b=CAKCC9eCD5ZUTVEgwr4sAvkxhFi7bsF9aTKVQZlO2hntKHqOIAi76gupwYn7dw7uhq ChdKAo5Bp+YOtYGKlAePcTIwG9bxkujHlA3tXSHWplsP5KSoktnEESbH1R4hW+oOUu8a MEhRB4DwmYn1dyzhv70quVkXqLK6GGJSMeHUkmPgcFRFyFdb9sw+4oOpA+bJhenhSCqI vI8mo1FfyMHqoC97w2QoRx5FCRVXgtC2WskJtwzWlBbizfhlGHtPcg7lxNCB96lwKuJ5 Lbb0DW0Kx8A0cOyRtY23bumVjJrdCfTMfwQcPSBCZQae1bkQpu8y6NYqz9OA8tgWHsiP pDBg== X-Forwarded-Encrypted: i=1; AJvYcCVpu96B3x7ug5dVC/LIHXlUk9141chHErLQ0J/kwP4+2kMxW+shbM6FV6MNJWkaVyS4AdTbA2O7AwtdwcEwmg==@vger.kernel.org X-Gm-Message-State: AOJu0YxRN8N6EYGmsk2mYnhy+4FJeFfv8fA5KyKgTIcOUCSP5Gey4Q+K N7Ess3cLOgeI77ZPl6Ij3B1J6yXZutzbqUNcmuEBADwUcqW2UzKh0AeZQo/1+aQEm2Iw8U3hBsb nw5ISq3A0aTqXk2dWTw== X-Google-Smtp-Source: AGHT+IGiwEd7eTxZ9yownKmkq+95ckQThdGfCWdRYXL3VC8aaXA63NY3zD0XDMVXiR8uahjKd1m9eGxoD1qpmVA= X-Received: from wrsa18.prod.google.com ([2002:adf:fad2:0:b0:429:c4e9:62a]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:5d82:0:b0:42b:5567:857f with SMTP id ffacd0b85a97d-42cc1d525acmr22774950f8f.50.1764172578671; Wed, 26 Nov 2025 07:56:18 -0800 (PST) Date: Wed, 26 Nov 2025 15:56:17 +0000 In-Reply-To: Precedence: bulk X-Mailing-List: rust-for-linux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251125-binder-bitmap-v6-0-dedaf1d05a98@google.com> <20251125-binder-bitmap-v6-6-dedaf1d05a98@google.com> Message-ID: Subject: Re: [PATCH v6 6/6] rust_binder: use bitmap for allocation of handles From: Alice Ryhl To: Yury Norov Cc: Greg Kroah-Hartman , "Arve =?utf-8?B?SGrDuG5uZXbDpWc=?=" , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Burak Emir , Miguel Ojeda , Boqun Feng , Gary Guo , "=?utf-8?B?QmrDtnJu?= Roy Baron" , Benno Lossin , Andreas Hindborg , Trevor Gross , Danilo Krummrich , rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, Nov 26, 2025 at 10:31:03AM -0500, Yury Norov wrote: > On Wed, Nov 26, 2025 at 08:35:55AM +0000, Alice Ryhl wrote: > > On Tue, Nov 25, 2025 at 09:26:46PM -0500, Yury Norov wrote: > > > On Tue, Nov 25, 2025 at 01:59:42PM +0000, Alice Ryhl wrote: > > > > To find an unused Binder handle, Rust Binder currently iterates the > > > > red/black tree from the beginning until it finds a gap in the keys.= This > > > > is extremely slow. > > > >=20 > > > > To improve the performance, add a bitmap that keeps track of which > > > > indices are actually in use. This allows us to quickly find an unus= ed > > > > key in the red/black tree. > > > >=20 > > > > For a benchmark, please see the below numbers that were obtained fr= om > > > > modifying binderThroughputTest to send a node with each transaction= and > > > > stashing it in the server. This results in the number of nodes > > > > increasing by one for every transaction sent. I got the following t= able > > > > of roundtrip latencies (in =C2=B5s): > > > >=20 > > > > Transaction Range =E2=94=82 Baseline (Rust) =E2=94=82 Bitmap (Rust)= =E2=94=82 Comparison (C) > > > > 0 - 10,000 =E2=94=82 176.88 =E2=94=82 92.93= =E2=94=82 99.41 > > > > 10,000 - 20,000 =E2=94=82 437.37 =E2=94=82 87.74= =E2=94=82 98.55 > > > > 20,000 - 30,000 =E2=94=82 677.49 =E2=94=82 76.24= =E2=94=82 96.37 > > > > 30,000 - 40,000 =E2=94=82 901.76 =E2=94=82 83.39= =E2=94=82 96.73 > > > > 40,000 - 50,000 =E2=94=82 1126.62 =E2=94=82 100.44= =E2=94=82 94.57 > > > > 50,000 - 60,000 =E2=94=82 1288.98 =E2=94=82 94.38= =E2=94=82 96.64 > > > > 60,000 - 70,000 =E2=94=82 1588.74 =E2=94=82 88.27= =E2=94=82 96.36 > > > > 70,000 - 80,000 =E2=94=82 1812.97 =E2=94=82 93.97= =E2=94=82 91.24 > > > > 80,000 - 90,000 =E2=94=82 2062.95 =E2=94=82 92.22= =E2=94=82 102.01 > > > > 90,000 - 100,000 =E2=94=82 2330.03 =E2=94=82 97.18= =E2=94=82 100.31 > > > >=20 > > > > It should be clear that the current Rust code becomes linearly slow= er > > > > per insertion as the number of calls to rb_next() per transaction > > > > increases. After this change, the time to find an ID number appears > > > > constant. (Technically it is not constant-time as both insertion an= d > > > > removal scan the entire bitmap. However, quick napkin math shows th= at > > > > scanning the entire bitmap with N=3D100k takes ~1.5=C2=B5s, which i= s neglible > > > > in a benchmark where the rountrip latency is 100=C2=B5s.) > > > >=20 > > > > I've included a comparison to the C driver, which uses the same bit= map > > > > algorithm as this patch since commit 15d9da3f818c ("binder: use bit= map > > > > for faster descriptor lookup"). > > >=20 > > > Thanks for the solid numbers! > > >=20 > > > > This currently checks if the bitmap should be shrunk after every > > > > removal. One potential future change is introducing a shrinker to m= ake > > > > this operation O(1), but based on the benchmark above this does not= seem > > > > required at this time. > > > >=20 > > > > Reviewed-by: Burak Emir > > > > Acked-by: Carlos Llamas > > > > Signed-off-by: Alice Ryhl > > > > --- > > > > drivers/android/binder/process.rs | 64 +++++++++++++++++++++++++++= +----------- > > > > 1 file changed, 47 insertions(+), 17 deletions(-) > > > >=20 > > > > diff --git a/drivers/android/binder/process.rs b/drivers/android/bi= nder/process.rs > > > > index f13a747e784c84a0fb09cbf47442712106eba07c..9264961fd92b33c07fc= d5353740cc0b1ec978afd 100644 > > > > --- a/drivers/android/binder/process.rs > > > > +++ b/drivers/android/binder/process.rs > > > > @@ -19,6 +19,7 @@ > > > > cred::Credential, > > > > error::Error, > > > > fs::file::{self, File}, > > > > + id_pool::IdPool, > > > > list::{List, ListArc, ListArcField, ListLinks}, > > > > mm, > > > > prelude::*, > > > > @@ -367,6 +368,8 @@ impl ListItem<{Self::LIST_NODE}> for NodeRefInf= o { > > > > struct ProcessNodeRefs { > > > > /// Used to look up nodes using the 32-bit id that this proces= s knows it by. > > > > by_handle: RBTree>, > > > > + /// Used to quickly find unused ids in `by_handle`. > > > > + handle_is_present: IdPool, > > > > /// Used to look up nodes without knowing their local 32-bit i= d. The usize is the address of > > > > /// the underlying `Node` struct as returned by `Node::global_= id`. > > > > by_node: RBTree, > > > > @@ -381,6 +384,7 @@ impl ProcessNodeRefs { > > > > fn new() -> Self { > > > > Self { > > > > by_handle: RBTree::new(), > > > > + handle_is_present: IdPool::new(), > > > > by_node: RBTree::new(), > > > > freeze_listeners: RBTree::new(), > > > > } > > > > @@ -775,7 +779,7 @@ pub(crate) fn get_node( > > > > pub(crate) fn insert_or_update_handle( > > > > self: ArcBorrow<'_, Process>, > > > > node_ref: NodeRef, > > > > - is_mananger: bool, > > > > + is_manager: bool, > > > > ) -> Result { > > > > { > > > > let mut refs =3D self.node_refs.lock(); > > > > @@ -794,7 +798,33 @@ pub(crate) fn insert_or_update_handle( > > > > let reserve2 =3D RBTreeNodeReservation::new(GFP_KERNEL)?; > > > > let info =3D UniqueArc::new_uninit(GFP_KERNEL)?; > > > > =20 > > > > - let mut refs =3D self.node_refs.lock(); > > > > + let mut refs_lock =3D self.node_refs.lock(); > > > > + let mut refs =3D &mut *refs_lock; > > > > + > > > > + let (unused_id, by_handle_slot) =3D loop { > > > > + // ID 0 may only be used by the manager. > > > > + let start =3D if is_manager { 0 } else { 1 }; > > > > + > > > > + if let Some(res) =3D refs.handle_is_present.find_unuse= d_id(start) { > > > > + match refs.by_handle.entry(res.as_u32()) { > > > > + rbtree::Entry::Vacant(entry) =3D> break (res, = entry), > > > > + rbtree::Entry::Occupied(_) =3D> { > > > > + pr_err!("Detected mismatch between handle_= is_present and by_handle"); > > > > + res.acquire(); > > > > + kernel::warn_on!(true); > > > > + return Err(EINVAL); > > >=20 > > > EINVAL means that user provides a wrong parameter. Here's a data > > > corruption. Maybe EFAULT? > >=20 > > Hmm ... EFAULT is used when the user passes a dangling pointer that > > copy_from_user() refuses to use. I would like to avoid confusion with > > that as well. Is there a third option? > >=20 > > > > + } > > > > + } > > > > + } > > > > + > > > > + let grow_request =3D refs.handle_is_present.grow_reque= st().ok_or(ENOMEM)?; > > > > + drop(refs_lock); > > > > + let resizer =3D grow_request.realloc(GFP_KERNEL)?; > > > > + refs_lock =3D self.node_refs.lock(); > > > > + refs =3D &mut *refs_lock; > > > > + refs.handle_is_present.grow(resizer); > > >=20 > > > This continues puzzling me. Refs_lock protects refs, and the spec > > > says: > > >=20 > > > a reference=E2=80=99s scope starts from where it is introduced an= d > > > continues through the last time that reference is used. > > >=20 > > > https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html > > >=20 > > > The last usage of refs is at .grow_request() line, because later it's > > > reused with the new value.=20 > > >=20 > > > If my reading of the spec is correct, after dropping the refs_lock, > > > you may get rescheduled, and another thread may follow the same path. > > > Because refs_lock is dropped explicitly and refs - implicitly, the > > > concurrent thread can grab both and follow with resizing the id map. > > >=20 > > > When your first thread will get back, you'll end up resizing the > > > already resized map. > > >=20 > > > I asked your AI, and it says that this race is indeed possible for > > > exactly that reason. But it doesn't break memory safety, so the > > > compiler is happy about it... > >=20 > > You're right that since we release the mutex, someone else may have > > resized the map in the meantime. That's why we implemented grow() to do > > nothing if it was already resized: > >=20 > > pub fn grow(&mut self, mut resizer: PoolResizer) { > > // Between request to grow that led to allocation of `resizer` and= now, > > // another thread may have already grown the capacity. > > // In this case, do nothing, drop `resizer` and move on. > > if resizer.new.len() <=3D self.capacity() { > > return; > > } > > =09 > > resizer.new.copy_and_extend(&self.map); > > self.map =3D resizer.new; > > } >=20 > OK, I see... >=20 > I expected that you'll switch the pool state to 'growing' before > dropping the ref_lock. This way, the concurrent thread would be able > to go straight to the new iteration. >=20 > Something like: > =20 > if let Some(res) =3D refs.handle_is_present.find_unused_id(start)= { > ... > } >=20 > if refs.is_growing() { > drop(refs_lock); > continue; > } > let grow_request =3D refs.handle_is_present.grow_request().ok_or(= ENOMEM)?; > refs.set_growing() > drop(refs_lock); > let resizer =3D grow_request.realloc(GFP_KERNEL)?; > ... >=20 > Your approach also works, assuming you're OK with a useless alloc/free > bitmaps in the case of collision. Hmm, having the extra ones just repeatedly lock/unlock also isn't great. I guess one alternate option could be to have a separate mutex that you take while allocating, but ehh I think the current logic is fine. > So, I can take this series as is, or I can wait for v7 if you prefer. >=20 > Please advise. Sure, go ahead! And thanks for the reviews! Alice