From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71B6826CE02 for ; Wed, 12 Nov 2025 23:35:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762990511; cv=none; b=Kqj+fKoFFX6Q9jKHs6XtIzb5yM8T+p7F20SsluRGePyBb/cHBS1EOb07pzL5oy1PKBIdS334UI62wHgwltbkfdWueLThk34Vok1yUAi8Cy0abxLWFdM+BwhM6QaRPG//PKIFU7g4PkCG21fvliY+zTwq9XhTUY58qv1PFVAXvpo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762990511; c=relaxed/simple; bh=4mflXfgXVT24C6CZ4udA49a2AIthJmldwzXgt3XntjI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=amjAihYQ/UF2C4N4cojzhD5SG3pmNDqs5ZlPAg+4/KTcHiwhkhbmO+oWxem6VtT+s4g8CZTns5o6G08s1LOx5FGlH9SY2AJt2v/4Fts62S0s05oeJ6jOMRLxMApdAKbd+6/I94qRN4qsOpY06GAYDK1d01xnkUKXvaLXIbErb1Y= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--aliceryhl.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=aLJ44Mbi; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--aliceryhl.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="aLJ44Mbi" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-42b2b9c7ab6so72586f8f.2 for ; Wed, 12 Nov 2025 15:35:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1762990508; x=1763595308; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yHGIBzZPwCgBjyzoAu2uZEsfUL+sP5QWpRjxugQAeqE=; b=aLJ44MbijMXx9+vDrozQXP4mmnY8heEmWPJ12aj4tvdbKhI6zpI0QBPtAp9BmPqzDK BfoIR5anbDRl5Nevs9lAuiP7cJthAK2AA61WgNZA63RF2jo28/lkYMFYDAHkwCMCfPVV TNf7mcachZTGA4xdNY0D/pNt00UfDIaN6CKvmboMhMW9FC232szqvfgEe1U+fyywfWpw nLjrrxhE+rhTpgqiLdflj7bQV4MbEmQHMQvWX+0yMdwgoYrxumjjKXpbVu7vF8g0OiqR 8+vSWXZ82C6n5x9W65gAiegwiMYmWsD3vrV9Ld9OfD2F0yTsW44Iw5/ZMYDpYZbr4MTk KWaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762990508; x=1763595308; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yHGIBzZPwCgBjyzoAu2uZEsfUL+sP5QWpRjxugQAeqE=; b=MM7Nahr9kyPIqXEcMD11LNORsJ9wKhuynRFa+npvYZHCIjsAk1UOMSTsWoxGuLK/f8 0fQKAFHOppmdspMiuHCMXwh6OeE8q/dkUR5eakSct98eEQGVeRHvQS9WSx3rmyhb1gut St+nigBsWy0QRj3KC5N6bahzWIsX88YjaBz4b/ZKUfdgC1IT8tfpvyzQjEdJJ9TwWGy8 W2uoWlymPcXhMU97SGPkunvkLoEYhZ2RIPSGOf5GGSB+Z0dUP7ET9UAKr0lxztoseOek F88ftRc0yU6ssX2lIOU1SWX8BfxmDX5/yRU8L2vVscgnN6kkUS4DsWbg1EQOQDOsWSNU AxDQ== X-Forwarded-Encrypted: i=1; AJvYcCXRQtP06Bs7vGF0WukysrZsvgKPYBpVOqJCdu/Nou+A9pO4UpD8cmmCmZlP5fqtmz9JcI9wenmQK2anYvY=@vger.kernel.org X-Gm-Message-State: AOJu0YzI1CItggZoJ+IZHGH6Vnu3nCevYHBxm40iSSkEPsuTd7iH//4N zfrBE/Riefx0xiP8kNFUxa5eXuqxStZJCyIBXp341UuOmN0SYQCOSDekAPX5QH7txi8rkEP1oEu LjPpSbaTbMcGu3X/PJg== X-Google-Smtp-Source: AGHT+IEId1v5duu3TRKOpet7fbCZD1zVCY8IAgfk+57gsEvGnUh+PLJNGxA0odr0Hgnj4bvYOByusBh9faYt/tA= X-Received: from wrnm1.prod.google.com ([2002:adf:e941:0:b0:42b:3021:cd6a]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:440e:b0:42b:530c:d8b6 with SMTP id ffacd0b85a97d-42b530cdae6mr283844f8f.61.1762990507739; Wed, 12 Nov 2025 15:35:07 -0800 (PST) Date: Wed, 12 Nov 2025 23:35:07 +0000 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251112-binder-bitmap-v5-0-8b9d7c7eca82@google.com> <20251112-binder-bitmap-v5-6-8b9d7c7eca82@google.com> Message-ID: Subject: Re: [PATCH v5 6/6] rust_binder: use bitmap for allocation of handles From: Alice Ryhl To: Yury Norov Cc: Greg Kroah-Hartman , "Arve =?utf-8?B?SGrDuG5uZXbDpWc=?=" , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Burak Emir , Miguel Ojeda , Boqun Feng , Gary Guo , "=?utf-8?B?QmrDtnJu?= Roy Baron" , Benno Lossin , Andreas Hindborg , Trevor Gross , Danilo Krummrich , rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" On Wed, Nov 12, 2025 at 02:09:19PM -0500, Yury Norov wrote: > On Wed, Nov 12, 2025 at 12:47:24PM +0000, Alice Ryhl wrote: > > To find an unused Binder handle, Rust Binder currently iterates the > > red/black tree from the beginning until it finds a gap in the keys. This > > is extremely slow. > > Can you share performance numbers? I have not benchmarked it in the Rust driver, but it replaces potentially thousands of calls to rb_next() with a single call to find_unused_id(), so I'm feeling good about the performance. And the equivalent change in the C driver was done because this particular piece of code was causing contention issues by holding the spinlock for a long time. The characteristics of this collection is that there is one per process using the driver. Most processes have only a few entries in this bitmap, so the inline representation will apply in most cases. However, there are a few processes that have a large number of entries in the 4 to maybe 5 figures range. Those processes are what caused the contention issue mentioned above. > > To improve the performance, add a bitmap that keeps track of which > > indices are actually in use. This allows us to quickly find an unused > > key in the red/black tree. > > > > This logic matches the approach used by C Binder. It was chosen > > partially because it's the most memory efficient solution. > > That inaccurate. You are adding a new data structure (bitmap), advocating > it with an improvement on search side, and that makes sense. > > But now you're saying it's also a more memory efficient approach, which > doesn't sound trivial because the most memory efficient solution is to > bring no new data structures at all. > > I guess you meant that bitmap is the most efficient data structure to > index used/unused nodes. If so, can you please rephrase the sentence? Yes I can rephrase. Adding more data is of course always less memory efficient. What I meant is that this is more memory efficient than the competing solution of using an augmented rbtree that Carlos mentioned here: https://lore.kernel.org/p/aC1PQ7tmcqMSmbHc@google.com > > + if let Some(res) = refs.handle_is_present.find_unused_id(start) { > > + match refs.by_handle.entry(res.as_u32()) { > > + rbtree::Entry::Vacant(entry) => break (res, entry), > > + rbtree::Entry::Occupied(_) => { > > + pr_err!("Detected mismatch between handle_is_present and by_handle"); > > + res.acquire(); > > + continue; > > At this point you've detected mismatch between two linked data > structures. It means that one of them or both are corrupted. To > me it looks fatal, and your pr_err() confirms it. How could you > continue then? Although we should never hit this codepath in real code, I don't think we need to kill the kernel. We can treat the r/b tree as source of truth and adjust the bitmap when mismathces are detected. I could add a kernel warning, though. That shouldn't kill an Android device. > > + } > > + } > > + } > > + > > + let grow_request = refs.handle_is_present.grow_request().ok_or(ENOMEM)?; > > + drop(refs_lock); > > + let resizer = grow_request.realloc(GFP_KERNEL)?; > > + refs_lock = self.node_refs.lock(); > > + refs = &mut *refs_lock; > > + refs.handle_is_present.grow(resizer); > > Is it possible to turn this block into a less wordy statement? Maybe a > wrapper function for it? Ideally, the grow request should be handled > transparently in .find_unused_id(). I can extract this block into a separate function, but I think it would be tricky to move the entire logic inside .find_unused_id() because of the mutex lock/unlock situation. > > @@ -905,6 +924,16 @@ pub(crate) fn update_ref( > > let id = info.node_ref().node.global_id(); > > refs.by_handle.remove(&handle); > > refs.by_node.remove(&id); > > + refs.handle_is_present.release_id(handle as usize); > > + > > + if let Some(shrink) = refs.handle_is_present.shrink_request() { > > This is questionable. Shrinking is usually the very slow path, and we > don't shrink unless we're really close or even inside the OOM condition. > > In this case, shrink_request() on average returns false, but it's > O(N), which makes _every_ release_id() O(N), while it should be O(1). The current implementation of shrink_request() will refuse to shrink the pool unless the largest bit is less than 1/4 of the capacity, so it should not perform the expensive operation very often. That said, it does call find_last_bit() each time, which I guess is O(N). But my assumption was that find_last_bit() is cheap enough that it wouldn't be a problem. > Consider a very realistic case: you're destroying every object, and thus > removing every ID in the associate ID pool, doing it in LIFO order. That > way you will need to call shrink_request() about O(log(N)) times, making > the whole complexity ~O(N*log(N)); and you'll have to make log(N) > realloc()s for nothing. If you release IDs in FIFO order, you don't > call realloc(), but your shrink_request() total complexity will be O(N^2). Even if we end up making log(N) reallocs, the total complexity of the reallocs is O(N) because the amount of data being reallocd halves each time. So the total number of bytes copied by reallocs ends up being: 1 + 2 + 4 + 8 + ... + 2^log(N) <= 2^(1+log(N)) = 2*N which is O(N). Of course, deleting the corresponding entry from the red/black tree is O(log N), so it's still O(N*log(N)) for the N deletions from the rb tree. > Can you compare performance numbers with and without shrinking under a > typical payload? Is there any mechanism to inspect the ID pools at runtime, > like expose via procfs? We expose the contents of the red/black tree via the binder_logs mechanism, but that doesn't show the *capacity* of the bitmap. Only the index of the largest set bit. > Can you make your shrinking logic conditional on some reasonable OOM > heuristics, maybe OOM event driven? > > And even without OOM, you can safely skip shrinking if the number of IDs in > use is greater than 1/4 of the capacity, or there's any used ID with > the index greater than the 1/2 capacity. I guess we could register a shrinker, but I don't think it's worth it. Thanks for the review! Alice