All of lore.kernel.org
 help / color / mirror / Atom feed
From: Carlos Llamas <cmllamas@google.com>
To: Shu Han <ebpqwerty472123@gmail.com>
Cc: gregkh@linuxfoundation.org, arve@android.com, tkjos@android.com,
	maco@android.com, joel@joelfernandes.org, brauner@kernel.org,
	surenb@google.com, linux-kernel@vger.kernel.org,
	aliceryhl@google.com
Subject: Re: [PATCH] binder: use augmented rb-tree for faster descriptor lookup
Date: Thu, 10 Oct 2024 04:22:19 +0000	[thread overview]
Message-ID: <ZwdWe_I2p3zD-v1O@google.com> (raw)
In-Reply-To: <CAHQche8v5CTW0L2cJGCDWOJ---KRv9wxWAk=eUaK1+k9UTcpHA@mail.gmail.com>

On Wed, Oct 09, 2024 at 11:08:42PM +0800, Shu Han wrote:
> On Wed, Oct 9, 2024 at 4:58 AM Carlos Llamas <cmllamas@google.com> wrote:
> 
> Thank you for your patient reply.
> 
> > I honestly don't remember. It might have been that we required to expand
> > the 'struct binder_ref' size. This issue starts to be a problem when we
> > have thousands of references e.g. 30,000. Adding 4 bytes to each one of
> > them might not be worth it. But let me check...
> 
> I didn't consider memory overhead before...
> That's indeed an important point.

Yeah, I actually started with a solution that kept a list of references
that had "gaps" behind them. This also required expanding struct
binder_ref and I eventually abandoned this idea.

> 
> Fortunately, after checking, I found that this patch does not occupy any
> additional memory for `struct binder_ref`.

This is good news. I hadn't actually checked this. I'll keep this in
mind.

> 
> In a modern 64-bit platform, the size of `struct binder_ref` is 104 bytes.
> If we add a 4-byte field into it, the size will be 112 bytes(aligned by
> long). And both of them occupy a 128-byte slab(aligned by kmalloc_index).
> So the memory cost won't be increased, if there isn't a pending change
> that needs exactly 24 bytes in `struct binder_ref`.
> 
> In Rust, a rbtree::Node costs 40 bytes, 4 of 40 are used for alignment,
> adding a 4-byte field won't change the size of the struct.
> 
> In a 32-bit platform, the number is from 60 bytes to 64 bytes, a 64-byte
> slab, exactly 4 bytes.(very close to the slab bound)
> 
> Perhaps it's caused by introducing augmented rbtree into Rust which
> requires considerable effort. But personally, I think it's not bad to just
> introduce augmented rbtree into C (Rust and C are already quite different
> here).
> 
> > Yeah, I think it would look cleaner if we do a revert of the previous
> > patches though. This way we can remove the noise and see the actual
> > changes. I'll do this locally for now, no need to send a v2 just yet.
> 
> Great point. thanks.
> 
> Best regards.

I ran a benchmark test that creates multiple references on a Pixel
device and the numbers between the augmented rbtree implementation and
the current dbitmap are pretty much the same:

augmented rbtree:
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_collectProxies/0/10        696251 ns       334494 ns         2363 kernel
BM_collectProxies/0/100      4047417 ns      1886942 ns          390 kernel
BM_collectProxies/0/1000    29510599 ns     14827312 ns           51 kernel
BM_collectProxies/0/5000   136774414 ns     70482303 ns            9 kernel
BM_collectProxies/0/10000  248333277 ns    125898564 ns            5 kernel
BM_collectProxies/0/20000  485156508 ns    245891699 ns            3 kernel

dbitmap:
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_collectProxies/0/10        646427 ns       291657 ns         1935 kernel
BM_collectProxies/0/100      4203230 ns      2005648 ns          484 kernel
BM_collectProxies/0/1000    30859725 ns     15369365 ns           42 kernel
BM_collectProxies/0/5000   147910844 ns     75179017 ns            9 kernel
BM_collectProxies/0/10000  239196875 ns    121801066 ns            5 kernel
BM_collectProxies/0/20000  481154229 ns    247742285 ns            3 kernel

You should have this test avialable as 'binderRpcBenchmark' I ran with
the following parameters if you want to play with it:
  $ binderRpcBenchmark --benchmark_filter="BM_collectProxies/0/*"

Even if the numbers are too close it seems the bump in memory might be a
problem. Particularly in 32bit as these devices are more sensitive to
increases in allocations.

Regards,
Carlos Llamas

      reply	other threads:[~2024-10-10  4:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-17  3:02 [PATCH] binder: use augmented rb-tree for faster descriptor lookup Shu Han
2024-09-19 21:00 ` Carlos Llamas
2024-10-02 12:24   ` Shu Han
2024-10-08 20:58     ` Carlos Llamas
2024-10-09 15:08       ` Shu Han
2024-10-10  4:22         ` Carlos Llamas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZwdWe_I2p3zD-v1O@google.com \
    --to=cmllamas@google.com \
    --cc=aliceryhl@google.com \
    --cc=arve@android.com \
    --cc=brauner@kernel.org \
    --cc=ebpqwerty472123@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=surenb@google.com \
    --cc=tkjos@android.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.