BPF List
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	John Fastabend <john.fastabend@gmail.com>
Cc: Martin KaFai Lau <kafai@fb.com>,
	Dave Marchevsky <davemarchevsky@fb.com>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v6 bpf-next] selftests/bpf: Add benchmark for local_storage get
Date: Wed, 22 Jun 2022 20:25:33 -0700	[thread overview]
Message-ID: <62b3dd2d8f81c_6a3b22085b@john.notmuch> (raw)
In-Reply-To: <CAADnVQJh0BKP3+WHQ+EmEvw3h=QS9xgR=8uk_3Zb6321xhMjtg@mail.gmail.com>

Alexei Starovoitov wrote:
> On Wed, Jun 22, 2022 at 6:26 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Martin KaFai Lau wrote:
> > > On Tue, Jun 21, 2022 at 10:49:46PM -0700, John Fastabend wrote:
> > > > Martin KaFai Lau wrote:
> > > > > On Tue, Jun 21, 2022 at 12:17:54PM -0700, John Fastabend wrote:
> > > > > > > Hashmap Control
> > > > > > > ===============
> > > > > > >         num keys: 10
> > > > > > > hashmap (control) sequential    get:  hits throughput: 20.900 ± 0.334 M ops/s, hits latency: 47.847 ns/op, important_hits throughput: 20.900 ± 0.334 M ops/s
> > > > > > >
> > > > > > >         num keys: 1000
> > > > > > > hashmap (control) sequential    get:  hits throughput: 13.758 ± 0.219 M ops/s, hits latency: 72.683 ns/op, important_hits throughput: 13.758 ± 0.219 M ops/s
> > > > > > >
> > > > > > >         num keys: 10000
> > > > > > > hashmap (control) sequential    get:  hits throughput: 6.995 ± 0.034 M ops/s, hits latency: 142.959 ns/op, important_hits throughput: 6.995 ± 0.034 M ops/s
> > > > > > >
> > > > > > >         num keys: 100000
> > > > > > > hashmap (control) sequential    get:  hits throughput: 4.452 ± 0.371 M ops/s, hits latency: 224.635 ns/op, important_hits throughput: 4.452 ± 0.371 M ops/s
> > > > > > >
> > > > > > >         num keys: 4194304
> > > > > > > hashmap (control) sequential    get:  hits throughput: 3.043 ± 0.033 M ops/s, hits latency: 328.587 ns/op, important_hits throughput: 3.043 ± 0.033 M ops/s
> > > > > > >
> > > > > >
> > > > > > Why is the hashmap lookup not constant with the number of keys? It looks
> > > > > > like its prepopulated without collisions so I wouldn't expect any
> > > > > > extra ops on the lookup side after looking at the code quickly.
> > > > > It may be due to the cpu-cache misses as the map grows.
> > > >
> > > > Maybe but, values are just ints so even 1k * 4B = 4kB should be
> > > > inside an otherwise unused server class system. Would be more
> > > > believable (to me at least) if the drop off happened at 100k or
> > > > more.
> > > It is not only value (and key) size.  There is overhead.
> > > htab_elem alone is 48bytes.  key and value need to 8bytes align also.
> > >
> >
> > Right late night math didn't add up. Now I'm wondering if we can make
> > hashmap behave much better, that drop off is looking really ugly.
> >
> > > From a random machine:
> > > lscpu -C
> > > NAME ONE-SIZE ALL-SIZE WAYS TYPE        LEVEL  SETS PHY-LINE COHERENCY-SIZE
> > > L1d       32K     576K    8 Data            1    64        1             64
> > > L1i       32K     576K    8 Instruction     1    64        1             64
> > > L2         1M      18M   16 Unified         2  1024        1             64
> > > L3      24.8M    24.8M   11 Unified         3 36864        1             64
> >
> > Could you do a couple more data point then, num keys=100,200,400? I would
> > expect those to fit in the cache and be same as 10 by the cache theory. I
> > could try as well but looking like Friday before I have a spare moment.
> 
> I think the benchmark achieved its goal :)
> It generated plenty of interesting data.
> Pulling random out of hot loop and any other improvements
> can be done as follow ups.
> Pushed it to bpf-next.

Yep just realized I hadn't ACK'd it yet. Thanks for the patches my
guess is we can improve the hashmap a bunch. FWIW we use the
hashmap as described here so wondering if we need to cut over to
task storage or just make hashmap better at large values.

  reply	other threads:[~2022-06-23  4:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-20 22:25 [PATCH v6 bpf-next] selftests/bpf: Add benchmark for local_storage get Dave Marchevsky
2022-06-21 19:17 ` John Fastabend
2022-06-22  0:29   ` Martin KaFai Lau
2022-06-22  5:49     ` John Fastabend
2022-06-22 17:26       ` Martin KaFai Lau
2022-06-23  1:26         ` John Fastabend
2022-06-23  2:18           ` Alexei Starovoitov
2022-06-23  3:25             ` John Fastabend [this message]
2022-06-23  2:53           ` Dave Marchevsky
2022-06-23  3:27             ` John Fastabend
2022-06-23  2:31   ` Dave Marchevsky
2022-06-23  2:20 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62b3dd2d8f81c_6a3b22085b@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox