Re: [PATCH v6 bpf-next] selftests/bpf: Add benchmark for local_storage get

All of lore.kernel.org
 help / color / mirror / Atom feed

From: John Fastabend <john.fastabend@gmail.com>
To: Dave Marchevsky <davemarchevsky@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Martin KaFai Lau <kafai@fb.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v6 bpf-next] selftests/bpf: Add benchmark for local_storage get
Date: Wed, 22 Jun 2022 20:27:23 -0700	[thread overview]
Message-ID: <62b3dd9b8249b_6a3b2208a4@john.notmuch> (raw)
In-Reply-To: <31bba044-6bb0-1a78-d706-f9fd9fbb1e2c@fb.com>

Dave Marchevsky wrote:
> On 6/22/22 9:26 PM, John Fastabend wrote:   
> > Martin KaFai Lau wrote:
> >> On Tue, Jun 21, 2022 at 10:49:46PM -0700, John Fastabend wrote:
> >>> Martin KaFai Lau wrote:
> >>>> On Tue, Jun 21, 2022 at 12:17:54PM -0700, John Fastabend wrote:
> >>>>>> Hashmap Control
> >>>>>> ===============
> >>>>>>         num keys: 10
> >>>>>> hashmap (control) sequential    get:  hits throughput: 20.900 ± 0.334 M ops/s, hits latency: 47.847 ns/op, important_hits throughput: 20.900 ± 0.334 M ops/s
> >>>>>>
> >>>>>>         num keys: 1000
> >>>>>> hashmap (control) sequential    get:  hits throughput: 13.758 ± 0.219 M ops/s, hits latency: 72.683 ns/op, important_hits throughput: 13.758 ± 0.219 M ops/s
> >>>>>>
> >>>>>>         num keys: 10000
> >>>>>> hashmap (control) sequential    get:  hits throughput: 6.995 ± 0.034 M ops/s, hits latency: 142.959 ns/op, important_hits throughput: 6.995 ± 0.034 M ops/s
> >>>>>>
> >>>>>>         num keys: 100000
> >>>>>> hashmap (control) sequential    get:  hits throughput: 4.452 ± 0.371 M ops/s, hits latency: 224.635 ns/op, important_hits throughput: 4.452 ± 0.371 M ops/s
> >>>>>>
> >>>>>>         num keys: 4194304
> >>>>>> hashmap (control) sequential    get:  hits throughput: 3.043 ± 0.033 M ops/s, hits latency: 328.587 ns/op, important_hits throughput: 3.043 ± 0.033 M ops/s
> >>>>>>
> >>>>>
> >>>>> Why is the hashmap lookup not constant with the number of keys? It looks
> >>>>> like its prepopulated without collisions so I wouldn't expect any
> >>>>> extra ops on the lookup side after looking at the code quickly.
> >>>> It may be due to the cpu-cache misses as the map grows.
> >>>
> >>> Maybe but, values are just ints so even 1k * 4B = 4kB should be
> >>> inside an otherwise unused server class system. Would be more
> >>> believable (to me at least) if the drop off happened at 100k or
> >>> more.
> >> It is not only value (and key) size.  There is overhead.
> >> htab_elem alone is 48bytes.  key and value need to 8bytes align also.
> >>
> > 
> > Right late night math didn't add up. Now I'm wondering if we can make
> > hashmap behave much better, that drop off is looking really ugly.
> > 
> >> From a random machine:
> >> lscpu -C
> >> NAME ONE-SIZE ALL-SIZE WAYS TYPE        LEVEL  SETS PHY-LINE COHERENCY-SIZE
> >> L1d       32K     576K    8 Data            1    64        1             64
> >> L1i       32K     576K    8 Instruction     1    64        1             64
> >> L2         1M      18M   16 Unified         2  1024        1             64
> >> L3      24.8M    24.8M   11 Unified         3 36864        1             64
> > 
> > Could you do a couple more data point then, num keys=100,200,400? I would
> > expect those to fit in the cache and be same as 10 by the cache theory. I
> > could try as well but looking like Friday before I have a spare moment.
> > 
> > Thanks,
> > John
> 
> Here's a benchmark run with those num_keys.
> 
> Hashmap Control
> ===============
>         num keys: 10
> hashmap (control) sequential    get:  hits throughput: 23.072 ± 0.208 M ops/s, hits latency: 43.343 ns/op, important_hits throughput: 23.072 ± 0.208 M ops/s
> 
>         num keys: 100
> hashmap (control) sequential    get:  hits throughput: 17.967 ± 0.236 M ops/s, hits latency: 55.659 ns/op, important_hits throughput: 17.967 ± 0.236 M ops/s
> 

Hmmm this is interesting. I expected a flat line and then a hard drop off
which to me would indicate the cache miss being the culprit. But this is
almost a linear perf dec over num_keys. Guess we need to debug more.

>         num keys: 200
> hashmap (control) sequential    get:  hits throughput: 17.812 ± 0.428 M ops/s, hits latency: 56.143 ns/op, important_hits throughput: 17.812 ± 0.428 M ops/s
> 
>         num keys: 300
> hashmap (control) sequential    get:  hits throughput: 17.070 ± 0.293 M ops/s, hits latency: 58.582 ns/op, important_hits throughput: 17.070 ± 0.293 M ops/s
> 
>         num keys: 400
> hashmap (control) sequential    get:  hits throughput: 17.667 ± 0.316 M ops/s, hits latency: 56.604 ns/op, important_hits throughput: 17.667 ± 0.316 M ops/s
> 
>         num keys: 500
> hashmap (control) sequential    get:  hits throughput: 17.010 ± 0.409 M ops/s, hits latency: 58.789 ns/op, important_hits throughput: 17.010 ± 0.409 M ops/s
> 
>         num keys: 1000
> hashmap (control) sequential    get:  hits throughput: 14.330 ± 0.172 M ops/s, hits latency: 69.784 ns/op, important_hits throughput: 14.330 ± 0.172 M ops/s
> 
>         num keys: 10000
> hashmap (control) sequential    get:  hits throughput: 6.047 ± 0.024 M ops/s, hits latency: 165.380 ns/op, important_hits throughput: 6.047 ± 0.024 M ops/s
> 
>         num keys: 100000
> hashmap (control) sequential    get:  hits throughput: 4.472 ± 0.163 M ops/s, hits latency: 223.630 ns/op, important_hits throughput: 4.472 ± 0.163 M ops/s
> 
>         num keys: 4194304
> hashmap (control) sequential    get:  hits throughput: 2.785 ± 0.024 M ops/s, hits latency: 359.066 ns/op, important_hits throughput: 2.785 ± 0.024 M ops/s

next prev parent reply	other threads:[~2022-06-23  4:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-20 22:25 [PATCH v6 bpf-next] selftests/bpf: Add benchmark for local_storage get Dave Marchevsky
2022-06-21 19:17 ` John Fastabend
2022-06-22  0:29   ` Martin KaFai Lau
2022-06-22  5:49     ` John Fastabend
2022-06-22 17:26       ` Martin KaFai Lau
2022-06-23  1:26         ` John Fastabend
2022-06-23  2:18           ` Alexei Starovoitov
2022-06-23  3:25             ` John Fastabend
2022-06-23  2:53           ` Dave Marchevsky
2022-06-23  3:27             ` John Fastabend [this message]
2022-06-23  2:31   ` Dave Marchevsky
2022-06-23  2:20 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62b3dd9b8249b_6a3b2208a4@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.