public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Anton Protopopov <aspsk@isovalent.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	John Fastabend <john.fastabend@gmail.com>
Subject: Re: [PATCH bpf-next 0/6] New benchmark for hashmap lookups
Date: Tue, 31 Jan 2023 20:18:34 +0100	[thread overview]
Message-ID: <Y9lpilqPsW1stF/a@lavr> (raw)
In-Reply-To: <CAEf4BzYxZpc8NZssBkpkT86A3ZFc9i08am9s76o6334HwBGMdA@mail.gmail.com>

On 23/01/31 10:48, Andrii Nakryiko wrote:
> On Tue, Jan 31, 2023 at 2:47 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > On 23/01/30 04:17, Andrii Nakryiko wrote:
> > > On Fri, Jan 27, 2023 at 10:14 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > >
> > > > Add a new benchmark for hashmap lookups and fix several typos. See individual
> > > > commits for descriptions.
> > > >
> > > > One thing to mention here is that in commit 3 I've patched bench so that now
> > > > command line options can be reused by different benchmarks.
> > > >
> > > > The benchmark itself is added in the last commit 6. I am using this benchmark
> > > > to test map lookup productivity when using a different hash function (see
> > > > https://fosdem.org/2023/schedule/event/bpf_hashing/). The results provided by
> > > > the benchmark look reasonable and match the results of my different benchmarks
> > > > (requiring to patch kernel to get actual statistics on map lookups).
> > >
> > > Could you share the results with us? Curious which hash functions did
> > > you try and which one are the most promising :)
> >
> > For the longer version with pictures see the talk I've referenced above (it's
> > at FOSDEM next Sunday Feb 5). A short version follows.
> 
> Yep, I'll try to watch it.
> 
> >
> > The xxh3 hash works fine for big keys, where "big" is different for different
> > architectures and for different maps sizes. On my Intel i7 machine this means
> > key size >= 8. On my AMD machine xxh3 works better for all keys for small maps,
> > but degrades for keys of size 12,16,20 for bigger maps (>=200K elements or so).
> > Example (map size 100K, 50% full, measuring M ops/second):
> 
> Nice, I was hoping you would look at xxh3, as I've been meaning to try
> it out as well (have dirty patches to introduce xxh3 into
> lib/xxhash.c, but didn't get to actual benchmarking).

My first attempt was with lib/xxhash.c, and it looked well on the first glance
(outperformed every other hash in my hash benchmark). However, when used inside
the hashmap, it behaved way worse than expected, so I had to inline it.

> Despite this AMD-specific degradation (which is interesting in its own
> right, could it be some fluke in testing?), I think it's a good idea
> to switch from jhash to xxh3, as it seems almost universally better.
> See also below.
> >
> >     hash_size    4    8   12   16   20   24   28   32   36   40   44   48   52   56   60   64
> >     orig_map  15.7 15.4 14.2 13.9 13.1 13.2 12.0 12.0 11.5 11.2 10.6 10.7 10.0 10.0  9.6  9.3
> >     new_map   15.5 15.9 15.2 15.3 14.3 14.6 14.0 14.2 13.3 13.6 13.1 13.4 12.7 13.1 12.3 12.8
> >
> > A smaller map (10K, 50% full):
> >
> >     hash_size    4    8   12   16   20   24   28   32   36   40   44   48   52   56   60   64
> >     orig_map  36.1 36.8 32.1 32.4 29.0 29.1 26.2 26.4 23.9 24.3 21.8 22.5 20.4 20.7 19.3 19.1
> >     new_map   37.7 39.6 34.0 36.5 31.5 34.1 30.7 32.7 28.1 29.5 27.4 28.8 27.1 28.1 26.4 27.8
> >
> > Other hash functions I've tried (older xxh32/64, spooky) work _way_ worse for
> > small keys than jhash/xxh3. (Answering a possible question about vector instructions:
> > xxh3 is scalar until key size <= 240, then the xxh64/xxh32 can be used which
> > also provides ~2x map lookup speed gain comparing to jhash for keys >240.)
> 
> Yeah, not suprising. xxh32/64 were optimized for long byte arrays, not
> for short keys. While xxh3 puts a lot of attention on short keys. Do
> you see xxh64 being faster than xxh3 for longs keys, as implemented in
> kernel? Kernel doesn't use SSE2/AVX versions, just purely scalars, so
> from reading benchmarks of xxh3/xxh64 author, xxh3 should win in all
> situations.

For keys longer than 240 the scalar xxh3 works way worse than xxhash. BTW, do
you know use cases when hashmap keys are > 240? (For cilium/tetragon the most
interesting use cases are keys of sizes ~4-40.)

> >
> > Bloom filters for big >= ~40 keys, predictably, work way faster with xxh3 than
> > with jhash. (Why not similar to hashmap? Because Bloom filters use jhash2 for
> > keys % 4 which works faster than jhash for small keys, see also a patch below.)
> >
> > The stacktrace map doesn't work much faster, because 95% of time it spends in
> > get_perf_callchain (the hash part, though, runs ~1.5-2.0 faster with xxh, as
> > hash size is typically about 60-90 bytes, so this makes sense to use xxh3 in
> > stacktrace by default).
> 
> For stacktrace very important aspect would be to pay attention (and
> minimize) hash collisions, though. This was a big problem with
> bpf_get_stackid() and STACK_TRACE map (and what motivated
> bpf_get_stack()). Even with a big sparsely populated map we'd get a
> lot of collisions between stack traces. xxh3 should have much better
> distribution, so in production it should result in less
> dropped/replaced stack traces. If you get a chance, it would be nice
> to collect these stats for jhash and xxh3-based implementations. Note
> that kernel's jhash2 seems to be what SMHasher ([0]) denotes as
> lookup3 (as does Jenkins himself). It's not a very good hash anymore
> in terms of distribution (and throughput as well), compared to xxh3
> (and lots of other more modern hashes).
> 
>   [0] https://github.com/rurban/smhasher

Ok, this makes sense. Based on the fact that for stacktrace xxh3 also works
about twice faster (for stack depths of 10 and more), I see no problem just
using it as is (corrected by the fact that for key sizes of 240 and more we
might prefer xxh64; this shouldn't break the stacktrace algorithms if we use
different hash algorithms, right?).

> >
> > One very simple change which brings 5-10% speed gain for all hashmaps is this:
> >
> > static inline u32 htab_map_hash(const void *key, u32 key_len, u32 hashrnd)
> >  {
> > +       if (likely((key_len & 0x3) == 0))
> > +               return jhash2(key, key_len >> 2, hashrnd);
> >         return jhash(key, key_len, hashrnd);
> >  }
> >
> > I will follow up with a patch as simple as this ^ or with a combination of
> > jhash, jhash2, and xxh3 once I will run benchmarks on more architectures to
> > check that there are no degradations.
> 
> 
> Sounds good, looking forward to it!

Benchmarks for "the better hash" are running already!

  reply	other threads:[~2023-01-31 19:18 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-27 18:14 [PATCH bpf-next 0/6] New benchmark for hashmap lookups Anton Protopopov
2023-01-27 18:14 ` [PATCH bpf-next 1/6] selftest/bpf/benchs: fix a typo in bpf_hashmap_full_update Anton Protopopov
2023-01-27 18:14 ` [PATCH bpf-next 2/6] selftest/bpf/benchs: make a function static " Anton Protopopov
2023-01-27 18:14 ` [PATCH bpf-next 3/6] selftest/bpf/benchs: enhance argp parsing Anton Protopopov
2023-01-31  0:07   ` Andrii Nakryiko
2023-01-31 13:35     ` Anton Protopopov
2023-01-27 18:14 ` [PATCH bpf-next 4/6] selftest/bpf/benchs: make quiet option common Anton Protopopov
2023-01-31  0:10   ` Andrii Nakryiko
2023-01-31 10:57     ` Anton Protopopov
2023-01-31 18:51       ` Andrii Nakryiko
2023-01-31 18:57       ` Anton Protopopov
2023-01-27 18:14 ` [PATCH bpf-next 5/6] selftest/bpf/benchs: print less if the quiet option is set Anton Protopopov
2023-01-27 18:14 ` [PATCH bpf-next 6/6] selftest/bpf/benchs: Add benchmark for hashmap lookups Anton Protopopov
2023-01-31  0:16   ` Andrii Nakryiko
2023-01-31 11:01     ` Anton Protopopov
2023-01-31  0:22   ` Martin KaFai Lau
2023-01-31 11:05     ` Anton Protopopov
2023-01-31 22:50       ` Martin KaFai Lau
2023-02-01  9:12         ` Anton Protopopov
2023-01-31  0:17 ` [PATCH bpf-next 0/6] New " Andrii Nakryiko
2023-01-31 10:47   ` Anton Protopopov
2023-01-31 18:48     ` Andrii Nakryiko
2023-01-31 19:18       ` Anton Protopopov [this message]
2023-02-01  0:02         ` Andrii Nakryiko
2023-02-01  9:41           ` Anton Protopopov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9lpilqPsW1stF/a@lavr \
    --to=aspsk@isovalent.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=martin.lau@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox