All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>, Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Networking <netdev@vger.kernel.org>,
	bpf@vger.kernel.org, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 05/12] libbpf: add resizable non-thread safe internal hashmap
Date: Wed, 22 May 2019 15:30:13 -0700	[thread overview]
Message-ID: <20190522223013.GC3032@mini-arch> (raw)
In-Reply-To: <CAEf4BzZi6A1vcFUFdwSQrbag_ptoU9+imdWhHdVKCLB8yvPSTw@mail.gmail.com>

On 05/22, Andrii Nakryiko wrote:
> On Wed, May 22, 2019 at 1:30 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 05/22, Andrii Nakryiko wrote:
> > > There is a need for fast point lookups inside libbpf for multiple use
> > > cases (e.g., name resolution for BTF-to-C conversion, by-name lookups in
> > > BTF for upcoming BPF CO-RE relocation support, etc). This patch
> > > implements simple resizable non-thread safe hashmap using single linked
> > > list chains.
> > Didn't really look into the details, but any reason you're not using
> > linux/hashtable.h? It's exported in tools/include and I think perf
> > is using it. It's probably not resizable, but should be easy to
> > implement rebalancing on top of it.
> 
> There are multiple reasons.
> 1. linux/hashtable.h is pretty bare-bones, it's just hlist_node and a
> bunch of macro to manipulate array or chains of them. I wanted to have
> higher-level API with lookup by key, insertion w/ various strategies,
> etc. Preferrably one not requiring to manipulate hlist_node directly
> as part of its API, even if at some performance cost of hiding that
> low-level detail.
> 2. Resizing is a big chunk of resizable hashmap logic, so I'd need to
> write a bunch of additional code anyway.
> 3. Licensing. linux/hashtable.h is under GPL, while libbpf is
> dual-licensed under GPL and BSD. When syncing libbpf from kernel to
> github, we have to re-implement all the parts from kernel that are not
> under BSD license anyway.
> 4. hlist_node keeps two pointers per item, which is unnecessary for
> hashmap which does deletion by key (by searching for node first, then
> deleting), so we can also have lower memory overhead per entry.
> 
> So in general, I feel like there is little benefit to reusing
> linux/hashlist.h for use cases I'm targeting this hashmap for.
Makes sense. Licensing is probably the biggest issue here because
my original suggestion was to use linux/hashtable.h internally,
just wrap it in a nice api.
But agree on all points, thanks for clarification!

> > > Four different insert strategies are supported:
> > >  - HASHMAP_ADD - only add key/value if key doesn't exist yet;
> > >  - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
> > >    update value;
> > >  - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
> > >    nothing and return -ENOENT;
> > >  - HASHMAP_APPEND - always add key/value pair, even if key already exists.
> > >    This turns hashmap into a multimap by allowing multiple values to be
> > >    associated with the same key. Most useful read API for such hashmap is
> > >    hashmap__for_each_key_entry() iteration. If hashmap__find() is still
> > >    used, it will return last inserted key/value entry (first in a bucket
> > >    chain).
> > >
> > > For HASHMAP_SET and HASHMAP_UPDATE, old key/value pair is returned, so
> > > that calling code can handle proper memory management, if necessary.
> > >
> > > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > ---
> > >  tools/lib/bpf/Build     |   2 +-
> > >  tools/lib/bpf/hashmap.c | 229 ++++++++++++++++++++++++++++++++++++++++
> > >  tools/lib/bpf/hashmap.h | 173 ++++++++++++++++++++++++++++++
> > >  3 files changed, 403 insertions(+), 1 deletion(-)
> > >  create mode 100644 tools/lib/bpf/hashmap.c
> > >  create mode 100644 tools/lib/bpf/hashmap.h
> > >
> > >

  reply	other threads:[~2019-05-22 22:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-22 19:50 [PATCH bpf-next 00/12] BTF-to-C converter Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 01/12] libbpf: ensure libbpf.h is included along libbpf_internal.h Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 02/12] libbpf: add btf__parse_elf API to load .BTF and .BTF.ext Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 03/12] bpftool: use libbpf's btf__parse_elf API Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 04/12] selftests/bpf: use btf__parse_elf to check presence of BTF/BTF.ext Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 05/12] libbpf: add resizable non-thread safe internal hashmap Andrii Nakryiko
2019-05-22 20:30   ` Stanislav Fomichev
2019-05-22 22:13     ` Andrii Nakryiko
2019-05-22 22:30       ` Stanislav Fomichev [this message]
2019-05-22 19:50 ` [PATCH bpf-next 06/12] selftests/bpf: add tests for libbpf's hashmap Andrii Nakryiko
2019-05-22 20:31   ` Stanislav Fomichev
2019-05-22 22:15     ` Andrii Nakryiko
2019-05-23  4:27       ` Andrii Nakryiko
2019-05-23 16:04         ` Stanislav Fomichev
2019-05-22 19:50 ` [PATCH bpf-next 07/12] libbpf: switch btf_dedup() to hashmap for dedup table Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 08/12] libbpf: add btf_dump API for BTF-to-C conversion Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 09/12] selftests/bpf: add btf_dump BTF-to-C conversion tests Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 10/12] bpftool: add C output format option to btf dump subcommand Andrii Nakryiko
2019-05-23  0:25   ` Jakub Kicinski
2019-05-23  0:58     ` Andrii Nakryiko
2019-05-23  1:23       ` Jakub Kicinski
2019-05-23  4:43         ` Andrii Nakryiko
2019-05-23 16:27           ` Jakub Kicinski
2019-05-23 17:26             ` Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 11/12] bpftool/docs: add description of btf dump C option Andrii Nakryiko
2019-05-22 19:50 ` [PATCH bpf-next 12/12] bpftool: update bash-completion w/ new c option for btf dump Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190522223013.GC3032@mini-arch \
    --to=sdf@fomichev.me \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.