Re: [PATCH bpf-next v4 3/6] xdp: Add devmap_hash map type for looking up devices by hashed index

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "Daniel Borkmann" <daniel@iogearbox.net>,
	"Alexei Starovoitov" <ast@kernel.org>,
	netdev@vger.kernel.org, "David Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	"Björn Töpel" <bjorn.topel@gmail.com>,
	"Yonghong Song" <yhs@fb.com>,
	brouer@redhat.com
Subject: Re: [PATCH bpf-next v4 3/6] xdp: Add devmap_hash map type for looking up devices by hashed index
Date: Thu, 25 Jul 2019 17:05:03 +0200	[thread overview]
Message-ID: <8736iuyx28.fsf@toke.dk> (raw)
In-Reply-To: <20190725133730.3750c66c@carbon>

Jesper Dangaard Brouer <brouer@redhat.com> writes:

> On Thu, 25 Jul 2019 12:32:19 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> Jesper Dangaard Brouer <brouer@redhat.com> writes:
>> 
>> > On Mon, 22 Jul 2019 13:52:48 +0200
>> > Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >  
>> >> +static inline struct hlist_head *dev_map_index_hash(struct bpf_dtab *dtab,
>> >> +						    int idx)
>> >> +{
>> >> +	return &dtab->dev_index_head[idx & (NETDEV_HASHENTRIES - 1)];
>> >> +}  
>> >
>> > It is good for performance that our "hash" function is simply an AND
>> > operation on the idx.  We want to keep it this way.
>> >
>> > I don't like that you are using NETDEV_HASHENTRIES, because the BPF map
>> > infrastructure already have a way to specify the map size (struct
>> > bpf_map_def .max_entries).  BUT for performance reasons, to keep the
>> > AND operation, we would need to round up the hash-array size to nearest
>> > power of 2 (or reject if user didn't specify a power of 2, if we want
>> > to "expose" this limit to users).  
>> 
>> But do we really want the number of hash buckets to be equal to the max
>> number of entries? The values are not likely to be evenly distributed,
>> so we'll end up with big buckets if the number is small, meaning we'll
>> blow performance on walking long lists in each bucket.
>
> The requested change makes it user-configurable, instead of fixed 256
> entries.  I've seen production use-case with >5000 net_devices, thus
> they need a knob to increase this (to avoid the list walking as you
> mention).

Ah, I see. That makes sense; I thought you wanted to make it smaller
(cf. the previous discussion about it being too big). Still, it seems
counter-intuitive to overload max_entries in this way.

I do see that this is what the existing hash map is also doing, though,
so I guess there is some precedence. I do wonder if we'll end up getting
bad performance from the hash being too simplistic, but I guess we can
always fix that later.

>> Also, if the size is dynamic the size needs to be loaded from memory
>> instead of being a compile-time constant, which will presumably hurt
>> performance (though not sure by how much)?
>
> To counter this, the mask value which need to be loaded from memory,
> needs to be placed next to some other struct member which is already in
> use (at least on same cacheline, Intel have some 16 bytes access micro
> optimizations, which I've never been able to measure, as its in 0.5
> nanosec scale).

In the fast path (i.e., in __xdp_map_lookup_elem) we will have already
loaded map->max_entries since it's on the same cacheline as map_type
which we use to disambiguate which function to call. So it should be
fine to just use that directly.

I'll send a new version with this change :)

-Toke

next prev parent reply	other threads:[~2019-07-25 15:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-22 11:52 [PATCH bpf-next v4 0/6] xdp: Add devmap_hash map type Toke Høiland-Jørgensen
2019-07-22 11:52 ` [PATCH bpf-next v4 3/6] xdp: Add devmap_hash map type for looking up devices by hashed index Toke Høiland-Jørgensen
2019-07-25  8:07   ` Jesper Dangaard Brouer
2019-07-25 10:32     ` Toke Høiland-Jørgensen
2019-07-25 11:37       ` Jesper Dangaard Brouer
2019-07-25 15:05         ` Toke Høiland-Jørgensen [this message]
2019-07-22 11:52 ` [PATCH bpf-next v4 4/6] tools/include/uapi: Add devmap_hash BPF map type Toke Høiland-Jørgensen
2019-07-22 11:52 ` [PATCH bpf-next v4 1/6] include/bpf.h: Remove map_insert_ctx() stubs Toke Høiland-Jørgensen
2019-07-22 11:52 ` [PATCH bpf-next v4 2/6] xdp: Refactor devmap allocation code for reuse Toke Høiland-Jørgensen
2019-07-22 11:52 ` [PATCH bpf-next v4 5/6] tools/libbpf_probes: Add new devmap_hash type Toke Høiland-Jørgensen
2019-07-22 11:52 ` [PATCH bpf-next v4 6/6] tools: Add definitions for devmap_hash map type Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8736iuyx28.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@gmail.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=netdev@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.