From: Yonghong Song <yhs@fb.com>
To: Brian Vazquez <brianvv@google.com>
Cc: Brian Vazquez <brianvv.kernel@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
"David S . Miller" <davem@davemloft.net>,
Stanislav Fomichev <sdf@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
Petar Penkov <ppenkov@google.com>,
Willem de Bruijn <willemb@google.com>
Subject: Re: [RFC bpf-next 0/3] bpf: adding map batch processing support
Date: Wed, 13 Nov 2019 22:26:01 +0000 [thread overview]
Message-ID: <45d9880e-de9e-abb6-b883-fecb9781738e@fb.com> (raw)
In-Reply-To: <CAMzD94TekSSCCAZD4jZiUpdfMKDqWcwdNf42b_heTGvv1K-=fg@mail.gmail.com>
On 11/13/19 2:07 PM, Brian Vazquez wrote:
> Hi Yonghong,
>
> Thanks for reviewing it! I'm preparing the changes and will submit them
> sooner.
>
> As for the right way to manage author rights, does anyone know what the
> correct approach is? Should I use Yonghong's patch and apply the
> extended support in different patches (i.e. support per_cpu maps, change
> batch from u64 to __aligned_u64, etc) or it is fine to apply the changes
> in place and write both sign-offs?
The logic flow of the patch set is most important.
You can add me as co-signoff if you reuse significant portion of my code.
>
> Thanks,
> Brian
>
> On Tue, Nov 12, 2019 at 6:34 PM Yonghong Song <yhs@fb.com
> <mailto:yhs@fb.com>> wrote:
>
>
>
> On 11/7/19 1:20 PM, Brian Vazquez wrote:
> > This is a follow up in the effort to batch bpf map operations to
> reduce
> > the syscall overhead with the map_ops. I initially proposed the
> idea and
> > the discussion is here:
> >
> https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
> >
> > Yonghong talked at the LPC about this and also proposed and idea that
> > handles the special weird case of hashtables by doing traversing
> using
> > the buckets as a reference instead of a key. Discussion is here:
> > https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/
> >
> > This RFC proposes a way to extend batch operations for more data
> > structures by creating generic batch functions that can be used
> instead
> > of implementing the operations for each individual data structure,
> > reducing the code that needs to be maintained. The series
> contains the
> > patches used in Yonghong's RFC and the patch that adds the generic
> > implementation of the operations plus some testing with pcpu hashmaps
> > and arrays. Note that pcpu hashmap shouldn't use the generic
> > implementation and it either should have its own implementation
> or share
> > the one introduced by Yonghong, I added that just as an example
> to show
> > that the generic implementation can be easily added to a data
> structure.
> >
> > What I want to achieve with this RFC is to collect early feedback
> and see if
> > there's any major concern about this before I move forward. I do plan
> > to better separate this into different patches and explain them
> properly
> > in the commit messages.
>
> Thanks Brian for working on this. The general approach described here
> is good to me. Having a generic implementation for batch operations
> looks good for maps (not hash table, queue/stack, etc.)
>
> >
> > Current known issues where I would like to discuss are the
> followings:
> >
> > - Because Yonghong's UAPI definition was done specifically for
> > iterating buckets, the batch field is u64 and is treated as an u64
> > instead of an opaque pointer, this won't work for other data
> structures
> > that are going to use a key as a batch token with a size
> greater than
> > 64. Although I think at this point the only key that couldn't be
> > treated as a u64 is the key of a hashmap, and the hashmap
> won't use
> > the generic interface.
>
> The u64 can be changed with a __aligned_u64 opaque value. This way,
> it can represent a pointer or a 64bit value.
>
> > - Not all the data structures use delete (because it's not a valid
> > operation) i.e. arrays. So maybe lookup_and_delete_batch
> command is
> > not needed and we can handle that operation with a
> lookup_batch and a
> > flag.
>
> This make sense.
>
> > - For delete_batch (not just the lookup_and_delete_batch). Is this
> > operation really needed? If so, shouldn't it be better if the
> > behaviour is delete the keys provided? I did that with my generic
> > implementation but Yonghong's delete_batch for a hashmap deletes
> > buckets.
>
> We need batched delete in bcc. lookup_and_delete_batch is better as
> it can preserves more new map entries. Alternatively, deleting
> all entries after lookup is another option. But this may remove
> more new map entries. Statistically this may or may not matter though.
>
> bcc does have a clear_table (clear_map) API, but not clear who is
> using it.
>
> So, I did not have a concrete use case for delete_batch yet.
> I tend to think we should have delete_batch for API completeness,
> but maybe other people can comment on this as well.
>
> Maybe initial patch, we can skip it. But we should still ensure
> user interface data structure can handle batch delete if it is
> needed later. The current data structure should handle this
> as far as I know.
>
> >
> > Brian Vazquez (1):
> > bpf: add generic batch support
> >
> > Yonghong Song (2):
> > bpf: adding map batch processing support
> > tools/bpf: test bpf_map_lookup_and_delete_batch()
> >
> > include/linux/bpf.h | 21 +
> > include/uapi/linux/bpf.h | 22 +
> > kernel/bpf/arraymap.c | 4 +
> > kernel/bpf/hashtab.c | 331 ++++++++++
> > kernel/bpf/syscall.c | 573
> ++++++++++++++----
> > tools/include/uapi/linux/bpf.h | 22 +
> > tools/lib/bpf/bpf.c | 59 ++
> > tools/lib/bpf/bpf.h | 13 +
> > tools/lib/bpf/libbpf.map | 4 +
> > .../map_tests/map_lookup_and_delete_batch.c | 245 ++++++++
> > .../map_lookup_and_delete_batch_array.c | 118 ++++
> > 11 files changed, 1292 insertions(+), 120 deletions(-)
> > create mode 100644
> tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch.c
> > create mode 100644
> tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch_array.c
> >
>
prev parent reply other threads:[~2019-11-13 22:26 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-07 21:20 [RFC bpf-next 0/3] bpf: adding map batch processing support Brian Vazquez
2019-11-07 21:20 ` [RFC bpf-next 1/3] " Brian Vazquez
2019-11-07 21:20 ` [RFC bpf-next 2/3] tools/bpf: test bpf_map_lookup_and_delete_batch() Brian Vazquez
2019-11-15 22:42 ` Andrii Nakryiko
2019-11-07 21:20 ` [RFC bpf-next 3/3] bpf: add generic batch support Brian Vazquez
2019-11-13 2:34 ` [RFC bpf-next 0/3] bpf: adding map batch processing support Yonghong Song
[not found] ` <CAMzD94TekSSCCAZD4jZiUpdfMKDqWcwdNf42b_heTGvv1K-=fg@mail.gmail.com>
2019-11-13 22:26 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45d9880e-de9e-abb6-b883-fecb9781738e@fb.com \
--to=yhs@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brianvv.kernel@gmail.com \
--cc=brianvv@google.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ppenkov@google.com \
--cc=sdf@google.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).