Re: [RFC bpf-next 0/3] bpf: adding map batch processing support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yonghong Song <yhs@fb.com>
To: Brian Vazquez <brianvv@google.com>
Cc: Brian Vazquez <brianvv.kernel@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"David S . Miller" <davem@davemloft.net>,
	Stanislav Fomichev <sdf@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	Petar Penkov <ppenkov@google.com>,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [RFC bpf-next 0/3] bpf: adding map batch processing support
Date: Wed, 13 Nov 2019 22:26:01 +0000	[thread overview]
Message-ID: <45d9880e-de9e-abb6-b883-fecb9781738e@fb.com> (raw)
In-Reply-To: <CAMzD94TekSSCCAZD4jZiUpdfMKDqWcwdNf42b_heTGvv1K-=fg@mail.gmail.com>



On 11/13/19 2:07 PM, Brian Vazquez wrote:
> Hi Yonghong,
> 
> Thanks for reviewing it! I'm preparing the changes and will submit them 
> sooner.
> 
> As for the right way to manage author rights, does anyone know what the 
> correct approach is? Should I use Yonghong's patch and apply the 
> extended support in different patches (i.e. support per_cpu maps, change 
> batch from u64 to __aligned_u64, etc) or it is fine to apply the changes 
> in place and write both sign-offs?

The logic flow of the patch set is most important.
You can add me as co-signoff if you reuse significant portion of my code.

> 
> Thanks,
> Brian
> 
> On Tue, Nov 12, 2019 at 6:34 PM Yonghong Song <yhs@fb.com 
> <mailto:yhs@fb.com>> wrote:
> 
> 
> 
>     On 11/7/19 1:20 PM, Brian Vazquez wrote:
>      > This is a follow up in the effort to batch bpf map operations to
>     reduce
>      > the syscall overhead with the map_ops. I initially proposed the
>     idea and
>      > the discussion is here:
>      >
>     https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
>      >
>      > Yonghong talked at the LPC about this and also proposed and idea that
>      > handles the special weird case of hashtables by doing traversing
>     using
>      > the buckets as a reference instead of a key. Discussion is here:
>      > https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/
>      >
>      > This RFC proposes a way to extend batch operations for more data
>      > structures by creating generic batch functions that can be used
>     instead
>      > of implementing the operations for each individual data structure,
>      > reducing the code that needs to be maintained. The series
>     contains the
>      > patches used in Yonghong's RFC and the patch that adds the generic
>      > implementation of the operations plus some testing with pcpu hashmaps
>      > and arrays. Note that pcpu hashmap shouldn't use the generic
>      > implementation and it either should have its own implementation
>     or share
>      > the one introduced by Yonghong, I added that just as an example
>     to show
>      > that the generic implementation can be easily added to a data
>     structure.
>      >
>      > What I want to achieve with this RFC is to collect early feedback
>     and see if
>      > there's any major concern about this before I move forward. I do plan
>      > to better separate this into different patches and explain them
>     properly
>      > in the commit messages.
> 
>     Thanks Brian for working on this. The general approach described here
>     is good to me. Having a generic implementation for batch operations
>     looks good for maps (not hash table, queue/stack, etc.)
> 
>      >
>      > Current known issues where I would like to discuss are the
>     followings:
>      >
>      > - Because Yonghong's UAPI definition was done specifically for
>      >    iterating buckets, the batch field is u64 and is treated as an u64
>      >    instead of an opaque pointer, this won't work for other data
>     structures
>      >    that are going to use a key as a batch token with a size
>     greater than
>      >    64. Although I think at this point the only key that couldn't be
>      >    treated as a u64 is the key of a hashmap, and the hashmap
>     won't use
>      >    the generic interface.
> 
>     The u64 can be changed with a __aligned_u64 opaque value. This way,
>     it can represent a pointer or a 64bit value.
> 
>      > - Not all the data structures use delete (because it's not a valid
>      >    operation) i.e. arrays. So maybe lookup_and_delete_batch
>     command is
>      >    not needed and we can handle that operation with a
>     lookup_batch and a
>      >    flag.
> 
>     This make sense.
> 
>      > - For delete_batch (not just the lookup_and_delete_batch). Is this
>      >    operation really needed? If so, shouldn't it be better if the
>      >    behaviour is delete the keys provided? I did that with my generic
>      >    implementation but Yonghong's delete_batch for a hashmap deletes
>      >    buckets.
> 
>     We need batched delete in bcc. lookup_and_delete_batch is better as
>     it can preserves more new map entries. Alternatively, deleting
>     all entries after lookup is another option. But this may remove
>     more new map entries. Statistically this may or may not matter though.
> 
>     bcc does have a clear_table (clear_map) API, but not clear who is
>     using it.
> 
>     So, I did not have a concrete use case for delete_batch yet.
>     I tend to think we should have delete_batch for API completeness,
>     but maybe other people can comment on this as well.
> 
>     Maybe initial patch, we can skip it. But we should still ensure
>     user interface data structure can handle batch delete if it is
>     needed later. The current data structure should handle this
>     as far as I know.
> 
>      >
>      > Brian Vazquez (1):
>      >    bpf: add generic batch support
>      >
>      > Yonghong Song (2):
>      >    bpf: adding map batch processing support
>      >    tools/bpf: test bpf_map_lookup_and_delete_batch()
>      >
>      >   include/linux/bpf.h                           |  21 +
>      >   include/uapi/linux/bpf.h                      |  22 +
>      >   kernel/bpf/arraymap.c                         |   4 +
>      >   kernel/bpf/hashtab.c                          | 331 ++++++++++
>      >   kernel/bpf/syscall.c                          | 573
>     ++++++++++++++----
>      >   tools/include/uapi/linux/bpf.h                |  22 +
>      >   tools/lib/bpf/bpf.c                           |  59 ++
>      >   tools/lib/bpf/bpf.h                           |  13 +
>      >   tools/lib/bpf/libbpf.map                      |   4 +
>      >   .../map_tests/map_lookup_and_delete_batch.c   | 245 ++++++++
>      >   .../map_lookup_and_delete_batch_array.c       | 118 ++++
>      >   11 files changed, 1292 insertions(+), 120 deletions(-)
>      >   create mode 100644
>     tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch.c
>      >   create mode 100644
>     tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch_array.c
>      >
>

     prev parent reply	other threads:[~2019-11-13 22:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 21:20 [RFC bpf-next 0/3] bpf: adding map batch processing support Brian Vazquez
2019-11-07 21:20 ` [RFC bpf-next 1/3] " Brian Vazquez
2019-11-07 21:20 ` [RFC bpf-next 2/3] tools/bpf: test bpf_map_lookup_and_delete_batch() Brian Vazquez
2019-11-15 22:42   ` Andrii Nakryiko
2019-11-07 21:20 ` [RFC bpf-next 3/3] bpf: add generic batch support Brian Vazquez
2019-11-13  2:34 ` [RFC bpf-next 0/3] bpf: adding map batch processing support Yonghong Song
     [not found]   ` <CAMzD94TekSSCCAZD4jZiUpdfMKDqWcwdNf42b_heTGvv1K-=fg@mail.gmail.com>
2019-11-13 22:26     ` Yonghong Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45d9880e-de9e-abb6-b883-fecb9781738e@fb.com \
    --to=yhs@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brianvv.kernel@gmail.com \
    --cc=brianvv@google.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=ppenkov@google.com \
    --cc=sdf@google.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).