From: Anton Protopopov <aspsk@isovalent.com>
To: Nick Zavaritsky <mejedi@gmail.com>
Cc: Charalampos Stylianopoulos <charalampos.stylianopoulos@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
aspsk2@gmail.com
Subject: Re: [PATCH bpf-next 0/4] expose number of map entries to userspace
Date: Fri, 17 Jan 2025 10:35:47 +0000 [thread overview]
Message-ID: <Z4oygzEgfLqGCCNA@eis> (raw)
In-Reply-To: <AC7968EC-73CA-415B-8FAD-70C805075479@gmail.com>
On 25/01/16 06:52PM, Nick Zavaritsky wrote:
>
> > On 16. Jan 2025, at 15:59, Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > On 25/01/14 12:38PM, Nick Zavaritsky wrote:
> >>
> >>> On 9. Jan 2025, at 18:37, Anton Protopopov <aspsk@isovalent.com> wrote:
> >>>
> >>> On 25/01/07 12:10PM, Charalampos Stylianopoulos wrote:
> >>>> (sorry for double posting, this time in plain text)
> >>>> Thanks a lot for the feedback!
> >>>>
> >>>> So, to double check, the suggestion is to only extend the libbpf API
> >>>> with a new helper that does pretty much what get_cur_elements() does
> >>>> in tools/testing/selftests/bpf/map_tests/map_percpu_stats.c ?
> >>>
> >>> What is your use case for getting the number of elements in a
> >>> particular map? Will it work for you to just use a variant of
> >>> get_cur_elements() from selftests vs. adding new API to libbpf?
> >>
> >> (On behalf of Charalampos Stylianopoulos) we would like to get the
> >> number of elements in some maps for monitoring purposes. The end goal is
> >> to get someone paged when a fixed-capacity map is about to start
> >> rejecting inserts.
> >>
> >> We aim to operate a large number of apps in containers (custom packet
> >> processing services, telekom). We find it most convenient for an app
> >> itself to expose metrics concerning the maps it has created.
> >>
> >> We currently use a map iterator and a bunch of bpf_probe_read_kernel. We
> >> foresee the number of maps in our systems getting significantly higher
> >> in the near future. Therefore enumerating every map in the system to get
> >> a number of elements in a particular map doesn't look sustainable.
> >>
> >> How do you feel about introducing bpf_map_sum_elem_count_by_fd kfunc,
> >> available in syscall programs?
> >
> > This should work already, something like
> >
> > __s64 bpf_map_sum_elem_count(const struct bpf_map *map) __ksym;
> > __s64 ret_user;
> >
> > struct {
> > __uint(type, BPF_MAP_TYPE_HASH);
> > __type(key, int);
> > __type(value, int);
> > __uint(max_entries, 4);
> > } your_map SEC(".maps");
> >
> > SEC("syscall")
> > int sum(void *ctx)
> > {
> > struct bpf_map *map = (struct bpf_map *)&your_map;
> >
> > ret_user = bpf_map_sum_elem_count(map);
> >
> > return 0;
> > }
> >
> > char _license[] SEC("license") = "GPL";
> >
> > Is this sufficient for your use case?
>
> Technically it works. One can add a program similar to the snippet below
> to their bpf code to expose the number of elements in every map of
> interest.
>
> struct stats { __s64 a, b, c, d; };
> SEC(“.maps”) struct { ... } a, b, c, d;
>
> SEC(“syscall”)
> int sum_element_count_bulk(void *ctx)
> {
> struct stats *stats = ctx;
> stats->a = bpf_map_sum_element_count((void *)a);
> stats->b = bpf_map_sum_element_count((void *)b);
> ...
> return 0;
> }
>
> The downside is that it is boilerplate code that has to be written every
> single time. With the proposed bpf_map_sum_element_count_by_fd, one can
> have a library in user space that offers convenient
> sum_element_count(int fd).
>
> It could leverage the following bpf program behind the scenes:
>
> SEC(“syscall”)
> int sum_element_count(void *ctx)
> {
> *(__s64 *)ctx = bpf_map_sum_element_count_by_fd(*(int *)ctx);
> return 0;
> }
Makes sense. And this can also be used for multiple maps in one call.
I've quickly tested that the following implementation works, please
send a patch + selftests. Note that unlike the bpf_map_sum_elem_count
function, the bpf_map_sum_elem_count_by_fd should be only allowed for
SYSCALL programs.
__bpf_kfunc s64 bpf_map_sum_elem_count_by_fd(int fd)
{
struct bpf_map *map;
s64 ret;
map = bpf_map_get(fd);
if (IS_ERR(map))
return 0;
ret = bpf_map_sum_elem_count(map);
bpf_map_put(map);
return ret;
}
> >
> >>>
> >>> [Also, please try not to top-post, see https://www.idallen.com/topposting.html]
> >>>
> >>>>> On Tue, 7 Jan 2025 at 08:44, Anton Protopopov <aspsk@isovalent.com> wrote:
> >>>>>>
> >>>>>> On 25/01/06 05:19PM, Daniel Borkmann wrote:
> >>>>>>> On 1/6/25 3:53 PM, Charalampos Stylianopoulos wrote:
> >>>>>>>> This patch series provides an easy way for userspace applications to
> >>>>>>>> query the number of entries currently present in a map.
> >>>>>>>>
> >>>>>>>> Currently, the number of entries in a map is accessible only from kernel space
> >>>>>>>> and eBPF programs. A userspace program that wants to track map utilization has to
> >>>>>>>> create and attach an eBPF program solely for that purpose.
> >>>>>>>>
> >>>>>>>> This series makes the number of entries in a map easily accessible, by extending the
> >>>>>>>> main bpf syscall with a new command. The command supports only maps that already
> >>>>>>>> track utilization, namely hash maps, LPM maps and queue/stack maps.
> >>>>>>>
> >>>>>>> An earlier attempt to directly expose it to user space can be found here [0], which
> >>>>>>> eventually led to [1] to only expose it via kfunc for BPF programs in order to avoid
> >>>>>>> extending UAPI.
> >>>>>>>
> >>>>>>> Perhaps instead add a small libbpf helper (e.g. bpf_map__current_entries to complement
> >>>>>>> bpf_map__max_entries) which does all the work to extract that info via [1] underneath?
> >>>>>>
> >>>>>> One small thingy here is that bpf_map_sum_elem_count() is only
> >>>>>> available from the map iterator. Which means that to get the
> >>>>>> bpf_map_sum_elem_count() for one map only, one have to iterate
> >>>>>> through the whole set of maps (and filter out all but one).
> >>>>>>
> >>>>>> I wanted to follow up my series by either adding the result of
> >>>>>> calling bpf_map_sum_elem_count() to map_info as u32 or to add
> >>>>>> possibility to provide a map_fd/map_id when creating an iterator
> >>>>>> (so that it is only called for one map). But so far I haven't
> >>>>>> a real use case for getting the number of elements for one map only.
> >>>>>>
> >>>>>>> Thanks,
> >>>>>>> Daniel
> >>>>>>>
> >>>>>>> [0] https://lore.kernel.org/bpf/20230531110511.64612-1-aspsk@isovalent.com/
> >>>>>>> [1] https://lore.kernel.org/bpf/20230705160139.19967-1-aspsk@isovalent.com/
> >>>>>>> https://lore.kernel.org/bpf/20230719092952.41202-1-aspsk@isovalent.com/
> >>>>>>>
> >>>>>>>> Charalampos Stylianopoulos (4):
> >>>>>>>> bpf: Add map_num_entries map op
> >>>>>>>> bpf: Add bpf command to get number of map entries
> >>>>>>>> libbpf: Add support for MAP_GET_NUM_ENTRIES command
> >>>>>>>> selftests/bpf: Add tests for bpf_map_get_num_entries
> >>>>>>>>
> >>>>>>>> include/linux/bpf.h | 3 ++
> >>>>>>>> include/linux/bpf_local_storage.h | 1 +
> >>>>>>>> include/uapi/linux/bpf.h | 17 +++++++++
> >>>>>>>> kernel/bpf/devmap.c | 14 ++++++++
> >>>>>>>> kernel/bpf/hashtab.c | 10 ++++++
> >>>>>>>> kernel/bpf/lpm_trie.c | 8 +++++
> >>>>>>>> kernel/bpf/queue_stack_maps.c | 11 +++++-
> >>>>>>>> kernel/bpf/syscall.c | 32 +++++++++++++++++
> >>>>>>>> tools/include/uapi/linux/bpf.h | 17 +++++++++
> >>>>>>>> tools/lib/bpf/bpf.c | 16 +++++++++
> >>>>>>>> tools/lib/bpf/bpf.h | 2 ++
> >>>>>>>> tools/lib/bpf/libbpf.map | 1 +
> >>>>>>>> .../bpf/map_tests/lpm_trie_map_basic_ops.c | 5 +++
> >>>>>>>> tools/testing/selftests/bpf/test_maps.c | 35 +++++++++++++++++++
> >>>>>>>> 14 files changed, 171 insertions(+), 1 deletion(-)
>
>
prev parent reply other threads:[~2025-01-17 10:31 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-06 14:53 [PATCH bpf-next 0/4] expose number of map entries to userspace Charalampos Stylianopoulos
2025-01-06 14:53 ` [PATCH bpf-next 1/4] bpf: Add map_num_entries map op Charalampos Stylianopoulos
2025-01-06 14:53 ` [PATCH bpf-next 2/4] bpf: Add bpf command to get number of map entries Charalampos Stylianopoulos
2025-01-07 17:52 ` kernel test robot
2025-01-06 14:53 ` [PATCH bpf-next 3/4] libbpf: Add support for MAP_GET_NUM_ENTRIES command Charalampos Stylianopoulos
2025-01-06 14:53 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for bpf_map_get_num_entries Charalampos Stylianopoulos
2025-01-06 16:19 ` [PATCH bpf-next 0/4] expose number of map entries to userspace Daniel Borkmann
2025-01-07 7:43 ` Anton Protopopov
2025-01-07 7:48 ` Anton Protopopov
[not found] ` <CAAvdH+yNG=GefEd5CcP_52gPzzZexWMMxFAxnM3isX04iErMfQ@mail.gmail.com>
2025-01-07 11:10 ` Charalampos Stylianopoulos
2025-01-09 17:37 ` Anton Protopopov
2025-01-14 11:38 ` Nick Zavaritsky
2025-01-16 14:59 ` Anton Protopopov
2025-01-16 17:52 ` Nick Zavaritsky
2025-01-17 10:35 ` Anton Protopopov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4oygzEgfLqGCCNA@eis \
--to=aspsk@isovalent.com \
--cc=aspsk2@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=charalampos.stylianopoulos@gmail.com \
--cc=daniel@iogearbox.net \
--cc=mejedi@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox