All of lore.kernel.org
 help / color / mirror / Atom feed
From: sdf@google.com
To: Donald Hunter <donald.hunter@gmail.com>
Cc: bpf@vger.kernel.org, linux-doc@vger.kernel.org,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH] bpf, docs: document BPF_MAP_TYPE_HASH and variants
Date: Wed, 13 Jul 2022 16:30:15 -0700	[thread overview]
Message-ID: <Ys9VhwanEB/T8/Ue@google.com> (raw)
In-Reply-To: <20220713211612.84782-1-donald.hunter@gmail.com>

On 07/13, Donald Hunter wrote:
> This commit adds documentation for BPF_MAP_TYPE_HASH including kernel
> version introduced, usage and examples. It also documents
> BPF_MAP_TYPE_PERCPU_HASH, BPF_MAP_TYPE_LRU_HASH and
> BPF_MAP_TYPE_LRU_PERCPU_HASH which are similar.

> Note that this file is included in the BPF documentation by the glob in
> Documentation/bpf/maps.rst

> Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
> ---
>   Documentation/bpf/map_hash.rst | 176 +++++++++++++++++++++++++++++++++
>   1 file changed, 176 insertions(+)
>   create mode 100644 Documentation/bpf/map_hash.rst

> diff --git a/Documentation/bpf/map_hash.rst  
> b/Documentation/bpf/map_hash.rst
> new file mode 100644
> index 000000000000..991452e70cc9
> --- /dev/null
> +++ b/Documentation/bpf/map_hash.rst
> @@ -0,0 +1,176 @@
> +.. SPDX-License-Identifier: GPL-2.0-only
> +.. Copyright (C) 2021 Red Hat, Inc.
> +
> +===============================================
> +BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants
> +===============================================
> +
> +.. note::
> +   - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19
> +   - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6
> +   - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
> +     were introduced in version 4.10
> +
> +``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general
> +purpose hash map storage. Both the key and the value can be structs,
> +allowing for composite keys and values. The maximum number of entries is
> +defined in max_entries and is limited to 2^32. The kernel is responsible

Do we really need to mention 2^32 limit here? It really depends on
the key/value sizes, right?

Instead, might be worth talking about how/when this memory is allocated and
mention BPF_F_NO_PREALLOC?

> +for allocating and freeing key/value pairs, up to the max_entries limit
> +that you specify. ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate hash
> +table per CPU.
> +
> +Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by
> +programs running on different CPUs.  Since Kernel version 5.1, the BPF
> +infrastructure provides ``struct bpf_spin_lock`` to synchronize access.
> +
> +The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
> +variants add LRU semantics to their respective hash tables. An LRU hash
> +will automatically evict the least recently used entries when the hash
> +table reaches capacity. An LRU hash maintains an internal LRU list that
> +is used to select elements for eviction. This internal LRU list is
> +shared across CPUs but it is possible to request a per CPU LRU list with
> +the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``.
> +
> +Usage
> +=====
> +
> +.. c:function::
> +   long bpf_map_update_elem(struct bpf_map *map, const void *key, const  
> void *value, u60 flags)

s/u60/u64/

> +
> +Hash entries can be added or updated using the ``bpf_map_update_elem()``
> +helper. This helper replaces existing elements atomically. The ``flags``
> +parameter can be used to control the update behaviour:
> +
> +- ``BPF_ANY`` will create a new element or update an existing element
> +- ``BPF_NOTEXIST`` will create a new element only if one did not already
> +  exist
> +- ``BPF_EXIST`` will update an existing element
> +
> +``bpf_map_update_elem()`` returns 0 on success, or negative error in
> +case of failure.
> +
> +.. c:function::
> +   void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
> +
> +Hash entries can be retrieved using the ``bpf_map_lookup_elem()``
> +helper. This helper returns a pointer to the value associated with
> +``key``, or ``NULL`` if no entry was found.
> +
> +.. c:function::
> +   long bpf_map_delete_elem(struct bpf_map *map, const void *key)
> +
> +Hash entries can be deleted using the ``bpf_map_delete_elem()``
> +helper. This helper will return 0 on success, or negative error in case
> +of failure.
> +
> +Per CPU Hashes
> +--------------
> +
> +For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
> +the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers
> +automatically access the hash slot for the current CPU.
> +
> +.. c:function::
> +   void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void  
> *key, u32 cpu)
> +
> +The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the
> +value in the hash slot for a specific CPU. Returns value associated with
> +``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is
> +invalid.
> +
> +Userspace
> +---------
> +
> +.. c:function::
> +   int bpf_map_get_next_key (int fd, const void *cur_key, void *next_key)
> +
> +In userspace, is possible to iterate through the keys of a hash using
> +the ``bpf_map_get_next_key()`` function. The first key can be fetched by
> +calling ``bpf_map_get_next_key()`` with ``cur_key`` set to
> +``NULL``. Subsequent calls will fetch the next key that follows the
> +current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if
> +cur_key is the last key in the hash, or negative error in case of
> +failure.
> +
> +Examples
> +========
> +
> +Please see the ``tools/testing/selftests/bpf`` directory for functional
> +examples.  This sample code demonstrates API usage.

[..]

> +Kernel
> +------
> +
> +.. code-block:: c
> +
> +    #include <linux/bpf.h>
> +    #include <bpf/bpf_helpers.h>
> +
> +    struct key {
> +        __u32 srcip;
> +    };
> +
> +    struct value {
> +        __u64 packets;
> +        __u64 bytes;
> +    };
> +
> +    struct {
> +            __uint(type, BPF_MAP_TYPE_LRU_HASH);
> +            __uint(max_entries, 32);
> +            __type(key, struct key);
> +            __type(value, struct value);
> +    } packet_stats SEC(".maps");
> +
> +    static inline void count_by_srcip(__u32 srcip, int bytes)
> +    {
> +            struct key key = {
> +                    .srcip = srcip
> +            };
> +            struct value *value = bpf_map_lookup_elem(&packet_stats,  
> &key);
> +            if (value) {
> +                    __sync_fetch_and_add(&value->packets, 1);
> +                    __sync_fetch_and_add(&value->bytes, bytes);
> +            } else {
> +                    struct value newval = { 1, bytes };
> +                    bpf_map_update_elem(&packet_stats, &key, &newval,  
> BPF_NOEXIST);
> +            }
> +    }
> +
> +Userspace
> +---------
> +
> +.. code-block:: c
> +
> +    #include <bpf/libbpf.h>
> +    #include <bpf/bpf.h>
> +
> +    static void print_values(int map_fd)
> +    {
> +            struct key *cur_key = NULL;
> +            struct key next_key;
> +            int next;
> +            do {
> +                    next = bpf_map_get_next_key(stats_fd, cur_key,  
> &next_key);
> +                    if (next == -ENOENT)
> +                            break;
> +                    if (next < 0) {
> +                            fprintf(stderr, "bpf_map_get_next_key %d  
> returned %s\n", stats_fd, strerror(-next));
> +                            break;
> +                    }
> +
> +                    struct in_addr src_addr = {
> +                            .s_addr = next_key.srcip
> +                    };
> +                    char *src_ip = inet_ntoa(src_addr);
> +
> +                    struct value value;
> +                    int ret = bpf_map_lookup_elem(stats_fd, &next_key,  
> &value);
> +                    if (ret < 0) {
> +                            fprintf(stderr, "Failed to lookup elem with  
> key %s: %s\n", src_ip, strerror(-ret));
> +                            break;
> +                    }
> +                    printf("%s: %lld packets, %lld bytes\n", src_ip,  
> value.packets, value.bytes);
> +                    cur_key = &next_key;
> +            } while (next == 0);
> +    }

Instead of adding c code, maybe add pointers to specific file within
tools/testing/selftests/bpf/progs ? That's what we've done for
prog_cgroup_sockopt; the actual tests are a bit more maintained than
the doc :-)

  reply	other threads:[~2022-07-13 23:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-13 21:16 [PATCH] bpf, docs: document BPF_MAP_TYPE_HASH and variants Donald Hunter
2022-07-13 23:30 ` sdf [this message]
2022-07-14 22:20   ` Donald Hunter
2022-07-14  1:12 ` Bagas Sanjaya
2022-07-14  5:51   ` Daniel Müller
2022-07-14  8:10     ` Bagas Sanjaya
2022-07-14 21:53       ` Donald Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ys9VhwanEB/T8/Ue@google.com \
    --to=sdf@google.com \
    --cc=bpf@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=donald.hunter@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.