From: Amery Hung <ameryhung@gmail.com>
To: bpf@vger.kernel.org
Cc: daniel@iogearbox.net, andrii@kernel.org,
alexei.starovoitov@gmail.com, martin.lau@kernel.org,
ameryhung@gmail.com, kernel-team@meta.com
Subject: [RFC PATCH 0/4] uptr KV store
Date: Thu, 20 Mar 2025 14:40:54 -0700 [thread overview]
Message-ID: <20250320214058.2946857-1-ameryhung@gmail.com> (raw)
Hi all,
I'd like to discuss uptr KV store in LSFMMBPF'25. This is just an RFC and
the code definitely needs more work, but I hope it delivers the high level
idea.
* Overview *
The uptr KV store implements a key-value store based on existing bpf
features with one small change to the bpf verifier code. A motivation of
this work is to make rolling out a new bpf program with changes to map
value layouts easier. Currently, there is not a simple and easy way to do
it. One may try to create a new map that has the new map value layout,
copy old values into the new one, and then starts the new program.
However, this process is not trivial to automate. uptr KV store provides
an alternative to this. By replacing a structure in a map with the use of
KV store with multiple key-value pairs, changing map layout becomes just
adding/deleting key-value pairs. In addition, by maintaining a manifest
of key-value pairs, the roll out process and be easily automated.
* Design & implementation *
- User space and bpf API
The uptr KV store provides basic user space and bpf API (get/put/
delete). In addition, there are APIs for managing space that are only
provided to the user space. It is assume that all keys are known
before deploying a bpf program. To use it, the user space program first
needs to initialize the KV store by calling kv_store_init() and
initializes all key-value pairs using kv_store_put(). Then, the bpf
program and user space program can start using the KV store.
- Single global map lookup in KV store bpf API
Making the KV store performant enough for use in bpf programs on the hot
paths is one of the design goal. To achieve this, a key is to keep map
lookups as little as possible. The current implementation only requires
one task local storage lookup during one program invocation. Then, get/
put/delete only involves memory accesses in the uptr regions in the local
storage.
The uptr KV store mainly consists of two uptr regions, metadata and data.
The metadata is an array of metadata indexed by key, where each metadata
contains the page index and page offset and the size of the key-value
pair. The data region are pages allocated on-demand for storing values,
and one page is allocated initially.
If using string key is desired, the metadata can be moved to a bpf
hashmap indexed with string keys. However, this will add one hashmap
lookup to every basic KV store operation.
- 1K max int keys; 1B - 4KB value
The KV store is indexed by integer keys and the maximum number of keys
supported is 1K. This is limited by the maximum number of metadata as
shown below that can be stored in a 4KB uptr metadata array. The largest
size of a value is also bound to 4KB for the same reason.
struct kv_store_meta {
__u32 page_idx:3;
__u32 page_off:12;
__u32 size:12;
__u32 init:1;
}
- Growable storage backed by uptr
To be able to accommodate future storage space needed for new key-value
pairs or temporarily storing old and new key-value pairs at the same
time during transition, the KV store is growable. This is possible as
the KV store leverages uptr, which can be allocated in user space on
demand. The current implementation supports 8 pages, but this maybe
chanaged if it is too big or small.
- Variable-size data copy with dynptr
get/put involves copying variable-size data between uptr region and
stack. Since llvm does not support emitting bytecode for memcpy with
variable size, byte-by-byte copy in a for loop needs to be used.
This can be improved by using dynptr. Please refer to the patch 1 for
details.
* Todo *
- Allocate smaller chunks of memory and grow on demand
Amery Hung (4):
bpf: Allow creating dynptr from uptr
selftests/bpf: Implement basic uptr KV store
selftests/bpf: Test basic uptr KV store operations from user space and
bpf
selftests/bpf: Test changing KV store value layout
include/uapi/linux/bpf.h | 4 +-
kernel/bpf/verifier.c | 3 +-
.../bpf/prog_tests/test_uptr_kv_store.c | 154 ++++++++++
.../selftests/bpf/prog_tests/uptr_kv_store.c | 282 ++++++++++++++++++
.../selftests/bpf/prog_tests/uptr_kv_store.h | 22 ++
.../selftests/bpf/progs/test_uptr_kv_store.c | 46 +++
.../bpf/progs/test_uptr_kv_store_v1.c | 46 +++
.../selftests/bpf/progs/uptr_kv_store.h | 120 ++++++++
.../selftests/bpf/test_uptr_kv_store_common.h | 22 ++
.../selftests/bpf/uptr_kv_store_common.h | 47 +++
10 files changed, 744 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_uptr_kv_store.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/uptr_kv_store.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/uptr_kv_store.h
create mode 100644 tools/testing/selftests/bpf/progs/test_uptr_kv_store.c
create mode 100644 tools/testing/selftests/bpf/progs/test_uptr_kv_store_v1.c
create mode 100644 tools/testing/selftests/bpf/progs/uptr_kv_store.h
create mode 100644 tools/testing/selftests/bpf/test_uptr_kv_store_common.h
create mode 100644 tools/testing/selftests/bpf/uptr_kv_store_common.h
--
2.47.1
next reply other threads:[~2025-03-20 21:41 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 21:40 Amery Hung [this message]
2025-03-20 21:40 ` [RFC PATCH 1/4] bpf: Allow creating dynptr from uptr Amery Hung
2025-03-20 22:45 ` Andrii Nakryiko
2025-03-20 23:20 ` Amery Hung
2025-03-28 18:59 ` Andrii Nakryiko
2025-03-20 21:40 ` [RFC PATCH 2/4] selftests/bpf: Implement basic uptr KV store Amery Hung
2025-03-20 21:40 ` [RFC PATCH 3/4] selftests/bpf: Test basic uptr KV store operations from user space and bpf Amery Hung
2025-03-20 21:40 ` [RFC PATCH 4/4] selftests/bpf: Test changing KV store value layout Amery Hung
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250320214058.2946857-1-ameryhung@gmail.com \
--to=ameryhung@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@meta.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox