netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v5 0/3] Task local data
@ 2025-06-27 23:39 Amery Hung
  2025-06-27 23:39 ` [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task " Amery Hung
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Amery Hung @ 2025-06-27 23:39 UTC (permalink / raw)
  To: bpf
  Cc: netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, ameryhung, kernel-team

* Motivation *

CPU schedulers can potentially make a better decision with hints from
user space process. To support experimenting user space hinting with
sched_ext, there needs a mechanism to pass a "per-task hint" from user
space to the bpf scheduler "efficiently".

The proposed mechanism is task local data. Similar to pthread key or
__thread, it allows users to define thread-specific data. In addition,
the user pages that back task local data are pinned to the kernel to
share with bpf programs directly. As a result, user space programs
can directly update per-thread hints, and then bpf program can read
the hint with little overhead. The diagram in the design section gives
a sneak peek of how it works.


* Overview *

Task local data defines an abstract storage type for storing data specific
to each task and provides user space and bpf libraries to access it. The
result is a fast and easy way to share per-task data between user space
and bpf programs. The intended use case is sched_ext, where user space
programs will pass hints to sched_ext bpf programs to affect task
scheduling.

Task local data is built on top of task local storage map and UPTR[0]
to achieve fast per-task data sharing. UPTR is a type of special field
supported in task local storage map value. A user page assigned to a UPTR
will be pinned by the kernel when the map is updated. Therefore, user
space programs can update data seen by bpf programs without syscalls.

Additionally, unlike most bpf maps, task local data does not require a
static map value definition. This design is driven by sched_ext, which
would like to allow multiple developers to share a storage without the
need to explicitly agree on the layout of it. While a centralized layout
definition would have worked, the friction of synchronizing it across
different repos is not desirable. This simplify code base management and
makes experimenting easier.

In the rest of the cover letter, "task local data" is used to refer to
the abstract storage and TLD is used to denote a single data entry in
the storage.


* Design *

Task local data library provides simple APIs for user space and bpf
through two header files, task_local_data.h and task_loca_data.bpf.h,
respectively. The usage is illustrated in the following diagram.
An entry of data in the task local data, TLD, first needs to be defined
with TLD_DEFINE_KEY() with the size of the data and a name associated with
the data. The macro defines and initialize an opaque key object of
tld_key_t type, which can be used to locate a TLD. The same key may be
passed to tld_get_data() in different threads, and a pointer to data
specific to the calling thread will be returned. The pointer will
remain valid until the process terminates, so there is not need to call
tld_get_data() in subsequent accesses.

TLD_DEFINE_KEY() is allowed to define TLDs up to roughly a page. In the
case when a TLD can only be known and created on the fly,
tld_create_key() can be called. Since the total TLD size cannot be known
beforehand, a memory of size TLD_DYN_DATA_SIZE is allocated for each
thread to accommodate them.

On the bpf side, programs will use also use tld_get_data() to locate
TLDs. The arugments contain a name and a key to a TLD. The name is
used for the first tld_get_data() to a TLD, which will lookup the TLD
by name and save the corresponding key to a task local data map,
tld_key_map. The map value type, struct tld_keys, __must__ be defined by
developers. It should contain keys used in the compilation unit.


 ┌─ Application ───────────────────────────────────────────────────────┐
 │ TLD_DEFINE_KEY(kx, "X", 4);      ┌─ library A ─────────────────────┐│
 │                                  │ void func(...)                  ││
 │ int main(...)                    │ {                               ││
 │ {                                │     tld_key_t ky;               ││
 │      int *x;                     │     bool *y;                    ││
 │                                  │                                 ││
 │      x = tld_get_data(fd, kx);   │     ky = tld_create_key("Y", 1);││
 │      if (x) *x = 123;            │     y = tld_get_data(fd, ky);   ││
 │                         ┌────────┤     if (y) *y = true;           ││
 │                         │        └─────────────────────────────────┘│
 └───────┬─────────────────│───────────────────────────────────────────┘
         V                 V
 + ─ Task local data ─ ─ ─ ─ ─ +  ┌─ BPF program ──────────────────────┐
 | ┌─ tld_data_map ──────────┐ |  │ struct tld_object obj;             │
 | │ BPF Task local storage  │ |  │ bool *y;                           │
 | │                         │ |  │ int *x;                            │
 | │ __uptr *data            │ |  │                                    │
 | │ __uptr *metadata        │ |  │ if (tld_init_object(task, &obj))   │
 | └─────────────────────────┘ |  │     return 0;                      │
 | ┌─ tld_key_map ───────────┐ |  │                                    │
 | │ BPF Task local storage  │ |  │ x = tld_get_data(&obj, kx, "X", 4);│
 | │                         │ |<─┤ if (x) /* do something */          │
 | │ tld_key_t kx;           │ |  │                                    │
 | │ tld_key_t ky;           │ |  │ y = tld_get_data(&obj, ky, "Y", 1);│
 | └─────────────────────────┘ |  │ if (y) /* do something */          │
 + ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ +  └────────────────────────────────────┘
 


* Implementation *

Task local data defines the storage to be a task local storage map with
two UPTRs, data and metadata. Data points to a blob of memory for storing
TLDs individual to every task with the offset of data in a page. Metadata,
individual to each process and shared by its threads, records the total
number and size of TLDs and the metadata of each TLD. Metadata for a
TLD contains the key name and the size of the TLD.

  struct u_tld_data {
          u64 start;
          char data[PAGE_SIZE - 8];
  };

  struct u_tld_metadata {
          u8 cnt;
          u16 size;
          struct metadata data[TLD_DATA_CNT]; 
  };

Both user space and bpf API follow the same protocol when accessing
task local data. A pointer to a TLD is located by a key. tld_key_t
effectively is the offset of a TLD in data. To add a TLD, user space
API, loops through metadata->data until an empty slot is found and update
it. It also adds sizes of prior TLDs along the way to derive the offset.
To locate a TLD in bpf when the first time tld_get_data() is called,
__tld_fetch_key() also loops through metadata->data until the name is
found. The offset is also derived by adding sizes. When the TLD is not
found, the current TLD count is cached instead to skip name comparison
that has been done. The detail of task local data operations can be found
in patch 1.


* Misc *

The metadata can potentially use run-length encoding for names to reduce
memory wastage and support save more TLDs. I have a version that works,
but the selftest takes a bit longer to finish. More investigation needed
to find the root cause. I will save this for the future when there is a
need to store more than 63 TLDs.


[0] https://lore.kernel.org/bpf/20241023234759.860539-1-martin.lau@linux.dev/

---

v4 -> v5
  - Add an option to free memory on thread exit to prevent memory leak
  - Add an option to reduce memory waste if the allocator can
    use just enough memory to fullfill aligned_alloc() (e.g., glibc)
  - Tweak bpf API
      - Remove tld_fetch_key() as it does not work in init_tasl
      - tld_get_data() now tries to fetch key if it is not cached yet
  - Optimize bpf side tld_get_data()
      - Faster fast path
      - Less code
  - Use stdatomic.h in user space library with seq_cst order
  - Introduce TLD_DEFINE_KEY() as the default TLD creation API for
    easier memory management.
      - TLD_DEFINE_KEY() can consume memory up to a page and no memory
        is wasted since their size is known before per-thread data
        allocation.
      - tld_create_key() can only use up to TLD_DYN_DATA_SIZE. Since
        tld_create_key can run any time even after per-thread data
        allocation, it is impossible to predict the total size. A
        configurable size of memory is allocated on top of the total
        size of TLD_DEFINE_KEY() to accommodate dynamic key creation.
  - Add tld prefix to all macros
  - Replace map_update(NO_EXIST) in __tld_init_data() with cmpxchg()
  - No more +1,-1 dance on the bpf side
  - Reduce printf from ASSERT in race test
  - Try implementing run-length encoding for name and decide to
    save it for the future
  V4: https://lore.kernel.org/bpf/20250515211606.2697271-1-ameryhung@gmail.com/

v3 -> v4
  - API improvements
      - Simplify API
      - Drop string obfuscation
      - Use opaque type for key
      - Better documentation
  - Implementation
      - Switch to dynamic allocation for per-task data
      - Now offer as header-only libraries
      - No TLS map pinning; leave it to users
  - Drop pthread dependency
  - Add more invalid tld_create_key() test
  - Add a race test for tld_create_key()
  v3: https://lore.kernel.org/bpf/20250425214039.2919818-1-ameryhung@gmail.com/

Amery Hung (3):
  selftests/bpf: Introduce task local data
  selftests/bpf: Test basic task local data operations
  selftests/bpf: Test concurrent task local data key creation

 .../bpf/prog_tests/task_local_data.h          | 397 ++++++++++++++++++
 .../bpf/prog_tests/test_task_local_data.c     | 294 +++++++++++++
 .../selftests/bpf/progs/task_local_data.bpf.h | 232 ++++++++++
 .../bpf/progs/test_task_local_data.c          |  65 +++
 4 files changed, 988 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
 create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.bpf.h
 create mode 100644 tools/testing/selftests/bpf/progs/test_task_local_data.c

-- 
2.47.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task local data
  2025-06-27 23:39 [PATCH bpf-next v5 0/3] Task local data Amery Hung
@ 2025-06-27 23:39 ` Amery Hung
  2025-07-01 22:02   ` Andrii Nakryiko
  2025-06-27 23:39 ` [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations Amery Hung
  2025-06-27 23:39 ` [PATCH bpf-next v5 3/3] selftests/bpf: Test concurrent task local data key creation Amery Hung
  2 siblings, 1 reply; 8+ messages in thread
From: Amery Hung @ 2025-06-27 23:39 UTC (permalink / raw)
  To: bpf
  Cc: netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, ameryhung, kernel-team

Task local data defines an abstract storage type for storing task-
specific data (TLD). This patch provides user space and bpf
implementation as header-only libraries for accessing task local data.

Task local data is a bpf task local storage map with two UPTRs:
1) u_tld_metadata, shared by all tasks of the same process, consists of
the total count of TLDs and an array of metadata of TLDs. A metadata of
a TLD comprises the size and the name. The name is used to identify a
specific TLD in bpf 2) u_tld_data points to a task-specific memory region
for storing TLDs.

Below are the core task local data API:

                     User space                           BPF
Define TLD    TLD_DEFINE_KEY(), tld_create_key()           -
Get data           tld_get_data()                    tld_get_data()

A TLD is first defined by the user space with TLD_DEFINE_KEY() or
tld_create_key(). TLD_DEFINE_KEY() defines a TLD statically and allocates
just enough memory during initialization. tld_create_key() allows
creating TLDs on the fly, but has a fix memory budget, TLD_DYN_DATA_SIZE.
Internally, they all go through the metadata array to check if the TLD can
be added. The total TLD size needs to fit into a page (limited by UPTR),
and no two TLDs can have the same name. It also calculates the offset, the
next available space in u_tld_data, by summing sizes of TLDs. If the TLD
can be added, it increases the count using cmpxchg as there may be other
concurrent tld_create_key(). After a successful cmpxchg, the last
metadata slot now belongs to the calling thread and will be updated.
tld_create_key() returns the offset encapsulated as a opaque object key
to prevent user misuse.

Then, user space can pass the key to tld_get_data() to get a pointer
to the TLD. The pointer will remain valid for the lifetime of the
thread.

BPF programs can also locate the TLD by tld_get_data(), but with both
name and key. The first time tld_get_data() is called, the name will
be used to lookup the metadata. Then, the key will be saved to a
task_local_data map, tld_keys_map. Subsequent call to tld_get_data()
will use the key to quickly locate the data.

User space task local data library uses a light way approach to ensure
thread safety (i.e., atomic operation + compiler and memory barriers).
While a metadata is being updated, other threads may also try to read it.
To prevent them from seeing incomplete data, metadata::size is used to
signal the completion of the update, where 0 means the update is still
ongoing. Threads will wait until seeing a non-zero size to read a
metadata.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 .../bpf/prog_tests/task_local_data.h          | 397 ++++++++++++++++++
 .../selftests/bpf/progs/task_local_data.bpf.h | 232 ++++++++++
 2 files changed, 629 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h
 create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.bpf.h

diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_data.h b/tools/testing/selftests/bpf/prog_tests/task_local_data.h
new file mode 100644
index 000000000000..08b6a389ef6d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/task_local_data.h
@@ -0,0 +1,397 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __TASK_LOCAL_DATA_H
+#define __TASK_LOCAL_DATA_H
+
+#include <errno.h>
+#include <fcntl.h>
+#include <sched.h>
+#include <stdatomic.h>
+#include <stddef.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <sys/types.h>
+
+#ifdef TLD_FREE_DATA_ON_THREAD_EXIT
+#include <pthread.h>
+#endif
+
+#include <bpf/bpf.h>
+
+/*
+ * OPTIONS
+ *
+ *   Define the option before including the header
+ *
+ *   TLD_FREE_DATA_ON_THREAD_EXIT - Frees memory on thread exit automatically
+ *
+ *   Thread-specific memory for storing TLD is allocated lazily on the first call to
+ *   tld_get_data(). The thread that calls it must also calls tld_free() on thread exit
+ *   to prevent memory leak. Pthread will be included if the option is defined. A pthread
+ *   key will be registered with a destructor that calls tld_free().
+ *
+ *
+ *   TLD_DYN_DATA_SIZE - The maximum size of memory allocated for TLDs created dynamically
+ *   (default: 64 bytes)
+ *
+ *   A TLD can be defined statically using TLD_DEFINE_KEY() or created on the fly using
+ *   tld_create_key(). As the total size of TLDs created with tld_create_key() cannot be
+ *   possibly known statically, a memory area of size TLD_DYN_DATA_SIZE will be allocated
+ *   for these TLDs. This additional memory is allocated for every thread that calls
+ *   tld_get_data() even if no tld_create_key are actually called, so be mindful of
+ *   potential memory wastage. Use TLD_DEFINE_KEY() whenever possible as just enough memory
+ *   will be allocated for TLDs created with it.
+ *
+ *
+ *   TLD_NAME_LEN - The maximum length of the name of a TLD (default: 62)
+ *
+ *   Setting TLD_NAME_LEN will affect the maximum number of TLDs a process can store,
+ *   TLD_MAX_DATA_CNT.
+ *
+ *
+ *   TLD_DATA_USE_ALIGNED_ALLOC - Always use aligned_alloc() instead of malloc()
+ *
+ *   When allocating the memory for storing TLDs, we need to make sure there is a memory
+ *   region of the X bytes within a page. This is due to the limit posed by UPTR: memory
+ *   pinned to the kernel cannot exceed a page nor can it cross the page boundary. The
+ *   library normally calls malloc(2*X) given X bytes of total TLDs, and only uses
+ *   aligned_alloc(PAGE_SIZE, X) when X >= PAGE_SIZE / 2. This is to reduce memory wastage
+ *   as not all memory allocator can use the exact amount of memory requested to fulfill
+ *   aligned_alloc(). For example, some may round the size up to the alignment. Enable the
+ *   option to always use aligned_alloc() if the implementation has low memory overhead.
+ */
+
+#define TLD_PIDFD_THREAD O_EXCL
+
+#define TLD_PAGE_SIZE getpagesize()
+#define TLD_PAGE_MASK (~(TLD_PAGE_SIZE - 1))
+
+#define TLD_ROUND_MASK(x, y) ((__typeof__(x))((y) - 1))
+#define TLD_ROUND_UP(x, y) ((((x) - 1) | TLD_ROUND_MASK(x, y)) + 1)
+
+#define TLD_READ_ONCE(x) (*(volatile typeof(x) *)&(x))
+
+#ifndef TLD_DYN_DATA_SIZE
+#define TLD_DYN_DATA_SIZE 64
+#endif
+
+#define TLD_MAX_DATA_CNT (TLD_PAGE_SIZE / sizeof(struct tld_metadata) - 1)
+
+#ifndef TLD_NAME_LEN
+#define TLD_NAME_LEN 62
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef struct {
+	__s16 off;
+} tld_key_t;
+
+struct tld_metadata {
+	char name[TLD_NAME_LEN];
+	_Atomic __u16 size;
+};
+
+struct u_tld_metadata {
+	_Atomic __u8 cnt;
+	__u16 size;
+	struct tld_metadata metadata[];
+};
+
+struct u_tld_data {
+	__u64 start; /* offset of u_tld_data->data in a page */
+	char data[];
+};
+
+struct tld_map_value {
+	void *data;
+	struct u_tld_metadata *metadata;
+};
+
+struct u_tld_metadata * _Atomic tld_metadata_p __attribute__((weak));
+__thread struct u_tld_data *tld_data_p __attribute__((weak));
+__thread void *tld_data_alloc_p __attribute__((weak));
+
+#ifdef TLD_FREE_DATA_ON_THREAD_EXIT
+pthread_key_t tld_pthread_key __attribute__((weak));
+
+static void tld_free(void);
+
+static void __tld_thread_exit_handler(void *unused)
+{
+	tld_free();
+}
+#endif
+
+static int __tld_init_metadata()
+{
+	struct u_tld_metadata *meta, *uninit = NULL;
+	int err = 0;
+
+	meta = (struct u_tld_metadata *)aligned_alloc(TLD_PAGE_SIZE, TLD_PAGE_SIZE);
+	if (!meta) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	memset(meta, 0, TLD_PAGE_SIZE);
+	meta->size = TLD_DYN_DATA_SIZE;
+
+	if (!atomic_compare_exchange_strong(&tld_metadata_p, &uninit, meta)) {
+		free(meta);
+		goto out;
+	}
+
+#ifdef TLD_FREE_DATA_ON_THREAD_EXIT
+	pthread_key_create(&tld_pthread_key, __tld_thread_exit_handler);
+#endif
+out:
+	return err;
+}
+
+static int __tld_init_data(int map_fd)
+{
+	bool use_aligned_alloc = false;
+	struct tld_map_value map_val;
+	struct u_tld_data *data;
+	int err, tid_fd = -1;
+	void *d = NULL;
+
+	tid_fd = syscall(SYS_pidfd_open, gettid(), TLD_PIDFD_THREAD);
+	if (tid_fd < 0) {
+		err = -errno;
+		goto out;
+	}
+
+#ifdef TLD_DATA_USE_ALIGNED_ALLOC
+	use_aligned_alloc = true;
+#endif
+
+	/*
+	 * tld_metadata_p->size = TLD_DYN_DATA_SIZE +
+	 *          total size of TLDs defined via TLD_DEFINE_KEY()
+	 */
+	if (use_aligned_alloc || tld_metadata_p->size >= TLD_PAGE_SIZE / 2)
+		d = aligned_alloc(TLD_PAGE_SIZE, tld_metadata_p->size);
+	else
+		d = malloc(tld_metadata_p->size * 2);
+	if (!d) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	/*
+	 * The size of tld_map_value::data is a page in bpf. If d span across two pages,
+	 * find the page that contains large enough memory and pin the start of the page
+	 * via UPTR (i.e., map_val.data). If the usable memory lays in the second page,
+	 * point tld_data_p to the start of the second page.
+	 */
+	if ((uintptr_t)d % TLD_PAGE_SIZE == 0) {
+		map_val.data = d;
+		data = d;
+		data->start = offsetof(struct u_tld_data, data);
+	} else if (TLD_PAGE_SIZE - (~TLD_PAGE_MASK & (intptr_t)d) >= tld_metadata_p->size) {
+		map_val.data = (void *)(TLD_PAGE_MASK & (intptr_t)d);
+		data = d;
+		data->start = (~TLD_PAGE_MASK & (intptr_t)d) + offsetof(struct u_tld_data, data);
+	} else {
+		map_val.data = (void *)((TLD_PAGE_MASK & (intptr_t)d) + TLD_PAGE_SIZE);
+		data = (void *)((TLD_PAGE_MASK & (intptr_t)d) + TLD_PAGE_SIZE);
+		data->start = offsetof(struct u_tld_data, data);
+	}
+
+	map_val.metadata = TLD_READ_ONCE(tld_metadata_p);
+
+	err = bpf_map_update_elem(map_fd, &tid_fd, &map_val, 0);
+	if (err) {
+		free(d);
+		goto out;
+	}
+
+	tld_data_p = (struct u_tld_data *)data;
+	tld_data_alloc_p = d;
+#ifdef TLD_FREE_DATA_ON_THREAD_EXIT
+	pthread_setspecific(tld_pthread_key, (void *)1);
+#endif
+out:
+	if (tid_fd >= 0)
+		close(tid_fd);
+	return err;
+}
+
+static tld_key_t __tld_create_key(const char *name, size_t size, bool dyn_data)
+{
+	int err, i, sz, off = 0;
+	__u8 cnt;
+
+	if (!TLD_READ_ONCE(tld_metadata_p)) {
+		err = __tld_init_metadata();
+		if (err)
+			return (tld_key_t) {.off = (__s16)err};
+	}
+
+	for (i = 0; i < TLD_MAX_DATA_CNT; i++) {
+retry:
+		cnt = atomic_load(&tld_metadata_p->cnt);
+		if (i < cnt) {
+			/* A metadata is not ready until size is updated with a non-zero value */
+			while (!(sz = atomic_load(&tld_metadata_p->metadata[i].size)))
+				sched_yield();
+
+			if (!strncmp(tld_metadata_p->metadata[i].name, name, TLD_NAME_LEN))
+				return (tld_key_t) {.off = -EEXIST};
+
+			off += TLD_ROUND_UP(sz, 8);
+			continue;
+		}
+
+		/*
+		 * TLD_DEFINE_KEY() is given memory upto a page while at most
+		 * TLD_DYN_DATA_SIZE is allocated for tld_create_key()
+		 */
+		if (dyn_data) {
+			if (off + TLD_ROUND_UP(size, 8) > tld_metadata_p->size)
+				return (tld_key_t) {.off = -E2BIG};
+		} else {
+			if (off + TLD_ROUND_UP(size, 8) > TLD_PAGE_SIZE - sizeof(struct u_tld_data))
+				return (tld_key_t) {.off = -E2BIG};
+			tld_metadata_p->size += TLD_ROUND_UP(size, 8);
+		}
+
+		/*
+		 * Only one tld_create_key() can increase the current cnt by one and
+		 * takes the latest available slot. Other threads will check again if a new
+		 * TLD can still be added, and then compete for the new slot after the
+		 * succeeding thread update the size.
+		 */
+		if (!atomic_compare_exchange_strong(&tld_metadata_p->cnt, &cnt, cnt + 1))
+			goto retry;
+
+		strncpy(tld_metadata_p->metadata[i].name, name, TLD_NAME_LEN);
+		atomic_store(&tld_metadata_p->metadata[i].size, size);
+		return (tld_key_t) {.off = (__s16)off};
+	}
+
+	return (tld_key_t) {.off = -ENOSPC};
+}
+
+/**
+ * TLD_DEFINE_KEY() - Defines a TLD and a file-scope key associated with the TLD.
+ *
+ * @name: The name of the TLD
+ * @size: The size of the TLD
+ * @key: The variable name of the key. Cannot exceed TLD_NAME_LEN
+ *
+ * The macro can only be used in file scope.
+ *
+ * A file-scope key of opaque type, tld_key_t, will be declared and initialized before
+ * main() starts. Use tld_key_is_err() or tld_key_err_or_zero() later to check if the key
+ * creation succeeded. Pass the key to tld_get_data() to get a pointer to the TLD.
+ * bpf programs can also fetch the same key by name.
+ *
+ * The total size of TLDs created using TLD_DEFINE_KEY() cannot exceed a page. Just
+ * enough memory will be allocated for each thread on the first call to tld_get_data().
+ */
+#define TLD_DEFINE_KEY(key, name, size)			\
+tld_key_t key;						\
+							\
+__attribute__((constructor))				\
+void __tld_define_key_##key(void)			\
+{							\
+	key = __tld_create_key(name, size, false);	\
+}
+
+/**
+ * tld_create_key() - Creates a TLD and returns a key associated with the TLD.
+ *
+ * @name: The name the TLD
+ * @size: The size of the TLD
+ *
+ * Returns an opaque object key. Use tld_key_is_err() or tld_key_err_or_zero() to check
+ * if the key creation succeeded. Pass the key to tld_get_data() to get a pointer to
+ * locate the TLD. bpf programs can also fetch the same key by name.
+ *
+ * Use tld_create_key() only when @name is not known statically or a TLD needs to
+ * be created conditionally.
+ *
+ * An additional TLD_DYN_DATA_SIZE bytes are allocated per-thread to accommodate TLDs
+ * created dynamically with tld_create_key(). Since only a user page is pinned to the
+ * kernel, when TLDs created with TLD_DEFINE_KEY() uses more than TLD_PAGE_SIZE -
+ * TLD_DYN_DATA_SIZE, the buffer size will be limited to the rest of the page.
+ */
+__attribute__((unused))
+static tld_key_t tld_create_key(const char *name, size_t size)
+{
+	return __tld_create_key(name, size, true);
+}
+
+__attribute__((unused))
+static inline bool tld_key_is_err(tld_key_t key)
+{
+	return key.off < 0;
+}
+
+__attribute__((unused))
+static inline int tld_key_err_or_zero(tld_key_t key)
+{
+	return tld_key_is_err(key) ? key.off : 0;
+}
+
+/**
+ * tld_get_data() - Gets a pointer to the TLD associated with the given key of the
+ * calling thread.
+ *
+ * @map_fd: A file descriptor of tld_data_map, the underlying BPF task local storage map
+ * of task local data.
+ * @key: A key object created by TLD_DEFINE_KEY() or tld_create_key().
+ *
+ * Returns a pointer to the TLD if the key is valid; NULL if not enough memory for TLD
+ * for this thread, or the key is invalid. The returned pointer is guaranteed to be 8-byte
+ * aligned.
+ *
+ * Threads that call tld_get_data() must call tld_free() on exit to prevent
+ * memory leak if TLD_FREE_DATA_ON_THREAD_EXIT is not defined.
+ */
+__attribute__((unused))
+static void *tld_get_data(int map_fd, tld_key_t key)
+{
+	if (!TLD_READ_ONCE(tld_metadata_p))
+		return NULL;
+
+	/*
+	 * tld_data_p is allocated on the first invocation of tld_get_data()
+	 * for a thread that has not called tld_create_key()
+	 */
+	if (!tld_data_p && __tld_init_data(map_fd))
+		return NULL;
+
+	return tld_data_p->data + key.off;
+}
+
+/**
+ * tld_free() - Frees task local data memory of the calling thread
+ *
+ * For the calling thread, all pointers to TLDs acquired before will become invalid.
+ *
+ * Users must call tld_free() on thread exit to prevent memory leak. Or, define
+ * TLD_FREE_DATA_ON_THREAD_EXIT and let the library call tld_free() automatically
+ * when threads exit.
+ */
+__attribute__((unused))
+static void tld_free(void)
+{
+	if (tld_data_alloc_p) {
+		free(tld_data_alloc_p);
+		tld_data_alloc_p = NULL;
+		tld_data_p = NULL;
+	}
+}
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif /* __TASK_LOCAL_DATA_H */
diff --git a/tools/testing/selftests/bpf/progs/task_local_data.bpf.h b/tools/testing/selftests/bpf/progs/task_local_data.bpf.h
new file mode 100644
index 000000000000..ecfd6a86c6f5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/task_local_data.bpf.h
@@ -0,0 +1,232 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __TASK_LOCAL_DATA_BPF_H
+#define __TASK_LOCAL_DATA_BPF_H
+
+/*
+ * Task local data is a library that facilitates sharing per-task data
+ * between user space and bpf programs.
+ *
+ *
+ * USAGE
+ *
+ * A TLD, an entry of data in task local data, first needs to be created by the
+ * user space. This is done by calling user space API, TLD_DEFINE_KEY() or
+ * tld_create_key(), with the name of the TLD and the size.
+ *
+ * TLD_DEFINE_KEY(prio, "priority", sizeof(int));
+ *
+ * or
+ *
+ * void func_call(...) {
+ *     tld_key_t prio, in_cs;
+ *
+ *     prio = tld_create_key("priority", sizeof(int));
+ *     in_cs = tld_create_key("in_critical_section", sizeof(bool));
+ *     ...
+ *
+ * A key associated with the TLD, which has an opaque type tld_key_t, will be
+ * returned. It can be used to get a pointer to the TLD in the user space by
+ * calling tld_get_data().
+ *
+ * In a bpf program, tld_object_init() first needs to be called to initialized a
+ * tld_object on the stack. Then, TLDs can be accessed by calling tld_get_data().
+ * The API will try to fetch the key by the name and use it to locate the data.
+ * A pointer to the TLD will be returned. It also caches the key in a task local
+ * storage map, tld_key_map, whose value type, struct tld_keys, must be defined
+ * by the developer.
+ *
+ * struct tld_keys {
+ *     tld_key_t prio;
+ *     tld_key_t in_cs;
+ * };
+ *
+ * SEC("struct_ops")
+ * void prog(struct task_struct task, ...)
+ * {
+ *     struct tld_object tld_obj;
+ *     int err, *p;
+ *
+ *     err = tld_object_init(task, &tld_obj);
+ *     if (err)
+ *         return;
+ *
+ *     p = tld_get_data(&tld_obj, prio, "priority", sizeof(int));
+ *     if (p)
+ *         // do something depending on *p
+ */
+#include <errno.h>
+#include <bpf/bpf_helpers.h>
+
+#define TLD_ROUND_MASK(x, y) ((__typeof__(x))((y) - 1))
+#define TLD_ROUND_UP(x, y) ((((x) - 1) | TLD_ROUND_MASK(x, y)) + 1)
+
+#define TLD_MAX_DATA_CNT (__PAGE_SIZE / sizeof(struct tld_metadata) - 1)
+
+#ifndef TLD_NAME_LEN
+#define TLD_NAME_LEN 62
+#endif
+
+typedef struct {
+	__s16 off;
+} tld_key_t;
+
+struct tld_metadata {
+	char name[TLD_NAME_LEN];
+	__u16 size;
+};
+
+struct u_tld_metadata {
+	__u8 cnt;
+	__u16 size;
+	struct tld_metadata metadata[TLD_MAX_DATA_CNT];
+};
+
+struct u_tld_data {
+	__u64 start; /* offset of u_tld_data->data in a page */
+	char data[__PAGE_SIZE - sizeof(__u64)];
+};
+
+struct tld_map_value {
+	struct u_tld_data __uptr *data;
+	struct u_tld_metadata __uptr *metadata;
+};
+
+typedef struct tld_uptr_dummy {
+	struct u_tld_data data[0];
+	struct u_tld_metadata metadata[0];
+} *tld_uptr_dummy_t;
+
+struct tld_object {
+	struct tld_map_value *data_map;
+	struct tld_keys *key_map;
+	/*
+	 * Force the compiler to generate the actual definition of u_tld_metadata
+	 * and u_tld_data in BTF. Without it, u_tld_metadata and u_tld_data will
+	 * be BTF_KIND_FWD.
+	 */
+	tld_uptr_dummy_t dummy[0];
+};
+
+/*
+ * Map value of tld_key_map for caching keys. Must be defined by the developer.
+ * Members should be tld_key_t and passed to the 3rd argument of tld_fetch_key().
+ */
+struct tld_keys;
+
+struct {
+	__uint(type, BPF_MAP_TYPE_TASK_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC);
+	__type(key, int);
+	__type(value, struct tld_map_value);
+} tld_data_map SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_TASK_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC);
+	__type(key, int);
+	__type(value, struct tld_keys);
+} tld_key_map SEC(".maps");
+
+/**
+ * tld_object_init() - Initializes a tld_object.
+ *
+ * @task: The task_struct of the target task
+ * @tld_obj: A pointer to a tld_object to be initialized
+ *
+ * Returns 0 on success; -ENODATA if the task has no TLD; -ENOMEM if the creation
+ * of tld_key_map fails
+ */
+__attribute__((unused))
+static int tld_object_init(struct task_struct *task, struct tld_object *tld_obj)
+{
+	tld_obj->data_map = bpf_task_storage_get(&tld_data_map, task, 0, 0);
+	if (!tld_obj->data_map)
+		return -ENODATA;
+
+	tld_obj->key_map = bpf_task_storage_get(&tld_key_map, task, 0,
+						BPF_LOCAL_STORAGE_GET_F_CREATE);
+	if (!tld_obj->key_map)
+		return -ENOMEM;
+
+	return 0;
+}
+
+__attribute__((unused))
+static int __tld_fetch_key(struct tld_object *tld_obj, const char *name, int next)
+{
+	int i, meta_off, cnt, start, off = 0;
+	void *metadata, *nm, *sz;
+
+	if (!tld_obj->data_map || !tld_obj->data_map->data || !tld_obj->data_map->metadata)
+		return 0;
+
+	cnt = tld_obj->data_map->metadata->cnt;
+	start = tld_obj->data_map->data->start;
+	metadata = tld_obj->data_map->metadata->metadata;
+
+	bpf_for(i, 0, cnt) {
+		meta_off = i * sizeof(struct tld_metadata);
+		if (meta_off > sizeof(struct u_tld_metadata) - offsetof(struct u_tld_metadata, metadata)
+							     - sizeof(struct tld_metadata))
+			break;
+
+		nm = metadata + meta_off + offsetof(struct tld_metadata, name);
+		sz = metadata + meta_off + offsetof(struct tld_metadata, size);
+
+		if (i >= next && !bpf_strncmp(nm, TLD_NAME_LEN, name))
+			return start + off;
+
+		off += TLD_ROUND_UP(*(u16 *)sz, 8);
+	}
+
+	return -cnt;
+}
+
+/**
+ * tld_get_data() - Retrieves a pointer to the TLD associated with the name.
+ *
+ * @tld_obj: A pointer to a valid tld_object initialized by tld_object_init()
+ * @key: The cached key of the TLD in tld_key_map
+ * @name: The name of the key associated with a TLD
+ * @size: The size of the TLD. Must be a known constant value
+ *
+ * Returns a pointer to the TLD associated with @name; NULL if not found or
+ * @size is too big. @key is used to cache the key if the TLD is found
+ * to speed up subsequent call. It should be declared as an member of tld_keys
+ * of tld_key_t type by the developer.
+ *
+ * Internally, the first call to tld_get_data uses @name to fetch the key
+ * associated with the TLD. The key will be saved to @key in tld_key_map so that
+ * subsequent tld_get_data() can use it to directly locate the TLD. It not found,
+ * the current TLD count will be saved and the next __tld_fetch_key() will start
+ * searching @name from the count-th entry.
+ */
+#define tld_get_data(tld_obj, key, name, size)						\
+	({										\
+		void *data = NULL, *_data = (tld_obj)->data_map->data;			\
+		int cnt, off = (tld_obj)->key_map->key.off;				\
+											\
+		if (likely(_data)) {							\
+			if (likely(off > 0)) {						\
+				barrier_var(off);					\
+				if (likely(off < __PAGE_SIZE - size))			\
+					data = _data + off;				\
+			} else {							\
+				cnt = -(off);						\
+				if (likely((tld_obj)->data_map->metadata) &&		\
+				    cnt < (tld_obj)->data_map->metadata->cnt) {		\
+					off = __tld_fetch_key(tld_obj, name, cnt);	\
+					(tld_obj)->key_map->key.off = off;		\
+											\
+					if (likely(off < __PAGE_SIZE - size)) {		\
+						barrier_var(off);			\
+						if (off > 0)				\
+							data = _data + off;		\
+					}						\
+				}							\
+			}								\
+		}									\
+		data;									\
+	})
+
+#endif
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations
  2025-06-27 23:39 [PATCH bpf-next v5 0/3] Task local data Amery Hung
  2025-06-27 23:39 ` [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task " Amery Hung
@ 2025-06-27 23:39 ` Amery Hung
  2025-06-30 11:24   ` Jiri Olsa
  2025-06-27 23:39 ` [PATCH bpf-next v5 3/3] selftests/bpf: Test concurrent task local data key creation Amery Hung
  2 siblings, 1 reply; 8+ messages in thread
From: Amery Hung @ 2025-06-27 23:39 UTC (permalink / raw)
  To: bpf
  Cc: netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, ameryhung, kernel-team

Test basic operations of task local data with valid and invalid
tld_create_key().

For invalid calls, make sure they return the right error code and check
that the TLDs are not inserted by running tld_get_data("
value_not_exists") on the bpf side. The call should a null pointer.

For valid calls, first make sure the TLDs are created by calling
tld_get_data() on the bpf side. The call should return a valid pointer.

Finally, verify that the TLDs are indeed task-specific (i.e., their
addresses do not overlap) with multiple user threads. This done by
writing values unique to each thread, reading them from both user space
and bpf, and checking if the value read back matches the value written.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 .../bpf/prog_tests/test_task_local_data.c     | 191 ++++++++++++++++++
 .../bpf/progs/test_task_local_data.c          |  65 ++++++
 2 files changed, 256 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_task_local_data.c

diff --git a/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
new file mode 100644
index 000000000000..53cdb8466f8e
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
@@ -0,0 +1,191 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <pthread.h>
+#include <bpf/btf.h>
+#include <test_progs.h>
+
+struct test_struct {
+	__u64 a;
+	__u64 b;
+	__u64 c;
+	__u64 d;
+};
+
+#define TLD_FREE_DATA_ON_THREAD_EXIT
+#define TLD_DYN_DATA_SIZE 4096
+#include "task_local_data.h"
+
+#include "test_task_local_data.skel.h"
+
+TLD_DEFINE_KEY(value0_key, "value0", sizeof(int));
+
+/*
+ * Reset task local data between subtests by clearing metadata. This is safe
+ * as subtests run sequentially. Users of task local data libraries
+ * should not do this.
+ */
+static void reset_tld(void)
+{
+	if (TLD_READ_ONCE(tld_metadata_p)) {
+		/* Remove TLDs created by tld_create_key() */
+		tld_metadata_p->cnt = 1;
+		tld_metadata_p->size = TLD_DYN_DATA_SIZE;
+		memset(&tld_metadata_p->metadata[1], 0,
+		       (TLD_MAX_DATA_CNT - 1) * sizeof(struct tld_metadata));
+	}
+}
+
+/* Serialize access to bpf program's global variables */
+static pthread_mutex_t global_mutex;
+
+static tld_key_t *tld_keys;
+
+#define TEST_BASIC_THREAD_NUM TLD_MAX_DATA_CNT
+
+void *test_task_local_data_basic_thread(void *arg)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, opts);
+	struct test_task_local_data *skel = (struct test_task_local_data *)arg;
+	int fd, err, tid, *value0, *value1;
+	struct test_struct *value2;
+
+	fd = bpf_map__fd(skel->maps.tld_data_map);
+
+	value0 = tld_get_data(fd, value0_key);
+	if (!ASSERT_OK_PTR(value0, "tld_get_data"))
+		goto out;
+
+	value1 = tld_get_data(fd, tld_keys[0]);
+	if (!ASSERT_OK_PTR(value1, "tld_get_data"))
+		goto out;
+
+	value2 = tld_get_data(fd, tld_keys[1]);
+	if (!ASSERT_OK_PTR(value2, "tld_get_data"))
+		goto out;
+
+	tid = gettid();
+
+	*value0 = tid + 0;
+	*value1 = tid + 1;
+	value2->a = tid + 2;
+	value2->b = tid + 3;
+	value2->c = tid + 4;
+	value2->d = tid + 5;
+
+	pthread_mutex_lock(&global_mutex);
+	/* Run task_main that read task local data and save to global variables */
+	err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
+	ASSERT_OK(err, "run task_main");
+	ASSERT_OK(opts.retval, "task_main retval");
+
+	ASSERT_EQ(skel->bss->test_value0, tid + 0, "tld_get_data value0");
+	ASSERT_EQ(skel->bss->test_value1, tid + 1, "tld_get_data value1");
+	ASSERT_EQ(skel->bss->test_value2.a, tid + 2, "tld_get_data value2.a");
+	ASSERT_EQ(skel->bss->test_value2.b, tid + 3, "tld_get_data value2.b");
+	ASSERT_EQ(skel->bss->test_value2.c, tid + 4, "tld_get_data value2.c");
+	ASSERT_EQ(skel->bss->test_value2.d, tid + 5, "tld_get_data value2.d");
+	pthread_mutex_unlock(&global_mutex);
+
+	/* Make sure valueX are indeed local to threads */
+	ASSERT_EQ(*value0, tid + 0, "value0");
+	ASSERT_EQ(*value1, tid + 1, "value1");
+	ASSERT_EQ(value2->a, tid + 2, "value2.a");
+	ASSERT_EQ(value2->b, tid + 3, "value2.b");
+	ASSERT_EQ(value2->c, tid + 4, "value2.c");
+	ASSERT_EQ(value2->d, tid + 5, "value2.d");
+
+	*value0 = tid + 5;
+	*value1 = tid + 4;
+	value2->a = tid + 3;
+	value2->b = tid + 2;
+	value2->c = tid + 1;
+	value2->d = tid + 0;
+
+	/* Run task_main again */
+	pthread_mutex_lock(&global_mutex);
+	err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
+	ASSERT_OK(err, "run task_main");
+	ASSERT_OK(opts.retval, "task_main retval");
+
+	ASSERT_EQ(skel->bss->test_value0, tid + 5, "tld_get_data value0");
+	ASSERT_EQ(skel->bss->test_value1, tid + 4, "tld_get_data value1");
+	ASSERT_EQ(skel->bss->test_value2.a, tid + 3, "tld_get_data value2.a");
+	ASSERT_EQ(skel->bss->test_value2.b, tid + 2, "tld_get_data value2.b");
+	ASSERT_EQ(skel->bss->test_value2.c, tid + 1, "tld_get_data value2.c");
+	ASSERT_EQ(skel->bss->test_value2.d, tid + 0, "tld_get_data value2.d");
+	pthread_mutex_unlock(&global_mutex);
+
+out:
+	pthread_exit(NULL);
+}
+
+static void test_task_local_data_basic(void)
+{
+	struct test_task_local_data *skel;
+	pthread_t thread[TEST_BASIC_THREAD_NUM];
+	char dummy_key_name[TLD_NAME_LEN];
+	tld_key_t key;
+	int i, err;
+
+	reset_tld();
+
+	ASSERT_OK(pthread_mutex_init(&global_mutex, NULL), "pthread_mutex_init");
+
+	skel = test_task_local_data__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel_open_and_load"))
+		return;
+
+	tld_keys = calloc(TEST_BASIC_THREAD_NUM, sizeof(tld_key_t));
+	if (!ASSERT_OK_PTR(tld_keys, "calloc tld_keys"))
+		goto out;
+
+	ASSERT_FALSE(tld_key_is_err(value0_key), "TLD_DEFINE_KEY");
+	tld_keys[0] = tld_create_key("value1", sizeof(int));
+	ASSERT_FALSE(tld_key_is_err(tld_keys[0]), "tld_create_key");
+	tld_keys[1] = tld_create_key("value2", sizeof(struct test_struct));
+	ASSERT_FALSE(tld_key_is_err(tld_keys[1]), "tld_create_key");
+
+	/*
+	 * Shouldn't be able to store data exceed a page. Create a TLD just big
+	 * enough to exceed a page. TLDs already created are int value0, int
+	 * value1, and struct test_struct value2.
+	 */
+	key = tld_create_key("value_not_exist",
+			     TLD_PAGE_SIZE - 2 * sizeof(int) - sizeof(struct test_struct) + 1);
+	ASSERT_EQ(tld_key_err_or_zero(key), -E2BIG, "tld_create_key");
+
+	key = tld_create_key("value2", sizeof(struct test_struct));
+	ASSERT_EQ(tld_key_err_or_zero(key), -EEXIST, "tld_create_key");
+
+	/* Shouldn't be able to create the (TLD_MAX_DATA_CNT+1)-th TLD */
+	for (i = 3; i < TLD_MAX_DATA_CNT; i++) {
+		snprintf(dummy_key_name, TLD_NAME_LEN, "dummy_value%d", i);
+		tld_keys[i] = tld_create_key(dummy_key_name, sizeof(int));
+		ASSERT_FALSE(tld_key_is_err(tld_keys[i]), "tld_create_key");
+	}
+	key = tld_create_key("value_not_exist", sizeof(struct test_struct));
+	ASSERT_EQ(tld_key_err_or_zero(key), -ENOSPC, "tld_create_key");
+
+	/* Access TLDs from multiple threads and check if they are thread-specific */
+	for (i = 0; i < TEST_BASIC_THREAD_NUM; i++) {
+		err = pthread_create(&thread[i], NULL, test_task_local_data_basic_thread, skel);
+		if (!ASSERT_OK(err, "pthread_create"))
+			goto out;
+	}
+
+out:
+	for (i = 0; i < TEST_BASIC_THREAD_NUM; i++)
+		pthread_join(thread[i], NULL);
+
+	if (tld_keys) {
+		free(tld_keys);
+		tld_keys = NULL;
+	}
+	tld_free();
+	test_task_local_data__destroy(skel);
+}
+
+void test_task_local_data(void)
+{
+	if (test__start_subtest("task_local_data_basic"))
+		test_task_local_data_basic();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_task_local_data.c b/tools/testing/selftests/bpf/progs/test_task_local_data.c
new file mode 100644
index 000000000000..94d1745dd8d4
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_task_local_data.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <vmlinux.h>
+#include <errno.h>
+#include <bpf/bpf_helpers.h>
+
+#include "task_local_data.bpf.h"
+
+struct tld_keys {
+	tld_key_t value0;
+	tld_key_t value1;
+	tld_key_t value2;
+	tld_key_t value_not_exist;
+};
+
+struct test_struct {
+	unsigned long a;
+	unsigned long b;
+	unsigned long c;
+	unsigned long d;
+};
+
+int test_value0;
+int test_value1;
+struct test_struct test_value2;
+
+SEC("syscall")
+int task_main(void *ctx)
+{
+	struct tld_object tld_obj;
+	struct test_struct *struct_p;
+	struct task_struct *task;
+	int err, *int_p;
+
+	task = bpf_get_current_task_btf();
+	err = tld_object_init(task, &tld_obj);
+	if (err)
+		return 1;
+
+	int_p = tld_get_data(&tld_obj, value0, "value0", sizeof(int));
+	if (int_p)
+		test_value0 = *int_p;
+	else
+		return 2;
+
+	int_p = tld_get_data(&tld_obj, value1, "value1", sizeof(int));
+	if (int_p)
+		test_value1 = *int_p;
+	else
+		return 3;
+
+	struct_p = tld_get_data(&tld_obj, value2, "value2", sizeof(struct test_struct));
+	if (struct_p)
+		test_value2 = *struct_p;
+	else
+		return 4;
+
+	int_p = tld_get_data(&tld_obj, value_not_exist, "value_not_exist", sizeof(int));
+	if (int_p)
+		return 5;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v5 3/3] selftests/bpf: Test concurrent task local data key creation
  2025-06-27 23:39 [PATCH bpf-next v5 0/3] Task local data Amery Hung
  2025-06-27 23:39 ` [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task " Amery Hung
  2025-06-27 23:39 ` [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations Amery Hung
@ 2025-06-27 23:39 ` Amery Hung
  2 siblings, 0 replies; 8+ messages in thread
From: Amery Hung @ 2025-06-27 23:39 UTC (permalink / raw)
  To: bpf
  Cc: netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, ameryhung, kernel-team

Test thread-safety of tld_create_key(). Since tld_create_key() does
not rely on locks but memory barriers and atomic operations to protect
the shared metadata, the thread-safety of the function is non-trivial.
Make sure concurrent tld_key_create(), both valid and invalid, can not
race and corrupt metatada, which may leads to TLDs not being thread-
specific or duplicate TLDs with the same name.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 .../bpf/prog_tests/test_task_local_data.c     | 103 ++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
index 53cdb8466f8e..99a1ddaf3e67 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
@@ -184,8 +184,111 @@ static void test_task_local_data_basic(void)
 	test_task_local_data__destroy(skel);
 }
 
+#define TEST_RACE_THREAD_NUM (TLD_MAX_DATA_CNT - 3)
+
+void *test_task_local_data_race_thread(void *arg)
+{
+	int err = 0, id = (intptr_t)arg;
+	char key_name[32];
+	tld_key_t key;
+
+	key = tld_create_key("value_not_exist", TLD_PAGE_SIZE + 1);
+	if (tld_key_err_or_zero(key) != -E2BIG) {
+		err = 1;
+		goto out;
+	}
+
+	/*
+	 * If more than one thread succeed in creating value1 or value2,
+	 * some threads will fail to create thread_<id> later.
+	 */
+	key = tld_create_key("value1", sizeof(int));
+	if (!tld_key_is_err(key))
+		tld_keys[TEST_RACE_THREAD_NUM] = key;
+	key = tld_create_key("value2", sizeof(struct test_struct));
+	if (!tld_key_is_err(key))
+		tld_keys[TEST_RACE_THREAD_NUM + 1] = key;
+
+	snprintf(key_name, 32, "thread_%d", id);
+	tld_keys[id] = tld_create_key(key_name, sizeof(int));
+	if (tld_key_is_err(tld_keys[id]))
+		err = 2;
+out:
+	return (void *)(intptr_t)err;
+}
+
+static void test_task_local_data_race(void)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, opts);
+	pthread_t thread[TEST_RACE_THREAD_NUM];
+	struct test_task_local_data *skel;
+	int fd, i, j, err, *data;
+	void *ret = NULL;
+
+	skel = test_task_local_data__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel_open_and_load"))
+		return;
+
+	tld_keys = calloc(TEST_RACE_THREAD_NUM + 2, sizeof(tld_key_t));
+	if (!ASSERT_OK_PTR(tld_keys, "calloc tld_keys"))
+		goto out;
+
+	fd = bpf_map__fd(skel->maps.tld_data_map);
+
+	for (j = 0; j < 100; j++) {
+		reset_tld();
+
+		for (i = 0; i < TEST_RACE_THREAD_NUM; i++) {
+			/*
+			 * Try to make tld_create_key() race with each other. Call
+			 * tld_create_key(), both valid and invalid, from different threads.
+			 */
+			err = pthread_create(&thread[i], NULL, test_task_local_data_race_thread,
+					     (void *)(intptr_t)i);
+			if (CHECK_FAIL(err))
+				break;
+		}
+
+		/* Wait for all tld_create_key() to return */
+		for (i = 0; i < TEST_RACE_THREAD_NUM; i++) {
+			pthread_join(thread[i], &ret);
+			if (CHECK_FAIL(ret))
+				break;
+		}
+
+		/* Write a unique number in the range of [0, TEST_RACE_THREAD_NUM) to each TLD */
+		for (i = 0; i < TEST_RACE_THREAD_NUM; i++) {
+			data = tld_get_data(fd, tld_keys[i]);
+			if (CHECK_FAIL(!data))
+				break;
+			*data = i;
+		}
+
+		/* Read TLDs and check the value to see if any address collides with another */
+		for (i = 0; i < TEST_RACE_THREAD_NUM; i++) {
+			data = tld_get_data(fd, tld_keys[i]);
+			if (CHECK_FAIL(*data != i))
+				break;
+		}
+
+		/* Run task_main to make sure no invalid TLDs are added */
+		err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
+		ASSERT_OK(err, "run task_main");
+		ASSERT_OK(opts.retval, "task_main retval");
+	}
+out:
+	if (tld_keys) {
+		free(tld_keys);
+		tld_keys = NULL;
+	}
+	tld_free();
+	test_task_local_data__destroy(skel);
+}
+
 void test_task_local_data(void)
 {
 	if (test__start_subtest("task_local_data_basic"))
 		test_task_local_data_basic();
+	if (test__start_subtest("task_local_data_race"))
+		test_task_local_data_race();
 }
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations
  2025-06-27 23:39 ` [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations Amery Hung
@ 2025-06-30 11:24   ` Jiri Olsa
  2025-06-30 16:42     ` Amery Hung
  0 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2025-06-30 11:24 UTC (permalink / raw)
  To: Amery Hung
  Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, kernel-team

On Fri, Jun 27, 2025 at 04:39:56PM -0700, Amery Hung wrote:
> Test basic operations of task local data with valid and invalid
> tld_create_key().
> 
> For invalid calls, make sure they return the right error code and check
> that the TLDs are not inserted by running tld_get_data("
> value_not_exists") on the bpf side. The call should a null pointer.
> 
> For valid calls, first make sure the TLDs are created by calling
> tld_get_data() on the bpf side. The call should return a valid pointer.
> 
> Finally, verify that the TLDs are indeed task-specific (i.e., their
> addresses do not overlap) with multiple user threads. This done by
> writing values unique to each thread, reading them from both user space
> and bpf, and checking if the value read back matches the value written.
> 
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
>  .../bpf/prog_tests/test_task_local_data.c     | 191 ++++++++++++++++++
>  .../bpf/progs/test_task_local_data.c          |  65 ++++++
>  2 files changed, 256 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_task_local_data.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
> new file mode 100644
> index 000000000000..53cdb8466f8e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
> @@ -0,0 +1,191 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <pthread.h>
> +#include <bpf/btf.h>
> +#include <test_progs.h>
> +
> +struct test_struct {
> +	__u64 a;
> +	__u64 b;
> +	__u64 c;
> +	__u64 d;
> +};

hi,
I can't compile this on my config, bacause of the KGDB_TESTS config
that defines struct test_struct

progs/test_task_local_data.c:16:8: error: redefinition of 'test_struct'
   16 | struct test_struct {
      |        ^
/home/jolsa/kernel/linux-qemu-1/tools/testing/selftests/bpf/tools/include/vmlinux.h:141747:8: note: previous definition is here
 141747 | struct test_struct {


also I have these tests passing localy, but it's failing CI:
  https://github.com/kernel-patches/bpf/actions/runs/15939264078/job/44964987935

thanks,
jirka


> +
> +#define TLD_FREE_DATA_ON_THREAD_EXIT
> +#define TLD_DYN_DATA_SIZE 4096
> +#include "task_local_data.h"
> +
> +#include "test_task_local_data.skel.h"
> +
> +TLD_DEFINE_KEY(value0_key, "value0", sizeof(int));
> +
> +/*
> + * Reset task local data between subtests by clearing metadata. This is safe
> + * as subtests run sequentially. Users of task local data libraries
> + * should not do this.
> + */
> +static void reset_tld(void)
> +{
> +	if (TLD_READ_ONCE(tld_metadata_p)) {
> +		/* Remove TLDs created by tld_create_key() */
> +		tld_metadata_p->cnt = 1;
> +		tld_metadata_p->size = TLD_DYN_DATA_SIZE;
> +		memset(&tld_metadata_p->metadata[1], 0,
> +		       (TLD_MAX_DATA_CNT - 1) * sizeof(struct tld_metadata));
> +	}
> +}
> +
> +/* Serialize access to bpf program's global variables */
> +static pthread_mutex_t global_mutex;
> +
> +static tld_key_t *tld_keys;
> +
> +#define TEST_BASIC_THREAD_NUM TLD_MAX_DATA_CNT
> +
> +void *test_task_local_data_basic_thread(void *arg)
> +{
> +	LIBBPF_OPTS(bpf_test_run_opts, opts);
> +	struct test_task_local_data *skel = (struct test_task_local_data *)arg;
> +	int fd, err, tid, *value0, *value1;
> +	struct test_struct *value2;
> +
> +	fd = bpf_map__fd(skel->maps.tld_data_map);
> +
> +	value0 = tld_get_data(fd, value0_key);
> +	if (!ASSERT_OK_PTR(value0, "tld_get_data"))
> +		goto out;
> +
> +	value1 = tld_get_data(fd, tld_keys[0]);
> +	if (!ASSERT_OK_PTR(value1, "tld_get_data"))
> +		goto out;
> +
> +	value2 = tld_get_data(fd, tld_keys[1]);
> +	if (!ASSERT_OK_PTR(value2, "tld_get_data"))
> +		goto out;
> +
> +	tid = gettid();
> +
> +	*value0 = tid + 0;
> +	*value1 = tid + 1;
> +	value2->a = tid + 2;
> +	value2->b = tid + 3;
> +	value2->c = tid + 4;
> +	value2->d = tid + 5;
> +
> +	pthread_mutex_lock(&global_mutex);
> +	/* Run task_main that read task local data and save to global variables */
> +	err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
> +	ASSERT_OK(err, "run task_main");
> +	ASSERT_OK(opts.retval, "task_main retval");
> +
> +	ASSERT_EQ(skel->bss->test_value0, tid + 0, "tld_get_data value0");
> +	ASSERT_EQ(skel->bss->test_value1, tid + 1, "tld_get_data value1");
> +	ASSERT_EQ(skel->bss->test_value2.a, tid + 2, "tld_get_data value2.a");
> +	ASSERT_EQ(skel->bss->test_value2.b, tid + 3, "tld_get_data value2.b");
> +	ASSERT_EQ(skel->bss->test_value2.c, tid + 4, "tld_get_data value2.c");
> +	ASSERT_EQ(skel->bss->test_value2.d, tid + 5, "tld_get_data value2.d");
> +	pthread_mutex_unlock(&global_mutex);
> +
> +	/* Make sure valueX are indeed local to threads */
> +	ASSERT_EQ(*value0, tid + 0, "value0");
> +	ASSERT_EQ(*value1, tid + 1, "value1");
> +	ASSERT_EQ(value2->a, tid + 2, "value2.a");
> +	ASSERT_EQ(value2->b, tid + 3, "value2.b");
> +	ASSERT_EQ(value2->c, tid + 4, "value2.c");
> +	ASSERT_EQ(value2->d, tid + 5, "value2.d");
> +
> +	*value0 = tid + 5;
> +	*value1 = tid + 4;
> +	value2->a = tid + 3;
> +	value2->b = tid + 2;
> +	value2->c = tid + 1;
> +	value2->d = tid + 0;
> +
> +	/* Run task_main again */
> +	pthread_mutex_lock(&global_mutex);
> +	err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
> +	ASSERT_OK(err, "run task_main");
> +	ASSERT_OK(opts.retval, "task_main retval");
> +
> +	ASSERT_EQ(skel->bss->test_value0, tid + 5, "tld_get_data value0");
> +	ASSERT_EQ(skel->bss->test_value1, tid + 4, "tld_get_data value1");
> +	ASSERT_EQ(skel->bss->test_value2.a, tid + 3, "tld_get_data value2.a");
> +	ASSERT_EQ(skel->bss->test_value2.b, tid + 2, "tld_get_data value2.b");
> +	ASSERT_EQ(skel->bss->test_value2.c, tid + 1, "tld_get_data value2.c");
> +	ASSERT_EQ(skel->bss->test_value2.d, tid + 0, "tld_get_data value2.d");
> +	pthread_mutex_unlock(&global_mutex);
> +
> +out:
> +	pthread_exit(NULL);
> +}
> +
> +static void test_task_local_data_basic(void)
> +{
> +	struct test_task_local_data *skel;
> +	pthread_t thread[TEST_BASIC_THREAD_NUM];
> +	char dummy_key_name[TLD_NAME_LEN];
> +	tld_key_t key;
> +	int i, err;
> +
> +	reset_tld();
> +
> +	ASSERT_OK(pthread_mutex_init(&global_mutex, NULL), "pthread_mutex_init");
> +
> +	skel = test_task_local_data__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel_open_and_load"))
> +		return;
> +
> +	tld_keys = calloc(TEST_BASIC_THREAD_NUM, sizeof(tld_key_t));
> +	if (!ASSERT_OK_PTR(tld_keys, "calloc tld_keys"))
> +		goto out;
> +
> +	ASSERT_FALSE(tld_key_is_err(value0_key), "TLD_DEFINE_KEY");
> +	tld_keys[0] = tld_create_key("value1", sizeof(int));
> +	ASSERT_FALSE(tld_key_is_err(tld_keys[0]), "tld_create_key");
> +	tld_keys[1] = tld_create_key("value2", sizeof(struct test_struct));
> +	ASSERT_FALSE(tld_key_is_err(tld_keys[1]), "tld_create_key");
> +
> +	/*
> +	 * Shouldn't be able to store data exceed a page. Create a TLD just big
> +	 * enough to exceed a page. TLDs already created are int value0, int
> +	 * value1, and struct test_struct value2.
> +	 */
> +	key = tld_create_key("value_not_exist",
> +			     TLD_PAGE_SIZE - 2 * sizeof(int) - sizeof(struct test_struct) + 1);
> +	ASSERT_EQ(tld_key_err_or_zero(key), -E2BIG, "tld_create_key");
> +
> +	key = tld_create_key("value2", sizeof(struct test_struct));
> +	ASSERT_EQ(tld_key_err_or_zero(key), -EEXIST, "tld_create_key");
> +
> +	/* Shouldn't be able to create the (TLD_MAX_DATA_CNT+1)-th TLD */
> +	for (i = 3; i < TLD_MAX_DATA_CNT; i++) {
> +		snprintf(dummy_key_name, TLD_NAME_LEN, "dummy_value%d", i);
> +		tld_keys[i] = tld_create_key(dummy_key_name, sizeof(int));
> +		ASSERT_FALSE(tld_key_is_err(tld_keys[i]), "tld_create_key");
> +	}
> +	key = tld_create_key("value_not_exist", sizeof(struct test_struct));
> +	ASSERT_EQ(tld_key_err_or_zero(key), -ENOSPC, "tld_create_key");
> +
> +	/* Access TLDs from multiple threads and check if they are thread-specific */
> +	for (i = 0; i < TEST_BASIC_THREAD_NUM; i++) {
> +		err = pthread_create(&thread[i], NULL, test_task_local_data_basic_thread, skel);
> +		if (!ASSERT_OK(err, "pthread_create"))
> +			goto out;
> +	}
> +
> +out:
> +	for (i = 0; i < TEST_BASIC_THREAD_NUM; i++)
> +		pthread_join(thread[i], NULL);
> +
> +	if (tld_keys) {
> +		free(tld_keys);
> +		tld_keys = NULL;
> +	}
> +	tld_free();
> +	test_task_local_data__destroy(skel);
> +}
> +
> +void test_task_local_data(void)
> +{
> +	if (test__start_subtest("task_local_data_basic"))
> +		test_task_local_data_basic();
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_task_local_data.c b/tools/testing/selftests/bpf/progs/test_task_local_data.c
> new file mode 100644
> index 000000000000..94d1745dd8d4
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_task_local_data.c
> @@ -0,0 +1,65 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <vmlinux.h>
> +#include <errno.h>
> +#include <bpf/bpf_helpers.h>
> +
> +#include "task_local_data.bpf.h"
> +
> +struct tld_keys {
> +	tld_key_t value0;
> +	tld_key_t value1;
> +	tld_key_t value2;
> +	tld_key_t value_not_exist;
> +};
> +
> +struct test_struct {
> +	unsigned long a;
> +	unsigned long b;
> +	unsigned long c;
> +	unsigned long d;
> +};
> +
> +int test_value0;
> +int test_value1;
> +struct test_struct test_value2;
> +
> +SEC("syscall")
> +int task_main(void *ctx)
> +{
> +	struct tld_object tld_obj;
> +	struct test_struct *struct_p;
> +	struct task_struct *task;
> +	int err, *int_p;
> +
> +	task = bpf_get_current_task_btf();
> +	err = tld_object_init(task, &tld_obj);
> +	if (err)
> +		return 1;
> +
> +	int_p = tld_get_data(&tld_obj, value0, "value0", sizeof(int));
> +	if (int_p)
> +		test_value0 = *int_p;
> +	else
> +		return 2;
> +
> +	int_p = tld_get_data(&tld_obj, value1, "value1", sizeof(int));
> +	if (int_p)
> +		test_value1 = *int_p;
> +	else
> +		return 3;
> +
> +	struct_p = tld_get_data(&tld_obj, value2, "value2", sizeof(struct test_struct));
> +	if (struct_p)
> +		test_value2 = *struct_p;
> +	else
> +		return 4;
> +
> +	int_p = tld_get_data(&tld_obj, value_not_exist, "value_not_exist", sizeof(int));
> +	if (int_p)
> +		return 5;
> +
> +	return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> -- 
> 2.47.1
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations
  2025-06-30 11:24   ` Jiri Olsa
@ 2025-06-30 16:42     ` Amery Hung
  0 siblings, 0 replies; 8+ messages in thread
From: Amery Hung @ 2025-06-30 16:42 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, kernel-team

On Mon, Jun 30, 2025 at 4:24 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, Jun 27, 2025 at 04:39:56PM -0700, Amery Hung wrote:
> > Test basic operations of task local data with valid and invalid
> > tld_create_key().
> >
> > For invalid calls, make sure they return the right error code and check
> > that the TLDs are not inserted by running tld_get_data("
> > value_not_exists") on the bpf side. The call should a null pointer.
> >
> > For valid calls, first make sure the TLDs are created by calling
> > tld_get_data() on the bpf side. The call should return a valid pointer.
> >
> > Finally, verify that the TLDs are indeed task-specific (i.e., their
> > addresses do not overlap) with multiple user threads. This done by
> > writing values unique to each thread, reading them from both user space
> > and bpf, and checking if the value read back matches the value written.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> >  .../bpf/prog_tests/test_task_local_data.c     | 191 ++++++++++++++++++
> >  .../bpf/progs/test_task_local_data.c          |  65 ++++++
> >  2 files changed, 256 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_task_local_data.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
> > new file mode 100644
> > index 000000000000..53cdb8466f8e
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
> > @@ -0,0 +1,191 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <pthread.h>
> > +#include <bpf/btf.h>
> > +#include <test_progs.h>
> > +
> > +struct test_struct {
> > +     __u64 a;
> > +     __u64 b;
> > +     __u64 c;
> > +     __u64 d;
> > +};
>
> hi,
> I can't compile this on my config, bacause of the KGDB_TESTS config
> that defines struct test_struct
>
> progs/test_task_local_data.c:16:8: error: redefinition of 'test_struct'
>    16 | struct test_struct {
>       |        ^
> /home/jolsa/kernel/linux-qemu-1/tools/testing/selftests/bpf/tools/include/vmlinux.h:141747:8: note: previous definition is here
>  141747 | struct test_struct {
>
>
> also I have these tests passing localy, but it's failing CI:
>   https://github.com/kernel-patches/bpf/actions/runs/15939264078/job/44964987935
>

Thanks for reporting the error. I will change the test_struct name.

For the CI failure, I will fix it by changing the type of "off" in the
bpf tld_get_data() from int to s64.

Thanks,
Amery

> thanks,
> jirka
>
>
> > +
> > +#define TLD_FREE_DATA_ON_THREAD_EXIT
> > +#define TLD_DYN_DATA_SIZE 4096
> > +#include "task_local_data.h"
> > +
> > +#include "test_task_local_data.skel.h"
> > +
> > +TLD_DEFINE_KEY(value0_key, "value0", sizeof(int));
> > +
> > +/*
> > + * Reset task local data between subtests by clearing metadata. This is safe
> > + * as subtests run sequentially. Users of task local data libraries
> > + * should not do this.
> > + */
> > +static void reset_tld(void)
> > +{
> > +     if (TLD_READ_ONCE(tld_metadata_p)) {
> > +             /* Remove TLDs created by tld_create_key() */
> > +             tld_metadata_p->cnt = 1;
> > +             tld_metadata_p->size = TLD_DYN_DATA_SIZE;
> > +             memset(&tld_metadata_p->metadata[1], 0,
> > +                    (TLD_MAX_DATA_CNT - 1) * sizeof(struct tld_metadata));
> > +     }
> > +}
> > +
> > +/* Serialize access to bpf program's global variables */
> > +static pthread_mutex_t global_mutex;
> > +
> > +static tld_key_t *tld_keys;
> > +
> > +#define TEST_BASIC_THREAD_NUM TLD_MAX_DATA_CNT
> > +
> > +void *test_task_local_data_basic_thread(void *arg)
> > +{
> > +     LIBBPF_OPTS(bpf_test_run_opts, opts);
> > +     struct test_task_local_data *skel = (struct test_task_local_data *)arg;
> > +     int fd, err, tid, *value0, *value1;
> > +     struct test_struct *value2;
> > +
> > +     fd = bpf_map__fd(skel->maps.tld_data_map);
> > +
> > +     value0 = tld_get_data(fd, value0_key);
> > +     if (!ASSERT_OK_PTR(value0, "tld_get_data"))
> > +             goto out;
> > +
> > +     value1 = tld_get_data(fd, tld_keys[0]);
> > +     if (!ASSERT_OK_PTR(value1, "tld_get_data"))
> > +             goto out;
> > +
> > +     value2 = tld_get_data(fd, tld_keys[1]);
> > +     if (!ASSERT_OK_PTR(value2, "tld_get_data"))
> > +             goto out;
> > +
> > +     tid = gettid();
> > +
> > +     *value0 = tid + 0;
> > +     *value1 = tid + 1;
> > +     value2->a = tid + 2;
> > +     value2->b = tid + 3;
> > +     value2->c = tid + 4;
> > +     value2->d = tid + 5;
> > +
> > +     pthread_mutex_lock(&global_mutex);
> > +     /* Run task_main that read task local data and save to global variables */
> > +     err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
> > +     ASSERT_OK(err, "run task_main");
> > +     ASSERT_OK(opts.retval, "task_main retval");
> > +
> > +     ASSERT_EQ(skel->bss->test_value0, tid + 0, "tld_get_data value0");
> > +     ASSERT_EQ(skel->bss->test_value1, tid + 1, "tld_get_data value1");
> > +     ASSERT_EQ(skel->bss->test_value2.a, tid + 2, "tld_get_data value2.a");
> > +     ASSERT_EQ(skel->bss->test_value2.b, tid + 3, "tld_get_data value2.b");
> > +     ASSERT_EQ(skel->bss->test_value2.c, tid + 4, "tld_get_data value2.c");
> > +     ASSERT_EQ(skel->bss->test_value2.d, tid + 5, "tld_get_data value2.d");
> > +     pthread_mutex_unlock(&global_mutex);
> > +
> > +     /* Make sure valueX are indeed local to threads */
> > +     ASSERT_EQ(*value0, tid + 0, "value0");
> > +     ASSERT_EQ(*value1, tid + 1, "value1");
> > +     ASSERT_EQ(value2->a, tid + 2, "value2.a");
> > +     ASSERT_EQ(value2->b, tid + 3, "value2.b");
> > +     ASSERT_EQ(value2->c, tid + 4, "value2.c");
> > +     ASSERT_EQ(value2->d, tid + 5, "value2.d");
> > +
> > +     *value0 = tid + 5;
> > +     *value1 = tid + 4;
> > +     value2->a = tid + 3;
> > +     value2->b = tid + 2;
> > +     value2->c = tid + 1;
> > +     value2->d = tid + 0;
> > +
> > +     /* Run task_main again */
> > +     pthread_mutex_lock(&global_mutex);
> > +     err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.task_main), &opts);
> > +     ASSERT_OK(err, "run task_main");
> > +     ASSERT_OK(opts.retval, "task_main retval");
> > +
> > +     ASSERT_EQ(skel->bss->test_value0, tid + 5, "tld_get_data value0");
> > +     ASSERT_EQ(skel->bss->test_value1, tid + 4, "tld_get_data value1");
> > +     ASSERT_EQ(skel->bss->test_value2.a, tid + 3, "tld_get_data value2.a");
> > +     ASSERT_EQ(skel->bss->test_value2.b, tid + 2, "tld_get_data value2.b");
> > +     ASSERT_EQ(skel->bss->test_value2.c, tid + 1, "tld_get_data value2.c");
> > +     ASSERT_EQ(skel->bss->test_value2.d, tid + 0, "tld_get_data value2.d");
> > +     pthread_mutex_unlock(&global_mutex);
> > +
> > +out:
> > +     pthread_exit(NULL);
> > +}
> > +
> > +static void test_task_local_data_basic(void)
> > +{
> > +     struct test_task_local_data *skel;
> > +     pthread_t thread[TEST_BASIC_THREAD_NUM];
> > +     char dummy_key_name[TLD_NAME_LEN];
> > +     tld_key_t key;
> > +     int i, err;
> > +
> > +     reset_tld();
> > +
> > +     ASSERT_OK(pthread_mutex_init(&global_mutex, NULL), "pthread_mutex_init");
> > +
> > +     skel = test_task_local_data__open_and_load();
> > +     if (!ASSERT_OK_PTR(skel, "skel_open_and_load"))
> > +             return;
> > +
> > +     tld_keys = calloc(TEST_BASIC_THREAD_NUM, sizeof(tld_key_t));
> > +     if (!ASSERT_OK_PTR(tld_keys, "calloc tld_keys"))
> > +             goto out;
> > +
> > +     ASSERT_FALSE(tld_key_is_err(value0_key), "TLD_DEFINE_KEY");
> > +     tld_keys[0] = tld_create_key("value1", sizeof(int));
> > +     ASSERT_FALSE(tld_key_is_err(tld_keys[0]), "tld_create_key");
> > +     tld_keys[1] = tld_create_key("value2", sizeof(struct test_struct));
> > +     ASSERT_FALSE(tld_key_is_err(tld_keys[1]), "tld_create_key");
> > +
> > +     /*
> > +      * Shouldn't be able to store data exceed a page. Create a TLD just big
> > +      * enough to exceed a page. TLDs already created are int value0, int
> > +      * value1, and struct test_struct value2.
> > +      */
> > +     key = tld_create_key("value_not_exist",
> > +                          TLD_PAGE_SIZE - 2 * sizeof(int) - sizeof(struct test_struct) + 1);
> > +     ASSERT_EQ(tld_key_err_or_zero(key), -E2BIG, "tld_create_key");
> > +
> > +     key = tld_create_key("value2", sizeof(struct test_struct));
> > +     ASSERT_EQ(tld_key_err_or_zero(key), -EEXIST, "tld_create_key");
> > +
> > +     /* Shouldn't be able to create the (TLD_MAX_DATA_CNT+1)-th TLD */
> > +     for (i = 3; i < TLD_MAX_DATA_CNT; i++) {
> > +             snprintf(dummy_key_name, TLD_NAME_LEN, "dummy_value%d", i);
> > +             tld_keys[i] = tld_create_key(dummy_key_name, sizeof(int));
> > +             ASSERT_FALSE(tld_key_is_err(tld_keys[i]), "tld_create_key");
> > +     }
> > +     key = tld_create_key("value_not_exist", sizeof(struct test_struct));
> > +     ASSERT_EQ(tld_key_err_or_zero(key), -ENOSPC, "tld_create_key");
> > +
> > +     /* Access TLDs from multiple threads and check if they are thread-specific */
> > +     for (i = 0; i < TEST_BASIC_THREAD_NUM; i++) {
> > +             err = pthread_create(&thread[i], NULL, test_task_local_data_basic_thread, skel);
> > +             if (!ASSERT_OK(err, "pthread_create"))
> > +                     goto out;
> > +     }
> > +
> > +out:
> > +     for (i = 0; i < TEST_BASIC_THREAD_NUM; i++)
> > +             pthread_join(thread[i], NULL);
> > +
> > +     if (tld_keys) {
> > +             free(tld_keys);
> > +             tld_keys = NULL;
> > +     }
> > +     tld_free();
> > +     test_task_local_data__destroy(skel);
> > +}
> > +
> > +void test_task_local_data(void)
> > +{
> > +     if (test__start_subtest("task_local_data_basic"))
> > +             test_task_local_data_basic();
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/test_task_local_data.c b/tools/testing/selftests/bpf/progs/test_task_local_data.c
> > new file mode 100644
> > index 000000000000..94d1745dd8d4
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_task_local_data.c
> > @@ -0,0 +1,65 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <vmlinux.h>
> > +#include <errno.h>
> > +#include <bpf/bpf_helpers.h>
> > +
> > +#include "task_local_data.bpf.h"
> > +
> > +struct tld_keys {
> > +     tld_key_t value0;
> > +     tld_key_t value1;
> > +     tld_key_t value2;
> > +     tld_key_t value_not_exist;
> > +};
> > +
> > +struct test_struct {
> > +     unsigned long a;
> > +     unsigned long b;
> > +     unsigned long c;
> > +     unsigned long d;
> > +};
> > +
> > +int test_value0;
> > +int test_value1;
> > +struct test_struct test_value2;
> > +
> > +SEC("syscall")
> > +int task_main(void *ctx)
> > +{
> > +     struct tld_object tld_obj;
> > +     struct test_struct *struct_p;
> > +     struct task_struct *task;
> > +     int err, *int_p;
> > +
> > +     task = bpf_get_current_task_btf();
> > +     err = tld_object_init(task, &tld_obj);
> > +     if (err)
> > +             return 1;
> > +
> > +     int_p = tld_get_data(&tld_obj, value0, "value0", sizeof(int));
> > +     if (int_p)
> > +             test_value0 = *int_p;
> > +     else
> > +             return 2;
> > +
> > +     int_p = tld_get_data(&tld_obj, value1, "value1", sizeof(int));
> > +     if (int_p)
> > +             test_value1 = *int_p;
> > +     else
> > +             return 3;
> > +
> > +     struct_p = tld_get_data(&tld_obj, value2, "value2", sizeof(struct test_struct));
> > +     if (struct_p)
> > +             test_value2 = *struct_p;
> > +     else
> > +             return 4;
> > +
> > +     int_p = tld_get_data(&tld_obj, value_not_exist, "value_not_exist", sizeof(int));
> > +     if (int_p)
> > +             return 5;
> > +
> > +     return 0;
> > +}
> > +
> > +char _license[] SEC("license") = "GPL";
> > --
> > 2.47.1
> >
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task local data
  2025-06-27 23:39 ` [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task " Amery Hung
@ 2025-07-01 22:02   ` Andrii Nakryiko
  2025-07-01 22:47     ` Amery Hung
  0 siblings, 1 reply; 8+ messages in thread
From: Andrii Nakryiko @ 2025-07-01 22:02 UTC (permalink / raw)
  To: Amery Hung
  Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, kernel-team

On Fri, Jun 27, 2025 at 4:40 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> Task local data defines an abstract storage type for storing task-
> specific data (TLD). This patch provides user space and bpf
> implementation as header-only libraries for accessing task local data.
>
> Task local data is a bpf task local storage map with two UPTRs:
> 1) u_tld_metadata, shared by all tasks of the same process, consists of
> the total count of TLDs and an array of metadata of TLDs. A metadata of
> a TLD comprises the size and the name. The name is used to identify a
> specific TLD in bpf 2) u_tld_data points to a task-specific memory region
> for storing TLDs.
>
> Below are the core task local data API:
>
>                      User space                           BPF
> Define TLD    TLD_DEFINE_KEY(), tld_create_key()           -
> Get data           tld_get_data()                    tld_get_data()
>
> A TLD is first defined by the user space with TLD_DEFINE_KEY() or
> tld_create_key(). TLD_DEFINE_KEY() defines a TLD statically and allocates
> just enough memory during initialization. tld_create_key() allows
> creating TLDs on the fly, but has a fix memory budget, TLD_DYN_DATA_SIZE.
> Internally, they all go through the metadata array to check if the TLD can
> be added. The total TLD size needs to fit into a page (limited by UPTR),
> and no two TLDs can have the same name. It also calculates the offset, the
> next available space in u_tld_data, by summing sizes of TLDs. If the TLD
> can be added, it increases the count using cmpxchg as there may be other
> concurrent tld_create_key(). After a successful cmpxchg, the last
> metadata slot now belongs to the calling thread and will be updated.
> tld_create_key() returns the offset encapsulated as a opaque object key
> to prevent user misuse.
>
> Then, user space can pass the key to tld_get_data() to get a pointer
> to the TLD. The pointer will remain valid for the lifetime of the
> thread.
>
> BPF programs can also locate the TLD by tld_get_data(), but with both
> name and key. The first time tld_get_data() is called, the name will
> be used to lookup the metadata. Then, the key will be saved to a
> task_local_data map, tld_keys_map. Subsequent call to tld_get_data()
> will use the key to quickly locate the data.
>
> User space task local data library uses a light way approach to ensure
> thread safety (i.e., atomic operation + compiler and memory barriers).
> While a metadata is being updated, other threads may also try to read it.
> To prevent them from seeing incomplete data, metadata::size is used to
> signal the completion of the update, where 0 means the update is still
> ongoing. Threads will wait until seeing a non-zero size to read a
> metadata.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
>  .../bpf/prog_tests/task_local_data.h          | 397 ++++++++++++++++++
>  .../selftests/bpf/progs/task_local_data.bpf.h | 232 ++++++++++
>  2 files changed, 629 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h
>  create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.bpf.h
>

[...]

> +               /*
> +                * Only one tld_create_key() can increase the current cnt by one and
> +                * takes the latest available slot. Other threads will check again if a new
> +                * TLD can still be added, and then compete for the new slot after the
> +                * succeeding thread update the size.
> +                */
> +               if (!atomic_compare_exchange_strong(&tld_metadata_p->cnt, &cnt, cnt + 1))
> +                       goto retry;
> +
> +               strncpy(tld_metadata_p->metadata[i].name, name, TLD_NAME_LEN);

from man page:

Warning: If there is no null byte among the first n bytes of src, the
string placed in dest will not be null-terminated.

is that a concern?

> +               atomic_store(&tld_metadata_p->metadata[i].size, size);
> +               return (tld_key_t) {.off = (__s16)off};
> +       }
> +
> +       return (tld_key_t) {.off = -ENOSPC};

I don't know if C++ compiler will like this, but in C just
`(tld_key_t){-ENOSPC}` should work fine

> +}
> +
> +/**
> + * TLD_DEFINE_KEY() - Defines a TLD and a file-scope key associated with the TLD.
> + *
> + * @name: The name of the TLD
> + * @size: The size of the TLD
> + * @key: The variable name of the key. Cannot exceed TLD_NAME_LEN
> + *
> + * The macro can only be used in file scope.
> + *
> + * A file-scope key of opaque type, tld_key_t, will be declared and initialized before

what's "file-scope"? it looks like a global (not even static)
variable, so you can even reference it from other files with extern,
no?

> + * main() starts. Use tld_key_is_err() or tld_key_err_or_zero() later to check if the key
> + * creation succeeded. Pass the key to tld_get_data() to get a pointer to the TLD.
> + * bpf programs can also fetch the same key by name.
> + *
> + * The total size of TLDs created using TLD_DEFINE_KEY() cannot exceed a page. Just
> + * enough memory will be allocated for each thread on the first call to tld_get_data().
> + */

[...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task local data
  2025-07-01 22:02   ` Andrii Nakryiko
@ 2025-07-01 22:47     ` Amery Hung
  0 siblings, 0 replies; 8+ messages in thread
From: Amery Hung @ 2025-07-01 22:47 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, memxor,
	martin.lau, kernel-team

On Tue, Jul 1, 2025 at 3:02 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Jun 27, 2025 at 4:40 PM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > Task local data defines an abstract storage type for storing task-
> > specific data (TLD). This patch provides user space and bpf
> > implementation as header-only libraries for accessing task local data.
> >
> > Task local data is a bpf task local storage map with two UPTRs:
> > 1) u_tld_metadata, shared by all tasks of the same process, consists of
> > the total count of TLDs and an array of metadata of TLDs. A metadata of
> > a TLD comprises the size and the name. The name is used to identify a
> > specific TLD in bpf 2) u_tld_data points to a task-specific memory region
> > for storing TLDs.
> >
> > Below are the core task local data API:
> >
> >                      User space                           BPF
> > Define TLD    TLD_DEFINE_KEY(), tld_create_key()           -
> > Get data           tld_get_data()                    tld_get_data()
> >
> > A TLD is first defined by the user space with TLD_DEFINE_KEY() or
> > tld_create_key(). TLD_DEFINE_KEY() defines a TLD statically and allocates
> > just enough memory during initialization. tld_create_key() allows
> > creating TLDs on the fly, but has a fix memory budget, TLD_DYN_DATA_SIZE.
> > Internally, they all go through the metadata array to check if the TLD can
> > be added. The total TLD size needs to fit into a page (limited by UPTR),
> > and no two TLDs can have the same name. It also calculates the offset, the
> > next available space in u_tld_data, by summing sizes of TLDs. If the TLD
> > can be added, it increases the count using cmpxchg as there may be other
> > concurrent tld_create_key(). After a successful cmpxchg, the last
> > metadata slot now belongs to the calling thread and will be updated.
> > tld_create_key() returns the offset encapsulated as a opaque object key
> > to prevent user misuse.
> >
> > Then, user space can pass the key to tld_get_data() to get a pointer
> > to the TLD. The pointer will remain valid for the lifetime of the
> > thread.
> >
> > BPF programs can also locate the TLD by tld_get_data(), but with both
> > name and key. The first time tld_get_data() is called, the name will
> > be used to lookup the metadata. Then, the key will be saved to a
> > task_local_data map, tld_keys_map. Subsequent call to tld_get_data()
> > will use the key to quickly locate the data.
> >
> > User space task local data library uses a light way approach to ensure
> > thread safety (i.e., atomic operation + compiler and memory barriers).
> > While a metadata is being updated, other threads may also try to read it.
> > To prevent them from seeing incomplete data, metadata::size is used to
> > signal the completion of the update, where 0 means the update is still
> > ongoing. Threads will wait until seeing a non-zero size to read a
> > metadata.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> >  .../bpf/prog_tests/task_local_data.h          | 397 ++++++++++++++++++
> >  .../selftests/bpf/progs/task_local_data.bpf.h | 232 ++++++++++
> >  2 files changed, 629 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h
> >  create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.bpf.h
> >
>
> [...]
>
> > +               /*
> > +                * Only one tld_create_key() can increase the current cnt by one and
> > +                * takes the latest available slot. Other threads will check again if a new
> > +                * TLD can still be added, and then compete for the new slot after the
> > +                * succeeding thread update the size.
> > +                */
> > +               if (!atomic_compare_exchange_strong(&tld_metadata_p->cnt, &cnt, cnt + 1))
> > +                       goto retry;
> > +
> > +               strncpy(tld_metadata_p->metadata[i].name, name, TLD_NAME_LEN);
>
> from man page:
>
> Warning: If there is no null byte among the first n bytes of src, the
> string placed in dest will not be null-terminated.
>
> is that a concern?
>

It should be fine as the BPF side uses strncmp. So, a TLD can have a
name that is TLD_NAME_LEN-char long, not including the null
terminator.

> > +               atomic_store(&tld_metadata_p->metadata[i].size, size);
> > +               return (tld_key_t) {.off = (__s16)off};
> > +       }
> > +
> > +       return (tld_key_t) {.off = -ENOSPC};
>
> I don't know if C++ compiler will like this, but in C just
> `(tld_key_t){-ENOSPC}` should work fine
>

Designated initializers has been supported since C++20, but I can also
just use (tld_key_t){-ENOSPC} to make it less verbose.

> > +}
> > +
> > +/**
> > + * TLD_DEFINE_KEY() - Defines a TLD and a file-scope key associated with the TLD.
> > + *
> > + * @name: The name of the TLD
> > + * @size: The size of the TLD
> > + * @key: The variable name of the key. Cannot exceed TLD_NAME_LEN
> > + *
> > + * The macro can only be used in file scope.
> > + *
> > + * A file-scope key of opaque type, tld_key_t, will be declared and initialized before
>
> what's "file-scope"? it looks like a global (not even static)
> variable, so you can even reference it from other files with extern,
> no?
>

It is a global variable. File-scope is just the terminology used in
the C language standard
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf

> > + * main() starts. Use tld_key_is_err() or tld_key_err_or_zero() later to check if the key
> > + * creation succeeded. Pass the key to tld_get_data() to get a pointer to the TLD.
> > + * bpf programs can also fetch the same key by name.
> > + *
> > + * The total size of TLDs created using TLD_DEFINE_KEY() cannot exceed a page. Just
> > + * enough memory will be allocated for each thread on the first call to tld_get_data().
> > + */
>
> [...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-01 22:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27 23:39 [PATCH bpf-next v5 0/3] Task local data Amery Hung
2025-06-27 23:39 ` [PATCH bpf-next v5 1/3] selftests/bpf: Introduce task " Amery Hung
2025-07-01 22:02   ` Andrii Nakryiko
2025-07-01 22:47     ` Amery Hung
2025-06-27 23:39 ` [PATCH bpf-next v5 2/3] selftests/bpf: Test basic task local data operations Amery Hung
2025-06-30 11:24   ` Jiri Olsa
2025-06-30 16:42     ` Amery Hung
2025-06-27 23:39 ` [PATCH bpf-next v5 3/3] selftests/bpf: Test concurrent task local data key creation Amery Hung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).