Re: [PATCH bpf-next v1 2/3] selftests/bpf: Simplify task_local_data memory allocation

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Amery Hung <ameryhung@gmail.com>, bpf@vger.kernel.org
Cc: alexei.starovoitov@gmail.com, andrii@kernel.org,
	daniel@iogearbox.net, eddyz87@gmail.com, memxor@gmail.com,
	ameryhung@gmail.com, kernel-team@meta.com
Subject: Re: [PATCH bpf-next v1 2/3] selftests/bpf: Simplify task_local_data memory allocation
Date: Fri, 27 Mar 2026 16:36:47 +0000	[thread overview]
Message-ID: <877bqxi600.fsf@gmail.com> (raw)
In-Reply-To: <20260326052437.590158-3-ameryhung@gmail.com>

Amery Hung <ameryhung@gmail.com> writes:

> Simplify data allocation by always using aligned_alloc() and passing
> size_pot, size rounded up to the closest power of two to alignment.
>
> Currently, aligned_alloc(page_size, size) is only intended to be used
> with memory allocators that can fulfill the request without rounding
> size up to page_size to conserve memory. This is enabled by defining
> TLD_DATA_USE_ALIGNED_ALLOC. The reason to align to page_size is due to
> the limitation of UPTR where only a page can be pinned to the kernel.
> Otherwise, malloc(size * 2) is used to allocate memory for data.
>
> However, we don't need to call aligned_alloc(page_size, size) to get
> a contiguous memory of size bytes within a page. aligned_alloc(size_pot,
> ...) will also do the trick. Therefore, just use aligned_alloc(size_pot,
> ...) universally.
>
> As for the size argument, create a new option,
> TLD_DONT_ROUND_UP_DATA_SIZE, to specify not rounding up the size.
> This preserves the current TLD_DATA_USE_ALIGNED_ALLOC behavior, allowing
> memory allocators with low overhead aligned_alloc() to not waste memory.
> To enable this, users need to make sure it is not an undefined behavior
> for the memory allocator to have size not being an integral multiple of
> alignment.
Why not simplify this further and just mmap() a page?

  data = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
              MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

That gives you a page-aligned, page-sized buffer with no ambiguity
about alignment, page boundary crossings, or aligned_alloc() POSIX
compliance. munmap() on cleanup instead of free().

The whole power-of-two rounding and TLD_DONT_ROUND_UP_DATA_SIZE
option adds complexity that's hard to justify for selftest
infrastructure. How does a user decide when to define/undefine
TLD_DONT_ROUND_UP? This is a selftest library -- the common case
should just work with the simplest possible approach
>
> Compared to the current implementation, !TLD_DATA_USE_ALIGNED_ALLOC
> used to always waste size-byte of memory due to malloc(size * 2).
> Now the worst case becomes size - 1 and the best case is 0 when the size
> is already a power of two.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
>  .../bpf/prog_tests/task_local_data.h          | 60 +++++++------------
>  1 file changed, 22 insertions(+), 38 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_data.h b/tools/testing/selftests/bpf/prog_tests/task_local_data.h
> index a52d8b549425..366a6739c086 100644
> --- a/tools/testing/selftests/bpf/prog_tests/task_local_data.h
> +++ b/tools/testing/selftests/bpf/prog_tests/task_local_data.h
> @@ -50,16 +50,13 @@
>   *   TLD_MAX_DATA_CNT.
>   *
>   *
> - *   TLD_DATA_USE_ALIGNED_ALLOC - Always use aligned_alloc() instead of malloc()
> + *   TLD_DONT_ROUND_UP_DATA_SIZE - Don't round up memory size allocated for data if
> + *   the memory allocator has low overhead aligned_alloc() implementation.
>   *
> - *   When allocating the memory for storing TLDs, we need to make sure there is a memory
> - *   region of the X bytes within a page. This is due to the limit posed by UPTR: memory
> - *   pinned to the kernel cannot exceed a page nor can it cross the page boundary. The
> - *   library normally calls malloc(2*X) given X bytes of total TLDs, and only uses
> - *   aligned_alloc(PAGE_SIZE, X) when X >= PAGE_SIZE / 2. This is to reduce memory wastage
> - *   as not all memory allocator can use the exact amount of memory requested to fulfill
> - *   aligned_alloc(). For example, some may round the size up to the alignment. Enable the
> - *   option to always use aligned_alloc() if the implementation has low memory overhead.
> + *   For some memory allocators, when calling aligned_alloc(alignment, size), size
> + *   does not need to be an integral multiple of alignment and it can be fulfilled
> + *   without using round_up(size, alignment) bytes of memory. Enable this option to
> + *   reduce memory usage.
>   */
>  
>  #define TLD_PAGE_SIZE getpagesize()
> @@ -68,6 +65,8 @@
>  #define TLD_ROUND_MASK(x, y) ((__typeof__(x))((y) - 1))
>  #define TLD_ROUND_UP(x, y) ((((x) - 1) | TLD_ROUND_MASK(x, y)) + 1)
>  
> +#define TLD_ROUND_UP_POWER_OF_TWO(x) (1UL << (sizeof(x) * 8 - __builtin_clzl(x - 1)))
> +
>  #define TLD_READ_ONCE(x) (*(volatile typeof(x) *)&(x))
>  
>  #ifndef TLD_DYN_DATA_SIZE
> @@ -111,7 +110,6 @@ struct tld_map_value {
>  
>  struct tld_meta_u * _Atomic tld_meta_p __attribute__((weak));
>  __thread struct tld_data_u *tld_data_p __attribute__((weak));
> -__thread void *tld_data_alloc_p __attribute__((weak));
>  
>  #ifdef TLD_FREE_DATA_ON_THREAD_EXIT
>  pthread_key_t tld_pthread_key __attribute__((weak));
> @@ -153,12 +151,10 @@ static int __tld_init_meta_p(void)
>  
>  static int __tld_init_data_p(int map_fd)
>  {
> -	bool use_aligned_alloc = false;
>  	struct tld_map_value map_val;
>  	struct tld_data_u *data;
> -	void *data_alloc = NULL;
>  	int err, tid_fd = -1;
> -	size_t size;
> +	size_t size, size_pot;
>  
>  	tid_fd = syscall(SYS_pidfd_open, sys_gettid(), O_EXCL);
>  	if (tid_fd < 0) {
> @@ -166,48 +162,37 @@ static int __tld_init_data_p(int map_fd)
>  		goto out;
>  	}
>  
> -#ifdef TLD_DATA_USE_ALIGNED_ALLOC
> -	use_aligned_alloc = true;
> -#endif
> -
>  	/*
>  	 * tld_meta_p->size = TLD_DYN_DATA_SIZE +
>  	 *          total size of TLDs defined via TLD_DEFINE_KEY()
>  	 */
>  	size = tld_meta_p->size + sizeof(struct tld_data_u);
> -	data_alloc = (use_aligned_alloc || size * 2 >= TLD_PAGE_SIZE) ?
> -		aligned_alloc(TLD_PAGE_SIZE, size) :
> -		malloc(size * 2);
> -	if (!data_alloc) {
> +	size_pot = TLD_ROUND_UP_POWER_OF_TWO(size);
> +#ifdef TLD_DONT_ROUND_UP_DATA_SIZE
> +	data = (struct tld_data_u *)aligned_alloc(size_pot, size);
> +#else
> +	data = (struct tld_data_u *)aligned_alloc(size_pot, size_pot);
> +#endif
> +	if (!data) {
>  		err = -ENOMEM;
>  		goto out;
>  	}
>  
>  	/*
>  	 * Always pass a page-aligned address to UPTR since the size of tld_map_value::data
> -	 * is a page in BTF. If data_alloc spans across two pages, use the page that contains large
> -	 * enough memory.
> +	 * is a page in BTF.
>  	 */
> -	if (TLD_PAGE_SIZE - (~TLD_PAGE_MASK & (intptr_t)data_alloc) >= tld_meta_p->size) {
> -		map_val.data = (void *)(TLD_PAGE_MASK & (intptr_t)data_alloc);
> -		data = data_alloc;
> -		data->start = (~TLD_PAGE_MASK & (intptr_t)data_alloc) +
> -			      offsetof(struct tld_data_u, data);
> -	} else {
> -		map_val.data = (void *)(TLD_ROUND_UP((intptr_t)data_alloc, TLD_PAGE_SIZE));
> -		data = (void *)(TLD_ROUND_UP((intptr_t)data_alloc, TLD_PAGE_SIZE));
> -		data->start = offsetof(struct tld_data_u, data);
> -	}
> +	map_val.data = (void *)(TLD_PAGE_MASK & (intptr_t)data);
> +	data->start = (~TLD_PAGE_MASK & (intptr_t)data) + sizeof(struct tld_data_u);
>  	map_val.meta = TLD_READ_ONCE(tld_meta_p);
>  
>  	err = bpf_map_update_elem(map_fd, &tid_fd, &map_val, 0);
>  	if (err) {
> -		free(data_alloc);
> +		free(data);
>  		goto out;
>  	}
>  
>  	tld_data_p = data;
> -	tld_data_alloc_p = data_alloc;
>  #ifdef TLD_FREE_DATA_ON_THREAD_EXIT
>  	pthread_setspecific(tld_pthread_key, (void *)1);
>  #endif
> @@ -375,9 +360,8 @@ static void *tld_get_data(int map_fd, tld_key_t key)
>  __attribute__((unused))
>  static void tld_free(void)
>  {
> -	if (tld_data_alloc_p) {
> -		free(tld_data_alloc_p);
> -		tld_data_alloc_p = NULL;
> +	if (tld_data_p) {
> +		free(tld_data_p);
>  		tld_data_p = NULL;
>  	}
>  }
> -- 
> 2.52.0

next prev parent reply	other threads:[~2026-03-27 16:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26  5:24 [PATCH bpf-next v1 0/3] Task local data bug fixes and improvement Amery Hung
2026-03-26  5:24 ` [PATCH bpf-next v1 1/3] selftests/bpf: Fix task_local_data data allocation size Amery Hung
2026-03-26 10:27   ` sun jian
2026-03-26 15:56     ` Amery Hung
2026-03-26 16:17   ` Mykyta Yatsenko
2026-03-26 20:32     ` Amery Hung
2026-03-26  5:24 ` [PATCH bpf-next v1 2/3] selftests/bpf: Simplify task_local_data memory allocation Amery Hung
2026-03-27 16:36   ` Mykyta Yatsenko [this message]
2026-03-27 20:56     ` Amery Hung
2026-03-27 21:15       ` Mykyta Yatsenko
2026-03-26  5:24 ` [PATCH bpf-next v1 3/3] selftests/bpf: Make sure TLD_DEFINE_KEY runs first Amery Hung
2026-03-27 16:56   ` Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877bqxi600.fsf@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=memxor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox