public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>, <dev@dpdk.org>
Subject: RE: [PATCH v2] mempool: simplify get objects
Date: Tue, 3 Feb 2026 11:03:46 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35F656E8@smartserver.smartshare.dk> (raw)
In-Reply-To: <20260120101701.467039-1-mb@smartsharesystems.com>

PING for review.

Here's some elaboration for reviewers...

Clearly, when the request can be served from the cache (n <= cache->len), the patch is correct, regardless if n is constant or variable:

	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
	if (likely(n <= cache->len)) {
 		/*
		 * The entire request can be satisfied from the cache.
		 * If the request size is known at build time,
		 * the compiler unrolls the fixed length copy loop.
 		 */
 		cache->len -= n;
 		for (index = 0; index < n; index++)
			*obj_table++ = *--cache_objs;

		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);

		return 0;
	}


Now, let's see what happens when the request cannot be served from the cache,
i.e. when n > cache-len:

	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
	if (__rte_constant(n) && n <= cache->len) {
// FALSE, because n > cache->len
// Regardless if n is constant or variable
//		/*
//		 * The request size is known at build time, and
//		 * the entire request can be satisfied from the cache,
//		 * so let the compiler unroll the fixed length copy loop.
//		 */
//		cache->len -= n;
//		for (index = 0; index < n; index++)
//			*obj_table++ = *--cache_objs;
//
//		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
//		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
//
//		return 0;
//	}

	/*
	 * Use the cache as much as we have to return hot objects first.
	 * If the request size 'n' is known at build time, the above comparison
	 * ensures that n > cache->len here, so omit RTE_MIN().
	 */
	len = __rte_constant(n) ? cache->len : RTE_MIN(n, cache->len);
// ALWAYS: len = cache->len
// When n is constant:
//	len = cache->len
// When n is variable:
//	len = RTE_MIN(n, cache->len)
//		= cache->len, because n > cache->len
	cache->len -= len;
// ALWAYS: cache->len = 0, because len == cache->len
	remaining = n - len;
	for (index = 0; index < len; index++)
		*obj_table++ = *--cache_objs;

	/*
	 * If the request size 'n' is known at build time, the case
	 * where the entire request can be satisfied from the cache
	 * has already been handled above, so omit handling it here.
	 */
	if (!__rte_constant(n) && likely(remaining == 0)) {
// FALSE, because remaining > 0
//		/* The entire request is satisfied from the cache. */
//
//		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
//		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
//
//		return 0;
//	}

	/* Dequeue below would overflow mem allocated for cache? */
	if (unlikely(remaining > RTE_MEMPOOL_CACHE_MAX_SIZE))
		goto driver_dequeue;

Venlig hilsen / Kind regards,
-Morten Brørup


> -----Original Message-----
> From: Morten Brørup [mailto:mb@smartsharesystems.com]
> Sent: Tuesday, 20 January 2026 11.17
> To: Andrew Rybchenko; dev@dpdk.org
> Cc: Morten Brørup
> Subject: [PATCH v2] mempool: simplify get objects
> 
> Removed explicit test for build time constant request size,
> and added comment that the compiler loop unrolls when request size is
> build time constant, to improve source code readability.
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> v2:
> * Removed unrelated microoptimization from
> rte_mempool_do_generic_put(),
>   which was also described incorrectly.
> ---
>  lib/mempool/rte_mempool.h | 35 ++++++++---------------------------
>  1 file changed, 8 insertions(+), 27 deletions(-)
> 
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index aedc100964..4213784e14 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -1531,11 +1531,11 @@ rte_mempool_do_generic_get(struct rte_mempool
> *mp, void **obj_table,
>  	cache_objs = &cache->objs[cache->len];
> 
>  	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
> -	if (__rte_constant(n) && n <= cache->len) {
> +	if (likely(n <= cache->len)) {
>  		/*
> -		 * The request size is known at build time, and
> -		 * the entire request can be satisfied from the cache,
> -		 * so let the compiler unroll the fixed length copy loop.
> +		 * The entire request can be satisfied from the cache.
> +		 * If the request size is known at build time,
> +		 * the compiler unrolls the fixed length copy loop.
>  		 */
>  		cache->len -= n;
>  		for (index = 0; index < n; index++)
> @@ -1547,31 +1547,13 @@ rte_mempool_do_generic_get(struct rte_mempool
> *mp, void **obj_table,
>  		return 0;
>  	}
> 
> -	/*
> -	 * Use the cache as much as we have to return hot objects first.
> -	 * If the request size 'n' is known at build time, the above
> comparison
> -	 * ensures that n > cache->len here, so omit RTE_MIN().
> -	 */
> -	len = __rte_constant(n) ? cache->len : RTE_MIN(n, cache->len);
> -	cache->len -= len;
> +	/* Use the cache as much as we have to return hot objects first.
> */
> +	len = cache->len;
>  	remaining = n - len;
> +	cache->len = 0;
>  	for (index = 0; index < len; index++)
>  		*obj_table++ = *--cache_objs;
> 
> -	/*
> -	 * If the request size 'n' is known at build time, the case
> -	 * where the entire request can be satisfied from the cache
> -	 * has already been handled above, so omit handling it here.
> -	 */
> -	if (!__rte_constant(n) && likely(remaining == 0)) {
> -		/* The entire request is satisfied from the cache. */
> -
> -		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
> -		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
> -
> -		return 0;
> -	}
> -
>  	/* Dequeue below would overflow mem allocated for cache? */
>  	if (unlikely(remaining > RTE_MEMPOOL_CACHE_MAX_SIZE))
>  		goto driver_dequeue;
> @@ -1592,11 +1574,10 @@ rte_mempool_do_generic_get(struct rte_mempool
> *mp, void **obj_table,
>  	__rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
>  	__rte_assume(remaining <= RTE_MEMPOOL_CACHE_MAX_SIZE);
>  	cache_objs = &cache->objs[cache->size + remaining];
> +	cache->len = cache->size;
>  	for (index = 0; index < remaining; index++)
>  		*obj_table++ = *--cache_objs;
> 
> -	cache->len = cache->size;
> -
>  	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
>  	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
> 
> --
> 2.43.0


  parent reply	other threads:[~2026-02-03 10:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-20  8:20 [PATCH] mempool: simplify get objects Morten Brørup
2026-01-20  8:57 ` Morten Brørup
2026-01-20 10:17 ` [PATCH v2] " Morten Brørup
2026-01-20 20:00   ` Stephen Hemminger
2026-01-21 11:17     ` Morten Brørup
2026-02-03 10:03   ` Morten Brørup [this message]
2026-02-16  9:27 ` [PATCH v3] " Morten Brørup
2026-02-17  6:53   ` Andrew Rybchenko
2026-03-17  8:51     ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35F656E8@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox