All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: <dev@dpdk.org>, Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	"Jingjing Wu" <jingjing.wu@intel.com>,
	Praveen Shetty <praveen.shetty@intel.com>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	Sachin Saxena <sachin.saxena@oss.nxp.com>
Subject: Re: [PATCH v5] mempool: improve cache behaviour and performance
Date: Fri, 22 May 2026 17:12:50 +0100	[thread overview]
Message-ID: <ahCAgitAAK3e5Kf3@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <20260419095526.39526-1-mb@smartsharesystems.com>

On Sun, Apr 19, 2026 at 09:55:26AM +0000, Morten Brørup wrote:
> This patch refactors the mempool cache to eliminate some unexpected
> behaviour and reduce the mempool cache miss rate.
> 
> 1.
> The actual cache size was 1.5 times the cache size specified at run-time
> mempool creation.
> This was obviously not expected by application developers.
> 
> 2.
> In get operations, the check for when to use the cache as bounce buffer
> did not respect the run-time configured cache size,
> but compared to the build time maximum possible cache size
> (RTE_MEMPOOL_CACHE_MAX_SIZE, default 512).
> E.g. with a configured cache size of 32 objects, getting 256 objects
> would first fetch 32 + 256 = 288 objects into the cache,
> and then move the 256 objects from the cache to the destination memory,
> instead of fetching the 256 objects directly to the destination memory.
> This had a performance cost.
> However, this is unlikely to occur in real applications, so it is not
> important in itself.
> 
> 3.
> When putting objects into a mempool, and the mempool cache did not have
> free space for so many objects,
> the cache was flushed completely, and the new objects were then put into
> the cache.
> I.e. the cache drain level was zero.
> This (complete cache flush) meant that a subsequent get operation (with
> the same number of objects) completely emptied the cache,
> so another subsequent get operation required replenishing the cache.
> 
> Similarly,
> When getting objects from a mempool, and the mempool cache did not hold so
> many objects,
> the cache was replenished to cache->size + remaining objects,
> and then (the remaining part of) the requested objects were fetched via
> the cache,
> which left the cache filled (to cache->size) at completion.
> I.e. the cache refill level was cache->size (plus some, depending on
> request size).
> 
> (1) was improved by generally comparing to cache->size instead of
> cache->flushthresh, when considering the capacity of the cache.
> The cache->flushthresh field is kept for API/ABI compatibility purposes,
> and initialized to cache->size instead of cache->size * 1.5.
> 
> (2) was improved by generally comparing to cache->size / 2 instead of
> RTE_MEMPOOL_CACHE_MAX_SIZE, when checking the bounce buffer limit.
> 
> (3) was improved by flushing and replenishing the cache by half its size,
> so a flush/refill can be followed randomly by get or put requests.
> This also reduced the number of objects in each flush/refill operation.
> 
> As a consequence of these changes, the size of the array holding the
> objects in the cache (cache->objs[]) no longer needs to be
> 2 * RTE_MEMPOOL_CACHE_MAX_SIZE, and can be reduced to
> RTE_MEMPOOL_CACHE_MAX_SIZE at an API/ABI breaking release.
> 
> Performance data:
> With a real WAN Optimization application, where the number of allocated
> packets varies (as they are held in e.g. shaper queues), the mempool
> cache miss rate dropped from ca. 1/20 objects to ca. 1/48 objects.
> This was deployed in production at an ISP, and using an effective cache
> size of 384 objects.
> 
> As a consequence of the improved mempool cache algorithm, some drivers
> were updated accordingly:
> - The Intel idpf PMD was updated regarding how much to backfill the
>   mempool cache in the AVX512 code.
> - The NXP dpaa and dpaa2 mempool drivers were updated to not set the
>   mempool cache flush threshold; doing this no longer has any effect, and
>   thus became superfluous.
> 
> Bugzilla ID: 1027
> Fixes: ea5dd2744b90 ("mempool: cache optimisations")
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> Depends-on: patch-163181 ("net/intel: do not bypass mbuf lib for mbuf fast-free")
> ---
> v5:
> * Flush the cache from the bottom, where objects are colder, and move down
>   the remaining objects, which are hotter.
> * In the Intel idpf PMD, move up the hot objects in the cache and refill
>   with cold objects at the bottom.
> v4:
> * Added Bugzilla ID.
> * Added Fixes tag. For reference only.
> * Moved fast-free related update of Intel common driver out as a separate
>   patch, and depend on that patch.
> * Omitted unrelated changes to the Intel idpf AVX512 driver, specifically
>   fixing an indentation and adding mbuf instrumentation.
> * Omitted unrelated changes to the mempool library, specifically adding
>   __rte_restrict and changing a couple of comments to proper sentences.
> * Please checkpatches by swapping operators in a couple of comparisons.
> v3:
> * Fixed my copy-paste bug in idpf_splitq_rearm().
> v2:
> * Fixed issue found by abidiff:
>   Reverted cache objects array size reduction. Added a note instead.
> * Added missing mbuf instrumentation to the Intel idpf AVX512 driver.
> * Updated idpf_splitq_rearm() like idpf_singleq_rearm().
> * Added a few more __rte_assume(). (Inspired by AI review)
> * Updated NXP dpaa and dpaa2 mempool drivers to not set mempool cache
>   flush threshold.
> * Added release notes.
> * Added deprecation notes.
> ---
>  doc/guides/rel_notes/deprecation.rst          |  7 ++
>  doc/guides/rel_notes/release_26_07.rst        | 10 +++
>  drivers/mempool/dpaa/dpaa_mempool.c           | 14 ----
>  drivers/mempool/dpaa2/dpaa2_hw_mempool.c      | 14 ----
>  .../net/intel/idpf/idpf_common_rxtx_avx512.c  | 52 +++++++++++---
>  lib/mempool/rte_mempool.c                     | 14 +---
>  lib/mempool/rte_mempool.h                     | 70 ++++++++++++-------
>  7 files changed, 104 insertions(+), 77 deletions(-)
> 
Can the idpf and dpaa changes be made in separate patches, so we can review
the mempool changes along in a single patch? Even if the commits can't work
logically together, perhaps they can be separated for review, and then
squashed on apply?

/Bruce

  parent reply	other threads:[~2026-05-22 16:13 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 14:13 [PATCH] mempool: improve cache behaviour and performance Morten Brørup
2026-04-08 15:41 ` Stephen Hemminger
2026-04-09 10:25 ` [PATCH v2] " Morten Brørup
2026-04-09 11:05 ` [PATCH v3] " Morten Brørup
2026-04-15 13:40   ` Morten Brørup
2026-04-18 11:15 ` [PATCH v4] " Morten Brørup
2026-04-19  9:55 ` [PATCH v5] " Morten Brørup
2026-04-22 12:27   ` Morten Brørup
2026-04-27 15:21   ` Morten Brørup
2026-04-28  7:44   ` Andrew Rybchenko
2026-05-22 16:11   ` Bruce Richardson
2026-05-26  8:41     ` Morten Brørup
2026-05-26  9:39       ` Bruce Richardson
2026-05-26 10:37         ` Morten Brørup
2026-05-26 17:45           ` Morten Brørup
2026-05-27  8:48             ` Bruce Richardson
2026-05-27  9:22               ` Morten Brørup
2026-05-22 16:12   ` Bruce Richardson [this message]
2026-05-26  8:57     ` Morten Brørup
2026-05-26 14:00 ` [PATCH v6] " Morten Brørup
2026-05-26 16:00   ` Morten Brørup
2026-06-01 13:36     ` Thomas Monjalon
2026-06-01 13:51       ` Morten Brørup
2026-06-01 14:19         ` Thomas Monjalon
2026-06-01 14:27           ` Morten Brørup
2026-05-29  8:53   ` fengchengwen
2026-05-29 11:43     ` Morten Brørup
2026-05-27 11:36 ` [PATCH v6] net/idpf: update for new mempool cache algorithm Morten Brørup
2026-05-27 11:36   ` [PATCH v6] mempool/dpaa: " Morten Brørup
2026-05-27 11:36   ` [PATCH v6] mempool/dpaa2: " Morten Brørup
2026-06-01 16:40 ` [PATCH v7] mempool: improve cache behaviour and performance Morten Brørup
2026-06-03 15:44   ` Thomas Monjalon
2026-06-01 18:36 ` [PATCH v7] net/idpf: update for new mempool cache algorithm Morten Brørup
2026-06-01 18:36   ` [PATCH v7] mempool/dpaa: " Morten Brørup
2026-06-02  6:51     ` Morten Brørup
2026-06-01 18:36   ` [PATCH v7] mempool/dpaa2: " Morten Brørup
2026-06-02  6:53     ` Morten Brørup
2026-06-02  6:45   ` [PATCH v7] net/idpf: " Morten Brørup
2026-06-10 11:21   ` Morten Brørup
2026-06-10 11:31     ` Bruce Richardson
2026-06-10 12:17       ` Thomas Monjalon
2026-06-10 12:34         ` Bruce Richardson
2026-06-10 11:31   ` Morten Brørup
2026-06-04 11:48 ` [PATCH v8] mempool: improve cache behaviour and performance Morten Brørup
2026-06-04 13:57   ` Morten Brørup
2026-06-10 11:06   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahCAgitAAK3e5Kf3@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=hemant.agrawal@nxp.com \
    --cc=jingjing.wu@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=praveen.shetty@intel.com \
    --cc=sachin.saxena@oss.nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.