DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: <dev@dpdk.org>, Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	"Jingjing Wu" <jingjing.wu@intel.com>,
	Praveen Shetty <praveen.shetty@intel.com>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	Sachin Saxena <sachin.saxena@oss.nxp.com>
Subject: Re: [PATCH v5] mempool: improve cache behaviour and performance
Date: Fri, 22 May 2026 17:12:50 +0100	[thread overview]
Message-ID: <ahCAgitAAK3e5Kf3@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <20260419095526.39526-1-mb@smartsharesystems.com>

On Sun, Apr 19, 2026 at 09:55:26AM +0000, Morten Brørup wrote:
> This patch refactors the mempool cache to eliminate some unexpected
> behaviour and reduce the mempool cache miss rate.
> 
> 1.
> The actual cache size was 1.5 times the cache size specified at run-time
> mempool creation.
> This was obviously not expected by application developers.
> 
> 2.
> In get operations, the check for when to use the cache as bounce buffer
> did not respect the run-time configured cache size,
> but compared to the build time maximum possible cache size
> (RTE_MEMPOOL_CACHE_MAX_SIZE, default 512).
> E.g. with a configured cache size of 32 objects, getting 256 objects
> would first fetch 32 + 256 = 288 objects into the cache,
> and then move the 256 objects from the cache to the destination memory,
> instead of fetching the 256 objects directly to the destination memory.
> This had a performance cost.
> However, this is unlikely to occur in real applications, so it is not
> important in itself.
> 
> 3.
> When putting objects into a mempool, and the mempool cache did not have
> free space for so many objects,
> the cache was flushed completely, and the new objects were then put into
> the cache.
> I.e. the cache drain level was zero.
> This (complete cache flush) meant that a subsequent get operation (with
> the same number of objects) completely emptied the cache,
> so another subsequent get operation required replenishing the cache.
> 
> Similarly,
> When getting objects from a mempool, and the mempool cache did not hold so
> many objects,
> the cache was replenished to cache->size + remaining objects,
> and then (the remaining part of) the requested objects were fetched via
> the cache,
> which left the cache filled (to cache->size) at completion.
> I.e. the cache refill level was cache->size (plus some, depending on
> request size).
> 
> (1) was improved by generally comparing to cache->size instead of
> cache->flushthresh, when considering the capacity of the cache.
> The cache->flushthresh field is kept for API/ABI compatibility purposes,
> and initialized to cache->size instead of cache->size * 1.5.
> 
> (2) was improved by generally comparing to cache->size / 2 instead of
> RTE_MEMPOOL_CACHE_MAX_SIZE, when checking the bounce buffer limit.
> 
> (3) was improved by flushing and replenishing the cache by half its size,
> so a flush/refill can be followed randomly by get or put requests.
> This also reduced the number of objects in each flush/refill operation.
> 
> As a consequence of these changes, the size of the array holding the
> objects in the cache (cache->objs[]) no longer needs to be
> 2 * RTE_MEMPOOL_CACHE_MAX_SIZE, and can be reduced to
> RTE_MEMPOOL_CACHE_MAX_SIZE at an API/ABI breaking release.
> 
> Performance data:
> With a real WAN Optimization application, where the number of allocated
> packets varies (as they are held in e.g. shaper queues), the mempool
> cache miss rate dropped from ca. 1/20 objects to ca. 1/48 objects.
> This was deployed in production at an ISP, and using an effective cache
> size of 384 objects.
> 
> As a consequence of the improved mempool cache algorithm, some drivers
> were updated accordingly:
> - The Intel idpf PMD was updated regarding how much to backfill the
>   mempool cache in the AVX512 code.
> - The NXP dpaa and dpaa2 mempool drivers were updated to not set the
>   mempool cache flush threshold; doing this no longer has any effect, and
>   thus became superfluous.
> 
> Bugzilla ID: 1027
> Fixes: ea5dd2744b90 ("mempool: cache optimisations")
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> Depends-on: patch-163181 ("net/intel: do not bypass mbuf lib for mbuf fast-free")
> ---
> v5:
> * Flush the cache from the bottom, where objects are colder, and move down
>   the remaining objects, which are hotter.
> * In the Intel idpf PMD, move up the hot objects in the cache and refill
>   with cold objects at the bottom.
> v4:
> * Added Bugzilla ID.
> * Added Fixes tag. For reference only.
> * Moved fast-free related update of Intel common driver out as a separate
>   patch, and depend on that patch.
> * Omitted unrelated changes to the Intel idpf AVX512 driver, specifically
>   fixing an indentation and adding mbuf instrumentation.
> * Omitted unrelated changes to the mempool library, specifically adding
>   __rte_restrict and changing a couple of comments to proper sentences.
> * Please checkpatches by swapping operators in a couple of comparisons.
> v3:
> * Fixed my copy-paste bug in idpf_splitq_rearm().
> v2:
> * Fixed issue found by abidiff:
>   Reverted cache objects array size reduction. Added a note instead.
> * Added missing mbuf instrumentation to the Intel idpf AVX512 driver.
> * Updated idpf_splitq_rearm() like idpf_singleq_rearm().
> * Added a few more __rte_assume(). (Inspired by AI review)
> * Updated NXP dpaa and dpaa2 mempool drivers to not set mempool cache
>   flush threshold.
> * Added release notes.
> * Added deprecation notes.
> ---
>  doc/guides/rel_notes/deprecation.rst          |  7 ++
>  doc/guides/rel_notes/release_26_07.rst        | 10 +++
>  drivers/mempool/dpaa/dpaa_mempool.c           | 14 ----
>  drivers/mempool/dpaa2/dpaa2_hw_mempool.c      | 14 ----
>  .../net/intel/idpf/idpf_common_rxtx_avx512.c  | 52 +++++++++++---
>  lib/mempool/rte_mempool.c                     | 14 +---
>  lib/mempool/rte_mempool.h                     | 70 ++++++++++++-------
>  7 files changed, 104 insertions(+), 77 deletions(-)
> 
Can the idpf and dpaa changes be made in separate patches, so we can review
the mempool changes along in a single patch? Even if the commits can't work
logically together, perhaps they can be separated for review, and then
squashed on apply?

/Bruce

  parent reply	other threads:[~2026-05-22 16:13 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 14:13 [PATCH] mempool: improve cache behaviour and performance Morten Brørup
2026-04-08 15:41 ` Stephen Hemminger
2026-04-09 10:25 ` [PATCH v2] " Morten Brørup
2026-04-09 11:05 ` [PATCH v3] " Morten Brørup
2026-04-15 13:40   ` Morten Brørup
2026-04-18 11:15 ` [PATCH v4] " Morten Brørup
2026-04-19  9:55 ` [PATCH v5] " Morten Brørup
2026-04-22 12:27   ` Morten Brørup
2026-04-27 15:21   ` Morten Brørup
2026-04-28  7:44   ` Andrew Rybchenko
2026-05-22 16:11   ` Bruce Richardson
2026-05-26  8:41     ` Morten Brørup
2026-05-26  9:39       ` Bruce Richardson
2026-05-26 10:37         ` Morten Brørup
2026-05-26 17:45           ` Morten Brørup
2026-05-27  8:48             ` Bruce Richardson
2026-05-27  9:22               ` Morten Brørup
2026-05-22 16:12   ` Bruce Richardson [this message]
2026-05-26  8:57     ` Morten Brørup
2026-05-26 14:00 ` [PATCH v6] " Morten Brørup
2026-05-26 16:00   ` Morten Brørup
2026-06-01 13:36     ` Thomas Monjalon
2026-06-01 13:51       ` Morten Brørup
2026-06-01 14:19         ` Thomas Monjalon
2026-06-01 14:27           ` Morten Brørup
2026-05-29  8:53   ` fengchengwen
2026-05-29 11:43     ` Morten Brørup
2026-05-27 11:36 ` [PATCH v6] net/idpf: update for new mempool cache algorithm Morten Brørup
2026-05-27 11:36   ` [PATCH v6] mempool/dpaa: " Morten Brørup
2026-05-27 11:36   ` [PATCH v6] mempool/dpaa2: " Morten Brørup
2026-06-01 16:40 ` [PATCH v7] mempool: improve cache behaviour and performance Morten Brørup
2026-06-03 15:44   ` Thomas Monjalon
2026-06-01 18:36 ` [PATCH v7] net/idpf: update for new mempool cache algorithm Morten Brørup
2026-06-01 18:36   ` [PATCH v7] mempool/dpaa: " Morten Brørup
2026-06-02  6:51     ` Morten Brørup
2026-06-01 18:36   ` [PATCH v7] mempool/dpaa2: " Morten Brørup
2026-06-02  6:53     ` Morten Brørup
2026-06-02  6:45   ` [PATCH v7] net/idpf: " Morten Brørup
2026-06-10 11:21   ` Morten Brørup
2026-06-10 11:31     ` Bruce Richardson
2026-06-10 12:17       ` Thomas Monjalon
2026-06-10 12:34         ` Bruce Richardson
2026-06-10 11:31   ` Morten Brørup
2026-06-04 11:48 ` [PATCH v8] mempool: improve cache behaviour and performance Morten Brørup
2026-06-04 13:57   ` Morten Brørup
2026-06-10 11:06   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahCAgitAAK3e5Kf3@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=hemant.agrawal@nxp.com \
    --cc=jingjing.wu@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=praveen.shetty@intel.com \
    --cc=sachin.saxena@oss.nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox