All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>, <dev@dpdk.org>
Subject: Re: [PATCH] mempool: micro optimizations
Date: Thu, 27 Feb 2025 09:17:10 +0000	[thread overview]
Message-ID: <Z8Atll7rmVM1p4Ee@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9FA86@smartserver.smartshare.dk>

On Thu, Feb 27, 2025 at 10:14:27AM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Wednesday, 26 February 2025 17.53
> > 
> > On Wed, Feb 26, 2025 at 03:59:22PM +0000, Morten Brørup wrote:
> > > The comparisons lcore_id < RTE_MAX_LCORE and lcore_id != LCORE_ID_ANY
> > are
> > > equivalent, but the latter compiles to fewer bytes of code space.
> > > Similarly for lcore_id >= RTE_MAX_LCORE and lcore_id == LCORE_ID_ANY.
> > >
> > > The rte_mempool_get_ops() function is also used in the fast path, so
> > > RTE_VERIFY() was replaced by RTE_ASSERT().
> > >
> > > Compilers implicitly consider comparisons of variable == 0 likely, so
> > > unlikely() was added to the check for no mempool cache (mp-
> > >cache_size ==
> > > 0) in the rte_mempool_default_cache() function.
> > >
> > > The rte_mempool_do_generic_put() function for adding objects to a
> > mempool
> > > was refactored as follows:
> > > - The comparison for the request itself being too big, which is
> > considered
> > >   unlikely, was moved down and out of the code path where the cache
> > has
> > >   sufficient room for the added objects, which is considered the most
> > >   likely code path.
> > > - Added __rte_assume() about the cache length, size and threshold,
> > for
> > >   compiler optimization when "n" is compile time constant.
> > > - Added __rte_assume() about "ret" being zero, so other functions
> > using
> > >   the value returned by this function can be potentially optimized by
> > the
> > >   compiler; especially when it merges multiple sequential code paths
> > of
> > >   inlined code depending on the return value being either zero or
> > >   negative.
> > > - The refactored source code (with comments) made the separate
> > comment
> > >   describing the cache flush/add algorithm superfluous, so it was
> > removed.
> > >
> > > A few more likely()/unlikely() were added.
> > >
> > > A few comments were improved for readability.
> > >
> > > Some assertions, RTE_ASSERT(), were added. Most importantly to assert
> > that
> > > the return values of the mempool drivers' enqueue and dequeue
> > operations
> > > are API compliant, i.e. 0 (for success) or negative (for failure),
> > and
> > > never positive.
> > >
> > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > > ---
> > >  lib/mempool/rte_mempool.h | 67 ++++++++++++++++++++++---------------
> > --
> > >  1 file changed, 38 insertions(+), 29 deletions(-)
> > >
> > Is there any measurable performance change with these modifications?
> 
> It varies.
> Here are some of the good ones, tested on a VM under VMware:
> 
> mempool_autotest cache=512 cores=1
> n_get_bulk=64 n_put_bulk=64 n_keep=128 constant_n=0
> rate_persec=1309408130 -> 1417067889 : +8.2 %
> 
> mempool_autotest cache=512 cores=1
> n_get_bulk=64 n_put_bulk=64 n_keep=128 constant_n=1
> rate_persec=1479812844 -> 1573307159 : +6.3 %
> 
> mempool_autotest cache=512 cores=1
> n_max_bulk=32 n_keep=128 constant_n=0
> rate_persec=825183959 -> 868013386 : +5.2 %
> 
> The last result is from a new type of test where the size of every get/put varies between 1 and n_max_bulk, so the CPU's dynamic branch predictor cannot predict the request size.
> I'll probably provide a separate patch for test_mempool_perf.c with this new test type, when I have finished it.
>

Thanks, those results look worthwhile so.

/Bruce 

  reply	other threads:[~2025-02-27  9:17 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-26 15:59 [PATCH] mempool: micro optimizations Morten Brørup
2025-02-26 16:53 ` Bruce Richardson
2025-02-27  9:14   ` Morten Brørup
2025-02-27  9:17     ` Bruce Richardson [this message]
2025-02-28 16:59       ` Morten Brørup
2025-03-25  7:13 ` Morten Brørup
2025-03-27 17:15 ` Bruce Richardson
2025-03-27 19:30   ` Morten Brørup
2025-03-30  8:09   ` Andrew Rybchenko
2025-06-08 17:53     ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8Atll7rmVM1p4Ee@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=mb@smartsharesystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.