All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Cc: "Morten Brørup" <mb@smartsharesystems.com>,
	"Bruce Richardson" <bruce.richardson@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	"Wathsala Vithanage" <wathsala.vithanage@arm.com>,
	Fengchengwen <fengchengwen@huawei.com>
Subject: Re: [PATCH v7] mbuf: optimize segment prefree
Date: Fri, 24 Oct 2025 12:14:54 +0200	[thread overview]
Message-ID: <9617057.OUTRe80PYV@thomas> (raw)
In-Reply-To: <1d47dfea2fcc4dfc8507dd0ba3cbe6ec@huawei.com>

24/10/2025 11:21, Konstantin Ananyev:
> 
> > > > Refactored rte_pktmbuf_prefree_seg() for both performance and
> > > readability.
> > > >
> > > > With the optimized RTE_MBUF_DIRECT() macro, the common likely code
> > > path
> > > > now fits within one instruction cache line on x86-64 when built with
> > > GCC.
> > > >
> > > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > > > Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > > > Acked-by: Chengwen Feng <fengchengwen@huawei.com>
> > > > Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > ---
> > > > v7:
> > > > * Go back to long names instead of numerical value in
> > > RTE_MBUF_DIRECT()
> > > >   macro.
> > > >   (Konstantin Ananyev)
> > > > * Updated static_assert() accordingly.
> > 
> > [...]
> > 
> > > >   *
> > > >   * If a mbuf embeds its own data after the rte_mbuf structure, this
> > > mbuf
> > > >   * can be defined as a direct mbuf.
> > > > - */
> > > > + *
> > > > + * Note: Macro optimized for code size.
> > > > + *
> > > > + * The plain macro would be:
> > > > + * \code{.c}
> > > > + *      #define RTE_MBUF_DIRECT(mb) \
> > > > + *          (!((mb)->ol_flags & (RTE_MBUF_F_INDIRECT |
> > > > RTE_MBUF_F_EXTERNAL)))
> > > > + * \endcode
> > > > + *
> > > > + * The flags RTE_MBUF_F_INDIRECT and RTE_MBUF_F_EXTERNAL are both in
> > > > the MSB (most significant
> > > > + * byte) of the 64-bit ol_flags field, so we only compare this one
> > > byte instead of
> > > > all 64 bits.
> > > > + *
> > > > + * E.g., GCC version 16.0.0 20251019 (experimental) generates the
> > > following
> > > > code for x86-64.
> > > > + *
> > > > + * With the plain macro, 17 bytes of instructions:
> > > > + * \code
> > > > + *      movabs rax,0x6000000000000000       // 10 bytes
> > > > + *      and    rax,QWORD PTR [rdi+0x18]     // 4 bytes
> > > > + *      sete   al                           // 3 bytes
> > > > + * \endcode
> > > > + * With this optimized macro, only 7 bytes of instructions:
> > > > + * \code
> > > > + *      test   BYTE PTR [rdi+0x1f],0x60     // 4 bytes
> > > > + *      sete   al                           // 3 bytes
> > > > + * \endcode
> > > > + */
> > > > +#ifdef __DOXYGEN__
> > > > +#define RTE_MBUF_DIRECT(mb) \
> > > > +	!(((const char *)(&(mb)->ol_flags))[MSB_OFFSET /* 7 or 0,
> > > depending on
> > > > endianness */] & \
> > > > +	(char)((RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL) >> (7 *
> > > > CHAR_BIT)))
> > > > +#else /* !__DOXYGEN__ */
> > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > +/* On little endian architecture, the MSB of a 64-bit integer is at
> > > byte offset 7. */
> > > > +#define RTE_MBUF_DIRECT(mb) \
> > > > +	!(((const char *)(&(mb)->ol_flags))[7] & \
> > > > +	(char)((RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL) >> (7 *
> > > > CHAR_BIT)))
> > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > +/* On big endian architecture, the MSB of a 64-bit integer is at
> > > byte offset 0. */
> > > >  #define RTE_MBUF_DIRECT(mb) \
> > > > -	(!((mb)->ol_flags & (RTE_MBUF_F_INDIRECT |
> > > > RTE_MBUF_F_EXTERNAL)))
> > > > +	!(((const char *)(&(mb)->ol_flags))[0] & \
> > > > +	(char)((RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL) >> (7 *
> > > > CHAR_BIT)))
> > > > +#endif /* RTE_BYTE_ORDER */
> > > > +#endif /* !__DOXYGEN__ */
> > > > +/* Verify the optimization above. */
> > > > +static_assert(((RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL) &
> > > > (UINT64_C(0xFF) << (7 * CHAR_BIT))) ==
> > > > +	(RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL),
> > > > +	"(RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL) is not at MSB");
> > > >
> > > >  /** Uninitialized or unspecified port. */
> > > >  #define RTE_MBUF_PORT_INVALID UINT16_MAX
> > > > --
> > >
> > > LGTM, thanks for refactoring.
> > 
> > Thank you for reviewing, Konstantin.
> > 
> > I had no preference for v7 or v6, but Bruce and Thomas preferred v6, so v6 was
> > applied.
> 
> Yes, I saw Thomas email, after I sent my reply already.
> Looks like I was late with my vote.
> My preference still would be to avoid hard-coded constants in the code,
> but seems that it is just me.

Me too I want to avoid hardcoded constants.
But in this case, it is very well documented,
and there is a trade-off with length and reading.

The comment starts with
* The plain macro would be:
* \code{.c}
*      #define RTE_MBUF_DIRECT(mb) \
*          (!((mb)->ol_flags & (RTE_MBUF_F_INDIRECT | RTE_MBUF_F_EXTERNAL)))
* \endcode

so I believe it is very clear already.



  reply	other threads:[~2025-10-24 10:15 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27 21:35 [PATCH] mbuf: optimize segment prefree Morten Brørup
2025-08-27 23:17 ` Stephen Hemminger
2025-10-06 17:46   ` Wathsala Vithanage
2025-10-06 18:26     ` Morten Brørup
2025-10-06 14:49 ` Morten Brørup
2025-10-20 12:02 ` [PATCH v2] " Morten Brørup
2025-10-20 14:24   ` Konstantin Ananyev
2025-10-21  8:38   ` fengchengwen
2025-10-22  9:08   ` Bruce Richardson
2025-10-22 13:53     ` Morten Brørup
2025-10-22 14:12       ` Bruce Richardson
2025-10-22 14:14         ` Morten Brørup
2025-10-22 13:23 ` [PATCH v3] " Morten Brørup
2025-10-22 14:47 ` [PATCH v4] " Morten Brørup
2025-10-22 15:02   ` Bruce Richardson
2025-10-22 18:28     ` Morten Brørup
2025-10-23  7:04     ` Morten Brørup
2025-10-23  8:01 ` [PATCH v5] " Morten Brørup
2025-10-23  8:08   ` Bruce Richardson
2025-10-23  8:51   ` Konstantin Ananyev
2025-10-23 11:17     ` Morten Brørup
2025-10-23 14:04       ` Konstantin Ananyev
2025-10-23 14:48         ` Morten Brørup
2025-10-23 15:27           ` Konstantin Ananyev
2025-10-23 15:46             ` Morten Brørup
2025-10-23 16:03               ` Bruce Richardson
2025-10-23 16:24                 ` Morten Brørup
2025-10-23 12:48 ` [PATCH v6] " Morten Brørup
2025-10-23 17:13   ` Thomas Monjalon
2025-10-23 16:18 ` [PATCH v7] " Morten Brørup
2025-10-24  8:20   ` Konstantin Ananyev
2025-10-24  8:58     ` Morten Brørup
2025-10-24  9:21       ` Konstantin Ananyev
2025-10-24 10:14         ` Thomas Monjalon [this message]
2025-10-25 10:24           ` Konstantin Ananyev
2025-10-25 12:16             ` Morten Brørup
2025-10-27  8:54               ` Konstantin Ananyev
2025-10-27  9:00                 ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9617057.OUTRe80PYV@thomas \
    --to=thomas@monjalon.net \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=konstantin.ananyev@huawei.com \
    --cc=mb@smartsharesystems.com \
    --cc=stephen@networkplumber.org \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.