public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: <scott.k.mitch1@gmail.com>, <dev@dpdk.org>
Cc: <stephen@networkplumber.org>, <bruce.richardson@intel.com>
Subject: RE: [PATCH v7] eal: RTE_PTR_ADD/SUB char* for compiler optimizations
Date: Fri, 16 Jan 2026 12:39:57 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35F6566B@smartserver.smartshare.dk> (raw)
In-Reply-To: <20260114215648.47820-1-scott.k.mitch1@gmail.com>

> From: Scott Mitchell <scott.k.mitch1@gmail.com>
> 
> Modify RTE_PTR_ADD and RTE_PTR_SUB to use char* pointer arithmetic
> on Clang instead of uintptr_t casts. Benefits of this approach:
> - API compatibility: works for both integer and pointer inputs
> - Retains simple macros: no pragmas, no _Generic
> - Enables Clang optimizations: Clang can now unroll and vectorize
>   pointer loops. GCC uses uintptr_t to avoid false positive warnings.
> 
> Example use case which benefits is __rte_raw_cksum. Performance
> results from cksum_perf_autotest on Intel Xeon (Cascade Lake,
> AVX-512) built with Clang 18.1 (TSC cycles/byte):
>   Block size    Before    After    Improvement
>          100      0.40     0.24        ~40%
>         1500      0.50     0.06        ~8x
>         9000      0.49     0.06        ~8x
> 
> Signed-off-by: Scott Mitchell <scott.k.mitch1@gmail.com>
> 
> ---
> v7:
> - Fix tests: use TEST_BUFFER_SIZE macro for buffer allocation
> - Fix tests: ADD then SUB same amount to avoid out-of-bounds pointer
> arithmetic
> - All RTE_PTR_SUB tests now verify round-trip (ADD+SUB returns to
> original)
> 
> v6:
> - Make char* optimization Clang-only to avoid GCC false positive
> warnings
> - Improve tests: use named constants instead of magic numbers
> - Improve tests: use void* casts on expected values instead of
> uintptr_t on results
> 
> v5:
> - Initial implementation with char* arithmetic for all compilers
> 
> v4:
> - Used _Generic for type-based dispatch with char* casts
> - Had warnings on both Clang and GCC due to _Generic type-checking all
> branches
> ---
>  app/test/meson.build         |   1 +
>  app/test/test_ptr_add_sub.c  | 197 +++++++++++++++++++++++++++++++++++
>  lib/eal/include/rte_common.h |  14 ++-
>  3 files changed, 208 insertions(+), 4 deletions(-)
>  create mode 100644 app/test/test_ptr_add_sub.c
> 
> diff --git a/app/test/meson.build b/app/test/meson.build
> index efec42a6bf..80a19d65ba 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -152,6 +152,7 @@ source_file_deps = {
>      'test_power_intel_uncore.c': ['power', 'power_intel_uncore'],
>      'test_power_kvm_vm.c': ['power', 'power_kvm_vm'],
>      'test_prefetch.c': [],
> +    'test_ptr_add_sub.c': [],
>      'test_ptr_compress.c': ['ptr_compress'],
>      'test_rand_perf.c': [],
>      'test_rawdev.c': ['rawdev', 'bus_vdev', 'raw_skeleton'],
> diff --git a/app/test/test_ptr_add_sub.c b/app/test/test_ptr_add_sub.c
> new file mode 100644
> index 0000000000..2956eabeb3
> --- /dev/null
> +++ b/app/test/test_ptr_add_sub.c
> @@ -0,0 +1,197 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Apple Inc.
> + */
> +
> +#include "test.h"
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#include <rte_common.h>
> +
> +/* Test constants for large integer types (16-bit and wider) */
> +#define TEST_INITVAL_LARGE 0x1000
> +#define TEST_INCREMENT_LARGE 100
> +#define TEST_RETVAL_LARGE ((void *)0x1064)
> +#define TEST_SUBVAL_LARGE ((void *)(TEST_INITVAL_LARGE -
> TEST_INCREMENT_LARGE))
> +
> +/* Test constants for 8-bit types */
> +#define TEST_INITVAL_SMALL 100
> +#define TEST_INCREMENT_SMALL 50
> +#define TEST_RETVAL_SMALL ((void *)150)
> +#define TEST_SUBVAL_SMALL ((void *)50)
> +
> +/* Test constants for bool (1-bit) */
> +#define TEST_INITVAL_BOOL 1
> +#define TEST_INCREMENT_BOOL 99
> +#define TEST_RETVAL_BOOL ((void *)100)

TEST_RETVAL_BOOL is incorrect if typeof(ptr) is bool.
(int)(bool)(true+99) == 1

Now that _Generic is not used anymore, I suggest you remove test cases for other types than pointers.
At your preference, you may also keep pointer compatible types, i.e. 32 bit integers when building for 32-bit architecture and 64 bit integers when building for 64-bit architecture.

> +
> +/* Buffer size for pointer tests */
> +#define TEST_BUFFER_SIZE (TEST_INCREMENT_LARGE + TEST_INCREMENT_SMALL
> + 256)
> +
> +/* Test all C11 standard integer types */
> +static int
> +test_ptr_add_sub_integer_types(void)
> +{
> +	unsigned long long ull = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(ull, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for unsigned long long");
> +	TEST_ASSERT_EQUAL(RTE_PTR_SUB(ull, TEST_INCREMENT_LARGE),
> TEST_SUBVAL_LARGE,
> +		"RTE_PTR_SUB failed for unsigned long long");
> +
> +	long long ll = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(ll, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for long long");
> +
> +	unsigned long ul = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(ul, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for unsigned long");
> +
> +	long l = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(l, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for long");
> +
> +	unsigned int ui = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(ui, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for unsigned int");
> +
> +	int i = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(i, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for int");
> +
> +	unsigned short us = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(us, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for unsigned short");
> +
> +	short s = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(s, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for short");
> +
> +	unsigned char uc = TEST_INITVAL_SMALL;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(uc, TEST_INCREMENT_SMALL),
> TEST_RETVAL_SMALL,
> +		"RTE_PTR_ADD failed for unsigned char");
> +
> +	signed char sc = TEST_INITVAL_SMALL;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(sc, TEST_INCREMENT_SMALL),
> TEST_RETVAL_SMALL,
> +		"RTE_PTR_ADD failed for signed char");
> +
> +	char c = TEST_INITVAL_SMALL;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(c, TEST_INCREMENT_SMALL),
> TEST_RETVAL_SMALL,
> +		"RTE_PTR_ADD failed for char");
> +
> +	_Bool b = TEST_INITVAL_BOOL;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(b, TEST_INCREMENT_BOOL),
> TEST_RETVAL_BOOL,
> +		"RTE_PTR_ADD failed for _Bool");
> +
> +	bool b2 = TEST_INITVAL_BOOL;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(b2, TEST_INCREMENT_BOOL),
> TEST_RETVAL_BOOL,
> +		"RTE_PTR_ADD failed for bool");
> +
> +	return 0;
> +}
> +
> +/* Test pointer types including const correctness */
> +static int
> +test_ptr_add_sub_pointer_types(void)
> +{
> +	char buffer[TEST_BUFFER_SIZE];
> +	void *result;
> +
> +	/* Test void* and const void* */
> +	void *vp = buffer;
> +	result = RTE_PTR_ADD(vp, TEST_INCREMENT_LARGE);
> +	result = RTE_PTR_SUB(result, TEST_INCREMENT_LARGE);
> +	TEST_ASSERT_EQUAL(result, vp, "RTE_PTR_ADD/SUB failed for
> void*");
> +
> +	const void *cvp = buffer;
> +	result = RTE_PTR_ADD(cvp, TEST_INCREMENT_SMALL);
> +	result = RTE_PTR_SUB(result, TEST_INCREMENT_SMALL);
> +	TEST_ASSERT_EQUAL(result, (const void *)cvp,
> +		"RTE_PTR_ADD/SUB failed for const void*");
> +
> +	/* Test char* and const char* */
> +	char *cp = buffer;
> +	result = RTE_PTR_ADD(cp, TEST_INCREMENT_LARGE);
> +	result = RTE_PTR_SUB(result, TEST_INCREMENT_LARGE);
> +	TEST_ASSERT_EQUAL(result, (void *)cp, "RTE_PTR_ADD/SUB failed for
> char*");
> +
> +	const char *ccp = buffer;
> +	result = RTE_PTR_ADD(ccp, TEST_INCREMENT_SMALL);
> +	result = RTE_PTR_SUB(result, TEST_INCREMENT_SMALL);
> +	TEST_ASSERT_EQUAL(result, (const void *)ccp,
> +		"RTE_PTR_ADD/SUB failed for const char*");
> +
> +	/* Test uint32_t* and const uint32_t* */
> +	uint32_t *u32p = (uint32_t *)buffer;
> +	result = RTE_PTR_ADD(u32p, TEST_INCREMENT_LARGE);
> +	result = RTE_PTR_SUB(result, TEST_INCREMENT_LARGE);
> +	TEST_ASSERT_EQUAL(result, (void *)u32p,
> +		"RTE_PTR_ADD/SUB failed for uint32_t*");
> +
> +	const uint32_t *cu32p = (const uint32_t *)buffer;
> +	result = RTE_PTR_ADD(cu32p, TEST_INCREMENT_SMALL);
> +	result = RTE_PTR_SUB(result, TEST_INCREMENT_SMALL);
> +	TEST_ASSERT_EQUAL(result, (const void *)cu32p,
> +		"RTE_PTR_ADD/SUB failed for const uint32_t*");
> +
> +	/* Verify assigning to const pointer works (adding const is safe)
> */
> +	const void *result_const = RTE_PTR_ADD(cvp,
> TEST_INCREMENT_LARGE);
> +	result_const = RTE_PTR_SUB(result_const, TEST_INCREMENT_LARGE);
> +	TEST_ASSERT_EQUAL(result_const, cvp,
> +		"RTE_PTR_ADD/SUB failed when assigning to const void*");
> +
> +	return 0;
> +}
> +
> +/* Test that typedefs resolve to native types correctly */
> +static int
> +test_ptr_add_sub_typedefs(void)
> +{
> +	uint64_t u64 = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(u64, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for uint64_t");
> +
> +	uint32_t u32 = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(u32, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for uint32_t");
> +
> +	uint16_t u16 = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(u16, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for uint16_t");
> +
> +	uint8_t u8 = TEST_INITVAL_SMALL;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(u8, TEST_INCREMENT_SMALL),
> TEST_RETVAL_SMALL,
> +		"RTE_PTR_ADD failed for uint8_t");
> +
> +	uintptr_t uptr = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(uptr, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for uintptr_t");
> +
> +	size_t sz = TEST_INITVAL_LARGE;
> +	TEST_ASSERT_EQUAL(RTE_PTR_ADD(sz, TEST_INCREMENT_LARGE),
> TEST_RETVAL_LARGE,
> +		"RTE_PTR_ADD failed for size_t");
> +
> +	return 0;
> +}
> +
> +/* Main test function that runs all subtests */
> +static int
> +test_ptr_add_sub(void)
> +{
> +	int ret;
> +
> +	ret = test_ptr_add_sub_integer_types();
> +	if (ret != 0)
> +		return ret;
> +
> +	ret = test_ptr_add_sub_pointer_types();
> +	if (ret != 0)
> +		return ret;
> +
> +	ret = test_ptr_add_sub_typedefs();
> +	if (ret != 0)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +REGISTER_FAST_TEST(ptr_add_sub_autotest, true, true,
> test_ptr_add_sub);
> diff --git a/lib/eal/include/rte_common.h
> b/lib/eal/include/rte_common.h
> index 9e7d84f929..b525110b06 100644
> --- a/lib/eal/include/rte_common.h
> +++ b/lib/eal/include/rte_common.h
> @@ -549,14 +549,20 @@ static void
> __attribute__((destructor(RTE_PRIO(prio)), used)) func(void)
>  /*********** Macros for pointer arithmetic ********/
> 
>  /**
> - * add a byte-value offset to a pointer
> + * add and subtract a byte-value offset from a pointer
>   */
> -#define RTE_PTR_ADD(ptr, x) ((void*)((uintptr_t)(ptr) + (x)))
> -
> +#ifdef RTE_CC_CLANG
>  /**
> - * subtract a byte-value offset from a pointer
> + * Clang doesn't optimize through uintptr_t, (char*) enables
> + * optimizations and doesn't generate warnings. GCC does optimize
> + * through uintptr_t but throws warnings (e.g. array-bounds) when cast
> to char*.
>   */
> +#define RTE_PTR_ADD(ptr, x) ((void *)((char *)(uintptr_t)(ptr) + (x)))
> +#define RTE_PTR_SUB(ptr, x) ((void *)((char *)(uintptr_t)(ptr) - (x)))
> +#else
> +#define RTE_PTR_ADD(ptr, x) ((void *)((uintptr_t)(ptr) + (x)))
>  #define RTE_PTR_SUB(ptr, x) ((void *)((uintptr_t)(ptr) - (x)))
> +#endif

Please keep descriptions for RTE_PTR_ADD and RTE_PTR_SUB separate, so they show up for both macros in the API documentation:
https://doc.dpdk.org/api/rte__common_8h.html#a96c6294ef6a307980f78899fe8a5813e

> 
>  /**
>   * get the difference between two pointer values, i.e. how far apart
> --
> 2.39.5 (Apple Git-154)


  reply	other threads:[~2026-01-16 11:40 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 16:07 [PATCH v6] eal: RTE_PTR_ADD/SUB char* for compiler optimizations scott.k.mitch1
2026-01-14 21:56 ` [PATCH v7] " scott.k.mitch1
2026-01-16 11:39   ` Morten Brørup [this message]
2026-01-16 22:38     ` Scott Mitchell
2026-01-16 23:19   ` [PATCH v8] " scott.k.mitch1
2026-01-17  2:44     ` Stephen Hemminger
2026-01-17 21:07       ` Scott Mitchell
2026-01-17 21:03     ` [PATCH v9] " scott.k.mitch1
2026-01-18  6:12       ` Stephen Hemminger
2026-01-23 16:20         ` Scott Mitchell
2026-01-20 12:59       ` Morten Brørup
2026-01-23 16:27       ` [PATCH v10] " scott.k.mitch1
2026-01-24  7:58         ` Morten Brørup
2026-01-24  8:59         ` Scott Mitchell
2026-01-24 22:59           ` Scott Mitchell
2026-01-24  9:11         ` [PATCH v11] eal: RTE_PTR_ADD/SUB API improvements scott.k.mitch1
2026-01-25 11:11           ` scott.k.mitch1
2026-01-25 11:12           ` [PATCH v12] " scott.k.mitch1
2026-01-25 19:36             ` [REVIEW] " Stephen Hemminger
2026-01-25 22:24               ` Scott Mitchell
2026-01-26  8:19               ` Morten Brørup
2026-01-25 22:30             ` [PATCH v13] " scott.k.mitch1
2026-01-26  8:03               ` [PATCH v14] " scott.k.mitch1
2026-01-26 14:35                 ` Morten Brørup
2026-01-26 21:29                   ` Scott Mitchell
2026-01-27  5:11                   ` Scott Mitchell
2026-01-27  2:02                 ` [PATCH v15 0/2] " scott.k.mitch1
2026-01-27  2:02                   ` [PATCH v15 1/2] eal: remove alloc_size from rte_lcore_var_alloc scott.k.mitch1
2026-01-27  2:02                   ` [PATCH v15 2/2] eal: RTE_PTR_ADD/SUB API improvements scott.k.mitch1
2026-01-27  5:28                   ` [PATCH v16 0/2] " scott.k.mitch1
2026-01-27  5:28                     ` [PATCH v16 1/2] eal: remove alloc_size from rte_lcore_var_alloc scott.k.mitch1
2026-01-27 11:16                       ` Mattias Rönnblom
2026-01-27 14:07                       ` Stephen Hemminger
2026-01-27  5:29                     ` [PATCH v16 2/2] eal: RTE_PTR_ADD/SUB API improvements scott.k.mitch1
2026-02-02  5:24                     ` [PATCH v17 0/2] " scott.k.mitch1
2026-02-02  5:24                       ` [PATCH v17 1/2] eal: remove alloc_size from rte_lcore_var_alloc scott.k.mitch1
2026-02-03  8:24                         ` Morten Brørup
2026-02-03  9:48                           ` David Marchand
2026-02-02  5:24                       ` [PATCH v17 2/2] eal: RTE_PTR_ADD/SUB API improvements scott.k.mitch1
2026-02-03  9:08                         ` Morten Brørup
2026-02-03 16:24                           ` Scott Mitchell
2026-02-03  9:51                         ` David Marchand
2026-02-03 16:25                           ` Scott Mitchell
2026-02-03 21:18                       ` [PATCH v18 0/2] " scott.k.mitch1
2026-02-03 21:18                         ` [PATCH v18 1/2] eal: remove alloc_size from rte_lcore_var_alloc scott.k.mitch1
2026-02-03 21:18                         ` [PATCH v18 2/2] eal: RTE_PTR_ADD/SUB API improvements scott.k.mitch1
2026-02-04  2:46                         ` [PATCH v19] " scott.k.mitch1
2026-02-04  5:20                           ` Scott Mitchell
2026-02-04  6:12                           ` [PATCH v20] " scott.k.mitch1
2026-02-04 22:47                             ` Morten Brørup
2026-02-05  6:53                               ` Scott Mitchell
2026-02-05  7:03                             ` [PATCH v21] " scott.k.mitch1
2026-02-05  7:50                               ` Morten Brørup
2026-02-06  1:04                                 ` Scott Mitchell
2026-02-06  1:01                               ` [PATCH v22] " scott.k.mitch1
2026-02-06  4:28                                 ` [PATCH v23] " scott.k.mitch1
2026-02-06 16:09                                   ` Stephen Hemminger
2026-02-07  1:45                                   ` [PATCH v24] " scott.k.mitch1
2026-01-27 20:28               ` [REVIEW] " Stephen Hemminger
2026-02-02  5:17                 ` Scott Mitchell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35F6566B@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=scott.k.mitch1@gmail.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox