* [PATCH 0/2] eal/riscv: implement prefetch using zicbop
@ 2024-05-30 17:19 Daniel Gregory
2024-05-30 17:19 ` [PATCH 1/2] eal: add flag to hide generic prefetch_write Daniel Gregory
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Daniel Gregory @ 2024-05-30 17:19 UTC (permalink / raw)
To: Stanislaw Kardach
Cc: Bruce Richardson, Tyler Retzlaff, dev, Liang Ma, Punit Agrawal,
Daniel Gregory
Instructions from RISC-V's Zicbop extension can be used to implement the
rte_prefetch* family of functions. On modern versions of GCC (13.1.0+)
and Clang (17.0.1+), these are emitted by __builtin_prefetch() when the
extension is present.
In order to support older compiler versions, this patchset manually
emits these instructions using inline assembly. To do this, I have added
a new flag, RTE_PREFETCH_WRITE_ARCH_DEFINED, that
(similarly to RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED) hides the generic
implementation of rte_prefetch*_write.
I am still in the process of acquiring hardware that supports this
extension, so I haven't tested how this affects performance yet.
Daniel Gregory (2):
eal: add flag to hide generic prefetch_write
eal/riscv: add support for zicbop extension
config/riscv/meson.build | 6 +++
lib/eal/include/generic/rte_prefetch.h | 47 +++++++++++++--------
lib/eal/riscv/include/rte_prefetch.h | 57 ++++++++++++++++++++++++--
3 files changed, 90 insertions(+), 20 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] eal: add flag to hide generic prefetch_write
2024-05-30 17:19 [PATCH 0/2] eal/riscv: implement prefetch using zicbop Daniel Gregory
@ 2024-05-30 17:19 ` Daniel Gregory
2024-05-30 17:19 ` [PATCH 2/2] eal/riscv: add support for zicbop extension Daniel Gregory
2024-07-12 17:22 ` [PATCH 0/2] eal/riscv: implement prefetch using zicbop David Marchand
2 siblings, 0 replies; 7+ messages in thread
From: Daniel Gregory @ 2024-05-30 17:19 UTC (permalink / raw)
To: Stanislaw Kardach
Cc: Bruce Richardson, Tyler Retzlaff, dev, Liang Ma, Punit Agrawal,
Daniel Gregory
This allows for the definition of architecture-specific implementations
of the rte_prefetch*_write collection of functions by defining
RTE_PREFETCH_WRITE_ARCH_DEFINED.
Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>
---
lib/eal/include/generic/rte_prefetch.h | 47 +++++++++++++++++---------
1 file changed, 31 insertions(+), 16 deletions(-)
diff --git a/lib/eal/include/generic/rte_prefetch.h b/lib/eal/include/generic/rte_prefetch.h
index f9fab5e359..5558376cba 100644
--- a/lib/eal/include/generic/rte_prefetch.h
+++ b/lib/eal/include/generic/rte_prefetch.h
@@ -65,14 +65,7 @@ static inline void rte_prefetch_non_temporal(const volatile void *p);
*/
__rte_experimental
static inline void
-rte_prefetch0_write(const void *p)
-{
- /* 1 indicates intention to write, 3 sets target cache level to L1. See
- * GCC docs where these integer constants are described in more detail:
- * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
- */
- __builtin_prefetch(p, 1, 3);
-}
+rte_prefetch0_write(const void *p);
/**
* @warning
@@ -86,14 +79,7 @@ rte_prefetch0_write(const void *p)
*/
__rte_experimental
static inline void
-rte_prefetch1_write(const void *p)
-{
- /* 1 indicates intention to write, 2 sets target cache level to L2. See
- * GCC docs where these integer constants are described in more detail:
- * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
- */
- __builtin_prefetch(p, 1, 2);
-}
+rte_prefetch1_write(const void *p);
/**
* @warning
@@ -105,6 +91,33 @@ rte_prefetch1_write(const void *p)
*
* @param p Address to prefetch
*/
+__rte_experimental
+static inline void
+rte_prefetch2_write(const void *p);
+
+#ifndef RTE_PREFETCH_WRITE_ARCH_DEFINED
+__rte_experimental
+static inline void
+rte_prefetch0_write(const void *p)
+{
+ /* 1 indicates intention to write, 3 sets target cache level to L1. See
+ * GCC docs where these integer constants are described in more detail:
+ * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
+ */
+ __builtin_prefetch(p, 1, 3);
+}
+
+__rte_experimental
+static inline void
+rte_prefetch1_write(const void *p)
+{
+ /* 1 indicates intention to write, 2 sets target cache level to L2. See
+ * GCC docs where these integer constants are described in more detail:
+ * https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
+ */
+ __builtin_prefetch(p, 1, 2);
+}
+
__rte_experimental
static inline void
rte_prefetch2_write(const void *p)
@@ -116,6 +129,8 @@ rte_prefetch2_write(const void *p)
__builtin_prefetch(p, 1, 1);
}
+#endif /* RTE_PREFETCH_WRITE_ARCH_DEFINED */
+
/**
* @warning
* @b EXPERIMENTAL: this API may change, or be removed, without prior notice
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] eal/riscv: add support for zicbop extension
2024-05-30 17:19 [PATCH 0/2] eal/riscv: implement prefetch using zicbop Daniel Gregory
2024-05-30 17:19 ` [PATCH 1/2] eal: add flag to hide generic prefetch_write Daniel Gregory
@ 2024-05-30 17:19 ` Daniel Gregory
2024-05-31 8:49 ` Daniel Gregory
` (2 more replies)
2024-07-12 17:22 ` [PATCH 0/2] eal/riscv: implement prefetch using zicbop David Marchand
2 siblings, 3 replies; 7+ messages in thread
From: Daniel Gregory @ 2024-05-30 17:19 UTC (permalink / raw)
To: Stanislaw Kardach
Cc: Bruce Richardson, Tyler Retzlaff, dev, Liang Ma, Punit Agrawal,
Daniel Gregory
The zicbop extension adds instructions for prefetching data into cache.
Use them to implement RISCV-specific versions of the rte_prefetch* and
rte_prefetch*_write functions.
- prefetch.r indicates to hardware that the cache block will be accessed
by a data read soon
- prefetch.w indicates to hardware that the cache block will be accessed
by a data write soon
These instructions are emitted by __builtin_prefetch on modern versions
of Clang (17.0.1+) and GCC (13.1.0+). For earlier versions, we may not
have support for assembling Zicbop instructions, so emit the word
that encodes a 'prefetch.[rw] 0(a0)' instruction.
This new functionality is controlled by a Meson flag that is disabled by
default. Whilst it's a hint, like rte_pause(), and so has no effect if
the target doesn't support the extension, it requires the address
prefetched to be loaded into a0, which may be costly.
Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>
Suggested-by: Punit Agrawal <punit.agrawal@bytedance.com>
---
config/riscv/meson.build | 6 +++
lib/eal/riscv/include/rte_prefetch.h | 57 ++++++++++++++++++++++++++--
2 files changed, 59 insertions(+), 4 deletions(-)
diff --git a/config/riscv/meson.build b/config/riscv/meson.build
index 07d7d9da23..ecf9da1c39 100644
--- a/config/riscv/meson.build
+++ b/config/riscv/meson.build
@@ -26,6 +26,12 @@ flags_common = [
# read from /proc/device-tree/cpus/timebase-frequency. This property is
# guaranteed on Linux, as riscv time_init() requires it.
['RTE_RISCV_TIME_FREQ', 0],
+
+ # When true override the default implementation of the prefetching functions
+ # (rte_prefetch*) with a version that explicitly uses the Zicbop extension.
+ # Do not enable when using modern versions of GCC (13.1.0+) or Clang
+ # (17.0.1+). They will emit these instructions in the default implementation
+ ['RTE_RISCV_ZICBOP', false],
]
## SoC-specific options.
diff --git a/lib/eal/riscv/include/rte_prefetch.h b/lib/eal/riscv/include/rte_prefetch.h
index 748cf1b626..82cad526b3 100644
--- a/lib/eal/riscv/include/rte_prefetch.h
+++ b/lib/eal/riscv/include/rte_prefetch.h
@@ -14,21 +14,42 @@ extern "C" {
#include <rte_compat.h>
#include <rte_common.h>
+
+#ifdef RTE_RISCV_ZICBOP
+#define RTE_PREFETCH_WRITE_ARCH_DEFINED
+#endif
+
#include "generic/rte_prefetch.h"
+/*
+ * Modern versions of GCC & Clang will emit prefetch instructions for
+ * __builtin_prefetch when the Zicbop extension is present.
+ * The RTE_RISCV_ZICBOP option controls whether we emit them manually for older
+ * compilers that may not have the support to assemble them.
+ */
static inline void rte_prefetch0(const volatile void *p)
{
- RTE_SET_USED(p);
+#ifndef RTE_RISCV_ZICBOP
+ /* by default __builtin_prefetch prepares for a read */
+ __builtin_prefetch((const void *)p);
+#else
+ /* prefetch.r 0(a0) */
+ register const volatile void *a0 asm("a0") = p;
+ asm volatile (".int 0x00156013" : : "r" (a0));
+#endif
}
+/*
+ * The RISC-V Zicbop extension doesn't have instructions to prefetch to only a
+ * subset of cache levels, so fallback to rte_prefetch0
+ */
static inline void rte_prefetch1(const volatile void *p)
{
- RTE_SET_USED(p);
+ rte_prefetch0(p);
}
-
static inline void rte_prefetch2(const volatile void *p)
{
- RTE_SET_USED(p);
+ rte_prefetch0(p);
}
static inline void rte_prefetch_non_temporal(const volatile void *p)
@@ -44,6 +65,34 @@ rte_cldemote(const volatile void *p)
RTE_SET_USED(p);
}
+#ifdef RTE_RISCV_ZICBOP
+__rte_experimental
+static inline void
+rte_prefetch0_write(const void *p)
+{
+ /* prefetch.w 0(a0) */
+ register const void *a0 asm("a0") = p;
+ asm volatile (".int 0x00356013" : : "r" (a0));
+}
+
+/*
+ * The RISC-V Zicbop extension doesn't have instructions to prefetch to only a
+ * subset of cache levels, so fallback to rte_prefetch0_write
+ */
+__rte_experimental
+static inline void
+rte_prefetch1_write(const void *p)
+{
+ rte_prefetch0_write(p);
+}
+__rte_experimental
+static inline void
+rte_prefetch2_write(const void *p)
+{
+ rte_prefetch0_write(p);
+}
+#endif /* RTE_RISCV_ZICBOP */
+
#ifdef __cplusplus
}
#endif
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] eal/riscv: add support for zicbop extension
2024-05-30 17:19 ` [PATCH 2/2] eal/riscv: add support for zicbop extension Daniel Gregory
@ 2024-05-31 8:49 ` Daniel Gregory
2024-10-05 1:31 ` Stephen Hemminger
2024-10-07 7:48 ` Stanisław Kardach
2 siblings, 0 replies; 7+ messages in thread
From: Daniel Gregory @ 2024-05-31 8:49 UTC (permalink / raw)
To: Stanislaw Kardach
Cc: Bruce Richardson, Tyler Retzlaff, dev, Liang Ma, Punit Agrawal
On Thu, May 30, 2024 at 06:19:48PM +0100, Daniel Gregory wrote:
> + * The RTE_RISCV_ZICBOP option controls whether we emit them manually for older
> + * compilers that may not have the support to assemble them.
> + */
> static inline void rte_prefetch0(const volatile void *p)
> {
> - RTE_SET_USED(p);
> +#ifndef RTE_RISCV_ZICBOP
> + /* by default __builtin_prefetch prepares for a read */
> + __builtin_prefetch((const void *)p);
This cast causes warnings (which are treated as errors by the 0-day
Robot) due to it discarding the 'volatile' on p.
Removing the volatile from the definition of rte_prefetch0 causes build
failures in some drivers (txgbe_rxtx.c:1809, ixgbe_rxtx.c:2174,
enic_rxtx.c:127, ...).
rte_prefetch0_write takes its argument as 'const void *' and so can use
__builtin_prefetch().
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] eal/riscv: implement prefetch using zicbop
2024-05-30 17:19 [PATCH 0/2] eal/riscv: implement prefetch using zicbop Daniel Gregory
2024-05-30 17:19 ` [PATCH 1/2] eal: add flag to hide generic prefetch_write Daniel Gregory
2024-05-30 17:19 ` [PATCH 2/2] eal/riscv: add support for zicbop extension Daniel Gregory
@ 2024-07-12 17:22 ` David Marchand
2 siblings, 0 replies; 7+ messages in thread
From: David Marchand @ 2024-07-12 17:22 UTC (permalink / raw)
To: Daniel Gregory
Cc: Stanislaw Kardach, Bruce Richardson, Tyler Retzlaff, dev,
Liang Ma, Punit Agrawal, Sachin Saxena
Hello,
On Thu, May 30, 2024 at 7:21 PM Daniel Gregory
<daniel.gregory@bytedance.com> wrote:
>
> Instructions from RISC-V's Zicbop extension can be used to implement the
> rte_prefetch* family of functions. On modern versions of GCC (13.1.0+)
> and Clang (17.0.1+), these are emitted by __builtin_prefetch() when the
> extension is present.
>
> In order to support older compiler versions, this patchset manually
> emits these instructions using inline assembly. To do this, I have added
> a new flag, RTE_PREFETCH_WRITE_ARCH_DEFINED, that
> (similarly to RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED) hides the generic
> implementation of rte_prefetch*_write.
>
> I am still in the process of acquiring hardware that supports this
> extension, so I haven't tested how this affects performance yet.
Let's hope you get such hardware by the next release.
We will need reviews too.
CC: Sachin.
--
David Marchand
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] eal/riscv: add support for zicbop extension
2024-05-30 17:19 ` [PATCH 2/2] eal/riscv: add support for zicbop extension Daniel Gregory
2024-05-31 8:49 ` Daniel Gregory
@ 2024-10-05 1:31 ` Stephen Hemminger
2024-10-07 7:48 ` Stanisław Kardach
2 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2024-10-05 1:31 UTC (permalink / raw)
To: Daniel Gregory
Cc: Stanislaw Kardach, Bruce Richardson, Tyler Retzlaff, dev,
Liang Ma, Punit Agrawal
On Thu, 30 May 2024 18:19:48 +0100
Daniel Gregory <daniel.gregory@bytedance.com> wrote:
> This new functionality is controlled by a Meson flag that is disabled by
> default. Whilst it's a hint, like rte_pause(), and so has no effect if
> the target doesn't support the extension, it requires the address
> prefetched to be loaded into a0, which may be costly.
Please just use #ifdef's against compiler version string like
other code does. Introducing a meson flag means user has to do something.
Better to have "it works right the first time".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] eal/riscv: add support for zicbop extension
2024-05-30 17:19 ` [PATCH 2/2] eal/riscv: add support for zicbop extension Daniel Gregory
2024-05-31 8:49 ` Daniel Gregory
2024-10-05 1:31 ` Stephen Hemminger
@ 2024-10-07 7:48 ` Stanisław Kardach
2 siblings, 0 replies; 7+ messages in thread
From: Stanisław Kardach @ 2024-10-07 7:48 UTC (permalink / raw)
To: Daniel Gregory
Cc: Bruce Richardson, Tyler Retzlaff, dev, Liang Ma, Punit Agrawal
On Thu, May 30, 2024 at 7:20 PM Daniel Gregory
<daniel.gregory@bytedance.com> wrote:
>
> The zicbop extension adds instructions for prefetching data into cache.
> Use them to implement RISCV-specific versions of the rte_prefetch* and
> rte_prefetch*_write functions.
>
> - prefetch.r indicates to hardware that the cache block will be accessed
> by a data read soon
> - prefetch.w indicates to hardware that the cache block will be accessed
> by a data write soon
>
> These instructions are emitted by __builtin_prefetch on modern versions
> of Clang (17.0.1+) and GCC (13.1.0+). For earlier versions, we may not
> have support for assembling Zicbop instructions, so emit the word
> that encodes a 'prefetch.[rw] 0(a0)' instruction.
Is there a benefit of adding this flag instead of relying on compiler
implementation of __builtin_prefetch()? As in do you have some
requirements for older compiler support? If just using
__builtin_prefetch could simplify this code a lot and push the
detection of zicbop to the compiler implementation. The runtime
detection is another issue and would require runtime patching to keep
the code fast.
>
> This new functionality is controlled by a Meson flag that is disabled by
> default. Whilst it's a hint, like rte_pause(), and so has no effect if
> the target doesn't support the extension, it requires the address
> prefetched to be loaded into a0, which may be costly.
>
> Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>
> Suggested-by: Punit Agrawal <punit.agrawal@bytedance.com>
> ---
> config/riscv/meson.build | 6 +++
> lib/eal/riscv/include/rte_prefetch.h | 57 ++++++++++++++++++++++++++--
> 2 files changed, 59 insertions(+), 4 deletions(-)
>
> diff --git a/config/riscv/meson.build b/config/riscv/meson.build
> index 07d7d9da23..ecf9da1c39 100644
> --- a/config/riscv/meson.build
> +++ b/config/riscv/meson.build
> @@ -26,6 +26,12 @@ flags_common = [
> # read from /proc/device-tree/cpus/timebase-frequency. This property is
> # guaranteed on Linux, as riscv time_init() requires it.
> ['RTE_RISCV_TIME_FREQ', 0],
> +
> + # When true override the default implementation of the prefetching functions
> + # (rte_prefetch*) with a version that explicitly uses the Zicbop extension.
> + # Do not enable when using modern versions of GCC (13.1.0+) or Clang
> + # (17.0.1+). They will emit these instructions in the default implementation
> + ['RTE_RISCV_ZICBOP', false],
> ]
>
> ## SoC-specific options.
> diff --git a/lib/eal/riscv/include/rte_prefetch.h b/lib/eal/riscv/include/rte_prefetch.h
> index 748cf1b626..82cad526b3 100644
> --- a/lib/eal/riscv/include/rte_prefetch.h
> +++ b/lib/eal/riscv/include/rte_prefetch.h
> @@ -14,21 +14,42 @@ extern "C" {
>
> #include <rte_compat.h>
> #include <rte_common.h>
> +
> +#ifdef RTE_RISCV_ZICBOP
> +#define RTE_PREFETCH_WRITE_ARCH_DEFINED
> +#endif
> +
> #include "generic/rte_prefetch.h"
>
> +/*
> + * Modern versions of GCC & Clang will emit prefetch instructions for
> + * __builtin_prefetch when the Zicbop extension is present.
> + * The RTE_RISCV_ZICBOP option controls whether we emit them manually for older
> + * compilers that may not have the support to assemble them.
> + */
> static inline void rte_prefetch0(const volatile void *p)
> {
> - RTE_SET_USED(p);
> +#ifndef RTE_RISCV_ZICBOP
> + /* by default __builtin_prefetch prepares for a read */
> + __builtin_prefetch((const void *)p);
> +#else
> + /* prefetch.r 0(a0) */
> + register const volatile void *a0 asm("a0") = p;
> + asm volatile (".int 0x00156013" : : "r" (a0));
> +#endif
> }
>
> +/*
> + * The RISC-V Zicbop extension doesn't have instructions to prefetch to only a
> + * subset of cache levels, so fallback to rte_prefetch0
> + */
> static inline void rte_prefetch1(const volatile void *p)
> {
> - RTE_SET_USED(p);
> + rte_prefetch0(p);
> }
> -
> static inline void rte_prefetch2(const volatile void *p)
> {
> - RTE_SET_USED(p);
> + rte_prefetch0(p);
> }
>
> static inline void rte_prefetch_non_temporal(const volatile void *p)
> @@ -44,6 +65,34 @@ rte_cldemote(const volatile void *p)
> RTE_SET_USED(p);
> }
>
> +#ifdef RTE_RISCV_ZICBOP
> +__rte_experimental
> +static inline void
> +rte_prefetch0_write(const void *p)
> +{
> + /* prefetch.w 0(a0) */
> + register const void *a0 asm("a0") = p;
> + asm volatile (".int 0x00356013" : : "r" (a0));
> +}
> +
> +/*
> + * The RISC-V Zicbop extension doesn't have instructions to prefetch to only a
> + * subset of cache levels, so fallback to rte_prefetch0_write
> + */
> +__rte_experimental
> +static inline void
> +rte_prefetch1_write(const void *p)
> +{
> + rte_prefetch0_write(p);
> +}
> +__rte_experimental
> +static inline void
> +rte_prefetch2_write(const void *p)
> +{
> + rte_prefetch0_write(p);
> +}
> +#endif /* RTE_RISCV_ZICBOP */
> +
> #ifdef __cplusplus
> }
> #endif
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-10-07 7:48 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-30 17:19 [PATCH 0/2] eal/riscv: implement prefetch using zicbop Daniel Gregory
2024-05-30 17:19 ` [PATCH 1/2] eal: add flag to hide generic prefetch_write Daniel Gregory
2024-05-30 17:19 ` [PATCH 2/2] eal/riscv: add support for zicbop extension Daniel Gregory
2024-05-31 8:49 ` Daniel Gregory
2024-10-05 1:31 ` Stephen Hemminger
2024-10-07 7:48 ` Stanisław Kardach
2024-07-12 17:22 ` [PATCH 0/2] eal/riscv: implement prefetch using zicbop David Marchand
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.