From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: "Alexander Lobakin" <aleksander.lobakin@intel.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"John Fastabend" <john.fastabend@gmail.com>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
nex.sw.ncis.osdt.itp.upstreaming@intel.com, bpf@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
"Jose E. Marchesi" <jose.marchesi@oracle.com>,
"Przemek Kitszel" <przemyslaw.kitszel@intel.com>
Subject: [PATCH net-next v2 03/18] unroll: add generic loop unroll helpers
Date: Tue, 15 Oct 2024 16:53:35 +0200 [thread overview]
Message-ID: <20241015145350.4077765-4-aleksander.lobakin@intel.com> (raw)
In-Reply-To: <20241015145350.4077765-1-aleksander.lobakin@intel.com>
There are cases when we need to explicitly unroll loops. For example,
cache operations, filling DMA descriptors on very high speeds etc.
Add compiler-specific attribute macros to give the compiler a hint
that we'd like to unroll a loop.
Example usage:
#define UNROLL_BATCH 8
unrolled_count(UNROLL_BATCH)
for (u32 i = 0; i < UNROLL_BATCH; i++)
op(priv, i);
Note that sometimes the compilers won't unroll loops if they think this
would have worse optimization and perf than without unrolling, and that
unroll attributes are available only starting GCC 8. For older compiler
versions, no hints/attributes will be applied.
For better unrolling/parallelization, don't have any variables that
interfere between iterations except for the iterator itself.
Co-developed-by: Jose E. Marchesi <jose.marchesi@oracle.com> # pragmas
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/linux/unroll.h | 43 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/include/linux/unroll.h b/include/linux/unroll.h
index d42fd6366373..2870c89a93f4 100644
--- a/include/linux/unroll.h
+++ b/include/linux/unroll.h
@@ -9,6 +9,49 @@
#include <linux/args.h>
+#ifdef CONFIG_CC_IS_CLANG
+#define __pick_unrolled(x, y) _Pragma(#x)
+#elif CONFIG_GCC_VERSION >= 80000
+#define __pick_unrolled(x, y) _Pragma(#y)
+#else
+#define __pick_unrolled(x, y) /* not supported */
+#endif
+
+/**
+ * unrolled - loop attributes to ask the compiler to unroll it
+ *
+ * Usage:
+ *
+ * #define BATCH 4
+ * unrolled_count(BATCH)
+ * for (u32 i = 0; i < BATCH; i++)
+ * // loop body without cross-iteration dependencies
+ *
+ * This is only a hint and the compiler is free to disable unrolling if it
+ * thinks the count is suboptimal and may hurt performance and/or hugely
+ * increase object code size.
+ * Not having any cross-iteration dependencies (i.e. when iter x + 1 depends
+ * on what iter x will do with variables) is not a strict requirement, but
+ * provides best performance and object code size.
+ * Available only on Clang and GCC 8.x onwards.
+ */
+
+/* Ask the compiler to pick an optimal unroll count, Clang only */
+#define unrolled \
+ __pick_unrolled(clang loop unroll(enable), /* nothing */)
+
+/* Unroll each @n iterations of a loop */
+#define unrolled_count(n) \
+ __pick_unrolled(clang loop unroll_count(n), GCC unroll n)
+
+/* Unroll the whole loop */
+#define unrolled_full \
+ __pick_unrolled(clang loop unroll(full), GCC unroll 65534)
+
+/* Never unroll a loop */
+#define unrolled_none \
+ __pick_unrolled(clang loop unroll(disable), GCC unroll 1)
+
#define UNROLL(N, MACRO, args...) CONCATENATE(__UNROLL_, N)(MACRO, args)
#define __UNROLL_0(MACRO, args...)
--
2.46.2
next prev parent reply other threads:[~2024-10-15 14:54 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-15 14:53 [PATCH net-next v2 00/18] idpf: XDP chapter III: core XDP changes (+libeth_xdp) Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 01/18] jump_label: export static_key_slow_{inc,dec}_cpuslocked() Alexander Lobakin
2024-10-17 11:06 ` Maciej Fijalkowski
2024-10-21 13:53 ` Alexander Lobakin
2024-10-22 12:52 ` Maciej Fijalkowski
2024-10-15 14:53 ` [PATCH net-next v2 02/18] skbuff: allow 2-4-argument skb_frag_dma_map() Alexander Lobakin
2024-10-15 14:53 ` Alexander Lobakin [this message]
2024-10-15 14:53 ` [PATCH net-next v2 04/18] bpf, xdp: constify some bpf_prog * function arguments Alexander Lobakin
2024-10-17 11:12 ` Maciej Fijalkowski
2024-10-21 13:56 ` Alexander Lobakin
2024-10-22 12:55 ` Maciej Fijalkowski
2024-10-15 14:53 ` [PATCH net-next v2 05/18] xdp, xsk: constify read-only arguments of some static inline helpers Alexander Lobakin
2024-10-17 11:14 ` Maciej Fijalkowski
2024-10-21 13:57 ` Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 06/18] xdp: allow attaching already registered memory model to xdp_rxq_info Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 07/18] net: Register system page pool as an XDP memory model Alexander Lobakin
2024-10-17 11:32 ` Maciej Fijalkowski
2024-10-21 14:00 ` Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 08/18] page_pool: make page_pool_put_page_bulk() actually handle array of pages Alexander Lobakin
2024-10-17 11:33 ` Maciej Fijalkowski
2024-10-21 14:03 ` Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 09/18] page_pool: allow mixing PPs within one bulk Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 10/18] xdp: get rid of xdp_frame::mem.id Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 11/18] xdp: add generic xdp_buff_add_frag() Alexander Lobakin
2024-10-17 12:26 ` Maciej Fijalkowski
2024-10-21 14:10 ` Alexander Lobakin
2024-10-22 13:00 ` Maciej Fijalkowski
2024-10-15 14:53 ` [PATCH net-next v2 12/18] xdp: add generic xdp_build_skb_from_buff() Alexander Lobakin
2024-10-17 12:34 ` Maciej Fijalkowski
2024-10-21 14:20 ` Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 13/18] xsk: allow attaching XSk pool via xdp_rxq_info_reg_mem_model() Alexander Lobakin
2024-10-17 12:49 ` Maciej Fijalkowski
2024-10-21 14:23 ` Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 14/18] xsk: make xsk_buff_add_frag really add a frag via __xdp_buff_add_frag() Alexander Lobakin
2024-10-17 13:04 ` Maciej Fijalkowski
2024-10-15 14:53 ` [PATCH net-next v2 15/18] xsk: add generic XSk &xdp_buff -> skb conversion Alexander Lobakin
2024-10-18 12:48 ` Maciej Fijalkowski
2024-10-15 14:53 ` [PATCH net-next v2 16/18] xsk: add helper to get &xdp_desc's DMA and meta pointer in one go Alexander Lobakin
2024-10-22 15:42 ` Maciej Fijalkowski
2024-10-23 14:50 ` Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 17/18] libeth: support native XDP and register memory model Alexander Lobakin
2024-10-15 14:53 ` [PATCH net-next v2 18/18] libeth: add a couple of XDP helpers (libeth_xdp) Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241015145350.4077765-4-aleksander.lobakin@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=john.fastabend@gmail.com \
--cc=jose.marchesi@oracle.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=sdf@fomichev.me \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox