* [PATCH 0/2] rbree: inline rb_first() and rb_last()
@ 2025-11-14 14:06 Eric Dumazet
2025-11-14 14:06 ` [PATCH 1/2] rbtree: inline rb_first() Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Eric Dumazet @ 2025-11-14 14:06 UTC (permalink / raw)
To: Andrew Morton, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, netdev, Neal Cardwell, Kuniyuki Iwashima,
Eric Dumazet, Eric Dumazet
Inline these two small helpers, heavily used in TCP and FQ packet scheduler,
and in many other places.
This reduces kernel text size, and brings an 1.5 % improvement on network
TCP stress test.
Eric Dumazet (2):
rbtree: inline rb_first()
rbtree: inline rb_last()
include/linux/rbtree.h | 32 ++++++++++++++++++++++++++++++--
lib/rbtree.c | 29 -----------------------------
2 files changed, 30 insertions(+), 31 deletions(-)
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] rbtree: inline rb_first()
2025-11-14 14:06 [PATCH 0/2] rbree: inline rb_first() and rb_last() Eric Dumazet
@ 2025-11-14 14:06 ` Eric Dumazet
2025-11-14 14:06 ` [PATCH 2/2] rbtree: inline rb_last() Eric Dumazet
2025-11-16 18:00 ` [PATCH 0/2] rbree: inline rb_first() and rb_last() Kuan-Wei Chiu
2 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2025-11-14 14:06 UTC (permalink / raw)
To: Andrew Morton, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, netdev, Neal Cardwell, Kuniyuki Iwashima,
Eric Dumazet, Eric Dumazet
This is a very small function, inlining it save cpu cycles
by reducing register pressure and removing call/ret overhead.
It also reduces vmlinux text size by 744 bytes on a typical x86_64 build.
Before:
size vmlinux
text data bss dec hex filename
34812525 22177365 5685248 62675138 3bc58c2 vmlinux
After:
size vmlinux
text data bss dec hex filename
34811781 22177365 5685248 62674394 3bc55da vmlinux
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/rbtree.h | 16 +++++++++++++++-
lib/rbtree.c | 16 ----------------
2 files changed, 15 insertions(+), 17 deletions(-)
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 8d2ba3749866f500a492d267e5e13556f6aa3f55..484554900f7d3201d41fb29e04fb65fe331eee79 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -43,7 +43,21 @@ extern void rb_erase(struct rb_node *, struct rb_root *);
/* Find logical next and previous nodes in a tree */
extern struct rb_node *rb_next(const struct rb_node *);
extern struct rb_node *rb_prev(const struct rb_node *);
-extern struct rb_node *rb_first(const struct rb_root *);
+
+/*
+ * This function returns the first node (in sort order) of the tree.
+ */
+static inline struct rb_node *rb_first(const struct rb_root *root)
+{
+ struct rb_node *n;
+
+ n = root->rb_node;
+ if (!n)
+ return NULL;
+ while (n->rb_left)
+ n = n->rb_left;
+ return n;
+}
extern struct rb_node *rb_last(const struct rb_root *);
/* Postorder iteration - always visit the parent after its children */
diff --git a/lib/rbtree.c b/lib/rbtree.c
index 5114eda6309c9d867a3e1ed9358bf9b3b275eb71..b946eb4b759d3b65f5bc5d54d0377348962bdc56 100644
--- a/lib/rbtree.c
+++ b/lib/rbtree.c
@@ -460,22 +460,6 @@ void __rb_insert_augmented(struct rb_node *node, struct rb_root *root,
}
EXPORT_SYMBOL(__rb_insert_augmented);
-/*
- * This function returns the first node (in sort order) of the tree.
- */
-struct rb_node *rb_first(const struct rb_root *root)
-{
- struct rb_node *n;
-
- n = root->rb_node;
- if (!n)
- return NULL;
- while (n->rb_left)
- n = n->rb_left;
- return n;
-}
-EXPORT_SYMBOL(rb_first);
-
struct rb_node *rb_last(const struct rb_root *root)
{
struct rb_node *n;
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] rbtree: inline rb_last()
2025-11-14 14:06 [PATCH 0/2] rbree: inline rb_first() and rb_last() Eric Dumazet
2025-11-14 14:06 ` [PATCH 1/2] rbtree: inline rb_first() Eric Dumazet
@ 2025-11-14 14:06 ` Eric Dumazet
2025-11-16 18:00 ` [PATCH 0/2] rbree: inline rb_first() and rb_last() Kuan-Wei Chiu
2 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2025-11-14 14:06 UTC (permalink / raw)
To: Andrew Morton, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, netdev, Neal Cardwell, Kuniyuki Iwashima,
Eric Dumazet, Eric Dumazet
This is a very small function, inlining it save cpu cycles in TCP
by reducing register pressure and removing call/ret overhead.
It also reduces vmlinux text size by 122 bytes on a typical x86_64 build.
Before:
size vmlinux
text data bss dec hex filename
34811781 22177365 5685248 62674394 3bc55da vmlinux
After:
size vmlinux
text data bss dec hex filename
34811659 22177365 5685248 62674272 3bc5560 vmlinux
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/rbtree.h | 16 +++++++++++++++-
lib/rbtree.c | 13 -------------
2 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 484554900f7d3201d41fb29e04fb65fe331eee79..4091e978aef2404b56d7643d9385727c69796678 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -58,7 +58,21 @@ static inline struct rb_node *rb_first(const struct rb_root *root)
n = n->rb_left;
return n;
}
-extern struct rb_node *rb_last(const struct rb_root *);
+
+/*
+ * This function returns the last node (in sort order) of the tree.
+ */
+static inline struct rb_node *rb_last(const struct rb_root *root)
+{
+ struct rb_node *n;
+
+ n = root->rb_node;
+ if (!n)
+ return NULL;
+ while (n->rb_right)
+ n = n->rb_right;
+ return n;
+}
/* Postorder iteration - always visit the parent after its children */
extern struct rb_node *rb_first_postorder(const struct rb_root *);
diff --git a/lib/rbtree.c b/lib/rbtree.c
index b946eb4b759d3b65f5bc5d54d0377348962bdc56..18d42bcf4ec9d581807179f34561f4561900206d 100644
--- a/lib/rbtree.c
+++ b/lib/rbtree.c
@@ -460,19 +460,6 @@ void __rb_insert_augmented(struct rb_node *node, struct rb_root *root,
}
EXPORT_SYMBOL(__rb_insert_augmented);
-struct rb_node *rb_last(const struct rb_root *root)
-{
- struct rb_node *n;
-
- n = root->rb_node;
- if (!n)
- return NULL;
- while (n->rb_right)
- n = n->rb_right;
- return n;
-}
-EXPORT_SYMBOL(rb_last);
-
struct rb_node *rb_next(const struct rb_node *node)
{
struct rb_node *parent;
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] rbree: inline rb_first() and rb_last()
2025-11-14 14:06 [PATCH 0/2] rbree: inline rb_first() and rb_last() Eric Dumazet
2025-11-14 14:06 ` [PATCH 1/2] rbtree: inline rb_first() Eric Dumazet
2025-11-14 14:06 ` [PATCH 2/2] rbtree: inline rb_last() Eric Dumazet
@ 2025-11-16 18:00 ` Kuan-Wei Chiu
2025-11-16 18:41 ` Eric Dumazet
2 siblings, 1 reply; 6+ messages in thread
From: Kuan-Wei Chiu @ 2025-11-16 18:00 UTC (permalink / raw)
To: Eric Dumazet
Cc: Andrew Morton, Jakub Kicinski, Paolo Abeni, linux-kernel, netdev,
Neal Cardwell, Kuniyuki Iwashima, Eric Dumazet
Hi Eric,
On Fri, Nov 14, 2025 at 02:06:44PM +0000, Eric Dumazet wrote:
> Inline these two small helpers, heavily used in TCP and FQ packet scheduler,
> and in many other places.
>
> This reduces kernel text size, and brings an 1.5 % improvement on network
> TCP stress test.
Thanks for the patch!
Just out of curiosity, do you think rb_first() and rb_last() would be
worth marking with __always_inline?
Regardless, for the series:
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Regards,
Kuan-Wei
>
> Eric Dumazet (2):
> rbtree: inline rb_first()
> rbtree: inline rb_last()
>
> include/linux/rbtree.h | 32 ++++++++++++++++++++++++++++++--
> lib/rbtree.c | 29 -----------------------------
> 2 files changed, 30 insertions(+), 31 deletions(-)
>
> --
> 2.52.0.rc1.455.g30608eb744-goog
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] rbree: inline rb_first() and rb_last()
2025-11-16 18:00 ` [PATCH 0/2] rbree: inline rb_first() and rb_last() Kuan-Wei Chiu
@ 2025-11-16 18:41 ` Eric Dumazet
2025-11-17 11:35 ` Kuan-Wei Chiu
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2025-11-16 18:41 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: Andrew Morton, Jakub Kicinski, Paolo Abeni, linux-kernel, netdev,
Neal Cardwell, Kuniyuki Iwashima, Eric Dumazet
On Sun, Nov 16, 2025 at 10:00 AM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
>
> Hi Eric,
>
> On Fri, Nov 14, 2025 at 02:06:44PM +0000, Eric Dumazet wrote:
> > Inline these two small helpers, heavily used in TCP and FQ packet scheduler,
> > and in many other places.
> >
> > This reduces kernel text size, and brings an 1.5 % improvement on network
> > TCP stress test.
>
> Thanks for the patch!
>
> Just out of curiosity, do you think rb_first() and rb_last() would be
> worth marking with __always_inline?
I have not seen any difference, what compilers are you using that
would not inline this ?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] rbree: inline rb_first() and rb_last()
2025-11-16 18:41 ` Eric Dumazet
@ 2025-11-17 11:35 ` Kuan-Wei Chiu
0 siblings, 0 replies; 6+ messages in thread
From: Kuan-Wei Chiu @ 2025-11-17 11:35 UTC (permalink / raw)
To: Eric Dumazet
Cc: Andrew Morton, Jakub Kicinski, Paolo Abeni, linux-kernel, netdev,
Neal Cardwell, Kuniyuki Iwashima, Eric Dumazet
On Sun, Nov 16, 2025 at 10:41:31AM -0800, Eric Dumazet wrote:
> On Sun, Nov 16, 2025 at 10:00 AM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> >
> > Hi Eric,
> >
> > On Fri, Nov 14, 2025 at 02:06:44PM +0000, Eric Dumazet wrote:
> > > Inline these two small helpers, heavily used in TCP and FQ packet scheduler,
> > > and in many other places.
> > >
> > > This reduces kernel text size, and brings an 1.5 % improvement on network
> > > TCP stress test.
> >
> > Thanks for the patch!
> >
> > Just out of curiosity, do you think rb_first() and rb_last() would be
> > worth marking with __always_inline?
>
> I have not seen any difference, what compilers are you using that
> would not inline this ?
I haven't specifically measured the difference. I was curious if using
__always_inline would be better to guarantee the compiler inlines the
function, given that the inline keyword is only a suggestion, and this
optimization is important for performance in a hot path.
Also, FWIW, I tried building an arm64 defconfig with LLVM both with and
without __always_inline. I got the same results from size vmlinux, but
scripts/bloat-o-meter output is as follows:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
Function old new delta
__aarch32_sigret_code_end 24 16 -8
Total: Before=129127208516005098430, After=129127208516005098422, chg -0.00%
Regards,
Kuan-Wei
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-11-17 11:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-14 14:06 [PATCH 0/2] rbree: inline rb_first() and rb_last() Eric Dumazet
2025-11-14 14:06 ` [PATCH 1/2] rbtree: inline rb_first() Eric Dumazet
2025-11-14 14:06 ` [PATCH 2/2] rbtree: inline rb_last() Eric Dumazet
2025-11-16 18:00 ` [PATCH 0/2] rbree: inline rb_first() and rb_last() Kuan-Wei Chiu
2025-11-16 18:41 ` Eric Dumazet
2025-11-17 11:35 ` Kuan-Wei Chiu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).