* [RFC] microoptimizing hlist_add_{before,behind}
@ 2019-09-20 23:12 Al Viro
2019-09-21 3:11 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2019-09-20 23:12 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
Neither hlist_add_before() nor hlist_add_behind() should ever
be called with both arguments pointing to the same hlist_node.
However, gcc doesn't know that, so it ends up with pointless reloads.
AFAICS, the following generates better code, is obviously equivalent
in case when arguments are different and actually even in case when
they are same, the end result is identical (if the hlist hadn't been
corrupted even earlier than that).
Objections?
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/include/linux/list.h b/include/linux/list.h
index 85c92555e31f..aee8232e6827 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -793,21 +793,21 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
static inline void hlist_add_before(struct hlist_node *n,
struct hlist_node *next)
{
- n->pprev = next->pprev;
+ struct hlist_node *p = n->pprev = next->pprev;
n->next = next;
next->pprev = &n->next;
- WRITE_ONCE(*(n->pprev), n);
+ WRITE_ONCE(*p, n);
}
static inline void hlist_add_behind(struct hlist_node *n,
struct hlist_node *prev)
{
- n->next = prev->next;
+ struct hlist_node *p = n->next = prev->next;
prev->next = n;
n->pprev = &prev->next;
- if (n->next)
- n->next->pprev = &n->next;
+ if (p)
+ p->pprev = &n->next;
}
/* after that we'll appear to be on some hlist and hlist_del will work */
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC] microoptimizing hlist_add_{before,behind}
2019-09-20 23:12 [RFC] microoptimizing hlist_add_{before,behind} Al Viro
@ 2019-09-21 3:11 ` Al Viro
2019-09-21 17:03 ` Linus Torvalds
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2019-09-21 3:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
On Sat, Sep 21, 2019 at 12:12:33AM +0100, Al Viro wrote:
> Neither hlist_add_before() nor hlist_add_behind() should ever
> be called with both arguments pointing to the same hlist_node.
> However, gcc doesn't know that, so it ends up with pointless reloads.
> AFAICS, the following generates better code, is obviously equivalent
> in case when arguments are different and actually even in case when
> they are same, the end result is identical (if the hlist hadn't been
> corrupted even earlier than that).
>
> Objections?
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*gyah*
git diff >/tmp/y1
<build>
<fix a braino>
<test>
scp-out /tmp/y1
<send mail with the original diff>
<several hours later: reread the sent mail>
My apologies ;-/ Correct diff follows:
diff --git a/include/linux/list.h b/include/linux/list.h
index 85c92555e31f..5c84383675bc 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -793,21 +793,21 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
static inline void hlist_add_before(struct hlist_node *n,
struct hlist_node *next)
{
- n->pprev = next->pprev;
+ struct hlist_node **p = n->pprev = next->pprev;
n->next = next;
next->pprev = &n->next;
- WRITE_ONCE(*(n->pprev), n);
+ WRITE_ONCE(*p, n);
}
static inline void hlist_add_behind(struct hlist_node *n,
struct hlist_node *prev)
{
- n->next = prev->next;
+ struct hlist_node *p = n->next = prev->next;
prev->next = n;
n->pprev = &prev->next;
- if (n->next)
- n->next->pprev = &n->next;
+ if (p)
+ p->pprev = &n->next;
}
/* after that we'll appear to be on some hlist and hlist_del will work */
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC] microoptimizing hlist_add_{before,behind}
2019-09-21 3:11 ` Al Viro
@ 2019-09-21 17:03 ` Linus Torvalds
0 siblings, 0 replies; 3+ messages in thread
From: Linus Torvalds @ 2019-09-21 17:03 UTC (permalink / raw)
To: Al Viro; +Cc: Linux Kernel Mailing List
On Fri, Sep 20, 2019 at 8:11 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> My apologies ;-/ Correct diff follows:
This is similar to what we do for the regular list_add(), so I have no
objections to the micro-optimization.
Of course, for list_add() we do it by using a helper function and
passing those prev/next pointers to it instead, so it _looks_ very
different. But the logic is the same: do the loads of next/prev early
and once, so that gcc doesn't think they might alias with the updates.
However, I *really* don't like this syntax:
struct hlist_node *p = n->next = prev->next;
What, what? That's illegible. Both for the double assignment within a
declaration, but also for the naming.
Yeah, I assume you mean 'p' just for pointer. Fine. But when we are
explicitly playing with multiple pointers, just give them a name.
In this case, 'next'.
So just do
hlist_add_behind:
struct hlist_node *next = prev->next;
n->next = next;
prev->next = n;
n->pprev = &prev->next;
if (next)
next->pprev = &n->next;
And honestly, I'd rename 'n' with 'new' too while at it. We're not
using C++, so we can use sane names (and already do in other places).
That way each statement makes sense on its own, rather than being a
mess of "what does 'p' and 'n' mean?"
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-09-21 17:03 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-20 23:12 [RFC] microoptimizing hlist_add_{before,behind} Al Viro
2019-09-21 3:11 ` Al Viro
2019-09-21 17:03 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox