* [patch] epoll use unlocked wqueue operations ...
@ 2006-06-02 23:28 Davide Libenzi
2006-06-03 6:04 ` Willy Tarreau
0 siblings, 1 reply; 5+ messages in thread
From: Davide Libenzi @ 2006-06-02 23:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux Kernel Mailing List, Arjan Van de Ven
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3992 bytes --]
A few days ago Arjan signaled a lockdep red flag on epoll locks, and
precisely between the epoll's device structure lock (->lock) and the wait
queue head lock (->lock). Like I explained in another email, and directly
to Arjan, this can't happen in reality because of the explicit check at
eventpoll.c:592, that does not allow to drop an epoll fd inside the same
epoll fd. Since lockdep is working on per-structure locks, it will never
be able to know of policies enforced in other parts of the code. It was
decided time ago of having the ability to drop epoll fds inside other
epoll fds, that triggers a very trick wakeup operations (due to possibly
reentrant callback-driven wakeups) handled by the ep_poll_safewake() function.
While looking again at the code though, I noticed that all the operations
done on the epoll's main structure wait queue head (->wq) are already
protected by the epoll lock (->lock), so that locked-style functions can
be used to manipulate the ->wq member. This makes both a lock-acquire
save, and lockdep happy.
Running totalmess on my dual opteron for a while did not reveal any
problem so far:
http://www.xmailserver.org/totalmess.c
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
- Davide
diff -Nru linux-2.6.17-rc5.vanilla/fs/eventpoll.c linux-2.6.17-rc5.eplock/fs/eventpoll.c
--- linux-2.6.17-rc5.vanilla/fs/eventpoll.c 2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/fs/eventpoll.c 2006-06-02 11:22:43.000000000 -0700
@@ -1,6 +1,6 @@
/*
* fs/eventpoll.c ( Efficent event polling implementation )
- * Copyright (C) 2001,...,2003 Davide Libenzi
+ * Copyright (C) 2001,...,2006 Davide Libenzi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -1004,7 +1004,7 @@
/* Notify waiting tasks that events are available */
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
@@ -1083,7 +1083,8 @@
/* Notify waiting tasks that events are available */
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+ TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
@@ -1260,7 +1261,8 @@
* wait list.
*/
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+ TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
@@ -1444,7 +1446,8 @@
* wait list.
*/
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+ TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
@@ -1516,7 +1519,7 @@
* ep_poll_callback() when events will become available.
*/
init_waitqueue_entry(&wait, current);
- add_wait_queue(&ep->wq, &wait);
+ __add_wait_queue(&ep->wq, &wait);
for (;;) {
/*
@@ -1536,7 +1539,7 @@
jtimeout = schedule_timeout(jtimeout);
write_lock_irqsave(&ep->lock, flags);
}
- remove_wait_queue(&ep->wq, &wait);
+ __remove_wait_queue(&ep->wq, &wait);
set_current_state(TASK_RUNNING);
}
diff -Nru linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h linux-2.6.17-rc5.eplock/include/linux/eventpoll.h
--- linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h 2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/include/linux/eventpoll.h 2006-06-02 11:07:48.000000000 -0700
@@ -1,6 +1,6 @@
/*
* include/linux/eventpoll.h ( Efficent event polling implementation )
- * Copyright (C) 2001,...,2003 Davide Libenzi
+ * Copyright (C) 2001,...,2006 Davide Libenzi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
[-- Attachment #2: Type: TEXT/plain, Size: 2782 bytes --]
diff -Nru linux-2.6.17-rc5.vanilla/fs/eventpoll.c linux-2.6.17-rc5.eplock/fs/eventpoll.c
--- linux-2.6.17-rc5.vanilla/fs/eventpoll.c 2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/fs/eventpoll.c 2006-06-02 11:22:43.000000000 -0700
@@ -1,6 +1,6 @@
/*
* fs/eventpoll.c ( Efficent event polling implementation )
- * Copyright (C) 2001,...,2003 Davide Libenzi
+ * Copyright (C) 2001,...,2006 Davide Libenzi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -1004,7 +1004,7 @@
/* Notify waiting tasks that events are available */
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
@@ -1083,7 +1083,8 @@
/* Notify waiting tasks that events are available */
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+ TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
@@ -1260,7 +1261,8 @@
* wait list.
*/
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+ TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
@@ -1444,7 +1446,8 @@
* wait list.
*/
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ __wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+ TASK_INTERRUPTIBLE);
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
@@ -1516,7 +1519,7 @@
* ep_poll_callback() when events will become available.
*/
init_waitqueue_entry(&wait, current);
- add_wait_queue(&ep->wq, &wait);
+ __add_wait_queue(&ep->wq, &wait);
for (;;) {
/*
@@ -1536,7 +1539,7 @@
jtimeout = schedule_timeout(jtimeout);
write_lock_irqsave(&ep->lock, flags);
}
- remove_wait_queue(&ep->wq, &wait);
+ __remove_wait_queue(&ep->wq, &wait);
set_current_state(TASK_RUNNING);
}
diff -Nru linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h linux-2.6.17-rc5.eplock/include/linux/eventpoll.h
--- linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h 2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/include/linux/eventpoll.h 2006-06-02 11:07:48.000000000 -0700
@@ -1,6 +1,6 @@
/*
* include/linux/eventpoll.h ( Efficent event polling implementation )
- * Copyright (C) 2001,...,2003 Davide Libenzi
+ * Copyright (C) 2001,...,2006 Davide Libenzi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] epoll use unlocked wqueue operations ...
2006-06-02 23:28 [patch] epoll use unlocked wqueue operations Davide Libenzi
@ 2006-06-03 6:04 ` Willy Tarreau
2006-06-03 17:35 ` Davide Libenzi
0 siblings, 1 reply; 5+ messages in thread
From: Willy Tarreau @ 2006-06-03 6:04 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Andrew Morton, Linux Kernel Mailing List, Arjan Van de Ven
Hi Davide,
On Fri, Jun 02, 2006 at 04:28:25PM -0700, Davide Libenzi wrote:
>
> A few days ago Arjan signaled a lockdep red flag on epoll locks, and
> precisely between the epoll's device structure lock (->lock) and the wait
> queue head lock (->lock). Like I explained in another email, and directly
> to Arjan, this can't happen in reality because of the explicit check at
> eventpoll.c:592, that does not allow to drop an epoll fd inside the same
> epoll fd. Since lockdep is working on per-structure locks, it will never
> be able to know of policies enforced in other parts of the code. It was
> decided time ago of having the ability to drop epoll fds inside other
> epoll fds, that triggers a very trick wakeup operations (due to possibly
> reentrant callback-driven wakeups) handled by the ep_poll_safewake()
> function.
> While looking again at the code though, I noticed that all the operations
> done on the epoll's main structure wait queue head (->wq) are already
> protected by the epoll lock (->lock), so that locked-style functions can
> be used to manipulate the ->wq member. This makes both a lock-acquire
> save, and lockdep happy.
> Running totalmess on my dual opteron for a while did not reveal any
> problem so far:
>
> http://www.xmailserver.org/totalmess.c
Shouldn't we notice a tiny performance boost by avoiding those useless
locks, or do you consider they are not located in the fast path anyway ?
Regards,
Willy
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] epoll use unlocked wqueue operations ...
2006-06-03 6:04 ` Willy Tarreau
@ 2006-06-03 17:35 ` Davide Libenzi
2006-06-04 20:52 ` Willy TARREAU
0 siblings, 1 reply; 5+ messages in thread
From: Davide Libenzi @ 2006-06-03 17:35 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Andrew Morton, Linux Kernel Mailing List, Arjan Van de Ven
On Sat, 3 Jun 2006, Willy Tarreau wrote:
> Hi Davide,
>
> On Fri, Jun 02, 2006 at 04:28:25PM -0700, Davide Libenzi wrote:
>>
>> A few days ago Arjan signaled a lockdep red flag on epoll locks, and
>> precisely between the epoll's device structure lock (->lock) and the wait
>> queue head lock (->lock). Like I explained in another email, and directly
>> to Arjan, this can't happen in reality because of the explicit check at
>> eventpoll.c:592, that does not allow to drop an epoll fd inside the same
>> epoll fd. Since lockdep is working on per-structure locks, it will never
>> be able to know of policies enforced in other parts of the code. It was
>> decided time ago of having the ability to drop epoll fds inside other
>> epoll fds, that triggers a very trick wakeup operations (due to possibly
>> reentrant callback-driven wakeups) handled by the ep_poll_safewake()
>> function.
>> While looking again at the code though, I noticed that all the operations
>> done on the epoll's main structure wait queue head (->wq) are already
>> protected by the epoll lock (->lock), so that locked-style functions can
>> be used to manipulate the ->wq member. This makes both a lock-acquire
>> save, and lockdep happy.
>> Running totalmess on my dual opteron for a while did not reveal any
>> problem so far:
>>
>> http://www.xmailserver.org/totalmess.c
>
> Shouldn't we notice a tiny performance boost by avoiding those useless
> locks, or do you consider they are not located in the fast path anyway ?
Well, we take a lock less but I can't say if it'll be measureable. The
test program above is not a performance thing though, just some code to
verify multiple threads doing waits on the same epoll fd.
- Davide
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] epoll use unlocked wqueue operations ...
2006-06-03 17:35 ` Davide Libenzi
@ 2006-06-04 20:52 ` Willy TARREAU
2006-06-05 1:01 ` Davide Libenzi
0 siblings, 1 reply; 5+ messages in thread
From: Willy TARREAU @ 2006-06-04 20:52 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Andrew Morton, Linux Kernel Mailing List, Arjan Van de Ven
Hi Davide,
On Sat, Jun 03, 2006 at 10:35:52AM -0700, Davide Libenzi wrote:
> On Sat, 3 Jun 2006, Willy Tarreau wrote:
>
> >Hi Davide,
> >
> >On Fri, Jun 02, 2006 at 04:28:25PM -0700, Davide Libenzi wrote:
> >>
> >>A few days ago Arjan signaled a lockdep red flag on epoll locks, and
> >>precisely between the epoll's device structure lock (->lock) and the
> >>wait
> >>queue head lock (->lock). Like I explained in another email, and
> >>directly
> >>to Arjan, this can't happen in reality because of the explicit check at
> >>eventpoll.c:592, that does not allow to drop an epoll fd inside the same
> >>epoll fd. Since lockdep is working on per-structure locks, it will never
> >>be able to know of policies enforced in other parts of the code. It was
> >>decided time ago of having the ability to drop epoll fds inside other
> >>epoll fds, that triggers a very trick wakeup operations (due to possibly
> >>reentrant callback-driven wakeups) handled by the ep_poll_safewake()
> >>function.
> >>While looking again at the code though, I noticed that all the
> >>operations
> >>done on the epoll's main structure wait queue head (->wq) are already
> >>protected by the epoll lock (->lock), so that locked-style functions can
> >>be used to manipulate the ->wq member. This makes both a lock-acquire
> >>save, and lockdep happy.
> >>Running totalmess on my dual opteron for a while did not reveal any
> >>problem so far:
> >>
> >>http://www.xmailserver.org/totalmess.c
> >
> >Shouldn't we notice a tiny performance boost by avoiding those useless
> >locks, or do you consider they are not located in the fast path anyway ?
>
> Well, we take a lock less but I can't say if it'll be measureable. The
> test program above is not a performance thing though, just some code to
> verify multiple threads doing waits on the same epoll fd.
OK, so I ported your patch to 2.4 (+epoll-lt-0.22) because I have some
code using it right there. At first, I thought I was observing measuring
errors, but after about 6 reboots, I seem to observe a consistent 6.5%
increase in the number of sessions/s on my fake web server on my dual
athlon. It jumps from 14350 hits/s with epoll-lt-0.22 alone to 15300 with
your patch. It seems much to me, but I'm sure I'm not dreaming (yet).
I'll send you (offlist) an update to 2.4-epoll-lt-0.22 which incorporates
this patch.
> - Davide
Cheers,
Willy
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] epoll use unlocked wqueue operations ...
2006-06-04 20:52 ` Willy TARREAU
@ 2006-06-05 1:01 ` Davide Libenzi
0 siblings, 0 replies; 5+ messages in thread
From: Davide Libenzi @ 2006-06-05 1:01 UTC (permalink / raw)
To: Willy TARREAU; +Cc: Andrew Morton, Linux Kernel Mailing List, Arjan Van de Ven
On Sun, 4 Jun 2006, Willy TARREAU wrote:
>> Well, we take a lock less but I can't say if it'll be measureable. The
>> test program above is not a performance thing though, just some code to
>> verify multiple threads doing waits on the same epoll fd.
>
> OK, so I ported your patch to 2.4 (+epoll-lt-0.22) because I have some
> code using it right there. At first, I thought I was observing measuring
> errors, but after about 6 reboots, I seem to observe a consistent 6.5%
> increase in the number of sessions/s on my fake web server on my dual
> athlon. It jumps from 14350 hits/s with epoll-lt-0.22 alone to 15300 with
> your patch. It seems much to me, but I'm sure I'm not dreaming (yet).
Ok, that's measureable :)
> I'll send you (offlist) an update to 2.4-epoll-lt-0.22 which incorporates
> this patch.
Thx for the patch. Got it and uploaded.
- Davide
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-06-05 1:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-02 23:28 [patch] epoll use unlocked wqueue operations Davide Libenzi
2006-06-03 6:04 ` Willy Tarreau
2006-06-03 17:35 ` Davide Libenzi
2006-06-04 20:52 ` Willy TARREAU
2006-06-05 1:01 ` Davide Libenzi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox