public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch] epoll use unlocked wqueue operations ...
@ 2006-06-02 23:28 Davide Libenzi
  2006-06-03  6:04 ` Willy Tarreau
  0 siblings, 1 reply; 5+ messages in thread
From: Davide Libenzi @ 2006-06-02 23:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List, Arjan Van de Ven

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3992 bytes --]


A few days ago Arjan signaled a lockdep red flag on epoll locks, and 
precisely between the epoll's device structure lock (->lock) and the wait 
queue head lock (->lock). Like I explained in another email, and directly 
to Arjan, this can't happen in reality because of the explicit check at 
eventpoll.c:592, that does not allow to drop an epoll fd inside the same 
epoll fd. Since lockdep is working on per-structure locks, it will never 
be able to know of policies enforced in other parts of the code. It was 
decided time ago of having the ability to drop epoll fds inside other 
epoll fds, that triggers a very trick wakeup operations (due to possibly 
reentrant callback-driven wakeups) handled by the ep_poll_safewake() function.
While looking again at the code though, I noticed that all the operations 
done on the epoll's main structure wait queue head (->wq) are already 
protected by the epoll lock (->lock), so that locked-style functions can 
be used to manipulate the ->wq member. This makes both a lock-acquire 
save, and lockdep happy.
Running totalmess on my dual opteron for a while did not reveal any 
problem so far:

http://www.xmailserver.org/totalmess.c



Signed-off-by: Davide Libenzi <davidel@xmailserver.org>



- Davide



diff -Nru linux-2.6.17-rc5.vanilla/fs/eventpoll.c linux-2.6.17-rc5.eplock/fs/eventpoll.c
--- linux-2.6.17-rc5.vanilla/fs/eventpoll.c	2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/fs/eventpoll.c	2006-06-02 11:22:43.000000000 -0700
@@ -1,6 +1,6 @@
  /*
   *  fs/eventpoll.c ( Efficent event polling implementation )
- *  Copyright (C) 2001,...,2003	 Davide Libenzi
+ *  Copyright (C) 2001,...,2006	 Davide Libenzi
   *
   *  This program is free software; you can redistribute it and/or modify
   *  it under the terms of the GNU General Public License as published by
@@ -1004,7 +1004,7 @@

  		/* Notify waiting tasks that events are available */
  		if (waitqueue_active(&ep->wq))
-			wake_up(&ep->wq);
+			__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE);
  		if (waitqueue_active(&ep->poll_wait))
  			pwake++;
  	}
@@ -1083,7 +1083,8 @@

  				/* Notify waiting tasks that events are available */
  				if (waitqueue_active(&ep->wq))
-					wake_up(&ep->wq);
+					__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+							 TASK_INTERRUPTIBLE);
  				if (waitqueue_active(&ep->poll_wait))
  					pwake++;
  			}
@@ -1260,7 +1261,8 @@
  	 * wait list.
  	 */
  	if (waitqueue_active(&ep->wq))
-		wake_up(&ep->wq);
+		__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+				 TASK_INTERRUPTIBLE);
  	if (waitqueue_active(&ep->poll_wait))
  		pwake++;

@@ -1444,7 +1446,8 @@
  		 * wait list.
  		 */
  		if (waitqueue_active(&ep->wq))
-			wake_up(&ep->wq);
+			__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+					 TASK_INTERRUPTIBLE);
  		if (waitqueue_active(&ep->poll_wait))
  			pwake++;
  	}
@@ -1516,7 +1519,7 @@
  		 * ep_poll_callback() when events will become available.
  		 */
  		init_waitqueue_entry(&wait, current);
-		add_wait_queue(&ep->wq, &wait);
+		__add_wait_queue(&ep->wq, &wait);

  		for (;;) {
  			/*
@@ -1536,7 +1539,7 @@
  			jtimeout = schedule_timeout(jtimeout);
  			write_lock_irqsave(&ep->lock, flags);
  		}
-		remove_wait_queue(&ep->wq, &wait);
+		__remove_wait_queue(&ep->wq, &wait);

  		set_current_state(TASK_RUNNING);
  	}
diff -Nru linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h linux-2.6.17-rc5.eplock/include/linux/eventpoll.h
--- linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h	2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/include/linux/eventpoll.h	2006-06-02 11:07:48.000000000 -0700
@@ -1,6 +1,6 @@
  /*
   *  include/linux/eventpoll.h ( Efficent event polling implementation )
- *  Copyright (C) 2001,...,2003	 Davide Libenzi
+ *  Copyright (C) 2001,...,2006	 Davide Libenzi
   *
   *  This program is free software; you can redistribute it and/or modify
   *  it under the terms of the GNU General Public License as published by

[-- Attachment #2: Type: TEXT/plain, Size: 2782 bytes --]

diff -Nru linux-2.6.17-rc5.vanilla/fs/eventpoll.c linux-2.6.17-rc5.eplock/fs/eventpoll.c
--- linux-2.6.17-rc5.vanilla/fs/eventpoll.c	2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/fs/eventpoll.c	2006-06-02 11:22:43.000000000 -0700
@@ -1,6 +1,6 @@
 /*
  *  fs/eventpoll.c ( Efficent event polling implementation )
- *  Copyright (C) 2001,...,2003	 Davide Libenzi
+ *  Copyright (C) 2001,...,2006	 Davide Libenzi
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -1004,7 +1004,7 @@
 
 		/* Notify waiting tasks that events are available */
 		if (waitqueue_active(&ep->wq))
-			wake_up(&ep->wq);
+			__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE);
 		if (waitqueue_active(&ep->poll_wait))
 			pwake++;
 	}
@@ -1083,7 +1083,8 @@
 
 				/* Notify waiting tasks that events are available */
 				if (waitqueue_active(&ep->wq))
-					wake_up(&ep->wq);
+					__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+							 TASK_INTERRUPTIBLE);
 				if (waitqueue_active(&ep->poll_wait))
 					pwake++;
 			}
@@ -1260,7 +1261,8 @@
 	 * wait list.
 	 */
 	if (waitqueue_active(&ep->wq))
-		wake_up(&ep->wq);
+		__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+				 TASK_INTERRUPTIBLE);
 	if (waitqueue_active(&ep->poll_wait))
 		pwake++;
 
@@ -1444,7 +1446,8 @@
 		 * wait list.
 		 */
 		if (waitqueue_active(&ep->wq))
-			wake_up(&ep->wq);
+			__wake_up_locked(&ep->wq, TASK_UNINTERRUPTIBLE |
+					 TASK_INTERRUPTIBLE);
 		if (waitqueue_active(&ep->poll_wait))
 			pwake++;
 	}
@@ -1516,7 +1519,7 @@
 		 * ep_poll_callback() when events will become available.
 		 */
 		init_waitqueue_entry(&wait, current);
-		add_wait_queue(&ep->wq, &wait);
+		__add_wait_queue(&ep->wq, &wait);
 
 		for (;;) {
 			/*
@@ -1536,7 +1539,7 @@
 			jtimeout = schedule_timeout(jtimeout);
 			write_lock_irqsave(&ep->lock, flags);
 		}
-		remove_wait_queue(&ep->wq, &wait);
+		__remove_wait_queue(&ep->wq, &wait);
 
 		set_current_state(TASK_RUNNING);
 	}
diff -Nru linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h linux-2.6.17-rc5.eplock/include/linux/eventpoll.h
--- linux-2.6.17-rc5.vanilla/include/linux/eventpoll.h	2006-06-02 11:04:58.000000000 -0700
+++ linux-2.6.17-rc5.eplock/include/linux/eventpoll.h	2006-06-02 11:07:48.000000000 -0700
@@ -1,6 +1,6 @@
 /*
  *  include/linux/eventpoll.h ( Efficent event polling implementation )
- *  Copyright (C) 2001,...,2003	 Davide Libenzi
+ *  Copyright (C) 2001,...,2006	 Davide Libenzi
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-06-05  1:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-02 23:28 [patch] epoll use unlocked wqueue operations Davide Libenzi
2006-06-03  6:04 ` Willy Tarreau
2006-06-03 17:35   ` Davide Libenzi
2006-06-04 20:52     ` Willy TARREAU
2006-06-05  1:01       ` Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox