linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: The thundering herd like problem when multi epolls on one fd
@ 2012-01-15 15:41 Li Yu
  0 siblings, 0 replies; 4+ messages in thread
From: Li Yu @ 2012-01-15 15:41 UTC (permalink / raw)
  To: eric.dumazet; +Cc: linux-kernel, davidel



2012/1/14 Eric Dumazet <eric.dumazet@gmail.com>:
> Le samedi 14 janvier 2012 à 19:13 +0800, Li Yu a écrit :
>> Hi,
>>
>>       My buddy reported a thundering herd problem about using epoll
>> on TCP listen sockets. He said their usage like below:
>>
>>       1. sk = new tcp_listen_socket();
>>       2. create many child processes or threads.
>>       3. in new created processes (threads), use epoll API on listen
>> sk to provide HTTP service.
>>
>>       Such using pattern means we have multi wait queues when
>> accepting one socket, and it is not exclusive waking up, so we get a
>> thundering herd like problem. And, so I heard many popular applications
>> can use such pattern, which includes nginx, lighttpd, haproxy at least.
>
> It is not very scalable. But we really lack a fanout mechanism to allow
> better paralelism on accept(), its not a poll() vs select() vs epoll()
> problem per se, but a generic problem.
>

I am interesting in this issue, my rough idea is it may utilize XPS or
RPS/RSS information to detect which tasks on target processor to
wake up,

>> So should we change this waking up behavior to exclusive too ?
>>
>
> Certainly not.
>
>>       Below is a simple patch (tested and works) for epoll() to do it,
>> of course, we also should fix select() and poll() syscalls if it is
right.
>>
>>       Thanks.
>>
>> Yu
>>
>> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
>> index 828e750..a3d6ab4 100644
>> --- a/fs/eventpoll.c
>> +++ b/fs/eventpoll.c
>> @@ -898,7 +899,7 @@ static void ep_ptable_queue_proc(struct file
*file, wait_queue_head_t *whead,
>>                 init_waitqueue_func_entry(&pwq->wait, ep_poll_callback);
>>                 pwq->whead = whead;
>>                 pwq->base = epi;
>> -               add_wait_queue(whead, &pwq->wait);
>> +               add_wait_queue_exclusive(whead, &pwq->wait);
>>                 list_add_tail(&pwq->llink, &epi->pwqlist);
>>                 epi->nwait++;
>>         } else {
>> --
>
>
> What happens if the awaken thread does not consume the event, and prefer
> to exit ?

In my words, If so, it should be think as a bug in application.

>
> If several threads are doing select()/poll()/epoll() on a shared fd,
> they _all_ must be notified the fd is ready, as manpages claim.
>
> Doing otherwise would require the prior consent of the user, using a
> special flag for example, and documentation.
>

Indeed, thanks!

Yu

^ permalink raw reply	[flat|nested] 4+ messages in thread
* The thundering herd like problem when multi epolls on one fd
@ 2012-01-14 11:13 Li Yu
  2012-01-14 13:20 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Li Yu @ 2012-01-14 11:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: davidel

Hi, 

	My buddy reported a thundering herd problem about using epoll 
on TCP listen sockets. He said their usage like below:

	1. sk = new tcp_listen_socket();
	2. create many child processes or threads.
	3. in new created processes (threads), use epoll API on listen
sk to provide HTTP service.

	Such using pattern means we have multi wait queues when 
accepting one socket, and it is not exclusive waking up, so we get a 
thundering herd like problem. And, so I heard many popular applications
can use such pattern, which includes nginx, lighttpd, haproxy at least.
So should we change this waking up behavior to exclusive too ? 

	Below is a simple patch (tested and works) for epoll() to do it,
of course, we also should fix select() and poll() syscalls if it is right.

	Thanks.

Yu	

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 828e750..a3d6ab4 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -898,7 +899,7 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
                init_waitqueue_func_entry(&pwq->wait, ep_poll_callback);
                pwq->whead = whead;
                pwq->base = epi;
-               add_wait_queue(whead, &pwq->wait);
+               add_wait_queue_exclusive(whead, &pwq->wait);
                list_add_tail(&pwq->llink, &epi->pwqlist);
                epi->nwait++;
        } else {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-01-15 15:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-15 15:41 The thundering herd like problem when multi epolls on one fd Li Yu
  -- strict thread matches above, loose matches on Subject: below --
2012-01-14 11:13 Li Yu
2012-01-14 13:20 ` Eric Dumazet
2012-01-14 15:57   ` Hagen Paul Pfeifer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).