From: Jason Baron <jbaron@akamai.com>
To: Andy Lutomirski <luto@amacapital.net>,
peterz@infradead.org, mingo@redhat.com, viro@zeniv.linux.org.uk
Cc: akpm@linux-foundation.org, normalperson@yhbt.net,
davidel@xmailserver.org, mtk.manpages@gmail.com,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN
Date: Mon, 09 Feb 2015 16:32:48 -0500 [thread overview]
Message-ID: <54D92780.4000303@akamai.com> (raw)
In-Reply-To: <54D915FC.7010003@amacapital.net>
On 02/09/2015 03:18 PM, Andy Lutomirski wrote:
> On 02/09/2015 12:06 PM, Jason Baron wrote:
>> Epoll file descriptors that are added to a shared wakeup source are always
>> added in a non-exclusive manner. That means that when we have multiple epoll
>> fds attached to a shared wakeup source they are all woken up. This can
>> lead to excessive cpu usage and uneven load distribution.
>>
>> This patch introduces two new 'events' flags that are intended to be used
>> with EPOLL_CTL_ADD operations. EPOLLEXCLUSIVE, adds the epoll fd to the event
>> source in an exclusive manner such that the minimum number of threads are
>> woken. EPOLLROUNDROBIN, which depends on EPOLLEXCLUSIVE also being set, can
>> also be added to the 'events' flag, such that we round robin around the set
>> of waiting threads.
>>
>> An implementation note is that in the epoll wakeup routine,
>> 'ep_poll_callback()', if EPOLLROUNDROBIN is set, we return 1, for a successful
>> wakeup, only when there are current waiters. The idea is to use this additional
>> heuristic in order minimize wakeup latencies.
>
> I don't understand what this is intended to do.
>
> If an event has EPOLLONESHOT, then this only one thread should be woken regardless, right? If not, isn't that just a bug that should be fixed?
>
hmm...so with EPOLLONESHOT you basically get notified once about an event. If i have multiple epoll fds (say 1 per-thread) attached to a single source in EPOLLONESHOT, then all threads will potentially get woken up once per event. Then, I would have to re-arm all of them. So I don't think this addresses this particular usecase...what I am trying to avoid is this mass wakeup or thundering herd for a shared event source.
> If an event has EPOLLET, then the considerations are similar to EPOLLONESHOT, right?
>
EPOLLET is still going to cause this thundering herd.
> If an event is a normal level-triggered non-one-shot event, then I don't understand how a round-robin wakeup makes any sense. It's level-triggered, after all.
Yeah, so the current behavior is to wake up all of the threads. I'm trying to add a new mode where it load balances among the threads interested in the event. Perhaps, the test program I attached to 0/2 will show the issue better?
Also, this originally came up in the context of a single listening socket which was attached to multiple epoll fds each in a separate thread. With the attached patch, I can measure a large decrease in cpu usage and better balancing behavior among the accepting threads.
Thanks,
-Jason
next prev parent reply other threads:[~2015-02-09 21:32 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-09 20:05 [PATCH 0/2] Add epoll round robin wakeup mode Jason Baron
2015-02-09 20:05 ` [PATCH 1/2] sched/wait: add " Jason Baron
2015-02-09 20:26 ` Michael Kerrisk
2015-02-09 21:50 ` Peter Zijlstra
2015-02-10 4:06 ` Jason Baron
2015-02-10 9:03 ` Peter Zijlstra
2015-02-10 15:59 ` Jason Baron
2015-02-10 16:11 ` Peter Zijlstra
2015-02-09 20:06 ` [PATCH 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN Jason Baron
2015-02-09 20:18 ` Andy Lutomirski
2015-02-09 21:32 ` Jason Baron [this message]
[not found] ` <54D92780.4000303-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-02-09 22:45 ` Andy Lutomirski
2015-02-09 22:45 ` Andy Lutomirski
2015-02-10 3:59 ` Jason Baron
[not found] ` <54D98209.2080901-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-02-10 4:49 ` Eric Wong
2015-02-10 4:49 ` Eric Wong
[not found] ` <20150210044939.GA15616-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2015-02-10 19:16 ` Jason Baron
2015-02-10 19:16 ` Jason Baron
2015-02-10 19:32 ` Eric Wong
[not found] ` <68a0ad4a99551ea3bfff89da461bb490d63b0ca8.1423509605.git.jbaron-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-02-09 20:27 ` Michael Kerrisk
2015-02-09 20:27 ` Michael Kerrisk
2015-02-09 20:25 ` [PATCH 0/2] Add epoll round robin wakeup mode Michael Kerrisk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54D92780.4000303@akamai.com \
--to=jbaron@akamai.com \
--cc=akpm@linux-foundation.org \
--cc=davidel@xmailserver.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=normalperson@yhbt.net \
--cc=peterz@infradead.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.