From: Jason Baron <jbaron@akamai.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>,
peterz@infradead.org, mingo@redhat.com, viro@zeniv.linux.org.uk,
normalperson@yhbt.net, davidel@xmailserver.org,
mtk.manpages@gmail.com, luto@amacapital.net,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-api@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Alexander Viro <viro@ftp.linux.org.uk>
Subject: Re: [PATCH v3 0/3] epoll: introduce round robin wakeup mode
Date: Fri, 27 Feb 2015 17:01:32 -0500 [thread overview]
Message-ID: <54F0E93C.3010306@akamai.com> (raw)
In-Reply-To: <20150227131034.2f2787dcabf285191a1f6ffa@linux-foundation.org>
On 02/27/2015 04:10 PM, Andrew Morton wrote:
> On Wed, 25 Feb 2015 11:27:04 -0500 Jason Baron <jbaron@akamai.com> wrote:
>
>>> Libenzi inactive eventpoll appears to be without a
>>> dedicated maintainer since 2011 or so. Is there anyone who
>>> knows the code and its usages in detail and does final ABI
>>> decisions on eventpoll - Andrew, Al or Linus?
>>>
>> Generally, Andrew and Al do more 'final' reviews here,
>> and a lot of others on lkml are always very helpful in
>> looking at this code. However, its not always clear, at
>> least to me, who I should pester.
> Yes, it's a difficult situation.
>
> The 3/3 changelog refers to "EPOLLROUNDROBIN" which I assume is
> a leftover from some earlier revision?
Yes, that's a typo there. It should read 'EPOLL_ROTATE'.
>
> I don't really understand the need for rotation/round-robin. We can
> solve the thundering herd via exclusive wakeups, but what is the point
> in choosing to wake the task which has been sleeping for the longest
> time? Why is that better than waking the task which has been sleeping
> for the *least* time? That's probably faster as that task's data is
> more likely to still be in cache.
>
> The changelogs talks about "starvation" but they don't really say what
> this term means in this context, nor why it is a bad thing.
>
So the idea with the 'rotation' is to try and distribute the
workload more evenly across the worker threads. We currently
tend to wake up the 'head' of the queue over and over and
thus the workload for us is not evenly distributed. In fact, we
have a workload where we have to remove all the epoll sets
and then re-add them in a different order to improve the situation.
We are trying to avoid this workaround and in addition avoid
thundering wakeups when possible (using exclusive as you
mention).
I agree that waking up the task that may have been sleeping longer
may not be the best for all workloads. So what I am proposing
here is an optional flag to meet a certain workload. It might not be
right for all workloads, but we have found it quite useful.
The 'starvation' mention was in regards to the fact that with this
new behavior of not waking up all threads (and rotating them),
an adversarial thread might insert itself into our wakeup queue
and 'starve' us out. This concern was raised by Andy Lutomirkski,
and this current series is not subject to this issue, b/c it works
by creating a new epoll fd and then adding that epoll fd to the
wakeup queue. Thus, this 'new' epoll fd is local to the thread
and the wakeup queue continues to wake all threads. Only the
'new' epoll fd which we then attach ourselves to, implements the
exclusive/rotate behavior.
Thanks,
-Jason
next prev parent reply other threads:[~2015-02-27 22:01 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-24 21:25 [PATCH v3 0/3] epoll: introduce round robin wakeup mode Jason Baron
2015-02-24 21:25 ` [PATCH v3 1/3] sched/wait: add __wake_up_rotate() Jason Baron
2015-02-24 21:25 ` [PATCH v3 2/3] epoll: restrict wakeups to the overflow list Jason Baron
2015-02-24 21:25 ` [PATCH v3 3/3] epoll: Add EPOLL_ROTATE mode Jason Baron
2015-02-25 7:38 ` [PATCH v3 0/3] epoll: introduce round robin wakeup mode Ingo Molnar
2015-02-25 16:27 ` Jason Baron
2015-02-27 21:10 ` Andrew Morton
2015-02-27 21:31 ` Jonathan Corbet
2015-03-02 5:04 ` Jason Baron
2015-02-27 22:01 ` Jason Baron [this message]
2015-02-27 22:31 ` Andrew Morton
2015-03-05 0:02 ` Ingo Molnar
2015-03-05 3:53 ` Jason Baron
2015-03-05 9:15 ` Ingo Molnar
2015-03-05 20:24 ` Jason Baron
2015-03-07 12:35 ` Jason Baron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54F0E93C.3010306@akamai.com \
--to=jbaron@akamai.com \
--cc=akpm@linux-foundation.org \
--cc=davidel@xmailserver.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=normalperson@yhbt.net \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@ftp.linux.org.uk \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox