linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Jason Baron <jbaron@akamai.com>, akpm@linux-foundation.org
Cc: mtk.manpages@gmail.com, mingo@kernel.org, peterz@infradead.org,
	viro@ftp.linux.org.uk, normalperson@yhbt.net, m@silodev.com,
	corbet@lwn.net, luto@amacapital.net,
	torvalds@linux-foundation.org, hagen@jauu.net,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org
Subject: Re: [PATCH] epoll: add exclusive wakeups flag
Date: Thu, 28 Jan 2016 08:16:11 +0100	[thread overview]
Message-ID: <56A9C03B.7020104@gmail.com> (raw)
In-Reply-To: <cover.1449523436.git.jbaron@akamai.com>

Hi Jason,

On 12/08/2015 04:23 AM, Jason Baron wrote:
> Hi,
> 
> Re-post of an old series addressing thundering herd issues when sharing
> an event source fd amongst multiple epoll fds. Last posting was here
> for reference: https://lkml.org/lkml/2015/2/25/56
>  
> The patch herein drops the core scheduler 'rotate' changes I had previously
> proposed as this patch seems performant without those.
> 
> I was prompted to re-post this because Madars Vitolins reported some good
> speedups with this patch using Enduro/X application. His writeup is here:
> https://mvitolin.wordpress.com/2015/12/05/endurox-testing-epollexclusive-flag/
> 
> Thanks,
> 
> -Jason
> 
> Sample epoll_clt text:

Thanks for the proposed text. I have some questions about points
that are not quite clear to me.

> EPOLLEXCLUSIVE
>         Sets an exclusive wakeup mode for the epfd file descriptor that is
> 	being attached to the target file descriptor, fd. Thus, when an
> 	event occurs and multiple epfd file descriptors are attached to the
> 	same target file using EPOLLEXCLUSIVE, one or more epfds will receive
> 	an event with epoll_wait(2). The default in this scenario (when
> 	EPOLLEXCLUSIVE is not set) is for all epfds to receive an event.
> 	EPOLLEXLUSVIE may only be specified with the op EPOLL_CTL_ADD.

So, assuming an FD is present in the interest list of multiple (say 6)
epoll FDs, and some (say 3) of those attachments were done using
EPOLLEXCLUSVE. Which of the following statements are correct:

(a) It's guaranteed that *none* of the epoll FDs that did NOT specify
    EPOLLEXCLUSIVE will receive an event.

(b) It's guaranteed that *all* of the epoll FDs that did NOT specify
    EPOLLEXCLUSIVE will receive an event.

(c) From 1 to 3 of the epoll FDs that did specify EPOLLEXCLUSIVE
    will receive an event.

(d) Exactly one epoll FD that did specify EPOLLEXCLUSIVE will get
    an event, and it is indeterminate which one.

I suppose one point I'm trying to uncover in the above is: what is
the scope of EPOLLEXCLUSIVE? Is it just applicable for one process's
FD, or is it setting an attribute in the epoll "interest list" record
for that FD that affects notification behavior across all processes?

And then:

(1) What are the semantics of EPOLLEXCLUSIVE if the added FD becomes
    disabled via EPOLLONESHOT (or explicitly via EPOLL_CTL_MOD with
    the 'events' field set to 0)?

(2) The source code contains a comment "we do not currently supported 
    nested exclusive wakeups". Could you elaborate on this point? It
    sounds like something that should be documented.

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  parent reply	other threads:[~2016-01-28  7:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-08  3:23 [PATCH] epoll: add exclusive wakeups flag Jason Baron
2015-12-08  3:23 ` [PATCH] epoll: add EPOLLEXCLUSIVE flag Jason Baron
2016-01-28  7:16 ` Michael Kerrisk (man-pages) [this message]
2016-01-28 17:57   ` [PATCH] epoll: add exclusive wakeups flag Jason Baron
2016-01-29  8:14     ` Michael Kerrisk (man-pages)
2016-02-01 19:42       ` Jason Baron
2016-03-10 18:53       ` Jason Baron
2016-03-10 19:47         ` Michael Kerrisk (man-pages)
2016-03-10 19:58         ` Michael Kerrisk (man-pages)
2016-03-10 20:40           ` Jason Baron
2016-03-11 20:30             ` Michael Kerrisk (man-pages)
     [not found]               ` <56E32FC5.4030902@akamai.com>
     [not found]                 ` <56E353CF.6050503@gmail.com>
     [not found]                   ` <56E6D0ED.20609@akamai.com>
2016-03-14 17:47                     ` Michael Kerrisk (man-pages)
2016-03-14 19:32                       ` Jason Baron
2016-03-14 20:01                         ` Michael Kerrisk (man-pages)
2016-03-14 21:03                           ` Michael Kerrisk (man-pages)
2016-03-14 22:35                             ` Jason Baron
2016-03-14 23:09                               ` Madars Vitolins
2016-03-14 23:26                               ` Michael Kerrisk (man-pages)
2016-03-15  2:36                                 ` Jason Baron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A9C03B.7020104@gmail.com \
    --to=mtk.manpages@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=hagen@jauu.net \
    --cc=jbaron@akamai.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=m@silodev.com \
    --cc=mingo@kernel.org \
    --cc=normalperson@yhbt.net \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).