All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: netdev <netdev@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: [GIT PULL] Add support for epoll min wait time
Date: Sat, 10 Dec 2022 08:36:11 -0700	[thread overview]
Message-ID: <b0901cba-3cb8-a309-701e-7b8cb13f0e8a@kernel.dk> (raw)

Hi Linus,

I've had this done for months and posted a few times, but little
attention has been received. Sending it out for inclusion now, as having
it caught up in upstream limbo is preventing further use cases of it at
Meta. We upstream every feature that we develop, and we don't put any
features into our kernel that aren't already upstream, or on the way
upstream. This is obviously especially important when an API is
involved.

This adds an epoll_ctl method for setting the minimum wait time for
retrieving events. In production, workloads don't just run mostly idle
or mostly full tilt. A common pattern is medium load. epoll_wait and
friends receive a cap of max events and max time we want to wait for
them, but there's no notion of min events or min time. This leads to
services only getting a single event, even if they are totally fine with
waiting eg 200 usec for more events. More events leads to greater
efficiency in handling them.

The main patch has some numbers, but tldr is that we see a nice
reduction in context switches / second, and a reduction in busy time on
such systems.

It has been suggested that a syscall should be available for this as
well, and there are two main reasons for why this wasn't pursued (but
was still investigated):

- This most likely should've been done as epoll_pwait3(), as we already
  have epoll_wait, epoll_pwait, and epoll_pwait2. The latter two are
  already at the max number of syscall arguments, so a new method would
  have to be done where a struct would define the API. With some
  arguments being optional, this could get inefficient or ugly (or both).

- Main reason is that Meta doesn't need it. By using epoll_ctl, the
  check-for-support-of-feature can be relegated to setup time rather
  than in the fast path, and the workloads we are looking at would not
  need different min wait settings within a single epoll context.

Please pull!


The following changes since commit ef4d3ea40565a781c25847e9cb96c1bd9f462bc6:

  afs: Fix server->active leak in afs_put_server (2022-11-30 10:02:37 -0800)

are available in the Git repository at:

  git://git.kernel.dk/linux.git tags/epoll-min_ts-2022-12-08

for you to fetch changes up to 73b9320234c0ad1b5e6f576abb796221eb088c64:

  eventpoll: ensure we pass back -EBADF for a bad file descriptor (2022-12-08 07:05:42 -0700)

----------------------------------------------------------------
epoll-min_ts-2022-12-08

----------------------------------------------------------------
Jens Axboe (8):
      eventpoll: cleanup branches around sleeping for events
      eventpoll: don't pass in 'timed_out' to ep_busy_loop()
      eventpoll: split out wait handling
      eventpoll: move expires to epoll_wq
      eventpoll: move file checking earlier for epoll_ctl()
      eventpoll: add support for min-wait
      eventpoll: add method for configuring minimum wait on epoll context
      eventpoll: ensure we pass back -EBADF for a bad file descriptor

 fs/eventpoll.c                 | 192 +++++++++++++++++++++++++++++++++--------
 include/linux/eventpoll.h      |   2 +-
 include/uapi/linux/eventpoll.h |   1 +
 3 files changed, 158 insertions(+), 37 deletions(-)

-- 
Jens Axboe


             reply	other threads:[~2022-12-10 15:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-10 15:36 Jens Axboe [this message]
2022-12-10 15:58 ` [GIT PULL] Add support for epoll min wait time Willy Tarreau
2022-12-10 16:05   ` Jens Axboe
2022-12-10 16:17     ` Willy Tarreau
2022-12-10 16:23       ` Jens Axboe
2022-12-10 18:51 ` Linus Torvalds
2022-12-10 19:26   ` Linus Torvalds
2022-12-11  2:03     ` Jens Axboe
2022-12-11  1:58   ` Jens Axboe
2022-12-11  2:20     ` Jens Axboe
2022-12-11  2:31       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b0901cba-3cb8-a309-701e-7b8cb13f0e8a@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.