All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: netdev <netdev@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [GIT PULL] Add support for epoll min wait time
Date: Sat, 10 Dec 2022 19:31:06 -0700	[thread overview]
Message-ID: <581a6498-efcd-b57d-02b6-4237559c72e6@kernel.dk> (raw)
In-Reply-To: <4a649974-39db-83e9-8070-d1b7f3b2a03f@kernel.dk>

On 12/10/22 7:20?PM, Jens Axboe wrote:
> On 12/10/22 6:58?PM, Jens Axboe wrote:
>> On 12/10/22 11:51?AM, Linus Torvalds wrote:
>>> On Sat, Dec 10, 2022 at 7:36 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> This adds an epoll_ctl method for setting the minimum wait time for
>>>> retrieving events.
>>>
>>> So this is something very close to what the TTY layer has had forever,
>>> and is useful (well... *was* useful) for pretty much the same reason.
>>>
>>> However, let's learn from successful past interfaces: the tty layer
>>> doesn't have just VTIME, it has VMIN too.
>>>
>>> And I think they very much go hand in hand: you want for at least VMIN
>>> events or for at most VTIME after the last event.
>>
>> It has been suggested before too. A more modern example is how IRQ
>> coalescing works on eg nvme or nics. Those generally are of the nature
>> of "wait for X time, or until Y events are available". We can certainly
>> do something like that here too, it's just adding a minevents and
>> passing them in together.
>>
>> I'll add that, really should be trivial, and resend later in the merge
>> window once we're happy with that.
> 
> Took a quick look, and it's not that trivial. The problem is you have
> to wake the task to reap events anyway, this cannot be checked at
> wakeup time. And now you lose the nice benefit of reducing the
> context switch rate, which was a good chunk of the win here...

One approximation we could make is that once we've done that first reap
of events, let's say we get N events (where N could be zero), the number
of wakeups post that is a rough approximation of the number of events
that have arrived. We already use this to break out of min_wait if we
think we'll exceed maxevents. We could use that same metric to estimate
if we've hit minevents as well. It would not be guaranteed accurate, but
probably good enough. Even if we didn't quite hit minevents there, we'd
return rather than do another sleep and wakeup cycle.

-- 
Jens Axboe


      reply	other threads:[~2022-12-11  2:31 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-10 15:36 [GIT PULL] Add support for epoll min wait time Jens Axboe
2022-12-10 15:58 ` Willy Tarreau
2022-12-10 16:05   ` Jens Axboe
2022-12-10 16:17     ` Willy Tarreau
2022-12-10 16:23       ` Jens Axboe
2022-12-10 18:51 ` Linus Torvalds
2022-12-10 19:26   ` Linus Torvalds
2022-12-11  2:03     ` Jens Axboe
2022-12-11  1:58   ` Jens Axboe
2022-12-11  2:20     ` Jens Axboe
2022-12-11  2:31       ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=581a6498-efcd-b57d-02b6-4237559c72e6@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.