From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcin 'Qrczak' Kowalczyk <qrczak@knm.org.pl>
Subject: Re: Bug: epoll_wait timeout is shorter than requested
Date: Mon, 17 Jan 2005 14:41:42 +0100
Message-ID: <87r7kk41gp.fsf@qrnik.zagroda>
References: <87651wl32d.fsf@qrnik.zagroda>
	<20050117114821.GB20152@mail.shareable.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from paf87.warszawa.sdi.tpnet.pl ([217.96.225.87]:65294 "EHLO
	qrnik.knm.org.pl") by vger.kernel.org with ESMTP id S262798AbVAQNlo
	(ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 17 Jan 2005 08:41:44 -0500
Received: from qrczak by qrnik.knm.org.pl with local (Exim 3.36 #1)
	id 1CqX8Y-0006Bl-00
	for linux-fsdevel@vger.kernel.org; Mon, 17 Jan 2005 14:41:42 +0100
To: linux-fsdevel@vger.kernel.org
In-Reply-To: <20050117114821.GB20152@mail.shareable.org> (Jamie Lokier's
 message of "Mon, 17 Jan 2005 11:48:21 +0000")
Sender: linux-fsdevel-owner@vger.kernel.org
List-Id: linux-fsdevel.vger.kernel.org

Jamie Lokier <jamie@shareable.org> writes:

> The epoll argument rounds like select(), not like poll().
> It was done deliberately.

Is it documented?
ftp://ftp.win.tue.nl/pub/home/aeb/linux-local/manpages/man-pages-1.70.tar.gz
doesn't seem to say that the timeout is interpreted differently for
poll and epoll.

Will adding 1ms be enough? In other words is epoll supposed to wait
for some period of time which, when rounded *up* to milliseconds, will
be >= the requested timeout? As contrasted to poll which waits at
least the requested timeout - this behaviour is specified by SUSv3.

I can't observe the semantics of the timeout in select because it's in
microseconds, and a gettimeofday call takes about 2us here. SUSv3 says
that it should wait at least the requested time (except that if the
timeout is longer than a maximum supported timeout, which must be at
least 31 days, then it is allowed to wait shorter). So if select works
like epoll (can wait up to 1us shorter than the requested timeout),
it's not conforming to SUSv3.

> This isn't just a problem for programs doing low jitter work.  Many
> programs call select/poll/epoll, and then call gettimeofday() after to
> decide whether the next "timer" application event is ready to be
> serviced, or whether to call select/poll/epoll again.

This is exactly my case. I noticed that it often finishes a little
before the requested time, and then my program epolls again for 1ms.

> With the poll() behaviour, if a previous poll() finished _just_
> before the timer event is ready, the application will call poll()
> again with timeout 1, and then it will wait 10-20ms (on a 100 Hz
> kernel) instead of the far more desirable 0-10ms.

Well, if the kernel measured the delay more accurately than to a clock
tick, it could notice that a requested 1ms would be satisifed by, say,
8ms which remained from the current tick.

* * *

There is another point where the man page is misleading: it says that
closing a fd will automatically unregister it from epoll sets. In
reality it is unregistered only when the underlying file structure is
released.

* * *

While I understand that the current semantics of sharing epoll fd
across a fork is a consequence of its design, it is inconvenient in
my case. I have to epoll_create again and reregister all descriptors
after a fork, in order for the epoll sets in the two processes to be
independent.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/