public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael Kerrisk" <mtk-manpages@gmx.net>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Lee.Schermerhorn@hp.com, torvalds@linux-foundation.org,
	vda.linux@googlemail.com, rdunlap@xenotime.net, corbet@lwn.net,
	hch@lst.de, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, geoff@gclare.org.uk,
	drepper@redhat.com, davidel@xmailserver.org,
	"David Härdeman" <david@hardeman.nu>
Subject: Re: RFC: A revised timerfd API
Date: Tue, 18 Sep 2007 11:30:07 +0200	[thread overview]
Message-ID: <20070918093007.223350@gmx.net> (raw)
In-Reply-To: <1190106607.2995.97.camel@chaos>

Hi Thomas,

> On Tue, 2007-09-18 at 09:30 +0200, Michael Kerrisk wrote:
> > ====> a) Add an argument (a multiplexing timerfd() system call)
> > Disadvantage:
> > Jon Corbet pointed out
> > (http://thread.gmane.org/gmane.linux.kernel/559193/focus=570709 )
> > that this interface was starting to look like a multiplexing syscall,
> > because there is no case where all of the arguments are used (see
> > the use-case descriptions in the earlier mail).
> > 
> > I'm inclined to agree with Jon; therefore one of the remaining
> > solutions may be preferable
> 
> I agree. It's ugly.

Fair enough.  I mainly tried to do things that way to minimize
the change from the Davide's original interface.

> > ====> b) Create a timerfd interface analogous to POSIX timers
> > 
> > Create an interface analogous to POSIX timers:
> > 
> > fd = timerfd_create(clockid, flags);
> > 
> > timerfd_settime(fd, flags, newtimervalue, &time_to_next_expire);
> > 
> > timerfd_gettime(fd, &time_to_next_expire);
> > 
> > Under this proposal, the manner of making a timer that does not
> > need "get-while-set" functionality remains fairly simple:
> > 
> >     fd = timerfd_create(clockid);
> > 
> >     timerfd_settime(fd, flags, newtimervalue, NULL);
> > 
> > Advantage: this would be a clean, fully functional API, and well
> > understood by virtue of its analogy with the POSIX timers API.
> > 
> > Disadvantage: 3 new system calls, rather than 1.
> > 
> > This solution would be sufficient, IMO, but one of the
> > next solutions might be better.
> 
> I'm not scared by the 3 system calls. I rather fear that we end up
> reimplementing half of the existing posix timer code.

Yes.  Perhaps some refactoring might be required, if we went 
down this route.

> > ====> c) Integrate timerfd with POSIX timers
> > 
> > Make a very simple timerfd call that is integrated with the
> > POSIX timers API.  The POSIX timers API is detailed here:
> > http://linux.die.net/man/3/timer_create
> > http://linux.die.net/man/3/timer_settime
> > 
> > Under the POSIX timers API, a new timer is created using:
> > 
> > int timer_create(clockid_t clockid, struct sigevent *evp,
> >         timer_t *timerid);
> > 
> > We could then have a timerfd() call that returns a file descriptor
> > for the newly created 'timerid':
> > 
> > fd = timerfd(timer_t timerid);
> > 
> > We could then use the POSIX timers API to operate on the timer
> > (start it / modify it / fetch timer value):
> > 
> > int timer_settime(timer_t timerid, int flags,
> >         const struct itimerspec *value,
> >         struct itimerspec *ovalue);
> > int timer_gettime(timer_t timerid, struct itimerspec *value);
> > 
> > And then read() from 'fd' as before.
> > 
> > In the simple case (no "get" or "get-while-setting" functionality),
> > the use of API (c) would be:
> > 
> >     timer_create(clockid, &evp, &timerid);
> > 
> >     fd = timerfd(timerid);
> > 
> >     timer_settime(timerid, flags, &newvalue, NULL));
> > 
> > Advantages:
> >   1. Integration with an existing API.
> >   2. Adds just a single system call.
> >   3. It _might_ be possible to construct an interface that allows
> >      userland programs to do things like creating a timer fd for
> >      a POSIX timer that was created via some library that doesn't
> >      actually know about timer fds.  (I can already see problems with
> >      this, since that library will already expect to be delivering
> >      timer notifications somehow (via threads or signals), and it may
> >      be difficult to make the two notification mechanisms play
> >      together in a sane way.  But maybe someone else has a take on
> >      this that can rescue this idea.)
> > 
> > Disadvantages:
> >   1. Starts to get a little more clunky to use in the simple
> >      case shown above.
> > 
> > This strikes me as a more attractive solution than (b), if we can do
> > it properly -- that means: if we can achieve advantage 3
> > in some reasonable way.  If we can't achieve that, then probably
> > the next solution is better.
> 
> The main problem here is, that there is no way to tell the posix timer
> code that the delivery of the timer is through the file descriptor and
> not via the usual posix timer mechanisms. We need something like the
> SIGEV_TIMERFD flag to make the posix timer code aware of that.

Well, I left it it kind of open whether the expiration 
notification might be delivered via both the traditional
mechanism, and via the tiemrfd.  But I realize that all
may get overly complex.

> > ====> d) extend the POSIX timers API
> > 
> > Under the POSIX timers API, the evp argument of timer_create() is a
> > structure that allows the caller to specify how timer expirations
> > should be notified.  There are the following possibilities
> > (differentiated by the value assigned to evp.sigev_notify):
> > 
> >   i) notify via a signal: the caller specifies which signal the
> >      kernel should deliver when the timer expires.
> >      (SIGEV_SIGNAL)
> >  ii) notify by delivering a signal to the thread whose thread ID
> >      is specified in evp.  (This is Linux specific.)
> >      (SIGEV_THREAD_ID)
> > iii) notify via a thread: when the timer expires, the system starts
> >      a new thread which receives an argument that was specified in
> >      the evp structure. (SIGEV_THREAD)
> >  iv) no notification: the caller can monitor the timer state using
> >      timer_gettime(). (SIGEV_NONE)
> > 
> > In all of the above cases, the return value from timer_create()
> > is 0 for success, or -1 for failure.
> > 
> > We could extend the interface as follows:
> > 
> > 1) Add a new flag for evp.sigev_notify: SIGEV_TIMERFD.
> >    This flag indicates that the caller wants timer
> >    notification via a file descriptor.
> > 2) Whenevp.sigev_notify == SIGEV_TIMERFD, have a successful
> >    timer_create() call return a file descriptor (i.e., an
> >    integer >= 0).
> > 
> > Advantages:
> >   1. Integration with an existing API.
> >   2. No new system calls are required.
> >   3. This idea might even have a chance of getting standardized in
> >      POSIX one day, since (IMO) it integrates fairly cleanly with
> >      an existing API.
> > 
> > Disadvantages:
> >   1. The fact that the return value of a successful timer_create()
> >      is different for the SIGEV_TIMERFD case is a bit of a wart.
> 
> What happens on close(fd) ? Is the posix timer automatically destroyed ?

I would say not (see also my reply to David Härdeman.)

> Is the file descriptor invalidated when the timer is destroyed via
> timer_delete(timer_id) ? The automatic file descriptor creation is a bit
> ugly.

Yes, it is a little ugly.

> I'd rather see a combination of c) and d) as a solution:
> 
> Notify the posix timer code that the timer delivery is done via the file
> descriptor mechanism (SIGEV_TIMERFD). 
> 
> Use a new syscall to open a file descriptor on that timer. 
> 
> When the file descriptor is closed the timer is not destroyed, but
> delivery disabled (analogous to the SIGEV_NONE case), so you can reopen
> and reactivate it later on.
> 
> This way we have it nicely integrated into the posix timer code and keep
> the existing semantics of posix timers intact.
> 
> We need to think about the open file descriptor in the timer_delete()
> case as well, but this should be not too hard to sort out.

This seems like a workable idea also.  But note David Härdeman's
critique of options c & d: the existence of a coupled timerfd 
and a timerid means that the application must maintain a mapping
between the two, so that after an epoll call (for example) that 
says the timerfd is ready, the timer can be manipulated using
the corresponding timerfd.  This isn't IMO a fatal flaw, but
it does make the API a little more clumsy.

Cheers,

Michael
-- 
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 

Want to help with man page maintenance?  
Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages , 
read the HOWTOHELP file and grep the source 
files for 'FIXME'.


  reply	other threads:[~2007-09-18  9:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-18  7:27 RFC: A revised timerfd API Michael Kerrisk
2007-09-18  7:30 ` Michael Kerrisk
2007-09-18  8:05   ` David Härdeman
2007-09-18  9:01     ` Michael Kerrisk
2007-09-18  9:27       ` Thomas Gleixner
2007-09-18  9:10   ` Thomas Gleixner
2007-09-18  9:30     ` Michael Kerrisk [this message]
2007-09-18  9:42       ` Thomas Gleixner
2007-09-18 11:08         ` Michael Kerrisk
2007-09-18 11:30           ` Thomas Gleixner
2007-09-18 13:13             ` David Härdeman
2007-09-22 13:03               ` Michael Kerrisk
2007-09-18 16:51 ` Davide Libenzi
2007-09-22 13:12   ` Michael Kerrisk
2007-09-22 14:32     ` Bernd Eckenfels
2007-09-22 16:07       ` Michael Kerrisk
2007-09-22 17:05         ` Thomas Gleixner
2007-09-22 23:37         ` David Härdeman
2007-09-22 17:10     ` Thomas Gleixner
2007-09-22 21:07     ` Davide Libenzi
2007-09-22 21:26       ` Thomas Gleixner
2007-09-22 23:21         ` Davide Libenzi
2007-09-23 17:33       ` Michael Kerrisk
2007-09-23 18:33         ` Davide Libenzi
2007-09-23 18:41           ` Davide Libenzi
2007-09-23 19:03             ` Michael Kerrisk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070918093007.223350@gmx.net \
    --to=mtk-manpages@gmx.net \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@hardeman.nu \
    --cc=davidel@xmailserver.org \
    --cc=drepper@redhat.com \
    --cc=geoff@gclare.org.uk \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rdunlap@xenotime.net \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vda.linux@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox