All of lore.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org, cotte@de.ibm.com, hugh@veritas.com,
	neilb@suse.de, zanussi@us.ibm.com, hch@infradead.org
Subject: Re: [PATCH] sendfile removal
Date: Fri, 01 Jun 2007 09:53:33 -0700	[thread overview]
Message-ID: <46604F0D.9070806@zytor.com> (raw)
In-Reply-To: <alpine.LFD.0.98.0706010904110.3957@woody.linux-foundation.org>

Linus Torvalds wrote:
> And the thing is, neither poll nor select work on regular files. And no, 
> that is _not_ just an implementation issue. It's very fundamental: neither 
> poll nor select get the file offset to wait for!
> 
> And that file offset is _critical_ for a regular file, in a way it 
> obviously is _not_ for a socket, pipe, or other special file. Because 
> without knowing the file offset, you cannot know which page you should be 
> waiting for!
> 
> And no, the file offset is not "f_pos". sendfile(), along with 
> pread/pwrite, uses a totally separate file offset, so if select/poll were 
> to base their decision on f_pos, they'd be _wrong_.

This is obviously correct, although at the time those interfaces were
designed, I don't believe either pread/pwrite nor sendfile() existed,
and they still couldn't wait on real files.  That there isn't a suitable
way to wait for a file at an offset is probably a result of that past
history.

Waiting at f_pos is still a possible interface, of course; it would mean
that pread/pwrite/sendfile users would have to seek before waiting.
However, implementing waiting on files in select/poll is prohibited by
POSIX, so it would at least need some sort of Linux-specific flag anyway.

It seems that being able to do nonblocking I/O on files would be a
useful thing.  This really *does* require proper nonblocking I/O and not
just the ability to wait, since you can never know when the kernel
decides to recycle the page you are just about to want from the cache.

> So there's a few things to take away from this:
> 
>  - regular file access MUST NOT return EAGAIN just because a page isn't 
>    in the cache. Doing so is simply a bug. No ifs, buts or maybe's about 
>    it!
> 
>    Busy-looping is NOT ACCEPTABLE!
> 
>  - you *could* make some alternative conventions:
> 
> 	(a) you could make O_NONBLOCK mean that you'll at least 
> 	    guarantee that you *start* the IO, and while you never return 
> 	    EAGAIN, you migth validly return a _partial_ result!
> 
> 	(b) variation on (a): it's ok to return EAGAIN if _you_ were the 
> 	    one who started the IO during this particular time aroudn the 
> 	    loop. But if you find a page that isn't up-to-date yet, and 
> 	    you didn't start the IO, you *must* wait for it, so that you 
> 	    end up returning EAGAIN atmost once! Exactly because 
> 	    busy-looping is simply not acceptable behaviour!

(b) seems really ugly.  (a) is at least well-defined.  Either seems
wrong, though.

	-hpa

  parent reply	other threads:[~2007-06-01 16:59 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-31 10:33 [PATCH] sendfile removal Jens Axboe
2007-05-31 10:47 ` Jens Axboe
2007-05-31 10:47 ` Eric Dumazet
2007-05-31 10:53   ` Jens Axboe
2007-06-01  4:09     ` H. Peter Anvin
2007-06-01  5:41       ` Jens Axboe
2007-06-01  5:50         ` H. Peter Anvin
2007-06-01  7:22           ` Eric Dumazet
2007-06-01 15:52             ` H. Peter Anvin
2007-06-01 16:18               ` Linus Torvalds
2007-06-01 16:47                 ` Eric Dumazet
2007-06-01 16:53                 ` H. Peter Anvin [this message]
2007-06-02 15:02                   ` Jens Axboe
2007-06-02 15:01                 ` Jens Axboe
2007-06-02 15:40                   ` Linus Torvalds
2007-06-02 16:35                     ` Jens Axboe
2007-06-03 13:05                     ` Fengguang Wu
2007-06-03 13:05                       ` Fengguang Wu
2007-06-03 14:29                       ` Fengguang Wu
2007-06-03 14:29                         ` Fengguang Wu
2007-06-04  0:46                         ` Fengguang Wu
2007-06-04  0:46                           ` Fengguang Wu
2007-06-04  8:05                           ` Jens Axboe
2007-06-04 11:22                             ` Fengguang Wu
2007-06-04 11:22                               ` Fengguang Wu
2007-06-01 16:22               ` Pádraig Brady
2007-05-31 10:55 ` Christoph Hellwig
2007-05-31 11:05   ` Jens Axboe
2007-05-31 12:26     ` Neil Brown
2007-05-31 12:27       ` Jens Axboe
2007-06-01  2:44         ` [PATCH] sendfile removal (nfsd update) Neil Brown
2007-06-01  5:44           ` Jens Axboe
2007-06-01  8:01             ` Jens Axboe
2007-06-01  8:15     ` [PATCH] sendfile removal Jens Axboe
2007-05-31 11:04 ` Carsten Otte
2007-05-31 11:06   ` Jens Axboe
2007-05-31 15:33 ` Tom Zanussi
2007-05-31 19:01   ` Jens Axboe
2007-05-31 17:06 ` Hugh Dickins
2007-05-31 17:31   ` Christoph Hellwig
2007-05-31 19:03   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46604F0D.9070806@zytor.com \
    --to=hpa@zytor.com \
    --cc=cotte@de.ibm.com \
    --cc=dada1@cosmosbay.com \
    --cc=hch@infradead.org \
    --cc=hugh@veritas.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=torvalds@linux-foundation.org \
    --cc=zanussi@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.