public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@mandrakesoft.com>
To: Andrew Morton <akpm@zip.com.au>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: fadvise syscall?
Date: Sun, 17 Mar 2002 04:10:21 -0500	[thread overview]
Message-ID: <3C945D7D.8040703@mandrakesoft.com> (raw)
In-Reply-To: <3C945635.4050101@mandrakesoft.com> <3C945A5A.9673053F@zip.com.au>

Andrew Morton wrote:

>Jeff Garzik wrote:
>
>>Has anyone ever done an madvise(2)-type syscall for file descriptors?
>>(or does the capability exist and I'm missing it?)
>>
>
>Well, question is: is madvise() any use? :)
>
:)

>>was thinking, in playing around with stuff like cp(1) I've found that
>>standard read(2) and write(2) of a 4-8K buffer is the fastest solution
>>overall, in addition to providing the useful side effect of better error
>>reporting, such as ENOSPC report.  Better error reporting than the
>>alternative I see anyway, mmap(2).
>>
>
>4k to 8k is best on x86 at least.  And if you're actually going to *use*
>each byte in the file, the zero-copy characteristics of mmap aren't
>worth much at all.
>

That's exactly what I found through experimentation.

>>So... we have madvise, why not fadvise?  I would love the capability for
>>applications to provide hints to the OS like madvise, but for file
>>descriptors...
>>
>
>The one hint which I can think of which would be beneficial would
>be an equivalent to MADV_SEQUENTIAL.  Something which says "this
>is a big streaming read/write - don't go and evict other stuff because
>of it".  O_STREAMING perhaps.  Or working dropbehind heuristics,
>although I suspect that explicit controls will always do better.
>
>For MADV_RANDOM, readahead window scaling should get that right.
>
>What else were you thinking of?
>

Hints for,
* sequential read
* sequential write
* sequential write, where the application considers the data it's 
writing to be unlikely to be read again any time soon (hopefully 
implying to the page cache that these pages have low value as cacheable 
objects)
* some sort of streaming hints, implying that the application cares a 
lot about maintaining some minimum i/o rate.  note I said hint, not 
requirement.  -not- guaranteed-rate-IO.

I might even go so far as to advocate identifying common usage patterns, 
and creating hint constants for them, even if we don't support them in 
the kernel immediately (if ever).  Makes the interface much more 
future-proof, at the expense of a few integers in a 32-bit numberspace, 
and a few more bytes in the C compiler's symbol table.

    Jeff




  reply	other threads:[~2002-03-17  9:11 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-17  8:39 fadvise syscall? Jeff Garzik
2002-03-17  8:56 ` Andrew Morton
2002-03-17  9:10   ` Jeff Garzik [this message]
2002-03-17 20:18     ` Richard Gooch
2002-03-17 13:41   ` Anton Altaparmakov
2002-03-17 14:31     ` Simon Richter
2002-03-17 14:56       ` Jan Hudec
2002-03-17 15:00     ` Anton Altaparmakov
2002-03-17 19:20     ` Joel Becker
2002-03-17 23:59     ` Anton Altaparmakov
2002-03-18  7:28     ` Jeff Garzik
2002-03-18  7:55       ` Andrew Morton
2002-03-18  8:07         ` Jeff Garzik
2002-03-18  8:17           ` Andrew Morton
2002-03-18 16:41         ` Richard Gooch
2002-03-18 19:00           ` Andrew Morton
2002-03-18 19:15             ` Richard Gooch
2002-03-22 16:05       ` Pavel Machek
2002-03-24  6:38         ` Stevie O
2002-03-24 11:24           ` Pavel Machek
2002-03-24 12:52             ` Anton Altaparmakov
2002-03-25 11:12               ` Pavel Machek
2002-03-18  8:05     ` Joel Becker
2002-03-18  8:10       ` Jeff Garzik
2002-03-18  8:20         ` Joel Becker
2002-03-18  8:14       ` Andrew Morton
2002-03-18 14:39         ` Martin K. Petersen
2002-03-18 19:15           ` Andrew Morton
2002-03-18 19:42             ` Martin K. Petersen
2002-03-19 20:08               ` Eric W. Biederman
2002-03-19 23:38                 ` Martin K. Petersen
2002-03-17 15:13 ` Ken Hirsch
2002-03-17 17:14 ` Anton Altaparmakov
2002-03-17 18:31   ` Mark Mielke
2002-03-17 18:35   ` Ken Hirsch
2002-03-17 19:06   ` Anton Altaparmakov
2002-03-17 20:19     ` Ken Hirsch
2002-03-18  0:12     ` Anton Altaparmakov
     [not found]       ` <a73ujs$5mc$1@cesium.transmeta.com>
2002-03-18  8:58         ` Jan Hudec
2002-03-18 10:08           ` Jeff Garzik
2002-03-18 17:29             ` Mark Mielke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C945D7D.8040703@mandrakesoft.com \
    --to=jgarzik@mandrakesoft.com \
    --cc=akpm@zip.com.au \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox