public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marr <marr@flex.com>
To: Hans Reiser <reiser@namesys.com>, libc-alpha@gnu.org
Cc: linux-kernel@vger.kernel.org, reiserfs-dev@namesys.com,
	drepper@redhat.com, Andrew Morton <akpm@osdl.org>,
	Mark Lord <lkml@rtr.ca>, Linda Walsh <lkml@tlinx.org>,
	Bill Davidsen <davidsen@tmr.com>, Gerold Jury <gjury@inode.at>,
	Robert Hancock <hancockr@shaw.ca>, Al Boldi <a1426z@gawab.com>,
	Ingo Oeser <ioe-lkml@rameria.de>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Arjan van de Ven <arjan@infradead.org>,
	marr@flex.com
Subject: Re: Readahead value 128K? (was Re: Drastic Slowdown of 'fseek()'
Date: Mon, 27 Mar 2006 14:12:57 -0500	[thread overview]
Message-ID: <200603271412.58364.marr@flex.com> (raw)
In-Reply-To: <442833FC.7080109@namesys.com>

On Monday 27 March 2006 1:50pm, Hans Reiser wrote:
> Thanks Marr.
>
> My concern here is with the users who have no idea what fseek is, and
> just see their apps getting slow.  libc is to my mind doing the clearly
> incorrect thing here.
>
> Is there a libc developers mailing list, maybe we should try them if
> Ulrich is no longer active in libc maintaining?

Good point. I've found a 'glibc' developers' mailing list, so I'm including 
them on this reply. Hopefully someone there will pick up on this thread and 
respond.

Bill Marr

> Marr wrote:
> >Greetings, Ulrich, Hans, et al,
> >
> >*** Please CC: me on replies -- I'm not subscribed.
> >
> >After some more testing and some input (off-list) from others, here is a
> >summary of this problem and its various work-arounds to date....
> >
> >On Monday 27 February 2006 4:53pm, Hans Reiser wrote:
> >>Andrew Morton wrote:
> >>>runs like a dog on 2.6's reiserfs.  libc is doing a (probably) 128k read
> >>>on every fseek.
> >>>
> >>>- There may be a libc stdio function which allows you to tune this
> >>> behaviour.
> >
> >It turns out that there is just such a function. Thanks to some sage
> >(off-list) advice from Gerold Jury, this is an effective way to switch the
> >file's stream to "unbuffered" mode:
> >
> >   setvbuf( inp_fh, 0, _IONBF, 0 );
> >
> >This results in incredible speedups on the ReiserFS+2.6.x setup, without
> > the need to even use the 'nolargeio=1' mount option. Basically, we're
> > going from 128KB read-ahead on every 'fseek()' call to no read-ahead.
> >
> >>>- libc should probably be a bit more defensive about this anyway -
> >>> plainly the filesystem is being silly.
> >>
> >>I really thank you for isolating the problem, but I don't see how you
> >>can do other than blame glibc for this.  The recommended IO size is only
> >>relevant to uncached data, and glibc is using it regardless of whether
> >>or not it is cached or uncached.   Do I misunderstand something myself
> >>here?
> >
> >To date, I've not seen anyone address this implicit question/issue that
> > Hans raised. To wit: Is the "recommended I/O size" only relevant to
> > _uncached_ data???
> >
> >If not, then anyone using ReiserFS on a 2.6.x kernel had best be well
> > aware that 128KB read-aheads are going to occur with every 'fseek()'
> > call, degrading performance drastically. This seems like a good reason
> > for the ReiserFS folks to re-evaluate the use of 128KB as the default
> > value for read-ahead.
> >
> >Alternatively, if "recommended I/O size" _is_ (intended to be) only
> > relevant to _uncached_ data, then the question becomes this: Is 'glibc'
> > erroneously using that recommended size regardless of whether the data is
> > cached or uncached?
> >
> >Ulrich, we'd really appreciate your input on this matter. Please advise.
> > Even a simple reply of "buzz off" would be useful at this point! ;^)
> >
> >------------------------------
> >
> >In summary, the problem still exists, but any of the following
> > work-arounds are effective, ordered here from best to worst:
> >
> >(A) Use a 'setvbuf()' call in the target application to disable (or
> > reduce) buffering on the input stream.
> >
> >Under certain conditions, this should be useful even when not using
> > ReiserFS and/or when not running a 2.6.x kernel. However, it's almost
> > essential (currently) with ReiserFS and 2.6.x kernels, for apps which do
> > a lot of file seeks using ANSI C file I/O (i.e. 'fseek()').
> >
> >OR
> >
> >(B) Use the `nolargeio=1' option when mounting a ReiserFS partition under
> >2.6.x kernels. This effectively changes the recommended I/O read-ahead
> > after each 'fseek()' call from 128KB to 4KB.
> >
> >Unlike option (A) above, this is useful for situations where you don't
> > have access to the source code of the target application(s).
> >
> >However, Andrew Morton mentioned this possible negative side-effect:
> >>  This will alter the behaviour of every reiserfs filesystem in the
> >>  machine.  Even the already mounted ones.
> >
> >OR
> >
> >(C) Don't use ReiserFS (v3) under 2.6.x kernels (for apps which do a lot
> > of file seeks using ANSI C file I/O).
> >
> >For example, the 'ext2'/'ext3' filesystems seem to still use the 4KB
> >read-ahead, resulting in _much_ better performance when performing
> > multiple seeks (outside the range of the 'read-ahead' setting).
> >
> >------------------------------
> >
> >Of course, the unmentioned option (which basically bypasses the whole
> > issue) is to convert the underlying application to use raw, unbuffered
> > Unix file I/O (i.e. 'lseek() + read()' [or even just 'pread()', as
> > suggested by Andrew Morton]) instead of ANSI C file I/O ('fseek() +
> > fread()'), but that is considered out-of-scope for purposes of this
> > discussion.
> >
> >-----------------------------
> >
> >Thanks to all who supplied input. Special thanks to Andrew Morton and
> > Gerold Jury who supplied what effectively turned out to be the
> > most-useful work-arounds.
> >
> >*** Please CC: me on replies -- I'm not subscribed.
> >
> >Bill Marr

      reply	other threads:[~2006-03-27 19:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-13 11:37 Readahead value 128K? (was Re: Drastic Slowdown of 'fseek()' Al Boldi
2006-03-13 20:01 ` Marr
2006-03-26 22:25   ` Marr
2006-03-27 18:50     ` Hans Reiser
2006-03-27 19:12       ` Marr [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200603271412.58364.marr@flex.com \
    --to=marr@flex.com \
    --cc=a1426z@gawab.com \
    --cc=akpm@osdl.org \
    --cc=arjan@infradead.org \
    --cc=davidsen@tmr.com \
    --cc=drepper@redhat.com \
    --cc=gjury@inode.at \
    --cc=hancockr@shaw.ca \
    --cc=ioe-lkml@rameria.de \
    --cc=libc-alpha@gnu.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkml@rtr.ca \
    --cc=lkml@tlinx.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=reiser@namesys.com \
    --cc=reiserfs-dev@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox