From: Marr <marr@flex.com>
To: Hans Reiser <reiser@namesys.com>, libc-alpha@gnu.org
Cc: linux-kernel@vger.kernel.org, reiserfs-dev@namesys.com,
drepper@redhat.com, Andrew Morton <akpm@osdl.org>,
Mark Lord <lkml@rtr.ca>, Linda Walsh <lkml@tlinx.org>,
Bill Davidsen <davidsen@tmr.com>, Gerold Jury <gjury@inode.at>,
Robert Hancock <hancockr@shaw.ca>, Al Boldi <a1426z@gawab.com>,
Ingo Oeser <ioe-lkml@rameria.de>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Arjan van de Ven <arjan@infradead.org>,
marr@flex.com
Subject: Re: Readahead value 128K? (was Re: Drastic Slowdown of 'fseek()'
Date: Mon, 27 Mar 2006 14:12:57 -0500 [thread overview]
Message-ID: <200603271412.58364.marr@flex.com> (raw)
In-Reply-To: <442833FC.7080109@namesys.com>
On Monday 27 March 2006 1:50pm, Hans Reiser wrote:
> Thanks Marr.
>
> My concern here is with the users who have no idea what fseek is, and
> just see their apps getting slow. libc is to my mind doing the clearly
> incorrect thing here.
>
> Is there a libc developers mailing list, maybe we should try them if
> Ulrich is no longer active in libc maintaining?
Good point. I've found a 'glibc' developers' mailing list, so I'm including
them on this reply. Hopefully someone there will pick up on this thread and
respond.
Bill Marr
> Marr wrote:
> >Greetings, Ulrich, Hans, et al,
> >
> >*** Please CC: me on replies -- I'm not subscribed.
> >
> >After some more testing and some input (off-list) from others, here is a
> >summary of this problem and its various work-arounds to date....
> >
> >On Monday 27 February 2006 4:53pm, Hans Reiser wrote:
> >>Andrew Morton wrote:
> >>>runs like a dog on 2.6's reiserfs. libc is doing a (probably) 128k read
> >>>on every fseek.
> >>>
> >>>- There may be a libc stdio function which allows you to tune this
> >>> behaviour.
> >
> >It turns out that there is just such a function. Thanks to some sage
> >(off-list) advice from Gerold Jury, this is an effective way to switch the
> >file's stream to "unbuffered" mode:
> >
> > setvbuf( inp_fh, 0, _IONBF, 0 );
> >
> >This results in incredible speedups on the ReiserFS+2.6.x setup, without
> > the need to even use the 'nolargeio=1' mount option. Basically, we're
> > going from 128KB read-ahead on every 'fseek()' call to no read-ahead.
> >
> >>>- libc should probably be a bit more defensive about this anyway -
> >>> plainly the filesystem is being silly.
> >>
> >>I really thank you for isolating the problem, but I don't see how you
> >>can do other than blame glibc for this. The recommended IO size is only
> >>relevant to uncached data, and glibc is using it regardless of whether
> >>or not it is cached or uncached. Do I misunderstand something myself
> >>here?
> >
> >To date, I've not seen anyone address this implicit question/issue that
> > Hans raised. To wit: Is the "recommended I/O size" only relevant to
> > _uncached_ data???
> >
> >If not, then anyone using ReiserFS on a 2.6.x kernel had best be well
> > aware that 128KB read-aheads are going to occur with every 'fseek()'
> > call, degrading performance drastically. This seems like a good reason
> > for the ReiserFS folks to re-evaluate the use of 128KB as the default
> > value for read-ahead.
> >
> >Alternatively, if "recommended I/O size" _is_ (intended to be) only
> > relevant to _uncached_ data, then the question becomes this: Is 'glibc'
> > erroneously using that recommended size regardless of whether the data is
> > cached or uncached?
> >
> >Ulrich, we'd really appreciate your input on this matter. Please advise.
> > Even a simple reply of "buzz off" would be useful at this point! ;^)
> >
> >------------------------------
> >
> >In summary, the problem still exists, but any of the following
> > work-arounds are effective, ordered here from best to worst:
> >
> >(A) Use a 'setvbuf()' call in the target application to disable (or
> > reduce) buffering on the input stream.
> >
> >Under certain conditions, this should be useful even when not using
> > ReiserFS and/or when not running a 2.6.x kernel. However, it's almost
> > essential (currently) with ReiserFS and 2.6.x kernels, for apps which do
> > a lot of file seeks using ANSI C file I/O (i.e. 'fseek()').
> >
> >OR
> >
> >(B) Use the `nolargeio=1' option when mounting a ReiserFS partition under
> >2.6.x kernels. This effectively changes the recommended I/O read-ahead
> > after each 'fseek()' call from 128KB to 4KB.
> >
> >Unlike option (A) above, this is useful for situations where you don't
> > have access to the source code of the target application(s).
> >
> >However, Andrew Morton mentioned this possible negative side-effect:
> >> This will alter the behaviour of every reiserfs filesystem in the
> >> machine. Even the already mounted ones.
> >
> >OR
> >
> >(C) Don't use ReiserFS (v3) under 2.6.x kernels (for apps which do a lot
> > of file seeks using ANSI C file I/O).
> >
> >For example, the 'ext2'/'ext3' filesystems seem to still use the 4KB
> >read-ahead, resulting in _much_ better performance when performing
> > multiple seeks (outside the range of the 'read-ahead' setting).
> >
> >------------------------------
> >
> >Of course, the unmentioned option (which basically bypasses the whole
> > issue) is to convert the underlying application to use raw, unbuffered
> > Unix file I/O (i.e. 'lseek() + read()' [or even just 'pread()', as
> > suggested by Andrew Morton]) instead of ANSI C file I/O ('fseek() +
> > fread()'), but that is considered out-of-scope for purposes of this
> > discussion.
> >
> >-----------------------------
> >
> >Thanks to all who supplied input. Special thanks to Andrew Morton and
> > Gerold Jury who supplied what effectively turned out to be the
> > most-useful work-arounds.
> >
> >*** Please CC: me on replies -- I'm not subscribed.
> >
> >Bill Marr
prev parent reply other threads:[~2006-03-27 19:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-13 11:37 Readahead value 128K? (was Re: Drastic Slowdown of 'fseek()' Al Boldi
2006-03-13 20:01 ` Marr
2006-03-26 22:25 ` Marr
2006-03-27 18:50 ` Hans Reiser
2006-03-27 19:12 ` Marr [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200603271412.58364.marr@flex.com \
--to=marr@flex.com \
--cc=a1426z@gawab.com \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=davidsen@tmr.com \
--cc=drepper@redhat.com \
--cc=gjury@inode.at \
--cc=hancockr@shaw.ca \
--cc=ioe-lkml@rameria.de \
--cc=libc-alpha@gnu.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkml@rtr.ca \
--cc=lkml@tlinx.org \
--cc=nickpiggin@yahoo.com.au \
--cc=reiser@namesys.com \
--cc=reiserfs-dev@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.