Re: 2.4.14-pre6 - Andrew Morton

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@zip.com.au>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: 2.4.14-pre6
Date: Wed, 31 Oct 2001 10:36:24 -0800	[thread overview]
Message-ID: <3BE044A8.E91A4F01@zip.com.au> (raw)
In-Reply-To: <Pine.LNX.4.33.0110310809200.32460-100000@penguin.transmeta.com>

Linus Torvalds wrote:
> 
> In article <3BDFBFF5.9F54B938@zip.com.au>,
> Andrew Morton  <akpm@zip.com.au> wrote:
> >
> >Appended here is a program which creates 100,000 small files.
> >Using ext2 on -pre5.  We see how long it takes to run
> >
> >       (make-many-files ; sync)
> >
> >For several values of queue_nr_requests:
> >
> >queue_nr_requests:     128     8192    32768
> >execution time:                4:43    3:25    3:20
> >
> >Almost all of the execution time is in the `sync'.
> 
> Hmm..  I don't consider "sync" to be a benchmark, and one of the things
> that made me limit the queue size was in fact that Linux in the
> timeframe before roughly 2.4.7 or so was _completely_ unresponsive when
> you did a big "untar" followed by a "sync".

Sure.  I chose `sync' because it's measurable.  That sync took
four minutes, so the machine will be locked up seeking for four
minutes whether the writeback was initiated by /bin/sync or by
kupdate/bdflush.

> I'd rather have a machine where I don't even much notice the sync than
> one where a made-up-load and a "sync" that servers no purpose shows
> lower throughput.
> 
> Do you actually have any real load that cares?

All I do is compile kernels :)

Actually, ext3 journal replay can sometimes take thirty seconds
or longer - it reads maybe ten megs from the journal and then
it has to splatter it all over the platter and wait on it.

> ...
> We have actually talked about some higher-level ordering of the dirty list
> for at least five years, but nobody has ever done it. And I bet you $5
> that you'll get (a) better throughput than by making the queues longer and
> (b) you'll have fine latency while you write and (c) that you want to
> order the write-queue anyway for filesystems that care about ordering.

I'll buy that.  It's not just the dirty list, either.  I've seen 
various incarnations of page_launder() and its successor which
were pretty suboptimal from a write clustering pov.

But it's actually quite seductive to take a huge amount of data and
just chuck it at the request layer and let Jens sort it out. This
usually works well and keeps the complexity in one place.

One does wonder whether everything is working as it should, though.
Creating those 100,000 4k files is going to require writeout of
how many blocks?  120,000?  And four minutes is enough time for
34,000 seven-millisecond seeks.  And ext2 is pretty good at laying
things out contiguously.  These numbers don't gel.

Ah-ha.  Look at the sync_inodes stuff:

	for (zillions of files) {
		filemap_fdatasync(file)
		filemap_fdatawait(file)
	}

If we turn this into

	for (zillions of files)
		filemap_fdatasync(file)
	for (zillions of files)
		filemap_fdatawait(file)

I suspect that interesting effects will be observed, yes?  Especially
if we have a nice long request queue, and the results of the
preceding sync_buffers() are still available for being merged with.

kupdate runs this code path as well. Why is there any need for
kupdate to wait on the writes?

Anyway.  I'll take a look....

-

next prev parent reply	other threads:[~2001-10-31 18:41 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-10-31 16:15 2.4.14-pre6 Linus Torvalds
2001-10-31 18:36 ` Andrew Morton [this message]
2001-10-31 19:06   ` 2.4.14-pre6 Linus Torvalds
2001-11-01 10:20 ` 2.4.14-pre6 Neil Brown
2001-11-01 20:55   ` 2.4.14-pre6 Andrew Morton
2001-11-01 21:28     ` 2.4.14-pre6 Chris Mason
2001-11-02  8:00     ` 2.4.14-pre6 Helge Hafting
2001-11-04 22:34     ` 2.4.14-pre6 Pavel Machek
2001-11-04 23:16       ` 2.4.14-pre6 Daniel Phillips
  -- strict thread matches above, loose matches on Subject: below --
2001-10-31  8:00 2.4.14-pre6 Linus Torvalds
2001-10-31  9:10 ` 2.4.14-pre6 Andrew Morton
2001-10-31  9:29   ` 2.4.14-pre6 Jens Axboe
2001-10-31  9:30 ` 2.4.14-pre6 bert hubert
2001-10-31 19:27 ` 2.4.14-pre6 Michael Peddemors
2001-10-31 19:38   ` 2.4.14-pre6 Linus Torvalds
2001-10-31 19:55     ` 2.4.14-pre6 Mike Castle
2001-10-31 20:02     ` 2.4.14-pre6 Rik van Riel
2001-10-31 23:18     ` 2.4.14-pre6 Erik Andersen
2001-10-31 23:40       ` 2.4.14-pre6 Dax Kelson
2001-10-31 23:57         ` 2.4.14-pre6 Michael Peddemors
2001-10-31 19:52 ` 2.4.14-pre6 Philipp Matthias Hahn
2001-10-31 21:05   ` 2.4.14-pre6 H. Peter Anvin
2001-11-01 19:14 ` 2.4.14-pre6 Pozsar Balazs
2001-11-02 12:01 ` 2.4.14-pre6 Pavel Machek
2001-11-05 20:43   ` 2.4.14-pre6 Charles Cazabon
2001-11-05 20:49   ` 2.4.14-pre6 Linus Torvalds
2001-11-05 21:27     ` 2.4.14-pre6 Josh Fryman
2001-11-05 19:04       ` 2.4.14-pre6 Gérard Roudier
2001-11-05 21:04   ` 2.4.14-pre6 Johannes Erdfelt
2001-11-05 21:08   ` 2.4.14-pre6 Wilson
2001-11-02 16:48 ` 2.4.14-pre6 jogi
2001-11-03 12:47   ` 2.4.14-pre6 Mike Galbraith
2001-11-03 18:01     ` 2.4.14-pre6 Linus Torvalds
2001-11-03 19:07       ` 2.4.14-pre6 Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3BE044A8.E91A4F01@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox