public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: Andrea Arcangeli <andrea@suse.de>
Cc: "Griffiths, Richard A" <richard.a.griffiths@intel.com>,
	"'Marcelo Tosatti'" <marcelo@conectiva.com.br>,
	"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
	"'Carter K. George'" <carter@polyserve.com>,
	"'Don Norton'" <djn@polyserve.com>,
	"'James S. Tybur'" <jtybur@polyserve.com>,
	"Gross, Mark" <mark.gross@intel.com>
Subject: Re: fsync fixes for 2.4
Date: Mon, 15 Jul 2002 11:36:56 -0700	[thread overview]
Message-ID: <3D331648.64AC0C25@zip.com.au> (raw)
In-Reply-To: 20020715100719.GE34@dualathlon.random

Andrea Arcangeli wrote:
> 
> ...
> as for the scaling with async flushes to multiple devices, 2.4 has a
> single flushing thread, 2.5 as Andrew said (partly) fixes this as he
> explained me at OLS, with multiple pdflush. The only issue I seen in his
> design is that he works based on superblocks, so if a filesystem is on
> top of a lvm backed by a dozen of different harddisks, only one pdflush
> will pump on those dozen physical request queues, because the first
> pdflush entering the superblock will forbid other pdflush to work on the
> same superblock too. So the first physical queue that is full, will
> forbid pdflush to push more dirty pages to the other possibly empty
> physical queues.

Well.  There's no way in which we can get effective writeback against
200 spindles by relying on pdflush, so that daemon is mainly there
to permit background writeback under light-to-moderate loads.

Once things get heavy, the only sane approach is to use the actual
caller of write(2) as the resource for performing the writeback.
As we're currently doing, in balance_dirty[_pages]().  But the
problem there is that in both 2.4 and 2.5, a caller to that function
can easily get stuck on the wrong queue, and bandwidth really suffers.

I've been working on changing 2.5 so that the write(2) caller no
longer performs a general "writeback of everything" - that caller
instead performs writeback specifically against the queue which
he just dirtied.  Do this by using the address_space->backing_dev_info
as a key during a search across the superblocks and blockdev inodes.
That works quite well.

But there's still a problem where pdflush goes to writeback a queue
and fills it up, so the userspace program ends up blocking (due to
pdflush's activity) when it really should not.  Still undecided about
what to do about that.

And yes, point taken on the LVM thing.  If the chunk size is reasonably
small (a few megabytes) then we should normally get decent concurrency,
but there will be corner-cases.

-

  reply	other threads:[~2002-07-15 18:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-12 21:52 fsync fixes for 2.4 Griffiths, Richard A
2002-07-12 22:21 ` Andrew Morton
2002-07-15 10:07 ` Andrea Arcangeli
2002-07-15 18:36   ` Andrew Morton [this message]
2002-07-17 14:44   ` mgross
2002-07-17 20:05     ` Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2002-07-10 20:20 Andrea Arcangeli
2002-07-11 20:21 ` Marcelo Tosatti
2002-07-11 22:57   ` Andrea Arcangeli
2002-07-12  0:51     ` Marcelo Tosatti
2002-07-12  1:52       ` Andrea Arcangeli
2002-07-12  2:59         ` Marcelo Tosatti
2002-07-11 21:57 ` J.A. Magallon
2002-07-11 23:00   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D331648.64AC0C25@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=andrea@suse.de \
    --cc=carter@polyserve.com \
    --cc=djn@polyserve.com \
    --cc=jtybur@polyserve.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=mark.gross@intel.com \
    --cc=richard.a.griffiths@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox