From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: David Chinner <dgc@sgi.com>
Cc: Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, Ken Chen <kenchen@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michael Rubin <mrubin@google.com>
Subject: Re: [PATCH 5/5] writeback: introduce writeback_control.more_io to indicate more io
Date: Thu, 4 Oct 2007 10:21:33 +0800 [thread overview]
Message-ID: <391464497.22880@ustc.edu.cn> (raw)
Message-ID: <20071004022133.GA6244@mail.ustc.edu.cn> (raw)
In-Reply-To: <20071003024119.GL23367404@sgi.com>
On Wed, Oct 03, 2007 at 12:41:19PM +1000, David Chinner wrote:
> On Wed, Oct 03, 2007 at 09:34:39AM +0800, Fengguang Wu wrote:
> > On Wed, Oct 03, 2007 at 07:47:45AM +1000, David Chinner wrote:
> > > On Tue, Oct 02, 2007 at 04:41:48PM +0800, Fengguang Wu wrote:
> > > > wbc.pages_skipped = 0;
> > > > @@ -560,8 +561,9 @@ static void background_writeout(unsigned
> > > > min_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write;
> > > > if (wbc.nr_to_write > 0 || wbc.pages_skipped > 0) {
> > > > /* Wrote less than expected */
> > > > - congestion_wait(WRITE, HZ/10);
> > > > - if (!wbc.encountered_congestion)
> > > > + if (wbc.encountered_congestion || wbc.more_io)
> > > > + congestion_wait(WRITE, HZ/10);
> > > > + else
> > > > break;
> > > > }
> > >
> > > Why do you call congestion_wait() if there is more I/O to issue? If
> > > we have a fast filesystem, this might cause the device queues to
> > > fill, then drain on congestion_wait(), then fill again, etc. i.e. we
> > > will have trouble keeping the queues full, right?
> >
> > You mean slow writers and fast RAID? That would be exactly the case
> > these patches try to improve.
>
> I mean any writers and a fast block device (raid or otherwise).
>
> > This patchset makes kupdate/background writeback more responsible,
> > so that if (avg-write-speed < device-capabilities), the dirty data are
> > synced timely, and we don't have to go for balance_dirty_pages().
>
> Sure, but I'm asking about the effect of the patches on the
> (avg-write-speed == device-capabilities) case. I agree that
> they are necessary for timely syncing of data but I'm trying
> to understand what effect they have on the normal write case
> (i.e. keeping the disk at full write throughput).
OK, I guess it is the focus of all your questions: Why should we sleep
in congestion_wait() and possibly hurt the write throughput? I'll try
to summary it:
- congestion_wait() is necessary
Besides device congestions, there may be other blockades we have to
wait on, e.g. temporary page locks, NFS/journal issues(I guess).
- congestion_wait() is called only when necessary
congestion_wait() will only be called we saw blockades:
if (wbc.nr_to_write > 0 || wbc.pages_skipped > 0) {
congestion_wait(WRITE, HZ/10);
}
So in normal case, it may well write 128MB data without any waiting.
- congestion_wait() won't hurt write throughput
When not congested, congestion_wait() will be wake up on each write
completion. Note that MAX_WRITEBACK_PAGES=1024 and
/sys/block/sda/queue/max_sectors_kb=512(for me),
which means we are gave the chance to sync 4MB on every 512KB written,
which means we are able to submit write IOs 8 times faster than the
device capability. congestion_wait() is a magical timer :-)
> > So for your question of queue depth, the answer is: the queue length
> > will not build up in the first place.
>
> Which queue are you talking about here? The block deivce queue?
Yes, the elevator's queues.
next prev parent reply other threads:[~2007-10-04 2:21 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20071002084143.110486039@mail.ustc.edu.cn>
2007-10-02 8:41 ` [PATCH 0/5] sluggish writeback fixes Fengguang Wu
2007-10-03 11:04 ` Martin Knoblauch
[not found] ` <20071002090254.489150786@mail.ustc.edu.cn>
2007-10-02 8:41 ` [PATCH 1/5] revert check_dirty_inode_list.patch Fengguang Wu
[not found] ` <20071002090254.596842343@mail.ustc.edu.cn>
2007-10-02 8:41 ` [PATCH 2/5] writeback: fix time ordering of the per superblock inode lists 8 Fengguang Wu
[not found] ` <20071002090254.728493507@mail.ustc.edu.cn>
2007-10-02 8:41 ` [PATCH 3/5] writeback: fix ntfs with sb_has_dirty_inodes() Fengguang Wu
[not found] ` <20071002090254.987182999@mail.ustc.edu.cn>
2007-10-02 8:41 ` [PATCH 5/5] writeback: introduce writeback_control.more_io to indicate more io Fengguang Wu
2007-10-02 21:47 ` David Chinner
[not found] ` <20071003013439.GA6501@mail.ustc.edu.cn>
2007-10-03 1:34 ` Fengguang Wu
2007-10-03 2:41 ` David Chinner
[not found] ` <20071004022133.GA6244@mail.ustc.edu.cn>
2007-10-04 2:21 ` Fengguang Wu [this message]
2007-10-04 5:03 ` David Chinner
[not found] ` <20071005033652.GA6448@mail.ustc.edu.cn>
2007-10-05 3:36 ` Fengguang Wu
2007-10-05 7:41 ` David Chinner
[not found] ` <20071005115508.GA9998@mail.ustc.edu.cn>
2007-10-05 11:55 ` Fengguang Wu
[not found] ` <20071002090254.873023041@mail.ustc.edu.cn>
2007-10-02 8:41 ` [PATCH 4/5] writeback: remove pages_skipped accounting in __block_write_full_page() Fengguang Wu
2007-10-04 21:26 ` Andrew Morton
2007-10-02 21:55 ` David Chinner
[not found] ` <20071003014333.GB6501@mail.ustc.edu.cn>
2007-10-03 1:43 ` Fengguang Wu
2007-10-03 2:22 ` David Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=391464497.22880@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@linux-foundation.org \
--cc=akpm@osdl.org \
--cc=dgc@sgi.com \
--cc=kenchen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mrubin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox