public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Chakri n <chakriin5@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>, Greg KH <gregkh@suse.de>,
	lkml <linux-kernel@vger.kernel.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Krzysztof Oledzki <olel@ans.pl>,
	linux-pm <linux-pm@lists.linux-foundation.org>,
	richard kennedy <richard@rsk.demon.co.uk>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
Date: Tue, 2 Oct 2007 20:13:27 +0800	[thread overview]
Message-ID: <20071002121327.GA5718__20536.5351438106$1191327391$gmane$org@mail.ustc.edu.cn> (raw)
Message-ID: <20071002121327.GA5718@mail.ustc.edu.cn> (raw)
In-Reply-To: <20071001191457.2f7c7538.akpm@linux-foundation.org>

On Mon, Oct 01, 2007 at 07:14:57PM -0700, Andrew Morton wrote:
> On Tue, 2 Oct 2007 10:00:40 +0800 Fengguang Wu <wfg@mail.ustc.edu.cn> wrote:
> 
> > writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
> > 
> > On a busy-writing system, a writer could be hold up infinitely on a
> > light-load device. It will be trying to sync more than available dirty data.
> > 
> > The problem case:
> > 
> > 0. sda/nr_dirty >= dirty_limit;
> >    sdb/nr_dirty == 0
> > 1. dd writes 32 pages on sdb
> > 2. balance_dirty_pages() blocks dd, and tries to write 6MB.
> > 3. it never gets there: there's only 128KB dirty data.
> > 4. dd may be blocked for a loooong time
> 
> Please quantify loooong.

There're only two 'break' conditions in the loop:
1. nr_dirty + nr_unstable + nr_writeback < dirty_limit
   => *mostly* FALSE for a busy system
   => *always* FALSE in Chakri's stucked NFS case
2. nr_written >= 6MB
   for a light-load bdi:
   => *never* TRUE until there comes many new writers, contributing
      more dirty pages to sync
   => more worse, those new writers will also stuck here...
      the obvious unbalance here is:
           each writer contributes only 32KB new dirty pages, but
           want to consume (not necessarily available) 6MB

So loooong = min(global-less-busy-time, bdi-many-new-writers-arrival-time).

> > Fix it by returning on 'zero dirty inodes' in the current bdi.
> > (In fact there are slight differences between 'dirty inodes' and 'dirty pages'.
> > But there is no available counters for 'dirty pages'.)
> > 
> > But the newly introduced 'break' could make the nr_writeback drift away
> > above the dirty limit. The workaround is to limit the error under 1MB.
> 
> I'm still not sure that we fully understand this yet.
> 
> If the sdb writer is stuck in balance_dirty_pages() then all sda writers
> will be in balance_dirty_pages() too, madly writing stuff out to sda.  And
> pdflush will be writing out sda as well.  All this writeout to sda should
> release the sdb writer.
> 
> Why isn't this happening?

You are right in the reasoning. The exact consequence is:
        the light-load sdb is made as _unresponsive_ as the busy sda

Hence Chakri's case: whenever NFS is stuck, every device get stuck.

> 
> > Cc: Chuck Ebbert <cebbert@redhat.com>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
> > ---
> >  mm/page-writeback.c |    5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > --- linux-2.6.22.orig/mm/page-writeback.c
> > +++ linux-2.6.22/mm/page-writeback.c
> > @@ -250,6 +250,11 @@ static void balance_dirty_pages(struct a
> >  			pages_written += write_chunk - wbc.nr_to_write;
> >  			if (pages_written >= write_chunk)
> >  				break;		/* We've done our duty */
> > +			if (list_empty(&mapping->host->i_sb->s_dirty) &&
> > +			    list_empty(&mapping->host->i_sb->s_io) &&
> > +			    nr_reclaimable + global_page_state(NR_WRITEBACK) <=
> > +				    dirty_thresh + (1 << (20-PAGE_CACHE_SHIFT)))
> > +				break;
> >  		}
> >  		congestion_wait(WRITE, HZ/10);
> >  	}
> 
> Well that has a nice safetly net.  Perhaps it could fail a bit later on,
> but that depends on why it's failing.

In theory, every CPU/paralle writer could contribute 8 pages of error.
Hence we get 1MB/32KB = 32 (CPUs/writers).

One more serious problem is, a busy writer could also drain all the
dirty pages and make (nr_writeback == dirty_limit+1MB). In that case,
I suspect the light-load sdb writer still have good chance to
make progress(need confirmation).

> How well tested was this?

Not well tested till now. My system becomes unusable soon after
starting the NFS write(even before plugging the network). I'm seeing
large latencies in try_to_wake_up(). Hope that Ingo could help it out.

> If we merge this for 2.6.23 then I expect that we'll immediately unmerge it
> for 2.6.24 because Peter's stuff fixes this problem by other means.
> 
> Do we all agree with the above sentence?

Yeah, Peter and me were both aware of the timing.
This patch is only meant for 2.6.23 and 2.6.22.10.

Fengguang

  reply	other threads:[~2007-10-02 12:13 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <92cbf19b0709272332s25684643odaade0e98cb3a1f4@mail.gmail.com>
2007-09-28  6:50 ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Andrew Morton
     [not found] ` <20070927235034.ae7bd73d.akpm@linux-foundation.org>
2007-09-28  6:59   ` Peter Zijlstra
     [not found]   ` <1190962752.31636.15.camel@twins>
2007-09-28  8:27     ` Chakri n
     [not found]     ` <92cbf19b0709280127yba48b60wfe58e532944894ca@mail.gmail.com>
2007-09-28  8:40       ` Peter Zijlstra
     [not found]       ` <1190968800.31636.26.camel@twins>
2007-09-28  9:01         ` Chakri n
     [not found]         ` <92cbf19b0709280201o3778f945mf1d8d61cbb3d0558@mail.gmail.com>
2007-09-28  9:12           ` Peter Zijlstra
     [not found]           ` <1190970729.31636.29.camel@twins>
2007-09-28  9:20             ` Chakri n
     [not found]             ` <92cbf19b0709280220o7cdd4b1cua37a8776af68ac25@mail.gmail.com>
2007-09-28  9:23               ` Peter Zijlstra
     [not found]               ` <1190971419.31636.38.camel@twins>
2007-09-28 10:36                 ` Chakri n
2007-09-28 13:28   ` Jonathan Corbet
2007-09-28 17:00   ` Trond Myklebust
     [not found]   ` <10659.1190986132@lwn.net>
2007-09-28 13:35     ` Peter Zijlstra
     [not found]     ` <1190986542.13204.10.camel@twins>
2007-09-28 16:45       ` Alan Stern
2007-09-29  1:27       ` Daniel Phillips
2007-09-28 18:04     ` Andrew Morton
     [not found]   ` <1190998853.6702.17.camel@heimdal.trondhjem.org>
     [not found]     ` <20070928114930.2c201324.akpm@linux-foundation.org>
2007-09-28 18:48       ` Peter Zijlstra
2007-09-28 19:16       ` Trond Myklebust
     [not found]       ` <1191005339.18147.89.camel@lappy>
2007-09-28 19:16         ` Andrew Morton
     [not found]       ` <1191006971.6702.25.camel@heimdal.trondhjem.org>
2007-09-28 19:26         ` Andrew Morton
     [not found]         ` <20070928122628.965137f2.akpm@linux-foundation.org>
2007-09-28 19:52           ` Trond Myklebust
     [not found]           ` <1191009148.6702.46.camel@heimdal.trondhjem.org>
2007-09-28 20:10             ` Andrew Morton
2007-09-28 20:24             ` Daniel Phillips
     [not found]             ` <20070928131012.4a03c53e.akpm@linux-foundation.org>
2007-09-28 20:32               ` Trond Myklebust
     [not found]               ` <1191011538.6702.59.camel@heimdal.trondhjem.org>
2007-09-28 20:43                 ` Andrew Morton
2007-09-28 21:36                   ` Chakri n
     [not found]                   ` <92cbf19b0709281436i41247863t6cbc919c33e972a3@mail.gmail.com>
2007-09-28 23:33                     ` Chakri n
2007-09-29  1:51         ` KDB? Daniel Phillips
2007-09-28 18:49     ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Andrew Morton
2007-09-29  0:46   ` Daniel Phillips
     [not found] ` <20070929110454.GA29861@mail.ustc.edu.cn>
2007-09-29 11:04   ` Fengguang Wu
     [not found] ` <391063897.19256@ustc.edu.cn>
2007-09-29 11:48   ` Peter Zijlstra
     [not found]   ` <1191066481.18147.115.camel@lappy>
     [not found]     ` <20070929122842.GA5454@mail.ustc.edu.cn>
2007-09-29 12:28       ` Fengguang Wu
     [not found]     ` <391068925.28146@ustc.edu.cn>
2007-09-29 14:43       ` Peter Zijlstra
2007-10-01 15:57   ` Chuck Ebbert
     [not found]   ` <470118EE.1020103@redhat.com>
     [not found]     ` <20071002020040.GA5275@mail.ustc.edu.cn>
2007-10-02  2:00       ` [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi Fengguang Wu
2007-10-03 12:46       ` richard kennedy
     [not found]       ` <1191415612.3123.21.camel@castor.rsk.org>
     [not found]         ` <20071004015053.GA5789@mail.ustc.edu.cn>
2007-10-04  1:50           ` Fengguang Wu
     [not found]     ` <391290444.23950@ustc.edu.cn>
2007-10-02  2:14       ` Andrew Morton
     [not found]         ` <20071002121327.GA5718@mail.ustc.edu.cn>
2007-10-02 12:13           ` Fengguang Wu [this message]
     [not found]           ` <20071002132702.GA10967@mail.ustc.edu.cn>
2007-10-02 13:27             ` Fengguang Wu
     [not found]           ` <391331626.16970@ustc.edu.cn>
2007-10-02 18:35             ` Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='20071002121327.GA5718__20536.5351438106$1191327391$gmane$org@mail.ustc.edu.cn' \
    --to=wfg@mail.ustc.edu.cn \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cebbert@redhat.com \
    --cc=chakriin5@gmail.com \
    --cc=gregkh@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=olel@ans.pl \
    --cc=richard@rsk.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox