All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>,
	linux-kernel@vger.kernel.org, WU Fengguang <wfg@mail.ustc.edu.cn>,
	Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: Temporary lockup on loopback block device
Date: Sun, 11 Nov 2007 00:02:32 +0100	[thread overview]
Message-ID: <1194735752.5828.1.camel@lappy> (raw)
In-Reply-To: <20071110145444.a6993df1.akpm@linux-foundation.org>


On Sat, 2007-11-10 at 14:54 -0800, Andrew Morton wrote:
> On Sat, 10 Nov 2007 20:51:31 +0100 (CET) Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> wrote:
> 
> > Hi
> > 
> > I am experiencing a transient lockup in 'D' state with loopback device. It 
> > happens when process writes to a filesystem in loopback with command like
> > dd if=/dev/zero of=/s/fill bs=4k 
> > 
> > CPU is idle, disk is idle too, yet the dd process is waiting in 'D' in 
> > congestion_wait called from balance_dirty_pages.
> > 
> > After about 30 seconds, the lockup is gone and dd resumes, but it locks up 
> > soon again.
> > 
> > I added a printk to the balance_dirty_pages
> > printk("wait: nr_reclaimable %d, nr_writeback %d, dirty_thresh %d, 
> > pages_written %d, write_chunk %d\n", nr_reclaimable, 
> > global_page_state(NR_WRITEBACK), dirty_thresh, pages_written, 
> > write_chunk);
> > 
> > and it shows this during the lockup:
> > 
> > wait: nr_reclaimable 3099, nr_writeback 0, dirty_thresh 2985, 
> > pages_written 1021, write_chunk 1522
> > wait: nr_reclaimable 3099, nr_writeback 0, dirty_thresh 2985, 
> > pages_written 1021, write_chunk 1522
> > wait: nr_reclaimable 3099, nr_writeback 0, dirty_thresh 2985, 
> > pages_written 1021, write_chunk 1522
> > 
> > What apparently happens:
> > 
> > writeback_inodes syncs inodes only on the given wbc->bdi, however 
> > balance_dirty_pages checks against global counts of dirty pages. So if 
> > there's nothing to sync on a given device, but there are other dirty pages 
> > so that the counts are over the limit, it will loop without doing any 
> > work.
> > 
> > To reproduce it, you need totally idle machine (no GUI, etc.) -- if 
> > something writes to the backing device, it flushes the dirty pages 
> > generated by the loopback and the lockup is gone. If you add printk, don't 
> > forget to stop klogd, otherwise logging would end the lockup.
> 
> erk.

known issue.

> > The hotfix (that I verified to work) is to not set wbc->bdi, so that all 
> > devices are flushed ... but the code probably needs some redesign (i.e. 
> > either account per-device and flush per-device, or account-global and 
> > flush-global).

.24 will have the per-device solution.

> > 
> > diff -u -r ../x/linux-2.6.23.1/mm/page-writeback.c mm/page-writeback.c
> > --- ../x/linux-2.6.23.1/mm/page-writeback.c     2007-10-12 18:43:44.000000000 +0200
> > +++ mm/page-writeback.c 2007-11-10 20:32:43.000000000 +0100
> > @@ -214,7 +214,6 @@
> > 
> > 	for (;;) {
> > 		struct writeback_control wbc = {
> > -			.bdi            = bdi,
> > 			.sync_mode      = WB_SYNC_NONE,
> > 			.older_than_this = NULL,
> > 			.nr_to_write    = write_chunk,
> 
> Arguably we just have the wrong backing-device here, and what we should do
> is to propagate the real backing device's pointer through up into the
> filesystem.  There's machinery for this which things like DM stacks use.
> 
> I wonder if the post-2.6.23 changes happened to make this problem go away.

The per BDI dirty stuff in 24 should make this work, I just checked and
loopback thingies seem to have their own BDI, so all should be well.



  reply	other threads:[~2007-11-10 23:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-10 19:51 Temporary lockup on loopback block device Mikulas Patocka
2007-11-10 22:54 ` Andrew Morton
2007-11-10 23:02   ` Peter Zijlstra [this message]
2007-11-11  0:38     ` Mikulas Patocka
2007-11-11  7:50       ` Miklos Szeredi
2007-11-11 18:29         ` Mikulas Patocka
2007-11-12 13:32           ` Miklos Szeredi
2007-11-15 22:35             ` Mikulas Patocka
2007-11-11  0:33   ` Mikulas Patocka
2007-11-11  3:56     ` Mikulas Patocka
2007-11-11  5:33       ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1194735752.5828.1.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mikulas@artax.karlin.mff.cuni.cz \
    --cc=wfg@mail.ustc.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.