public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Chuck Ebbert <cebbert@redhat.com>
Cc: Matthias Hensler <matthias@wspse.de>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	richard kennedy <richard@rsk.demon.co.uk>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: Processes spinning forever, apparently in lock_timer_base()?
Date: Thu, 20 Sep 2007 15:36:54 -0700	[thread overview]
Message-ID: <20070920153654.b9e90616.akpm@linux-foundation.org> (raw)
In-Reply-To: <46F2EE76.4000203@redhat.com>

On Thu, 20 Sep 2007 18:04:38 -0400
Chuck Ebbert <cebbert@redhat.com> wrote:

> > 
> >> Can we get some kind of band-aid, like making the endless 'for' loop in
> >> balance_dirty_pages() terminate after some number of iterations? Clearly
> >> if we haven't written "write_chunk" pages after a few tries, *and* we
> >> haven't encountered congestion, there's no point in trying forever...
> > 
> > Did my above questions get looked at?
> > 
> > Is anyone able to reproduce this?
> > 
> > Do we have a clue what's happening?
> 
> There are a ton of dirty pages for one disk, and zero or close to zero dirty
> for a different one. Kernel spins forever trying to write some arbitrary
> minimum amount of data ("write_chunk" pages) to the second disk...

That should be OK.  The caller will sit in that loop, sleeping in
congestion_wait(), polling the correct backing-dev occasionally and waiting
until the dirty limits subside to an acceptable limit, at which stage this:

			if (nr_reclaimable +
				global_page_state(NR_WRITEBACK)
					<= dirty_thresh)
						break;


will happen and we leave balance_dirty_pages().

That's all a bit crappy if the wrong races happen and some other task is
somehow exceeding the dirty limits each time this task polls them.  Seems
unlikely that such a condition would persist forever.

So the question is, why do we have large amounts of dirty pages for one
disk which appear to be sitting there not getting written?

Do we know if there's any writeout at all happening when the system is in
this state?

I guess it's possible that the dirty inodes on the "other" disk got
themselves onto the wrong per-sb inode list, or are on the correct list,
but in the correct place.  If so, these:

writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-2.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-3.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-4.patch
writeback-fix-comment-use-helper-function.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-5.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-6.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-7.patch
writeback-fix-periodic-superblock-dirty-inode-flushing.patch

from 2.6.23-rc6-mm1 should help.


Did anyone try running /bin/sync when the system is in this state?

  reply	other threads:[~2007-09-20 22:38 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-01 22:39 Processes spinning forever, apparently in lock_timer_base()? Chuck Ebbert
2007-08-02 10:37 ` richard kennedy
2007-08-03 18:34 ` Andrew Morton
2007-08-04  8:44   ` Matthias Hensler
2007-08-09  9:59     ` Matthias Hensler
2007-08-09 16:55       ` Andrew Morton
2007-08-09 17:37         ` Matthias Hensler
2007-09-20 21:07         ` Chuck Ebbert
2007-09-20 21:29           ` Andrew Morton
2007-09-20 22:04             ` Chuck Ebbert
2007-09-20 22:36               ` Andrew Morton [this message]
2007-09-20 22:44                 ` Chuck Ebbert
2007-09-21  8:08                 ` Matthias Hensler
2007-09-21  8:22                   ` Andrew Morton
2007-09-21 10:25                 ` richard kennedy
2007-09-21 10:33                   ` Andrew Morton
2007-09-21 10:47                     ` richard kennedy
2007-09-22 12:08                     ` richard kennedy
2007-09-21  9:39             ` Andy Whitcroft
2007-09-21 15:43               ` Chuck Ebbert
2007-09-21 15:58               ` Hugh Dickins
2007-09-21 16:16                 ` Chuck Ebbert
2007-09-21 18:54                 ` Peter Zijlstra
2007-10-29 18:55                 ` Bruno Wolff III
  -- strict thread matches above, loose matches on Subject: below --
2007-08-03 20:14 Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070920153654.b9e90616.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=cebbert@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthias@wspse.de \
    --cc=richard@rsk.demon.co.uk \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox