All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Chakri n" <chakriin5@gmail.com>
To: "Andrew Morton" <akpm@linux-foundation.org>
Cc: a.p.zijlstra@chello.nl, linux-pm@lists.linux-foundation.org,
	nfs@lists.sourceforge.net, linux-kernel@vger.kernel.org,
	Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)
Date: Fri, 28 Sep 2007 16:33:18 -0700	[thread overview]
Message-ID: <92cbf19b0709281633ib4cfbdet5edc19381beeece3@mail.gmail.com> (raw)
In-Reply-To: <92cbf19b0709281436i41247863t6cbc919c33e972a3@mail.gmail.com>

No change in behavior even in case of low memory systems. I confirmed
it running on 1Gig machine.

Thanks
--Chakri

On 9/28/07, Chakri n <chakriin5@gmail.com> wrote:
> Here is a the snapshot of vmstats when the problem happened. I believe
> this could help a little.
>
> crash> kmem -V
>        NR_FREE_PAGES: 680853
>          NR_INACTIVE: 95380
>            NR_ACTIVE: 26891
>        NR_ANON_PAGES: 2507
>       NR_FILE_MAPPED: 1832
>        NR_FILE_PAGES: 119779
>        NR_FILE_DIRTY: 0
>         NR_WRITEBACK: 18272
>  NR_SLAB_RECLAIMABLE: 1305
> NR_SLAB_UNRECLAIMABLE: 2085
>         NR_PAGETABLE: 123
>      NR_UNSTABLE_NFS: 0
>            NR_BOUNCE: 0
>      NR_VMSCAN_WRITE: 0
>
> In my testing, I always saw the processes are waiting in
> balance_dirty_pages_ratelimited(), never in throttle_vm_writeout()
> path.
>
> But this could be because I have about 4Gig of memory in the system
> and plenty of mem is still available around.
>
> I will rerun the test limiting memory to 1024MB and lets see if it
> takes in any different path.
>
> Thanks
> --Chakri
>
>
> On 9/28/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Fri, 28 Sep 2007 16:32:18 -0400
> > Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> >
> > > On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote:
> > > > On Fri, 28 Sep 2007 15:52:28 -0400
> > > > Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > > >
> > > > > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
> > > > > > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > > > > > > Looking back, they were getting caught up in
> > > > > > > balance_dirty_pages_ratelimited() and friends. See the attached
> > > > > > > example...
> > > > > >
> > > > > > that one is nfs-on-loopback, which is a special case, isn't it?
> > > > >
> > > > > I'm not sure that the hang that is illustrated here is so special. It is
> > > > > an example of a bog-standard ext3 write, that ends up calling the NFS
> > > > > client, which is hanging. The fact that it happens to be hanging on the
> > > > > nfsd process is more or less irrelevant here: the same thing could
> > > > > happen to any other process in the case where we have an NFS server that
> > > > > is down.
> > > >
> > > > hm, so ext3 got stuck in nfs via __alloc_pages direct reclaim?
> > > >
> > > > We should be able to fix that by marking the backing device as
> > > > write-congested.  That'll have small race windows, but it should be a 99.9%
> > > > fix?
> > >
> > > No. The problem would rather appear to be that we're doing
> > > per-backing_dev writeback (if I read sync_sb_inodes() correctly), but
> > > we're measuring variables which are global to the VM. The backing device
> > > that we are selecting may not be writing out any dirty pages, in which
> > > case, we're just spinning in balance_dirty_pages_ratelimited().
> >
> > OK, so it's unrelated to page reclaim.
> >
> > > Should we therefore perhaps be looking at adding per-backing_dev stats
> > > too?
> >
> > That's what mm-per-device-dirty-threshold.patch and friends are doing.
> > Whether it works adequately is not really known at this time.
> > Unfortunately kernel developers don't test -mm much.
> >
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

WARNING: multiple messages have this Message-ID (diff)
From: "Chakri n" <chakriin5@gmail.com>
To: "Andrew Morton" <akpm@linux-foundation.org>
Cc: "Trond Myklebust" <trond.myklebust@fys.uio.no>,
	linux-pm@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, nfs@lists.sourceforge.net,
	a.p.zijlstra@chello.nl
Subject: Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)
Date: Fri, 28 Sep 2007 16:33:18 -0700	[thread overview]
Message-ID: <92cbf19b0709281633ib4cfbdet5edc19381beeece3@mail.gmail.com> (raw)
In-Reply-To: <92cbf19b0709281436i41247863t6cbc919c33e972a3@mail.gmail.com>

No change in behavior even in case of low memory systems. I confirmed
it running on 1Gig machine.

Thanks
--Chakri

On 9/28/07, Chakri n <chakriin5@gmail.com> wrote:
> Here is a the snapshot of vmstats when the problem happened. I believe
> this could help a little.
>
> crash> kmem -V
>        NR_FREE_PAGES: 680853
>          NR_INACTIVE: 95380
>            NR_ACTIVE: 26891
>        NR_ANON_PAGES: 2507
>       NR_FILE_MAPPED: 1832
>        NR_FILE_PAGES: 119779
>        NR_FILE_DIRTY: 0
>         NR_WRITEBACK: 18272
>  NR_SLAB_RECLAIMABLE: 1305
> NR_SLAB_UNRECLAIMABLE: 2085
>         NR_PAGETABLE: 123
>      NR_UNSTABLE_NFS: 0
>            NR_BOUNCE: 0
>      NR_VMSCAN_WRITE: 0
>
> In my testing, I always saw the processes are waiting in
> balance_dirty_pages_ratelimited(), never in throttle_vm_writeout()
> path.
>
> But this could be because I have about 4Gig of memory in the system
> and plenty of mem is still available around.
>
> I will rerun the test limiting memory to 1024MB and lets see if it
> takes in any different path.
>
> Thanks
> --Chakri
>
>
> On 9/28/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Fri, 28 Sep 2007 16:32:18 -0400
> > Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> >
> > > On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote:
> > > > On Fri, 28 Sep 2007 15:52:28 -0400
> > > > Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > > >
> > > > > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
> > > > > > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > > > > > > Looking back, they were getting caught up in
> > > > > > > balance_dirty_pages_ratelimited() and friends. See the attached
> > > > > > > example...
> > > > > >
> > > > > > that one is nfs-on-loopback, which is a special case, isn't it?
> > > > >
> > > > > I'm not sure that the hang that is illustrated here is so special. It is
> > > > > an example of a bog-standard ext3 write, that ends up calling the NFS
> > > > > client, which is hanging. The fact that it happens to be hanging on the
> > > > > nfsd process is more or less irrelevant here: the same thing could
> > > > > happen to any other process in the case where we have an NFS server that
> > > > > is down.
> > > >
> > > > hm, so ext3 got stuck in nfs via __alloc_pages direct reclaim?
> > > >
> > > > We should be able to fix that by marking the backing device as
> > > > write-congested.  That'll have small race windows, but it should be a 99.9%
> > > > fix?
> > >
> > > No. The problem would rather appear to be that we're doing
> > > per-backing_dev writeback (if I read sync_sb_inodes() correctly), but
> > > we're measuring variables which are global to the VM. The backing device
> > > that we are selecting may not be writing out any dirty pages, in which
> > > case, we're just spinning in balance_dirty_pages_ratelimited().
> >
> > OK, so it's unrelated to page reclaim.
> >
> > > Should we therefore perhaps be looking at adding per-backing_dev stats
> > > too?
> >
> > That's what mm-per-device-dirty-threshold.patch and friends are doing.
> > Whether it works adequately is not really known at this time.
> > Unfortunately kernel developers don't test -mm much.
> >
>

  reply	other threads:[~2007-09-28 23:33 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-28  6:32 A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Chakri n
2007-09-28  6:50 ` Andrew Morton
2007-09-28  6:50 ` Andrew Morton
2007-09-28  6:59   ` Peter Zijlstra
2007-09-28  8:27     ` Chakri n
2007-09-28  8:27     ` Chakri n
2007-09-28  8:27       ` Chakri n
2007-09-28  8:40       ` Peter Zijlstra
2007-09-28  9:01         ` Chakri n
2007-09-28  9:01           ` Chakri n
2007-09-28  9:12           ` Peter Zijlstra
2007-09-28  9:20             ` Chakri n
2007-09-28  9:20               ` Chakri n
2007-09-28  9:23               ` Peter Zijlstra
2007-09-28  9:23                 ` Peter Zijlstra
2007-09-28 10:36                 ` Chakri n
2007-09-28 10:36                 ` Chakri n
2007-09-28 10:36                   ` Chakri n
2007-09-28  9:23               ` Peter Zijlstra
2007-09-28  9:20             ` Chakri n
2007-09-28  9:12           ` Peter Zijlstra
2007-09-28  9:01         ` Chakri n
2007-09-28  8:40       ` Peter Zijlstra
2007-09-28  6:59   ` Peter Zijlstra
2007-09-28 13:28   ` Jonathan Corbet
2007-09-28 13:28     ` Jonathan Corbet
2007-09-28 13:35     ` Peter Zijlstra
2007-09-28 16:45       ` Alan Stern
2007-09-28 16:45       ` [linux-pm] " Alan Stern
2007-09-28 16:45         ` Alan Stern
2007-09-29  1:27       ` Daniel Phillips
2007-09-29  1:27         ` Daniel Phillips
2007-09-29  1:27       ` Daniel Phillips
2007-09-28 13:35     ` Peter Zijlstra
2007-09-28 18:04     ` Andrew Morton
2007-09-28 18:04       ` Andrew Morton
2007-09-28 18:04       ` Andrew Morton
2007-09-28 13:28   ` Jonathan Corbet
2007-09-28 17:00   ` Trond Myklebust
2007-09-28 17:00   ` Trond Myklebust
2007-09-28 18:49     ` Andrew Morton
2007-09-28 18:49     ` Andrew Morton
2007-09-28 18:49       ` Andrew Morton
2007-09-28 18:48       ` Peter Zijlstra
2007-09-28 18:48       ` Peter Zijlstra
2007-09-28 19:16         ` Andrew Morton
2007-09-28 19:16           ` Andrew Morton
2007-10-02 13:36           ` Peter Zijlstra
2007-10-02 15:42             ` Randy Dunlap
2007-10-03  9:28               ` [PATCH] lockstat: documentation Peter Zijlstra
2007-10-03  9:35                 ` Ingo Molnar
2007-09-28 19:16         ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Andrew Morton
2007-09-28 19:16       ` Trond Myklebust
2007-09-28 19:16       ` Trond Myklebust
2007-09-28 19:26         ` Andrew Morton
2007-09-28 19:26           ` Andrew Morton
2007-09-28 19:52           ` Trond Myklebust
2007-09-28 19:52             ` Trond Myklebust
2007-09-28 20:10             ` Andrew Morton
2007-09-28 20:10             ` Andrew Morton
2007-09-28 20:10               ` Andrew Morton
2007-09-28 20:32               ` Trond Myklebust
2007-09-28 20:32               ` Trond Myklebust
2007-09-28 20:32                 ` Trond Myklebust
2007-09-28 20:43                 ` Andrew Morton
2007-09-28 20:43                   ` Andrew Morton
2007-09-28 21:36                   ` Chakri n
2007-09-28 23:33                     ` Chakri n [this message]
2007-09-28 23:33                       ` Chakri n
2007-09-28 23:33                     ` Chakri n
2007-09-28 21:36                   ` Chakri n
2007-09-28 20:24             ` Daniel Phillips
2007-09-28 20:24             ` Daniel Phillips
2007-09-28 19:52           ` Trond Myklebust
2007-09-28 19:26         ` Andrew Morton
2007-09-29  1:51         ` KDB? Daniel Phillips
2007-09-29  1:51           ` KDB? Daniel Phillips
2007-09-29  1:51         ` KDB? Daniel Phillips
2007-09-29  0:46   ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Daniel Phillips
2007-09-29  0:46     ` Daniel Phillips
2007-09-29  0:46   ` Daniel Phillips
2007-09-29 11:04 ` Fengguang Wu
2007-09-29 11:04   ` Fengguang Wu
2007-09-29 11:04   ` Fengguang Wu
2007-09-29 11:48     ` Peter Zijlstra
2007-09-29 11:48     ` Peter Zijlstra
2007-09-29 12:28       ` Fengguang Wu
2007-09-29 12:28         ` Fengguang Wu
2007-09-29 12:28         ` Fengguang Wu
2007-09-29 14:43           ` Peter Zijlstra
2007-09-29 14:43           ` Peter Zijlstra
2007-10-01 15:57     ` Chuck Ebbert
2007-10-02  2:00       ` [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi Fengguang Wu
2007-10-02  2:00         ` Fengguang Wu
2007-10-02  2:00         ` Fengguang Wu
2007-10-02  2:14           ` Andrew Morton
2007-10-02  2:14             ` Andrew Morton
2007-10-02 12:13             ` Fengguang Wu
2007-10-02 12:13               ` Fengguang Wu
2007-10-02 12:13               ` Fengguang Wu
2007-10-02 13:27               ` Fengguang Wu
2007-10-02 13:27                 ` Fengguang Wu
2007-10-02 13:27                 ` Fengguang Wu
2007-10-02 18:35                   ` Chuck Ebbert
2007-10-02 18:35                     ` Chuck Ebbert
2007-10-03 12:46         ` richard kennedy
2007-10-04  1:50           ` Fengguang Wu
2007-10-04  1:50             ` Fengguang Wu
2007-10-04  1:50             ` Fengguang Wu
2007-10-03 12:46         ` richard kennedy
2007-10-01 15:57     ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Chuck Ebbert
  -- strict thread matches above, loose matches on Subject: below --
2007-09-28  6:32 Chakri n

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92cbf19b0709281633ib4cfbdet5edc19381beeece3@mail.gmail.com \
    --to=chakriin5@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.