[PATCH 0/2] xfs: write back inodes during reclaim

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] xfs: write back inodes during reclaim
@ 2011-04-07  6:19 Dave Chinner
  2011-04-07  6:19 ` [PATCH 1/2] bdi: mark the bdi flusher busy when being forked Dave Chinner
  2011-04-07  6:19 ` [PATCH 2/2] xfs: kick inode writeback when low on memory Dave Chinner
  0 siblings, 2 replies; 10+ messages in thread
From: Dave Chinner @ 2011-04-07  6:19 UTC (permalink / raw)
  To: xfs; +Cc: linux-fsdevel

This series fixes an OOM problem where VFS-only dirty inodes
accumulate on an XFS filesystem due to atime updates causing OOM to
occur.

The first patch fixes a deadlock triggering bdi-flusher writeback
from memory reclaim when a new bdi-flusher thread needs to be forked
and no memory is available.

the second adds a bdi-flusher kick from XFS's inode cache shrinker
so that when memory is low the VFS starts writing back dirty inodes
so they can be reclaimed as they get cleaned rather than remaining
dirty and pinning the inode cache in memory.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] bdi: mark the bdi flusher busy when being forked
  2011-04-07  6:19 [PATCH 0/2] xfs: write back inodes during reclaim Dave Chinner
@ 2011-04-07  6:19 ` Dave Chinner
  2011-04-11 18:34   ` Christoph Hellwig
  2011-04-13 19:29   ` Alex Elder
  2011-04-07  6:19 ` [PATCH 2/2] xfs: kick inode writeback when low on memory Dave Chinner
  1 sibling, 2 replies; 10+ messages in thread
From: Dave Chinner @ 2011-04-07  6:19 UTC (permalink / raw)
  To: xfs; +Cc: linux-fsdevel

From: Dave Chinner <dchinner@redhat.com>

Recetn attempts to use writeback_inode_sb_nr_if_idle() in XFs from
memory reclaim context have caused deadlocks because memory reclaim
call be called from a failed allocation during forking a flusher
thread. The shrinker then attempts to trigger writeback and the bdi
is considered idle because writeback is not in progress yet and then
deadlocks because bdi_queue_work() blocks waiting for the
BDI_Pending bit to clear which will never happen because it needs
the fork to complete.

To avoid this deadlock, consider writeback to be in progress if the
flusher thread is being created. This prevents reclaim from blocking
waiting for it be forked and hence avoids the deadlock.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/fs-writeback.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index b5ed541..64e2aba 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -62,11 +62,14 @@ int nr_pdflush_threads;
  * @bdi: the device's backing_dev_info structure.
  *
  * Determine whether there is writeback waiting to be handled against a
- * backing device.
+ * backing device. If the flusher thread is being created, then writeback is in
+ * the process of being started, so indicate that it writeback is not idle at
+ * this point in time.
  */
 int writeback_in_progress(struct backing_dev_info *bdi)
 {
-	return test_bit(BDI_writeback_running, &bdi->state);
+	return test_bit(BDI_writeback_running, &bdi->state) ||
+		test_bit(BDI_pending, &bdi->state);
 }

 static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] xfs: kick inode writeback when low on memory
  2011-04-07  6:19 [PATCH 0/2] xfs: write back inodes during reclaim Dave Chinner
  2011-04-07  6:19 ` [PATCH 1/2] bdi: mark the bdi flusher busy when being forked Dave Chinner
@ 2011-04-07  6:19 ` Dave Chinner
  2011-04-11 18:36   ` Christoph Hellwig
  2011-04-13 20:33   ` Alex Elder
  1 sibling, 2 replies; 10+ messages in thread
From: Dave Chinner @ 2011-04-07  6:19 UTC (permalink / raw)
  To: xfs; +Cc: linux-fsdevel

From: Dave Chinner <dchinner@redhat.com>

When the inode cache shrinker runs, we may have lots of dirty inodes queued up
in the VFS dirty queues that have not been expired. The typical case for this
with XFS is atime updates. The result is that a highly concurrent workload that
copies files and then later reads them (say to verify checksums) dirties all
the inodes again, even when relatime is used.

In a constrained memory environment, this results in a large number of dirty
inodes using all of available memory and memory reclaim being unable to free
them as dirty inodes areconsidered active. This problem was uncovered by Chris
Mason during recent low memory stress testing.

The fix is to trigger VFS level writeback from the XFS inode cache shrinker if
there isn't already writeback in progress. This ensures that when we enter a
low memory situation we start cleaning inodes (via the flusher thread) on the
filesystem immediately, thereby making it more likely that we will be able to
evict those dirty inodes from the VFS in the near future.

The mechanism is not perfect - it only acts on the current filesystem, so if
all the dirty inodes are on a different filesystem it won't help. However, it
seems to be a valid assumption is that the filesystem with lots of dirty inodes
is going to have the shrinker called very soon after the memory shortage
begins, so this shouldn't be an issue.

The other flaw is that there is no guarantee that the flusher thread will make
progress fast enough to clean the dirty inodes so they can be reclaimed in the
near future. However, this mechanism does improve the resilience of the
filesystem under the test conditions - instead of reliably triggering the OOM
killer 20 minutes into the stress test, it took more than 6 hours before it
happened.

This small addition definitely improves the low memory resilience of XFS on
this type of workload, and best of all it has no impact on performance when
memory is not constrained.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_sync.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index 9ad9560..c240d46 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -1038,6 +1038,17 @@ xfs_reclaim_inode_shrink(
 		if (!(gfp_mask & __GFP_FS))
 			return -1;

+		/*
+		 * make sure VFS is cleaning inodes so they can be pruned
+		 * and marked for reclaim in the XFS inode cache. If we don't
+		 * do this the VFS can accumulate dirty inodes and we can OOM
+		 * before they are cleaned by the periodic VFS writeback.
+		 *
+		 * This takes VFS level locks, so we can only do this after
+		 * the __GFP_FS checks otherwise lockdep gets really unhappy.
+		 */
+		writeback_inodes_sb_nr_if_idle(mp->m_super, nr_to_scan);
+
 		xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT,
 					&nr_to_scan);
 		/* terminate if we don't exhaust the scan */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] bdi: mark the bdi flusher busy when being forked
  2011-04-07  6:19 ` [PATCH 1/2] bdi: mark the bdi flusher busy when being forked Dave Chinner
@ 2011-04-11 18:34   ` Christoph Hellwig
  2011-04-13 19:29   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2011-04-11 18:34 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-fsdevel

On Thu, Apr 07, 2011 at 04:19:55PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Recetn attempts to use writeback_inode_sb_nr_if_idle() in XFs from
> memory reclaim context have caused deadlocks because memory reclaim
> call be called from a failed allocation during forking a flusher
> thread. The shrinker then attempts to trigger writeback and the bdi
> is considered idle because writeback is not in progress yet and then
> deadlocks because bdi_queue_work() blocks waiting for the
> BDI_Pending bit to clear which will never happen because it needs
> the fork to complete.
> 
> To avoid this deadlock, consider writeback to be in progress if the
> flusher thread is being created. This prevents reclaim from blocking
> waiting for it be forked and hence avoids the deadlock.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] xfs: kick inode writeback when low on memory
  2011-04-07  6:19 ` [PATCH 2/2] xfs: kick inode writeback when low on memory Dave Chinner
@ 2011-04-11 18:36   ` Christoph Hellwig
  2011-04-11 21:14     ` Dave Chinner
  2011-04-13 20:33   ` Alex Elder
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2011-04-11 18:36 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-fsdevel

How do you produce so many atime-dirty inodes?  With relatime we
should have cut down on the requirement for those a lot.

Do you have traces that show if we're kicking off additional data
writeback this way too, or just pushing timestamp updates into
the AIL?

Either way the actual patch looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] xfs: kick inode writeback when low on memory
  2011-04-11 18:36   ` Christoph Hellwig
@ 2011-04-11 21:14     ` Dave Chinner
  0 siblings, 0 replies; 10+ messages in thread
From: Dave Chinner @ 2011-04-11 21:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs, linux-fsdevel

On Mon, Apr 11, 2011 at 02:36:53PM -0400, Christoph Hellwig wrote:
> How do you produce so many atime-dirty inodes?  With relatime we
> should have cut down on the requirement for those a lot.

Copy a bunch of files, then md5sum them.

The copy modifies c/mtime, the md5sum modifies atime, sees mtime is
younger than atime, updates atime and dirties the inode. i.e.:

$ touch foo
$ stat foo
  File: `foo'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: fe02h/65026d    Inode: 150756489   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/    dave)   Gid: ( 1000/    dave)
Access: 2011-04-12 07:08:24.668542636 +1000
Modify: 2011-04-12 07:08:24.668542636 +1000
Change: 2011-04-12 07:08:24.668542636 +1000
$ cp README foo
cp: overwrite `foo'? y
$ stat foo
  File: `foo'
  Size: 17525           Blocks: 40         IO Block: 4096   regular file
Device: fe02h/65026d    Inode: 150756489   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/    dave)   Gid: ( 1000/    dave)
Access: 2011-04-12 07:08:24.668542636 +1000
Modify: 2011-04-12 07:08:44.676108103 +1000
Change: 2011-04-12 07:08:44.676108103 +1000
$ md5sum foo
9eb709847626f3663ea66121f10a27d7  foo
$ stat foo
  File: `foo'
  Size: 17525           Blocks: 40         IO Block: 4096   regular file
Device: fe02h/65026d    Inode: 150756489   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/    dave)   Gid: ( 1000/    dave)
Access: 2011-04-12 07:09:00.223770431 +1000
Modify: 2011-04-12 07:08:44.676108103 +1000
Change: 2011-04-12 07:08:44.676108103 +1000
$

> Do you have traces that show if we're kicking off additional data
> writeback this way too, or just pushing timestamp updates into
> the AIL?

For the test workload, it just pushes timestamp updates into the AIL
as that is the only thing that is dirtying the inodes when the OOM
occurs. In other situations, I have no evidence either way, but
I have not noticed any performance changes from a high level.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] bdi: mark the bdi flusher busy when being forked
  2011-04-07  6:19 ` [PATCH 1/2] bdi: mark the bdi flusher busy when being forked Dave Chinner
  2011-04-11 18:34   ` Christoph Hellwig
@ 2011-04-13 19:29   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Elder @ 2011-04-13 19:29 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-fsdevel

On Thu, 2011-04-07 at 16:19 +1000, Dave Chinner wrote: 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Recetn attempts to use writeback_inode_sb_nr_if_idle() in XFs from
> memory reclaim context have caused deadlocks because memory reclaim
> call be called from a failed allocation during forking a flusher
> thread. The shrinker then attempts to trigger writeback and the bdi
> is considered idle because writeback is not in progress yet and then
> deadlocks because bdi_queue_work() blocks waiting for the
> BDI_Pending bit to clear which will never happen because it needs
> the fork to complete.
> 
> To avoid this deadlock, consider writeback to be in progress if the
> flusher thread is being created. This prevents reclaim from blocking
> waiting for it be forked and hence avoids the deadlock.

I don't believe it matters, but BDI_pending is also set
while a writeback flusher thread is being shut down.

In any case, a handy use of that flag bit.

Reviewed-by: Alex Elder <aelder@sgi.com>

> Signed-off-by: Dave Chinner <dchinner@redhat.com>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] xfs: kick inode writeback when low on memory
  2011-04-07  6:19 ` [PATCH 2/2] xfs: kick inode writeback when low on memory Dave Chinner
  2011-04-11 18:36   ` Christoph Hellwig
@ 2011-04-13 20:33   ` Alex Elder
  2011-04-14  5:08     ` Dave Chinner
  1 sibling, 1 reply; 10+ messages in thread
From: Alex Elder @ 2011-04-13 20:33 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-fsdevel

On Thu, 2011-04-07 at 16:19 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When the inode cache shrinker runs, we may have lots of dirty inodes queued up
> in the VFS dirty queues that have not been expired. The typical case for this
> with XFS is atime updates. The result is that a highly concurrent workload that
> copies files and then later reads them (say to verify checksums) dirties all
> the inodes again, even when relatime is used.
> 
> In a constrained memory environment, this results in a large number of dirty
> inodes using all of available memory and memory reclaim being unable to free
> them as dirty inodes areconsidered active. This problem was uncovered by Chris
> Mason during recent low memory stress testing.
> 
> The fix is to trigger VFS level writeback from the XFS inode cache shrinker if
> there isn't already writeback in progress. This ensures that when we enter a
> low memory situation we start cleaning inodes (via the flusher thread) on the
> filesystem immediately, thereby making it more likely that we will be able to
> evict those dirty inodes from the VFS in the near future.
> 
> The mechanism is not perfect - it only acts on the current filesystem, so if
> all the dirty inodes are on a different filesystem it won't help. However, it
> seems to be a valid assumption is that the filesystem with lots of dirty inodes
> is going to have the shrinker called very soon after the memory shortage
> begins, so this shouldn't be an issue.
> 
> The other flaw is that there is no guarantee that the flusher thread will make
> progress fast enough to clean the dirty inodes so they can be reclaimed in the
> near future. However, this mechanism does improve the resilience of the
> filesystem under the test conditions - instead of reliably triggering the OOM
> killer 20 minutes into the stress test, it took more than 6 hours before it
> happened.
> 
> This small addition definitely improves the low memory resilience of XFS on
> this type of workload, and best of all it has no impact on performance when
> memory is not constrained.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Looks good to me.

Reviewed-by: Alex Elder <aelder@sgi.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] xfs: kick inode writeback when low on memory
  2011-04-13 20:33   ` Alex Elder
@ 2011-04-14  5:08     ` Dave Chinner
  2011-04-15  8:09       ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2011-04-14  5:08 UTC (permalink / raw)
  To: Alex Elder; +Cc: xfs, linux-fsdevel

On Wed, Apr 13, 2011 at 03:33:42PM -0500, Alex Elder wrote:
> On Thu, 2011-04-07 at 16:19 +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > When the inode cache shrinker runs, we may have lots of dirty inodes queued up
> > in the VFS dirty queues that have not been expired. The typical case for this
> > with XFS is atime updates. The result is that a highly concurrent workload that
> > copies files and then later reads them (say to verify checksums) dirties all
> > the inodes again, even when relatime is used.
> > 
> > In a constrained memory environment, this results in a large number of dirty
> > inodes using all of available memory and memory reclaim being unable to free
> > them as dirty inodes areconsidered active. This problem was uncovered by Chris
> > Mason during recent low memory stress testing.
> > 
> > The fix is to trigger VFS level writeback from the XFS inode cache shrinker if
> > there isn't already writeback in progress. This ensures that when we enter a
> > low memory situation we start cleaning inodes (via the flusher thread) on the
> > filesystem immediately, thereby making it more likely that we will be able to
> > evict those dirty inodes from the VFS in the near future.
> > 
> > The mechanism is not perfect - it only acts on the current filesystem, so if
> > all the dirty inodes are on a different filesystem it won't help. However, it
> > seems to be a valid assumption is that the filesystem with lots of dirty inodes
> > is going to have the shrinker called very soon after the memory shortage
> > begins, so this shouldn't be an issue.
> > 
> > The other flaw is that there is no guarantee that the flusher thread will make
> > progress fast enough to clean the dirty inodes so they can be reclaimed in the
> > near future. However, this mechanism does improve the resilience of the
> > filesystem under the test conditions - instead of reliably triggering the OOM
> > killer 20 minutes into the stress test, it took more than 6 hours before it
> > happened.
> > 
> > This small addition definitely improves the low memory resilience of XFS on
> > this type of workload, and best of all it has no impact on performance when
> > memory is not constrained.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> 
> Looks good to me.
> 
> Reviewed-by: Alex Elder <aelder@sgi.com>

Unfortunately, we simply can't take the s_umount lock in reclaim
context. So further hackery is going to be required here - I think
that writeback_inodes_sb_nr_if_idle() need to use trylocks. if the
s_umount lock is taken in write mode, then it's pretty certain that
the sb is busy....

[ 2226.939859] =================================
[ 2226.940026] [ INFO: inconsistent lock state ]
[ 2226.940026] 2.6.39-rc3-dgc+ #1162
[ 2226.940026] ---------------------------------
[ 2226.940026] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
[ 2226.940026] diff/23704 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 2226.940026]  (&type->s_umount_key#23){+++++?}, at: [<ffffffff81191bf0>] writeback_inodes_sb_nr_if_idle+0x50/0x80
[ 2226.940026] {RECLAIM_FS-ON-W} state was registered at:
[ 2226.940026]   [<ffffffff810c06a7>] mark_held_locks+0x67/0x90
[ 2226.940026]   [<ffffffff810c0796>] lockdep_trace_alloc+0xc6/0x100
[ 2226.940026]   [<ffffffff8115fec9>] kmem_cache_alloc+0x39/0x1e0
[ 2226.940026]   [<ffffffff814afce7>] kmem_zone_alloc+0x77/0xf0
[ 2226.940026]   [<ffffffff814afd7e>] kmem_zone_zalloc+0x1e/0x50
[ 2226.940026]   [<ffffffff814a5f51>] _xfs_trans_alloc+0x31/0x80
[ 2226.940026]   [<ffffffff814a1b74>] xfs_log_sbcount+0x84/0xf0
[ 2226.940026]   [<ffffffff814a26be>] xfs_unmountfs+0xde/0x1a0
[ 2226.940026]   [<ffffffff814bd466>] xfs_fs_put_super+0x46/0x80
[ 2226.940026]   [<ffffffff8116cb92>] generic_shutdown_super+0x72/0x100
[ 2226.940026]   [<ffffffff8116cc51>] kill_block_super+0x31/0x80
[ 2226.940026]   [<ffffffff8116d415>] deactivate_locked_super+0x45/0x60
[ 2226.940026]   [<ffffffff8116e10a>] deactivate_super+0x4a/0x70
[ 2226.940026]   [<ffffffff8118951c>] mntput_no_expire+0xec/0x140
[ 2226.940026]   [<ffffffff81189a08>] sys_umount+0x78/0x3c0
[ 2226.940026]   [<ffffffff81b76c82>] system_call_fastpath+0x16/0x1b
[ 2226.940026] irq event stamp: 2767751
[ 2226.940026] hardirqs last  enabled at (2767751): [<ffffffff810ee0b6>] __call_rcu+0xa6/0x190
[ 2226.940026] hardirqs last disabled at (2767750): [<ffffffff810ee05a>] __call_rcu+0x4a/0x190
[ 2226.940026] softirqs last  enabled at (2758484): [<ffffffff8108d1a3>] __do_softirq+0x143/0x220
[ 2226.940026] softirqs last disabled at (2758471): [<ffffffff81b77e9c>] call_softirq+0x1c/0x30
[ 2226.940026]
[ 2226.940026] other info that might help us debug this:
[ 2226.940026] 3 locks held by diff/23704:
[ 2226.940026]  #0:  (xfs_iolock_active){++++++}, at: [<ffffffff81487408>] xfs_ilock+0x138/0x190
[ 2226.940026]  #1:  (&mm->mmap_sem){++++++}, at: [<ffffffff81b7258b>] do_page_fault+0xeb/0x4f0
[ 2226.940026]  #2:  (shrinker_rwsem){++++..}, at: [<ffffffff8112cb6d>] shrink_slab+0x3d/0x1a0
[ 2226.940026]
[ 2226.940026] stack backtrace:
[ 2226.940026] Pid: 23704, comm: diff Not tainted 2.6.39-rc3-dgc+ #1162
[ 2226.940026] Call Trace:
[ 2226.940026]  [<ffffffff810bf5fa>] print_usage_bug+0x18a/0x190
[ 2226.940026]  [<ffffffff8104982f>] ? save_stack_trace+0x2f/0x50
[ 2226.940026]  [<ffffffff810bf770>] ? print_irq_inversion_bug+0x170/0x170
[ 2226.940026]  [<ffffffff810c055e>] mark_lock+0x35e/0x440
[ 2226.940026]  [<ffffffff810c1227>] __lock_acquire+0x447/0x14b0
[ 2226.940026]  [<ffffffff81065ed8>] ? pvclock_clocksource_read+0x58/0xd0
[ 2226.940026]  [<ffffffff814a84c8>] ? xfs_ail_push_all+0x78/0x80
[ 2226.940026]  [<ffffffff810650b9>] ? kvm_clock_read+0x19/0x20
[ 2226.940026]  [<ffffffff81042bc9>] ? sched_clock+0x9/0x10
[ 2226.940026]  [<ffffffff810aff15>] ? sched_clock_local+0x25/0x90
[ 2226.940026]  [<ffffffff810c2344>] lock_acquire+0xb4/0x140
[ 2226.940026]  [<ffffffff81191bf0>] ? writeback_inodes_sb_nr_if_idle+0x50/0x80
[ 2226.940026]  [<ffffffff81b76a16>] ? ftrace_call+0x5/0x2b
[ 2226.940026]  [<ffffffff81b6d731>] down_read+0x51/0xa0
[ 2226.940026]  [<ffffffff81191bf0>] ? writeback_inodes_sb_nr_if_idle+0x50/0x80
[ 2226.940026]  [<ffffffff81191bf0>] writeback_inodes_sb_nr_if_idle+0x50/0x80
[ 2226.940026]  [<ffffffff814bec18>] ? xfs_syncd_queue_reclaim+0x28/0xc0
[ 2226.940026]  [<ffffffff814c02e9>] xfs_reclaim_inode_shrink+0x99/0xc0
[ 2226.940026]  [<ffffffff8112cc67>] shrink_slab+0x137/0x1a0
[ 2226.940026]  [<ffffffff8112e40c>] do_try_to_free_pages+0x20c/0x440
[ 2226.940026]  [<ffffffff8112e7a2>] try_to_free_pages+0x92/0x130
[ 2226.940026]  [<ffffffff81124826>] __alloc_pages_nodemask+0x496/0x930
[ 2226.940026]  [<ffffffff810aff15>] ? sched_clock_local+0x25/0x90
[ 2226.940026]  [<ffffffff81b76a16>] ? ftrace_call+0x5/0x2b
[ 2226.940026]  [<ffffffff8115c169>] alloc_pages_vma+0x99/0x150
[ 2226.940026]  [<ffffffff811681b3>] do_huge_pmd_anonymous_page+0x143/0x380
[ 2226.940026]  [<ffffffff81b76a16>] ? ftrace_call+0x5/0x2b
[ 2226.940026]  [<ffffffff81141b26>] handle_mm_fault+0x136/0x290
[ 2226.940026]  [<ffffffff81b72601>] do_page_fault+0x161/0x4f0
[ 2226.940026]  [<ffffffff810b0038>] ? sched_clock_cpu+0xb8/0x110
[ 2226.940026]  [<ffffffff810c1116>] ? __lock_acquire+0x336/0x14b0
[ 2226.940026]  [<ffffffff811275b8>] ? __do_page_cache_readahead+0x208/0x2b0
[ 2226.940026]  [<ffffffff81065ed8>] ? pvclock_clocksource_read+0x58/0xd0
[ 2226.940026]  [<ffffffff816d527d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 2226.940026]  [<ffffffff81b6f265>] page_fault+0x25/0x30
[ 2226.940026]  [<ffffffff8111bed4>] ? file_read_actor+0x114/0x1d0
[ 2226.940026]  [<ffffffff8111bde1>] ? file_read_actor+0x21/0x1d0
[ 2226.940026]  [<ffffffff8111dfad>] generic_file_aio_read+0x35d/0x7b0
[ 2226.940026]  [<ffffffff814b769e>] xfs_file_aio_read+0x15e/0x2e0
[ 2226.940026]  [<ffffffff8116a4d0>] ? do_sync_write+0x120/0x120
[ 2226.940026]  [<ffffffff8116a5aa>] do_sync_read+0xda/0x120
[ 2226.940026]  [<ffffffff8169aeee>] ? security_file_permission+0x8e/0x90
[ 2226.940026]  [<ffffffff8116acdd>] vfs_read+0xcd/0x180
[ 2226.940026]  [<ffffffff8116ae94>] sys_read+0x54/0xa0
[ 2226.940026]  [<ffffffff81b76c82>] system_call_fastpath+0x16/0x1b

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] xfs: kick inode writeback when low on memory
  2011-04-14  5:08     ` Dave Chinner
@ 2011-04-15  8:09       ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2011-04-15  8:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Alex Elder, linux-fsdevel, xfs

On Thu, Apr 14, 2011 at 03:08:46PM +1000, Dave Chinner wrote:
> Unfortunately, we simply can't take the s_umount lock in reclaim
> context. So further hackery is going to be required here - I think
> that writeback_inodes_sb_nr_if_idle() need to use trylocks. if the
> s_umount lock is taken in write mode, then it's pretty certain that
> the sb is busy....

http://thread.gmane.org/gmane.linux.file-systems/48373/focus=48628


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-04-15  8:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-07  6:19 [PATCH 0/2] xfs: write back inodes during reclaim Dave Chinner
2011-04-07  6:19 ` [PATCH 1/2] bdi: mark the bdi flusher busy when being forked Dave Chinner
2011-04-11 18:34   ` Christoph Hellwig
2011-04-13 19:29   ` Alex Elder
2011-04-07  6:19 ` [PATCH 2/2] xfs: kick inode writeback when low on memory Dave Chinner
2011-04-11 18:36   ` Christoph Hellwig
2011-04-11 21:14     ` Dave Chinner
2011-04-13 20:33   ` Alex Elder
2011-04-14  5:08     ` Dave Chinner
2011-04-15  8:09       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).