* [regression,bisected] 2.6.32.12: find(1) on xfs causes OOM
@ 2010-05-03 11:54 Peter Palfrader
2010-05-03 12:36 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Peter Palfrader @ 2010-05-03 11:54 UTC (permalink / raw)
To: linux-kernel; +Cc: stable, xfs
Hi,
I have an xfs filesystem in a KVM domain with 512megs of memory and 2 gigs of
swap.
The filesystem is 750g in size, of which some 500g are in use in about 6
million files. (This XFS filesystem is exported via nfs4. I haven't tested if
this makes any difference.)
Starting in 2.6.32.12 running something like "find | wc -l" on this
filesystem's mountpoint causes the OOM killer to kill off most of the
system. (See kern.log[1])
With 2.6.32.11 the system does not behave like this.
Bisecting turned up the following commit. Reverting it in 2.6.32.12
also results in a system that works.
| 9e1e9675fb29c0e94a7c87146138aa2135feba2f is first bad commit
| commit 9e1e9675fb29c0e94a7c87146138aa2135feba2f
| Author: Dave Chinner <david@fromorbit.com>
| Date: Fri Mar 12 09:42:10 2010 +1100
|
| xfs: reclaim all inodes by background tree walks
|
| commit 57817c68229984818fea9e614d6f95249c3fb098 upstream
|
| We cannot do direct inode reclaim without taking the flush lock to
| ensure that we do not reclaim an inode under IO. We check the inode
| is clean before doing direct reclaim, but this is not good enough
| because the inode flush code marks the inode clean once it has
| copied the in-core dirty state to the backing buffer.
|
| It is the flush lock that determines whether the inode is still
| under IO, even though it is marked clean, and the inode is still
| required at IO completion so we can't reclaim it even though it is
| clean in core. Hence the requirement that we need to take the flush
| lock even on clean inodes because this guarantees that the inode
| writeback IO has completed and it is safe to reclaim the inode.
|
| With delayed write inode flushing, we could end up waiting a long
| time on the flush lock even for a clean inode. The background
| reclaim already handles this efficiently, so avoid all the problems
| by killing the direct reclaim path altogether.
|
| Signed-off-by: Dave Chinner <david@fromorbit.com>
| Reviewed-by: Christoph Hellwig <hch@lst.de>
| Signed-off-by: Alex Elder <aelder@sgi.com>
| Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
| index a82a93d..ea7a59a 100644
| --- a/fs/xfs/linux-2.6/xfs_super.c
| +++ b/fs/xfs/linux-2.6/xfs_super.c
| @@ -953,16 +953,14 @@ xfs_fs_destroy_inode(
| ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));
|
| /*
| - * If we have nothing to flush with this inode then complete the
| - * teardown now, otherwise delay the flush operation.
| + * We always use background reclaim here because even if the
| + * inode is clean, it still may be under IO and hence we have
| + * to take the flush lock. The background reclaim path handles
| + * this more efficiently than we can here, so simply let background
| + * reclaim tear down all inodes.
| */
| - if (!xfs_inode_clean(ip)) {
| - xfs_inode_set_reclaim_tag(ip);
| - return;
| - }
| -
| out_reclaim:
| - xfs_ireclaim(ip);
| + xfs_inode_set_reclaim_tag(ip);
| }
|
| /*
Cheers,
Peter
1. http://asteria.noreply.org/~weasel/volatile/2010-05-03-Aju29kSrm0A/kern.log
2. http://asteria.noreply.org/~weasel/volatile/2010-05-03-Aju29kSrm0A/config-2.6.32.12-dsa-amd64
--
| .''`. ** Debian GNU/Linux **
Peter Palfrader | : :' : The universal
http://www.palfrader.org/ | `. `' Operating System
| `- http://www.debian.org/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [regression,bisected] 2.6.32.12: find(1) on xfs causes OOM
2010-05-03 11:54 [regression,bisected] 2.6.32.12: find(1) on xfs causes OOM Peter Palfrader
@ 2010-05-03 12:36 ` Dave Chinner
2010-05-03 12:44 ` Peter Palfrader
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2010-05-03 12:36 UTC (permalink / raw)
To: Peter Palfrader, linux-kernel, xfs, stable
On Mon, May 03, 2010 at 01:54:38PM +0200, Peter Palfrader wrote:
> Hi,
>
> I have an xfs filesystem in a KVM domain with 512megs of memory and 2 gigs of
> swap.
>
> The filesystem is 750g in size, of which some 500g are in use in about 6
> million files. (This XFS filesystem is exported via nfs4. I haven't tested if
> this makes any difference.)
>
> Starting in 2.6.32.12 running something like "find | wc -l" on this
> filesystem's mountpoint causes the OOM killer to kill off most of the
> system. (See kern.log[1])
Knwon problem.
As a workaraound, you can increase the frequency at which the
xfssyncd runs so that it is less than the default 30s between
background reclaim runs.
> With 2.6.32.11 the system does not behave like this.
>
> Bisecting turned up the following commit. Reverting it in 2.6.32.12
> also results in a system that works.
>
> | 9e1e9675fb29c0e94a7c87146138aa2135feba2f is first bad commit
> | commit 9e1e9675fb29c0e94a7c87146138aa2135feba2f
> | Author: Dave Chinner <david@fromorbit.com>
> | Date: Fri Mar 12 09:42:10 2010 +1100
> |
> | xfs: reclaim all inodes by background tree walks
Reverting this leaves you running with a subtly altered and
completely untested reclaim path that I'm not sure does the right
thing in all situations. I wouldn't run that revert on my machines,
nor recommend it for anyone else. But it's up to you if you want to
run it on your machines....
The fix for this problem only got to mainline a couple of days ago.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=9bf729c0af67897ea8498ce17c29b0683f7f2028
I've got to backport it to the stable kernel tree so the next stable
kernel should fix this.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [regression,bisected] 2.6.32.12: find(1) on xfs causes OOM
2010-05-03 12:36 ` Dave Chinner
@ 2010-05-03 12:44 ` Peter Palfrader
0 siblings, 0 replies; 3+ messages in thread
From: Peter Palfrader @ 2010-05-03 12:44 UTC (permalink / raw)
To: Dave Chinner; +Cc: stable, linux-kernel, xfs
On Mon, 03 May 2010, Dave Chinner wrote:
> > Starting in 2.6.32.12 running something like "find | wc -l" on this
> > filesystem's mountpoint causes the OOM killer to kill off most of the
> > system. (See kern.log[1])
>
> Knwon problem.
> The fix for this problem only got to mainline a couple of days ago.
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=9bf729c0af67897ea8498ce17c29b0683f7f2028
>
> I've got to backport it to the stable kernel tree so the next stable
> kernel should fix this.
Thanks, I'll stay on .11 on that machine for now then.
--
| .''`. ** Debian GNU/Linux **
Peter Palfrader | : :' : The universal
http://www.palfrader.org/ | `. `' Operating System
| `- http://www.debian.org/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-05-03 12:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-03 11:54 [regression,bisected] 2.6.32.12: find(1) on xfs causes OOM Peter Palfrader
2010-05-03 12:36 ` Dave Chinner
2010-05-03 12:44 ` Peter Palfrader
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox