From: Jeff Layton <jlayton@poochiereds.net>
To: Dave Chinner <david@fromorbit.com>
Cc: Hugh Dickins <hughd@google.com>, angelo <angelo70@gmail.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@infradead.org>,
Andi Kleen <andi@firstfloor.org>, Eryu Guan <eguan@redhat.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix cpu hangs on truncating last page of a 16t sparse file
Date: Sun, 27 Sep 2015 19:56:55 -0400 [thread overview]
Message-ID: <20150927195655.18e20003@tlielax.poochiereds.net> (raw)
In-Reply-To: <20150927232645.GW3902@dastard>
On Mon, 28 Sep 2015 09:26:45 +1000
Dave Chinner <david@fromorbit.com> wrote:
> On Sun, Sep 27, 2015 at 10:59:33AM -0700, Hugh Dickins wrote:
> > On Sun, 27 Sep 2015, angelo wrote:
> > > On 27/09/2015 03:36, Hugh Dickins wrote:
> > > > Let's Cc linux-fsdevel, who will be more knowledgable.
> > > >
> > > > On Sun, 27 Sep 2015, angelo wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > running xfstests, generic 308 on whatever 32bit arch is possible
> > > > > to observe cpu to hang near 100% on unlink.
> >
> > I have since tried to repeat your result, but generic/308 on 32-bit just
> > skipped the test for me. I didn't investigate why: it's quite possible
> > that I had a leftover 64-bit executable in the path that it tried to use,
> > but didn't show the relevant error message.
> >
> > I did verify your result with a standalone test; and that proves that
> > nobody has actually been using such files in practice before you,
> > since unmounting the xfs filesystem would hang in the same way if
> > they didn't unlink them.
>
> It used to work - this is a regression. Just because nobody has
> reported it recently simply means nobody has run xfstests on 32 bit
> storage recently. There are 32 bit systems out there that expect
> this to work, and we've broken it.
>
> The regression was introduced in 3.11 by this commit:
>
> commit 5a7203947a1d9b6f3a00a39fda08c2466489555f
> Author: Lukas Czerner <lczerner@redhat.com>
> Date: Mon May 27 23:32:35 2013 -0400
>
> mm: teach truncate_inode_pages_range() to handle non page aligned ranges
>
> This commit changes truncate_inode_pages_range() so it can handle non
> page aligned regions of the truncate. Currently we can hit BUG_ON when
> the end of the range is not page aligned, but we can handle unaligned
> start of the range.
>
> Being able to handle non page aligned regions of the page can help file
> system punch_hole implementations and save some work, because once we're
> holding the page we might as well deal with it right away.
>
> In previous commits we've changed ->invalidatepage() prototype to accept
> 'length' argument to be able to specify range to invalidate. No we can
> use that new ability in truncate_inode_pages_range().
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Hugh Dickins <hughd@google.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>
>
> > > > > The test removes a sparse file of length 16tera where only the last
> > > > > 4096 bytes block is mapped.
> > > > > At line 265 of truncate.c there is a
> > > > > if (index >= end)
> > > > > break;
> > > > > But if index is, as in this case, a 4294967295, it match -1 used as
> > > > > eof. Hence the cpu loops 100% just after.
> > > > >
> > > > That's odd. I've not checked your patch, because I think the problem
> > > > would go beyond truncate, and the root cause lie elsewhere.
> > > >
> > > > My understanding is that the 32-bit
> > > > #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
> > > > makes a page->index of -1 (or any "negative") impossible to reach.
>
> We've supported > 8TB files on 32 bit XFS file systems since
> since mid 2003:
>
> http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=d13d78f6b83eefbd90a6cac5c9fbe42560c6511e
>
> And it's been documented as such for a long time, too:
>
> http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s04.html
>
> (that was written, IIRC, back in 2007).
>
> i.e. whatever the definition says about MAX_LFS_FILESIZE being an
> 8TB limit on 32 bit is stale and has been for a very long time.
>
> > A surprise to me, and I expect to others, that 32-bit xfs is not
> > respecting MAX_LFS_FILESIZE: going its own way with 0xfff ffffffff
> > instead of 0x7ff ffffffff (on a PAGE_CACHE_SIZE 4096 system).
> >
> > MAX_LFS_FILESIZE has been defined that way ever since v2.5.4:
> > this is probably just an oversight from when xfs was later added
> > into the Linux tree.
>
> We supported >8 TB file offsets on 32 bit systems on 2.4 kernels
> with XFS, so it sounds like it was wrong even when it was first
> committed. Of course, XFS wasn't merged until 2.5.36, so I guess
> nobody realised... ;)
>
> > > But if s_maxbytes doesn't have to be greater than MAX_LFS_FILESIZE,
> > > i agree the issue should be fixed in layers above.
> >
> > There is a "filesystems should never set s_maxbytes larger than
> > MAX_LFS_FILESIZE" comment in fs/super.c, but unfortunately its
> > warning is written with just 64-bit in mind (testing for negative).
>
> Yup, introduced here:
>
> commit 42cb56ae2ab67390da34906b27bedc3f2ff1393b
> Author: Jeff Layton <jlayton@redhat.com>
> Date: Fri Sep 18 13:05:53 2009 -0700
>
> vfs: change sb->s_maxbytes to a loff_t
>
> sb->s_maxbytes is supposed to indicate the maximum size of a file that can
> exist on the filesystem. It's declared as an unsigned long long.
>
> And yes, that will never fire on a 32bit filesystem, because loff_t
> is a "long long" type....
>
Hmm...should we change that to something like this instead?
WARN(((unsigned long long)sb->s_maxbytes > (unsigned long long)MAX_LFS_FILESIZE,
"%s set sb->s_maxbytes to too large a value (0x%llx)\n", type->name, sb->s_maxbytes);
--
Jeff Layton <jlayton@poochiereds.net>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-09-27 23:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <560723F8.3010909@gmail.com>
2015-09-27 1:36 ` [PATCH] mm: fix cpu hangs on truncating last page of a 16t sparse file Hugh Dickins
2015-09-27 2:14 ` angelo
2015-09-27 2:21 ` angelo
2015-09-27 17:59 ` Hugh Dickins
2015-09-27 23:26 ` Dave Chinner
2015-09-27 23:56 ` Jeff Layton [this message]
2015-09-28 1:06 ` Dave Chinner
2015-09-28 17:03 ` Andi Kleen
2015-09-28 18:08 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150927195655.18e20003@tlielax.poochiereds.net \
--to=jlayton@poochiereds.net \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=angelo70@gmail.com \
--cc=david@fromorbit.com \
--cc=eguan@redhat.com \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).