* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-04-12 10:26 [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed Jeff Liu
@ 2013-04-12 15:20 ` Michael L. Semon
2013-04-13 5:03 ` Michael L. Semon
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Michael L. Semon @ 2013-04-12 15:20 UTC (permalink / raw)
To: Jeff Liu; +Cc: xfs@oss.sgi.com
I'll try this patch tonight. Thanks!
BTW, after failures with CONFIG_LBDAF=n in previous xfstests, my kernels
should have CONFIG_LBDAF=y, but I could be wrong. I'll check this when
I get back to my test PC and test your patch with both settings.
Michael
On 04/12/2013 06:26 AM, Jeff Liu wrote:
> From: Jie Liu <jeff.liu@oracle.com>
>
> This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
> than it.
>
> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>
> ---
> fs/xfs/xfs_super.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index ea341ce..0644d61 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -585,6 +585,7 @@ xfs_max_file_offset(
> {
> unsigned int pagefactor = 1;
> unsigned int bitshift = BITS_PER_LONG - 1;
> + __uint64_t offset;
>
> /* Figure out maximum filesize, on Linux this can depend on
> * the filesystem blocksize (on 32 bit platforms).
> @@ -610,7 +611,10 @@ xfs_max_file_offset(
> # endif
> #endif
>
> - return (((__uint64_t)pagefactor) << bitshift) - 1;
> + offset = (((__uint64_t)pagefactor) << bitshift) - 1;
> +
> + /* Check against VM & VFS exposed limits */
> + return (offset > MAX_LFS_FILESIZE) ? MAX_LFS_FILESIZE : offset;
> }
>
> xfs_agnumber_t
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-04-12 10:26 [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed Jeff Liu
2013-04-12 15:20 ` Michael L. Semon
@ 2013-04-13 5:03 ` Michael L. Semon
2013-04-13 21:20 ` Michael L. Semon
2013-07-10 6:28 ` Jeff Liu
3 siblings, 0 replies; 9+ messages in thread
From: Michael L. Semon @ 2013-04-13 5:03 UTC (permalink / raw)
To: Jeff Liu; +Cc: xfs@oss.sgi.com
I'll have to test this yet more, but preliminary results on a patched
3.9-rc6-git-sgi-dave-crc kernel look good:
These were done on a 32-bit Pentium 4, BTW:
generic/308, in order of testing...
[F/F] CONFIG_LBDAF=n, without Liu MAX_LFS_FILESIZE patch: PASS
[T/F] CONFIG_LBDAF=y, without Liu patch: HANG, possible FS corruption
[T/T] CONFIG_LBDAF=y, with Liu patch: PASS
[F/T] CONFIG_LBDAF=n, with Liu patch: PASS
It was a surprise that the F/F case passed because it is somewhat in
conflict with your write-up. This will have to be tested more, though,
on the original testing hardware, with the original generic/308, so it's
not a full conflict yet.
The patch was first tested after the [T/F] case above, without creating
a new XFS filesystem first, and I got a soft oops (captured) and had to
do a SysRq reboot. Attempts to mount the partition again led to another
oops (not captured).
Tests on a new XFS filesystem came out fine.
This means I'll have to look at the aftermath of generic/308 a little
bit more, and report on it, too.
Good job so far!
Michael
[ 163.479270] ------------[ cut here ]------------
[ 163.480027] kernel BUG at fs/xfs/xfs_message.c:100!
[ 163.480027] invalid opcode: 0000 [#1]
[ 163.480027] Pid: 1039, comm: rm Not tainted 3.9.0-rc6+ #3 Dell
Computer Corporation Dimension 2350/07W080
[ 163.480027] EIP: 0060:[<c11ad904>] EFLAGS: 00010292 CPU: 0
[ 163.480027] EIP is at assfail+0x2b/0x2d
[ 163.480027] EAX: 00000057 EBX: ed2c2c80 ECX: 00000000 EDX: c16fe980
[ 163.480027] ESI: ecdcac00 EDI: 00000001 EBP: ea45deb4 ESP: ea45dea0
[ 163.480027] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 163.480027] CR0: 8005003b CR2: b765c000 CR3: 2ae8c000 CR4: 000007d0
[ 163.480027] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 163.480027] DR6: ffff0ff0 DR7: 00000400
[ 163.480027] Process rm (pid: 1039, ti=ea45c000 task=eaca9810
task.ti=ea45c000)
[ 163.480027] Stack:
[ 163.480027] 00000000 c167ee3c c166f028 c166efaa 00000159 ea45def0
c11b157b 00000000
[ 163.480027] 00000000 00000002 ea45def0 c118f229 00000000 ecad3000
ed2c2c80 ed2c2db8
[ 163.480027] ea45def0 ee9cee10 ed2c2c80 ed2c2db8 ea45df04 c11aeb59
ed2c2db8 c154b760
[ 163.480027] Call Trace:
[ 163.480027] [<c11b157b>] xfs_inactive+0x3d6/0x4ea
[ 163.480027] [<c118f229>] ? ftrace_raw_event_xfs_inode_class+0x88/0x90
[ 163.480027] [<c11aeb59>] xfs_fs_evict_inode+0x6c/0x8f
[ 163.480027] [<c10cf8a6>] evict+0x7a/0x148
[ 163.480027] [<c10d0131>] iput+0xcd/0x129
[ 163.480027] [<c10c85e7>] do_unlinkat+0x121/0x177
[ 163.480027] [<c10c8660>] sys_unlinkat+0x23/0x34
[ 163.480027] [<c1521c3b>] sysenter_do_call+0x12/0x22
[ 163.480027] Code: 55 89 e5 83 ec 14 3e 8d 74 26 00 89 4c 24 10 89 54
24 0c 89 44 24 08 c7 44 24 04 3c ee 67 c1 c7 04 24 00 00 00 00 e8 e9 fd
ff ff <0f> 0b 55 89 e5 83 ec 14 3e 8d 74 26 00 c7 44 24 10 01 00 00 00
[ 163.480027] EIP: [<c11ad904>] assfail+0x2b/0x2d SS:ESP 0068:ea45dea0
[ 163.514560] ---[ end trace 2a80fb79142bf578 ]---
Message from syslogd@plbearer at Fri Apr 12 23:23:10 2013 ...
plbearer kernel: [ 163.478205] XFS: Assertion failed:
ip->i_d.di_nextents == 0, file: fs/xfs/xfs_vnodeops.c, line: 345
Message from syslogd@plbearer at Fri Apr 12 23:23:10 2013 ...
plbearer kernel: [ 163.480027] EIP: [<c11ad904>] assfail+0x2b/0x2d
SS:ESP 0068:ea45dea0
Message from syslogd@plbearer at Fri Apr 12 23:23:10 2013 ...
plbearer kernel: [ 163.480027] Code: 55 89 e5 83 ec 14 3e 8d 74 26 00
89 4c 24 10 89 54 24 0c 89 44 24 08 c7 44 24 04 3c ee 67
c1 c7 04 24 00 00 00 00 e8 e9 fd ff ff <0f> 0b 55 89 e5 83 ec 14 3e 8d
74 26 00 c7 44 24 10 01 00 00 00
Message from syslogd@plbearer at Fri Apr 12 23:23:10 2013 ...
plbearer kernel: [ 163.480027] Call Trace:
Message from syslogd@plbearer at Fri Apr 12 23:23:10 2013 ...
plbearer kernel: [ 163.480027] Stack:
Message from syslogd@plbearer at Fri Apr 12 23:23:10 2013 ...
plbearer kernel: [ 163.480027] Process rm (pid: 1039, ti=ea45c000
task=eaca9810 task.ti=ea45c000)
- output mismatch (see /usr/src/xfs/xfstests/results/generic/308.out.bad)
--- tests/generic/308.out 2013-04-05 16:00:27.879187036 -0400
+++ /usr/src/xfs/xfstests/results/generic/308.out.bad
2013-04-12 23:23:10.528872994 -0400
@@ -1,2 +1,3 @@
QA output created by 308
Silence is golden
+./tests/generic/308: line 33: 1039 Segmentation fault exit
...
(Run 'diff -u tests/generic/308.out
/usr/src/xfs/xfstests/results/generic/308.out.bad' to see the entire diff)
umount: /tests/testdir: target is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
_check_xfs_filesystem: filesystem on /dev/sda5 is inconsistent (c) (see
/usr/src/xfs/xfstests/results/generic/308.full)
_check_xfs_filesystem: filesystem on /dev/sda5 is inconsistent (r) (see
/usr/src/xfs/xfstests/results/generic/308.full)
Ran: generic/308
Failures: generic/308
Failed 1 of 1 tests
On 04/12/2013 06:26 AM, Jeff Liu wrote:
> From: Jie Liu <jeff.liu@oracle.com>
>
> On 32-bit machine, the s_maxbytes is larger than the MAX_LFS_FILESIZE limits if CONFIG_LBDAF is
> not enabled. Hence it's possible to create a huge file via buffered-IO write with a given offset
> beyond this limitation. e.g.
>
> # block_size=4096
> # offset=$(((2**32 - 1) * $block_size))
> # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
>
> In this case, xfs_io will hang at the page writeback stage soon since the given offset would
> cause an overflow at xfs_vm_writepage():
>
> end_index = offset >> PAGE_CACHE_SHIFT;
> last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
> if (page->index >= end_index) {
> unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
>
> /*
> * Just skip the page if it is fully outside i_size, e.g. due
> * to a truncate operation that is in progress.
> */
> if (page->index >= end_index + 1 || offset_into_page == 0) {
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> unlock_page(page);
> return 0;
> }
> end_index is unsigned long so that the max value is '2^32-1 = 4294967295', and it
> would be evaluated to the max value with the given offset(when writing the page offset
> up to s_max_bytes) for above test case. As a result, (page->index >= end_index + 1) is
> ok as (end_index + 1) is overflowed to ZERO.
>
> Actually, create a file as above on 32-bit machine should be failed with EFBIG error returned
> because there has strict check up at generic_write_checks() against the given offset with a
> *correct* s_max_bytes.
>
> This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
> than it.
>
> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>
> ---
> fs/xfs/xfs_super.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index ea341ce..0644d61 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -585,6 +585,7 @@ xfs_max_file_offset(
> {
> unsigned int pagefactor = 1;
> unsigned int bitshift = BITS_PER_LONG - 1;
> + __uint64_t offset;
>
> /* Figure out maximum filesize, on Linux this can depend on
> * the filesystem blocksize (on 32 bit platforms).
> @@ -610,7 +611,10 @@ xfs_max_file_offset(
> # endif
> #endif
>
> - return (((__uint64_t)pagefactor) << bitshift) - 1;
> + offset = (((__uint64_t)pagefactor) << bitshift) - 1;
> +
> + /* Check against VM & VFS exposed limits */
> + return (offset > MAX_LFS_FILESIZE) ? MAX_LFS_FILESIZE : offset;
> }
>
> xfs_agnumber_t
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-04-12 10:26 [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed Jeff Liu
2013-04-12 15:20 ` Michael L. Semon
2013-04-13 5:03 ` Michael L. Semon
@ 2013-04-13 21:20 ` Michael L. Semon
2013-04-16 5:40 ` Jeff Liu
2013-07-10 6:28 ` Jeff Liu
3 siblings, 1 reply; 9+ messages in thread
From: Michael L. Semon @ 2013-04-13 21:20 UTC (permalink / raw)
To: Jeff Liu; +Cc: xfs@oss.sgi.com
Update: My tests on my original hardware go exactly as they did in my
Pentium 4 test. xfstests shared/[0-9][0-9][0-9] and xfs/003 through
xfs/136 were run against it. No problems. Good job. I'm keeping the
patch.
My final version of the bug summary goes like this:
On a 32-bit x86 PC, with a Linux kernel that has CONFIG_LBDAF=y...
xfstests generic/308, by writing to a file at an address just before
2**32, causes the following conditions on an XFS filesystem:
1) CPU usage becomes very high,
2) The xfs_io process cannot be killed,
3) The best way to shut down the PC is through use of the magic SysRq keys.
4) Afterwards, attempts to mount the filesystem result in a soft oops.
5) After an `xfs_repair -L` on the filesystem, all is OK, other than for
what was lost by zeroing the log.
J. Liu wrote a patch that solves this problem, but he found the answers
with CONFIG_LBDAF=n, which is a condition for which xfstests generic/308
passes on the two test PCs used.
Tests were conducted on a Pentium III (kernel 3.9-rc4 with numerous SGI
patches) and on a Pentium 4 (kernel 3.9-rc6 with numerous SGI patches).
Could you verify these things by memory (no need to retest)?
a) With CONFIG_LBDAF=y, generic/308 caused filesystem corruption, and
b) With CONFIG_LBDAF=n, generic/308 passed the test.
c) Having CONFIG_LBDAF=n helped you to find the answers and write this
fine patch.
Otherwise, the conclusion is "I don't know how you got there, but you
got there. Good job! and thanks for finding the root cause of the problem."
Thanks again!
Michael
On 04/12/2013 06:26 AM, Jeff Liu wrote:
> From: Jie Liu <jeff.liu@oracle.com>
>
> On 32-bit machine, the s_maxbytes is larger than the MAX_LFS_FILESIZE limits if CONFIG_LBDAF is
> not enabled. Hence it's possible to create a huge file via buffered-IO write with a given offset
> beyond this limitation. e.g.
>
> # block_size=4096
> # offset=$(((2**32 - 1) * $block_size))
> # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
>
> In this case, xfs_io will hang at the page writeback stage soon since the given offset would
> cause an overflow at xfs_vm_writepage():
>
> end_index = offset >> PAGE_CACHE_SHIFT;
> last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
> if (page->index >= end_index) {
> unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
>
> /*
> * Just skip the page if it is fully outside i_size, e.g. due
> * to a truncate operation that is in progress.
> */
> if (page->index >= end_index + 1 || offset_into_page == 0) {
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> unlock_page(page);
> return 0;
> }
> end_index is unsigned long so that the max value is '2^32-1 = 4294967295', and it
> would be evaluated to the max value with the given offset(when writing the page offset
> up to s_max_bytes) for above test case. As a result, (page->index >= end_index + 1) is
> ok as (end_index + 1) is overflowed to ZERO.
>
> Actually, create a file as above on 32-bit machine should be failed with EFBIG error returned
> because there has strict check up at generic_write_checks() against the given offset with a
> *correct* s_max_bytes.
>
> This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
> than it.
>
> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>
> ---
> fs/xfs/xfs_super.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index ea341ce..0644d61 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -585,6 +585,7 @@ xfs_max_file_offset(
> {
> unsigned int pagefactor = 1;
> unsigned int bitshift = BITS_PER_LONG - 1;
> + __uint64_t offset;
>
> /* Figure out maximum filesize, on Linux this can depend on
> * the filesystem blocksize (on 32 bit platforms).
> @@ -610,7 +611,10 @@ xfs_max_file_offset(
> # endif
> #endif
>
> - return (((__uint64_t)pagefactor) << bitshift) - 1;
> + offset = (((__uint64_t)pagefactor) << bitshift) - 1;
> +
> + /* Check against VM & VFS exposed limits */
> + return (offset > MAX_LFS_FILESIZE) ? MAX_LFS_FILESIZE : offset;
> }
>
> xfs_agnumber_t
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-04-13 21:20 ` Michael L. Semon
@ 2013-04-16 5:40 ` Jeff Liu
2013-04-16 5:55 ` Michael L. Semon
0 siblings, 1 reply; 9+ messages in thread
From: Jeff Liu @ 2013-04-16 5:40 UTC (permalink / raw)
To: Michael L. Semon; +Cc: xfs@oss.sgi.com
Hi Michael,
Thanks a lot for help verifying this fix and sorry for my too late
response since I have traveled to US two days ago.
On 04/14/2013 05:20 AM, Michael L. Semon wrote:
> Update: My tests on my original hardware go exactly as they did in my
> Pentium 4 test. xfstests shared/[0-9][0-9][0-9] and xfs/003 through
> xfs/136 were run against it. No problems. Good job. I'm keeping the
> patch.
>
> My final version of the bug summary goes like this:
>
> On a 32-bit x86 PC, with a Linux kernel that has CONFIG_LBDAF=y...
>
> xfstests generic/308, by writing to a file at an address just before
> 2**32, causes the following conditions on an XFS filesystem:
>
> 1) CPU usage becomes very high,
>
> 2) The xfs_io process cannot be killed,
>
> 3) The best way to shut down the PC is through use of the magic SysRq keys.
>
> 4) Afterwards, attempts to mount the filesystem result in a soft oops.
>
> 5) After an `xfs_repair -L` on the filesystem, all is OK, other than for
> what was lost by zeroing the log.
>
> J. Liu wrote a patch that solves this problem, but he found the answers
> with CONFIG_LBDAF=n, which is a condition for which xfstests generic/308
> passes on the two test PCs used.
Ooops! it's wrong. My answer is misleading, you can think that I drink
too much at that time.:( Actually, it quite the reverse, i.e. this
issue can be reproduced against 32-bit kernel with CONFIG_LBADF=y, this
is the default configuration of mine.
In this case, I observed that the s_maxbytes is larger than the
MAX_LFS_FILESIZE. Hence, the current patch is written to make sure that
the s_maxbytes should not beyond this limits.
For 32-bit kernel with CONFIG_LBADF=n, the s_maxbytes is just equal to
MAX_LFS_FILESIZE, so the test is works to me. BTW, I only verified this
fix upon the default mkfs options. i.e, 4k blocksize, that is, mkfs.xfs
/dev/sdX.
>
> Tests were conducted on a Pentium III (kernel 3.9-rc4 with numerous SGI
> patches) and on a Pentium 4 (kernel 3.9-rc6 with numerous SGI patches).
>
> Could you verify these things by memory (no need to retest)?
As I mentioned above.
> a) With CONFIG_LBDAF=y, generic/308 caused filesystem corruption, and
In this case, the operation should be denied with -EFBIG error returned
if trying to create a huge file.
>
> b) With CONFIG_LBDAF=n, generic/308 passed the test.
Even don't applying this patch, the test run passed for the default mkfs
setup.
>
> c) Having CONFIG_LBDAF=n helped you to find the answers and write this
> fine patch.
CONFIG_LBADF=y instead.
Thanks again!
-Jeff
>
> Otherwise, the conclusion is "I don't know how you got there, but you
> got there. Good job! and thanks for finding the root cause of the problem."
>
> Thanks again!
>
> Michael
>
> On 04/12/2013 06:26 AM, Jeff Liu wrote:
>> From: Jie Liu <jeff.liu@oracle.com>
>>
>> On 32-bit machine, the s_maxbytes is larger than the MAX_LFS_FILESIZE limits if CONFIG_LBDAF is
>> not enabled. Hence it's possible to create a huge file via buffered-IO write with a given offset
>> beyond this limitation. e.g.
>>
>> # block_size=4096
>> # offset=$(((2**32 - 1) * $block_size))
>> # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
>>
>> In this case, xfs_io will hang at the page writeback stage soon since the given offset would
>> cause an overflow at xfs_vm_writepage():
>>
>> end_index = offset >> PAGE_CACHE_SHIFT;
>> last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
>> if (page->index >= end_index) {
>> unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
>>
>> /*
>> * Just skip the page if it is fully outside i_size, e.g. due
>> * to a truncate operation that is in progress.
>> */
>> if (page->index >= end_index + 1 || offset_into_page == 0) {
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> unlock_page(page);
>> return 0;
>> }
>> end_index is unsigned long so that the max value is '2^32-1 = 4294967295', and it
>> would be evaluated to the max value with the given offset(when writing the page offset
>> up to s_max_bytes) for above test case. As a result, (page->index >= end_index + 1) is
>> ok as (end_index + 1) is overflowed to ZERO.
>>
>> Actually, create a file as above on 32-bit machine should be failed with EFBIG error returned
>> because there has strict check up at generic_write_checks() against the given offset with a
>> *correct* s_max_bytes.
>>
>> This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
>> than it.
>>
>> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
>> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>>
>> ---
>> fs/xfs/xfs_super.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
>> index ea341ce..0644d61 100644
>> --- a/fs/xfs/xfs_super.c
>> +++ b/fs/xfs/xfs_super.c
>> @@ -585,6 +585,7 @@ xfs_max_file_offset(
>> {
>> unsigned int pagefactor = 1;
>> unsigned int bitshift = BITS_PER_LONG - 1;
>> + __uint64_t offset;
>>
>> /* Figure out maximum filesize, on Linux this can depend on
>> * the filesystem blocksize (on 32 bit platforms).
>> @@ -610,7 +611,10 @@ xfs_max_file_offset(
>> # endif
>> #endif
>>
>> - return (((__uint64_t)pagefactor) << bitshift) - 1;
>> + offset = (((__uint64_t)pagefactor) << bitshift) - 1;
>> +
>> + /* Check against VM & VFS exposed limits */
>> + return (offset > MAX_LFS_FILESIZE) ? MAX_LFS_FILESIZE : offset;
>> }
>>
>> xfs_agnumber_t
>>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-04-16 5:40 ` Jeff Liu
@ 2013-04-16 5:55 ` Michael L. Semon
0 siblings, 0 replies; 9+ messages in thread
From: Michael L. Semon @ 2013-04-16 5:55 UTC (permalink / raw)
To: Jeff Liu; +Cc: xfs@oss.sgi.com
You're welcome. Thanks for the explanation. Now everything makes
logical sense.
I'll re-run the tests with different block sizes. The tests have been
run already with a) default mkfs options and b) with an external journal
and realtime device.
Your related patch, "xfs: don't return 0 if generic_segment_checks()
find nothing to write," has also been applied. The suggested test case
hasn't been tried, but the patch hasn't caused any additional problems,
either.
Anyway, I hope you had/have a good stay here in the US.
Michael
On 04/16/2013 01:40 AM, Jeff Liu wrote:
> Hi Michael,
>
> Thanks a lot for help verifying this fix and sorry for my too late
> response since I have traveled to US two days ago.
>
> On 04/14/2013 05:20 AM, Michael L. Semon wrote:
>> Update: My tests on my original hardware go exactly as they did in my
>> Pentium 4 test. xfstests shared/[0-9][0-9][0-9] and xfs/003 through
>> xfs/136 were run against it. No problems. Good job. I'm keeping the
>> patch.
>>
>> My final version of the bug summary goes like this:
>>
>> On a 32-bit x86 PC, with a Linux kernel that has CONFIG_LBDAF=y...
>>
>> xfstests generic/308, by writing to a file at an address just before
>> 2**32, causes the following conditions on an XFS filesystem:
>>
>> 1) CPU usage becomes very high,
>>
>> 2) The xfs_io process cannot be killed,
>>
>> 3) The best way to shut down the PC is through use of the magic SysRq keys.
>>
>> 4) Afterwards, attempts to mount the filesystem result in a soft oops.
>>
>> 5) After an `xfs_repair -L` on the filesystem, all is OK, other than for
>> what was lost by zeroing the log.
>>
>> J. Liu wrote a patch that solves this problem, but he found the answers
>> with CONFIG_LBDAF=n, which is a condition for which xfstests generic/308
>> passes on the two test PCs used.
> Ooops! it's wrong. My answer is misleading, you can think that I drink
> too much at that time.:( Actually, it quite the reverse, i.e. this
> issue can be reproduced against 32-bit kernel with CONFIG_LBADF=y, this
> is the default configuration of mine.
> In this case, I observed that the s_maxbytes is larger than the
> MAX_LFS_FILESIZE. Hence, the current patch is written to make sure that
> the s_maxbytes should not beyond this limits.
>
> For 32-bit kernel with CONFIG_LBADF=n, the s_maxbytes is just equal to
> MAX_LFS_FILESIZE, so the test is works to me. BTW, I only verified this
> fix upon the default mkfs options. i.e, 4k blocksize, that is, mkfs.xfs
> /dev/sdX.
>>
>> Tests were conducted on a Pentium III (kernel 3.9-rc4 with numerous SGI
>> patches) and on a Pentium 4 (kernel 3.9-rc6 with numerous SGI patches).
>>
>> Could you verify these things by memory (no need to retest)?
> As I mentioned above.
>> a) With CONFIG_LBDAF=y, generic/308 caused filesystem corruption, and
> In this case, the operation should be denied with -EFBIG error returned
> if trying to create a huge file.
>>
>> b) With CONFIG_LBDAF=n, generic/308 passed the test.
> Even don't applying this patch, the test run passed for the default mkfs
> setup.
>>
>> c) Having CONFIG_LBDAF=n helped you to find the answers and write this
>> fine patch.
> CONFIG_LBADF=y instead.
>
> Thanks again!
> -Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-04-12 10:26 [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed Jeff Liu
` (2 preceding siblings ...)
2013-04-13 21:20 ` Michael L. Semon
@ 2013-07-10 6:28 ` Jeff Liu
2013-07-10 6:48 ` Dave Chinner
3 siblings, 1 reply; 9+ messages in thread
From: Jeff Liu @ 2013-07-10 6:28 UTC (permalink / raw)
To: xfs@oss.sgi.com; +Cc: Michael L. Semon
Could anyone help to review this patch?
Thanks,
-Jeff
On 04/12/2013 06:26 PM, Jeff Liu wrote:
> From: Jie Liu <jeff.liu@oracle.com>
>
> On 32-bit machine, the s_maxbytes is larger than the MAX_LFS_FILESIZE limits if CONFIG_LBDAF is
> not enabled. Hence it's possible to create a huge file via buffered-IO write with a given offset
> beyond this limitation. e.g.
>
> # block_size=4096
> # offset=$(((2**32 - 1) * $block_size))
> # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
>
> In this case, xfs_io will hang at the page writeback stage soon since the given offset would
> cause an overflow at xfs_vm_writepage():
>
> end_index = offset >> PAGE_CACHE_SHIFT;
> last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
> if (page->index >= end_index) {
> unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
>
> /*
> * Just skip the page if it is fully outside i_size, e.g. due
> * to a truncate operation that is in progress.
> */
> if (page->index >= end_index + 1 || offset_into_page == 0) {
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> unlock_page(page);
> return 0;
> }
> end_index is unsigned long so that the max value is '2^32-1 = 4294967295', and it
> would be evaluated to the max value with the given offset(when writing the page offset
> up to s_max_bytes) for above test case. As a result, (page->index >= end_index + 1) is
> ok as (end_index + 1) is overflowed to ZERO.
>
> Actually, create a file as above on 32-bit machine should be failed with EFBIG error returned
> because there has strict check up at generic_write_checks() against the given offset with a
> *correct* s_max_bytes.
>
> This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
> than it.
>
> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>
> ---
> fs/xfs/xfs_super.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index ea341ce..0644d61 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -585,6 +585,7 @@ xfs_max_file_offset(
> {
> unsigned int pagefactor = 1;
> unsigned int bitshift = BITS_PER_LONG - 1;
> + __uint64_t offset;
>
> /* Figure out maximum filesize, on Linux this can depend on
> * the filesystem blocksize (on 32 bit platforms).
> @@ -610,7 +611,10 @@ xfs_max_file_offset(
> # endif
> #endif
>
> - return (((__uint64_t)pagefactor) << bitshift) - 1;
> + offset = (((__uint64_t)pagefactor) << bitshift) - 1;
> +
> + /* Check against VM & VFS exposed limits */
> + return (offset > MAX_LFS_FILESIZE) ? MAX_LFS_FILESIZE : offset;
> }
>
> xfs_agnumber_t
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-07-10 6:28 ` Jeff Liu
@ 2013-07-10 6:48 ` Dave Chinner
2013-07-10 13:14 ` Jeff Liu
0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2013-07-10 6:48 UTC (permalink / raw)
To: Jeff Liu; +Cc: Michael L. Semon, xfs@oss.sgi.com
On Wed, Jul 10, 2013 at 02:28:20PM +0800, Jeff Liu wrote:
> Could anyone help to review this patch?
Sorry, I missed it.
> > On 32-bit machine, the s_maxbytes is larger than the MAX_LFS_FILESIZE limits if CONFIG_LBDAF is
> > not enabled. Hence it's possible to create a huge file via buffered-IO write with a given offset
> > beyond this limitation. e.g.
> >
> > # block_size=4096
> > # offset=$(((2**32 - 1) * $block_size))
> > # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
> >
> > In this case, xfs_io will hang at the page writeback stage soon since the given offset would
> > cause an overflow at xfs_vm_writepage():
> >
> > end_index = offset >> PAGE_CACHE_SHIFT;
> > last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
> > if (page->index >= end_index) {
> > unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
> >
> > /*
> > * Just skip the page if it is fully outside i_size, e.g. due
> > * to a truncate operation that is in progress.
> > */
> > if (page->index >= end_index + 1 || offset_into_page == 0) {
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > unlock_page(page);
> > return 0;
> > }
> > end_index is unsigned long so that the max value is '2^32-1 = 4294967295', and it
> > would be evaluated to the max value with the given offset(when writing the page offset
> > up to s_max_bytes) for above test case. As a result, (page->index >= end_index + 1) is
> > ok as (end_index + 1) is overflowed to ZERO.
> >
> > Actually, create a file as above on 32-bit machine should be failed with EFBIG error returned
> > because there has strict check up at generic_write_checks() against the given offset with a
> > *correct* s_max_bytes.
> >
> > This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
> > than it.
Isn't MAX_LFS_FILESIZE defined on 32 bit systems to 8TB and the
problem here is that we are overflowing at 16TB? If so, that means
addin gthis patch will potentially cause problems with existing
working setups that have (sparse) files larger than 8TB on 32 bit
systems.
So, can't we simply subtract PAGE_CACHE_SIZE from the offset
being returned to avoid this overflow?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] xfs: fix s_max_bytes to MAX_LFS_FILESIZE if needed
2013-07-10 6:48 ` Dave Chinner
@ 2013-07-10 13:14 ` Jeff Liu
0 siblings, 0 replies; 9+ messages in thread
From: Jeff Liu @ 2013-07-10 13:14 UTC (permalink / raw)
To: Dave Chinner; +Cc: Michael L. Semon, xfs@oss.sgi.com
Hi Dave,
Thanks for the quick response.
On 07/10/2013 02:48 PM, Dave Chinner wrote:
> On Wed, Jul 10, 2013 at 02:28:20PM +0800, Jeff Liu wrote:
>> Could anyone help to review this patch?
>
> Sorry, I missed it.
>
>>> On 32-bit machine, the s_maxbytes is larger than the MAX_LFS_FILESIZE limits if CONFIG_LBDAF is
>>> not enabled. Hence it's possible to create a huge file via buffered-IO write with a given offset
>>> beyond this limitation. e.g.
>>>
>>> # block_size=4096
>>> # offset=$(((2**32 - 1) * $block_size))
>>> # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
>>>
>>> In this case, xfs_io will hang at the page writeback stage soon since the given offset would
>>> cause an overflow at xfs_vm_writepage():
>>>
>>> end_index = offset >> PAGE_CACHE_SHIFT;
>>> last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
>>> if (page->index >= end_index) {
>>> unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
>>>
>>> /*
>>> * Just skip the page if it is fully outside i_size, e.g. due
>>> * to a truncate operation that is in progress.
>>> */
>>> if (page->index >= end_index + 1 || offset_into_page == 0) {
>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> unlock_page(page);
>>> return 0;
>>> }
>>> end_index is unsigned long so that the max value is '2^32-1 = 4294967295', and it
>>> would be evaluated to the max value with the given offset(when writing the page offset
>>> up to s_max_bytes) for above test case. As a result, (page->index >= end_index + 1) is
>>> ok as (end_index + 1) is overflowed to ZERO.
>>>
>>> Actually, create a file as above on 32-bit machine should be failed with EFBIG error returned
>>> because there has strict check up at generic_write_checks() against the given offset with a
>>> *correct* s_max_bytes.
>>>
>>> This patch fix the s_max_bytes to MAX_LFS_FILESIZE if the pre-calculated value is greater
>>> than it.
>
> Isn't MAX_LFS_FILESIZE defined on 32 bit systems to 8TB and the
> problem here is that we are overflowing at 16TB? If so, that means
> addin gthis patch will potentially cause problems with existing
> working setups that have (sparse) files larger than 8TB on 32 bit
> systems.
Yes, but maybe I should say end_index is wrapped to zero rather than
overflow in this situation.
>
> So, can't we simply subtract PAGE_CACHE_SIZE from the offset
> being returned to avoid this overflow?
It seems that this change does not works to me because page->index is
greater than end_index in most cases for large files.
IOWs, what I mentioned in comment log is incorrect and it misled you.
I worked out another patch, it looks works for this test but I need to
run some extra tests before posting.
Thanks,
-Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread