* [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed()
@ 2013-03-17 15:01 Jeff Liu
2013-03-17 23:53 ` Dave Chinner
2013-03-18 4:17 ` Michael L. Semon
0 siblings, 2 replies; 4+ messages in thread
From: Jeff Liu @ 2013-03-17 15:01 UTC (permalink / raw)
To: xfs@oss.sgi.com; +Cc: Michael L. Semon
Hello,
Yesterday, Michael reported that ran xfstest-078 got kernel panic when growing a
small partition to a huge size(64-bit) on 32-bit system, this issue can be easily
reproduced on the latest upstream 38 to 39-rc2 if CONFIG_XFS_DEBUG is enabled.
According to my investigation, if the request pos offset exceeding 32-bit integer
range, evaluate block_offset with pos & PAGE_MASK will cause overflows so that
it most likely that the assert is not true.
This patch fix it by checking "block_offset + from == (pos & ~0UL)" instead.
Thanks,
-Jeff
From: Jie Liu <jeff.liu@oracle.com>
XFSTEST-078 cause kernel panic on 32-bit system if CONFIG_XFS_DEBUG is enabled:
XFS: Assertion failed: block_offset + from == pos, file: fs/xfs/xfs_aops.c, line: 1504
------------[ cut here ]------------
kernel BUG at fs/xfs/xfs_message.c:100!
invalid opcode: 0000 [#1] SMP
Modules linked in: netconsole configfs xfs(O) libcrc32c joydev
Pid: 4280, comm: mkfs.xfs Tainted: G O 3.9.0-rc2 #1 innotek GmbH VirtualBox
EIP: 0060:[<f9492e5b>] EFLAGS: 00010282 CPU: 0
EIP is at assfail+0x2b/0x30 [xfs]
EAX: 00000056 EBX: f6dca440 ECX: 00000007 EDX: f57d22a4
ESI: 1c2f8000 EDI: 00000000 EBP: eababd30 ESP: eababd1c
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: b2e43000 CR3: 2b92c000 CR4: 000006f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process mkfs.xfs (pid: 4280, ti=eabaa000 task=ebf35a90 task.ti=eabaa000)
Stack:
00000000 f9510c14 f950aa70 f950a96b 000005e0 eababd7c f947fb04 c19b0ea2
00000066 f30b22a0 c19b0ea2 00000000 f3dbc118 00001000 00000000 00000000
00000000 c15c7e89 00000000 1c2f8000 00000000 00000000 1c2f8000 00000080
Call Trace:
[<f947fb04>] xfs_vm_write_failed+0x74/0x1b0 [xfs]
[<c15c7e89>] ? printk+0x4d/0x4f
[<f947fd4d>] xfs_vm_write_begin+0x10d/0x170 [xfs]
[<c110a34c>] generic_file_buffered_write+0xdc/0x210
[<f9486639>] xfs_file_buffered_aio_write+0xf9/0x190 [xfs]
[<f94867c3>] xfs_file_aio_write+0xf3/0x160 [xfs]
[<c115e504>] do_sync_write+0x94/0xd0
[<c115ed1f>] vfs_write+0x8f/0x160
[<c115e470>] ? wait_on_retry_sync_kiocb+0x50/0x50
[<c115f017>] sys_write+0x47/0x80
[<c15d860d>] sysenter_do_call+0x12/0x28
Code: 55 89 e5 83 ec 14 3e 8d 74 26 00 89 4c 24 10 89 54 24 0c 89 44
24 08 c7 44 24 04 14 0c 51 f9 c7 04 24 00 00 00 00 e8 d5 fd ff ff <0f>
0b 8d 76 00 55 89 e5 83 ec 14 3e 8d 74 26 00 b9 01 00 00 00
EIP: [<f9492e5b>] assfail+0x2b/0x30 [xfs] SS:ESP 0068:eababd1c
---[ end trace 3348226f4d43bec2 ]---
Kernel panic - not syncing: Fatal exception
On 32-bit system, if the request pos is 64-bit and evaluate block_offset
with (pos & PAGE_MASK) will result in overflows, therefore the assertion
will failed. We have to check the write offset against (pos & ~0UL) to
avoid this issue as it can evaluate the highest 20 bits on 32-bit correctly
if the pos request is 64-bit and keep the expected result of 64-bit pos request
on 64-bit system unchanged.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
---
fs/xfs/xfs_aops.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 5f707e5..2fc7367 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1501,7 +1501,12 @@ xfs_vm_write_failed(
loff_t to = from + len;
struct buffer_head *bh, *head;
- ASSERT(block_offset + from == pos);
+ /*
+ * Evaluate block_offset via (pos & PAGE_MASK) on 32-bit system
+ * can cause overflow if the request pos is 64-bit. Hence we
+ * have to verify the write offset with (pos & ~0UL) to avoid it.
+ */
+ ASSERT(block_offset + from == (pos & ~0UL));
head = page_buffers(page);
block_start = 0;
--
1.7.9.5
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed()
2013-03-17 15:01 [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed() Jeff Liu
@ 2013-03-17 23:53 ` Dave Chinner
2013-03-18 4:12 ` Jeff Liu
2013-03-18 4:17 ` Michael L. Semon
1 sibling, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2013-03-17 23:53 UTC (permalink / raw)
To: Jeff Liu; +Cc: Michael L. Semon, xfs@oss.sgi.com
On Sun, Mar 17, 2013 at 11:01:08PM +0800, Jeff Liu wrote:
> Hello,
>
> Yesterday, Michael reported that ran xfstest-078 got kernel panic when growing a
> small partition to a huge size(64-bit) on 32-bit system, this issue can be easily
> reproduced on the latest upstream 38 to 39-rc2 if CONFIG_XFS_DEBUG is enabled.
> According to my investigation, if the request pos offset exceeding 32-bit integer
> range, evaluate block_offset with pos & PAGE_MASK will cause overflows so that
> it most likely that the assert is not true.
Hi Jeff,
Nice work finding the root of the problem, but I think your fix
is wrong. Here's how I look at the problem - all the parameters are
loff_t, so everything should validate cleanly as 64 bit values.
typedef __kernel_loff_t loff_t
typedef long long __kernel_loff_t;
So there shouldn't be any overflows occurring that need masking like
this:
> This patch fix it by checking "block_offset + from == (pos & ~0UL)" instead.
So, what's the real problem?
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
On a 32 bit system, PAGE_MASK = 0xfffff000, as an unsigned long.
And so this:
loff_t block_offset = pos & PAGE_MASK;
is also masking off the high 32 bits in pos.
That's why:
> + /*
> + * Evaluate block_offset via (pos & PAGE_MASK) on 32-bit system
> + * can cause overflow if the request pos is 64-bit. Hence we
> + * have to verify the write offset with (pos & ~0UL) to avoid it.
> + */
> + ASSERT(block_offset + from == (pos & ~0UL));
Masking off the high bits of pos here makes the ASSERT failure go
away. However, it doesn't fix the problem - it just shuts up the
warning that there's a problem.
The bug is that block_offset is passed into
xfs_vm_kill_delalloc_range(), and from above we now know that
block_offset doesn't have the correct value. This is a
potential data corruption bug, and catching this problem was the
reason the ASSERT() was placed in the code. i.e. ensuring we are
punching at the correct block offset into the file.
IOWs, the intention of the code is that block_offset should be a 64
bit value with the lower 12 bits masked out. Something like this:
block_offset = (pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT;
Will clear the lower 12 bits, and then the ASSERT() should evaluate
correctly and the correct offset get passed to
xfs_vm_kill_delalloc_range(), fixing the bug....
i.e. whenever an ASSERT() fires, you need to look at the code for
bugs - more often than not the ASSERT() is correct and the fact that
it fired is indicative of a bug in the code. Hence changing the
ASSERT to stop it firing is almost always the wrong "fix". :)
Cheers,
Dave.
is telling us we have a bug in the code, not that the ASSERT needs
fixing.
i.e.
So, there's no overflow here - PAGE_MASK just doesn't work on 64 bit
offsets.
>
> head = page_buffers(page);
> block_start = 0;
> --
> 1.7.9.5
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed()
2013-03-17 23:53 ` Dave Chinner
@ 2013-03-18 4:12 ` Jeff Liu
0 siblings, 0 replies; 4+ messages in thread
From: Jeff Liu @ 2013-03-18 4:12 UTC (permalink / raw)
To: Dave Chinner; +Cc: Michael L. Semon, xfs@oss.sgi.com
Hi Dave,
On 03/18/2013 07:53 AM, Dave Chinner wrote:
> On Sun, Mar 17, 2013 at 11:01:08PM +0800, Jeff Liu wrote:
>> Hello,
>>
>> Yesterday, Michael reported that ran xfstest-078 got kernel panic when growing a
>> small partition to a huge size(64-bit) on 32-bit system, this issue can be easily
>> reproduced on the latest upstream 38 to 39-rc2 if CONFIG_XFS_DEBUG is enabled.
>> According to my investigation, if the request pos offset exceeding 32-bit integer
>> range, evaluate block_offset with pos & PAGE_MASK will cause overflows so that
>> it most likely that the assert is not true.
>
> Hi Jeff,
>
> Nice work finding the root of the problem, but I think your fix
> is wrong. Here's how I look at the problem - all the parameters are
> loff_t, so everything should validate cleanly as 64 bit values.
>
> typedef __kernel_loff_t loff_t
> typedef long long __kernel_loff_t;
>
> So there shouldn't be any overflows occurring that need masking like
> this:
>
>> This patch fix it by checking "block_offset + from == (pos & ~0UL)" instead.
>
> So, what's the real problem?
>
> #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
> #define PAGE_MASK (~(PAGE_SIZE-1))
>
> On a 32 bit system, PAGE_MASK = 0xfffff000, as an unsigned long.
>
> And so this:
>
> loff_t block_offset = pos & PAGE_MASK;
>
> is also masking off the high 32 bits in pos.
Exactly. It's not an overflow as the block_offset is 64-bit, but the
original evaluated value missing the high 32-bits of pos.
>
> That's why:
>
>> + /*
>> + * Evaluate block_offset via (pos & PAGE_MASK) on 32-bit system
>> + * can cause overflow if the request pos is 64-bit. Hence we
>> + * have to verify the write offset with (pos & ~0UL) to avoid it.
>> + */
>> + ASSERT(block_offset + from == (pos & ~0UL));
>
> Masking off the high bits of pos here makes the ASSERT failure go
> away. However, it doesn't fix the problem - it just shuts up the
> warning that there's a problem.
>
> The bug is that block_offset is passed into
> xfs_vm_kill_delalloc_range(), and from above we now know that
> block_offset doesn't have the correct value. This is a
> potential data corruption bug, and catching this problem was the
> reason the ASSERT() was placed in the code. i.e. ensuring we are
> punching at the correct block offset into the file.
>
> IOWs, the intention of the code is that block_offset should be a 64
> bit value with the lower 12 bits masked out. Something like this:
>
> block_offset = (pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT;
>
> Will clear the lower 12 bits, and then the ASSERT() should evaluate
> correctly and the correct offset get passed to
> xfs_vm_kill_delalloc_range(), fixing the bug....
Yep. so we figure the 'from' out with the lower 12 bits in pos only, and
have block_offset with left higher bits in this way.
I will resent a patch according to your comments.
>
> i.e. whenever an ASSERT() fires, you need to look at the code for
> bugs - more often than not the ASSERT() is correct and the fact that
> it fired is indicative of a bug in the code. Hence changing the
> ASSERT to stop it firing is almost always the wrong "fix". :)
Thanks for your teaching!
-Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed()
2013-03-17 15:01 [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed() Jeff Liu
2013-03-17 23:53 ` Dave Chinner
@ 2013-03-18 4:17 ` Michael L. Semon
1 sibling, 0 replies; 4+ messages in thread
From: Michael L. Semon @ 2013-03-18 4:17 UTC (permalink / raw)
To: Jeff Liu; +Cc: xfs@oss.sgi.com
On 03/17/2013 11:01 AM, Jeff Liu wrote:
> On 32-bit system, if the request pos is 64-bit and evaluate block_offset
> with (pos & PAGE_MASK) will result in overflows, therefore the assertion
> will failed. We have to check the write offset against (pos & ~0UL) to
> avoid this issue as it can evaluate the highest 20 bits on 32-bit correctly
> if the pos request is 64-bit and keep the expected result of 64-bit pos request
> on 64-bit system unchanged.
>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
> Reported-by: Michael L. Semon <mlsemon35@gmail.com>
> Cc: Dave Chinner <david@fromorbit.com>
> ---
> fs/xfs/xfs_aops.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 5f707e5..2fc7367 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -1501,7 +1501,12 @@ xfs_vm_write_failed(
> loff_t to = from + len;
> struct buffer_head *bh, *head;
>
> - ASSERT(block_offset + from == pos);
> + /*
> + * Evaluate block_offset via (pos & PAGE_MASK) on 32-bit system
> + * can cause overflow if the request pos is 64-bit. Hence we
> + * have to verify the write offset with (pos & ~0UL) to avoid it.
> + */
> + ASSERT(block_offset + from == (pos & ~0UL));
>
> head = page_buffers(page);
> block_start = 0;
Thanks! I can't help but admire the effort. That stated, I did read
Dave's review and now understand the "..." that he left as comments to
the original bug report...
My original reason for writing was to refine the test case a little bit.
On this 32-bit Pentium III PC, xfstests #078 succeeds on a 560MB
device-mapper linear target (1146880 sectors), but it fails with an oops
on a 544MB dm-linear target (1114112 sectors). Looking at the output of
the `df` command over and over during the test, the data does stop
growing at a point between those two numbers, proving Dave's initial
observation correct.
Michael
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-18 4:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-17 15:01 [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed() Jeff Liu
2013-03-17 23:53 ` Dave Chinner
2013-03-18 4:12 ` Jeff Liu
2013-03-18 4:17 ` Michael L. Semon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox