All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Michael L. Semon" <mlsemon35@gmail.com>,
	"xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: [PATCH v2] xfs: fix assertion failure in xfs_vm_write_failed()
Date: Wed, 20 Mar 2013 10:18:04 +0800	[thread overview]
Message-ID: <51491C5C.4040102@oracle.com> (raw)
In-Reply-To: <20130319192322.GB6369@dastard>

On 03/20/2013 03:23 AM, Dave Chinner wrote:
> On Tue, Mar 19, 2013 at 02:08:27PM +0800, Jeff Liu wrote:
>> On 03/19/2013 07:30 AM, Dave Chinner wrote:
>> From: Jie Liu <jeff.liu@oracle.com>
>>
>> In xfs_vm_write_failed(), we evaluate the block_offset of pos with PAGE_MASK
>> which is 0xfffff000 as an unsigned long,
> 
> That's the 32 bit value. if it's a 64 bit value, it's
> 0xfffffffffffff000.
> 
>> that is fine on 64-bit platforms no
>> matter the request pos is 32-bit or 64-bit.  However, on 32-bit platforms,
>> the high 32-bit in it will be masked off with (pos & PAGE_MASK) for 64-bit pos
>> request.  As a result, the evaluated block_offset is incorrect which will cause
>> the ASSERT() failed: ASSERT(block_offset + from == pos);
> 
> So I'd just rearrange this slightly:
> 
>> In xfs_vm_write_failed(), we evaluate the block_offset of pos with PAGE_MASK
>> which is an unsigned long. That is fine on 64-bit platforms
>> regardless of whether the request pos is 32-bit or 64-bit.
>> However, on 32-bit platforms, the value is 0xfffff000 and so
>> the high 32 bits in it will be masked off with (pos & PAGE_MASK)
>> for a 64-bit pos As a result, the evaluated block_offset is
>> incorrect which will cause this failure ASSERT(block_offset + from
>> == pos); and potentially pass the wrong block to
>> xfs_vm_kill_delalloc_range().
> 
> ...
>> This patch fix the block_offset evaluation to clear the lower 12 bits as:
>> block_offset = pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT
>> Hence, the ASSERTION should be correct because the from offset in a page is
>> evaluated to have the lower 12 bits only.
> 
> Saying we are clearing the lower 12 bits is not technically correct,
> as there are platforms with different page sizes. What we are
> actually calculating is the offset at the start of the page....
> 
>> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
>> index 5f707e5..f26a341 100644
>> --- a/fs/xfs/xfs_aops.c
>> +++ b/fs/xfs/xfs_aops.c
>> @@ -1494,13 +1494,25 @@ xfs_vm_write_failed(
>>  	loff_t			pos,
>>  	unsigned		len)
>>  {
>> -	loff_t			block_offset = pos & PAGE_MASK;
>> +	loff_t			block_offset;
>>  	loff_t			block_start;
>>  	loff_t			block_end;
>>  	loff_t			from = pos & (PAGE_CACHE_SIZE - 1);
>>  	loff_t			to = from + len;
>>  	struct buffer_head	*bh, *head;
>>  
>> +	/*
>> +	 * The request pos offset might be 32 or 64 bit, this is all fine
>> +	 * on 64-bit platform.  However, for 64-bit pos request on 32-bit
>> +	 * platform, the high 32-bit will be masked off if we evaluate the
>> +	 * block_offset via (pos & PAGE_MASK) because the PAGE_MASK is
>> +	 * 0xfffff000 as an unsigned long, hence the result is incorrect
>> +	 * which could cause the following ASSERT failed in most cases.
>> +	 * In order to avoid this, we can evaluate the block_offset with
>> +	 * the lower 12-bit masked out and the ASSERT should be correct.
> 
> Same here:
> 
> 	* In order to avoid this, we can evaluate the block_offset
> 	* of the start of the page by using shifts rather than masks
> 	* the mismatch problem.
>> +	 */
>> +	block_offset = (pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT;
>> +
>>  	ASSERT(block_offset + from == pos);
>>  
>>  	head = page_buffers(page);
> 
> As for the code, it looks fine. Hence with the comments/commit
> fixups, you can add:
> 
> Reviewed-by: Dave Chinner <dchinner@redhat.com>

Thanks Dave for correcting me with detailed comments, the revised patch was shown as following.

Regards,
-Jeff


In xfs_vm_write_failed(), we evaluate the block_offset of pos with PAGE_MASK
which is an unsigned long.  That is fine on 64-bit platforms regardless of
whether the request pos is 32-bit or 64-bit.  However, on 32-bit platforms
the value is 0xfffff000 and so the high 32 bits in it will be masked off with
(pos & PAGE_MASK) for a 64-bit pos.  As a result, the evaluated block_offset is
incorrect which will cause this failure ASSERT(block_offset + from == pos); and
potentially pass the wrong block to xfs_vm_kill_delalloc_range().

In this case, we can get the following kernel Panic if the CONFIG_XFS_DEBUG is enabled:

[   68.700573] XFS: Assertion failed: block_offset + from == pos, file: fs/xfs/xfs_aops.c, line: 1504
[   68.700656] ------------[ cut here ]------------
[   68.700692] kernel BUG at fs/xfs/xfs_message.c:100!
[   68.700742] invalid opcode: 0000 [#1] SMP
........
[   68.701678] Pid: 4057, comm: mkfs.xfs Tainted: G           O 3.9.0-rc2 #1
[   68.701722] EIP: 0060:[<f94a7e8b>] EFLAGS: 00010282 CPU: 0
[   68.701783] EIP is at assfail+0x2b/0x30 [xfs]
[   68.701819] EAX: 00000056 EBX: f6ef28a0 ECX: 00000007 EDX: f57d22a4
[   68.701852] ESI: 1c2fb000 EDI: 00000000 EBP: ea6b5d30 ESP: ea6b5d1c
[   68.701895]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   68.701934] CR0: 8005003b CR2: 094f3ff4 CR3: 2bcb4000 CR4: 000006f0
[   68.701970] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[   68.702011] DR6: ffff0ff0 DR7: 00000400
[   68.702046] Process mkfs.xfs (pid: 4057, ti=ea6b4000 task=ea5799e0 task.ti=ea6b4000)
[   68.702086] Stack:
[   68.702124]  00000000 f9525c48 f951fa80 f951f96b 000005e4 ea6b5d7c f9494b34 c19b0ea2
[   68.702445]  00000066 f3d6c620 c19b0ea2 00000000 e9a91458 00001000 00000000 00000000
[   68.702868]  00000000 c15c7e89 00000000 1c2fb000 00000000 00000000 1c2fb000 00000080
[   68.703192] Call Trace:
[   68.703248]  [<f9494b34>] xfs_vm_write_failed+0x74/0x1b0 [xfs]
[   68.703441]  [<c15c7e89>] ? printk+0x4d/0x4f
[   68.703496]  [<f9494d7d>] xfs_vm_write_begin+0x10d/0x170 [xfs]
[   68.703535]  [<c110a34c>] generic_file_buffered_write+0xdc/0x210
[   68.703583]  [<f949b669>] xfs_file_buffered_aio_write+0xf9/0x190 [xfs]
[   68.703629]  [<f949b7f3>] xfs_file_aio_write+0xf3/0x160 [xfs]
[   68.703668]  [<c115e504>] do_sync_write+0x94/0xd0
[   68.703716]  [<c115ed1f>] vfs_write+0x8f/0x160
[   68.703753]  [<c115e470>] ? wait_on_retry_sync_kiocb+0x50/0x50
[   68.703794]  [<c115f017>] sys_write+0x47/0x80
[   68.703830]  [<c15d860d>] sysenter_do_call+0x12/0x28
.............
[   68.704203] EIP: [<f94a7e8b>] assfail+0x2b/0x30 [xfs] SS:ESP 0068:ea6b5d1c
[   68.706615] ---[ end trace cdd9af4f4ecab42f ]---
[   68.706687] Kernel panic - not syncing: Fatal exception

In order to avoid this, we can evaluate the block_offset of the start of the page
by using shifts rather than masks the mismatch problem.

Thanks Dave Chinner for help finding and fixing this bug.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
 fs/xfs/xfs_aops.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 5f707e5..7b5d6b1 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1494,13 +1494,26 @@ xfs_vm_write_failed(
 	loff_t			pos,
 	unsigned		len)
 {
-	loff_t			block_offset = pos & PAGE_MASK;
+	loff_t			block_offset;
 	loff_t			block_start;
 	loff_t			block_end;
 	loff_t			from = pos & (PAGE_CACHE_SIZE - 1);
 	loff_t			to = from + len;
 	struct buffer_head	*bh, *head;
 
+	/*
+	 * The request pos offset might be 32 or 64 bit, this is all fine
+	 * on 64-bit platform.  However, for 64-bit pos request on 32-bit
+	 * platform, the high 32-bit will be masked off if we evaluate the
+	 * block_offset via (pos & PAGE_MASK) because the PAGE_MASK is
+	 * 0xfffff000 as an unsigned long, hence the result is incorrect
+	 * which could cause the following ASSERT failed in most cases.
+	 * In order to avoid this, we can evaluate the block_offset of the
+	 * start of the page by using shifts rather than masks the mismatch
+	 * problem.
+	 */
+	block_offset = (pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT;
+
 	ASSERT(block_offset + from == pos);
 
 	head = page_buffers(page);
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-03-20  2:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-18  4:48 [PATCH v2] xfs: fix assertion failure in xfs_vm_write_failed() Jeff Liu
2013-03-18 20:03 ` Michael L. Semon
2013-03-18 23:30 ` Dave Chinner
2013-03-19  6:08   ` Jeff Liu
2013-03-19 19:23     ` Dave Chinner
2013-03-20  2:18       ` Jeff Liu [this message]
2013-04-08 21:47         ` Mark Tinguely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51491C5C.4040102@oracle.com \
    --to=jeff.liu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=mlsemon35@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.