public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Michael L. Semon" <mlsemon35@gmail.com>,
	"xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed()
Date: Mon, 18 Mar 2013 12:12:47 +0800	[thread overview]
Message-ID: <5146943F.5080003@oracle.com> (raw)
In-Reply-To: <20130317235302.GL6369@dastard>

Hi Dave,

On 03/18/2013 07:53 AM, Dave Chinner wrote:
> On Sun, Mar 17, 2013 at 11:01:08PM +0800, Jeff Liu wrote:
>> Hello,
>>
>> Yesterday, Michael reported that ran xfstest-078 got kernel panic when growing a
>> small partition to a huge size(64-bit) on 32-bit system, this issue can be easily
>> reproduced on the latest upstream 38 to 39-rc2 if CONFIG_XFS_DEBUG is enabled.
>> According to my investigation, if the request pos offset exceeding 32-bit integer
>> range, evaluate block_offset with pos & PAGE_MASK will cause overflows so that
>> it most likely that the assert is not true.
> 
> Hi Jeff,
> 
> Nice work finding the root of the problem, but I think your fix
> is wrong. Here's how I look at the problem - all the parameters are
> loff_t, so everything should validate cleanly as 64 bit values.
> 
> typedef __kernel_loff_t         loff_t
> typedef long long       __kernel_loff_t;
> 
> So there shouldn't be any overflows occurring that need masking like
> this:
> 
>> This patch fix it by checking "block_offset + from == (pos & ~0UL)" instead.
> 
> So, what's the real problem?
> 
> #define PAGE_SIZE       (_AC(1,UL) << PAGE_SHIFT)
> #define PAGE_MASK       (~(PAGE_SIZE-1))
> 
> On a 32 bit system, PAGE_MASK = 0xfffff000, as an unsigned long.
> 
> And so this:
> 
> loff_t                  block_offset = pos & PAGE_MASK;
> 
> is also masking off the high 32 bits in pos.
Exactly.  It's not an overflow as the block_offset is 64-bit, but the
original evaluated value missing the high 32-bits of pos.
> 
> That's why:
> 
>> +	/*
>> +	 * Evaluate block_offset via (pos & PAGE_MASK) on 32-bit system
>> +	 * can cause overflow if the request pos is 64-bit.  Hence we
>> +	 * have to verify the write offset with (pos & ~0UL) to avoid it.
>> +	 */
>> +	ASSERT(block_offset + from == (pos & ~0UL));
> 
> Masking off the high bits of pos here makes the ASSERT failure go
> away.  However, it doesn't fix the problem - it just shuts up the
> warning that there's a problem.
> 
> The bug is that block_offset is passed into
> xfs_vm_kill_delalloc_range(), and from above we now know that
> block_offset doesn't have the correct value. This is a
> potential data corruption bug, and catching this problem was the
> reason the ASSERT() was placed in the code. i.e. ensuring we are
> punching at the correct block offset into the file.
> 
> IOWs, the intention of the code is that block_offset should be a 64
> bit value with the lower 12 bits masked out. Something like this:
> 
> 	block_offset = (pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT;
> 
> Will clear the lower 12 bits, and then the ASSERT() should evaluate
> correctly and the correct offset get passed to
> xfs_vm_kill_delalloc_range(), fixing the bug....
Yep. so we figure the 'from' out with the lower 12 bits in pos only, and
have block_offset with left higher bits in this way.
I will resent a patch according to your comments.
> 
> i.e. whenever an ASSERT() fires, you need to look at the code for
> bugs - more often than not the ASSERT() is correct and the fact that
> it fired is indicative of a bug in the code. Hence changing the
> ASSERT to stop it firing is almost always the wrong "fix". :)
Thanks for your teaching!

-Jeff

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-03-18  4:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-17 15:01 [PATCH] xfs: fix ASSERTION failure in xfs_vm_write_failed() Jeff Liu
2013-03-17 23:53 ` Dave Chinner
2013-03-18  4:12   ` Jeff Liu [this message]
2013-03-18  4:17 ` Michael L. Semon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5146943F.5080003@oracle.com \
    --to=jeff.liu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=mlsemon35@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox