From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dkim2.fusionio.com ([66.114.96.54]:33503 "EHLO dkim2.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751780Ab3JAMic (ORCPT ); Tue, 1 Oct 2013 08:38:32 -0400 Received: from mx2.fusionio.com (unknown [10.101.1.160]) by dkim2.fusionio.com (Postfix) with ESMTP id 770249A069F for ; Tue, 1 Oct 2013 06:38:32 -0600 (MDT) Date: Tue, 1 Oct 2013 08:38:30 -0400 From: Josef Bacik To: Liu Bo CC: Josef Bacik , Subject: Re: [PATCH] Btrfs: fix crash of compressed writes Message-ID: <20131001123830.GE27490@localhost.localdomain> References: <1380544797-29798-1-git-send-email-bo.li.liu@oracle.com> <20130930170249.GO18681@localhost.localdomain> <20131001033721.GA32575@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" In-Reply-To: <20131001033721.GA32575@localhost.localdomain> Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Tue, Oct 01, 2013 at 11:37:22AM +0800, Liu Bo wrote: > On Mon, Sep 30, 2013 at 01:02:49PM -0400, Josef Bacik wrote: > > On Mon, Sep 30, 2013 at 08:39:57PM +0800, Liu Bo wrote: > > > The crash[1] is found by xfstests/generic/208 with "-o compress", > > > it's not reproduced everytime, but it does panic. > > > > > > The bug is quite interesting, it's actually introduced by a recent commit > > > (573aecafca1cf7a974231b759197a1aebcf39c2a, > > > Btrfs: actually limit the size of delalloc range). > > > > > > Btrfs implements delay allocation, so during writeback, we > > > (1) get a page A and lock it > > > (2) search the state tree for delalloc bytes and lock all pages within the range > > > (3) process the delalloc range, including find disk space and create > > > ordered extent and so on. > > > (4) submit the page A. > > > > > > It runs well in normal cases, but if we're in a racy case, eg. > > > buffered compressed writes and aio-dio writes, > > > sometimes we may fail to lock all pages in the 'delalloc' range, > > > in which case, we need to fall back to search the state tree again with > > > a smaller range limit(max_bytes = PAGE_CACHE_SIZE - offset). > > > > > > The mentioned commit has a side effect, that is, in the fallback case, > > > we can find delalloc bytes before the index of the page we already have locked, > > > so we're in the case of (delalloc_end <= *start) and return with (found > 0). > > > > > > This ends with not locking delalloc pages but making ->writepage still > > > process them, and the crash happens. > > > > > > This fixes it by not enforcing the 'max_bytes' limit in the special fallback > > > case. > > > > > > > Great analysis, thank you for that, however I don't like the fix ;). Instead We > > need to change the > > > > if (!found || delalloc_end <= *start) > > > > to always return 0 since we should not call fill delalloc if the delalloc range > > doesn't start at our current offset. Secondly the > > Setting 'found = 0' is also the first idea into my mind :), but I'm not sure if > it's right. > > I've check it out, commit 70b99e6959a4c28ae1b314985eca731f3db72f1d > (Btrfs: Compression corner fixes, > Signed-off-by: Chris Mason ) > introduced the (delalloc_end <= *start). > > The commit log says, > "Make sure we lock pages in the correct order during delalloc. The > search into the state tree for delalloc bytes can return bytes > before the page we already have locked." > > But I think if we find bytes before the page we locked, then we > should get (found = 0). > > So I don't know why we need the check (delalloc_end <= *start), > maybe some corner cases need that? Chris can explain a bit? > No we needed his other addition below which is important, for when the delalloc range starts before but still encompasses our page. The other part is not needed, we should be returning 0 if delalloc_end lands before or on our start, which should fix your bug and make it a bit more sane. > > > > max_bytes = PAGE_CACHE_SIZE - offset; > > > > is completely crap since we are talking about the entire page/sector. We should > > simply set max_bytes to sectorszie or PAGE_CACHE_SIZE that way we avoid this > > issue. Thanks, > > This might be due to historical reasons, but for now, it doesn't cause > problems as the passed @start is page aligned and @offset is always 0. > Yeah you are right, still I'd kill it anyway since it doesn't make sense. Thanks, Josef