From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp1040.oracle.com ([141.146.126.69]:47622 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755157AbaFYHee (ORCPT ); Wed, 25 Jun 2014 03:34:34 -0400 Date: Wed, 25 Jun 2014 15:34:24 +0800 From: Liu Bo To: Satoru Takeuchi Cc: linux-btrfs Subject: Re: [PATCH] Btrfs: fix crash when mounting raid5 btrfs with missing disks Message-ID: <20140625073423.GB3642@localhost.localdomain> Reply-To: bo.li.liu@oracle.com References: <1403595556-32753-1-git-send-email-bo.li.liu@oracle.com> <53AA794D.2090407@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <53AA794D.2090407@jp.fujitsu.com> Sender: linux-btrfs-owner@vger.kernel.org List-ID: Hi Satoru, On Wed, Jun 25, 2014 at 04:25:01PM +0900, Satoru Takeuchi wrote: > Hi Liu, > > (2014/06/24 16:39), Liu Bo wrote: > > The reproducer is > > > > $ mkfs.btrfs D1 D2 D3 -mraid5 > > $ mkfs.ext4 D2 && mkfs.ext4 D3 > > $ mount D1 /btrfs -odegraded > > Tested-by: Satoru Takeuchi > > Here is the result of the last mount. > > === > ... > mount: wrong fs type, bad option, bad superblock on /dev/vdb1, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. > === > > It "correctly" failed :-) Thanks for testing it :) thanks, -liubo > > Thanks, > Satoru > > > > > ------------------- > > > > [ 87.672992] ------------[ cut here ]------------ > > [ 87.673845] kernel BUG at fs/btrfs/raid56.c:1828! > > ... > > [ 87.673845] RIP: 0010:[] [] __raid_recover_end_io+0x4ae/0x4d0 > > ... > > [ 87.673845] Call Trace: > > [ 87.673845] [] ? mempool_free+0x36/0xa0 > > [ 87.673845] [] raid_recover_end_io+0x75/0xa0 > > [ 87.673845] [] bio_endio+0x5b/0xa0 > > [ 87.673845] [] bio_endio_nodec+0x12/0x20 > > [ 87.673845] [] end_workqueue_fn+0x41/0x50 > > [ 87.673845] [] normal_work_helper+0xca/0x2c0 > > [ 87.673845] [] process_one_work+0x1eb/0x530 > > [ 87.673845] [] ? process_one_work+0x189/0x530 > > [ 87.673845] [] worker_thread+0x11b/0x4f0 > > [ 87.673845] [] ? rescuer_thread+0x290/0x290 > > [ 87.673845] [] kthread+0xe4/0x100 > > [ 87.673845] [] ? kthread_create_on_node+0x220/0x220 > > [ 87.673845] [] ret_from_fork+0x7c/0xb0 > > [ 87.673845] [] ? kthread_create_on_node+0x220/0x220 > > > > ------------------- > > > > It's because that we miscalculate @rbio->bbio->error so that it doesn't > > reach maximum of tolerable errors while it should have. > > > > Signed-off-by: Liu Bo > > --- > > fs/btrfs/raid56.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c > > index 4055291..4a88f07 100644 > > --- a/fs/btrfs/raid56.c > > +++ b/fs/btrfs/raid56.c > > @@ -1956,9 +1956,10 @@ static int __raid56_parity_recover(struct btrfs_raid_bio *rbio) > > * pages are going to be uptodate. > > */ > > for (stripe = 0; stripe < bbio->num_stripes; stripe++) { > > - if (rbio->faila == stripe || > > - rbio->failb == stripe) > > + if (rbio->faila == stripe || rbio->failb == stripe) { > > + atomic_inc(&rbio->bbio->error); > > continue; > > + } > > > > for (pagenr = 0; pagenr < nr_pages; pagenr++) { > > struct page *p; > > >