From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Ivan P <chrnosphered@gmail.com>, Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: scrub: Tree block spanning stripes, ignored
Date: Thu, 7 Apr 2016 08:58:40 +0800 [thread overview]
Message-ID: <5705B0C0.6070606@cn.fujitsu.com> (raw)
In-Reply-To: <CADzmB23BwRsrvCeM7ntPt=0wz4tEVc6Vu5Gm0X7tKoQ20n_WTQ@mail.gmail.com>
Ivan P wrote on 2016/04/06 21:39 +0200:
> Ok, I'm cautiously optimistic: after running btrfsck
> --init-extent-tree --repair and running scrub, it finished without
> errors.
> Will run a file compare against my backup copy, but it seems the
> repair was successful.
Better run btrfsck again, to ensure no other problem.
For backref problem, did you rw mount the fs with some old kernel like 4.2?
IIRC, I introduced a delayed_ref regression in that version.
Maybe it's related to the bug.
Thanks,
Qu
>
> Here is the btrfs-image btw:
> https://dl.dropboxusercontent.com/u/19330332/image.btrfs (821Mb)
>
> Maybe you will be able to track down whatever caused this.
>
> Regards,
> Ivan.
>
> On Sun, Apr 3, 2016 at 3:24 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 04/03/2016 12:29 AM, Ivan P wrote:
>>>
>>> It's about 800Mb, I think I could upload that.
>>>
>>> I ran it with the -s parameter, is that enough to remove all personal
>>> info from the image?
>>> Also, I had to run it with -w because otherwise it died on the same
>>> corrupt node.
>>
>>
>> You can also use -c9 to further compress the data.
>>
>> Thanks,
>> Qu
>>
>>>
>>> On Fri, Apr 1, 2016 at 2:25 AM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>>
>>>>
>>>>
>>>> Ivan P wrote on 2016/03/31 18:04 +0200:
>>>>>
>>>>>
>>>>> Ok, it will take a while until I can attempt repairing it, since I
>>>>> will have to order a spare HDD to copy the data to.
>>>>> Should I take some sort of debug snapshot of the fs so you can take a
>>>>> look at it? I think I read something about a snapshot that only
>>>>> contains the fs but not the data that somewhere.
>>>>
>>>>
>>>> That's btrfs-image.
>>>>
>>>> It would be good, but if your metadata is over 3G, I think it's would
>>>> take a
>>>> lot of time uploading.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Regards,
>>>>> Ivan.
>>>>>
>>>>> On Tue, Mar 29, 2016 at 3:57 AM, Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ivan P wrote on 2016/03/28 23:21 +0200:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Well, the file in this inode is fine, I was able to copy it off the
>>>>>>> disk. However, rm-ing the file causes a segmentation fault. Shortly
>>>>>>> after that, I get a kernel oops. Same thing happens if I attempt to
>>>>>>> re-run scrub.
>>>>>>>
>>>>>>> How can I delete that inode? Could deleting it destroy the filesystem
>>>>>>> beyond repair?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The kernel oops should protect you from completely destroying the fs.
>>>>>>
>>>>>> However it seems that the problem is beyond kernel's handle (kernel
>>>>>> oops).
>>>>>>
>>>>>> So no safe recovery method now.
>>>>>>
>>>>>> From now on, any repair advice from me *MAY* *destroy* your fs.
>>>>>> So please do backup when you still can.
>>>>>>
>>>>>>
>>>>>> The best possible try would be "btrfsck --init-extent-tree --repair".
>>>>>>
>>>>>> If it works, then mount it and run "btrfs balance start <mnt>".
>>>>>> Lastly, umount and use btrfsck to re-check if it fixes the problem.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Ivan
>>>>>>>
>>>>>>> On Mon, Mar 28, 2016 at 3:10 AM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ivan P wrote on 2016/03/27 16:31 +0200:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks for the reply,
>>>>>>>>>
>>>>>>>>> the raid1 array was created from scratch, so not converted from
>>>>>>>>> ext*.
>>>>>>>>> I used btrfs-progs version 4.2.3 on kernel 4.2.5 to create the
>>>>>>>>> array,
>>>>>>>>> btw.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I don't remember any strange behavior after 4.0, so no clue here.
>>>>>>>>
>>>>>>>> Go to the subvolume 5 (the top-level subvolume), find inode 71723 and
>>>>>>>> try
>>>>>>>> to
>>>>>>>> remove it.
>>>>>>>> Then, use 'btrfs filesystem sync <mount point>' to sync the inode
>>>>>>>> removal.
>>>>>>>>
>>>>>>>> Finally use latest btrfs-progs to check if the problem disappears.
>>>>>>>>
>>>>>>>> This problem seems to be quite strange, so I can't locate the root
>>>>>>>> cause,
>>>>>>>> but try to remove the file and hopes kernel can handle it.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Is there a way to fix the current situation without taking the whole
>>>>>>>>> data off the disk?
>>>>>>>>> I'm not familiar with file systems terms, so what exactly could I
>>>>>>>>> have
>>>>>>>>> lost, if anything?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Ivan
>>>>>>>>>
>>>>>>>>> On Sun, Mar 27, 2016 at 4:23 PM, Qu Wenruo <quwenruo.btrfs@gmx.com
>>>>>>>>> <mailto:quwenruo.btrfs@gmx.com>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 03/27/2016 05:54 PM, Ivan P wrote:
>>>>>>>>>
>>>>>>>>> Read the info on the wiki, here's the rest of the
>>>>>>>>> requested
>>>>>>>>> information:
>>>>>>>>>
>>>>>>>>> # uname -r
>>>>>>>>> 4.4.5-1-ARCH
>>>>>>>>>
>>>>>>>>> # btrfs fi show
>>>>>>>>> Label: 'ArchVault' uuid:
>>>>>>>>> cd8a92b6-c5b5-4b19-b5e6-a839828d12d8
>>>>>>>>> Total devices 1 FS bytes used 2.10GiB
>>>>>>>>> devid 1 size 14.92GiB used 4.02GiB path
>>>>>>>>> /dev/sdc1
>>>>>>>>>
>>>>>>>>> Label: 'Vault' uuid:
>>>>>>>>> 013cda95-8aab-4cb2-acdd-2f0f78036e02
>>>>>>>>> Total devices 2 FS bytes used 800.72GiB
>>>>>>>>> devid 1 size 931.51GiB used 808.01GiB path
>>>>>>>>> /dev/sda
>>>>>>>>> devid 2 size 931.51GiB used 808.01GiB path
>>>>>>>>> /dev/sdb
>>>>>>>>>
>>>>>>>>> # btrfs fi df /mnt/vault/
>>>>>>>>> Data, RAID1: total=806.00GiB, used=799.81GiB
>>>>>>>>> System, RAID1: total=8.00MiB, used=128.00KiB
>>>>>>>>> Metadata, RAID1: total=2.00GiB, used=936.20MiB
>>>>>>>>> GlobalReserve, single: total=320.00MiB, used=0.00B
>>>>>>>>>
>>>>>>>>> On Fri, Mar 25, 2016 at 3:16 PM, Ivan P
>>>>>>>>> <chrnosphered@gmail.com
>>>>>>>>> <mailto:chrnosphered@gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> using kernel 4.4.5 and btrfs-progs 4.4.1, I today
>>>>>>>>> ran a
>>>>>>>>> scrub on my
>>>>>>>>> 2x1Tb btrfs raid1 array and it finished with 36
>>>>>>>>> unrecoverable errors
>>>>>>>>> [1], all blaming the treeblock 741942071296. Running
>>>>>>>>> "btrfs
>>>>>>>>> check
>>>>>>>>> --readonly" on one of the devices lists that extent
>>>>>>>>> as
>>>>>>>>> corrupted [2].
>>>>>>>>>
>>>>>>>>> How can I recover, how much did I really lose, and
>>>>>>>>> how
>>>>>>>>> can
>>>>>>>>> I
>>>>>>>>> prevent
>>>>>>>>> it from happening again?
>>>>>>>>> If you need me to provide more info, do tell.
>>>>>>>>>
>>>>>>>>> [1] http://cwillu.com:8080/188.110.141.36/1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This message itself is normal, it just means a tree block is
>>>>>>>>> crossing 64K stripe boundary.
>>>>>>>>> And due to scrub limit, it can check if it's good or bad.
>>>>>>>>> But....
>>>>>>>>>
>>>>>>>>> [2] http://pastebin.com/xA5zezqw
>>>>>>>>>
>>>>>>>>> This one is much more meaningful, showing several strange
>>>>>>>>> bugs.
>>>>>>>>>
>>>>>>>>> 1. corrupt extent record: key 741942071296 168 1114112
>>>>>>>>> This means, this is a EXTENT_ITEM(168), and according to the
>>>>>>>>> offset,
>>>>>>>>> it means the length of the extent is, 1088K, definitely not a
>>>>>>>>> valid
>>>>>>>>> tree block size.
>>>>>>>>>
>>>>>>>>> But according to [1], kernel think it's a tree block, which
>>>>>>>>> is
>>>>>>>>> quite
>>>>>>>>> strange.
>>>>>>>>> Normally, such mismatch only happens in fs converted from
>>>>>>>>> ext*.
>>>>>>>>>
>>>>>>>>> 2. Backref 741942071296 root 5 owner 71723 offset 2589392896
>>>>>>>>> num_refs 0 not found in extent tree
>>>>>>>>>
>>>>>>>>> num_refs 0, this is also strange, normal backref won't have a
>>>>>>>>> zero
>>>>>>>>> refrence number.
>>>>>>>>>
>>>>>>>>> 3. bad metadata [741942071296, 741943185408) crossing stripe
>>>>>>>>> boundary
>>>>>>>>> It could be a false warning fixed in latest btrfsck.
>>>>>>>>> But you're using 4.4.1, so I think that's the problem.
>>>>>>>>>
>>>>>>>>> 4. bad extent [741942071296, 741943185408), type mismatch
>>>>>>>>> with
>>>>>>>>> chunk
>>>>>>>>> This seems to explain the problem, a data extent appears in a
>>>>>>>>> metadata chunk.
>>>>>>>>> It seems that you're really using converted btrfs.
>>>>>>>>>
>>>>>>>>> If so, just roll it back to ext*. Current btrfs-convert has
>>>>>>>>> known
>>>>>>>>> bug but fix is still under review.
>>>>>>>>>
>>>>>>>>> If want to use btrfs, use a newly created one instead of
>>>>>>>>> btrfs-convert.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Qu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Soukyuu
>>>>>>>>>
>>>>>>>>> P.S.: please add me to CC when replying as I did not
>>>>>>>>> subscribe to the
>>>>>>>>> mailing list. Majordomo won't let me use my hotmail
>>>>>>>>> address
>>>>>>>>> and I
>>>>>>>>> don't want that much traffic on this address.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>>> linux-btrfs" in
>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>> <mailto:majordomo@vger.kernel.org>
>>>>>>>>> More majordomo info at
>>>>>>>>> http://vger.kernel.org/majordomo-info.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>>>>> in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>
>
next prev parent reply other threads:[~2016-04-07 0:59 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-02 16:29 scrub: Tree block spanning stripes, ignored Ivan P
2016-04-03 1:24 ` Qu Wenruo
2016-04-06 19:39 ` Ivan P
2016-04-07 0:58 ` Qu Wenruo [this message]
2016-04-07 15:33 ` Ivan P
2016-04-07 15:46 ` Patrik Lundquist
2016-04-08 0:23 ` Qu Wenruo
2016-04-09 9:53 ` Ivan P
2016-04-11 1:10 ` Qu Wenruo
2016-04-12 17:15 ` Ivan P
2016-05-06 11:25 ` Ivan P
2016-05-09 1:28 ` Qu Wenruo
-- strict thread matches above, loose matches on Subject: below --
2016-03-25 14:16 Ivan P
2016-03-27 9:54 ` Ivan P
2016-03-27 9:56 ` Ivan P
2016-03-27 14:23 ` Qu Wenruo
[not found] ` <CADzmB20uJmLgMSgHX1vse35Ssj0rKXxzsTTum+L2ZnjFaBCrww@mail.gmail.com>
2016-03-28 1:10 ` Qu Wenruo
2016-03-28 21:21 ` Ivan P
2016-03-29 1:57 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5705B0C0.6070606@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=chrnosphered@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.