From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:38897 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750861AbaGBFlC (ORCPT ); Wed, 2 Jul 2014 01:41:02 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1X2DHw-0007iq-T5 for linux-btrfs@vger.kernel.org; Wed, 02 Jul 2014 07:41:00 +0200 Received: from ip68-231-22-224.ph.ph.cox.net ([68.231.22.224]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 02 Jul 2014 07:41:00 +0200 Received: from 1i5t5.duncan by ip68-231-22-224.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 02 Jul 2014 07:41:00 +0200 To: linux-btrfs@vger.kernel.org From: Duncan <1i5t5.duncan@cox.net> Subject: Re: Corrupt filesystem after hardware failure: Scrub causes kernel GPF Date: Wed, 2 Jul 2014 05:40:48 +0000 (UTC) Message-ID: References: <53B2DF4B.4080708@fos4x.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: Philipp Tölke posted on Tue, 01 Jul 2014 18:18:19 +0200 as excerpted: > root@filer:~# btrfs fi df /home > Data, single: total=9.61TiB, used=9.32TiB > System, single: total=32.00MiB, used=1.04MiB > Metadata, single: total=19.00GiB, used=17.37GiB > unknown, single: total=512.00MiB, used=0.00 > root@filer:~# uname -a > Linux filer 3.15-trunk-amd64 #1 SMP Debian > 3.15.1-1~exp1 (2014-06-20) x86_64 GNU/Linux > Doing a scrub scrubs over the first TiB of the filesystem and then > caused this OOPS: Well, it shouldn't GPF and there's obviously other more complex problems that I won't attempt to address, but as a btrfs user and list regular I can pick off the the low hanging fruit for you... Btrfs scrub is designed to detect and possibly fix exactly one sort of problem: bad checksums. Since btrfs does checksumming by default, btrfs scrub should detect bad checksums whenever the calculated checksum doesn't match the recorded one, but it can only /correct/ the problem if there's another copy of the data available that still has a /valid/ checksum. And your filesystem, as reported above, is all single, data single, metadata single, system single, and "unknown" (kernel 3.15 split out, I believe it was the free-space cache-tree, into its own type, but there's no corresponding btrfs-progs release to label it, and it's simply listed as "unknown" in current userspace) single. Single means there's only the one copy, so scrub couldn't correct any invalid checksums it detected anyway, altho at least it should detect them, and it should NOT segfault. So as I said there's obviously a more complex problem as well, well at least one, but scrub wouldn't/couldn't fix anything for you anyway, since the only way it can fix is if there's a second copy (single-device dup mode or multi-device raid1/10 mode, etc), and you have single mode for everything so there's no further copy to checksum verify and restore the bad copy from, assuming checksum verification of the second. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman