From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:60806 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751486AbbGMIMf (ORCPT ); Mon, 13 Jul 2015 04:12:35 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1ZEYqm-0007Po-RI for linux-btrfs@vger.kernel.org; Mon, 13 Jul 2015 10:12:33 +0200 Received: from ip68-231-22-224.ph.ph.cox.net ([68.231.22.224]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 13 Jul 2015 10:12:32 +0200 Received: from 1i5t5.duncan by ip68-231-22-224.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 13 Jul 2015 10:12:32 +0200 To: linux-btrfs@vger.kernel.org From: Duncan <1i5t5.duncan@cox.net> Subject: Re: Disk "failed" while doing scrub Date: Mon, 13 Jul 2015 08:12:27 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: Dāvis Mosāns posted on Mon, 13 Jul 2015 09:26:05 +0300 as excerpted: > Short version: while doing scrub on 5 disk btrfs filesystem, /dev/sdd > "failed" and also had some error on other disk (/dev/sdh) You say five disk, but nowhere in your post do you mention what raid mode you were using, neither do you post btrfs filesystem show and btrfs filesystem df, as suggested on the wiki and which list that information. FWIW, btrfs defaults for a multi-device filesystem are raid1 metadata, raid0 data. If you didn't specify raid level at mkfs time, it's very likely that's what you're using. The scrub results seem to support this as if the data had been raid1 or raid10, nearly all the errors should have been correctable by pulling from the second copy. And raid5/6 should have been able to recover from parity, tho this mode is new enough it's still not recommended as the chances of bugs and thus failure to work properly are much higher. So you really should have been using raid1/10 if you wanted device failure tolerance, but you didn't say, and if you're using defaults as seems reasonably likely, your data was raid0, and thus it's likely many/ most files are either gone or damaged beyond repair. (As it happens I have a number of btrfs raid1 data/metadata on a pair of partitioned ssds, with each btrfs on a corresponding partition on both of them, with one of the ssds developing bad sectors and basically slowly failing. But the other member of the raid1 pair is solid and I have backups, as well as a spare I can replace the failing one with when I decide it's time, so I've been letting the bad one stick around due as much as anything to morbid curiosity, watching it slowly fail. So I know exactly how scrub on btrfs raid1 behaves in a bad-sector case, pulling the copy from the good device to overwrite the bad copy with, triggering the device's sector remapping in the process. Despite all the read errors, they've all been correctable, because I'm using raid1 for both data and metadata.) > Because filesystem still mounts, I assume I should do "btrfs device > delete /dev/sdd /mntpoint" and then restore damaged files from backup. You can try a replace, but with a failing drive still connected, people report mixed results. It's likely to fail as it can't read certain blocks to transfer them to the new device. With raid1 or better, physically disconnecting the failing device, and doing a device delete missing (or replace missing, but AFAIK this doesn't work with released versions and I'm not sure if it's even in integration yet, but there are patches on-list that should make it work) can work, but with raid0/single, you can mount with a missing device if you use degraded,ro, but obviously that'll only let you try to copy files off, and you'll likely not have a lot of luck with raid0, with files missing but a bit more luck with single. In the likely raid0/single case, you're best bet is probably to try copying off what you can, and/or restoring from backups. See the discussion below. > Are all affected files listed in journal? there's messages about "x > callbacks suppressed" so I'm not sure and if there aren't how to get > full list of damaged files? > Also I wonder if there are any tools to recover partial file fragments > and reconstruct file? (where missing fragments filled with nulls) > I assume that there's no point in running "btrfs check > --check-data-csum" because scrub already does check that? There's no such partial-file with null-fill tools shipped just yet. Those files normally simply trigger errors trying to read them, because btrfs won't let you at them if the checksum doesn't verify. There /is/, however, a command that can be used to either regenerate or zero-out the checksum tree. See btrfs check --init-csum-tree. Current versions recalculate the csums, older versions (btrfsck as that was before btrfs check) simply zeroed it out. Then you can read the file despite bad checksums, tho you'll still get errors if the block physically cannot be read. There's also btrfs restore, which works on the unmounted filesystem without actually writing to it, copying the files it can read to a new location, which of course has to be a filesystem with enough room to restore the files to, altho it's possible to tell restore to do only specific subdirs, for instance. What I'd recommend depends on how complete and how recent your backup is. If it's complete and recent enough, probably the easiest thing is to simply blow away the bad filesystem and start over, recovering from the backup to a new filesystem. If there's files you'd like to get back that weren't backed up or where the backup is old, since the filesystem is mountable, I'd probably copy everything off it I could. Then, I'd try restore, letting it restore to the same location I had copied to, but NOT using the --overwrite option, so it only wrote any files it could restore that the copy wasn't able to get you, as they might be slightly older versions. Then, if you really need more of the files, you can try using btrfs check --init-csum-tree as mentioned above, and then try mounting and see if you can access more files. But as these are likely to be somewhat corrupt, I'd probably /not/ copy them to the same location as the others. If you have space for two copies, you might duplicate the set of files as you were able to recover them with the initial copy and restore, and use the same don't-overwrite technique on one of the sets, marking it the possibly corrupted version. Then you can do a diff or rsync dry-run to see the differences between the good version and the bad, and examine anything spitout by the diff/rsync individually. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman