linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stuart Pook <slp644161@pook.it>
To: linux-btrfs@vger.kernel.org
Subject: uncorrectable errors after btrfs replace
Date: Sun, 18 Aug 2013 21:12:29 +0200	[thread overview]
Message-ID: <52111C9D.3090704@pook.it> (raw)
In-Reply-To: <S1753593Ab3HRQvp/20130818165145Z+301@vger.kernel.org>

hi all

I moved my btrfs filesystems around using btrfs replace and now I have errors (lots of errors)

[63724.419779] BTRFS info (device dm-12): csum failed ino 9340 off 8192 csum 717036259 private 94677163

: root; time  btrfs  scrub start -Bd /disks/backups
scrub device /dev/dm-11 (id 1) done
	scrub started at Sun Aug 18 15:17:50 2013 and finished after 4487 seconds
	total bytes scrubbed: 576.46GB with 261883 errors
	error details: csum=261883
	corrected errors: 0, uncorrectable errors: 261883, unverified errors: 0

I had two 2 Tb disks who's data I needed to swap (/mnt on a WD-Black & /disks/backup on a HD204UI). Both had btrfs systems but /disks/backup was encrypted using luks. I had a spare 640 Gb WD-Blue disk that I plugged into an SATA dock for this operation.

I "btrfs resize"d /disks/backup to fit in 590 GB then I "btrfs replace"d /disks/backup to a new luks partition on the WD-Blue disk. Then I "btrfs replace"d /mnt to the HD204UI.  Then I "btrfs replace"d the backup data to a new luks partition on the WD-Black. I then got IO Errors reading /disks/backup.

I'm using: Linux kooka 3.10-2-amd64 #1 SMP Debian 3.10.5-1 (2013-08-07) x86_64 GNU/Linux
and btrfs-tools 0.19+20130315-5

rsync: write failed on "/disks/backups/snapshot_rsync/stuart/secret/current/.purple/accounts.xml": Input/output error (5)

Lots of files on /disks/backup have errors. smartctl says passed for all the drives.

This is a summary of what I did:

     6  btrfs filesystem resize 580g .
     9  time btrfs  balance start -musage=1 -dusage=1 . && time  btrfs filesystem resize 580g .
    10  time  btrfs filesystem resize 590g .
    12  cryptsetup luksOpen /dev/sdd2 640Gb
    13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    14  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    18  cryptsetup remove _dev_sdc2
    19  fdisk /dev/sdc
    32  time btrfs replace start  /dev/sdb1  /dev/sdc2 -B /mnt
    34  btrfs filesystem label  /dev/dm-12
    36   btrfs filesystem label /disks/backups backups2Tb
    38   btrfs filesystem label /disks/backups
    39  cryptsetup luksFormat /dev/sdb2
    40  cryptsetup luksAddKey /dev/sdb2
    41  cryptsetup open  /dev/sdb2 newbackups
    43  time btrfs replace start  /dev/dm-12  /dev/dm-11 -B /disks/backups
    44  btrfs filesystem show
    45  cryptsetup status 640Gb
    46  cryptsetup remove 640Gb
    47  btrfs filesystem show
    49  btrfs filesystem resize max /disks/backups/
    54  /etc/local/backups
# errors !
    57  time  btrfs  scrub start -Bd /disks/backups

Lots of errors in /var/log/syslog

Aug 18 12:27:51 kooka kernel: [54113.507151] btrfs: dev_replace from /dev/mapper/640Gb (devid 1) to /dev/dm-11) started
Aug 18 12:27:51 kooka kernel: [54113.601334] device label backups2Tb devid 1 transid 39282 /dev/dm-12
Aug 18 12:28:03 kooka kernel: [54125.020038] ata10.00: exception Emask 0x10 SAct 0x3dfe0ff0 SErr 0x780100 action 0x6
Aug 18 12:28:03 kooka kernel: [54125.020043] ata10.00: irq_stat 0x08000000
Aug 18 12:28:03 kooka kernel: [54125.020047] ata10: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }
Aug 18 12:28:03 kooka kernel: [54125.020050] ata10.00: failed command: READ FPDMA QUEUED
Aug 18 12:28:03 kooka kernel: [54125.020056] ata10.00: cmd 60/18:20:c0:18:0b/00:00:00:00:00/40 tag 4 ncq 12288 in
Aug 18 12:28:03 kooka kernel: [54125.020056]          res 40/00:5c:f0:1a:0b/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Aug 18 12:28:03 kooka kernel: [54125.020059] ata10.00: status: { DRDY }
[...]
Aug 18 12:28:03 kooka kernel: [54125.020262] ata10: hard resetting link
Aug 18 12:28:03 kooka kernel: [54125.512032] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 18 12:28:03 kooka kernel: [54125.523759] ata10.00: configured for UDMA/133
Aug 18 12:28:03 kooka kernel: [54125.536380] ata10: EH complete
Aug 18 12:28:04 kooka kernel: [54125.770176] ata10.00: exception Emask 0x10 SAct 0x7fffffff SErr 0x780100 action 0x6
Aug 18 12:28:04 kooka kernel: [54125.770181] ata10.00: irq_stat 0x08000000
Aug 18 12:28:04 kooka kernel: [54125.770184] ata10: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }
[...]
Aug 18 12:28:17 kooka kernel: [54138.957095] ata10.00: status: { DRDY }
Aug 18 12:28:17 kooka kernel: [54138.957100] ata10: hard resetting link
Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Aug 18 12:28:17 kooka kernel: [54139.449972] ata10.00: configured for UDMA/133
Aug 18 12:28:17 kooka kernel: [54139.464065] ata10: EH complete
[...]

Aug 18 12:38:31 kooka kernel: [54753.527070] btrfs: checksum error at logical 52642709504 on dev /dev/dm-12, sector 104931328, root 1281, inode 42152, offset 0, length 4096, links 1 (path: XXXXX)
[...]
Aug 18 12:38:31 kooka kernel: [54753.606566] btrfs: bdev /dev/dm-12 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[...]
Aug 18 12:38:32 kooka kernel: [54753.679513] btrfs: bdev /dev/dm-12 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
Aug 18 12:38:36 kooka kernel: [54758.076089] scrub_handle_errored_block: 15173 callbacks suppressed
[...]
Aug 18 12:38:52 kooka kernel: [54774.647414] btrfs: bdev /dev/dm-12 errs: wr 0, rd 0, flush 0, corrupt 65313, gen 0
[...]
Aug 18 15:24:03 kooka kernel: [64685.641464] btrfs: unable to fixup (regular) error at logical 52643758080 on dev /dev/dm-11

It appears that my WD-Blue or its connection is bad but why didn't the "btrfs replace" give me an error? "btrfs replace" seems to have read bad data without checking the checksum and then wrote the bad data to the new disk.

ata10 is the WD-Blue

Aug 17 21:26:19 kooka kernel: [    1.410573] ata10.00: ATA-8: WDC WD6400AAKS-00A7B2, 01.03B01, max UDMA/133

: root; sleep 2m;  smartctl -a   /dev/sdd
[...]
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
   3 Spin_Up_Time            0x0027   161   158   021    Pre-fail  Always       -       4933
   4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       327
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
   7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
   9 Power_On_Hours          0x0032   070   070   000    Old_age   Always       -       22077
  10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
  11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       245
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       169
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       327
194 Temperature_Celsius     0x0022   096   090   000    Old_age   Always       -       51
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       12080
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

I guess that /disks/backup is mostly dead and that I should just reformat it.  What do you think?  Next time I'll watch /var/log/syslog but I would have preferred that "btrfs replace" stop when getting errors.

thanks, Stuart


       reply	other threads:[~2013-08-18 19:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <S1753593Ab3HRQvp/20130818165145Z+301@vger.kernel.org>
2013-08-18 19:12 ` Stuart Pook [this message]
2013-08-18 21:43   ` uncorrectable errors after btrfs replace Chris Murphy
2013-08-18 22:35     ` Stuart Pook
2013-08-19  0:42       ` Chris Murphy
2013-08-19  1:21         ` George Mitchell
2013-08-20 14:46         ` slp644161
2013-08-20 15:16           ` Stefan Behrens
2013-08-25 22:10             ` Stuart Pook
2013-08-26  2:07               ` Chris Murphy
2013-08-26  2:32                 ` Chris Murphy
2013-09-02 16:23                 ` Stefan Behrens
2013-08-20  9:44       ` Stefan Behrens
2013-08-20 13:52         ` slp644161
2013-08-20 14:50           ` Stefan Behrens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52111C9D.3090704@pook.it \
    --to=slp644161@pook.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).