All of lore.kernel.org
 help / color / mirror / Atom feed
From: Charles Cazabon <charlesc-lists-btrfs@pyropus.ca>
To: btrfs list <linux-btrfs@vger.kernel.org>
Subject: Re: Is `btrfsck --repair` supposed to actually repair problems?
Date: Tue, 1 Oct 2013 17:46:55 -0600	[thread overview]
Message-ID: <20131001234655.GA7937@pyropus.ca> (raw)
In-Reply-To: <13617928-CDCE-439D-B887-ED42F0E43F12@colorremedies.com>

[-- Attachment #1: Type: text/plain, Size: 3034 bytes --]

Hi, Chris,

Chris Murphy <lists@colorremedies.com> wrote:
> On Oct 1, 2013, at 3:12 PM, Charles Cazabon
> <charlesc-lists-btrfs@pyropus.ca> wrote:
> 
> > Running btrfsck with the --repair option, however, does not appear to fix
> > these [checksum verify] problems.  I'll attach the complete output of
> > running with the --repair option; running btrfsck in check-only mode
> > afterwards reports largely the same checksum errors as it did originally,
> > prior to "repair".  something wrong?
> 
> It looks like the file system thinks the file has changed and isn't matching
> checksum. That's not obviously fixable unless both data and metadata are
> raid1.i

Perhaps this wasn't clear from my original message, but I'm not using btrfs'
RAID or lvm-like capabilities.  The filesystem is on an LVM logical volume,
with the actual underlying storage being an 8-disk RAID-6 array (mdadm array).
So the stack is:

    vanilla btrfs filesystem (not using subvolumes, btrfs' multiple device
       support or any other advanced features)

    LVM logical volume

    LVM volume group

    LVM physical volume

    md_crypt / LUKS encrypted volume

    mdadm RAID-6 array

    8 x SATA disks

> More information is needed:

Okay:

  # btrfs fi df /media/bigbackup/
  Data: total=4.53TB, used=4.22TB
  System, DUP: total=8.00MB, used=508.00KB
  System: total=4.00MB, used=0.00
  Metadata, DUP: total=18.00GB, used=17.13GB
  Metadata: total=8.00MB, used=0.00

> btrfs show

This fails with `btrfs: unknown token 'show'`.

> dmesg | grep -i btrfs

After mounting the filesystem read-only, the following ends up in the syslog:

  [13333.117462] Btrfs loaded
  [13333.157078] device label bigbackup devid 1 transid 5249
      /dev/mapper/extbackup-bigbackup
  [13333.158445] btrfs: disk space caching is enabled

That's the only btrfs-related info that gets logged.

> dmesg | grep ata<port#>
> 
> I'm assuming it's a SATA drive,

As I say, it's 8 disks (yes, SATA).  What info exactly do you want about the
disks and ports?  The log is quite noisy because these are behind SATA port
multipliers, and there are a bunch of other SATA drives in the system.  But if
I filter out all the extra stuff, then when I power up the port-multiplier
boxes that the disks are in, what's logged is 126 lines (much of it garbage
from not all possible multiplier ports being in use), log attached.

The 8 disks are, as you can see, all identical Seagate units:

  ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133

> And report the version of btrfs-progs.

Btrfs v0.20-rc1-358-g194aa4a-dirty

That's what I get when I build from the git repository at
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git

git insists I'm fully up to date, though the last time I pulled before today
was over a month ago.

Charles

-- 
-----------------------------------------------------------------------
Charles Cazabon
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

[-- Attachment #2: sata.log --]
[-- Type: text/plain, Size: 7307 bytes --]

[    1.927026] ata11: SATA max UDMA/100 host m128@0xfd8ff000 port 0xfd8f8000 irq 19
[    1.927065] ata12: SATA max UDMA/100 host m128@0xfd8ff000 port 0xfd8fa000 irq 19
[    4.008746] ata11: SATA link down (SStatus 0 SControl 0)
[    6.091302] ata12: SATA link down (SStatus 0 SControl 0)
[  372.741259] ata11: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen
[  372.741270] ata11: irq_stat 0x00b40090, PHY RDY changed
[  372.741284] ata11: hard resetting link
[  374.710712] ata12: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen
[  374.710724] ata12: irq_stat 0x00b40090, PHY RDY changed
[  374.710738] ata12: hard resetting link
[  382.758711] ata11: softreset failed (timeout)
[  382.758724] ata11: hard resetting link
[  384.729193] ata12: softreset failed (timeout)
[  384.729206] ata12: hard resetting link
[  387.941314] ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[  387.941715] ata11.15: Port Multiplier 1.2, 0x197b:0x0325 r0, 15 ports, feat 0x5/0xf
[  387.946096] ata11.00: hard resetting link
[  388.314054] ata11.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  388.314105] ata11.01: hard resetting link
[  388.682496] ata11.01: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  388.682548] ata11.02: hard resetting link
[  389.051042] ata11.02: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  389.051095] ata11.03: hard resetting link
[  389.419480] ata11.03: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  389.419535] ata11.04: hard resetting link
[  389.927921] ata12: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[  389.928310] ata12.15: Port Multiplier 1.2, 0x197b:0x0325 r0, 15 ports, feat 0x5/0xf
[  389.939731] ata12.00: hard resetting link
[  390.308622] ata12.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  390.308677] ata12.01: hard resetting link
[  390.448517] ata11.04: failed to resume link (SControl 0)
[  390.448851] ata11.04: SATA link down (SStatus 0 SControl 0)
[  390.448932] ata11.05: hard resetting link
[  390.677099] ata12.01: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  390.677155] ata12.02: hard resetting link
[  391.045600] ata12.02: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  391.045654] ata12.03: hard resetting link
[  391.414090] ata12.03: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  391.414143] ata12.04: hard resetting link
[  391.477925] ata11.05: failed to resume link (SControl 0)
[  391.478259] ata11.05: SATA link down (SStatus 0 SControl 0)
[  391.478339] ata11.06: hard resetting link
[  392.443117] ata12.04: failed to resume link (SControl 0)
[  392.443458] ata12.04: SATA link down (SStatus 0 SControl 0)
[  392.443540] ata12.05: hard resetting link
[  392.507226] ata11.06: failed to resume link (SControl 0)
[  392.507563] ata11.06: SATA link down (SStatus 0 SControl 0)
[  392.507644] ata11.07: hard resetting link
[  393.472419] ata12.05: failed to resume link (SControl 0)
[  393.472758] ata12.05: SATA link down (SStatus 0 SControl 0)
[  393.472842] ata12.06: hard resetting link
[  393.536548] ata11.07: failed to resume link (SControl 0)
[  393.536884] ata11.07: SATA link down (SStatus 0 SControl 0)
[  393.536964] ata11.08: hard resetting link
[  394.501715] ata12.06: failed to resume link (SControl 0)
[  394.502072] ata12.06: SATA link down (SStatus 0 SControl 0)
[  394.502154] ata12.07: hard resetting link
[  394.565850] ata11.08: failed to resume link (SControl 0)
[  394.566187] ata11.08: SATA link down (SStatus 0 SControl 0)
[  394.566319] ata11.09: hard resetting link
[  395.531029] ata12.07: failed to resume link (SControl 0)
[  395.531363] ata12.07: SATA link down (SStatus 0 SControl 0)
[  395.531446] ata12.08: hard resetting link
[  395.595131] ata11.09: failed to resume link (SControl 0)
[  395.595469] ata11.09: SATA link down (SStatus 0 SControl 0)
[  395.595550] ata11.10: hard resetting link
[  396.560399] ata12.08: failed to resume link (SControl 0)
[  396.560736] ata12.08: SATA link down (SStatus 0 SControl 0)
[  396.560818] ata12.09: hard resetting link
[  396.624462] ata11.10: failed to resume link (SControl 0)
[  396.624855] ata11.10: SATA link down (SStatus 0 SControl 0)
[  396.624963] ata11.11: hard resetting link
[  397.589718] ata12.09: failed to resume link (SControl 0)
[  397.590056] ata12.09: SATA link down (SStatus 0 SControl 0)
[  397.590137] ata12.10: hard resetting link
[  397.653780] ata11.11: failed to resume link (SControl 0)
[  397.654112] ata11.11: SATA link down (SStatus 0 SControl 0)
[  397.654193] ata11.12: hard resetting link
[  398.619001] ata12.10: failed to resume link (SControl 0)
[  398.619338] ata12.10: SATA link down (SStatus 0 SControl 0)
[  398.619420] ata12.11: hard resetting link
[  398.683119] ata11.12: failed to resume link (SControl 0)
[  398.683451] ata11.12: SATA link down (SStatus 0 SControl 0)
[  398.683530] ata11.13: hard resetting link
[  399.648291] ata12.11: failed to resume link (SControl 0)
[  399.648655] ata12.11: SATA link down (SStatus 0 SControl 0)
[  399.648744] ata12.12: hard resetting link
[  399.712480] ata11.13: failed to resume link (SControl 0)
[  399.712817] ata11.13: SATA link down (SStatus 0 SControl 0)
[  399.712897] ata11.14: hard resetting link
[  400.677675] ata12.12: failed to resume link (SControl 0)
[  400.678012] ata12.12: SATA link down (SStatus 0 SControl 0)
[  400.678097] ata12.13: hard resetting link
[  400.741762] ata11.14: failed to resume link (SControl 0)
[  400.742101] ata11.14: SATA link down (SStatus 0 SControl 0)
[  400.742911] ata11.00: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  400.742921] ata11.00: 5860533168 sectors, multi 0: LBA48 
[  400.743635] ata11.00: configured for UDMA/100
[  400.744397] ata11.01: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  400.744409] ata11.01: 5860533168 sectors, multi 0: LBA48 
[  400.765387] ata11.01: configured for UDMA/100
[  400.766129] ata11.02: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  400.766140] ata11.02: 5860533168 sectors, multi 0: LBA48 
[  400.787661] ata11.02: configured for UDMA/100
[  400.788424] ata11.03: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  400.788434] ata11.03: 5860533168 sectors, multi 0: LBA48 
[  400.808638] ata11.03: configured for UDMA/100
[  400.808738] ata11: EH complete
[  401.706984] ata12.13: failed to resume link (SControl 0)
[  401.707321] ata12.13: SATA link down (SStatus 0 SControl 0)
[  401.707405] ata12.14: hard resetting link
[  402.736244] ata12.14: failed to resume link (SControl 0)
[  402.736603] ata12.14: SATA link down (SStatus 0 SControl 0)
[  402.737449] ata12.00: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  402.737460] ata12.00: 5860533168 sectors, multi 0: LBA48 
[  402.760315] ata12.00: configured for UDMA/100
[  402.761058] ata12.01: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  402.761068] ata12.01: 5860533168 sectors, multi 0: LBA48 
[  402.761803] ata12.01: configured for UDMA/100
[  402.762551] ata12.02: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  402.762560] ata12.02: 5860533168 sectors, multi 0: LBA48 
[  402.763284] ata12.02: configured for UDMA/100
[  402.764008] ata12.03: ATA-8: ST3000DM001-1E6166, CC45, max UDMA/133
[  402.764014] ata12.03: 5860533168 sectors, multi 0: LBA48 
[  402.764778] ata12.03: configured for UDMA/100
[  402.764876] ata12: EH complete


  reply	other threads:[~2013-10-01 23:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-01 21:12 Is `btrfsck --repair` supposed to actually repair problems? Charles Cazabon
2013-10-01 22:01 ` Chris Murphy
2013-10-01 23:46   ` Charles Cazabon [this message]
2013-10-02  0:42     ` Chris Murphy
2013-10-02  3:13       ` Charles Cazabon
2013-10-02  3:50         ` Chris Murphy
2013-10-02 16:53           ` Charles Cazabon
2013-10-02 19:13             ` Chris Murphy
2013-10-02 19:56               ` Charles Cazabon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131001234655.GA7937@pyropus.ca \
    --to=charlesc-lists-btrfs@pyropus.ca \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.