open_ctree failed after ATA errors

All of lore.kernel.org
 help / color / mirror / Atom feed

* open_ctree failed after ATA errors
@ 2014-11-11 15:51 Florian Bruhin
  2014-11-11 18:07 ` Robert White
  2014-11-11 20:08 ` Chris Murphy
  0 siblings, 2 replies; 4+ messages in thread
From: Florian Bruhin @ 2014-11-11 15:51 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4642 bytes --]

I have the following setup:

- Two harddisks
    - Both individually encrypted using LUKS
    - Both combined into a btrfs using the btrfs raid1 feature

- The above duplicated twice:
    - /dev/mapper/data1 and /dev/mapper/data2 -> /mnt/data
    - /dev/mapper/secdata1 and /dev/mapper/secdata2 -> /mnt/secdata

Recently, I saw the following messages in my kernel logs all few days:

    ata6.00: exception Emask 0x10 SAct 0x40000 SErr 0x400000 action 0x6 frozen
    ata6.00: irq_stat 0x08000000, interface fatal error
    ata6: SError: { Handshk }
    ata6.00: failed command: WRITE FPDMA QUEUED
    ata6.00: cmd 61/08:90:e8:29:85/01:00:03:00:00/40 tag 18 ncq 135168 out
             res 40/00:94:e8:29:85/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
    ata6.00: status: { DRDY }
    ata6: hard resetting link
    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata6.00: configured for UDMA/133
    ata6: EH complete
    ata6.00: exception Emask 0x10 SAct 0x800000 SErr 0x400000 action 0x6 frozen
    ata6.00: irq_stat 0x08000000, interface fatal error
    ata6: SError: { Handshk }
    ata6.00: failed command: WRITE FPDMA QUEUED
    ata6.00: cmd 61/00:b8:f0:2a:85/02:00:03:00:00/40 tag 23 ncq 262144 out
             res 40/00:bc:f0:2a:85/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
    ata6.00: status: { DRDY }
    ata6: hard resetting link
    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata6.00: configured for UDMA/133
    ata6: EH complete

I thought maybe it was just a temporary problem or related to
upgrading the kernel recently (3.17.1 -> 3.17.2) and not rebooting
yet, so I rebooted.

Since then, I could run cryptsetup luksOpen without problems, but
mounting the devices hanged for ~15 seconds and then returned without
error, but didn't mount anything.

When strace'ing mount, it hanged here:

    mount("/dev/mapper/data1", "/mnt/data", "btrfs", MS_MGC_VAL, NULL)

(which then returned 0). I didn't see anything in the kernel logs.

I then tried the following:

    # cryptsetup luksClose ... # for all 4 disks
    # cryptsetup luksOpen ...  # for all 4 disks
    # btrfs device scan --all-devices
    # mount /dev/mapper/data1 /mnt/data
    # mount /dev/mapper/secdata1 /mnt/data

The same thing happened, and I then saw this in the kernel logs:

    [Nov11 15:33] BTRFS info (device dm-3): disk space caching is enabled
    [Nov11 15:34] BTRFS info (device dm-3): disk space caching is enabled
    [Nov11 15:35] BTRFS info (device dm-3): disk space caching is enabled
    [Nov11 15:36] BTRFS info (device dm-3): disk space caching is enabled
    [Nov11 15:37] BTRFS info (device dm-3): disk space caching is enabled
    [ +16.054127] BTRFS: open_ctree failed
    [Nov11 15:38] BTRFS info (device dm-3): disk space caching is enabled
    [Nov11 16:02] BTRFS info (device dm-2): disk space caching is enabled

How could I mount these volumes again? Is it a good idea to use
btrfs-zero-log as described in [1]?

Some other information:

- Distribution: Archlinux
- uname -a: Linux moody 3.17.2-1-ARCH #1 SMP PREEMPT Thu Oct 30 20:49:39 CET 2014 x86_64 GNU/Linux
- btrfs --version: Btrfs v3.17
- btrfs fi show:

    Label: 'secdata2'  uuid: 38267260-b656-4c66-a123-5f9214066ae1
            Total devices 2 FS bytes used 2.06TiB
            devid    1 size 3.64TiB used 2.06TiB path /dev/mapper/secdata2
            devid    2 size 3.64TiB used 2.06TiB path /dev/mapper/secdata1

    Label: 'data2'  uuid: b67ca50d-dbde-445d-922a-3479849b5499
            Total devices 2 FS bytes used 2.35TiB
            devid    1 size 2.73TiB used 2.69TiB path /dev/mapper/data1
            devid    3 size 2.73TiB used 2.69TiB path /dev/mapper/data2

- btrfs fi df /mnt/data:

    Data, single: total=58.42GiB, used=14.48GiB
    System, single: total=4.00MiB, used=12.00KiB
    Metadata, single: total=1.01GiB, used=471.80MiB
    GlobalReserve, single: total=160.00MiB, used=0.00B

- btrfs fi df /mnt/secdata

    Data, single: total=58.42GiB, used=14.48GiB
    System, single: total=4.00MiB, used=12.00KiB
    Metadata, single: total=1.01GiB, used=471.80MiB
    GlobalReserve, single: total=160.00MiB, used=0.00B

If there's anything else I can provide please let me know. Please Cc
me on replies, as I'm not on the list. Thanks in advance!

Florian

[1] https://btrfs.wiki.kernel.org/index.php/Btrfs-zero-log

-- 
http://www.the-compiler.org | me@the-compiler.org (Mail/XMPP)
             GPG 0xFD55A072 | http://the-compiler.org/pubkey.asc
         I love long mails! | http://email.is-not-s.ms/

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: open_ctree failed after ATA errors
  2014-11-11 15:51 open_ctree failed after ATA errors Florian Bruhin
@ 2014-11-11 18:07 ` Robert White
  2014-11-12  5:42   ` Florian Bruhin
  2014-11-11 20:08 ` Chris Murphy
  1 sibling, 1 reply; 4+ messages in thread
From: Robert White @ 2014-11-11 18:07 UTC (permalink / raw)
  To: Florian Bruhin, linux-btrfs

The below is a hard disk going bad or other systematic problem at the 
hardware level (controller card, interrupt conflict, etc).

In fact, given "ata6.00: irq_stat 0x08000000, interface fatal error" its 
pretty much a smoking gun about your controller.

Since you just upgraded your kernel I'd check to make sure you have the 
correct chipset and controller card selected. Look at /proc/interrupts 
and see if the controller is sharing an interrupt with some other device 
that could be crossing it up. Play with your MSI/MSI-X settings (if they 
are in use try disabling them).

I'd also actvate SMART and get the smart tools (e.g. "smartmontools" in 
gentoo, so probably something similar for your distro) and check the 
drive health.

So the stack is
Application ->
  File System ->
   Device Mapper ->
    Encryption ->
     Controller ->
      Wiring ->
       Drive

You are seeing write failures in the controller->wiring->drive section 
somewhere.

Cryptsetup is succeeding because the open operation is read-only. That 
is cryptsetup reads the LUKS block (first 4k of the partition) and does 
the key work and device mapper setup completely in memory without 
writing to the physical media at all.

Another possible area is if you ever resized the physical partitions but 
didn't properly resize the cryptsetup layer with "cryptsetup resize", 
but that woudl be unlikly to affect multiple drives (unless the mistake 
was symmetric, e.g. you did it to both drives).

Basically your problem is _way_ below the BTRFS level, but BTRFS is the 
first layer thats actually trying to write to the drives so it's the 
first level client to fail.

On 11/11/2014 07:51 AM, Florian Bruhin wrote:
>      ata6.00: exception Emask 0x10 SAct 0x40000 SErr 0x400000 action 0x6 frozen
>      ata6.00: irq_stat 0x08000000, interface fatal error
>      ata6: SError: { Handshk }
>      ata6.00: failed command: WRITE FPDMA QUEUED
>      ata6.00: cmd 61/08:90:e8:29:85/01:00:03:00:00/40 tag 18 ncq 135168 out
>               res 40/00:94:e8:29:85/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
>      ata6.00: status: { DRDY }
>      ata6: hard resetting link
>      ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>      ata6.00: configured for UDMA/133
>      ata6: EH complete
>      ata6.00: exception Emask 0x10 SAct 0x800000 SErr 0x400000 action 0x6 frozen
>      ata6.00: irq_stat 0x08000000, interface fatal error
>      ata6: SError: { Handshk }
>      ata6.00: failed command: WRITE FPDMA QUEUED
>      ata6.00: cmd 61/00:b8:f0:2a:85/02:00:03:00:00/40 tag 23 ncq 262144 out
>               res 40/00:bc:f0:2a:85/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
>      ata6.00: status: { DRDY }
>      ata6: hard resetting link
>      ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>      ata6.00: configured for UDMA/133
>      ata6: EH complete

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: open_ctree failed after ATA errors
  2014-11-11 15:51 open_ctree failed after ATA errors Florian Bruhin
  2014-11-11 18:07 ` Robert White
@ 2014-11-11 20:08 ` Chris Murphy
  1 sibling, 0 replies; 4+ messages in thread
From: Chris Murphy @ 2014-11-11 20:08 UTC (permalink / raw)
  To: Btrfs BTRFS


On Nov 11, 2014, at 8:51 AM, Florian Bruhin <me@the-compiler.org> wrote:

> I have the following setup:
> 
> - Two harddisks
>    - Both individually encrypted using LUKS
>    - Both combined into a btrfs using the btrfs raid1 feature
> 
> - The above duplicated twice:
>    - /dev/mapper/data1 and /dev/mapper/data2 -> /mnt/data
>    - /dev/mapper/secdata1 and /dev/mapper/secdata2 -> /mnt/secdata
> 
> Recently, I saw the following messages in my kernel logs all few days:
> 
>    ata6.00: exception Emask 0x10 SAct 0x40000 SErr 0x400000 action 0x6 frozen
>    ata6.00: irq_stat 0x08000000, interface fatal error
>    ata6: SError: { Handshk }
>    ata6.00: failed command: WRITE FPDMA QUEUED
>    ata6.00: cmd 61/08:90:e8:29:85/01:00:03:00:00/40 tag 18 ncq 135168 out
>             res 40/00:94:e8:29:85/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
>    ata6.00: status: { DRDY }
>    ata6: hard resetting link
>    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>    ata6.00: configured for UDMA/133
>    ata6: EH complete
>    ata6.00: exception Emask 0x10 SAct 0x800000 SErr 0x400000 action 0x6 frozen
>    ata6.00: irq_stat 0x08000000, interface fatal error
>    ata6: SError: { Handshk }
>    ata6.00: failed command: WRITE FPDMA QUEUED
>    ata6.00: cmd 61/00:b8:f0:2a:85/02:00:03:00:00/40 tag 23 ncq 262144 out
>             res 40/00:bc:f0:2a:85/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
>    ata6.00: status: { DRDY }
>    ata6: hard resetting link
>    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>    ata6.00: configured for UDMA/133
>    ata6: EH complete
> 
> I thought maybe it was just a temporary problem or related to
> upgrading the kernel recently (3.17.1 -> 3.17.2) and not rebooting
> yet, so I rebooted.
> 
> Since then, I could run cryptsetup luksOpen without problems, but
> mounting the devices hanged for ~15 seconds and then returned without
> error, but didn't mount anything.
> 
> When strace'ing mount, it hanged here:
> 
>    mount("/dev/mapper/data1", "/mnt/data", "btrfs", MS_MGC_VAL, NULL)
> 
> (which then returned 0). I didn't see anything in the kernel logs.
> 
> I then tried the following:
> 
>    # cryptsetup luksClose ... # for all 4 disks
>    # cryptsetup luksOpen ...  # for all 4 disks
>    # btrfs device scan --all-devices
>    # mount /dev/mapper/data1 /mnt/data
>    # mount /dev/mapper/secdata1 /mnt/data
> 
> The same thing happened, and I then saw this in the kernel logs:
> 
>    [Nov11 15:33] BTRFS info (device dm-3): disk space caching is enabled
>    [Nov11 15:34] BTRFS info (device dm-3): disk space caching is enabled
>    [Nov11 15:35] BTRFS info (device dm-3): disk space caching is enabled
>    [Nov11 15:36] BTRFS info (device dm-3): disk space caching is enabled
>    [Nov11 15:37] BTRFS info (device dm-3): disk space caching is enabled
>    [ +16.054127] BTRFS: open_ctree failed
>    [Nov11 15:38] BTRFS info (device dm-3): disk space caching is enabled
>    [Nov11 16:02] BTRFS info (device dm-2): disk space caching is enabled
> 
> How could I mount these volumes again? Is it a good idea to use
> btrfs-zero-log as described in [1]?

First sort out the cause of the hardware problems reported. Persistent errors are going to make things worse. Then you can try -o ro,recovery and see if that works while likely also not altering anything on the drive. If it works, take the opportunity to update backups. Then you can see whether -o recovery works and fixes the problem permanently.


Chris Murphy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: open_ctree failed after ATA errors
  2014-11-11 18:07 ` Robert White
@ 2014-11-12  5:42   ` Florian Bruhin
  0 siblings, 0 replies; 4+ messages in thread
From: Florian Bruhin @ 2014-11-12  5:42 UTC (permalink / raw)
  To: Robert White; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4776 bytes --]

Hi,

First of all: I noticed was able to mount my partitions when doing
with a different path, which made me investigate my /etc/fstab.

It contained this:

    LABEL=data1       /mnt/data            btrfs           defaults,noatime,nofail,device=/dev/disk/by-label/data1,device=/dev/disk/by-label/data2        0 0
    LABEL=secdata1    /mnt/secdata         btrfs            defaults,noatime,nofail,device=/dev/disk/by-label/secdata1,device=/dev/disk/by-label/secdata2        0 0

I now changed it to:

    /dev/mapper/data1    /mnt/data    btrfs defaults,noatime,nofail 0 0
    /dev/mapper/secdata1 /mnt/secdata btrfs defaults,noatime,nofail 0 0

since my initramfs scans for btrfs devices anyways. Looking at
/dev/disk/by-label, only the second disk respectively shows up:

    lrwxrwxrwx 1 root root 10 Nov 11 19:32 bootfs -> ../../sde1
    lrwxrwxrwx 1 root root 10 Nov 11 19:32 data2 -> ../../dm-3
    lrwxrwxrwx 1 root root 10 Nov 11 19:32 secdata2 -> ../../dm-4

However in /dev/mapper, all of them are listed:

    lrwxrwxrwx 1 root root       7 Nov 11 19:32 data1 -> ../dm-3
    lrwxrwxrwx 1 root root       7 Nov 11 19:32 data2 -> ../dm-1
    lrwxrwxrwx 1 root root       7 Nov 11 19:32 rootfs -> ../dm-0
    lrwxrwxrwx 1 root root       7 Nov 11 19:32 secdata1 -> ../dm-2
    lrwxrwxrwx 1 root root       7 Nov 11 19:32 secdata2 -> ../dm-4

I don't know what's going on there exactly (pointers welcome!) but it
seems the inability to mount is a different issue than the error
messages.

* Robert White <rwhite@pobox.com> [2014-11-11 10:07:25 -0800]:
> Since you just upgraded your kernel I'd check to make sure you have the
> correct chipset and controller card selected. Look at /proc/interrupts and
> see if the controller is sharing an interrupt with some other device that
> could be crossing it up.

I don't really get how to interpret that file I'm afraid. These are
the contents:

               CPU0       CPU1
      0:     754372          0   IO-APIC-edge      timer
      8:          0          1   IO-APIC-edge      rtc0
      9:          0          0   IO-APIC-fasteoi   acpi
     17:        600     114573   IO-APIC  17-fasteoi   ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3
     18:        521    1240697   IO-APIC  18-fasteoi   ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6, radeon
     24:          0          0   PCI-MSI-edge      PCIe PME
     25:        667     373771   PCI-MSI-edge      ahci
     26:        408     179898   PCI-MSI-edge      eth0
    NMI:         31         39   Non-maskable interrupts
    LOC:      76628     498768   Local timer interrupts
    SPU:          0          0   Spurious interrupts
    PMI:         31         39   Performance monitoring interrupts
    IWI:          0          2   IRQ work interrupts
    RTR:          0          0   APIC ICR read retries
    RES:     826089     252701   Rescheduling interrupts
    CAL:        270        504   Function call interrupts
    TLB:       5704       5023   TLB shootdowns
    TRM:          0          0   Thermal event interrupts
    THR:          0          0   Threshold APIC interrupts
    MCE:          0          0   Machine check exceptions
    MCP:        129        129   Machine check polls
    THR:          0          0   Hypervisor callback interrupts
    ERR:          0
    MIS:          0

> Play with your MSI/MSI-X settings (if they are in use try disabling them).

I'll try that if the errors show up again in the next few days - maybe
the reboot actually fixed it after all.

> I'd also actvate SMART and get the smart tools (e.g. "smartmontools" in
> gentoo, so probably something similar for your distro) and check the drive
> health.

I already have a monitoring running which also checks SMART, never had
any problems there. But I'll re-check by hand to be sure.

> So the stack is
> Application ->
>  File System ->
>   Device Mapper ->
>    Encryption ->
>     Controller ->
>      Wiring ->
>       Drive
> 
> You are seeing write failures in the controller->wiring->drive section
> somewhere.

Since it started happening after the upgrade, I can still hope it was
just some temporary issue if it doesn't show up again, right? ;)

> Another possible area is if you ever resized the physical partitions but
> didn't properly resize the cryptsetup layer with "cryptsetup resize", but
> that woudl be unlikly to affect multiple drives (unless the mistake was
> symmetric, e.g. you did it to both drives).

This isn't the case.

Thanks!

Florian

-- 
http://www.the-compiler.org | me@the-compiler.org (Mail/XMPP)
             GPG 0xFD55A072 | http://the-compiler.org/pubkey.asc
         I love long mails! | http://email.is-not-s.ms/

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-12  5:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-11 15:51 open_ctree failed after ATA errors Florian Bruhin
2014-11-11 18:07 ` Robert White
2014-11-12  5:42   ` Florian Bruhin
2014-11-11 20:08 ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.