All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Pocock <daniel@pocock.pro>
To: linux-btrfs@vger.kernel.org
Subject: disk failure but no alert
Date: Wed, 19 Aug 2015 10:53:29 +0200	[thread overview]
Message-ID: <55D44409.40100@pocock.pro> (raw)



There are two large disks, part of the disks partitioned for MD RAID1
and the rest of the disks partitioned for BtrFs RAID1

One of the disks (/dev/sdd) appears to have failed, there were plenty of
alerts from MD (including dmesg and emails) but nothing from the BtrFs
filesystem

Could this just be a problem on a sector within the MD RAID1 partition
(/dev/sdd2) or is BtrFs failing to alert?  If there is a failure on
another partition on the same disk, should BtrFs be notified by the
kernel in some way and should it consider the filesystem to be at risk?

Should I do anything proactively to stop BtrFs using the /dev/sdd3
partition now?  Unfortunately it is not possible to get a new disk to
this server in the same day and it may just be shut down until the disk
can be replaced.

# uname -a
Linux - 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u3 (2015-08-04)
x86_64 GNU/Linux

# btrfs fi show /dev/sdd3
Label: none  uuid: -----------------------------
    Total devices 2 FS bytes used 1.74TiB
    devid    1 size 4.55TiB used 1.75TiB path /dev/sdd3
    devid    2 size 4.55TiB used 1.75TiB path /dev/sda3

Btrfs v3.17


Here is the dmesg output:

[996932.734999] sd 0:0:3:0: [sdd] 
[996932.735039] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[996932.735047] sd 0:0:3:0: [sdd] 
[996932.735053] Sense Key : Illegal Request [current]
[996932.735062] Info fld=0x80808
[996932.735069] sd 0:0:3:0: [sdd] 
[996932.735078] Add. Sense: Logical block address out of range
[996932.735085] sd 0:0:3:0: [sdd] CDB:
[996932.735089] Write(16): 8a 00 00 00 00 00 00 08 08 08 00 00 00 02 00 00
[996932.735110] end_request: critical target error, dev sdd, sector 526344
[996932.735280] md: super_written gets error=-121, uptodate=0
[996932.735290] md/raid1:md2: Disk failure on sdd2, disabling device.
md/raid1:md2: Operation continuing on 1 devices.
[996932.777853] RAID1 conf printout:
[996932.777917]  --- wd:1 rd:2
[996932.777925]  disk 0, wo:0, o:1, dev:sda2
[996932.777931]  disk 1, wo:1, o:0, dev:sdd2
[996932.794052] RAID1 conf printout:
[996932.794063]  --- wd:1 rd:2
[996932.794069]  disk 0, wo:0, o:1, dev:sda2



             reply	other threads:[~2015-08-19  9:02 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-19  8:53 Daniel Pocock [this message]
2015-08-20  2:59 ` disk failure but no alert Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55D44409.40100@pocock.pro \
    --to=daniel@pocock.pro \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.