linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jon Buckingham <jbuckingham@blueyonder.co.uk>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org, Jon Buckingham <jon.buckingham@hp.com>
Subject: Re: mdadm: making a spare actie
Date: Fri, 20 Jun 2008 23:18:16 +0100	[thread overview]
Message-ID: <485C2CA8.7050506@blueyonder.co.uk> (raw)
In-Reply-To: <18522.64213.285243.770425@notabene.brown>

[-- Attachment #1: Type: text/plain, Size: 7496 bytes --]

Hi Neil
> 
> What would be interesting to see is the --examine output and the dmesg
> just as the recovery after the add has completed.  i.e. just before
> the reboot.
> 
> The dmesg you have included is after the reboot.  It confirms that
> sdb5 is non-refresh, presumably the event count is behind for some
> reason (as can be seen from the --examine output you send in the first
> email).  However it doesn't contain any hint as to why.
> 
> NeilBrown
> 
> 

OK, after the resync completed, the disk is marked as faulty.
Also, there are bundles of errors reported by dmesg,
and the other partition on the drive which was ok is
unreadable.
So your earlier thought that there were IO errors was correct.

I will now try some system rebuilding!

FYI, the various outputs are appended.

Thanks for your help

Jon B


nas:~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sda5[0] sdb5[4](F) sdd5[3] sdc5[2]
       733142016 blocks level 5, 64k chunk, algorithm 2 [4/3] [U_UU]

unused devices: <none>

nas:~ # mdadm -E /dev/sda5
/dev/sda5:
           Magic : a92b4efc
         Version : 00.90.03
            UUID : b54e46e1:b6a6e6ea:3ae5a5a5:04e207e4
   Creation Time : Fri Aug  4 22:42:14 2006
      Raid Level : raid5
   Used Dev Size : 244380672 (233.06 GiB 250.25 GB)
      Array Size : 733142016 (699.18 GiB 750.74 GB)
    Raid Devices : 4
   Total Devices : 4
Preferred Minor : 0

     Update Time : Fri Jun 20 13:05:54 2008
           State : clean
  Active Devices : 3
Working Devices : 3
  Failed Devices : 1
   Spare Devices : 0
        Checksum : f11d55f5 - correct
          Events : 0.3796224

          Layout : left-symmetric
      Chunk Size : 64K

       Number   Major   Minor   RaidDevice State
this     0       8        5        0      active sync   /dev/sda5

    0     0       8        5        0      active sync   /dev/sda5
    1     1       0        0        1      faulty removed
    2     2       8       37        2      active sync   /dev/sdc5
    3     3       8       53        3      active sync   /dev/sdd5

mdadm -E /dev/sdb5
mdadm: No md superblock detected on /dev/sdb5.


ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata4.00: BMDMA stat 0x24
ata4.00: cmd 35/00:30:9a:e7:63/00:02:1b:00:00/e0 tag 0 cdb 0x0 data 286720 out
          res 61/04:01:e3:e8:63/04:00:1b:00:00/e0 Emask 0x1 (device error)
ata4.00: failed to set xfermode (err_mask=0x1)
ata4: failed to recover some devices, retrying in 5 secs
Marking TSC unstable due to: cpufreq changes.
Time: acpi_pm clocksource has been installed.
Clocksource tsc unstable (delta = -163018120 ns)
ata4: soft resetting link
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: failed to set xfermode (err_mask=0x1)
ata4: limiting SATA link speed to 1.5 Gbps
ata4.00: limiting speed to UDMA/133:PIO3
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting link
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: failed to set xfermode (err_mask=0x1)
ata4.00: disabled
ata4: EH complete
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 459532186
raid5: Disk failure on sdb5, disabling device. Operation continuing on 3 devices
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 459532746
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 459533570
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 883198
Buffer I/O error on device sdb2, logical block 98351
lost page write due to I/O error on sdb2
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 1049150
Buffer I/O error on device sdb2, logical block 119095
lost page write due to I/O error on sdb2
Aborting journal on device sdb2.
journal commit I/O error
ext3_abort called.
EXT3-fs error (device sdb2): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
sd 3:0:0:0: [sdb] READ CAPACITY failed
sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
sd 3:0:0:0: [sdb] Sense not available.
sd 3:0:0:0: [sdb] Write Protect is off
sd 3:0:0:0: [sdb] Mode Sense: 00 00 00 00
sd 3:0:0:0: [sdb] Asking for cache data failed
sd 3:0:0:0: [sdb] Assuming drive cache: write through
md: md0: recovery done.
RAID5 conf printout:
  --- rd:4 wd:3
  disk 0, o:1, dev:sda5
  disk 1, o:0, dev:sdb5
  disk 2, o:1, dev:sdc5
  disk 3, o:1, dev:sdd5
RAID5 conf printout:
  --- rd:4 wd:3
  disk 0, o:1, dev:sda5
  disk 2, o:1, dev:sdc5
  disk 3, o:1, dev:sdd5
Buffer I/O error on device sdb2, logical block 98350
lost page write due to I/O error on sdb2
SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=00:17:31:4c:c2:28:00:09:5b:25:14:ee:08:00 SRC=212.13.194.96 DST=192.168.1.11 LEN=76 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=123 DPT=123 LEN=56
SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=00:17:31:4c:c2:28:00:09:5b:25:14:ee:08:00 SRC=212.13.194.96 DST=192.168.1.11 LEN=76 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=123 DPT=123 LEN=56
SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=00:17:31:4c:c2:28:00:09:5b:25:14:ee:08:00 SRC=212.13.194.96 DST=192.168.1.11 LEN=76 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=123 DPT=123 LEN=56
SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=00:17:31:4c:c2:28:00:09:5b:25:14:ee:08:00 SRC=212.13.194.96 DST=192.168.1.11 LEN=76 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=123 DPT=123 LEN=56
SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=00:17:31:4c:c2:28:00:09:5b:25:14:ee:08:00 SRC=212.13.194.96 DST=192.168.1.11 LEN=76 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=123 DPT=123 LEN=56
SFW2-INext-ACC-TCP IN=eth0 OUT= MAC=00:17:31:4c:c2:28:00:40:ca:3b:a6:05:08:00 SRC=192.168.1.12 DST=192.168.1.11 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=11827 DF PROTO=TCP SPT=27999 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT (020405B40402080A004B8D400000000001030306)
Buffer I/O error on device sdb5, logical block 488761344
Buffer I/O error on device sdb5, logical block 488761345
Buffer I/O error on device sdb5, logical block 488761346
Buffer I/O error on device sdb5, logical block 488761347
Buffer I/O error on device sdb5, logical block 488761348
Buffer I/O error on device sdb5, logical block 488761349
Buffer I/O error on device sdb5, logical block 488761350
Buffer I/O error on device sdb5, logical block 488761351
Buffer I/O error on device sdb5, logical block 488761344
Buffer I/O error on device sdb5, logical block 488761345
nas:~ # ll /var
ls: cannot access /var/adm: Input/output error
ls: cannot access /var/X11R6: Input/output error
total 52
d?????????  ? ?     ?         ?                ? adm
drwxr-xr-x  8 root  root   4096 2007-11-21 23:42 cache
drwxrwxr-x  3 games games  4096 2007-10-28 23:23 games
drwxr-xr-x 20 root  root   4096 2007-11-01 23:09 lib
drwxrwxr-t  5 root  uucp   4096 2008-06-20 10:03 lock
drwxr-xr-x  8 root  root   4096 2008-06-20 10:02 log
drwx------  2 root  root  16384 2007-10-28 23:09 lost+found
lrwxrwxrwx  1 root  root     10 2007-10-28 23:09 mail -> spool/mail
drwxr-xr-x  2 root  root   4096 2007-09-21 23:04 opt
drwxr-xr-x 10 root  root   4096 2008-06-20 10:03 run
drwxr-xr-x  9 root  root   4096 2007-11-01 23:09 spool
drwxrwxrwt  4 root  root   4096 2008-06-20 00:03 tmp
d?????????  ? ?     ?         ?                ? X11R6
nas:~ #

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3281 bytes --]

      parent reply	other threads:[~2008-06-20 22:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-19  9:23 mdadm: making a spare actie Jon Buckingham
2008-06-19 11:45 ` Neil Brown
2008-06-19 22:24   ` Jon Buckingham
2008-06-20  0:33     ` Neil Brown
2008-06-20  8:57       ` Jon Buckingham
2008-06-20 22:18       ` Jon Buckingham [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=485C2CA8.7050506@blueyonder.co.uk \
    --to=jbuckingham@blueyonder.co.uk \
    --cc=jon.buckingham@hp.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).