RAID1 == two different ARRAY in scan, and Q on read error corrected

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Phil Lobbes <phil@perkpartners.com>
To: linux-raid@vger.kernel.org
Subject: RAID1 == two different ARRAY in scan, and Q on read error corrected
Date: Fri, 18 Apr 2008 15:35:59 -0400	[thread overview]
Message-ID: <27567.1208547359@perkpartners.com> (raw)

Hi,

I have been lurking for a little while on the mail list and been doing
some investigation on my own.  I don't mean to impose and hopefully this
is the right forum for these questions.  If anyone has some
suggestions/recommendations/guidance on the following two questions I'm
all ears!

_________________________________________________________________
Q1: RAID1 == two different ARRAY in scan

I recently upgraded my server from Fedora Core 5 to Fedora 8 and along
with that I noticed something that either overlooked before or perhaps
caused during the upgrade.  On that system I have a 300G RAID1 mirror:

  # cat /proc/mdstat
  Personalities : [raid1]
  md0 : active raid1 sdc1[0] sdd1[1]
        293049600 blocks [2/2] [UU]

  unused devices: <none>

When I use mdadm --examine --scan my 300G RAID1 mirror returns two
separate UUIDs with different devices for each:
* (correct) a "complete disk partition" aka /dev/sd{c,d}1
* (bogus) a entire device aka /dev/sd{c,d}

  # mdadm --examine --scan --verbose
  ARRAY /dev/md0 level=raid1 num-devices=2 UUID=12c2d7a3:0b791468:9e965247:f4354b36
     devices=/dev/sdd,/dev/sdc
  ARRAY /dev/md0 level=raid1 num-devices=2 UUID=7b879b21:7cc83b9c:765dd3f3:2af46d19
     devices=/dev/sdd1,/dev/sdc1

I didn't find a match in a FAQ or other posting so I was hoping to get
some insight/pointers here.

Should I:
a. Ignore this?

b. Zero out the superblock on sd{c,d}?  I'm no expert here so not
   positive this is a good option.  My theory is that a superblock for
   sdc must be different than a superblock for sdc1 so if that is
   correct the "fix" might be something like:

   # mdadm --zero-superblock /dev/sdc /dev/sdd

   Is this correct and safe?  No worries about it somehow impacting
   /dev/sdc1 and /dev/sdd1 and the good mirror, right?

c. Something else altogether?

For what it's worth, I suppose there is a chance I may have caused this
by trying to 'rename' the md# used by the ARRAY /dev/md0 => /dev/md3.

-----------------------------------------------------------------
* Disk/Partition info:

NOTE: Valid mirror is for partition /dev/sd{c,d}1 (not device
/dev/sd{c,d})

# fdisk -l /dev/sdc /dev/sdd

Disk /dev/sdc: 300.0 GB, 300090728448 bytes
255 heads, 63 sectors/track, 36483 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       36483   293049666   fd  Linux raid autodetect

Disk /dev/sdd: 300.0 GB, 300090728448 bytes
255 heads, 63 sectors/track, 36483 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       36483   293049666   fd  Linux raid autodetect

_________________________________________________________________
* Q2: On read error corrected messages

On an unrelated note, during/after the upgrade I noticed that I'm now
seeing a few of these events logged:

Apr 15 11:07:14  kernel: raid1: sdc1: rescheduling sector 517365296
Apr 15 11:07:54  kernel: raid1:md0: read error corrected (8 sectors at 517365296 on sdc1)
Apr 15 11:07:54  kernel: raid1: sdc1: redirecting sector 517365296 to another mirror
Apr 15 11:08:32  kernel: raid1: sdc1: rescheduling sector 517365472
Apr 15 11:09:09  kernel: raid1:md0: read error corrected (8 sectors at 517365472 on sdc1)
Apr 15 11:09:09  kernel: raid1: sdc1: redirecting sector 517365472 to another mirror

And also more of these:

Apr 18 14:01:45  smartd[2104]: Device: /dev/sdc, 3 Currently unreadable (pending) sectors
Apr 18 14:01:45  smartd[2104]: Device: /dev/sdc, SMART Prefailure Attribute: 8 Seek_Time_Performance changed from 240 to 241
Apr 18 14:01:45  smartd[2104]: Device: /dev/sdd, SMART Prefailure Attribute: 8 Seek_Time_Performance changed from 238 to 239

Here's some info from smartctl:

# smartctl -a /dev/sdc
smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Maxtor DiamondMax 10 family (ATA/133 and SATA/150)
Device Model:     Maxtor 6B300S0
Serial Number:    B60370HH
Firmware Version: BANC1980
User Capacity:    300,090,728,448 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
Local Time is:    Fri Apr 18 15:09:02 2008 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
...

SMART Error Log Version: 1
ATA Error Count: 36 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 36 occurred at disk power-on lifetime: 27108 hours (1129 days + 12 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  5e 00 00 00 00 00 a0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 00 00 00 00 a0 00  18d+12:45:51.593  NOP [Abort queued commands]
  00 00 08 1f 5f d6 e0 00  18d+12:45:48.339  NOP [Abort queued commands]
  00 00 00 00 00 00 e0 00  18d+12:45:48.338  NOP [Abort queued commands]
  00 00 00 00 00 00 a0 00  18d+12:45:48.335  NOP [Abort queued commands]
  00 03 46 00 00 00 a0 00  18d+12:45:48.332  NOP [Reserved subcommand]

Luckily, I'm not an expert on hard drives (nor their failures) but I'm
hoping that somebody might be able to give me some insight on any of
this and if I should be concerned or if I should just considered these
unreadable sectors as "normal" in the life of the drive.

Sincerely,
Phil

next             reply	other threads:[~2008-04-18 19:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-18 19:35 Phil Lobbes [this message]
2008-04-18 22:02 ` RAID1 == two different ARRAY in scan, and Q on read error corrected Richard Scobie
2008-04-18 23:49   ` David Lethe
2008-04-19  3:15     ` Richard Scobie
2008-04-19 17:26       ` Phil Lobbes
2008-04-19 19:58         ` Richard Scobie
2008-04-19 20:43           ` David Lethe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27567.1208547359@perkpartners.com \
    --to=phil@perkpartners.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).