raid5 won't resync

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid5 won't resync
@ 2004-08-31  3:08 Jon Lewis
  2004-08-31  4:08 ` Guy
  0 siblings, 1 reply; 11+ messages in thread
From: Jon Lewis @ 2004-08-31  3:08 UTC (permalink / raw)
  To: linux-raid; +Cc: aaron

We had a large mail server lose a drive today (not the first time), but
we've been having alot of trouble with the resync this time.

mdadm told us /dev/sde1 had failed.  Coworker did a raidhotadd with a hot
spare (/dev/sdg1).  Machine was under heavy load so we weren't surprised
that the rebuild was going kind of slowly.  About 4 hours later, the
system locked up with lots of "qlogifc0: no handles slots, this should not
happen" error messages.

At this point, we moved the drives (fiber channel attached SCA scsi drive
array) to a spare system with its own qlogic card.  Kernel sees the RAID5
and says that /dev/sde1 is bad.  It starts trying to resync it, but
it's using a different spare drive.  After about 10% of the resync, the
K/s resync speed slows to a few hundred K/sec, and keeps getting slower.
At this point the FS on the RAID5 isn't even mounted, so there shouldn't
be any system activity competing with the RAID rebuild.
/proc/sys/dev/raid/speed_limit_max is set to 100000.

Personalities : [raid5]
read_ahead 1024 sectors
md2 : active raid5 sdf1[10] sdm1[9] sdl1[8] sdk1[7] sdj1[6] sdn1[5]
sdg1[3]
sdd1[2] sdc1[1] sdb1[0]
     315266688 blocks level 5, 64k chunk, algorithm 2 [10/9] [UUUU_UUUUU]
     [==>..................]  recovery = 11.6% (4065836/35029632)
finish=1400.0min speed=368K/sec

kernel version in the original system where the drive failed and the
lockup happened during resync was 2.4.20-28.rh8.0.atsmp from
http://atrpms.net.  ATrpms are simply rebuilding the redhat kernel with
the XFS patches applied.

That system will also crash with the following ATrpms kernels:
2.4.20-35
2.4.20-19
2.4.18-14

Kernel version on spare system doing the slow resync is 2.4.22 from
kernel.org with XFS patches from http://oss.sgi.com/projects/xfs/.  The
big raid5 is an XFS fs.

Each system has 2 qlogic cards (all of which are the same).  The one where
it's resyncing now are:

QLogic ISP2100 SCSI on PCI bus 01 device 10 irq 27 base 0xe800
QLogic ISP2100 SCSI on PCI bus 01 device 18 irq 23 base 0xe400

The drives are all:
  Vendor: IBM      Model: DRHL36L  CLAR36  Rev: 3347
  Type:   Direct-Access                    ANSI SCSI revision: 02

Both systems are dual PIII 1.4's with 4GB RAM.

Anyone have any idea what bug(s) we're running into or have suggestions
for getting this RAID5 back in sync and in service?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31  3:08 raid5 won't resync Jon Lewis
@ 2004-08-31  4:08 ` Guy
  2004-08-31  8:08   ` Jon Lewis
  0 siblings, 1 reply; 11+ messages in thread
From: Guy @ 2004-08-31  4:08 UTC (permalink / raw)
  To: 'Jon Lewis', linux-raid; +Cc: aaron

I have read where someone else had a similar problem.
The slowdown was caused by a bad hard disk.

Do a dd read test of each disk in the array.

Example:
time dd if=/dev/sdj of=/dev/null bs=64k

Open different windows and test all of the disks at the same time, 1 per
window.  If you test them all from the same window using "&" the output will
get mixed.

The time command is to compare the performance of each disk.
The time command is optional.

Someone else has said:
Performance can be bad if the disk controller is sharing an interrupt with
another device.
It is ok for 2 of the same model cards to share 1 interrupt.

Use this to determine which interrupts are being used:
cat /proc/interrupts

Moving the card may change the interrupt.
You may also change the interrupts from the BIOS.

I don't think an interrupt problem would cause a slow down over time.
I bet you have a problem with a disk drive.

I hope this helps!

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Jon Lewis
Sent: Monday, August 30, 2004 11:09 PM
To: linux-raid@vger.kernel.org
Cc: aaron@america.com
Subject: raid5 won't resync

We had a large mail server lose a drive today (not the first time), but
we've been having alot of trouble with the resync this time.

mdadm told us /dev/sde1 had failed.  Coworker did a raidhotadd with a hot
spare (/dev/sdg1).  Machine was under heavy load so we weren't surprised
that the rebuild was going kind of slowly.  About 4 hours later, the
system locked up with lots of "qlogifc0: no handles slots, this should not
happen" error messages.

At this point, we moved the drives (fiber channel attached SCA scsi drive
array) to a spare system with its own qlogic card.  Kernel sees the RAID5
and says that /dev/sde1 is bad.  It starts trying to resync it, but
it's using a different spare drive.  After about 10% of the resync, the
K/s resync speed slows to a few hundred K/sec, and keeps getting slower.
At this point the FS on the RAID5 isn't even mounted, so there shouldn't
be any system activity competing with the RAID rebuild.
/proc/sys/dev/raid/speed_limit_max is set to 100000.

Personalities : [raid5]
read_ahead 1024 sectors
md2 : active raid5 sdf1[10] sdm1[9] sdl1[8] sdk1[7] sdj1[6] sdn1[5]
sdg1[3]
sdd1[2] sdc1[1] sdb1[0]
     315266688 blocks level 5, 64k chunk, algorithm 2 [10/9] [UUUU_UUUUU]
     [==>..................]  recovery = 11.6% (4065836/35029632)
finish=1400.0min speed=368K/sec

kernel version in the original system where the drive failed and the
lockup happened during resync was 2.4.20-28.rh8.0.atsmp from
http://atrpms.net.  ATrpms are simply rebuilding the redhat kernel with
the XFS patches applied.

That system will also crash with the following ATrpms kernels:
2.4.20-35
2.4.20-19
2.4.18-14

Kernel version on spare system doing the slow resync is 2.4.22 from
kernel.org with XFS patches from http://oss.sgi.com/projects/xfs/.  The
big raid5 is an XFS fs.

Each system has 2 qlogic cards (all of which are the same).  The one where
it's resyncing now are:

QLogic ISP2100 SCSI on PCI bus 01 device 10 irq 27 base 0xe800
QLogic ISP2100 SCSI on PCI bus 01 device 18 irq 23 base 0xe400

The drives are all:
  Vendor: IBM      Model: DRHL36L  CLAR36  Rev: 3347
  Type:   Direct-Access                    ANSI SCSI revision: 02

Both systems are dual PIII 1.4's with 4GB RAM.

Anyone have any idea what bug(s) we're running into or have suggestions
for getting this RAID5 back in sync and in service?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31  4:08 ` Guy
@ 2004-08-31  8:08   ` Jon Lewis
  2004-08-31  9:22     ` BUG: mdadm --fail makes the kernel lose count (was Re: raid5 won't resync) David Greaves
  2004-08-31 14:50     ` raid5 won't resync Guy
  0 siblings, 2 replies; 11+ messages in thread
From: Jon Lewis @ 2004-08-31  8:08 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid, aaron

On Tue, 31 Aug 2004, Guy wrote:

> I have read where someone else had a similar problem.
> The slowdown was caused by a bad hard disk.
>
> Do a dd read test of each disk in the array.
>
> Example:
> time dd if=/dev/sdj of=/dev/null bs=64k

All of these finished at about the same time with no read errors reported.

> Someone else has said:
> Performance can be bad if the disk controller is sharing an interrupt with
> another device.
> It is ok for 2 of the same model cards to share 1 interrupt.

Since it's an SMP system, IO APIC gives us lots of IRQs and there is no
sharing.

           CPU0       CPU1
  0:     739040    1188881    IO-APIC-edge  timer
  1:        173        178    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
 14:     355893     353513    IO-APIC-edge  ide0
 15:    1963919    1944260    IO-APIC-edge  ide1
 20:       7171       7690   IO-APIC-level  eth0
 21:          2          3   IO-APIC-level  eth1
 23:    1540742    1537849   IO-APIC-level  qlogicfc
 27:    1540624    1539874   IO-APIC-level  qlogicfc

Since the recovery had stopped making progress, I decided to fail the
drive it had brought in as the spare with mdadm /dev/md2 -f /dev/sdf1.
That worked as expected.  mdadm /dev/md2 -r /dev/sdf1 seems to have hung.
It's in state D and I can't terminate it.  Trying to add a new spare,
mdadm can't get a lock on /dev/md2 because the previous one is stuck.

I suspect at this point, we're going to have to just reboot again.

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* BUG: mdadm --fail makes the kernel lose count (was Re: raid5 won't resync)
  2004-08-31  8:08   ` Jon Lewis
@ 2004-08-31  9:22     ` David Greaves
  2004-09-01  0:36       ` Neil Brown
  2004-08-31 14:50     ` raid5 won't resync Guy
  1 sibling, 1 reply; 11+ messages in thread
From: David Greaves @ 2004-08-31  9:22 UTC (permalink / raw)
  To: Jon Lewis, neilb; +Cc: Guy, linux-raid, aaron

Neil
copied you as I think there's a bug in resync behaviour (kernel.org 2.6.6)

Summary: No data loss. A resync in progress doesn't stop when mdadm 
fails the resyncing device and the kernel loses count.
When complete
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [raid6]
md0 : active raid5 sdd1[3] sdc1[1] sdb1[2] sda1[0] hdb1[4]
      980446208 blocks level 5, 128k chunk, algorithm 2 [5/4] [UUUUU]

That should be [5/5] shouldn't it?

Apologies if this is known and fixed in a later kernel.

Jon Lewis wrote:

>Since the recovery had stopped making progress, I decided to fail the
>drive it had brought in as the spare with mdadm /dev/md2 -f /dev/sdf1.
>That worked as expected.  mdadm /dev/md2 -r /dev/sdf1 seems to have hung.
>It's in state D and I can't terminate it.  Trying to add a new spare,
>mdadm can't get a lock on /dev/md2 because the previous one is stuck.
>
>I suspect at this point, we're going to have to just reboot again.
>  
>
Jon,
Since I had a similar problem (manually 'failing' a device during resync 
- I have a 5 device RAID5 - no spares)
I thought I'd ask if you noticed anything like this at all?


David
PS full story, messages etc below

Whilst having my own problems the other day I had the following odd 
behaviour:

Disk sdd1 failed (I think a single spurious bad block read)
/proc/mdstat and --detail showed it marked faulty
I mdadm-removed it from the array.
I checked it and found no errors.
I mdadm-added it and a resync started.
I realised I'd made a mistake and checked the partition and not the disk
I looked to see what was happening:
I did an mdadm --detail /dev/md0
--
/dev/md0:
        Version : 00.90.01
  Creation Time : Sat Jun  5 18:13:04 2004
     Raid Level : raid5
     Array Size : 980446208 (935.03 GiB 1003.98 GB)
    Device Size : 245111552 (233.76 GiB 250.99 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Aug 29 21:08:35 2004
          State : clean, degraded, recovering
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 128K

 Rebuild Status : 0% complete

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       17        2      active sync   /dev/sdb1
       3       0        0       -1      removed
       4       3       65        4      active sync   /dev/hdb1
       5       8       49        3      spare   /dev/sdd1
           UUID : 19779db7:1b41c34b:f70aa853:062c9fe5
         Events : 0.1979229
--

I mdadm-failed the device _whilst it was syncing_
The kernel reported "Operation continuing on 3 devices" (not 4)
[I thought at this point that I'd lost the lot!
The kernel not counting properly is not confidence inspiring]
at this point I had:
--
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [raid6]
md0 : active raid5 sdd1[5](F) sdc1[1] sdb1[2] sda1[0] hdb1[4]
      980446208 blocks level 5, 128k chunk, algorithm 2 [5/3] [UUU_U]
      [>....................]  recovery =  0.3% (920724/245111552) 
finish=349.5min s
--
Not nice looking at all!!!
Another  mdadm --detail /dev/md0
--
/dev/md0:
        Version : 00.90.01
  Creation Time : Sat Jun  5 18:13:04 2004
     Raid Level : raid5
     Array Size : 980446208 (935.03 GiB 1003.98 GB)
    Device Size : 245111552 (233.76 GiB 250.99 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Aug 29 21:09:06 2004
          State : clean, degraded, recovering
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

 Rebuild Status : 0% complete

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       17        2      active sync   /dev/sdb1
       3       0        0       -1      removed
       4       3       65        4      active sync   /dev/hdb1
       5       8       49        3      faulty   /dev/sdd1
           UUID : 19779db7:1b41c34b:f70aa853:062c9fe5
         Events : 0.1979246
--
Now mdadm reports the drive faulty but:
mdadm /dev/md0 --remove /dev/sdd1
mdadm: hot remove failed for /dev/sdd1: Device or resource busy

OK, fail the drive again and try and remove it.
Nope.
Oh-oh.

I figured leaving it was the safest thing at this point.
Later that night it finished.

Aug 30 01:37:55 cu kernel: md: md0: sync done.
Aug 30 01:37:55 cu kernel: RAID5 conf printout:
Aug 30 01:37:55 cu kernel:  --- rd:5 wd:3 fd:1
Aug 30 01:37:55 cu kernel:  disk 0, o:1, dev:sda1
Aug 30 01:37:55 cu kernel:  disk 1, o:1, dev:sdc1
Aug 30 01:37:55 cu kernel:  disk 2, o:1, dev:sdb1
Aug 30 01:37:55 cu kernel:  disk 3, o:0, dev:sdd1
Aug 30 01:37:55 cu kernel:  disk 4, o:1, dev:hdb1
Aug 30 01:37:55 cu kernel: RAID5 conf printout:
Aug 30 01:37:55 cu kernel:  --- rd:5 wd:3 fd:1
Aug 30 01:37:55 cu kernel:  disk 0, o:1, dev:sda1
Aug 30 01:37:55 cu kernel:  disk 1, o:1, dev:sdc1
Aug 30 01:37:55 cu kernel:  disk 2, o:1, dev:sdb1
Aug 30 01:37:55 cu kernel:  disk 3, o:0, dev:sdd1
Aug 30 01:37:55 cu kernel:  disk 4, o:1, dev:hdb1
Aug 30 01:37:55 cu kernel: RAID5 conf printout:
Aug 30 01:37:55 cu kernel:  --- rd:5 wd:3 fd:1
Aug 30 01:37:55 cu kernel:  disk 0, o:1, dev:sda1
Aug 30 01:37:55 cu kernel:  disk 1, o:1, dev:sdc1
Aug 30 01:37:55 cu kernel:  disk 2, o:1, dev:sdb1
Aug 30 01:37:55 cu kernel:  disk 4, o:1, dev:hdb1

Next morning:
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [raid6]
md0 : active raid5 sdd1[5](F) sdc1[1] sdb1[2] sda1[0] hdb1[4]
      980446208 blocks level 5, 128k chunk, algorithm 2 [5/3] [UUU_U]

unused devices: <none>
# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Sat Jun  5 18:13:04 2004
     Raid Level : raid5
     Array Size : 980446208 (935.03 GiB 1003.98 GB)
    Device Size : 245111552 (233.76 GiB 250.99 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Aug 30 08:45:35 2004
          State : clean, degraded
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       17        2      active sync   /dev/sdb1
       3       0        0       -1      removed
       4       3       65        4      active sync   /dev/hdb1
       5       8       49       -1      faulty   /dev/sdd1
           UUID : 19779db7:1b41c34b:f70aa853:062c9fe5
         Events : 0.1986057

I don't know why it was still (F). As if the last fail and remove were 
'queued'?


Finally I did mdadm /dev/md0 --remove /dev/sdd1

mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Sat Jun  5 18:13:04 2004
     Raid Level : raid5
     Array Size : 980446208 (935.03 GiB 1003.98 GB)
    Device Size : 245111552 (233.76 GiB 250.99 GB)
   Raid Devices : 5
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Aug 30 08:54:28 2004
          State : clean, degraded
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       17        2      active sync   /dev/sdb1
       3       0        0       -1      removed
       4       3       65        4      active sync   /dev/hdb1
           UUID : 19779db7:1b41c34b:f70aa853:062c9fe5
         Events : 0.1986058
cu:/var/cache/apt-cacher# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [raid6]
md0 : active raid5 sdc1[1] sdb1[2] sda1[0] hdb1[4]
      980446208 blocks level 5, 128k chunk, algorithm 2 [5/3] [UUU_U]

unused devices: <none>


mdadm /dev/md0 --add /dev/sdd1

cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [raid6]
md0 : active raid5 sdd1[5] sdc1[1] sdb1[2] sda1[0] hdb1[4]
      980446208 blocks level 5, 128k chunk, algorithm 2 [5/3] [UUU_U]
      [>....................]  recovery =  0.0% (161328/245111552) 
finish=252.9min speed=16132K/sec
unused devices: <none>


Eventually:
Aug 30 17:24:07 cu kernel: md: md0: sync done.
Aug 30 17:24:07 cu kernel: RAID5 conf printout:
Aug 30 17:24:07 cu kernel:  --- rd:5 wd:4 fd:0
Aug 30 17:24:07 cu kernel:  disk 0, o:1, dev:sda1
Aug 30 17:24:07 cu kernel:  disk 1, o:1, dev:sdc1
Aug 30 17:24:07 cu kernel:  disk 2, o:1, dev:sdb1
Aug 30 17:24:07 cu kernel:  disk 3, o:1, dev:sdd1
Aug 30 17:24:07 cu kernel:  disk 4, o:1, dev:hdb1

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [raid6]
md0 : active raid5 sdd1[3] sdc1[1] sdb1[2] sda1[0] hdb1[4]
      980446208 blocks level 5, 128k chunk, algorithm 2 [5/4] [UUUUU]

unused devices: <none>
# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Sat Jun  5 18:13:04 2004
     Raid Level : raid5
     Array Size : 980446208 (935.03 GiB 1003.98 GB)
    Device Size : 245111552 (233.76 GiB 250.99 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Aug 30 17:24:07 2004
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       17        2      active sync   /dev/sdb1
       3       8       49        3      active sync   /dev/sdd1
       4       3       65        4      active sync   /dev/hdb1
           UUID : 19779db7:1b41c34b:f70aa853:062c9fe5
         Events : 0.2014548

So back to normal and happy - but I guess the md0 device needs a restart 
now which is bad.

David



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31  8:08   ` Jon Lewis
  2004-08-31  9:22     ` BUG: mdadm --fail makes the kernel lose count (was Re: raid5 won't resync) David Greaves
@ 2004-08-31 14:50     ` Guy
  2004-08-31 20:09       ` Jon Lewis
  1 sibling, 1 reply; 11+ messages in thread
From: Guy @ 2004-08-31 14:50 UTC (permalink / raw)
  To: 'Jon Lewis'; +Cc: linux-raid, aaron

I this point you need professional help!  :)

I don't know what to tell you.

Good luck,
Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Jon Lewis
Sent: Tuesday, August 31, 2004 4:08 AM
To: Guy
Cc: linux-raid@vger.kernel.org; aaron@america.com
Subject: RE: raid5 won't resync

On Tue, 31 Aug 2004, Guy wrote:

> I have read where someone else had a similar problem.
> The slowdown was caused by a bad hard disk.
>
> Do a dd read test of each disk in the array.
>
> Example:
> time dd if=/dev/sdj of=/dev/null bs=64k

All of these finished at about the same time with no read errors reported.

> Someone else has said:
> Performance can be bad if the disk controller is sharing an interrupt with
> another device.
> It is ok for 2 of the same model cards to share 1 interrupt.

Since it's an SMP system, IO APIC gives us lots of IRQs and there is no
sharing.

           CPU0       CPU1
  0:     739040    1188881    IO-APIC-edge  timer
  1:        173        178    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
 14:     355893     353513    IO-APIC-edge  ide0
 15:    1963919    1944260    IO-APIC-edge  ide1
 20:       7171       7690   IO-APIC-level  eth0
 21:          2          3   IO-APIC-level  eth1
 23:    1540742    1537849   IO-APIC-level  qlogicfc
 27:    1540624    1539874   IO-APIC-level  qlogicfc

Since the recovery had stopped making progress, I decided to fail the
drive it had brought in as the spare with mdadm /dev/md2 -f /dev/sdf1.
That worked as expected.  mdadm /dev/md2 -r /dev/sdf1 seems to have hung.
It's in state D and I can't terminate it.  Trying to add a new spare,
mdadm can't get a lock on /dev/md2 because the previous one is stuck.

I suspect at this point, we're going to have to just reboot again.

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31 14:50     ` raid5 won't resync Guy
@ 2004-08-31 20:09       ` Jon Lewis
  2004-08-31 20:40         ` Guy
  0 siblings, 1 reply; 11+ messages in thread
From: Jon Lewis @ 2004-08-31 20:09 UTC (permalink / raw)
  To: linux-raid; +Cc: aaron

Now we've got a new problem with the raid array from last night.  We've
switched qlogic drivers to one that some people have posted is more stable
than the one we were using.  This unfortunately changed all the scsi
device names.  i.e.

abcdefg hijklmn has become
hijklmn abcdefg

I put the following in /etc/mdadm.conf:

DEVICE /dev/sd[abcdefghijklmn][1]
ARRAY /dev/md2 level=raid5 num-devices=10 UUID=532d4b61:48f5278b:4fd2e730:6dd4a608

That DEVICE line should cover all the members (under their new device
names) for the raid5 array.

then I ran:

mdadm --assemble /dev/md2 --uuid 532d4b61:48f5278b:4fd2e730:6dd4a608
or
mdadm --assemble /dev/md2 --scan

Both terminate with the same result:

mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start
the array.

but if I look at /proc/mdstat, it did find all 10 (actually 11) devices.

# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md2 : inactive sdc1[6] sdm1[10] sdf1[9] sde1[8] sdd1[7] sdg1[5] sdl1[4]
sdn1[3] sdk1[2] sdj1[1] sdi1[0]
      0 blocks
md1 : active raid1 hda1[0] hdc1[1]
      30716160 blocks [2/2] [UU]
      [>....................]  resync =  3.5% (1098392/30716160) finish=298.2min speed=1654K/sec
md0 : active raid1 sdh2[0] sda2[1]
      104320 blocks [2/2] [UU]

unused devices: <none>

I suspect it's found both the failed drive (originally sde1, now named
sdl1) and the spare that it had started, but never finished, rebuilding
on (sdg1, now sdn1).  Why is mdadm saying there are only 4 devices + 1
spare?  Is there a best way to proceed at this point to try to get this
array repaired?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31 20:09       ` Jon Lewis
@ 2004-08-31 20:40         ` Guy
  2004-08-31 21:27           ` Jon Lewis
  0 siblings, 1 reply; 11+ messages in thread
From: Guy @ 2004-08-31 20:40 UTC (permalink / raw)
  To: 'Jon Lewis', linux-raid; +Cc: aaron

I think what you did should work, but...
I have had similar problems.
Try again, but this time don't include any spare disks, or any other disks.
Only include the disks you know have the data.
Or, just list the disks on the command line.

Keep your fingers crossed!

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Jon Lewis
Sent: Tuesday, August 31, 2004 4:10 PM
To: linux-raid@vger.kernel.org
Cc: aaron@america.com
Subject: RE: raid5 won't resync

Now we've got a new problem with the raid array from last night.  We've
switched qlogic drivers to one that some people have posted is more stable
than the one we were using.  This unfortunately changed all the scsi
device names.  i.e.

abcdefg hijklmn has become
hijklmn abcdefg

I put the following in /etc/mdadm.conf:

DEVICE /dev/sd[abcdefghijklmn][1]
ARRAY /dev/md2 level=raid5 num-devices=10
UUID=532d4b61:48f5278b:4fd2e730:6dd4a608

That DEVICE line should cover all the members (under their new device
names) for the raid5 array.

then I ran:

mdadm --assemble /dev/md2 --uuid 532d4b61:48f5278b:4fd2e730:6dd4a608
or
mdadm --assemble /dev/md2 --scan

Both terminate with the same result:

mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start
the array.

but if I look at /proc/mdstat, it did find all 10 (actually 11) devices.

# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md2 : inactive sdc1[6] sdm1[10] sdf1[9] sde1[8] sdd1[7] sdg1[5] sdl1[4]
sdn1[3] sdk1[2] sdj1[1] sdi1[0]
      0 blocks
md1 : active raid1 hda1[0] hdc1[1]
      30716160 blocks [2/2] [UU]
      [>....................]  resync =  3.5% (1098392/30716160)
finish=298.2min speed=1654K/sec
md0 : active raid1 sdh2[0] sda2[1]
      104320 blocks [2/2] [UU]

unused devices: <none>

I suspect it's found both the failed drive (originally sde1, now named
sdl1) and the spare that it had started, but never finished, rebuilding
on (sdg1, now sdn1).  Why is mdadm saying there are only 4 devices + 1
spare?  Is there a best way to proceed at this point to try to get this
array repaired?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31 20:40         ` Guy
@ 2004-08-31 21:27           ` Jon Lewis
  2004-08-31 22:37             ` Guy
  0 siblings, 1 reply; 11+ messages in thread
From: Jon Lewis @ 2004-08-31 21:27 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid, aaron

On Tue, 31 Aug 2004, Guy wrote:

> I think what you did should work, but...
> I have had similar problems.
> Try again, but this time don't include any spare disks, or any other disks.
> Only include the disks you know have the data.
> Or, just list the disks on the command line.

# mdadm --assemble /dev/md2 /dev/sdc1 /dev/sdm1 /dev/sdf1 /dev/sde1
/dev/sdd1 /dev/sdg1 /dev/sdk1 /dev/sdj1 /dev/sdi1
mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start
the array.

I've left sdl1 and sdn1 out of the above as they're the failed drive and
the partially rebuilt spare.

I see a pattern that could explain why mdadm thinks there are only 4
drives.  From mdadm -E on each drive:

sdc1:    Update Time : Tue Aug 31 03:47:27 2004
sdd1:    Update Time : Tue Aug 31 03:47:27 2004
sde1:    Update Time : Tue Aug 31 03:47:27 2004
sdf1:    Update Time : Tue Aug 31 03:47:27 2004
sdg1:    Update Time : Mon Aug 30 22:42:36 2004
sdi1:    Update Time : Mon Aug 30 22:42:36 2004
sdj1:    Update Time : Mon Aug 30 22:42:36 2004
sdk1:    Update Time : Mon Aug 30 22:42:36 2004
sdl1:    Update Time : Tue Jul 13 02:08:37 2004
sdm1:    Update Time : Mon Aug 30 22:42:36 2004
sdn1:    Update Time : Mon Aug 30 22:42:36 2004

Is mdadm --assemble seeing that 4 drives have a more recent Update Time
than the rest and ignoring the rest?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31 21:27           ` Jon Lewis
@ 2004-08-31 22:37             ` Guy
  2004-09-01  0:25               ` Jon Lewis
  0 siblings, 1 reply; 11+ messages in thread
From: Guy @ 2004-08-31 22:37 UTC (permalink / raw)
  To: 'Jon Lewis'; +Cc: linux-raid, aaron

You have 2 failed drives?
RAID5 only supports 1 failed drive.

Have you tested the drives to determine if they are good?
Example:
dd if=/dev/sdf of=/dev/null bs=64k

If you can find enough good drives, use the force option on assemble.
But don't include any disks that don't have 100% of the data.
A spare that did a partial re-build is not good to use at this point.

So, if your array had 10 disks, you need to find 9 of them that are still
working.

Guy

-----Original Message-----
From: Jon Lewis [mailto:jlewis@lewis.org] 
Sent: Tuesday, August 31, 2004 5:27 PM
To: Guy
Cc: linux-raid@vger.kernel.org; aaron@america.com
Subject: RE: raid5 won't resync

On Tue, 31 Aug 2004, Guy wrote:

> I think what you did should work, but...
> I have had similar problems.
> Try again, but this time don't include any spare disks, or any other
disks.
> Only include the disks you know have the data.
> Or, just list the disks on the command line.

# mdadm --assemble /dev/md2 /dev/sdc1 /dev/sdm1 /dev/sdf1 /dev/sde1
/dev/sdd1 /dev/sdg1 /dev/sdk1 /dev/sdj1 /dev/sdi1
mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start
the array.

I've left sdl1 and sdn1 out of the above as they're the failed drive and
the partially rebuilt spare.

I see a pattern that could explain why mdadm thinks there are only 4
drives.  From mdadm -E on each drive:

sdc1:    Update Time : Tue Aug 31 03:47:27 2004
sdd1:    Update Time : Tue Aug 31 03:47:27 2004
sde1:    Update Time : Tue Aug 31 03:47:27 2004
sdf1:    Update Time : Tue Aug 31 03:47:27 2004
sdg1:    Update Time : Mon Aug 30 22:42:36 2004
sdi1:    Update Time : Mon Aug 30 22:42:36 2004
sdj1:    Update Time : Mon Aug 30 22:42:36 2004
sdk1:    Update Time : Mon Aug 30 22:42:36 2004
sdl1:    Update Time : Tue Jul 13 02:08:37 2004
sdm1:    Update Time : Mon Aug 30 22:42:36 2004
sdn1:    Update Time : Mon Aug 30 22:42:36 2004

Is mdadm --assemble seeing that 4 drives have a more recent Update Time
than the rest and ignoring the rest?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: raid5 won't resync
  2004-08-31 22:37             ` Guy
@ 2004-09-01  0:25               ` Jon Lewis
  0 siblings, 0 replies; 11+ messages in thread
From: Jon Lewis @ 2004-09-01  0:25 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid, aaron

I don't believe we have 2 failed drives, and AFAICT from doing the dd read
tests last night, none are actually bad.  md decided for whatever reason
(qlogic driver bug I'm guessing) that 1 drive had failed.  We put in a
spare drive to let it rebuild, but that rebuild never completed.  Unless
something else happened that I'm not aware of (quite possible since I'm
125 miles away), we should still have a 10 drive raid5 with one failed
drive...so we ought to be able to get the 9 drives + parity/missing bits
calculation up and running.

On Tue, 31 Aug 2004, Guy wrote:

> You have 2 failed drives?
> RAID5 only supports 1 failed drive.
>
> Have you tested the drives to determine if they are good?
> Example:
> dd if=/dev/sdf of=/dev/null bs=64k
>
> If you can find enough good drives, use the force option on assemble.
> But don't include any disks that don't have 100% of the data.
> A spare that did a partial re-build is not good to use at this point.
>
> So, if your array had 10 disks, you need to find 9 of them that are still
> working.
>
> Guy
>
> -----Original Message-----
> From: Jon Lewis [mailto:jlewis@lewis.org]
> Sent: Tuesday, August 31, 2004 5:27 PM
> To: Guy
> Cc: linux-raid@vger.kernel.org; aaron@america.com
> Subject: RE: raid5 won't resync
>
> On Tue, 31 Aug 2004, Guy wrote:
>
> > I think what you did should work, but...
> > I have had similar problems.
> > Try again, but this time don't include any spare disks, or any other
> disks.
> > Only include the disks you know have the data.
> > Or, just list the disks on the command line.
>
> # mdadm --assemble /dev/md2 /dev/sdc1 /dev/sdm1 /dev/sdf1 /dev/sde1
> /dev/sdd1 /dev/sdg1 /dev/sdk1 /dev/sdj1 /dev/sdi1
> mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start
> the array.
>
> I've left sdl1 and sdn1 out of the above as they're the failed drive and
> the partially rebuilt spare.
>
> I see a pattern that could explain why mdadm thinks there are only 4
> drives.  From mdadm -E on each drive:
>
> sdc1:    Update Time : Tue Aug 31 03:47:27 2004
> sdd1:    Update Time : Tue Aug 31 03:47:27 2004
> sde1:    Update Time : Tue Aug 31 03:47:27 2004
> sdf1:    Update Time : Tue Aug 31 03:47:27 2004
> sdg1:    Update Time : Mon Aug 30 22:42:36 2004
> sdi1:    Update Time : Mon Aug 30 22:42:36 2004
> sdj1:    Update Time : Mon Aug 30 22:42:36 2004
> sdk1:    Update Time : Mon Aug 30 22:42:36 2004
> sdl1:    Update Time : Tue Jul 13 02:08:37 2004
> sdm1:    Update Time : Mon Aug 30 22:42:36 2004
> sdn1:    Update Time : Mon Aug 30 22:42:36 2004
>
> Is mdadm --assemble seeing that 4 drives have a more recent Update Time
> than the rest and ignoring the rest?
>
> ----------------------------------------------------------------------
>  Jon Lewis                   |  I route
>  Senior Network Engineer     |  therefore you are
>  Atlantic Net                |
> _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
>

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: BUG: mdadm --fail makes the kernel lose count (was Re: raid5 won't resync)
  2004-08-31  9:22     ` BUG: mdadm --fail makes the kernel lose count (was Re: raid5 won't resync) David Greaves
@ 2004-09-01  0:36       ` Neil Brown
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2004-09-01  0:36 UTC (permalink / raw)
  To: David Greaves; +Cc: Jon Lewis, Guy, linux-raid, aaron

On Tuesday August 31, david@dgreaves.com wrote:
> Neil
> copied you as I think there's a bug in resync behaviour (kernel.org
2.6.6)

Yes, thanks.
It's just a small bug.  It only affect the content of /proc/mdstat
and the "continuing on ... drives" message.
Here is that patch which will be submitted shortly.

Thanks again,
NeilBrown

=============================================
Correct "working_disk" counts for raid5 and raid6

This error only affects two message (and sysadmin heart-rate).
It does not risk data.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>

### Diffstat output
 ./drivers/md/raid5.c     |    2 +-
 ./drivers/md/raid6main.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2004-08-30 14:43:27.000000000 +1000
+++ ./drivers/md/raid5.c	2004-09-01 10:17:19.000000000 +1000
@@ -477,8 +477,8 @@ static void error(mddev_t *mddev, mdk_rd
 
 	if (!rdev->faulty) {
 		mddev->sb_dirty = 1;
-		conf->working_disks--;
 		if (rdev->in_sync) {
+			conf->working_disks--;
 			mddev->degraded++;
 			conf->failed_disks++;
 			rdev->in_sync = 0;

diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c
--- ./drivers/md/raid6main.c~current~	2004-08-30 14:43:27.000000000 +1000
+++ ./drivers/md/raid6main.c	2004-09-01 10:17:46.000000000 +1000
@@ -498,8 +498,8 @@ static void error(mddev_t *mddev, mdk_rd
 
 	if (!rdev->faulty) {
 		mddev->sb_dirty = 1;
-		conf->working_disks--;
 		if (rdev->in_sync) {
+			conf->working_disks--;
 			mddev->degraded++;
 			conf->failed_disks++;
 			rdev->in_sync = 0;

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-09-01  0:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-31  3:08 raid5 won't resync Jon Lewis
2004-08-31  4:08 ` Guy
2004-08-31  8:08   ` Jon Lewis
2004-08-31  9:22     ` BUG: mdadm --fail makes the kernel lose count (was Re: raid5 won't resync) David Greaves
2004-09-01  0:36       ` Neil Brown
2004-08-31 14:50     ` raid5 won't resync Guy
2004-08-31 20:09       ` Jon Lewis
2004-08-31 20:40         ` Guy
2004-08-31 21:27           ` Jon Lewis
2004-08-31 22:37             ` Guy
2004-09-01  0:25               ` Jon Lewis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).