linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PERC2/Si won't failover
@ 2005-01-31 13:49 Kit Gerrits
  2005-01-31 16:32 ` Kit Gerrits
  2005-01-31 18:42 ` Andrew Kinney
  0 siblings, 2 replies; 3+ messages in thread
From: Kit Gerrits @ 2005-01-31 13:49 UTC (permalink / raw)
  To: linux-scsi

Hey all!

I have a PowerEDGE 2400 with  PERC2/Si with 4x9GB Drives with RedHat EL 3.0
Container 0: plain 9GB drive (O/S)
Container 1: 3x9GB in RAID5 (data)

After getting I/O Errors (and gettinge a strange noise from drive 0:3:0), I
did the unthinkable: I pulled the drive from the chassis without shutting it
down. (oops)
I have now verified the drive, cleaned off the partition and rescanned the
bus.
...but the drive won't failover

I have set it to failover, but the PERC won't failover the drive, even after
a (warm) reboot.

Did I forget anything?

Thanks in advance,

Kit Gerrits
kit@gerritsaa.nl


---------------
Debugging info:
---------------

AFA0> disk list
Executing: disk list

B:ID:L  Device Type     Blocks    Bytes/Block Usage            Shared Rate
------  --------------  --------- ----------- ---------------- ------ ----
0:00:0   Disk            17783240  512         Initialized      NO     80
0:01:0   Disk            17783240  512         Initialized      NO     80
0:02:0   Disk            17783240  512         Initialized      NO     80
0:03:0   Disk            17783240  512         Initialized      NO     80

AFA0> container show failover
Executing: container show failover

Container Scsi B:ID:L
--------- ----------------------------------
  0       --- No Devices Assigned ---
  1       0:03:0

AFA0> container list
Executing: container list
Num          Total  Oth Chunk          Scsi   Partition
Label Type   Size   Ctr Size   Usage   B:ID:L Offset:Size
----- ------ ------ --- ------ ------- ------ -------------
 0    Volume 8.47GB            Open    0:00:0 64.0KB:8.47GB
 /dev/sda             NT

 1    RAID-5 16.9GB       32KB Open    0:01:0 64.0KB:8.47GB
 /dev/sdb             DATA             0:02:0 64.0KB:8.47GB
                                       ?:??:?  - Missing -

AFA0> controller show au
Executing: controller show automatic_failover
Automatic failover ENABLED

AFA0> container scrub 1
Executing: container scrub 1
Command Error: <The controller was unable to perform a scrub (consistency
check)
 operation on the container because one or more of the container's
partitions fa
iled.  >



^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: PERC2/Si won't failover
  2005-01-31 13:49 PERC2/Si won't failover Kit Gerrits
@ 2005-01-31 16:32 ` Kit Gerrits
  2005-01-31 18:42 ` Andrew Kinney
  1 sibling, 0 replies; 3+ messages in thread
From: Kit Gerrits @ 2005-01-31 16:32 UTC (permalink / raw)
  To: linux-scsi

I found this in a diagnostic dump, maybe someone can make any sense of it?

[76]: CT_FindMissingEntry: dev = 3
[77]: CT_SwitchFillPrimary: recover missing entry, container Inde
[78]: x 1, dev 3, signature 0xf7b74944
[79]: CT_CONFIG_WARNING: at Line 8983 - CFG_MISSING_PARTITION: ar
[80]: g1 = 0x2 arg2 = 0x0

Kit

> -----Oorspronkelijk bericht-----
> Van: Kit Gerrits [mailto:kit@gerritsacc.nl] 
> Verzonden: maandag 31 januari 2005 14:49
> Aan: linux-scsi@vger.kernel.org
> Onderwerp: PERC2/Si won't failover
> 
> Hey all!
> 
> I have a PowerEDGE 2400 with  PERC2/Si with 4x9GB Drives with 
> RedHat EL 3.0 Container 0: plain 9GB drive (O/S) Container 1: 
> 3x9GB in RAID5 (data)
> 
> After getting I/O Errors (and gettinge a strange noise from 
> drive 0:3:0), I did the unthinkable: I pulled the drive from 
> the chassis without shutting it down. (oops) I have now 
> verified the drive, cleaned off the partition and rescanned the bus.
> ...but the drive won't failover
> 
> I have set it to failover, but the PERC won't failover the 
> drive, even after a (warm) reboot.
> 
> Did I forget anything?
> 
> Thanks in advance,
> 
> Kit Gerrits
> kit@gerritsaa.nl
> 
> 
> ---------------
> Debugging info:
> ---------------
> 
> AFA0> disk list
> Executing: disk list
> 
> B:ID:L  Device Type     Blocks    Bytes/Block Usage           
>  Shared Rate
> ------  --------------  --------- ----------- 
> ---------------- ------ ----
> 0:00:0   Disk            17783240  512         Initialized    
>   NO     80
> 0:01:0   Disk            17783240  512         Initialized    
>   NO     80
> 0:02:0   Disk            17783240  512         Initialized    
>   NO     80
> 0:03:0   Disk            17783240  512         Initialized    
>   NO     80
> 
> AFA0> container show failover
> Executing: container show failover
> 
> Container Scsi B:ID:L
> --------- ----------------------------------
>   0       --- No Devices Assigned ---
>   1       0:03:0
> 
> AFA0> container list
> Executing: container list
> Num          Total  Oth Chunk          Scsi   Partition
> Label Type   Size   Ctr Size   Usage   B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
>  0    Volume 8.47GB            Open    0:00:0 64.0KB:8.47GB
>  /dev/sda             NT
> 
>  1    RAID-5 16.9GB       32KB Open    0:01:0 64.0KB:8.47GB
>  /dev/sdb             DATA             0:02:0 64.0KB:8.47GB
>                                        ?:??:?  - Missing -
> 
> AFA0> controller show au
> Executing: controller show automatic_failover Automatic 
> failover ENABLED
> 
> AFA0> container scrub 1
> Executing: container scrub 1
> Command Error: <The controller was unable to perform a scrub 
> (consistency
> check)
>  operation on the container because one or more of the 
> container's partitions fa iled.  >
> 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PERC2/Si won't failover
  2005-01-31 13:49 PERC2/Si won't failover Kit Gerrits
  2005-01-31 16:32 ` Kit Gerrits
@ 2005-01-31 18:42 ` Andrew Kinney
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Kinney @ 2005-01-31 18:42 UTC (permalink / raw)
  To: Kit Gerrits; +Cc: linux-scsi

We had the same thing happen on a PE2500 with PERC3/DI, different 
drive configuration.

Text from my tech "diary" of sorts that I keep regarding unique 
problems I run into:

"The system would not boot after all the replacements were there, 
claiming "no boot device." For one reason or another, the RAID 
controller detected the new drive, but rather than just add the drive 
into the container and begin rebuilding, it simply offlined the 
container, which resulted in the "no boot device" message. I suspect 
this was because we did not properly inform the controller that were 
were taking a drive offline before we did so. We did it while the 
machine was turned off. We probably would have had better luck if we 
had booted the system, gone into afacli, prepare the enclosure slot 
to remove the drive, remove the drive, insert the new drive, and 
issue the proper commands to begin the rebuild if it doesn't start 
automatically. The OS afacli is much more robust than the RAID BIOS 
utilities.

The solution to get the drive into the RAID container and get it to 
begin rebuilding was to go into the RAID BIOS and assign the new 
drive as the failover drive. As soon as I did that, the container 
started rebuilding. I exited the utility (which automatically saves 
any settings) and rebooted. The system came back up just fine, like 
normal, and is now happily rebuilding the RAID array. No data was 
lost  and we now have as close to a completely new system as you can 
get short of replacing the entire thing."

It was just one part of a major system overhaul due to a "ghost" in 
the SCSI system that kept offlining our container in near-random non-
reproducable conditions, but I suspect a similar procedure may help 
in your instance.

Andrew

On 31 Jan 2005 at 14:49, Kit Gerrits wrote:

> Hey all!
> 
> I have a PowerEDGE 2400 with  PERC2/Si with 4x9GB Drives with RedHat
> EL 3.0 Container 0: plain 9GB drive (O/S) Container 1: 3x9GB in RAID5
> (data)
> 
> After getting I/O Errors (and gettinge a strange noise from drive
> 0:3:0), I did the unthinkable: I pulled the drive from the chassis
> without shutting it down. (oops) I have now verified the drive,
> cleaned off the partition and rescanned the bus. ...but the drive
> won't failover
> 
> I have set it to failover, but the PERC won't failover the drive, even
> after a (warm) reboot.
> 
> Did I forget anything?
> 
> Thanks in advance,
> 
> Kit Gerrits
> kit@gerritsaa.nl
> 
> 
> ---------------
> Debugging info:
> ---------------
> 
> AFA0> disk list
> Executing: disk list
> 
> B:ID:L  Device Type     Blocks    Bytes/Block Usage            Shared
> Rate ------  --------------  --------- ----------- ----------------
> ------ ---- 0:00:0   Disk            17783240  512         Initialized
>      NO     80 0:01:0   Disk            17783240  512        
> Initialized      NO     80 0:02:0   Disk            17783240  512     
>    Initialized      NO     80 0:03:0   Disk            17783240  512  
>       Initialized      NO     80
> 
> AFA0> container show failover
> Executing: container show failover
> 
> Container Scsi B:ID:L
> --------- ----------------------------------
>   0       --- No Devices Assigned ---
>   1       0:03:0
> 
> AFA0> container list
> Executing: container list
> Num          Total  Oth Chunk          Scsi   Partition
> Label Type   Size   Ctr Size   Usage   B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
>  0    Volume 8.47GB            Open    0:00:0 64.0KB:8.47GB
>  /dev/sda             NT
> 
>  1    RAID-5 16.9GB       32KB Open    0:01:0 64.0KB:8.47GB
>  /dev/sdb             DATA             0:02:0 64.0KB:8.47GB
>                                    ?:??:?  - Missing -
> 
> AFA0> controller show au
> Executing: controller show automatic_failover
> Automatic failover ENABLED
> 
> AFA0> container scrub 1
> Executing: container scrub 1
> Command Error: <The controller was unable to perform a scrub
> (consistency check)
>  operation on the container because one or more of the container's
> partitions fa iled.  >
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at  http://vger.kernel.org/majordomo-info.html
> 


Sincerely,
Andrew Kinney
President and
Chief Technology Officer
Advantagecom Networks, Inc.
http://www.advantagecom.net




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-01-31 18:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-31 13:49 PERC2/Si won't failover Kit Gerrits
2005-01-31 16:32 ` Kit Gerrits
2005-01-31 18:42 ` Andrew Kinney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).