From: "Andrew Kinney" <andykinney@advantagecom.net>
To: Kit Gerrits <kit@gerritsacc.nl>
Cc: linux-scsi@vger.kernel.org
Subject: Re: PERC2/Si won't failover
Date: Mon, 31 Jan 2005 10:42:18 -0800 [thread overview]
Message-ID: <41FE0B8A.10635.F25DCB4C@localhost> (raw)
In-Reply-To: <20050131134506.34A3AC4@frisbee.gerritsacc.nl>
We had the same thing happen on a PE2500 with PERC3/DI, different
drive configuration.
Text from my tech "diary" of sorts that I keep regarding unique
problems I run into:
"The system would not boot after all the replacements were there,
claiming "no boot device." For one reason or another, the RAID
controller detected the new drive, but rather than just add the drive
into the container and begin rebuilding, it simply offlined the
container, which resulted in the "no boot device" message. I suspect
this was because we did not properly inform the controller that were
were taking a drive offline before we did so. We did it while the
machine was turned off. We probably would have had better luck if we
had booted the system, gone into afacli, prepare the enclosure slot
to remove the drive, remove the drive, insert the new drive, and
issue the proper commands to begin the rebuild if it doesn't start
automatically. The OS afacli is much more robust than the RAID BIOS
utilities.
The solution to get the drive into the RAID container and get it to
begin rebuilding was to go into the RAID BIOS and assign the new
drive as the failover drive. As soon as I did that, the container
started rebuilding. I exited the utility (which automatically saves
any settings) and rebooted. The system came back up just fine, like
normal, and is now happily rebuilding the RAID array. No data was
lost and we now have as close to a completely new system as you can
get short of replacing the entire thing."
It was just one part of a major system overhaul due to a "ghost" in
the SCSI system that kept offlining our container in near-random non-
reproducable conditions, but I suspect a similar procedure may help
in your instance.
Andrew
On 31 Jan 2005 at 14:49, Kit Gerrits wrote:
> Hey all!
>
> I have a PowerEDGE 2400 with PERC2/Si with 4x9GB Drives with RedHat
> EL 3.0 Container 0: plain 9GB drive (O/S) Container 1: 3x9GB in RAID5
> (data)
>
> After getting I/O Errors (and gettinge a strange noise from drive
> 0:3:0), I did the unthinkable: I pulled the drive from the chassis
> without shutting it down. (oops) I have now verified the drive,
> cleaned off the partition and rescanned the bus. ...but the drive
> won't failover
>
> I have set it to failover, but the PERC won't failover the drive, even
> after a (warm) reboot.
>
> Did I forget anything?
>
> Thanks in advance,
>
> Kit Gerrits
> kit@gerritsaa.nl
>
>
> ---------------
> Debugging info:
> ---------------
>
> AFA0> disk list
> Executing: disk list
>
> B:ID:L Device Type Blocks Bytes/Block Usage Shared
> Rate ------ -------------- --------- ----------- ----------------
> ------ ---- 0:00:0 Disk 17783240 512 Initialized
> NO 80 0:01:0 Disk 17783240 512
> Initialized NO 80 0:02:0 Disk 17783240 512
> Initialized NO 80 0:03:0 Disk 17783240 512
> Initialized NO 80
>
> AFA0> container show failover
> Executing: container show failover
>
> Container Scsi B:ID:L
> --------- ----------------------------------
> 0 --- No Devices Assigned ---
> 1 0:03:0
>
> AFA0> container list
> Executing: container list
> Num Total Oth Chunk Scsi Partition
> Label Type Size Ctr Size Usage B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
> 0 Volume 8.47GB Open 0:00:0 64.0KB:8.47GB
> /dev/sda NT
>
> 1 RAID-5 16.9GB 32KB Open 0:01:0 64.0KB:8.47GB
> /dev/sdb DATA 0:02:0 64.0KB:8.47GB
> ?:??:? - Missing -
>
> AFA0> controller show au
> Executing: controller show automatic_failover
> Automatic failover ENABLED
>
> AFA0> container scrub 1
> Executing: container scrub 1
> Command Error: <The controller was unable to perform a scrub
> (consistency check)
> operation on the container because one or more of the container's
> partitions fa iled. >
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
>
Sincerely,
Andrew Kinney
President and
Chief Technology Officer
Advantagecom Networks, Inc.
http://www.advantagecom.net
prev parent reply other threads:[~2005-01-31 18:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-31 13:49 PERC2/Si won't failover Kit Gerrits
2005-01-31 16:32 ` Kit Gerrits
2005-01-31 18:42 ` Andrew Kinney [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41FE0B8A.10635.F25DCB4C@localhost \
--to=andykinney@advantagecom.net \
--cc=kit@gerritsacc.nl \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).