All of lore.kernel.org
 help / color / mirror / Atom feed
From: Iordan Iordanov <iordan@cdf.toronto.edu>
To: Roberto Nunnari <roberto.nunnari@supsi.ch>
Cc: linux-raid@vger.kernel.org
Subject: Re: failed drive in raid 1 array
Date: Thu, 24 Feb 2011 11:05:53 -0500	[thread overview]
Message-ID: <4D6681E1.5010905@cdf.toronto.edu> (raw)
In-Reply-To: <4D658672.4000503@supsi.ch>

Hi guys,

I saw a bunch of discussion of devices changing names when hot-plugged. 
If you get the device name right when you add it to the array first, all 
is good since the superblock is used to "discover" the device later.

However, to make things easier/clearer, and to avoid errors, one can 
take a look at the set of directories:

/dev/disk/by-id
/dev/disk/by-path
/dev/disk/by-uuid
/dev/disk/by-label

for a predictable, more static view of the drives. The symlinks in these 
directories are created by udev, and are simply links to the "real" 
device nodes /dev/sd{a-z}*. You can either just use these symlinks as a 
way of verifying that you are adding the right device, or add the device 
using the symlink.

At our location, we even augmented udev to add links to labeled GPT 
partitions in /dev/disk/by-label, and now our drives/partitions look 
like this:

iscsi00-drive00-part00 -> ../../sda1
iscsi00-drive01-part00 -> ../../sdb1
iscsi00-drive02-part00 -> ../../sdc1
iscsi00-drive03-part00 -> ../../sdd1
iscsi00-drive04-part00 -> ../../sde1

This way, we know exactly which bay contains exactly which drive, and it 
stays this way. If you guys want, I can share with you the changes to 
udev necessary and the script which extracts the GPT label and reports 
it to udev for this magic to happen :). Please reply to this thread with 
a request if you think it may be useful to you.

Cheers,
Iordan


On 02/23/11 17:13, Roberto Nunnari wrote:
> Roberto Spadim wrote:
>> hum, maybe you are using mdadm.conf or autodetect, non autodetect
>> should be something like this:
>> i don´t know the best solution, but it works ehhehe
>>
>> kernel /vmlinuz-2.6.9-89.31.1.ELsmp ro root=/dev/md0 rhgb
>> quiet md=0,/dev/sda,/dev/sdb md=1,xxxx,yyyy.....
>>
>> or another md array...
>>
>> humm i readed the sata specification and removing isn´t a problem, at
>> eletronic level the sata channel is only data, no power source, all
>> channels are diferencial (like rs422 or rs485), i don´t see anyproblem
>> removing it. i tryed hot plug a revodrive (pciexpress ssd) and it
>> don´t work (reboot) hehehe, pci-express isn´t hot plug =P, sata2 don´t
>> have problems, the main problem is a short circuit at power source, if
>> you remove with caution no problems =)
>>
>> i tried in some others distros and udev created a new device when add
>> a diferent disk for example, remove sdb, and add another disk create
>> sdc (not sdb), maybe with another udev configuration should work
>
> Ok. I'll keep all that in mind tomorrow.
> Best regards.
> Robi
>
>
>>
>>
>> 2011/2/23 Roberto Nunnari <roberto.nunnari@supsi.ch>:
>>> Roberto Spadim wrote:
>>>> i don´t know how you setup your kernel (with or without raid
>>> I use the official CentOS kernel with no modification and don't
>>> know about raid autodetect, but:
>>> # cat /boot/config-2.6.24-28-server |grep -i raid
>>> CONFIG_BLK_DEV_3W_XXXX_RAID=m
>>> CONFIG_MD_RAID0=m
>>> CONFIG_MD_RAID1=m
>>> CONFIG_MD_RAID10=m
>>> CONFIG_MD_RAID456=m
>>> CONFIG_MD_RAID5_RESHAPE=y
>>> CONFIG_MEGARAID_LEGACY=m
>>> CONFIG_MEGARAID_MAILBOX=m
>>> CONFIG_MEGARAID_MM=m
>>> CONFIG_MEGARAID_NEWGEN=y
>>> CONFIG_MEGARAID_SAS=m
>>> CONFIG_RAID_ATTRS=m
>>> CONFIG_SCSI_AACRAID=m
>>>
>>>
>>>> autodetect?) do you use kernel command line to setup raid? autodetect?
>>> /dev/md0 in grub
>>> I don't know if that means autodetect, but I guess so..
>>>
>>>
>>>> here in my test machine i´m using kernel command line (grub), i don´t
>>>> have a server with hotplug bay, i open the case and remove the wire
>>>> with my hands =) after reconecting it with another device kerenel
>>> Is it safe? Isn't it a blind bet to fry up the controller and/or disk?
>>>
>>>
>>>> recognize the new device reread the parititions etc etc and i can add
>>>> it to array again
>>>> my grub is something like:
>>>>
>>>> md=0,/dev/sda,/dev/sdb .....
>>>>
>>>> internal meta data, raid1, i didn´t like the autodetect (it´s good)
>>>> but i prefer hardcoded kernel command line (it´s not good with usb
>>>> devices)
>>> the relevant part of my grub is:
>>>
>>> default=0
>>> timeout=5
>>> splashimage=(hd0,0)/grub/splash.xpm.gz
>>> hiddenmenu
>>> title CentOS (2.6.9-89.31.1.ELsmp)
>>> root (hd0,0)
>>> kernel /vmlinuz-2.6.9-89.31.1.ELsmp ro root=/dev/md0 rhgb quiet
>>> initrd /initrd-2.6.9-89.31.1.ELsmp.img
>>>
>>> Best regards.
>>> Robi
>>>
>>>
>>>> 2011/2/23 Roberto Nunnari <roberto.nunnari@supsi.ch>:
>>>>> Roberto Spadim wrote:
>>>>>> sata2 without hot plug?
>>>>> Hi Roberto.
>>>>>
>>>>> I mean that there is no hot-plug bay, with sliding rails etc..
>>>>> The drives are connected to the mb using standard sata cables.
>>>>>
>>>>>
>>>>>> check if your sda sdb sdc will change after removing it, it愀 depends
>>>>>> on your udev or another /dev filesystem
>>>>> Ok, thank you.
>>>>> That means that if I take care to check the above, and
>>>>> the new drive will be sdb, then taking the steps indicated
>>>>> in my original post will do the job?
>>>>>
>>>>> Best regards.
>>>>> Robi
>>>>>
>>>>>
>>>>>> 2011/2/23 Roberto Nunnari <roberto.nunnari@supsi.ch>:
>>>>>>> Hello.
>>>>>>>
>>>>>>> I have a linux box, with two 2TB sata HD in raid 1.
>>>>>>>
>>>>>>> Now, one disk is in failed state and it has no spares:
>>>>>>> # cat /proc/mdstat
>>>>>>> Personalities : [raid1]
>>>>>>> md1 : active raid1 sdb4[2](F) sda4[0]
>>>>>>> 1910200704 blocks [2/1] [U_]
>>>>>>>
>>>>>>> md0 : active raid1 sdb1[1] sda2[0]
>>>>>>> 40957568 blocks [2/2] [UU]
>>>>>>>
>>>>>>> unused devices: <none>
>>>>>>>
>>>>>>>
>>>>>>> The drives are not hot-plug, so I need to shutdown the box.
>>>>>>>
>>>>>>> My plan is to:
>>>>>>> # sfdisk -d /dev/sdb > sdb.sfdisk
>>>>>>> # mdadm /dev/md1 -r /dev/sdb4
>>>>>>> # mdadm /dev/md0 -r /dev/sdb1
>>>>>>> # shutdown -h now
>>>>>>>
>>>>>>> replace the disk and boot (it should come back up, even without one
>>>>>>> drive,
>>>>>>> right?)
>>>>>>>
>>>>>>> # sfdisk /dev/sdb < sdb.sfdisk
>>>>>>> # mdadm /dev/md1 -a /dev/sdb4
>>>>>>> # mdadm /dev/md0 -a /dev/sdb1
>>>>>>>
>>>>>>> and the drives should start to resync, right?
>>>>>>>
>>>>>>> This is my first time I do such a thing, so please, correct me
>>>>>>> if the above is not correct, or is not a best practice for
>>>>>>> my configuration.
>>>>>>>
>>>>>>> My last backup of md1 is of mid november, so I need to be
>>>>>>> pretty sure I will not lose my data (over 1TB).
>>>>>>>
>>>>>>> A bit abount my environment:
>>>>>>> # mdadm --version
>>>>>>> mdadm - v1.12.0 - 14 June 2005
>>>>>>> # cat /etc/redhat-release
>>>>>>> CentOS release 4.8 (Final)
>>>>>>> # uname -rms
>>>>>>> Linux 2.6.9-89.31.1.ELsmp i686
>>>>>>>
>>>>>>> Thank you very much and best regards.
>>>>>>> Robi
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-02-24 16:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-23 16:52 failed drive in raid 1 array Roberto Nunnari
2011-02-23 17:56 ` Roberto Spadim
2011-02-23 18:20   ` Albert Pauw
2011-02-23 21:21     ` Roberto Nunnari
2011-02-24 21:51       ` Roberto Nunnari
2011-02-24 22:00         ` Roberto Spadim
2011-02-23 19:16   ` Roberto Nunnari
2011-02-23 19:20     ` Roberto Spadim
2011-02-23 21:24       ` Roberto Nunnari
2011-02-23 21:34         ` Roberto Spadim
2011-02-23 22:13           ` Roberto Nunnari
2011-02-24 16:05             ` Iordan Iordanov [this message]
2011-02-24 20:08               ` Roberto Spadim
2011-02-24 21:32                 ` Iordan Iordanov
2011-02-24 21:38                   ` Roberto Spadim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D6681E1.5010905@cdf.toronto.edu \
    --to=iordan@cdf.toronto.edu \
    --cc=linux-raid@vger.kernel.org \
    --cc=roberto.nunnari@supsi.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.