Linux RAID subsystem development
 help / color / mirror / Atom feed
* Questions
@ 2016-02-14  4:28 o1bigtenor
  2016-02-14  6:34 ` Questions Adam Goryachev
  0 siblings, 1 reply; 14+ messages in thread
From: o1bigtenor @ 2016-02-14  4:28 UTC (permalink / raw)
  To: Linux-RAID

Greetings

My raid 10 array was the subject of a number of exchanges on this
board a few months ago.
With the generous assistance of members here things were reestablished
and have been working well. Today I had a VirtualBox VM crater and in
the process cause other system issues. In process to clear the mess a
number of hard stops (shutting the system off using the button on the
case) were used. In rebooting I found that one of the drives in the
array is no longer responding issuing a number of clicks in the boot
up process with nothing else happening. Even though it is a RAID 10
array the array is no longer mounted nor available. I have removed the
faulty drive already. I have an appropriately sized drive available
that I could place into the machine.

1. should I reformat the drive (to be placed into the machine)?
2. what sequence of commands should I be using for this new drive to
be included into the array?
3. what sequence of commands should I use to remount the array?

TIA

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-14  4:28 Questions o1bigtenor
@ 2016-02-14  6:34 ` Adam Goryachev
  2016-02-14 11:53   ` Questions o1bigtenor
  0 siblings, 1 reply; 14+ messages in thread
From: Adam Goryachev @ 2016-02-14  6:34 UTC (permalink / raw)
  To: o1bigtenor, Linux-RAID



On 14/02/2016 15:28, o1bigtenor wrote:
> Greetings
>
> My raid 10 array was the subject of a number of exchanges on this
> board a few months ago.
> With the generous assistance of members here things were reestablished
> and have been working well. Today I had a VirtualBox VM crater and in
> the process cause other system issues. In process to clear the mess a
> number of hard stops (shutting the system off using the button on the
> case) were used. In rebooting I found that one of the drives in the
> array is no longer responding issuing a number of clicks in the boot
> up process with nothing else happening. Even though it is a RAID 10
> array the array is no longer mounted nor available. I have removed the
> faulty drive already. I have an appropriately sized drive available
> that I could place into the machine.
>
> 1. should I reformat the drive (to be placed into the machine)?
> 2. what sequence of commands should I be using for this new drive to
> be included into the array?
> 3. what sequence of commands should I use to remount the array?

First thing I would suggest is to let everyone know the status of the 
current array, and how to get it working.

Can you send the output of cat /proc/mdstat and mdadm --misc --detail 
/dev/md?

Assuming the existing array is in a "normal" status, albeit degraded, 
then it should be pretty simple to just partition the new drive to match 
the other members, and then simply add the new partition to the array 
(mdadm --manage /dev/md? --add /dev/sdxy)

Try to take things slowly, as doing the wrong thing might make a simple 
recovery into a very sad event (loss of all the data).

Regards,
Adam

Regards,
Adam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-14  6:34 ` Questions Adam Goryachev
@ 2016-02-14 11:53   ` o1bigtenor
  2016-02-14 12:24     ` Questions Adam Goryachev
  0 siblings, 1 reply; 14+ messages in thread
From: o1bigtenor @ 2016-02-14 11:53 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Linux-RAID

On Sun, Feb 14, 2016 at 12:34 AM, Adam Goryachev
<mailinglists@websitemanagers.com.au> wrote:
>
>
> On 14/02/2016 15:28, o1bigtenor wrote:
>>
>> Greetings
>>
>> My raid 10 array was the subject of a number of exchanges on this
>> board a few months ago.
>> With the generous assistance of members here things were reestablished
>> and have been working well. Today I had a VirtualBox VM crater and in
>> the process cause other system issues. In process to clear the mess a
>> number of hard stops (shutting the system off using the button on the
>> case) were used. In rebooting I found that one of the drives in the
>> array is no longer responding issuing a number of clicks in the boot
>> up process with nothing else happening. Even though it is a RAID 10
>> array the array is no longer mounted nor available. I have removed the
>> faulty drive already. I have an appropriately sized drive available
>> that I could place into the machine.
>>
>> 1. should I reformat the drive (to be placed into the machine)?
>> 2. what sequence of commands should I be using for this new drive to
>> be included into the array?
>> 3. what sequence of commands should I use to remount the array?
>
>
> First thing I would suggest is to let everyone know the status of the
> current array, and how to get it working.
>
> Can you send the output of cat /proc/mdstat and mdadm --misc --detail
> /dev/md?

# cat /proc/mdstat
Personalities : [raid10]
md0 : active (auto-read-only) raid10 sde1[5] sdc1[4] sdb1[3]
      1953518592 blocks super 1.2 512K chunks 2 near-copies [4/3] [U_UU]

unused devices: <none>

# mdadm --misc --detail /dev/md
mdadm: /dev/md does not appear to be an md device

# mdadm --misc --detail /dev/md/0
/dev/md/0:
        Version : 1.2
  Creation Time : Mon Mar  5 08:26:28 2012
     Raid Level : raid10
     Array Size : 1953518592 (1863.02 GiB 2000.40 GB)
  Used Dev Size : 976759296 (931.51 GiB 1000.20 GB)
   Raid Devices : 4
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Sat Feb 13 17:21:51 2016
          State : clean, degraded
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : debianbase:0  (local to host debianbase)
           UUID : 79baaa2f:0aa2b9fa:18e2ea6b:6e2846b3
         Events : 60241

    Number   Major   Minor   RaidDevice State
       5       8       65        0      active sync set-A   /dev/sde1
       2       0        0        2      removed
       4       8       33        2      active sync set-A   /dev/sdc1
       3       8       17        3      active sync set-B   /dev/sdb1


>
> Assuming the existing array is in a "normal" status, albeit degraded, then
> it should be pretty simple to just partition the new drive to match the
> other members, and then simply add the new partition to the array (mdadm
> --manage /dev/md? --add /dev/sdxy)

Drive I wish to use for replacement has had some use.
Should I be reformatting it?

Thank you Adam!

Regards

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-14 11:53   ` Questions o1bigtenor
@ 2016-02-14 12:24     ` Adam Goryachev
  2016-02-15 12:12       ` Questions o1bigtenor
  0 siblings, 1 reply; 14+ messages in thread
From: Adam Goryachev @ 2016-02-14 12:24 UTC (permalink / raw)
  To: o1bigtenor; +Cc: Linux-RAID

On 14 February 2016 10:53:48 pm AEDT, o1bigtenor <o1bigtenor@gmail.com> wrote:
>On Sun, Feb 14, 2016 at 12:34 AM, Adam Goryachev
><mailinglists@websitemanagers.com.au> wrote:
>>
>>
>> On 14/02/2016 15:28, o1bigtenor wrote:
>>>
>>> Greetings
>>>
>>> My raid 10 array was the subject of a number of exchanges on this
>>> board a few months ago.
>>> With the generous assistance of members here things were
>reestablished
>>> and have been working well. Today I had a VirtualBox VM crater and
>in
>>> the process cause other system issues. In process to clear the mess
>a
>>> number of hard stops (shutting the system off using the button on
>the
>>> case) were used. In rebooting I found that one of the drives in the
>>> array is no longer responding issuing a number of clicks in the boot
>>> up process with nothing else happening. Even though it is a RAID 10
>>> array the array is no longer mounted nor available. I have removed
>the
>>> faulty drive already. I have an appropriately sized drive available
>>> that I could place into the machine.
>>>
>>> 1. should I reformat the drive (to be placed into the machine)?
>>> 2. what sequence of commands should I be using for this new drive to
>>> be included into the array?
>>> 3. what sequence of commands should I use to remount the array?
>>
>>
>> First thing I would suggest is to let everyone know the status of the
>> current array, and how to get it working.
>>
>> Can you send the output of cat /proc/mdstat and mdadm --misc --detail
>> /dev/md?
>
># cat /proc/mdstat
>Personalities : [raid10]
>md0 : active (auto-read-only) raid10 sde1[5] sdc1[4] sdb1[3]
>     1953518592 blocks super 1.2 512K chunks 2 near-copies [4/3] [U_UU]
>
>unused devices: <none>
>
># mdadm --misc --detail /dev/md
>mdadm: /dev/md does not appear to be an md device
>
># mdadm --misc --detail /dev/md/0
>/dev/md/0:
>        Version : 1.2
>  Creation Time : Mon Mar  5 08:26:28 2012
>     Raid Level : raid10
>     Array Size : 1953518592 (1863.02 GiB 2000.40 GB)
>  Used Dev Size : 976759296 (931.51 GiB 1000.20 GB)
>   Raid Devices : 4
>  Total Devices : 3
>    Persistence : Superblock is persistent
>
>    Update Time : Sat Feb 13 17:21:51 2016
>          State : clean, degraded
> Active Devices : 3
>Working Devices : 3
> Failed Devices : 0
>  Spare Devices : 0
>
>         Layout : near=2
>     Chunk Size : 512K
>
>           Name : debianbase:0  (local to host debianbase)
>           UUID : 79baaa2f:0aa2b9fa:18e2ea6b:6e2846b3
>         Events : 60241
>
>    Number   Major   Minor   RaidDevice State
>       5       8       65        0      active sync set-A   /dev/sde1
>       2       0        0        2      removed
>       4       8       33        2      active sync set-A   /dev/sdc1
>       3       8       17        3      active sync set-B   /dev/sdb1
>
>
>>
>> Assuming the existing array is in a "normal" status, albeit degraded,
>then
>> it should be pretty simple to just partition the new drive to match
>the
>> other members, and then simply add the new partition to the array
>(mdadm
>> --manage /dev/md? --add /dev/sdxy)
>
>Drive I wish to use for replacement has had some use.
>Should I be reformatting it?
>

Not needed, the resync will overwrite the content. You just need to partition it the same as the other drive members. 

Then you can simply add it to the array and it will sync.

Also, you should have access to your stay content already if you just mount it (assuming it contains a file system )

Let us know if you need any more help. 

Regards
Adam




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-14 12:24     ` Questions Adam Goryachev
@ 2016-02-15 12:12       ` o1bigtenor
  2016-02-15 19:50         ` Questions Wols Lists
  0 siblings, 1 reply; 14+ messages in thread
From: o1bigtenor @ 2016-02-15 12:12 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Linux-RAID

Greetings

On Sun, Feb 14, 2016 at 6:24 AM, Adam Goryachev
<mailinglists@websitemanagers.com.au> wrote:
> On 14 February 2016 10:53:48 pm AEDT, o1bigtenor <o1bigtenor@gmail.com> wrote:
>>On Sun, Feb 14, 2016 at 12:34 AM, Adam Goryachev
>><mailinglists@websitemanagers.com.au> wrote:
>>>
>>>
>>> On 14/02/2016 15:28, o1bigtenor wrote:
>>>>
>>>> Greetings
>>>>
>>>> My raid 10 array was the subject of a number of exchanges on this
>>>> board a few months ago.
>>>> With the generous assistance of members here things were
>>reestablished
>>>> and have been working well. Today I had a VirtualBox VM crater and
>>in
>>>> the process cause other system issues. In process to clear the mess
>>a
>>>> number of hard stops (shutting the system off using the button on
>>the
>>>> case) were used. In rebooting I found that one of the drives in the
>>>> array is no longer responding issuing a number of clicks in the boot
>>>> up process with nothing else happening. Even though it is a RAID 10
>>>> array the array is no longer mounted nor available. I have removed
>>the
>>>> faulty drive already. I have an appropriately sized drive available
>>>> that I could place into the machine.
>>>>
>>>> 1. should I reformat the drive (to be placed into the machine)?
>>>> 2. what sequence of commands should I be using for this new drive to
>>>> be included into the array?
>>>> 3. what sequence of commands should I use to remount the array?
>>>
>>>
>>> First thing I would suggest is to let everyone know the status of the
>>> current array, and how to get it working.
>>>
>>> Can you send the output of cat /proc/mdstat and mdadm --misc --detail
>>> /dev/md?
>>
>># cat /proc/mdstat
>>Personalities : [raid10]
>>md0 : active (auto-read-only) raid10 sde1[5] sdc1[4] sdb1[3]
>>     1953518592 blocks super 1.2 512K chunks 2 near-copies [4/3] [U_UU]
>>
>>unused devices: <none>
>>
>># mdadm --misc --detail /dev/md
>>mdadm: /dev/md does not appear to be an md device
>>
>># mdadm --misc --detail /dev/md/0
>>/dev/md/0:
>>        Version : 1.2
>>  Creation Time : Mon Mar  5 08:26:28 2012
>>     Raid Level : raid10
>>     Array Size : 1953518592 (1863.02 GiB 2000.40 GB)
>>  Used Dev Size : 976759296 (931.51 GiB 1000.20 GB)
>>   Raid Devices : 4
>>  Total Devices : 3
>>    Persistence : Superblock is persistent
>>
>>    Update Time : Sat Feb 13 17:21:51 2016
>>          State : clean, degraded
>> Active Devices : 3
>>Working Devices : 3
>> Failed Devices : 0
>>  Spare Devices : 0
>>
>>         Layout : near=2
>>     Chunk Size : 512K
>>
>>           Name : debianbase:0  (local to host debianbase)
>>           UUID : 79baaa2f:0aa2b9fa:18e2ea6b:6e2846b3
>>         Events : 60241
>>
>>    Number   Major   Minor   RaidDevice State
>>       5       8       65        0      active sync set-A   /dev/sde1
>>       2       0        0        2      removed
>>       4       8       33        2      active sync set-A   /dev/sdc1
>>       3       8       17        3      active sync set-B   /dev/sdb1
>>
>>
>>>
>>> Assuming the existing array is in a "normal" status, albeit degraded,
>>then
>>> it should be pretty simple to just partition the new drive to match
>>the
>>> other members, and then simply add the new partition to the array
>>(mdadm
>>> --manage /dev/md? --add /dev/sdxy)
>>
>>Drive I wish to use for replacement has had some use.
>>Should I be reformatting it?
>>
>
> Not needed, the resync will overwrite the content. You just need to partition it the same as the other drive members.
>
> Then you can simply add it to the array and it will sync.
>
> Also, you should have access to your stay content already if you just mount it (assuming it contains a file system )
>
> Let us know if you need any more help.
>

Thank you to Mr Adam for his assistance!

Installed the drive and, with a wait time for results, everything is
working very well.

I am looking at replacing ALL the drives in the array so that I can
reduce the likelihood of these kind of issues for a longer period than
a few months.

Would you be able to tell me what steps should I be taking to replace
the entire array?

Should I replace the drives one at a time (sort of just like I did
this time) using the same commands?

If so is there an easy way of mounting the array?

Regards

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-15 12:12       ` Questions o1bigtenor
@ 2016-02-15 19:50         ` Wols Lists
  2016-02-15 21:01           ` Questions o1bigtenor
  0 siblings, 1 reply; 14+ messages in thread
From: Wols Lists @ 2016-02-15 19:50 UTC (permalink / raw)
  To: o1bigtenor, Adam Goryachev; +Cc: Linux-RAID

On 15/02/16 12:12, o1bigtenor wrote:
> I am looking at replacing ALL the drives in the array so that I can
> reduce the likelihood of these kind of issues for a longer period than
> a few months.
> 
> Would you be able to tell me what steps should I be taking to replace
> the entire array?
> 
> Should I replace the drives one at a time (sort of just like I did
> this time) using the same commands?

If you have a spare SATA (I presume they're sata drives) slot, then
definitely not. It might be worth buying an add-in card to give you an
extra slot, they're only a few quid.

Look up "mdadm --replace". That'll keep the old drive in the array until
the new one has replaced it, keeping the array fully functional all the
time.

Doing --fail, --remove, --add places the array in danger until the
entire sequence is complete, and if you're doing it several times then
you're massively increasing the chances of trouble.

In fact, if you get an add-in card, you might be able to replace several
drives at once - I don't know - read up and see if it's supported (or not).

Cheers,
Wol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-15 19:50         ` Questions Wols Lists
@ 2016-02-15 21:01           ` o1bigtenor
  2016-02-15 22:05             ` Questions Adam Goryachev
  2016-02-15 22:09             ` Questions Wols Lists
  0 siblings, 2 replies; 14+ messages in thread
From: o1bigtenor @ 2016-02-15 21:01 UTC (permalink / raw)
  To: Wols Lists; +Cc: Adam Goryachev, Linux-RAID

On Mon, Feb 15, 2016 at 1:50 PM, Wols Lists <antlists@youngman.org.uk> wrote:
> On 15/02/16 12:12, o1bigtenor wrote:
>> I am looking at replacing ALL the drives in the array so that I can
>> reduce the likelihood of these kind of issues for a longer period than
>> a few months.
>>
>> Would you be able to tell me what steps should I be taking to replace
>> the entire array?
>>
>> Should I replace the drives one at a time (sort of just like I did
>> this time) using the same commands?
>
> If you have a spare SATA (I presume they're sata drives) slot, then
> definitely not. It might be worth buying an add-in card to give you an
> extra slot, they're only a few quid.
>
> Look up "mdadm --replace". That'll keep the old drive in the array until
> the new one has replaced it, keeping the array fully functional all the
> time.
>
> Doing --fail, --remove, --add places the array in danger until the
> entire sequence is complete, and if you're doing it several times then
> you're massively increasing the chances of trouble.
>
> In fact, if you get an add-in card, you might be able to replace several
> drives at once - I don't know - read up and see if it's supported (or not).

Looked through well over a hundred different pages and only found 1 page
that even hinted at what I am trying to do and details there were sparse.

Read through the man page (likely one of the best for understandability that
I have found so far) and it doesn't address the wholesale replacement of all
the drives in an array. Most of the other pages that I found were what I would
call old, ie before 2010, as I have found that all too often the software being
discussed has changed and then the instructions given are not always useful,
occasionally there is some value in these old pages but only sometimes.

Most often the documentation was focused on replacing a faulty disc.

This is NOT what I am proposing to do.

I have a present working array and with to replace its components with the
same size but new drives (which are NAS rated). Was thinking that using
the fail remove and add process 4 seperate times might not be a good thing
but I do not know of a different option. Compounding the difficulty is
that there
are no empty hard drive slots in the machine. I do have an external USB 3.0
2 drive holder that could be used.

The only suggestion in all the documents I perused was to place spare drives
into something like this external box and then add the drives into the array.
The process was not laid out and leaves me with a number of questions.

Is there a suggested method for replacing ALL the drives in an array (raid 10
in this case)?

If I use the external box how do I do this (external box only holds 2 drives) so
that I can transfer the information on the drives from the array to
the new drives
and then just replace the drives 2 at a time into the machine without
there being
issues because in the information transfer the drives will be sdg and
sdh (AFAIK)
and later they will be some of sdb, sdc, sde, and/or sdf.

If one of the gurus on the list has an already prepared process list
with the steps
and commands that would be wonderful. I am thinking that this document would
very much be appreciated by many others out there that are not on this list.

TIA

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-15 21:01           ` Questions o1bigtenor
@ 2016-02-15 22:05             ` Adam Goryachev
  2016-02-16 11:46               ` Questions o1bigtenor
  2016-02-15 22:09             ` Questions Wols Lists
  1 sibling, 1 reply; 14+ messages in thread
From: Adam Goryachev @ 2016-02-15 22:05 UTC (permalink / raw)
  To: o1bigtenor, Wols Lists; +Cc: Linux-RAID

On 16/02/16 08:01, o1bigtenor wrote:
> On Mon, Feb 15, 2016 at 1:50 PM, Wols Lists <antlists@youngman.org.uk> wrote:
>> On 15/02/16 12:12, o1bigtenor wrote:
>>> I am looking at replacing ALL the drives in the array so that I can
>>> reduce the likelihood of these kind of issues for a longer period than
>>> a few months.
>>>
>>> Would you be able to tell me what steps should I be taking to replace
>>> the entire array?
>>>
>>> Should I replace the drives one at a time (sort of just like I did
>>> this time) using the same commands?
>> If you have a spare SATA (I presume they're sata drives) slot, then
>> definitely not. It might be worth buying an add-in card to give you an
>> extra slot, they're only a few quid.
>>
>> Look up "mdadm --replace". That'll keep the old drive in the array until
>> the new one has replaced it, keeping the array fully functional all the
>> time.
>>
>> Doing --fail, --remove, --add places the array in danger until the
>> entire sequence is complete, and if you're doing it several times then
>> you're massively increasing the chances of trouble.
>>
>> In fact, if you get an add-in card, you might be able to replace several
>> drives at once - I don't know - read up and see if it's supported (or not).
> Looked through well over a hundred different pages and only found 1 page
> that even hinted at what I am trying to do and details there were sparse.
>
> Read through the man page (likely one of the best for understandability that
> I have found so far) and it doesn't address the wholesale replacement of all
> the drives in an array. Most of the other pages that I found were what I would
> call old, ie before 2010, as I have found that all too often the software being
> discussed has changed and then the instructions given are not always useful,
> occasionally there is some value in these old pages but only sometimes.
>
> Most often the documentation was focused on replacing a faulty disc.
>
> This is NOT what I am proposing to do.
>
> I have a present working array and with to replace its components with the
> same size but new drives (which are NAS rated). Was thinking that using

I wanted to mention this, what drives do you have right now, and do know 
about SCT ERC?
Maybe start here (but you probably need to read more):
http://www.spinics.net/lists/raid/msg48199.html

Essentially, your current disks might be fine, but if you don't have the 
right settings, they could be "failing" regularly putting your data at 
risk. You should fix any issue here before you attempt to replace your 
drives.

> the fail remove and add process 4 seperate times might not be a good thing
> but I do not know of a different option. Compounding the difficulty is
> that there
> are no empty hard drive slots in the machine. I do have an external USB 3.0
> 2 drive holder that could be used.
>
> The only suggestion in all the documents I perused was to place spare drives
> into something like this external box and then add the drives into the array.
> The process was not laid out and leaves me with a number of questions.
>
> Is there a suggested method for replacing ALL the drives in an array (raid 10
> in this case)?

In order to replace all drives, I would suggest that you simply replace 
one drive 4 times (different drive each time).
The first question to ask, is your external USB drive bay reliable? If 
not, then there are other solutions that are probably less dangerous.

So, add your spare drive to the external USB drive bay, it should show 
up as /dev/sdy (for example)
Partition to match the rest of your existing drives
Add the new partition to your existing array: mdadm --manage /dev/md0 
--add /dev/sdx1
Replace one of the existing drives with the new one: mdadm /dev/md0 
--replace /dev/sda1 --with /dev/sdx1
Personally, because I distrust the external USB drive bay (don't ask me 
why, it just seems less reliable than internal sata), once the drive has 
finished being replaced, I would shutdown, remove the old drive, and 
install the replacement drive, then add another new spare, and repeat.

You can see this page for some extra information:
http://unix.stackexchange.com/questions/74924/how-to-safely-replace-a-not-yet-failed-disk-in-a-linux-raid5-array
> If I use the external box how do I do this (external box only holds 2 drives) so
> that I can transfer the information on the drives from the array to
> the new drives
> and then just replace the drives 2 at a time into the machine without
> there being
> issues because in the information transfer the drives will be sdg and
> sdh (AFAIK)
> and later they will be some of sdb, sdc, sde, and/or sdf.
I would suggest replacing one at a time.
> If one of the gurus on the list has an already prepared process list
> with the steps
> and commands that would be wonderful. I am thinking that this document would
> very much be appreciated by many others out there that are not on this list.
>
> TIA
>
>

Hope that helps.

Regards,
Adam

-- 
Adam Goryachev Website Managers www.websitemanagers.com.au

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-15 21:01           ` Questions o1bigtenor
  2016-02-15 22:05             ` Questions Adam Goryachev
@ 2016-02-15 22:09             ` Wols Lists
  1 sibling, 0 replies; 14+ messages in thread
From: Wols Lists @ 2016-02-15 22:09 UTC (permalink / raw)
  To: o1bigtenor; +Cc: Adam Goryachev, Linux-RAID

On 15/02/16 21:01, o1bigtenor wrote:
> I have a present working array and with to replace its components with the
> same size but new drives (which are NAS rated). Was thinking that using
> the fail remove and add process 4 seperate times might not be a good thing
> but I do not know of a different option. Compounding the difficulty is
> that there
> are no empty hard drive slots in the machine. I do have an external USB 3.0
> 2 drive holder that could be used.

Does it have a spare PCI slot? That's what I was getting at - can you
add two more sata slots? Presumably not if it's a cage, but if it's a
computer case? As I said, a spare PCI SATA card should be dirt cheap to
add temporary extra capacity. And does it matter if the case is open and
the drives lying around temporarily?
> 
> The only suggestion in all the documents I perused was to place spare drives
> into something like this external box and then add the drives into the array.
> The process was not laid out and leaves me with a number of questions.
> 
> Is there a suggested method for replacing ALL the drives in an array (raid 10
> in this case)?

As far as I'm aware, there's just the "mdadm --replace" I mentioned -
drive by drive. Given that it's raid 10, maybe you can just add another
mirror then fail an old one.

I'd just plug in the extra drives, run mdadm --replace, and then remove
the old drives. Just make sure you get the right drives! And always use
uuids so you know which drive is which!

Get Phil's lsdrv and it will probably give you all the drives, with
serial numbers, etc etc. I haven't managed to run it so I can't be sure
:-) But it's intended to give you all the stuff you need to recover an
array so it should give you the information you need to rebuild it.
> 
> If I use the external box how do I do this (external box only holds 2 drives) so
> that I can transfer the information on the drives from the array to
> the new drives
> and then just replace the drives 2 at a time into the machine without
> there being
> issues because in the information transfer the drives will be sdg and
> sdh (AFAIK)
> and later they will be some of sdb, sdc, sde, and/or sdf.

That's why they now have uuids.

ls /dev/disk/by-id

I *think* raid uses uuids internally. So swapping the drives out won't
be a problem - the sdx names are just used as a human-readable output.
But don't take that as gospel ... However, bear in mind that the kernel
does NOT guarantee that a drive will get the same sdx name from one boot
to the next. It so happens that that is the norm, but it's not
guaranteed ... so x changing for any value of sdx shouldn't be a problem.

Regardless, you should not be using sdx. Everything should be using
uuids, my /etc/fstab is a lovely mangle with all those long uuids
everywhere :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-15 22:05             ` Questions Adam Goryachev
@ 2016-02-16 11:46               ` o1bigtenor
  2016-02-16 14:00                 ` Questions Adam Goryachev
  2016-02-16 14:32                 ` Questions Wols Lists
  0 siblings, 2 replies; 14+ messages in thread
From: o1bigtenor @ 2016-02-16 11:46 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Wols Lists, Linux-RAID

On Mon, Feb 15, 2016 at 4:05 PM, Adam Goryachev
<mailinglists@websitemanagers.com.au> wrote:


>
> I wanted to mention this, what drives do you have right now, and do know
> about SCT ERC?
> Maybe start here (but you probably need to read more):
> http://www.spinics.net/lists/raid/msg48199.html

A major reason as to why the drives are getting replaced. Back in early 2012
when I setup the machine there was no obvious information that the ERC
type drives were needed so I just bought vanilla drives.
>
> Essentially, your current disks might be fine, but if you don't have the
> right settings, they could be "failing" regularly putting your data at risk.
> You should fix any issue here before you attempt to replace your drives.

I now have 2 long term drives which are likely still good and 2 cheap
drives that
are quite new but which I don't trust for long term reliability,
therefore the push
to change them all.
>
>> the fail remove and add process 4 seperate times might not be a good thing
>> but I do not know of a different option. Compounding the difficulty is
>> that there
>> are no empty hard drive slots in the machine. I do have an external USB
>> 3.0
>> 2 drive holder that could be used.
>>
>> The only suggestion in all the documents I perused was to place spare
>> drives
>> into something like this external box and then add the drives into the
>> array.
>> The process was not laid out and leaves me with a number of questions.
>>
>> Is there a suggested method for replacing ALL the drives in an array (raid
>> 10
>> in this case)?
>
>
> In order to replace all drives, I would suggest that you simply replace one
> drive 4 times (different drive each time).
> The first question to ask, is your external USB drive bay reliable? If not,
> then there are other solutions that are probably less dangerous.
>
> So, add your spare drive to the external USB drive bay, it should show up as
> /dev/sdy (for example)
> Partition to match the rest of your existing drives
> Add the new partition to your existing array: mdadm --manage /dev/md0 --add
> /dev/sdx1
> Replace one of the existing drives with the new one: mdadm /dev/md0
> --replace /dev/sda1 --with /dev/sdx1
> Personally, because I distrust the external USB drive bay (don't ask me why,
> it just seems less reliable than internal sata), once the drive has finished
> being replaced, I would shutdown, remove the old drive, and install the
> replacement drive, then add another new spare, and repeat.
>
> You can see this page for some extra information:
> http://unix.stackexchange.com/questions/74924/how-to-safely-replace-a-not-yet-failed-disk-in-a-linux-raid5-array
>>
>> If I use the external box how do I do this (external box only holds 2
>> drives) so
>> that I can transfer the information on the drives from the array to
>> the new drives
>> and then just replace the drives 2 at a time into the machine without
>> there being
>> issues because in the information transfer the drives will be sdg and
>> sdh (AFAIK)
>> and later they will be some of sdb, sdc, sde, and/or sdf.
>
> I would suggest replacing one at a time.

There is no way to do them one after another copying over all four and then
only needing to shut the box down once or failing that doing the process 2 times
necessitating only 2 shutdowns instead of 4 is there? The external USB box
does have room for 2 drives at once.

TIA

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-16 11:46               ` Questions o1bigtenor
@ 2016-02-16 14:00                 ` Adam Goryachev
  2016-02-16 18:33                   ` Questions o1bigtenor
  2016-02-16 14:32                 ` Questions Wols Lists
  1 sibling, 1 reply; 14+ messages in thread
From: Adam Goryachev @ 2016-02-16 14:00 UTC (permalink / raw)
  To: o1bigtenor; +Cc: Wols Lists, Linux-RAID



On 16/02/2016 22:46, o1bigtenor wrote:
> On Mon, Feb 15, 2016 at 4:05 PM, Adam Goryachev
> <mailinglists@websitemanagers.com.au> wrote:
>
>
>> I wanted to mention this, what drives do you have right now, and do know
>> about SCT ERC?
>> Maybe start here (but you probably need to read more):
>> http://www.spinics.net/lists/raid/msg48199.html
> A major reason as to why the drives are getting replaced. Back in early 2012
> when I setup the machine there was no obvious information that the ERC
> type drives were needed so I just bought vanilla drives.
>> Essentially, your current disks might be fine, but if you don't have the
>> right settings, they could be "failing" regularly putting your data at risk.
>> You should fix any issue here before you attempt to replace your drives.
> I now have 2 long term drives which are likely still good and 2 cheap
> drives that
> are quite new but which I don't trust for long term reliability,
> therefore the push
> to change them all.

Note, most drives either still support it, or else can be worked around 
to avoid the timeout mismatch. You should do this before continue to 
replace the drives, as you want to avoid this happening in the middle of 
replacing drives.
>>> the fail remove and add process 4 seperate times might not be a good thing
>>> but I do not know of a different option. Compounding the difficulty is
>>> that there
>>> are no empty hard drive slots in the machine. I do have an external USB
>>> 3.0
>>> 2 drive holder that could be used.
>>>
>>> The only suggestion in all the documents I perused was to place spare
>>> drives
>>> into something like this external box and then add the drives into the
>>> array.
>>> The process was not laid out and leaves me with a number of questions.
>>>
>>> Is there a suggested method for replacing ALL the drives in an array (raid
>>> 10
>>> in this case)?
>>
>> In order to replace all drives, I would suggest that you simply replace one
>> drive 4 times (different drive each time).
>> The first question to ask, is your external USB drive bay reliable? If not,
>> then there are other solutions that are probably less dangerous.
>>
>> So, add your spare drive to the external USB drive bay, it should show up as
>> /dev/sdy (for example)
>> Partition to match the rest of your existing drives
>> Add the new partition to your existing array: mdadm --manage /dev/md0 --add
>> /dev/sdx1
>> Replace one of the existing drives with the new one: mdadm /dev/md0
>> --replace /dev/sda1 --with /dev/sdx1
>> Personally, because I distrust the external USB drive bay (don't ask me why,
>> it just seems less reliable than internal sata), once the drive has finished
>> being replaced, I would shutdown, remove the old drive, and install the
>> replacement drive, then add another new spare, and repeat.
>>
>> You can see this page for some extra information:
>> http://unix.stackexchange.com/questions/74924/how-to-safely-replace-a-not-yet-failed-disk-in-a-linux-raid5-array
>>> If I use the external box how do I do this (external box only holds 2
>>> drives) so
>>> that I can transfer the information on the drives from the array to
>>> the new drives
>>> and then just replace the drives 2 at a time into the machine without
>>> there being
>>> issues because in the information transfer the drives will be sdg and
>>> sdh (AFAIK)
>>> and later they will be some of sdb, sdc, sde, and/or sdf.
>> I would suggest replacing one at a time.
> There is no way to do them one after another copying over all four and then
> only needing to shut the box down once or failing that doing the process 2 times
> necessitating only 2 shutdowns instead of 4 is there? The external USB box
> does have room for 2 drives at once.

Put two blank drives into the USB, replace first one, replace second one 
(would suggest second drive as non-matching pair). Shutdown, move two 
new drives internal, place two new drives into drive bay, and repeat for 
the last two drives.

I would suggest adding a bitmap for at least the time you are doing the 
replacements, then if you have a failure on the USB enclosure during the 
2nd or 4th drive, at least the 1st or 3rd will re-sync quickly.

Regards,
Adam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-16 11:46               ` Questions o1bigtenor
  2016-02-16 14:00                 ` Questions Adam Goryachev
@ 2016-02-16 14:32                 ` Wols Lists
  2016-02-16 18:37                   ` Questions o1bigtenor
  1 sibling, 1 reply; 14+ messages in thread
From: Wols Lists @ 2016-02-16 14:32 UTC (permalink / raw)
  To: o1bigtenor, Adam Goryachev; +Cc: Linux-RAID

On 16/02/16 11:46, o1bigtenor wrote:
> A major reason as to why the drives are getting replaced. Back in early 2012
> when I setup the machine there was no obvious information that the ERC
> type drives were needed so I just bought vanilla drives.

This is just what I've got a gut feel for, it seems to make sense of
what I've seen on the list ...

"new" drives of 2TB or more have the crippled drive firmware unless you
specifically buy raid.

The original release date of 1TB and less drives predates the crippling,
and the manufacturers haven't bothered to go back and "fix" this.

So 1TB drives - even desktop ones - seem usually to be okay. Anything
over that is suspect.

Cheers,
Wol

Incidentally, any reason for sticking with the same size drives? I'm
looking at replacing my Barracudas and might well upgrade from 3TB to
4TB, just because the price difference is minimal. That's despite /home
only being half full even though I have a 24MP camera ...


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-16 14:00                 ` Questions Adam Goryachev
@ 2016-02-16 18:33                   ` o1bigtenor
  0 siblings, 0 replies; 14+ messages in thread
From: o1bigtenor @ 2016-02-16 18:33 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Wols Lists, Linux-RAID

On Tue, Feb 16, 2016 at 8:00 AM, Adam Goryachev
<mailinglists@websitemanagers.com.au> wrote:
>snip
>>
>> A major reason as to why the drives are getting replaced. Back in early
>> 2012
>> when I setup the machine there was no obvious information that the ERC
>> type drives were needed so I just bought vanilla drives.
>>>
>>> Essentially, your current disks might be fine, but if you don't have the
>>> right settings, they could be "failing" regularly putting your data at
>>> risk.
>>> You should fix any issue here before you attempt to replace your drives.
>>
>> I now have 2 long term drives which are likely still good and 2 cheap
>> drives that
>> are quite new but which I don't trust for long term reliability,
>> therefore the push
>> to change them all.
>
>
> Note, most drives either still support it, or else can be worked around to
> avoid the timeout mismatch. You should do this before continue to replace
> the drives, as you want to avoid this happening in the middle of replacing
> drives.
snip
>>>
>>> I would suggest replacing one at a time.
>>
>> There is no way to do them one after another copying over all four and
>> then
>> only needing to shut the box down once or failing that doing the process 2
>> times
>> necessitating only 2 shutdowns instead of 4 is there? The external USB box
>> does have room for 2 drives at once.
>
>
> Put two blank drives into the USB, replace first one, replace second one
> (would suggest second drive as non-matching pair). Shutdown, move two new
> drives internal, place two new drives into drive bay, and repeat for the
> last two drives.

How do I determine exactly which are the matching pairs?

Raid 10 means striped and mirrored so I have 2 pairs (AIUI).
I'm thinking its sde and sdf as a pair and then sdb and sdc as the other so if
that's true then I would start with say sde and sdc for the first set
and then do
sdf and sdb for the second.
>
> I would suggest adding a bitmap for at least the time you are doing the
> replacements, then if you have a failure on the USB enclosure during the 2nd
> or 4th drive, at least the 1st or 3rd will re-sync quickly.

Sorry - - - a bitmap - - - not sure what you mean?

Thanking you for your assistance.

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Questions
  2016-02-16 14:32                 ` Questions Wols Lists
@ 2016-02-16 18:37                   ` o1bigtenor
  0 siblings, 0 replies; 14+ messages in thread
From: o1bigtenor @ 2016-02-16 18:37 UTC (permalink / raw)
  To: Wols Lists; +Cc: Adam Goryachev, Linux-RAID

On Tue, Feb 16, 2016 at 8:32 AM, Wols Lists <antlists@youngman.org.uk> wrote:
> On 16/02/16 11:46, o1bigtenor wrote:
>> A major reason as to why the drives are getting replaced. Back in early 2012
>> when I setup the machine there was no obvious information that the ERC
>> type drives were needed so I just bought vanilla drives.
>
> This is just what I've got a gut feel for, it seems to make sense of
> what I've seen on the list ...
>
> "new" drives of 2TB or more have the crippled drive firmware unless you
> specifically buy raid.
>
> The original release date of 1TB and less drives predates the crippling,
> and the manufacturers haven't bothered to go back and "fix" this.

I'll take your word for that but did buy NAS rated drives (just missed
getting some
enterprise drives which should have been even better!).
>
> So 1TB drives - even desktop ones - seem usually to be okay. Anything
> over that is suspect.
>
> Cheers,
> Wol
>
> Incidentally, any reason for sticking with the same size drives? I'm
> looking at replacing my Barracudas and might well upgrade from 3TB to
> 4TB, just because the price difference is minimal. That's despite /home
> only being half full even though I have a 24MP camera ...
>

Sticking with the same size drives for a number of reasons. The price
difference is
minimal but it is there. In 4 years of using I'm now at 300 GB which is not that
large compared to a 2TB array. I also am in the process of setting up
a server which
has 4 - 3TB drives in a raid 10 array for backup level 1.

Thanking you for your time and tips!

Dee

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-02-16 18:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-14  4:28 Questions o1bigtenor
2016-02-14  6:34 ` Questions Adam Goryachev
2016-02-14 11:53   ` Questions o1bigtenor
2016-02-14 12:24     ` Questions Adam Goryachev
2016-02-15 12:12       ` Questions o1bigtenor
2016-02-15 19:50         ` Questions Wols Lists
2016-02-15 21:01           ` Questions o1bigtenor
2016-02-15 22:05             ` Questions Adam Goryachev
2016-02-16 11:46               ` Questions o1bigtenor
2016-02-16 14:00                 ` Questions Adam Goryachev
2016-02-16 18:33                   ` Questions o1bigtenor
2016-02-16 14:32                 ` Questions Wols Lists
2016-02-16 18:37                   ` Questions o1bigtenor
2016-02-15 22:09             ` Questions Wols Lists

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox