linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* growing a RAID-10 array with mdadm 3.3.1+ ?
@ 2016-10-11 17:26 moft
  2016-10-11 18:29 ` Anthony Youngman
  2016-10-11 20:26 ` Phil Turmel
  0 siblings, 2 replies; 6+ messages in thread
From: moft @ 2016-10-11 17:26 UTC (permalink / raw)
  To: linux-raid

Hi

I have a 4-disk RAID10 array

	md0 : active raid10 sda1[4] sdb1[3] sdc1[2] sdd1[1]
	      1953259520 blocks super 1.2 512K chunks 2 far-copies [4/4] [UUUU]
	      bitmap: 0/15 pages [0KB], 65536KB chunk

It was created with this command

	mdadm --create /dev/md0 --level=raid10 --raid-devices=4 \
	 --name=md0 --homehost="<none>" \
	 --metadata=1.2 --bitmap=internal --layout=f2 --chunk=512  \
	 /dev/sd[abcd]1

It's running on a linux machine

	uname -rm
		4.8.1-2.g4861355-default x86_64
	mdadm --version
		mdadm - v3.3.1 - 5th June 2014

I need to add storage to the array.

I'd like to grow it by adding two disks (/dev/sd[ef]), to end up with a 6-disk array.

I know I can completely wipe it out and recreate it with 6-disks.

But I'd rather grow/extend it, Instead.

*CAN* I safely grow/expand it?

The ChangeLog for mdadm 3.3.1 says

	Changes Prior to release 3.3
	- Some array reshapes can proceed without needing backup file.
	  This is done by changing the 'data_offset' so we never need to write
	  any data back over where it was before.  If there is no "head space"
	  or "tail space" to allow data_offset to change, the old mechanism
	  with a backup file can still be used.
	- RAID10 arrays can be reshaped to change the number of devices,
	  change the chunk size, or change the layout between 'near'
	  and 'offset'.
	  This will always change data_offset, and will fail if there is no
	  room for data_offset to be moved.

So far I haven't found any specific "how to" for this process.

(1) The changelog refers to 'near' and 'offset' layouts, but doesn't mention 'far'.

CAN I safely grow this layout=f2 array ?

(2) If I can, what's the detailed procedure to do it?

Thanks

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: growing a RAID-10 array with mdadm 3.3.1+ ?
  2016-10-11 17:26 growing a RAID-10 array with mdadm 3.3.1+ ? moft
@ 2016-10-11 18:29 ` Anthony Youngman
  2016-10-11 18:37   ` moft
  2016-10-11 20:26 ` Phil Turmel
  1 sibling, 1 reply; 6+ messages in thread
From: Anthony Youngman @ 2016-10-11 18:29 UTC (permalink / raw)
  To: moft, linux-raid

Okay, this is a first response, so you'll probably need more experienced 
people to chime in, but

FIRST - BACKUP BACKUP!! BACKUP!!!!

Growing an array is pretty safe, but like anything here, it does have 
its dangers.

Second, what distro are you running? Is it a systemd-based distro?

There are a few problems with resizing arrays at the moment, and my gut 
feeling is that systemd is "to blame". It's very unlikely you'll lose 
data, but you might well find the resize fails and you have copy to a 
new array anyway.

More notes inline ...

On 11/10/16 18:26, moft@fmailbox.com wrote:
> Hi
>
> I have a 4-disk RAID10 array
>
> 	md0 : active raid10 sda1[4] sdb1[3] sdc1[2] sdd1[1]
> 	      1953259520 blocks super 1.2 512K chunks 2 far-copies [4/4] [UUUU]
> 	      bitmap: 0/15 pages [0KB], 65536KB chunk
>
> It was created with this command
>
> 	mdadm --create /dev/md0 --level=raid10 --raid-devices=4 \
> 	 --name=md0 --homehost="<none>" \
> 	 --metadata=1.2 --bitmap=internal --layout=f2 --chunk=512  \
> 	 /dev/sd[abcd]1
>
> It's running on a linux machine
>
> 	uname -rm
> 		4.8.1-2.g4861355-default x86_64
> 	mdadm --version
> 		mdadm - v3.3.1 - 5th June 2014
>
> I need to add storage to the array.
>
> I'd like to grow it by adding two disks (/dev/sd[ef]), to end up with a 6-disk array.
>
> I know I can completely wipe it out and recreate it with 6-disks.
>
> But I'd rather grow/extend it, Instead.
>
> *CAN* I safely grow/expand it?

Bugs excepted - yes you should be able to, without problems.
>
> The ChangeLog for mdadm 3.3.1 says
>
> 	Changes Prior to release 3.3
> 	- Some array reshapes can proceed without needing backup file.
> 	  This is done by changing the 'data_offset' so we never need to write
> 	  any data back over where it was before.  If there is no "head space"
> 	  or "tail space" to allow data_offset to change, the old mechanism
> 	  with a backup file can still be used.

If you're growing the array, you shouldn't need a backup file. You might 
need a backup for the first second or so, but then it's no longer 
necessary. And mdadm can probably use the space in the two new disks to 
store the backup data.

(What I understand happens, is that mdadm will read old stripes 1 & 2. 
It then writes new stripe 1 and sets the watermark to stripe 1. That 
says that the new array is complete up to 1, and if the data isn't 
there, fetch it from the old array. It then reads old stripe 3 and 
writes new stripe 2, then sets the watermark to 2. Old 4 & 5 become new 
3, then old 6 makes new 4. Etc etc. Plus, of course, all the locking and 
safeguards to make sure nothing reads the stripe that's actively being 
updated ... :-)

Anyways, if it needs a backup file, it will tell you.

> 	- RAID10 arrays can be reshaped to change the number of devices,
> 	  change the chunk size, or change the layout between 'near'
> 	  and 'offset'.
> 	  This will always change data_offset, and will fail if there is no
> 	  room for data_offset to be moved.
>
> So far I haven't found any specific "how to" for this process.

mdadm /dev/md0 --add /dev/sde1 /dev/sdf1
mdadm --grow /dev/md0 --raid-devices=6

The first command will add your two drives as spares. The second will 
make them part of the array. It's the second command that's the risky 
one... and bearing in mind I don't know raid10, it might just add them 
on the end and not need any reconstruction at all ...
>
> (1) The changelog refers to 'near' and 'offset' layouts, but doesn't mention 'far'.
>
> CAN I safely grow this layout=f2 array ?
>
> (2) If I can, what's the detailed procedure to do it?
>
I'll be interested in knowing how this pans out, too, so I can add it to 
the wiki :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: growing a RAID-10 array with mdadm 3.3.1+ ?
  2016-10-11 18:29 ` Anthony Youngman
@ 2016-10-11 18:37   ` moft
  2016-10-11 18:50     ` Anthony Youngman
  2016-10-28  5:47     ` NeilBrown
  0 siblings, 2 replies; 6+ messages in thread
From: moft @ 2016-10-11 18:37 UTC (permalink / raw)
  To: Anthony Youngman, linux-raid

On Tue, Oct 11, 2016, at 11:29 AM, Anthony Youngman wrote:
> Growing an array is pretty safe, but like anything here, it does have  its dangers.
> 
> Second, what distro are you running? Is it a systemd-based distro?

Opensuse. Yes.

> feeling is that systemd is "to blame". 

I have no idea why that'd be the case.  That's the first time I've heard anybody suggest that.

> > *CAN* I safely grow/expand it?
> 
> Bugs excepted - yes you should be able to, without problems.

So grouwing 'far' layouts are now supported?  Do have a reference/source for that?

> > 	  This will always change data_offset, and will fail if there is no
> > 	  room for data_offset to be moved.

So a 'fail' means -- just won't start? as opposed to 'oops, it's now broken'?

> > So far I haven't found any specific "how to" for this process.
> 
> mdadm /dev/md0 --add /dev/sde1 /dev/sdf1
> mdadm --grow /dev/md0 --raid-devices=6
> 
> The first command will add your two drives as spares. The second will 
> make them part of the array. It's the second command that's the risky 
> one... and bearing in mind I don't know raid10, it might just add them 
> on the end and not need any reconstruction at all ...

Well, that's the missing critical detail here.

> > (1) The changelog refers to 'near' and 'offset' layouts, but doesn't mention 'far'.
> >
> > CAN I safely grow this layout=f2 array ?
> >
> > (2) If I can, what's the detailed procedure to do it?

Still need to understand the 'far' support, namely yes/no.

> I'll be interested in knowing how this pans out, too, so I can add it to 
> the wiki :-)

Thanks

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: growing a RAID-10 array with mdadm 3.3.1+ ?
  2016-10-11 18:37   ` moft
@ 2016-10-11 18:50     ` Anthony Youngman
  2016-10-28  5:47     ` NeilBrown
  1 sibling, 0 replies; 6+ messages in thread
From: Anthony Youngman @ 2016-10-11 18:50 UTC (permalink / raw)
  To: moft, linux-raid



On 11/10/16 19:37, moft@fmailbox.com wrote:
>> feeling is that systemd is "to blame".
> I have no idea why that'd be the case.  That's the first time I've heard anybody suggest that.
>
That's why it's a "gut feel" :-)

But the impression I'm getting is that when mdadm runs in the foreground 
with root's permissions it runs fine. When it detects systemd and 
backgrounds into daemon mode, something goes wrong.

But I repeat - this is just a gut feel. I could be completely wrong :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: growing a RAID-10 array with mdadm 3.3.1+ ?
  2016-10-11 17:26 growing a RAID-10 array with mdadm 3.3.1+ ? moft
  2016-10-11 18:29 ` Anthony Youngman
@ 2016-10-11 20:26 ` Phil Turmel
  1 sibling, 0 replies; 6+ messages in thread
From: Phil Turmel @ 2016-10-11 20:26 UTC (permalink / raw)
  To: moft, linux-raid

On 10/11/2016 01:26 PM, moft@fmailbox.com wrote:
> (1) The changelog refers to 'near' and 'offset' layouts, but doesn't mention 'far'.

Historically "far" has not been reshapeable at all.  I don't recall
seeing a patch that implemented it.  If you attempt it and it doesn't
support it, mdadm will refuse without hurting your array.  Same is true
for other reasons to reject growing.  mdadm gives an error before
touching the array.

You can get a definitive answer by setting up a set of small loop
devices in an array that mimics your setup and attempting to grow that
test array.

There have been bugs with SElinux and systemd preventing the reshape
task from forking properly from the command line tool.  The array is
then stuck at reshape position 0.

Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: growing a RAID-10 array with mdadm 3.3.1+ ?
  2016-10-11 18:37   ` moft
  2016-10-11 18:50     ` Anthony Youngman
@ 2016-10-28  5:47     ` NeilBrown
  1 sibling, 0 replies; 6+ messages in thread
From: NeilBrown @ 2016-10-28  5:47 UTC (permalink / raw)
  To: moft, Anthony Youngman, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]

On Wed, Oct 12 2016, moft@fmailbox.com wrote:

> On Tue, Oct 11, 2016, at 11:29 AM, Anthony Youngman wrote:
>> Growing an array is pretty safe, but like anything here, it does have  its dangers.
>> 
>> Second, what distro are you running? Is it a systemd-based distro?
>
> Opensuse. Yes.

Specifically, which openSUSE.  What version of mdadm. > 

>
>> feeling is that systemd is "to blame". 
>
> I have no idea why that'd be the case.  That's the first time I've heard anybody suggest that.

Debian bug 840743 helped me see a possible reason.

In some cases mdadm need to remain running in the background to monitor
the reshape.  Systemd doesn't like us to do that (it likes to kill
background tasks when you log off).
So on systemd installs we use a systemd service to run
  mdadm --grow --continue
There was a bug in mdadm 3.3.x, fixed in 3.4, which caused that mdadm to
fail (at least in some circumstances).
The the reshape froze at the start.


>
>> > *CAN* I safely grow/expand it?
>> 
>> Bugs excepted - yes you should be able to, without problems.
>
> So grouwing 'far' layouts are now supported?  Do have a
> reference/source for that?

No, 'far' layout RAID10 cannot be reshaped.  There are some messy issues
with making that work sensibly which I never bothered to resolve.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-28  5:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-11 17:26 growing a RAID-10 array with mdadm 3.3.1+ ? moft
2016-10-11 18:29 ` Anthony Youngman
2016-10-11 18:37   ` moft
2016-10-11 18:50     ` Anthony Youngman
2016-10-28  5:47     ` NeilBrown
2016-10-11 20:26 ` Phil Turmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).