All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: linbloke <linbloke@fastmail.fm>
Cc: "Jérôme Poulin" <jeromepoulin@gmail.com>,
	"Pavel Hofman" <pavel.hofman@ivitera.com>,
	linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Rotating RAID 1
Date: Wed, 26 Oct 2011 08:47:53 +1100	[thread overview]
Message-ID: <20111026084753.1db251b2@notabene.brown> (raw)
In-Reply-To: <4EA666A1.9000904@fastmail.fm>

[-- Attachment #1: Type: text/plain, Size: 12942 bytes --]

On Tue, 25 Oct 2011 18:34:57 +1100 linbloke <linbloke@fastmail.fm> wrote:

> On 16/08/11 9:55 AM, NeilBrown wrote:
> > On Mon, 15 Aug 2011 19:32:04 -0400 Jérôme Poulin<jeromepoulin@gmail.com>
> > wrote:
> >
> >> On Mon, Aug 15, 2011 at 6:42 PM, NeilBrown<neilb@suse.de>  wrote:
> >>> So if there are 3 drives A, X, Y where A is permanent and X and Y are rotated
> >>> off-site, then I create two RAID1s like this:
> >>>
> >>>
> >>> mdadm -C /dev/md0 -l1 -n2 --bitmap=internal /dev/A /dev/X
> >>> mdadm -C /dev/md1 -l1 -n2 --bitmap=internal /dev/md0 /dev/Y
> >> That seems nice for 2 disks, but adding another one later would be a
> >> mess. Is there any way to play with slots number manually to make it
> >> appear as an always degraded RAID ? I can't plug all the disks at once
> >> because of the maximum of 2 ports.
> > Yes, add another one later would be difficult.  But if you know up-front that
> > you will want three off-site devices it is easy.
> >
> > You could
> >
> >   mdadm -C /dev/md0 -l1 -n2 -b internal /dev/A missing
> >   mdadm -C /dev/md1 -l1 -n2 -b internal /dev/md0 missing
> >   mdadm -C /dev/md2 -l1 -n2 -b internal /dev/md1 missing
> >   mdadm -C /dev/md3 -l1 -n2 -b internal /dev/md2 missing
> >
> >   mkfs /dev/md3 ; mount ..
> >
> >   So you now have 4 "missing" devices.  Each time you plug in a device that
> >   hasn't been in an array before, explicitly add it to the array that you want
> >   it to be a part of and let it recover.
> >   When you plug in a device that was previously plugged in, just "mdadm
> >   -I /dev/XX" and it will automatically be added and recover based on the
> >   bitmap.
> >
> >   You can have as many or as few of the transient drives plugged in at any
> >   time as you like.
> >
> >   There is a cost here of course.  Every write potentially needs to update
> >   every bitmap, so the more bitmaps, the more overhead in updating them.  So
> >   don't create more than you need.
> >
> >   Also, it doesn't have to be a linear stack.  It could be a binary tree
> >   though that might take a little more care to construct.  Then when an
> >   adjacent pair of leafs are both off-site, their bitmap would not need
> >   updating.
> >
> > NeilBrown
> 
> Hi Neil, Jérôme and Pavel,
> 
> I'm in the process of testing the solution described above and have been 
> successful at those steps (I now have sync'd devices that I have failed 
> and removed from their respective arrays - the "backups"). I can add new 
> devices and also incrementally re-add the devices back to their 
> respective arrays and all my tests show this process works well. The 
> point which I'm now trying to resolve is how to create a new array from 
> one of the off-site components - ie, the restore from backup test.
> Below are the steps I've taken to implement and verify each step, you 
> can skip to the bottom section "Restore from off-site backup" to get to 
> the point if you like. When the wiki is back up, I'll post this process 
> there for others who are looking for mdadm based offline backups. Any 
> corrections gratefully appreciated.
> 
> Based on the example above, for a target setup of 7 off-site devices 
> synced to a two device RAID1, my test setup for a is:
> 
> RAID Array    Online Device    Off-site device
> md100    sdc         sdd
> md101    md100    sde
> md102    md101    sdf
> md103    md102    sdg
> md104    md103    sdh
> md105    md104    sdi
> md106    md105    sdj
> 
> root@deb6dev:~# uname -a
> Linux deb6dev 2.6.32-5-686 #1 SMP Tue Mar 8 21:36:00 UTC 2011 i686 GNU/Linux
> root@deb6dev:~# mdadm -V
> mdadm - v3.1.4 - 31st August 2010
> root@deb6dev:~# cat /etc/debian_version
> 6.0.1
> 
> Create the nested arrays
> ---------------------
> root@deb6dev:~# mdadm -C /dev/md100 -l1 -n2 -b internal -e 1.2 /dev/sdc 
> missing
> mdadm: array /dev/md100 started.
> root@deb6dev:~# mdadm -C /dev/md101 -l1 -n2 -b internal -e 1.2 
> /dev/md100 missing
> mdadm: array /dev/md101 started.
> root@deb6dev:~# mdadm -C /dev/md102 -l1 -n2 -b internal -e 1.2 
> /dev/md101 missing
> mdadm: array /dev/md102 started.
> root@deb6dev:~# mdadm -C /dev/md103 -l1 -n2 -b internal -e 1.2 
> /dev/md102 missing
> mdadm: array /dev/md103 started.
> root@deb6dev:~# mdadm -C /dev/md104 -l1 -n2 -b internal -e 1.2 
> /dev/md103 missing
> mdadm: array /dev/md104 started.
> root@deb6dev:~# mdadm -C /dev/md105 -l1 -n2 -b internal -e 1.2 
> /dev/md104 missing
> mdadm: array /dev/md105 started.
> root@deb6dev:~# mdadm -C /dev/md106 -l1 -n2 -b internal -e 1.2 
> /dev/md105 missing
> mdadm: array /dev/md106 started.
> 
> root@deb6dev:~# cat /proc/mdstat
> Personalities : [raid1]
> md106 : active (auto-read-only) raid1 md105[0]
>        51116 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md105 : active raid1 md104[0]
>        51128 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md104 : active raid1 md103[0]
>        51140 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md103 : active raid1 md102[0]
>        51152 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md102 : active raid1 md101[0]
>        51164 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md101 : active raid1 md100[0]
>        51176 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md100 : active raid1 sdc[0]
>        51188 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> unused devices: <none>
> 
> Create and mount a filesystem
> --------------------------
> root@deb6dev:~# mkfs.ext3 /dev/md106
> <<successful mkfs output snipped>>
> root@deb6dev:~# mount -t ext3 /dev/md106 /mnt/backup
> root@deb6dev:~# df | grep backup
> /dev/md106               49490      4923     42012  11% /mnt/backup
> 
> Plug in a device that hasn't been in an array before
> -------------------------------------------
> root@deb6dev:~# mdadm -vv /dev/md100 --add /dev/sdd
> mdadm: added /dev/sdd
> root@deb6dev:~# cat /proc/mdstat
> Personalities : [raid1]
> md106 : active raid1 md105[0]
>        51116 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md105 : active raid1 md104[0]
>        51128 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md104 : active raid1 md103[0]
>        51140 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md103 : active raid1 md102[0]
>        51152 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md102 : active raid1 md101[0]
>        51164 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md101 : active raid1 md100[0]
>        51176 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md100 : active raid1 sdd[2] sdc[0]
>        51188 blocks super 1.2 [2/2] [UU]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> 
> Write to the array
> ---------------
> root@deb6dev:~# dd if=/dev/urandom of=a.blob bs=1M count=20
> 20+0 records in
> 20+0 records out
> 20971520 bytes (21 MB) copied, 5.05528 s, 4.1 MB/s
> root@deb6dev:~# dd if=/dev/urandom of=b.blob bs=1M count=10
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 2.59361 s, 4.0 MB/s
> root@deb6dev:~# dd if=/dev/urandom of=c.blob bs=1M count=5
> 5+0 records in
> 5+0 records out
> 5242880 bytes (5.2 MB) copied, 1.35619 s, 3.9 MB/s
> root@deb6dev:~# md5sum *blob > md5sums.txt
> root@deb6dev:~# ls -l
> total 35844
> -rw-r--r-- 1 root root 20971520 Oct 25 15:57 a.blob
> -rw-r--r-- 1 root root 10485760 Oct 25 15:57 b.blob
> -rw-r--r-- 1 root root  5242880 Oct 25 15:57 c.blob
> -rw-r--r-- 1 root root      123 Oct 25 15:57 md5sums.txt
> root@deb6dev:~# cp *blob /mnt/backup
> root@deb6dev:~# ls -l /mnt/backup
> total 35995
> -rw-r--r-- 1 root root 20971520 Oct 25 15:58 a.blob
> -rw-r--r-- 1 root root 10485760 Oct 25 15:58 b.blob
> -rw-r--r-- 1 root root  5242880 Oct 25 15:58 c.blob
> drwx------ 2 root root    12288 Oct 25 15:27 lost+found
> root@deb6dev:~# df | grep backup
> /dev/md106               49490     40906      6029  88% /mnt/backup
> root@deb6dev:~# cat /proc/mdstat | grep -A 3 '^md100'
> md100 : active raid1 sdd[2] sdc[0]
>        51188 blocks super 1.2 [2/2] [UU]
>        bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> Data written and array devices in sync (bitmap 0/1)
> 
> Fail and remove device
> -------------------
> root@deb6dev:~# sync
> root@deb6dev:~# mdadm -vv /dev/md100 --fail /dev/sdd --remove /dev/sdd
> mdadm: set /dev/sdd faulty in /dev/md100
> mdadm: hot removed /dev/sdd from /dev/md100
> root@deb6dev:~# cat /proc/mdstat | grep -A 3 '^md100'
> md100 : active raid1 sdc[0]
>        51188 blocks super 1.2 [2/1] [U_]
>        bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> Device may now be unplugged
> 
> 
> Write to the array again
> --------------------
> root@deb6dev:~# rm /mnt/backup/b.blob
> root@deb6dev:~# ls -l /mnt/backup
> total 25714
> -rw-r--r-- 1 root root 20971520 Oct 25 15:58 a.blob
> -rw-r--r-- 1 root root  5242880 Oct 25 15:58 c.blob
> drwx------ 2 root root    12288 Oct 25 15:27 lost+found
> root@deb6dev:~# df | grep backup
> /dev/md106               49490     30625     16310  66% /mnt/backup
> root@deb6dev:~# cat /proc/mdstat | grep -A 3 '^md100'
> md100 : active raid1 sdc[0]
>        51188 blocks super 1.2 [2/1] [U_]
>        bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> bitmap 1/1 shows array is not in sync (we know it's due to the writes 
> pending for the device we previously failed)
> 
> Plug in a device that was previously plugged in
> ----------------------------------------
> root@deb6dev:~# mdadm -vv -I /dev/sdd --run
> mdadm: UUID differs from /dev/md/0.
> mdadm: UUID differs from /dev/md/1.
> mdadm: /dev/sdd attached to /dev/md100 which is already active.
> root@deb6dev:~# sync
> root@deb6dev:~# cat /proc/mdstat | grep -A 3 '^md100'
> md100 : active raid1 sdd[2] sdc[0]
>        51188 blocks super 1.2 [2/2] [UU]
>        bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> Device reconnected [UU] and in sync (bitmap 0/1)
> 
> Restore from off-site device
> ------------------------
> Remove device from array
> root@deb6dev:~# mdadm -vv /dev/md100 --fail /dev/sdd --remove /dev/sdd
> mdadm: set /dev/sdd faulty in /dev/md100
> mdadm: hot removed /dev/sdd from /dev/md100
> root@deb6dev:~# mdadm -Ev /dev/sdd
> /dev/sdd:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x1
>       Array UUID : 4c957fac:d7dbc792:b642daf0:d22e313e
>             Name : deb6dev:100  (local to host deb6dev)
>    Creation Time : Tue Oct 25 15:22:19 2011
>       Raid Level : raid1
>     Raid Devices : 2
> 
>   Avail Dev Size : 102376 (50.00 MiB 52.42 MB)
>       Array Size : 102376 (50.00 MiB 52.42 MB)
>      Data Offset : 24 sectors
>     Super Offset : 8 sectors
>            State : clean
>      Device UUID : 381f453f:5a97f1f6:bb5098bb:8c071a95
> 
> Internal Bitmap : 8 sectors from superblock
>      Update Time : Tue Oct 25 17:27:53 2011
>         Checksum : acbcee5f - correct
>           Events : 250
> 
> 
>     Device Role : Active device 1
>     Array State : AA ('A' == active, '.' == missing)
> 
> Assemble a new array from off-site component:
> root@deb6dev:~# mdadm -vv -A /dev/md200 --run /dev/sdd
> mdadm: looking for devices for /dev/md200
> mdadm: /dev/sdd is identified as a member of /dev/md200, slot 1.
> mdadm: no uptodate device for slot 0 of /dev/md200
> mdadm: added /dev/sdd to /dev/md200 as 1
> mdadm: /dev/md200 has been started with 1 drive (out of 2).
> root@deb6dev:~#
> 
> Check file-system on new array
> root@deb6dev:~# fsck.ext3 -f -n /dev/md200
> e2fsck 1.41.12 (17-May-2010)
> fsck.ext3: Superblock invalid, trying backup blocks...
> fsck.ext3: Bad magic number in super-block while trying to open /dev/md200
> 
> The superblock could not be read or does not describe a correct ext2
> filesystem.  If the device is valid and it really contains an ext2
> filesystem (and not swap or ufs or something else), then the superblock
> is corrupt, and you might try running e2fsck with an alternate superblock:
>      e2fsck -b 8193 <device>
> 
> 
> How do I use these devices in a new array?
> 

You need to also assemble md201 md202 md203 md204 md205 md206
and the fsck/mount md206
Each of these is made by assembling the single previous md20X array.

mdadm -A /dev/md201 --run /dev/md200
mdadm -A /dev/md202 --run /dev/md201
....
mdadm -A /dev/md206 --run /dev/md205

All the rest of your description looks good!

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2011-10-25 21:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-15 19:56 Rotating RAID 1 Jérôme Poulin
2011-08-15 20:19 ` Phil Turmel
2011-08-15 20:23   ` Jérôme Poulin
2011-08-15 20:21 ` Pavel Hofman
2011-08-15 20:25   ` Jérôme Poulin
2011-08-15 20:42     ` Pavel Hofman
2011-08-15 22:42       ` NeilBrown
2011-08-15 23:32         ` Jérôme Poulin
2011-08-15 23:55           ` NeilBrown
2011-08-16  6:34             ` Pavel Hofman
2011-09-09 22:28               ` Bill Davidsen
2011-09-11 19:21                 ` Pavel Hofman
2011-09-12 14:20                   ` Bill Davidsen
2011-08-23  3:45             ` Jérôme Poulin
2011-08-23  3:58               ` NeilBrown
2011-08-23  4:05                 ` Jérôme Poulin
2011-08-24  2:28                   ` Jérôme Poulin
2011-10-25  7:34             ` linbloke
2011-10-25 21:47               ` NeilBrown [this message]
2011-08-16  4:36         ` maurice

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111026084753.1db251b2@notabene.brown \
    --to=neilb@suse.de \
    --cc=jeromepoulin@gmail.com \
    --cc=linbloke@fastmail.fm \
    --cc=linux-raid@vger.kernel.org \
    --cc=pavel.hofman@ivitera.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.