Network based (iSCSI) RAID1 setup

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Network based (iSCSI) RAID1 setup
@ 2017-05-10  8:42 Gionatan Danti
  2017-05-10  9:03 ` Roman Mamedov
  2017-05-10 12:08 ` Adam Goryachev
  0 siblings, 2 replies; 5+ messages in thread
From: Gionatan Danti @ 2017-05-10  8:42 UTC (permalink / raw)
  To: linux-raid

Hi all,
I'm trying to understand if, and how, mdadm can be used with network 
attached devices (iSCSI, in this case). I have a very simple setup with 
two 1 GB drives, the first being a local disk (a logical volume, really) 
and the second a remote iSCSI disk.

First question: even if in my preliminary tests this seems to work 
reasonably well, do you feel that such solution can be used for 
production workloads? Or something with a more specific focus, as DRBD, 
remains the preferred solution?

I'm using two CentOS 7.3 x86-64 boxes, with kernel version 
3.10.0-514.16.1.el7.x86_64 and mdadm v3.4 - 28th January 2016. Here you 
can find my current RAID1 setup, where /dev/sdb is the iSCSI disk:

[root@gdanti-laptop g.danti]# cat /proc/mdstat
Personalities : [raid1]
md200 : active raid1 sdb[1] dm-3[0]
       1047552 blocks super 1.2 [2/2] [UU]
       bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>
[root@gdanti-laptop g.danti]# mdadm -D /dev/md200
/dev/md200:
         Version : 1.2
   Creation Time : Wed May 10 08:53:12 2017
      Raid Level : raid1
      Array Size : 1047552 (1023.00 MiB 1072.69 MB)
   Used Dev Size : 1047552 (1023.00 MiB 1072.69 MB)
    Raid Devices : 2
   Total Devices : 2
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Wed May 10 10:27:35 2017
           State : clean
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            Name : gdanti-laptop.assyoma.it:200  (local to host 
gdanti-laptop.assyoma.it)
            UUID : 9d6fb056:c1d49780:149f9391:68fc267f
          Events : 62

     Number   Major   Minor   RaidDevice State
        0     253        3        0      active sync   /dev/dm-3
        1       8       16        1      active sync   /dev/sdb

So far, so good: the array seems to work well, with good read/write 
speed. Now, suppose the remote disk become unavailable:

[root@gdanti-laptop g.danti]# iscsiadm -m node --targetname 
iqn.2008-09.com.example:server.target1 --portal 172.31.255.11 --logout
Logging out of session [sid: 6, target: 
iqn.2008-09.com.example:server.target1, portal: 172.31.255.11,3260]
Logout of [sid: 6, target: iqn.2008-09.com.example:server.target1, 
portal: 172.31.255.11,3260] successful.
[root@gdanti-laptop g.danti]# cat /proc/mdstat
Personalities : [raid1]
md200 : active (auto-read-only) raid1 dm-3[0]
       1047552 blocks super 1.2 [2/1] [U_]
       bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

The device is correctly kicked-off the array.
So, second question: how to enable auto re-add for the remote device 
when it become available again? For example:

[root@gdanti-laptop g.danti]# iscsiadm -m node --targetname 
iqn.2008-09.com.example:server.target1 --portal 172.31.255.11 --login
Logging in to [iface: default, target: 
iqn.2008-09.com.example:server.target1, portal: 172.31.255.11,3260] 
(multiple)
Login to [iface: default, target: 
iqn.2008-09.com.example:server.target1, portal: 172.31.255.11,3260] 
successful.
[root@gdanti-laptop g.danti]# cat /proc/mdstat
Personalities : [raid1]
md200 : active (auto-read-only) raid1 dm-3[0]
       1047552 blocks super 1.2 [2/1] [U_]
       bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Even if /dev/sdb is now visible, it is not auto re-added to the array. 
If I run mdadm /dev/sdb --incremental --run I see the device added as a 
spare:

[root@gdanti-laptop g.danti]# cat /proc/mdstat
Personalities : [raid1]
md200 : active (auto-read-only) raid1 sdb[1](S) dm-3[0]
       1047552 blocks super 1.2 [2/1] [U_]
       bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Third question: with --incremental adds device as a spare, rather than 
active?

I've looked at the POLICY directive in mdadm.conf, but I am unable to 
make it work by auto re-adding iSCSI devices when they become up again.

Sorry for the long post, I am really trying to learn something!
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Network based (iSCSI) RAID1 setup
  2017-05-10  8:42 Network based (iSCSI) RAID1 setup Gionatan Danti
@ 2017-05-10  9:03 ` Roman Mamedov
  2017-05-10 10:25   ` Gionatan Danti
  2017-05-10 12:08 ` Adam Goryachev
  1 sibling, 1 reply; 5+ messages in thread
From: Roman Mamedov @ 2017-05-10  9:03 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: linux-raid

On Wed, 10 May 2017 10:42:54 +0200
Gionatan Danti <g.danti@assyoma.it> wrote:

> even if in my preliminary tests this seems to work 
> reasonably well, do you feel that such solution can be used for 
> production workloads? Or something with a more specific focus, as DRBD, 
> remains the preferred solution?

First thing that comes to mind, you should look into setting the remote device
as --write-mostly, so that the local one is preferred for all reads (as long as
it's up).

But to be honest DRBD may indeed be a better solution for this use case, as
it's built specifically with it in mind, and likely has all the various
gotchas that might arise already thought about and handled properly.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Network based (iSCSI) RAID1 setup
  2017-05-10  9:03 ` Roman Mamedov
@ 2017-05-10 10:25   ` Gionatan Danti
  0 siblings, 0 replies; 5+ messages in thread
From: Gionatan Danti @ 2017-05-10 10:25 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid

On 10/05/2017 11:03, Roman Mamedov wrote:
> First thing that comes to mind, you should look into setting the remote device
> as --write-mostly, so that the local one is preferred for all reads (as long as
> it's up).

Sure, and it is planned. However, for initial testing, I want to leave 
as many parameters to their default settings.

> But to be honest DRBD may indeed be a better solution for this use case, as
> it's built specifically with it in mind, and likely has all the various
> gotchas that might arise already thought about and handled properly.
>

To tell the truth, I already use DRBD 8.4 in production workloads and I 
are quite satisfied with it. However, DRBD 8.4 only supports 2 hosts 
(ie: master and slave), and DRBD 9.x (which supports multi-node 
replication) is a relatively new, deep re-write of the old codebase 
which significantly expanded scope.

So my interest in mdadm-based network replication...

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Network based (iSCSI) RAID1 setup
  2017-05-10  8:42 Network based (iSCSI) RAID1 setup Gionatan Danti
  2017-05-10  9:03 ` Roman Mamedov
@ 2017-05-10 12:08 ` Adam Goryachev
  2017-05-11 15:09   ` Gionatan Danti
  1 sibling, 1 reply; 5+ messages in thread
From: Adam Goryachev @ 2017-05-10 12:08 UTC (permalink / raw)
  To: Gionatan Danti, linux-raid



On 10/5/17 18:42, Gionatan Danti wrote:
> Hi all,
> I'm trying to understand if, and how, mdadm can be used with network 
> attached devices (iSCSI, in this case). I have a very simple setup 
> with two 1 GB drives, the first being a local disk (a logical volume, 
> really) and the second a remote iSCSI disk.
>
> First question: even if in my preliminary tests this seems to work 
> reasonably well, do you feel that such solution can be used for 
> production workloads? Or something with a more specific focus, as 
> DRBD, remains the preferred solution?
>
It depends on your definition of production, but for me, the answer is 
no. Once upon a time, I used MD to do RAID1 between a local SSD and a 
remote device with NBD and that worked well, (apart from the fact I 
needed to manually re-add the remote device after a reboot, or whenever 
it dropped out for any other reason). It did save me when the local SSD 
died, and I was able to keep running purely from the remote NBD device 
until I could get in and replace the local SSD.

Today, I use DRBD, and would much prefer that compared to MD + NBD.
> I'm using two CentOS 7.3 x86-64 boxes, with kernel version 
> 3.10.0-514.16.1.el7.x86_64 and mdadm v3.4 - 28th January 2016. Here 
> you can find my current RAID1 setup, where /dev/sdb is the iSCSI disk:
>
> So, second question: how to enable auto re-add for the remote device 
> when it become available again? For example:

I don't know, but I guess you need to work out what udev rules are 
triggered when the iscsi device is "connected", and then get that to 
trigger the MD add rules. Possibly you could try to create a partition 
on the iscsi, and then use sdb1 for the RAID array, there might be 
better handling by udev in that case (I really don't know, just making 
random suggestions here).
> Even if /dev/sdb is now visible, it is not auto re-added to the array. 
> If I run mdadm /dev/sdb --incremental --run I see the device added as 
> a spare:
>
> [root@gdanti-laptop g.danti]# cat /proc/mdstat
> Personalities : [raid1]
> md200 : active (auto-read-only) raid1 sdb[1](S) dm-3[0]
>       1047552 blocks super 1.2 [2/1] [U_]
>       bitmap: 0/1 pages [0KB], 65536KB chunk
>
> unused devices: <none>
>
> Third question: with --incremental adds device as a spare, rather than 
> active?
>
Is it because the raid isn't actually running? Perhaps you need to start 
the array first?
> I've looked at the POLICY directive in mdadm.conf, but I am unable to 
> make it work by auto re-adding iSCSI devices when they become up again.
I'd suggest using DRBD, it handles all these things a lot better because 
it is normal events for it, and a lot more people will be able to assist 
when something goes wrong with it.

Regards,
Adam

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Network based (iSCSI) RAID1 setup
  2017-05-10 12:08 ` Adam Goryachev
@ 2017-05-11 15:09   ` Gionatan Danti
  0 siblings, 0 replies; 5+ messages in thread
From: Gionatan Danti @ 2017-05-11 15:09 UTC (permalink / raw)
  To: Adam Goryachev, linux-raid

On 10/05/2017 14:08, Adam Goryachev wrote:
> It depends on your definition of production, but for me, the answer is
> no. Once upon a time, I used MD to do RAID1 between a local SSD and a
> remote device with NBD and that worked well, (apart from the fact I
> needed to manually re-add the remote device after a reboot, or whenever
> it dropped out for any other reason). It did save me when the local SSD
> died, and I was able to keep running purely from the remote NBD device
> until I could get in and replace the local SSD.
>
> Today, I use DRBD, and would much prefer that compared to MD + NBD.

Thanks for your feedback, Adam. I agree with your, for the reasons 
expressed below. Hoping to be useful for others, I'll document here my 
findings.

>> I'm using two CentOS 7.3 x86-64 boxes, with kernel version
>> 3.10.0-514.16.1.el7.x86_64 and mdadm v3.4 - 28th January 2016. Here
>> you can find my current RAID1 setup, where /dev/sdb is the iSCSI disk:
>>
>> So, second question: how to enable auto re-add for the remote device
>> when it become available again? For example:
>
> I don't know, but I guess you need to work out what udev rules are
> triggered when the iscsi device is "connected", and then get that to
> trigger the MD add rules. Possibly you could try to create a partition
> on the iscsi, and then use sdb1 for the RAID array, there might be
> better handling by udev in that case (I really don't know, just making
> random suggestions here).

I worked out how to enable auto re-add: the key was to include a default 
"POLICY action=re-add" in /etc/mdadm.conf; at this point, any removed 
disk will be re-attached automatically when it newly become visible.

With a catch: in iSCSI, when the remote disk become unresponsive and it 
is dropped, it is *not* removed (by udev) from the disk entries found in 
/dev. As both the remove and the auto-readd processes depends on udev 
events triggering changes to the /dev directory (ie: a drive 
disappearing and/or reappearing), rebooting the remote host will cause 
the iSCSI-imported disk to be marked as failed, but not as removed (due 
to its entries in /dev being *not* removed); later, when the iSCSI disk 
become visible again, it is re-added as a spare.

So, while mdadm by itself worked quite well with networked disks, I 
agree with Adam in that, for production workloads, this is a too fragile 
setup. Specifically:
- the default iSCSI timeout (120 seconds) it way too high and need to be 
adjusted;
- as, by default, mdadm scans all disk devices for valid arrays, *both* 
the local and remote machines can see the md array. This must be avoided 
with the using of the ARRAY <ignore> directive in the mdadm.conf file on 
the remote machines (ie: the one which exports the iSCSI drives);
- udev is clearly not geared for managing events caused by remote disks 
which suddenly disconnect without advice.

In the end, no problems seems related to mdadm itself, which remains a 
wonderful and extremely flexible tool to manage software RAIDs. Thank 
you all for the hard works.

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-11 15:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-10  8:42 Network based (iSCSI) RAID1 setup Gionatan Danti
2017-05-10  9:03 ` Roman Mamedov
2017-05-10 10:25   ` Gionatan Danti
2017-05-10 12:08 ` Adam Goryachev
2017-05-11 15:09   ` Gionatan Danti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox