All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wols Lists <antlists@youngman.org.uk>
To: "Guilherme G. Piccoli" <gpiccoli@canonical.com>,
	linux-raid@vger.kernel.org
Cc: linux-block@vger.kernel.org, kernel@gpiccoli.net,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org
Subject: Re: Should a raid-0 array immediately stop if a component disk is removed?
Date: Fri, 27 Apr 2018 23:11:23 +0100	[thread overview]
Message-ID: <5AE3A00B.6050801@youngman.org.uk> (raw)
In-Reply-To: <301f2a37-81ea-938b-3bd1-947f716c0f63@canonical.com>

On 27/04/18 22:49, Guilherme G. Piccoli wrote:
> Hello, we've noticed an interesting behavior when using a raid-0 md
> array. Suppose we have a 2-disk raid-0 array that has a mount point
> set - in our tests, we've used ext4 filesystem. If we remove one of
> the component disks via sysfs[0], userspace is notified, but mdadm tool
> fails to stop the array[1] (it cannot open the array device node with
> O_EXCL flag, hence it fails to issue the STOP_ARRAY ioctl). Even if we
> circumvent the mdadm O_EXCL open, md driver will fail to execute the
> ioctl given the array is mounted.

Sounds like you're not using mdadm to remove the disk. So why do you
expect mdadm to stop the array immediately? It doesn't know anything is
wrong until it trips over the missing disk.
> 
> As a result, the array keeps mounted and we can even read/write from
> it, although it's possible to observe filesystem errors on dmesg[2].
> Eventually, after some _minutes_, the filesystem gets remounted as
> read-only.

Is your array linear or striped? If it's striped, I would expect it to
fall over in a heap very quickly. If it's linear, it depends whether you
remove drive 0 or drive 1. If you remove drive 0, it will fall over very
quickly. If you remove drive 1, the fuller your array the quicker it
will fall over (if your array isn't very full, drive 1 may well not be
used in which case the array might not fall over at all!)
> 
> During this weird window in which the array had a component disk removed
> but is still mounted/active (and accepting read/writes), we tried to
> perform reads and writes and sync command, which "succeed" (meaning the
> commands themselves didn't fail, although the errors were observed in
> dmesg). When "dd" was executed with "oflag=direct", the writes failed
> immediately. This was observed with both nvme and scsi disks composing
> the raid-0 array.
> 
> We've started to pursue a solution to this, which seems to be an odd
> behavior. But worth to check in the CC'ed lists if perhaps this is "by
> design" or if it was already discussed in the past (maybe an idea was
> proposed). Tests were executed with v4.17-rc2 and upstream mdadm tool.

Note that raid-0 is NOT redundant. Standard advice is "if a drive fails,
expect to lose your data". So the fact that your array limps on should
be the pleasant surprise, not that it blows up in ways you didn't expect.
> 
> Thanks in advance,
> 
> 
> Guilherme

Cheers,
Wol


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

WARNING: multiple messages have this Message-ID (diff)
From: antlists@youngman.org.uk (Wols Lists)
Subject: Should a raid-0 array immediately stop if a component disk is removed?
Date: Fri, 27 Apr 2018 23:11:23 +0100	[thread overview]
Message-ID: <5AE3A00B.6050801@youngman.org.uk> (raw)
In-Reply-To: <301f2a37-81ea-938b-3bd1-947f716c0f63@canonical.com>

On 27/04/18 22:49, Guilherme G. Piccoli wrote:
> Hello, we've noticed an interesting behavior when using a raid-0 md
> array. Suppose we have a 2-disk raid-0 array that has a mount point
> set - in our tests, we've used ext4 filesystem. If we remove one of
> the component disks via sysfs[0], userspace is notified, but mdadm tool
> fails to stop the array[1] (it cannot open the array device node with
> O_EXCL flag, hence it fails to issue the STOP_ARRAY ioctl). Even if we
> circumvent the mdadm O_EXCL open, md driver will fail to execute the
> ioctl given the array is mounted.

Sounds like you're not using mdadm to remove the disk. So why do you
expect mdadm to stop the array immediately? It doesn't know anything is
wrong until it trips over the missing disk.
> 
> As a result, the array keeps mounted and we can even read/write from
> it, although it's possible to observe filesystem errors on dmesg[2].
> Eventually, after some _minutes_, the filesystem gets remounted as
> read-only.

Is your array linear or striped? If it's striped, I would expect it to
fall over in a heap very quickly. If it's linear, it depends whether you
remove drive 0 or drive 1. If you remove drive 0, it will fall over very
quickly. If you remove drive 1, the fuller your array the quicker it
will fall over (if your array isn't very full, drive 1 may well not be
used in which case the array might not fall over at all!)
> 
> During this weird window in which the array had a component disk removed
> but is still mounted/active (and accepting read/writes), we tried to
> perform reads and writes and sync command, which "succeed" (meaning the
> commands themselves didn't fail, although the errors were observed in
> dmesg). When "dd" was executed with "oflag=direct", the writes failed
> immediately. This was observed with both nvme and scsi disks composing
> the raid-0 array.
> 
> We've started to pursue a solution to this, which seems to be an odd
> behavior. But worth to check in the CC'ed lists if perhaps this is "by
> design" or if it was already discussed in the past (maybe an idea was
> proposed). Tests were executed with v4.17-rc2 and upstream mdadm tool.

Note that raid-0 is NOT redundant. Standard advice is "if a drive fails,
expect to lose your data". So the fact that your array limps on should
be the pleasant surprise, not that it blows up in ways you didn't expect.
> 
> Thanks in advance,
> 
> 
> Guilherme

Cheers,
Wol

  reply	other threads:[~2018-04-27 22:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-27 21:49 Should a raid-0 array immediately stop if a component disk is removed? Guilherme G. Piccoli
2018-04-27 21:49 ` Guilherme G. Piccoli
2018-04-27 22:11 ` Wols Lists [this message]
2018-04-27 22:11   ` Wols Lists
2018-04-27 22:54   ` Guilherme G. Piccoli
2018-04-27 22:54     ` Guilherme G. Piccoli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5AE3A00B.6050801@youngman.org.uk \
    --to=antlists@youngman.org.uk \
    --cc=gpiccoli@canonical.com \
    --cc=kernel@gpiccoli.net \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.