Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: pg@btrfs.list.sabi.co.UK (Peter Grandi)
To: Linux fs Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Unexpected raid1 behaviour
Date: Sun, 17 Dec 2017 15:48:20 +0000	[thread overview]
Message-ID: <23094.37316.66397.431081@tree.ty.sabi.co.uk> (raw)
In-Reply-To: <pan$36ed4$200dd538$4324239e$ed0f9f4@cox.net>

"Duncan"'s reply is slightly optimistic in parts, so some
further information...

[ ... ]

> Basically, at this point btrfs doesn't have "dynamic" device
> handling.  That is, if a device disappears, it doesn't know
> it.

That's just the consequence of what is a completely broken
conceptual model: the current way most multi-device profiles are
designed is that block-devices and only be "added" or "removed",
and cannot be "broken"/"missing". Therefore if IO fails, that is
just one IO failing, not the entire block-device going away.
The time when a block-device is noticed as sort-of missing is
when it is not available for "add"-ing at start.

Put another way, the multi-device design is/was based on the
demented idea that block-devices that are missing are/should be
"remove"d, so that a 2-device volume with a 'raid1' profile
becomes a 1-device volume with a 'single'/'dup' profile, and not
a 2-device volume with a missing block-device and an incomplete
'raid1' profile, even if things have been awkwardly moving in
that direction in recent years.

Note the above is not totally accurate today because various
hacks have been introduced to work around the various issues.

> Thus, if a device disappears, to get it back you really have
> to reboot, or at least unload/reload the btrfs kernel module,
> in ordered to clear the stale device state and have btrfs
> rescan and reassociate devices with the matching filesystems.

IIRC that is not quite accurate: a "missing" device can be
nowadays "replace"d (by "devid") or "remove"d, the latter
possibly implying profile changes:

  https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Using_add_and_delete

Terrible tricks like this also work:

  https://www.spinics.net/lists/linux-btrfs/msg48394.html

> Meanwhile, as mentioned above, there's active work on proper
> dynamic btrfs device tracking and management. It may or may
> not be ready for 4.16, but once it goes in, btrfs should
> properly detect a device going away and react accordingly,

I haven't seen that, but I doubt that it is the radical redesign
of the multi-device layer of Btrfs that is needed to give it
operational semantics similar to those of MD RAID, and that I
have vaguely described previously.

> and it should detect a device coming back as a different
> device too.

That is disagreeable because of poor terminology: I guess that
what was intended that it should be able to detect a previous
member block-device becoming available again as a different
device inode, which currently is very dangerous in some vital
situations.

> Longer term, there's further patches that will provide a
> hot-spare functionality, automatically bringing in a device
> pre-configured as a hot- spare if a device disappears, but
> that of course requires that btrfs properly recognize devices
> disappearing and coming back first, so one thing at a time.

That would be trivial if the complete redesign of block-device
states of the Btrfs multi-device layer happened, adding an
"active" flag to an "accessible" flag to describe new member
states, for example.

My guess that while logically consistent, the current
multi-device logic is fundamentally broken from an operational
point of view, and needs a complete replacement instead of
fixes.

  reply	other threads:[~2017-12-17 15:48 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-16 19:50 Unexpected raid1 behaviour Dark Penguin
2017-12-17 11:58 ` Duncan
2017-12-17 15:48   ` Peter Grandi [this message]
2017-12-17 20:42     ` Chris Murphy
2017-12-18  8:49       ` Anand Jain
2017-12-18  8:49     ` Anand Jain
2017-12-18 10:36       ` Peter Grandi
2017-12-18 12:10       ` Nikolay Borisov
2017-12-18 13:43         ` Anand Jain
2017-12-18 22:28       ` Chris Murphy
2017-12-18 22:29         ` Chris Murphy
2017-12-19 12:30         ` Adam Borowski
2017-12-19 12:54         ` Andrei Borzenkov
2017-12-19 12:59         ` Peter Grandi
2017-12-18 13:06     ` Austin S. Hemmelgarn
2017-12-18 19:43       ` Tomasz Pala
2017-12-18 22:01         ` Peter Grandi
2017-12-19 12:46           ` Austin S. Hemmelgarn
2017-12-19 12:25         ` Austin S. Hemmelgarn
2017-12-19 14:46           ` Tomasz Pala
2017-12-19 16:35             ` Austin S. Hemmelgarn
2017-12-19 17:56               ` Tomasz Pala
2017-12-19 19:47                 ` Chris Murphy
2017-12-19 21:17                   ` Tomasz Pala
2017-12-20  0:08                     ` Chris Murphy
2017-12-23  4:08                       ` Tomasz Pala
2017-12-23  5:23                         ` Duncan
2017-12-20 16:53                   ` Andrei Borzenkov
2017-12-20 16:57                     ` Austin S. Hemmelgarn
2017-12-20 20:02                     ` Chris Murphy
2017-12-20 20:07                       ` Chris Murphy
2017-12-20 20:14                         ` Austin S. Hemmelgarn
2017-12-21  1:34                           ` Chris Murphy
2017-12-21 11:49                         ` Andrei Borzenkov
2017-12-19 20:11                 ` Austin S. Hemmelgarn
2017-12-19 21:58                   ` Tomasz Pala
2017-12-20 13:10                     ` Austin S. Hemmelgarn
2017-12-19 23:53                   ` Chris Murphy
2017-12-20 13:12                     ` Austin S. Hemmelgarn
2017-12-19 18:31             ` George Mitchell
2017-12-19 20:28               ` Tomasz Pala
2017-12-19 19:35             ` Chris Murphy
2017-12-19 20:41               ` Tomasz Pala
2017-12-19 20:47                 ` Austin S. Hemmelgarn
2017-12-19 22:23                   ` Tomasz Pala
2017-12-20 13:33                     ` Austin S. Hemmelgarn
2017-12-20 17:28                       ` Duncan
2017-12-21 11:44                   ` Andrei Borzenkov
2017-12-21 12:27                     ` Austin S. Hemmelgarn
2017-12-22 16:05                       ` Tomasz Pala
2017-12-22 21:04                         ` Chris Murphy
2017-12-23  2:52                           ` Tomasz Pala
2017-12-23  5:40                             ` Duncan
2017-12-19 23:59                 ` Chris Murphy
2017-12-20  8:34                   ` Tomasz Pala
2017-12-20  8:51                     ` Tomasz Pala
2017-12-20 19:49                     ` Chris Murphy
2017-12-18  5:11   ` Anand Jain
2017-12-18  1:20 ` Qu Wenruo
2017-12-18 13:31 ` Austin S. Hemmelgarn
2018-01-12 12:26   ` Dark Penguin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23094.37316.66397.431081@tree.ty.sabi.co.uk \
    --to=pg@btrfs.list.sabi.co.uk \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox