Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Peter Grandi <pg@btrfs.list.sabi.co.UK>,
	Linux fs Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Unexpected raid1 behaviour
Date: Tue, 19 Dec 2017 07:46:58 -0500	[thread overview]
Message-ID: <41e8ba5f-205d-ede5-e613-86896d463e4f@gmail.com> (raw)
In-Reply-To: <23096.15063.957412.712081@tree.ty.sabi.co.uk>

On 2017-12-18 17:01, Peter Grandi wrote:
>>> The fact is, the only cases where this is really an issue is
>>> if you've either got intermittently bad hardware, or are
>>> dealing with external
> 
>> Well, the RAID1+ is all about the failing hardware.
> 
>>> storage devices. For the majority of people who are using
>>> multi-device setups, the common case is internally connected
>>> fixed storage devices with properly working hardware, and for
>>> that use case, it works perfectly fine.
> 
>> If you're talking about "RAID"-0 or storage pools (volume
>> management) that is true. But if you imply, that RAID1+ "works
>> perfectly fine as long as hardware works fine" this is
>> fundamentally wrong.
> 
> I really agree with this, the argument about "properly working
> hardware" is utterly ridiculous. I'll to this: apparently I am
> not the first one to discover the "anomalies" in the "RAID"
> profiles, but I may have been the first to document some of
> them, e.g. the famous issues with the 'raid1' profile. How did I
> discover them? Well, I had used Btrfs in single device mode for
> a bit, and wanted to try multi-device, and the docs seemed
> "strange", so I did tests before trying it out.
> 
> The tests were simply on a spare PC with a bunch of old disks to
> create two block devices (partitions), put them in 'raid1' first
> natively, then by adding a new member to an existing partition,
> and then 'remove' one, or simply unplug it (actually 'echo 1 >
> /sys/block/.../device/delete') initially. I wanted to check
> exactly what happened, resync times, speed, behaviour and speed
> when degraded, just ordinary operational tasks.
> 
> Well I found significant problems after less than one hour. I
> can't imagine anyone with some experience of hw or sw RAID
> (especially hw RAID, as hw RAID firmware is often fantastically
> buggy especially as to RAID operations) that wouldn't have done
> the same tests before operational use, and would not have found
> the same issues too straight away. The only guess I could draw
> is that whover designed the "RAID" profile had zero operational
> system administration experience.
Or possibly that you didn't read the documentation thoroughly at all, 
which any reasonable system administrator would do before even starting 
to test something.  Unless you were doing stupid stuff like running for 
extended periods of time with half an array or not trying at all to 
repair things after the device reappeared, then none of what you 
described should have caused any issues.
> 
>> If the hardware needs to work properly for the RAID to work
>> properly, noone would need this RAID in the first place.
> 
> It is not just that, but some maintenance operations are needed
> even if the hardware works properly: for example preventive
> maintenance, replacing drives that are becoming too old,
> expanding capacity, testing periodically hardware bits. Systems
> engineers don't just say "it works, let's assume it continues to
> work properly, why worry".
Really?  So replacing hard drives just doesn't work on BTRFS?

Hmm...

Then that means that all the testing I do regularly of reshaping arrays 
and replacing devices that is consistently working (except for raid5 and 
raid6, but those have other issues too right now) must be a complete 
fluke.  I guess I have to go check my hardware and the QEMU sources to 
figure out how those are broken such that all of this is working 
successfully...

Seriously though, did you even _test_ replacing devices using the 
procedures described in the documentation, or did you just see that 
things didn't work in the couple of cases you thought were most 
important and assume nothing else worked?
> 
> My impression is that multi-device and "chunks" were designed in
> one way by someone, and someone else did not understand the
> intent, and confused them with "RAID", and based the 'raid'
> profiles on that confusion. For example the 'raid10' profile
> seems the least confused to me, and that's I think because the
> "RAID" aspect is kept more distinct from the "multi-device"
> aspect. But perhaps I am an optimist...
Then names were a stupid choice intended to convey the basic behavior in 
a way that idiots who have no business being sysadmins could understand 
(and yes, the raid1 profiles do behave as someone with a naive 
understanding of RAID1 as simple replication would expect). 
Unfortunately, we're stuck with them now, and there's no point in 
complaining beyond just acknowledging that the names were a poor choice.
> 
> To simplify a longer discussion to have "RAID" one needs an
> explicit design concept of "stripe", which in Btrfs needs to be
> quite different from that of "set of member devices" and
> "chunks", so that for example adding/removing to a "stripe" is
> not quite the same thing as adding/removing members to a volume,
> plus to make a distinction between online and offline members,
> not just added and removed ones, and well-defined state machine
> transitions (e.g. in response to hardware problems) among all
> those, like in MD RAID. But the importance of such distinctions
> may not be apparent to everybody.
Or maybe people are sensible and don't care about such distinctions as 
long as things work within the defined parameters?  It's only engineers 
and scientists that care about how and why (or stuffy bureaucrats who 
want control over things).  Regular users, and even some developers 
don't care about the exact implementation provided it works how they 
need it to work.
> 
[Obviously intentionally inflammatory comment removed]

  reply	other threads:[~2017-12-19 12:47 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-16 19:50 Unexpected raid1 behaviour Dark Penguin
2017-12-17 11:58 ` Duncan
2017-12-17 15:48   ` Peter Grandi
2017-12-17 20:42     ` Chris Murphy
2017-12-18  8:49       ` Anand Jain
2017-12-18  8:49     ` Anand Jain
2017-12-18 10:36       ` Peter Grandi
2017-12-18 12:10       ` Nikolay Borisov
2017-12-18 13:43         ` Anand Jain
2017-12-18 22:28       ` Chris Murphy
2017-12-18 22:29         ` Chris Murphy
2017-12-19 12:30         ` Adam Borowski
2017-12-19 12:54         ` Andrei Borzenkov
2017-12-19 12:59         ` Peter Grandi
2017-12-18 13:06     ` Austin S. Hemmelgarn
2017-12-18 19:43       ` Tomasz Pala
2017-12-18 22:01         ` Peter Grandi
2017-12-19 12:46           ` Austin S. Hemmelgarn [this message]
2017-12-19 12:25         ` Austin S. Hemmelgarn
2017-12-19 14:46           ` Tomasz Pala
2017-12-19 16:35             ` Austin S. Hemmelgarn
2017-12-19 17:56               ` Tomasz Pala
2017-12-19 19:47                 ` Chris Murphy
2017-12-19 21:17                   ` Tomasz Pala
2017-12-20  0:08                     ` Chris Murphy
2017-12-23  4:08                       ` Tomasz Pala
2017-12-23  5:23                         ` Duncan
2017-12-20 16:53                   ` Andrei Borzenkov
2017-12-20 16:57                     ` Austin S. Hemmelgarn
2017-12-20 20:02                     ` Chris Murphy
2017-12-20 20:07                       ` Chris Murphy
2017-12-20 20:14                         ` Austin S. Hemmelgarn
2017-12-21  1:34                           ` Chris Murphy
2017-12-21 11:49                         ` Andrei Borzenkov
2017-12-19 20:11                 ` Austin S. Hemmelgarn
2017-12-19 21:58                   ` Tomasz Pala
2017-12-20 13:10                     ` Austin S. Hemmelgarn
2017-12-19 23:53                   ` Chris Murphy
2017-12-20 13:12                     ` Austin S. Hemmelgarn
2017-12-19 18:31             ` George Mitchell
2017-12-19 20:28               ` Tomasz Pala
2017-12-19 19:35             ` Chris Murphy
2017-12-19 20:41               ` Tomasz Pala
2017-12-19 20:47                 ` Austin S. Hemmelgarn
2017-12-19 22:23                   ` Tomasz Pala
2017-12-20 13:33                     ` Austin S. Hemmelgarn
2017-12-20 17:28                       ` Duncan
2017-12-21 11:44                   ` Andrei Borzenkov
2017-12-21 12:27                     ` Austin S. Hemmelgarn
2017-12-22 16:05                       ` Tomasz Pala
2017-12-22 21:04                         ` Chris Murphy
2017-12-23  2:52                           ` Tomasz Pala
2017-12-23  5:40                             ` Duncan
2017-12-19 23:59                 ` Chris Murphy
2017-12-20  8:34                   ` Tomasz Pala
2017-12-20  8:51                     ` Tomasz Pala
2017-12-20 19:49                     ` Chris Murphy
2017-12-18  5:11   ` Anand Jain
2017-12-18  1:20 ` Qu Wenruo
2017-12-18 13:31 ` Austin S. Hemmelgarn
2018-01-12 12:26   ` Dark Penguin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41e8ba5f-205d-ede5-e613-86896d463e4f@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=pg@btrfs.list.sabi.co.UK \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox