Re: Is it safe to use btrfs on top of different types of devices?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Adam Borowski <kilobyte@angband.pl>
Cc: Zoltan <zoltan1980@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: Is it safe to use btrfs on top of different types of devices?
Date: Tue, 17 Oct 2017 15:19:09 -0400	[thread overview]
Message-ID: <1d5e9875-1c1e-f67e-1f5b-0741555d9517@gmail.com> (raw)
In-Reply-To: <20171017170626.amfrohfyqlujdueu@angband.pl>

On 2017-10-17 13:06, Adam Borowski wrote:
> On Tue, Oct 17, 2017 at 08:40:20AM -0400, Austin S. Hemmelgarn wrote:
>> On 2017-10-17 07:42, Zoltan wrote:
>>> On Tue, Oct 17, 2017 at 1:26 PM, Austin S. Hemmelgarn
>>> <ahferroin7@gmail.com> wrote:
>>>
>>>> I forget sometimes that people insist on storing large volumes of data on
>>>> unreliable storage...
>>>
>>> In my opinion the unreliability of the storage is the exact reason for
>>> wanting to use raid1. And I think any problem one encounters with an
>>> unreliable disk can likely happen with more reliable ones as well,
>>> only less frequently, so if I don't feel comfortable using raid1 on an
>>> unreliable medium then I wouldn't trust it on a more reliable one
>>> either.
> 
>> The thing is that you need some minimum degree of reliability in the other
>> components in the storage stack for it to be viable to use any given storage
>> technology.  If you don't meet that minimum degree of reliability, then you
>> can't count on the reliability guarantees of the storage technology.
> 
> The thing is, reliability guarantees required vary WILDLY depending on your
> particular use cases.  On one hand, there's "even an one-minute downtime
> would cost us mucho $$$s, can't have that!" -- on the other, "it died?
> Okay, we got backups, lemme restore it after the weekend".
Yes, but if you are in the second case, you arguably don't need 
replication, and would be better served by improving the reliability of 
your underlying storage stack than trying to work around it's problems. 
Even in that case, your overall reliability is still constrained by the 
least reliable component (in more idiomatic terms 'a chain is only as 
strong as it's weakest link').

Using replication with a reliable device and a questionable device is 
essentially the same as trying to add redundancy to a machine by adding 
an extra linkage that doesn't always work and can get in the way of the 
main linkage it's supposed to be protecting from failure.  Yes, it will 
work most of the time, but the system is going to be less reliable than 
it is without the 'redundancy'.
> 
> Lemme tell you a btrfs blockdev disconnects story.
> I have an Odroid-U2, a cheap ARM SoC that, despite being 5 years old and
> costing mere $79 (+$89 eMMC...) still beats the performance of much newer
> SoCs that have far better theoretical specs, including subsequent Odroids.
> After ~1.5 year of CPU-bound stress tests for one program, I switched this
> machine to doing Debian package rebuilds, 24/7/365¼, for QA purposes.
> Being a moron, I did not realize until pretty late that high parallelism to
> keep all cores utilized is still a net performance loss when a memory-hungry
> package goes into a swappeathon, even despite the latter being fairly rare.
> Thus, I can say disk utilization was pretty much 100%, with almost as much
> writing as reading.  The eMMC card endured all of this until very recently
> (nowadays it sadly throws errors from time to time).
> 
> Thus, I switched the machine to NBD (albeit it sucks on 100Mbit eth).  Alas,
> the network driver allocates memory with GFP_NOIO which causes NBD
> disconnects (somehow, this doesn't ever happen on swap where GFP_NOIO would
> be obvious but on regular filesystem where throwing out userspace memory is
> safe).  The disconnects happen around once per week.
Somewhat off-topic, but you might try looking at ATAoE as an 
alternative, it's more reliable in my experience (if you've got a 
reliable network), gives better performance (there's less protocol 
overhead than NBD, and it runs on top of layer 2 instead of layer 4), 
and you can even boot with an ATAoE device as root without needing an 
initramfs if you have network auto-configuration in the kernel.  The 
generic server-side component is called 'vblade', and you actually don't 
need anything on the client side other than the `aoe` kernel module 
(loading the module scans for devices automatically, and you can easily 
manage things through the various nodes it creates in /dev).
> 
> It's a single-device filesystem, thus disconnects are obviously fatal.  But,
> they never caused even a single bit of damage (as scrub goes), thus proving
> btrfs handles this kind of disconnects well.  Unlike times past, the kernel
> doesn't get confused thus no reboot is needed, merely an unmount, "service
> nbd-client restart", mount, restart the rebuild jobs.
That's expected behavior though.  _Single_ device BTRFS has nothing to 
get out of sync most of the time, the only time there's any possibility 
of an issue is when you die after writing the first copy of a block 
that's in a dup profile chunk, but even that is not very likely to cause 
problems (you'll just lose at most the last <commit-time> worth of 
data).  The moment you add another device though, that simplicity goes 
out the window.
> 
> I also can recreate this filesystem and the build environment on it with
> just a few commands, thus, unlike /, there's no need for backups.  But I
> had no need to recreate it yet.
> 
> This is single-device not RAID5, but it's a good example for an use case
> where an unreliable storage medium is acceptable (even if the GFP_NOIO issue
> is still worth fixing).
Again, single device mode with BTRFS handles it just fine (at least, it 
handles it just as well as any other filesystem does).  Multi-device 
BTRFS is the issue here.

next prev parent reply	other threads:[~2017-10-17 19:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-14 19:00 Is it safe to use btrfs on top of different types of devices? Zoltán Ivánfi
2017-10-15  0:19 ` Peter Grandi
2017-10-15  3:42 ` Duncan
2017-10-15  8:30 ` Zoltán Ivánfi
2017-10-15 12:05   ` Duncan
2017-10-16 11:53   ` Austin S. Hemmelgarn
2017-10-16 16:57     ` Zoltan
2017-10-16 17:27       ` Austin S. Hemmelgarn
2017-10-17  1:14         ` Adam Borowski
2017-10-17 11:26           ` Austin S. Hemmelgarn
2017-10-17 11:42             ` Zoltan
2017-10-17 12:40               ` Austin S. Hemmelgarn
2017-10-17 17:06                 ` Adam Borowski
2017-10-17 19:19                   ` Austin S. Hemmelgarn [this message]
2017-10-17 20:21                     ` Adam Borowski
2017-10-17 21:56                       ` Zoltán Ivánfi
2017-10-18  4:44                         ` Duncan
2017-10-18 14:07                         ` Peter Grandi
2017-10-18 11:30                       ` Austin S. Hemmelgarn
2017-10-18 11:59                         ` Adam Borowski
2017-10-18 14:30                           ` Austin S. Hemmelgarn
2017-10-18  4:50                     ` Duncan
2017-10-18 13:53               ` Peter Grandi
2017-10-18 14:30                 ` Austin S. Hemmelgarn
2017-10-19 11:01                   ` Peter Grandi
2017-10-19 12:32                     ` Austin S. Hemmelgarn
2017-10-19 18:39                       ` Peter Grandi
2017-10-20 11:53                         ` Austin S. Hemmelgarn
2017-10-19 13:48                     ` Zoltan
2017-10-19 14:27                       ` Austin S. Hemmelgarn
2017-10-19 14:42                         ` Zoltan
2017-10-19 15:07                           ` Austin S. Hemmelgarn
2017-10-19 18:00                         ` Peter Grandi
2017-10-19 17:56                       ` Peter Grandi
2017-10-19 18:59                         ` Peter Grandi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d5e9875-1c1e-f67e-1f5b-0741555d9517@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=kilobyte@angband.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=zoltan1980@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).