From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Help with space
Date: Fri, 2 May 2014 08:23:24 +0000 (UTC) [thread overview]
Message-ID: <pan$b29aa$ed2c5b17$e83ea42c$5e85b880@cox.net> (raw)
In-Reply-To: 201405021148.07577.russell@coker.com.au
Russell Coker posted on Fri, 02 May 2014 11:48:07 +1000 as excerpted:
> On Thu, 1 May 2014, Duncan <1i5t5.duncan@cox.net> wrote:
>
> Am I missing something or is it impossible to do a disk replace on BTRFS
> right now?
>
> I can delete a device, I can add a device, but I'd like to replace a
> device.
You're missing something... but it's easy to do as I almost missed it too
even tho I was sure it was there.
Something tells me btrfs replace (not device replace, simply replace)
should be moved to btrfs device replace...
> http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf
>
> Whether a true RAID-1 means just 2 copies or N copies is a matter of
> opinion. Papers such as the above seem to clearly imply that RAID-1 is
> strictly 2 copies of data.
Thanks for that link. =:^)
My position would be that reflects the original, but not the modern,
definition. The paper seems to describe as raid1 what would later come
to be called raid1+0, which quickly morphed into raid10, leaving the
raid1 description only covering pure mirror-raid.
And even then, the paper says mirrors in spots without specifically
defining it as (only) two mirrors, but in others it seems to /assume/,
without further explanation, just two mirrors. So I'd argue that even
then the definition of raid1 allowed more than two mirrors, but that it
just so happened that the examples and formulae given dealt with only two
mirrors.
Tho certainly I can see the room for differing opinions on the matter as
well.
> I don't have a strong opinion on how many copies of data can be involved
> in a RAID-1, but I think that there's no good case to claim that only 2
> copies means that something isn't "true RAID-1".
Well, I'd say two copies if it's only two devices in the raid1... would
be true raid1. But if it's say four devices in the raid1, as is
certainly possible with btrfs raid1, that if it's not mirrored 4-way
across all devices, it's not true raid1, but rather some sort of hybrid
raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if
arranged that way, or some form that doesn't nicely fall into a well
defined raid level categorization.
But still, opinions can differ. Point well made... and taken. =:^)
>> Surprisingly, after shutting everything down, getting a new AC, and
>> letting the system cool for a few hours, it pretty much all came back
>> to life, including the CPU(s) (that was pre-multi-core, but I don't
>> remember whether it was my dual socket original Opteron, or
>> pre-dual-socket for me as well) which I had feared would be dead.
>
> CPUs have had thermal shutdown for a long time. When a CPU lacks such
> controls (as some buggy Opteron chips did a few years ago) it makes the
> IT news.
That was certainly some years ago, and I remember for awhile, AMD Athlons
didn't have thermal shutdown yet, while Intel CPUs of the time did. And
that was an AMD CPU as I've run mostly AMD (with only specific
exceptions) for literally decades, now. But what IDR for sure is whether
it was my original AMD Athlon (500 MHz), or the Athlon C @ 1.2 GHz, or
the dual Opteron 242s I ran for several years. If it was the original
Athlon, it wouldn't have had thermal shutdown. If it was the Opterons I
think they did, but I think the Athlon Cs were in the period when Intel
had introduced thermal shutdown but AMD hadn't, and Tom's Hardware among
others had dramatic videos of just exactly what happened if one actually
tried to run the things without cooling, compared to running an Intel of
the period.
But I remember being rather surprised that the CPU(s) was/were unharmed,
which means it very well could have been the Athlon C era, and I had seen
the dramatic videos and knew my CPU wasn't protected.
> I'd like to be able to run a combination of "dup" and RAID-1 for
> metadata. ZFS has a "copies" option, it would be good if we could do
> that.
Well, if N-way-mirroring were possible, one could do more or less just
that easily enough with suitable partitioning and setting the data vs
metadata number of mirrors as appropriate... but of course with only two-
way-mirroring and dup as choices... the only way to do it would be
layering btrfs atop something else, say md/raid. And without real-time
checksumming verification at the md/raid level...
> I use BTRFS for all my backups too. I think that the chance of data
> patterns triggering filesystem bugs that break backups as well as
> primary storage is vanishingly small. The chance of such bugs being
> latent for long enough that I can't easily recreate the data isn't worth
> worrying about.
The fact that my primary filesystems and their first backups are btrfs
raid1 on dual SSDs, while secondary backups are on spinning rust, does
factor into my calculations here.
I ran reiserfs for many years, since I first switched to Linux full time
in the early kernel 2.4 era in fact, and while it had its problems early
on, since the introduction of ordered data mode in IIRC 2.6.16 or some
such, reiserfs has proven its reliability thru all sorts of hardware
issues including faulty memory, bad power, and that overheated disk,
here, and thru the infamous ext3 write-back-journal-by-default period as
well. Of course I attribute a good part of that reliability to the fact
that kernel hackers that think they know enough about ext* to mess with
it are afraid to touch reiserfs, leaving that to the experts, and of
course that (and memories of its earlier writeback issues) are precisely
why reiserfs didn't suffer the same writeback-by-default problems that
ext3 had, when kernel hackers thought they knew ext3 well enough to try
writeback with it.
And it's in no small part due to Chris Mason's history with reiserfs and
the introduction of ordered journaling there, that I trust btrfs to the
degree I trust it today.
But reiserfs, while it has proven its reliability here time and again,
simply wasn't designed nor is it appropriate for SSDs. So while I had
tried btrfs on spinning rust somewhat earlier and decided it wasn't
mature enough for my usage at that time, when I switched to SSD I needed
to find a new filesystem as well, and in part because I do /not/ trust
the kernel hackers to keep their hands off ext*, while at the same time
I /do/ routinely run pre-release kernels including occasionally pre-rc1
kernels, thereby heightening my exposure to ext* kernel hacking risks, I
wasn't particularly enthusiastic about switching to ext4 and its ssd
mode. Moreover, having run reiserfs with tail-packing for years, I
viewed the restrictions of whole-block allocations as a regression I
didn't want to deal with.
As a result, when I switched to SSD and needed something more suited to
ssd than reiserfs, it was little surprise that I decided on a new
filesystem with a lead developer instrumental in making reiserfs as
stable as it has been for me, even while keeping my spinning rust backups
on the reiserfs that has time and again demonstrated for me surprisingly
good stability in the face of hardware issues.
Meanwhile, I'm not so much afraid of data-pattern triggered btrfs bugs,
at least not directly, as I am of the possibility of some new development-
version btrfs bug eating my working fs, and then when I boot to the
backup to try to recover, eating it too, if it too is btrfs. If that
backup is instead my trusted reiserfs I've found so stable over the
years, then that new btrfs bug shouldn't affect it, and while I'll
discover my attempt to restore from reiserfs to btrfs doesn't work due to
that btrfs bug again eating the btrfs as soon as I load it up, at least
the reiserfs copy of the same data should still be safe, since the btrfs
bug wouldn't affect it.
In that regard having reiserfs on the second level backups on spinning
rust, while running btrfs on the working copy and primary level backups
on ssd, serves as a firewall against bugs from the still under
development btrfs eating first my working copy, then the primary backup,
then the secondary backup and beyond, since the secondary and beyond
backups are beyond the firewall on a totally different filesystem, which
shouldn't be susceptible to the same bugs.
Another risk reduction I take some comfort in, is the fact that I keep my
rootfs mounted read-only by default, only remounting it read-write for
updates to packages or configuration. Since the rootfs is thus likely to
be read-only mounted at the time of a crash and will almost certainly be
read-only mounted if I'm booting from backup in ordered to restore a
damaged working filesystem, it's even more unlikely that a bug that might
destroy the working copy could destroy the backup as I boot from it to
try to restore the working copy. =:^) Of course if the bug triggers on
/home or the like, it could still destroy the backup /home as well, at
least the primary btrfs backup, but in that case, chances remain quite
good that the read-only rootfs with all the usual recovery tools, etc,
will remain intact and usable to rescue the other filesystems.)
And yet another risk reduction is the fact that I run totally separate
and independent partition filesystems, not subvolumes on the same
partition with the same common base filesystem structures, which is what
a lot of btrfs users are choosing to do. If btrfs suffers a structure
destroying bug, they'll lose and have to restore everything, while I'll
lose and have to restore just one filesystem with its rather more limited
dataset.
Meanwhile, those partitions are all dual-copy checksummed gpt based (now
days I use gpt even on USB sticks), too, with an identical partitioning
scheme on each of two different physical devices. So if one partition
table gets corrupted, the other one will kick in. And if both partition
tables at opposite ends of the same device get corrupted, presumably in
some power failure accident while I was actually editing them or
something, then there's still the other physical device I can boot from
and use its partitioning table and gptfdisk to redo the corrupted
partition table on the damaged device.
If all that fails and my final backup, an external spinning rust device
not normally even attached to the computer, fails as well, say due to a
fire or flood or something, I figure at that point I'll have rather more
important things to worry about, like just surviving and finding a new
home, than what happened to all those checksum-verified both logically
and physically redundant layers of backup. And when I do get around to
worrying about computers again, well, the really valuable stuff's in my
head anyway, and if *THAT* copy dies too, well, come visit me in the the
mental ward or cemetery!
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-05-02 8:23 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-27 18:19 Help with space Justin Brown
2014-02-27 19:27 ` Chris Murphy
2014-02-27 19:51 ` Chris Murphy
2014-02-27 20:49 ` otakujunction
2014-02-27 21:11 ` Chris Murphy
2014-02-28 0:12 ` Dave Chinner
2014-02-28 0:27 ` Chris Murphy
2014-02-28 4:21 ` Dave Chinner
2014-02-28 5:49 ` Chris Murphy
2014-02-28 4:34 ` Roman Mamedov
2014-02-28 7:27 ` Duncan
2014-02-28 7:37 ` Roman Mamedov
2014-02-28 7:46 ` Justin Brown
2014-05-01 1:52 ` Russell Coker
2014-05-01 5:33 ` Duncan
2014-05-02 1:48 ` Russell Coker
2014-05-02 8:23 ` Duncan [this message]
2014-05-02 9:28 ` Brendan Hide
2014-05-02 19:21 ` Chris Murphy
2014-05-02 21:08 ` Hugo Mills
2014-05-02 22:33 ` Chris Murphy
2014-05-03 16:31 ` Austin S Hemmelgarn
2014-05-03 19:09 ` Chris Murphy
2014-05-03 20:52 ` Austin S Hemmelgarn
2014-05-03 23:16 ` Chris Murphy
2014-02-28 6:13 ` Chris Murphy
2014-02-28 6:26 ` Chris Murphy
2014-02-28 7:39 ` Justin Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$b29aa$ed2c5b17$e83ea42c$5e85b880@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox