From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: compress=lzo safe to use?
Date: Mon, 12 Sep 2016 04:36:07 +0000 (UTC) [thread overview]
Message-ID: <pan$56316$8902276c$b806652$f5dac40c@cox.net> (raw)
In-Reply-To: 6ef80ffd-6a56-3538-0778-a99cb4b9851e@mendix.com
Hans van Kranenburg posted on Sun, 11 Sep 2016 22:49:58 +0200 as
excerpted:
> So, you can use a lot of compress without problems for years.
>
> Only if your hardware is starting to break in a specific way, causing
> lots and lots of checksum errors, the kernel might not be able to handle
> all of them at the same time currently.
>
> The compress might be super stable itself, but in this case another part
> of the filesystem is not perfecty able to handle certain failure
> scenario's involving it.
Well put.
In my case I had problems trigger due to exactly two things, tho there
are obviously other ways of triggering the same issues, including a crash
in the middle of a commit, with one copy of the raid1 already updated
while the other is still being written.:
1) I first discovered the problem when one of my pair of ssds was going
bad. Because I had btrfs raid1 and could normally scrub-fix things, and
because I had backups anyway, I chose to continue running it for some
time, just to see how it handled things, as more and more sectors became
unwritable and were replaced by spares. By the end I had several MiB
worth of spares in-use, altho smart reported I had only used about 15% of
the available spares, but by then it was getting bad enough and the
newness had worn off, so I just replaced it and got rid of the hassle.
But as a result of the above, I had a *LOT* of practice with btrfs
recovery, mostly running scrub.
And what I found was that if btrfs raid1 encounters too many checksum
errors in compressed data it will crash btrfs and the kernel, even when
it *SHOULD* recover from the other device because it has a good copy, as
demonstrated by the fact that after a reboot, I could run a scrub and fix
everything, no uncorrected errors at all.
At first I thought it was just the way btrfs worked -- that it could
handle a few checksum errors but not too many at once. I had no idea it
was compression related. But nobody else seemed to mention the problem,
which I though a bit strange, until someone /did/ mention it, and
furthermore, actually tested both compressed and uncompressed btrfs, and
found the problem only when btrfs was reading compressed data. If the
data wasn't compressed, btrfs went ahead and read the second copy
correctly, without crashing the system, every time.
The extra kink in this is that at the time, I had a boot-time service
setup to cache (via cat > /dev/null) a bunch of files in a particular
directory. This particular directory is a cache for news archives, with
articles on some groups going back over a decade to 2002, and my news
client (pan) is slow to startup with several gigs of cached messages like
that, so I had the boot-time service pre-cache everything, so by the time
I started X and pan, it would be done or nearly so and I'd not have to
wait for pan to startup.
The problem was that many of the new files were in this directory, and
all that activity tended to hit the going-bad sectors on that ssd rather
frequently, making one copy often bad. Additionally, these are mostly
text messages, so they compress quite well, meaning compress=lzo would
trigger compression on many of them.
And because I had it reading them at boot, the kernel tended to overload
on checksum errors before it finished booting, far more frequently than
it would have otherwise. Of course, that would crash the system before I
could get a login in ordered to run btrfs scrub and fix the problem.
What I had to do then was boot to rescue mode, with the filesystems
mounted but before normal services (including this caching service) ran,
run the scrub from there, and then continue boot, which would then work
just fine because I'd fixed all the checksum errors.
But, as I said I eventually got tired of the hassle and just replaced the
failing device. Btrfs replace worked nicely. =:^)
2a) My second trigger is that I've found that with multiple devices, as
in multi-device btrfs, but also when I used to run mdraid, don't always
resume from suspend-to-RAM very well. Often one device takes longer to
wake up than the other(s), and the kernel will try to resume while one
still isn't responding properly. (FWIW, I ran into this problem on
spinning rust back on mdraid, but I see it now on ssds on btrfs as well,
so it seems to be a common issue, which probably remains relatively
obscure I'd guess because relatively few people with multi-device btrfs
or mdraid do suspend-to-ram.)
The result is that btrfs will try to write to the remaining device(s),
getting them out of sync with the one that isn't responding properly
yet. Ultimately this leads to a crash if I don't catch it and complete a
controlled shutdown before that, and sometimes I see the same crash-on-
boot-due-to-too-many-checksum-errors problem I saw with #1. I no longer
have that caching job running at boot and thus don't see it as often, but
it still happens occasionally. Again, once I boot to rescue mode and run
scrub, it fixes the problem and I can resume the normal mode boot without
further issue.
So I pretty much quit suspending to RAM, at least for any longer period,
and just shutdown and reboot, now. With systemd and ssds, the boot
doesn't take significantly longer anyway, tho it does mean I can't simply
resume and pick up where I was, I have to reopen my work, etc.
2b) Closely related to #2a and most recent, since I'm no longer trying to
suspend to RAM, I think one of the ssds now has a bad backup capacitor or
something, as if I leave it idle for too long it'll fail to respond once
I start trying to use it again. Same story, the other device gets writes
that the unresponsive device is missing, and eventually if I don't reboot
I crash. Upon reboot, again, if there were too many things written to
the device that stayed up that didn't make it to the other one, it can
trigger a crash due to checksum failure. However, if I can get a command
prompt, either because it boots all the way or because I boot to rescue
mode, I can run a scrub and update the bad device from the good one, and
then everything works fine once again... until the device goes
unresponsive, again.
Again, I once thought all this was just the stage at which btrfs was,
until I found out that it doesn't seem to happen if btrfs compression
isn't being used. Something about the way it recovers from checksum
errors on compressed data differs from the way it recovers from checksum
errors on uncompressed data, and there's a bug in the compressed data
processing path. But beyond that, I'm not a dev and it gets a bit fuzzy,
which also explains why I've not gone code diving and submitted patches
to try to fix it, myself.
But if I'm correct, it probably doesn't matter what the compression type
is, only how much of it there is. So compress-force would tend to
trigger the issue far more frequently than simply compress, unless of
course your use-case is a corner-case like my trying to read all those
compressible text messages into cache at boot was, but compress (or
compress-force) =lzo vs =zlib shouldn't matter.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2016-09-12 4:36 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-24 14:52 Trying to rescue my data :( Steven Haigh
2016-06-24 16:26 ` Steven Haigh
2016-06-24 16:59 ` ronnie sahlberg
2016-06-24 17:05 ` Steven Haigh
2016-06-24 17:40 ` Austin S. Hemmelgarn
2016-06-24 17:43 ` Steven Haigh
2016-06-24 17:50 ` Austin S. Hemmelgarn
2016-06-25 4:19 ` Steven Haigh
2016-06-25 16:25 ` Chris Murphy
2016-06-25 16:39 ` Steven Haigh
2016-06-25 17:14 ` Chris Murphy
2016-06-26 2:30 ` Duncan
2016-06-26 3:13 ` Steven Haigh
2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald
2016-09-11 20:06 ` Adam Borowski
2016-09-11 20:27 ` Chris Murphy
2016-09-11 20:49 ` compress=lzo safe to use? Hans van Kranenburg
2016-09-12 4:36 ` Duncan [this message]
2016-09-17 9:30 ` Kai Krakow
2016-09-12 1:00 ` Steven Haigh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$56316$8902276c$b806652$f5dac40c@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).