From: Steven Haigh <netwiz@crc.id.au>
To: Martin Steigerwald <martin@lichtvoll.de>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: compress=lzo safe to use?
Date: Mon, 12 Sep 2016 11:00:45 +1000 [thread overview]
Message-ID: <bcd02f62aa49ca4f17bb94e0253f048a@crc.id.au> (raw)
In-Reply-To: <4096253.hu8ZAHGEqT@merkaba>
On 2016-09-12 05:48, Martin Steigerwald wrote:
> Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh:
>> On 26/06/16 12:30, Duncan wrote:
>> > Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted:
>> >> In every case, it was a flurry of csum error messages, then instant
>> >> death.
>> >
>> > This is very possibly a known bug in btrfs, that occurs even in raid1
>> > where a later scrub repairs all csum errors. While in theory btrfs raid1
>> > should simply pull from the mirrored copy if its first try fails checksum
>> > (assuming the second one passes, of course), and it seems to do this just
>> > fine if there's only an occasional csum error, if it gets too many at
>> > once, it *does* unfortunately crash, despite the second copy being
>> > available and being just fine as later demonstrated by the scrub fixing
>> > the bad copy from the good one.
>> >
>> > I'm used to dealing with that here any time I have a bad shutdown (and
>> > I'm running live-git kde, which currently has a bug that triggers a
>> > system crash if I let it idle and shut off the monitors, so I've been
>> > getting crash shutdowns and having to deal with this unfortunately often,
>> > recently). Fortunately I keep my root, with all system executables, etc,
>> > mounted read-only by default, so it's not affected and I can /almost/
>> > boot normally after such a crash. The problem is /var/log and /home
>> > (which has some parts of /var that need to be writable symlinked into /
>> > home/var, so / can stay read-only). Something in the normal after-crash
>> > boot triggers enough csum errors there that I often crash again.
>> >
>> > So I have to boot to emergency mode and manually mount the filesystems in
>> > question, so nothing's trying to access them until I run the scrub and
>> > fix the csum errors. Scrub itself doesn't trigger the crash, thankfully,
>> > and once it has repaired all the csum errors due to partial writes on one
>> > mirror that either were never made or were properly completed on the
>> > other mirror, I can exit emergency mode and complete the normal boot (to
>> > the multi-user default target). As there's no more csum errors then
>> > because scrub fixed them all, the boot doesn't crash due to too many such
>> > errors, and I'm back in business.
>> >
>> >
>> > Tho I believe at least the csum bug that affects me may only trigger if
>> > compression is (or perhaps has been in the past) enabled. Since I run
>> > compress=lzo everywhere, that would certainly affect me. It would also
>> > explain why the bug has remained around for quite some time as well,
>> > since presumably the devs don't run with compression on enough for this
>> > to have become a personal itch they needed to scratch, thus its remaining
>> > untraced and unfixed.
>> >
>> > So if you weren't using the compress option, your bug is probably
>> > different, but either way, the whole thing about too many csum errors at
>> > once triggering a system crash sure does sound familiar, here.
>>
>> Yes, I was running the compress=lzo option as well... Maybe here lays
>> a
>> common problem?
>
> Hmm… I found this from being referred to by reading Debian wiki page on
> BTRFS¹.
>
> I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found
> an
> issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6?
Yes, I was using RAID6 - and it has had a track record of eating data.
There's lots of problems with the implementation / correctness of
RAID5/6 parity - which I'm pretty sure haven't been nailed down yet. The
recommendation at the moment is just not to use RAID5 or RAID6 modes of
BTRFS. The last I heard, if you were using RAID5/6 in BTRFS, the
recommended action was to migrate your data to a different profile or a
different FS.
> I just want to assess whether using compress=lzo might be dangerous to
> use in
> my setup. Actually right now I like to keep using it, since I think at
> least
> one of the SSDs does not compress. And… well… /home and / where I use
> it are
> both quite full already.
I don't believe the compress=lzo option by itself was a problem - but it
*may* have an impact in the RAID5/6 parity problems? I'd be guessing
here, but am happy to be corrected.
--
Steven Haigh
Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
prev parent reply other threads:[~2016-09-12 1:00 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-24 14:52 Trying to rescue my data :( Steven Haigh
2016-06-24 16:26 ` Steven Haigh
2016-06-24 16:59 ` ronnie sahlberg
2016-06-24 17:05 ` Steven Haigh
2016-06-24 17:40 ` Austin S. Hemmelgarn
2016-06-24 17:43 ` Steven Haigh
2016-06-24 17:50 ` Austin S. Hemmelgarn
2016-06-25 4:19 ` Steven Haigh
2016-06-25 16:25 ` Chris Murphy
2016-06-25 16:39 ` Steven Haigh
2016-06-25 17:14 ` Chris Murphy
2016-06-26 2:30 ` Duncan
2016-06-26 3:13 ` Steven Haigh
2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald
2016-09-11 20:06 ` Adam Borowski
2016-09-11 20:27 ` Chris Murphy
2016-09-11 20:49 ` compress=lzo safe to use? Hans van Kranenburg
2016-09-12 4:36 ` Duncan
2016-09-17 9:30 ` Kai Krakow
2016-09-12 1:00 ` Steven Haigh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bcd02f62aa49ca4f17bb94e0253f048a@crc.id.au \
--to=netwiz@crc.id.au \
--cc=linux-btrfs@vger.kernel.org \
--cc=martin@lichtvoll.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).