From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: "Niccolò Belli" <darkbasic@linuxsystems.it>
Cc: linux-btrfs@vger.kernel.org,
Clemens Eisserer <linuxhippy@gmail.com>,
"Austin S. Hemmelgarn" <ahferroin7@gmail.com>,
Patrik Lundquist <patrik.lundquist@gmail.com>,
Chris Murphy <lists@colorremedies.com>,
Qu Wenruo <quwenruo@cn.fujitsu.com>,
Omar Sandoval <osandov@osandov.com>,
1i5t5.duncan@cox.net
Subject: Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
Date: Thu, 12 May 2016 12:48:17 -0400 [thread overview]
Message-ID: <20160512164817.GD15597@hungrycats.org> (raw)
In-Reply-To: <c5fa6a35-f6bd-4546-8297-7f6225696157@linuxsystems.it>
[-- Attachment #1: Type: text/plain, Size: 2790 bytes --]
On Thu, May 12, 2016 at 04:35:24PM +0200, Niccolò Belli wrote:
> When doing the btrfs check I also always do a btrfs scrub and it never found
> any error. Once it didn't manage to finish the scrub because of:
> BTRFS critical (device dm-0): corrupt leaf, slot offset bad:
> block=670597120,root=1, slot=6
> and btrfs scrub status reported "was aborted after 00:00:10".
>
> Talking about scrub I created a systemd timer to run scrub hourly and I
> noticed 2 *uncorrectable* errors suddenly appeared on my system. So I
> immediately re-run the scrub just to confirm it and then I rebooted into the
> Arch live usb and runned btrfs check: the metadata were perfect. So I runned
> btrfs scrub from the live usb and there were no errors at all! I rebooted
> into my system and runned scrub once again and the uncorrectable errors
> where really gone! It happened two times in the past few days.
That's what a RAM corruption problem looks like when you run btrfs scrub.
Maybe the RAM itself is OK, but *something* is scribbling on it.
Does the Arch live usb use the same kernel as your normal system?
> Almost no patches get applied by the Arch kernel team:
> https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/linux
> At the moment the only one is an harmless
> "change-default-console-loglevel.patch".
Did you try an older (or newer) kernel? I've been running 4.5.x on a few
canary systems, but so far none of them have survived more than a day.
Contrast with 4.1.x and 4.4.x, which runs for months between reboots
for me. Maybe there's a regression in 4.5.x, maybe I did something
wrong in my config or build, or maybe I just have too few data points
to draw any conclusions, but my data so far is telling me to stay on
4.4.x until something changes (i.e. wait for a 4.5.x stable update or
skip directly to 4.6.x). :-/
It's always worth trying this if only to eliminate regression as a
possible root cause early. In practice, every mainline kernel release
has a regression that affects at least one combination of config options
and hardware. btrfs is stable enough now that you can be running one
or two releases behind to avoid a problem elsewhere in the kernel.
> Another option will be crashing it with my car's wheels hoping that because
> of my comprehensive insurance policy Dell will give me the next model (the
> Skylake one) as a replacement (hoping that it will not suffer from the same
> issue of the Broadwell one).
The first rule of Insurance Fraud Club: don't talk about Insurance
Fraud Club. ;)
It's possible there's a problem that affects only very specific chipsets
You seem to have eliminated RAM in isolation, but there could be a problem
in the kernel that affects only your chipset.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2016-05-12 16:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-04 23:21 btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair Niccolò Belli
2016-05-05 1:07 ` Chris Murphy
2016-05-05 10:36 ` Niccolò Belli
2016-05-05 17:48 ` Omar Sandoval
2016-05-06 11:38 ` Niccolò Belli
2016-05-07 15:45 ` Niccolò Belli
2016-05-07 15:58 ` Clemens Eisserer
2016-05-07 16:11 ` Niccolò Belli
2016-05-08 18:27 ` Patrik Lundquist
2016-05-09 11:52 ` Austin S. Hemmelgarn
2016-05-09 14:53 ` Niccolò Belli
2016-05-09 16:29 ` Zygo Blaxell
2016-05-09 18:21 ` Austin S. Hemmelgarn
2016-05-09 19:18 ` Duncan
2016-05-12 14:35 ` Niccolò Belli
2016-05-12 15:43 ` Austin S. Hemmelgarn
2016-05-13 11:07 ` Niccolò Belli
2016-05-13 11:35 ` Austin S. Hemmelgarn
2016-05-13 12:10 ` Niccolò Belli
2016-05-13 21:54 ` Chris Murphy
2016-05-12 16:48 ` Zygo Blaxell [this message]
2016-05-09 19:23 ` Lionel Bouton
2016-05-09 21:30 ` Chris Murphy
2016-05-07 23:35 ` Chris Murphy
2016-05-05 4:12 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160512164817.GD15597@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=1i5t5.duncan@cox.net \
--cc=ahferroin7@gmail.com \
--cc=darkbasic@linuxsystems.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=linuxhippy@gmail.com \
--cc=lists@colorremedies.com \
--cc=osandov@osandov.com \
--cc=patrik.lundquist@gmail.com \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).