From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: "Niccolò Belli" <darkbasic@linuxsystems.it>
Cc: linux-btrfs@vger.kernel.org,
Clemens Eisserer <linuxhippy@gmail.com>,
"Austin S. Hemmelgarn" <ahferroin7@gmail.com>,
Patrik Lundquist <patrik.lundquist@gmail.com>,
Chris Murphy <lists@colorremedies.com>,
Qu Wenruo <quwenruo@cn.fujitsu.com>,
Omar Sandoval <osandov@osandov.com>,
1i5t5.duncan@cox.net
Subject: Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
Date: Thu, 12 May 2016 12:48:17 -0400 [thread overview]
Message-ID: <20160512164817.GD15597@hungrycats.org> (raw)
In-Reply-To: <c5fa6a35-f6bd-4546-8297-7f6225696157@linuxsystems.it>
[-- Attachment #1: Type: text/plain, Size: 2790 bytes --]
On Thu, May 12, 2016 at 04:35:24PM +0200, Niccolò Belli wrote:
> When doing the btrfs check I also always do a btrfs scrub and it never found
> any error. Once it didn't manage to finish the scrub because of:
> BTRFS critical (device dm-0): corrupt leaf, slot offset bad:
> block=670597120,root=1, slot=6
> and btrfs scrub status reported "was aborted after 00:00:10".
>
> Talking about scrub I created a systemd timer to run scrub hourly and I
> noticed 2 *uncorrectable* errors suddenly appeared on my system. So I
> immediately re-run the scrub just to confirm it and then I rebooted into the
> Arch live usb and runned btrfs check: the metadata were perfect. So I runned
> btrfs scrub from the live usb and there were no errors at all! I rebooted
> into my system and runned scrub once again and the uncorrectable errors
> where really gone! It happened two times in the past few days.
That's what a RAM corruption problem looks like when you run btrfs scrub.
Maybe the RAM itself is OK, but *something* is scribbling on it.
Does the Arch live usb use the same kernel as your normal system?
> Almost no patches get applied by the Arch kernel team:
> https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/linux
> At the moment the only one is an harmless
> "change-default-console-loglevel.patch".
Did you try an older (or newer) kernel? I've been running 4.5.x on a few
canary systems, but so far none of them have survived more than a day.
Contrast with 4.1.x and 4.4.x, which runs for months between reboots
for me. Maybe there's a regression in 4.5.x, maybe I did something
wrong in my config or build, or maybe I just have too few data points
to draw any conclusions, but my data so far is telling me to stay on
4.4.x until something changes (i.e. wait for a 4.5.x stable update or
skip directly to 4.6.x). :-/
It's always worth trying this if only to eliminate regression as a
possible root cause early. In practice, every mainline kernel release
has a regression that affects at least one combination of config options
and hardware. btrfs is stable enough now that you can be running one
or two releases behind to avoid a problem elsewhere in the kernel.
> Another option will be crashing it with my car's wheels hoping that because
> of my comprehensive insurance policy Dell will give me the next model (the
> Skylake one) as a replacement (hoping that it will not suffer from the same
> issue of the Broadwell one).
The first rule of Insurance Fraud Club: don't talk about Insurance
Fraud Club. ;)
It's possible there's a problem that affects only very specific chipsets
You seem to have eliminated RAM in isolation, but there could be a problem
in the kernel that affects only your chipset.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2016-05-12 16:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-04 23:21 btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair Niccolò Belli
2016-05-05 1:07 ` Chris Murphy
2016-05-05 10:36 ` Niccolò Belli
2016-05-05 17:48 ` Omar Sandoval
2016-05-06 11:38 ` Niccolò Belli
2016-05-07 15:45 ` Niccolò Belli
2016-05-07 15:58 ` Clemens Eisserer
2016-05-07 16:11 ` Niccolò Belli
2016-05-08 18:27 ` Patrik Lundquist
2016-05-09 11:52 ` Austin S. Hemmelgarn
2016-05-09 14:53 ` Niccolò Belli
2016-05-09 16:29 ` Zygo Blaxell
2016-05-09 18:21 ` Austin S. Hemmelgarn
2016-05-09 19:18 ` Duncan
2016-05-12 14:35 ` Niccolò Belli
2016-05-12 15:43 ` Austin S. Hemmelgarn
2016-05-13 11:07 ` Niccolò Belli
2016-05-13 11:35 ` Austin S. Hemmelgarn
2016-05-13 12:10 ` Niccolò Belli
2016-05-13 21:54 ` Chris Murphy
2016-05-12 16:48 ` Zygo Blaxell [this message]
2016-05-09 19:23 ` Lionel Bouton
2016-05-09 21:30 ` Chris Murphy
2016-05-07 23:35 ` Chris Murphy
2016-05-05 4:12 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160512164817.GD15597@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=1i5t5.duncan@cox.net \
--cc=ahferroin7@gmail.com \
--cc=darkbasic@linuxsystems.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=linuxhippy@gmail.com \
--cc=lists@colorremedies.com \
--cc=osandov@osandov.com \
--cc=patrik.lundquist@gmail.com \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.