From: Hugo Mills <hugo@carfax.org.uk>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem
Date: Mon, 11 Jan 2016 23:07:27 +0000 [thread overview]
Message-ID: <20160111230727.GF422@carfax.org.uk> (raw)
In-Reply-To: <CAJCQCtQBUFkPc_Tqfx5HCbVvNEhBbGhxRzHAZ+j75e1f74NUjg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3979 bytes --]
On Mon, Jan 11, 2016 at 03:39:43PM -0700, Chris Murphy wrote:
> On Mon, Jan 11, 2016 at 3:30 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> > On Mon, Jan 11, 2016 at 03:20:36PM -0700, Chris Murphy wrote:
> >> On Mon, Jan 11, 2016 at 3:10 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> >> > On Mon, Jan 11, 2016 at 02:31:41PM -0700, Chris Murphy wrote:
> >> >> On Mon, Jan 11, 2016 at 2:03 AM, Hugo Mills <hugo@carfax.org.uk> wrote:
> >> >> > On Sun, Jan 10, 2016 at 05:13:28PM -0700, Chris Murphy wrote:
> >> >> >> On Sat, Jan 9, 2016 at 2:04 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> >> >> >> > On Sat, Jan 09, 2016 at 09:59:29PM +0100, cheater00 . wrote:
> >> >> >> >> OK. How do we track down that bug and get it fixed?
> >> >> >> >
> >> >> >> > I have no idea. I'm not a btrfs dev, I'm afraid.
> >> >> >> >
> >> >> >> > It's been around for a number of years. None of the devs has, I
> >> >> >> > think, had the time to look at it. When Josef was still (publicly)
> >> >> >> > active, he had it second on his list of bugs to look at for many
> >> >> >> > months -- but it always got trumped by some new bug that could cause
> >> >> >> > data loss.
> >> >> >>
> >> >> >>
> >> >> >> Interesting. I did not know of this bug. It's pretty rare.
> >> >> >
> >> >> > Not really. It shows up maybe on average once a week on IRC. It
> >> >> > gets reported much less on the mailing list.
> >> >>
> >> >> Is there a pattern? Does it only happen at a 2TiB threshold?
> >> >
> >> > No, and no.
> >> >
> >> > There is, as far as I can tell from some years of seeing reports of
> >> > this bug, no correlation with RAID level, hardware, OS, kernel
> >> > version, FS size, usage of the FS at failure, or allocation level of
> >> > either data or metadata at failure.
> >> >
> >> > I haven't tried correlating with the phase of the moon or the
> >> > losses on Lloyds Register yet.
> >>
> >> Huh. So it's goofy cakes.
> >>
> >> This is specifically where btrfs_free_extent produces errno -28 no
> >> space left, and then the fs goes read-only?
> >
> > The symptoms I'm using for a diagnosis of this bug are that the FS
> > runs out of (usually data) space when there's still unallocated space
> > remaining that it could use for another block group.
> >
> > Forced RO isn't usually a symptom, although the FS can get into a
> > state where you can't modify it (as distinct from being explicitly
> > read-only).
> >
> > Block-group level operations, like balance, device delete, device
> > add sometimes seem to have some kind of (usually small) effect on the
> > point at which the error occurs. If you hit the problem and run a
> > balance, you might end up making things worse by a couple of
> > gigabytes, or making things better by the same amount, or having no
> > effect at all.
>
> Are there any compile time options not normally set that would help find it?
> # CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
> # CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
> # CONFIG_BTRFS_DEBUG is not set
> # CONFIG_BTRFS_ASSERT is not set
>
> Once it starts to happen, it sounds like it's straightforward to
> reproduce in a short amount of time. I'm kinda surprised I've never
> run into this.
It does sometimes have a repeating nature: I'm reasonably sure
we've seen a few people get it repeatedly on different filesystems.
This might point at a particular workload needed to trigger it. (Or
just bad luck / statistical likelihood). Some people have never hit
it.
There is (or at least, was) an ENOSPC debugging option. I think
that's a mount option. That's probably the most useful one, but the
range of usefulness of existing debug output may be very small. :)
(Sorry for the vague nature of this reply -- it's been a very long
day).
Hugo.
--
Hugo Mills | "What are we going to do tonight?"
hugo@... carfax.org.uk | "The same thing we do every night, Pinky. Try to
http://carfax.org.uk/ | take over the world!"
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2016-01-11 23:07 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-30 21:44 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem cheater00 .
2015-12-30 22:13 ` Chris Murphy
2016-01-02 2:09 ` cheater00 .
2016-01-02 2:10 ` cheater00 .
[not found] ` <CA+9GZUiWQ2tAotFuq2Svkjnk+2Quz5B8UwZSSpm4SJfhqfoStQ@mail.gmail.com>
2016-01-07 21:55 ` Chris Murphy
[not found] ` <CA+9GZUjLcRnRX_mwO-McXWFd+G4o3jtBENMLnszg-rJTn6vL1w@mail.gmail.com>
[not found] ` <CAJCQCtRhYZi9nqWP_LYmZeg1yRQVkpnmUDQ-P5o1-gc-3w+Pdg@mail.gmail.com>
2016-01-09 20:00 ` cheater00 .
2016-01-09 20:26 ` Hugo Mills
2016-01-09 20:59 ` cheater00 .
2016-01-09 21:04 ` Hugo Mills
2016-01-09 21:07 ` cheater00 .
2016-01-09 21:15 ` Hugo Mills
2016-01-10 3:59 ` cheater00 .
2016-01-10 6:16 ` Russell Coker
2016-01-10 22:24 ` cheater00 .
2016-01-10 22:32 ` Lionel Bouton
2016-01-11 13:05 ` Austin S. Hemmelgarn
2016-01-11 13:11 ` cheater00 .
2016-01-11 13:30 ` cheater00 .
2016-01-11 13:45 ` cheater00 .
2016-01-11 14:04 ` cheater00 .
2016-01-12 2:18 ` Duncan
2016-08-04 16:53 ` Lutz Vieweg
2016-08-04 20:30 ` Chris Murphy
2016-08-05 10:56 ` Lutz Vieweg
2016-08-05 12:12 ` Austin S. Hemmelgarn
2016-08-05 13:14 ` Lutz Vieweg
2016-08-05 20:03 ` Gabriel C
2016-08-25 15:48 ` Lutz Vieweg
2016-01-11 14:10 ` Austin S. Hemmelgarn
2016-01-11 16:02 ` cheater00 .
2016-01-11 16:33 ` cheater00 .
2016-01-11 20:29 ` Henk Slager
2016-01-12 1:16 ` Duncan
2016-01-11 0:13 ` Chris Murphy
2016-01-11 9:03 ` Hugo Mills
2016-01-11 13:04 ` cheater00 .
2016-01-11 21:31 ` Chris Murphy
2016-01-11 22:10 ` Hugo Mills
2016-01-11 22:20 ` Chris Murphy
2016-01-11 22:30 ` Hugo Mills
2016-01-11 22:39 ` Chris Murphy
2016-01-11 23:07 ` Hugo Mills [this message]
2016-01-11 23:12 ` cheater00 .
2016-01-11 23:05 ` cheater00 .
2016-01-12 2:05 ` Duncan
2016-01-11 22:57 ` cheater00 .
2016-01-10 14:14 ` Henk Slager
2016-01-10 23:47 ` cheater00 .
2016-01-11 0:24 ` Chris Murphy
2016-01-11 6:07 ` cheater00 .
2016-01-11 6:24 ` cheater00 .
2016-01-11 7:54 ` cheater00 .
2016-01-12 0:35 ` Duncan
2016-01-11 19:50 ` Henk Slager
2016-01-11 23:03 ` cheater00 .
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160111230727.GF422@carfax.org.uk \
--to=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).