linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: fdmanana@gmail.com
Cc: "dsterba@suse.cz" <dsterba@suse.cz>, Waxhead <waxhead@online.no>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Is stability a joke? (wiki updated)
Date: Mon, 12 Sep 2016 13:42:18 -0400	[thread overview]
Message-ID: <6315ec53-40e8-d32f-a3e4-8589b6a5e3e1@gmail.com> (raw)
In-Reply-To: <CAL3q7H7TP=XtUZkEcAfOc-qTHvncU=coTiB+EeYUEOFYeSk2-A@mail.gmail.com>

On 2016-09-12 13:29, Filipe Manana wrote:
> On Mon, Sep 12, 2016 at 5:56 PM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> On 2016-09-12 12:27, David Sterba wrote:
>>>
>>> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
>>>>>
>>>>> I therefore would like to propose that some sort of feature / stability
>>>>> matrix for the latest kernel is added to the wiki preferably somewhere
>>>>> where it is easy to find. It would be nice to archive old matrix'es as
>>>>> well in case someone runs on a bit older kernel (we who use Debian tend
>>>>> to like older kernels). In my opinion it would make things bit easier
>>>>> and perhaps a bit less scary too. Remember if you get bitten badly once
>>>>> you tend to stay away from from it all just in case, if you on the other
>>>>> hand know what bites you can safely pet the fluffy end instead :)
>>>>
>>>>
>>>> Somebody has put that table on the wiki, so it's a good starting point.
>>>> I'm not sure we can fit everything into one table, some combinations do
>>>> not bring new information and we'd need n-dimensional matrix to get the
>>>> whole picture.
>>>
>>>
>>> https://btrfs.wiki.kernel.org/index.php/Status
>>
>>
>> Some things to potentially add based on my own experience:
>>
>> Things listed as TBD status:
>> 1. Seeding: Seems to work fine the couple of times I've tested it, however
>> I've only done very light testing, and the whole feature is pretty much
>> undocumented.
>> 2. Device Replace: Works perfectly as long as the filesystem itself is not
>> corrupted, all the component devices are working, and the FS isn't using any
>> raid56 profiles.  Works fine if only the device being replaced is failing.
>> I've not done much testing WRT replacement when multiple devices are
>> suspect, but what I've done seems to suggest that it might be possible to
>> make it work, but it doesn't currently.  On raid56 it sometimes works fine,
>> sometimes corrupts data, and sometimes takes an insanely long time to
>> complete (putting data at risk from subsequent failures while the replace is
>> running).
>> 3. Balance: Works perfectly as long as the filesystem is not corrupted and
>> nothing throws any read or write errors.  IOW, only run this on a generally
>> healthy filesystem.  Similar caveats to those for replace with raid56 apply
>> here too.
>> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
>> is healthy.
>
> Virtually all other features work fine if the fs is healthy...
I would add more, but I don't often have the time to test broken 
filesystems...

TBH though, that's most of the issue I see with BTRFS in general at the 
moment.  RAID5/6 works fine, as long as all the devices keep working and 
you don't try to replace them and don't lose power.  Qgroups appear to 
work fine as long as no other bug shows up (other than the issues with 
accounting and returning ENOSPC instead of EDQUOT).  We do so much 
testing on pristine filesystems, but most of the utilities and less 
widely used features have had near zero testing on filesystems that are 
in bad shape.  If you pay attention, many (possibly most?) of the 
recently reported bugs are from broken (or poorly curated) filesystems, 
not some random kernel bug.  New features are nice, but they generally 
don't improve stability, and for BTRFS to be truly production ready 
outside of constrained environments like FaceBook, it needs to not choke 
on encountering a FS with some small amount of corruption.
>
>>
>> Other stuff:
>> 1. Compression: The specific known issue is that compressed extents don't
>> always get recovered properly on failed reads when dealing with lots of
>> failed reads.  This can be demonstrated by generating a large raid1
>> filesystem image with huge numbers of small (1MB) readliy compressible
>> files, then putting that on top of a dm-flaky or dm-error target set to give
>> a high read-error rate, then mounting and running cat `find .` > /dev/null
>> from the top level of the FS multiple times in a row.
>
>> 2. Send: The particular edge case appears to be caused by metadata
>> corruption on the sender and results in send choking on the same file every
>> time you try to run it.  The quick fix is to copy the contents of the file
>> to another file and rename that over the original.
>
> I don't remember having seen such case at least for the last 2 or 3
> years, all the problems I've seen/solved or seen fixes from others
> were all related to bugs in the send algorithm and definitely not any
> metadata corruption.
> So I wonder what evidence you have about this.
For the compression related issues, I can still reproduce it, but it 
takes a while.

As for the send issues, I do still see these on rare occasion, but only 
on 2+ year old filesystems, but I think the last time I saw it happen 
was more than 3 months ago.


  reply	other threads:[~2016-09-12 17:42 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-11  8:55 Is stability a joke? Waxhead
2016-09-11  9:56 ` Steven Haigh
2016-09-11 10:23 ` Martin Steigerwald
2016-09-11 11:21   ` Zoiled
2016-09-11 11:43     ` Martin Steigerwald
2016-09-11 12:05       ` Martin Steigerwald
2016-09-11 12:39         ` Waxhead
2016-09-11 13:02           ` Hugo Mills
2016-09-11 14:59             ` Martin Steigerwald
2016-09-11 20:14             ` Chris Murphy
2016-09-12 12:20             ` Austin S. Hemmelgarn
2016-09-12 12:59               ` Michel Bouissou
2016-09-12 13:14                 ` Austin S. Hemmelgarn
2016-09-12 14:04                 ` Lionel Bouton
2016-09-15  1:05               ` Nicholas D Steeves
2016-09-15  8:02                 ` Martin Steigerwald
2016-09-16  7:13                 ` Helmut Eller
2016-09-15  5:55               ` Kai Krakow
2016-09-15  8:05                 ` Martin Steigerwald
2016-09-11 14:54           ` Martin Steigerwald
2016-09-11 15:19             ` Martin Steigerwald
2016-09-11 20:21             ` Chris Murphy
2016-09-11 17:46           ` Marc MERLIN
2016-09-20 16:33             ` Chris Murphy
2016-09-11 17:11         ` Duncan
2016-09-12 12:26           ` Austin S. Hemmelgarn
2016-09-11 12:30       ` Waxhead
2016-09-11 14:36         ` Martin Steigerwald
2016-09-12 12:48   ` Swâmi Petaramesh
2016-09-12 13:53 ` Chris Mason
2016-09-12 17:36   ` Zoiled
2016-09-12 17:44     ` Waxhead
2016-09-15  1:12     ` Nicholas D Steeves
2016-09-12 14:27 ` David Sterba
2016-09-12 14:54   ` Austin S. Hemmelgarn
2016-09-12 16:51     ` David Sterba
2016-09-12 17:31       ` Austin S. Hemmelgarn
2016-09-15  1:07         ` Nicholas D Steeves
2016-09-15  1:13           ` Steven Haigh
2016-09-15  2:14             ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
2016-09-15  9:49               ` stability matrix Hans van Kranenburg
2016-09-15 11:54                 ` Austin S. Hemmelgarn
2016-09-15 14:15                   ` Chris Murphy
2016-09-15 14:56                   ` Martin Steigerwald
2016-09-19 14:38                   ` David Sterba
2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
2016-09-19 17:18                 ` stability matrix Austin S. Hemmelgarn
2016-09-19 19:52                   ` Christoph Anton Mitterer
2016-09-19 20:07                     ` Chris Mason
2016-09-19 20:36                       ` Christoph Anton Mitterer
2016-09-19 21:03                         ` Chris Mason
2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
2016-09-20  7:59                   ` Duncan
2016-09-20  8:19                     ` Hugo Mills
2016-09-20  8:34                   ` David Sterba
2016-09-19 15:38         ` Is stability a joke? David Sterba
2016-09-19 21:25           ` Hans van Kranenburg
2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
2016-09-12 16:56     ` Austin S. Hemmelgarn
2016-09-12 17:29       ` Filipe Manana
2016-09-12 17:42         ` Austin S. Hemmelgarn [this message]
2016-09-12 20:08       ` Chris Murphy
2016-09-13 11:35         ` Austin S. Hemmelgarn
2016-09-15 18:01           ` Chris Murphy
2016-09-15 18:20             ` Austin S. Hemmelgarn
2016-09-15 19:02               ` Chris Murphy
2016-09-15 20:16                 ` Hugo Mills
2016-09-15 20:26                   ` Chris Murphy
2016-09-16 12:00                     ` Austin S. Hemmelgarn
2016-09-19  2:57                       ` Zygo Blaxell
2016-09-19 12:37                         ` Austin S. Hemmelgarn
2016-09-19  4:08                 ` Zygo Blaxell
2016-09-19 15:27                   ` Sean Greenslade
2016-09-19 17:38                   ` Austin S. Hemmelgarn
2016-09-19 18:27                     ` Chris Murphy
2016-09-19 18:34                       ` Austin S. Hemmelgarn
2016-09-19 20:15                     ` Zygo Blaxell
2016-09-20 12:09                       ` Austin S. Hemmelgarn
2016-09-15 21:23               ` Christoph Anton Mitterer
2016-09-16 12:13                 ` Austin S. Hemmelgarn
2016-09-19  3:47       ` Zygo Blaxell
2016-09-19 12:32         ` Austin S. Hemmelgarn
2016-09-19 15:33           ` Zygo Blaxell
2016-09-12 19:57     ` Martin Steigerwald
2016-09-12 20:21       ` Pasi Kärkkäinen
2016-09-12 20:35         ` Martin Steigerwald
2016-09-12 20:44           ` Chris Murphy
2016-09-13 11:28             ` Austin S. Hemmelgarn
2016-09-13 11:39               ` Martin Steigerwald
2016-09-14  5:53             ` Marc Haber
2016-09-12 20:48         ` Waxhead
2016-09-13  8:38           ` Timofey Titovets
2016-09-13 11:26             ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6315ec53-40e8-d32f-a3e4-8589b6a5e3e1@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=waxhead@online.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).