Re: Is stability a joke? - Nicholas D Steeves

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Nicholas D Steeves <nsteeves@gmail.com>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
Cc: Hugo Mills <hugo@carfax.org.uk>, Waxhead <waxhead@online.no>,
	Martin Steigerwald <martin@lichtvoll.de>,
	linux-btrfs@vger.kernel.org
Subject: Re: Is stability a joke?
Date: Wed, 14 Sep 2016 21:05:52 -0400	[thread overview]
Message-ID: <20160915010552.GC32452@DigitalMercury.dynalias.net> (raw)
In-Reply-To: <be04c51d-c35d-39fe-c5f7-a7ab13d72cc5@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 9683 bytes --]

On Mon, Sep 12, 2016 at 08:20:20AM -0400, Austin S. Hemmelgarn wrote:
> On 2016-09-11 09:02, Hugo Mills wrote:
> >On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
> >>Martin Steigerwald wrote:
> >>>Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
> >>>>>>Thing is: This just seems to be when has a feature been implemented
> >>>>>>matrix.
> >>>>>>Not when it is considered to be stable. I think this could be done with
> >>>>>>colors or so. Like red for not supported, yellow for implemented and
> >>>>>>green for production ready.
> >>>>>Exactly, just like the Nouveau matrix. It clearly shows what you can
> >>>>>expect from it.
> >>>I mentioned this matrix as a good *starting* point. And I think it would be
> >>>easy to extent it:
> >>>
> >>>Just add another column called "Production ready". Then research / ask about
> >>>production stability of each feature. The only challenge is: Who is
> >>>authoritative on that? I´d certainly ask the developer of a feature, but I´d
> >>>also consider user reports to some extent.
> >>>
> >>>Maybe thats the real challenge.
> >>>
> >>>If you wish, I´d go through each feature there and give my own estimation. But
> >>>I think there are others who are deeper into this.
> >>That is exactly the same reason I don't edit the wiki myself. I
> >>could of course get it started and hopefully someone will correct
> >>what I write, but I feel that if I start this off I don't have deep
> >>enough knowledge to do a proper start. Perhaps I will change my mind
> >>about this.
> >
> >   Given that nobody else has done it yet, what are the odds that
> >someone else will step up to do it now? I would say that you should at
> >least try. Yes, you don't have as much knowledge as some others, but
> >if you keep working at it, you'll gain that knowledge. Yes, you'll
> >probably get it wrong to start with, but you probably won't get it
> >*very* wrong. You'll probably get it horribly wrong at some point, but
> >even the more knowledgable people you're deferring to didn't identify
> >the problems with parity RAID until Zygo and Austin and Chris (and
> >others) put in the work to pin down the exact issues.
> FWIW, here's a list of what I personally consider stable (as in, I'm willing
> to bet against reduced uptime to use this stuff on production systems at
> work and personal systems at home):
> 1. Single device mode, including DUP data profiles on single device without
> mixed-bg.
> 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical devices
> (all devices are the same size).
> 3. Multi-device single profiles with asymmetrical devices.
> 4. Small numbers (max double digit) of snapshots, taken at infrequent
> intervals (no more than once an hour).  I use single snapshots regularly to
> get stable images of the filesystem for backups, and I keep hourly ones of
> my home directory for about 48 hours.
> 5. Subvolumes used to isolate parts of a filesystem from snapshots.  I use
> this regularly to isolate areas of my filesystems from backups.
> 6. Non-incremental send/receive (no clone source, no parent's, no
> deduplication).  I use this regularly for cloning virtual machines.
> 7. Checksumming and scrubs using any of the profiles I've listed above.
> 8. Defragmentation, including autodefrag.
> 9. All of the compat_features, including no-holes and skinny-metadata.
> 
> Things I consider stable enough that I'm willing to use them on my personal
> systems but not systems at work:
> 1. In-line data compression with compress=lzo.  I use this on my laptop and
> home server system.  I've never had any issues with it myself, but I know
> that other people have, and it does seem to make other things more likely to
> have issues.
> 2. Batch deduplication.  I only use this on the back-end filesystems for my
> personal storage cluster, and only because I have multiple copies as a
> result of GlusterFS on top of BTRFS.  I've not had any significant issues
> with it, and I don't remember any reports of data loss resulting from it,
> but it's something that people should not be using if they don't understand
> all the implications.
> 
> Things that I don't consider stable but some people do:
> 1. Quotas and qgroups.  Some people (such as SUSE) consider these to be
> stable.  There are a couple of known issues with them still however (such as
> returning the wrong errno when a quota is hit (should be returning -EDQUOT,
> instead returns -ENOSPC)).
> 2. RAID5/6.  There are a few people who use this, but it's generally agreed
> to be unstable.  There are still at least 3 known bugs which can cause
> complete loss of a filesystem, and there's also a known issue with rebuilds
> taking insanely long, which puts data at risk as well.
> 3. Multi device filesystems with asymmetrical devices running raid0, raid1,
> or raid10.  The issue I have here is that it's much easier to hit errors
> regarding free space than a reliable system should be.  It's possible to
> avoid with careful planning (for example, a 3 disk raid1 profile with 1 disk
> exactly twice the size of the other two will work fine, albeit with more
> load on the larger disk).
> 
...
> As far as documentation though, we [BTRFS] really do need to get our act
> together.  It really doesn't look good to have most of the best
> documentation be in the distro's wikis instead of ours.  I'm not trying to
> say the distros shouldn't be documenting BTRFS, but the point at which
> Debian (for example) has better documentation of the upstream version of
> BTRFS than the upstream project itself does, that starts to look bad.

I would have loved to have this feature-to-stability list when I
started working on the Debian documentation!  I started it because I
was saddened by number of horror story "adventures with btrfs"
articles and posts I had read about, combined with the perspective of
certain members within the Debian community that it was a toy fs.

Are my contributions to that wiki of a high enough quality that I
can work on the upstream one?  Do you think the broader btrfs
community is interested in citations and curated links to discussions?

eg: if a company wants to use btrfs, they check the status page, see a
feature they want is still in the yellow zone of stabilisation, and
then follow the links to familiarise themselves with past discussions.
I imagine this would also help individuals or grad students more
quickly familiarise themselves with the available literature before
choosing a specific project.  If regular updates from SUSE, STRATO,
Facebook, and Fujitsu are also publicly available the k.org wiki would
be a wonderful place to syndicate them!

Sincerely,
Nicholas

> >
> >   So, go for it. You have a lot to offer the community.
> >
> >   Hugo.
> >
> >>>I do think for example that scrubbing and auto raid repair are stable, except
> >>>for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be stable.
> >>>I think RAID 10 is also stable, but as I do not run it, I don´t know. For me
> >>>also skinny-metadata is stable. For me so far even compress=lzo seems to be
> >>>stable, but well for others it may not.
> >>>
> >>>Since what kernel version? Now, there you go. I have no idea. All I know I
> >>>started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 at
> >>>that time.
> >>>
> >>>See, the implementation time of a feature is much easier to assess. Maybe
> >>>thats part of the reason why there is not stability matrix: Maybe no one
> >>>*exactly* knows *for sure*. How could you? So I would even put a footnote on
> >>>that "production ready" row explaining "Considered to be stable by developer
> >>>and user oppinions".
> >>>
> >>>Of course additionally it would be good to read about experiences of corporate
> >>>usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using it.
> >>>But I don´t know in what configurations and with what experiences. One Oracle
> >>>developer invests a lot of time to bring BTRFS like features to XFS and RedHat
> >>>still favors XFS over BTRFS, even SLES defaults to XFS for /home and other non
> >>>/-filesystems. That also tells a story.
> >>>
> >>>Some ideas you can get from SUSE releasenotes. Even if you do not want to use
> >>>it, it tells something and I bet is one of the better sources of information
> >>>regarding your question you can get at this time. Cause I believe SUSE
> >>>developers invested some time to assess the stability of features. Cause they
> >>>would carefully assess what they can support in enterprise environments. There
> >>>is also someone from Fujitsu who shared experiences in a talk, I can search
> >>>the URL to the slides again.
> >>By all means, SUSE's wiki is very valuable. I just said that I
> >>*prefer* to have that stuff on the BTRFS wiki and feel that is the
> >>right place for it.
> >>>
> >>>I bet Chris Mason and other BTRFS developers at Facebook have some idea on
> >>>what they use within Facebook as well. To what extent they are allowed to talk
> >>>about it… I don´t know. My personal impression is that as soon as Chris went
> >>>to Facebook he became quite quiet. Maybe just due to being busy. Maybe due to
> >>>Facebook being concerned much more about the privacy of itself than of its
> >>>users.
> >>>
> >>>Thanks,
> >>
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

next prev parent reply	other threads:[~2016-09-15  1:05 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-11  8:55 Is stability a joke? Waxhead
2016-09-11  9:56 ` Steven Haigh
2016-09-11 10:23 ` Martin Steigerwald
2016-09-11 11:21   ` Zoiled
2016-09-11 11:43     ` Martin Steigerwald
2016-09-11 12:05       ` Martin Steigerwald
2016-09-11 12:39         ` Waxhead
2016-09-11 13:02           ` Hugo Mills
2016-09-11 14:59             ` Martin Steigerwald
2016-09-11 20:14             ` Chris Murphy
2016-09-12 12:20             ` Austin S. Hemmelgarn
2016-09-12 12:59               ` Michel Bouissou
2016-09-12 13:14                 ` Austin S. Hemmelgarn
2016-09-12 14:04                 ` Lionel Bouton
2016-09-15  1:05               ` Nicholas D Steeves [this message]
2016-09-15  8:02                 ` Martin Steigerwald
2016-09-16  7:13                 ` Helmut Eller
2016-09-15  5:55               ` Kai Krakow
2016-09-15  8:05                 ` Martin Steigerwald
2016-09-11 14:54           ` Martin Steigerwald
2016-09-11 15:19             ` Martin Steigerwald
2016-09-11 20:21             ` Chris Murphy
2016-09-11 17:46           ` Marc MERLIN
2016-09-20 16:33             ` Chris Murphy
2016-09-11 17:11         ` Duncan
2016-09-12 12:26           ` Austin S. Hemmelgarn
2016-09-11 12:30       ` Waxhead
2016-09-11 14:36         ` Martin Steigerwald
2016-09-12 12:48   ` Swâmi Petaramesh
2016-09-12 13:53 ` Chris Mason
2016-09-12 17:36   ` Zoiled
2016-09-12 17:44     ` Waxhead
2016-09-15  1:12     ` Nicholas D Steeves
2016-09-12 14:27 ` David Sterba
2016-09-12 14:54   ` Austin S. Hemmelgarn
2016-09-12 16:51     ` David Sterba
2016-09-12 17:31       ` Austin S. Hemmelgarn
2016-09-15  1:07         ` Nicholas D Steeves
2016-09-15  1:13           ` Steven Haigh
2016-09-15  2:14             ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
2016-09-15  9:49               ` stability matrix Hans van Kranenburg
2016-09-15 11:54                 ` Austin S. Hemmelgarn
2016-09-15 14:15                   ` Chris Murphy
2016-09-15 14:56                   ` Martin Steigerwald
2016-09-19 14:38                   ` David Sterba
2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
2016-09-19 17:18                 ` stability matrix Austin S. Hemmelgarn
2016-09-19 19:52                   ` Christoph Anton Mitterer
2016-09-19 20:07                     ` Chris Mason
2016-09-19 20:36                       ` Christoph Anton Mitterer
2016-09-19 21:03                         ` Chris Mason
2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
2016-09-20  7:59                   ` Duncan
2016-09-20  8:19                     ` Hugo Mills
2016-09-20  8:34                   ` David Sterba
2016-09-19 15:38         ` Is stability a joke? David Sterba
2016-09-19 21:25           ` Hans van Kranenburg
2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
2016-09-12 16:56     ` Austin S. Hemmelgarn
2016-09-12 17:29       ` Filipe Manana
2016-09-12 17:42         ` Austin S. Hemmelgarn
2016-09-12 20:08       ` Chris Murphy
2016-09-13 11:35         ` Austin S. Hemmelgarn
2016-09-15 18:01           ` Chris Murphy
2016-09-15 18:20             ` Austin S. Hemmelgarn
2016-09-15 19:02               ` Chris Murphy
2016-09-15 20:16                 ` Hugo Mills
2016-09-15 20:26                   ` Chris Murphy
2016-09-16 12:00                     ` Austin S. Hemmelgarn
2016-09-19  2:57                       ` Zygo Blaxell
2016-09-19 12:37                         ` Austin S. Hemmelgarn
2016-09-19  4:08                 ` Zygo Blaxell
2016-09-19 15:27                   ` Sean Greenslade
2016-09-19 17:38                   ` Austin S. Hemmelgarn
2016-09-19 18:27                     ` Chris Murphy
2016-09-19 18:34                       ` Austin S. Hemmelgarn
2016-09-19 20:15                     ` Zygo Blaxell
2016-09-20 12:09                       ` Austin S. Hemmelgarn
2016-09-15 21:23               ` Christoph Anton Mitterer
2016-09-16 12:13                 ` Austin S. Hemmelgarn
2016-09-19  3:47       ` Zygo Blaxell
2016-09-19 12:32         ` Austin S. Hemmelgarn
2016-09-19 15:33           ` Zygo Blaxell
2016-09-12 19:57     ` Martin Steigerwald
2016-09-12 20:21       ` Pasi Kärkkäinen
2016-09-12 20:35         ` Martin Steigerwald
2016-09-12 20:44           ` Chris Murphy
2016-09-13 11:28             ` Austin S. Hemmelgarn
2016-09-13 11:39               ` Martin Steigerwald
2016-09-14  5:53             ` Marc Haber
2016-09-12 20:48         ` Waxhead
2016-09-13  8:38           ` Timofey Titovets
2016-09-13 11:26             ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160915010552.GC32452@DigitalMercury.dynalias.net \
    --to=nsteeves@gmail.com \
    --cc=ahferroin7@gmail.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=martin@lichtvoll.de \
    --cc=waxhead@online.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).