Citation Needed: BTRFS Failure Resistance

All of lore.kernel.org
 help / color / mirror / Atom feed

* Citation Needed: BTRFS Failure Resistance
@ 2019-05-22 18:46 Cerem Cem ASLAN
  2019-05-22 19:00 ` Hugo Mills
  2019-05-23 11:19 ` Austin S. Hemmelgarn
  0 siblings, 2 replies; 12+ messages in thread
From: Cerem Cem ASLAN @ 2019-05-22 18:46 UTC (permalink / raw)
  To: Btrfs BTRFS

Could you confirm or disclaim the following explanation:
https://unix.stackexchange.com/a/520063/65781

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-22 18:46 Citation Needed: BTRFS Failure Resistance Cerem Cem ASLAN
@ 2019-05-22 19:00 ` Hugo Mills
  2019-05-23 16:48   ` Jeff Mahoney
  2019-05-23 11:19 ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 12+ messages in thread
From: Hugo Mills @ 2019-05-22 19:00 UTC (permalink / raw)
  To: Cerem Cem ASLAN; +Cc: Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]

On Wed, May 22, 2019 at 09:46:42PM +0300, Cerem Cem ASLAN wrote:
> Could you confirm or disclaim the following explanation:
> https://unix.stackexchange.com/a/520063/65781

   Well, the quoted comment at the top is accurate (although I haven't
looked for the IRC conversation in question).

   However, there are some inaccuracies in the detailed comment
below. These aren't particularly relevant to the argument addressing
your question, but do detract somewhat from the authority of the
answer. :)

   Specifically: Btrfs doesn't use Merkle trees. It uses CoW-friendly
B-trees -- there's no csum of tree contents. It also doesn't make a
complete copy of the tree (that would take a long time). Instead,
it'll only update the blocks in the tree that need updating, which
will bubble the changes up through the tree node path to the top
level.

   There's a detailed description of the issues of broken hardware on
the btrfs wiki, here:

https://btrfs.wiki.kernel.org/index.php/FAQ#What_does_.22parent_transid_verify_failed.22_mean.3F

   Hugo.

-- 
Hugo Mills             | Why play all the notes, when you need only play the
hugo@... carfax.org.uk | most beautiful?
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                           Miles Davis

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-22 19:00 ` Hugo Mills
@ 2019-05-23 16:48   ` Jeff Mahoney
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff Mahoney @ 2019-05-23 16:48 UTC (permalink / raw)
  To: Hugo Mills, Cerem Cem ASLAN, Btrfs BTRFS; +Cc: Johannes Thumshirn


[-- Attachment #1.1: Type: text/plain, Size: 1533 bytes --]

On 5/22/19 3:00 PM, Hugo Mills wrote:
> On Wed, May 22, 2019 at 09:46:42PM +0300, Cerem Cem ASLAN wrote:
>> Could you confirm or disclaim the following explanation:
>> https://unix.stackexchange.com/a/520063/65781
> 
>    Well, the quoted comment at the top is accurate (although I haven't
> looked for the IRC conversation in question).
> 
>    However, there are some inaccuracies in the detailed comment
> below. These aren't particularly relevant to the argument addressing
> your question, but do detract somewhat from the authority of the
> answer. :)
> 
>    Specifically: Btrfs doesn't use Merkle trees. It uses CoW-friendly
> B-trees -- there's no csum of tree contents. It also doesn't make a
> complete copy of the tree (that would take a long time). Instead,
> it'll only update the blocks in the tree that need updating, which
> will bubble the changes up through the tree node path to the top
> level.

There are csums of tree contents -- they're part of the header for every
tree node and leaf.  It doesn't currently function as a merkle tree,
though, since there is no external reference to verify it.  There are
two potential solutions to this:

1) Change the tree nodes to contain checksums for each of the next blocks.
2) Use an hmac in each tree node and leaf, where the signature functions
as the external reference.

Either solution requires checksums be added to the superblock for the
tree root, the chunk root, and the log tree root.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-22 18:46 Citation Needed: BTRFS Failure Resistance Cerem Cem ASLAN
  2019-05-22 19:00 ` Hugo Mills
@ 2019-05-23 11:19 ` Austin S. Hemmelgarn
  2019-05-23 16:24   ` Chris Murphy
  1 sibling, 1 reply; 12+ messages in thread
From: Austin S. Hemmelgarn @ 2019-05-23 11:19 UTC (permalink / raw)
  To: Cerem Cem ASLAN, Btrfs BTRFS

On 2019-05-22 14:46, Cerem Cem ASLAN wrote:
> Could you confirm or disclaim the following explanation:
> https://unix.stackexchange.com/a/520063/65781
> 
Aside from what Hugo mentioned (which is correct), it's worth mentioning 
that the example listed in the answer of how hardware issues could screw 
things up assumes that for some reason write barriers aren't honored. 
BTRFS explicitly requests write barriers to prevent that type of 
reordering of writes from happening, and it's actually pretty unusual on 
modern hardware for those write barriers to not be honored unless the 
user is doing something stupid (like mounting with 'nobarrier' or using 
LVM with write barrier support disabled).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 11:19 ` Austin S. Hemmelgarn
@ 2019-05-23 16:24   ` Chris Murphy
  2019-05-23 16:34     ` Adam Borowski
  2019-05-23 17:13     ` Austin S. Hemmelgarn
  0 siblings, 2 replies; 12+ messages in thread
From: Chris Murphy @ 2019-05-23 16:24 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Cerem Cem ASLAN, Btrfs BTRFS

On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
>
> On 2019-05-22 14:46, Cerem Cem ASLAN wrote:
> > Could you confirm or disclaim the following explanation:
> > https://unix.stackexchange.com/a/520063/65781
> >
> Aside from what Hugo mentioned (which is correct), it's worth mentioning
> that the example listed in the answer of how hardware issues could screw
> things up assumes that for some reason write barriers aren't honored.
> BTRFS explicitly requests write barriers to prevent that type of
> reordering of writes from happening, and it's actually pretty unusual on
> modern hardware for those write barriers to not be honored unless the
> user is doing something stupid (like mounting with 'nobarrier' or using
> LVM with write barrier support disabled).

'man xfs'

       barrier|nobarrier
              Note: This option has been deprecated as of kernel
v4.10; in that version, integrity operations are always performed and
the mount option is ignored.  These mount options will be removed no
earlier than kernel v4.15.

Since they're getting rid of it, I wonder if it's sane for most any
sane file system use case.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 16:24   ` Chris Murphy
@ 2019-05-23 16:34     ` Adam Borowski
  2019-05-23 16:46       ` Chris Murphy
  2019-05-23 17:13     ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 12+ messages in thread
From: Adam Borowski @ 2019-05-23 16:34 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Austin S. Hemmelgarn, Cerem Cem ASLAN, Btrfs BTRFS

On Thu, May 23, 2019 at 10:24:28AM -0600, Chris Murphy wrote:
> On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
> > BTRFS explicitly requests write barriers to prevent that type of
> > reordering of writes from happening, and it's actually pretty unusual on
> > modern hardware for those write barriers to not be honored unless the
> > user is doing something stupid (like mounting with 'nobarrier' or using
> > LVM with write barrier support disabled).
> 
> 'man xfs'
> 
>        barrier|nobarrier
>               Note: This option has been deprecated as of kernel
> v4.10; in that version, integrity operations are always performed and
> the mount option is ignored.  These mount options will be removed no
> earlier than kernel v4.15.
> 
> Since they're getting rid of it, I wonder if it's sane for most any
> sane file system use case.

A volatile filesystem: one that you're willing to rebuild from scratch (or
backups) on power loss.  This includes any filesystem in a volatile VM.

Example use case: a build machine, where the build filesystem wants btrfs
for snapshots (the build environment several minutes to recreate), yet with
the environment recreated weekly, a crash can be considered an additional
start of a week. :)

Or, some clusters consider a crashed node to be dead and needing rebuild;
the filesystem's contents will be cloned from a master anyway.

In all of these cases, fsyncs can be ignored as well.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢰⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ At least spammers get it right: "Hello beautiful!".
⠈⠳⣄⠀⠀⠀⠀

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 16:34     ` Adam Borowski
@ 2019-05-23 16:46       ` Chris Murphy
  2019-05-23 17:04         ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Murphy @ 2019-05-23 16:46 UTC (permalink / raw)
  To: Adam Borowski
  Cc: Chris Murphy, Austin S. Hemmelgarn, Cerem Cem ASLAN, Btrfs BTRFS

On Thu, May 23, 2019 at 10:34 AM Adam Borowski <kilobyte@angband.pl> wrote:
>
> On Thu, May 23, 2019 at 10:24:28AM -0600, Chris Murphy wrote:
> > On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
> > > BTRFS explicitly requests write barriers to prevent that type of
> > > reordering of writes from happening, and it's actually pretty unusual on
> > > modern hardware for those write barriers to not be honored unless the
> > > user is doing something stupid (like mounting with 'nobarrier' or using
> > > LVM with write barrier support disabled).
> >
> > 'man xfs'
> >
> >        barrier|nobarrier
> >               Note: This option has been deprecated as of kernel
> > v4.10; in that version, integrity operations are always performed and
> > the mount option is ignored.  These mount options will be removed no
> > earlier than kernel v4.15.
> >
> > Since they're getting rid of it, I wonder if it's sane for most any
> > sane file system use case.
>
> A volatile filesystem: one that you're willing to rebuild from scratch (or
> backups) on power loss.  This includes any filesystem in a volatile VM.
>
> Example use case: a build machine, where the build filesystem wants btrfs
> for snapshots (the build environment several minutes to recreate), yet with
> the environment recreated weekly, a crash can be considered an additional
> start of a week. :)
>
> Or, some clusters consider a crashed node to be dead and needing rebuild;
> the filesystem's contents will be cloned from a master anyway.
>
> In all of these cases, fsyncs can be ignored as well.

I would not mind a mount option to ignore application fsync and
fdatasync, while maintaining the Btrfs data->metadata->super write
order guarantee. I'd expect that would be a more commonly preferred
use case than volatile/disposable file systems. But what do you
suppose the real world performance increase is between the former and
latter?


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 16:46       ` Chris Murphy
@ 2019-05-23 17:04         ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 12+ messages in thread
From: Austin S. Hemmelgarn @ 2019-05-23 17:04 UTC (permalink / raw)
  To: Chris Murphy, Adam Borowski; +Cc: Cerem Cem ASLAN, Btrfs BTRFS

On 2019-05-23 12:46, Chris Murphy wrote:
> On Thu, May 23, 2019 at 10:34 AM Adam Borowski <kilobyte@angband.pl> wrote:
>>
>> On Thu, May 23, 2019 at 10:24:28AM -0600, Chris Murphy wrote:
>>> On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
>>>> BTRFS explicitly requests write barriers to prevent that type of
>>>> reordering of writes from happening, and it's actually pretty unusual on
>>>> modern hardware for those write barriers to not be honored unless the
>>>> user is doing something stupid (like mounting with 'nobarrier' or using
>>>> LVM with write barrier support disabled).
>>>
>>> 'man xfs'
>>>
>>>         barrier|nobarrier
>>>                Note: This option has been deprecated as of kernel
>>> v4.10; in that version, integrity operations are always performed and
>>> the mount option is ignored.  These mount options will be removed no
>>> earlier than kernel v4.15.
>>>
>>> Since they're getting rid of it, I wonder if it's sane for most any
>>> sane file system use case.
>>
>> A volatile filesystem: one that you're willing to rebuild from scratch (or
>> backups) on power loss.  This includes any filesystem in a volatile VM.
>>
>> Example use case: a build machine, where the build filesystem wants btrfs
>> for snapshots (the build environment several minutes to recreate), yet with
>> the environment recreated weekly, a crash can be considered an additional
>> start of a week. :)
>>
>> Or, some clusters consider a crashed node to be dead and needing rebuild;
>> the filesystem's contents will be cloned from a master anyway.
>>
>> In all of these cases, fsyncs can be ignored as well.
> 
> I would not mind a mount option to ignore application fsync and
> fdatasync, while maintaining the Btrfs data->metadata->super write
> order guarantee. I'd expect that would be a more commonly preferred
> use case than volatile/disposable file systems. But what do you
> suppose the real world performance increase is between the former and
> latter?
> 
There's a LD_PRELOAD for that!

Search 'libeatmydata' or 'eatmydata' in your preferred distro's package 
manager, most of them have it.  It's an LD_PRELOAD library that stubs 
out fsync and fdatasync.  Realistically, how much it helps is _really_ 
dependent on the application.  Some package managers can see huge 
benefits because they call fsync regularly (for example, APT on Debian 
can show a big improvement, because each call it makes to dpkg that 
actually modifies system state makes at least 1 call to fsync, usually 
more).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 16:24   ` Chris Murphy
  2019-05-23 16:34     ` Adam Borowski
@ 2019-05-23 17:13     ` Austin S. Hemmelgarn
  2019-05-23 17:31       ` Martin Raiber
  1 sibling, 1 reply; 12+ messages in thread
From: Austin S. Hemmelgarn @ 2019-05-23 17:13 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Cerem Cem ASLAN, Btrfs BTRFS

On 2019-05-23 12:24, Chris Murphy wrote:
> On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>>
>> On 2019-05-22 14:46, Cerem Cem ASLAN wrote:
>>> Could you confirm or disclaim the following explanation:
>>> https://unix.stackexchange.com/a/520063/65781
>>>
>> Aside from what Hugo mentioned (which is correct), it's worth mentioning
>> that the example listed in the answer of how hardware issues could screw
>> things up assumes that for some reason write barriers aren't honored.
>> BTRFS explicitly requests write barriers to prevent that type of
>> reordering of writes from happening, and it's actually pretty unusual on
>> modern hardware for those write barriers to not be honored unless the
>> user is doing something stupid (like mounting with 'nobarrier' or using
>> LVM with write barrier support disabled).
> 
> 'man xfs'
> 
>         barrier|nobarrier
>                Note: This option has been deprecated as of kernel
> v4.10; in that version, integrity operations are always performed and
> the mount option is ignored.  These mount options will be removed no
> earlier than kernel v4.15.
> 
> Since they're getting rid of it, I wonder if it's sane for most any
> sane file system use case.
> 
As Adam mentioned, it's mostly volatile storage that benefits from this. 
  For example, on the systems where I have /var/cache configured as a 
separate filesystem, I mount it with barriers disabled because the data 
there just doesn't matter (all of it can be regenerated easily) and it 
gives me a few percent better performance.  In essence, it's the mostly 
same type of stuff where you might consider running ext4 without a 
journal for performance reasons.

In the case of XFS, it probably got removed to keep people who fancy 
themselves to be power users but really have no clue what they're doing 
from shooting themselves in the foot to try and get some more performance.

IIRC, the option originally got added to both XFS and ext* because early 
write barrier support was a bigger performance hit than it is today, and 
BTRFS just kind of inherited it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 17:13     ` Austin S. Hemmelgarn
@ 2019-05-23 17:31       ` Martin Raiber
  2019-05-23 17:41         ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Raiber @ 2019-05-23 17:31 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, Chris Murphy; +Cc: Cerem Cem ASLAN, Btrfs BTRFS

On 23.05.2019 19:13 Austin S. Hemmelgarn wrote:
> On 2019-05-23 12:24, Chris Murphy wrote:
>> On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
>> <ahferroin7@gmail.com> wrote:
>>>
>>> On 2019-05-22 14:46, Cerem Cem ASLAN wrote:
>>>> Could you confirm or disclaim the following explanation:
>>>> https://unix.stackexchange.com/a/520063/65781
>>>>
>>> Aside from what Hugo mentioned (which is correct), it's worth
>>> mentioning
>>> that the example listed in the answer of how hardware issues could
>>> screw
>>> things up assumes that for some reason write barriers aren't honored.
>>> BTRFS explicitly requests write barriers to prevent that type of
>>> reordering of writes from happening, and it's actually pretty
>>> unusual on
>>> modern hardware for those write barriers to not be honored unless the
>>> user is doing something stupid (like mounting with 'nobarrier' or using
>>> LVM with write barrier support disabled).
>>
>> 'man xfs'
>>
>>         barrier|nobarrier
>>                Note: This option has been deprecated as of kernel
>> v4.10; in that version, integrity operations are always performed and
>> the mount option is ignored.  These mount options will be removed no
>> earlier than kernel v4.15.
>>
>> Since they're getting rid of it, I wonder if it's sane for most any
>> sane file system use case.
>>
> As Adam mentioned, it's mostly volatile storage that benefits from
> this.  For example, on the systems where I have /var/cache configured
> as a separate filesystem, I mount it with barriers disabled because
> the data there just doesn't matter (all of it can be regenerated
> easily) and it gives me a few percent better performance.  In essence,
> it's the mostly same type of stuff where you might consider running
> ext4 without a journal for performance reasons.
>
> In the case of XFS, it probably got removed to keep people who fancy
> themselves to be power users but really have no clue what they're
> doing from shooting themselves in the foot to try and get some more
> performance.
>
> IIRC, the option originally got added to both XFS and ext* because
> early write barrier support was a bigger performance hit than it is
> today, and BTRFS just kind of inherited it.

When I google for it I find that flushing the device can also be
disabled via

echo "write through" > /sys/block/$device/queue/write_cache

I actually used nobarrier recently (albeit with ext4), because a steam
download was taking forever (hours), when remounting with nobarrier it
went down to minutes (next time I started it with eatmydata). But ext4
fsck is probably able to recover nobarrier file systems with unfortunate
powerlosses and btrfs fsck... isn't. So combined with the above I'd
remove nobarrier.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 17:31       ` Martin Raiber
@ 2019-05-23 17:41         ` Austin S. Hemmelgarn
  2019-05-24 13:41           ` Martin Raiber
  0 siblings, 1 reply; 12+ messages in thread
From: Austin S. Hemmelgarn @ 2019-05-23 17:41 UTC (permalink / raw)
  To: Martin Raiber, Chris Murphy; +Cc: Cerem Cem ASLAN, Btrfs BTRFS

On 2019-05-23 13:31, Martin Raiber wrote:
> On 23.05.2019 19:13 Austin S. Hemmelgarn wrote:
>> On 2019-05-23 12:24, Chris Murphy wrote:
>>> On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
>>> <ahferroin7@gmail.com> wrote:
>>>>
>>>> On 2019-05-22 14:46, Cerem Cem ASLAN wrote:
>>>>> Could you confirm or disclaim the following explanation:
>>>>> https://unix.stackexchange.com/a/520063/65781
>>>>>
>>>> Aside from what Hugo mentioned (which is correct), it's worth
>>>> mentioning
>>>> that the example listed in the answer of how hardware issues could
>>>> screw
>>>> things up assumes that for some reason write barriers aren't honored.
>>>> BTRFS explicitly requests write barriers to prevent that type of
>>>> reordering of writes from happening, and it's actually pretty
>>>> unusual on
>>>> modern hardware for those write barriers to not be honored unless the
>>>> user is doing something stupid (like mounting with 'nobarrier' or using
>>>> LVM with write barrier support disabled).
>>>
>>> 'man xfs'
>>>
>>>          barrier|nobarrier
>>>                 Note: This option has been deprecated as of kernel
>>> v4.10; in that version, integrity operations are always performed and
>>> the mount option is ignored.  These mount options will be removed no
>>> earlier than kernel v4.15.
>>>
>>> Since they're getting rid of it, I wonder if it's sane for most any
>>> sane file system use case.
>>>
>> As Adam mentioned, it's mostly volatile storage that benefits from
>> this.  For example, on the systems where I have /var/cache configured
>> as a separate filesystem, I mount it with barriers disabled because
>> the data there just doesn't matter (all of it can be regenerated
>> easily) and it gives me a few percent better performance.  In essence,
>> it's the mostly same type of stuff where you might consider running
>> ext4 without a journal for performance reasons.
>>
>> In the case of XFS, it probably got removed to keep people who fancy
>> themselves to be power users but really have no clue what they're
>> doing from shooting themselves in the foot to try and get some more
>> performance.
>>
>> IIRC, the option originally got added to both XFS and ext* because
>> early write barrier support was a bigger performance hit than it is
>> today, and BTRFS just kind of inherited it.
> 
> When I google for it I find that flushing the device can also be
> disabled via
> 
> echo "write through" > /sys/block/$device/queue/write_cache
Disabling write caching (which is what that does) is not really the same 
as mounting with 'nobarrier'.  Write caching actually improves 
performance in most cases, it just makes things a bit riskier because of 
the possibility of write reordering (which barriers prevent).
> 
> I actually used nobarrier recently (albeit with ext4), because a steam
> download was taking forever (hours), when remounting with nobarrier it
> went down to minutes (next time I started it with eatmydata). But ext4
> fsck is probably able to recover nobarrier file systems with unfortunate
> powerlosses and btrfs fsck... isn't. So combined with the above I'd
> remove nobarrier.
> 
Yeah, Steam is another pathological case actually, though that's mostly 
because their distribution format is generously described as 
'excessively segmented' and they fsync after _every single file_.  If 
you ever use Steam's game backup feature, you'll see similar results 
because it actually serializes the data to the same format that is used 
when downloading the game in the first place.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Citation Needed: BTRFS Failure Resistance
  2019-05-23 17:41         ` Austin S. Hemmelgarn
@ 2019-05-24 13:41           ` Martin Raiber
  0 siblings, 0 replies; 12+ messages in thread
From: Martin Raiber @ 2019-05-24 13:41 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, Martin Raiber, Chris Murphy
  Cc: Cerem Cem ASLAN, Btrfs BTRFS

On 23.05.2019 19:41 Austin S. Hemmelgarn wrote:
> On 2019-05-23 13:31, Martin Raiber wrote:
>> On 23.05.2019 19:13 Austin S. Hemmelgarn wrote:
>>> On 2019-05-23 12:24, Chris Murphy wrote:
>>>> On Thu, May 23, 2019 at 5:19 AM Austin S. Hemmelgarn
>>>> <ahferroin7@gmail.com> wrote:
>>>>>
>>>>> On 2019-05-22 14:46, Cerem Cem ASLAN wrote:
>>>>>> Could you confirm or disclaim the following explanation:
>>>>>> https://unix.stackexchange.com/a/520063/65781
>>>>>>
>>>>> Aside from what Hugo mentioned (which is correct), it's worth
>>>>> mentioning
>>>>> that the example listed in the answer of how hardware issues could
>>>>> screw
>>>>> things up assumes that for some reason write barriers aren't honored.
>>>>> BTRFS explicitly requests write barriers to prevent that type of
>>>>> reordering of writes from happening, and it's actually pretty
>>>>> unusual on
>>>>> modern hardware for those write barriers to not be honored unless the
>>>>> user is doing something stupid (like mounting with 'nobarrier' or
>>>>> using
>>>>> LVM with write barrier support disabled).
>>>>
>>>> 'man xfs'
>>>>
>>>>          barrier|nobarrier
>>>>                 Note: This option has been deprecated as of kernel
>>>> v4.10; in that version, integrity operations are always performed and
>>>> the mount option is ignored.  These mount options will be removed no
>>>> earlier than kernel v4.15.
>>>>
>>>> Since they're getting rid of it, I wonder if it's sane for most any
>>>> sane file system use case.
>>>>
>>> As Adam mentioned, it's mostly volatile storage that benefits from
>>> this.  For example, on the systems where I have /var/cache configured
>>> as a separate filesystem, I mount it with barriers disabled because
>>> the data there just doesn't matter (all of it can be regenerated
>>> easily) and it gives me a few percent better performance.  In essence,
>>> it's the mostly same type of stuff where you might consider running
>>> ext4 without a journal for performance reasons.
>>>
>>> In the case of XFS, it probably got removed to keep people who fancy
>>> themselves to be power users but really have no clue what they're
>>> doing from shooting themselves in the foot to try and get some more
>>> performance.
>>>
>>> IIRC, the option originally got added to both XFS and ext* because
>>> early write barrier support was a bigger performance hit than it is
>>> today, and BTRFS just kind of inherited it.
>>
>> When I google for it I find that flushing the device can also be
>> disabled via
>>
>> echo "write through" > /sys/block/$device/queue/write_cache
> Disabling write caching (which is what that does) is not really the
> same as mounting with 'nobarrier'.  Write caching actually improves
> performance in most cases, it just makes things a bit riskier because
> of the possibility of write reordering (which barriers prevent).

According to documentation it doesn't change any caching. This changes
how the kernel sees what kind of caching the device does. If the device
claims it does "write through" caching (e.g. battery backed RAID card)
the kernel doesn't need to send device cache flushes, otherwise is does.
If you set a device that has "write back" there to "write through", the
kernel will think it does not require flushes and not send any, thus
causing data loss at power loss (because the device obviously still does
write back caching).

>>
>> I actually used nobarrier recently (albeit with ext4), because a steam
>> download was taking forever (hours), when remounting with nobarrier it
>> went down to minutes (next time I started it with eatmydata). But ext4
>> fsck is probably able to recover nobarrier file systems with unfortunate
>> powerlosses and btrfs fsck... isn't. So combined with the above I'd
>> remove nobarrier.
>>
> Yeah, Steam is another pathological case actually, though that's
> mostly because their distribution format is generously described as
> 'excessively segmented' and they fsync after _every single file_.  If
> you ever use Steam's game backup feature, you'll see similar results
> because it actually serializes the data to the same format that is
> used when downloading the game in the first place.



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-05-24 13:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-22 18:46 Citation Needed: BTRFS Failure Resistance Cerem Cem ASLAN
2019-05-22 19:00 ` Hugo Mills
2019-05-23 16:48   ` Jeff Mahoney
2019-05-23 11:19 ` Austin S. Hemmelgarn
2019-05-23 16:24   ` Chris Murphy
2019-05-23 16:34     ` Adam Borowski
2019-05-23 16:46       ` Chris Murphy
2019-05-23 17:04         ` Austin S. Hemmelgarn
2019-05-23 17:13     ` Austin S. Hemmelgarn
2019-05-23 17:31       ` Martin Raiber
2019-05-23 17:41         ` Austin S. Hemmelgarn
2019-05-24 13:41           ` Martin Raiber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.