linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Hendrik Friedel <hendrik@friedels.name>,
	"Austin S. Hemmelgarn" <ahferroin7@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: btrfs and write barriers
Date: Mon, 29 Apr 2019 07:53:49 +0800	[thread overview]
Message-ID: <aa8774cd-d337-b527-c6ec-72c4495ad9d9@gmx.com> (raw)
In-Reply-To: <emf90f6f49-9542-47ad-aad7-7ed271887782@ryzen>


[-- Attachment #1.1: Type: text/plain, Size: 7076 bytes --]



On 2019/4/29 上午3:27, Hendrik Friedel wrote:
> Hello,
> thanks for your reply.
> 
>>>3) Even more, it would be good, if btrfs would disable the write cache
>>> in that case, so that one does not need to rely on the user
>>
>>Personally speaking, if user really believes it's write cache causing
>>the problem or want to be extra safe, then they should disable cache.
> How many percent of the users will be able to judge that?
> 
>>As long as FLUSH is implemented without problem, the only faulty part is
>>btrfs itself and I haven't found any proof of either yet.
> But you have searched?
> 
>>>2) I find the location of the (only?) warning -dmesg- well hidden. I
> think it would be better to notify the user when creating the file-system.
>>A notification on creating the volume and ones when adding devices
> (either via `device add` or via a replace operation)
>>would indeed be nice, but we should still keep the kernel log warning.
> 
> Ok, so what would be the way to move forward on that? Would it help if I
> create an issue in a https://bugzilla.kernel.org/ ?

No need. See comment below.

> 
>>>3) Even more, it would be good, if btrfs would disable the write cache
> in that case, so that one does not need to rely on the user
>> I would tend to disagree here. We should definitely _recommend_ this
> to the user if we know there is no barrier support, but just
>> doing it behind their back is not a good idea.
> 
> Well, there is some room between 'automatic' and 'behind their back. E.g.
> "Barriers are not supported by /dev/sda. Automatically disabling
> write-cache on mount. You can suppress this with the
> 'enable-cache-despite-no-barrier-support-I-know-what-I-am-doing' mount
> option (maybe, we can shorten the option).

There is no problem using write cache as long as the device supports
flush. SATA/NVME protocol specified all devices should support flush.

As long as flush is supported, fua can be emulated.
Thus write cache is not a problem at all, as long as flush is
implemented correctly.

> 
>> There are also plenty of valid reasons to want to use the write cache
> anyway.
> 
> I cannot think of one. Who would sacrifice data integrity/potential
> total loss of the filesystem for speed?

No data integrity is lost, and performance is greatly improved with
write cache.

Thanks,
Qu
> 
>> As far as FUA/DPO, I know of exactly _zero_ devices that lie about
> implementing it and don't.
> ...
>> but the fact that Linux used to not issue a FLUSH command to the disks
> when you called fsync in userspace.
> 
> Ok, thanks for that clarification.
> 
> 
> Greetings,
> Hendrik
> 
> ------ Originalnachricht ------
> Von: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
> An: "Hendrik Friedel" <hendrik@friedels.name>; "Qu Wenruo"
> <quwenruo.btrfs@gmx.com>; linux-btrfs@vger.kernel.org
> Gesendet: 03.04.2019 20:44:09
> Betreff: Re: btrfs and write barriers
> 
>> On 2019-04-03 14:17, Hendrik Friedel wrote:
>>> Hello,
>>>
>>> thanks for your reply.
>>>
>>>>> 3) Even more, it would be good, if btrfs would disable the write cache
>>>>> in that case, so that one does not need to rely on the user
>>>> Personally speaking, if user really believes it's write cache causing
>>>> the problem or want to be extra safe, then they should disable cache.
>>> How many percent of the users will be able to judge that?
>>>> As long as FLUSH is implemented without problem, the only faulty
>>>> part is
>>>> btrfs itself and I haven't found any proof of either yet.
>>> But you have searched?
>>>
>>>  >>2) I find the location of the (only?) warning -dmesg- well hidden.
>>> I think it would be better to notify the user when creating the
>>> file-system.
>>>  >A notification on creating the volume and ones when adding devices
>>> (either via `device add` or via a replace operation)
>>>  >would indeed be nice, but we should still keep the kernel log warning.
>>>
>>> Ok, so what would be the way to move forward on that? Would it help
>>> if I create an issue in a https://bugzilla.kernel.org/ ?
>> The biggest issue is actually figuring out if the devices don't
>> support write barriers (which means no FLUSH or broken FLUSH on Linux,
>> not no FUA/DPO, because as long as the device properly implements
>> FLUSH (and most do), Linux will provide a FUA emulation which works
>> for write barriers).  Once you've got that, it should be pretty
>> trivial to add to the messages.
>>>
>>>  >>3) Even more, it would be good, if btrfs would disable the write
>>> cache in that case, so that one does not need to rely on the user
>>>  > I would tend to disagree here. We should definitely _recommend_
>>> this to the user if we know there is no barrier support, but just
>>>  > doing it behind their back is not a good idea.
>>>
>>> Well, there is some room between 'automatic' and 'behind their back.
>>> E.g.
>>> "Barriers are not supported by /dev/sda. Automatically disabling
>>> write-cache on mount. You can suppress this with the
>>> 'enable-cache-despite-no-barrier-support-I-know-what-I-am-doing'
>>> mount option (maybe, we can shorten the option).
>> And that's still 'behind the back' because it's a layering violation.
>> Even LVM and MD don't do this, and they have even worse issues than we
>> do because they aren't CoW.
>>>
>>>  > There are also plenty of valid reasons to want to use the write
>>> cache anyway.
>>> I cannot think of one. Who would sacrifice data integrity/potential
>>> total loss of the filesystem for speed?
>> There are quite a few cases where the risk of data loss _just doesn't
>> matter_, and any data that could be invalid is also inherently stale.
>> Some trivial examples:
>>
>> * /run on any modern Linux system. Primarily contains sockets used by
>> running services, PID files for daemons, and other similar things that
>> only matter for the duration of the current boot of the system. These
>> days, it's usually in-memory, but some people with really tight memory
>> constraints still use persistent storage for it to save memory.
>> * /tmp on any sane UNIX system. Similar case to above, but usually for
>> stuff that only matters on the scale of session lifetimes, or even
>> just process lifetimes.
>> * /var/tmp on most Linux systems. Usually the same case as /tmp.
>> * /var/cache on any sane UNIX system. By definition, if the data here
>> is lost, it doesn't matter, as it only exists for performance reasons
>> anyway. Smart applications will even validate the files they put here,
>> so corruption isn't an issue either.
>>
>> There are bunches of other examples I could list, but all of them are
>> far more situational and application specific.
>>>
>>>  > As far as FUA/DPO, I know of exactly _zero_ devices that lie about
>>> implementing it and don't.
>>> ...
>>>  > but the fact that Linux used to not issue a FLUSH command to the
>>> disks when you called fsync in userspace.
>>> Ok, thanks for that clarification.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-04-28 23:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-01 19:22 btrfs and write barriers Hendrik Friedel
2019-04-02  0:13 ` Qu Wenruo
     [not found]   ` <em07dd5637-7710-4eaa-8659-8d8eef1fc709@ryzen>
2019-04-03 18:44     ` Austin S. Hemmelgarn
2019-04-28 19:27       ` Re[2]: " Hendrik Friedel
2019-04-28 23:53         ` Qu Wenruo [this message]
     [not found]     ` <eme2e3d545-ea78-4120-9800-6a33db6c506b@ryzen>
2019-04-03 19:38       ` Re[3]: " Hendrik Friedel
2019-04-04  1:00     ` Qu Wenruo
2019-04-02 11:46 ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa8774cd-d337-b527-c6ec-72c4495ad9d9@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=ahferroin7@gmail.com \
    --cc=hendrik@friedels.name \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).