From: Zhang Yi <yi.zhang@huawei.com>
To: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>, <tytso@mit.edu>,
<adilger.kernel@dilger.ca>, <yukuai3@huawei.com>
Subject: Re: [PATCH] ext4: add barrier info if journal device write cache is not enabled
Date: Tue, 29 Nov 2022 14:16:47 +0800 [thread overview]
Message-ID: <91aff807-ecde-b37f-444c-010276fd09f7@huawei.com> (raw)
In-Reply-To: <20221128151551.fo6ct7nbozlqjvci@quack3>
On 2022/11/28 23:15, Jan Kara wrote:
> On Mon 28-11-22 21:01:07, Zhang Yi wrote:
>> On 2022/11/28 18:11, Jan Kara wrote:
>>> On Thu 24-11-22 21:57:44, Zhang Yi wrote:
>>>> The block layer will check and suppress flush bio if the device write
>>>> cache is not enabled, so the journal barrier will not go into effect
>>>> even if uer specify 'barrier=1' mount option. It's dangerous if the
>>>> write cache state is false negative, and we cannot distinguish such
>>>> case easily. So just give an info and an inquire interface to let
>>>> sysadmin know the barrier is suppressed for the case of write cache is
>>>> not enabled.
>>>>
>>>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>>>
>>> Hum, so have you seen a situation when write cache information is incorrect
>>> in the block layer? Does it happen often enough that it warrants extra
>>> sysfs file?
>>>
>>
>> Thanks for response. Yes, It often happens on some SCSI devices with RAID
>> card, the disks below the RAID card enabled write cache, but the RAID driver
>> declare the write cache was disabled when probing, and the RAID card seems
>> cannot guarantee data writing back to disk medium on power failure. So the
>> ext4 filesystem will probably be corrupted at the next startup. It's
>> difficult to distinguish it's a hardware or an software problem.
>> I am not familiar with the RAID card. So I don't know why the cache state
>> is incorrect (maybe incorrect configured or firmware bug).
>
> OK, thanks for info. I believe usually you're expected to disable write
> cache on the disks themselves and leave caching to the RAID card. But I'm
> not an expert here and it's a bit besides the point anyway ;)
>
>>> After all you should be able to query what the block layer thinks about the
>>> write cache - you definitely can for SCSI devices, I'm not sure about
>>> others. So you can have a look there. Providing this info in the filesystem
>>> seems like doing it in the wrong layer - I don't see anything jbd2/ext4
>>> specific here...
>>>
>>
>> Yes, the best way is to figure out the RAID card problem.
>> This patch is not to aim to fix something in ext4. The reason why I want to add
>> this in ext4 is just give a hint from the fs barrier's point of view, it show the
>> barrier's running state at mount time, could help us to delimit the cache problem
>> more easily when we found ext4 corruption after power failure. Before this patch,
>> we could do that through SCSI probing info and /sys/block/sda/queue/write_cache
>> (maybe some others?), it's not quite clear.
>>
>> [ 2.520176] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>>
>> [root@localhost ~]# cat /sys/block/sda/queue/write_cache
>> write back
>
> Yes. /sys/block/<device>/queue/write_cache is what you should query to find
> whether barriers will be ignored or not. My point is - you need this for
> ext4, now if you start using XFS filesystem you'd need similar patch for
> XFS and then if you transition to btrfs you'd need this for btrfs as well
> and all this duplication is there because you are querying through the
> filesystem a property of the underlying block device. So why not ask the
> block device directly?
>
> I understand it may be more *convenient* to grab the information from the
> filesystem given the infrastructure you have for gathering filesystem
> information. But carrying around various sysfs files has its cost as well.
>
OK, it's fine, let's keep querying the block layer.
Thanks,
Yi.
prev parent reply other threads:[~2022-11-29 6:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-24 13:57 [PATCH] ext4: add barrier info if journal device write cache is not enabled Zhang Yi
2022-11-28 10:11 ` Jan Kara
2022-11-28 13:01 ` Zhang Yi
2022-11-28 15:15 ` Jan Kara
2022-11-29 6:16 ` Zhang Yi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=91aff807-ecde-b37f-444c-010276fd09f7@huawei.com \
--to=yi.zhang@huawei.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox