From: Chao Yu <yuchao0@huawei.com>
To: Gao Xiang <hsiangkao@aol.com>, Hongwei Qin <glqinhongwei@gmail.com>
Cc: linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net>
Subject: Re: [f2fs-dev] Potential data corruption?
Date: Mon, 9 Dec 2019 18:46:28 +0800 [thread overview]
Message-ID: <11aeed7b-24e2-61ba-fddc-6684aac2b152@huawei.com> (raw)
In-Reply-To: <20191208135117.GA12771@hsiangkao-HP-ZHAN-66-Pro-G1>
On 2019/12/8 21:51, Gao Xiang via Linux-f2fs-devel wrote:
> Hi,
>
> On Sun, Dec 08, 2019 at 09:15:55PM +0800, Hongwei Qin wrote:
>> Hi,
>>
>> On Sun, Dec 8, 2019 at 12:01 PM Chao Yu <chao@kernel.org> wrote:
>>>
>>> Hello,
>>>
>>> On 2019-12-7 18:10, 锟斤拷锟秸碉拷锟斤拷锟斤拷锟斤拷 wrote:
>>>> Hi F2FS experts,
>>>> The following confuses me:
>>>>
>>>> A typical fsync() goes like this:
>>>> 1) Issue data block IOs
>>>> 2) Wait for completion
>>>> 3) Issue chained node block IOs
>>>> 4) Wait for completion
>>>> 5) Issue flush command
>>>>
>>>> In order to preserve data consistency under sudden power failure, it requires that the storage device persists data blocks prior to node blocks.
>>>> Otherwise, under sudden power failure, it's possible that the persisted node block points to NULL data blocks.
>>>
>>> Firstly it doesn't break POSIX semantics, right? since fsync() didn't return
>>> successfully before sudden power-cut, so we can not guarantee that data is fully
>>> persisted in such condition.
>>>
>>> However, what you want looks like atomic write semantics, which mostly database
>>> want to guarantee during db file update.
>>>
>>> F2FS has support atomic_write via ioctl, which is used by SQLite officially, I
>>> guess you can check its implementation detail.
>>>
>>> Thanks,
>>>
>>
>> Thanks for your kind reply.
>> It's true that if we meet power failure before fsync() completes,
>> POSIX doen't require FS to recover the file. However, consider the
>> following situation:
>>
>> 1) Data block IOs (Not persisted)
>> 2) Node block IOs (All Persisted)
>> 3) Power failure
>>
>> Since the node blocks are all persisted before power failure, the node
>> chain isn't broken. Note that this file's new data is not properly
>> persisted before crash. So the recovery process should be able to
>> recognize this situation and avoid recover this file. However, since
>> the node chain is not broken, perhaps the recovery process will regard
>> this file as recoverable?
>
> As my own limited understanding, I'm afraid it seems true for extreme case.
> Without proper FLUSH command, newer nodes could be recovered but no newer
> data persisted.
>
> So if fsync() is not successful, the old data should be readed
> but for this case, unexpected data (not A or A', could be random data
> C) will be considered validly since its node is ok.
>
> It seems it should FLUSH data before the related node chain written or
> introduce some data checksum though.
>
> If I am wrong, kindly correct me...
Yes, I guess if user wants more consistence guarantee of fsync() than posix one,
we can refactor fsync_mode=strict mode a bit to handle fsync() IOs like we did
for atomic write IOs to keep strict data/node IO order. But note that such
consistence guarantee is weak, after sudden power-cut, recovered file may
contain mixed old and new data (fsynced data partially persisted) which may also
crash the Apps.
Thanks,
>
> Thanks,
> Gao Xiang
>
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
prev parent reply other threads:[~2019-12-09 10:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-07 10:10 [f2fs-dev] Potential data corruption? =?gb18030?B?uuzJ1bXEzf67r7H9?=
2019-12-08 4:00 ` Chao Yu
2019-12-08 13:15 ` Hongwei Qin
2019-12-08 13:41 ` Chao Yu
2019-12-08 13:51 ` Gao Xiang via Linux-f2fs-devel
2019-12-09 10:46 ` Chao Yu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11aeed7b-24e2-61ba-fddc-6684aac2b152@huawei.com \
--to=yuchao0@huawei.com \
--cc=glqinhongwei@gmail.com \
--cc=hsiangkao@aol.com \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).