From: Raouf Rokhjavan <rokhjavan.r@gmail.com>
To: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net
Subject: Re: f2fs Crash Consistency Problem
Date: Thu, 25 May 2017 00:28:18 +0430 [thread overview]
Message-ID: <2ac7e872-d127-679e-a66f-52b585ae6813@gmail.com> (raw)
In-Reply-To: <20170517180109.GC21506@jaegeuk.local>
Hi
First of all, I'm really really sorry for my absence and replying too late.
On 05/17/17 22:31, Jaegeuk Kim wrote:
> On 05/17, Raouf Rokhjavan wrote:
>> Hi,
>>
>> Since I want to make sure that my system, having a database app, stay
>> operational after the power failure, I test database system on top of f2fs.
>> Accordingly, I use sysbench and dm-log-writes to serve this purpose. I took
>> advantage of lua scripting facility in sysbench to implement write only
>> operations in database:
>>
>> #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
>> --db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench
>> --mysql-user=sysbench --mysql-password=password --max-requests=100
>> --num-threads=1 --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock
>> run
>>
>> I ran this test on 3 configurations:
>> 1- ext4 (ordered, noatime) - success 15/15
>> 2- ext4 (norecovery, noatime) - success 0/15
>> 3- f2fs (noatime) - success 3/15
>>
>> Success, here, means whether file system is operational without running fsck
>> and fixing after each replay.
>> As the result show, ext4 with ordered journaling could surmount this test,
>> but ,as it had been expected, ext4 without journaling like ext2 needs fsck
>> to recover file system after simulated power loss.
>> The surprising part of this test is f2fs. As f2fs always maintains a stable
>> checkpoint of file system, and based on its FAST paper, it always rolls back
>> to its stable checkpoint after power loss, I didn't expect to see f2fs in
>> inconsistent state after replaying logs as fsck.f2fs reports. (It's
>> necessary to mention that we check consistency of f2fs after mkfs.f2fs.
>> ext4's results verify this notion.)
>>
>> Unfortunately, the results are not reproducible, and inconsistency occurs in
>> different logs; moreover, fsck.f2fs passes this test occasionally.
>> To give more accurate information, I uploaded the output of fsck.f2fs on
>> Google Drive.
>>
>> https://drive.google.com/open?id=0BxdqCs3G6wd3UWtDTmRGbFBiYmc
> Hi,
Honestly speaking, I didn't expect to encountered such a confusing
condition when I decided to verify the resiliency of f2fs after power
failure!!! :)
The main thing which baffles me is that I haven't seen consistent
behavior between ext4 and f2fs.
As I told before, ext4 pass all sysbench which replays single log-writes
following up with fsck. It doesn't reflect any inconsistency.
Moreover, ext4 with norecovery option,as we expect, fails in all tests
and needs to fix the file system after simulated power-failure.
On the contrary, f2fs show peculiar behaviors. It haphazardly passes or
fails a test on different runs!
> Could you please check:
> - did you use a snapshot device?
In order to prove that I use dm-snap appropriately in my scripts, I
developed fsck_snap_f2fs_only.sh which logs the CKPTs of f2fs in
different stages: before, during, and after snapshot. You can see it here:
CKPT version output, passed test -
https://drive.google.com/file/d/0BxdqCs3G6wd3aTNPS1pfRWlIWk0/view?usp=sharing
fsck output, passed test -
https://drive.google.com/file/d/0BxdqCs3G6wd3Nm5DSk9DX0tLUDg/view?usp=sharing
> - what command was issued at #1687?
An important thing is that failures don't occur at fixed positions;
consequently, they aren't reproducible. In terms of command issued at
#1687, I don't know exactly since I call sysbench program in my bash
script to run a write-only database benchmark while I'm capturing disk
logs via log-writes; on the other hand, sysbench calls a lua script to
accomplish this task.
> - how's result of fsck.f2fs -d 3?
I run another test (with FSCK_SCRIPT=./fsck_script/fsck_snap.sh in
config) to capture the inconsistent condition. The outputs are
available here:
fsck outputs, failed test -
https://drive.google.com/file/d/0BxdqCs3G6wd3cy04TXd6QTBsbzA/view?usp=sharing
fsck -d3 output, failed test -
https://drive.google.com/file/d/0BxdqCs3G6wd3MXVzUHBGZEhlSFk/view?usp=sharing
> - can you share your log-dev image?
After you asked me to share my log-dev, I got intrigued to replay again
the log-dev which has inconsistency, but ,surprisingly, f2fs.fsck
doesn't complain at that point, and it again reflects unpredictable
behaviors!!! What I mean is that, during replaying the log-dev in which
fsck.f2fs had reported inconsistency, fsck_snap.sh passed one time and
failed another time at different log number!!! A couple of theories come
to my mind:
1) A bug in log-wirtes causes this behavior.
2) The virtualized block-device in vmware causes this behavior -
because It's not SSD.
3) Something is wrong with fsck.f2fs.
Another important thing is continuous errors in kernel log during
replaying and checking the consistency of file system:
- Buffer I/O error on dm-2, logical block X, async page read
(replay-base; snapshot origin device )
- Buffer I/O error on dm-3, logical block Y, async page read
(replay-cow; cow based snapshot device)
- ...
- buffer_io_error: Z calls suppressed
However, these kernel log error are generated in all conditions,
f2fs{success, fail} and ext4.
*** IMPORTANT ***
The most interesting part of my tests happened when I add
fsck_snap_f2fs_only.sh to check the correctness of using dm-snapshot in
my scripts. As I told I get CKPT by calling dump.f2fs and grep CKPT and
log it, just that; however, the results are absolutely surprising. 15/15
tests passed. I don't know why because there is no change in my tests'
logic. The main difference is that tests take longer to finish since I
call more program to grep CKPT.
I put the codes of my tests on github, you can run it and get the results:
http://github.com/raoufro/CrashConsistencyTest.git
What causes the weird behavior of f2fs in these tests?
Regards,
>
> Thanks,
>
>> Regards,
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
next prev parent reply other threads:[~2017-05-24 19:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-13 15:28 f2fs Crash Consistency Problem Raouf Rokhjavan
2017-04-13 21:19 ` Jaegeuk Kim
[not found] ` <60e7c703-13f1-0f7e-24cc-2c5fae3bc958@gmail.com>
[not found] ` <20170414184520.GA6827@jaegeuk.local>
2017-04-15 4:33 ` Raouf Rokhjavan
[not found] ` <20170414235834.GA8933@jaegeuk.local>
2017-04-15 5:13 ` Raouf Rokhjavan
2017-04-17 22:34 ` Jaegeuk Kim
2017-05-10 17:51 ` Raouf Rokhjavan
2017-05-12 0:14 ` Jaegeuk Kim
2017-05-12 18:30 ` Raouf Rokhjavan
2017-05-15 17:46 ` Jaegeuk Kim
2017-05-17 17:43 ` Raouf Rokhjavan
2017-05-17 18:01 ` Jaegeuk Kim
2017-05-24 19:58 ` Raouf Rokhjavan [this message]
2017-05-25 23:44 ` Jaegeuk Kim
[not found] ` <20170526022213.GA54408@jaegeuk-macbookpro.roam.corp.google.com>
2017-06-01 4:44 ` Raouf Rokhjavan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2ac7e872-d127-679e-a66f-52b585ae6813@gmail.com \
--to=rokhjavan.r@gmail.com \
--cc=jaegeuk@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).