From: Jan Kara <jack@suse.cz>
To: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Gao Xiang <hsiangkao@linux.alibaba.com>,
linux-ext4@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>,
Matthew Bobrowski <mbobrowski@mbobrowski.org>,
Christoph Hellwig <hch@lst.de>,
Joseph Qi <joseph.qi@linux.alibaba.com>
Subject: Re: [bug report] ext4 misses final i_size meta sync under O_DIRECT | O_SYNC semantics after iomap DIO conversion
Date: Mon, 25 Sep 2023 14:08:22 +0200 [thread overview]
Message-ID: <20230925120822.g2mvg2wsd42dpsh4@quack3> (raw)
In-Reply-To: <87y1gy5s9c.fsf@doe.com>
On Fri 22-09-23 17:33:59, Ritesh Harjani wrote:
> Jan Kara <jack@suse.cz> writes:
>
> > On Wed 20-09-23 17:08:19, Ritesh Harjani wrote:
> >> Jan Kara <jack@suse.cz> writes:
> >>
> >> > Hello!
> >> >
> >> > On Tue 19-09-23 14:00:04, Gao Xiang wrote:
> >> >> Our consumer reports a behavior change between pre-iomap and iomap
> >> >> direct io conversion:
> >> >>
> >> >> If the system crashes after an appending write to a file open with
> >> >> O_DIRECT | O_SYNC flag set, file i_size won't be updated even if
> >> >> O_SYNC was marked before.
> >> >>
> >> >> It can be reproduced by a test program in the attachment with
> >> >> gcc -o repro repro.c && ./repro testfile && echo c > /proc/sysrq-trigger
> >> >>
> >> >> After some analysis, we found that before iomap direct I/O conversion,
> >> >> the timing was roughly (taking Linux 3.10 codebase as an example):
> >> >>
> >> >> ..
> >> >> - ext4_file_dio_write
> >> >> - __generic_file_aio_write
> >> >> ..
> >> >> - ext4_direct_IO # generic_file_direct_write
> >> >> - ext4_ext_direct_IO
> >> >> - ext4_ind_direct_IO # final_size > inode->i_size
> >> >> - ..
> >> >> - ret = blockdev_direct_IO()
> >> >> - i_size_write(inode, end) # orphan && ret > 0 &&
> >> >> # end > inode->i_size
> >> >> - ext4_mark_inode_dirty()
> >> >> - ...
> >> >> - generic_write_sync # handling O_SYNC
> >> >>
> >> >> So the dirty inode meta will be committed into journal immediately
> >> >> if O_SYNC is set. However, After commit 569342dc2485 ("ext4: move
> >> >> inode extension/truncate code out from ->iomap_end() callback"),
> >> >> the new behavior seems as below:
> >> >>
> >> >> ..
> >> >> - ext4_dio_write_iter
> >> >> - ext4_dio_write_checks # extend = 1
> >> >> - iomap_dio_rw
> >> >> - __iomap_dio_rw
> >> >> - iomap_dio_complete
> >> >> - generic_write_sync
> >> >> - ext4_handle_inode_extension # extend = 1
> >>
> >> Yes, since ext4_handle_inode_extension() will handle inode i_disksize
> >> update and mark the inode dirty, generic_write_sync() call should happen
> >> after that.
> >>
> >> That also means then we don't have any generic FS testcase which can
> >> validate this behaviour.
> >
> > Yeah.
> >
>
> Ok. Let me then first send a fstest in response to this integrity
> problem with directio and o_sync.
Thanks for working on the testcase! I have written a fix in the meantime
but so far it causes weird inconsistencies in fsstress ENOSPC hitters tests
so I'm still debugging that.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2023-09-25 12:08 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-19 6:00 [bug report] ext4 misses final i_size meta sync under O_DIRECT | O_SYNC semantics after iomap DIO conversion Gao Xiang
2023-09-19 12:05 ` Jan Kara
2023-09-19 13:47 ` Gao Xiang
2023-09-20 7:29 ` Dave Chinner
2023-09-20 7:36 ` Gao Xiang
2023-09-20 11:38 ` Ritesh Harjani
2023-09-20 15:20 ` Jan Kara
2023-09-22 12:03 ` Ritesh Harjani
2023-09-22 12:10 ` [PATCH] generic: Add integrity tests with synchronous directio Ritesh Harjani (IBM)
2023-09-22 13:29 ` Gao Xiang
2023-09-22 16:09 ` Ritesh Harjani
2023-09-22 15:21 ` Darrick J. Wong
2023-09-22 16:13 ` Ritesh Harjani
2023-09-22 17:06 ` Zorro Lang
2023-09-23 10:25 ` Ritesh Harjani
2023-09-23 12:00 ` [PATCHv2 1/2] aio-dio-write-verify: Add sync and noverify option Ritesh Harjani (IBM)
2023-09-23 12:00 ` [PATCHv2 2/2] generic: Add integrity tests with synchronous directio Ritesh Harjani (IBM)
2023-09-28 3:42 ` Zorro Lang
2023-09-28 4:51 ` Ritesh Harjani
2023-09-25 12:08 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230925120822.g2mvg2wsd42dpsh4@quack3 \
--to=jack@suse.cz \
--cc=hch@lst.de \
--cc=hsiangkao@linux.alibaba.com \
--cc=joseph.qi@linux.alibaba.com \
--cc=linux-ext4@vger.kernel.org \
--cc=mbobrowski@mbobrowski.org \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).