linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device
Date: Thu, 21 Apr 2022 08:20:22 -0700	[thread overview]
Message-ID: <YmF2Nqu8Rtc4cx52@google.com> (raw)
In-Reply-To: <42e10758-e50a-7aaa-dfa9-dcf6338ebaff@opensource.wdc.com>

On 04/21, Damien Le Moal wrote:
> On 4/20/22 06:57, Jaegeuk Kim wrote:
> > The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> > command to keep the write order.
> 
> Stricktly speaking, for a request that has data, the problem is triggered
> by REQ_PREFLUSH since in this case the request does not go through the
> scheduler and is processed through the blk-flush machinery. REQ_FUA on its
> own should not matter if the device supports it. If the device does not
> support FUA, then the same problem can happen due to POSTFLUSH (again no
> scheduler).

I think the problem is a piggy-backed data along with flush or fua whatever,
but this made me use a separate flush command.

> 
> Bypassing the scheduler leads to the write not write-locking the zone,
> which leads to reordering... Completely overlooked that case when the zone
> write locking was implemented.
> 
> Ideally, the FS should not have to care about this. blk-flush machinery
> should be a little more intelligent and process the write phase of the
> request using the scheduler. Need to look into that.

Please. I'm okay to revert this, once the block layer supports.

> 
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/file.c | 4 +++-
> >  fs/f2fs/node.c | 2 +-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index f08e6208e183..2aef0632f35b 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> >  	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
> >  	clear_inode_flag(inode, FI_APPEND_WRITE);
> >  flush_out:
> > -	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
> > +	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
> > +			(atomic && !test_opt(sbi, NOBARRIER) &&
> > +					f2fs_sb_has_blkzoned(sbi)))
> 
> Aligning the conditions and not breaking the second line would make this a
> lot easier to read...

Sure.

> 
> >  		ret = f2fs_issue_flush(sbi, inode->i_ino);
> >  	if (!ret) {
> >  		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index c280f482c741..7224a980056f 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
> >  		goto redirty_out;
> >  	}
> >  
> > -	if (atomic && !test_opt(sbi, NOBARRIER))
> > +	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
> >  		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
> 
> Is this really OK to do ? flush + write as different operations may not
> lead to the same result as a preflush+fua write.
> 
> Until the block layer is fixed to properly handle this, a simpler fix for
> f2fs would be to force enable the NOBARRIER option for zoned drives ? That
> would avoid these changes no ?

No, it will hurt the stability of FS metadata consistency.

> 
> Also, with all the testing we do on SMR disks and f2fs (smaller, older SMR
> disks due to the 16TB limit), we never have triggered this problem. How
> did you trigger it ?

This happens in Android only, since atomic_write for sqlite is taking this path.

> 
> >  
> >  	/* should add to global list before clearing PAGECACHE status */
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2022-04-21 15:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19 21:57 [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device Jaegeuk Kim
2022-04-20  5:16 ` Christoph Hellwig
2022-04-20 18:12   ` Jaegeuk Kim
2022-04-21  8:43 ` Damien Le Moal via Linux-f2fs-devel
2022-04-21 15:20   ` Jaegeuk Kim [this message]
2022-04-21 16:05 ` [f2fs-dev] [PATCH v2] " Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmF2Nqu8Rtc4cx52@google.com \
    --to=jaegeuk@kernel.org \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).