linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/5] iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents
Date: Tue, 20 Nov 2018 15:00:17 -0800	[thread overview]
Message-ID: <20181120230017.GL6792@magnolia> (raw)
In-Reply-To: <20181119211742.8824-2-david@fromorbit.com>

On Tue, Nov 20, 2018 at 08:17:38AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When we write into an unwritten extent via direct IO, we dirty
> metadata on IO completion to convert the unwritten extent to
> written. However, when we do the FUA optimisation checks, the inode
> may be clean and so we issue a FUA write into the unwritten extent.
> This means we then bypass the generic_write_sync() call after
> unwritten extent conversion has ben done and we don't force the
> modified metadata to stable storage.
> 
> This violates O_DSYNC semantics. The window of exposure is a single
> IO, as the next DIO write will see the inode has dirty metadata and
> hence will not use the FUA optimisation. Calling
> generic_write_sync() after completion of the second IO will also
> sync the first write and it's metadata.
> 
> Fix this by avoiding the FUA optimisation when writing to unwritten
> extents.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/iomap.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 64ce240217a1..72f3864a2e6b 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -1596,12 +1596,13 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  
>  	if (iomap->flags & IOMAP_F_NEW) {
>  		need_zeroout = true;
> -	} else {
> +	} else if (iomap->type == IOMAP_MAPPED) {
>  		/*
> -		 * Use a FUA write if we need datasync semantics, this
> -		 * is a pure data IO that doesn't require any metadata
> -		 * updates and the underlying device supports FUA. This
> -		 * allows us to avoid cache flushes on IO completion.
> +		 * Use a FUA write if we need datasync semantics, this is a pure
> +		 * data IO that doesn't require any metadata updates (including
> +		 * after IO completion such as unwritten extent conversion) and
> +		 * the underlying device supports FUA. This allows us to avoid
> +		 * cache flushes on IO completion.
>  		 */
>  		if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) &&
>  		    (dio->flags & IOMAP_DIO_WRITE_FUA) &&
> -- 
> 2.19.1
> 

  parent reply	other threads:[~2018-11-21  9:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-19 21:17 [PATCH 0/5] iomap: data corruption fixes and more Dave Chinner
2018-11-19 21:17 ` [PATCH 1/5] iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents Dave Chinner
2018-11-20  7:48   ` Christoph Hellwig
2018-11-20 23:00   ` Darrick J. Wong [this message]
2018-11-19 21:17 ` [PATCH 2/5] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
2018-11-20 23:01   ` Darrick J. Wong
2018-11-19 21:17 ` [PATCH 3/5] iomap: dio data corruption and spurious errors when pipes fill Dave Chinner
2018-11-19 21:17 ` [PATCH 4/5] splice: increase pipe size in splice_direct_to_actor() Dave Chinner
2018-11-20  7:49   ` Christoph Hellwig
2018-11-20 23:02   ` Darrick J. Wong
2018-11-19 21:17 ` [PATCH 5/5] vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP Dave Chinner
2018-11-20  7:49   ` Christoph Hellwig
2018-11-21  6:50 ` [PATCH 6/5] iomap: readpages doesn't zero page tail beyond EOF properly Dave Chinner
2018-11-21  8:27   ` [PATCH 6/5 V2] " Dave Chinner
2018-11-21 16:20     ` Darrick J. Wong
2018-11-21 16:34     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181120230017.GL6792@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).