From: Christoph Hellwig <hch@lst.de>
To: stable@vger.kernel.org
Cc: linux-xfs@vger.kernel.org, Amir Goldstein <amir73il@gmail.com>,
Josef Bacik <jbacik@fb.com>,
"Darrick J . Wong" <darrick.wong@oracle.com>
Subject: [PATCH 22/25] xfs: fix incorrect log_flushed on fsync
Date: Sun, 17 Sep 2017 14:06:28 -0700 [thread overview]
Message-ID: <20170917210631.10725-23-hch@lst.de> (raw)
In-Reply-To: <20170917210631.10725-1-hch@lst.de>
From: Amir Goldstein <amir73il@gmail.com>
commit 47c7d0b19502583120c3f396c7559e7a77288a68 upstream.
When calling into _xfs_log_force{,_lsn}() with a pointer
to log_flushed variable, log_flushed will be set to 1 if:
1. xlog_sync() is called to flush the active log buffer
AND/OR
2. xlog_wait() is called to wait on a syncing log buffers
xfs_file_fsync() checks the value of log_flushed after
_xfs_log_force_lsn() call to optimize away an explicit
PREFLUSH request to the data block device after writing
out all the file's pages to disk.
This optimization is incorrect in the following sequence of events:
Task A Task B
-------------------------------------------------------
xfs_file_fsync()
_xfs_log_force_lsn()
xlog_sync()
[submit PREFLUSH]
xfs_file_fsync()
file_write_and_wait_range()
[submit WRITE X]
[endio WRITE X]
_xfs_log_force_lsn()
xlog_wait()
[endio PREFLUSH]
The write X is not guarantied to be on persistent storage
when PREFLUSH request in completed, because write A was submitted
after the PREFLUSH request, but xfs_file_fsync() of task A will
be notified of log_flushed=1 and will skip explicit flush.
If the system crashes after fsync of task A, write X may not be
present on disk after reboot.
This bug was discovered and demonstrated using Josef Bacik's
dm-log-writes target, which can be used to record block io operations
and then replay a subset of these operations onto the target device.
The test goes something like this:
- Use fsx to execute ops of a file and record ops on log device
- Every now and then fsync the file, store md5 of file and mark
the location in the log
- Then replay log onto device for each mark, mount fs and compare
md5 of file to stored value
Cc: Christoph Hellwig <hch@lst.de>
Cc: Josef Bacik <jbacik@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
fs/xfs/xfs_log.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index bcb2f860e508..c5107c7bc4bf 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -3375,8 +3375,6 @@ _xfs_log_force(
*/
if (iclog->ic_state & XLOG_STATE_IOERROR)
return -EIO;
- if (log_flushed)
- *log_flushed = 1;
} else {
no_sleep:
@@ -3480,8 +3478,6 @@ _xfs_log_force_lsn(
xlog_wait(&iclog->ic_prev->ic_write_wait,
&log->l_icloglock);
- if (log_flushed)
- *log_flushed = 1;
already_slept = 1;
goto try_again;
}
@@ -3515,9 +3511,6 @@ _xfs_log_force_lsn(
*/
if (iclog->ic_state & XLOG_STATE_IOERROR)
return -EIO;
-
- if (log_flushed)
- *log_flushed = 1;
} else { /* just return */
spin_unlock(&log->l_icloglock);
}
--
2.14.1
next prev parent reply other threads:[~2017-09-17 21:06 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-17 21:06 4.13-stable updates for XFS Christoph Hellwig
2017-09-17 21:06 ` [PATCH 01/25] xfs: write unmount record for ro mounts Christoph Hellwig
2017-09-17 21:06 ` [PATCH 02/25] xfs: toggle readonly state around xfs_log_mount_finish Christoph Hellwig
2017-09-17 21:06 ` [PATCH 03/25] xfs: Add infrastructure needed for error propagation during buffer IO failure Christoph Hellwig
2017-09-17 21:06 ` [PATCH 04/25] xfs: Properly retry failed inode items in case of error during buffer writeback Christoph Hellwig
2017-09-17 21:06 ` [PATCH 05/25] xfs: fix recovery failure when log record header wraps log end Christoph Hellwig
2017-09-17 21:06 ` [PATCH 06/25] xfs: always verify the log tail during recovery Christoph Hellwig
2017-09-17 21:06 ` [PATCH 07/25] xfs: fix log recovery corruption error due to tail overwrite Christoph Hellwig
2017-09-17 21:06 ` [PATCH 08/25] xfs: handle -EFSCORRUPTED during head/tail verification Christoph Hellwig
2017-09-17 21:06 ` [PATCH 09/25] xfs: stop searching for free slots in an inode chunk when there are none Christoph Hellwig
2017-09-17 21:06 ` [PATCH 10/25] xfs: evict all inodes involved with log redo item Christoph Hellwig
2017-09-17 21:06 ` [PATCH 11/25] xfs: check for race with xfs_reclaim_inode() in xfs_ifree_cluster() Christoph Hellwig
2017-09-17 21:06 ` [PATCH 12/25] xfs: open-code xfs_buf_item_dirty() Christoph Hellwig
2017-09-17 21:06 ` [PATCH 13/25] xfs: remove unnecessary dirty bli format check for ordered bufs Christoph Hellwig
2017-09-17 21:06 ` [PATCH 14/25] xfs: ordered buffer log items are never formatted Christoph Hellwig
2017-09-17 21:06 ` [PATCH 15/25] xfs: refactor buffer logging into buffer dirtying helper Christoph Hellwig
2017-09-17 21:06 ` [PATCH 16/25] xfs: don't log dirty ranges for ordered buffers Christoph Hellwig
2017-09-17 21:06 ` [PATCH 17/25] xfs: skip bmbt block ino validation during owner change Christoph Hellwig
2017-09-17 21:06 ` [PATCH 18/25] xfs: move bmbt owner change to last step of extent swap Christoph Hellwig
2017-09-17 21:06 ` [PATCH 19/25] xfs: disallow marking previously dirty buffers as ordered Christoph Hellwig
2017-09-17 21:06 ` [PATCH 20/25] xfs: relog dirty buffers during swapext bmbt owner change Christoph Hellwig
2017-09-17 21:06 ` [PATCH 21/25] xfs: disable per-inode DAX flag Christoph Hellwig
2017-09-17 21:06 ` Christoph Hellwig [this message]
2017-09-17 21:06 ` [PATCH 23/25] xfs: don't set v3 xflags for v2 inodes Christoph Hellwig
2017-09-17 21:06 ` [PATCH 24/25] xfs: open code end_buffer_async_write in xfs_finish_page_writeback Christoph Hellwig
2017-09-17 21:06 ` [PATCH 25/25] xfs: use kmem_free to free return value of kmem_zalloc Christoph Hellwig
2017-09-18 8:31 ` 4.13-stable updates for XFS Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170917210631.10725-23-hch@lst.de \
--to=hch@lst.de \
--cc=amir73il@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=jbacik@fb.com \
--cc=linux-xfs@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).