stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Christoph Hellwig <hch@lst.de>, Josef Bacik <jbacik@fb.com>,
	Amir Goldstein <amir73il@gmail.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>
Subject: [PATCH 4.9 74/78] xfs: fix incorrect log_flushed on fsync
Date: Mon, 18 Sep 2017 11:12:23 +0200	[thread overview]
Message-ID: <20170918091137.710824174@linuxfoundation.org> (raw)
In-Reply-To: <20170918091126.077483037@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Amir Goldstein <amir73il@gmail.com>

commit 47c7d0b19502583120c3f396c7559e7a77288a68 upstream.

When calling into _xfs_log_force{,_lsn}() with a pointer
to log_flushed variable, log_flushed will be set to 1 if:
1. xlog_sync() is called to flush the active log buffer
AND/OR
2. xlog_wait() is called to wait on a syncing log buffers

xfs_file_fsync() checks the value of log_flushed after
_xfs_log_force_lsn() call to optimize away an explicit
PREFLUSH request to the data block device after writing
out all the file's pages to disk.

This optimization is incorrect in the following sequence of events:

 Task A                    Task B
 -------------------------------------------------------
 xfs_file_fsync()
   _xfs_log_force_lsn()
     xlog_sync()
        [submit PREFLUSH]
                           xfs_file_fsync()
                             file_write_and_wait_range()
                               [submit WRITE X]
                               [endio  WRITE X]
                             _xfs_log_force_lsn()
                               xlog_wait()
        [endio  PREFLUSH]

The write X is not guarantied to be on persistent storage
when PREFLUSH request in completed, because write A was submitted
after the PREFLUSH request, but xfs_file_fsync() of task A will
be notified of log_flushed=1 and will skip explicit flush.

If the system crashes after fsync of task A, write X may not be
present on disk after reboot.

This bug was discovered and demonstrated using Josef Bacik's
dm-log-writes target, which can be used to record block io operations
and then replay a subset of these operations onto the target device.
The test goes something like this:
- Use fsx to execute ops of a file and record ops on log device
- Every now and then fsync the file, store md5 of file and mark
  the location in the log
- Then replay log onto device for each mark, mount fs and compare
  md5 of file to stored value

Cc: Christoph Hellwig <hch@lst.de>
Cc: Josef Bacik <jbacik@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_log.c |    7 -------
 1 file changed, 7 deletions(-)

--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -3337,8 +3337,6 @@ maybe_sleep:
 		 */
 		if (iclog->ic_state & XLOG_STATE_IOERROR)
 			return -EIO;
-		if (log_flushed)
-			*log_flushed = 1;
 	} else {
 
 no_sleep:
@@ -3442,8 +3440,6 @@ try_again:
 
 				xlog_wait(&iclog->ic_prev->ic_write_wait,
 							&log->l_icloglock);
-				if (log_flushed)
-					*log_flushed = 1;
 				already_slept = 1;
 				goto try_again;
 			}
@@ -3477,9 +3473,6 @@ try_again:
 			 */
 			if (iclog->ic_state & XLOG_STATE_IOERROR)
 				return -EIO;
-
-			if (log_flushed)
-				*log_flushed = 1;
 		} else {		/* just return */
 			spin_unlock(&log->l_icloglock);
 		}

  parent reply	other threads:[~2017-09-18  9:17 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-18  9:11 [PATCH 4.9 00/78] 4.9.51-stable review Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 01/78] ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt() Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 02/78] ipv6: add rcu grace period before freeing fib6_node Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 03/78] ipv6: fix sparse warning on rt6i_node Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 04/78] macsec: add genl family module alias Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 05/78] udp: on peeking bad csum, drop packets even if not at head Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 06/78] fsl/man: Inherit parent device and of_node Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 07/78] sctp: Avoid out-of-bounds reads from address storage Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 08/78] qlge: avoid memcpy buffer overflow Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 09/78] netvsc: fix deadlock betwen link status and removal Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 10/78] cxgb4: Fix stack out-of-bounds read due to wrong size to t4_record_mbox() Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 11/78] packet: Dont write vnet header beyond end of buffer Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 12/78] kcm: do not attach PF_KCM sockets to avoid deadlock Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 13/78] Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()" Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 14/78] tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0 Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 15/78] mlxsw: spectrum: Forbid linking to devices that have uppers Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 16/78] bridge: switchdev: Clear forward mark when transmitting packet Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 17/78] Revert "net: use lib/percpu_counter API for fragmentation mem accounting" Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 18/78] Revert "net: fix percpu memory leaks" Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 19/78] gianfar: Fix Tx flow control deactivation Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 20/78] vhost_net: correctly check tx avail during rx busy polling Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 21/78] ip6_gre: update mtu properly in ip6gre_err Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 22/78] ipv6: fix memory leak with multiple tables during netns destruction Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 23/78] ipv6: fix typo in fib6_net_exit() Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 24/78] sctp: fix missing wake ups in some situations Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 25/78] ip_tunnel: fix setting ttl and tos value in collect_md mode Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 26/78] f2fs: let fill_super handle roll-forward errors Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 27/78] f2fs: check hot_data for roll-forward recovery Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 28/78] x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 29/78] x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 30/78] x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 31/78] xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff() Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 32/78] xfs: fix spurious spin_is_locked() assert failures on non-smp kernels Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 33/78] xfs: push buffer of flush locked dquot to avoid quotacheck deadlock Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 34/78] xfs: try to avoid blowing out the transaction reservation when bunmaping a shared extent Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 35/78] xfs: release bli from transaction properly on fs shutdown Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 36/78] xfs: remove bli from AIL before release on transaction abort Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 37/78] xfs: dont allow bmap on rt files Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 38/78] xfs: free uncommitted transactions during log recovery Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 39/78] xfs: free cowblocks and retry on buffered write ENOSPC Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 40/78] xfs: dont crash on unexpected holes in dir/attr btrees Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 41/78] xfs: check _btree_check_block value Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 42/78] xfs: set firstfsb to NULLFSBLOCK before feeding it to _bmapi_write Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 43/78] xfs: check _alloc_read_agf buffer pointer before using Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 44/78] xfs: fix quotacheck dquot id overflow infinite loop Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 45/78] xfs: fix multi-AG deadlock in xfs_bunmapi Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 46/78] xfs: Fix per-inode DAX flag inheritance Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 47/78] xfs: fix inobt inode allocation search optimization Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 48/78] xfs: clear MS_ACTIVE after finishing log recovery Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 49/78] xfs: dont leak quotacheck dquots when cow recovery Greg Kroah-Hartman
2017-09-18  9:11 ` [PATCH 4.9 50/78] iomap: fix integer truncation issues in the zeroing and dirtying helpers Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 51/78] xfs: write unmount record for ro mounts Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 52/78] xfs: toggle readonly state around xfs_log_mount_finish Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 53/78] xfs: remove xfs_trans_ail_delete_bulk Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 54/78] xfs: Add infrastructure needed for error propagation during buffer IO failure Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 55/78] xfs: Properly retry failed inode items in case of error during buffer writeback Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 56/78] xfs: fix recovery failure when log record header wraps log end Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 57/78] xfs: always verify the log tail during recovery Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 58/78] xfs: fix log recovery corruption error due to tail overwrite Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 59/78] xfs: handle -EFSCORRUPTED during head/tail verification Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 60/78] xfs: add log recovery tracepoint for head/tail Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 61/78] xfs: stop searching for free slots in an inode chunk when there are none Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 62/78] xfs: evict all inodes involved with log redo item Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 63/78] xfs: check for race with xfs_reclaim_inode() in xfs_ifree_cluster() Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 64/78] xfs: open-code xfs_buf_item_dirty() Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 65/78] xfs: remove unnecessary dirty bli format check for ordered bufs Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 66/78] xfs: ordered buffer log items are never formatted Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 67/78] xfs: refactor buffer logging into buffer dirtying helper Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 68/78] xfs: dont log dirty ranges for ordered buffers Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 69/78] xfs: skip bmbt block ino validation during owner change Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 70/78] xfs: move bmbt owner change to last step of extent swap Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 71/78] xfs: disallow marking previously dirty buffers as ordered Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 72/78] xfs: relog dirty buffers during swapext bmbt owner change Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 73/78] xfs: disable per-inode DAX flag Greg Kroah-Hartman
2017-09-18  9:12 ` Greg Kroah-Hartman [this message]
2017-09-18  9:12 ` [PATCH 4.9 75/78] xfs: dont set v3 xflags for v2 inodes Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 76/78] xfs: open code end_buffer_async_write in xfs_finish_page_writeback Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 77/78] xfs: use kmem_free to free return value of kmem_zalloc Greg Kroah-Hartman
2017-09-18  9:12 ` [PATCH 4.9 78/78] md/raid5: release/flush io in raid5_do_work() Greg Kroah-Hartman
2017-09-18 12:21 ` [PATCH 4.9 00/78] 4.9.51-stable review Tom Gall
2017-09-18 14:20   ` Greg Kroah-Hartman
2017-09-18 19:28 ` Guenter Roeck
2017-09-19  6:33   ` Greg Kroah-Hartman
2017-09-18 19:55 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170918091137.710824174@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=jbacik@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).