All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-xfs@vger.kernel.org,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	Christoph Hellwig <hch@lst.de>, Brian Foster <bfoster@redhat.com>
Subject: [PATCH 4.13 49/52] xfs: open code end_buffer_async_write in xfs_finish_page_writeback
Date: Mon, 18 Sep 2017 11:10:17 +0200	[thread overview]
Message-ID: <20170918090911.500831056@linuxfoundation.org> (raw)
In-Reply-To: <20170918090904.072766209@linuxfoundation.org>

4.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christoph Hellwig <hch@lst.de>

commit 8353a814f2518dcfa79a5bb77afd0e7dfa391bb1 upstream.

Our loop in xfs_finish_page_writeback, which iterates over all buffer
heads in a page and then calls end_buffer_async_write, which also
iterates over all buffers in the page to check if any I/O is in flight
is not only inefficient, but also potentially dangerous as
end_buffer_async_write can cause the page and all buffers to be freed.

Replace it with a single loop that does the work of end_buffer_async_write
on a per-page basis.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/xfs/xfs_aops.c |   71 +++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 47 insertions(+), 24 deletions(-)

--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -85,11 +85,11 @@ xfs_find_bdev_for_inode(
  * associated buffer_heads, paying attention to the start and end offsets that
  * we need to process on the page.
  *
- * Landmine Warning: bh->b_end_io() will call end_page_writeback() on the last
- * buffer in the IO. Once it does this, it is unsafe to access the bufferhead or
- * the page at all, as we may be racing with memory reclaim and it can free both
- * the bufferhead chain and the page as it will see the page as clean and
- * unused.
+ * Note that we open code the action in end_buffer_async_write here so that we
+ * only have to iterate over the buffers attached to the page once.  This is not
+ * only more efficient, but also ensures that we only calls end_page_writeback
+ * at the end of the iteration, and thus avoids the pitfall of having the page
+ * and buffers potentially freed after every call to end_buffer_async_write.
  */
 static void
 xfs_finish_page_writeback(
@@ -97,29 +97,44 @@ xfs_finish_page_writeback(
 	struct bio_vec		*bvec,
 	int			error)
 {
-	unsigned int		end = bvec->bv_offset + bvec->bv_len - 1;
-	struct buffer_head	*head, *bh, *next;
+	struct buffer_head	*head = page_buffers(bvec->bv_page), *bh = head;
+	bool			busy = false;
 	unsigned int		off = 0;
-	unsigned int		bsize;
+	unsigned long		flags;
 
 	ASSERT(bvec->bv_offset < PAGE_SIZE);
 	ASSERT((bvec->bv_offset & (i_blocksize(inode) - 1)) == 0);
-	ASSERT(end < PAGE_SIZE);
+	ASSERT(bvec->bv_offset + bvec->bv_len <= PAGE_SIZE);
 	ASSERT((bvec->bv_len & (i_blocksize(inode) - 1)) == 0);
 
-	bh = head = page_buffers(bvec->bv_page);
-
-	bsize = bh->b_size;
+	local_irq_save(flags);
+	bit_spin_lock(BH_Uptodate_Lock, &head->b_state);
 	do {
-		if (off > end)
-			break;
-		next = bh->b_this_page;
-		if (off < bvec->bv_offset)
-			goto next_bh;
-		bh->b_end_io(bh, !error);
-next_bh:
-		off += bsize;
-	} while ((bh = next) != head);
+		if (off >= bvec->bv_offset &&
+		    off < bvec->bv_offset + bvec->bv_len) {
+			ASSERT(buffer_async_write(bh));
+			ASSERT(bh->b_end_io == NULL);
+
+			if (error) {
+				mark_buffer_write_io_error(bh);
+				clear_buffer_uptodate(bh);
+				SetPageError(bvec->bv_page);
+			} else {
+				set_buffer_uptodate(bh);
+			}
+			clear_buffer_async_write(bh);
+			unlock_buffer(bh);
+		} else if (buffer_async_write(bh)) {
+			ASSERT(buffer_locked(bh));
+			busy = true;
+		}
+		off += bh->b_size;
+	} while ((bh = bh->b_this_page) != head);
+	bit_spin_unlock(BH_Uptodate_Lock, &head->b_state);
+	local_irq_restore(flags);
+
+	if (!busy)
+		end_page_writeback(bvec->bv_page);
 }
 
 /*
@@ -133,8 +148,10 @@ xfs_destroy_ioend(
 	int			error)
 {
 	struct inode		*inode = ioend->io_inode;
-	struct bio		*last = ioend->io_bio;
-	struct bio		*bio, *next;
+	struct bio		*bio = &ioend->io_inline_bio;
+	struct bio		*last = ioend->io_bio, *next;
+	u64			start = bio->bi_iter.bi_sector;
+	bool			quiet = bio_flagged(bio, BIO_QUIET);
 
 	for (bio = &ioend->io_inline_bio; bio; bio = next) {
 		struct bio_vec	*bvec;
@@ -155,6 +172,11 @@ xfs_destroy_ioend(
 
 		bio_put(bio);
 	}
+
+	if (unlikely(error && !quiet)) {
+		xfs_err_ratelimited(XFS_I(inode)->i_mount,
+			"writeback error on sector %llu", start);
+	}
 }
 
 /*
@@ -423,7 +445,8 @@ xfs_start_buffer_writeback(
 	ASSERT(!buffer_delay(bh));
 	ASSERT(!buffer_unwritten(bh));
 
-	mark_buffer_async_write(bh);
+	bh->b_end_io = NULL;
+	set_buffer_async_write(bh);
 	set_buffer_uptodate(bh);
 	clear_buffer_dirty(bh);
 }



  parent reply	other threads:[~2017-09-18  9:12 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-18  9:09 [PATCH 4.13 00/52] 4.13.3-stable review Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 01/52] Revert "net: use lib/percpu_counter API for fragmentation mem accounting" Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 02/52] Revert "net: fix percpu memory leaks" Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 03/52] gianfar: Fix Tx flow control deactivation Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 04/52] vhost_net: correctly check tx avail during rx busy polling Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 05/52] ip6_gre: update mtu properly in ip6gre_err Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 06/52] udp: drop head states only when all skb references are gone Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 07/52] ipv6: fix memory leak with multiple tables during netns destruction Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 08/52] ipv6: fix typo in fib6_net_exit() Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 09/52] sctp: fix missing wake ups in some situations Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 10/52] tcp: fix a request socket leak Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 11/52] ip_tunnel: fix setting ttl and tos value in collect_md mode Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 12/52] f2fs: let fill_super handle roll-forward errors Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 13/52] f2fs: check hot_data for roll-forward recovery Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 14/52] thunderbolt: Remove superfluous check Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 15/52] thunderbolt: Make key root-only accessible Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 16/52] thunderbolt: Allow clearing the key Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 17/52] x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 18/52] x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 19/52] x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 20/52] x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages Greg Kroah-Hartman
2017-09-18  9:09   ` Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 21/52] ovl: fix false positive ESTALE on lookup Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 22/52] fuse: allow server to run in different pid_ns Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 23/52] idr: remove WARN_ON_ONCE() when trying to replace negative ID Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 24/52] libnvdimm, btt: check memory allocation failure Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 25/52] libnvdimm: fix integer overflow static analysis warning Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 26/52] xfs: write unmount record for ro mounts Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 27/52] xfs: toggle readonly state around xfs_log_mount_finish Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 28/52] xfs: Add infrastructure needed for error propagation during buffer IO failure Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 29/52] xfs: Properly retry failed inode items in case of error during buffer writeback Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 30/52] xfs: fix recovery failure when log record header wraps log end Greg Kroah-Hartman
2017-09-18  9:09 ` [PATCH 4.13 31/52] xfs: always verify the log tail during recovery Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 32/52] xfs: fix log recovery corruption error due to tail overwrite Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 33/52] xfs: handle -EFSCORRUPTED during head/tail verification Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 34/52] xfs: stop searching for free slots in an inode chunk when there are none Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 35/52] xfs: evict all inodes involved with log redo item Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 36/52] xfs: check for race with xfs_reclaim_inode() in xfs_ifree_cluster() Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 37/52] xfs: open-code xfs_buf_item_dirty() Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 38/52] xfs: remove unnecessary dirty bli format check for ordered bufs Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 39/52] xfs: ordered buffer log items are never formatted Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 40/52] xfs: refactor buffer logging into buffer dirtying helper Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 41/52] xfs: dont log dirty ranges for ordered buffers Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 42/52] xfs: skip bmbt block ino validation during owner change Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 43/52] xfs: move bmbt owner change to last step of extent swap Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 44/52] xfs: disallow marking previously dirty buffers as ordered Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 45/52] xfs: relog dirty buffers during swapext bmbt owner change Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 46/52] xfs: disable per-inode DAX flag Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 47/52] xfs: fix incorrect log_flushed on fsync Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 48/52] xfs: dont set v3 xflags for v2 inodes Greg Kroah-Hartman
2017-09-18  9:10 ` Greg Kroah-Hartman [this message]
2017-09-18  9:10 ` [PATCH 4.13 50/52] xfs: use kmem_free to free return value of kmem_zalloc Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 51/52] md/raid1/10: reset bio allocated from mempool Greg Kroah-Hartman
2017-09-18  9:10 ` [PATCH 4.13 52/52] md/raid5: release/flush io in raid5_do_work() Greg Kroah-Hartman
2017-09-18 19:29 ` [PATCH 4.13 00/52] 4.13.3-stable review Guenter Roeck
2017-09-18 20:17 ` Shuah Khan
2017-09-19  6:33   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170918090911.500831056@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.