linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: brauner@kernel.org
Cc: bfoster@redhat.com, hch@infradead.org, djwong@kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: [PATCH v3 2/2] iomap: fix race when reading in all bytes of a folio
Date: Tue, 28 Oct 2025 11:11:33 -0700	[thread overview]
Message-ID: <20251028181133.1285219-3-joannelkoong@gmail.com> (raw)
In-Reply-To: <20251028181133.1285219-1-joannelkoong@gmail.com>

There is a race where if all bytes in a folio need to get read in and
the filesystem finishes reading the bytes in before the call to
iomap_read_end(), then bytes_not_submitted in iomap_read_end() will be 0
and the following "ifs->read_bytes_pending -= bytes_not_submitted" will
also be 0 which will trigger an extra folio_end_read() call. This extra
folio_end_read() unlocks the folio for the 2nd time, which sets the lock
bit on the folio, resulting in a permanent lockup.

Fix this by adding a +1 bias when doing the initialization for
ifs->read_bytes_pending. The bias will be subtracted in
iomap_read_end(). This ensures the folio is locked when
iomap_read_end() is called, even if the IO helper has finished reading
in the entire folio.

Additionally, add some comments to clarify how this logic works.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Fixes: 51311f045375 ("iomap: track pending read bytes more optimally")
Reported-by: Brian Foster <bfoster@redhat.com>
---
 fs/iomap/buffered-io.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 4c0d66612a67..5d6c578338dd 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -358,12 +358,42 @@ static void iomap_read_init(struct folio *folio)
 	if (ifs) {
 		size_t len = folio_size(folio);
 
+		/*
+		 * ifs->read_bytes_pending is used to track how many bytes are
+		 * read in asynchronously by the IO helper. We need to track
+		 * this so that we can know when the IO helper has finished
+		 * reading in all the necessary ranges of the folio and can end
+		 * the read.
+		 *
+		 * Increase ->read_bytes_pending by the folio size to start, and
+		 * add a +1 bias. We'll subtract the bias and any uptodate/zeroed
+		 * ranges that did not require IO in iomap_read_end() after we're
+		 * done processing the folio.
+		 *
+		 * We do this because otherwise, we would have to increment
+		 * ifs->read_bytes_pending every time a range in the folio needs
+		 * to be read in, which can get expensive since the spinlock
+		 * needs to be held whenever modifying ifs->read_bytes_pending.
+		 *
+		 * We add the bias to ensure the folio is still locked when
+		 * iomap_read_end() is called, even if the IO helper has already
+		 * finished reading in the entire folio.
+		 */
 		spin_lock_irq(&ifs->state_lock);
-		ifs->read_bytes_pending += len;
+		ifs->read_bytes_pending += len + 1;
 		spin_unlock_irq(&ifs->state_lock);
 	}
 }
 
+/*
+ * This ends IO if no bytes were submitted to an IO helper.
+ *
+ * Otherwise, this calibrates ifs->read_bytes_pending to represent only the
+ * submitted bytes (see comment in iomap_read_init()). If all bytes submitted
+ * have already been completed by the IO helper, then this will end the read.
+ * Else the IO helper will end the read after all submitted ranges have been
+ * read.
+ */
 static void iomap_read_end(struct folio *folio, size_t bytes_submitted)
 {
 	struct iomap_folio_state *ifs;
@@ -381,7 +411,12 @@ static void iomap_read_end(struct folio *folio, size_t bytes_submitted)
 	ifs = folio->private;
 	if (ifs) {
 		bool end_read, uptodate;
-		size_t bytes_not_submitted = folio_size(folio) -
+		/*
+		 * Subtract any bytes that were initially accounted to
+		 * read_bytes_pending but skipped for IO.
+		 * The +1 accounts for the bias we added in iomap_read_init().
+		 */
+		size_t bytes_not_submitted = folio_size(folio) + 1 -
 				bytes_submitted;
 
 		spin_lock_irq(&ifs->state_lock);
-- 
2.47.3


  parent reply	other threads:[~2025-10-28 18:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28 18:11 [PATCH v3 0/2] vfs-6.19.iomap commit 51311f045375 fixups Joanne Koong
2025-10-28 18:11 ` [PATCH v3 1/2] iomap: rename bytes_pending/bytes_accounted to bytes_submitted/bytes_not_submitted Joanne Koong
2025-10-29  5:56   ` Darrick J. Wong
2025-10-29  8:36     ` Christoph Hellwig
2025-10-29  8:37   ` Christoph Hellwig
2025-10-28 18:11 ` Joanne Koong [this message]
2025-10-29  8:38   ` [PATCH v3 2/2] iomap: fix race when reading in all bytes of a folio Christoph Hellwig
2025-10-31 20:34     ` Joanne Koong
2025-10-31 12:39 ` [PATCH v3 0/2] vfs-6.19.iomap commit 51311f045375 fixups Christian Brauner
2025-10-31 20:38   ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251028181133.1285219-3-joannelkoong@gmail.com \
    --to=joannelkoong@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).