public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	dm-devel@lists.linux.dev, Mike Snitzer <snitzer@redhat.com>
Subject: [PATCH v3 04/14] block: Fix reference counting for zone write plugs in error state
Date: Wed,  1 May 2024 20:08:57 +0900	[thread overview]
Message-ID: <20240501110907.96950-5-dlemoal@kernel.org> (raw)
In-Reply-To: <20240501110907.96950-1-dlemoal@kernel.org>

When zone is reset or finished, disk_zone_wplug_set_wp_offset() is
called to update the zone write plug write pointer offset and to clear
the zone error state (BLK_ZONE_WPLUG_ERROR flag) if it is set.
However, this processing is missing dropping the reference to the zone
write plug that was taken in disk_zone_wplug_set_error() when the error
flag was first set. Furthermore, the error state handling must release
the zone write plug lock to first execute a report zones command. When
the report zone races with a reset or finish operation that clears the
error, we can end up decrementing the zone write plug reference count
twice: once in disk_zone_wplug_set_wp_offset() for the reset/finish
operation and one more time in disk_zone_wplugs_work() once
disk_zone_wplug_handle_error() completes.

Fix this by introducing disk_zone_wplug_clear_error() as the symmetric
function of disk_zone_wplug_set_error(). disk_zone_wplug_clear_error()
decrements the zone write plug reference count obtained in
disk_zone_wplug_set_error() only if the error handling has not started
yet, that is, only if disk_zone_wplugs_work() has not yet taken the zone
write plug off the error list. This ensure that either
disk_zone_wplug_clear_error() or disk_zone_wplugs_work() drop the zone
write plug reference count.

Fixes: dd291d77cc90 ("block: Introduce zone write plugging")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 block/blk-zoned.c | 75 +++++++++++++++++++++++++++++++----------------
 1 file changed, 49 insertions(+), 26 deletions(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 7824bd52c82c..23ad1de0da62 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -658,6 +658,54 @@ static void disk_zone_wplug_abort_unaligned(struct gendisk *disk,
 	bio_list_merge(&zwplug->bio_list, &bl);
 }
 
+static inline void disk_zone_wplug_set_error(struct gendisk *disk,
+					     struct blk_zone_wplug *zwplug)
+{
+	unsigned long flags;
+
+	if (zwplug->flags & BLK_ZONE_WPLUG_ERROR)
+		return;
+
+	/*
+	 * At this point, we already have a reference on the zone write plug.
+	 * However, since we are going to add the plug to the disk zone write
+	 * plugs work list, increase its reference count. This reference will
+	 * be dropped in disk_zone_wplugs_work() once the error state is
+	 * handled, or in disk_zone_wplug_clear_error() if the zone is reset or
+	 * finished.
+	 */
+	zwplug->flags |= BLK_ZONE_WPLUG_ERROR;
+	atomic_inc(&zwplug->ref);
+
+	spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
+	list_add_tail(&zwplug->link, &disk->zone_wplugs_err_list);
+	spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
+}
+
+static inline void disk_zone_wplug_clear_error(struct gendisk *disk,
+					       struct blk_zone_wplug *zwplug)
+{
+	unsigned long flags;
+
+	if (!(zwplug->flags & BLK_ZONE_WPLUG_ERROR))
+		return;
+
+	/*
+	 * We are racing with the error handling work which drops the reference
+	 * on the zone write plug after handling the error state. So remove the
+	 * plug from the error list and drop its reference count only if the
+	 * error handling has not yet started, that is, if the zone write plug
+	 * is still listed.
+	 */
+	spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
+	if (!list_empty(&zwplug->link)) {
+		list_del_init(&zwplug->link);
+		zwplug->flags &= ~BLK_ZONE_WPLUG_ERROR;
+		disk_put_zone_wplug(zwplug);
+	}
+	spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
+}
+
 /*
  * Set a zone write plug write pointer offset to either 0 (zone reset case)
  * or to the zone size (zone finish case). This aborts all plugged BIOs, which
@@ -691,12 +739,7 @@ static void disk_zone_wplug_set_wp_offset(struct gendisk *disk,
 	 * in a good state. So clear the error flag and decrement the
 	 * error count if we were in error state.
 	 */
-	if (zwplug->flags & BLK_ZONE_WPLUG_ERROR) {
-		zwplug->flags &= ~BLK_ZONE_WPLUG_ERROR;
-		spin_lock(&disk->zone_wplugs_lock);
-		list_del_init(&zwplug->link);
-		spin_unlock(&disk->zone_wplugs_lock);
-	}
+	disk_zone_wplug_clear_error(disk, zwplug);
 
 	/*
 	 * The zone write plug now has no BIO plugged: remove it from the
@@ -885,26 +928,6 @@ void blk_zone_write_plug_attempt_merge(struct request *req)
 	spin_unlock_irqrestore(&zwplug->lock, flags);
 }
 
-static inline void disk_zone_wplug_set_error(struct gendisk *disk,
-					     struct blk_zone_wplug *zwplug)
-{
-	if (!(zwplug->flags & BLK_ZONE_WPLUG_ERROR)) {
-		unsigned long flags;
-
-		/*
-		 * Increase the plug reference count. The reference will be
-		 * dropped in disk_zone_wplugs_work() once the error state
-		 * is handled.
-		 */
-		zwplug->flags |= BLK_ZONE_WPLUG_ERROR;
-		atomic_inc(&zwplug->ref);
-
-		spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
-		list_add_tail(&zwplug->link, &disk->zone_wplugs_err_list);
-		spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
-	}
-}
-
 /*
  * Check and prepare a BIO for submission by incrementing the write pointer
  * offset of its zone write plug and changing zone append operations into
-- 
2.44.0


  parent reply	other threads:[~2024-05-01 11:09 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-01 11:08 [PATCH v3 00/14] Zone write plugging fixes and cleanup Damien Le Moal
2024-05-01 11:08 ` [PATCH v3 01/14] dm: Check that a zoned table leads to a valid mapped device Damien Le Moal
2024-05-01 11:08 ` [PATCH v3 02/14] block: Exclude conventional zones when faking max open limit Damien Le Moal
2024-05-02  5:52   ` Hannes Reinecke
2024-05-01 11:08 ` [PATCH v3 03/14] block: Fix zone write plug initialization from blk_revalidate_zone_cb() Damien Le Moal
2024-05-02  5:53   ` Hannes Reinecke
2024-05-01 11:08 ` Damien Le Moal [this message]
2024-05-02  6:01   ` [PATCH v3 04/14] block: Fix reference counting for zone write plugs in error state Hannes Reinecke
2024-05-02  9:37     ` Damien Le Moal
2024-05-01 11:08 ` [PATCH v3 05/14] block: Hold a reference on zone write plugs to schedule submission Damien Le Moal
2024-05-02  6:04   ` Hannes Reinecke
2024-05-01 11:08 ` [PATCH v3 06/14] block: Unhash a zone write plug only if needed Damien Le Moal
2024-05-02  6:05   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 07/14] block: Do not remove zone write plugs still in use Damien Le Moal
2024-05-02  5:38   ` Christoph Hellwig
2024-05-02  6:09   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 08/14] block: Fix flush request sector restore Damien Le Moal
2024-05-02  6:10   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 09/14] block: Fix handling of non-empty flush write requests to zones Damien Le Moal
2024-05-02  6:11   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 10/14] block: Improve blk_zone_write_plug_bio_merged() Damien Le Moal
2024-05-02  6:13   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 11/14] block: Improve zone write request completion handling Damien Le Moal
2024-05-02  6:14   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 12/14] block: Simplify blk_zone_write_plug_bio_endio() Damien Le Moal
2024-05-02  6:15   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 13/14] block: Simplify zone write plug BIO abort Damien Le Moal
2024-05-02  6:15   ` Hannes Reinecke
2024-05-01 11:09 ` [PATCH v3 14/14] block: Cleanup blk_revalidate_zone_cb() Damien Le Moal
2024-05-02  6:17   ` Hannes Reinecke
2024-05-01 14:08 ` [PATCH v3 00/14] Zone write plugging fixes and cleanup Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240501110907.96950-5-dlemoal@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@lists.linux.dev \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox