From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC8E2466B4E; Tue, 16 Jun 2026 16:39:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781627978; cv=none; b=Yq2aEH+2L17saF+Yc7Q8o05zpOGLOdBLSVgo9eODOyxQ+eFnmMX+35Vtv9kqA7a+FunszPuo5r8ExTpd5LwzUCppuObLwnhbpcFw2RvjaLz8FbHBxUryEB0rvvojzjQxKWi7fp6+3E4L+6XTKL2LJwLvSYOcbpunXFFZGoAu1DY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781627978; c=relaxed/simple; bh=I+NqjtHfwrRyXV6GfuAZBW3wAHKavblB+g5S2vr/SSo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dYSrqDMdS8Kegw8XC6tN7L7jrHHIMosxMb7rfNll8Jp83GS2XboXs14DvAhXW5/zGopr5p93y8AFr5X2tEugMU9jRGteq+TocPXy2zpm+8LBsaI5ANoGuSsGWV+0iNsZ7WLgXnPUvCwaj1bJRYKndxLP+1FyVUgdaxfFmFj48eo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=lDludqoH; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="lDludqoH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E5841F000E9; Tue, 16 Jun 2026 16:39:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=korg; t=1781627976; bh=kj0N2ZxOgWo/1MfOpSuoMdzltG2Ey+1z/+/2ZfwVub8=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=lDludqoHvKbPS2fV6JTUY3z/Jfq2aCwmV81PuKMN21xRs4XkjX5Sh5/yPT2DH9IVh ZgrKk0DWw/bt/jNh5DvwWgpt830ljw4sVYofQ5CrWQ+M8u9cpft/JYvcLH+bi+XjT0 KXA7aQxtOtxZnHDBqyWS0egHt6a5vag/CUDUsorg= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Shinichiro Kawasaki , Damien Le Moal , Jens Axboe , Gyokhan Kochmarla Subject: [PATCH 6.12 250/261] block: fix handling of dead zone write plugs Date: Tue, 16 Jun 2026 20:31:28 +0530 Message-ID: <20260616145056.663049551@linuxfoundation.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260616145044.869532709@linuxfoundation.org> References: <20260616145044.869532709@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.12-stable review patch. If anyone has any objections, please let me know. ------------------ From: Damien Le Moal commit 836efd35c472d89c838d7b17ef339ddb3286ffc5 upstream. Shin'ichiro reported hard to reproduce unaligned write errors with zoned block devices. Under normal operation conditions (e.g. running XFS on an SMR disk), these errors are nearly impossible to trigger. But using a "slow" kernel with many debug options enables and some specific use cases (e.g. fio zbd test case 46), the errors can be reproduced fairly easily. The unaligned write errors come from mishandling a valid reference counting pattern of zone write plugs. Such pattern triggers for instance if a process A writes a zone (not necessarilly to the full state), another process B immediately resets the zone and immediately following the completion of the zone reset, starts issuing writes to the zone. With such pattern, in some cases, the zone write plugs worker thread of the device may still be holding a reference to the zone write plug of the zone taken when process A was writing to the zone. The following zone reset from process B marks the zone as dead but does not remove the zone write plug from the device hash table as a reference to the plug still exist. Once process B starts issuing new writes, the zone write plug is seen as dead and the writes from process B are immediately failed, despite this write pattern being perfectly legal. Fix this by allowing restoring a dead zone write plug to a live state if a write is issued to the zone when the zone is: marked as dead, empty and the write sector corresponds to the first sector of the zone (that is, the write is aligned to the zone write pointer). This is done with the new helper function disk_check_zone_wplug_dead(), which restores a dead zone write plug to a live state by clearing the BLK_ZONE_WPLUG_DEAD flag and restoring the initial reference to the zone write plug taken when the plug was added to the device hash table. Reported-by: Shin'ichiro Kawasaki Fixes: b7d4ffb51037 ("block: fix zone write plug removal") Signed-off-by: Damien Le Moal Tested-by: Shin'ichiro Kawasaki Link: https://patch.msgid.link/20260513111129.108809-1-dlemoal@kernel.org Signed-off-by: Jens Axboe [ context conflict due to different line offsets in blk-zoned.c ] Signed-off-by: Gyokhan Kochmarla Signed-off-by: Greg Kroah-Hartman --- block/blk-zoned.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -517,6 +517,28 @@ static void disk_mark_zone_wplug_dead(st } } +static inline bool disk_check_zone_wplug_dead(struct blk_zone_wplug *zwplug) +{ + if (!(zwplug->flags & BLK_ZONE_WPLUG_DEAD)) + return false; + + /* + * If a new write is received right after a zone reset completes and + * while the disk_zone_wplugs_worker() thread has not yet released the + * reference on the zone write plug after processing the last write to + * the zone, then the new write BIO will see the zone write plug marked + * as dead. This case is however a false positive and a perfectly valid + * pattern. In such case, restore the zone write plug to a live one. + */ + if (!zwplug->wp_offset && bio_list_empty(&zwplug->bio_list)) { + zwplug->flags &= ~BLK_ZONE_WPLUG_DEAD; + refcount_inc(&zwplug->ref); + return false; + } + + return true; +} + static void blk_zone_wplug_bio_work(struct work_struct *work); /* @@ -1037,12 +1059,12 @@ static bool blk_zone_wplug_handle_write( } /* - * If we got a zone write plug marked as dead, then the user is issuing - * writes to a full zone, or without synchronizing with zone reset or - * zone finish operations. In such case, fail the BIO to signal this - * invalid usage. + * Check if we got a zone write plug marked as dead. If yes, then the + * user is likely issuing writes to a full zone, or without + * synchronizing with zone reset or zone finish operations. In such + * case, fail the BIO to signal this invalid usage. */ - if (zwplug->flags & BLK_ZONE_WPLUG_DEAD) { + if (disk_check_zone_wplug_dead(zwplug)) { spin_unlock_irqrestore(&zwplug->lock, flags); disk_put_zone_wplug(zwplug); bio_io_error(bio);