From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 613F5145A05 for ; Tue, 24 Feb 2026 01:12:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771895575; cv=none; b=Gahb39w4BDxEB5efuXEfZ6BfxUphSxGegl9eV/OukJPOz7PBmEPTRVqa1RpoY+z7VQiJE04DFci7cJZ3R46KTP/UPQwBRgALZY/urk+TfKmFQJDXluTjdNGOQCxeTum8BZZfPq1hd372N871O6p/w+BL9SlZguHXOXF0T6g3+hQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771895575; c=relaxed/simple; bh=johsvsWjrk8Y57dSXcGbZGQH4IkCGmrGJtDtc8uLSxA=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type; b=mRDcZ4/IuosbbRVCPSXqudQdcZ6wy+FsYQPce2b4vwMhLFWW4khG8DvrU+4Cjmdqy0dsxa599QO+fN7+BCIYPyoCX9Wyr/2e432msOzUdjscVp1MUM80YOyzniebqCctVA3Jk/0pnpFDeUscBPdOJeobNcBs+o7oIjB10AUp5jI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ELo0kR07; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ELo0kR07" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9114BC116C6; Tue, 24 Feb 2026 01:12:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771895575; bh=johsvsWjrk8Y57dSXcGbZGQH4IkCGmrGJtDtc8uLSxA=; h=Date:Subject:To:References:From:In-Reply-To:From; b=ELo0kR079iwePzjDjzYE3qs7lmBSxtyyCokNSvDo45oLDnCFqBYgyvcdOzKTmXdIn zZsbm1qrG8txi6iCEsSHs3AunEvj0wwCbpMXrIyzPXWQj+H3XYH1a1DW/zoCv2bvt7 zkE7PyDHznvvjeSxG0/UQB86Y6rqaC3db3sOwq9zx6w7IiGZn/QQlvDanjH2MRWGmw 1Qv3W2dxeF+YfL5Red4kAxjstsPbq9bQ6T+gXRDFvuQ4V4+lWmmBaL/r7d5Dtg59kD Gs4EYEVgp2G/reYVlwuTA2hfGQ7cSMU+y0pgqqgsQeDNwU+GEVHEkcV/5JhUy2JNrF AbZPbBlfBGNDQ== Message-ID: <99c22bd8-2898-4b72-91bb-e80847cda065@kernel.org> Date: Tue, 24 Feb 2026 10:07:38 +0900 Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 0/8] Improve zoned (SMR) HDD write throughput To: Bart Van Assche , Jens Axboe , linux-block@vger.kernel.org References: <20260221004411.548482-1-dlemoal@kernel.org> <3ea1b7da-0639-4cf7-a8b4-132b26eedba8@acm.org> From: Damien Le Moal Content-Language: en-US Organization: Western Digital Research In-Reply-To: <3ea1b7da-0639-4cf7-a8b4-132b26eedba8@acm.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 2/24/26 2:03 AM, Bart Van Assche wrote: > On 2/20/26 4:44 PM, Damien Le Moal wrote: >> This patch series cleans up the zone write plugging code and introduces >> the ability to issue all write BIOs from a single context (a kthread) >> instead of allowing multiple zones to be written at the same time using >> a per zone work. As shown in patch 6, raw block device tests and XFS >> tests with an SMR HDD show that this can significantly increase write >> throughput (up to 40% over the current zone write plugging). > Hi Damien, > > Is a new kthread necessary? Has the following approach been considered? > * Make the dm drivers that support rotational zoned storage devices >   request-based instead of bio-based. What you are suggesting would be an enormous amount of work (dm-linear, dm-flakey, dm-error, dm-crypt) to change generic code into DM targets that would be very specialized for just SMR HDDs. I do not understand why you think that would be a good idea. > * Modify blk_mq_get_tag() such that only one tag can be allocated at a >   time for zoned write requests. Sure, doing that would limit the number of write requests to zones to 1 at most at any time, but that would also result in a total loss of control over which zone write BIO work gets that single tag, meaning that the writes would in the end be mostly random again, like they are now. So with this solution, I can say goodbye to the +40% write throughput increase that I am seeing with the kthread. Also note that this idea of limiting write tags combined with your idea of using req based DM targets would likely negatively impact dm-crypt performance as we would lose the ability to encrypt multiple writes in parallel on different CPUs. > I think that would be sufficient to serialize zoned writes. Additionally, this > approach doesn't increase request processing latency > by forcing a context switch to a kthread. This point is in my opinion moot because we currently use work items to issue the write commands. We are scheduling the zone write plugs BIO works in the submission and completion path, so the context switch overhead is already there. And would argue that using work items is potentially even more overhead than using a fixed kthread since the work items need to be assigned to CPUs and worker threads. Granted, your point is valid for a QD=1 workload. In this case, I am indeed introducing a context switch where there is none now. But that is not really the use case we are looking at here. File system writeback does not happen at QD=1 per zone. Also, this added overhead does not really matter for HDDs anyway, and if that is really an issue, the user can enable the legacy zone write plugging behavior with "echo 0 > /sys/block/sdX/queue/zoned_qd1_writes". -- Damien Le Moal Western Digital Research