linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>, Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>, Jaegeuk Kim <jaegeuk@kernel.org>
Subject: Re: [PATCH v16 05/26] blk-zoned: Fix a deadlock triggered by unaligned writes
Date: Mon, 25 Nov 2024 13:00:53 +0900	[thread overview]
Message-ID: <2e55151f-2001-46b6-8f82-cf961bbdf7cf@kernel.org> (raw)
In-Reply-To: <03c6dc60-fc51-4906-b433-aeac6be87083@acm.org>

On 11/22/24 2:51 AM, Bart Van Assche wrote:
> 
> On 11/20/24 7:32 PM, Damien Le Moal wrote:
>> On 11/20/24 06:04, Bart Van Assche wrote:
>>> On 11/18/24 6:57 PM, Damien Le Moal wrote:
>>>> On 11/19/24 9:27 AM, Bart Van Assche wrote:
>>>>> If the queue is filled with unaligned writes then the following
>>>>> deadlock occurs:
>>>>>
>>>>> Call Trace:
>>>>>    <TASK>
>>>>>    __schedule+0x8cc/0x2190
>>>>>    schedule+0xdd/0x2b0
>>>>>    blk_queue_enter+0x2ce/0x4f0
>>>>>    blk_mq_alloc_request+0x303/0x810
>>>>>    scsi_execute_cmd+0x3f4/0x7b0
>>>>>    sd_zbc_do_report_zones+0x19e/0x4c0
>>>>>    sd_zbc_report_zones+0x304/0x920
>>>>>    disk_zone_wplug_handle_error+0x237/0x920
>>>>>    disk_zone_wplugs_work+0x17e/0x430
>>>>>    process_one_work+0xdd0/0x1490
>>>>>    worker_thread+0x5eb/0x1010
>>>>>    kthread+0x2e5/0x3b0
>>>>>    ret_from_fork+0x3a/0x80
>>>>>    </TASK>
>>>>>
>>>>> Fix this deadlock by removing the disk->fops->report_zones() call and by
>>>>> deriving the write pointer information from successfully completed zoned
>>>>> writes.
>>>>>
>>>>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>>>
>>>> Doesn't this need a Fixes tag and CC stable, and come earlier in the series
>>>> (or
>>>> sent separately) ?
>>>
>>> I will add Fixes: and Cc: stable tags.
>>>
>>> I'm not sure how to move this patch earlier since it depends on the
>>> previous patch in this series ("blk-zoned: Only handle errors after
>>> pending zoned writes have completed"). Without that patch, it is not
>>> safe to use zwplug->wp_offset_compl in the error handler.
>>>
>>>> Overall, this patch seems wrong anyway as zone reset and zone finish may be
>>>> done between 2 writes, failing the next one but the recovery done here will
>>>> use
>>>> the previous succeful write end position as the wp, which is NOT correct as
>>>> reset or finish changed that...
>>>
>>> I will add support for the zone reset and zone finish commands in this
>>> patch.
>>>
>>>> And we also have the possibility of torn writes
>>>> (partial writes) with SAS SMR drives. So I really think that you cannot avoid
>>>> doing a report zone to recover errors.
>>>
>>> Thanks for having brought this up. This is something I was not aware of.
>>>
>>> disk_zone_wplug_handle_error() submits a new request to retrieve zone
>>> information while handling an error triggered by other requests. This
>>> can easily lead to a deadlock as the above call trace shows. How about
>>> introducing a queue flag for the "report zones" approach in
>>> disk_zone_wplug_handle_error() such that the "report zones" approach is
>>> only used for SAS SMR drives?
>>
>> Sure, but how would that solve the potential deadlock problem ? ALso, I am not
>> entirely clear on how the deadlock can happen given that zone write plugs are
>> queueing/blocking BIOs, not requests. So even assuming you have a large number
>> of BIOs plugged in a zone write plug, the error handler work should still be
>> able to issue a request to do a report zones, no ? On which resource can the
>> deadlock happen ? Plugged BIOs do not yet use a tag, right ?
>>
>> What am I missing here ? Or is it maybe something that can happen with your
>> modifications because you changed the zone write plug behavior to allow for more
>> than one BIO at a time being unplugged and issued to the device ?
>>
>> Note that if you do have a test case for this triggering the deadlock, we
>> definitely need to solve this and ideally have a blktest case checking it.
> 
> Hi Damien,
> 
> The call trace mentioned above comes from the kernel log and was
> encountered while I was testing this patch series. A reproducer has
> already been shared - see also
> https://lore.kernel.org/linux-block/e840b66a-79f0-4169-9ab1-
> c475d9608e4d@acm.org/. The lockup happened after the queue was filled up
> with requests and hence sd_zbc_report_zones() failed to allocate an
> additional request for the zone report.
> 
> I'm wondering whether this lockup can also happen with the current
> upstream kernel by submitting multiple unaligned writes simultaneously
> where each write affects another zone and where the number of writes
> matches the queue depth.

Let me check. This indeed may be a possibility.

> 
> Thanks,
> 
> Bart.
> 
> 


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2024-11-25  4:01 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-19  0:27 [PATCH v16 00/26] Improve write performance for zoned UFS devices Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 01/26] blk-zoned: Fix a reference count leak Bart Van Assche
2024-11-19  2:23   ` Damien Le Moal
2024-11-19 20:21     ` Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 02/26] blk-zoned: Split disk_zone_wplugs_work() Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 03/26] blk-zoned: Split queue_zone_wplugs_show() Bart Van Assche
2024-11-19  2:25   ` Damien Le Moal
2024-11-19  0:27 ` [PATCH v16 04/26] blk-zoned: Only handle errors after pending zoned writes have completed Bart Van Assche
2024-11-19  2:50   ` Damien Le Moal
2024-11-19 20:51     ` Bart Van Assche
2024-11-21  3:23       ` Damien Le Moal
2024-11-21 17:43         ` Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 05/26] blk-zoned: Fix a deadlock triggered by unaligned writes Bart Van Assche
2024-11-19  2:57   ` Damien Le Moal
2024-11-19 21:04     ` Bart Van Assche
2024-11-21  3:32       ` Damien Le Moal
2024-11-21 17:51         ` Bart Van Assche
2024-11-25  4:00           ` Damien Le Moal [this message]
2024-11-25  4:19             ` Damien Le Moal
2025-01-09 19:11     ` Bart Van Assche
2025-01-10  5:07       ` Damien Le Moal
2025-01-10 18:17         ` Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 06/26] blk-zoned: Fix requeuing of zoned writes Bart Van Assche
2024-11-19  3:00   ` Damien Le Moal
2024-11-19 21:06     ` Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 07/26] block: Support block drivers that preserve the order of write requests Bart Van Assche
2024-11-19  7:37   ` Damien Le Moal
2024-11-19 21:08     ` Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 08/26] dm-linear: Report to the block layer that the write order is preserved Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 09/26] mq-deadline: Remove a local variable Bart Van Assche
2024-11-19  7:38   ` Damien Le Moal
2024-11-19 21:11     ` Bart Van Assche
2024-11-19  0:27 ` [PATCH v16 10/26] blk-mq: Clean up blk_mq_requeue_work() Bart Van Assche
2024-11-19  7:39   ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 11/26] block: Optimize blk_mq_submit_bio() for the cache hit scenario Bart Van Assche
2024-11-19  7:40   ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 12/26] block: Rework request allocation in blk_mq_submit_bio() Bart Van Assche
2024-11-19  7:44   ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 13/26] block: Support allocating from a specific software queue Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 14/26] blk-mq: Restore the zoned write order when requeuing Bart Van Assche
2024-11-19  7:52   ` Damien Le Moal
2024-11-19 21:16     ` Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 15/26] blk-zoned: Document the locking order Bart Van Assche
2024-11-19  7:52   ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 16/26] blk-zoned: Document locking assumptions Bart Van Assche
2024-11-19  7:53   ` Damien Le Moal
2024-11-19 21:18     ` Bart Van Assche
2024-11-21  3:34       ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 17/26] blk-zoned: Uninline functions that are not in the hot path Bart Van Assche
2024-11-19  7:55   ` Damien Le Moal
2024-11-19 21:20     ` Bart Van Assche
2024-11-21  3:36       ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 18/26] blk-zoned: Make disk_should_remove_zone_wplug() more robust Bart Van Assche
2024-11-19  7:58   ` Damien Le Moal
2024-11-19  0:28 ` [PATCH v16 19/26] blk-zoned: Add an argument to blk_zone_plug_bio() Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 20/26] blk-zoned: Support pipelining of zoned writes Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 21/26] scsi: core: Retry unaligned " Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 22/26] scsi: sd: Increase retry count for " Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 23/26] scsi: scsi_debug: Add the preserves_write_order module parameter Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 24/26] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 25/26] scsi: scsi_debug: Skip host/bus reset settle delay Bart Van Assche
2024-11-19  0:28 ` [PATCH v16 26/26] scsi: ufs: Inform the block layer about write ordering Bart Van Assche
     [not found]   ` <37f95f44-ab1d-20db-e0c7-94946cb9d4eb@quicinc.com>
2024-11-22 18:20     ` Bart Van Assche
2024-11-23  0:34       ` Can Guo
2024-11-19  8:01 ` [PATCH v16 00/26] Improve write performance for zoned UFS devices Damien Le Moal
2024-11-19 19:08   ` Bart Van Assche
2024-11-21  3:20     ` Damien Le Moal
2024-11-21 18:00       ` Bart Van Assche
2024-11-25  3:59         ` Damien Le Moal
2025-01-09 19:02   ` Bart Van Assche
2025-01-10  5:10     ` Damien Le Moal
2024-11-19 12:25 ` Christoph Hellwig
2024-11-19 18:52   ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e55151f-2001-46b6-8f82-cf961bbdf7cf@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).