public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <Bart.VanAssche@wdc.com>
To: "lduncan@suse.com" <lduncan@suse.com>,
	"tang.chen@huawei.com" <tang.chen@huawei.com>,
	"cleech@redhat.com" <cleech@redhat.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"guijianfeng@huawei.com" <guijianfeng@huawei.com>,
	"zhengchuan@huawei.com" <zhengchuan@huawei.com>
Subject: Re: [iscsi] Deadlock occurred when network is in error
Date: Mon, 14 Aug 2017 15:17:17 +0000	[thread overview]
Message-ID: <1502723836.2333.3.camel@wdc.com> (raw)
In-Reply-To: <22E823DBB7698E489DC113638F7470729C17B6@DGGEMM506-MBX.china.huawei.com>

On Mon, 2017-08-14 at 11:23 +0000, Tangchen (UVP) wrote:
> Problem 2:
> 
> ***************
> [What it looks like]
> ***************
> When remove a scsi device, and the network error happens, __blk_drain_queue() could hang forever.
> 
> # cat /proc/19160/stack 
> [<ffffffff8005886d>] msleep+0x1d/0x30
> [<ffffffff80201a84>] __blk_drain_queue+0xe4/0x160
> [<ffffffff80202766>] blk_cleanup_queue+0x106/0x2e0
> [<ffffffffa000fb02>] __scsi_remove_device+0x52/0xc0 [scsi_mod]
> [<ffffffffa000fb9b>] scsi_remove_device+0x2b/0x40 [scsi_mod]
> [<ffffffffa000fbc0>] sdev_store_delete_callback+0x10/0x20 [scsi_mod]
> [<ffffffff801a4e75>] sysfs_schedule_callback_work+0x15/0x80
> [<ffffffff80062d69>] process_one_work+0x169/0x340
> [<ffffffff800667e3>] worker_thread+0x183/0x490
> [<ffffffff8006a526>] kthread+0x96/0xa0
> [<ffffffff8041ebb4>] kernel_thread_helper+0x4/0x10
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> The request queue of this device was stopped. So the following check will be true forever:
> __blk_run_queue()
> {
>         if (unlikely(blk_queue_stopped(q)))
>                 return;
> 
>         __blk_run_queue_uncond(q);
> }
> 
> So __blk_run_queue_uncond() will never be called, and the process hang.
> 
> [ ... ]
>
> ****************
> [How to reproduce]
> ****************
> Unfortunately I cannot reproduce it in the latest kernel. 
> The script below will help to reproduce, but not very often.
> 
> # create network error
> tc qdisc add dev eth1 root netem loss 60%
> 
> # restart iscsid and rescan scsi bus again and again
> while [ 1 ]
> do
> systemctl restart iscsid
> rescan-scsi-bus        (http://manpages.ubuntu.com/manpages/trusty/man8/rescan-scsi-bus.8.html)
> done

This should have been fixed by commit 36e3cf273977 ("scsi: Avoid that SCSI
queues get stuck"). The first mainline kernel that includes this commit is
kernel v4.11.

> void __blk_run_queue(struct request_queue *q)
> {
> -       if (unlikely(blk_queue_stopped(q)))
> +       if (unlikely(blk_queue_stopped(q)) && unlikely(!blk_queue_dying(q)))
>                 return;
> 
>         __blk_run_queue_uncond(q);

Are you aware that the single queue block layer is on its way out and will
be removed sooner or later? Please focus your testing on scsi-mq. 

Regarding the above patch: it is wrong because it will cause lockups during
path removal for other block drivers. Please drop this patch.

Bart.

  reply	other threads:[~2017-08-14 15:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-14 11:23 [iscsi] Deadlock occurred when network is in error Tangchen (UVP)
2017-08-14 15:17 ` Bart Van Assche [this message]
2017-08-15  2:16   ` 答复: " Tangchen (UVP)
2017-08-15 21:45     ` Bart Van Assche
2017-08-16  3:03       ` Tangchen (UVP)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1502723836.2333.3.camel@wdc.com \
    --to=bart.vanassche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=cleech@redhat.com \
    --cc=guijianfeng@huawei.com \
    --cc=lduncan@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=tang.chen@huawei.com \
    --cc=zhengchuan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox