All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: "Holger Hoffstätte" <holger@applied-asynchrony.com>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready()
Date: Tue, 5 Dec 2017 14:56:42 +0800	[thread overview]
Message-ID: <20171205065641.GC9989@ming.t460p> (raw)
In-Reply-To: <20171205051624.GB9989@ming.t460p>

On Tue, Dec 05, 2017 at 01:16:24PM +0800, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 11:48:07PM +0000, Holger Hoffst�tte wrote:
> > On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:
> > 
> > > On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
> > >> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> > >> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> > >> 
> > >> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
> > >> issues introduced by that commit for kernel version v4.15 ...
> > > 
> > > What are all issues in v4.15-rc? Up to now, it is the only issue reported,
> > > and can be fixed by this simple patch, which one can be thought as cleanup
> > > too.
> > 
> > Even with this patch I've encountered at least one hang that
> > seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
> > the hang in question was on a rotating disk. It could be solved by activating
> > a different scheduler on the hanging device; all hanging sync/df processes got
> > unstuck and all was fine again, which leads me to believe that there is at least
> > one more rare condition where delaying requests (as done in the budget patch)
> > leads to a hang.
> > 
> > This happened with mq-deadline which I was testing specifically to avoid
> > any BFQ-related side effects.
> 
> OK, this looks a new report.
> 
> Without any log, we can't make any progress, and even we can't guess
> what the issue is related with.
> 
> Could you post your dmesg log(include the hang process stack trace)? And
> dump the debugfs log by the following script when this hang happens?
> 
> 	http://people.redhat.com/minlei/tests/tools/dump-blk-info
> 
> BTW, you just need to pass the disk name to the script, such as: /dev/sda.

Thinking of the issue further, this patch only covers case of
scsi_set_blocked(), but don't consider the case in which .get_budget()
is called inside blk_mq_dispatch_rq_list() for request coming from
hctx->dispatch_list.

If .get_budget() is called in both blk_mq_do_dispatch_sched() and
blk_mq_do_dispatch_ctx(), we don't need to run queue if the queue
is idle. But if it is called from blk_mq_dispatch_rq_list() for request
coming from hctx->dispatch_list, we have to run queue if queue is
idle, as before.

So please ignore this patch, and will submit V2 for cover both cases.

Thanks,
Ming

WARNING: multiple messages have this Message-ID (diff)
From: Ming Lei <ming.lei@redhat.com>
To: "Holger Hoffstätte" <holger@applied-asynchrony.com>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready()
Date: Tue, 5 Dec 2017 14:56:42 +0800	[thread overview]
Message-ID: <20171205065641.GC9989@ming.t460p> (raw)
In-Reply-To: <20171205051624.GB9989@ming.t460p>

On Tue, Dec 05, 2017 at 01:16:24PM +0800, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 11:48:07PM +0000, Holger Hoffstätte wrote:
> > On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:
> > 
> > > On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
> > >> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> > >> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> > >> 
> > >> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
> > >> issues introduced by that commit for kernel version v4.15 ...
> > > 
> > > What are all issues in v4.15-rc? Up to now, it is the only issue reported,
> > > and can be fixed by this simple patch, which one can be thought as cleanup
> > > too.
> > 
> > Even with this patch I've encountered at least one hang that
> > seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
> > the hang in question was on a rotating disk. It could be solved by activating
> > a different scheduler on the hanging device; all hanging sync/df processes got
> > unstuck and all was fine again, which leads me to believe that there is at least
> > one more rare condition where delaying requests (as done in the budget patch)
> > leads to a hang.
> > 
> > This happened with mq-deadline which I was testing specifically to avoid
> > any BFQ-related side effects.
> 
> OK, this looks a new report.
> 
> Without any log, we can't make any progress, and even we can't guess
> what the issue is related with.
> 
> Could you post your dmesg log(include the hang process stack trace)? And
> dump the debugfs log by the following script when this hang happens?
> 
> 	http://people.redhat.com/minlei/tests/tools/dump-blk-info
> 
> BTW, you just need to pass the disk name to the script, such as: /dev/sda.

Thinking of the issue further, this patch only covers case of
scsi_set_blocked(), but don't consider the case in which .get_budget()
is called inside blk_mq_dispatch_rq_list() for request coming from
hctx->dispatch_list.

If .get_budget() is called in both blk_mq_do_dispatch_sched() and
blk_mq_do_dispatch_ctx(), we don't need to run queue if the queue
is idle. But if it is called from blk_mq_dispatch_rq_list() for request
coming from hctx->dispatch_list, we have to run queue if queue is
idle, as before.

So please ignore this patch, and will submit V2 for cover both cases.

Thanks,
Ming

  reply	other threads:[~2017-12-05  6:57 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-02 16:31 [PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready() Ming Lei
2017-12-04  8:19 ` Johannes Thumshirn
2017-12-04  8:34   ` Ming Lei
2017-12-04  8:44     ` Johannes Thumshirn
2017-12-04  8:58       ` Ming Lei
2017-12-04 15:09 ` Bart Van Assche
2017-12-04 15:09   ` Bart Van Assche
2017-12-04 22:45   ` Ming Lei
2017-12-04 22:49     ` Bart Van Assche
2017-12-04 22:49       ` Bart Van Assche
2017-12-04 23:48     ` Holger Hoffstätte
2017-12-05  5:16       ` Ming Lei
2017-12-05  5:16         ` Ming Lei
2017-12-05  6:56         ` Ming Lei [this message]
2017-12-05  6:56           ` Ming Lei
2017-12-05 11:26         ` Holger Hoffstätte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171205065641.GC9989@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=holger@applied-asynchrony.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.