From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"hch@infradead.org" <hch@infradead.org>,
"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"john.garry@huawei.com" <john.garry@huawei.com>,
"osandov@fb.com" <osandov@fb.com>,
"jejb@linux.vnet.ibm.com" <jejb@linux.vnet.ibm.com>,
"loberman@redhat.com" <loberman@redhat.com>
Subject: Re: [PATCH] SCSI: don't get target/host busy_count in scsi_mq_get_budget()
Date: Thu, 9 Nov 2017 10:05:08 +0800 [thread overview]
Message-ID: <20171109020506.GB14520@ming.t460p> (raw)
In-Reply-To: <1510159293.24237.19.camel@wdc.com>
On Wed, Nov 08, 2017 at 04:41:35PM +0000, Bart Van Assche wrote:
> On Tue, 2017-11-07 at 20:06 -0700, Jens Axboe wrote:
> > At this point, I have no idea what Bart's setup looks like. Bart, it
> > would be REALLY helpful if you could tell us how you are reproducing
> > your hang. I don't know why this has to be dragged out.
>
> Hello Jens,
>
> It is a disappointment to me that you have allowed Ming to evaluate other
> approaches than reverting "blk-mq: don't handle TAG_SHARED in restart". That
I have mentioned in another email to Jens, that I agree to revert that
patch because of TAG_WAITING's issue in Jens's test case.
> patch namely replaces an algorithm that is trusted by the community with an
> algorithm of which even Ming acknowledged that it is racy. A quote from [1]:
> "IO hang may be caused if all requests are completed just before the current
> SCSI device is added to shost->starved_list". I don't know of any way to fix
> that race other than serializing request submission and completion by adding
> locking around these actions, which is something we don't want. Hence my
> request to revert that patch.
That can't be the reason for this revert.
This issue[1] is fixed by '[PATCH] SCSI: don't get target/host busy_count in
scsi_mq_get_budget()', follows the idea:
- we add sdev into shost->starved_list in scsi_target_queue_ready(),
and the return value of BLK_STS_RESOURCE is set
- atomic_read(&sdev->device_busy) is checked to see if there is
pending request, queue will be run if it is zero, otherwise we
depend on scsi_end_request() from pending request to restart queue.
- you may mention sdev->device_busy may become 0 just after the
check, then the completion still see the sdev in
shost->starved_list and do the restart, and no IO hang
If you think something above is wrong, please comment on it directly.
Without this patch, no need any out-of-tree patch, IO hang can be
triggered in test 01 of srp-test. After this patch is applied on
V4.14-rc4, no IO hang can be observed any more.
>
> Regarding the test I run, here is a summary of what I mentioned in previous
> e-mails:
> * I modified the SRP initiator such that the SCSI target queue depth is
> reduced to one by setting starget->can_queue to 1 from inside
> scsi_host_template.target_alloc.
> * With that modified SRP initiator I run the srp-test software as follows
> until something breaks:
> while ./run_tests -f xfs -d -e deadline -r 60; do :; done
>
> Today a system with at least one InfiniBand HCA is required to run that test.
> When I have the time I will post the SRP initiator and target patches on the
> linux-rdma mailing list that make it possible to run that test against the
> SoftRoCE driver (drivers/infiniband/sw/rxe). The only hardware required to
> use that driver is an Ethernet adapter.
The thing is that we still don't know the root cause for your issue, and
keeping the restart for TAG_SHARED can be thought as a workaround. Maybe
it is same with Jens, maybe others, we don't know, and even without any
log provided, such as sched_tags or tags.
It is easy to see > 20% IOPS drops with restart for TAG_SHARED in 8
luns scsi debug test.
--
Ming
next prev parent reply other threads:[~2017-11-09 2:05 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-04 1:55 [PATCH] SCSI: don't get target/host busy_count in scsi_mq_get_budget() Ming Lei
2017-11-04 14:19 ` Jens Axboe
2017-11-06 19:45 ` Bart Van Assche
2017-11-07 2:11 ` Ming Lei
2017-11-07 16:20 ` Bart Van Assche
2017-11-07 16:29 ` Jens Axboe
2017-11-07 17:10 ` Jens Axboe
2017-11-07 17:36 ` Jens Axboe
2017-11-07 22:06 ` Jens Axboe
2017-11-07 22:34 ` Bart Van Assche
2017-11-07 22:39 ` Jens Axboe
2017-11-08 0:50 ` Ming Lei
2017-11-08 1:03 ` Ming Lei
2017-11-08 3:01 ` Jens Axboe
2017-11-08 3:12 ` Ming Lei
2017-11-08 3:17 ` Jens Axboe
2017-11-08 6:20 ` Ming Lei
2017-11-08 15:59 ` Ming Lei
2017-11-08 18:19 ` Jens Axboe
2017-11-07 17:34 ` Bart Van Assche
2017-11-08 0:53 ` Ming Lei
2017-11-08 2:06 ` Ming Lei
2017-11-08 0:39 ` Ming Lei
2017-11-08 2:55 ` Jens Axboe
2017-11-08 2:58 ` Ming Lei
2017-11-08 3:06 ` Jens Axboe
2017-11-08 16:41 ` Bart Van Assche
2017-11-08 17:57 ` Jens Axboe
2017-11-08 18:22 ` Laurence Oberman
2017-11-08 18:28 ` Jens Axboe
2017-11-09 4:02 ` Ming Lei
2017-11-09 2:05 ` Ming Lei [this message]
2017-11-07 10:15 ` Ming Lei
2017-11-07 16:17 ` Bart Van Assche
2017-11-08 3:12 ` Jens Axboe
2017-11-06 18:04 ` Bart Van Assche
2017-11-07 2:19 ` Ming Lei
2017-11-07 3:53 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171109020506.GB14520@ming.t460p \
--to=ming.lei@redhat.com \
--cc=Bart.VanAssche@wdc.com \
--cc=axboe@kernel.dk \
--cc=hch@infradead.org \
--cc=jejb@linux.vnet.ibm.com \
--cc=john.garry@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=loberman@redhat.com \
--cc=martin.petersen@oracle.com \
--cc=osandov@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox