From: Ming Lei <ming.lei@redhat.com>
To: John Garry <john.garry@huawei.com>
Cc: Christoph Hellwig <hch@infradead.org>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
jejb@linux.vnet.ibm.com, linuxarm@huawei.com,
linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
chenxiang <chenxiang66@hisilicon.com>,
Kashyap Desai <kashyap.desai@broadcom.com>
Subject: Re: [PATCH 0/7] hisi_sas: Misc bugfixes and an optimisation patch
Date: Fri, 12 Oct 2018 07:36:34 +0800 [thread overview]
Message-ID: <20181011233633.GA30271@ming.t460p> (raw)
In-Reply-To: <519812c7-970a-976c-7db5-a035bf4f4e24@huawei.com>
On Thu, Oct 11, 2018 at 03:07:33PM +0100, John Garry wrote:
> On 11/10/2018 14:32, Ming Lei wrote:
> > On Thu, Oct 11, 2018 at 02:12:11PM +0100, John Garry wrote:
> > > On 11/10/2018 11:15, Christoph Hellwig wrote:
> > > > On Thu, Oct 11, 2018 at 10:59:11AM +0100, John Garry wrote:
> > > > >
> > > > > > blk-mq tags are always per-host (which has actually caused problems for
> > > > > > ATA, which is now using its own per-device tags).
> > > > > >
> > > > >
> > > > > So, for example, if Scsi_host.can_queue = 2048 and Scsi_host.nr_hw_queues =
> > > > > 16, then rq tags are still in range [0, 2048) for that HBA, i.e. invariant
> > > > > on queue count?
> > > >
> > > > Yes, if can_queue is 2048 you will gets tags from 0..2047.
> > > >
> > >
> > > I should be clear about some things before discussing this further. Our
> > > device has 16 hw queues. And each command we send to any queue in the device
> > > must have a unique tag across all hw queues for that device, and should be
> > > in the range [0, 2048) - it's called an IPTT. So Scsi_host.can_queue = 2048.
> >
> > Could you describe a bit about IPTT?
> >
>
> IPTT is an "Initiator Tag". It is a tag to map to the context of a hw queue
> command. It is related to SAS protocol Initiator Port tag. I think that most
> SAS HBAs have a similar concept.
>
> IPTTs are limited, and must be recycled when an IO completes. Our hw
> supports upto 2048. So we have a limit of 2048 commands issued at any point
> in time.
>
> Previously we had been managing IPTT in LLDD, but found rq tag can be used
> as IPTT (as in 6/7), to gave a good performance boost.
>
> > Looks like the 16 hw queues are like reply queues in other drivers,
> > such as megara_sas, but given all the 16 reply queues share one tagset,
> > so the hw queue number has to be 1 from blk-mq's view.
> >
> > >
> > > However today we only expose a single queue to upper layer (for unrelated
> > > LLDD error handling restriction). We hope to expose all 16 queues in future,
> > > which is what I meant by "enabling SCSI MQ in the driver". However, with
> > > 6/7, this creates a problem, below.
> >
> > If the tag of each request from all hw queues has to be unique, you
> > can't expose all 16 queues.
>
> Well we can if we generate and manage the IPTT in the LLDD, as we had been
> doing. If we want to use the rq tag - which 6/7 is for - then we can't.
In theory, you still may generate and manage the IPTT in the LLDD by
simply ignoring rq->tag, meantime enabling SCSI_MQ with 16 hw queues.
However, not sure how much this way may improve performance, and it may
degrade IO perf. If 16 hw queues are exposed to blk-mq, 16*.can_queue
requests may be queued to the driver, and allocation & free on the single
IPTT pool will become a bottleneck.
Per my experiment on host tagset, it might be a good tradeoff to allocate
one hw queue for each node to avoid the remote access on dispatch
data/requests structure for this case, but your IPTT pool is still
shared all CPUs, maybe you can try the smart sbitmap.
https://www.spinics.net/lists/linux-scsi/msg117920.html
>
> >
> > >
> > > > IFF you device needs different tags for different queues it can use
> > > > the blk_mq_unique_tag heper to generate unique global tag.
> > >
> > > So this helper can't help, as fundamentially the issue is "the tag field in
> > > struct request is unique per hardware queue but not all all hw queues".
> > > Indeed blk_mq_unique_tag() does give a unique global tag, but cannot be used
> > > for the IPTT.
> > >
> > > OTOH, We could expose 16 queues to upper layer, and drop 6/7, but we found
> > > it performs worse.
> >
> > We discussed this issue before, but not found a good solution yet for
> > exposing multiple hw queues to blk-mq.
>
> I just think that it's unfortunate that enabling blk-mq means that the LLDD
> loses this unique tag across all queues in range [0, Scsi_host.can_queue),
> so much so that we found performance better by not exposing multiple queues
> and continuing to use single rq tag...
It isn't a new problem, we discussed it a lot on megaraid_sas which has
same situation with yours, you may find it in block list.
Kashyap Desai did lots of test on this case.
>
> >
> > However, we still get good performance in case of none scheduler by the
> > following patches:
> >
> > 8824f62246be blk-mq: fail the request in case issue failure
> > 6ce3dd6eec11 blk-mq: issue directly if hw queue isn't busy in case of 'none'
> >
>
> I think that these patches would have been included in our testing. I need
> to check.
Please switch to none io sched in your test, and it is observed that IO
perf becomes good on megaraid_sas.
Thanks,
Ming
next prev parent reply other threads:[~2018-10-11 23:36 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-24 15:06 [PATCH 0/7] hisi_sas: Misc bugfixes and an optimisation patch John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 1/7] scsi: hisi_sas: Feed back linkrate(max/min) when re-attached John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 2/7] scsi: hisi_sas: Move evaluation of hisi_hba in hisi_sas_task_prep() John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 3/7] scsi: hisi_sas: Fix the race between IO completion and timeout for SMP/internal IO John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 4/7] scsi: hisi_sas: Free slot later in slot_complete_vx_hw() John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 5/7] scsi: hisi_sas: unmask interrupts ent72 and ent74 John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 6/7] scsi: hisi_sas: Use block layer tag instead for IPTT John Garry
2018-09-24 15:06 ` John Garry
2018-09-24 15:06 ` [PATCH 7/7] scsi: hisi_sas: Update v3 hw AIP_LIMIT and CFG_AGING_TIME register values John Garry
2018-09-24 15:06 ` John Garry
2018-10-04 15:30 ` [PATCH 0/7] hisi_sas: Misc bugfixes and an optimisation patch John Garry
2018-10-04 15:30 ` John Garry
2018-10-11 1:58 ` Martin K. Petersen
2018-10-11 1:58 ` Martin K. Petersen
2018-10-11 6:36 ` Christoph Hellwig
2018-10-11 9:59 ` John Garry
2018-10-11 9:59 ` John Garry
2018-10-11 10:15 ` Christoph Hellwig
2018-10-11 13:12 ` John Garry
2018-10-11 13:12 ` John Garry
2018-10-11 13:32 ` Ming Lei
2018-10-11 14:07 ` John Garry
2018-10-11 14:07 ` John Garry
2018-10-11 23:36 ` Ming Lei [this message]
2018-10-12 9:02 ` John Garry
2018-10-12 9:02 ` John Garry
2018-10-12 13:30 ` Ming Lei
2018-10-12 13:30 ` Ming Lei
2018-10-12 10:47 ` John Garry
2018-10-12 10:47 ` John Garry
2018-10-16 4:28 ` Martin K. Petersen
2018-10-16 4:28 ` Martin K. Petersen
2018-10-16 8:28 ` John Garry
2018-10-16 8:28 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181011233633.GA30271@ming.t460p \
--to=ming.lei@redhat.com \
--cc=chenxiang66@hisilicon.com \
--cc=hch@infradead.org \
--cc=jejb@linux.vnet.ibm.com \
--cc=john.garry@huawei.com \
--cc=kashyap.desai@broadcom.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.