linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: open-iscsi@googlegroups.com, Hannes Reinecke <hare@suse.de>,
	Sagi Grimberg <sagig@dev.mellanox.co.il>,
	lsf-pc@lists.linux-foundation.org
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
	target-devel <target-devel@vger.kernel.org>
Subject: Re: [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion
Date: Thu, 08 Jan 2015 08:50:38 +0100	[thread overview]
Message-ID: <54AE36CE.8020509@acm.org> (raw)
In-Reply-To: <54ADA777.6090801@cs.wisc.edu>

On 01/07/15 22:39, Mike Christie wrote:
> On 01/07/2015 10:57 AM, Hannes Reinecke wrote:
>> On 01/07/2015 05:25 PM, Sagi Grimberg wrote:
>>> Hi everyone,
>>>
>>> Now that scsi-mq is fully included, we need an iSCSI initiator that
>>> would use it to achieve scalable performance. The need is even greater
>>> for iSCSI offload devices and transports that support multiple HW
>>> queues. As iSER maintainer I'd like to discuss the way we would choose
>>> to implement that in iSCSI.
>>>
>>> My measurements show that iSER initiator can scale up to ~2.1M IOPs
>>> with multiple sessions but only ~630K IOPs with a single session where
>>> the most significant bottleneck the (single) core processing
>>> completions.
>>>
>>> In the existing single connection per session model, given that command
>>> ordering must be preserved session-wide, we end up in a serial command
>>> execution over a single connection which is basically a single queue
>>> model. The best fit seems to be plugging iSCSI MCS as a multi-queued
>>> scsi LLDD. In this model, a hardware context will have a 1x1 mapping
>>> with an iSCSI connection (TCP socket or a HW queue).
>>>
>>> iSCSI MCS and it's role in the presence of dm-multipath layer was
>>> discussed several times in the past decade(s). The basic need for MCS is
>>> implementing a multi-queue data path, so perhaps we may want to avoid
>>> doing any type link aggregation or load balancing to not overlap
>>> dm-multipath. For example we can implement ERL=0 (which is basically the
>>> scsi-mq ERL) and/or restrict a session to a single portal.
>>>
>>> As I see it, the todo's are:
>>> 1. Getting MCS to work (kernel + user-space) with ERL=0 and a
>>>     round-robin connection selection (per scsi command execution).
>>> 2. Plug into scsi-mq - exposing num_connections as nr_hw_queues and
>>>     using blk-mq based queue (conn) selection.
>>> 3. Rework iSCSI core locking scheme to avoid session-wide locking
>>>     as much as possible.
>>> 4. Use blk-mq pre-allocation and tagging facilities.
>>>
>>> I've recently started looking into this. I would like the community to
>>> agree (or debate) on this scheme and also talk about implementation
>>> with anyone who is also interested in this.
>>>
>> Yes, that's a really good topic.
>>
>> I've pondered implementing MC/S for iscsi/TCP but then I've figured my
>> network implementation knowledge doesn't spread that far.
>> So yeah, a discussion here would be good.
>>
>> Mike? Any comments?
>
> I have been working under the assumption that people would be ok with
> MCS upstream if we are only using it to handle the issue where we want
> to do something like have a tcp/iscsi connection per CPU then map the
> connection to a blk_mq_hw_ctx. In this more limited MCS implementation
> there would be no iscsi layer code to do something like load balance
> across ports or transport paths like how dm-multipath does, so there
> would be no feature/code duplication. For balancing across hctxs, then
> the iscsi layer would also leave that up to whatever we end up with in
> upper layers, so again no feature/code duplication with upper layers.
>
> So pretty non controversial I hope :)
>
> If people want to add something like round robin connection selection in
> the iscsi layer, then I think we want to leave that for after the
> initial merge, so people can argue about that separately.

Hello Sagi and Mike,

I agree with Sagi that adding scsi-mq support in the iSER initiator 
would help iSER users because that would allow these users to configure 
a single iSER target and use the multiqueue feature instead of having to 
configure multiple iSER targets to spread the workload over multiple 
cpus at the target side.

And I agree with Mike that implementing scsi-mq support in the iSER 
initiator as multiple independent connections probably is a better 
choice than MC/S. RFC 3720 namely requires that iSCSI numbering is 
session-wide. This means maintaining a single counter for all MC/S 
sessions. Such a counter would be a contention point. I'm afraid that 
because of that counter performance on a multi-socket initiator system 
with a scsi-mq implementation based on MC/S could be worse than with the 
approach with multiple iSER targets. Hence my preference for an approach 
based on multiple independent iSER connections instead of MC/S.

Bart.


  reply	other threads:[~2015-01-08  7:50 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-07 16:25 [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion Sagi Grimberg
     [not found] ` <54AD5DDD.2090808-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-01-07 16:57   ` Hannes Reinecke
     [not found]     ` <54AD6563.4040603-l3A5Bk7waGM@public.gmane.org>
2015-01-07 21:39       ` Mike Christie
2015-01-08  7:50         ` Bart Van Assche [this message]
2015-01-08 13:45           ` Sagi Grimberg
     [not found]             ` <54AE8A02.1030100-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-01-08 14:11               ` Bart Van Assche
     [not found]                 ` <54AE9010.5080609-HInyCGIudOg@public.gmane.org>
2015-01-08 15:57                   ` Paul Koning
2015-01-09 11:39                 ` Sagi Grimberg
2015-01-09 13:31                   ` Bart Van Assche
     [not found]                     ` <5EE87F5E6631894E80EB1A63198F964D040A6A8F-cXZ6iGhjG0hm/BozF5lIdDJ2aSJ780jGSxCzGc5ayCJWk0Htik3J/w@public.gmane.org>
2015-01-11  9:52                       ` Sagi Grimberg
2015-01-14  4:16             ` Vladislav Bolkhovitin
2015-01-08 22:16           ` Nicholas A. Bellinger
2015-01-08 22:29             ` James Bottomley
2015-01-08 22:57               ` Nicholas A. Bellinger
     [not found]                 ` <1420757822.2842.39.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2015-01-08 23:22                   ` [Lsf-pc] " James Bottomley
2015-01-09  5:03                     ` Nicholas A. Bellinger
2015-01-09  6:25                       ` James Bottomley
     [not found]                       ` <1420779808.21830.21.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2015-01-09 18:00                         ` Michael Christie
2015-01-09 18:28                           ` Hannes Reinecke
     [not found]                             ` <54B01DBD.5020707-l3A5Bk7waGM@public.gmane.org>
2015-01-09 18:34                               ` James Bottomley
2015-01-09 20:19                               ` Mike Christie
     [not found]                                 ` <54B037BF.1010903-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2015-01-11  9:40                                   ` Sagi Grimberg
2015-01-12 12:56                                     ` Bart Van Assche
     [not found]                                       ` <54B3C47E.6010109-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-01-13  9:46                                         ` Sagi Grimberg
     [not found]                                     ` <54B24501.7090801-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-01-12 20:14                                       ` Mike Christie
     [not found]                           ` <38CE4ECA-D155-4BF9-9D6D-E1A01ADA05E4-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2015-01-11  9:23                             ` Sagi Grimberg
     [not found]                               ` <54B24117.7050204-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-01-12 20:05                                 ` Mike Christie
     [not found]                                   ` <54B428F2.2010507-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2015-01-13  9:55                                     ` Sagi Grimberg
2015-01-08 23:26                 ` Mike Christie
     [not found]                   ` <54AF122C.9070703-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2015-01-09 11:17                     ` Sagi Grimberg
2015-01-08 23:01           ` Mike Christie
2015-01-08 14:50         ` James Bottomley
2015-01-08 17:25           ` Sagi Grimberg
     [not found]         ` <54ADA777.6090801-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2015-01-08 23:40           ` Mike Christie
2015-01-07 17:22   ` Lee Duncan
2015-01-07 19:11     ` [Lsf-pc] " Jan Kara
2015-01-07 16:58 ` Nicholas A. Bellinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54AE36CE.8020509@acm.org \
    --to=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=open-iscsi@googlegroups.com \
    --cc=sagig@dev.mellanox.co.il \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).