From: Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
Sagi Grimberg
<sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Cc: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>,
linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: IB_CQ_VECTOR_LEAST_ATTACHED
Date: Sun, 07 Dec 2014 16:46:01 -0800 [thread overview]
Message-ID: <5484F4C9.6010304@oracle.com> (raw)
In-Reply-To: <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
On 12/07/2014 12:08 PM, Chuck Lever wrote:
>
> On Dec 7, 2014, at 5:20 AM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>
>> On 12/4/2014 9:41 PM, Shirley Ma wrote:
>>> On 12/04/2014 10:43 AM, Bart Van Assche wrote:
>>>> On 12/04/14 17:47, Shirley Ma wrote:
>>>>> What's the history of this patch?
>>>>> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html
>>>>>
>>>>> I am working on multiple QPs workload. And I created a similar approach
>>>>> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O
>>>>> performance. I think this CQ_VECTOR loading balance should be maintained
>>>>> in provider not the caller. I didn't see this patch was submitted to
>>>>> mainline kernel, wonder any reason behind?
>>>>
>>>> My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git.
>>>
>>> What I did is that I manually controlled IRQ and working thread on the same socket. The CQ is created when mounting the file system in NFS/RDMA, but the workload thread might start from different socket, so per-cpu based implementation might not apply. I will look at SRP implementation.
>>>
>>
>> Hey Shirley,
>>
>> Bart is correct, in general the LEAST_ATTACHED approach might not be
>> optimal in the NUMA case. The thread <-> QP/CQ/CPU assignment is
>> addressed by the multi-channel approach which to my understanding won't
>> be implemented in NFSoRDMA in the near future (right Chuck?)
>
> As I understand it, the preference of the Linux NFS community is that
> any multi-pathing solution should be transparent to the ULP (NFS and
> RPC, in this case). mp-tcp is ideal in that the ULP is presented with
> a single virtual transport instance, but under the covers, that instance
> can be backed by multiple active paths.
>
> Alternately, pNFS can be deployed. This allows a dataset to be striped
> across multiple servers (and networks). There is a rather high bar to
> entering this arena however.
>
> Speculating aloud, multiple QPs per transport instance may require
> implementation changes on the server as well as the client. Any
> interoperability dependencies should be documented via a standards
> process.
>
> And note that an RPC transport (at least in kernel) is shared across
> many user applications and mount points. I find it difficult to visualize
> an intuitive and comprehensive administrative interface where enough
> guidance is provided to place a set of NFS applications and an RPC
> transport in the same resource domain (maybe cgroups?).
>
> So for the time being I prefer staying with a single QP per client-
> server pair.
>
> A large NFS client can actively use many NFS servers, however. Each
> client-server pair would benefit from finding "least-used" resources
> when QP and CQs are created. That is something we can leverage today.
>
Yes, that's something I am evaluating now for one NFS client to different destination servers. I can see more than 15% BW performance increase when simulating multiple servers mount points through different IPoIB child interfaces and changing create_cq completion vector from 0 and "least-used", so comp vectors are balanced among QPs.
>> However, the LEAST_ATTACH vector hint will revive again in the future
>> as there is a need to spread applications on different interrupt
>> vectors (especially for user-space).
>>
>> CC'ing Matan who is working on this, perhaps he can comment on this as
>> well.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-12-08 0:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-04 16:47 IB_CQ_VECTOR_LEAST_ATTACHED Shirley Ma
[not found] ` <54809030.6090107-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-12-04 18:43 ` IB_CQ_VECTOR_LEAST_ATTACHED Bart Van Assche
[not found] ` <5480AB49.1080209-HInyCGIudOg@public.gmane.org>
2014-12-04 19:41 ` IB_CQ_VECTOR_LEAST_ATTACHED Shirley Ma
[not found] ` <5480B8CE.3080704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-12-07 10:20 ` IB_CQ_VECTOR_LEAST_ATTACHED Sagi Grimberg
[not found] ` <54842A05.9070207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-12-07 12:22 ` IB_CQ_VECTOR_LEAST_ATTACHED Matan Barak
[not found] ` <54844690.4040501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-07 12:59 ` IB_CQ_VECTOR_LEAST_ATTACHED Or Gerlitz
[not found] ` <54844F44.1010604-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-07 16:58 ` IB_CQ_VECTOR_LEAST_ATTACHED Matan Barak
[not found] ` <54848724.4020908-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-09 20:11 ` IB_CQ_VECTOR_LEAST_ATTACHED Or Gerlitz
2014-12-07 20:08 ` IB_CQ_VECTOR_LEAST_ATTACHED Chuck Lever
[not found] ` <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-12-08 0:46 ` Shirley Ma [this message]
2014-12-09 11:29 ` IB_CQ_VECTOR_LEAST_ATTACHED Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5484F4C9.6010304@oracle.com \
--to=shirley.ma-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
--cc=bvanassche-HInyCGIudOg@public.gmane.org \
--cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox