* IB_CQ_VECTOR_LEAST_ATTACHED
@ 2014-12-04 16:47 Shirley Ma
[not found] ` <54809030.6090107-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Shirley Ma @ 2014-12-04 16:47 UTC (permalink / raw)
To: linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w
Hello Or, Eli,
What's the history of this patch?
http://lists.openfabrics.org/pipermail/general/2008-May/050813.html
I am working on multiple QPs workload. And I created a similar approach with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O performance. I think this CQ_VECTOR loading balance should be maintained in provider not the caller. I didn't see this patch was submitted to mainline kernel, wonder any reason behind?
Thanks
Shirley
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread[parent not found: <54809030.6090107-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <54809030.6090107-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> @ 2014-12-04 18:43 ` Bart Van Assche [not found] ` <5480AB49.1080209-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Bart Van Assche @ 2014-12-04 18:43 UTC (permalink / raw) To: Shirley Ma, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w On 12/04/14 17:47, Shirley Ma wrote: > What's the history of this patch? > http://lists.openfabrics.org/pipermail/general/2008-May/050813.html > > I am working on multiple QPs workload. And I created a similar approach > with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O > performance. I think this CQ_VECTOR loading balance should be maintained > in provider not the caller. I didn't see this patch was submitted to > mainline kernel, wonder any reason behind? My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <5480AB49.1080209-HInyCGIudOg@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <5480AB49.1080209-HInyCGIudOg@public.gmane.org> @ 2014-12-04 19:41 ` Shirley Ma [not found] ` <5480B8CE.3080704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Shirley Ma @ 2014-12-04 19:41 UTC (permalink / raw) To: Bart Van Assche, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w On 12/04/2014 10:43 AM, Bart Van Assche wrote: > On 12/04/14 17:47, Shirley Ma wrote: >> What's the history of this patch? >> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html >> >> I am working on multiple QPs workload. And I created a similar approach >> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O >> performance. I think this CQ_VECTOR loading balance should be maintained >> in provider not the caller. I didn't see this patch was submitted to >> mainline kernel, wonder any reason behind? > > My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git. What I did is that I manually controlled IRQ and working thread on the same socket. The CQ is created when mounting the file system in NFS/RDMA, but the workload thread might start from different socket, so per-cpu based implementation might not apply. I will look at SRP implementation. Thanks, Shirley -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <5480B8CE.3080704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <5480B8CE.3080704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> @ 2014-12-07 10:20 ` Sagi Grimberg [not found] ` <54842A05.9070207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Sagi Grimberg @ 2014-12-07 10:20 UTC (permalink / raw) To: Shirley Ma, Bart Van Assche, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w Cc: Matan Barak On 12/4/2014 9:41 PM, Shirley Ma wrote: > On 12/04/2014 10:43 AM, Bart Van Assche wrote: >> On 12/04/14 17:47, Shirley Ma wrote: >>> What's the history of this patch? >>> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html >>> >>> I am working on multiple QPs workload. And I created a similar approach >>> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O >>> performance. I think this CQ_VECTOR loading balance should be maintained >>> in provider not the caller. I didn't see this patch was submitted to >>> mainline kernel, wonder any reason behind? >> >> My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git. > > What I did is that I manually controlled IRQ and working thread on the same socket. The CQ is created when mounting the file system in NFS/RDMA, but the workload thread might start from different socket, so per-cpu based implementation might not apply. I will look at SRP implementation. > Hey Shirley, Bart is correct, in general the LEAST_ATTACHED approach might not be optimal in the NUMA case. The thread <-> QP/CQ/CPU assignment is addressed by the multi-channel approach which to my understanding won't be implemented in NFSoRDMA in the near future (right Chuck?) However, the LEAST_ATTACH vector hint will revive again in the future as there is a need to spread applications on different interrupt vectors (especially for user-space). CC'ing Matan who is working on this, perhaps he can comment on this as well. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <54842A05.9070207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <54842A05.9070207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2014-12-07 12:22 ` Matan Barak [not found] ` <54844690.4040501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2014-12-07 20:08 ` IB_CQ_VECTOR_LEAST_ATTACHED Chuck Lever 1 sibling, 1 reply; 11+ messages in thread From: Matan Barak @ 2014-12-07 12:22 UTC (permalink / raw) To: Sagi Grimberg, Shirley Ma, Bart Van Assche, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w On 12/7/2014 12:20 PM, Sagi Grimberg wrote: > On 12/4/2014 9:41 PM, Shirley Ma wrote: >> On 12/04/2014 10:43 AM, Bart Van Assche wrote: >>> On 12/04/14 17:47, Shirley Ma wrote: >>>> What's the history of this patch? >>>> >>>> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html >>>> >>>> I am working on multiple QPs workload. And I created a similar approach >>>> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small >>>> I/O >>>> performance. I think this CQ_VECTOR loading balance should be >>>> maintained >>>> in provider not the caller. I didn't see this patch was submitted to >>>> mainline kernel, wonder any reason behind? >>> >>> My interpretation is that an approach similar to >>> IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but >>> suboptimal on multi-socket systems. Hence the code for associating CQ >>> sets with CPU sockets in the SRP initiator. These changes have been >>> queued for kernel 3.19. See also branch drivers-for-3.19 in git repo >>> git://git.infradead.org/users/hch/scsi-queue.git. >> >> What I did is that I manually controlled IRQ and working thread on the >> same socket. The CQ is created when mounting the file system in >> NFS/RDMA, but the workload thread might start from different socket, >> so per-cpu based implementation might not apply. I will look at SRP >> implementation. >> > > Hey Shirley, > > Bart is correct, in general the LEAST_ATTACHED approach might not be > optimal in the NUMA case. The thread <-> QP/CQ/CPU assignment is > addressed by the multi-channel approach which to my understanding won't > be implemented in NFSoRDMA in the near future (right Chuck?) > > However, the LEAST_ATTACH vector hint will revive again in the future > as there is a need to spread applications on different interrupt > vectors (especially for user-space). > > CC'ing Matan who is working on this, perhaps he can comment on this as > well. > > Sagi. > > Hi, I'm not sure LEAST_ATTACHED is the best practice here. Applications might want to create a CQ on n different cores. You can't guarantee that with LEAST_ATTACHED policy. Anything smarter would probably require an API change or some tricks. We might, for example, add an API like "give me the least attached CQ vector which isn't in the following list {a, b, c....}". Another option might be that when several CQs are registered with LEAST_ATTACHED on the same PD, we'll try to give them different vectors. Anyway, these are some rough ideas. We should really think about that thoroughly before implementing. Regards, Matan -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <54844690.4040501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <54844690.4040501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2014-12-07 12:59 ` Or Gerlitz [not found] ` <54844F44.1010604-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Or Gerlitz @ 2014-12-07 12:59 UTC (permalink / raw) To: Matan Barak, Sagi Grimberg, Shirley Ma, Bart Van Assche, linux-rdma, eli-VPRAkNaXOzVWk0Htik3J/w On 12/7/2014 2:22 PM, Matan Barak wrote: > Applications might want to create a CQ on n different cores You mean like an IRQ can flush on a mask potentially made of multiple CPUs? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <54844F44.1010604-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <54844F44.1010604-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2014-12-07 16:58 ` Matan Barak [not found] ` <54848724.4020908-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Matan Barak @ 2014-12-07 16:58 UTC (permalink / raw) To: Or Gerlitz, Sagi Grimberg, Shirley Ma, Bart Van Assche, linux-rdma, eli-VPRAkNaXOzVWk0Htik3J/w On 12/7/2014 2:59 PM, Or Gerlitz wrote: > On 12/7/2014 2:22 PM, Matan Barak wrote: >> Applications might want to create a CQ on n different cores > > You mean like an IRQ can flush on a mask potentially made of multiple CPUs? Sort of. In both cases you try to spread the resources such that you'll get best performance (that should be done by the device driver itself). The user needs to somehow get n different least-used resources. Hopefully, if the device driver does a decent job - the user would get his resources potentially on multiple cpus. Matan > > Or. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <54848724.4020908-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <54848724.4020908-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2014-12-09 20:11 ` Or Gerlitz 0 siblings, 0 replies; 11+ messages in thread From: Or Gerlitz @ 2014-12-09 20:11 UTC (permalink / raw) To: Matan Barak Cc: Sagi Grimberg, Shirley Ma, Bart Van Assche, linux-rdma, Eli Cohen, Eyal Salomon On Sun, Dec 7, 2014 at 6:58 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > On 12/7/2014 2:59 PM, Or Gerlitz wrote: >> On 12/7/2014 2:22 PM, Matan Barak wrote: >>> Applications might want to create a CQ on n different cores >> You mean like an IRQ can flush on a mask potentially made of multiple >> CPUs? > Sort of. In both cases you try to spread the resources such that you'll get > best performance (that should be done by the device driver itself). The user > needs to somehow get n different least-used resources. Hopefully, if the > device driver does a decent job - the user would get his resources > potentially on multiple cpus. I am not sure to follow on the "n different LU resources". Thinking on this matter little further, what user-space (and maybe kernel too?) apps would want follows the rmap (reverse map where cpu --> set of IRQs) used by the kernel aRFS logic, where we let app choose some primitive that would cause the interrupt to be raised on the cpu they want. In this context, the param to the cq creation verb needs not be the vector number, but rather the cpu number (maybe nice default to THIS_CPU) or set of cpus? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <54842A05.9070207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2014-12-07 12:22 ` IB_CQ_VECTOR_LEAST_ATTACHED Matan Barak @ 2014-12-07 20:08 ` Chuck Lever [not found] ` <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 11+ messages in thread From: Chuck Lever @ 2014-12-07 20:08 UTC (permalink / raw) To: Sagi Grimberg Cc: Shirley Ma, Bart Van Assche, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w, Matan Barak On Dec 7, 2014, at 5:20 AM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: > On 12/4/2014 9:41 PM, Shirley Ma wrote: >> On 12/04/2014 10:43 AM, Bart Van Assche wrote: >>> On 12/04/14 17:47, Shirley Ma wrote: >>>> What's the history of this patch? >>>> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html >>>> >>>> I am working on multiple QPs workload. And I created a similar approach >>>> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O >>>> performance. I think this CQ_VECTOR loading balance should be maintained >>>> in provider not the caller. I didn't see this patch was submitted to >>>> mainline kernel, wonder any reason behind? >>> >>> My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git. >> >> What I did is that I manually controlled IRQ and working thread on the same socket. The CQ is created when mounting the file system in NFS/RDMA, but the workload thread might start from different socket, so per-cpu based implementation might not apply. I will look at SRP implementation. >> > > Hey Shirley, > > Bart is correct, in general the LEAST_ATTACHED approach might not be > optimal in the NUMA case. The thread <-> QP/CQ/CPU assignment is > addressed by the multi-channel approach which to my understanding won't > be implemented in NFSoRDMA in the near future (right Chuck?) As I understand it, the preference of the Linux NFS community is that any multi-pathing solution should be transparent to the ULP (NFS and RPC, in this case). mp-tcp is ideal in that the ULP is presented with a single virtual transport instance, but under the covers, that instance can be backed by multiple active paths. Alternately, pNFS can be deployed. This allows a dataset to be striped across multiple servers (and networks). There is a rather high bar to entering this arena however. Speculating aloud, multiple QPs per transport instance may require implementation changes on the server as well as the client. Any interoperability dependencies should be documented via a standards process. And note that an RPC transport (at least in kernel) is shared across many user applications and mount points. I find it difficult to visualize an intuitive and comprehensive administrative interface where enough guidance is provided to place a set of NFS applications and an RPC transport in the same resource domain (maybe cgroups?). So for the time being I prefer staying with a single QP per client- server pair. A large NFS client can actively use many NFS servers, however. Each client-server pair would benefit from finding "least-used" resources when QP and CQs are created. That is something we can leverage today. > However, the LEAST_ATTACH vector hint will revive again in the future > as there is a need to spread applications on different interrupt > vectors (especially for user-space). > > CC'ing Matan who is working on this, perhaps he can comment on this as > well. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>]
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> @ 2014-12-08 0:46 ` Shirley Ma 2014-12-09 11:29 ` IB_CQ_VECTOR_LEAST_ATTACHED Sagi Grimberg 1 sibling, 0 replies; 11+ messages in thread From: Shirley Ma @ 2014-12-08 0:46 UTC (permalink / raw) To: Chuck Lever, Sagi Grimberg Cc: Bart Van Assche, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w, Matan Barak On 12/07/2014 12:08 PM, Chuck Lever wrote: > > On Dec 7, 2014, at 5:20 AM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: > >> On 12/4/2014 9:41 PM, Shirley Ma wrote: >>> On 12/04/2014 10:43 AM, Bart Van Assche wrote: >>>> On 12/04/14 17:47, Shirley Ma wrote: >>>>> What's the history of this patch? >>>>> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html >>>>> >>>>> I am working on multiple QPs workload. And I created a similar approach >>>>> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O >>>>> performance. I think this CQ_VECTOR loading balance should be maintained >>>>> in provider not the caller. I didn't see this patch was submitted to >>>>> mainline kernel, wonder any reason behind? >>>> >>>> My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git. >>> >>> What I did is that I manually controlled IRQ and working thread on the same socket. The CQ is created when mounting the file system in NFS/RDMA, but the workload thread might start from different socket, so per-cpu based implementation might not apply. I will look at SRP implementation. >>> >> >> Hey Shirley, >> >> Bart is correct, in general the LEAST_ATTACHED approach might not be >> optimal in the NUMA case. The thread <-> QP/CQ/CPU assignment is >> addressed by the multi-channel approach which to my understanding won't >> be implemented in NFSoRDMA in the near future (right Chuck?) > > As I understand it, the preference of the Linux NFS community is that > any multi-pathing solution should be transparent to the ULP (NFS and > RPC, in this case). mp-tcp is ideal in that the ULP is presented with > a single virtual transport instance, but under the covers, that instance > can be backed by multiple active paths. > > Alternately, pNFS can be deployed. This allows a dataset to be striped > across multiple servers (and networks). There is a rather high bar to > entering this arena however. > > Speculating aloud, multiple QPs per transport instance may require > implementation changes on the server as well as the client. Any > interoperability dependencies should be documented via a standards > process. > > And note that an RPC transport (at least in kernel) is shared across > many user applications and mount points. I find it difficult to visualize > an intuitive and comprehensive administrative interface where enough > guidance is provided to place a set of NFS applications and an RPC > transport in the same resource domain (maybe cgroups?). > > So for the time being I prefer staying with a single QP per client- > server pair. > > A large NFS client can actively use many NFS servers, however. Each > client-server pair would benefit from finding "least-used" resources > when QP and CQs are created. That is something we can leverage today. > Yes, that's something I am evaluating now for one NFS client to different destination servers. I can see more than 15% BW performance increase when simulating multiple servers mount points through different IPoIB child interfaces and changing create_cq completion vector from 0 and "least-used", so comp vectors are balanced among QPs. >> However, the LEAST_ATTACH vector hint will revive again in the future >> as there is a need to spread applications on different interrupt >> vectors (especially for user-space). >> >> CC'ing Matan who is working on this, perhaps he can comment on this as >> well. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: IB_CQ_VECTOR_LEAST_ATTACHED [not found] ` <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-12-08 0:46 ` IB_CQ_VECTOR_LEAST_ATTACHED Shirley Ma @ 2014-12-09 11:29 ` Sagi Grimberg 1 sibling, 0 replies; 11+ messages in thread From: Sagi Grimberg @ 2014-12-09 11:29 UTC (permalink / raw) To: Chuck Lever Cc: Shirley Ma, Bart Van Assche, linux-rdma, Or Gerlitz, eli-VPRAkNaXOzVWk0Htik3J/w, Matan Barak On 12/7/2014 10:08 PM, Chuck Lever wrote: > > On Dec 7, 2014, at 5:20 AM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: > >> On 12/4/2014 9:41 PM, Shirley Ma wrote: >>> On 12/04/2014 10:43 AM, Bart Van Assche wrote: >>>> On 12/04/14 17:47, Shirley Ma wrote: >>>>> What's the history of this patch? >>>>> http://lists.openfabrics.org/pipermail/general/2008-May/050813.html >>>>> >>>>> I am working on multiple QPs workload. And I created a similar approach >>>>> with IB_CQ_VECTOR_LEAST_ATTACHED, which can bring up about 17% small I/O >>>>> performance. I think this CQ_VECTOR loading balance should be maintained >>>>> in provider not the caller. I didn't see this patch was submitted to >>>>> mainline kernel, wonder any reason behind? >>>> >>>> My interpretation is that an approach similar to IB_CQ_VECTOR_LEAST_ATTACHED is useful on single-socket systems but suboptimal on multi-socket systems. Hence the code for associating CQ sets with CPU sockets in the SRP initiator. These changes have been queued for kernel 3.19. See also branch drivers-for-3.19 in git repo git://git.infradead.org/users/hch/scsi-queue.git. >>> >>> What I did is that I manually controlled IRQ and working thread on the same socket. The CQ is created when mounting the file system in NFS/RDMA, but the workload thread might start from different socket, so per-cpu based implementation might not apply. I will look at SRP implementation. >>> >> >> Hey Shirley, >> >> Bart is correct, in general the LEAST_ATTACHED approach might not be >> optimal in the NUMA case. The thread <-> QP/CQ/CPU assignment is >> addressed by the multi-channel approach which to my understanding won't >> be implemented in NFSoRDMA in the near future (right Chuck?) > > As I understand it, the preference of the Linux NFS community is that > any multi-pathing solution should be transparent to the ULP (NFS and > RPC, in this case). Agree. > mp-tcp is ideal in that the ULP is presented with > a single virtual transport instance, but under the covers, that instance > can be backed by multiple active paths. > > Alternately, pNFS can be deployed. This allows a dataset to be striped > across multiple servers (and networks). There is a rather high bar to > entering this arena however. > > Speculating aloud, multiple QPs per transport instance may require > implementation changes on the server as well as the client. Any > interoperability dependencies should be documented via a standards > process. Correct, this obviously needs negotiation. But this is specific to NFSoRDMA standard. > > And note that an RPC transport (at least in kernel) is shared across > many user applications and mount points. I find it difficult to visualize > an intuitive and comprehensive administrative interface where enough > guidance is provided to place a set of NFS applications and an RPC > transport in the same resource domain (maybe cgroups?). This is why a multi-channel approach will solve the problem. Each IO operation selects a channel by the best fit (for example running cpu-id). This gives a *very* high gain and possibly max out HW performance even over a single mount. Having said that, I think this discussion is ahead of its time... > > So for the time being I prefer staying with a single QP per client- > server pair. > > A large NFS client can actively use many NFS servers, however. Each > client-server pair would benefit from finding "least-used" resources > when QP and CQs are created. That is something we can leverage today. > I agree that for the current state least-used can give some benefit by separating interrupt vectors for each client-server pair. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-12-09 20:11 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-04 16:47 IB_CQ_VECTOR_LEAST_ATTACHED Shirley Ma
[not found] ` <54809030.6090107-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-12-04 18:43 ` IB_CQ_VECTOR_LEAST_ATTACHED Bart Van Assche
[not found] ` <5480AB49.1080209-HInyCGIudOg@public.gmane.org>
2014-12-04 19:41 ` IB_CQ_VECTOR_LEAST_ATTACHED Shirley Ma
[not found] ` <5480B8CE.3080704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-12-07 10:20 ` IB_CQ_VECTOR_LEAST_ATTACHED Sagi Grimberg
[not found] ` <54842A05.9070207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-12-07 12:22 ` IB_CQ_VECTOR_LEAST_ATTACHED Matan Barak
[not found] ` <54844690.4040501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-07 12:59 ` IB_CQ_VECTOR_LEAST_ATTACHED Or Gerlitz
[not found] ` <54844F44.1010604-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-07 16:58 ` IB_CQ_VECTOR_LEAST_ATTACHED Matan Barak
[not found] ` <54848724.4020908-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-09 20:11 ` IB_CQ_VECTOR_LEAST_ATTACHED Or Gerlitz
2014-12-07 20:08 ` IB_CQ_VECTOR_LEAST_ATTACHED Chuck Lever
[not found] ` <A1DD5C9B-ED0E-42D1-A20C-710C7DAB514B-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-12-08 0:46 ` IB_CQ_VECTOR_LEAST_ATTACHED Shirley Ma
2014-12-09 11:29 ` IB_CQ_VECTOR_LEAST_ATTACHED Sagi Grimberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox