From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sagi Grimberg Subject: Re: [PATCH v3 11/13] IB/srp: Make HCA completion vector configurable Date: Tue, 16 Jul 2013 13:11:34 +0300 Message-ID: <51E51C56.50906@mellanox.com> References: <51D41C03.4020607@acm.org> <51D41FFC.6070105@acm.org> <51E272A4.5030707@mellanox.com> <51E3D79D.9070808@acm.org> <51E3F931.9080903@mellanox.com> <51E43E22.2060502@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51E43E22.2060502-HInyCGIudOg@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Bart Van Assche Cc: Roland Dreier , David Dillow , Vu Pham , Sebastian Riemer , Jinpu Wang , linux-rdma List-Id: linux-rdma@vger.kernel.org On 7/15/2013 9:23 PM, Bart Van Assche wrote: > On 15/07/2013 7:29, Sagi Grimberg wrote: >> srp_daemon is a package designated for the customer to automatically >> detect targets in the IB fabric. From our experience here in Mellanox, >> customers/users like automatic "plug&play" tools. >> They are reluctant to build their own scriptology to enhance performance >> and settle with srp_daemon which is preferred over use of ibsrpdm and >> manual adding new targets. >> Regardless, the completion vectors assignment is meaningless without >> setting proper IRQ affinity, so in the worst case where the user didn't >> set his IRQ affinity, >> this assignment will perform like the default completion vector >> assignment as all IRQs are directed without any masking i.e. core 0. >> >> From my experiments in NUMA systems, optimal performance is gained >> where all IRQs are directed to half of the cores on the NUMA node close >> to the HCA, and all traffic generators share the other half of the cores >> on the same NUMA node. So based on that knowledge, I thought that >> srp_daemon/srp driver will assign it's CQs across the HCAs completion >> vectors, and the user is encouraged to set the IRQ affinity as described >> above to gain optimal performance. >> Adding connections over the far NUMA node don't seem to benefit >> performance too much... >> >> As I mentioned, a use-case I see that may raise a problem here, is if >> the user would like to maintain multiple SRP connections and reserve >> some completion vectors for other IB applications on the system. >> in this case the user will be able to disable srp_daemon/srp driver >> completion vectors assignment. >> >> So, this was just an idea, and easy implementation that would >> potentially give the user semi-automatic performance optimized >> configuration... > > Hello Sagi, > > I agree with you that it would help a lot if completion vector > assignment could be automated such that end users do not have to care > about assigning completion vector numbers. The challenge is to find an > approach that is general enough such that it works for all possible > use cases. One possible approach is to let a tool that has knowledge > about the application fill in completion vector numbers in > srp_daemon.conf and let srp_daemon use the values generated by this > tool. That approach would avoid that srp_daemon has to have any > knowledge about the application but would still allow srp_daemon to > assign the completion vector numbers. > > Bart. Hey Bart, This sounds like a nice Idea, but there an inherent problem about applications coming and going while the connections are static (somewhat), how can you control pinning an arbitrary application running (over SRP devices of-course) at certain point of time. So will you agree at least to give target->comp_vector a default of IB_CQ_VECTOR_LEAST_ATTACHED? From my point of view, a user that don't have a slightest clue about completion vectors and performance optimization, this is somewhat better than doing nothing... -Sagi -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html