From mboxrd@z Thu Jan 1 00:00:00 1970 From: Doug Ledford Subject: Re: [PATCH rdma-next] Revert "IB/core: Add flow control to the portmapper netlink calls" Date: Mon, 05 Jun 2017 10:27:47 -0400 Message-ID: <1496672867.7171.146.camel@redhat.com> References: <20170529082423.1180-1-leon@kernel.org> <20170530212431.GA21008@ssaleem-MOBL4.amr.corp.intel.com> <20170531040437.GE5406@mtr-leonro.local> <20170531174245.GA16304@ssaleem-MOBL4.amr.corp.intel.com> <1496261429.2608.15.camel@sandisk.com> <20170602162849.GA28660@ssaleem-MOBL4.amr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: <20170602162849.GA28660-GOXS9JX10wfOxmVO0tvppfooFf0ArEBIu+b9c/7xato@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Shiraz Saleem , Bart Van Assche Cc: "leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org" , "Latif, Faisal" , "leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org" , "Ismail, Mustafa" , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org" List-Id: linux-rdma@vger.kernel.org On Fri, 2017-06-02 at 11:28 -0500, Shiraz Saleem wrote: > On Wed, May 31, 2017 at 02:10:31PM -0600, Bart Van Assche wrote: > > > > On Wed, 2017-05-31 at 12:42 -0500, Shiraz Saleem wrote: > > > > > > > > > > > 5. I proposed a solution -> go and fix your user space program. > > > > > > This is a kernel patch you are trying to revert, you are breaking > > > existing > > > kernel functionality.  Nothing to do with user space. > > > > > > Bottom line, come up with a solution that will address both port > > > mapper > > > functionality and your issue. > > > > Hello Shiraz, > > > > Sorry that this means additional work for you, but I agree with > > Leon that > > user space software should not assume that netlink sockets are a > > reliable > > communication mechanism. > > Hi Bart - Thank you for your response. > > The original problem was that ibnl_unicast, which is used to send nl > messages from  > portmapper kernel space to user-space, would occasionally and > momentarily fail under stress.  > We could have retried the call for a certain amount of time, but > since netlink_unicast has a  > nonblock/block parameter, we chose to use the blocking option with a > timeout. So we thought we  > did account for deadlocks with this timeout. Just to be clear, replacing a non-blocking and occasionally dropping function with a blocking function but with a timeout does not actually solve the problem, it merely moves the goal post out.  It is entirely possible that you will have the problem again given sufficient load. -- Doug Ledford     GPG KeyID: B826A3330E572FDD     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html