From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Ricardo Leitner Subject: Re: [RFC PATCH 0/2] sctp: add new getsockopt option SCTP_SOCKOPT_PEELOFF_KERNEL Date: Wed, 17 Jun 2015 10:40:26 -0300 Message-ID: <558178CA.3010806@gmail.com> References: <20150617102119.GA24677@hmsreliant.think-freely.org> <55815C22.2000605@gmail.com> <20150617122004.GC24677@hmsreliant.think-freely.org> <55816AC0.2070508@gmail.com> <20150617131658.GD24677@hmsreliant.think-freely.org> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, Vlad Yasevich , linux-sctp@vger.kernel.org To: Neil Horman Return-path: Received: from mail-qc0-f174.google.com ([209.85.216.174]:33742 "EHLO mail-qc0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753582AbbFQNkc (ORCPT ); Wed, 17 Jun 2015 09:40:32 -0400 In-Reply-To: <20150617131658.GD24677@hmsreliant.think-freely.org> Sender: netdev-owner@vger.kernel.org List-ID: On 17-06-2015 10:16, Neil Horman wrote: > On Wed, Jun 17, 2015 at 09:40:32AM -0300, Marcelo Ricardo Leitner wrote: >> On 17-06-2015 09:20, Neil Horman wrote: >>> On Wed, Jun 17, 2015 at 08:38:10AM -0300, Marcelo Ricardo Leitner wrote: >>>> On 17-06-2015 07:21, Neil Horman wrote: >>>>> On Tue, Jun 16, 2015 at 07:42:31PM -0300, Marcelo Ricardo Leitner wrote: >>>>>> Hi, >>>>>> >>>>>> I'm trying to remove a direct dependency of dlm module on sctp one. >>>>>> Currently dlm code is calling sctp_do_peeloff() directly and only this >>>>>> call is causing the load of sctp module together with dlm. For that, we >>>>>> have basically 3 options: >>>>>> - Doing a module split on dlm >>>>>> - which I'm avoiding because it was already split and was merged (more >>>>>> info on patch2 changelog) >>>>>> - and the sctp code on it is rather small if compared with sctp module >>>>>> itself >>>>>> - Using some other infra that gets indirectly activated, like getsockopt() >>>>>> - It was like this before, but the exposed sockopt created a file >>>>>> descriptor for the new socket and that create some serious issues. >>>>>> More info on 2f2d76cc3e93 ("dlm: Do not allocate a fd for peeloff") >>>>>> - Doing something like ipv6_stub (which is used by vxlan) or similar >>>>>> - but I don't feel that's a good way out here, it doesn't feel right. >>>>>> >>>>>> So I'm approaching this by going with 2nd option again but this time >>>>>> also creating a new sockopt that is only accessible for kernel users of >>>>>> this protocol, so that we are safe to directly return a struct socket * >>>>>> via getsockopt() results. This is the tricky part of it of this series. >>>>>> >>>>>> It smells hacky yes but currently most of sctp calls are wrapped behind >>>>>> kernel_*(). Even if we set a flag (like netlink does) saying that this >>>>>> is a kernel socket, we still have the issue of getting the function call >>>>>> through and returning such non-usual return value. >>>>>> >>>>>> I kept __user marker on sctp_getsockopt_peeloff_kernel() prototype and >>>>>> its helpers just to avoid issues with static checkers. >>>>>> >>>>>> Kernel path not really tested yet.. mainly willing to know what do you >>>>>> think, is this feasible? getsockopt option only reachable by kernel >>>>>> itself? Couldn't find any other like this. >>>>>> >>>>>> Thanks, >>>>>> Marcelo >>>>>> >>>>>> Marcelo Ricardo Leitner (2): >>>>>> sctp: add new getsockopt option SCTP_SOCKOPT_PEELOFF_KERNEL >>>>>> dlm: avoid using sctp_do_peeloff directly >>>>>> >>>>>> fs/dlm/lowcomms.c | 17 ++++++++--------- >>>>>> include/uapi/linux/sctp.h | 12 ++++++++++++ >>>>>> net/sctp/socket.c | 39 +++++++++++++++++++++++++++++++++++++++ >>>>>> 3 files changed, 59 insertions(+), 9 deletions(-) >>>>>> >>>>>> -- >>>>>> 2.4.1 >>>>>> >>>>>> >>>>> >>>>> Why not just use the existing PEELOFF socket option with the kernel_getsockopt >>>>> interface, and sockfd_lookup to translate the returned value back to a socket >>>>> struct? That seems less redundant and less hack-ish to me. >>>> >>>> It was like that before commit 2f2d76cc3e93 ("dlm: Do not allocate a fd for >>>> peeloff"), but it caused serious issues due to the fd allocation, so that's >>>> what I'm willing to avoid now. >>>> >>>> References: >>>> http://article.gmane.org/gmane.linux.network.drbd/22529 >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1075629 (this one is closed, >>>> sorry) >>>> >>>> Marcelo >>>> >>> Ah, I see. You're using the new socket option as a differentiator to just skip >>> the creation of an FD. >> >> Exactly. >> >>> I get your reasoning, but I'm still not in love with the idea of duplicating >>> code paths to avoid that action. Can we use some data inside the socket >>> structure to do this differentiation? Specifically here I'm thinking of >>> sock->file. IIRC that will be non-null for any sockets created in user space, >> >> I had thought about using some socket flags like netlink does but couldn't >> get around with that. Hadn't thought about sock->file though, nice idea. >> >>> but will always be NULL for dlm created sockets (since we use sock_create >>> directly to create them. If that is a sufficient differentiator, then we can >>> just optionally allocate the new socket fd for the peeled off socket, iff the >>> parent sock->file pointer is non-null. >>> >>> Thoughts? >>> Neil >> >> We can re-use the current code path, by either checking it via sock->file or >> via get_fs(). That will require us to change the option arg format so we >> keep it nice and clean but as it would be kernel-side only, it should be ok >> right? It currently is: >> >> typedef struct { >> sctp_assoc_t associd; >> int sd; >> } sctp_peeloff_arg_t; >> >> And we would have to fit a pointer in there, something like: >> typedef union { >> struct { >> sctp_assoc_t associd; >> int sd; >> }; >> void *sock; >> } sctp_peeloff_arg_t; >> >> Sounds good? >> > Yes, sounds reasonable. > > Thanks! > Neil Cool, thanks Neil. I'll rework these now but will post the new version probably by next week only, as we can get dlm properly tested too. Cheers, Marcelo