From mboxrd@z Thu Jan 1 00:00:00 1970 From: 'Christoph Hellwig' Subject: Re: remove kernel_setsockopt and kernel_getsockopt v2 Date: Thu, 21 May 2020 11:11:50 +0200 Message-ID: <20200521091150.GA8401@lst.de> References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <138a17dfff244c089b95f129e4ea2f66-1XygrNkDbNvwg4NCKwmqgw@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: drbd-dev-bounces-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org Errors-To: drbd-dev-bounces-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org To: David Laight Cc: Marcelo Ricardo Leitner , Eric Dumazet , "linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" , "linux-sctp-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "target-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-afs-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" , "drbd-dev-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org" , "linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "rds-devel-N0ozoZBvEnrZJqsBc5GL+g@public.gmane.org" , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , 'Christoph Hellwig' , "cluster-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org" , Alexey Kuznetsov , Jakub Kicinski , "ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Neil Horman List-Id: ceph-devel.vger.kernel.org On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately. From mboxrd@z Thu Jan 1 00:00:00 1970 From: 'Christoph Hellwig' Date: Thu, 21 May 2020 11:11:50 +0200 Subject: [Cluster-devel] remove kernel_setsockopt and kernel_getsockopt v2 In-Reply-To: <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> Message-ID: <20200521091150.GA8401@lst.de> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AA51C433E2 for ; Thu, 21 May 2020 09:12:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41C0F2070A for ; Thu, 21 May 2020 09:12:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728716AbgEUJL6 (ORCPT ); Thu, 21 May 2020 05:11:58 -0400 Received: from verein.lst.de ([213.95.11.211]:53831 "EHLO verein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728545AbgEUJL6 (ORCPT ); Thu, 21 May 2020 05:11:58 -0400 Received: by verein.lst.de (Postfix, from userid 2407) id DBA5368C4E; Thu, 21 May 2020 11:11:50 +0200 (CEST) Date: Thu, 21 May 2020 11:11:50 +0200 From: 'Christoph Hellwig' To: David Laight Cc: 'Christoph Hellwig' , "David S. Miller" , Jakub Kicinski , Eric Dumazet , Alexey Kuznetsov , Hideaki YOSHIFUJI , Vlad Yasevich , Neil Horman , Marcelo Ricardo Leitner , Jon Maloy , Ying Xue , "drbd-dev@lists.linbit.com" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "linux-nvme@lists.infradead.org" , "target-devel@vger.kernel.org" , "linux-afs@lists.infradead.org" , "linux-cifs@vger.kernel.org" , "cluster-devel@redhat.com" , "ocfs2-devel@oss.oracle.com" , "netdev@vger.kernel.org" , "linux-sctp@vger.kernel.org" , "ceph-devel@vger.kernel.org" , "rds-devel@oss.oracle.com" , "linux-nfs@vger.kernel.org" Subject: Re: remove kernel_setsockopt and kernel_getsockopt v2 Message-ID: <20200521091150.GA8401@lst.de> References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D46BC433E0 for ; Thu, 21 May 2020 09:12:01 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ED9EC2070A for ; Thu, 21 May 2020 09:12:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="IvAXwTXL" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ED9EC2070A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=swTdp8XU0CqCd9o+iji8i/0NjNiueWPG6tVve6FGbNE=; b=IvAXwTXLzwvgna YoQRRU2rp5+pMW4hvvVLs/gyFRcv/H59D3lQ4mKd7k2fXLnj3qkBA4mAydKOBXuHv7+I1RIgr1CmL ElKs1IJwVBC54iU6wpatJ1uxPsmASwxVh8MehQ0YSRgTc2FOqZqnTyo8XsLg4DYQOmkTEDU623+WW RpmMkhg0QxKPZLJqA3/ugVBx5a3p7U0pZOFK6zLrBITK/Z4HAKo6NNfxRg/YxxnXOA6svibdJ2UxQ +CXvs93FM+rrFBPf3RX3yNcUJk1aJ3USJ8ZIvyoxMigtLqa76IJY0X1uwneYjE8Z4XagQlu9B8JQE df4C12BSIQ9FC+pOwOlw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jbhF8-0007OU-U0; Thu, 21 May 2020 09:11:58 +0000 Received: from verein.lst.de ([213.95.11.211]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jbhF5-0007Ng-L4; Thu, 21 May 2020 09:11:56 +0000 Received: by verein.lst.de (Postfix, from userid 2407) id DBA5368C4E; Thu, 21 May 2020 11:11:50 +0200 (CEST) Date: Thu, 21 May 2020 11:11:50 +0200 From: 'Christoph Hellwig' To: David Laight Subject: Re: remove kernel_setsockopt and kernel_getsockopt v2 Message-ID: <20200521091150.GA8401@lst.de> References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200521_021155_839068_93A903A3 X-CRM114-Status: GOOD ( 15.66 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Marcelo Ricardo Leitner , Eric Dumazet , "linux-nvme@lists.infradead.org" , "linux-sctp@vger.kernel.org" , "target-devel@vger.kernel.org" , "linux-afs@lists.infradead.org" , "drbd-dev@lists.linbit.com" , "linux-cifs@vger.kernel.org" , "rds-devel@oss.oracle.com" , "linux-rdma@vger.kernel.org" , 'Christoph Hellwig' , "cluster-devel@redhat.com" , Alexey Kuznetsov , Jakub Kicinski , "ceph-devel@vger.kernel.org" , "linux-nfs@vger.kernel.org" , Neil Horman , Hideaki YOSHIFUJI , "netdev@vger.kernel.org" , Vlad Yasevich , "linux-kernel@vger.kernel.org" , Jon Maloy , Ying Xue , "David S. Miller" , "ocfs2-devel@oss.oracle.com" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately. _______________________________________________ linux-nvme mailing list linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme From mboxrd@z Thu Jan 1 00:00:00 1970 From: 'Christoph Hellwig' Date: Thu, 21 May 2020 09:11:50 +0000 Subject: Re: remove kernel_setsockopt and kernel_getsockopt v2 Message-Id: <20200521091150.GA8401@lst.de> List-Id: References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> In-Reply-To: <138a17dfff244c089b95f129e4ea2f66-1XygrNkDbNvwg4NCKwmqgw@public.gmane.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: David Laight Cc: Marcelo Ricardo Leitner , Eric Dumazet , "linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" , "linux-sctp-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "target-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-afs-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" , "drbd-dev-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org" , "linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "rds-devel-N0ozoZBvEnrZJqsBc5GL+g@public.gmane.org" , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , 'Christoph Hellwig' , "cluster-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org" , Alexey Kuznetsov , Jakub Kicinski , "ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Neil Horman On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately. From mboxrd@z Thu Jan 1 00:00:00 1970 From: 'Christoph Hellwig' Date: Thu, 21 May 2020 11:11:50 +0200 Subject: [Ocfs2-devel] remove kernel_setsockopt and kernel_getsockopt v2 In-Reply-To: <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> Message-ID: <20200521091150.GA8401@lst.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: David Laight Cc: Marcelo Ricardo Leitner , Eric Dumazet , "linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" , "linux-sctp-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "target-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-afs-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" , "drbd-dev-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org" , "linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "rds-devel-N0ozoZBvEnrZJqsBc5GL+g@public.gmane.org" , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , 'Christoph Hellwig' , "cluster-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org" , Alexey Kuznetsov , Jakub Kicinski , "ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Neil Horman On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by mail19.linbit.com (LINBIT Mail Daemon) with ESMTP id C85964203D6 for ; Thu, 21 May 2020 11:11:54 +0200 (CEST) Date: Thu, 21 May 2020 11:11:50 +0200 From: 'Christoph Hellwig' To: David Laight Message-ID: <20200521091150.GA8401@lst.de> References: <20200520195509.2215098-1-hch@lst.de> <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <138a17dfff244c089b95f129e4ea2f66@AcuMS.aculab.com> Cc: Marcelo Ricardo Leitner , Eric Dumazet , "linux-nvme@lists.infradead.org" , "linux-sctp@vger.kernel.org" , "target-devel@vger.kernel.org" , "linux-afs@lists.infradead.org" , "drbd-dev@lists.linbit.com" , "linux-cifs@vger.kernel.org" , "rds-devel@oss.oracle.com" , "linux-rdma@vger.kernel.org" , 'Christoph Hellwig' , "cluster-devel@redhat.com" , Alexey Kuznetsov , Jakub Kicinski , "ceph-devel@vger.kernel.org" , "linux-nfs@vger.kernel.org" , Neil Horman , Hideaki YOSHIFUJI , "netdev@vger.kernel.org" , Vlad Yasevich , "linux-kernel@vger.kernel.org" , Jon Maloy , Ying Xue , "David S. Miller" , "ocfs2-devel@oss.oracle.com" Subject: Re: [Drbd-dev] remove kernel_setsockopt and kernel_getsockopt v2 List-Id: "*Coordination* of development, patches, contributions -- *Questions* \(even to developers\) go to drbd-user, please." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, May 21, 2020 at 08:01:33AM +0000, David Laight wrote: > How much does this increase the kernel code by? 44 files changed, 660 insertions(+), 843 deletions(-) > You are also replicating a lot of code making it more > difficult to maintain. No, I specifically don't. > I don't think the performance of an socket option code > really matters - it is usually done once when a socket > is initialised and the other costs of establishing a > connection will dominate. > > Pulling the user copies outside the [gs]etsocksopt switch > statement not only reduces the code size (source and object) > and trivially allows kernel_[sg]sockopt() to me added to > the list of socket calls. > > It probably isn't possible to pull the usercopies right > out into the syscall wrapper because of some broken > requests. Please read through the previous discussion of the rationale and the options. We've been there before. > I worried about whether getsockopt() should read the entire > user buffer first. SCTP needs the some of it often (including a > sockaddr_storage in one case), TCP needs it once. > However the cost of reading a few words is small, and a big > buffer probably needs setting to avoid leaking kernel > memory if the structure has holes or fields that don't get set. > Reading from userspace solves both issues. As mention in the thread on the last series: That was my first idea, but we have way to many sockopts, especially in obscure protocols that just hard code the size. The chance of breaking userspace in a way that can't be fixed without going back to passing user pointers to get/setsockopt is way to high to commit to such a change unfortunately.