From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [RFC] API to modify /proc/sys/net/ipv4/ip_local_reserved_ports Date: Tue, 10 Apr 2012 15:13:42 -0700 Message-ID: References: <4F5BE563.9050506@gmx.de> <4F5FAF28.5030205@gmx.de> <4F611835.4080904@gmx.de> <4F7CADE8.3060205@gmx.de> <1333960981.414.24.camel@cr0> <4F84A06F.3090808@gmx.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Cong Wang , Octavian Purdila , netdev@vger.kernel.org, David Miller , Andrew Morton , Frank Danapfel , Laszlo Ersek , shemminger@vyatta.com To: Helge Deller Return-path: Received: from out08.mta.xmission.com ([166.70.13.238]:49058 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755615Ab2DJWJv (ORCPT ); Tue, 10 Apr 2012 18:09:51 -0400 In-Reply-To: <4F84A06F.3090808@gmx.de> (Helge Deller's message of "Tue, 10 Apr 2012 23:04:47 +0200") Sender: netdev-owner@vger.kernel.org List-ID: Helge Deller writes: > On 04/09/2012 10:43 AM, Cong Wang wrote: >> On Wed, 2012-04-04 at 22:24 +0200, Helge Deller wrote: >>> I would like to follow up on my last patch series to be able to modify >>> the contents of the /proc/sys/net/ipv4/ip_local_reserved_ports port list >>> from userspace. >>> >>> My last patch (https://lkml.org/lkml/2012/3/10/187) was based on >>> modifications to the proc interface, which - based on the feedback here >>> on the list - seemed to not be the right way to go (although I personally >>> still like the idea very much :-)). >>> >>> Anyway, with this RFC I would like to get feedback about a new proposed >>> API and attached kernel patch. >>> >>> The idea is to introduce a new value for get/setsockopt() >>> named SO_RESERVED_PORTS to get/set the ip_local_reserved_ports >>> bitmap via standard get/setsockopt() syscalls. >>> As far as I understand this seems to be similiar to how iptables works. >>> >>> An untested kernel patch for review and feedback is attached below. >>> >>> In userspace it then would be possible to write a new tool or to extend >>> for example the "ip" tool to accept commands like: >>> $> ip reserved_ports add 100-2000 >>> $> ip reserved_ports remove 50-60 >>> $> ip reserved_ports list (to show current reserved port list) >>> >>> This userspace tool could then read the port bitmap from kernel via >>> a) socket(PF_INET, SOCK_RAW, IPPROTO_RAW) >>> b) getsockopt(3, SOL_SOCKET, SO_RESERVED_PORTS,) >>> and write back the results after modification via >>> c) setsockopt(3, SOL_SOCKET, SO_RESERVED_PORTS,) >>> >>> Would that be an acceptable solution? >> Hmm, it is indeed that bitmap fits for syscall rather than /proc file. >> >> But it seems that using getsockopt()/setsockopt() makes it like it is a >> per-socket setting, actually it is a system-wide setting. > Yes, that's the reason why I used SOL_SOCKET which configures at least > a few system-wide settings too. > >> So I am >> wondering if exporting a binary /proc file for this is a better >> solution. > Yeah - that's another solution, but (65536 ports)/(8 bits per byte) = 8 KByte, > so we > may again hit the 4k limit of /proc (unless you do binary reads which should > be done with a binary /proc-entry anyway). > > Again, I'm open to develop any kind of solution which would get an OK > here. Just looking at proc_do_large_bitmap, it does appear that there is a very local 4k limit on writes. Can you please just modify proc_do_large_bitmap so that there is not a 4k limit on writes. Ideally the code would just read another 4k from userspace when it is getting close to the end of it's 4k buffer, or perhaps we just read everything directly from userspace and run slowly. The bitmap is installed atomically at the end so any weird partial states should not be a problem.. Eric