From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754047Ab2A3AMj (ORCPT ); Sun, 29 Jan 2012 19:12:39 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:55005 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751694Ab2A3AMi (ORCPT ); Sun, 29 Jan 2012 19:12:38 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Earl Chew Cc: Andrew Morton , Ingo Molnar , Peter Zijlstra , Eric Paris , "Serge E. Hallyn" , "linux-kernel\@vger.kernel.org" , Subject: Re: [PATCH] Support single byte reads from integers published in procfs by kernel/sysctl.c References: <4F1C5995.3060806@ixiacom.com> <4F1D8180.9030509@ixiacom.com> <4F1D8F37.6020806@ixiacom.com> <4F25CEB5.8070704@ixiacom.com> Date: Sun, 29 Jan 2012 16:15:10 -0800 In-Reply-To: <4F25CEB5.8070704@ixiacom.com> (Earl Chew's message of "Sun, 29 Jan 2012 14:56:53 -0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-XM-SPF: eid=;;;mid=;;;hst=in01.mta.xmission.com;;;ip=98.207.153.68;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18ALaFwBJq/QTXCv6/tiUfsdNt3+CooOSk= X-SA-Exim-Connect-IP: 98.207.153.68 X-SA-Exim-Mail-From: ebiederm@xmission.com X-SA-Exim-Scanned: No (on in01.mta.xmission.com); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Earl Chew writes: >> If you are interested in fixing this properly with a tiny buffer >> reachable from struct file I think this can be worth fixing. I think >> this is doable by using seq_file in proc_sys_read. > > I've looked into making proc_sys_read() use seq_file, but there are a few > issues to work through. > > I'm assuming that the existing kernel modules must continue to use the > ctl_table/proc_dointvec/etc interface without requiring patching. > > The two main consequences of this are: > > a. The existing struct ctl_table interface should be preserved. > Kernel modules use this to publish into procfs. Generally. Changing all 200 or so users of the ctl_table interface is a pain to audit and make certain you having done something silly. > b. The existing proc_dointvec(), etc, functions must be preserved > because they are exported via EXPORT_SYMBOL. Kernel modules > may rely on these symbols in arbitrary ways. You only have to worry about in kernel users, and a quick source audit should allow to be certain that there are no arbitrary weird uses of proc_dointvec. > I tried use seq_file in proc_sys_read, and it comes close to solving > the problem. The issue is that proc_dointvec() requires a __user pointer. > > A seq_file has an internal buffer that could be filled by proc_dointvec() -- but that > seq_file buffer is in kernel space. > > This is not a problem isolated to the seq_file implementation. I think that > any approach involving a small local buffer would mean that that buffer is > in kernel space, and that buffer cannot be passed to proc_dointvec() as it > stands now because proc_dointvec() requires a __user buffer. We can definitely change the signature of proc_handler and thus proc_dointvec and friends to be friendlier to an internal buffer. For prototyping it probably makes sense to use set_fs so you can use an internal buffer without needing to change the method. > One approach that might work is to add a new field to struct ctl_table : > > struct ctl_table > { > ... > proc_seq_handler *proc_seq_handler; > }; > > Existing modules can continue to use proc_dointvec() etc and fixed > code can use the new proc_seq_handler interface using the new > proc_seq_dointvec() etc. > > > Although this preserves the old interface, it seems to me that this > approach is flawed in that kernel modules using struct ctl_table and proc_dointvec() > continue to be broken -- they are not fixed. It might make a good intermediate transition scheme while existing tables are being updated. > So, perhaps breaking assumption (b) might be a reasonable thing to do ? Definitely. > If we accept that proc_dointvec() etc are only or mainly used in the > context of filling out struct ctl_table, and that other kernel modules > don't use proc_dointvec() in other contexts, then we can change the > call signature of proc_dointvec() to stop using a __user pointer: Exactly. > 1. Change signature of proc_dointvec() etc to stop using __user pointer for reads > 2. Change definition of typedef proc_call_handler to stop using __user pointer for reads > 3. In kernel/sysctl.c don't use copy_to_user() for reads All of which sound like nice cleanups, and likely to reduce the number of places where people can badly get things wrong. > Then proc_sys_read() can use proc_dointvec() etc to fill seq_file. Yes. That sounds like a nice cleanup. > For writes, the existing behaviour needs to be preserved. One approach > that would solve this would be to add: > > union proc_buffer { > void __user *uptr; > void *kptr; > }; > > typedef int proc_handler (struct ctl_table *ctl, int write, > union proc_buffer buffer, size_t *lenp, loff_t *ppos); > I think this would be preferable to either casting away __user for reads, or > adding two pointers to the proc_handler signature (one with __user, one without). > > If write is indicated, use buffer.uptr, and if not the handlers would use buffer.kptr. For interface design I would look carefully at the bitmap interface that comes out of sysctl. I think that is the most data interface we have. I suspect what we want for writes is to read the existing value into an internal buffer, and then allow partial writes to the internal buffer. I don't know if writable seq_files exist. > But this is not perfect. Kernel modules containing their _own_ proc_handler > definitions would now be broken severely. > > > And now I circle back to the proc_seq_handler approach. Each ctl_table client > must be fixed individually (ie switch from proc_handler to proc_seq_handler). > > What do you think ? It sounds like you are thinking things through well. For a transition strategy we may want to change proc_handler into operations structure instead of just a function pointer. Only having a single method we can call sounds like a pain. struct ctl_proc_operations { int (*proc_handler)(....); int (*proc_seq_handler)(...); }; Or maybe: struct ctl_proc_operations { int (*proc_handler)(....); int (*proc_seq_read_handler)(...); int (*proc_seq_write_handler)(...); }; Having the option of different read/write methods is a plus. More usefully this should allow us to update what the implementation of things like proc_dointvec mean without having to update all of the tables. Which should make it much easier to preserve correctness, because we don't a huge number of manual edits to ctl_table structures all over the kernel. Eric