From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756437AbZLUCbd (ORCPT ); Sun, 20 Dec 2009 21:31:33 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754019AbZLUCbc (ORCPT ); Sun, 20 Dec 2009 21:31:32 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:57107 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753815AbZLUCbb (ORCPT ); Sun, 20 Dec 2009 21:31:31 -0500 To: Andi Kleen Cc: linux-kernel@vger.kernel.org, paulmck@linux.vnet.ibm.com References: <20091221220.243954235@firstfloor.org> <20091221020427.GA25372@basil.fritz.box> From: ebiederm@xmission.com (Eric W. Biederman) Date: Sun, 20 Dec 2009 18:31:23 -0800 In-Reply-To: <20091221020427.GA25372@basil.fritz.box> (Andi Kleen's message of "Mon\, 21 Dec 2009 03\:04\:28 +0100") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-XM-SPF: eid=;;;mid=;;;hst=in02.mta.xmission.com;;;ip=76.21.114.89;;;frm=ebiederm@xmission.com;;;spf=neutral X-SA-Exim-Connect-IP: 76.21.114.89 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls X-SA-Exim-Version: 4.2.1 (built Thu, 25 Oct 2007 00:26:12 +0000) X-SA-Exim-Scanned: No (on in02.mta.xmission.com); Unknown failure Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andi Kleen writes: > On Sun, Dec 20, 2009 at 05:59:59PM -0800, Eric W. Biederman wrote: >> Andi Kleen writes: >> >> > With BKL-less sysctls most of the writable string sysctls are racy. There >> > is no locking on the reader side, so a reader could see an inconsistent >> > string or worse miss the terminating null and walk of beyond it. >> >> The walk will only extend up to the maximum length of the string. >> So the worst case really is inconsistent data. > > It could still miss the 0 byte and walk out, can't it? Looking again. Yes it appears there is a small vulnerability there. The code does: len = strlen(data); if (len > maxlen) len = maxlen; So we should be calling: len = strnlen(data, maxlen); At which point we won't be able to walk out. The write side appears to be in need of strnlen_user as well, so it does not walk all of user space looking for null byte. >> This is an unfortunate corner case. This is not a regression as this >> has been the way things have worked for years. So probably 2.6.34 >> material. > > The one that's a clear regression is the core pattern one, that > was protected before by the BKL. A lot of others were always > broken yes. Nope. The core pattern just thought it was protected by BKL. I did not change the /proc/sys code path to remove the BKL. I don't know if we ever took the BKL on the /proc/sys codepath. I remember looking at the core pattern earlier and my memory is that sysctl is new enough that core pattern was not protected by the BKL on the /proc/sys path when it was introduced. There was a lot of confusing code in the sys_sysctl code path (which grabbed the BKL) so I expect people thought they were safe due to the BKL when they were not. So we have sysctl have locking problems, not new sysctl regressions. >> > This patch kit adds a new "rcu string" variant to avoid these >> > problems and convers the racy users. One the writer side the strings are >> > always copied to new memory and the readers use rcu_read_lock() >> > to get a stable view. For readers who access the string over >> > sleeps the reader copies the string. >> >> I will have to look more after the holidays. This rcu_string looks like >> it introduces allocations on paths that did not use them before, which >> has me wondering a bit. > > On the reader side about all of them allocated before, e.g. for > call_usermodehelper. That sounds like less of an issue. > If the strings were made a bit smaller this could be also > put on the stack, but I didn't dare for 256 bytes. Hmm. rcu wise that sounds wrong, but I haven't looked into your cool new data structure yet. Eric