From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755105Ab0G1MH7 (ORCPT ); Wed, 28 Jul 2010 08:07:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:32347 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754763Ab0G1MH5 (ORCPT ); Wed, 28 Jul 2010 08:07:57 -0400 Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: <20100727155023.GF1967@jolsa.brq.redhat.com> <24865.1280249187@redhat.com> To: Linus Torvalds , Jiri Olsa Cc: dhowells@redhat.com, Andrew Morton , Eric Dumazet , linux-kernel@vger.kernel.org, "Paul E. McKenney" , linux-security-module@vger.kernel.org Subject: Re: [PATCH] cred - synchronize rcu before releasing cred Date: Wed, 28 Jul 2010 13:07:42 +0100 Message-ID: <18537.1280318862@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus Torvalds wrote: > So it looks like the validation is simply wrong. The __task_cred() > helper is buggy. It's used for two different cases, and they have > totally different locking requirements. I think part of my comment on __task_cred(): * The caller must make sure task doesn't go away, either by holding a * ref on task or by holding tasklist_lock to prevent it from being * unlinked. may be obvious and perhaps unnecessary - anyone attempting to access a task_struct should know that they need to prevent it from going away whilst they do it - and I think this has led to Paul McKenny misunderstanding the intent. What I was trying to convey is the destruction of the task_struct involves discarding the credentials it points to. Either I should insert the word 'also' into that comment after 'must' or just get rid of it entirely. I think I should remove lock_dep_tasklist_lock_is_held() from the stated checks. It doesn't add anything, and someone calling __task_cred() on their own process (perhaps indirectly) doesn't need the tasklist lock. > Umm. In that case, get_task_cred() is actively misleading. > > What you are saying is that you cannot do > > rcu_read_lock() > __cred = (struct cred *) __task_cred((task)); > get_cred(__cred); > rcu_read_unlock(); Yeah. I think there are three alternatives: (1) get_task_cred() could be done by doing { __task_cred(), atomic_inc_not_zero() } in a loop until we manage to come up with the goods. It probably wouldn't be all that inefficient as creds don't change very often as a rule. (2) Get a spinlock in commit_creds() around the point where we alter the cred pointers. (3) Try and get rid of get_task_cred() and use other means. I've gone through all the users of get_task_cred() (see below) and this may be viable, though restrictive, in places. Any thoughts as to which is the best? > So presumably Jiri's patch is correct, but the reason the bug happened > in the first place is that all those accessor functions are totally > confused about how they supposed to be used, with incorrect comments > and incorrect access checks. At some point in the past I think I discarded a lock from the code:-/ > That should get fixed. Who knows how many other buggy users there are > due to the confusion? Some. warthog>grep get_task_cred `find . -name "*.[ch]"` ./kernel/auditsc.c: const struct cred *cred = get_task_cred(tsk); This is audit_filter_rules() which is used by: - audit_filter_task() - audit_alloc() as called from copy_process() with the new process - audit_filter_syscall() - audit_get_context() - audit_free() as called from copy_process() with the new, now dead process - audit_free() as called from do_exit() with the dead process - audit_syscall_exit() which passes current. - audit_syscall_entry() which passes current. - audit_filter_inodes() - audit_update_watch() which passes current - audit_get_context() - see above. which I think are all safe because when it's a new/dead process being accessed, that process can't be modifying itself at that point, otherwise it's current being accessed. ./kernel/cred.c: old = get_task_cred(daemon); This is prepare_kernel_cred() which is used by: - cachefiles_get_security_ID() which passes current. so this is safe. ./include/linux/cred.h: * get_task_cred - Get another task's objective credentials ./include/linux/cred.h:#define get_task_cred(task) \ The header file stuff under discussion. ./security/security.c: cred = get_task_cred(tsk); ./security/security.c: cred = get_task_cred(tsk); These are security_real_capable{,_noaudit}() if CONFIG_SECURITY=y, which could be a problem since they're used by has_capability{,_noaudit}() and called by things like the OOM killer with tasks other than current. However, I'm not sure it's necessary for get_task_cred() to be used here (in security/security.c) as it doesn't appear that the capable() LSM method sleeps in the one LSM that implements it (SELinux) or in the commoncap code. David