From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752482AbdBUUZJ (ORCPT ); Tue, 21 Feb 2017 15:25:09 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:39064 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752094AbdBUUYz (ORCPT ); Tue, 21 Feb 2017 15:24:55 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Oleg Nesterov Cc: Andrew Morton , Mika =?utf-8?Q?Penttil?= =?utf-8?Q?=C3=A4?= , Aleksa Sarai , Andy Lutomirski , Attila Fazekas , Jann Horn , Kees Cook , Michal Hocko , Ulrich Obergfell , linux-kernel@vger.kernel.org References: <20170213141452.GA30203@redhat.com> <20170213141516.GA30233@redhat.com> <20170213180454.GA2858@redhat.com> <87zihmpdkf.fsf@xmission.com> <20170220152202.GA13726@redhat.com> <87vas4wl3z.fsf@xmission.com> <20170221175354.GA31436@redhat.com> Date: Wed, 22 Feb 2017 09:20:15 +1300 In-Reply-To: <20170221175354.GA31436@redhat.com> (Oleg Nesterov's message of "Tue, 21 Feb 2017 18:53:55 +0100") Message-ID: <871surmh34.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1cgGzU-00054v-Ku;;;mid=<871surmh34.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=101.100.131.232;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX188skXYBBATwJkCzBeUHT4coyXc5OaMywA= X-SA-Exim-Connect-IP: 101.100.131.232 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4999] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMHurry_00 Hurry and Do Something * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.0 T_TooManySym_02 5+ unique symbols in subject X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Oleg Nesterov X-Spam-Relay-Country: X-Spam-Timing: total 382 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 3.7 (1.0%), b_tie_ro: 2.6 (0.7%), parse: 1.60 (0.4%), extract_message_metadata: 6 (1.7%), get_uri_detail_list: 3.7 (1.0%), tests_pri_-1000: 6 (1.6%), tests_pri_-950: 1.81 (0.5%), tests_pri_-900: 1.48 (0.4%), tests_pri_-400: 31 (8.1%), check_bayes: 29 (7.7%), b_tokenize: 11 (3.0%), b_tok_get_all: 9 (2.4%), b_comp_prob: 3.4 (0.9%), b_tok_touch_all: 3.2 (0.8%), b_finish: 0.78 (0.2%), tests_pri_0: 314 (82.1%), check_dkim_signature: 0.62 (0.2%), check_dkim_adsp: 3.3 (0.9%), tests_pri_500: 4.0 (1.0%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH V2 1/2] exec: don't wait for zombie threads with cred_guard_mutex held X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Oleg Nesterov writes: > On 02/21, Eric W. Biederman wrote: >> >> Today cred_guard_mutex is part of making exec appear to be an atomic >> operation to ptrace and and proc. To make exec appear to be atomic >> we do need to take the mutex at the beginning and release it at the end >> of exec. >> >> The semantics of exec appear atomic to ptrace_attach and to proc readers >> are necessary to ensure we use the proper process credentials in the >> event of a suid exec. > > This is clear. My point is that imo a) it is over-used in fs/proc and b) > the scope of this mutex if execve is too huge. I see absolutely no reason > to do copy_strings() with this mutex held, for example. And note that > copy_strings() can use a lot of memory/time, it can trigger oom,swapping, > etc. I agree that we can do things like copy_strings that don't change the data structures and of the task without before taking the cred_guard_mutex. > But let me repeat, this is a bit off-topic right now, this patch doesn't > change anything in this respect, afaics. > > >> I believe making cred_guard_mutex per task is an option. Reducing the >> scope of cred_guard_mutex concerns me. There appear to be some fields >> like sighand that we currently expose in proc > > please see another email, collect_sigign_sigcatch() is called without this > mutex. I agree that it is called without the mutex. It is not clear to me that is the correct behavior. It violates the fundamental property that exec of a setuid executable should be an atomic operation. I don't know how much we care but it disturbs me that we can read something of a processes signal handling state with the wrong credentials. Adopting an implementation where we can never fix this apparent bug really really disturbs me. >> Do you know if we can make cred_guard_mutex a per-task lock again? > > I think we can, but this needs some (afaics simple) changes too. > > But for what? Note that the problem fixed by this series won't go away > if we do this. I believe it will if the other waiters use mutex_lock_killable. > So what do you think about this series? I like the second patch. That seems clean and reasonable. I really don't like the first patch. It makes an information leak part a required detail of the implementation and as such possibly something we can never change. It attempts to paint a picture for a full fix in the future that appears to result in an incorrect kernel. That really bugs me. I suspect that a good fix that respects that proc and ptrace_attach need to exclude the setuid exec case for semantic reasons would have a similar complexity. I think a mutex doing the job that cred_guard_mutex is doing especially when we have multiple readers and a single writer is the wrong locking primative. A reader-writer lock or something even cheaper would probably be much better. I think fixing the deadlock is important. I think structuring the fix in such a way that the code is easily maintainable in the future and is also very important. Right now it feels like your fix in patch 1 makes things a bit more brittle and I don't like that at all. Eric