From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756897AbcHDDfh (ORCPT ); Wed, 3 Aug 2016 23:35:37 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:47810 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751434AbcHDDff (ORCPT ); Wed, 3 Aug 2016 23:35:35 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Peter Zijlstra Cc: Kees Cook , Jeff Vander Stoep , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , "linux-doc\@vger.kernel.org" , "kernel-hardening\@lists.openwall.com" , LKML , Jonathan Corbet References: <1469630746-32279-1-git-send-email-jeffv@google.com> <20160802095243.GD6862@twins.programming.kicks-ass.net> <20160802203037.GC6879@twins.programming.kicks-ass.net> <87shulix2z.fsf@x220.int.ebiederm.org> <20160803214437.GI6879@twins.programming.kicks-ass.net> Date: Wed, 03 Aug 2016 21:50:37 -0500 In-Reply-To: <20160803214437.GI6879@twins.programming.kicks-ass.net> (Peter Zijlstra's message of "Wed, 3 Aug 2016 23:44:37 +0200") Message-ID: <87fuqldz7m.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1bV8wj-0007wb-Lk;;;mid=<87fuqldz7m.fsf@x220.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=67.3.204.119;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/amjzE3Ur75QOfzcv1AMNJdrWaliRpB4Y= X-SA-Exim-Connect-IP: 67.3.204.119 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.0 T_TooManySym_03 6+ unique symbols in subject * 0.0 T_TooManySym_02 5+ unique symbols in subject X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Peter Zijlstra X-Spam-Relay-Country: X-Spam-Timing: total 707 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 4.2 (0.6%), b_tie_ro: 3.1 (0.4%), parse: 0.78 (0.1%), extract_message_metadata: 16 (2.3%), get_uri_detail_list: 2.5 (0.4%), tests_pri_-1000: 6 (0.9%), tests_pri_-950: 1.26 (0.2%), tests_pri_-900: 1.03 (0.1%), tests_pri_-400: 30 (4.2%), check_bayes: 28 (4.0%), b_tokenize: 9 (1.2%), b_tok_get_all: 11 (1.5%), b_comp_prob: 2.9 (0.4%), b_tok_touch_all: 3.9 (0.6%), b_finish: 0.68 (0.1%), tests_pri_0: 356 (50.4%), check_dkim_signature: 0.55 (0.1%), check_dkim_adsp: 2.9 (0.4%), tests_pri_500: 289 (40.8%), poll_dns_idle: 283 (40.1%), rewrite_mail: 0.00 (0.0%) Subject: Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > On Wed, Aug 03, 2016 at 11:53:41AM -0700, Kees Cook wrote: >> > Kees Cook writes: >> > >> >> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra wrote: >> >> Let me take this another way instead. What would be a better way to >> >> provide a mechanism for system owners to disable perf without an LSM? >> >> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as >> >> wide a coverage as possible.) >> > >> > I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl. >> > Perhaps something else. >> >> Peter, did you happen to see Eric's solution to this problem for >> namespaces? Basically, a per-userns sysctl instead of a global sysctl. >> Is that something that would be acceptable here? > > Someone would have to educate me on what a userns is and how that would > help here. userns is an abbreviation for user namespace. How it might help is that it is an easy unescapable context for processes. Essentialy the idea is to limit the scope of the sysctl to a container. User namespaces run into flack because while tremendously simple in themselves the code takes advantage of the fact that suid root executables in a user namespace do not have privileges on anything outside of the user namespace. Which means that it is semantically safe to allow operations like creating mount namespaces, mount filesystems, creating network namespaces, manipulating the network stack etc. All of which allows unprivileged users (that can create network namespaces) to exercise more kernel code and exercise those bugs. Fundamentally user namespaces as objects you can create need limits on the maximum number of user namespaces you can create to cawtch run away processes. Set the limit you can create to 0 and you get what Kees wants. In my pending patches that were not quite ready for the merge window, I added a sysctl that described the maximum number of user namepaces that could be created (default value threads-max), and implemented the sysctl in a per user way. Such that counts and limits were kept for every user namespace. In a nested user namespace (which are all of them except for the initial user namspace) the count and limit would be checked in the current user namepsace, then the count would be incremented in the parent and verified the count was below the limit in the parent user namespace. What this means in practice is user namespaces can be enabled by default on a system, and yet you can easily disable them in a sandbox that was built with a user namespace. I named the new sysctls in my patch: /proc/sys/userns/max_user_namespaces /proc/sys/userns/max_pid_namespaces /proc/sys/userns/max_net_namespaces /proc/sys/userns/max_uts_namespaces /proc/sys/userns/max_ipc_namespaces /proc/sys/userns/max_cgroup_namespaces /proc/sys/userns/max_mnt_namespaces What Kees was suggesting was to add a similar sysctl say: /proc/sys/userns/perf_event_enabled And have the ability to disable perf events in each user namespaces. While still being able to leave usage perf events enabled by default. I don't know if any of that is a good fit for perf events. For purposes of this discussion I assume we are limiting ourselves to discussing userspace tracing, which semantically is 100% fine for access by userspace. Eric