From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754172AbbI3HPp (ORCPT ); Wed, 30 Sep 2015 03:15:45 -0400 Received: from mail-wi0-f196.google.com ([209.85.212.196]:35586 "EHLO mail-wi0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753714AbbI3HPn (ORCPT ); Wed, 30 Sep 2015 03:15:43 -0400 Date: Wed, 30 Sep 2015 09:15:37 +0200 From: Ingo Molnar To: Andy Lutomirski , Kees Cook Cc: Thomas Gleixner , Dmitry Vyukov , Andrey Ryabinin , Ingo Molnar , "H. Peter Anvin" , Andy Lutomirski , Borislav Petkov , Denys Vlasenko , "x86@kernel.org" , LKML , Kostya Serebryany , Alexander Potapenko , Andrey Konovalov , Sasha Levin , Andi Kleen , kasan-dev , Linus Torvalds , Peter Zijlstra , Andrew Morton , Kees Cook , Al Viro Subject: [PATCH] fs/proc: Don't expose absolute kernel addresses via wchan Message-ID: <20150930071537.GA19048@gmail.com> References: <1443430839-13225-1-git-send-email-dvyukov@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Andy Lutomirski wrote: > > + * ----------- bottom = start + sizeof(thread_info) > > + * thread_info > > + * ----------- start > > + * > > + * The tasks stack pointer points at the location where the > > + * framepointer is stored. The data on the stack is: > > + * ... IP FP ... IP FP > > + * > > + * We need to read FP and IP, so we need to adjust the upper > > + * bound by another unsigned long. > > + */ > > + top = start + THREAD_SIZE - 2 * sizeof(unsigned long); > > + bottom = start + sizeof(struct thread_info); > > + > > + sp = p->thread.sp; > > + if (sp < bottom || sp > top) > > + return 0; > > + > > + fp = *(unsigned long *)sp; > > do { > > - if (fp < (unsigned long)stack || > > - fp >= (unsigned long)stack+THREAD_SIZE) > > + if (fp < bottom || fp > top) > > return 0; > > - ip = *(u64 *)(fp+8); > > + ip = *(unsigned long *)(fp + sizeof(unsigned long)); > > if (!in_sched_functions(ip)) > > return ip; > > - fp = *(u64 *)fp; > > + fp = *(unsigned long *)fp; > > } while (count++ < 16); > > I'm be vaguely amazed if this isn't an exploitable info leak even > without the out of bounds thing. Can we really not find a way to do > this without walking the stack? So wchan leaks absolute kernel addresses to unprivileged user-space, of kernel functions that sleep: static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned long wchan; char symname[KSYM_NAME_LEN]; wchan = get_wchan(task); if (lookup_symbol_name(wchan, symname) < 0) { if (!ptrace_may_access(task, PTRACE_MODE_READ)) return 0; seq_printf(m, "%lu", wchan); } else { seq_printf(m, "%s", symname); } return 0; } So for example it trivially leaks the KASLR offset to any local attacker: fomalhaut:~> printf "%016lx\n" $(cat /proc/$$/stat | cut -d' ' -f35) ffffffff8123b380 Most real-life uses of wchan are symbolic: ps -eo pid:10,tid:10,wchan:30,comm and procps uses /proc/PID/wchan, not the absolute address in /proc/PID/stat: triton:~/tip> strace -f ps -eo pid:10,tid:10,wchan:30,comm 2>&1 | grep wchan | tail -1 open("/proc/30833/wchan", O_RDONLY) = 6 So shouldn't we try to set all numeric output to 0 and only allow symbolic output via /proc/PID/wchan? These days there's very little legitimate reason user-space would be interested in the absolute address. The absolute address is mostly historic: from the days when we didn't have kallsyms and user-space procps had to do the decoding itself via the System.map. ( The absolute sleep address can generally still be profiled via perf, by tasks with sufficient privileges. ) I.e. how about something like the patch below? (completely untested.) Thanks, Ingo ======================> fs/proc/array.c | 2 +- fs/proc/base.c | 7 +------ 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/fs/proc/array.c b/fs/proc/array.c index f60f0121e331..99082730b2ac 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -507,7 +507,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, seq_put_decimal_ull(m, ' ', task->blocked.sig[0] & 0x7fffffffUL); seq_put_decimal_ull(m, ' ', sigign.sig[0] & 0x7fffffffUL); seq_put_decimal_ull(m, ' ', sigcatch.sig[0] & 0x7fffffffUL); - seq_put_decimal_ull(m, ' ', wchan); + seq_puts(m, " 0"); /* Used to be numeric wchan - replaced by /proc/PID/wchan */ seq_put_decimal_ull(m, ' ', 0); seq_put_decimal_ull(m, ' ', 0); seq_put_decimal_ll(m, ' ', task->exit_signal); diff --git a/fs/proc/base.c b/fs/proc/base.c index b25eee4cead5..2fdbf303e3eb 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -430,13 +430,8 @@ static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns, wchan = get_wchan(task); - if (lookup_symbol_name(wchan, symname) < 0) { - if (!ptrace_may_access(task, PTRACE_MODE_READ)) - return 0; - seq_printf(m, "%lu", wchan); - } else { + if (!lookup_symbol_name(wchan, symname)) seq_printf(m, "%s", symname); - } return 0; }