From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756510AbbJAH5X (ORCPT ); Thu, 1 Oct 2015 03:57:23 -0400 Received: from mail-wi0-f180.google.com ([209.85.212.180]:37858 "EHLO mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755568AbbJAH5V (ORCPT ); Thu, 1 Oct 2015 03:57:21 -0400 Date: Thu, 1 Oct 2015 09:57:15 +0200 From: Ingo Molnar To: Kees Cook Cc: Thomas Gleixner , Andy Lutomirski , Dmitry Vyukov , Andrey Ryabinin , Ingo Molnar , "H. Peter Anvin" , Andy Lutomirski , Borislav Petkov , Denys Vlasenko , "x86@kernel.org" , LKML , Kostya Serebryany , Alexander Potapenko , Andrey Konovalov , Sasha Levin , Andi Kleen , kasan-dev , Linus Torvalds , Peter Zijlstra , Andrew Morton , Al Viro Subject: [PATCH v3] fs/proc, core/debug: Don't expose absolute kernel addresses via wchan Message-ID: <20151001075715.GA23430@gmail.com> References: <1443430839-13225-1-git-send-email-dvyukov@google.com> <20150930071537.GA19048@gmail.com> <20150930135917.GA3285@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Kees Cook wrote: > > @@ -507,7 +505,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, > > seq_put_decimal_ull(m, ' ', task->blocked.sig[0] & 0x7fffffffUL); > > seq_put_decimal_ull(m, ' ', sigign.sig[0] & 0x7fffffffUL); > > seq_put_decimal_ull(m, ' ', sigcatch.sig[0] & 0x7fffffffUL); > > - seq_put_decimal_ull(m, ' ', wchan); > > + seq_puts(m, " 0"); /* Used to be numeric wchan - replaced by /proc/PID/wchan */ > > Probably should also update Documentation/filesystems/proc.txt with > something like: > > --- a/Documentation/filesystems/proc.txt > +++ b/Documentation/filesystems/proc.txt > @@ -310,7 +310,7 @@ Table 1-4: Contents of the stat files (as of 2.6.30-rc7) > blocked bitmap of blocked signals > sigign bitmap of ignored signals > sigcatch bitmap of caught signals > - wchan address where process went to sleep > + 0 (place holder, was wchan, see /proc/PID/wchan instead) > 0 (place holder) > 0 (place holder) > exit_signal signal to send to parent thread on exit Indeed - I ended up clarifying both wchan explanations, see the changes below. I also made the 'no symbols' output "0" (instead of an empty string), to better match the /proc/PID/stat behavior and previous output. I'll push it out after a bit more testing and if nothing goes wrong I'll send this patch to Linus in the v4.4 merge window. Thanks, Ingo ============> >>From bc43bb95763e5b215e389f75860eca0952ca4704 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 30 Sep 2015 15:59:17 +0200 Subject: [PATCH] fs/proc, core/debug: Don't expose absolute kernel addresses via wchan So the /proc/PID/stat 'wchan' field (the 30th field) leaks absolute kernel addresses to unprivileged user-space, of kernel functions that sleep: seq_put_decimal_ull(m, ' ', wchan); The absolute address might also leak via /proc/PID/wchan, if KALLSYMS is turned off or if the symbol lookup fails for some reason: static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned long wchan; char symname[KSYM_NAME_LEN]; wchan = get_wchan(task); if (lookup_symbol_name(wchan, symname) < 0) { if (!ptrace_may_access(task, PTRACE_MODE_READ)) return 0; seq_printf(m, "%lu", wchan); } else { seq_printf(m, "%s", symname); } return 0; } This isn't ideal, because for example it trivially leaks the KASLR offset to any local attacker: fomalhaut:~> printf "%016lx\n" $(cat /proc/$$/stat | cut -d' ' -f35) ffffffff8123b380 Most real-life uses of wchan are symbolic: ps -eo pid:10,tid:10,wchan:30,comm and procps uses /proc/PID/wchan, not the absolute address in /proc/PID/stat: triton:~/tip> strace -f ps -eo pid:10,tid:10,wchan:30,comm 2>&1 | grep wchan | tail -1 open("/proc/30833/wchan", O_RDONLY) = 6 These days there's very little legitimate reason user-space would be interested in the absolute address. The absolute address is mostly historic: from the days when we didn't have kallsyms and user-space procps had to do the decoding itself via the System.map. So this patch sets all numeric output to "0" and keeps only symbolic output, in /proc/PID/wchan. ( The absolute sleep address can generally still be profiled via perf, by tasks with sufficient privileges. ) Reviewed-by: Thomas Gleixner Acked-by: Kees Cook Acked-by: Linus Torvalds Cc: Cc: Al Viro Cc: Alexander Potapenko Cc: Andrey Konovalov Cc: Andrey Ryabinin Cc: Andy Lutomirski Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Denys Vlasenko Cc: Dmitry Vyukov Cc: Kostya Serebryany Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Peter Zijlstra Cc: Sasha Levin Cc: kasan-dev Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20150930135917.GA3285@gmail.com Signed-off-by: Ingo Molnar --- Documentation/filesystems/proc.txt | 5 +++-- fs/proc/array.c | 6 ++---- fs/proc/base.c | 9 +++------ 3 files changed, 8 insertions(+), 12 deletions(-) diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index d411ca63c8b6..db64f7d6492d 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -140,7 +140,8 @@ Table 1-1: Process specific entries in /proc stat Process status statm Process memory status information status Process status in human readable form - wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan + wchan If CONFIG_KALLSYMS=y, wchan (the kernel function the process is + blocked in) symbol string. "0" if not blocked or !KALLSYMS. pagemap Page table stack Report full stack trace, enable via CONFIG_STACKTRACE smaps a extension based on maps, showing the memory consumption of @@ -310,7 +311,7 @@ Table 1-4: Contents of the stat files (as of 2.6.30-rc7) blocked bitmap of blocked signals sigign bitmap of ignored signals sigcatch bitmap of caught signals - wchan address where process went to sleep + 0 (place holder, used to be the wchan address, use /proc/PID/wchan instead) 0 (place holder) 0 (place holder) exit_signal signal to send to parent thread on exit diff --git a/fs/proc/array.c b/fs/proc/array.c index f60f0121e331..ad5ad1e376ad 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -375,7 +375,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns, static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task, int whole) { - unsigned long vsize, eip, esp, wchan = ~0UL; + unsigned long vsize, eip, esp; int priority, nice; int tty_pgrp = -1, tty_nr = 0; sigset_t sigign, sigcatch; @@ -454,8 +454,6 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, unlock_task_sighand(task, &flags); } - if (permitted && (!whole || num_threads < 2)) - wchan = get_wchan(task); if (!whole) { min_flt = task->min_flt; maj_flt = task->maj_flt; @@ -507,7 +505,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, seq_put_decimal_ull(m, ' ', task->blocked.sig[0] & 0x7fffffffUL); seq_put_decimal_ull(m, ' ', sigign.sig[0] & 0x7fffffffUL); seq_put_decimal_ull(m, ' ', sigcatch.sig[0] & 0x7fffffffUL); - seq_put_decimal_ull(m, ' ', wchan); + seq_puts(m, " 0"); /* Used to be numeric wchan - replaced by /proc/PID/wchan */ seq_put_decimal_ull(m, ' ', 0); seq_put_decimal_ull(m, ' ', 0); seq_put_decimal_ll(m, ' ', task->exit_signal); diff --git a/fs/proc/base.c b/fs/proc/base.c index b25eee4cead5..6f05aabce3aa 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -430,13 +430,10 @@ static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns, wchan = get_wchan(task); - if (lookup_symbol_name(wchan, symname) < 0) { - if (!ptrace_may_access(task, PTRACE_MODE_READ)) - return 0; - seq_printf(m, "%lu", wchan); - } else { + if (!lookup_symbol_name(wchan, symname)) seq_printf(m, "%s", symname); - } + else + seq_putc(m, '0'); return 0; }