From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32A19C433FE for ; Fri, 24 Sep 2021 00:22:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 190DF61090 for ; Fri, 24 Sep 2021 00:22:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243644AbhIXAYE (ORCPT ); Thu, 23 Sep 2021 20:24:04 -0400 Received: from shells.gnugeneration.com ([66.240.222.126]:37182 "EHLO shells.gnugeneration.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243602AbhIXAYD (ORCPT ); Thu, 23 Sep 2021 20:24:03 -0400 Received: by shells.gnugeneration.com (Postfix, from userid 1000) id BA2931A56019; Thu, 23 Sep 2021 17:22:30 -0700 (PDT) Date: Thu, 23 Sep 2021 17:22:30 -0700 From: Vito Caputo To: Jann Horn Cc: Vito Caputo , Kees Cook , Thomas Gleixner , Josh Poimboeuf , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Jens Axboe , Mark Rutland , Peter Zijlstra , Stefan Metzmacher , Andy Lutomirski , Lai Jiangshan , Christian Brauner , Andrew Morton , "Kenta.Tada@sony.com" , Daniel Bristot de Oliveira , Michael =?utf-8?B?V2Vpw58=?= , Anand K Mistry , Alexey Gladkov , Michal Hocko , Helge Deller , Dave Hansen , Andrea Righi , Ohhoon Kwon , Kalesh Singh , YiFei Zhu , "Eric W. Biederman" , linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-hardening@vger.kernel.org Subject: Re: [PATCH] proc: Disable /proc/$pid/wchan Message-ID: <20210924002230.sijoedia65hf5bj7@shells.gnugeneration.com> References: <20210923233105.4045080-1-keescook@chromium.org> <20210923234917.pqrxwoq7yqnvfpwu@shells.gnugeneration.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Fri, Sep 24, 2021 at 02:08:45AM +0200, Jann Horn wrote: > On Fri, Sep 24, 2021 at 1:59 AM Vito Caputo wrote: > > On Thu, Sep 23, 2021 at 04:31:05PM -0700, Kees Cook wrote: > > > The /proc/$pid/wchan file has been broken by default on x86_64 for 4 > > > years now[1]. As this remains a potential leak of either kernel > > > addresses (when symbolization fails) or limited observation of kernel > > > function progress, just remove the contents for good. > > > > > > Unconditionally set the contents to "0" and also mark the wchan > > > field in /proc/$pid/stat with 0. > > > > > > This leaves kernel/sched/fair.c as the only user of get_wchan(). But > > > again, since this was broken for 4 years, was this profiling logic > > > actually doing anything useful? > > > > > > [1] https://lore.kernel.org/lkml/20210922001537.4ktg3r2ky3b3r6yp@treble/ > > > > > > Cc: Josh Poimboeuf > > > Cc: Vito Caputo > > > Signed-off-by: Kees Cook > > > > > > > > Please don't deliberately break WCHANs wholesale. This is a very > > useful tool for sysadmins to get a vague sense of where processes are > > spending time in the kernel on production systems without affecting > > performance or having to restart things under instrumentation. > > Wouldn't /proc/$pid/stack be more useful for that anyway? As long as > you have root privileges, you can read that to get the entire stack, > not just a single method name. > > (By the way, I guess that might be an alternative to ripping wchan out > completely - require CAP_SYS_ADMIN like for /proc/$pid/stack?) WCHAN is a first-class concept of the OS. As a result we have long-standing useful tools exposing them in far more organized, documented, and discoverable ways than poking around linux-specific /proc files at the shell. Even `top` can show WCHAN in a column alongside everything else it exposes, complete with sorting etc, and I've already demonstrated the support in `ps`. I also think it's worth preserving the ability for regular users to observe the WCHAN of their own processes. It's unclear to me why this is such a worry. If the WCHAN as-implemented is granular enough to expose too much kernel inner workings, then it should be watered down to be more vague. Even if it just said "ioctl" when a process was blocked in D state through making an ioctl() it would still be much more useful than saying nothing at all. Can't regular users see this much about their own processes via strace/gdb anyways? Instead of unwinding stacks maybe the kernel should be sticking an entrypoint address in the current task struct for get_wchan() to access, whenever userspace enters the kernel? Regards, Vito Caputo