From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5727C433F5 for ; Tue, 16 Nov 2021 09:12:44 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id E599461A4E for ; Tue, 16 Nov 2021 09:12:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E599461A4E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lists.openwall.com Received: (qmail 6068 invoked by uid 550); 16 Nov 2021 09:12:36 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Received: (qmail 6033 invoked from network); 16 Nov 2021 09:12:35 -0000 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:reply-to :subject:content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=fmDWsP6HIgDPZm0jsUsVahurpJpoutIi09vI45Vo0Wc=; b=FRHagvL6uaxn3o/BmI4Tcdodk+NfcEEL0MUUDkSTVzrgOownTh7EZp7fqYSrHZyEG1 t8vebGw7SMit7K6j3Ctccq9L2oJyKcvU58Y+nxwBmJ50vhs6+N9p0mzf7mecU/Z3psKJ ykkn0FUkyi2gfEITvGtATAH+NfFTlVyiNVZTa5grjFHVJfrHIKarun+QlsKp0xhCRJwz loIPBYxqUSL4yncXwRoPtMNzppSrHszxiIFp6LjA54g3zaMzIrFlB7XKuGT4zDwIxIAo YkEf2irH1Q93f4nFvMR6slT9Uxik1mB3+t+2v1Bx+V+DLgzQ5zSWpCz3YXOCM8R5rrtM dwPA== X-Gm-Message-State: AOAM532g1N23HyxtU3uRhKHEme51yehWt0jOMg1UoiUpdRSqTTij2wBe iUZIebfrgSMrzrIZkIbU+SY= X-Google-Smtp-Source: ABdhPJxepms+HgoLFwnuhBFNqrvUhvNh4mcd7Wey0CxKrcVKLj1+P9n2pR6lZtm0ajXnCX83PK/xuw== X-Received: by 2002:adf:d4c2:: with SMTP id w2mr7481224wrk.225.1637053944102; Tue, 16 Nov 2021 01:12:24 -0800 (PST) Message-ID: <59534db5-b251-c0c8-791f-58aca5c00a2b@linux.com> Date: Tue, 16 Nov 2021 12:12:16 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter Content-Language: en-US To: Kees Cook , Steven Rostedt , Linus Torvalds Cc: Lukas Bulwahn , Linus Torvalds , Jonathan Corbet , Paul McKenney , Andrew Morton , Thomas Gleixner , Peter Zijlstra , Joerg Roedel , Maciej Rozycki , Muchun Song , Viresh Kumar , Robin Murphy , Randy Dunlap , Lu Baolu , Petr Mladek , Luis Chamberlain , Wei Liu , John Ogness , Andy Shevchenko , Alexey Kardashevskiy , Christophe Leroy , Jann Horn , Greg Kroah-Hartman , Mark Rutland , Andy Lutomirski , Dave Hansen , Will Deacon , Ard Biesheuvel , Laura Abbott , David S Miller , Borislav Petkov , Arnd Bergmann , Andrew Scull , Marc Zyngier , Jessica Yu , Iurii Zaikin , Rasmus Villemoes , Wang Qing , Mel Gorman , Mauro Carvalho Chehab , Andrew Klychkov , Mathieu Chouquet-Stringer , Daniel Borkmann , Stephen Kitt , Stephen Boyd , Thomas Bogendoerfer , Mike Rapoport , Bjorn Andersson , Kernel Hardening , linux-hardening@vger.kernel.org, "open list:DOCUMENTATION" , linux-arch , Linux Kernel Mailing List , linux-fsdevel , notify@kernel.org, main@lists.elisa.tech, safety-architecture@lists.elisa.tech, devel@lists.elisa.tech, Shuah Khan References: <20211027233215.306111-1-alex.popov@linux.com> <77b79f0c-48f2-16dd-1d00-22f3a1b1f5a6@linux.com> <20211115110649.4f9cb390@gandalf.local.home> <202111151116.933184F716@keescook> From: Alexander Popov In-Reply-To: <202111151116.933184F716@keescook> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 16.11.2021 01:06, Kees Cook wrote: > Hmm, yes. What it originally boiled down to, which is why Linus first > objected to BUG(), was that we don't know what other parts of the system > have been disrupted. The best example is just that of locking: if we > BUG() or do_exit() in the middle of holding a lock, we'll wreck whatever > subsystem that was attached to. Without a deterministic system state > unwinder, there really isn't a "safe" way to just stop a kernel thread. > > With this pkill_on_warn, we avoid the BUG problem (since the thread of > execution continues and stops at an 'expected' place: the signal > handler). > > However, now we have the newer objection from Linus, which is one of > attribution: the WARN might be hit during an "unrelated" thread of > execution and "current" gets blamed, etc. And beyond that, if we take > down a portion of userspace, what in userspace may be destabilized? In > theory, we get a case where any required daemons would be restarted by > init, but that's not "known". > > The safest version of this I can think of is for processes to opt into > this mitigation. That would also cover the "special cases" we've seen > exposed too. i.e. init and kthreads would not opt in. > > However, that's a lot to implement when Marco's tracing suggestion might > be sufficient and policy could be entirely implemented in userspace. It > could be as simple as this (totally untested): I don't think that this userspace warning handling can work as pkill_on_warn. 1. The kernel code execution continues after WARN_ON(), it will not wait some userspace daemon that is polling trace events. That's not different from ignoring and having all negative effects after WARN_ON(). 2. This userspace policy will miss WARN_ON_ONCE(), WARN_ONCE() and WARN_TAINT_ONCE() after the first hit. Oh, wait... I got a crazy idea that may bring more consistency in the error handling mess. What if the Linux kernel had a LSM module responsible for error handling policy? That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc. In such LSM policy we can decide immediately how to react on the kernel error. We can even decide depending on the subsystem and things like that. (idea for brainstorming) Best regards, Alexander