From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f71.google.com (mail-oi0-f71.google.com [209.85.218.71]) by kanga.kvack.org (Postfix) with ESMTP id 3831E6B3FDC for ; Mon, 27 Aug 2018 05:55:29 -0400 (EDT) Received: by mail-oi0-f71.google.com with SMTP id j17-v6so14115741oii.8 for ; Mon, 27 Aug 2018 02:55:29 -0700 (PDT) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id u141-v6sor7942113oie.151.2018.08.27.02.55.28 for (Google Transport Security); Mon, 27 Aug 2018 02:55:28 -0700 (PDT) MIME-Version: 1.0 References: <56A9902F-44BE-4520-A17C-26650FCC3A11@gmail.com> <9A38D3F4-2F75-401D-8B4D-83A844C9061B@gmail.com> <8E0D8C66-6F21-4890-8984-B6B3082D4CC5@gmail.com> <20180826112341.f77a528763e297cbc36058fa@kernel.org> <20180826090958.GT24124@hirez.programming.kicks-ass.net> <20180827120305.01a6f26267c64610cadec5d8@kernel.org> <20180827081329.GZ24124@hirez.programming.kicks-ass.net> In-Reply-To: <20180827081329.GZ24124@hirez.programming.kicks-ass.net> From: Jann Horn Date: Mon, 27 Aug 2018 11:55:00 +0200 Message-ID: Subject: Re: TLB flushes on fixmap changes Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra , Andy Lutomirski , Dave Hansen Cc: mhiramat@kernel.org, Kees Cook , Nadav Amit , Linus Torvalds , Paolo Bonzini , jkosina@suse.cz, Will Deacon , benh@au1.ibm.com, npiggin@gmail.com, the arch/x86 maintainers , Borislav Petkov , Rik van Riel , Adin Scannell , kernel list , Linux-MM , "David S. Miller" , Martin Schwidefsky , Michael Ellerman On Mon, Aug 27, 2018 at 10:13 AM Peter Zijlstra wrote: > > On Mon, Aug 27, 2018 at 12:03:05PM +0900, Masami Hiramatsu wrote: > > On Sun, 26 Aug 2018 11:09:58 +0200 > > Peter Zijlstra wrote: > > > > FWIW, before text_poke_bp(), text_poke() would only be used from > > > stop_machine, so all the other CPUs would be stuck busy-waiting with > > > IRQs disabled. These days, yeah, that's lots more dodgy, but yes > > > text_mutex should be serializing all that. > > > > I'm still not sure that speculative page-table walk can be done > > over the mutex. Also, if the fixmap area is for aliasing > > pages (which always mapped to memory), what kind of > > security issue can happen? > > So suppose CPU-A is doing the text_poke (let's say through text_poke_bp, > such that other CPUs get to continue with whatever they're doing). > > While at that point, CPU-B gets an interrupt, and the CPU's > branch-trace-buffer for the IRET points to / near our fixmap. Then the > CPU could do a speculative TLB fill based on the BTB value, either > directly or indirectly (through speculative driven fault-ahead) of > whatever is in te fixmap at the time. Worse: The way academics have been defeating KASLR for a while is based on TLB fills for kernel addresses, triggered from userspace. Quoting https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf : | Additionally, even if a permission error occurs, this still allows to | launch address translations and, hence, generate valid TLB entries | by accessing privileged kernel space memory from user mode. This was actually part of the original motivation for KAISER/KPTI. Quoting https://gruss.cc/files/kaiser.pdf : | Modern operating system kernels employ address space layout | randomization (ASLR) to prevent control-flow hijacking attacks and | code-injection attacks. While kernel security relies fundamentally on preventing | access to address information, recent attacks have shown that the | hardware directly leaks this information. I believe that PTI probably prevents this way of directly triggering TLB fills for now (under the assumption that hyperthreads with equal CR3 don't share TLB entries), but I would still assume that an attacker can probably trigger TLB fills for arbitrary addresses anytime. And at some point in the future, I believe people would probably like to be able to disable PTI again? > Then CPU-A completes the text_poke and only does a local TLB invalidate > on CPU-A, leaving CPU-B with an active translation. > > *FAIL*