From mboxrd@z Thu Jan 1 00:00:00 1970 References: <20170311000501.46607-1-thgarnie@google.com> <20170311000501.46607-2-thgarnie@google.com> <20170311094200.GA27700@gmail.com> <733ed189-6c01-2975-a81a-6fbfe4b7b593@zytor.com> <2d9aad2a-a677-40d2-c179-379fb6e9f194@zytor.com> From: "H. Peter Anvin" Message-ID: <7389c6e7-87dc-ea0d-5b2a-7925b8c8d33e@zytor.com> Date: Tue, 14 Mar 2017 09:44:56 -0700 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: [kernel-hardening] Re: [PATCH v3 2/4] x86/syscalls: Specific usage of verify_pre_usermode_state To: Thomas Garnier , Andy Lutomirski Cc: Ingo Molnar , Martin Schwidefsky , Heiko Carstens , David Howells , Arnd Bergmann , Al Viro , Dave Hansen , =?UTF-8?Q?Ren=c3=a9_Nyffenegger?= , Andrew Morton , Kees Cook , "Paul E . McKenney" , Andy Lutomirski , Ard Biesheuvel , Nicolas Pitre , Petr Mladek , Sebastian Andrzej Siewior , Sergey Senozhatsky , Helge Deller , Rik van Riel , John Stultz , Thomas Gleixner , Oleg Nesterov , Stephen Smalley , Pavel Tikhomirov , Frederic Weisbecker , Stanislav Kinsburskiy , Ingo Molnar , Paolo Bonzini , Dmitry Safonov , Borislav Petkov , Josh Poimboeuf , Brian Gerst , Jan Beulich , Christian Borntraeger , Fenghua Yu , He Chen , Russell King , Vladimir Murzin , Will Deacon , Catalin Marinas , Mark Rutland , James Morse , "David A . Long" , Pratyush Anand , Laura Abbott , Andre Przywara , Chris Metcalf , linux-s390 , LKML , Linux API , the arch/x86 maintainers , "linux-arm-kernel@lists.infradead.org" , Kernel Hardening List-ID: On 03/14/17 09:29, Thomas Garnier wrote: > > We want to enforce the address limit by default, not only when > CONFIG_BUG_ON_DATA_CORRUPTION is enabled. I tested this one: > > /* Check user-mode state on fast path return. */ > movq PER_CPU_VAR(current_task), %rax > btq $63, TASK_addr_limit(%rax) > jnc 1f > #ifdef CONFIG_BUG_ON_DATA_CORRUPTION > call syscall_return_slowpath > jmp return_from_SYSCALL_64 > #else > movq $TASK_SIZE_MAX, %rcx > movq %rcx, TASK_addr_limit(%rax) > #endif > 1: > > I saw that syscall_return_slowpath is supposed to be called not jumped > to. I could just call verify_pre_usermode_state that would be about > the same. I wanted to comment on that thing: why on earth isn't verify_pre_usermode_state() an inline? Making it an out-of-line function adds pointless extra overhead to the C code when we are talking about a few instructions. Second, you never do a branch-around to handle an exceptional condition on the fast path: you jump *out of line* to handle the special condition; a forward branch is preferred since it is slightly more likely to be predicted not taken. Now, I finally had a chance to actually look at the full file (I was preoccupied yesterday), and am a bit disappointed, to say the least. First of all, the jump target you need is only a handful of instructions further down the code path; you need to do *exactly* that is done when the test of _TIF_ALLWORK_MASK right above is tested! Not only that, but you already have PER_CPU_VAR(current_task) in %r11 just ready to be used! This was all in the three instructions immediately prior to the code you modified... So, all you'd need would be: movq $TASK_SIZE_MAX, %rcx cmpq %rcx, TASK_addr_limit(%r11) jne 1f We even get a short jump instruction! (Using bt saves one more instruction, but see previous caveats about it.) -hpa