From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933169AbdELW6Q (ORCPT ); Fri, 12 May 2017 18:58:16 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:53150 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750796AbdELW6O (ORCPT ); Fri, 12 May 2017 18:58:14 -0400 Date: Fri, 12 May 2017 23:57:55 +0100 From: Al Viro To: Rik van Riel Cc: Kees Cook , Russell King - ARM Linux , Linus Torvalds , Mark Rutland , Kernel Hardening , Greg KH , Heiko Carstens , LKML , David Howells , Dave Hansen , "H . Peter Anvin" , Ingo Molnar , Pavel Tikhomirov , linux-s390 , the arch/x86 maintainers , Will Deacon , Christian Borntraeger , =?iso-8859-1?Q?Ren=E9?= Nyffenegger , Catalin Marinas , "Paul E . McKenney" , Peter Zijlstra , Arnd Bergmann , Brian Gerst , Borislav Petkov , Andy Lutomirski , Josh Poimboeuf , Thomas Gleixner , Ingo Molnar , "linux-arm-kernel@lists.infradead.org" , Linux API , Oleg Nesterov , Daniel Micay , James Morse , "Eric W . Biederman" , Martin Schwidefsky , Paolo Bonzini , Andrew Morton , Thomas Garnier , "Kirill A . Shutemov" Subject: Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode Message-ID: <20170512225755.GU390@ZenIV.linux.org.uk> References: <20170512075458.09a3a1ce@mschwideX1> <20170512202106.GO22219@n2100.armlinux.org.uk> <20170512210645.GS390@ZenIV.linux.org.uk> <20170512214144.GT390@ZenIV.linux.org.uk> <1494625675.29205.21.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1494625675.29205.21.camel@redhat.com> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 12, 2017 at 05:47:55PM -0400, Rik van Riel wrote: > > Seriously, look at these beasts.  Overwriting ->addr_limit is nowhere > > near > > the top threat.  If attacker can overwrite thread_info, you have > > lost. > > That is why THREAD_INFO_IN_TASK exists. It moves > the struct thread_info to a location away from the > stack, which means a stack overflow will not overwrite > the thread_info. ... in which case such attacks on ->addr_limit also become a non-issue. AFAICS, we are mixing several unrelated issues here: * amount of places where set_fs() is called. Sure, reducing it is a good idea and we want to move to primitives like kernel_write() et.al. Fewer users => lower odds of screwing it up. * making sure that remaining callers are properly paired. Ditto. * switching to ->read_iter()/->write_iter() where it makes sense. Again, no problem with that. * providing sane environment for places like perf/oprofile. Again, a good idea, and set_fs(USER_DS) is only a part of what's needed there. * switching _everything_ to ->read_iter()/->write_iter(). Flat-out insane and AFAICS nobody is signing up for that. * getting rid of set_fs() entirely. I'm afraid that it's not feasible without the previous one and frankly, I don't see much point. * sanity-checking on return to userland. Maybe useful, maybe not. * taking thread_info out of the way of stack overflows. Reasonable, but has very little to do with the rest of that. * protecting against Lovecraftian horrors slithering in from the outer space only to commit unspeakable acts against ->addr_limit and ignoring much tastier targets next to it, but then what do you expect from degenerate spawn of Great Old Ones - sanity?