From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Alcock Date: Fri, 27 May 2016 21:44:56 +0000 Subject: Re: [4.1.x -- 4.6.x and probably HEAD] Reproducible unprivileged panic/TLB BUG on sparc via a stack- Message-Id: <8760tz2n1j.fsf@esperi.org.uk> List-Id: References: <87fut34unx.fsf@esperi.org.uk> <87fut34unx.fsf@esperi.org.uk> <87twhj3ag0.fsf@esperi.org.uk> <20160527.123731.2105286005500436503.davem@davemloft.net> In-Reply-To: <20160527.123731.2105286005500436503.davem@davemloft.net> (David Miller's message of "Fri, 27 May 2016 12:37:31 -0700 (PDT)") MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: David Miller Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, fweimer@redhat.com On 27 May 2016, David Miller stated: > From: Nick Alcock > Date: Fri, 27 May 2016 14:19:27 +0100 > >> The only difference between the two series above is that in the crashing >> series, the ka_restorer stub functions __rt_sigreturn_stub and >> __sigreturn_stub (on sparc32) and __rt_sigreturn_stub (on sparc64) get >> stack-protected; in the non-crashing series, they do not; the same is >> true without --enable-stack-protector=all, because the functions have no >> local variables at all, so without -fstack-protector-all they don't get >> stack-protected in any case. Passing such a stack-protected function in >> as the ka_restorer stub seems to suffice to cause this crash at some >> later date. I'm wondering if the stack canary is clobbering something >> that the caller does not expect to be clobbered: we saw this cause >> trouble on x86 in a different context (see upstream commit >> 7a25d6a84df9fea56963569ceccaaf7c2a88f161). > > This is amazing that it makes a difference since the sigreturn stub is > implemented entirely in inline assembler :-) I was fairly surprised as well, but not shocked, because people who write a function that consists of one single inline assembler instruction might well be rather surprised to find a massive pile of prologue and epilogue code dumped around it! > Normally the 64-bit stub is emitted as: > > __rt_sigreturn_stub: > mov 101, %g1 > ta 0x6d > > and with -fstack-protector-all we get: > > __rt_sigreturn_stub: > save %sp, -192, %sp > ldx [%g7+40], %g1 > stx %g1, [%fp+2039] > mov 0, %g1 > > mov 101, %g1 > ta 0x6d > > ldx [%fp+2039], %g1 > ldx [%g7+40], %g2 > xor %g1, %g2, %g1 > mov 0, %g2 > brnz,pn %g1, .LL4 > nop > return %i7+8 > nop > .LL4: > call __stack_chk_fail, 0 > nop > nop > > That 'save' is the problem. > > One can't change the register window or the stack pointer in this > function, as the kernel has setup the restore frame at a precise > location relative to the stack pointer when the stub is invoked. Oops! > Basically, do_rt_sigreturn is restoring garbage into the cpu > registers. Oh gods is it supposed to do register restoration? i.e. the usual ABI rules in re stack changes, etc just don't apply to it? Right, that's a disaster for stack-protection, obviously. The stack-protector prologue/epilogue does rather assume that it's being wrapped around a function, and in a very real sense this thing isn't a function in the normal sense at all. This is exactly what I thought was going on with the x86 code, but in the end that turned out to be a simple case of the (assembly) caller assuming a call-clobbered register had survived unchanged when the stack-protector epilogue had clobbered it (as it was quite within its rights to). > It obviously shouldn't crash, which I'll look into, but it is clear > that we can't enable -fstack-protector-all for this function. And now I have a good explanation of why that is for the commit log. Thank you! > So far I'm playing with the patch below to do some basic sanity > checks on the values inside of the sigreturn frame: Good move. Segfaulting the process is fine! :) Any process that does this sort of thing is clearly either terminally buggy, written by an idiot who doesn't know what he's doing (i.e. my original patch) or malicious. These all deserve SEGVs. (I still don't understand why this leads to spurious TLB faults, though. Filling the userland CPU registers with garbage is bad, but should still be reasonably harmless to the kernel, surely?) -- NULL && (void) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756214AbcE0VpJ (ORCPT ); Fri, 27 May 2016 17:45:09 -0400 Received: from icebox.esperi.org.uk ([81.187.191.129]:41602 "EHLO mail.esperi.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752616AbcE0VpH (ORCPT ); Fri, 27 May 2016 17:45:07 -0400 From: Nick Alcock To: David Miller Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, fweimer@redhat.com Subject: Re: [4.1.x -- 4.6.x and probably HEAD] Reproducible unprivileged panic/TLB BUG on sparc via a stack-protected rt_sigaction() ka_restorer, courtesy of the glibc testsuite References: <87fut34unx.fsf@esperi.org.uk> <87fut34unx.fsf@esperi.org.uk> <87twhj3ag0.fsf@esperi.org.uk> <20160527.123731.2105286005500436503.davem@davemloft.net> Emacs: no job too big... no job. Date: Fri, 27 May 2016 22:44:56 +0100 In-Reply-To: <20160527.123731.2105286005500436503.davem@davemloft.net> (David Miller's message of "Fri, 27 May 2016 12:37:31 -0700 (PDT)") Message-ID: <8760tz2n1j.fsf@esperi.org.uk> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-DCC-URT-Metrics: spindle 1060; Body=4 Fuz1=4 Fuz2=4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27 May 2016, David Miller stated: > From: Nick Alcock > Date: Fri, 27 May 2016 14:19:27 +0100 > >> The only difference between the two series above is that in the crashing >> series, the ka_restorer stub functions __rt_sigreturn_stub and >> __sigreturn_stub (on sparc32) and __rt_sigreturn_stub (on sparc64) get >> stack-protected; in the non-crashing series, they do not; the same is >> true without --enable-stack-protector=all, because the functions have no >> local variables at all, so without -fstack-protector-all they don't get >> stack-protected in any case. Passing such a stack-protected function in >> as the ka_restorer stub seems to suffice to cause this crash at some >> later date. I'm wondering if the stack canary is clobbering something >> that the caller does not expect to be clobbered: we saw this cause >> trouble on x86 in a different context (see upstream commit >> 7a25d6a84df9fea56963569ceccaaf7c2a88f161). > > This is amazing that it makes a difference since the sigreturn stub is > implemented entirely in inline assembler :-) I was fairly surprised as well, but not shocked, because people who write a function that consists of one single inline assembler instruction might well be rather surprised to find a massive pile of prologue and epilogue code dumped around it! > Normally the 64-bit stub is emitted as: > > __rt_sigreturn_stub: > mov 101, %g1 > ta 0x6d > > and with -fstack-protector-all we get: > > __rt_sigreturn_stub: > save %sp, -192, %sp > ldx [%g7+40], %g1 > stx %g1, [%fp+2039] > mov 0, %g1 > > mov 101, %g1 > ta 0x6d > > ldx [%fp+2039], %g1 > ldx [%g7+40], %g2 > xor %g1, %g2, %g1 > mov 0, %g2 > brnz,pn %g1, .LL4 > nop > return %i7+8 > nop > .LL4: > call __stack_chk_fail, 0 > nop > nop > > That 'save' is the problem. > > One can't change the register window or the stack pointer in this > function, as the kernel has setup the restore frame at a precise > location relative to the stack pointer when the stub is invoked. Oops! > Basically, do_rt_sigreturn is restoring garbage into the cpu > registers. Oh gods is it supposed to do register restoration? i.e. the usual ABI rules in re stack changes, etc just don't apply to it? Right, that's a disaster for stack-protection, obviously. The stack-protector prologue/epilogue does rather assume that it's being wrapped around a function, and in a very real sense this thing isn't a function in the normal sense at all. This is exactly what I thought was going on with the x86 code, but in the end that turned out to be a simple case of the (assembly) caller assuming a call-clobbered register had survived unchanged when the stack-protector epilogue had clobbered it (as it was quite within its rights to). > It obviously shouldn't crash, which I'll look into, but it is clear > that we can't enable -fstack-protector-all for this function. And now I have a good explanation of why that is for the commit log. Thank you! > So far I'm playing with the patch below to do some basic sanity > checks on the values inside of the sigreturn frame: Good move. Segfaulting the process is fine! :) Any process that does this sort of thing is clearly either terminally buggy, written by an idiot who doesn't know what he's doing (i.e. my original patch) or malicious. These all deserve SEGVs. (I still don't understand why this leads to spurious TLB faults, though. Filling the userland CPU registers with garbage is bad, but should still be reasonably harmless to the kernel, surely?) -- NULL && (void)