From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>,
linux-parisc <linux-parisc@vger.kernel.org>,
John David Anglin <dave.anglin@bell.net>
Subject: Re: [PATCH] parisc: Fix syscall restarts
Date: Mon, 21 Dec 2015 14:39:01 +0000 (UTC) [thread overview]
Message-ID: <1067033558.279034.1450708741358.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1450685026.2256.20.camel@HansenPartnership.com>
----- On Dec 21, 2015, at 3:03 AM, James Bottomley James.Bottomley@HansenPartnership.com wrote:
> On Sun, 2015-12-20 at 21:35 +0100, Helge Deller wrote:
>> On 20.12.2015 17:50, James Bottomley wrote:
>> > On Sun, 2015-12-20 at 16:49 +0100, Helge Deller wrote:
>> > > On 20.12.2015 15:09, Mathieu Desnoyers wrote:
>> > > > ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers
>> > > > mathieu.desnoyers@efficios.com wrote:
>> > > >
>> > > > > ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de
>> > > > > wro
>> > > > > te:
>> > > > >
>> > > > > > On parisc syscalls which are interrupted by signals
>> > > > > > sometimes
>> > > > > > fail to restart
>> > > > > > and instead return -ENOSYS which then in the worst case
>> > > > > > lead to
>> > > > > > userspace
>> > > > > > crashes.
>> > > > > > A similiar problem existed on MIPS and was fixed by commit
>> > > > > > e967ef02
>> > > > > > ("MIPS: Fix restart of indirect syscalls").
>> > > > > >
>> > > > > > On parisc the current syscall restart code assumes hat the
>> > > > > > syscall number is
>> > > > > > always loaded in the delay branch of the ble instruction as
>> > > > > > defined in the
>> > > > > > unistd.h header file and as such never restored %r20 before
>> > > > > > returning to
>> > > > > > userspace:
>> > > > > > ble 0x100(%sr2, %r0)
>> > > > > > ldi #syscall_nr, %r20
>> > > > > >
>> > > > > > This assumption is at least not true for code which uses
>> > > > > > the
>> > > > > > syscall() glibc
>> > > > > > function, which instead uses this syntax:
>> > > > > > ble 0x100(%sr2, %r0)
>> > > > > > copy regX, %r20
>> > > > > > where regX depend on how the compiler optimizes the code
>> > > > > > and
>> > > > > > register usage.
>> > > > > >
>> > > > > > This patch fixes this problem by adding code to analyze how
>> > > > > > the
>> > > > > > syscall number
>> > > > > > is loaded in the delay branch and - if needed - copy the
>> > > > > > syscall number to regX
>> > > > > > prior returning to userspace for the syscall restart.
>> > > > > >
>> > > > > > Signed-off-by: Helge Deller <deller@gmx.de>
>> > > > > > Cc: stable@vger.kernel.org
>> > > > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> > > > > >
>> > > > > > diff --git a/arch/parisc/kernel/signal.c
>> > > > > > b/arch/parisc/kernel/signal.c
>> > > > > > index dc1ea79..b0414ad 100644
>> > > > > > --- a/arch/parisc/kernel/signal.c
>> > > > > > +++ b/arch/parisc/kernel/signal.c
>> > > > > > @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig,
>> > > > > > struct
>> > > > > > pt_regs *regs,
>> > > > > > int in_syscall)
>> > > > > > regs->gr[28]);
>> > > > > > }
>> > > > > >
>> > > > > > +/*
>> > > > > > + * Check the delay branch in userspace how the syscall
>> > > > > > number
>> > > > > > gets loaded into
>> > > > > > + * %r20 and adjust as needed.
>> > > > >
>> > > > > I'm pretty sure "Check the delay branch in userspace how the
>> > > > > syscall..."
>> > > > > is not an English construct. ;-) Suggested rewording:
>> > > > >
>> > > > > "Check how the syscall number gets loaded into %r20 within
>> > > > > the delay branch in userspace and adjust as needed."
>> > >
>> > > Thanks!
>> > > I'll change that.
>> > >
>> > >
>> > > > > > + */
>> > > > > > +
>> > > > > > +static void check_syscallno_in_delay_branch(struct pt_regs
>> > > > > > *regs)
>> > > > > > +{
>> > > > > > + unsigned int opcode, source_reg;
>> > > > >
>> > > > > Why "unsigned int" above rather than u32 ? Since we're using
>> > > > > opcode as target variable for a get_user, it would be clearer
>> > > > > if the type of the __user * match the type of the target
>> > > > > kernel
>> > > > > variable. (understood that those happen to have the same
>> > > > > bitness
>> > > > > and type size on all Linux architectures, but it would be
>> > > > > clearer
>> > > > > nevertheless).
>> > >
>> > > Yes, seems OK.
>> > > I'll change that.
>> > >
>> > > > > > + u32 __user *uaddr;
>> > > > > > +
>> > > > > > + /* Usually we don't have to restore %r20 (the
>> > > > > > system
>> > > > > > call number)
>> > > > > > + * because it gets loaded in the delay slot of the
>> > > > > > branch external
>> > > > > > + * instruction via the ldi instruction.
>> > > > > > + * In some cases a register-to-register copy
>> > > > > > instruction might have
>> > > > > > + * been used instead, in which case we need to
>> > > > > > copy
>> > > > > > the syscall
>> > > > > > + * number into the source register before
>> > > > > > returning to
>> > > > > > userspace.
>> > > > > > + */
>> > > > > > +
>> > > > > > + /* A syscall is just a branch, so all
>> > > > > > + * we have to do is fiddle the return pointer.
>> > > > > > + */
>> > > > > > + regs->gr[31] -= 8; /* delayed branching */
>> > > > > > +
>> > > > > > + /* Get assembler opcode of code in delay branch */
>> > > > > > + uaddr = (unsigned int *) (regs->gr[31] + 1);
>> > > > > > + get_user(opcode, uaddr);
>> > > > >
>> > > > > get_user() can fail due to EFAULT. This error should be
>> > > > > handled here, otherwise this could lead to the following
>> > > > > code using an uninitialized opcode variable, which could
>> > > > > indirectly leak a few bits of kernel stack information
>> > > > > to userspace (security concern). One attack vector I have
>> > > > > in mind for this is ptrace(), which might be able to tweak
>> > > > > those register values.
>> > >
>> > > Yes, generally get_user() can fail.
>> > > But this would be rather strange in that case, because
>> > > the syscall was started by userspace from this address.
>> > > So, without the code at that address in userspace, we would
>> > > never have reached this get_user().
>> >
>> > Actually, that's not necessarily a safe assumption. Any memory
>> > allocation in a syscall path (except GFP_ATOMIC) can trigger
>> > reclaim
>> > and since this is a signal restart path, that's entirely possible.
>> > Reclaim could pull the backing page out from under the syscall, so
>> > in
>> > a low memory situation it is possible get_user() could fail with
>> > EFAULT
>>
>> Really?
>> Maybe I misunderstood...?
>> So, let's say we have low memory and the kernel "swapped" out the
>> userspace.
>> I assume that when there is no memory pressure get_user() would
>> pull the page in again, and if it's memory pressure, then with the
>> assumption
>> there is no memory left it probably return EFAULT (is that true?).
>> But what happens then when we return to userspace?
>> I expect userspace then to segfault.
>>
>> I will add the check for EFAULT, but just trying to understand...
>
> Actually, you're right, I was misremembering why these functions return
> EFAULT. It's if the access is invalid. They will actually try to page
> back in via the fault handler if the backing has gone.
I think you could still trigger the EFAULT by having a
concurrent thread running within the same process address
space invoking munmap on the memory range that
includes the get_user() target memory area.
This would clearly be a hostile user-space application,
but it's important to validate all user inputs very
carefully, especially for misbehaving applications.
Thanks,
Mathieu
>
> James
>
>> > unless get_user_page() has been called somewhere to pin the page.
>>
>> Helge
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux
>> -parisc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2015-12-21 14:39 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-18 23:30 [PATCH] parisc: Fix syscall restarts Helge Deller
2015-12-20 13:59 ` Mathieu Desnoyers
2015-12-20 14:09 ` Mathieu Desnoyers
2015-12-20 15:49 ` Helge Deller
2015-12-20 16:50 ` James Bottomley
2015-12-20 20:35 ` Helge Deller
2015-12-21 8:03 ` James Bottomley
2015-12-21 14:39 ` Mathieu Desnoyers [this message]
2015-12-20 18:31 ` John David Anglin
2015-12-20 19:32 ` Helge Deller
2015-12-20 19:46 ` John David Anglin
2015-12-20 20:06 ` Helge Deller
2015-12-20 23:57 ` John David Anglin
2015-12-21 14:42 ` Mathieu Desnoyers
2015-12-21 15:12 ` John David Anglin
2015-12-20 19:39 ` John David Anglin
2015-12-20 19:48 ` Helge Deller
2015-12-20 20:01 ` John David Anglin
2015-12-20 20:18 ` Helge Deller
2015-12-20 20:45 ` John David Anglin
2015-12-20 20:14 ` John David Anglin
2015-12-20 20:19 ` Helge Deller
2015-12-20 20:21 ` Helge Deller
2015-12-20 20:53 ` John David Anglin
2015-12-21 9:19 ` [PATCH] parisc: Fix syscall restarts (v2) Helge Deller
2015-12-21 13:11 ` John David Anglin
2015-12-21 20:27 ` Mathieu Desnoyers
2015-12-21 20:54 ` Helge Deller
2015-12-24 16:07 ` Mathieu Desnoyers
2015-12-24 16:51 ` John David Anglin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1067033558.279034.1450708741358.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=dave.anglin@bell.net \
--cc=deller@gmx.de \
--cc=linux-parisc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox