public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* access_ok macor
@ 2009-07-14 12:56 Michal Simek
  2009-07-14 13:21 ` Arnd Bergmann
       [not found] ` <200907141652.59049.arnd@arndb.de>
  0 siblings, 2 replies; 15+ messages in thread
From: Michal Simek @ 2009-07-14 12:56 UTC (permalink / raw)
  To: Linux Kernel list, Arnd Bergmann, LTP

Hi,

I am trying to solve one thing around access_ok macro.
Microblaze memory map is below
Text address for user app is 0x1000 0000
for library 0x4800 0000
and stack below 0xc000 0000


# cat /proc/1/maps
10000000-10106000 r-xp 00000000 00:01 379        /bin/busybox
10106000-10107000 rw-p 00106000 00:01 379        /bin/busybox
10107000-1012a000 rwxp 00000000 00:00 0          [heap]
48000000-4801c000 r-xp 00000000 00:01 400        /lib/ld-2.3.3.so
4801c000-4801e000 rw-p 0001b000 00:01 400        /lib/ld-2.3.3.so
4801e000-480ec000 r-xp 00000000 00:01 396        /lib/libm-2.3.3.so
480ec000-480ee000 rw-p 000ce000 00:01 396        /lib/libm-2.3.3.so
480ee000-4824c000 r-xp 00000000 00:01 390        /lib/libc-2.3.3.so
4824c000-4824f000 r--p 0015d000 00:01 390        /lib/libc-2.3.3.so
4824f000-48251000 rw-p 00160000 00:01 390        /lib/libc-2.3.3.so
48251000-48255000 rw-p 00000000 00:00 0
bfa38000-bfa4d000 rwxp 00000000 00:00 0          [stack]
#

I found that I can setup text base in binutils/ld/emulparam/elf32mb_linux.sh

The problem which I have is that if I run socketpair, getsockname, getpeername LTP
tests with invalid salen pointer there are addresses close to 0x0. Microblaze
has no text there and the sigsegv fault is generated.

This fault could be fixed by changed access_ok macro where I check bottom limit
at 0x1000 0000 too. After this change the LTP program not failed but I am not sure
if is the right solution because none arch do it. All archs just check upper limit
not lower.

What is the correct solution for it? Moving .text base to 0x0 or is there any other
elegant solution?

Thanks,
Michal




-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 12:56 access_ok macor Michal Simek
@ 2009-07-14 13:21 ` Arnd Bergmann
  2009-07-14 13:45   ` Michal Simek
       [not found] ` <200907141652.59049.arnd@arndb.de>
  1 sibling, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2009-07-14 13:21 UTC (permalink / raw)
  To: monstr; +Cc: Linux Kernel list, LTP

On Tuesday 14 July 2009, Michal Simek wrote:
> I found that I can setup text base in binutils/ld/emulparam/elf32mb_linux.sh
> 
> The problem which I have is that if I run socketpair, getsockname, getpeername LTP
> tests with invalid salen pointer there are addresses close to 0x0. Microblaze
> has no text there and the sigsegv fault is generated.

This sounds like a classic NULL pointer dereference that is handled correctly
by the kernel. The question is where the address came from.

> This fault could be fixed by changed access_ok macro where I check bottom limit
> at 0x1000 0000 too. After this change the LTP program not failed but I am not sure
> if is the right solution because none arch do it. All archs just check upper limit
> not lower.
> 
> What is the correct solution for it? Moving .text base to 0x0 or is there any other
> elegant solution?

Moving .text is not the right solution, because it only papers over real bugs.
access_ok() is also not the right place to check this, the only purpose it has
is to make sure that the argument is not a valid kernel address but either a
valid user address or possibly invalid address. Also, access_ok() is only used
together with the copy_from/to_user and get/put_user function families. These
need to catch invalid addresses with a fixup table entry in the kernel.

I briefly looked at your implementation but could not find any problems in
this area. Could you use gdb to find out whether the sigsegv happens in the
kernel at all, or in user space?

	Arnd <><

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 13:21 ` Arnd Bergmann
@ 2009-07-14 13:45   ` Michal Simek
  2009-07-14 14:45     ` Arnd Bergmann
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Simek @ 2009-07-14 13:45 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Linux Kernel list, LTP

Arnd Bergmann wrote:
> On Tuesday 14 July 2009, Michal Simek wrote:
>> I found that I can setup text base in binutils/ld/emulparam/elf32mb_linux.sh
>>
>> The problem which I have is that if I run socketpair, getsockname, getpeername LTP
>> tests with invalid salen pointer there are addresses close to 0x0. Microblaze
>> has no text there and the sigsegv fault is generated.
> 
> This sounds like a classic NULL pointer dereference that is handled correctly
> by the kernel. The question is where the address came from.

It is not anly NULL pointer - is LTP tests are some fake addresses. From my tests I see
that I am not able to access place till 1000 0000 in dec.
Bad address come from tests to test it.

Look at
http://developer.petalogix.com/git/gitweb.cgi?p=ltp-microblaze.git;a=commitdiff;h=45f4cd783ce8b94f1267bb87c0c46e8536f62eca

There are three affected tests and my quick fixes which I am trying to solve now.


> 
>> This fault could be fixed by changed access_ok macro where I check bottom limit
>> at 0x1000 0000 too. After this change the LTP program not failed but I am not sure
>> if is the right solution because none arch do it. All archs just check upper limit
>> not lower.
>>
>> What is the correct solution for it? Moving .text base to 0x0 or is there any other
>> elegant solution?
> 
> Moving .text is not the right solution, because it only papers over real bugs.

I can confirm it - I moved it and rebuild toolchain.

> access_ok() is also not the right place to check this, the only purpose it has
> is to make sure that the argument is not a valid kernel address but either a
> valid user address or possibly invalid address. Also, access_ok() is only used
> together with the copy_from/to_user and get/put_user function families. These
> need to catch invalid addresses with a fixup table entry in the kernel.

ok - that mean that problem could be in bad fixup table?


> 
> I briefly looked at your implementation but could not find any problems in
> this area. Could you use gdb to find out whether the sigsegv happens in the
> kernel at all, or in user space?

We don't have gdb in place.

The problem should come from get_user macro. net/socket.c:212
I was looking for it in the morning. I am checking it again.

Thanks,
Michal


int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uaddr,
		      int __user *ulen)
{
	int err;
	int len;

	err = get_user(len, ulen);
	if (err)
		return err;


> 
> 	Arnd <><


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 13:45   ` Michal Simek
@ 2009-07-14 14:45     ` Arnd Bergmann
  2009-07-14 15:06       ` Michal Simek
  0 siblings, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2009-07-14 14:45 UTC (permalink / raw)
  To: monstr; +Cc: Linux Kernel list, LTP

On Tuesday 14 July 2009, Michal Simek wrote:
> Arnd Bergmann wrote:
> Look at
> http://developer.petalogix.com/git/gitweb.cgi?p=ltp-microblaze.git;a=commitdiff;h=45f4cd783ce8b94f1267bb87c0c46e8536f62eca
> 
> There are three affected tests and my quick fixes which I am trying to solve now.
> 

ok, I see.
 
> int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uaddr,
> 		      int __user *ulen)
> {
> 	int err;
> 	int len;
> 
> 	err = get_user(len, ulen);
> 	if (err)
> 		return err;
> 

So the code looks something like

                        "1:     lw      %1, %2, r0;                     \
                                addk    %0, r0, r0;                     \
                        2:                                              \
                        .section .fixup,\"ax\";                         \
                        3:      brid    2b;                             \
                                addik   %0, r0, %3;                     \
                        .previous;                                      \
                        .section ,\"a\";                      		\
                        .word   1b,3b;                                  \
                        .previous;"                                     \

Not much that can go wrong there. First of all, I'd check that the
code actually looks the same in the binary. I assume that the 'addik'
gets executed when the brid branches, right?

I would guess that some of the logic in do_page_fault might be
broken and does not actually call the fixup.

	Arnd <><

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 14:45     ` Arnd Bergmann
@ 2009-07-14 15:06       ` Michal Simek
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Simek @ 2009-07-14 15:06 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Linux Kernel list, LTP

Arnd Bergmann wrote:
> On Tuesday 14 July 2009, Michal Simek wrote:
>> Arnd Bergmann wrote:
>> Look at
>> http://developer.petalogix.com/git/gitweb.cgi?p=ltp-microblaze.git;a=commitdiff;h=45f4cd783ce8b94f1267bb87c0c46e8536f62eca
>>
>> There are three affected tests and my quick fixes which I am trying to solve now.
>>
> 
> ok, I see.
>  
>> int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uaddr,
>> 		      int __user *ulen)
>> {
>> 	int err;
>> 	int len;
>>
>> 	err = get_user(len, ulen);
>> 	if (err)
>> 		return err;
>>
> 
> So the code looks something like
> 
>                         "1:     lw      %1, %2, r0;                     \
>                                 addk    %0, r0, r0;                     \
>                         2:                                              \
>                         .section .fixup,\"ax\";                         \
>                         3:      brid    2b;                             \
>                                 addik   %0, r0, %3;                     \
>                         .previous;                                      \
>                         .section ,\"a\";                      		\
>                         .word   1b,3b;                                  \
>                         .previous;"                                     \

yes,

> 
> Not much that can go wrong there. First of all, I'd check that the
> code actually looks the same in the binary. I assume that the 'addik'
> gets executed when the brid branches, right?

yes, it should bri-d meant branch with delay slot and addik is in delay.

> 
> I would guess that some of the logic in do_page_fault might be
> broken and does not actually call the fixup.

ok.

Michal

> 
> 	Arnd <><


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
       [not found]   ` <4A5CAEFF.9080206@monstr.eu>
@ 2009-07-14 16:43     ` Arnd Bergmann
  2009-07-14 16:56       ` Michal Simek
       [not found]       ` <9e6f3dfd0907141811p512b4edp3f9dd0fdeae1123e@mail.gmail.com>
  0 siblings, 2 replies; 15+ messages in thread
From: Arnd Bergmann @ 2009-07-14 16:43 UTC (permalink / raw)
  To: monstr; +Cc: Linux Kernel list, LTP

On Tuesday 14 July 2009, Michal Simek wrote:
> Arnd Bergmann wrote:

> >>  r29=00000000, r30=00000000, r31=CE9759A4, rPC=C000123C
> >>  msr=800045AE, ear=00000001, esr=000000B2, fsr=000080D0
> >> Segmentation fault
> >>
> > 
> > I guess then you should check if 0xc000123c is in your
> > exception table, or why it is not.
> 
> on that address is load instruction for unaligned exception because addr is odd number
> that's why is called unaligned exception handler and from this function
> is called load instruction which failed. :-(
> 
> Currently this make more sense why that tests failed. If that pointers are
> even number exception is not taken and exception sure don't have fixup for it because
> this is generic code. :-(
> 
> That's the problem because we are looking for regs->pc but this point to unaligned exception
> handler.

Ok, that makes a lot of sense.

The solution then is to handle fixups from the unaligned exception handler
if you come from the kernel. That should fix the three text cases.

I don't fully understand your exception handling there, but I think you
also need to add code checking for __range_ok() to your unaligned handler,
to prevent malicious user space code from accessing the kernel through
unaligned pointers.

	Arnd <><

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 16:43     ` Arnd Bergmann
@ 2009-07-14 16:56       ` Michal Simek
  2009-07-14 17:13         ` Arnd Bergmann
       [not found]       ` <9e6f3dfd0907141811p512b4edp3f9dd0fdeae1123e@mail.gmail.com>
  1 sibling, 1 reply; 15+ messages in thread
From: Michal Simek @ 2009-07-14 16:56 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Linux Kernel list, LTP

Arnd Bergmann wrote:
> On Tuesday 14 July 2009, Michal Simek wrote:
>> Arnd Bergmann wrote:
> 
>>>>  r29=00000000, r30=00000000, r31=CE9759A4, rPC=C000123C
>>>>  msr=800045AE, ear=00000001, esr=000000B2, fsr=000080D0
>>>> Segmentation fault
>>>>
>>> I guess then you should check if 0xc000123c is in your
>>> exception table, or why it is not.
>> on that address is load instruction for unaligned exception because addr is odd number
>> that's why is called unaligned exception handler and from this function
>> is called load instruction which failed. :-(
>>
>> Currently this make more sense why that tests failed. If that pointers are
>> even number exception is not taken and exception sure don't have fixup for it because
>> this is generic code. :-(
>>
>> That's the problem because we are looking for regs->pc but this point to unaligned exception
>> handler.
> 
> Ok, that makes a lot of sense.
> 
> The solution then is to handle fixups from the unaligned exception handler
> if you come from the kernel. That should fix the three text cases.
> 
> I don't fully understand your exception handling there, but I think you
> also need to add code checking for __range_ok() to your unaligned handler,
> to prevent malicious user space code from accessing the kernel through
> unaligned pointers.

when the code tried to read/write from unaligned address (and in cpu is turn on unaligned exception)
then is caused unaligned exception and asm code assemble/return value which is on that unaligned
address. (Assemble it that read/write every byte separately). That will be harder to prevent all
this cases because unaligned exception is in generic code.
What do you mean add __range_ok? Range checking is ok. The problem is when in case get_user kernel
try to load unaligned addr - unaligned exception is perform and try to load that value separately.
If that page is not there, page fault handler is called and not find it, it is performed search
from exception table and that address is not there of course - because address in pc is generic
unaligned code. I think that handling this needs more code.
Maybe if the address with from unaligned exception handler (there are some address which can caused
it) and find out which aligned address is there and find out proper fixup for it.
I think that this could work.

What do you think?

Michal



> 
> 	Arnd <><


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 16:56       ` Michal Simek
@ 2009-07-14 17:13         ` Arnd Bergmann
  2009-07-14 17:45           ` Michal Simek
  2009-07-15  9:21           ` Paul Mundt
  0 siblings, 2 replies; 15+ messages in thread
From: Arnd Bergmann @ 2009-07-14 17:13 UTC (permalink / raw)
  To: monstr; +Cc: Linux Kernel list, LTP

On Tuesday 14 July 2009, Michal Simek wrote:
> when the code tried to read/write from unaligned address (and in cpu is turn on unaligned exception)
> then is caused unaligned exception and asm code assemble/return value which is on that unaligned
> address. (Assemble it that read/write every byte separately). That will be harder to prevent all
> this cases because unaligned exception is in generic code.
> What do you mean add __range_ok? Range checking is ok. The problem is when in case get_user kernel
> try to load unaligned addr - unaligned exception is perform and try to load that value separately.
> If that page is not there, page fault handler is called and not find it, it is performed search
> from exception table and that address is not there of course - because address in pc is generic
> unaligned code. I think that handling this needs more code.
> Maybe if the address with from unaligned exception handler (there are some address which can caused
> it) and find out which aligned address is there and find out proper fixup for it.
> I think that this could work.
> 
> What do you think?

I think the key point is that the kernel should never try an unaligned
access. Other architectures already rely on this, so you can too.
Instead of doing an aligned access from the unaligned exception handler,
you should not access the data at all when you come from kernel mode
but only search the fixup table to see if you should continue anyway.

Sorry for adding confusion by mentioning the __range_ok, which is addressing
a separate problem. I should have made that clearer.

In pseudo-code, I think you should do:

void unaligned_exception(struct pt_regs *regs, unsigned long address)
{
	if (kernel_mode(regs)) {
		/*
		 * unaligned access not allowed in kernel,
		 * but search exception table.
		 */
		bad_page_fault(regs, address, 0);
	} else {
		if (!access_ok(address, 4))
			force_sig(SIGBUS, current);
		else
			fixup_unaligned_access(regs);
	}
}

	Arnd <><

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 17:13         ` Arnd Bergmann
@ 2009-07-14 17:45           ` Michal Simek
  2009-07-15  9:21           ` Paul Mundt
  1 sibling, 0 replies; 15+ messages in thread
From: Michal Simek @ 2009-07-14 17:45 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Linux Kernel list, LTP

Arnd Bergmann wrote:
> On Tuesday 14 July 2009, Michal Simek wrote:
>> when the code tried to read/write from unaligned address (and in cpu is turn on unaligned exception)
>> then is caused unaligned exception and asm code assemble/return value which is on that unaligned
>> address. (Assemble it that read/write every byte separately). That will be harder to prevent all
>> this cases because unaligned exception is in generic code.
>> What do you mean add __range_ok? Range checking is ok. The problem is when in case get_user kernel
>> try to load unaligned addr - unaligned exception is perform and try to load that value separately.
>> If that page is not there, page fault handler is called and not find it, it is performed search
>> from exception table and that address is not there of course - because address in pc is generic
>> unaligned code. I think that handling this needs more code.
>> Maybe if the address with from unaligned exception handler (there are some address which can caused
>> it) and find out which aligned address is there and find out proper fixup for it.
>> I think that this could work.
>>
>> What do you think?
> 
> I think the key point is that the kernel should never try an unaligned
> access. Other architectures already rely on this, so you can too.

I think that especially for this case get/put_user (+ user_copy macros)
kernel can do it because just copy/paste data from user space and data
can be unaligned.


> Instead of doing an aligned access from the unaligned exception handler,
> you should not access the data at all when you come from kernel mode
> but only search the fixup table to see if you should continue anyway.

> 
> Sorry for adding confusion by mentioning the __range_ok, which is addressing
> a separate problem. I should have made that clearer.

it is ok.

> 
> In pseudo-code, I think you should do:
> 
> void unaligned_exception(struct pt_regs *regs, unsigned long address)
> {
> 	if (kernel_mode(regs)) {
> 		/*
> 		 * unaligned access not allowed in kernel,
> 		 * but search exception table.
> 		 */
> 		bad_page_fault(regs, address, 0);
> 	} else {
> 		if (!access_ok(address, 4))
> 			force_sig(SIGBUS, current);
> 		else
> 			fixup_unaligned_access(regs);
> 	}
> }

Not sure about this idea because I think we have to use unaligned access for kernel too.
Your idea to skip unaligned access for kernel code should work but I think we have to find out
a solution for case when we need to use it in kernel code.

My idea was


void bad_page_fault()
{
	fixup = search_exception_tables(regs->pc);
	if (fixup) {
		regs->pc = fixup->fixup;
		return;
	}

	fixup = search_exception_tables(unaligned exception addr & 0xffff fffc) //mask last 2bits to get
aligned address
	if (fixup) {
		regs->pc = fixup->fixup;
		return;
	}

	die();

}


but not sure if will work because this setup bad fixup. In case that tlb load correct data
and kernel is able to access that place then unaligned exception needn't caused next page fault
and exception handler can load that value.

Maybe this could be work.

Michal





> 
> 	Arnd <><


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-14 17:13         ` Arnd Bergmann
  2009-07-14 17:45           ` Michal Simek
@ 2009-07-15  9:21           ` Paul Mundt
  2009-07-15 10:03             ` Michal Simek
  1 sibling, 1 reply; 15+ messages in thread
From: Paul Mundt @ 2009-07-15  9:21 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: monstr, Linux Kernel list, LTP

On Tue, Jul 14, 2009 at 07:13:20PM +0200, Arnd Bergmann wrote:
> On Tuesday 14 July 2009, Michal Simek wrote:
> > when the code tried to read/write from unaligned address (and in cpu
> > is turn on unaligned exception) then is caused unaligned exception
> > and asm code assemble/return value which is on that unaligned
> > address. (Assemble it that read/write every byte separately). That
> > will be harder to prevent all this cases because unaligned exception
> > is in generic code.  What do you mean add __range_ok? Range checking
> > is ok. The problem is when in case get_user kernel try to load
> > unaligned addr - unaligned exception is perform and try to load that
> > value separately.  If that page is not there, page fault handler is
> > called and not find it, it is performed search from exception table
> > and that address is not there of course - because address in pc is
> > generic unaligned code. I think that handling this needs more code.
> > Maybe if the address with from unaligned exception handler (there are
> > some address which can caused it) and find out which aligned address
> > is there and find out proper fixup for it.  I think that this could
> > work.
> > 
> > What do you think?
> 
> I think the key point is that the kernel should never try an unaligned
> access. Other architectures already rely on this, so you can too.

No, other architectures used to rely on this, until it was no longer
possible to do so. See for example, nfs. Unaligned accesses by the kernel
must be handled by the architecture, unaligned accesses by userspace can
be optionally fixed up.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-15  9:21           ` Paul Mundt
@ 2009-07-15 10:03             ` Michal Simek
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Simek @ 2009-07-15 10:03 UTC (permalink / raw)
  To: Paul Mundt, Arnd Bergmann, monstr, Linux Kernel list, LTP

Paul Mundt wrote:
> On Tue, Jul 14, 2009 at 07:13:20PM +0200, Arnd Bergmann wrote:
>> On Tuesday 14 July 2009, Michal Simek wrote:
>>> when the code tried to read/write from unaligned address (and in cpu
>>> is turn on unaligned exception) then is caused unaligned exception
>>> and asm code assemble/return value which is on that unaligned
>>> address. (Assemble it that read/write every byte separately). That
>>> will be harder to prevent all this cases because unaligned exception
>>> is in generic code.  What do you mean add __range_ok? Range checking
>>> is ok. The problem is when in case get_user kernel try to load
>>> unaligned addr - unaligned exception is perform and try to load that
>>> value separately.  If that page is not there, page fault handler is
>>> called and not find it, it is performed search from exception table
>>> and that address is not there of course - because address in pc is
>>> generic unaligned code. I think that handling this needs more code.
>>> Maybe if the address with from unaligned exception handler (there are
>>> some address which can caused it) and find out which aligned address
>>> is there and find out proper fixup for it.  I think that this could
>>> work.
>>>
>>> What do you think?
>> I think the key point is that the kernel should never try an unaligned
>> access. Other architectures already rely on this, so you can too.
> 
> No, other architectures used to rely on this, until it was no longer
> possible to do so. See for example, nfs. Unaligned accesses by the kernel
> must be handled by the architecture, unaligned accesses by userspace can
> be optionally fixed up.

Can you please look at John's email in this thread. How does SH handle this case?
I mean when for get/put user address is added unaligned address. How does your kernel handle it?

Michal


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
       [not found]       ` <9e6f3dfd0907141811p512b4edp3f9dd0fdeae1123e@mail.gmail.com>
@ 2009-07-15 10:14         ` Arnd Bergmann
  2009-07-15 11:39           ` Michal Simek
  2009-07-15 12:05           ` Ralf Baechle
  0 siblings, 2 replies; 15+ messages in thread
From: Arnd Bergmann @ 2009-07-15 10:14 UTC (permalink / raw)
  To: John Williams; +Cc: monstr, Linux Kernel list, LTP, Ralf Baechle

On Wednesday 15 July 2009, John Williams wrote:
> On Wed, Jul 15, 2009 at 2:43 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > The solution then is to handle fixups from the unaligned exception handler
> > if you come from the kernel. That should fix the three text cases.
> >
> > I don't fully understand your exception handling there, but I think you
> > also need to add code checking for __range_ok() to your unaligned handler,
> > to prevent malicious user space code from accessing the kernel through
> > unaligned pointers.
> 
> 
> Just to try to clarify - are there any alignment rules in the ABI on
> user-space pointers (which end up going to get/put_user)?

The kernel normally expects aligned input from user space, but I guess
it can't hurt to handle it anyway. arch/mips/kernel/alignment.c seems
to handle that case. Maybe Ralf can give some more insight.

> It seems the failure path is like this:
> 
> 1. userspace passes unaligned pointer
> 2. get_user attempts to access
> 3. CPU raises unaligned exception (if only it would raise the segfault as
> higher priority, before the unaligned!)
> 4. unaligned exception handler attempts to simulate the unaligned access
> with multiple partial read/write ops
> 5. CPU raises MMU exception on the read/write by the unaligned handler
> 6. kernel segfault handler looks up faulting address, it is in the unaligned
> exception handler, which has no fixup.
> 7. no fixup -> failure

Right.

> So, I suppose the question is - where in the sequence is the true failure?

I think in step 4. AFIACT, the kernel must do a number of checks on accesses
to random pointers.

> Clearly LTP thinks it's ok to pass unaligned pointers to the kernel,
> suggesting (1) is fine - thus my question about alignment rules in the ABI.

No, LTP thinks it should get a -EFAULT error code for that access. It does
specify whether it expects this because of an unaligned address or because
of an invalid page.

> Do we need fixups on the unaligned handler itself? This will be ugly ugly
> ugly. 

That's what ARM does. You don't have to do it from assembly though,
implementing it in C is probably easier.

> Or, some way of tracing the segfault back through the unaligned
> exception and to the root cause (the get/put-user), and call that fixup as
> required?

Yes, I guess that would have to look roughly like this:

int emulate_insn(struct pt_regs *regs, unsigned long addr, unsigned long len)
{
	/* use inline assembly with fixups here, return -EFAULT on bad addr */		
}

void alignment_exception(struct pt_regs *regs, unsigned long addr, unsigned long len)
{
	const struct exception_table_entry *fixup;
	int err;

	if (user_mode(regs)) {
		if (!access_ok(addr, len))
			goto segv;
		if (emulate_insn(regs) == -EFAULT))
			goto segv;
	} else {
		if (!access_ok(addr, len))
			goto fixup;
		if (emulate_insn(regs, addr, len) == -EFAULT))
			goto fixup;
	return;

fixup:
	fixup = search_exception_tables(regs->ip);
	if (!fixup)
		goto segv;

	regs->ip = fixup->fixup;
	return;

segv:
	force_sig(SIGSEGV, current));
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-15 10:14         ` Arnd Bergmann
@ 2009-07-15 11:39           ` Michal Simek
  2009-07-15 12:05           ` Ralf Baechle
  1 sibling, 0 replies; 15+ messages in thread
From: Michal Simek @ 2009-07-15 11:39 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: John Williams, Linux Kernel list, LTP, Ralf Baechle,
	subrata@linux.vnet.ibm.com

Arnd Bergmann wrote:
> On Wednesday 15 July 2009, John Williams wrote:
>> On Wed, Jul 15, 2009 at 2:43 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>> The solution then is to handle fixups from the unaligned exception handler
>>> if you come from the kernel. That should fix the three text cases.
>>>
>>> I don't fully understand your exception handling there, but I think you
>>> also need to add code checking for __range_ok() to your unaligned handler,
>>> to prevent malicious user space code from accessing the kernel through
>>> unaligned pointers.
>>
>> Just to try to clarify - are there any alignment rules in the ABI on
>> user-space pointers (which end up going to get/put_user)?
> 
> The kernel normally expects aligned input from user space, but I guess
> it can't hurt to handle it anyway. arch/mips/kernel/alignment.c seems
> to handle that case. Maybe Ralf can give some more insight.

you meant unaligned.c.

> 
>> It seems the failure path is like this:
>>
>> 1. userspace passes unaligned pointer
>> 2. get_user attempts to access
>> 3. CPU raises unaligned exception (if only it would raise the segfault as
>> higher priority, before the unaligned!)
>> 4. unaligned exception handler attempts to simulate the unaligned access
>> with multiple partial read/write ops
>> 5. CPU raises MMU exception on the read/write by the unaligned handler
>> 6. kernel segfault handler looks up faulting address, it is in the unaligned
>> exception handler, which has no fixup.
>> 7. no fixup -> failure
> 
> Right.
> 
>> So, I suppose the question is - where in the sequence is the true failure?
> 
> I think in step 4. AFIACT, the kernel must do a number of checks on accesses
> to random pointers.
> 
>> Clearly LTP thinks it's ok to pass unaligned pointers to the kernel,
>> suggesting (1) is fine - thus my question about alignment rules in the ABI.
> 
> No, LTP thinks it should get a -EFAULT error code for that access. It does
> specify whether it expects this because of an unaligned address or because
> of an invalid page.

IMHO author of this test not expect that caused too much troubles. From that tests
EFAULT should be return from copy_to_user macro not caused kernel fault. LTP should contain
special testcases for testing unaligned address.
I think we should add one more test with invalid aligned argument for that 3 tests + some doc.
I'll send it.

M

> 
>> Do we need fixups on the unaligned handler itself? This will be ugly ugly
>> ugly. 
> 
> That's what ARM does. You don't have to do it from assembly though,
> implementing it in C is probably easier.
> 
>> Or, some way of tracing the segfault back through the unaligned
>> exception and to the root cause (the get/put-user), and call that fixup as
>> required?
> 
> Yes, I guess that would have to look roughly like this:


> 
> int emulate_insn(struct pt_regs *regs, unsigned long addr, unsigned long len)
> {
> 	/* use inline assembly with fixups here, return -EFAULT on bad addr */		
> }
> 
> void alignment_exception(struct pt_regs *regs, unsigned long addr, unsigned long len)
> {
> 	const struct exception_table_entry *fixup;
> 	int err;
> 
> 	if (user_mode(regs)) {
> 		if (!access_ok(addr, len))
> 			goto segv;
> 		if (emulate_insn(regs) == -EFAULT))
> 			goto segv;
> 	} else {
> 		if (!access_ok(addr, len))
> 			goto fixup;
> 		if (emulate_insn(regs, addr, len) == -EFAULT))
> 			goto fixup;
> 	return;
> 
> fixup:
> 	fixup = search_exception_tables(regs->ip);
> 	if (!fixup)
> 		goto segv;
> 
> 	regs->ip = fixup->fixup;
> 	return;
> 
> segv:
> 	force_sig(SIGSEGV, current));
> }


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-15 10:14         ` Arnd Bergmann
  2009-07-15 11:39           ` Michal Simek
@ 2009-07-15 12:05           ` Ralf Baechle
  2009-07-15 13:27             ` Arnd Bergmann
  1 sibling, 1 reply; 15+ messages in thread
From: Ralf Baechle @ 2009-07-15 12:05 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: John Williams, monstr, Linux Kernel list, LTP

On Wed, Jul 15, 2009 at 12:14:52PM +0200, Arnd Bergmann wrote:

> On Wednesday 15 July 2009, John Williams wrote:
> > On Wed, Jul 15, 2009 at 2:43 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > > The solution then is to handle fixups from the unaligned exception handler
> > > if you come from the kernel. That should fix the three text cases.
> > >
> > > I don't fully understand your exception handling there, but I think you
> > > also need to add code checking for __range_ok() to your unaligned handler,
> > > to prevent malicious user space code from accessing the kernel through
> > > unaligned pointers.
> > 
> > 
> > Just to try to clarify - are there any alignment rules in the ABI on
> > user-space pointers (which end up going to get/put_user)?
> 
> The kernel normally expects aligned input from user space, but I guess
> it can't hurt to handle it anyway. arch/mips/kernel/alignment.c seems
> to handle that case. Maybe Ralf can give some more insight.

It's arch/arm/mm/alignment.c rsp. arch/mips/kernel/unaliged.c.

> > It seems the failure path is like this:
> > 
> > 1. userspace passes unaligned pointer
> > 2. get_user attempts to access
> > 3. CPU raises unaligned exception (if only it would raise the segfault as
> > higher priority, before the unaligned!)
> > 4. unaligned exception handler attempts to simulate the unaligned access
> > with multiple partial read/write ops
> > 5. CPU raises MMU exception on the read/write by the unaligned handler
> > 6. kernel segfault handler looks up faulting address, it is in the unaligned
> > exception handler, which has no fixup.
> > 7. no fixup -> failure
> 
> Right.
> 
> > So, I suppose the question is - where in the sequence is the true failure?
> 
> I think in step 4. AFIACT, the kernel must do a number of checks on accesses
> to random pointers.

Yes; but the checks that unaligned.c does on MIPS are no different from
any other get_user(), that is it ensures that the entire 8/16/32/64-bit
access is in userspace.  That's done using access_ok().

> > Clearly LTP thinks it's ok to pass unaligned pointers to the kernel,
> > suggesting (1) is fine - thus my question about alignment rules in the ABI.
> 
> No, LTP thinks it should get a -EFAULT error code for that access. It does
> specify whether it expects this because of an unaligned address or because
> of an invalid page.
> 
> > Do we need fixups on the unaligned handler itself? This will be ugly ugly
> > ugly. 
> 
> That's what ARM does. You don't have to do it from assembly though,
> implementing it in C is probably easier.

The MIPS psABI of which the Linux/MIPS ABI is sort of derived doesn't
specify any alignment-specific behaviour for syscalls.  That is they're not
treated any different than any random piece of library code.  I don't
know of any other standards documents (POSIX, single UNIX etc.) that
would have anything relevant either.

There are several historic precedents for why Linux handles miss-alignment
this way:

  o The networking stack where some cards may align the header in memory
    such that IP addresses are always on an address that is aligned to a
    16-bit but not a 32-bit boundary.  The tulip cards do that for example.
    It was considered a better solution to emulate the access or copy the
    entire rx packet before further processing than hacking up the
    networking stack to handle with ill-designed hardware.

  o Another in-kernel case was the MS-DOS partition table parsing.

  o The were also a bunch of historic Linux applications originally written
    for x86 which were relativly hard to port, so generally handling
    exceptions, was the by far easiest thing to do and probably for most
    architectures where this is relevant it's the easiest to handle all
    miss-alignment cases, not just selectively userland and maybe IP and
    MS-DOS partition table parsing.

We've moved on.  Handling miss-alignment is no longer a very relavent part
of the syscall interface.  So I think for the definition of the Linux
ABI this should just be left as implementation defined behaviour but
whatever an architecture does, it should be done consistently for all
system interfaces.

  Ralf

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: access_ok macor
  2009-07-15 12:05           ` Ralf Baechle
@ 2009-07-15 13:27             ` Arnd Bergmann
  0 siblings, 0 replies; 15+ messages in thread
From: Arnd Bergmann @ 2009-07-15 13:27 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: John Williams, monstr, Linux Kernel list, LTP

On Wednesday 15 July 2009, Ralf Baechle wrote:
 
> > I think in step 4. AFIACT, the kernel must do a number of checks on accesses
> > to random pointers.
>
> Yes; but the checks that unaligned.c does on MIPS are no different from
> any other get_user(), that is it ensures that the entire 8/16/32/64-bit
> access is in userspace.  That's done using access_ok().

Well, access_ok() plus the fixup code for unmapped or write-protected user
pages, I guess. But I agree, there are no checks beyond what get/put_user
does. In particular, the access_ok() check should only be needed for
faults from user space, while any fault coming from kernel space is
either some kernel code accessing its own data (as you mention below) or
a __get_user/__put_user that has already checked access permissions.

In my example, I got the case from kernel space wrong, it should not
check access_ok() if !user_space(regs), or it needs to do set_fs(KERNEL_DS)
first, like the mips code does.

> We've moved on.  Handling miss-alignment is no longer a very relavent part
> of the syscall interface.  So I think for the definition of the Linux
> ABI this should just be left as implementation defined behaviour but
> whatever an architecture does, it should be done consistently for all
> system interfaces.

Ok, thanks, for explanation.

I guess for the microblaze unaligned handler, both cases mean that
it needs to do a fixup table lookup for the original fault and
it needs to handle fixups on its own emulation code. The only difference
between disallowing unaligned accesses in the ABI and allowing them
is which of the two ways (check fixup table, emulate instruction) it
tries first.

	Arnd <><

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-07-15 13:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-14 12:56 access_ok macor Michal Simek
2009-07-14 13:21 ` Arnd Bergmann
2009-07-14 13:45   ` Michal Simek
2009-07-14 14:45     ` Arnd Bergmann
2009-07-14 15:06       ` Michal Simek
     [not found] ` <200907141652.59049.arnd@arndb.de>
     [not found]   ` <4A5CAEFF.9080206@monstr.eu>
2009-07-14 16:43     ` Arnd Bergmann
2009-07-14 16:56       ` Michal Simek
2009-07-14 17:13         ` Arnd Bergmann
2009-07-14 17:45           ` Michal Simek
2009-07-15  9:21           ` Paul Mundt
2009-07-15 10:03             ` Michal Simek
     [not found]       ` <9e6f3dfd0907141811p512b4edp3f9dd0fdeae1123e@mail.gmail.com>
2009-07-15 10:14         ` Arnd Bergmann
2009-07-15 11:39           ` Michal Simek
2009-07-15 12:05           ` Ralf Baechle
2009-07-15 13:27             ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox