* Unaligned address handling, and the cause of that login problem
@ 2000-04-16 22:19 Mike Klar
2000-04-16 22:19 ` Mike Klar
2000-04-17 23:43 ` Ralf Baechle
0 siblings, 2 replies; 7+ messages in thread
From: Mike Klar @ 2000-04-16 22:19 UTC (permalink / raw)
To: linux; +Cc: linux-mips
While tracking down a random memory corruption bug, I stumbled across the
cause of that telnet/ssh problem in recent kernels reported about a month
ago:
The version of down_trylock() for CPUs with support LL/SC assumes that
struct semaphore is 64-bit aligned, since it accesses count and waking as a
single dualword (with lld/scd). Nothing in struct semaphore guarantees this
alignment, and in fact, struct tty_struct has a struct semaphore that is not
64-bit aligned. Depending on how a tty is used (I think it's a non-blocking
read that triggers the problem, in drivers/char/n_tty.c), the kernel will
attempt an unaligned lld, it will cause an address error, and the handler in
arch/mips/kernel/unaligned.c will kill current with SIGBUS (since lld/scd
cannot be properly simulated).
The quick-and-dirty workaround is to put 32 bits of padding before the
atomic_read member of struct tty_struct. Of course, that doesn't fix the
real problem, and there may well be other non-64-bit aligned struct
semaphore's out there. A proper fix would be to either hack up struct
semaphore to guarantee dualword alignment, or rework the was down_trylock
does its thing.
While I'm on the topic of unaligned handling, this behavior of sending
SIGBUS, SIGSEGV, or SIGILL to current on unaligned accesses seems to me like
incorrect behavior if the original fault happened in kernel mode. The above
example of an unaligned lld sending SIGBUS is not too bad, since the fault
does happen while doing something on behalf of the current process.
Consider this example, though: If kernel code attempts an unaligned word
read to virtual address 0x00000001 (for example), the unaligned handler will
attempt to simulate with 2 aligned reads, which will fault, and since the
unaligned handler catches those faults, it will wind up sending SIGSEGV to
current. I would think that condition should cause an oops, since that's
what an equivalent aligned access would do, and especially since the access
may have had nothing to do with current (it may happen from an interrupt,
for example).
Comments?
Mike Klar
Wyldfier Technology
^ permalink raw reply [flat|nested] 7+ messages in thread
* Unaligned address handling, and the cause of that login problem
2000-04-16 22:19 Mike Klar
@ 2000-04-16 22:19 ` Mike Klar
2000-04-17 23:43 ` Ralf Baechle
1 sibling, 0 replies; 7+ messages in thread
From: Mike Klar @ 2000-04-16 22:19 UTC (permalink / raw)
To: linux; +Cc: linux-mips
While tracking down a random memory corruption bug, I stumbled across the
cause of that telnet/ssh problem in recent kernels reported about a month
ago:
The version of down_trylock() for CPUs with support LL/SC assumes that
struct semaphore is 64-bit aligned, since it accesses count and waking as a
single dualword (with lld/scd). Nothing in struct semaphore guarantees this
alignment, and in fact, struct tty_struct has a struct semaphore that is not
64-bit aligned. Depending on how a tty is used (I think it's a non-blocking
read that triggers the problem, in drivers/char/n_tty.c), the kernel will
attempt an unaligned lld, it will cause an address error, and the handler in
arch/mips/kernel/unaligned.c will kill current with SIGBUS (since lld/scd
cannot be properly simulated).
The quick-and-dirty workaround is to put 32 bits of padding before the
atomic_read member of struct tty_struct. Of course, that doesn't fix the
real problem, and there may well be other non-64-bit aligned struct
semaphore's out there. A proper fix would be to either hack up struct
semaphore to guarantee dualword alignment, or rework the was down_trylock
does its thing.
While I'm on the topic of unaligned handling, this behavior of sending
SIGBUS, SIGSEGV, or SIGILL to current on unaligned accesses seems to me like
incorrect behavior if the original fault happened in kernel mode. The above
example of an unaligned lld sending SIGBUS is not too bad, since the fault
does happen while doing something on behalf of the current process.
Consider this example, though: If kernel code attempts an unaligned word
read to virtual address 0x00000001 (for example), the unaligned handler will
attempt to simulate with 2 aligned reads, which will fault, and since the
unaligned handler catches those faults, it will wind up sending SIGSEGV to
current. I would think that condition should cause an oops, since that's
what an equivalent aligned access would do, and especially since the access
may have had nothing to do with current (it may happen from an interrupt,
for example).
Comments?
Mike Klar
Wyldfier Technology
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unaligned address handling, and the cause of that login problem
@ 2000-04-17 7:32 Kevin D. Kissell
2000-04-17 7:32 ` Kevin D. Kissell
0 siblings, 1 reply; 7+ messages in thread
From: Kevin D. Kissell @ 2000-04-17 7:32 UTC (permalink / raw)
To: Mike Klar, linux; +Cc: linux-mips
Note that, as part of what we had to do to support
the newer generations of MIPS32 chips that support
LL/SC, but only for 32-bit quantities, I did a complete
rework of the semaphore support primatives to
eliminate this dependency on 64-bit LL/SC.
See the source tar and patch file on the MIPS
FTP server ftp://ftp.mips.com/pub/linux/mips/kernel
or even the slightly earlier patches webbed on the
www.paralogos.com/mipslinux pages.
It *was* a rewrite from first principles, based
on study of the documentation and the x86
and PPC code, and while I can guarantee it
won't have the unaligned doubleword problem,
I'd be interested in anyone elses critique of the
implementation.
Regards,
Kevin K.
-----Original Message-----
From: Mike Klar <mfklar@ponymail.com>
To: linux@cthulhu.engr.sgi.com <linux@cthulhu.engr.sgi.com>
Cc: linux-mips@fnet.fr <linux-mips@fnet.fr>
Date: Monday, April 17, 2000 12:36 AM
Subject: Unaligned address handling, and the cause of that login problem
>While tracking down a random memory corruption bug, I stumbled across the
>cause of that telnet/ssh problem in recent kernels reported about a month
>ago:
>
>The version of down_trylock() for CPUs with support LL/SC assumes that
>struct semaphore is 64-bit aligned, since it accesses count and waking as a
>single dualword (with lld/scd). Nothing in struct semaphore guarantees
this
>alignment, and in fact, struct tty_struct has a struct semaphore that is
not
>64-bit aligned. Depending on how a tty is used (I think it's a
non-blocking
>read that triggers the problem, in drivers/char/n_tty.c), the kernel will
>attempt an unaligned lld, it will cause an address error, and the handler
in
>arch/mips/kernel/unaligned.c will kill current with SIGBUS (since lld/scd
>cannot be properly simulated).
>
>The quick-and-dirty workaround is to put 32 bits of padding before the
>atomic_read member of struct tty_struct. Of course, that doesn't fix the
>real problem, and there may well be other non-64-bit aligned struct
>semaphore's out there. A proper fix would be to either hack up struct
>semaphore to guarantee dualword alignment, or rework the was down_trylock
>does its thing.
>
>While I'm on the topic of unaligned handling, this behavior of sending
>SIGBUS, SIGSEGV, or SIGILL to current on unaligned accesses seems to me
like
>incorrect behavior if the original fault happened in kernel mode. The
above
>example of an unaligned lld sending SIGBUS is not too bad, since the fault
>does happen while doing something on behalf of the current process.
>Consider this example, though: If kernel code attempts an unaligned word
>read to virtual address 0x00000001 (for example), the unaligned handler
will
>attempt to simulate with 2 aligned reads, which will fault, and since the
>unaligned handler catches those faults, it will wind up sending SIGSEGV to
>current. I would think that condition should cause an oops, since that's
>what an equivalent aligned access would do, and especially since the access
>may have had nothing to do with current (it may happen from an interrupt,
>for example).
>
>Comments?
>
>Mike Klar
>Wyldfier Technology
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unaligned address handling, and the cause of that login problem
2000-04-17 7:32 Unaligned address handling, and the cause of that login problem Kevin D. Kissell
@ 2000-04-17 7:32 ` Kevin D. Kissell
0 siblings, 0 replies; 7+ messages in thread
From: Kevin D. Kissell @ 2000-04-17 7:32 UTC (permalink / raw)
To: Mike Klar, linux; +Cc: linux-mips
Note that, as part of what we had to do to support
the newer generations of MIPS32 chips that support
LL/SC, but only for 32-bit quantities, I did a complete
rework of the semaphore support primatives to
eliminate this dependency on 64-bit LL/SC.
See the source tar and patch file on the MIPS
FTP server ftp://ftp.mips.com/pub/linux/mips/kernel
or even the slightly earlier patches webbed on the
www.paralogos.com/mipslinux pages.
It *was* a rewrite from first principles, based
on study of the documentation and the x86
and PPC code, and while I can guarantee it
won't have the unaligned doubleword problem,
I'd be interested in anyone elses critique of the
implementation.
Regards,
Kevin K.
-----Original Message-----
From: Mike Klar <mfklar@ponymail.com>
To: linux@cthulhu.engr.sgi.com <linux@cthulhu.engr.sgi.com>
Cc: linux-mips@fnet.fr <linux-mips@fnet.fr>
Date: Monday, April 17, 2000 12:36 AM
Subject: Unaligned address handling, and the cause of that login problem
>While tracking down a random memory corruption bug, I stumbled across the
>cause of that telnet/ssh problem in recent kernels reported about a month
>ago:
>
>The version of down_trylock() for CPUs with support LL/SC assumes that
>struct semaphore is 64-bit aligned, since it accesses count and waking as a
>single dualword (with lld/scd). Nothing in struct semaphore guarantees
this
>alignment, and in fact, struct tty_struct has a struct semaphore that is
not
>64-bit aligned. Depending on how a tty is used (I think it's a
non-blocking
>read that triggers the problem, in drivers/char/n_tty.c), the kernel will
>attempt an unaligned lld, it will cause an address error, and the handler
in
>arch/mips/kernel/unaligned.c will kill current with SIGBUS (since lld/scd
>cannot be properly simulated).
>
>The quick-and-dirty workaround is to put 32 bits of padding before the
>atomic_read member of struct tty_struct. Of course, that doesn't fix the
>real problem, and there may well be other non-64-bit aligned struct
>semaphore's out there. A proper fix would be to either hack up struct
>semaphore to guarantee dualword alignment, or rework the was down_trylock
>does its thing.
>
>While I'm on the topic of unaligned handling, this behavior of sending
>SIGBUS, SIGSEGV, or SIGILL to current on unaligned accesses seems to me
like
>incorrect behavior if the original fault happened in kernel mode. The
above
>example of an unaligned lld sending SIGBUS is not too bad, since the fault
>does happen while doing something on behalf of the current process.
>Consider this example, though: If kernel code attempts an unaligned word
>read to virtual address 0x00000001 (for example), the unaligned handler
will
>attempt to simulate with 2 aligned reads, which will fault, and since the
>unaligned handler catches those faults, it will wind up sending SIGSEGV to
>current. I would think that condition should cause an oops, since that's
>what an equivalent aligned access would do, and especially since the access
>may have had nothing to do with current (it may happen from an interrupt,
>for example).
>
>Comments?
>
>Mike Klar
>Wyldfier Technology
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unaligned address handling, and the cause of that login problem
2000-04-16 22:19 Mike Klar
2000-04-16 22:19 ` Mike Klar
@ 2000-04-17 23:43 ` Ralf Baechle
2000-04-18 15:13 ` Geert Uytterhoeven
1 sibling, 1 reply; 7+ messages in thread
From: Ralf Baechle @ 2000-04-17 23:43 UTC (permalink / raw)
To: Mike Klar; +Cc: linux, linux-mips
On Sun, Apr 16, 2000 at 03:19:01PM -0700, Mike Klar wrote:
> While tracking down a random memory corruption bug, I stumbled across the
> cause of that telnet/ssh problem in recent kernels reported about a month
> ago:
>
> The version of down_trylock() for CPUs with support LL/SC assumes that
> struct semaphore is 64-bit aligned, since it accesses count and waking as a
> single dualword (with lld/scd). Nothing in struct semaphore guarantees this
> alignment, and in fact, struct tty_struct has a struct semaphore that is not
> 64-bit aligned. Depending on how a tty is used (I think it's a non-blocking
> read that triggers the problem, in drivers/char/n_tty.c), the kernel will
> attempt an unaligned lld, it will cause an address error, and the handler in
> arch/mips/kernel/unaligned.c will kill current with SIGBUS (since lld/scd
> cannot be properly simulated).
>
> The quick-and-dirty workaround is to put 32 bits of padding before the
> atomic_read member of struct tty_struct. Of course, that doesn't fix the
> real problem, and there may well be other non-64-bit aligned struct
> semaphore's out there. A proper fix would be to either hack up struct
> semaphore to guarantee dualword alignment, or rework the was down_trylock
> does its thing.
I'll put __attribute__ ((aligned(64))) to the structure which will fix this.
This will have to be changed again when we add support for 32-bit processors
with ll / sc instructions but for now we don't support them, so it's the
right thing.
> While I'm on the topic of unaligned handling, this behavior of sending
> SIGBUS, SIGSEGV, or SIGILL to current on unaligned accesses seems to me like
> incorrect behavior if the original fault happened in kernel mode.
> The above
> example of an unaligned lld sending SIGBUS is not too bad, since the fault
> does happen while doing something on behalf of the current process.
The assumption is that the kernel should never ever use ll, lld, sc and scd
on improperly aligned memory objects, so not checking is ok. In other
words it's it's perfectly ok if the kernel dies or behaves silly following
such a can-not-happen case.
Note that while we don't attemt to handle missaligned ll/sc/lld/scd
instructions because that would break atomicity on SMP machines. On the
other side again emulating them on CPUs that don't have them at all like
the R3000 is ok because those are not used on SMP systems. That is not
counting the oddball SMP systems which we'll probably not support ever.
Ralf
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unaligned address handling, and the cause of that login problem
2000-04-17 23:43 ` Ralf Baechle
@ 2000-04-18 15:13 ` Geert Uytterhoeven
2000-04-18 21:40 ` Ralf Baechle
0 siblings, 1 reply; 7+ messages in thread
From: Geert Uytterhoeven @ 2000-04-18 15:13 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Mike Klar, linux, linux-mips
On Mon, 17 Apr 2000, Ralf Baechle wrote:
> I'll put __attribute__ ((aligned(64))) to the structure which will fix this.
^^
8, I suppose?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- Linux/{m68k~Amiga,PPC~CHRP} -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unaligned address handling, and the cause of that login problem
2000-04-18 15:13 ` Geert Uytterhoeven
@ 2000-04-18 21:40 ` Ralf Baechle
0 siblings, 0 replies; 7+ messages in thread
From: Ralf Baechle @ 2000-04-18 21:40 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Mike Klar, linux, linux-mips
On Tue, Apr 18, 2000 at 05:13:48PM +0200, Geert Uytterhoeven wrote:
> On Mon, 17 Apr 2000, Ralf Baechle wrote:
> > I'll put __attribute__ ((aligned(64))) to the structure which will fix this.
> ^^
> 8, I suppose?
Of course.
Ralf
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2000-04-18 21:54 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-04-17 7:32 Unaligned address handling, and the cause of that login problem Kevin D. Kissell
2000-04-17 7:32 ` Kevin D. Kissell
-- strict thread matches above, loose matches on Subject: below --
2000-04-16 22:19 Mike Klar
2000-04-16 22:19 ` Mike Klar
2000-04-17 23:43 ` Ralf Baechle
2000-04-18 15:13 ` Geert Uytterhoeven
2000-04-18 21:40 ` Ralf Baechle
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox