All of lore.kernel.org
 help / color / mirror / Atom feed
* Fixing up unaligned userspace access
@ 2008-03-07 14:30 Kieran Bingham
  2008-03-07 14:44 ` Adrian McMenamin
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Kieran Bingham @ 2008-03-07 14:30 UTC (permalink / raw)
  To: linux-sh

Hi guys,

I'm getting confused about how the unaligned accesses are being handled
in SH2/SH2a.

Our busybox build and userland seems to be causing these exceptions far
too often.

For example, pressing "Tab" to tab complete a filename, causes these two
exceptions to fire, and then soon after the whole board stalls.



Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
ins=0x60f0

Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
ins=0x9015



Mainly I'm confused though, as looking at the assembly the instructions
aren't unaligned ? and don't seem to be trying to do anything
un-aligned ? - Is there something else that could be going on here?


In the first instance, the code disassembly from HEW shows :

DissAddr Obj Code          Dissassembly

0CCDA158 6893              MOV       R9,R8
0CCDA15A 410B              JSR       @R1
0CCDA15C 7801              ADD       #H'01,R8
0CCDA15E 64A2              MOV.L     @R10,R4
0CCDA160 3488              SUB       R8,R4
0CCDA162 D13C              MOV.L     @(H'00F0:8,PC),R1
0CCDA164 410B              JSR       @R1
0CCDA166 0009              NOP       
0CCDA168 60F0              MOV.B     @R15,R0     ## Faulting Address
0CCDA16A 8809              CMP/EQ    #H'09,R0
0CCDA16C 8904              BT        @H'CCDA178:8
0CCDA16E 9162              MOV.W     @(H'00C4:8,PC),R1
0CCDA170 31FC              ADD       R15,R1
0CCDA172 9061              MOV.W     @(H'00C2:8,PC),R0
0CCDA174 5119              MOV.L     @(H'24:4,R1),R1
0CCDA176 0F16              MOV.L     R1,@(R0,R15)
0CCDA178 AE21              BRA       @H'CCD9DBE:12
0CCDA17A 0009              NOP       

That seems to read fine as move the value pointed by R15 into R0, and
compare to see if it is H'09 ...

And at the second exception :

DissAddr Obj Code          Dissassembly

0CCD9E96 A0D5              BRA       @H'CCDA044:12
0CCD9E98 0009              NOP       
0CCD9E9A 7101              ADD       #H'01,R1
0CCD9E9C 3010              CMP/EQ    R1,R0
0CCD9E9E 8D4A              BT/S      @H'CCD9F36:8
0CCD9EA0 627C              EXTU.B    R7,R2
0CCD9EA2 A12E              BRA       @H'CCDA102:12
0CCD9EA4 E11F              MOV       #H'1F,R1
0CCD9EA6 D118              MOV.L     @(H'0060:8,PC),R1
0CCD9EA8 410B              JSR       @R1
0CCD9EAA 0009              NOP       
0CCD9EAC 9015              MOV.W     @(H'002A:8,PC),R0 ## Faulting Addr
0CCD9EAE E101              MOV       #H'01,R1
0CCD9EB0 A164              BRA       @H'CCDA17C:12
0CCD9EB2 0F16              MOV.L     R1,@(R0,R15)
0CCD9EB4 D114              MOV.L     @(H'0050:8,PC),R1
0CCD9EB6 410B              JSR       @R1
0CCD9EB8 0009              NOP       
0CCD9EBA D114              MOV.L     @(H'0050:8,PC),R1
0CCD9EBC E200              MOV       #H'00,R2
0CCD9EBE 900C              MOV.W     @(OFF_R6:8,PC),R0
0CCD9EC0 E3FF              MOV       #H'FF,R3



Does anyone have any clue as to which direction I should start poking to
find out what is actually causing these exceptions ?

Cheers

Hmmmmmmm and I've confused my self further ... tab completion in the
shell is working without issue on the NFS mounts, but not if you try to
tab complete filenames from within the root/mtd folders .... I'm not
sure thats much of a clue - but perhaps the bug is somewhere in the FS
instead of the busybox shell ...


-- 

Kieran Bingham,


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fixing up unaligned userspace access
  2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
@ 2008-03-07 14:44 ` Adrian McMenamin
  2008-03-12 18:41 ` Kieran Bingham
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Adrian McMenamin @ 2008-03-07 14:44 UTC (permalink / raw)
  To: linux-sh

On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
> Hi guys,
>
> I'm getting confused about how the unaligned accesses are being handled
> in SH2/SH2a.
>
> Our busybox build and userland seems to be causing these exceptions far
> too often.
>
> For example, pressing "Tab" to tab complete a filename, causes these two
> exceptions to fire, and then soon after the whole board stalls.
>
>
>
> Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
> ins=0x60f0
>
> Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
> ins=0x9015
>
>
>
> Mainly I'm confused though, as looking at the assembly the instructions
> aren't unaligned ? and don't seem to be trying to do anything
> un-aligned ? - Is there something else that could be going on here?
>

My experience - admittedly with SH4 - is that these errors are almost
always caused by a memory leak, or access to unitialised memory, elsewhere
in the kernel.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fixing up unaligned userspace access
  2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
  2008-03-07 14:44 ` Adrian McMenamin
@ 2008-03-12 18:41 ` Kieran Bingham
  2008-03-13 10:39 ` Paul Mundt
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Kieran Bingham @ 2008-03-12 18:41 UTC (permalink / raw)
  To: linux-sh

Think I found the fault ! :)


On 07/03/2008, Adrian McMenamin <adrian@newgolddream.dyndns.info> wrote:
> On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
>  >
>  > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
>  > ins=0x60f0
>  >
>  > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
>  > ins=0x9015
>  > Mainly I'm confused though, as looking at the assembly the instructions
>  > aren't unaligned ? and don't seem to be trying to do anything
>  > un-aligned ? - Is there something else that could be going on here?

> My experience - admittedly with SH4 - is that these errors are almost
>  always caused by a memory leak, or access to uninitialized memory, elsewhere
>  in the kernel.

The instructions don't look like they perform anything bad, because they don't!

They are the wrong instructions. ... would seem that the address error
trap handler is adding 4 to the regs pointer before it calls
do_address_error, so regs->pc was actually returning the PR!!


Does anyone know why this code adds 4 ?
Can we remove it if its incorrect (patch below)

I'm working on SH2a, so I don't know if its an SH2 specific thing
thats been put in ?



Remove erroneous offset on SH2a address error handling

Signed-off-by: Kieran Bingham <kbingham@mpc-data.co.uk>
---

diff --git a/arch/sh/kernel/cpu/sh2/entry.S b/arch/sh/kernel/cpu/sh2/entry.S
index 7a26569..0fc8906 100644
--- a/arch/sh/kernel/cpu/sh2/entry.S
+++ b/arch/sh/kernel/cpu/sh2/entry.S
@@ -267,7 +267,6 @@ ENTRY(sh_bios_handler)

 ENTRY(address_error_trap_handler)
        mov     r15,r4                          ! regs
-       add     #4,r4
        mov     #OFF_PC,r0
        mov.l   @(r0,r15),r6                    ! pc
        mov.l   1f,r0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Fixing up unaligned userspace access
  2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
  2008-03-07 14:44 ` Adrian McMenamin
  2008-03-12 18:41 ` Kieran Bingham
@ 2008-03-13 10:39 ` Paul Mundt
  2008-03-14 14:10 ` Kieran Bingham
  2008-03-21  8:49 ` Paul Mundt
  4 siblings, 0 replies; 6+ messages in thread
From: Paul Mundt @ 2008-03-13 10:39 UTC (permalink / raw)
  To: linux-sh

On Wed, Mar 12, 2008 at 06:41:57PM +0000, Kieran Bingham wrote:
> On 07/03/2008, Adrian McMenamin <adrian@newgolddream.dyndns.info> wrote:
> > On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
> >  >
> >  > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
> >  > ins=0x60f0
> >  >
> >  > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
> >  > ins=0x9015
> >  > Mainly I'm confused though, as looking at the assembly the instructions
> >  > aren't unaligned ? and don't seem to be trying to do anything
> >  > un-aligned ? - Is there something else that could be going on here?
> 
> > My experience - admittedly with SH4 - is that these errors are almost
> >  always caused by a memory leak, or access to uninitialized memory, elsewhere
> >  in the kernel.
> 
> The instructions don't look like they perform anything bad, because they don't!
> 
> They are the wrong instructions. ... would seem that the address error
> trap handler is adding 4 to the regs pointer before it calls
> do_address_error, so regs->pc was actually returning the PR!!
> 
> Does anyone know why this code adds 4 ?
> Can we remove it if its incorrect (patch below)
> 
> I'm working on SH2a, so I don't know if its an SH2 specific thing
> thats been put in ?
> 
It's definitely not an SH-2 thing. I wonder if it's a left over remnant
from when we were placing markers on the stack in the early days of the
SH-2 port. Anyways, dumping the stack from the address error path makes
it pretty obvious that the add is forcing all of the state to be off by
one register, which also explains why the regs->sr check was failing.

Given that, I'll add your patch to the 2.6.25 queue. Though it seems like
there are still a few corruption issues outstanding, which the slab
caches in particular seem to hit. There's occasional garbage in regs->pc,
which suggests that the exceptions are nesting and we're hitting the case
where the saved PC value is undefined. I've been debugging this most of
the day, and found a number of other bugs in the nommu code, but most of
the corruption issues are still outstanding (using current git both on
the SH7203 RSK and the SH7206 SolutionEngine).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fixing up unaligned userspace access
  2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
                   ` (2 preceding siblings ...)
  2008-03-13 10:39 ` Paul Mundt
@ 2008-03-14 14:10 ` Kieran Bingham
  2008-03-21  8:49 ` Paul Mundt
  4 siblings, 0 replies; 6+ messages in thread
From: Kieran Bingham @ 2008-03-14 14:10 UTC (permalink / raw)
  To: linux-sh

Hmmm couple of issues come from looking at the Address Error Exception Handling.

If I find the offending code statement in HEW and set breakpoints
before and after, and use the continue button to iterate through the
instructions, I get the exception fired on the correct line, and all
of the regs match my expectations :

root:/home/nfs/apps> ./unaligned-access

Pid : 56, Comm:      unaligned-acces
PC is at 0xcfa0112
PC  : 0cfa0112 SP  : 0cfbff34 SR  : 00000001                   Not tainted
R0  : 0cfa00f0 R1  : f2f9f1f9 R2  : 0cfbff88 R3  : 0cfbff21
R4  : 00000001 R5  : 0cfbff6c R6  : 0cfbff74 R7  : ffffffff
R8  : 0cfbff6c R9  : 0cfbff74 R10 : 00000001 R11 : 0c31ffc0
R12 : 0c3083dc R13 : 0c2e06a8 R14 : 0cfbff34
MACH: 00000000 MACL: 00000015 GBR : 00000000 PR  : 0cfa0242
Fixing up unaligned userspace access in "unaligned-acces" pidV
pc=0x0cfa0112 ins=0x6112
instruction : 6112
Fixing up a Mov.[bwl] Insa12, rmÍ35fa4, rnÍ35fa4, *rmòf9f1f9, count=4
calling copy_from_user(dst : cd35fa4, src : f2f9f1f9, count 4)
Killing process "unaligned-acces" due to unaligned access
SIGSEGV

execution broke at 0x0cfa0112 as expected and handler can try and deal
with the correct lines.


However, If i just let it run its course, with no interference from HEW,


Pid : 57, Comm:      unaligned-acces
PC is at 0xcfa0116
PC  : 0cfa0116 SP  : 0cfbff34 SR  : 00000001                   Not tainted
R0  : 0cfa00f0 R1  : f2f9f1f9 R2  : 0cfbff88 R3  : 0cfbff21
R4  : 00000001 R5  : 0cfbff6c R6  : 0cfbff74 R7  : ffffffff
R8  : 0cfbff6c R9  : 0cfbff74 R10 : 00000001 R11 : 0c31ffc0
R12 : 0c3083dc R13 : 0c2e06f8 R14 : 0cfbff34
MACH: 00000000 MACL: 00000015 GBR : 00000000 PR  : 0cfa0242
Fixing up unaligned userspace access in "unaligned-acces" pidW
pc=0x0cfa0116 ins=0x0009
instruction : 9

Code from above:

0CFA010A 0009              NOP
0CFA010C 0009              NOP
0CFA010E 0009              NOP
0CFA0110 51E1              MOV.L     @(H'04:4,R14),R1    # R1 is
0x4466 at this point
0CFA0112 6112              MOV.L     @R1,R1
0CFA0114 1E13              MOV.L     R1,@(H'0C:4,R14)
0CFA0116 0009              NOP
0CFA0118 0009              NOP
0CFA011A 0009              NOP


Is there some sort of timing issue here where by the time the
exception is raised, the CPU has already started to execute the
following instructions ? Something in a pipeline perhaps ? Surely when
an address error occurs - thats it - it should stop ? But maybe I'm
missing something...


Thoughts / Comments anyone?
--
Cheers
Kieran

On 13/03/2008, Paul Mundt <lethal@linux-sh.org> wrote:
> On Wed, Mar 12, 2008 at 06:41:57PM +0000, Kieran Bingham wrote:
>  > On 07/03/2008, Adrian McMenamin <adrian@newgolddream.dyndns.info> wrote:
>  > > On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
>  > >  >
>  > >  > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
>  > >  > ins=0x60f0
>  > >  >
>  > >  > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
>  > >  > ins=0x9015
>  > >  > Mainly I'm confused though, as looking at the assembly the instructions
>  > >  > aren't unaligned ? and don't seem to be trying to do anything
>  > >  > un-aligned ? - Is there something else that could be going on here?
>  >
>  > > My experience - admittedly with SH4 - is that these errors are almost
>  > >  always caused by a memory leak, or access to uninitialized memory, elsewhere
>  > >  in the kernel.
>  >
>  > The instructions don't look like they perform anything bad, because they don't!
>  >
>  > They are the wrong instructions. ... would seem that the address error
>  > trap handler is adding 4 to the regs pointer before it calls
>  > do_address_error, so regs->pc was actually returning the PR!!
>  >
>  > Does anyone know why this code adds 4 ?
>  > Can we remove it if its incorrect (patch below)
>  >
>  > I'm working on SH2a, so I don't know if its an SH2 specific thing
>  > thats been put in ?
>  >
>
> It's definitely not an SH-2 thing. I wonder if it's a left over remnant
>  from when we were placing markers on the stack in the early days of the
>  SH-2 port. Anyways, dumping the stack from the address error path makes
>  it pretty obvious that the add is forcing all of the state to be off by
>  one register, which also explains why the regs->sr check was failing.
>
>  Given that, I'll add your patch to the 2.6.25 queue. Though it seems like
>  there are still a few corruption issues outstanding, which the slab
>  caches in particular seem to hit. There's occasional garbage in regs->pc,
>  which suggests that the exceptions are nesting and we're hitting the case
>  where the saved PC value is undefined. I've been debugging this most of
>  the day, and found a number of other bugs in the nommu code, but most of
>  the corruption issues are still outstanding (using current git both on
>  the SH7203 RSK and the SH7206 SolutionEngine).
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fixing up unaligned userspace access
  2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
                   ` (3 preceding siblings ...)
  2008-03-14 14:10 ` Kieran Bingham
@ 2008-03-21  8:49 ` Paul Mundt
  4 siblings, 0 replies; 6+ messages in thread
From: Paul Mundt @ 2008-03-21  8:49 UTC (permalink / raw)
  To: linux-sh

On Fri, Mar 14, 2008 at 02:10:59PM +0000, Kieran Bingham wrote:
> Fixing up unaligned userspace access in "unaligned-acces" pidW
> pc=0x0cfa0116 ins=0x0009
> instruction : 9
> 
> Code from above:
> 
> 0CFA010A 0009              NOP
> 0CFA010C 0009              NOP
> 0CFA010E 0009              NOP
> 0CFA0110 51E1              MOV.L     @(H'04:4,R14),R1    # R1 is
> 0x4466 at this point
> 0CFA0112 6112              MOV.L     @R1,R1
> 0CFA0114 1E13              MOV.L     R1,@(H'0C:4,R14)
> 0CFA0116 0009              NOP
> 0CFA0118 0009              NOP
> 0CFA011A 0009              NOP
> 
> 
> Is there some sort of timing issue here where by the time the
> exception is raised, the CPU has already started to execute the
> following instructions ? Something in a pipeline perhaps ? Surely when
> an address error occurs - thats it - it should stop ? But maybe I'm
> missing something...
> 
It would be helpful to know what the stack layout is at the time you
enter the exception, both from HEW and from regular processing. The saved
PC value in this case is the next instruction to be executed, and even if
it's partially split out in the pipeline, the exec stage should not be
hit until execution resumes.

Looking at address_error_trap_handler(), I wonder if we have an
inconsistency between the stack-relative saved (ie, OFF_PC) PC and the
saved PC on the top of the stack pushed by the hardware on top of the
saved SR before the processing begins. You may wish to pop the saved PC
off and toss that in r7 or some such thing so you can more easily
compare. The only issue is that the hardware-saved PC at the top of the
stack can occasionally be undefined (ie, in the case of nested
exceptions), in which case we have to go to the OFF_PC value on-stack
regardless. This is fairly easy to test for however.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-03-21  8:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
2008-03-07 14:44 ` Adrian McMenamin
2008-03-12 18:41 ` Kieran Bingham
2008-03-13 10:39 ` Paul Mundt
2008-03-14 14:10 ` Kieran Bingham
2008-03-21  8:49 ` Paul Mundt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.