linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jose Goncalves <jose.goncalves@inov.pt>
To: Frederik Deweerdt <deweerdt@free.fr>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: Serial related oops
Date: Thu, 22 Feb 2007 15:02:46 +0000	[thread overview]
Message-ID: <45DDB096.2020807@inov.pt> (raw)
In-Reply-To: <20070221230503.GA28156@flint.arm.linux.org.uk>

Russell King wrote:
> On Wed, Feb 21, 2007 at 02:13:15PM +0000, Jose Goncalves wrote:
>   
>> <1>[18840.304048] Unable to handle kernel NULL pointer dereference at virtual address 00000012
>> <1>[18840.313046]  printing eip:
>> <4>[18840.321687] c01bfa7a
>> <1>[18840.321714] *pde = 00000000
>> <0>[18840.331287] Oops: 0000 [#1]
>> <4>[18840.340687] Modules linked in:
>> <0>[18840.349749] CPU:    0
>> <4>[18840.349767] EIP:    0060:[<c01bfa7a>]    Not tainted VLI
>> <4>[18840.349782] EFLAGS: 00010202   (2.6.16.41-mtm5-debug1 #1) 
>> <0>[18840.377277] EIP is at serial_in+0xa/0x4a
>> <0>[18840.387221] eax: 00000060   ebx: 00000000   ecx: 00000000   edx: 00000000
>> <0>[18840.397805] esi: 00000000   edi: 00000040   ebp: c728fe1c   esp: c728fe18
>> <0>[18840.408579] ds: 007b   es: 007b   ss: 0068
>> <0>[18840.419624] Process gp_position (pid: 11629, threadinfo=c728e000 task=c7443a90)
>> <0>[18840.420509] Stack: <0>00000000 00000000 c01c0f88 00000000 00000000 c031fef0 00000005 00000202 
>> <0>[18840.445655]        c7161a1c c031fef0 c124b510 c728fe60 c01bd97d c031fef0 c124b510 c124b510 
>> <0>[18840.460540]        00000000 c773dbcc c728fe7c c01befe7 c124b510 00000000 ffffffed c773dbcc 
>>     
>
> Okay, this one is even more plainly "not a coding error".
>
>   
>> <0>[18840.566645]  [<c01c0f88>] serial8250_startup+0x28f/0x2a9
>>     
>
> The code around this point (with the return point marked) is:
>
>   
>> c01c0f78:	6a 05                	push   $0x5
>> c01c0f7a:	53                   	push   %ebx
>> c01c0f7b:	e8 f0 ea ff ff       	call   c01bfa70 <serial_in>
>> c01c0f80:	6a 00                	push   $0x0
>> c01c0f82:	53                   	push   %ebx
>> c01c0f83:	e8 e8 ea ff ff       	call   c01bfa70 <serial_in>
>> c01c0f88<<<	6a 02                	push   $0x2
>> c01c0f8a:	53                   	push   %ebx
>> c01c0f8b:	e8 e0 ea ff ff       	call   c01bfa70 <serial_in>
>>     
>
> and corresponds with this C code:
>
>         (void) serial_inp(up, UART_LSR);
>         (void) serial_inp(up, UART_RX);
>         (void) serial_inp(up, UART_IIR);
>
> Now let's look at the words pushed on the stack around this code:
>
>   00000000
>   00000000
>   c01c0f88 <- return address for serial_in (serial8250_startup+0x28f/0x2a9)
>   00000000 <- from push %ebx at c01c0f82
>   00000000 <- from push $0x0 at c01c0f80
>   c031fef0 <- from push %ebx at c01c0f7a
>   00000005 <- from push %0x5 at c01c0f78
>
> Plainly, %ebx changed across the call to serial_in() at c01c0f7b.
> First thing to notice is this violates the C code - "up" can not
> change.
>
> Now let's look at serial_in:
>
> c01bfa70:       55                      push   %ebp
> c01bfa71:       89 e5                   mov    %esp,%ebp
> c01bfa73:       53                      push   %ebx
> ...
> c01bfab7:       5b                      pop    %ebx
> c01bfab8:       5d                      pop    %ebp
> c01bfab9:       c3                      ret
>
> This code tells the CPU to preserves %ebx and %ebp.  But we know %ebx
> _wasn't_ preserved.  Ergo, your CPU is plainly not doing what the code
> told it to do.
>
> Moreover, serial_in() has preserved %ebx in the past otherwise we'd
> never got past all the other serial_in()s in serial8250_startup().
>
> So I think it's very demonstrably a hardware fault, and not software
> related.
>   

It could be a silly question (tamper with me as I'm not familiar with
such low level programming), but couldn't it be possible for a interrupt
to hit in the middle of the serial_in() calls and mess with %ebx?

What I find real hard to understand is why a hardware fault happens
always in the same software instruction! I would expect a hardware fault
to hit randomly...

I left my application running this night, with a 2.6.16.41 kernel
unpatched  on the serial driver (my last Oops report was with Frederik
patch to remove the insertion made in 2.6.12) and it crashed again on
exactly the same point!

> For all we know, it could be a one-off fault on the hardware you
> happen to have - other identical units may not behave the same (can
> you check?)
>   

Yes I have other units that I can test it. I'll do that to see if it's
really a one-off fault on the hardware.
If it continues to crash with other units I will then test with the
msleep(10) before the "And clear the interrupt registers again for
luck.", as you suggested earlier.

> If it is a one off case, you are welcome to patch that test out in
> your kernel build to remove the problem, and if it's an isolated case
> I encourage you to do this.  This is one of the great advantages of
> open source - if you hit such a problem rather than throwing the
> hardware away you can work around such issues.
>   

I didn't understand what you mean by "you are welcome to patch that test
out in your kernel build to remove the problem". Which test are you
talking about?

Regards,
José Gonçalves


  parent reply	other threads:[~2007-02-22 15:03 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-20 13:29 Serial related oops Frederik Deweerdt
2007-02-19 13:45 ` Russell King
2007-02-20 14:24   ` Frederik Deweerdt
2007-02-19 14:35     ` Russell King
2007-02-20 14:48       ` Frederik Deweerdt
2007-02-19 15:05         ` Russell King
2007-02-19 16:29           ` Jose Goncalves
2007-02-19 16:42             ` Russell King
2007-02-19 17:54               ` Jose Goncalves
2007-02-19 20:37                 ` Michael K. Edwards
2007-02-19 20:51                   ` Russell King
2007-02-19 21:24                     ` Michael K. Edwards
2007-02-19 21:31                       ` Russell King
2007-02-19 22:16                         ` Michael K. Edwards
2007-02-19 23:20                           ` Russell King
2007-02-20  0:04                             ` Michael K. Edwards
2007-02-20  0:21                               ` Russell King
2007-02-20  2:17                                 ` Michael K. Edwards
2007-02-24  2:46                             ` Michael K. Edwards
2007-02-19 21:23                 ` Russell King
2007-02-21 14:13                   ` Jose Goncalves
2007-02-21 14:55                     ` Jose Goncalves
2007-02-21 22:53                     ` Frederik Deweerdt
2007-02-21 23:05                     ` Russell King
2007-02-22  0:34                       ` Michael K. Edwards
2007-02-22  8:54                         ` Russell King
2007-02-22 15:07                           ` Jose Goncalves
2007-02-22 16:56                             ` Russell King
2007-02-22 17:24                               ` jose.goncalves
2007-02-22  5:57                       ` H. Peter Anvin
2007-02-22  7:39                         ` Frederik Deweerdt
2007-02-22  8:52                         ` Russell King
2007-02-22 15:02                       ` Jose Goncalves [this message]
2007-02-22 17:03                         ` Russell King
2007-02-22 17:21                           ` jose.goncalves
2007-02-22 17:32                           ` Paul Fulghum
2007-03-01 13:33                           ` Jose Goncalves
2007-03-01 15:10                             ` Russell King
2007-03-01 15:24                               ` Jose Goncalves
     [not found] <fa.0IigYYV566ZB0kBHCj88jOEJx1s@ifi.uio.no>
     [not found] ` <fa.IE91N03KQO01UZbOdcF6HewOdYc@ifi.uio.no>
2007-02-20  2:48   ` Robert Hancock
2007-02-20  4:59     ` Michael K. Edwards
2007-02-20  5:18       ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45DDB096.2020807@inov.pt \
    --to=jose.goncalves@inov.pt \
    --cc=akpm@linux-foundation.org \
    --cc=deweerdt@free.fr \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).