public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* EIP Errors kernel 2.6.18 ...
@ 2006-09-26 15:34 Ben Duncan
  2006-09-26 15:51 ` Michal Piotrowski
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Duncan @ 2006-09-26 15:34 UTC (permalink / raw)
  To: linux-kernel

Getting EIP erros with 2.6.18 ...
following from Syslog ..

Any Ideas what is going on?
Thanks ...

Sep 26 04:46:23 desktop kernel: ------------[ cut here ]------------
Sep 26 04:46:23 desktop kernel: kernel BUG at lib/radix-tree.c:404!
Sep 26 04:46:23 desktop kernel: invalid opcode: 0000 [#1]
Sep 26 04:46:23 desktop kernel: DEBUG_PAGEALLOC
Sep 26 04:46:23 desktop kernel: Modules linked in: sr_mod nvidia uhci_hcd nvidia_agp 
i2c_nforce2 sata_nv sd_mod ide_scsi agpgart sata_sil libata genrtc
Sep 26 04:46:23 desktop kernel: CPU:    0
Sep 26 04:46:23 desktop kernel: EIP:    0060:[<c01ba714>]    Tainted: P      VLI
Sep 26 04:46:23 desktop kernel: EFLAGS: 00010046   (2.6.18 #1)
Sep 26 04:46:23 desktop kernel: EIP is at radix_tree_tag_set+0x6a/0xa2
Sep 26 04:46:23 desktop kernel: eax: 00000001   ebx: 00000001   ecx: 00000001   edx: 00000001
Sep 26 04:46:23 desktop kernel: esi: 00000000   edi: 00000000   ebp: f7cf3d70   esp: f7cf3d54
Sep 26 04:46:23 desktop kernel: ds: 007b   es: 007b   ss: 0068
Sep 26 04:46:23 desktop kernel: Process pdflush (pid: 181, ti=f7cf2000 task=f7cd6a90 
task.ti=f7cf2000)
Sep 26 04:46:23 desktop kernel: Stack: 00000001 00000001 00050001 f71a2f4c c1061be0 f71a2f48 
00000000 f7cf3d88
Sep 26 04:46:23 desktop kernel:        c013579e 00000286 f57d5f68 c1061be0 00050002 f7cf3db8 
c014ccc8 00000000
Sep 26 04:46:23 desktop kernel:        00001000 f57d5f68 000773b0 00000000 c0150760 f71a2e70 
00000000 c1061be0
Sep 26 04:46:23 desktop kernel: Call Trace:
Sep 26 04:46:23 desktop kernel:  [<c0103485>] show_stack_log_lvl+0x8f/0x97
Sep 26 04:46:23 desktop kernel:  [<c01035e6>] show_registers+0x116/0x17f
Sep 26 04:46:23 desktop kernel:  [<c01037cb>] die+0x108/0x1ba
Sep 26 04:46:23 desktop kernel:  [<c01038f9>] do_trap+0x7c/0x96
Sep 26 04:46:23 desktop kernel:  [<c0103b64>] do_invalid_op+0x95/0x9c
Sep 26 04:46:23 desktop kernel:  [<c0103161>] error_code+0x39/0x40
Sep 26 04:46:23 desktop kernel:  [<c013579e>] test_set_page_writeback+0x79/0xd1
Sep 26 04:46:23 desktop kernel:  [<c014ccc8>] __block_write_full_page+0x18f/0x292
Sep 26 04:46:23 desktop kernel:  [<c014e087>] block_write_full_page+0x9e/0xa6
Sep 26 04:46:23 desktop kernel:  [<c0150850>] blkdev_writepage+0xf/0x11
Sep 26 04:46:23 desktop kernel:  [<c0168e74>] mpage_writepages+0x1aa/0x304
Sep 26 04:46:23 desktop kernel:  [<c01519a3>] generic_writepages+0xa/0xf
Sep 26 04:46:23 desktop kernel:  [<c013531b>] do_writepages+0x25/0x38
Sep 26 04:46:23 desktop kernel:  [<c016788e>] __sync_single_inode+0x62/0x1bc
Sep 26 04:46:23 desktop kernel:  [<c0167b31>] __writeback_single_inode+0x149/0x151
Sep 26 04:46:23 desktop kernel:  [<c0167ce3>] sync_sb_inodes+0x1aa/0x275
Sep 26 04:46:23 desktop kernel:  [<c0167e39>] writeback_inodes+0x8b/0xd9
Sep 26 04:46:23 desktop kernel:  [<c01351aa>] wb_kupdate+0x70/0xd3
Sep 26 04:46:23 desktop kernel:  [<c013590a>] __pdflush+0xda/0x171
Sep 26 04:46:23 desktop kernel:  [<c01359ca>] pdflush+0x29/0x2b
Sep 26 04:46:23 desktop kernel:  [<c01242ee>] kthread+0x79/0xa1
Sep 26 04:46:23 desktop kernel:  [<c0100d19>] kernel_thread_helper+0x5/0xb
Sep 26 04:46:23 desktop kernel: Code: 83 e0 3f 89 45 e4 8d 14 ce 0f a3 82 04 01 00 00 19 c0 
85 c0 75 0a 8b 45 e4 0f ab 82 04 01 00 00 8b 55 e4 8b 74 96 04 85 f6 75 08 <0f> 0
b 94 01 70 35 31 c0 83 ef 06 4b 75 bd 85 f6 74 1c 8b 4d e8
Sep 26 04:46:23 desktop kernel: EIP: [<c01ba714>] radix_tree_tag_set+0x6a/0xa2 SS:ESP 
0068:f7cf3d54


-- 
Ben Duncan   - Business Network Solutions, Inc. 336 Elton Road  Jackson MS, 39212
"Never attribute to malice, that which can be adequately explained by stupidity"
        - Hanlon's Razor


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: EIP Errors kernel 2.6.18 ...
  2006-09-26 15:34 EIP Errors kernel 2.6.18 Ben Duncan
@ 2006-09-26 15:51 ` Michal Piotrowski
  2006-09-26 17:39   ` EIP Errors kernel 2.6.18 .AND hard lockup Ben Duncan
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Piotrowski @ 2006-09-26 15:51 UTC (permalink / raw)
  To: Ben Duncan; +Cc: linux-kernel

Hi,

On 26/09/06, Ben Duncan <ben@versaccounting.com> wrote:
> Getting EIP erros with 2.6.18 ...
> following from Syslog ..
>
> Any Ideas what is going on?
> Thanks ...
>
> Sep 26 04:46:23 desktop kernel: ------------[ cut here ]------------
> Sep 26 04:46:23 desktop kernel: kernel BUG at lib/radix-tree.c:404!
> Sep 26 04:46:23 desktop kernel: invalid opcode: 0000 [#1]
> Sep 26 04:46:23 desktop kernel: DEBUG_PAGEALLOC
> Sep 26 04:46:23 desktop kernel: Modules linked in: sr_mod nvidia uhci_hcd nvidia_agp
> i2c_nforce2 sata_nv sd_mod ide_scsi agpgart sata_sil libata genrtc
> Sep 26 04:46:23 desktop kernel: CPU:    0
> Sep 26 04:46:23 desktop kernel: EIP:    0060:[<c01ba714>]    Tainted: P      VLI

"When emailing linux-bugs@nvidia.com, please attach an
nvidia-bug-report.log, which is generated by running
"nvidia-bug-report.sh". "
http://www.nvidia.com/object/linux_display_ia32_1.0-8774.html

Please send this report to Nvidia.

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: EIP Errors kernel 2.6.18 .AND hard lockup ...
  2006-09-26 15:51 ` Michal Piotrowski
@ 2006-09-26 17:39   ` Ben Duncan
  2006-09-26 18:00     ` Valdis.Kletnieks
  2006-10-25 13:56     ` EIP Errors kernel 2.6.18 .AND hard lockup ... Revisted Ben Duncan
  0 siblings, 2 replies; 6+ messages in thread
From: Ben Duncan @ 2006-09-26 17:39 UTC (permalink / raw)
  To: linux-kernel

Ok, first IANAKP (I am not a kernel programmer ) ;-> ...
But, just got another hard lockup and EIP code ...

I am not sure why this should got to nVidia (Please, I
personally know the Head of nVidias' Linux driver development,
so if it is a nVidia problem, I can help there).

Anway,

EIP shows :

desktop kernel: CPU:    0
desktop kernel: EIP:    0060:[<c01ba714>]    Tainted: P      VLI
EFLAGS: 00010046   (2.6.18 #1)
desktop kernel: EIP is at radix_tree_tag_set+0x6a/0xa2
desktop kernel: eax: 00000001   ebx: 00000001   ecx: 00000001   edx: 00000001
desktop kernel: esi: 00000000   edi: 00000000   ebp: f7cf3d70   esp: f7cf3d54
desktop kernel: ds: 007b   es: 007b   ss: 0068
desktop kernel: Process pdflush (pid: 181, ti=f7cf2000 task=f7cd6a90 task.ti=f7cf2000)
desktop kernel: Stack: 00000001 00000001 00050001 f71a2f4c c1061be0 f71a2f48 00000000 
f7cf3d88c01ba3cc t radix_tree_node_alloc
c01ba416 T radix_tree_preload
c01ba475 t radix_tree_extend
c01ba4ff T radix_tree_insert
c01ba5fc T radix_tree_lookup_slot
c01ba651 T radix_tree_lookup
c01ba6aa T radix_tree_tag_set
c01ba74c T radix_tree_tag_clear
c01ba81a t __lookup
c01ba906 T radix_tree_gang_lookup

desktop kernel:        c013579e 00000286 f57d5f68 c1061be0 00050002 f7cf3db8 c014ccc8 00000000
desktop kernel:        00001000 f57d5f68 000773b0 00000000 c0150760 f71a2e70 00000000 c1061be0

With current system map showing:

c01ba3cc t radix_tree_node_alloc
c01ba416 T radix_tree_preload
c01ba475 t radix_tree_extend
c01ba4ff T radix_tree_insert
c01ba5fc T radix_tree_lookup_slot
c01ba651 T radix_tree_lookup
c01ba6aa T radix_tree_tag_set
c01ba74c T radix_tree_tag_clear
c01ba81a t __lookup
c01ba906 T radix_tree_gang_lookup

The "radix lookup" seems to be occuring inside of the radix_tree_tag_set
and in particular the pdflush routine.

fgrep'ing kernel shows this is in the lib/radix-tree routine ..

To me seems to be a PDFLUSH eip and the nvidia stuff is just
a by product of loaded modules, no?

Thnaks ..

Michal Piotrowski wrote:
> Hi,
> 
<SNIP>
> 
> "When emailing linux-bugs@nvidia.com, please attach an
> nvidia-bug-report.log, which is generated by running
> "nvidia-bug-report.sh". "
> http://www.nvidia.com/object/linux_display_ia32_1.0-8774.html
> 
> Please send this report to Nvidia.
> 
> Regards,
> Michal
> 

-- 
Ben Duncan   - Business Network Solutions, Inc. 336 Elton Road  Jackson MS, 39212
"Never attribute to malice, that which can be adequately explained by stupidity"
        - Hanlon's Razor


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: EIP Errors kernel 2.6.18 .AND hard lockup ...
  2006-09-26 17:39   ` EIP Errors kernel 2.6.18 .AND hard lockup Ben Duncan
@ 2006-09-26 18:00     ` Valdis.Kletnieks
  2006-09-26 18:09       ` Ben Duncan
  2006-10-25 13:56     ` EIP Errors kernel 2.6.18 .AND hard lockup ... Revisted Ben Duncan
  1 sibling, 1 reply; 6+ messages in thread
From: Valdis.Kletnieks @ 2006-09-26 18:00 UTC (permalink / raw)
  To: Ben Duncan; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1074 bytes --]

On Tue, 26 Sep 2006 12:39:49 CDT, Ben Duncan said:
> I am not sure why this should got to nVidia (Please, I
> personally know the Head of nVidias' Linux driver development,
> so if it is a nVidia problem, I can help there).

Maybe is, maybe isn't.

> desktop kernel: EIP:    0060:[<c01ba714>]    Tainted: P      VLI
                             proprietary module loaded--^

> To me seems to be a PDFLUSH eip and the nvidia stuff is just
> a by product of loaded modules, no?

The point is that we can't know that the NVidia module hasn't stomped on
some random memory location that happened to corrupt a radix tree.  Note
that this is true even if you've loaded and then unloaded the module - it
may have splatted something before it departed....

Is it a replicatable error, and if so, can you replicate it without loading
the NVidia module?  If you can come up with a traceback that doesn't have
an NVidia tainting in it, we'll be glad to look at it.  Conversely, if you're
able to replicate it with nvidia loaded, but not without, toss it over
the fence to your friend.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: EIP Errors kernel 2.6.18 .AND hard lockup ...
  2006-09-26 18:00     ` Valdis.Kletnieks
@ 2006-09-26 18:09       ` Ben Duncan
  0 siblings, 0 replies; 6+ messages in thread
From: Ben Duncan @ 2006-09-26 18:09 UTC (permalink / raw)
  To: linux-kernel

Ok, I can remove the module so it no longer is loaded ..

It is replicate able, but randomly. Seems to occur when I hammer on
the SATA drive in the system, which is running on a add-on SIL 3112a
controller card.

Anyway, driver is removed, system rebooted, ksyms logged.
I will hammer again on the system to see if it fails ...

Thanks  ...

Valdis.Kletnieks@vt.edu wrote:
> 
> 
>>desktop kernel: EIP:    0060:[<c01ba714>]    Tainted: P      VLI
> 
>                              proprietary module loaded--^
> 
> 
>>To me seems to be a PDFLUSH eip and the nvidia stuff is just
>>a by product of loaded modules, no?
> 
> 
> The point is that we can't know that the NVidia module hasn't stomped on
> some random memory location that happened to corrupt a radix tree.  Note
> that this is true even if you've loaded and then unloaded the module - it
> may have splatted something before it departed....
> 
> Is it a replicatable error, and if so, can you replicate it without loading
> the NVidia module?  If you can come up with a traceback that doesn't have
> an NVidia tainting in it, we'll be glad to look at it.  Conversely, if you're
> able to replicate it with nvidia loaded, but not without, toss it over
> the fence to your friend.

-- 
Ben Duncan   - Business Network Solutions, Inc. 336 Elton Road  Jackson MS, 39212
"Never attribute to malice, that which can be adequately explained by stupidity"
        - Hanlon's Razor


^ permalink raw reply	[flat|nested] 6+ messages in thread

* EIP Errors kernel 2.6.18 .AND hard lockup ... Revisted
  2006-09-26 17:39   ` EIP Errors kernel 2.6.18 .AND hard lockup Ben Duncan
  2006-09-26 18:00     ` Valdis.Kletnieks
@ 2006-10-25 13:56     ` Ben Duncan
  1 sibling, 0 replies; 6+ messages in thread
From: Ben Duncan @ 2006-10-25 13:56 UTC (permalink / raw)
  To: linux-kernel

Ok, Same stuff, different day. Took the nVidia drivers out.
Ran HOT MEM checks, all passed (elimnating RAM issues).

Same as this time last month.

Got hard lock ups every 2 - 3 days. No syslog / debug ever given -
pdflush ,when top is left running on console - shows 100% CPU usage.

FINALLY got this morning a EIP error :

Oct 25 04:45:02 desktop kernel: BUG: unable to handle kernel NULL pointer dereference at 
virtual address 00000020
Oct 25 04:45:02 desktop kernel:  printing eip:
Oct 25 04:45:02 desktop kernel: c0160620
Oct 25 04:45:02 desktop kernel: *pde = 00000000
Oct 25 04:45:02 desktop kernel: Oops: 0000 [#1]
Oct 25 04:45:02 desktop kernel: DEBUG_PAGEALLOC
Oct 25 04:45:02 desktop kernel: Modules linked in: nls_iso8859_15 usb_storage uhci_hcd 
nvidia_agp i2c_nforce2 sata_nv sd
_mod ide_scsi agpgart sata_sil libata genrtc
Oct 25 04:45:02 desktop kernel: CPU:    0
Oct 25 04:45:02 desktop kernel: EIP:    0060:[<c0160620>]    Not tainted VLI
Oct 25 04:45:02 desktop kernel: EFLAGS: 00010283   (2.6.18 #1)
Oct 25 04:45:02 desktop kernel: EIP is at iput+0x17/0x65
Oct 25 04:45:02 desktop kernel: eax: 00000000   ebx: cf2aee70   ecx: cf2aee88   edx: f7cf4000
Oct 25 04:45:02 desktop kernel: esi: cf2aee70   edi: d5e8c128   ebp: f7cf5e84   esp: f7cf5e80
Oct 25 04:45:02 desktop kernel: ds: 007b   es: 007b   ss: 0068
Oct 25 04:45:02 desktop kernel: Process kswapd0 (pid: 182, ti=f7cf4000 task=f7cdaa90 
task.ti=f7cf4000)
Oct 25 04:45:02 desktop kernel: Stack: d5e8c120 f7cf5e98 c015dc4e d5e8c120 d5e8c120 d5e8c128 
f7cf5ea8 c015e093
Oct 25 04:45:02 desktop kernel:        f5e96e34 d5e8c120 f7cf5ec4 c015e1cc 00000000 0000003c 
00017124 00000000
Oct 25 04:45:02 desktop kernel:        000000a6 f7cf5ecc c015e441 f7cf5f08 c0136fef 005c4900 
00000000 00017124
Oct 25 04:45:02 desktop kernel: Call Trace:
Oct 25 04:45:02 desktop kernel:  [<c0103485>] show_stack_log_lvl+0x8f/0x97
Oct 25 04:45:02 desktop kernel:  [<c01035e6>] show_registers+0x116/0x17f
Oct 25 04:45:02 desktop kernel:  [<c01037cb>] die+0x108/0x1ba
Oct 25 04:45:02 desktop kernel:  [<c010ecb3>] do_page_fault+0x3a4/0x481
Oct 25 04:45:02 desktop kernel:  [<c0103161>] error_code+0x39/0x40
Oct 25 04:45:02 desktop kernel:  [<c015dc4e>] dentry_iput+0x5b/0x73
Oct 25 04:45:02 desktop kernel:  [<c015e093>] prune_one_dentry+0x56/0x79
Oct 25 04:45:02 desktop kernel:  [<c015e1cc>] prune_dcache+0x116/0x14a
Oct 25 04:45:02 desktop kernel:  [<c015e441>] shrink_dcache_memory+0x19/0x31
Oct 25 04:45:02 desktop kernel:  [<c0136fef>] shrink_slab+0x12f/0x18a
Oct 25 04:45:02 desktop kernel:  [<c013803b>] balance_pgdat+0x1c4/0x29c
Oct 25 04:45:02 desktop kernel:  [<c0138207>] kswapd+0xf4/0xf6
Oct 25 04:45:02 desktop kernel:  [<c01242ee>] kthread+0x79/0xa1
Oct 25 04:45:02 desktop kernel:  [<c0100d19>] kernel_thread_helper+0x5/0xb
Oct 25 04:45:02 desktop kernel: Code: 00 55 89 e5 75 07 e8 de fd ff ff eb 05 e8 bd fe ff ff 
c9 c3 55 85 c0 89 e5 53 89 c
3 74 58 83 bb 70 01 00 00 20 8b 80 cc 00 00 00 <8b> 40 20 75 08 0f 0b 73 04 e6 94 30 c0 85 
c0 74 0b 8b 50 14 85
Oct 25 04:45:02 desktop kernel: EIP: [<c0160620>] iput+0x17/0x65 SS:ESP 0068:f7cf5e80

--------------------------------------------------------------------------------------

Problems seemt to have started  when I added at start of summer, a SIL 3112A SATA controller
and a WD 250GB WD2500SD-01K Rev: 08.0 250GB SATA disk.


-- 
Ben Duncan   - Business Network Solutions, Inc. 336 Elton Road  Jackson MS, 39212
"Never attribute to malice, that which can be adequately explained by stupidity"
        - Hanlon's Razor


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-10-25 14:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-26 15:34 EIP Errors kernel 2.6.18 Ben Duncan
2006-09-26 15:51 ` Michal Piotrowski
2006-09-26 17:39   ` EIP Errors kernel 2.6.18 .AND hard lockup Ben Duncan
2006-09-26 18:00     ` Valdis.Kletnieks
2006-09-26 18:09       ` Ben Duncan
2006-10-25 13:56     ` EIP Errors kernel 2.6.18 .AND hard lockup ... Revisted Ben Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox