* Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64)
@ 2005-07-13 9:07 Christian Boehme
2005-07-13 9:16 ` Arjan van de Ven
2005-07-13 9:59 ` Jan Engelhardt
0 siblings, 2 replies; 5+ messages in thread
From: Christian Boehme @ 2005-07-13 9:07 UTC (permalink / raw)
To: linux-kernel
We often see the following kernel-bug in our logs:
kernel: Kernel BUG at objrmap:325
kernel: invalid operand: 0000 [1] SMP
kernel: CPU 0
kernel: Pid: 4752, comm: mhd3d.opteron Tainted: G U (2.6.5-7.151-smp SLES9_SP1_BRANCH-200503181131210000)
kernel: RIP: 0010:[<ffffffff8017d1de>] <ffffffff8017d1de>{page_add_rmap+334}
kernel: RSP: 0000:00000103e68d5db8 EFLAGS: 00010246
kernel: RAX: 000000000100806d RBX: 0000010007cefc48 RCX: 0000000000000000
kernel: RDX: ffffffff80596340 RSI: 00000101fb0d9d90 RDI: 0000010007cefc48
kernel: RBP: 00000000000165ae R08: 0000000000000000 R09: 000000000043c300
kernel: R10: 000000000d9b0b10 R11: 0000000000000002 R12: 00000101fb0d9d90
kernel: R13: 0000000000000003 R14: 00000103f0947800 R15: 000000000043c300
kernel: FS: 0000002a961864c0(0000) GS:ffffffff80554200(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 000000000043c300 CR3: 0000000000101000 CR4: 00000000000006e0
kernel: Process mhd3d.opteron (pid: 4752, threadinfo 00000103e68d4000, task 00000101fb6e8c40)
kernel: Stack: 0000000000000246 ffffffff801740db 00000103e5028800 0000010231caf010
kernel: 0000000100000000 00000101fb0d9d90 00000103ea3b01e0 0000000000000000
kernel: 000000000043c300 00000103f0947800
kernel: Call Trace:<ffffffff801740db>{do_no_page+2987} <ffffffff801758a5>{handle_mm_fault+405}
kernel: <ffffffff80122554>{do_page_fault+468} <ffffffff801477c1>{sys_rt_sigaction+113}
kernel: <ffffffff80111041>{error_exit+0}
kernel:
kernel: Code: 0f 0b 10 ec 37 80 ff ff ff ff 45 01 8b 07 a9 00 80 00 00 75
kernel: RIP <ffffffff8017d1de>{page_add_rmap+334} RSP <00000103e68d5db8>
The bug always hits the same user-compiled executable,
always at the time the daily cron-jobs are run, so there
seems to be a dependency on another process. The process
4752 is part of an MPI-parallel application, and another
instance of it runs on the same node with PID 4751. Interestingly
after this bug the /proc/4751 dir (i.e., the one of the instance
not cited in the bug) is inaccessible. Thanks for any help
with this issue!
Regards
Christian Boehme
--
Dr. Christian Boehme
GWDG Private:
Am Fassberg Wilhelm-Raabe-Str. 15
37077 Göttingen 37083 Göttingen
email: Christian.Boehme@gwdg.de ChristianBoehme@web.de
phone: +49 (0)551 201-1839 +49 (0)551 3077000
fax: +49 (0)551 201-2150 +49 (0)551 3077077
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64)
2005-07-13 9:07 Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64) Christian Boehme
@ 2005-07-13 9:16 ` Arjan van de Ven
2005-07-13 9:59 ` Jan Engelhardt
1 sibling, 0 replies; 5+ messages in thread
From: Arjan van de Ven @ 2005-07-13 9:16 UTC (permalink / raw)
To: Christian Boehme; +Cc: linux-kernel
On Wed, 2005-07-13 at 11:07 +0200, Christian Boehme wrote:
> We often see the following kernel-bug in our logs:
you really should call SuSE support for this... after all that's what
you're paying them for ;)
>
> kernel: Kernel BUG at objrmap:325
> kernel: invalid operand: 0000 [1] SMP
> kernel: CPU 0
> kernel: Pid: 4752, comm: mhd3d.opteron Tainted: G U (2.6.5-7.151-smp SLES9_SP1_BRANCH-200503181131210000)
which modules do you use ?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64)
[not found] ` <4pLzx-1KX-23@gated-at.bofh.it>
@ 2005-07-13 9:52 ` Christian Boehme
0 siblings, 0 replies; 5+ messages in thread
From: Christian Boehme @ 2005-07-13 9:52 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: linux-kernel
Arjan van de Ven schrieb:
> you really should call SuSE support for this... after all that's what
> you're paying them for ;)
We use SLES because it is a requiremnet for getting support for some
vendor-supplied MPI-libraries... but you are of course still right. I
file a bug report to them as well.
>>kernel: Kernel BUG at objrmap:325
>>kernel: invalid operand: 0000 [1] SMP
>>kernel: CPU 0
>>kernel: Pid: 4752, comm: mhd3d.opteron Tainted: G U (2.6.5-7.151-smp SLES9_SP1_BRANCH-200503181131210000)
>
>
> which modules do you use ?
Here's the list:
joydev 19584 0
sg 51000 0
st 51236 0
sr_mod 26788 0
thermal 22668 0
processor 28352 1 thermal
fan 12808 0
button 15648 0
battery 17928 0
ac 13832 0
ipv6 317304 21
tg3 88324 0
ohci_hcd 29700 0
evdev 18944 0
usbcore 131824 3 ohci_hcd
ib_sdp 136000 0
ib_useraccess_cm 28320 0
ib_cm 59000 2 ib_sdp,ib_useraccess_cm
ib_udapl 49852 0
ib_ip2pr 38200 3 ib_sdp,ib_useraccess_cm,ib_udapl
ib_ipoib 75160 2 ib_udapl,ib_ip2pr
ib_useraccess 21540 0
ib_sa_client 41232 3 ib_udapl,ib_ip2pr,ib_ipoib
ib_client_query 27040 4 ib_udapl,ib_ip2pr,ib_ipoib,ib_sa_client
ib_tavor 40964 7 ib_useraccess_cm
mod_vapi 169824 3 ib_useraccess_cm,ib_udapl,ib_tavor
mod_vipkl 262520 1 mod_vapi
mod_thh 258272 1 mod_vapi
mod_hh 31888 2 mod_vipkl,mod_thh
mod_mpga 35200 1 mod_vapi
mod_vapi_common 103296 6 ib_useraccess_cm,ib_udapl,ib_tavor,mod_vapi,mod_vipkl,mod_thh
mosal 155340 5 mod_vapi,mod_vipkl,mod_thh,mod_mpga,mod_vapi_common
ib_mad 34200 4 ib_cm,ib_useraccess,ib_client_query,ib_tavor
ib_core 286744 10 ib_sdp,ib_useraccess_cm,ib_cm,ib_udapl,ib_ip2pr,ib_ipoib,ib_useraccess,ib_sa_client,ib_tavor,ib_mad
ib_poll 34768 4 ib_sdp,ib_cm,ib_ip2pr,ib_client_query
ib_services 28484 13 ib_sdp,ib_useraccess_cm,ib_cm,ib_udapl,ib_ip2pr,ib_ipoib,ib_useraccess,ib_sa_client,ib_client_query,ib_tavor,ib_mad,ib_core,ib_poll
mst_pciconf 95488 0
mst_pci 93312 2
w83627hf 39172 0
lm85 34052 0
i2c_sensor 11520 2 w83627hf,lm85
i2c_isa 11008 0
i2c_amd756 15620 0
i2c_core 35716 5 w83627hf,lm85,i2c_sensor,i2c_isa,i2c_amd756
dm_mod 67776 0
ext3 133104 3
jbd 83144 1 ext3
sata_sil 17284 4
libata 51200 1 sata_sil,[permanent]
sd_mod 29568 5
scsi_mod 140800 5 sg,st,sr_mod,libata,sd_mod
The ib_* modules are for the Infiniband network connections. Thanks for
your help!
Best wishes
Christian Boehme
--
Dr. Christian Boehme
GWDG Private:
Am Fassberg Wilhelm-Raabe-Str. 15
37077 Göttingen 37083 Göttingen
email: Christian.Boehme@gwdg.de ChristianBoehme@web.de
phone: +49 (0)551 201-1839 +49 (0)551 3077000
fax: +49 (0)551 201-2150 +49 (0)551 3077077
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64)
2005-07-13 9:07 Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64) Christian Boehme
2005-07-13 9:16 ` Arjan van de Ven
@ 2005-07-13 9:59 ` Jan Engelhardt
2005-07-13 10:03 ` Jan Engelhardt
1 sibling, 1 reply; 5+ messages in thread
From: Jan Engelhardt @ 2005-07-13 9:59 UTC (permalink / raw)
To: Christian Boehme; +Cc: linux-kernel
> We often see the following kernel-bug in our logs:
> kernel: Process mhd3d.opteron (pid: 4752, threadinfo 00000103e68d4000,
> kernel: task 00000101fb6e8c40)
> kernel: Stack: 0000000000000246 ffffffff801740db 00000103e5028800
> kernel: 0000010231caf010
> kernel: 0000000100000000 00000101fb0d9d90 00000103ea3b01e0
> kernel: 0000000000000000
> kernel: 000000000043c300 00000103f0947800
> kernel: Call Trace:<ffffffff801740db>{do_no_page+2987}
> kernel: <ffffffff801758a5>{handle_mm_fault+405}
> kernel: <ffffffff80122554>{do_page_fault+468}
> kernel: <ffffffff801477c1>{sys_rt_sigaction+113}
> kernel: <ffffffff80111041>{error_exit+0}
SUSE kontaktieren (bugzilla.novell.com) (oder auch bei Eberhard nachfragen).
Aber da wirst du wahrscheinlich auch nicht viel erfahren, da der Stack Trace
nicht viel enthaelt.
Ich gehe mal davon aus, dass die Userspace-Anwendung, die das verursacht, 'n
Segfault erhaelt. Du kannst also mal 'n strace mitlaufen lassen und schauen,
welcher "Userspace-Syscall" das ausloest. Eigentlich steht's auch da:
rt_sigaction() [man rt_sigaction] aber da ist auch nicht viel bei.
Was ist ueberhaupt mhd3d.opteron? Und sonst, halt einfach mal auf neueres SUSE
updaten, die haben naemlich schon in kurzer Zeit den Sprung von 2.6.5 auf
2.6.8 auf 2.6.11 gemacht - zumindest im KOTD.
> Interestingly
> after this bug the /proc/4751 dir (i.e., the one of the instance
> not cited in the bug) is inaccessible. Thanks for any help
> with this issue!
Welche Fehlernummer? Mal mit root probiert, reinzukommen?
Jan Engelhardt
--
| Alphagate Systems, http://alphagate.hopto.org/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64)
2005-07-13 9:59 ` Jan Engelhardt
@ 2005-07-13 10:03 ` Jan Engelhardt
0 siblings, 0 replies; 5+ messages in thread
From: Jan Engelhardt @ 2005-07-13 10:03 UTC (permalink / raw)
Cc: Linux Kernel Mailing List
>> not cited in the bug) is inaccessible. Thanks for any help
>> with this issue!
>Welche Fehlernummer? Mal mit root probiert, reinzukommen?
<whoops, this should not have made it to the list; sorry>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-07-13 10:03 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-13 9:07 Kernel BUG at objrmap:325 in 2.6.5-7.151-smp (SuSE, x86_64) Christian Boehme
2005-07-13 9:16 ` Arjan van de Ven
2005-07-13 9:59 ` Jan Engelhardt
2005-07-13 10:03 ` Jan Engelhardt
[not found] <4pLpN-1F0-15@gated-at.bofh.it>
[not found] ` <4pLzx-1KX-23@gated-at.bofh.it>
2005-07-13 9:52 ` Christian Boehme
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox