All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-scsi@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	linuxppc-dev@ozlabs.org, Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [BUG] 2.6.24-rc3-git2 softlockup detected
Date: Thu, 29 Nov 2007 12:01:08 +0530	[thread overview]
Message-ID: <474E5CAC.9000108@linux.vnet.ibm.com> (raw)
In-Reply-To: <20071127232552.ad16e95c.akpm@linux-foundation.org>

Andrew Morton wrote:
> On Wed, 28 Nov 2007 12:47:19 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> 
>> Andrew Morton wrote:
>>> On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
>>>
>>>> Hi,
>>> (cc linux-scsi, for sym53c8xx)
>>>
>>>> Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox
>>> I assume this is a post-2.6.23 regression?
>>>
>>>> BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375]
>>>> NIP: c00000000002f02c LR: d0000000001414fc CTR: c00000000002f018
>>>> REGS: c00000077cbef0b0 TRAP: 0901   Not tainted  (2.6.24-rc3-git2-autotest)
>>>> MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24022088  XER: 00000000
>>>> TASK = c00000077cbd8000[375] 'insmod' THREAD: c00000077cbec000 CPU: 1
>>>> GPR00: d0000000001414fc c00000077cbef330 c00000000052b930 d000080080002014 
>>>> GPR04: d00008008000202c 0000000000000000 c00000077ca1cb00 d00000000014ce54 
>>>> GPR08: c00000077ca1c63c 0000000000000000 000000000000002a c00000000002f018 
>>>> GPR12: d000000000143610 c000000000473d00 
>>>> NIP [c00000000002f02c] .ioread8+0x14/0x60
>>>> LR [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
>>>> Call Trace:
>>>> [c00000077cbef330] [c00000077cbef3c0] 0xc00000077cbef3c0 (unreliable)
>>>> [c00000077cbef3a0] [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
>>>> [c00000077cbef470] [d0000000001395f8] .sym2_probe+0x700/0x99c [sym53c8xx]
>>>> [c00000077cbef710] [c0000000001bc118] .pci_device_probe+0x124/0x1b0
>>>> [c00000077cbef7b0] [c000000000221138] .driver_probe_device+0x144/0x20c
>>>> [c00000077cbef850] [c000000000221450] .__driver_attach+0xcc/0x154
>>>> [c00000077cbef8e0] [c00000000021ff94] .bus_for_each_dev+0x7c/0xd4
>>>> [c00000077cbef9a0] [c000000000220e9c] .driver_attach+0x28/0x40
>>>> [c00000077cbefa20] [c0000000002204d8] .bus_add_driver+0x90/0x228
>>>> [c00000077cbefac0] [c000000000221858] .driver_register+0x94/0xb0
>>>> [c00000077cbefb40] [c0000000001bc430] .__pci_register_driver+0x6c/0xcc
>>>> [c00000077cbefbe0] [d000000000143428] .sym2_init+0x108/0x15b0 [sym53c8xx]
>>>> [c00000077cbefc80] [c00000000008ce80] .sys_init_module+0x17c4/0x1958
>>>> [c00000077cbefe30] [c00000000000872c] syscall_exit+0x0/0x40
>>>> Instruction dump:
>>>> 60000000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 
>>>> f8010010 f821ff91 7c0004ac 89230000 <0c090000> 4c00012c 79290620 2f8900ff 
>>> I see no obvious lockup sites near the end of sym_hcb_attach().  Maybe it's
>>> being called lots of times from a higher level..  Do the traces all look
>>> the same?
>> Hi Andrew,
>>
>> I see this call trace twice and both looks similar and on another reboot
>> the following trace is seen twice in different cpu
>>
>> BUG: soft lockup detected on CPU#3!
>> Call Trace:
>> [C00000003FEDEDA0] [C000000000010220] .show_stack+0x68/0x1b0 (unreliable)
>> [C00000003FEDEE40] [C0000000000A061C] .softlockup_tick+0xf0/0x13c
>> [C00000003FEDEEF0] [C000000000072E2C] .run_local_timers+0x1c/0x30
>> [C00000003FEDEF70] [C000000000022FA0] .timer_interrupt+0xa8/0x488
>> [C00000003FEDF050] [C0000000000034EC] decrementer_common+0xec/0x100
>> --- Exception: 901 at .ioread8+0x14/0x60
>>     LR = .sym_hcb_attach+0x1194/0x1384 [sym53c8xx]
>> [C00000003FEDF340] [D0000000002B3BC0] 0xd0000000002b3bc0 (unreliable)
>> [C00000003FEDF3B0] [D00000000029A3C0] .sym_hcb_attach+0x1194/0x1384 [sym53c8xx]
>> [C00000003FEDF480] [D000000000291D30] .sym2_probe+0x75c/0x9f8 [sym53c8xx]
>> [C00000003FEDF710] [C0000000001B65A4] .pci_device_probe+0x13c/0x1dc
>> [C00000003FEDF7D0] [C000000000219A0C] .driver_probe_device+0xa0/0x15c
>> [C00000003FEDF870] [C000000000219C64] .__driver_attach+0xb4/0x138
>> [C00000003FEDF900] [C00000000021913C] .bus_for_each_dev+0x7c/0xd4
>> [C00000003FEDF9C0] [C0000000002198B0] .driver_attach+0x28/0x40
>> [C00000003FEDFA40] [C000000000218BA4] .bus_add_driver+0x98/0x18c
>> [C00000003FEDFAE0] [C00000000021A064] .driver_register+0xa8/0xc4
>> [C00000003FEDFB60] [C0000000001B68AC] .__pci_register_driver+0x5c/0xa4
>> [C00000003FEDFBF0] [D00000000029C204] .sym2_init+0x104/0x1550 [sym53c8xx]
>> [C00000003FEDFC90] [C00000000008D1F4] .sys_init_module+0x1764/0x1998
>> [C00000003FEDFE30] [C00000000000869C] syscall_exit+0x0/0x40
>>
> 
> hm, odd.
> 
> Can you look up sym_hcb_attach+0x1194/0x1384 in gdb?  Something like
> 
Hi Andrew,

I tried with 2.6.24-rc3-git3 and got the following trace

BUG: soft lockup - CPU#2 stuck for 11s! [insmod:375]
NIP: c00000000002f02c LR: d0000000001414fc CTR: c00000000002f018
REGS: c00000077ca3b0b0 TRAP: 0901   Not tainted  (2.6.24-rc3-git3-autokern1)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24022088  XER: 00000000
TASK = c00000077cc58000[375] 'insmod' THREAD: c00000077ca38000 CPU: 2
GPR00: d0000000001414fc c00000077ca3b330 c00000000052b880 d000080080002014 
GPR04: d00008008000202c 0000000000000000 c00000077c82eb00 d00000000014ce54 
GPR08: c00000077c82e63c 0000000000000000 000000000000002a c00000000002f018 
GPR12: d000000000143610 c000000000473f80 
NIP [c00000000002f02c] .ioread8+0x14/0x60
LR [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]

Call Trace:
[c00000077ca3b330] [c00000077ca3b3c0] 0xc00000077ca3b3c0 (unreliable)
[c00000077ca3b3a0] [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
[c00000077ca3b470] [d0000000001395f8] .sym2_probe+0x700/0x99c [sym53c8xx]
[c00000077ca3b710] [c0000000001bc098] .pci_device_probe+0x124/0x1b0
[c00000077ca3b7b0] [c0000000002210c4] .driver_probe_device+0x144/0x20c
[c00000077ca3b850] [c0000000002213dc] .__driver_attach+0xcc/0x154
[c00000077ca3b8e0] [c00000000021ff20] .bus_for_each_dev+0x7c/0xd4
[c00000077ca3b9a0] [c000000000220e28] .driver_attach+0x28/0x40
[c00000077ca3ba20] [c000000000220464] .bus_add_driver+0x90/0x228
[c00000077ca3bac0] [c0000000002217e4] .driver_register+0x94/0xb0
[c00000077ca3bb40] [c0000000001bc3b0] .__pci_register_driver+0x6c/0xcc
[c00000077ca3bbe0] [d000000000143428] .sym2_init+0x108/0x15b0 [sym53c8xx]
[c00000077ca3bc80] [c00000000008ce80] .sys_init_module+0x17c4/0x1958
[c00000077ca3be30] [c00000000000872c] syscall_exit+0x0/0x40

Instruction dump:
60000000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6
f8010010 f821ff91 7c0004ac 89230000 <0c090000> 4c00012c 79290620 2f8900ff

The gdb list the following for the above trace

0xa4fc is in sym_hcb_attach (drivers/scsi/sym53c8xx_2/sym_hipd.c:1041).
1036            OUTL_DSP(np, pc);
1037            /*
1038             *  Wait 'til done (with timeout)
1039             */
1040            for (i=0; i<SYM_SNOOP_TIMEOUT; i++)
1041                    if (INB(np, nc_istat) & (INTF|SIP|DIP))
1042                            break;
1043            if (i>=SYM_SNOOP_TIMEOUT) {
1044                    printf ("CACHE TEST FAILED: timeout.\n");
1045                    return (0x20);


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

WARNING: multiple messages have this Message-ID (diff)
From: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linuxppc-dev@ozlabs.org, Andy Whitcroft <apw@shadowen.org>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	linux-scsi@vger.kernel.org, "Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [BUG] 2.6.24-rc3-git2 softlockup detected
Date: Thu, 29 Nov 2007 12:01:08 +0530	[thread overview]
Message-ID: <474E5CAC.9000108@linux.vnet.ibm.com> (raw)
In-Reply-To: <20071127232552.ad16e95c.akpm@linux-foundation.org>

Andrew Morton wrote:
> On Wed, 28 Nov 2007 12:47:19 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> 
>> Andrew Morton wrote:
>>> On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
>>>
>>>> Hi,
>>> (cc linux-scsi, for sym53c8xx)
>>>
>>>> Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox
>>> I assume this is a post-2.6.23 regression?
>>>
>>>> BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375]
>>>> NIP: c00000000002f02c LR: d0000000001414fc CTR: c00000000002f018
>>>> REGS: c00000077cbef0b0 TRAP: 0901   Not tainted  (2.6.24-rc3-git2-autotest)
>>>> MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24022088  XER: 00000000
>>>> TASK = c00000077cbd8000[375] 'insmod' THREAD: c00000077cbec000 CPU: 1
>>>> GPR00: d0000000001414fc c00000077cbef330 c00000000052b930 d000080080002014 
>>>> GPR04: d00008008000202c 0000000000000000 c00000077ca1cb00 d00000000014ce54 
>>>> GPR08: c00000077ca1c63c 0000000000000000 000000000000002a c00000000002f018 
>>>> GPR12: d000000000143610 c000000000473d00 
>>>> NIP [c00000000002f02c] .ioread8+0x14/0x60
>>>> LR [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
>>>> Call Trace:
>>>> [c00000077cbef330] [c00000077cbef3c0] 0xc00000077cbef3c0 (unreliable)
>>>> [c00000077cbef3a0] [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
>>>> [c00000077cbef470] [d0000000001395f8] .sym2_probe+0x700/0x99c [sym53c8xx]
>>>> [c00000077cbef710] [c0000000001bc118] .pci_device_probe+0x124/0x1b0
>>>> [c00000077cbef7b0] [c000000000221138] .driver_probe_device+0x144/0x20c
>>>> [c00000077cbef850] [c000000000221450] .__driver_attach+0xcc/0x154
>>>> [c00000077cbef8e0] [c00000000021ff94] .bus_for_each_dev+0x7c/0xd4
>>>> [c00000077cbef9a0] [c000000000220e9c] .driver_attach+0x28/0x40
>>>> [c00000077cbefa20] [c0000000002204d8] .bus_add_driver+0x90/0x228
>>>> [c00000077cbefac0] [c000000000221858] .driver_register+0x94/0xb0
>>>> [c00000077cbefb40] [c0000000001bc430] .__pci_register_driver+0x6c/0xcc
>>>> [c00000077cbefbe0] [d000000000143428] .sym2_init+0x108/0x15b0 [sym53c8xx]
>>>> [c00000077cbefc80] [c00000000008ce80] .sys_init_module+0x17c4/0x1958
>>>> [c00000077cbefe30] [c00000000000872c] syscall_exit+0x0/0x40
>>>> Instruction dump:
>>>> 60000000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 
>>>> f8010010 f821ff91 7c0004ac 89230000 <0c090000> 4c00012c 79290620 2f8900ff 
>>> I see no obvious lockup sites near the end of sym_hcb_attach().  Maybe it's
>>> being called lots of times from a higher level..  Do the traces all look
>>> the same?
>> Hi Andrew,
>>
>> I see this call trace twice and both looks similar and on another reboot
>> the following trace is seen twice in different cpu
>>
>> BUG: soft lockup detected on CPU#3!
>> Call Trace:
>> [C00000003FEDEDA0] [C000000000010220] .show_stack+0x68/0x1b0 (unreliable)
>> [C00000003FEDEE40] [C0000000000A061C] .softlockup_tick+0xf0/0x13c
>> [C00000003FEDEEF0] [C000000000072E2C] .run_local_timers+0x1c/0x30
>> [C00000003FEDEF70] [C000000000022FA0] .timer_interrupt+0xa8/0x488
>> [C00000003FEDF050] [C0000000000034EC] decrementer_common+0xec/0x100
>> --- Exception: 901 at .ioread8+0x14/0x60
>>     LR = .sym_hcb_attach+0x1194/0x1384 [sym53c8xx]
>> [C00000003FEDF340] [D0000000002B3BC0] 0xd0000000002b3bc0 (unreliable)
>> [C00000003FEDF3B0] [D00000000029A3C0] .sym_hcb_attach+0x1194/0x1384 [sym53c8xx]
>> [C00000003FEDF480] [D000000000291D30] .sym2_probe+0x75c/0x9f8 [sym53c8xx]
>> [C00000003FEDF710] [C0000000001B65A4] .pci_device_probe+0x13c/0x1dc
>> [C00000003FEDF7D0] [C000000000219A0C] .driver_probe_device+0xa0/0x15c
>> [C00000003FEDF870] [C000000000219C64] .__driver_attach+0xb4/0x138
>> [C00000003FEDF900] [C00000000021913C] .bus_for_each_dev+0x7c/0xd4
>> [C00000003FEDF9C0] [C0000000002198B0] .driver_attach+0x28/0x40
>> [C00000003FEDFA40] [C000000000218BA4] .bus_add_driver+0x98/0x18c
>> [C00000003FEDFAE0] [C00000000021A064] .driver_register+0xa8/0xc4
>> [C00000003FEDFB60] [C0000000001B68AC] .__pci_register_driver+0x5c/0xa4
>> [C00000003FEDFBF0] [D00000000029C204] .sym2_init+0x104/0x1550 [sym53c8xx]
>> [C00000003FEDFC90] [C00000000008D1F4] .sys_init_module+0x1764/0x1998
>> [C00000003FEDFE30] [C00000000000869C] syscall_exit+0x0/0x40
>>
> 
> hm, odd.
> 
> Can you look up sym_hcb_attach+0x1194/0x1384 in gdb?  Something like
> 
Hi Andrew,

I tried with 2.6.24-rc3-git3 and got the following trace

BUG: soft lockup - CPU#2 stuck for 11s! [insmod:375]
NIP: c00000000002f02c LR: d0000000001414fc CTR: c00000000002f018
REGS: c00000077ca3b0b0 TRAP: 0901   Not tainted  (2.6.24-rc3-git3-autokern1)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24022088  XER: 00000000
TASK = c00000077cc58000[375] 'insmod' THREAD: c00000077ca38000 CPU: 2
GPR00: d0000000001414fc c00000077ca3b330 c00000000052b880 d000080080002014 
GPR04: d00008008000202c 0000000000000000 c00000077c82eb00 d00000000014ce54 
GPR08: c00000077c82e63c 0000000000000000 000000000000002a c00000000002f018 
GPR12: d000000000143610 c000000000473f80 
NIP [c00000000002f02c] .ioread8+0x14/0x60
LR [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]

Call Trace:
[c00000077ca3b330] [c00000077ca3b3c0] 0xc00000077ca3b3c0 (unreliable)
[c00000077ca3b3a0] [d0000000001414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
[c00000077ca3b470] [d0000000001395f8] .sym2_probe+0x700/0x99c [sym53c8xx]
[c00000077ca3b710] [c0000000001bc098] .pci_device_probe+0x124/0x1b0
[c00000077ca3b7b0] [c0000000002210c4] .driver_probe_device+0x144/0x20c
[c00000077ca3b850] [c0000000002213dc] .__driver_attach+0xcc/0x154
[c00000077ca3b8e0] [c00000000021ff20] .bus_for_each_dev+0x7c/0xd4
[c00000077ca3b9a0] [c000000000220e28] .driver_attach+0x28/0x40
[c00000077ca3ba20] [c000000000220464] .bus_add_driver+0x90/0x228
[c00000077ca3bac0] [c0000000002217e4] .driver_register+0x94/0xb0
[c00000077ca3bb40] [c0000000001bc3b0] .__pci_register_driver+0x6c/0xcc
[c00000077ca3bbe0] [d000000000143428] .sym2_init+0x108/0x15b0 [sym53c8xx]
[c00000077ca3bc80] [c00000000008ce80] .sys_init_module+0x17c4/0x1958
[c00000077ca3be30] [c00000000000872c] syscall_exit+0x0/0x40

Instruction dump:
60000000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6
f8010010 f821ff91 7c0004ac 89230000 <0c090000> 4c00012c 79290620 2f8900ff

The gdb list the following for the above trace

0xa4fc is in sym_hcb_attach (drivers/scsi/sym53c8xx_2/sym_hipd.c:1041).
1036            OUTL_DSP(np, pc);
1037            /*
1038             *  Wait 'til done (with timeout)
1039             */
1040            for (i=0; i<SYM_SNOOP_TIMEOUT; i++)
1041                    if (INB(np, nc_istat) & (INTF|SIP|DIP))
1042                            break;
1043            if (i>=SYM_SNOOP_TIMEOUT) {
1044                    printf ("CACHE TEST FAILED: timeout.\n");
1045                    return (0x20);


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

  reply	other threads:[~2007-11-29  6:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-28  6:29 [BUG] 2.6.24-rc3-git2 softlockup detected Kamalesh Babulal
2007-11-28  6:53 ` Andrew Morton
2007-11-28  6:53   ` Andrew Morton
2007-11-28  7:17   ` Kamalesh Babulal
2007-11-28  7:17     ` Kamalesh Babulal
2007-11-28  7:25     ` Andrew Morton
2007-11-28  7:25       ` Andrew Morton
2007-11-29  6:31       ` Kamalesh Babulal [this message]
2007-11-29  6:31         ` Kamalesh Babulal
2007-11-29  8:35         ` Andrew Morton
2007-11-29  8:35           ` Andrew Morton
2007-11-29  9:13           ` Kamalesh Babulal
2007-11-29  9:13             ` Kamalesh Babulal
2007-11-30  6:39           ` Kyle McMartin
2007-11-30  6:39             ` Kyle McMartin
2007-11-30  7:00             ` Andrew Morton
2007-11-30  7:00               ` Andrew Morton
2007-11-30  7:02               ` Andrew Morton
2007-11-30  7:02                 ` Andrew Morton
2007-11-30  7:28                 ` Kamalesh Babulal
2007-11-30  7:28                   ` Kamalesh Babulal
2007-12-03 21:11                   ` Andrew Morton
2007-12-03 21:11                     ` Andrew Morton
2007-12-04 12:20                   ` Ingo Molnar
2007-12-04 12:20                     ` Ingo Molnar
2007-12-04 14:57                     ` Kamalesh Babulal
2007-12-04 14:57                       ` Kamalesh Babulal
2007-12-04 20:55                       ` Ingo Molnar
2007-12-04 20:55                         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=474E5CAC.9000108@linux.vnet.ibm.com \
    --to=kamalesh@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.