All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shailabh Nagar <nagar@watson.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Keith Mannthey <kmannth@gmail.com>, linux-kernel@vger.kernel.org
Subject: Re: 2.6.18-rc1-mm1 panic on boot x86_64 NMI watchdog detected LOCKUP
Date: Tue, 11 Jul 2006 16:23:50 -0400	[thread overview]
Message-ID: <44B408D6.1090505@watson.ibm.com> (raw)
In-Reply-To: <20060711132102.acb46e5c.akpm@osdl.org>

Andrew Morton wrote:
> On Tue, 11 Jul 2006 11:13:00 -0700
> "Keith Mannthey" <kmannth@gmail.com> wrote:
> 
> 
>>Hello,
>>  I just tried booting 2.6.18-rc1-mm1 (I was booting 2.6.17-mm6 just
>>fine) and got the following error on boot.
>>
>>CPU 15: synchronized TSC with CPU 0 (last diff 49 cycles, maxerr 4698 cycles)
>>Brought up 16 CPUs
>>testing NMI watchdog ... OK.
>>time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
>>time.c: Detected 3002.570 MHz processor.
>>migration_cost=9,1121,16845
>>checking if image is initramfs... it is
>>Freeing initrd memory: 2770k freed
>>NMI Watchdog detected LOCKUP on CPU 8
>>CPU 8
>>Modules linked in:
>>Pid: 51, comm: khelper Not tainted 2.6.18-rc1-mm1-smp #2
>>RIP: 0010:[<ffffffff803dd6f5>]  [<ffffffff803dd6f5>]
>>.text.lock.spinlock+0x31/0x8a
>>RSP: 0000:ffff81065f91be70  EFLAGS: 00000086
>>RAX: 0000000000000000 RBX: ffff810476ce3380 RCX: 0000000000000000
>>RDX: ffff81046fad4108 RSI: ffff81046fad4000 RDI: ffff810476ce3384
>>RBP: ffff810476ce3380 R08: 0000000000000000 R09: 000000000036f849
>>R10: 0000000000000000 R11: 0000000000000002 R12: ffff81065f91bf04
>>R13: ffff81065f91bef8 R14: ffff810476dcdd18 R15: ffffffff8023f7a8
>>FS:  0000000000000000(0000) GS:ffff810476f79140(0000) knlGS:0000000000000000
>>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>>CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
>>Process khelper (pid: 51, threadinfo ffff81065f91a000, task ffff81046fedd080)
>>Stack:  ffffffff803dd040 ffff81047003f8c0 ffff81065f91bef8 ffff810476dcdd18
>> 0000000000000246 ffff81046fad4108 ffff810476ce3380 ffff81046fad4108
>> ffffffff8025b211 0000000000000000 0000000000000000 ffff81046fedd080
>>Call Trace:
>> [<ffffffff803dd040>] __down_read+0x12/0x9a
>> [<ffffffff8025b211>] taskstats_exit_alloc+0x59/0x8a
>> [<ffffffff80232e89>] do_exit+0x178/0x8f6
>> [<ffffffff8023f940>] request_module+0x0/0x150
>> [<ffffffff8020a05a>] child_rip+0x8/0x12
>> [<ffffffff8023f7a8>] __call_usermodehelper+0x0/0x47
>> [<ffffffff8023f866>] ____call_usermodehelper+0x0/0xda
>> [<ffffffff8020a052>] child_rip+0x0/0x12
>>
>>
>>Code: 7e f9 e9 d3 fe ff ff f3 90 83 3b 00 7e f9 e9 da fe ff ff e8
>>console shuts up ...
>>
>>
>>Any ideas, have we seen this?  I can attach config and full dmesg if needed.
>>
> 
> 
> Thanks.  Shailabh sent the below patch through yesterday.  It looks awfully
> similar.


Yes, this lockup on boot is caused by not initializing the per-cpu
semaphores early enough. The patch below should fix it.

--Shailabh

> 
> From: Shailabh Nagar <nagar@watson.ibm.com>
> 
> Shift initialization of semaphores taken on exit() path to earlier in the
> bootup sequence.  Without this fix, booting on large cpu machines hangs at
> down_read() called on one of the per-cpu semaphores declared in taskstats.
> 
> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
> 
>  kernel/taskstats.c |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff -puN kernel/taskstats.c~per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix-2 kernel/taskstats.c
> --- a/kernel/taskstats.c~per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix-2
> +++ a/kernel/taskstats.c
> @@ -501,15 +501,20 @@ static struct genl_ops taskstats_ops = {
>  /* Needed early in initialization */
>  void __init taskstats_init_early(void)
>  {
> +	unsigned int i;
> +
>  	taskstats_cache = kmem_cache_create("taskstats_cache",
>  						sizeof(struct taskstats),
>  						0, SLAB_PANIC, NULL, NULL);
> +	for_each_possible_cpu(i) {
> +		INIT_LIST_HEAD(&(per_cpu(listener_array, i).list));
> +		init_rwsem(&(per_cpu(listener_array, i).sem));
> +	}
>  }
>  
>  static int __init taskstats_init(void)
>  {
>  	int rc;
> -	unsigned int i;
>  
>  	rc = genl_register_family(&family);
>  	if (rc)
> @@ -519,11 +524,6 @@ static int __init taskstats_init(void)
>  	if (rc < 0)
>  		goto err;
>  
> -	for_each_possible_cpu(i) {
> -		INIT_LIST_HEAD(&(per_cpu(listener_array, i).list));
> -		init_rwsem(&(per_cpu(listener_array, i).sem));
> -	}
> -
>  	family_registered = 1;
>  	return 0;
>  err:
> _
> 


  reply	other threads:[~2006-07-11 20:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-11 18:13 2.6.18-rc1-mm1 panic on boot x86_64 NMI watchdog detected LOCKUP Keith Mannthey
     [not found] ` <a762e240607111125y1f9a67eleadbd1fffd053be6@mail.gmail.com>
2006-07-11 19:00   ` Keith Mannthey
2006-07-11 20:21 ` Andrew Morton
2006-07-11 20:23   ` Shailabh Nagar [this message]
2006-07-11 20:44     ` Keith Mannthey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44B408D6.1090505@watson.ibm.com \
    --to=nagar@watson.ibm.com \
    --cc=akpm@osdl.org \
    --cc=kmannth@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.