All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Waychison <mikew@google.com>
To: YhLu <YhLu@tyan.com>
Cc: Peter Buckingham <peter@pantasys.com>,
	linux-kernel@vger.kernel.org,
	"'discuss@x86-64.org'" <discuss@x86-64.org>
Subject: Re: 2.6.13-rc2 with dual way dual core ck804 MB
Date: Wed, 10 Aug 2005 16:14:19 -0700	[thread overview]
Message-ID: <42FA8A4B.4090408@google.com> (raw)
In-Reply-To: <3174569B9743D511922F00A0C94314230AF97867@TYANWEB>

YhLu wrote:
> andi,
> 
> please refer the patch, it will move cpu_set(, cpu_callin_map) from
> smi_callin to start_secondary.


This patch fixes an apparent race / lockup on our 2-way dual cores (when 
applied against 2.6.12.3).  The machine was locking up after 
"Initializing CPU#2".

Mike Waychison

> 
> --- /home/yhlu/xx1/linux-2.6.13-rc2/arch/x86_64/kernel/smpboot.c.orig
> 2005-07-06 18:41:16.789767168 -0700
> +++ /home/yhlu/xx1/linux-2.6.13-rc2/arch/x86_64/kernel/smpboot.c
> 2005-07-06 18:45:11.923021480 -0700
> @@ -442,7 +442,7 @@
>         /*
>          * Allow the master to continue.
>          */
> -       cpu_set(cpuid, cpu_callin_map);
> +//     cpu_set(cpuid, cpu_callin_map); // moved to start_secondary by yhlu
>  }
> 
>  static inline void set_cpu_sibling_map(int cpu)
> @@ -529,8 +529,11 @@
>         /* Wait for TSC sync to not schedule things before.
>            We still process interrupts, which could see an inconsistent
>            time in that window unfortunately. */
> +
>         tsc_sync_wait();
> 
> +       cpu_set(smp_processor_id(), cpu_callin_map); // moved from
> smp_callin by yhlu
> +
>         cpu_idle();
>  }
> 
> the other solution will be change cpu_callin_map to cpu_online_map in
> do_boot_cpu
> 
>                 /*
>                  * allow APs to start initializing.
>                  */
>                 Dprintk("Before Callout %d.\n", cpu);
>                 cpu_set(cpu, cpu_callout_map);
>                 Dprintk("After Callout %d.\n", cpu);
> 
>                 /*
>                  * Wait 5s total for a response
>                  */
>                 for (timeout = 0; timeout < 50000; timeout++) {
>                         if (cpu_isset(cpu, cpu_callin_map))
> --------------------------> cpu_online_map
>                                 break;  /* It has booted */
>                         udelay(100);
>                 }
> 
>                 if (cpu_isset(cpu, cpu_callin_map)) {
> --------------------------------> cpu_online_map
>                         /* number CPUs logically, starting from 1 (BSP is 0)
> */
>                         Dprintk("CPU has booted.\n");
>                 } else {
>                         boot_error = 1;
>                         if (*((volatile unsigned char
> *)phys_to_virt(SMP_TRAMPOLINE_BASE))
>                                         == 0xA5)
>                                 /* trampoline started but...? */
>                                 printk("Stuck ??\n");
>                         else
>                                 /* trampoline code not run */
>                                 printk("Not responding.\n");
> #if APIC_DEBUG
>                         inquire_remote_apic(apicid);
> #endif
>                 }
> 
> 
> the result will be
> 
> Booting processor 1/1 rip 6000 rsp ffff81013ff89f58
> Initializing CPU#1
> masked ExtINT on CPU#1
> Calibrating delay using timer specific routine.. 4422.98 BogoMIPS
> (lpj=8845965)
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 1024K (64 bytes/line)
> CPU 1(2) -> Node 0 -> Core 1
>  stepping 00
> CPU 1: Syncing TSC to CPU 0.
> sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
> sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
> CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 595 cycles)
> ---------------------> it is in right place.
> Booting processor 2/2 rip 6000 rsp ffff81023ff1df58
> Initializing CPU#2
> masked ExtINT on CPU#2
> Calibrating delay using timer specific routine.. 4422.99 BogoMIPS
> (lpj=8845997)
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 1024K (64 bytes/line)
> CPU 2(2) -> Node 1 -> Core 0
>  stepping 00
> CPU 2: Syncing TSC to CPU 0.
> sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
> sync_master: 1 smp_processor_id() = 01, boot_cpu_id= 00
> sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
> CPU 2: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 1097 cycles)
> Booting processor 3/3 rip 6000 rsp ffff81013ff53f58
> Initializing CPU#3
> masked ExtINT on CPU#3
> Calibrating delay using timer specific routine.. 4423.03 BogoMIPS
> (lpj=8846075)
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 1024K (64 bytes/line)
> CPU 3(2) -> Node 1 -> Core 1
>  stepping 00
> CPU 3: Syncing TSC to CPU 0.
> sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
> sync_master: 1 smp_processor_id() = 01, boot_cpu_id= 00
> sync_master: 1 smp_processor_id() = 02, boot_cpu_id= 00
> sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
> CPU 3: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 1097 cycles)
> Brought up 4 CPUs
> 
> 
> 
>>-----Original Message-----
>>From: YhLu 
>>Sent: Wednesday, July 06, 2005 3:25 PM
>>To: Andi Kleen
>>Cc: Peter Buckingham; linux-kernel@vger.kernel.org
>>Subject: 2.6.13-rc2 with dual way dual core ck804 MB
>>
>>andi,
>>
>>the core1/node0 take a long while to get TSC synchronized. Is 
>>it normal?
>>i guess
>>"CPU 1: synchronized TSC with CPU 0"  should be just after 
>>"CPU 1: Syncing TSC to CPU0"
>>
>>YH
>>
>>
>>cpu 1: setting up apic clock
>>cpu 1: enabling apic timer
>>CPU 1: Syncing TSC to CPU 0.
>>CPU has booted.
>>waiting for cpu 1
>>
>>cpu 2: setting up apic clock
>>cpu 2: enabling apic timer
>>CPU 2: Syncing TSC to CPU 0.
>>CPU 2: synchronized TSC with CPU 0 (last diff -4 cycles, 
>>maxerr 1097 cycles) CPU has booted.
>>waiting for cpu 2
>>
>>cpu 3: setting up apic clock
>>cpu 3: enabling apic timer
>>CPU 3: Syncing TSC to CPU 0.
>>CPU 3: synchronized TSC with CPU 0 (last diff 1 cycles, 
>>maxerr 1087 cycles) CPU has booted.
>>waiting for cpu 3
>>
>>testing NMI watchdog ... CPU#1: NMI appears to be stuck (1->1)!
>>checking if image is initramfs...<6>CPU 1: synchronized TSC 
>>with CPU 0 (last diff 0 cycles, maxerr 595 cycles) it isn't 
>>(no cpio magic); looks like an initrd
>>
>>
>>the
>>-
>>To unsubscribe from this list: send the line "unsubscribe 
>>linux-kernel" in the body of a message to 
>>majordomo@vger.kernel.org More majordomo info at  
>>http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>>


  reply	other threads:[~2005-08-10 23:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-07  0:56 2.6.13-rc2 with dual way dual core ck804 MB YhLu
2005-08-10 23:14 ` Mike Waychison [this message]
2005-08-10 23:26   ` [discuss] " Andi Kleen
2005-08-10 23:42     ` yhlu
2005-08-11  0:04       ` Andi Kleen
2005-08-11  0:17         ` yhlu
2005-08-11  0:23           ` yhlu
2005-08-11  0:28             ` Andi Kleen
2005-08-11  0:43               ` yhlu
2005-08-11  0:51                 ` Andi Kleen
2005-08-12  6:59                   ` yhlu
2005-08-12  7:04                     ` yhlu
2005-08-12 13:07                     ` Andi Kleen
2005-08-12 16:18                       ` yhlu
2005-08-12 16:41                         ` Andi Kleen
2005-08-12 17:36                           ` yhlu
2005-08-10 23:49     ` Mike Waychison
2005-08-10 23:31   ` Peter Buckingham
  -- strict thread matches above, loose matches on Subject: below --
2005-07-07 17:50 YhLu
2005-07-06 22:25 YhLu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42FA8A4B.4090408@google.com \
    --to=mikew@google.com \
    --cc=YhLu@tyan.com \
    --cc=discuss@x86-64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter@pantasys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.