public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: 2.6.13-rc2 with dual way dual core ck804 MB
@ 2005-07-07  0:56 YhLu
  2005-08-10 23:14 ` Mike Waychison
  0 siblings, 1 reply; 18+ messages in thread
From: YhLu @ 2005-07-07  0:56 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Peter Buckingham, linux-kernel, 'discuss@x86-64.org'

andi,

please refer the patch, it will move cpu_set(, cpu_callin_map) from
smi_callin to start_secondary.

--- /home/yhlu/xx1/linux-2.6.13-rc2/arch/x86_64/kernel/smpboot.c.orig
2005-07-06 18:41:16.789767168 -0700
+++ /home/yhlu/xx1/linux-2.6.13-rc2/arch/x86_64/kernel/smpboot.c
2005-07-06 18:45:11.923021480 -0700
@@ -442,7 +442,7 @@
        /*
         * Allow the master to continue.
         */
-       cpu_set(cpuid, cpu_callin_map);
+//     cpu_set(cpuid, cpu_callin_map); // moved to start_secondary by yhlu
 }

 static inline void set_cpu_sibling_map(int cpu)
@@ -529,8 +529,11 @@
        /* Wait for TSC sync to not schedule things before.
           We still process interrupts, which could see an inconsistent
           time in that window unfortunately. */
+
        tsc_sync_wait();

+       cpu_set(smp_processor_id(), cpu_callin_map); // moved from
smp_callin by yhlu
+
        cpu_idle();
 }

the other solution will be change cpu_callin_map to cpu_online_map in
do_boot_cpu

                /*
                 * allow APs to start initializing.
                 */
                Dprintk("Before Callout %d.\n", cpu);
                cpu_set(cpu, cpu_callout_map);
                Dprintk("After Callout %d.\n", cpu);

                /*
                 * Wait 5s total for a response
                 */
                for (timeout = 0; timeout < 50000; timeout++) {
                        if (cpu_isset(cpu, cpu_callin_map))
--------------------------> cpu_online_map
                                break;  /* It has booted */
                        udelay(100);
                }

                if (cpu_isset(cpu, cpu_callin_map)) {
--------------------------------> cpu_online_map
                        /* number CPUs logically, starting from 1 (BSP is 0)
*/
                        Dprintk("CPU has booted.\n");
                } else {
                        boot_error = 1;
                        if (*((volatile unsigned char
*)phys_to_virt(SMP_TRAMPOLINE_BASE))
                                        == 0xA5)
                                /* trampoline started but...? */
                                printk("Stuck ??\n");
                        else
                                /* trampoline code not run */
                                printk("Not responding.\n");
#if APIC_DEBUG
                        inquire_remote_apic(apicid);
#endif
                }


the result will be

Booting processor 1/1 rip 6000 rsp ffff81013ff89f58
Initializing CPU#1
masked ExtINT on CPU#1
Calibrating delay using timer specific routine.. 4422.98 BogoMIPS
(lpj=8845965)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1(2) -> Node 0 -> Core 1
 stepping 00
CPU 1: Syncing TSC to CPU 0.
sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 595 cycles)
---------------------> it is in right place.
Booting processor 2/2 rip 6000 rsp ffff81023ff1df58
Initializing CPU#2
masked ExtINT on CPU#2
Calibrating delay using timer specific routine.. 4422.99 BogoMIPS
(lpj=8845997)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 2(2) -> Node 1 -> Core 0
 stepping 00
CPU 2: Syncing TSC to CPU 0.
sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
sync_master: 1 smp_processor_id() = 01, boot_cpu_id= 00
sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
CPU 2: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 1097 cycles)
Booting processor 3/3 rip 6000 rsp ffff81013ff53f58
Initializing CPU#3
masked ExtINT on CPU#3
Calibrating delay using timer specific routine.. 4423.03 BogoMIPS
(lpj=8846075)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 3(2) -> Node 1 -> Core 1
 stepping 00
CPU 3: Syncing TSC to CPU 0.
sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
sync_master: 1 smp_processor_id() = 01, boot_cpu_id= 00
sync_master: 1 smp_processor_id() = 02, boot_cpu_id= 00
sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
CPU 3: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 1097 cycles)
Brought up 4 CPUs


> -----Original Message-----
> From: YhLu 
> Sent: Wednesday, July 06, 2005 3:25 PM
> To: Andi Kleen
> Cc: Peter Buckingham; linux-kernel@vger.kernel.org
> Subject: 2.6.13-rc2 with dual way dual core ck804 MB
> 
> andi,
> 
> the core1/node0 take a long while to get TSC synchronized. Is 
> it normal?
> i guess
> "CPU 1: synchronized TSC with CPU 0"  should be just after 
> "CPU 1: Syncing TSC to CPU0"
> 
> YH
> 
> 
> cpu 1: setting up apic clock
> cpu 1: enabling apic timer
> CPU 1: Syncing TSC to CPU 0.
> CPU has booted.
> waiting for cpu 1
> 
> cpu 2: setting up apic clock
> cpu 2: enabling apic timer
> CPU 2: Syncing TSC to CPU 0.
> CPU 2: synchronized TSC with CPU 0 (last diff -4 cycles, 
> maxerr 1097 cycles) CPU has booted.
> waiting for cpu 2
> 
> cpu 3: setting up apic clock
> cpu 3: enabling apic timer
> CPU 3: Syncing TSC to CPU 0.
> CPU 3: synchronized TSC with CPU 0 (last diff 1 cycles, 
> maxerr 1087 cycles) CPU has booted.
> waiting for cpu 3
> 
> testing NMI watchdog ... CPU#1: NMI appears to be stuck (1->1)!
> checking if image is initramfs...<6>CPU 1: synchronized TSC 
> with CPU 0 (last diff 0 cycles, maxerr 595 cycles) it isn't 
> (no cpio magic); looks like an initrd
> 
> 
> the
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> majordomo@vger.kernel.org More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-07-07  0:56 2.6.13-rc2 with dual way dual core ck804 MB YhLu
@ 2005-08-10 23:14 ` Mike Waychison
  2005-08-10 23:26   ` [discuss] " Andi Kleen
  2005-08-10 23:31   ` Peter Buckingham
  0 siblings, 2 replies; 18+ messages in thread
From: Mike Waychison @ 2005-08-10 23:14 UTC (permalink / raw)
  To: YhLu; +Cc: Peter Buckingham, linux-kernel, 'discuss@x86-64.org'

YhLu wrote:
> andi,
> 
> please refer the patch, it will move cpu_set(, cpu_callin_map) from
> smi_callin to start_secondary.


This patch fixes an apparent race / lockup on our 2-way dual cores (when 
applied against 2.6.12.3).  The machine was locking up after 
"Initializing CPU#2".

Mike Waychison

> 
> --- /home/yhlu/xx1/linux-2.6.13-rc2/arch/x86_64/kernel/smpboot.c.orig
> 2005-07-06 18:41:16.789767168 -0700
> +++ /home/yhlu/xx1/linux-2.6.13-rc2/arch/x86_64/kernel/smpboot.c
> 2005-07-06 18:45:11.923021480 -0700
> @@ -442,7 +442,7 @@
>         /*
>          * Allow the master to continue.
>          */
> -       cpu_set(cpuid, cpu_callin_map);
> +//     cpu_set(cpuid, cpu_callin_map); // moved to start_secondary by yhlu
>  }
> 
>  static inline void set_cpu_sibling_map(int cpu)
> @@ -529,8 +529,11 @@
>         /* Wait for TSC sync to not schedule things before.
>            We still process interrupts, which could see an inconsistent
>            time in that window unfortunately. */
> +
>         tsc_sync_wait();
> 
> +       cpu_set(smp_processor_id(), cpu_callin_map); // moved from
> smp_callin by yhlu
> +
>         cpu_idle();
>  }
> 
> the other solution will be change cpu_callin_map to cpu_online_map in
> do_boot_cpu
> 
>                 /*
>                  * allow APs to start initializing.
>                  */
>                 Dprintk("Before Callout %d.\n", cpu);
>                 cpu_set(cpu, cpu_callout_map);
>                 Dprintk("After Callout %d.\n", cpu);
> 
>                 /*
>                  * Wait 5s total for a response
>                  */
>                 for (timeout = 0; timeout < 50000; timeout++) {
>                         if (cpu_isset(cpu, cpu_callin_map))
> --------------------------> cpu_online_map
>                                 break;  /* It has booted */
>                         udelay(100);
>                 }
> 
>                 if (cpu_isset(cpu, cpu_callin_map)) {
> --------------------------------> cpu_online_map
>                         /* number CPUs logically, starting from 1 (BSP is 0)
> */
>                         Dprintk("CPU has booted.\n");
>                 } else {
>                         boot_error = 1;
>                         if (*((volatile unsigned char
> *)phys_to_virt(SMP_TRAMPOLINE_BASE))
>                                         == 0xA5)
>                                 /* trampoline started but...? */
>                                 printk("Stuck ??\n");
>                         else
>                                 /* trampoline code not run */
>                                 printk("Not responding.\n");
> #if APIC_DEBUG
>                         inquire_remote_apic(apicid);
> #endif
>                 }
> 
> 
> the result will be
> 
> Booting processor 1/1 rip 6000 rsp ffff81013ff89f58
> Initializing CPU#1
> masked ExtINT on CPU#1
> Calibrating delay using timer specific routine.. 4422.98 BogoMIPS
> (lpj=8845965)
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 1024K (64 bytes/line)
> CPU 1(2) -> Node 0 -> Core 1
>  stepping 00
> CPU 1: Syncing TSC to CPU 0.
> sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
> sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
> CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 595 cycles)
> ---------------------> it is in right place.
> Booting processor 2/2 rip 6000 rsp ffff81023ff1df58
> Initializing CPU#2
> masked ExtINT on CPU#2
> Calibrating delay using timer specific routine.. 4422.99 BogoMIPS
> (lpj=8845997)
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 1024K (64 bytes/line)
> CPU 2(2) -> Node 1 -> Core 0
>  stepping 00
> CPU 2: Syncing TSC to CPU 0.
> sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
> sync_master: 1 smp_processor_id() = 01, boot_cpu_id= 00
> sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
> CPU 2: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 1097 cycles)
> Booting processor 3/3 rip 6000 rsp ffff81013ff53f58
> Initializing CPU#3
> masked ExtINT on CPU#3
> Calibrating delay using timer specific routine.. 4423.03 BogoMIPS
> (lpj=8846075)
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 1024K (64 bytes/line)
> CPU 3(2) -> Node 1 -> Core 1
>  stepping 00
> CPU 3: Syncing TSC to CPU 0.
> sync_master: 1 smp_processor_id() = 00, boot_cpu_id= 00
> sync_master: 1 smp_processor_id() = 01, boot_cpu_id= 00
> sync_master: 1 smp_processor_id() = 02, boot_cpu_id= 00
> sync_master: 2 smp_processor_id() = 00, boot_cpu_id= 00
> CPU 3: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 1097 cycles)
> Brought up 4 CPUs
> 
> 
> 
>>-----Original Message-----
>>From: YhLu 
>>Sent: Wednesday, July 06, 2005 3:25 PM
>>To: Andi Kleen
>>Cc: Peter Buckingham; linux-kernel@vger.kernel.org
>>Subject: 2.6.13-rc2 with dual way dual core ck804 MB
>>
>>andi,
>>
>>the core1/node0 take a long while to get TSC synchronized. Is 
>>it normal?
>>i guess
>>"CPU 1: synchronized TSC with CPU 0"  should be just after 
>>"CPU 1: Syncing TSC to CPU0"
>>
>>YH
>>
>>
>>cpu 1: setting up apic clock
>>cpu 1: enabling apic timer
>>CPU 1: Syncing TSC to CPU 0.
>>CPU has booted.
>>waiting for cpu 1
>>
>>cpu 2: setting up apic clock
>>cpu 2: enabling apic timer
>>CPU 2: Syncing TSC to CPU 0.
>>CPU 2: synchronized TSC with CPU 0 (last diff -4 cycles, 
>>maxerr 1097 cycles) CPU has booted.
>>waiting for cpu 2
>>
>>cpu 3: setting up apic clock
>>cpu 3: enabling apic timer
>>CPU 3: Syncing TSC to CPU 0.
>>CPU 3: synchronized TSC with CPU 0 (last diff 1 cycles, 
>>maxerr 1087 cycles) CPU has booted.
>>waiting for cpu 3
>>
>>testing NMI watchdog ... CPU#1: NMI appears to be stuck (1->1)!
>>checking if image is initramfs...<6>CPU 1: synchronized TSC 
>>with CPU 0 (last diff 0 cycles, maxerr 595 cycles) it isn't 
>>(no cpio magic); looks like an initrd
>>
>>
>>the
>>-
>>To unsubscribe from this list: send the line "unsubscribe 
>>linux-kernel" in the body of a message to 
>>majordomo@vger.kernel.org More majordomo info at  
>>http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-10 23:14 ` Mike Waychison
@ 2005-08-10 23:26   ` Andi Kleen
  2005-08-10 23:42     ` yhlu
  2005-08-10 23:49     ` Mike Waychison
  2005-08-10 23:31   ` Peter Buckingham
  1 sibling, 2 replies; 18+ messages in thread
From: Andi Kleen @ 2005-08-10 23:26 UTC (permalink / raw)
  To: Mike Waychison
  Cc: YhLu, Peter Buckingham, linux-kernel,
	'discuss@x86-64.org'

On Wed, Aug 10, 2005 at 04:14:19PM -0700, Mike Waychison wrote:
> YhLu wrote:
> >andi,
> >
> >please refer the patch, it will move cpu_set(, cpu_callin_map) from
> >smi_callin to start_secondary.
> 
> 
> This patch fixes an apparent race / lockup on our 2-way dual cores (when 
> applied against 2.6.12.3).  The machine was locking up after 
> "Initializing CPU#2".

The real solution for this issue is the smp_call_function_single patch from Eric
that I reposted yesterday. Yh's patch just changed the timing slightly.


-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-10 23:14 ` Mike Waychison
  2005-08-10 23:26   ` [discuss] " Andi Kleen
@ 2005-08-10 23:31   ` Peter Buckingham
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Buckingham @ 2005-08-10 23:31 UTC (permalink / raw)
  To: Mike Waychison; +Cc: YhLu, linux-kernel, 'discuss@x86-64.org'

Mike Waychison wrote:
> This patch fixes an apparent race / lockup on our 2-way dual cores (when 
> applied against 2.6.12.3).  The machine was locking up after 
> "Initializing CPU#2".

the better ways is to use the patch from Eric that Andi posted to stable 
yesterday:

	http://x86-64.org/lists/discuss/msg06943.html

peter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-10 23:26   ` [discuss] " Andi Kleen
@ 2005-08-10 23:42     ` yhlu
  2005-08-11  0:04       ` Andi Kleen
  2005-08-10 23:49     ` Mike Waychison
  1 sibling, 1 reply; 18+ messages in thread
From: yhlu @ 2005-08-10 23:42 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

andi,

you can see the difference with the patch
Booting processor 1/1 rip 6000 rsp ffff810181c61f58
Initializing CPU#1
masked ExtINT on CPU#1
Calibrating delay using timer specific routine.. 4000.31 BogoMIPS (lpj=8000624)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 886 cycles)
Booting processor 2/2 rip 6000 rsp ffff81017ffa3f58
Initializing CPU#2
masked ExtINT on CPU#2
Calibrating delay using timer specific routine.. 4000.30 BogoMIPS (lpj=8000605)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
 stepping 0a
CPU 2: Syncing TSC to CPU 0.
CPU 2: synchronized TSC with CPU 0 (last diff 1 cycles, maxerr 901 cycles)
Booting processor 3/3 rip 6000 rsp ffff8101fffa9f58
Initializing CPU#3
masked ExtINT on CPU#3
Calibrating delay using timer specific routine.. 4000.31 BogoMIPS (lpj=8000622)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
 stepping 0a
CPU 3: Syncing TSC to CPU 0.
CPU 3: synchronized TSC with CPU 0 (last diff -3 cycles, maxerr 1504 cycles)
Brought up 4 CPUs


without the patch
Booting processor 1/4 APIC 0x1
Initializing CPU#1
masked ExtINT on CPU#1
Calibrating delay using timer specific routine.. 4000.30 BogoMIPS (lpj=8000608)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1(1) -> Node 1 -> Core 0
 stepping 0a
CPU 1: Syncing TSC to CPU 0.
Booting processor 2/4 APIC 0x2
Initializing CPU#2
masked ExtINT on CPU#2
CPU 1: synchronized TSC with CPU 0 (last diff 1 cycles, maxerr 893 cycles)
Calibrating delay using timer specific routine.. 4000.36 BogoMIPS (lpj=8000724)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 2(1) -> Node 2 -> Core 0
 stepping 0a
CPU 2: Syncing TSC to CPU 0.
Booting processor 3/4 APIC 0x3
Initializing CPU#3
masked ExtINT on CPU#3
CPU 2: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 904 cycles)
Calibrating delay using timer specific routine.. 4000.16 BogoMIPS (lpj=8000335)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 3(1) -> Node 3 -> Core 0
 stepping 0a
CPU 3: Syncing TSC to CPU 0.
Brought up 4 CPUs
time.c: Using PIT/TSC based timekeeping.
testing NMI watchdog ... OK.
checking if image is initramfs...<6>CPU 3: synchronized TSC with CPU 0
(last diff -18 cycles, maxerr 1504 cycles)
it isn't (no cpio magic); looks like an initrd

So my patch still can be used with Eric's, It just serialize the
TSC_SYNC between cpu.

I wonder it you can refine to make TSC_SYNC serialize that beteen CPU.
That will make
CPU X:synchronized TSC ... 
in fixed postion and timming.

YH


On 8/10/05, Andi Kleen <ak@suse.de> wrote:
> On Wed, Aug 10, 2005 at 04:14:19PM -0700, Mike Waychison wrote:
> > YhLu wrote:
> > >andi,
> > >
> > >please refer the patch, it will move cpu_set(, cpu_callin_map) from
> > >smi_callin to start_secondary.
> >
> >
> > This patch fixes an apparent race / lockup on our 2-way dual cores (when
> > applied against 2.6.12.3).  The machine was locking up after
> > "Initializing CPU#2".
> 
> The real solution for this issue is the smp_call_function_single patch from Eric
> that I reposted yesterday. Yh's patch just changed the timing slightly.
> 
> 
> -Andi
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-10 23:26   ` [discuss] " Andi Kleen
  2005-08-10 23:42     ` yhlu
@ 2005-08-10 23:49     ` Mike Waychison
  1 sibling, 0 replies; 18+ messages in thread
From: Mike Waychison @ 2005-08-10 23:49 UTC (permalink / raw)
  To: Andi Kleen
  Cc: YhLu, Peter Buckingham, linux-kernel,
	'discuss@x86-64.org'

Andi Kleen wrote:
> On Wed, Aug 10, 2005 at 04:14:19PM -0700, Mike Waychison wrote:
> 
>>YhLu wrote:
>>
>>>andi,
>>>
>>>please refer the patch, it will move cpu_set(, cpu_callin_map) from
>>>smi_callin to start_secondary.
>>
>>
>>This patch fixes an apparent race / lockup on our 2-way dual cores (when 
>>applied against 2.6.12.3).  The machine was locking up after 
>>"Initializing CPU#2".
> 
> 
> The real solution for this issue is the smp_call_function_single patch from Eric
> that I reposted yesterday. Yh's patch just changed the timing slightly.
> 
> 

Indeed.

I had a report here that the smp_call_function_single patch wasn't 
fixing our problem.  I just tried it myself and it appears to also fix 
the problem.

Thanks,

Mike Waychison

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-10 23:42     ` yhlu
@ 2005-08-11  0:04       ` Andi Kleen
  2005-08-11  0:17         ` yhlu
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2005-08-11  0:04 UTC (permalink / raw)
  To: yhlu
  Cc: Andi Kleen, Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

> So my patch still can be used with Eric's, It just serialize the
> TSC_SYNC between cpu.
> 
> I wonder it you can refine to make TSC_SYNC serialize that beteen CPU.
> That will make
> CPU X:synchronized TSC ... 
> in fixed postion and timming.

Why would we want that? 

Boot time is critical so it's better to do things asynchronous.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-11  0:04       ` Andi Kleen
@ 2005-08-11  0:17         ` yhlu
  2005-08-11  0:23           ` yhlu
  0 siblings, 1 reply; 18+ messages in thread
From: yhlu @ 2005-08-11  0:17 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

In LinuxBIOS, we could init_ecc asynchronous and the time reduced from
8x to 2.1x for 8 ways system. 1x mean 5s for 4G in one cpu. If 16G
will take 20s.

for TSC_SYNC asynchronous maybe you can get back 0.1s...

YH

On 8/10/05, Andi Kleen <ak@suse.de> wrote:
> > So my patch still can be used with Eric's, It just serialize the
> > TSC_SYNC between cpu.
> >
> > I wonder it you can refine to make TSC_SYNC serialize that beteen CPU.
> > That will make
> > CPU X:synchronized TSC ...
> > in fixed postion and timming.
> 
> Why would we want that?
> 
> Boot time is critical so it's better to do things asynchronous.
> 
> -Andi
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-11  0:17         ` yhlu
@ 2005-08-11  0:23           ` yhlu
  2005-08-11  0:28             ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: yhlu @ 2005-08-11  0:23 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

I wonder if you can make the bsp can start the APs callin in the same
time, and make it asynchronous, So you make spare 2s or more.

YH

On 8/10/05, yhlu <yhlu.kernel@gmail.com> wrote:
> In LinuxBIOS, we could init_ecc asynchronous and the time reduced from
> 8x to 2.1x for 8 ways system. 1x mean 5s for 4G in one cpu. If 16G
> will take 20s.
> 
> for TSC_SYNC asynchronous maybe you can get back 0.1s...
> 
> YH
> 
> On 8/10/05, Andi Kleen <ak@suse.de> wrote:
> > > So my patch still can be used with Eric's, It just serialize the
> > > TSC_SYNC between cpu.
> > >
> > > I wonder it you can refine to make TSC_SYNC serialize that beteen CPU.
> > > That will make
> > > CPU X:synchronized TSC ...
> > > in fixed postion and timming.
> >
> > Why would we want that?
> >
> > Boot time is critical so it's better to do things asynchronous.
> >
> > -Andi
> >
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-11  0:23           ` yhlu
@ 2005-08-11  0:28             ` Andi Kleen
  2005-08-11  0:43               ` yhlu
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2005-08-11  0:28 UTC (permalink / raw)
  To: yhlu
  Cc: Andi Kleen, Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

On Wed, Aug 10, 2005 at 05:23:31PM -0700, yhlu wrote:
> I wonder if you can make the bsp can start the APs callin in the same
> time, and make it asynchronous, So you make spare 2s or more.

The setting of cpu_callin_map in the AP could be moved earlier yes.
But it's not entirely trivial because there are some races to consider.

And the 1s quiet period on the AP could be probably also reduced
on modern systems. I doubt it is needed on Xeons or Opterons.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-11  0:28             ` Andi Kleen
@ 2005-08-11  0:43               ` yhlu
  2005-08-11  0:51                 ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: yhlu @ 2005-08-11  0:43 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

Yes, I mean more aggressive

static void __init smp_init(void)
{
        unsigned int i;

        /* FIXME: This should be done in userspace --RR */
        for_each_present_cpu(i) {
                if (num_online_cpus() >= max_cpus)
                        break;
                if (!cpu_online(i))
                        cpu_up(i);
        }


let cpu_up take one array instead of one int.

So  in do_boot_cpu() of smpboot.c
                /*
                 * Wait 5s total for a response
                 */
                for (timeout = 0; timeout < 50000; timeout++) {
                        if (cpu_isset(cpu, cpu_callin_map))
                                break;  /* It has booted */
                        udelay(100);
                }

could wait all be cpu_callin_map is set.

then we can spare more time.

YH


On 8/10/05, Andi Kleen <ak@suse.de> wrote:
> On Wed, Aug 10, 2005 at 05:23:31PM -0700, yhlu wrote:
> > I wonder if you can make the bsp can start the APs callin in the same
> > time, and make it asynchronous, So you make spare 2s or more.
> 
> The setting of cpu_callin_map in the AP could be moved earlier yes.
> But it's not entirely trivial because there are some races to consider.
> 
> And the 1s quiet period on the AP could be probably also reduced
> on modern systems. I doubt it is needed on Xeons or Opterons.
> 
> -Andi
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-11  0:43               ` yhlu
@ 2005-08-11  0:51                 ` Andi Kleen
  2005-08-12  6:59                   ` yhlu
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2005-08-11  0:51 UTC (permalink / raw)
  To: yhlu
  Cc: Andi Kleen, Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

On Wed, Aug 10, 2005 at 05:43:23PM -0700, yhlu wrote:
> Yes, I mean more aggressive
> 
> static void __init smp_init(void)
> {
>         unsigned int i;
> 
>         /* FIXME: This should be done in userspace --RR */
>         for_each_present_cpu(i) {
>                 if (num_online_cpus() >= max_cpus)
>                         break;
>                 if (!cpu_online(i))
>                         cpu_up(i);
>         }
> 
> 
> let cpu_up take one array instead of one int.

It can be done already by just not starting the CPUs and
then do it multithreaded from user space using sysfs with
the CPU hotplug infrastructure. Unfortunately cpu_up
right now has a global semaphore, so it won't save you any
time. However it could be done in parallel with other 
startup jobs.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-11  0:51                 ` Andi Kleen
@ 2005-08-12  6:59                   ` yhlu
  2005-08-12  7:04                     ` yhlu
  2005-08-12 13:07                     ` Andi Kleen
  0 siblings, 2 replies; 18+ messages in thread
From: yhlu @ 2005-08-12  6:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

andi,

is it possible for
after the AP1 call_in is done and before AP1 get in tsc_sync_wait
The AP2 call_in done.  and then AP1 get in tsc_sync_wait and before it
done, AP2 get in tsc_sync_wait too.

sync_master can not figure out from AP1 or AP2 because only have
go[MASTER] and go{SLAVE].

YH

On 8/10/05, Andi Kleen <ak@suse.de> wrote:
> On Wed, Aug 10, 2005 at 05:43:23PM -0700, yhlu wrote:
> > Yes, I mean more aggressive
> >
> > static void __init smp_init(void)
> > {
> >         unsigned int i;
> >
> >         /* FIXME: This should be done in userspace --RR */
> >         for_each_present_cpu(i) {
> >                 if (num_online_cpus() >= max_cpus)
> >                         break;
> >                 if (!cpu_online(i))
> >                         cpu_up(i);
> >         }
> >
> >
> > let cpu_up take one array instead of one int.
> 
> It can be done already by just not starting the CPUs and
> then do it multithreaded from user space using sysfs with
> the CPU hotplug infrastructure. Unfortunately cpu_up
> right now has a global semaphore, so it won't save you any
> time. However it could be done in parallel with other
> startup jobs.
> 
> -Andi
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-12  6:59                   ` yhlu
@ 2005-08-12  7:04                     ` yhlu
  2005-08-12 13:07                     ` Andi Kleen
  1 sibling, 0 replies; 18+ messages in thread
From: yhlu @ 2005-08-12  7:04 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

andi,

it seems ia64 is after done with the tsc_sync then set the callin_map.

YH

        if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT)) {
                /*
                 * Synchronize the ITC with the BP.  Need to do this
after irqs are
                 * enabled because ia64_sync_itc() calls
smp_call_function_single(), which
                 * calls spin_unlock_bh(), which calls
spin_unlock_bh(), which calls
                 * local_bh_enable(), which bugs out if irqs are not enabled...
                 */
                Dprintk("Going to syncup ITC with BP.\n");
                ia64_sync_itc(0);
        }

        /*
         * Get our bogomips.
         */
        ia64_init_itm();
        calibrate_delay();
        local_cpu_data->loops_per_jiffy = loops_per_jiffy;

#ifdef CONFIG_IA32_SUPPORT
        ia32_gdt_init();
#endif

        /*
         * Allow the master to continue.
         */
        cpu_set(cpuid, cpu_callin_map);


On 8/11/05, yhlu <yhlu.kernel@gmail.com> wrote:
> andi,
> 
> is it possible for
> after the AP1 call_in is done and before AP1 get in tsc_sync_wait
> The AP2 call_in done.  and then AP1 get in tsc_sync_wait and before it
> done, AP2 get in tsc_sync_wait too.
> 
> sync_master can not figure out from AP1 or AP2 because only have
> go[MASTER] and go{SLAVE].
> 
> YH
> 
> On 8/10/05, Andi Kleen <ak@suse.de> wrote:
> > On Wed, Aug 10, 2005 at 05:43:23PM -0700, yhlu wrote:
> > > Yes, I mean more aggressive
> > >
> > > static void __init smp_init(void)
> > > {
> > >         unsigned int i;
> > >
> > >         /* FIXME: This should be done in userspace --RR */
> > >         for_each_present_cpu(i) {
> > >                 if (num_online_cpus() >= max_cpus)
> > >                         break;
> > >                 if (!cpu_online(i))
> > >                         cpu_up(i);
> > >         }
> > >
> > >
> > > let cpu_up take one array instead of one int.
> >
> > It can be done already by just not starting the CPUs and
> > then do it multithreaded from user space using sysfs with
> > the CPU hotplug infrastructure. Unfortunately cpu_up
> > right now has a global semaphore, so it won't save you any
> > time. However it could be done in parallel with other
> > startup jobs.
> >
> > -Andi
> >
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-12  6:59                   ` yhlu
  2005-08-12  7:04                     ` yhlu
@ 2005-08-12 13:07                     ` Andi Kleen
  2005-08-12 16:18                       ` yhlu
  1 sibling, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2005-08-12 13:07 UTC (permalink / raw)
  To: yhlu
  Cc: Andi Kleen, Mike Waychison, YhLu, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

On Thu, Aug 11, 2005 at 11:59:21PM -0700, yhlu wrote:
> andi,
> 
> is it possible for
> after the AP1 call_in is done and before AP1 get in tsc_sync_wait
> The AP2 call_in done.  and then AP1 get in tsc_sync_wait and before it
> done, AP2 get in tsc_sync_wait too.
> 
> sync_master can not figure out from AP1 or AP2 because only have
> go[MASTER] and go{SLAVE].

Ok, you're right. It's better to move it to before callin map.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-12 13:07                     ` Andi Kleen
@ 2005-08-12 16:18                       ` yhlu
  2005-08-12 16:41                         ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: yhlu @ 2005-08-12 16:18 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

good, I will produce one patch next week.

YH

On 8/12/05, Andi Kleen <ak@suse.de> wrote:
> On Thu, Aug 11, 2005 at 11:59:21PM -0700, yhlu wrote:
> > andi,
> >
> > is it possible for
> > after the AP1 call_in is done and before AP1 get in tsc_sync_wait
> > The AP2 call_in done.  and then AP1 get in tsc_sync_wait and before it
> > done, AP2 get in tsc_sync_wait too.
> >
> > sync_master can not figure out from AP1 or AP2 because only have
> > go[MASTER] and go{SLAVE].
> 
> Ok, you're right. It's better to move it to before callin map.
> 
> -Andi
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-12 16:18                       ` yhlu
@ 2005-08-12 16:41                         ` Andi Kleen
  2005-08-12 17:36                           ` yhlu
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2005-08-12 16:41 UTC (permalink / raw)
  To: yhlu
  Cc: Andi Kleen, Mike Waychison, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

On Fri, Aug 12, 2005 at 09:18:07AM -0700, yhlu wrote:
> good, I will produce one patch next week.

I already did it in my tree.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [discuss] Re: 2.6.13-rc2 with dual way dual core ck804 MB
  2005-08-12 16:41                         ` Andi Kleen
@ 2005-08-12 17:36                           ` yhlu
  0 siblings, 0 replies; 18+ messages in thread
From: yhlu @ 2005-08-12 17:36 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mike Waychison, Peter Buckingham, linux-kernel,
	discuss@x86-64.org

Oh.

On 8/12/05, Andi Kleen <ak@suse.de> wrote:
> On Fri, Aug 12, 2005 at 09:18:07AM -0700, yhlu wrote:
> > good, I will produce one patch next week.
> 
> I already did it in my tree.
> 
> -Andi
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2005-08-12 17:36 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-07  0:56 2.6.13-rc2 with dual way dual core ck804 MB YhLu
2005-08-10 23:14 ` Mike Waychison
2005-08-10 23:26   ` [discuss] " Andi Kleen
2005-08-10 23:42     ` yhlu
2005-08-11  0:04       ` Andi Kleen
2005-08-11  0:17         ` yhlu
2005-08-11  0:23           ` yhlu
2005-08-11  0:28             ` Andi Kleen
2005-08-11  0:43               ` yhlu
2005-08-11  0:51                 ` Andi Kleen
2005-08-12  6:59                   ` yhlu
2005-08-12  7:04                     ` yhlu
2005-08-12 13:07                     ` Andi Kleen
2005-08-12 16:18                       ` yhlu
2005-08-12 16:41                         ` Andi Kleen
2005-08-12 17:36                           ` yhlu
2005-08-10 23:49     ` Mike Waychison
2005-08-10 23:31   ` Peter Buckingham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox