* SMP on assym. x86
@ 2001-03-21 15:55 Kurt Garloff
2001-03-21 16:32 ` Mark Hahn
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Kurt Garloff @ 2001-03-21 15:55 UTC (permalink / raw)
To: Ingo Molnar, Linus Torvalds, Alan Cox; +Cc: Linux kernel list
[-- Attachment #1.1: Type: text/plain, Size: 5262 bytes --]
Hi,
recently upgrading one of my two CPUs, I found kernel-2.4.2 to be unable to
handle the situation with 2 different CPUs (AMP = Assymmetric
multiprocessing ;-) correctly.
Some details on my system:
Dual BX board (DFI P2XBL/D), iPII 350 (Deschutes) + iPIII 850 (Coppermine)
Note: The difference in features is the XMM (SSE) flag.
The problems are twofold
(a) Determination of the correct common features (=: COMCAP), i.e.
boot_cpu_data.x86_capaility[0] at the correct time
(b) TSC stuff
Ad (a):
There is code in identify_cpu() to make sure the common subset of features
is stored in the COMCAP, however this does not work at all, as the freshly
booted CPUs just overwrite it in head.S.
Fixed this, so the system basically worked, if the iPII was the boot CPU.
With iPIII as boot CPU, it rebooted after (captured via serial console):
Asserting INIT.
Waiting for send to finish...
+Deasserting INIT.
Waiting for send to finish...
+#startup loops: 2.
Sending STARTUP #1.
After apic_write.
The problem is that the C/FPU gets initialized to use XMM instructions,
before the other CPUs are even started and checked for capabilities.
Booting with nofxsr (or noxmm which I added) helps.
However, the kernel should be able to find out himself.
To find out about the COMCAP earlier, one has two choices
* boot the secondary CPUs earlier (before initialiying usage of FXSR/XMM)
* try to find out without booting the 2ndary CPUs
The only way I found to do the latter is to use the results of the MPtable
parsing. However, this information may not be reliable. I did implement
this, but on my system, a lot of features were not reported at all ...
So I chose to do smp_init() earlier. It's a good idea to have this info
available before we get to do RAID5 init anyway.
I chose to move smp_init()/smp_boot_cpus() [which now calls identify_cpu()
and check_config() for the boot_cpu and later identify_cpu() on the secondary
CPUs and sets COMCAP correctly], then check_bugs() [which does not do the
identify_cpu() on SMP anymore) before bdev_init().
Problem solved. It does certainly solve XMM/SSE and FXSR issues. PSE36
differences, e.g., may still cause problems on 64GB kernels ...
Ad (b):
The fast_gettimeoffset() does return nonsense, if it runs on the wrong CPU.
There are two reasons: On the one hand, the TSC speed is only calculated for
one CPU, which does not fit the other. On the other hand, xtime gets updated
and the low TSC bits are saved. If gettimeofday() is run on the other CPU,
you compare the TSC of this CPU with the saved one from the other, which
will return nonsense. I've even seen negative numbers.
This behaviour screws up timing information. xntp did not work any more and
the glx module crashed the AGP/PCI bus for my MGA G400.
notsc could work around this of course.
I decided to fix this problem as well.
Therefore, cpu_khz, fast_gettimeoffset_quotient, last_tsc_low have been
moved to, and xtime duplicated in the cpuinfo_x86 struct.
identify_cpu() now calls time_init_cpu() for the secondary CPUs, which
initializes cpu_khz and the quotient.
The timer irq now stores the last_tsc_low in the CPU specific struct.
It also copies the xtime to the current CPU's struct.
gettimeofday() just uses this CPU specific info.
It works.
Find here some evidence:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 6
cpu MHz : 851.946
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1701.88
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 5
model name : Pentium II (Deschutes)
stepping : 2
cpu MHz : 350.800
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr
bogomips : 701.54
ntpq> pe
remote refid st t when poll reach delay offset jitter
==============================================================================
LOCAL(0) LOCAL(0) 10 l 15 64 377 0.000 0.000 0.000
*casa.casa-etp.n ntp2.ptb.de 2 u 15 64 377 0.365 1.453 5.000
boot_cpu_data (COMCAP) contains:
Intel Pentium III (Coppermine) caps: 0183fbff fpu vme de pse tsc msr pae mce
cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr stepping 02
Patch against 2.4.2 is attached.
Feedback is welcome. I think the patch is safe, but I could imagine that the
earlier SMP initialization might cause problems for some people or other archs.
I would like this patch to go into the mainstream kernel, if no problem can
be found.
Regards,
--
Kurt Garloff <kurt@garloff.de> [Eindhoven, NL]
Physics: Plasma simulations <K.Garloff@Phys.TUE.NL> [TU Eindhoven, NL]
Linux: SCSI, Security <garloff@suse.de> [SuSE Nuernberg, FRG]
(See mail header or public key servers for PGP2 and GPG public keys.)
[-- Attachment #1.2: linux242.amp-support.diff --]
[-- Type: text/plain, Size: 13445 bytes --]
diff -uNr linux-2.4.2.sym/arch/i386/kernel/head.S linux-2.4.2.amp/arch/i386/kernel/head.S
--- linux-2.4.2.sym/arch/i386/kernel/head.S Sat Feb 17 00:53:08 2001
+++ linux-2.4.2.amp/arch/i386/kernel/head.S Wed Mar 21 03:22:49 2001
@@ -229,6 +229,12 @@
movb %al,X86_MODEL
andb $0x0f,%cl # mask mask revision
movb %cl,X86_MASK
+ /* prevent overwriting of corrected cpu_boot_data.x86_capabilty[0] */
+#ifdef CONFIG_SMP
+ movb ready,%cl
+ cmpb $0,%cl
+ jne is486
+#endif
movl %edx,X86_CAPABILITY
is486:
diff -uNr linux-2.4.2.sym/arch/i386/kernel/i387.c linux-2.4.2.amp/arch/i386/kernel/i387.c
--- linux-2.4.2.sym/arch/i386/kernel/i387.c Sat Feb 3 21:09:34 2001
+++ linux-2.4.2.amp/arch/i386/kernel/i387.c Wed Mar 21 02:54:27 2001
@@ -28,7 +28,7 @@
* The _current_ task is using the FPU for the first time
* so initialize it and set the mxcsr to its default
* value at reset if we support XMM instructions and then
- * remeber the current task has used the FPU.
+ * remember the current task has used the FPU.
*/
void init_fpu(void)
{
diff -uNr linux-2.4.2.sym/arch/i386/kernel/mpparse.c linux-2.4.2.amp/arch/i386/kernel/mpparse.c
--- linux-2.4.2.sym/arch/i386/kernel/mpparse.c Wed Nov 15 06:25:34 2000
+++ linux-2.4.2.amp/arch/i386/kernel/mpparse.c Wed Mar 21 14:51:17 2001
@@ -106,6 +106,8 @@
return n;
}
+unsigned long mp_max_features = 0xffffffff;
+
static void __init MP_processor_info (struct mpc_config_processor *m)
{
int ver;
@@ -171,6 +173,9 @@
boot_cpu_id = m->mpc_apicid;
}
num_processors++;
+
+ /* Clear features that are not present on all CPUs */
+ mp_max_features &= m->mpc_featureflag;
if (m->mpc_apicid > MAX_APICS) {
printk("Processor #%d INVALID. (Max ID: %d).\n",
diff -uNr linux-2.4.2.sym/arch/i386/kernel/setup.c linux-2.4.2.amp/arch/i386/kernel/setup.c
--- linux-2.4.2.sym/arch/i386/kernel/setup.c Sat Feb 17 01:02:37 2001
+++ linux-2.4.2.amp/arch/i386/kernel/setup.c Wed Mar 21 14:52:58 2001
@@ -106,6 +106,9 @@
struct cpuinfo_x86 boot_cpu_data = { 0, 0, 0, 0, -1, 1, 0, 0, -1 };
unsigned long mmu_cr4_features;
+#ifdef CONFIG_SMP
+extern unsigned long mp_max_features;
+#endif
/*
* Bus types ..
@@ -144,10 +147,11 @@
extern int root_mountflags;
extern char _text, _etext, _edata, _end;
-extern unsigned long cpu_khz;
+//extern unsigned long cpu_khz;
static int disable_x86_serial_nr __initdata = 1;
static int disable_x86_fxsr __initdata = 0;
+static int disable_x86_xmm __initdata = 0;
/*
* This is set up by the setup-routine at boot-time
@@ -1805,6 +1809,11 @@
}
__setup("nofxsr", x86_fxsr_setup);
+int __init x86_xmm_setup(char* s)
+{ disable_x86_xmm = 1;
+ return 1;
+}
+__setup("noxmm", x86_xmm_setup);
/* Standard macro to see if a specific flag is changeable */
static inline int flag_is_changeable_p(u32 flag)
@@ -1912,6 +1921,8 @@
return have_cpuid_p(); /* Check to see if CPUID now enabled? */
}
+
+extern void time_init_cpu(struct cpuinfo_x86 *);
/*
* This does the hard work of actually picking apart the CPU stuff...
*/
@@ -2046,6 +2057,10 @@
clear_bit(X86_FEATURE_XMM, &c->x86_capability);
}
+ /* XMM/SSE disabled? */
+ if (disable_x86_xmm)
+ clear_bit(X86_FEATURE_XMM, &c->x86_capability);
+
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
@@ -2079,7 +2094,12 @@
/* AND the already accumulated flags with these */
for ( i = 0 ; i < NCAPINTS ; i++ )
boot_cpu_data.x86_capability[i] &= c->x86_capability[i];
+ time_init_cpu (c);
}
+#if 0 //def CONFIG_SMP
+ else
+ c->x86_capability[0] &= mp_max_features;
+#endif
printk("CPU: Common caps: %08x %08x %08x %08x\n",
boot_cpu_data.x86_capability[0],
@@ -2196,7 +2216,7 @@
if ( test_bit(X86_FEATURE_TSC, &c->x86_capability) ) {
p += sprintf(p, "cpu MHz\t\t: %lu.%03lu\n",
- cpu_khz / 1000, (cpu_khz % 1000));
+ c->cpu_khz / 1000, (c->cpu_khz % 1000));
}
/* Cache size */
diff -uNr linux-2.4.2.sym/arch/i386/kernel/smpboot.c linux-2.4.2.amp/arch/i386/kernel/smpboot.c
--- linux-2.4.2.sym/arch/i386/kernel/smpboot.c Tue Feb 13 23:13:43 2001
+++ linux-2.4.2.amp/arch/i386/kernel/smpboot.c Wed Mar 21 15:04:03 2001
@@ -29,6 +29,7 @@
* Ingo Molnar : various cleanups and rewrites
* Tigran Aivazian : fixed "0.00 in /proc/uptime on SMP" bug.
* Maciej W. Rozycki : Bits for genuine 82489DX APICs
+ * Kurt Garloff : Support for SMP with non-identical CPUs
*/
#include <linux/config.h>
@@ -44,6 +45,7 @@
#include <linux/mc146818rtc.h>
#include <asm/mtrr.h>
#include <asm/pgalloc.h>
+#include <asm/bugs.h>
/* Set if we find a B stepping CPU */
static int smp_b_stepping;
@@ -198,7 +200,7 @@
#define NR_LOOPS 5
-extern unsigned long fast_gettimeoffset_quotient;
+//extern unsigned long fast_gettimeoffset_quotient;
/*
* accurate 64-bit/32-bit division, expanded to 32-bit divisions and 64-bit
@@ -239,7 +241,7 @@
printk("checking TSC synchronization across CPUs: ");
- one_usec = ((1<<30)/fast_gettimeoffset_quotient)*(1<<2);
+ one_usec = ((1<<30)/current_cpu_data.fast_gettimeoffset_quotient)*(1<<2);
atomic_set(&tsc_start_flag, 1);
wmb();
@@ -342,6 +344,38 @@
}
#undef NR_LOOPS
+/* Note: cpu_boot_data caps on SMP machines is filled in as follows:
+ * (1) Initialized with the boot CPU's caps (head.S)
+ * (2) AND with the common caps from the MP table (identify_cpu())
+ * (3) AND with the caps from the started CPUs (identify_cpu())
+ * (4) Finally calculated and overwritten here (smp_make_common_caps())
+ * Step 2 is actually disabled, as the info is not very reliable.
+ * Step 4 should not be needed then, actually.
+ * KG <garloff@suse.de>, 2001-03-21
+ */
+
+/* Store common subset of capabilities in boot_cpu_data */
+void smp_make_common_caps(void)
+{
+ int cpu, c;
+ unsigned long x86cap[NCAPINTS];
+ for (c = 0; c < NCAPINTS; c++)
+ x86cap[c] = 0xffffffff;
+ for (cpu = 0; cpu < NR_CPUS; cpu++) {
+ if (!(cpu_online_map & (1<<cpu)))
+ continue;
+ for (c = 0; c < NCAPINTS; c++)
+ x86cap[c] &= cpu_data[cpu].x86_capability[c];
+ }
+ for (c = 0; c < NCAPINTS; c++)
+ boot_cpu_data.x86_capability[c] = x86cap[c];
+ printk("CPU: Common caps: %08x %08x %08x %08x\n",
+ boot_cpu_data.x86_capability[0],
+ boot_cpu_data.x86_capability[1],
+ boot_cpu_data.x86_capability[2],
+ boot_cpu_data.x86_capability[3]);
+}
+
extern void calibrate_delay(void);
static atomic_t init_deasserted;
@@ -780,12 +814,13 @@
}
cycles_t cacheflush_time;
-extern unsigned long cpu_khz;
+//extern unsigned long cpu_khz;
static void smp_tune_scheduling (void)
{
unsigned long cachesize; /* kB */
unsigned long bandwidth = 350; /* MB/s */
+ unsigned long cpu_khz = boot_cpu_data.cpu_khz;
/*
* Rough estimation for SMP scheduling, this is the number of
* cycles it takes for a fully memory-limited process to flush
@@ -831,6 +866,12 @@
{
int apicid, cpu;
+ /* Pieces from check_bugs() ... */
+ identify_cpu(&boot_cpu_data);
+ printk("Boot CPU: ");
+ print_cpu_info(&boot_cpu_data);
+ check_config();
+
#ifdef CONFIG_MTRR
/* Must be done before other processors booted */
mtrr_init_boot_cpu ();
@@ -1019,6 +1060,7 @@
synchronize_tsc_bp();
smp_done:
+ smp_make_common_caps ();
zap_low_mappings();
}
diff -uNr linux-2.4.2.sym/arch/i386/kernel/time.c linux-2.4.2.amp/arch/i386/kernel/time.c
--- linux-2.4.2.sym/arch/i386/kernel/time.c Fri Dec 29 23:07:57 2000
+++ linux-2.4.2.amp/arch/i386/kernel/time.c Wed Mar 21 02:15:42 2001
@@ -64,19 +64,19 @@
#include <linux/irq.h>
-unsigned long cpu_khz; /* Detected as we calibrate the TSC */
+//unsigned long cpu_khz; /* Detected as we calibrate the TSC */
/* Number of usecs that the last interrupt was delayed */
static int delay_at_last_interrupt;
-static unsigned long last_tsc_low; /* lsb 32 bits of Time Stamp Counter */
+//static unsigned long last_tsc_low; /* lsb 32 bits of Time Stamp Counter */
/* Cached *multiplier* to convert TSC counts to microseconds.
* (see the equation below).
* Equal to 2^32 * (1 / (clocks per usec) ).
* Initialized in time_init.
*/
-unsigned long fast_gettimeoffset_quotient;
+//unsigned long fast_gettimeoffset_quotient;
extern rwlock_t xtime_lock;
extern unsigned long wall_jiffies;
@@ -92,7 +92,7 @@
rdtsc(eax,edx);
/* .. relative to previous jiffy (32 bits is enough) */
- eax -= last_tsc_low; /* tsc_low delta */
+ eax -= current_cpu_data.last_tsc_low; /* tsc_low delta */
/*
* Time offset = (tsc_low delta) * fast_gettimeoffset_quotient
@@ -105,7 +105,7 @@
__asm__("mull %2"
:"=a" (eax), "=d" (edx)
- :"rm" (fast_gettimeoffset_quotient),
+ :"rm" (current_cpu_data.fast_gettimeoffset_quotient),
"0" (eax));
/* our adjusted time offset in microseconds */
@@ -266,8 +266,8 @@
if (lost)
usec += lost * (1000000 / HZ);
}
- sec = xtime.tv_sec;
- usec += xtime.tv_usec;
+ sec = current_cpu_data.xtime.tv_sec;
+ usec += current_cpu_data.xtime.tv_usec;
read_unlock_irqrestore(&xtime_lock, flags);
while (usec >= 1000000) {
@@ -431,6 +431,7 @@
else
last_rtc_update = xtime.tv_sec - 600; /* do it again in 60 s */
}
+ current_cpu_data.xtime = xtime;
#ifdef CONFIG_MCA
if( MCA_bus ) {
@@ -485,7 +486,7 @@
/* read Pentium cycle counter */
- rdtscl(last_tsc_low);
+ rdtscl(current_cpu_data.last_tsc_low);
spin_lock(&i8253_lock);
outb_p(0x00, 0x43); /* latch the count ASAP */
@@ -586,7 +587,7 @@
} while ((inb(0x61) & 0x20) == 0);
rdtsc(endlow,endhigh);
- last_tsc_low = endlow;
+ current_cpu_data.last_tsc_low = endlow;
/* Error: ECTCNEVERSET */
if (count <= 1)
@@ -623,7 +624,8 @@
return 0;
}
-void __init time_init(void)
+
+void __init time_init_cpu(struct cpuinfo_x86 *c)
{
extern int x86_udelay_tsc;
@@ -660,7 +662,8 @@
if (cpu_has_tsc) {
unsigned long tsc_quotient = calibrate_tsc();
if (tsc_quotient) {
- fast_gettimeoffset_quotient = tsc_quotient;
+ unsigned long cpu_khz;
+ c->fast_gettimeoffset_quotient = tsc_quotient;
use_tsc = 1;
/*
* We could be more selective here I suspect
@@ -683,6 +686,7 @@
"0" (eax), "1" (edx));
printk("Detected %lu.%03lu MHz processor.\n", cpu_khz / 1000, cpu_khz % 1000);
}
+ c->cpu_khz = cpu_khz;
}
}
@@ -703,4 +707,11 @@
#else
setup_irq(0, &irq0);
#endif
+}
+
+extern struct cpuinfo_x86 boot_cpu_data;
+
+void inline __init time_init (void)
+{
+ time_init_cpu (&boot_cpu_data);
}
diff -uNr linux-2.4.2.sym/include/asm-i386/bugs.h linux-2.4.2.amp/include/asm-i386/bugs.h
--- linux-2.4.2.sym/include/asm-i386/bugs.h Wed Mar 21 02:14:22 2001
+++ linux-2.4.2.amp/include/asm-i386/bugs.h Wed Mar 21 14:47:23 2001
@@ -21,6 +21,7 @@
*/
#include <linux/config.h>
+#include <linux/utsname.h>
#include <asm/processor.h>
#include <asm/i387.h>
#include <asm/msr.h>
@@ -204,12 +205,16 @@
static void __init check_bugs(void)
{
- identify_cpu(&boot_cpu_data);
+ /* identify_cpu(&boot_cpu_data) and
+ * check_config()
+ * are called in smp_boot_cpus() on SMP kernels
+ */
#ifndef CONFIG_SMP
+ identify_cpu(&boot_cpu_data);
printk("CPU: ");
print_cpu_info(&boot_cpu_data);
-#endif
check_config();
+#endif
check_fpu();
check_hlt();
check_popad();
diff -uNr linux-2.4.2.sym/include/asm-i386/processor.h linux-2.4.2.amp/include/asm-i386/processor.h
--- linux-2.4.2.sym/include/asm-i386/processor.h Tue Mar 6 23:25:52 2001
+++ linux-2.4.2.amp/include/asm-i386/processor.h Wed Mar 21 02:10:14 2001
@@ -16,6 +16,7 @@
#include <asm/cpufeature.h>
#include <linux/config.h>
#include <linux/threads.h>
+#include <linux/time.h>
/*
* Default implementation of macro that returns current
@@ -52,6 +53,10 @@
unsigned long *pmd_quick;
unsigned long *pte_quick;
unsigned long pgtable_cache_sz;
+ unsigned long cpu_khz;
+ unsigned long fast_gettimeoffset_quotient;
+ unsigned long last_tsc_low;
+ struct timeval xtime;
};
#define X86_VENDOR_INTEL 0
diff -uNr linux-2.4.2.sym/init/main.c linux-2.4.2.amp/init/main.c
--- linux-2.4.2.sym/init/main.c Thu Mar 1 16:44:10 2001
+++ linux-2.4.2.amp/init/main.c Wed Mar 21 14:39:32 2001
@@ -592,6 +592,12 @@
page_cache_init(mempages);
kiobuf_setup();
signals_init();
+ /* Moved smp_init() up, as bdev_init(), i.e. RAID5, may depend on the CPU features
+ * being present, which is unfortunately only known after all CPUs have been booted
+ * check_bugs() follows next to take care of FPU ...
+ * KG <garloff@suse.de>, 2001-01-21 */
+ smp_init();
+ check_bugs();
bdev_init();
inode_init(mempages);
#if defined(CONFIG_SYSVIPC)
@@ -600,7 +606,6 @@
#if defined(CONFIG_QUOTA)
dquot_init_hash();
#endif
- check_bugs();
printk("POSIX conformance testing by UNIFIX\n");
/*
@@ -608,7 +613,6 @@
* Like idlers init is an unlocked kernel thread, which will
* make syscalls (and thus be locked).
*/
- smp_init();
kernel_thread(init, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL);
unlock_kernel();
current->need_resched = 1;
[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-21 15:55 SMP on assym. x86 Kurt Garloff
@ 2001-03-21 16:32 ` Mark Hahn
2001-03-21 23:41 ` Alan Cox
2001-03-21 17:16 ` Linus Torvalds
[not found] ` <99anl4oi@penguin.transmeta.com>
2 siblings, 1 reply; 10+ messages in thread
From: Mark Hahn @ 2001-03-21 16:32 UTC (permalink / raw)
To: Linux kernel list
> recently upgrading one of my two CPUs, I found kernel-2.4.2 to be unable to
> handle the situation with 2 different CPUs (AMP = Assymmetric
> multiprocessing ;-) correctly.
"correctly". Intel doesn't support this (mis)configuration:
especially with different steppings, not to mention models.
Alan has, or is working on, a workaround to handle differing
multipliers by turning off the use of RDTSC. this is the right approach
to take in the kernel: disable features not shared by both processors,
so correctly-configured machines are not penalized.
and the kernel should LOUDLY WARN ABOUT this stuff on boot.
regards, mark hahn.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-21 15:55 SMP on assym. x86 Kurt Garloff
2001-03-21 16:32 ` Mark Hahn
@ 2001-03-21 17:16 ` Linus Torvalds
[not found] ` <99anl4oi@penguin.transmeta.com>
2 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2001-03-21 17:16 UTC (permalink / raw)
To: linux-kernel
In article <20010321165541.H3514@garloff.casa-etp.nl>,
Kurt Garloff <garloff@suse.de> wrote:
>
>recently upgrading one of my two CPUs, I found kernel-2.4.2 to be unable to
>handle the situation with 2 different CPUs (AMP =3D Assymmetric
>multiprocessing ;-) correctly.
This is not really a configuration Linux supports. You can hack it to
work in many cases, but I'm generally not inclined to make this a an
issue for me because:
- intel explicitly doesn't support it necessarily even in hardware.
You're supposed to only mix CPU's of the same stepping within a
family, never mind different families. They sometimes explicitly say
which steppings are compatible and can be mixed.
NOTE! For all I know, this might, for all I know, actually be due to
fundamental issues like cache coherency protocol timing or similar.
Safe answer: just say no.
- The boot CPU under Linux is special, and will be used to determine
things like support for 4M pages etc. It will then re-write the page
tables to be more efficient. If the other CPU's don't support all the
features the boot CPU has, they'll have serious trouble booting up.
NOTE! I'm not all that interested in trying to complicate the bootup
logic to take into account all the differences that can occur.
Especially as it only happens on arguably very broken hardware that
doesn't meet the specs anyway.
So I'm perfectly happy with you fixing it on your machine, but right now
I have no incentives to make this a "real" option for a standard kernel.
I retain the right to change my mind, as always. Le Linus e mobile.
Linus
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
@ 2001-03-21 19:13 James Bottomley
0 siblings, 0 replies; 10+ messages in thread
From: James Bottomley @ 2001-03-21 19:13 UTC (permalink / raw)
To: Kurt Garloff; +Cc: linux-kernel
> recently upgrading one of my two CPUs, I found kernel-2.4.2 to be
> unable to handle the situation with 2 different CPUs (AMP =3D
> Assymmetric multiprocessing ;-) correctly. Some details on my system:
> Dual BX board (DFI P2XBL/D), iPII 350 (Deschutes) + iPIII 850
> (Coppermine) Note: The difference in features is the XMM (SSE) flag.
> The problems are twofold (a) Determination of the correct common
> features (=3D: COMCAP), i.e.
> boot_cpu_data.x86_capaility[0] at the correct time (b) TSC stuff
I have similar problems. I've got a reconfigurable non-APIC 8 way system with
(currently) 4 p5-66 and 4 p5-166 processors. I found the answer to (b) was
simply to disable the TSC stuff---my processors aren't even guaranteed to be
fed from the same clock, so there's no hope for TSC coherency.
I run into your problem (a) when trying a mixture of 486 and 586 processors.
The simplest work around I find is just to make sure that the boot CPU has the
lowest capability set (i.e. boot off a 486). Could you just swap the order of
your processors to achieve the same effect?
James Bottomley
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-21 16:32 ` Mark Hahn
@ 2001-03-21 23:41 ` Alan Cox
2001-03-22 12:00 ` Kurt Garloff
0 siblings, 1 reply; 10+ messages in thread
From: Alan Cox @ 2001-03-21 23:41 UTC (permalink / raw)
To: Mark Hahn; +Cc: Linux kernel list
> > handle the situation with 2 different CPUs (AMP = Assymmetric
> > multiprocessing ;-) correctly.
>
> "correctly". Intel doesn't support this (mis)configuration:
> especially with different steppings, not to mention models.
Actually for a lot of cases its quite legal.
> Alan has, or is working on, a workaround to handle differing
> multipliers by turning off the use of RDTSC. this is the right approach
> to take in the kernel: disable features not shared by both processors,
> so correctly-configured machines are not penalized.
> and the kernel should LOUDLY WARN ABOUT this stuff on boot.
I've been working on reading the multipliers directly from the MSR 0x2A data,
Kurt is redoing the timing each run - possibly thats not so clean but its
more robust.
I rather like Kurt's patch
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-21 23:41 ` Alan Cox
@ 2001-03-22 12:00 ` Kurt Garloff
2001-03-22 13:48 ` Mark Hahn
2001-03-23 10:19 ` Eric W. Biederman
0 siblings, 2 replies; 10+ messages in thread
From: Kurt Garloff @ 2001-03-22 12:00 UTC (permalink / raw)
To: Alan Cox; +Cc: Linux kernel list
[-- Attachment #1: Type: text/plain, Size: 2538 bytes --]
On Wed, Mar 21, 2001 at 11:41:33PM +0000, Alan Cox wrote:
> > > handle the situation with 2 different CPUs (AMP = Assymmetric
> > > multiprocessing ;-) correctly.
> >
> > "correctly". Intel doesn't support this (mis)configuration:
> > especially with different steppings, not to mention models.
I wouldn't call it misconfiguration, just because it's a bit more difficult
to handle.
On the iontel side: You should watch out for matching APICs, voltages and
cache coherency (MESI) protocol. Actually, Deschutes and Coppermine just
work fine in spite of slightly different voltage.
> Actually for a lot of cases its quite legal.
It should be always legal to have the same CPU with different speeds. (FSB
must be the same for obvious reasons), e.g.. The TSC part of the patch
addresses this.
> > Alan has, or is working on, a workaround to handle differing
> > multipliers by turning off the use of RDTSC. this is the right approach
> > to take in the kernel: disable features not shared by both processors,
> > so correctly-configured machines are not penalized.
TSC is supported by both CPUs ... so why not use the nice rdtsc based time
routine.
The penalty:
It is 5 (20 bytes) ints more in the struct cpuinfo_x86, so you have
a kernel with (NR_CPUS*20 - 12) = 628 bytes more kernel data ...
Your bootup time may be a fraction of a second longer, as every CPU
calibrates the TSC on its own; OTOH they do it in parallel ...
The timer interrupt does an extra copy of xtime (2 ints) to the to the per
CPU struct. And one extra indirection for accessing the per CPU struct.
Something like 4 CPU cycles per timer interrupt.
gettimeofday() also has this extra indirection; OTOH not accessing global
data saves a cache line flush the next time the structure is written to ...
So this is probably a net cost of zero.
> > and the kernel should LOUDLY WARN ABOUT this stuff on boot.
>
> I've been working on reading the multipliers directly from the MSR 0x2A data,
> Kurt is redoing the timing each run - possibly thats not so clean but its
> more robust.
>
> I rather like Kurt's patch
Thx!
If you have some requests to make it suitable for -ac kernels, I'll do my
best.
Regards,
--
Kurt Garloff <kurt@garloff.de> [Eindhoven, NL]
Physics: Plasma simulations <K.Garloff@Phys.TUE.NL> [TU Eindhoven, NL]
Linux: SCSI, Security <garloff@suse.de> [SuSE Nuernberg, FRG]
(See mail header or public key servers for PGP2 and GPG public keys.)
[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
[not found] ` <99anl4oi@penguin.transmeta.com>
@ 2001-03-22 13:20 ` Pavel Machek
2001-03-23 10:18 ` Kurt Garloff
0 siblings, 1 reply; 10+ messages in thread
From: Pavel Machek @ 2001-03-22 13:20 UTC (permalink / raw)
To: garlof; +Cc: linux-kernel
Hi!
> Kurt Garloff <garloff@suse.de> wrote:
> >
> >recently upgrading one of my two CPUs, I found kernel-2.4.2 to be unable to
> >handle the situation with 2 different CPUs (AMP =3D Assymmetric
> >multiprocessing ;-) correctly.
>
> This is not really a configuration Linux supports. You can hack it to
> work in many cases, but I'm generally not inclined to make this a an
> issue for me because:
Notice, that one of your CPUs is twice as fast as second one. You'll
need some heavy updates in scheduler.
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-22 12:00 ` Kurt Garloff
@ 2001-03-22 13:48 ` Mark Hahn
2001-03-23 10:19 ` Eric W. Biederman
1 sibling, 0 replies; 10+ messages in thread
From: Mark Hahn @ 2001-03-22 13:48 UTC (permalink / raw)
To: Linux kernel list
> > > > handle the situation with 2 different CPUs (AMP = Assymmetric
> > > > multiprocessing ;-) correctly.
> > >
> > > "correctly". Intel doesn't support this (mis)configuration:
> > > especially with different steppings, not to mention models.
>
> I wouldn't call it misconfiguration, just because it's a bit more difficult
> to handle.
again, I *would* call it misconfiguration. intel says explicitly that
they don't support mixing model/family parts. and they only test
same-clock combinations (but do support mixed steppings.) just so people
don't get the impression that random, different CPUs are a sure thing...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-22 13:20 ` Pavel Machek
@ 2001-03-23 10:18 ` Kurt Garloff
0 siblings, 0 replies; 10+ messages in thread
From: Kurt Garloff @ 2001-03-23 10:18 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux kernel list
[-- Attachment #1: Type: text/plain, Size: 750 bytes --]
On Thu, Mar 22, 2001 at 01:20:40PM +0000, Pavel Machek wrote:
> > Kurt Garloff <garloff@suse.de> wrote:
> Notice, that one of your CPUs is twice as fast as second one. You'll
> need some heavy updates in scheduler.
I know that making sure to have a fair scheduling on non-symmetric
multiprocessor machiens would require some more work.
I can live with imperfect scheduling ...
Or are you refering to something more serious than non-fairness?
Note: My machine just runs fine for a couple of days now ...
Regards,
--
Kurt Garloff <garloff@suse.de> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE GmbH, Nuernberg, FRG SCSI, Security
[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: SMP on assym. x86
2001-03-22 12:00 ` Kurt Garloff
2001-03-22 13:48 ` Mark Hahn
@ 2001-03-23 10:19 ` Eric W. Biederman
1 sibling, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2001-03-23 10:19 UTC (permalink / raw)
To: Kurt Garloff; +Cc: Alan Cox, Linux kernel list
Kurt Garloff <garloff@suse.de> writes:
> On Wed, Mar 21, 2001 at 11:41:33PM +0000, Alan Cox wrote:
> > > > handle the situation with 2 different CPUs (AMP = Assymmetric
> > > > multiprocessing ;-) correctly.
> > >
> > > "correctly". Intel doesn't support this (mis)configuration:
> > > especially with different steppings, not to mention models.
>
> I wouldn't call it misconfiguration, just because it's a bit more difficult
> to handle.
> On the iontel side: You should watch out for matching APICs, voltages and
> cache coherency (MESI) protocol. Actually, Deschutes and Coppermine just
> work fine in spite of slightly different voltage.
The spooky thing is if there is that it may work just fine most of the
time but the differences between the CPU's might cause very strange
behavior every once in a great while. Which is a hardware argument, for
why you shouldn't trust such a configuration.
However it is still worth some thought. The hardware argument gets much
weaker when you have something like dual AMD's. The reason is that
with a point to point bus you may actually be able to sanely support
multiple cpu revs and speeds without even any theoretical hardware consequences.
And NUMA machines make this argument even stronger.
However I would suggest that we build some good kernel->kernel apis
for dealing with kernels with a wicked fast interconnect. And then
for NUMA and for the other cases where it really matters we can run multiple
kernels, and the mismatch problems just drop away.
Eric
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2001-03-23 10:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-21 15:55 SMP on assym. x86 Kurt Garloff
2001-03-21 16:32 ` Mark Hahn
2001-03-21 23:41 ` Alan Cox
2001-03-22 12:00 ` Kurt Garloff
2001-03-22 13:48 ` Mark Hahn
2001-03-23 10:19 ` Eric W. Biederman
2001-03-21 17:16 ` Linus Torvalds
[not found] ` <99anl4oi@penguin.transmeta.com>
2001-03-22 13:20 ` Pavel Machek
2001-03-23 10:18 ` Kurt Garloff
-- strict thread matches above, loose matches on Subject: below --
2001-03-21 19:13 James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox