public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* lguest broken in 2.6.22-rc1-mm1
@ 2007-05-22 22:38 Matt Mackall
  2007-05-22 23:27 ` Rusty Russell
  0 siblings, 1 reply; 18+ messages in thread
From: Matt Mackall @ 2007-05-22 22:38 UTC (permalink / raw)
  To: rusty; +Cc: linux-kernel, akpm

$ lguest 1024m vmlinux --tunnet=192.168.19.1 --block=rootfs root=/dev/lgba
[    0.000000] Reserving virtual address space above 0xffc00000
[    0.000000] Linux version 2.6.22-rc1-mm1 (mpm@cinder) (gcc
version 4.1.3 20070429 (prerelease) (Debian 4.1.2-6)) #125 PREEMPT Tue
May 22 16:50:02 CDT 2007
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  LGUEST: 0000000000000000 - 0000000040000000 (usable)
[    0.000000] 132MB HIGHMEM available.
[    0.000000] 892MB LOWMEM available.
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA             0 ->     4096
[    0.000000]   Normal       4096 ->   228352
[    0.000000]   HighMem    228352 ->   262144
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[1] active PFN ranges
[    0.000000]     0:        0 ->   262144
[    0.000000] DMI not present or invalid.
[    0.000000] Allocating PCI resources starting at 50000000 (gap:
40000000:c0000000)
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.
Total pages: 260096
[    0.000000] Kernel command line: root=/dev/lgba slub_debug
init=/sbin/init 
[    0.000000] Initializing CPU#0
[    0.000000] CPU 0 irqstacks, hard=c05c2000 soft=c05c1000
[    0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes)
[    0.000000] Console: colour dummy device 80x25
[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat,
Inc., Ingo Molnar
[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:    8
[    0.000000] ... MAX_LOCK_DEPTH:          30
[    0.000000] ... MAX_LOCKDEP_KEYS:        2048
[    0.000000] ... CLASSHASH_SIZE:           1024
[    0.000000] ... MAX_LOCKDEP_ENTRIES:     8192
[    0.000000] ... MAX_LOCKDEP_CHAINS:      16384
[    0.000000] ... CHAINHASH_SIZE:          8192
[    0.000000]  memory used by lock dependency info: 992 kB
[    0.000000]  per task-struct memory footprint: 1200 bytes
[    0.000000] Dentry cache hash table entries: 131072 (order: 7,
524288 bytes)
[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144
bytes)
[    0.000000] Memory: 1031316k/1048576k available (2838k kernel code,
17064k reserved, 1817k data, 188k init, 135168k highmem)
[    0.000000] virtual kernel memory layout:
[    0.000000]     fixmap  : 0xffbea000 - 0xffbff000   (  84 kB)
[    0.000000]     pkmap   : 0xff400000 - 0xff800000   (4096 kB)
[    0.000000]     vmalloc : 0xf8800000 - 0xff3fe000   ( 107 MB)
[    0.000000]     lowmem  : 0xc0000000 - 0xf7c00000   ( 892 MB)
[    0.000000]       .init : 0xc058d000 - 0xc05bc000   ( 188 kB)
[    0.000000]       .data : 0xc03c587f - 0xc058bdb0   (1817 kB)
[    0.000000]       .text : 0xc0100000 - 0xc03c587f   (2838 kB)
[    0.000000] Checking if this processor honours the WP bit even in
supervisor mode... Ok.
[    0.104000] Mount-cache hash table entries: 512
[    0.112000] CPU: L1 I cache: 32K, L1 D cache: 32K
[    0.112000] CPU: L2 cache: 2048K
[    0.112000] Compat vDSO mapped to ffbfe000.
[    0.112000] CPU: Intel(R) Pentium(R) M processor 1.70GHz stepping
06
[    0.120007] divide error: 0000 [#1]
[    0.120007] PREEMPT 
[    0.120007] Modules linked in:
[    0.120007] CPU:    0
[    0.120007] EIP:    0061:[<c01090f3>]    Not tainted VLI
[    0.120007] EFLAGS: 00010246   (2.6.22-rc1-mm1 #125)
[    0.120007] EIP is at resync_sc_freq+0x4b/0x56
[    0.120007] eax: 3d090000   ebx: c05529e0   ecx: 0000002a   edx:
00000000
[    0.120007] esi: 00000000   edi: c1c03f6c   ebp: c1c03f40   esp:
c1c03f38
[    0.120007] ds: 007b   es: 007b   fs: 0000  gs: 0000  ss: 0069
[    0.120007] Process swapper (pid: 1, ti=c1c03000 task=c1c214d0
task.ti=c1c03000)
[    0.120007] Stack: 00000000 00000000 c1c03f50 c01091a9 00000000
00000000 c1c03f70 c058fb56 
[    0.120007]        00000010 00000000 00000000 00000000 00000000
00000000 c1c03fe0 c058d6c4 
[    0.120007]        c1c214d0 c0103f24 00000000 c1c03fa4 c0132ab9
c1c03fac c01169ee 00000000 
[    0.120007] Call Trace:
[    0.120007]  [<c01091a9>] call_r_s_f+0x2b/0x2d
[    0.120007]  [<c058fb56>] init_sched_clock+0x43/0x77
[    0.120007]  [<c058d6c4>] kernel_init+0xbc/0x23e
[    0.120007]  [<c010420f>] kernel_thread_helper+0x7/0x10
[    0.120007]  ============

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: lguest broken in 2.6.22-rc1-mm1
  2007-05-22 22:38 lguest broken in 2.6.22-rc1-mm1 Matt Mackall
@ 2007-05-22 23:27 ` Rusty Russell
  2007-06-04 17:19   ` lguest rebroken in 2.6.22-rc3-mm1 Matt Mackall
  0 siblings, 1 reply; 18+ messages in thread
From: Rusty Russell @ 2007-05-22 23:27 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel, akpm

On Tue, 2007-05-22 at 17:38 -0500, Matt Mackall wrote:
> [    0.120007] EIP is at resync_sc_freq+0x4b/0x56

Hi Matt,

	Thanks for the report!  Andrew should have these two patches queued,
but here they are again:

If you set tsc_disable (eg "notsc" on cmdline), sched-clock.c gives a
divide by zero on boot.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

---
 arch/i386/kernel/sched-clock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

===================================================================
--- a/arch/i386/kernel/sched-clock.c
+++ b/arch/i386/kernel/sched-clock.c
@@ -103,7 +103,7 @@ static void resync_sc_freq(struct sc_dat
 static void resync_sc_freq(struct sc_data *sc, unsigned int newfreq)
 {
 	sc->sync_base = jiffies;
-	if (!cpu_has_tsc) {
+	if (!cpu_has_tsc || tsc_disable) {
 		sc->unstable = 1;
 		return;
 	}

Lguest guests don't use the TSC, and so we must disable it otherwise
sched-clock.c barfs.

Also, we no longer need to explicitly set the PGE feature bit:
cpu_detect->cpuid->lguest_cpuid does that for us now that cpu_detect
uses paravirt_ops (IIRC it used to do a direct cpuid from assembler).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

---
 drivers/lguest/lguest.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

===================================================================
--- a/drivers/lguest/lguest.c
+++ b/drivers/lguest/lguest.c
@@ -504,10 +504,10 @@ __init void lguest_init(void *boot)
 	reserve_top_address(lguest_data.reserve_mem);
 
 	cpu_detect(&new_cpu_data);
-	/* Need this before paging_init. */
-	set_bit(X86_FEATURE_PGE, new_cpu_data.x86_capability);
 	/* Math is always hard! */
 	new_cpu_data.hard_math = 1;
+
+	tsc_disable = 1;
 
 #ifdef CONFIG_X86_MCE
 	mce_disabled = 1;



^ permalink raw reply	[flat|nested] 18+ messages in thread

* lguest rebroken in 2.6.22-rc3-mm1
  2007-05-22 23:27 ` Rusty Russell
@ 2007-06-04 17:19   ` Matt Mackall
  2007-06-04 17:28     ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Matt Mackall @ 2007-06-04 17:19 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, akpm

On Wed, May 23, 2007 at 09:27:40AM +1000, Rusty Russell wrote:
> On Tue, 2007-05-22 at 17:38 -0500, Matt Mackall wrote:
> > [    0.120007] EIP is at resync_sc_freq+0x4b/0x56
> 
> Hi Matt,
> 
> 	Thanks for the report!  Andrew should have these two patches queued,
> but here they are again:
> 
> If you set tsc_disable (eg "notsc" on cmdline), sched-clock.c gives a
> divide by zero on boot.
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> 
> ---
>  arch/i386/kernel/sched-clock.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> ===================================================================
> --- a/arch/i386/kernel/sched-clock.c
> +++ b/arch/i386/kernel/sched-clock.c
> @@ -103,7 +103,7 @@ static void resync_sc_freq(struct sc_dat
>  static void resync_sc_freq(struct sc_data *sc, unsigned int newfreq)
>  {
>  	sc->sync_base = jiffies;
> -	if (!cpu_has_tsc) {
> +	if (!cpu_has_tsc || tsc_disable) {
>  		sc->unstable = 1;
>  		return;
>  	}

Looks like this one got lost in rc3-mm1.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: lguest rebroken in 2.6.22-rc3-mm1
  2007-06-04 17:19   ` lguest rebroken in 2.6.22-rc3-mm1 Matt Mackall
@ 2007-06-04 17:28     ` Andrew Morton
  2007-06-04 17:46       ` Matt Mackall
  2007-06-04 18:12       ` Andi Kleen
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-06-04 17:28 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Rusty Russell, linux-kernel, Andi Kleen

On Mon, 4 Jun 2007 12:19:33 -0500 Matt Mackall <mpm@selenic.com> wrote:

> On Wed, May 23, 2007 at 09:27:40AM +1000, Rusty Russell wrote:
> > On Tue, 2007-05-22 at 17:38 -0500, Matt Mackall wrote:
> > > [    0.120007] EIP is at resync_sc_freq+0x4b/0x56
> > 
> > Hi Matt,
> > 
> > 	Thanks for the report!  Andrew should have these two patches queued,
> > but here they are again:
> > 
> > If you set tsc_disable (eg "notsc" on cmdline), sched-clock.c gives a
> > divide by zero on boot.
> > 
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > 
> > ---
> >  arch/i386/kernel/sched-clock.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > ===================================================================
> > --- a/arch/i386/kernel/sched-clock.c
> > +++ b/arch/i386/kernel/sched-clock.c
> > @@ -103,7 +103,7 @@ static void resync_sc_freq(struct sc_dat
> >  static void resync_sc_freq(struct sc_data *sc, unsigned int newfreq)
> >  {
> >  	sc->sync_base = jiffies;
> > -	if (!cpu_has_tsc) {
> > +	if (!cpu_has_tsc || tsc_disable) {
> >  		sc->unstable = 1;
> >  		return;
> >  	}
> 
> Looks like this one got lost in rc3-mm1.
> 

Andi said that he fixed the zero-divide by other means?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: lguest rebroken in 2.6.22-rc3-mm1
  2007-06-04 17:28     ` Andrew Morton
@ 2007-06-04 17:46       ` Matt Mackall
  2007-06-04 18:12       ` Andi Kleen
  1 sibling, 0 replies; 18+ messages in thread
From: Matt Mackall @ 2007-06-04 17:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Rusty Russell, linux-kernel, Andi Kleen

On Mon, Jun 04, 2007 at 10:28:20AM -0700, Andrew Morton wrote:
> On Mon, 4 Jun 2007 12:19:33 -0500 Matt Mackall <mpm@selenic.com> wrote:
> 
> > On Wed, May 23, 2007 at 09:27:40AM +1000, Rusty Russell wrote:
> > > On Tue, 2007-05-22 at 17:38 -0500, Matt Mackall wrote:
> > > > [    0.120007] EIP is at resync_sc_freq+0x4b/0x56
> > > 
> > > Hi Matt,
> > > 
> > > 	Thanks for the report!  Andrew should have these two patches queued,
> > > but here they are again:
> > > 
> > > If you set tsc_disable (eg "notsc" on cmdline), sched-clock.c gives a
> > > divide by zero on boot.
> > > 
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > 
> > > ---
> > >  arch/i386/kernel/sched-clock.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > ===================================================================
> > > --- a/arch/i386/kernel/sched-clock.c
> > > +++ b/arch/i386/kernel/sched-clock.c
> > > @@ -103,7 +103,7 @@ static void resync_sc_freq(struct sc_dat
> > >  static void resync_sc_freq(struct sc_data *sc, unsigned int newfreq)
> > >  {
> > >  	sc->sync_base = jiffies;
> > > -	if (!cpu_has_tsc) {
> > > +	if (!cpu_has_tsc || tsc_disable) {
> > >  		sc->unstable = 1;
> > >  		return;
> > >  	}
> > 
> > Looks like this one got lost in rc3-mm1.
> > 
> 
> Andi said that he fixed the zero-divide by other means?

Got this, patch fixed it:

[    0.392024] divide error: 0000 [#1]
[    0.392024] PREEMPT 
[    0.392024] Modules linked in:
[    0.392024] CPU:    0
[    0.392024] EIP:    0061:[<c010995a>]    Not tainted VLI
[    0.392024] EFLAGS: 00010246   (2.6.22-rc3-mm1 #145)
[    0.392024] EIP is at resync_freq+0x7a/0x85
[    0.392024] eax: 3d090000   ebx: c1c04f70   ecx: 00000e70   edx: 00000000
[    0.392024] esi: 00000000   edi: c1c04f80   ebp: c1c04f68   esp: c1c04f64
[    0.392024] ds: 007b   es: 007b   fs: 0000  gs: 0000  ss: 0069
[    0.392024] Process swapper (pid: 1, ti=c1c04000 task=c1c20ae0 task.ti=c1c04000)
[    0.392024] Stack: 00000000 c1c04f84 c058eb2d 00000000 00000000 00000000 00000000 00000000 
[    0.392024]        c1c04fe0 c058c6c4 c1c04fac c0119c22 00000000 00000002 c054c320 c1c20ae0 
[    0.392024]        00000000 c05b2dc0 c1c20ae0 00000000 00000001 c058c608 00000000 00000000 
[    0.392024] Call Trace:
[    0.392024]  [<c058eb2d>] init_sched_clock+0x5c/0x90
[    0.392024]  [<c058c6c4>] kernel_init+0xbc/0x23e
[    0.392024]  [<c0104a6b>] kernel_thread_helper+0x7/0x10
[    0.392024]  =======================
[    0.392024] INFO: lockdep is turned off.
[    0.392024] Code: 30 0c 55 c0 89 15 34 0c 55 c0 e8 75 a6 00 00 90 31 c9 89 d1 a3 28 0c 55 c0 ba 00 00 00 00 b8 00 00 09 3d 89 0d 2c 0c 55 c0 31 d2 <f7> 73 08 a3 20 0c 55 c0 5b 5d c3 55 b8 01 00 00 00 89 e5 57 56 
[    0.392024] EIP: [<c010995a>] resync_freq+0x7a/0x85 SS:ESP 0069:c1c04f64
[    0.392024] Kernel panic - not syncing: Attempted to kill init!


-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: lguest rebroken in 2.6.22-rc3-mm1
  2007-06-04 17:28     ` Andrew Morton
  2007-06-04 17:46       ` Matt Mackall
@ 2007-06-04 18:12       ` Andi Kleen
  2007-06-05  2:48         ` Rusty Russell
  1 sibling, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2007-06-04 18:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matt Mackall, Rusty Russell, linux-kernel


> >
> > Looks like this one got lost in rc3-mm1.
>
> Andi said that he fixed the zero-divide by other means?

I determined it cannot happen in my source tree. When notsc
is passed TSC CPUID is cleared and sched-clock works.

I suspect what happens is that lguest forgets to clear the TSC cpuid
bit when it disables TSC. Then the TSC frequency doesn't get computed
and sched-clock can divide by zero.That's purely a lguest bug that needs
to be fixed in lguest with a 
clear_bit(X86_FEATURE_TSC, &boot_cpu_data.x86_capability) 
somewhere

-Andi
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: lguest rebroken in 2.6.22-rc3-mm1
  2007-06-04 18:12       ` Andi Kleen
@ 2007-06-05  2:48         ` Rusty Russell
  2007-06-05 10:01           ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: Rusty Russell @ 2007-06-05  2:48 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Matt Mackall, linux-kernel

On Mon, 2007-06-04 at 20:12 +0200, Andi Kleen wrote:
> > >
> > > Looks like this one got lost in rc3-mm1.
> >
> > Andi said that he fixed the zero-divide by other means?
> 
> I determined it cannot happen in my source tree. When notsc
> is passed TSC CPUID is cleared and sched-clock works.
> 
> I suspect what happens is that lguest forgets to clear the TSC cpuid
> bit when it disables TSC. Then the TSC frequency doesn't get computed
> and sched-clock can divide by zero.That's purely a lguest bug that needs
> to be fixed in lguest with a 
> clear_bit(X86_FEATURE_TSC, &boot_cpu_data.x86_capability) 
> somewhere

It's not quite that simple; lguest's paravirt_ops->cpuid sets TSC off,
and indeed X86_FEATURE_TSC isn't set in boot_cpu_data.x86_capability.

But TSC is a "required feature", so "cpu_has_tsc" is always true.

How about this patch:
===
Don't try to disable the TSC: it's a required feature under modern
configurations, so just mark the sched clock unstable which has the
same effect.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 drivers/lguest/lguest.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

===================================================================
--- a/drivers/lguest/lguest.c
+++ b/drivers/lguest/lguest.c
@@ -37,6 +37,7 @@
 #include <asm/e820.h>
 #include <asm/mce.h>
 #include <asm/io.h>
+#include <asm/sched-clock.h>
 
 /* Declarations for definitions in lguest_guest.S */
 extern char lguest_noirq_start[], lguest_noirq_end[];
@@ -508,7 +509,8 @@ __init void lguest_init(void *boot)
 	/* Math is always hard! */
 	new_cpu_data.hard_math = 1;
 
-	tsc_disable = 1;
+	/* Sched clock is unusable: you'll just hurt yourself if you try. */
+	__get_cpu_var(sc_data).unstable++;
 
 #ifdef CONFIG_X86_MCE
 	mce_disabled = 1;



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: lguest rebroken in 2.6.22-rc3-mm1
  2007-06-05  2:48         ` Rusty Russell
@ 2007-06-05 10:01           ` Andi Kleen
  2007-06-05 13:11             ` [PATCH] lguest-fix-divide-error-implement-sched_clock Rusty Russell
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2007-06-05 10:01 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Andrew Morton, Matt Mackall, linux-kernel


> But TSC is a "required feature", so "cpu_has_tsc" is always true.

Hmm? It isn't. What makes you think so?

cpufeature.h:
#define cpu_has_tsc             boot_cpu_has(X86_FEATURE_TSC)

% grep -i tsc include/asm-i386/required-features.h
% 

> How about this patch:
> ===
> Don't try to disable the TSC: it's a required feature under modern
> configurations, so just mark the sched clock unstable which has the
> same effect.

No, using the cpuid bit is the correct way. Or better fix lguest to support
TSC properly. You cannot stay forever in the keep-it-over-simple Minix trap anyways.

-Andi

 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 10:01           ` Andi Kleen
@ 2007-06-05 13:11             ` Rusty Russell
  2007-06-05 14:24               ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: Rusty Russell @ 2007-06-05 13:11 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Matt Mackall, linux-kernel

On Tue, 2007-06-05 at 12:01 +0200, Andi Kleen wrote:
> > But TSC is a "required feature", so "cpu_has_tsc" is always true.
> 
> Hmm? It isn't. What makes you think so?

Interestingly it seems to be only in -mm.

> > How about this patch:
> > ===
> > Don't try to disable the TSC: it's a required feature under modern
> > configurations, so just mark the sched clock unstable which has the
> > same effect.
> 
> No, using the cpuid bit is the correct way. Or better fix lguest to support
> TSC properly.

I have a patch for a tsc-based clock, but that's a little orthogonal: I
actually need to override sched_clock.

(I'd really like to do stolen time and everything, but looking at Xen
it's a lot of code to do right, and noone's asked for it yet).

This time for sure!

===
In recent -mm kernels, the TSC capability cannot be disabled,
resulting in a divide by zero error in the normal sched_clock.

The correct fix is to have a special lguest sched_clock
implementation: this is as simple as it gets.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 drivers/lguest/lguest.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

===================================================================
--- a/drivers/lguest/lguest.c
+++ b/drivers/lguest/lguest.c
@@ -351,11 +351,18 @@ static void lguest_time_irq(unsigned int
 	update_process_times(user_mode_vm(get_irq_regs()));
 }
 
+static u64 sched_clock_base;
 static void lguest_time_init(void)
 {
 	set_irq_handler(0, lguest_time_irq);
 	hcall(LHCALL_TIMER_READ, 0, 0, 0);
+	sched_clock_base = jiffies_64;
 	enable_lguest_irq(0);
+}
+
+static unsigned long long lguest_sched_clock(void)
+{
+	return (jiffies_64 - sched_clock_base) * (1000000000 / HZ);
 }
 
 static void lguest_load_esp0(struct tss_struct *tss,
@@ -494,6 +501,7 @@ __init void lguest_init(void *boot)
 	paravirt_ops.time_init = lguest_time_init;
 	paravirt_ops.set_lazy_mode = lguest_lazy_mode;
 	paravirt_ops.wbinvd = lguest_wbinvd;
+	paravirt_ops.sched_clock = lguest_sched_clock;
 
 	hcall(LHCALL_LGUEST_INIT, __pa(&lguest_data), 0, 0);
 
@@ -507,8 +515,6 @@ __init void lguest_init(void *boot)
 	cpu_detect(&new_cpu_data);
 	/* Math is always hard! */
 	new_cpu_data.hard_math = 1;
-
-	tsc_disable = 1;
 
 #ifdef CONFIG_X86_MCE
 	mce_disabled = 1;




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 13:11             ` [PATCH] lguest-fix-divide-error-implement-sched_clock Rusty Russell
@ 2007-06-05 14:24               ` Andi Kleen
  2007-06-05 16:07                 ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2007-06-05 14:24 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Andrew Morton, Matt Mackall, linux-kernel

On Tuesday 05 June 2007 15:11, Rusty Russell wrote:
> On Tue, 2007-06-05 at 12:01 +0200, Andi Kleen wrote:
> > > But TSC is a "required feature", so "cpu_has_tsc" is always true.
> >
> > Hmm? It isn't. What makes you think so?
>
> Interestingly it seems to be only in -mm.

If it is then it doesn't come out of my tree. Also sounds broken to me.
The required bits are only for features needed by the compiler's generated
code where not having them could cause an early crash.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 14:24               ` Andi Kleen
@ 2007-06-05 16:07                 ` Andrew Morton
  2007-06-05 16:16                   ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2007-06-05 16:07 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Rusty Russell, Matt Mackall, linux-kernel, H. Peter Anvin

On Tue, 5 Jun 2007 16:24:52 +0200 Andi Kleen <ak@suse.de> wrote:

> On Tuesday 05 June 2007 15:11, Rusty Russell wrote:
> > On Tue, 2007-06-05 at 12:01 +0200, Andi Kleen wrote:
> > > > But TSC is a "required feature", so "cpu_has_tsc" is always true.
> > >
> > > Hmm? It isn't. What makes you think so?
> >
> > Interestingly it seems to be only in -mm.
> 
> If it is then it doesn't come out of my tree. Also sounds broken to me.
> The required bits are only for features needed by the compiler's generated
> code where not having them could cause an early crash.
> 

This?

-#define REQUIRED_MASK1 (NEED_PAE|NEED_CMOV|NEED_CMPXCHG64)
+#define REQUIRED_MASK0 (NEED_FPU|NEED_TSC|NEED_PAE|NEED_CMOV|NEED_CX8)

Came in via git-newsetup

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 16:07                 ` Andrew Morton
@ 2007-06-05 16:16                   ` H. Peter Anvin
  2007-06-05 16:37                     ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2007-06-05 16:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andi Kleen, Rusty Russell, Matt Mackall, linux-kernel

Andrew Morton wrote:
> On Tue, 5 Jun 2007 16:24:52 +0200 Andi Kleen <ak@suse.de> wrote:
> 
>> On Tuesday 05 June 2007 15:11, Rusty Russell wrote:
>>> On Tue, 2007-06-05 at 12:01 +0200, Andi Kleen wrote:
>>>>> But TSC is a "required feature", so "cpu_has_tsc" is always true.
>>>> Hmm? It isn't. What makes you think so?
>>> Interestingly it seems to be only in -mm.
>> If it is then it doesn't come out of my tree. Also sounds broken to me.
>> The required bits are only for features needed by the compiler's generated
>> code where not having them could cause an early crash.

Yes.  Since there is now a mechanism to get a clean message out, it
seemed like a good idea to extend the benefit of static determination.
Andi already had in his tree -- and I copied it -- code to deal with
stuff like "cpu_has_tsc" as a compile-time constant, eliminating the
"else" clause.

Depending on the configuration it affects FPU, TSC, 3Dnow.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 16:16                   ` H. Peter Anvin
@ 2007-06-05 16:37                     ` Andi Kleen
  2007-06-05 17:02                       ` H. Peter Anvin
                                         ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Andi Kleen @ 2007-06-05 16:37 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Andrew Morton, Rusty Russell, Matt Mackall, linux-kernel

On Tuesday 05 June 2007 18:16, H. Peter Anvin wrote:

> Yes.  Since there is now a mechanism to get a clean message out, it
> seemed like a good idea to extend the benefit of static determination.
> Andi already had in his tree -- and I copied it -- code to deal with
> stuff like "cpu_has_tsc" as a compile-time constant, eliminating the
> "else" clause.
>
> Depending on the configuration it affects FPU, TSC, 3Dnow.

I don't think it's a good idea for the TSC. There are various
setups where it is unreliable and also often simulators don't 
implement it correctly. And it's always a valuable workaround
to be able to turn it off.

Except possibly for the FPU only features used by the gcc output
should be tested this way. For everything else it is better to 
test at runtime.

That is x86-64 makes some more assumptions. But even it
doesn't assume TSC. 

I added the mechanism to statically evaluate mostly to share cpufeatures.h
between 32bit and 64bit at some point -- but didn't quite finish that work 
before the last merge.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 16:37                     ` Andi Kleen
@ 2007-06-05 17:02                       ` H. Peter Anvin
  2007-06-05 18:43                       ` H. Peter Anvin
  2007-06-05 21:15                       ` H. Peter Anvin
  2 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2007-06-05 17:02 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Rusty Russell, Matt Mackall, linux-kernel

Andi Kleen wrote:
> 
> I don't think it's a good idea for the TSC. There are various
> setups where it is unreliable and also often simulators don't 
> implement it correctly. And it's always a valuable workaround
> to be able to turn it off.
> 
> Except possibly for the FPU only features used by the gcc output
> should be tested this way. For everything else it is better to 
> test at runtime.
> 
> That is x86-64 makes some more assumptions. But even it
> doesn't assume TSC. 
> 
> I added the mechanism to statically evaluate mostly to share cpufeatures.h
> between 32bit and 64bit at some point -- but didn't quite finish that work 
> before the last merge.
> 

Note that tsc_setup in i386 at least has:


static int __init tsc_setup(char *str)
{
        printk(KERN_WARNING "notsc: Kernel compiled with CONFIG_X86_TSC,"
                                "cannot disable TSC.\n");
        return 1;
}


If TSC isn't actually a compile-time feature then we should allow it to
be disabled on the command line!


This is what I have for i386:

#ifndef CONFIG_MATH_EMULATION
# define NEED_FPU       (1<<(X86_FEATURE_FPU & 31))
#else
# define NEED_FPU       0
#endif

#ifdef CONFIG_X86_TSC
# define NEED_TSC       (1<<(X86_FEATURE_TSC & 31))
#else
# define NEED_TSC       0
#endif

#ifdef CONFIG_X86_PAE
# define NEED_PAE       (1<<(X86_FEATURE_PAE & 31))
#else
# define NEED_PAE       0
#endif

#ifdef CONFIG_X86_CMOV
# define NEED_CMOV      (1<<(X86_FEATURE_CMOV & 31))
#else
# define NEED_CMOV      0
#endif

#ifdef CONFIG_X86_CMPXCHG64
# define NEED_CX8       (1<<(X86_FEATURE_CX8 & 31))
#else
# define NEED_CX8       0
#endif

#define REQUIRED_MASK0  (NEED_FPU|NEED_TSC|NEED_PAE|NEED_CMOV|NEED_CX8)

#ifdef CONFIG_X86_USE_3DNOW
# define NEED_3DNOW     (1<<(X86_FEATURE_3DNOW & 31))
#else
# define NEED_3DNOW     0
#endif

#define REQUIRED_MASK1  (NEED_3DNOW)

And for x86-64:


/* x86-64 baseline features */
#define NEED_FPU        (1<<(X86_FEATURE_FPU & 31))
#define NEED_PSE        (1<<(X86_FEATURE_PSE & 31))
#define NEED_TSC        (1<<(X86_FEATURE_TSC & 31))
#define NEED_MSR        (1<<(X86_FEATURE_MSR & 31))
#define NEED_PAE        (1<<(X86_FEATURE_PAE & 31))
#define NEED_CX8        (1<<(X86_FEATURE_CX8 & 31))
#define NEED_PGE        (1<<(X86_FEATURE_PGE & 31))
#define NEED_FXSR       (1<<(X86_FEATURE_FXSR & 31))
#define NEED_CMOV       (1<<(X86_FEATURE_CMOV & 31))
#define NEED_XMM        (1<<(X86_FEATURE_XMM & 31))
#define NEED_XMM2       (1<<(X86_FEATURE_XMM2 & 31))

#define REQUIRED_MASK0  (NEED_FPU|NEED_PSE|NEED_TSC|NEED_MSR|NEED_PAE|\
                         NEED_CX8|NEED_PGE|NEED_FXSR|NEED_CMOV|\
                         NEED_XMM|NEED_XMM2)
#define SSE_MASK        (NEED_XMM|NEED_XMM2)

/* x86-64 baseline features */
#define NEED_LM         (1<<(X86_FEATURE_LM & 31))

#ifdef CONFIG_X86_USE_3DNOW
# define NEED_3DNOW     (1<<(X86_FEATURE_3DNOW & 31))
#else
# define NEED_3DNOW     0
#endif

#define REQUIRED_MASK1  (NEED_LM|NEED_3DNOW)




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 16:37                     ` Andi Kleen
  2007-06-05 17:02                       ` H. Peter Anvin
@ 2007-06-05 18:43                       ` H. Peter Anvin
  2007-06-05 18:51                         ` Andi Kleen
  2007-06-05 21:15                       ` H. Peter Anvin
  2 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2007-06-05 18:43 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Rusty Russell, Matt Mackall, linux-kernel

Andi Kleen wrote:
> 
> I don't think it's a good idea for the TSC. There are various
> setups where it is unreliable and also often simulators don't 
> implement it correctly. And it's always a valuable workaround
> to be able to turn it off.
> 

I dug some more into the TSC code, and found some other annoying stuff:

/*
 * Make an educated guess if the TSC is trustworthy and synchronized
 * over all CPUs.
 */
__cpuinit int unsynchronized_tsc(void)
{
        if (!cpu_has_tsc || tsc_unstable)
                return 1;
        /*
         * Intel systems are normally all synchronized.
         * Exceptions must mark TSC as unstable:
         */
        if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) {
                /* assume multi socket systems are not synchronized: */
                if (num_possible_cpus() > 1)
                        tsc_unstable = 1;
        }
        return tsc_unstable;
}

That's a vendor check foul.  That should be a CPU feature flag.

Looks like there is some work to be done here.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 18:43                       ` H. Peter Anvin
@ 2007-06-05 18:51                         ` Andi Kleen
  0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2007-06-05 18:51 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Andrew Morton, Rusty Russell, Matt Mackall, linux-kernel


> 
> That's a vendor check foul.  That should be a CPU feature flag.
>
> Looks like there is some work to be done here.

No.

That would just move that code elsewhere, but there is still only
a single caller who actually uses this. Besides there are further
checks to be done here (see x86-64) which does some things here
(like IO-APIC accesses) that you definitely don't want in the cpu init code.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 16:37                     ` Andi Kleen
  2007-06-05 17:02                       ` H. Peter Anvin
  2007-06-05 18:43                       ` H. Peter Anvin
@ 2007-06-05 21:15                       ` H. Peter Anvin
  2007-06-05 21:31                         ` Andi Kleen
  2 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2007-06-05 21:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Rusty Russell, Matt Mackall, linux-kernel

Andi Kleen wrote:
> 
> I don't think it's a good idea for the TSC. There are various
> setups where it is unreliable and also often simulators don't 
> implement it correctly. And it's always a valuable workaround
> to be able to turn it off.
> 

For all I can tell, if this is the case, then CONFIG_X86_TSC should be
removed completely.  We still support the TSC when CONFIG_X86_TSC is not
compiled in; that attribute controls if it is unconditionally required.

(We seem to still not always use it, which may be a good indication that
CONFIG_X86_TSC has outlived its usefulness.  Should it be removed?)

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] lguest-fix-divide-error-implement-sched_clock
  2007-06-05 21:15                       ` H. Peter Anvin
@ 2007-06-05 21:31                         ` Andi Kleen
  0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2007-06-05 21:31 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Andrew Morton, Rusty Russell, Matt Mackall, linux-kernel

On Tuesday 05 June 2007 23:15:28 H. Peter Anvin wrote:
> Andi Kleen wrote:
> > 
> > I don't think it's a good idea for the TSC. There are various
> > setups where it is unreliable and also often simulators don't 
> > implement it correctly. And it's always a valuable workaround
> > to be able to turn it off.
> > 
> 
> For all I can tell, if this is the case, then CONFIG_X86_TSC should be
> removed completely.

Yes it should.

> We still support the TSC when CONFIG_X86_TSC is not 
> compiled in; that attribute controls if it is unconditionally required.
> 
> (We seem to still not always use it, which may be a good indication that
> CONFIG_X86_TSC has outlived its usefulness.  Should it be removed?)

Yes.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-06-05 21:31 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-22 22:38 lguest broken in 2.6.22-rc1-mm1 Matt Mackall
2007-05-22 23:27 ` Rusty Russell
2007-06-04 17:19   ` lguest rebroken in 2.6.22-rc3-mm1 Matt Mackall
2007-06-04 17:28     ` Andrew Morton
2007-06-04 17:46       ` Matt Mackall
2007-06-04 18:12       ` Andi Kleen
2007-06-05  2:48         ` Rusty Russell
2007-06-05 10:01           ` Andi Kleen
2007-06-05 13:11             ` [PATCH] lguest-fix-divide-error-implement-sched_clock Rusty Russell
2007-06-05 14:24               ` Andi Kleen
2007-06-05 16:07                 ` Andrew Morton
2007-06-05 16:16                   ` H. Peter Anvin
2007-06-05 16:37                     ` Andi Kleen
2007-06-05 17:02                       ` H. Peter Anvin
2007-06-05 18:43                       ` H. Peter Anvin
2007-06-05 18:51                         ` Andi Kleen
2007-06-05 21:15                       ` H. Peter Anvin
2007-06-05 21:31                         ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox