public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* w2k8 - reboots unexpected
@ 2009-06-18 11:27 Andreas Jud
  2009-06-18 11:37 ` Gleb Natapov
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Jud @ 2009-06-18 11:27 UTC (permalink / raw)
  To: kvm

Hi

We are running kvm-84 on debian x64 Linux.

We have serveral guest running on this host.

One guest witch is running windows 2008 server and used as a terminal 
server, reboots sometimes unexpected.

All what I found so far is the following error in the kernel-logfile:
Jun 18 10:33:17 sov07l kernel: [1470365.925930] kvm_handle_exit: Breaking 
out of NMI-blocked state on VCPU 1 after 1 s timeout
Jun 18 10:33:17 sov07l kernel: [1470365.929077] kvm_handle_exit: Breaking 
out of NMI-blocked state on VCPU 2 after 1 s timeout
Jun 18 10:33:17 sov07l kernel: [1470365.929077] kvm_handle_exit: Breaking 
out of NMI-blocked state on VCPU 3 after 1 s timeout

Does anybody have an idea what this mean? And what we could do, to make the 
system stable? resolve this error?

Thanks

Andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 11:27 w2k8 - reboots unexpected Andreas Jud
@ 2009-06-18 11:37 ` Gleb Natapov
  2009-06-18 12:45   ` Avi Kivity
  2009-06-18 13:55   ` Andreas Jud
  0 siblings, 2 replies; 8+ messages in thread
From: Gleb Natapov @ 2009-06-18 11:37 UTC (permalink / raw)
  To: Andreas Jud; +Cc: kvm

On Thu, Jun 18, 2009 at 01:27:25PM +0200, Andreas Jud wrote:
> Hi
> 
> We are running kvm-84 on debian x64 Linux.
> 
> We have serveral guest running on this host.
> 
> One guest witch is running windows 2008 server and used as a terminal 
> server, reboots sometimes unexpected.
> 
> All what I found so far is the following error in the kernel-logfile:
> Jun 18 10:33:17 sov07l kernel: [1470365.925930] kvm_handle_exit: Breaking 
> out of NMI-blocked state on VCPU 1 after 1 s timeout
> Jun 18 10:33:17 sov07l kernel: [1470365.929077] kvm_handle_exit: Breaking 
> out of NMI-blocked state on VCPU 2 after 1 s timeout
> Jun 18 10:33:17 sov07l kernel: [1470365.929077] kvm_handle_exit: Breaking 
> out of NMI-blocked state on VCPU 3 after 1 s timeout
> 
> Does anybody have an idea what this mean? And what we could do, to make the 
> system stable? resolve this error?
> 
You host cpu does not support NMI injection in VMX (your Intel processor
is too old). I can't tell for sure if this is what causes w2k8 to reboot
itself, but this is possible. Does w2k8 has some kind of NMI watchdog?

--
			Gleb.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 11:37 ` Gleb Natapov
@ 2009-06-18 12:45   ` Avi Kivity
  2009-06-18 14:16     ` Andreas Jud
  2009-06-18 14:24     ` Andreas Jud
  2009-06-18 13:55   ` Andreas Jud
  1 sibling, 2 replies; 8+ messages in thread
From: Avi Kivity @ 2009-06-18 12:45 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Andreas Jud, kvm

On 06/18/2009 02:37 PM, Gleb Natapov wrote:
> On Thu, Jun 18, 2009 at 01:27:25PM +0200, Andreas Jud wrote:
>    
>> Hi
>>
>> We are running kvm-84 on debian x64 Linux.
>>
>> We have serveral guest running on this host.
>>
>> One guest witch is running windows 2008 server and used as a terminal
>> server, reboots sometimes unexpected.
>>
>> All what I found so far is the following error in the kernel-logfile:
>> Jun 18 10:33:17 sov07l kernel: [1470365.925930] kvm_handle_exit: Breaking
>> out of NMI-blocked state on VCPU 1 after 1 s timeout
>> Jun 18 10:33:17 sov07l kernel: [1470365.929077] kvm_handle_exit: Breaking
>> out of NMI-blocked state on VCPU 2 after 1 s timeout
>> Jun 18 10:33:17 sov07l kernel: [1470365.929077] kvm_handle_exit: Breaking
>> out of NMI-blocked state on VCPU 3 after 1 s timeout
>>
>> Does anybody have an idea what this mean? And what we could do, to make the
>> system stable? resolve this error?
>>
>>      
> You host cpu does not support NMI injection in VMX (your Intel processor
> is too old). I can't tell for sure if this is what causes w2k8 to reboot
> itself, but this is possible. Does w2k8 has some kind of NMI watchdog?
>
>    

It doesn't inject NMIs here.

Can you set up memory dumping on BSODs and run the !analyze command in 
windbg?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 11:37 ` Gleb Natapov
  2009-06-18 12:45   ` Avi Kivity
@ 2009-06-18 13:55   ` Andreas Jud
  2009-06-18 14:02     ` Gleb Natapov
  1 sibling, 1 reply; 8+ messages in thread
From: Andreas Jud @ 2009-06-18 13:55 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

Hi Gleb

Thank you, for fast reply.

On Thu, 18 Jun 2009 14:37:47 +0300, Gleb Natapov wrote
> You host cpu does not support NMI injection in VMX (your Intel processor
> is too old). I can't tell for sure if this is what causes w2k8 to reboot
> itself, but this is possible. 
> 

It's a system with 2x L5320 (Quad Core Xeon). (see below)

The reboot it's randomly, sometime during working-hours, but sometimes 
during the night, when nobody is logged in. Sometimes the System is up over 
a week, sometimes  we have a reboot twice a day.


> Does w2k8 has some kind of NMI watchdog?

Don't know what it is, so i don't know.. if it has any.

Thanks 

Andy

--

cat /proc/cpuinfo:

processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU           L5320  @ 1.86GHz
stepping        : 7
cpu MHz         : 1866.732
cache size      : 4096 KB
physical id     : 1
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov                                                                         
                pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
syscall nx lm 
constant                                                                     
                   _tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl 
vmx tm2 ssse3 cx16 xtpr 
d                                                                            
            ca lahf_lm
bogomips        : 3733.58
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 13:55   ` Andreas Jud
@ 2009-06-18 14:02     ` Gleb Natapov
  2009-06-19  5:44       ` Andreas Jud
  0 siblings, 1 reply; 8+ messages in thread
From: Gleb Natapov @ 2009-06-18 14:02 UTC (permalink / raw)
  To: Andreas Jud; +Cc: kvm

On Thu, Jun 18, 2009 at 03:55:32PM +0200, Andreas Jud wrote:
> Hi Gleb
> 
> Thank you, for fast reply.
> 
> On Thu, 18 Jun 2009 14:37:47 +0300, Gleb Natapov wrote
> > You host cpu does not support NMI injection in VMX (your Intel processor
> > is too old). I can't tell for sure if this is what causes w2k8 to reboot
> > itself, but this is possible. 
> > 
> 
> It's a system with 2x L5320 (Quad Core Xeon). (see below)
> 
> The reboot it's randomly, sometime during working-hours, but sometimes 
> during the night, when nobody is logged in. Sometimes the System is up over 
> a week, sometimes  we have a reboot twice a day.
> 
> 
Is this the only VM that runs on this host? If not what other guests are
you running. May be the messages you see are not generated by the
problematic guest after all.

--
			Gleb.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 12:45   ` Avi Kivity
@ 2009-06-18 14:16     ` Andreas Jud
  2009-06-18 14:24     ` Andreas Jud
  1 sibling, 0 replies; 8+ messages in thread
From: Andreas Jud @ 2009-06-18 14:16 UTC (permalink / raw)
  To: Avi Kivity, Gleb Natapov; +Cc: kvm

> 
> Can you set up memory dumping on BSODs and run the !analyze command 
> in windbg?
> 

Hopefully i've done it right.

See below for the output of windbg:

--
0: kd> !analyze -v
*****************************************************************************
**
*                                                                            
 *
*                        Bugcheck 
Analysis                                    *
*                                                                            
 *
*****************************************************************************
**

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at 
an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffffa8007db5797, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS:  0000000000000000 

CURRENT_IRQL:  2

FAULTING_IP: 
+0
fffffa80`07db5797 ??              ???

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

BUGCHECK_STR:  0xD1

PROCESS_NAME:  System

TRAP_FRAME:  fffff80003e7f610 -- (.trap 0xfffff80003e7f610)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000201
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffffa8007db5797 rsp=fffff80003e7f7a0 rbp=fffff80003e7f800
 r8=000000000000082f  r9=000000000000000d r10=0000000000000000
r11=fffff80001a45640 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei ng nz na po nc
fffffa80`07db5797 ??              ???
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff800018703ee to fffff80001870650

STACK_TEXT:  
fffff800`03e7f4c8 fffff800`018703ee : 00000000`0000000a 00000000`00000000 
00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
fffff800`03e7f4d0 fffff800`0186f2cb : 00000000`00000000 fffffa80`04a7d000 
fffffa80`07a28d00 fffffa80`07db418f : nt!KiBugCheckDispatch+0x6e
fffff800`03e7f610 fffffa80`07db5797 : fffff800`03e7f7b0 fffff800`01d43cb1 
00000000`00000000 fffffa60`00e5c12d : nt!KiPageFault+0x20b
fffff800`03e7f7a0 fffff800`03e7f7b0 : fffff800`01d43cb1 00000000`00000000 
fffffa60`00e5c12d fffffa80`0573bbb0 : 0xfffffa80`07db5797
fffff800`03e7f7a8 fffff800`01d43cb1 : 00000000`00000000 fffffa60`00e5c12d 
fffffa80`0573bbb0 fffffa60`005ec180 : 0xfffff800`03e7f7b0
fffff800`03e7f7b0 fffff800`01877005 : fffffa80`0768a510 fffffa60`005ec180 
fffffa60`0085e110 fffffa80`050e6340 : hal!HalpRequestIpiSpecifyVector+0x81
fffff800`03e7f7e0 fffff800`01876773 : ffffffff`00000001 fffff800`01990680 
00000000`00000000 00000000`00000002 : nt!KiDeferredReadyThread+0x405
fffff800`03e7f830 00000000`fffffa80 : 01868e03`0010e380 00000000`fffff800 
00000000`00000000 00000000`00000000 : nt!KeSetEvent+0x1f3
fffff800`03e7f8a0 01868e03`0010e380 : 00000000`fffff800 00000000`00000000 
00000000`00000000 00000000`00000000 : 0xfffffa80
fffff800`03e7f8a8 00000000`fffff800 : 00000000`00000000 00000000`00000000 
00000000`00000000 00000000`00000000 : 0x1868e03`0010e380
fffff800`03e7f8b0 00000000`00000000 : 00000000`00000000 00000000`00000000 
00000000`00000000 00000000`00000000 : 0xfffff800


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt!KiPageFault+20b
fffff800`0186f2cb 488d058e320000  lea     rax,[nt!
RtlInterlockedPopEntrySList (fffff800`01872560)]

SYMBOL_STACK_INDEX:  2

SYMBOL_NAME:  nt!KiPageFault+20b

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  49ac93e1

FAILURE_BUCKET_ID:  X64_0xD1_nt!KiPageFault+20b

BUCKET_ID:  X64_0xD1_nt!KiPageFault+20b

Followup: MachineOwner
---------

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 12:45   ` Avi Kivity
  2009-06-18 14:16     ` Andreas Jud
@ 2009-06-18 14:24     ` Andreas Jud
  1 sibling, 0 replies; 8+ messages in thread
From: Andreas Jud @ 2009-06-18 14:24 UTC (permalink / raw)
  To: Avi Kivity, Gleb Natapov; +Cc: kvm

On Thu, 18 Jun 2009 15:45:54 +0300, Avi Kivity wrote

> It doesn't inject NMIs here.
> 
> Can you set up memory dumping on BSODs and run the !analyze command 
> in windbg?

I've found another dump (minidump) witch has some other information (see 
below):


--

0: kd> !analyze -v
*****************************************************************************
**
*                                                                            
 *
*                        Bugcheck 
Analysis                                    *
*                                                                            
 *
*****************************************************************************
**

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at 
an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffffa8007db5797, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: GetPointerFromAddress: unable to read from fffff80001a45080
 0000000000000000 

CURRENT_IRQL:  2

FAULTING_IP: 
+0
fffffa80`07db5797 ??              ???

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

BUGCHECK_STR:  0xD1

PROCESS_NAME:  System

TRAP_FRAME:  fffff80003e7f610 -- (.trap 0xfffff80003e7f610)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000201
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffffa8007db5797 rsp=fffff80003e7f7a0 rbp=fffff80003e7f800
 r8=000000000000082f  r9=000000000000000d r10=0000000000000000
r11=fffff80001a45640 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei ng nz na po nc
fffffa80`07db5797 ??              ???
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff800018703ee to fffff80001870650

STACK_TEXT:  
fffff800`03e7f4c8 fffff800`018703ee : 00000000`0000000a 00000000`00000000 
00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
fffff800`03e7f4d0 fffff800`0186f2cb : 00000000`00000000 fffffa80`04a7d000 
fffffa80`07a28d00 fffffa80`07db418f : nt!KiBugCheckDispatch+0x6e
fffff800`03e7f610 fffffa80`07db5797 : fffff800`03e7f7b0 fffff800`01d43cb1 
00000000`00000000 fffffa60`00e5c12d : nt!KiPageFault+0x20b
fffff800`03e7f7a0 fffff800`03e7f7b0 : fffff800`01d43cb1 00000000`00000000 
fffffa60`00e5c12d fffffa80`0573bbb0 : 0xfffffa80`07db5797
fffff800`03e7f7a8 fffff800`01d43cb1 : 00000000`00000000 fffffa60`00e5c12d 
fffffa80`0573bbb0 fffffa60`005ec180 : 0xfffff800`03e7f7b0
fffff800`03e7f7b0 fffff800`01877005 : fffffa80`0768a510 fffffa60`005ec180 
fffffa60`0085e110 fffffa80`050e6340 : hal!HalpRequestIpiSpecifyVector+0x81
fffff800`03e7f7e0 fffff800`01876773 : ffffffff`00000001 fffff800`01990680 
00000000`00000000 00000000`00000002 : nt!KiDeferredReadyThread+0x405
fffff800`03e7f830 00000000`fffffa80 : 01868e03`0010e380 00000000`fffff800 
00000000`00000000 00000000`00000000 : nt!KeSetEvent+0x1f3
fffff800`03e7f8a0 01868e03`0010e380 : 00000000`fffff800 00000000`00000000 
00000000`00000000 00000000`00000000 : 0xfffffa80
fffff800`03e7f8a8 00000000`fffff800 : 00000000`00000000 00000000`00000000 
00000000`00000000 00000000`00000000 : 0x1868e03`0010e380
fffff800`03e7f8b0 00000000`00000000 : 00000000`00000000 00000000`00000000 
00000000`00000000 00000000`00000000 : 0xfffff800


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt!KiPageFault+20b
fffff800`0186f2cb 488d058e320000  lea     rax,[nt!
RtlInterlockedPopEntrySList (fffff800`01872560)]

SYMBOL_STACK_INDEX:  2

SYMBOL_NAME:  nt!KiPageFault+20b

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  49ac93e1

FAILURE_BUCKET_ID:  X64_0xD1_nt!KiPageFault+20b

BUCKET_ID:  X64_0xD1_nt!KiPageFault+20b

Followup: MachineOwner

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: w2k8 - reboots unexpected
  2009-06-18 14:02     ` Gleb Natapov
@ 2009-06-19  5:44       ` Andreas Jud
  0 siblings, 0 replies; 8+ messages in thread
From: Andreas Jud @ 2009-06-19  5:44 UTC (permalink / raw)
  To: kvm

> Is this the only VM that runs on this host? If not what other guests 
> are you running. May be the messages you see are not generated by 
> the problematic guest after all. 

No, there are about 15 VMs on this Host. 

There are Vista, W2k3, w2k8 and debian linuxes running. I think all OSs are 
x64-systems. 

If I look at the time, when this error shows up in the kernel-log, it 
matches exactly the time, when the machine is rebooting. 

Currently I try to use the windbg-tool.. but no success so far..   

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-06-19  5:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-18 11:27 w2k8 - reboots unexpected Andreas Jud
2009-06-18 11:37 ` Gleb Natapov
2009-06-18 12:45   ` Avi Kivity
2009-06-18 14:16     ` Andreas Jud
2009-06-18 14:24     ` Andreas Jud
2009-06-18 13:55   ` Andreas Jud
2009-06-18 14:02     ` Gleb Natapov
2009-06-19  5:44       ` Andreas Jud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox