public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* kexec/kdump of a kvm guest?
@ 2008-06-26 18:58 Mike Snitzer
  2008-07-05 11:20 ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2008-06-26 18:58 UTC (permalink / raw)
  To: kvm

My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).

When I configure kdump in the guest (running 2.6.22.19) and force a
crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
kernel but then the kernel hangs (before it gets to /sbin/init et al).
 On the host, the associated qemu is consuming 100% cpu.

I really need to be able to collect vmcores from my kvm guests.  So
far I can't (on raw hardware all works fine).

Any pointers would be appreciated.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-06-26 18:58 kexec/kdump of a kvm guest? Mike Snitzer
@ 2008-07-05 11:20 ` Avi Kivity
  2008-07-24  0:13   ` Mike Snitzer
  0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2008-07-05 11:20 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: kvm

Mike Snitzer wrote:
> My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
>
> When I configure kdump in the guest (running 2.6.22.19) and force a
> crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
> kernel but then the kernel hangs (before it gets to /sbin/init et al).
>  On the host, the associated qemu is consuming 100% cpu.
>
> I really need to be able to collect vmcores from my kvm guests.  So
> far I can't (on raw hardware all works fine).
>
>   

I've tested this a while ago and it worked (though I tested regular 
kexecs, not crashes); this may be a regression.

Please run kvm_stat to see what's happening at the time of the crash.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-05 11:20 ` Avi Kivity
@ 2008-07-24  0:13   ` Mike Snitzer
  2008-07-24  8:39     ` Alexander Graf
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2008-07-24  0:13 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <avi@qumranet.com> wrote:
> Mike Snitzer wrote:
>>
>> My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
>>
>> When I configure kdump in the guest (running 2.6.22.19) and force a
>> crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
>> kernel but then the kernel hangs (before it gets to /sbin/init et al).
>>  On the host, the associated qemu is consuming 100% cpu.
>>
>> I really need to be able to collect vmcores from my kvm guests.  So
>> far I can't (on raw hardware all works fine).
>>
>>
>
> I've tested this a while ago and it worked (though I tested regular kexecs,
> not crashes); this may be a regression.
>
> Please run kvm_stat to see what's happening at the time of the crash.

OK, I can look into kvm_stat but I just discovered that just having
kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents
the host from being able to kexec/kdump too!?  I didn't have any
guests running (only the kvm modules were loaded).  As soon as I
unloaded the kvm modules kdump worked as expected.

Something about kvm is completely breaking kexec/kdump on both the
host and guest kernels.

Mike

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-24  0:13   ` Mike Snitzer
@ 2008-07-24  8:39     ` Alexander Graf
  2008-07-24 11:49       ` Mike Snitzer
  0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2008-07-24  8:39 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Avi Kivity, kvm, kexec


On Jul 24, 2008, at 2:13 AM, Mike Snitzer wrote:

> On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <avi@qumranet.com> wrote:
>> Mike Snitzer wrote:
>>>
>>> My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
>>>
>>> When I configure kdump in the guest (running 2.6.22.19) and force a
>>> crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
>>> kernel but then the kernel hangs (before it gets to /sbin/init et  
>>> al).
>>> On the host, the associated qemu is consuming 100% cpu.
>>>
>>> I really need to be able to collect vmcores from my kvm guests.  So
>>> far I can't (on raw hardware all works fine).
>>>
>>>
>>
>> I've tested this a while ago and it worked (though I tested regular  
>> kexecs,
>> not crashes); this may be a regression.
>>
>> Please run kvm_stat to see what's happening at the time of the crash.
>
> OK, I can look into kvm_stat but I just discovered that just having
> kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents

Is 2.6.22.19 your host or your guest kernel? It's very unlikely that  
you loaded kvm modules in the guest.

> the host from being able to kexec/kdump too!?  I didn't have any
> guests running (only the kvm modules were loaded).  As soon as I
> unloaded the kvm modules kdump worked as expected.
>
> Something about kvm is completely breaking kexec/kdump on both the
> host and guest kernels.

I guess the kexec people would be pretty interested in this as well,  
so I'll just CC them for now.
As you're stating that the host kernel breaks with kvm modules loaded,  
maybe someone there could give a hint.

Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-24  8:39     ` Alexander Graf
@ 2008-07-24 11:49       ` Mike Snitzer
  2008-07-24 13:15         ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2008-07-24 11:49 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Avi Kivity, kvm, kexec

On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf@suse.de> wrote:
>
> On Jul 24, 2008, at 2:13 AM, Mike Snitzer wrote:
>
>> On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <avi@qumranet.com> wrote:
>>>
>>> Mike Snitzer wrote:
>>>>
>>>> My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
>>>>
>>>> When I configure kdump in the guest (running 2.6.22.19) and force a
>>>> crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
>>>> kernel but then the kernel hangs (before it gets to /sbin/init et al).
>>>> On the host, the associated qemu is consuming 100% cpu.
>>>>
>>>> I really need to be able to collect vmcores from my kvm guests.  So
>>>> far I can't (on raw hardware all works fine).
>>>>
>>>>
>>>
>>> I've tested this a while ago and it worked (though I tested regular
>>> kexecs,
>>> not crashes); this may be a regression.
>>>
>>> Please run kvm_stat to see what's happening at the time of the crash.
>>
>> OK, I can look into kvm_stat but I just discovered that just having
>> kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents
>
> Is 2.6.22.19 your host or your guest kernel? It's very unlikely that you
> loaded kvm modules in the guest.

Correct, 2.6.22.19 is my host kernel.

>> the host from being able to kexec/kdump too!?  I didn't have any
>> guests running (only the kvm modules were loaded).  As soon as I
>> unloaded the kvm modules kdump worked as expected.
>>
>> Something about kvm is completely breaking kexec/kdump on both the
>> host and guest kernels.
>
> I guess the kexec people would be pretty interested in this as well, so I'll
> just CC them for now.
> As you're stating that the host kernel breaks with kvm modules loaded, maybe
> someone there could give a hint.

OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
see how kexec/kdump of the host fairs when kvm modules are loaded.

On the guest side of things, as I mentioned in my original post,
kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
running 2.6.25.4 (with kvm-70).

Mike

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-24 11:49       ` Mike Snitzer
@ 2008-07-24 13:15         ` Vivek Goyal
  2008-07-24 19:03           ` Mike Snitzer
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2008-07-24 13:15 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Alexander Graf, kexec, kvm, Avi Kivity

On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf@suse.de> wrote:
> >
> > On Jul 24, 2008, at 2:13 AM, Mike Snitzer wrote:
> >
> >> On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <avi@qumranet.com> wrote:
> >>>
> >>> Mike Snitzer wrote:
> >>>>
> >>>> My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
> >>>>
> >>>> When I configure kdump in the guest (running 2.6.22.19) and force a
> >>>> crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
> >>>> kernel but then the kernel hangs (before it gets to /sbin/init et al).
> >>>> On the host, the associated qemu is consuming 100% cpu.
> >>>>
> >>>> I really need to be able to collect vmcores from my kvm guests.  So
> >>>> far I can't (on raw hardware all works fine).
> >>>>
> >>>>
> >>>
> >>> I've tested this a while ago and it worked (though I tested regular
> >>> kexecs,
> >>> not crashes); this may be a regression.
> >>>
> >>> Please run kvm_stat to see what's happening at the time of the crash.
> >>
> >> OK, I can look into kvm_stat but I just discovered that just having
> >> kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents
> >
> > Is 2.6.22.19 your host or your guest kernel? It's very unlikely that you
> > loaded kvm modules in the guest.
> 
> Correct, 2.6.22.19 is my host kernel.
> 
> >> the host from being able to kexec/kdump too!?  I didn't have any
> >> guests running (only the kvm modules were loaded).  As soon as I
> >> unloaded the kvm modules kdump worked as expected.
> >>
> >> Something about kvm is completely breaking kexec/kdump on both the
> >> host and guest kernels.
> >
> > I guess the kexec people would be pretty interested in this as well, so I'll
> > just CC them for now.
> > As you're stating that the host kernel breaks with kvm modules loaded, maybe
> > someone there could give a hint.
> 
> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
> see how kexec/kdump of the host fairs when kvm modules are loaded.
> 
> On the guest side of things, as I mentioned in my original post,
> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
> running 2.6.25.4 (with kvm-70).
> 

Hi Mike,

I have never tried kexec/kdump inside a kvm guest. So I don't know if
historically they have been working or not.

Having said that, Why do we need kdump to work inside the guest? In this
case qemu should be knowing about the memory of guest kernel and should
be able to capture a kernel crash dump? I am not sure if qemu already does
that. If not, then probably we should think about it?

To me, kdump is a good solution for baremetal but not for virtualized
environment where we already have another piece of software running which
can do the job for us. We will end up wasting memory in every instance
of guest (memory reserved for kdump kernel in every guest).

It will be interesting to look at your results with 2.6.25.x kernels with
kvm module inserted. Currently I can't think what can possibly be wrong.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-24 13:15         ` Vivek Goyal
@ 2008-07-24 19:03           ` Mike Snitzer
  2008-07-24 19:50             ` Anthony Liguori
                               ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Mike Snitzer @ 2008-07-24 19:03 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Alexander Graf, kexec, kvm, Avi Kivity, libvir-list

On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
>> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf@suse.de> wrote:

>> > As you're stating that the host kernel breaks with kvm modules loaded, maybe
>> > someone there could give a hint.
>>
>> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
>> see how kexec/kdump of the host fairs when kvm modules are loaded.
>>
>> On the guest side of things, as I mentioned in my original post,
>> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
>> running 2.6.25.4 (with kvm-70).
>>
>
> Hi Mike,
>
> I have never tried kexec/kdump inside a kvm guest. So I don't know if
> historically they have been working or not.

Avi indicated he seems to remember that at least kexec worked last he
tried (didn't provide when/what he tried though).

> Having said that, Why do we need kdump to work inside the guest? In this
> case qemu should be knowing about the memory of guest kernel and should
> be able to capture a kernel crash dump? I am not sure if qemu already does
> that. If not, then probably we should think about it?
>
> To me, kdump is a good solution for baremetal but not for virtualized
> environment where we already have another piece of software running which
> can do the job for us. We will end up wasting memory in every instance
> of guest (memory reserved for kdump kernel in every guest).

I haven't looked into what mechanics qemu provides for collecting the
entire guest memory image; I'll dig deeper at some point.  It seems
the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
file for analysis) doesn't support saving a kvm guest core:
# virsh dump guest10 guest10.dump
libvir: error : this function is not supported by the hypervisor:
virDomainCoreDump
error: Failed to core dump domain guest10 to guest10.dump

Seems that libvirt functionality isn't available yet with kvm (I'm
using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
libvirt-list to get their insight.

That aside, having the crash dump collection be multi-phased really
isn't workable (that is if it requires a crashed guest to be manually
saved after the fact).  The host system _could_ be rebooted; whereby
losing the guest's core image.  So automating qemu and/or libvirtd to
trigger a dump would seem worthwhile (maybe its already done?).

So while I agree with you its ideal to not have to waste memory in
each guest for the purposes of kdump; if users want to model a guest
image as closely as possible to what will be deployed on bare metal it
really would be ideal to support a 1:1 functional equivalent with kvm.
 I work with people who refuse to use kvm because of the lack of
kexec/kdump support.

I can do further research but welcome others' insight: do others have
advice on how best to collect a crashed kvm guest's core?

> It will be interesting to look at your results with 2.6.25.x kernels with
> kvm module inserted. Currently I can't think what can possibly be wrong.

If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
loaded kexec/kdump does _not_ work (simply hangs the system).  If I
only have the kvm module loaded kexec/kdump works as expected
(likewise if no kvm modules are loaded at all).  So it would appear
that kvm-intel and kexec are definitely mutually exclusive at the
moment (at least on both 2.6.22.x and 2.6.25.x).

Mike

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-24 19:03           ` Mike Snitzer
@ 2008-07-24 19:50             ` Anthony Liguori
  2008-07-25  1:12             ` Vivek Goyal
       [not found]             ` <170fa0d20807241203h7065b643k7df1187ef7e76f87-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2 siblings, 0 replies; 13+ messages in thread
From: Anthony Liguori @ 2008-07-24 19:50 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Vivek Goyal, Alexander Graf, kexec, kvm, Avi Kivity, libvir-list

Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>   
>> On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
>>     
>>> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf@suse.de> wrote:
>>>       
> I can do further research but welcome others' insight: do others have
> advice on how best to collect a crashed kvm guest's core?
>   

I don't know what you do in libvirt, but you can start a gdbstub in 
QEMU, connect with gdb, and then have gdb dump out a core.

Regards,

Anthony Liguori

>> It will be interesting to look at your results with 2.6.25.x kernels with
>> kvm module inserted. Currently I can't think what can possibly be wrong.
>>     
>
> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
> only have the kvm module loaded kexec/kdump works as expected
> (likewise if no kvm modules are loaded at all).  So it would appear
> that kvm-intel and kexec are definitely mutually exclusive at the
> moment (at least on both 2.6.22.x and 2.6.25.x).
>
> Mike
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-24 19:03           ` Mike Snitzer
  2008-07-24 19:50             ` Anthony Liguori
@ 2008-07-25  1:12             ` Vivek Goyal
  2008-07-27  8:32               ` Avi Kivity
  2008-08-25 15:56               ` Mike Snitzer
       [not found]             ` <170fa0d20807241203h7065b643k7df1187ef7e76f87-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2 siblings, 2 replies; 13+ messages in thread
From: Vivek Goyal @ 2008-07-25  1:12 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Alexander Graf, kexec, kvm, Avi Kivity, libvir-list

On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf@suse.de> wrote:
> 
> >> > As you're stating that the host kernel breaks with kvm modules loaded, maybe
> >> > someone there could give a hint.
> >>
> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
> >> see how kexec/kdump of the host fairs when kvm modules are loaded.
> >>
> >> On the guest side of things, as I mentioned in my original post,
> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
> >> running 2.6.25.4 (with kvm-70).
> >>
> >
> > Hi Mike,
> >
> > I have never tried kexec/kdump inside a kvm guest. So I don't know if
> > historically they have been working or not.
> 
> Avi indicated he seems to remember that at least kexec worked last he
> tried (didn't provide when/what he tried though).
> 
> > Having said that, Why do we need kdump to work inside the guest? In this
> > case qemu should be knowing about the memory of guest kernel and should
> > be able to capture a kernel crash dump? I am not sure if qemu already does
> > that. If not, then probably we should think about it?
> >
> > To me, kdump is a good solution for baremetal but not for virtualized
> > environment where we already have another piece of software running which
> > can do the job for us. We will end up wasting memory in every instance
> > of guest (memory reserved for kdump kernel in every guest).
> 
> I haven't looked into what mechanics qemu provides for collecting the
> entire guest memory image; I'll dig deeper at some point.  It seems
> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
> file for analysis) doesn't support saving a kvm guest core:
> # virsh dump guest10 guest10.dump
> libvir: error : this function is not supported by the hypervisor:
> virDomainCoreDump
> error: Failed to core dump domain guest10 to guest10.dump
> 
> Seems that libvirt functionality isn't available yet with kvm (I'm
> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
> libvirt-list to get their insight.
> 
> That aside, having the crash dump collection be multi-phased really
> isn't workable (that is if it requires a crashed guest to be manually
> saved after the fact).  The host system _could_ be rebooted; whereby
> losing the guest's core image.  So automating qemu and/or libvirtd to
> trigger a dump would seem worthwhile (maybe its already done?).
> 

That's a good point. Ideally, one would like dump to be captured
automatically if kernel crashes and then reboot back to production
kernel. I am not sure what can we do to let qemu know after crash
so that it can automatically save dump.

What happens in the case of xen guests. Is dump automatically captured
or one has to force the dump capture externally.

> So while I agree with you its ideal to not have to waste memory in
> each guest for the purposes of kdump; if users want to model a guest
> image as closely as possible to what will be deployed on bare metal it
> really would be ideal to support a 1:1 functional equivalent with kvm.

Agreed. Making kdump work inside kvm guest does not harm.

>  I work with people who refuse to use kvm because of the lack of
> kexec/kdump support.
> 

Interesting.

> I can do further research but welcome others' insight: do others have
> advice on how best to collect a crashed kvm guest's core?
> 
> > It will be interesting to look at your results with 2.6.25.x kernels with
> > kvm module inserted. Currently I can't think what can possibly be wrong.
> 
> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
> only have the kvm module loaded kexec/kdump works as expected
> (likewise if no kvm modules are loaded at all).  So it would appear
> that kvm-intel and kexec are definitely mutually exclusive at the
> moment (at least on both 2.6.22.x and 2.6.25.x).

Ok. So first task is to fix host kexec/kdump with kvm-intel module
inserted.

Can you do little debugging to find out where system hangs. I generally
try few things for kexec related issue debugging.

1. Specify earlyprintk= parameter for second kernel and see if control
   is reaching to second kernel.

2. Otherwise specify --console-serial parameter on "kexec -l" commandline
   and it should display a message "I am in purgatory" on serial console.
   This will just mean that control has reached at least till purgatory.

3. If that also does not work, then most likely first kernel itself got
   stuck somewhere and we need to put some printks in first kernel to find
   out what's wrong.


Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-25  1:12             ` Vivek Goyal
@ 2008-07-27  8:32               ` Avi Kivity
  2008-08-25 15:56               ` Mike Snitzer
  1 sibling, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2008-07-27  8:32 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Mike Snitzer, Alexander Graf, kexec, kvm, libvir-list

Vivek Goyal wrote:
>> Seems that libvirt functionality isn't available yet with kvm (I'm
>> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
>> libvirt-list to get their insight.
>>
>> That aside, having the crash dump collection be multi-phased really
>> isn't workable (that is if it requires a crashed guest to be manually
>> saved after the fact).  The host system _could_ be rebooted; whereby
>> losing the guest's core image.  So automating qemu and/or libvirtd to
>> trigger a dump would seem worthwhile (maybe its already done?).
>>
>>     
>
> That's a good point. Ideally, one would like dump to be captured
> automatically if kernel crashes and then reboot back to production
> kernel. I am not sure what can we do to let qemu know after crash
> so that it can automatically save dump.
>   

We can expose a virtual pci device that when accessed, causes qemu to 
dump the guest's core.

>
> Ok. So first task is to fix host kexec/kdump with kvm-intel module
> inserted.
>
> Can you do little debugging to find out where system hangs. I generally
> try few things for kexec related issue debugging.
>
> 1. Specify earlyprintk= parameter for second kernel and see if control
>    is reaching to second kernel.
>
> 2. Otherwise specify --console-serial parameter on "kexec -l" commandline
>    and it should display a message "I am in purgatory" on serial console.
>    This will just mean that control has reached at least till purgatory.
>
> 3. If that also does not work, then most likely first kernel itself got
>    stuck somewhere and we need to put some printks in first kernel to find
>    out what's wrong.
>
>   

kvm has a reboot notifier to turn off vmx when rebooting.  See 
kvm_reboot_notifier and kvm_reboot().  Maybe something similar is needed 
for kexec?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
       [not found]             ` <170fa0d20807241203h7065b643k7df1187ef7e76f87-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-07-27  9:12               ` Avi Kivity
  0 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2008-07-27  9:12 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: libvir-list-H+wXaHxf7aLQT0dZR+AlfA,
	kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Graf,
	Vivek Goyal, kvm-u79uwXL29TY76Z2rM5mHXA

Mike Snitzer wrote:
> Avi indicated he seems to remember that at least kexec worked last he
> tried (didn't provide when/what he tried though).
>
>   

kexec inside a guest.  Months ago.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
  2008-07-25  1:12             ` Vivek Goyal
  2008-07-27  8:32               ` Avi Kivity
@ 2008-08-25 15:56               ` Mike Snitzer
       [not found]                 ` <170fa0d20808250856w7dd480a9x35f4112f2464a7cd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2008-08-25 15:56 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Alexander Graf, kexec, kvm, Avi Kivity

On Thu, Jul 24, 2008 at 9:12 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote:
>> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
>> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf@suse.de> wrote:
>>
>> >> > As you're stating that the host kernel breaks with kvm modules loaded, maybe
>> >> > someone there could give a hint.
>> >>
>> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
>> >> see how kexec/kdump of the host fairs when kvm modules are loaded.
>> >>
>> >> On the guest side of things, as I mentioned in my original post,
>> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
>> >> running 2.6.25.4 (with kvm-70).
>> >>
>> >
>> > Hi Mike,
>> >
>> > I have never tried kexec/kdump inside a kvm guest. So I don't know if
>> > historically they have been working or not.
>>
>> Avi indicated he seems to remember that at least kexec worked last he
>> tried (didn't provide when/what he tried though).
>>
>> > Having said that, Why do we need kdump to work inside the guest? In this
>> > case qemu should be knowing about the memory of guest kernel and should
>> > be able to capture a kernel crash dump? I am not sure if qemu already does
>> > that. If not, then probably we should think about it?
>> >
>> > To me, kdump is a good solution for baremetal but not for virtualized
>> > environment where we already have another piece of software running which
>> > can do the job for us. We will end up wasting memory in every instance
>> > of guest (memory reserved for kdump kernel in every guest).
>>
>> I haven't looked into what mechanics qemu provides for collecting the
>> entire guest memory image; I'll dig deeper at some point.  It seems
>> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
>> file for analysis) doesn't support saving a kvm guest core:
>> # virsh dump guest10 guest10.dump
>> libvir: error : this function is not supported by the hypervisor:
>> virDomainCoreDump
>> error: Failed to core dump domain guest10 to guest10.dump
>>
>> Seems that libvirt functionality isn't available yet with kvm (I'm
>> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
>> libvirt-list to get their insight.
>>
>> That aside, having the crash dump collection be multi-phased really
>> isn't workable (that is if it requires a crashed guest to be manually
>> saved after the fact).  The host system _could_ be rebooted; whereby
>> losing the guest's core image.  So automating qemu and/or libvirtd to
>> trigger a dump would seem worthwhile (maybe its already done?).
>>
>
> That's a good point. Ideally, one would like dump to be captured
> automatically if kernel crashes and then reboot back to production
> kernel. I am not sure what can we do to let qemu know after crash
> so that it can automatically save dump.
>
> What happens in the case of xen guests. Is dump automatically captured
> or one has to force the dump capture externally.
>
>> So while I agree with you its ideal to not have to waste memory in
>> each guest for the purposes of kdump; if users want to model a guest
>> image as closely as possible to what will be deployed on bare metal it
>> really would be ideal to support a 1:1 functional equivalent with kvm.
>
> Agreed. Making kdump work inside kvm guest does not harm.
>
>>  I work with people who refuse to use kvm because of the lack of
>> kexec/kdump support.
>>
>
> Interesting.
>
>> I can do further research but welcome others' insight: do others have
>> advice on how best to collect a crashed kvm guest's core?
>>
>> > It will be interesting to look at your results with 2.6.25.x kernels with
>> > kvm module inserted. Currently I can't think what can possibly be wrong.
>>
>> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
>> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
>> only have the kvm module loaded kexec/kdump works as expected
>> (likewise if no kvm modules are loaded at all).  So it would appear
>> that kvm-intel and kexec are definitely mutually exclusive at the
>> moment (at least on both 2.6.22.x and 2.6.25.x).
>
> Ok. So first task is to fix host kexec/kdump with kvm-intel module
> inserted.
>
> Can you do little debugging to find out where system hangs. I generally
> try few things for kexec related issue debugging.
>
> 1. Specify earlyprintk= parameter for second kernel and see if control
>   is reaching to second kernel.
>
> 2. Otherwise specify --console-serial parameter on "kexec -l" commandline
>   and it should display a message "I am in purgatory" on serial console.
>   This will just mean that control has reached at least till purgatory.
>
> 3. If that also does not work, then most likely first kernel itself got
>   stuck somewhere and we need to put some printks in first kernel to find
>   out what's wrong.

Vivek,

I've been unable to put time to chasing this (and I'm not seeing when
I'll be able to get back to it yet).  I hope that others will be
willing to take a look before me.

The kvm-intel and kexec incompatibility issue is not exclusive to my
local environment (simply need a cpu that supports kvm-intel).

regards,
Mike

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kexec/kdump of a kvm guest?
       [not found]                 ` <170fa0d20808250856w7dd480a9x35f4112f2464a7cd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-08-25 16:05                   ` Vivek Goyal
  0 siblings, 0 replies; 13+ messages in thread
From: Vivek Goyal @ 2008-08-25 16:05 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Avi Kivity, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Alexander Graf, kvm-u79uwXL29TY76Z2rM5mHXA

On Mon, Aug 25, 2008 at 11:56:11AM -0400, Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 9:12 PM, Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote:
> >> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
> >> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org> wrote:
> >>
> >> >> > As you're stating that the host kernel breaks with kvm modules loaded, maybe
> >> >> > someone there could give a hint.
> >> >>
> >> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
> >> >> see how kexec/kdump of the host fairs when kvm modules are loaded.
> >> >>
> >> >> On the guest side of things, as I mentioned in my original post,
> >> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
> >> >> running 2.6.25.4 (with kvm-70).
> >> >>
> >> >
> >> > Hi Mike,
> >> >
> >> > I have never tried kexec/kdump inside a kvm guest. So I don't know if
> >> > historically they have been working or not.
> >>
> >> Avi indicated he seems to remember that at least kexec worked last he
> >> tried (didn't provide when/what he tried though).
> >>
> >> > Having said that, Why do we need kdump to work inside the guest? In this
> >> > case qemu should be knowing about the memory of guest kernel and should
> >> > be able to capture a kernel crash dump? I am not sure if qemu already does
> >> > that. If not, then probably we should think about it?
> >> >
> >> > To me, kdump is a good solution for baremetal but not for virtualized
> >> > environment where we already have another piece of software running which
> >> > can do the job for us. We will end up wasting memory in every instance
> >> > of guest (memory reserved for kdump kernel in every guest).
> >>
> >> I haven't looked into what mechanics qemu provides for collecting the
> >> entire guest memory image; I'll dig deeper at some point.  It seems
> >> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
> >> file for analysis) doesn't support saving a kvm guest core:
> >> # virsh dump guest10 guest10.dump
> >> libvir: error : this function is not supported by the hypervisor:
> >> virDomainCoreDump
> >> error: Failed to core dump domain guest10 to guest10.dump
> >>
> >> Seems that libvirt functionality isn't available yet with kvm (I'm
> >> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
> >> libvirt-list to get their insight.
> >>
> >> That aside, having the crash dump collection be multi-phased really
> >> isn't workable (that is if it requires a crashed guest to be manually
> >> saved after the fact).  The host system _could_ be rebooted; whereby
> >> losing the guest's core image.  So automating qemu and/or libvirtd to
> >> trigger a dump would seem worthwhile (maybe its already done?).
> >>
> >
> > That's a good point. Ideally, one would like dump to be captured
> > automatically if kernel crashes and then reboot back to production
> > kernel. I am not sure what can we do to let qemu know after crash
> > so that it can automatically save dump.
> >
> > What happens in the case of xen guests. Is dump automatically captured
> > or one has to force the dump capture externally.
> >
> >> So while I agree with you its ideal to not have to waste memory in
> >> each guest for the purposes of kdump; if users want to model a guest
> >> image as closely as possible to what will be deployed on bare metal it
> >> really would be ideal to support a 1:1 functional equivalent with kvm.
> >
> > Agreed. Making kdump work inside kvm guest does not harm.
> >
> >>  I work with people who refuse to use kvm because of the lack of
> >> kexec/kdump support.
> >>
> >
> > Interesting.
> >
> >> I can do further research but welcome others' insight: do others have
> >> advice on how best to collect a crashed kvm guest's core?
> >>
> >> > It will be interesting to look at your results with 2.6.25.x kernels with
> >> > kvm module inserted. Currently I can't think what can possibly be wrong.
> >>
> >> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
> >> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
> >> only have the kvm module loaded kexec/kdump works as expected
> >> (likewise if no kvm modules are loaded at all).  So it would appear
> >> that kvm-intel and kexec are definitely mutually exclusive at the
> >> moment (at least on both 2.6.22.x and 2.6.25.x).
> >
> > Ok. So first task is to fix host kexec/kdump with kvm-intel module
> > inserted.
> >
> > Can you do little debugging to find out where system hangs. I generally
> > try few things for kexec related issue debugging.
> >
> > 1. Specify earlyprintk= parameter for second kernel and see if control
> >   is reaching to second kernel.
> >
> > 2. Otherwise specify --console-serial parameter on "kexec -l" commandline
> >   and it should display a message "I am in purgatory" on serial console.
> >   This will just mean that control has reached at least till purgatory.
> >
> > 3. If that also does not work, then most likely first kernel itself got
> >   stuck somewhere and we need to put some printks in first kernel to find
> >   out what's wrong.
> 
> Vivek,
> 
> I've been unable to put time to chasing this (and I'm not seeing when
> I'll be able to get back to it yet).  I hope that others will be
> willing to take a look before me.
> 
> The kvm-intel and kexec incompatibility issue is not exclusive to my
> local environment (simply need a cpu that supports kvm-intel).
> 

Thanks Mike. Let me see if I get some free cycles to debug it.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-08-25 16:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-26 18:58 kexec/kdump of a kvm guest? Mike Snitzer
2008-07-05 11:20 ` Avi Kivity
2008-07-24  0:13   ` Mike Snitzer
2008-07-24  8:39     ` Alexander Graf
2008-07-24 11:49       ` Mike Snitzer
2008-07-24 13:15         ` Vivek Goyal
2008-07-24 19:03           ` Mike Snitzer
2008-07-24 19:50             ` Anthony Liguori
2008-07-25  1:12             ` Vivek Goyal
2008-07-27  8:32               ` Avi Kivity
2008-08-25 15:56               ` Mike Snitzer
     [not found]                 ` <170fa0d20808250856w7dd480a9x35f4112f2464a7cd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-08-25 16:05                   ` Vivek Goyal
     [not found]             ` <170fa0d20807241203h7065b643k7df1187ef7e76f87-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-27  9:12               ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox