public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Export offsets of VMCS fields as note information for kdump
@ 2012-08-27  7:06 Zhang Yanfei
  2012-09-09 12:00 ` Avi Kivity
  0 siblings, 1 reply; 2+ messages in thread
From: Zhang Yanfei @ 2012-08-27  7:06 UTC (permalink / raw)
  To: Avi Kivity
  Cc: mtosatti, ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, kvm@vger.kernel.org, Greg KH, HATAYAMA Daisuke,
	masanori.yoshida.tv, digitaleric

Hello Avi,

About this VMCSINFO patch, we really need this functionality in our development.
And YOSHIDA Masanori(masanori.yoshida.tv@hitachi.com), the developer from Hitachi,
has said they need this too. So could you please tell us why the patch is unacceptable?
You dislike the whole export-VMCSINFO-thing in all, or you just dislike the way
we implement the path? Finally do you have any suggestion about all this?

Below is why we need this patch and how we will use this patch in our development.

We once came to an abnormal situation: a host scheduler bug caused guest machine's
vcpu stopped for a long time and then led to heartbeat stop (host is still running).
     
We want to have an efficient way to make the bug analysis when we come to the similar
situations where guest machine doesn't work well due to something of host machine's.
Actually, these situations have happened many times, in particular, under development.
  
So here comes the requirement:
If we want to find the root cause, we should debug both host machine's and guest
machine's sides. But first we should get both host machine's crash dump and guest
machine's crash dump and they must be dumped at the same time when the abnormal
situation remains. So the only way to do this is to panic the host with the abnormal
guest running on it and then the guest's image is contained in host's crash dump.

Logically, retrieving guest's crash dump from the host's crash dump is the very
important step to accomplish our goal. Unfortunately, in kvm implementation, some
registers' values of the guest are hidden in vmcs, and vmcs internal is hidden by
Intel. If we could not retrieve these registers from the vmcs, the guest crash dump
we make is incomplete, and some key information is lost when we analyse the guest
crash dump. 

So we make this patch to export the vmcs internal. With the patch applied, we
could write registers' values stored in vmcs into guest's crash dump. And that's
what we want.
  
If a bug was found on customer's environment, we have two ways to avoid
affecting other guest machines running on the same host. First, we could do bug
analysis on another environment to reproduce the buggy situation; Second, we
could migrate other guest machines to other hosts.

After the abnormal situation is reproduced, we panic the host *manually*.
Then we could use userland tools to get guest machine's crash dump from host machine's
with the feature provided by this patch. Finally we could analyse them separately
to find which side causes the problem.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-09-09 12:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-27  7:06 Export offsets of VMCS fields as note information for kdump Zhang Yanfei
2012-09-09 12:00 ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox