Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Peter Lieven <pl@dlhnet.de>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Gleb Natapov <gleb@redhat.com>
Subject: Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
Date: Thu, 28 Jun 2012 12:13:20 +0200	[thread overview]
Message-ID: <4FEC2E40.5090400@dlhnet.de> (raw)
In-Reply-To: <4FEC263C.5030504@siemens.com>

On 28.06.2012 11:39, Jan Kiszka wrote:
> On 2012-06-28 11:31, Peter Lieven wrote:
>> On 28.06.2012 11:21, Jan Kiszka wrote:
>>> On 2012-06-28 11:11, Peter Lieven wrote:
>>>> On 27.06.2012 18:54, Jan Kiszka wrote:
>>>>> On 2012-06-27 17:39, Peter Lieven wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> i debugged this further and found out that kvm-kmod-3.0 is working with
>>>>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>>>>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>>>>>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>>>>>> an infinite loop?
>>>>> Before accusing kvm-kmod ;), can you check if the effect is visible with
>>>>> an original Linux 3.3.x or 3.4.x kernel as well?
>>>> sorry, i should have been more specific. maybe I also misunderstood sth.
>>>> I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
>>>> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if
>>>> I use
>>>> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
>>>> however, maybe we don't have to dig to deep - see below.
>>> kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
>>> working on an older kernel. This step may introduce bugs of its own.
>>> Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
>>> first of all.
>>>
>>>>> Then, bisection the change in qemu-kvm that apparently resolved the
>>>>> issue would be interesting.
>>>>>
>>>>> If we have to dig deeper, tracing [1] the lockup would likely be helpful
>>>>> (all events of the qemu process, not just KVM related ones: trace-cmd
>>>>> record -e all qemu-system-x86_64 ...).
>>>> that here is bascially whats going on:
>>>>
>>>>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read
>>>> len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>
>>>> its doing that forever. this is tracing the kvm module. doing the
>>>> qemu-system-x86_64 trace is a bit compilcated, but
>>>> maybe this is already sufficient. otherwise i will of course gather this
>>>> info as well.
>>> That's only tracing KVM event, and it's tracing when things went wrong
>>> already. We may need a full trace (-e all) specifically for the period
>>> when this pattern above started.
>> i will do that. maybe i should explain that the vcpu is executing
>> garbage when this above starts. its basically booting from an empty
>> harddisk.
>>
>> if i understand correctly qemu-kvm loops in kvm_cpu_exec(CPUState *env);
>>
>> maybe the time to handle the monitor/qmp connection is just to short.
>> if i understand furhter correctly, it can only handle monitor connections
>> while qemu-kvm is executing kvm_vcpu_ioctl(env, KVM_RUN, 0); or am i
>> wrong here? the time spend in this state might be rather short.
> Unless you played with priorities and affinities, the Linux scheduler
> should provide the required time to the iothread.
I have a 1.1GB (85MB compressed) trace-file. If you have time to
look at it I could drop it somewhere.

We currently run all VMs with nice 1 because we observed that
this improves that controlability of the Node in case all VMs
have execessive CPU load. Running the VM unniced does
not change the behaviour unfortunately.

Peter
>> my concern is not that the machine hangs, just the the hypervisor is
>> unresponsive
>> and its impossible to reset or quit gracefully. the only way to get the
>> hypervisor
>> ended is via SIGKILL.
> Right. Even if the guest runs wild, you must be able to control the vm
> via the monitor etc. If not, that's a bug.
>
> Jan
>

next prev parent reply	other threads:[~2012-06-28 10:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-27 15:39 race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1 Peter Lieven
2012-06-27 16:54 ` Jan Kiszka
2012-06-28  9:11   ` Peter Lieven
2012-06-28  9:21     ` Jan Kiszka
2012-06-28  9:31       ` Peter Lieven
2012-06-28  9:38         ` Peter Lieven
2012-07-02 15:05           ` Avi Kivity
2012-07-02 15:57             ` Peter Lieven
2012-07-03 13:01             ` Peter Lieven
2012-07-03 13:13               ` Avi Kivity
2012-07-03 13:15                 ` Peter Lieven
2012-07-03 13:25                   ` Avi Kivity
2012-07-04 14:57                     ` Peter Lieven
2012-07-04 23:12                 ` Peter Lieven
2012-07-05  6:48                   ` Xiao Guangrong
2012-06-28  9:39         ` Jan Kiszka
2012-06-28 10:13           ` Peter Lieven [this message]
2012-06-28 10:34           ` Peter Lieven
2012-07-05  8:51     ` Xiao Guangrong
2012-07-05 12:42       ` Peter Lieven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FEC2E40.5090400@dlhnet.de \
    --to=pl@dlhnet.de \
    --cc=gleb@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).