From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 0/8] KVM updates for 2.6.20-rc2 Date: Thu, 28 Dec 2006 13:04:23 +0200 Message-ID: <4593A4B7.2070404@qumranet.com> References: <45939755.7010603@qumranet.com> <20061228103345.GA4708@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel Return-path: To: Ingo Molnar In-Reply-To: <20061228103345.GA4708-X9Un+BFzKDI@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Ingo Molnar wrote: > btw., we've got a problem with KVM in -rt: > > BUG: scheduling while atomic: qemu/0x00000001/11445, CPU#0 > [] dump_trace+0x63/0x1e8 > [] show_trace_log_lvl+0x19/0x2e > [] show_trace+0x12/0x14 > [] dump_stack+0x14/0x16 > [] __schedule+0xae/0xd9f > [] schedule+0xec/0x10f > [] rt_spin_lock_slowlock+0xd2/0x15b > [] rt_spin_lock+0x1e/0x20 > [] kunmap_high+0x11/0x99 > [] kunmap+0x40/0x42 > [] kunmap_virt+0x36/0x38 > [] emulator_read_std+0x92/0xaf [kvm] > [] x86_emulate_memop+0xc6/0x295c [kvm] > [] emulate_instruction+0xef/0x202 [kvm] > [] handle_exception+0x107/0x1c8 [kvm_intel] > [] kvm_vmx_return+0x145/0x18d [kvm_intel] > [] kvm_dev_ioctl+0x253/0xd76 [kvm] > [] do_ioctl+0x21/0x66 > [] vfs_ioctl+0x255/0x268 > [] sys_ioctl+0x48/0x62 > [] syscall_call+0x7/0xb > [<4a4538b2>] 0x4a4538b2 > > NOTE: this is not a worry for upstream kernel, it is caused by > PREEMPT_RT scheduling in previously atomic APIs like kunmap(). But KVM > used to work pretty nicely in -rt and this problem got introduced fairly > recently, related to some big-page changes IIRC. > This is not a recent change. The x86 emulator is fetching an instruction or a memory operand. > Any suggestions of how to best fix this in -rt? Basically, it would be > nice to decouple KVM from get_cpu/preempt_disable type of interfaces as > much as possible, to make it fully preemptible. kvm on intel uses hidden cpu registers, so it can't be migrated as a normal process. We'd need a hook on context switch to check if we migrated to another cpu, issue an ipi to unload the registers on the previous cpu, and reload the registers on the current cpu. The hook would need to be protected from context switches. See vmx.c:vmx_vcpu_load() for the gory details. The actual launch would require a small non-preemptible section as well, on both intel and AMD. Talking about the scheduler (and to the scheduler's author :), it would be nice to hook the migration weight algorithm too. kvm guests are considerably more expensive to migrate, at least on intel. > Below are my current > fixups for KVM, but the problem above looks harder to fix. > These make sense for mainline (especially the first hunk), so I'll apply them. > Ingo > > --- > drivers/kvm/vmx.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux/drivers/kvm/vmx.c > =================================================================== > --- linux.orig/drivers/kvm/vmx.c > +++ linux/drivers/kvm/vmx.c > @@ -117,7 +117,7 @@ static void vmcs_clear(struct vmcs *vmcs > static void __vcpu_clear(void *arg) > { > struct kvm_vcpu *vcpu = arg; > - int cpu = smp_processor_id(); > + int cpu = raw_smp_processor_id(); > > if (vcpu->cpu == cpu) > vmcs_clear(vcpu->vmcs); > @@ -576,7 +576,7 @@ static struct vmcs *alloc_vmcs_cpu(int c > > static struct vmcs *alloc_vmcs(void) > { > - return alloc_vmcs_cpu(smp_processor_id()); > + return alloc_vmcs_cpu(raw_smp_processor_id()); > } > > static void free_vmcs(struct vmcs *vmcs) > -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV