From mboxrd@z Thu Jan 1 00:00:00 1970 From: Philipp Hahn Subject: Re: RFH: kvm-1.0+git on 3.3.0 stuck in flash_tlb_others_ipi() Date: Fri, 30 Mar 2012 19:44:50 +0200 Message-ID: <201203301944.55376.hahn@univention.de> References: <201201091241.45281.hahn@univention.de> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2476430.E2kHXSdEWa"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit To: kvm Return-path: Received: from mail.univention.de ([82.198.197.8]:1871 "EHLO mail.univention.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758504Ab2C3Rwm (ORCPT ); Fri, 30 Mar 2012 13:52:42 -0400 Received: from localhost (localhost [127.0.0.1]) by slugis.knut.univention.de (Postfix) with ESMTP id 0F0806EA209 for ; Fri, 30 Mar 2012 19:45:02 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by slugis.knut.univention.de (Postfix) with ESMTP id AD1C26EA214 for ; Fri, 30 Mar 2012 19:45:01 +0200 (CEST) Received: from mail.univention.de ([127.0.0.1]) by localhost (slugis.knut.univention.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ygSW7Mh+8mcV for ; Fri, 30 Mar 2012 19:45:00 +0200 (CEST) Received: from stave.knut.univention.de (mail.univention.de [82.198.197.8]) by slugis.knut.univention.de (Postfix) with ESMTPSA id 8DD636EA209 for ; Fri, 30 Mar 2012 19:45:00 +0200 (CEST) In-Reply-To: <201201091241.45281.hahn@univention.de> Sender: kvm-owner@vger.kernel.org List-ID: --nextPart2476430.E2kHXSdEWa Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hello, On Monday 09 January 2012 12:41:41 Philipp Hahn wrote: > one of our VMs regularly get stuck: the VM is completely unresponsive (no > ssh, no serial console, no VNC). Using "gdbserver" and a remote system to > debug the running VM, I see 3 CPUs (1,3,4) stuck in > pgd_alloc() =E2=86=92 spin_lock_irqsave(pgd_lock) > while the 4th CPU (2) is waiting in > pgd_alloc() =E2=86=92 pgd_prepopulate_pmb() =E2=86=92... =E2=86=92 flus= h_tlb_others_ipi() > > 195 while > (!cpumask_empty(to_cpumask(f->flush_cpumask))) 196 = =20 > cpu_relax(); > (gdb) print f->flush_cpumask > $5 =3D {1} > > CPU 1 is duing a do_exec() syscall, will CPU 2-4 are doing a do_fork() > syscall according to "thread apply all backtrace". > > After a "set variable f->flush_cpumask 0" from gdb the kernel continued > dumping the trace-informations, which I attached. I repeatet the test today with current version: > Host: Debian linux-2.6.32-38-amd64 (=3D2.6.32.42), 8 Cores 3.3.0 doesn't hep either. > Kvm: 0.14.1+dfsg qemu-kvm-1.0-1587-ga0bc8c3 also not. > Guest: Debian linux-2.6.32-38-i686-bigmem, 4 CPUs I can reproduce the bug very reliable when building OpenOffice.org and/or=20 samba4 with -j4. > Is this a known bug and/or is a fix available? > I can gather more information from the VM if needd. Any ideas where to look next? How can I check that the IPI-handling in KVM works? Sincerely Philipp =2D-=20 Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ --nextPart2476430.E2kHXSdEWa Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAk918RIACgkQYPlgoZpUDjmPBACcDwvh5gHuta0PZ/hLiDn5I9k8 uaUAoJVTRS6JYU3O+G7m+DwsmbUOfOIp =ZpfT -----END PGP SIGNATURE----- --nextPart2476430.E2kHXSdEWa--