From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1 Date: Wed, 27 Jun 2012 18:54:30 +0200 Message-ID: <4FEB3AC6.6010206@web.de> References: <4FEB2945.1030607@dlhnet.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigF9C2F62C5CDC0AC9C5E57BDE" Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Gleb Natapov To: Peter Lieven Return-path: Received: from mout.web.de ([212.227.17.11]:52422 "EHLO mout.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755653Ab2F0Qyz (ORCPT ); Wed, 27 Jun 2012 12:54:55 -0400 In-Reply-To: <4FEB2945.1030607@dlhnet.de> Sender: kvm-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigF9C2F62C5CDC0AC9C5E57BDE Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 2012-06-27 17:39, Peter Lieven wrote: > Hi all, >=20 > i debugged this further and found out that kvm-kmod-3.0 is working with= > qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is > working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0)= =2E > Has anyone a clue which new KVM feature could cause this if a vcpu is i= n > an infinite loop? Before accusing kvm-kmod ;), can you check if the effect is visible with an original Linux 3.3.x or 3.4.x kernel as well? Then, bisection the change in qemu-kvm that apparently resolved the issue would be interesting. If we have to dig deeper, tracing [1] the lockup would likely be helpful (all events of the qemu process, not just KVM related ones: trace-cmd record -e all qemu-system-x86_64 ...). Jan [1] http://www.linux-kvm.org/page/Tracing > --- >=20 > Hi, >=20 > we recently came across multiple VMs racing and stopping working. It > seems to happen when the system is at 100% cpu. > One way to reproduce this is: > qemu-kvm-1.0.1 with vnc-thread enabled >=20 > cmdline (or similar): > /usr/bin/qemu-kvm-1.0.1 -net > tap,vlan=3D141,script=3Dno,downscript=3Dno,ifname=3Dtap15,vnet_hdr -net= > nic,vlan=3D141,model=3Dvirtio,macaddr=3D52:54:00:ff:00:f7 -drive > format=3Dhost_device,file=3D/dev/mapper/iqn.2001-05.com.equallogic:0-8a= 0906-efdf4e007-16700198c7f4fead-02-debug-race-hd01,if=3Dvirtio,cache=3Dno= ne,aio=3Dnative > -m 2048 -smp 2,sockets=3D1,cores=3D2,threads=3D1 -monitor > tcp:0:4026,server,nowait -vnc :26 -qmp tcp:0:3026,server,nowait -name > 02-debug-race -boot order=3Ddc,menu=3Doff -cdrom > /home/kvm/cdrom//root/ubuntu-12.04-server-amd64.iso -k de -pidfile > /var/run/qemu/vm-221.pid -mem-prealloc -cpu > host,+x2apic,model_id=3DIntel(R) Xeon(R) CPU L5640 @ > 2.27GHz,-tsc -rtc base=3Dutc -usb -usbdevice tablet -no-hpet -vga cirru= s >=20 > it is important that the attached virtio image contains only zeroes. if= > the system boots from cd, select boot from first harddisk. > the hypervisor then hangs at 100% cpu and neither monitor nor qmp are > responsive anymore. >=20 > i have also seen customers reporting this when a VM is shut down. >=20 > if this is connected to the threaded vnc server it might be important t= o > connected at this time. >=20 > debug backtrace attached. >=20 > Thanks, > Peter >=20 > --=20 >=20 > (gdb) file /usr/bin/qemu-kvm-1.0.1 > Reading symbols from /usr/bin/qemu-kvm-1.0.1...done. > (gdb) attach 5145 > Attaching to program: /usr/bin/qemu-kvm-1.0.1, process 5145 > Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbol= s > found)...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 > [Thread debugging using libthread_db enabled] > [New Thread 0x7f54d08b9700 (LWP 5253)] > [New Thread 0x7f5552757700 (LWP 5152)] > [New Thread 0x7f5552f58700 (LWP 5151)] > 0x00007f5553c6b5a3 in select () from /lib/libc.so.6 > (gdb) info threads > 4 Thread 0x7f5552f58700 (LWP 5151) 0x00007f5553c6a747 in ioctl () > from /lib/libc.so.6 > 3 Thread 0x7f5552757700 (LWP 5152) 0x00007f5553c6a747 in ioctl () > from /lib/libc.so.6 > 2 Thread 0x7f54d08b9700 (LWP 5253) 0x00007f5553f1a85c in > pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 > * 1 Thread 0x7f555550d700 (LWP 5145) 0x00007f5553c6b5a3 in select () > from /lib/libc.so.6 > (gdb) thread apply all bt >=20 > Thread 4 (Thread 0x7f5552f58700 (LWP 5151)): > #0 0x00007f5553c6a747 in ioctl () from /lib/libc.so.6 > #1 0x00007f5555727830 in kvm_vcpu_ioctl (env=3D0x7f5557652f10, > type=3D44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101 > #2 0x00007f555572728a in kvm_cpu_exec (env=3D0x7f5557652f10) at > /usr/src/qemu-kvm-1.0.1/kvm-all.c:987 > #3 0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=3D0x7f5557652f10)= at > /usr/src/qemu-kvm-1.0.1/cpus.c:740 > #4 0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0 > #5 0x00007f5553c72cdd in clone () from /lib/libc.so.6 > #6 0x0000000000000000 in ?? () >=20 > Thread 3 (Thread 0x7f5552757700 (LWP 5152)): > #0 0x00007f5553c6a747 in ioctl () from /lib/libc.so.6 > #1 0x00007f5555727830 in kvm_vcpu_ioctl (env=3D0x7f555766ae60, > type=3D44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101 > #2 0x00007f555572728a in kvm_cpu_exec (env=3D0x7f555766ae60) at > /usr/src/qemu-kvm-1.0.1/kvm-all.c:987 > #3 0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=3D0x7f555766ae60)= at > /usr/src/qemu-kvm-1.0.1/cpus.c:740 > #4 0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0 > #5 0x00007f5553c72cdd in clone () from /lib/libc.so.6 > #6 0x0000000000000000 in ?? () >=20 > Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)): > #0 0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from > /lib/libpthread.so.0 > #1 0x00007f5555679f5d in qemu_cond_wait (cond=3D0x7f5557ede1e0, > mutex=3D0x7f5557ede210) at qemu-thread-posix.c:113 > #2 0x00007f55556b06a1 in vnc_worker_thread_loop (queue=3D0x7f5557ede1e= 0) > at ui/vnc-jobs-async.c:222 > #3 0x00007f55556b0b7f in vnc_worker_thread (arg=3D0x7f5557ede1e0) at > ui/vnc-jobs-async.c:318 > #4 0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0 > #5 0x00007f5553c72cdd in clone () from /lib/libc.so.6 > #6 0x0000000000000000 in ?? () >=20 > Thread 1 (Thread 0x7f555550d700 (LWP 5145)): > #0 0x00007f5553c6b5a3 in select () from /lib/libc.so.6 > #1 0x00007f55556516be in main_loop_wait (nonblocking=3D0) at main-loop= =2Ec:456 > #2 0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:= 1482 > #3 0x00007f555564c698 in main (argc=3D38, argv=3D0x7ffff9d894a8, > envp=3D0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523 > (gdb) thread apply all bt full >=20 > Thread 4 (Thread 0x7f5552f58700 (LWP 5151)): > #0 0x00007f5553c6a747 in ioctl () from /lib/libc.so.6 > No symbol table info available. > #1 0x00007f5555727830 in kvm_vcpu_ioctl (env=3D0x7f5557652f10, > type=3D44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101 > ret =3D 32597 > arg =3D 0x0 > ap =3D {{gp_offset =3D 24, fp_offset =3D 48, overflow_arg_area = =3D > 0x7f5552f57e50, reg_save_area =3D 0x7f5552f57d90}} > #2 0x00007f555572728a in kvm_cpu_exec (env=3D0x7f5557652f10) at > /usr/src/qemu-kvm-1.0.1/kvm-all.c:987 > run =3D 0x7f55553e2000 > ret =3D 0 > run_ret =3D 0 > #3 0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=3D0x7f5557652f10)= at > /usr/src/qemu-kvm-1.0.1/cpus.c:740 > env =3D 0x7f5557652f10 > r =3D 0 > #4 0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0 > No symbol table info available. > #5 0x00007f5553c72cdd in clone () from /lib/libc.so.6 > No symbol table info available. > #6 0x0000000000000000 in ?? () > No symbol table info available. >=20 > Thread 3 (Thread 0x7f5552757700 (LWP 5152)): > #0 0x00007f5553c6a747 in ioctl () from /lib/libc.so.6 > No symbol table info available. > #1 0x00007f5555727830 in kvm_vcpu_ioctl (env=3D0x7f555766ae60, > type=3D44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101 > ret =3D 0 > arg =3D 0x0 > ap =3D {{gp_offset =3D 24, fp_offset =3D 48, overflow_arg_area = =3D > 0x7f5552756e50, reg_save_area =3D 0x7f5552756d90}} > #2 0x00007f555572728a in kvm_cpu_exec (env=3D0x7f555766ae60) at > /usr/src/qemu-kvm-1.0.1/kvm-all.c:987 > run =3D 0x7f55553df000 > ret =3D 32597 > run_ret =3D 1433358864 > #3 0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=3D0x7f555766ae60)= at > /usr/src/qemu-kvm-1.0.1/cpus.c:740 > env =3D 0x7f555766ae60 > r =3D 65536 > #4 0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0 > No symbol table info available. > #5 0x00007f5553c72cdd in clone () from /lib/libc.so.6 > No symbol table info available. > #6 0x0000000000000000 in ?? () > No symbol table info available. >=20 > Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)): > #0 0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from > /lib/libpthread.so.0 > ---Type to continue, or q to quit--- > No symbol table info available. > #1 0x00007f5555679f5d in qemu_cond_wait (cond=3D0x7f5557ede1e0, > mutex=3D0x7f5557ede210) at qemu-thread-posix.c:113 > err =3D 32597 > __func__ =3D "qemu_cond_wait" > #2 0x00007f55556b06a1 in vnc_worker_thread_loop (queue=3D0x7f5557ede1e= 0) > at ui/vnc-jobs-async.c:222 > job =3D 0x7f5557edd510 > entry =3D 0x0 > tmp =3D 0x0 > vs =3D {csock =3D -1, ds =3D 0x7f5557e8ec40, dirty =3D {{0, 0, = 0} > }, lossy_rect =3D 0x7f5557edd570, vd =3D 0x7f54d08b= a010, > need_update =3D 0, force_update =3D 0, > features =3D 227, absolute =3D 0, last_x =3D 0, last_y =3D 0,= > client_width =3D 0, client_height =3D 0, vnc_encoding =3D 6, major =3D = 0, minor > =3D 0, auth =3D 0, > challenge =3D '\000' , info =3D 0x0, output= =3D > {capacity =3D 1230913, offset =3D 1448, buffer =3D 0x7f5558176d60 ""}, = input =3D > {capacity =3D 0, offset =3D 0, > buffer =3D 0x0}, write_pixels =3D 0x7f55556b2aaf > , clientds =3D {flags =3D 0 '\000', width =3D= 640, > height =3D 480, linesize =3D 1280, > data =3D 0x7f54d0c00000 "B\a", pf =3D {bits_per_pixel =3D 3= 2 ' ', > bytes_per_pixel =3D 4 '\004', depth =3D 24 '\030', rmask =3D 16711680, = gmask =3D > 65280, bmask =3D 255, amask =3D 0, > rshift =3D 16 '\020', gshift =3D 8 '\b', bshift =3D 0 '\0= 00', > ashift =3D 0 '\000', rmax =3D 255 '\377', gmax =3D 255 '\377', bmax =3D= 255 > '\377', amax =3D 0 '\000', > rbits =3D 8 '\b', gbits =3D 8 '\b', bbits =3D 8 '\b', abi= ts =3D 0 > '\000'}}, audio_cap =3D 0x0, as =3D {freq =3D 0, nchannels =3D 0, fmt =3D= > AUD_FMT_U8, endianness =3D 0}, > read_handler =3D 0, read_handler_expect =3D 0, modifiers_stat= e =3D > '\000' , led =3D 0x0, abort =3D false, output_mutex = =3D > {lock =3D {__data =3D {__lock =3D 0, > __count =3D 0, __owner =3D 0, __nusers =3D 0, __kind =3D= 0, > __spins =3D 0, __list =3D {__prev =3D 0x0, __next =3D 0x0}}, __size =3D= '\000' > , __align =3D 0}}, > bh =3D 0x0, jobs_buffer =3D {capacity =3D 0, offset =3D 0, bu= ffer =3D > 0x0}, tight =3D {type =3D 0, quality =3D 255 '\377', compression =3D 9 = '\t', > pixel24 =3D 0 '\000', tight =3D { > capacity =3D 0, offset =3D 0, buffer =3D 0x0}, tmp =3D {c= apacity =3D > 0, offset =3D 0, buffer =3D 0x0}, zlib =3D {capacity =3D 0, offset =3D = 0, buffer =3D > 0x0}, gradient =3D { > capacity =3D 0, offset =3D 0, buffer =3D 0x0}, levels =3D= {0, 0, > 0, 0}, stream =3D {{next_in =3D 0x0, avail_in =3D 0, total_in =3D 0, ne= xt_out =3D > 0x0, avail_out =3D 0, > total_out =3D 0, msg =3D 0x0, state =3D 0x0, zalloc =3D= 0, zfree > =3D 0, opaque =3D 0x0, data_type =3D 0, adler =3D 0, reserved =3D 0}, {= next_in =3D > 0x0, avail_in =3D 0, total_in =3D 0, > next_out =3D 0x0, avail_out =3D 0, total_out =3D 0, msg= =3D 0x0, > state =3D 0x0, zalloc =3D 0, zfree =3D 0, opaque =3D 0x0, data_type =3D= 0, adler =3D > 0, reserved =3D 0}, { > next_in =3D 0x0, avail_in =3D 0, total_in =3D 0, next_o= ut =3D > 0x0, avail_out =3D 0, total_out =3D 0, msg =3D 0x0, state =3D 0x0, zall= oc =3D 0, > zfree =3D 0, opaque =3D 0x0, > data_type =3D 0, adler =3D 0, reserved =3D 0}, {next_in= =3D 0x0, > avail_in =3D 0, total_in =3D 0, next_out =3D 0x0, avail_out =3D 0, tota= l_out =3D > 0, msg =3D 0x0, state =3D 0x0, > zalloc =3D 0, zfree =3D 0, opaque =3D 0x0, data_type =3D= 0, > adler =3D 0, reserved =3D 0}}}, zlib =3D {zlib =3D {capacity =3D 122948= 8, offset =3D > 1152000, > buffer =3D 0x7f5557fe9320 ""}, tmp =3D {capacity =3D 1230= 913, > offset =3D 20, buffer =3D 0x7f5558176d60 ""}, stream =3D {next_in =3D > 0x7f5558102720 "\030\b ", avail_in =3D 0, > total_in =3D 7658880, next_out =3D 0x7f5558177308 "", > avail_out =3D 1229465, total_out =3D 49027, msg =3D 0x0, state =3D 0x7f= 55581155e0, > zalloc =3D 0x7f55556a622f , zfree =3D > 0x7f55556a626c , opaque =3D 0x7f54d08a6810, data_type =3D= 0, > adler =3D 197164569, reserved =3D 0}, > level =3D 9}, hextile =3D {send_tile =3D 0x7f55556a1180 > }, zrle =3D {type =3D 0, fb =3D {capacity= =3D 0, > offset =3D 0, buffer =3D 0x0}, zrle =3D { > capacity =3D 0, offset =3D 0, buffer =3D 0x0}, tmp =3D {c= apacity =3D > 0, offset =3D 0, buffer =3D 0x0}, zlib =3D {capacity =3D 0, offset =3D = 0, buffer =3D > 0x0}, stream =3D {next_in =3D 0x0, > avail_in =3D 0, total_in =3D 0, next_out =3D 0x0, avail_o= ut =3D 0, > total_out =3D 0, msg =3D 0x0, state =3D 0x0, zalloc =3D 0, zfree =3D 0,= opaque =3D > 0x0, data_type =3D 0, adler =3D 0, > reserved =3D 0}, palette =3D {pool =3D {{idx =3D 0, color= =3D 0, > next =3D {le_next =3D 0x0, le_prev =3D 0x0}} }, size= =3D 0, > max =3D 0, bpp =3D 0, table =3D {{ > lh_first =3D 0x0} }}}, zywrle =3D = {buf > =3D {0 }}, mouse_mode_notifier =3D {notify =3D 0, n= ode =3D > {tqe_next =3D 0x0, > tqe_prev =3D 0x0}}, next =3D {tqe_next =3D 0x0, tqe_prev = =3D 0x0}} > n_rectangles =3D 1 > saved_offset =3D 2 > #3 0x00007f55556b0b7f in vnc_worker_thread (arg=3D0x7f5557ede1e0) at > ui/vnc-jobs-async.c:318 > queue =3D 0x7f5557ede1e0 > #4 0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0 > No symbol table info available. > #5 0x00007f5553c72cdd in clone () from /lib/libc.so.6 > No symbol table info available. > #6 0x0000000000000000 in ?? () > No symbol table info available. >=20 > ---Type to continue, or q to quit--- > Thread 1 (Thread 0x7f555550d700 (LWP 5145)): > #0 0x00007f5553c6b5a3 in select () from /lib/libc.so.6 > No symbol table info available. > #1 0x00007f55556516be in main_loop_wait (nonblocking=3D0) at main-loop= =2Ec:456 > rfds =3D {fds_bits =3D {14197552, 0 }} > wfds =3D {fds_bits =3D {0 }} > xfds =3D {fds_bits =3D {0 }} > ret =3D 32597 > nfds =3D 23 > tv =3D {tv_sec =3D 0, tv_usec =3D 817147} > timeout =3D 1000 > #2 0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:= 1482 > nonblocking =3D false > last_io =3D 1 > #3 0x00007f555564c698 in main (argc=3D38, argv=3D0x7ffff9d894a8, > envp=3D0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523 > gdbstub_dev =3D 0x0 > i =3D 64 > snapshot =3D 0 > linux_boot =3D 0 > icount_option =3D 0x0 > initrd_filename =3D 0x0 > kernel_filename =3D 0x0 > kernel_cmdline =3D 0x7f55557d8bef "" > boot_devices =3D "dc", '\000' > ds =3D 0x7f5557e8ec40 > dcl =3D 0x0 > cyls =3D 0 > heads =3D 0 > secs =3D 0 > translation =3D 0 > hda_opts =3D 0x0 > opts =3D 0x7f5557637cb0 > olist =3D 0x7ffff9d89268 > optind =3D 38 > optarg =3D 0x7ffff9d89bed "cirrus" > loadvm =3D 0x0 > machine =3D 0x7f5555b53500 > cpu_model =3D 0x7ffff9d89b6f "host,+x2apic,model_id=3DIntel(R) > Xeon(R) CPU", ' ' , "L5640 @ 2.27GHz,-tsc" > pid_file =3D 0x7ffff9d89b43 "/var/run/qemu/vm-221.pid" > incoming =3D 0x0 > show_vnc_port =3D 0 > defconfig =3D 1 > log_mask =3D 0x0 > log_file =3D 0x0 > mem_trace =3D {malloc =3D 0x7f5555649081 , re= alloc > =3D 0x7f55556490b6 , free =3D 0x7f55556490fa > , calloc =3D 0, > ---Type to continue, or q to quit--- > try_malloc =3D 0, try_realloc =3D 0} > trace_events =3D 0x0 > trace_file =3D 0x0 > (gdb) >=20 >=20 >=20 --------------enigF9C2F62C5CDC0AC9C5E57BDE Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/rOsoACgkQitSsb3rl5xScWgCg0g9p8/61/63F6uiYKTqXClkV s9AAoKLLz2SgEWZXwgAn9KVs1lUSxyv3 =K0XM -----END PGP SIGNATURE----- --------------enigF9C2F62C5CDC0AC9C5E57BDE--