From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=48953 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PxDnP-0000FN-KM for qemu-devel@nongnu.org; Wed, 09 Mar 2011 02:27:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PxDnN-0001HZ-7C for qemu-devel@nongnu.org; Wed, 09 Mar 2011 02:26:59 -0500 Received: from moutng.kundenserver.de ([212.227.17.8]:53117) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PxDnM-0001HT-Kx for qemu-devel@nongnu.org; Wed, 09 Mar 2011 02:26:57 -0500 Message-ID: <4D772BBC.4040603@mail.berlios.de> Date: Wed, 09 Mar 2011 08:26:52 +0100 From: Stefan Weil MIME-Version: 1.0 Subject: Re: [Qemu-devel] segmentation fault in qemu-kvm-0.14.0 References: <2640D58E-2101-47FA-99B6-28815666651E@dlh.net> In-Reply-To: <2640D58E-2101-47FA-99B6-28815666651E@dlh.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: qemu-devel , kvm@vger.kernel.org Am 08.03.2011 23:53, schrieb Peter Lieven: > Hi, > > during testing of qemu-kvm-0.14.0 i can reproduce the following > segfault. i have seen similar crash already in 0.13.0, but had no time > to debug. > my guess is that this segfault is related to the threaded vnc server > which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc > client is attached. it might also be connected to a resolution change > in the guest. i have a backtrace attached. the debugger is still > running if someone > needs more output > > Reading symbols from /usr/local/bin/qemu-system-x86_64...done. > (gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net > nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive > format=host_device,file=/dev/mapper/iqn.2001-05.co > m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native > -m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'lieven-winxp-te > st' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid > -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R) > Xeon(R) CPU E5640 @ 2.67GHz',-n > x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet > Starting program: /usr/local/bin/qemu-system-x86_64 -net > tap,vlan=141,script=no,downscript=no,ifname=tap0 -net > nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format > =host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native > -m 1024 -monitor tcp:0:4001, > server,nowait -vnc :1 -name 'lieven-winxp-test' -boot order=c,menu=on > -k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages > -mem-prealloc -cpu qemu64,model_id='Intel(R > ) Xeon(R) CPU E5640 @ 2.67GHz',-nx -rtc base=localtime,clock=vm -vga > cirrus -usb -usbdevice tablet > [Thread debugging using libthread_db enabled] > > [New Thread 0x7ffff694e700 (LWP 29042)] > [New Thread 0x7ffff6020700 (LWP 29043)] > [New Thread 0x7ffff581f700 (LWP 29074)] > [Thread 0x7ffff581f700 (LWP 29074) exited] > [New Thread 0x7ffff581f700 (LWP 29124)] > [Thread 0x7ffff581f700 (LWP 29124) exited] > [New Thread 0x7ffff581f700 (LWP 29170)] > [Thread 0x7ffff581f700 (LWP 29170) exited] > [New Thread 0x7ffff581f700 (LWP 29246)] > [Thread 0x7ffff581f700 (LWP 29246) exited] > [New Thread 0x7ffff581f700 (LWP 29303)] > [Thread 0x7ffff581f700 (LWP 29303) exited] > [New Thread 0x7ffff581f700 (LWP 29349)] > [Thread 0x7ffff581f700 (LWP 29349) exited] > [New Thread 0x7ffff581f700 (LWP 29399)] > [Thread 0x7ffff581f700 (LWP 29399) exited] > [New Thread 0x7ffff581f700 (LWP 29471)] > [Thread 0x7ffff581f700 (LWP 29471) exited] > [New Thread 0x7ffff581f700 (LWP 29521)] > [Thread 0x7ffff581f700 (LWP 29521) exited] > [New Thread 0x7ffff581f700 (LWP 29593)] > [Thread 0x7ffff581f700 (LWP 29593) exited] > [New Thread 0x7ffff581f700 (LWP 29703)] > [Thread 0x7ffff581f700 (LWP 29703) exited] > > Program received signal SIGSEGV, Segmentation fault. > 0x0000000000000000 in ?? () > (gdb) > > (gdb) thread apply all bt full > > Thread 3 (Thread 0x7ffff6020700 (LWP 29043)): > #0 0x00007ffff79c385c in pthread_cond_wait@@GLIBC_2.3.2 () > from /lib/libpthread.so.0 > No symbol table info available. > #1 0x00000000004d3ae1 in qemu_cond_wait (cond=0x1612d50, mutex=0x1612d80) > at qemu-thread.c:133 > err = 0 > __func__ = "qemu_cond_wait" > #2 0x00000000004d2b39 in vnc_worker_thread_loop (queue=0x1612d50) > at ui/vnc-jobs-async.c:198 > job = 0x7ffff058cd20 > entry = 0x0 > tmp = 0x0 > vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0, > 0} }, vd = 0x1607ff0, need_update = 0, > force_update = 0, features = 243, absolute = 0, last_x = 0, > last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7, > major = 0, minor = 0, challenge = '\000' , > info = 0x0, output = {capacity = 3194, offset = 2723, > buffer = 0x1fbbfd0 ""}, input = {capacity = 0, offset = 0, > buffer = 0x0}, write_pixels = 0x4c4bc9 , > clientds = {flags = 0 '\000', width = 720, height = 400, > linesize = 2880, > data = 0x7ffff6021010
, > pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', > depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0, > rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000', > ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377', > bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b', > gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}}, > audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8, > endianness = 0}, read_handler = 0, read_handler_expect = 0, > modifiers_state = '\000' , led = 0x0, > abort = false, output_mutex = {lock = {__data = {__lock = 0, > __count = 0, __owner = 0, __nusers = 0, __kind = 0, > __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, > __size = '\000' , __align = 0}}, tight = { > type = 7, quality = 255 '\377', compression = 9 '\t', > pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376, > buffer = 0x7ffff4684010 ""}, tmp = {capacity = 3194, > offset = 2672, buffer = 0x1fbbfd0 ""}, zlib = { > capacity = 141451, offset = 0, > buffer = 0x1805da0 > "B--G\317\253\031=\257f\364\274\232\321\363jF\317\253\241\326y5"}, > gradient = {capacity = 0, offset = 0, buffer = 0x0}, > levels = {9, 9, 9, 0}, stream = {{ > next_in = 0x7ffff4684460 > "\002\003\002\003\002\002\002\002\002\002\002\002\002", avail_in = 0, > total_in = 9048240, > next_out = 0x1805df6 > "d\276\345c\363\216\237\212\377\314\003\236\246\\$\361\025~\032\311\232\067q&_$\231\251y\262*!\231\067\236\067\363\214\347\315\240\361\221,\274T\257\221\314\333\341\251\362R\373\232_\311a\272\002\061sgt\317\030\332\262~\300\063\267]\307\267\343\033", > avail_out = 141365, > total_out = 1034664, msg = 0x0, state = 0x16ea550, > zalloc = 0x4ca2d8 , > zfree = 0x4ca315 , opaque = 0x7ffff6015920, > data_type = 0, adler = 1285581469, reserved = 0}, { > next_in = 0x7ffff4684022 "\002\002", avail_in = 0, > total_in = 46491, > next_out = 0x1805da8 "\257f\364\274\232\321\363jF\317\253\241\326y5", > avail_out = 141443, total_out = 9905, msg = 0x0, state = 0x17a3f00, > zalloc = 0x4ca2d8 , > zfree = 0x4ca315 , opaque = 0x7ffff6015920, > data_type = 0, adler = 2415253306, reserved = 0}, { > next_in = 0x7ffff4684570 "\017(`", avail_in = 0, > total_in = 2188024, > next_out = 0x1805dbd > "z\222\024#\233O\214g\341\352\211/U\310s\017\361\245\020\262\343\211TO\222\371\304\207\fryK|\251E\222\343\311+\237\351`>\355\312sR\265\320\272~\001", > avail_out = 141422, total_out = 100512, msg = 0x0, > state = 0x1682950, zalloc = 0x4ca2d8 , > zfree = 0x4ca315 , opaque = 0x7ffff6015920, > data_type = 0, adler = 3999694267, reserved = 0}, { > next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, > avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, > zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0, > reserved = 0}}}, zlib = {zlib = {capacity = 0, offset = 0, > buffer = 0x0}, tmp = {capacity = 0, offset = 0, buffer = 0x0}, > stream = {next_in = 0x0, avail_in = 0, total_in = 0, > next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, > state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, > adler = 0, reserved = 0}, level = 0}, hextile = { > send_tile = 0x4cd21d }, > mouse_mode_notifier = {notify = 0x80, node = {tqe_next = 0x1612da8, > tqe_prev = 0x7ffff6020700}}, next = {tqe_next = 0x7ffff6020700, > tqe_prev = 0x7ffff7ffe128}} > n_rectangles = 40 > saved_offset = 2 > flush = true > #3 0x00000000004d302c in vnc_worker_thread (arg=0x1612d50) > at ui/vnc-jobs-async.c:302 > queue = 0x1612d50 > #4 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0 > No symbol table info available. > #5 0x00007ffff6c3970d in clone () from /lib/libc.so.6 > No symbol table info available. > #6 0x0000000000000000 in ?? () > No symbol table info available. > > Thread 2 (Thread 0x7ffff694e700 (LWP 29042)): > #0 0x00007ffff6c31197 in ioctl () from /lib/libc.so.6 > No symbol table info available. > #1 0x0000000000434c12 in kvm_run (env=0x118c2e0) > at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:582 > r = 0 > kvm = 0x1172538 > run = 0x7ffff7fc7000 > fd = 15 > #2 0x0000000000435ed4 in kvm_cpu_exec (env=0x118c2e0) > at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1233 > r = 0 > #3 0x00000000004364f7 in kvm_main_loop_cpu (env=0x118c2e0) > at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1419 > run_cpu = 1 > #4 0x0000000000436640 in ap_main_loop (_env=0x118c2e0) > at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1466 > env = 0x118c2e0 > signals = {__val = {18446744067267100671, > 18446744073709551615 }} > data = 0x0 > #5 0x00007ffff79be9ca in start_thread () from /lib/libpthread.so.0 > No symbol table info available. > #6 0x00007ffff6c3970d in clone () from /lib/libc.so.6 > No symbol table info available. > #7 0x0000000000000000 in ?? () > No symbol table info available. > > Thread 1 (Thread 0x7ffff7ff0700 (LWP 29038)): > #0 0x0000000000000000 in ?? () > No symbol table info available. > #1 0x000000000041d669 in main_loop_wait (nonblocking=0) > at /usr/src/qemu-kvm-0.14.0/vl.c:1388 > pioh = 0x1652870 > ioh = 0x1613330 > rfds = {fds_bits = {0 }} > wfds = {fds_bits = {1048576, 0 }} > xfds = {fds_bits = {0 }} > ret = 1 > nfds = 21 > tv = {tv_sec = 0, tv_usec = 999998} > timeout = 1000 > #2 0x0000000000436a4a in kvm_main_loop () > at /usr/src/qemu-kvm-0.14.0/qemu-kvm.c:1589 > mask = {__val = {268443712, 0 }} > sigfd = 19 > #3 0x000000000041d785 in main_loop () at > /usr/src/qemu-kvm-0.14.0/vl.c:1429 > r = 0 > #4 0x000000000042166a in main (argc=33, argv=0x7fffffffe658, > envp=0x7fffffffe768) at /usr/src/qemu-kvm-0.14.0/vl.c:3201 > gdbstub_dev = 0x0 > i = 64 > snapshot = 0 > linux_boot = 0 > icount_option = 0x0 > initrd_filename = 0x0 > kernel_filename = 0x0 > kernel_cmdline = 0x5e57a3 "" > boot_devices = "c\000d", '\000' > ds = 0x15cb380 > dcl = 0x0 > cyls = 0 > heads = 0 > secs = 0 > translation = 0 > hda_opts = 0x0 > opts = 0x1171a80 > olist = 0x7fffffffe3f0 > optind = 33 > optarg = 0x7fffffffebce "tablet" > loadvm = 0x0 > machine = 0x962cc0 > cpu_model = 0x7fffffffeb51 "qemu64,model_id=Intel(R) Xeon(R) CPU", ' ' > , "E5640 @ 2.67GHz,-nx" > tb_size = 0 > pid_file = 0x7fffffffeb10 "/var/run/qemu/vm-265.pid" > incoming = 0x0 > show_vnc_port = 0 > defconfig = 1 > (gdb) info threads > 3 Thread 0x7ffff6020700 (LWP 29043) 0x00007ffff79c385c in > pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 > 2 Thread 0x7ffff694e700 (LWP 29042) 0x00007ffff6c31197 in ioctl () > from /lib/libc.so.6 > * 1 Thread 0x7ffff7ff0700 (LWP 29038) 0x0000000000000000 in ?? () > (gdb) > > > Help appreciated! > > Thanks, > Peter Hi Peter, did you apply this patch which fixes one of the known vnc problems (but is still missing in qemu git master): http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00256.html Then you can read this thread: http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg00313.html And finally the following modifications of ui/vnc.c might help to see whether you experience the same kind of crash as I get here in my environment. They add assertions for bad memory access which occurs sometimes when a vnc client-server connection exists and the screen is refreshed after a resolution change. The code line with the //~ comment also includes a fix which works for me. Regards, Stefan W. @@ -2382,6 +2384,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd) int y; uint8_t *guest_row; uint8_t *server_row; + + size_t guest_size = vd->guest.ds->linesize * vd->guest.ds->height; + size_t server_size = vd->server->linesize * vd->server->height; + int cmp_bytes; VncState *vs; int has_dirty = 0; @@ -2399,11 +2405,15 @@ static int vnc_refresh_server_surface(VncDisplay *vd) * Update server dirty map. */ cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds); + if (cmp_bytes > vd->ds->surface->linesize) { + //~ fix crash: cmp_bytes = vd->ds->surface->linesize; + } guest_row = vd->guest.ds->data; server_row = vd->server->data; for (y = 0; y < vd->guest.ds->height; y++) { if (!bitmap_empty(vd->guest.dirty[y], VNC_DIRTY_BITS)) { int x; + size_t size_offset = 0; uint8_t *guest_ptr; uint8_t *server_ptr; @@ -2412,6 +2422,9 @@ static int vnc_refresh_server_surface(VncDisplay *vd) for (x = 0; x < vd->guest.ds->width; x += 16, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) { + assert(size_offset + cmp_bytes <= guest_size); + assert(size_offset + cmp_bytes <= server_size); + size_offset += cmp_bytes; if (!test_and_clear_bit((x / 16), vd->guest.dirty[y])) continue; if (memcmp(server_ptr, guest_ptr, cmp_bytes) == 0) @@ -2427,6 +2440,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd) } guest_row += ds_get_linesize(vd->ds); server_row += ds_get_linesize(vd->ds); + assert(guest_size >= ds_get_linesize(vd->ds)); + guest_size -= ds_get_linesize(vd->ds); + assert(server_size >= ds_get_linesize(vd->ds)); + server_size -= ds_get_linesize(vd->ds); } return has_dirty; }