kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
@ 2012-06-27 15:39 Peter Lieven
  2012-06-27 16:54 ` Jan Kiszka
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Lieven @ 2012-06-27 15:39 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: Gleb Natapov

Hi all,

i debugged this further and found out that kvm-kmod-3.0 is working with
qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is 
working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0). 
Has anyone a clue which new KVM feature could cause this if a vcpu is in 
an infinite loop?

Thanks,
Peter

---

Hi,

we recently came across multiple VMs racing and stopping working. It 
seems to happen when the system is at 100% cpu.
One way to reproduce this is:
qemu-kvm-1.0.1 with vnc-thread enabled

cmdline (or similar):
/usr/bin/qemu-kvm-1.0.1 -net 
tap,vlan=141,script=no,downscript=no,ifname=tap15,vnet_hdr -net 
nic,vlan=141,model=virtio,macaddr=52:54:00:ff:00:f7 -drive 
format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-efdf4e007-16700198c7f4fead-02-debug-race-hd01,if=virtio,cache=none,aio=native 
-m 2048 -smp 2,sockets=1,cores=2,threads=1 -monitor 
tcp:0:4026,server,nowait -vnc :26 -qmp tcp:0:3026,server,nowait -name 
02-debug-race -boot order=dc,menu=off -cdrom 
/home/kvm/cdrom//root/ubuntu-12.04-server-amd64.iso -k de -pidfile 
/var/run/qemu/vm-221.pid -mem-prealloc -cpu 
host,+x2apic,model_id=Intel(R) Xeon(R) CPU           L5640  @ 
2.27GHz,-tsc -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus

it is important that the attached virtio image contains only zeroes. if 
the system boots from cd, select boot from first harddisk.
the hypervisor then hangs at 100% cpu and neither monitor nor qmp are 
responsive anymore.

i have also seen customers reporting this when a VM is shut down.

if this is connected to the threaded vnc server it might be important to 
connected at this time.

debug backtrace attached.

Thanks,
Peter

--

(gdb) file /usr/bin/qemu-kvm-1.0.1
Reading symbols from /usr/bin/qemu-kvm-1.0.1...done.
(gdb) attach 5145
Attaching to program: /usr/bin/qemu-kvm-1.0.1, process 5145
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
[Thread debugging using libthread_db enabled]
[New Thread 0x7f54d08b9700 (LWP 5253)]
[New Thread 0x7f5552757700 (LWP 5152)]
[New Thread 0x7f5552f58700 (LWP 5151)]
0x00007f5553c6b5a3 in select () from /lib/libc.so.6
(gdb) info threads
   4 Thread 0x7f5552f58700 (LWP 5151)  0x00007f5553c6a747 in ioctl () 
from /lib/libc.so.6
   3 Thread 0x7f5552757700 (LWP 5152)  0x00007f5553c6a747 in ioctl () 
from /lib/libc.so.6
   2 Thread 0x7f54d08b9700 (LWP 5253)  0x00007f5553f1a85c in 
pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
* 1 Thread 0x7f555550d700 (LWP 5145)  0x00007f5553c6b5a3 in select () 
from /lib/libc.so.6
(gdb) thread apply all bt

Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10, 
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at 
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:740
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60, 
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at 
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:740
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
#0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0, 
mutex=0x7f5557ede210) at qemu-thread-posix.c:113
#2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0) 
at ui/vnc-jobs-async.c:222
#3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at 
ui/vnc-jobs-async.c:318
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
#0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
#1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
#2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
#3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8, 
envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
(gdb) thread apply all bt full

Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10, 
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
         ret = 32597
         arg = 0x0
         ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 
0x7f5552f57e50, reg_save_area = 0x7f5552f57d90}}
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at 
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
         run = 0x7f55553e2000
         ret = 0
         run_ret = 0
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:740
         env = 0x7f5557652f10
         r = 0
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60, 
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
         ret = 0
         arg = 0x0
         ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 
0x7f5552756e50, reg_save_area = 0x7f5552756d90}}
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at 
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
         run = 0x7f55553df000
         ret = 32597
         run_ret = 1433358864
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:740
         env = 0x7f555766ae60
         r = 65536
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
#0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
#1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0, 
mutex=0x7f5557ede210) at qemu-thread-posix.c:113
         err = 32597
         __func__ = "qemu_cond_wait"
#2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0) 
at ui/vnc-jobs-async.c:222
         job = 0x7f5557edd510
         entry = 0x0
         tmp = 0x0
         vs = {csock = -1, ds = 0x7f5557e8ec40, dirty = {{0, 0, 0} 
<repeats 2048 times>}, lossy_rect = 0x7f5557edd570, vd = 0x7f54d08ba010, 
need_update = 0, force_update = 0,
           features = 227, absolute = 0, last_x = 0, last_y = 0, 
client_width = 0, client_height = 0, vnc_encoding = 6, major = 0, minor 
= 0, auth = 0,
           challenge = '\000' <repeats 15 times>, info = 0x0, output = 
{capacity = 1230913, offset = 1448, buffer = 0x7f5558176d60 ""}, input = 
{capacity = 0, offset = 0,
             buffer = 0x0}, write_pixels = 0x7f55556b2aaf 
<vnc_write_pixels_generic>, clientds = {flags = 0 '\000', width = 640, 
height = 480, linesize = 1280,
             data = 0x7f54d0c00000 "B\a", pf = {bits_per_pixel = 32 ' ', 
bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 16711680, gmask = 
65280, bmask = 255, amask = 0,
               rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000', 
ashift = 0 '\000', rmax = 255 '\377', gmax = 255 '\377', bmax = 255 
'\377', amax = 0 '\000',
               rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 0 
'\000'}}, audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = 
AUD_FMT_U8, endianness = 0},
           read_handler = 0, read_handler_expect = 0, modifiers_state = 
'\000' <repeats 255 times>, led = 0x0, abort = false, output_mutex = 
{lock = {__data = {__lock = 0,
                 __count = 0, __owner = 0, __nusers = 0, __kind = 0, 
__spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' 
<repeats 39 times>, __align = 0}},
           bh = 0x0, jobs_buffer = {capacity = 0, offset = 0, buffer = 
0x0}, tight = {type = 0, quality = 255 '\377', compression = 9 '\t', 
pixel24 = 0 '\000', tight = {
               capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity 
= 0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer 
= 0x0}, gradient = {
               capacity = 0, offset = 0, buffer = 0x0}, levels = {0, 0, 
0, 0}, stream = {{next_in = 0x0, avail_in = 0, total_in = 0, next_out = 
0x0, avail_out = 0,
                 total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, 
zfree = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, 
{next_in = 0x0, avail_in = 0, total_in = 0,
                 next_out = 0x0, avail_out = 0, total_out = 0, msg = 
0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, 
adler = 0, reserved = 0}, {
                 next_in = 0x0, avail_in = 0, total_in = 0, next_out = 
0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, 
zfree = 0, opaque = 0x0,
                 data_type = 0, adler = 0, reserved = 0}, {next_in = 
0x0, avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, 
total_out = 0, msg = 0x0, state = 0x0,
                 zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, 
adler = 0, reserved = 0}}}, zlib = {zlib = {capacity = 1229488, offset = 
1152000,
               buffer = 0x7f5557fe9320 ""}, tmp = {capacity = 1230913, 
offset = 20, buffer = 0x7f5558176d60 ""}, stream = {next_in = 
0x7f5558102720 "\030\b ", avail_in = 0,
               total_in = 7658880, next_out = 0x7f5558177308 "", 
avail_out = 1229465, total_out = 49027, msg = 0x0, state = 0x7f55581155e0,
               zalloc = 0x7f55556a622f <vnc_zlib_zalloc>, zfree = 
0x7f55556a626c <vnc_zlib_zfree>, opaque = 0x7f54d08a6810, data_type = 0, 
adler = 197164569, reserved = 0},
             level = 9}, hextile = {send_tile = 0x7f55556a1180 
<send_hextile_tile_generic_32>}, zrle = {type = 0, fb = {capacity = 0, 
offset = 0, buffer = 0x0}, zrle = {
               capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity 
= 0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer 
= 0x0}, stream = {next_in = 0x0,
               avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 
0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque 
= 0x0, data_type = 0, adler = 0,
               reserved = 0}, palette = {pool = {{idx = 0, color = 0, 
next = {le_next = 0x0, le_prev = 0x0}} <repeats 256 times>}, size = 0, 
max = 0, bpp = 0, table = {{
                   lh_first = 0x0} <repeats 256 times>}}}, zywrle = {buf 
= {0 <repeats 4096 times>}}, mouse_mode_notifier = {notify = 0, node = 
{tqe_next = 0x0,
               tqe_prev = 0x0}}, next = {tqe_next = 0x0, tqe_prev = 0x0}}
         n_rectangles = 1
         saved_offset = 2
#3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at 
ui/vnc-jobs-async.c:318
         queue = 0x7f5557ede1e0
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

---Type <return> to continue, or q <return> to quit---
Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
#0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
         rfds = {fds_bits = {14197552, 0 <repeats 15 times>}}
         wfds = {fds_bits = {0 <repeats 16 times>}}
         xfds = {fds_bits = {0 <repeats 16 times>}}
         ret = 32597
         nfds = 23
         tv = {tv_sec = 0, tv_usec = 817147}
         timeout = 1000
#2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
         nonblocking = false
         last_io = 1
#3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8, 
envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
         gdbstub_dev = 0x0
         i = 64
         snapshot = 0
         linux_boot = 0
         icount_option = 0x0
         initrd_filename = 0x0
         kernel_filename = 0x0
         kernel_cmdline = 0x7f55557d8bef ""
         boot_devices = "dc", '\000' <repeats 30 times>
         ds = 0x7f5557e8ec40
         dcl = 0x0
         cyls = 0
         heads = 0
         secs = 0
         translation = 0
         hda_opts = 0x0
         opts = 0x7f5557637cb0
         olist = 0x7ffff9d89268
         optind = 38
         optarg = 0x7ffff9d89bed "cirrus"
         loadvm = 0x0
         machine = 0x7f5555b53500
         cpu_model = 0x7ffff9d89b6f "host,+x2apic,model_id=Intel(R) 
Xeon(R) CPU", ' ' <repeats 11 times>, "L5640  @ 2.27GHz,-tsc"
         pid_file = 0x7ffff9d89b43 "/var/run/qemu/vm-221.pid"
         incoming = 0x0
         show_vnc_port = 0
         defconfig = 1
         log_mask = 0x0
         log_file = 0x0
         mem_trace = {malloc = 0x7f5555649081 <malloc_and_trace>, 
realloc = 0x7f55556490b6 <realloc_and_trace>, free = 0x7f55556490fa 
<free_and_trace>, calloc = 0,
---Type <return> to continue, or q <return> to quit---
           try_malloc = 0, try_realloc = 0}
         trace_events = 0x0
         trace_file = 0x0
(gdb)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-27 15:39 race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1 Peter Lieven
@ 2012-06-27 16:54 ` Jan Kiszka
  2012-06-28  9:11   ` Peter Lieven
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kiszka @ 2012-06-27 16:54 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel, kvm, Gleb Natapov

[-- Attachment #1: Type: text/plain, Size: 15436 bytes --]

On 2012-06-27 17:39, Peter Lieven wrote:
> Hi all,
> 
> i debugged this further and found out that kvm-kmod-3.0 is working with
> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
> Has anyone a clue which new KVM feature could cause this if a vcpu is in
> an infinite loop?

Before accusing kvm-kmod ;), can you check if the effect is visible with
an original Linux 3.3.x or 3.4.x kernel as well?

Then, bisection the change in qemu-kvm that apparently resolved the
issue would be interesting.

If we have to dig deeper, tracing [1] the lockup would likely be helpful
(all events of the qemu process, not just KVM related ones: trace-cmd
record -e all qemu-system-x86_64 ...).

Jan

[1] http://www.linux-kvm.org/page/Tracing

> ---
> 
> Hi,
> 
> we recently came across multiple VMs racing and stopping working. It
> seems to happen when the system is at 100% cpu.
> One way to reproduce this is:
> qemu-kvm-1.0.1 with vnc-thread enabled
> 
> cmdline (or similar):
> /usr/bin/qemu-kvm-1.0.1 -net
> tap,vlan=141,script=no,downscript=no,ifname=tap15,vnet_hdr -net
> nic,vlan=141,model=virtio,macaddr=52:54:00:ff:00:f7 -drive
> format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-efdf4e007-16700198c7f4fead-02-debug-race-hd01,if=virtio,cache=none,aio=native
> -m 2048 -smp 2,sockets=1,cores=2,threads=1 -monitor
> tcp:0:4026,server,nowait -vnc :26 -qmp tcp:0:3026,server,nowait -name
> 02-debug-race -boot order=dc,menu=off -cdrom
> /home/kvm/cdrom//root/ubuntu-12.04-server-amd64.iso -k de -pidfile
> /var/run/qemu/vm-221.pid -mem-prealloc -cpu
> host,+x2apic,model_id=Intel(R) Xeon(R) CPU           L5640  @
> 2.27GHz,-tsc -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus
> 
> it is important that the attached virtio image contains only zeroes. if
> the system boots from cd, select boot from first harddisk.
> the hypervisor then hangs at 100% cpu and neither monitor nor qmp are
> responsive anymore.
> 
> i have also seen customers reporting this when a VM is shut down.
> 
> if this is connected to the threaded vnc server it might be important to
> connected at this time.
> 
> debug backtrace attached.
> 
> Thanks,
> Peter
> 
> -- 
> 
> (gdb) file /usr/bin/qemu-kvm-1.0.1
> Reading symbols from /usr/bin/qemu-kvm-1.0.1...done.
> (gdb) attach 5145
> Attaching to program: /usr/bin/qemu-kvm-1.0.1, process 5145
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
> found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7f54d08b9700 (LWP 5253)]
> [New Thread 0x7f5552757700 (LWP 5152)]
> [New Thread 0x7f5552f58700 (LWP 5151)]
> 0x00007f5553c6b5a3 in select () from /lib/libc.so.6
> (gdb) info threads
>   4 Thread 0x7f5552f58700 (LWP 5151)  0x00007f5553c6a747 in ioctl ()
> from /lib/libc.so.6
>   3 Thread 0x7f5552757700 (LWP 5152)  0x00007f5553c6a747 in ioctl ()
> from /lib/libc.so.6
>   2 Thread 0x7f54d08b9700 (LWP 5253)  0x00007f5553f1a85c in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
> * 1 Thread 0x7f555550d700 (LWP 5145)  0x00007f5553c6b5a3 in select ()
> from /lib/libc.so.6
> (gdb) thread apply all bt
> 
> Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10,
> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at
> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at
> /usr/src/qemu-kvm-1.0.1/cpus.c:740
> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
> #6  0x0000000000000000 in ?? ()
> 
> Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60,
> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at
> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at
> /usr/src/qemu-kvm-1.0.1/cpus.c:740
> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
> #6  0x0000000000000000 in ?? ()
> 
> Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
> #0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/libpthread.so.0
> #1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0,
> mutex=0x7f5557ede210) at qemu-thread-posix.c:113
> #2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0)
> at ui/vnc-jobs-async.c:222
> #3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at
> ui/vnc-jobs-async.c:318
> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
> #6  0x0000000000000000 in ?? ()
> 
> Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
> #0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
> #1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
> #2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
> #3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8,
> envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
> (gdb) thread apply all bt full
> 
> Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
> No symbol table info available.
> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10,
> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
>         ret = 32597
>         arg = 0x0
>         ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area =
> 0x7f5552f57e50, reg_save_area = 0x7f5552f57d90}}
> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at
> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
>         run = 0x7f55553e2000
>         ret = 0
>         run_ret = 0
> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at
> /usr/src/qemu-kvm-1.0.1/cpus.c:740
>         env = 0x7f5557652f10
>         r = 0
> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
> No symbol table info available.
> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
> No symbol table info available.
> #6  0x0000000000000000 in ?? ()
> No symbol table info available.
> 
> Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
> No symbol table info available.
> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60,
> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
>         ret = 0
>         arg = 0x0
>         ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area =
> 0x7f5552756e50, reg_save_area = 0x7f5552756d90}}
> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at
> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
>         run = 0x7f55553df000
>         ret = 32597
>         run_ret = 1433358864
> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at
> /usr/src/qemu-kvm-1.0.1/cpus.c:740
>         env = 0x7f555766ae60
>         r = 65536
> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
> No symbol table info available.
> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
> No symbol table info available.
> #6  0x0000000000000000 in ?? ()
> No symbol table info available.
> 
> Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
> #0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/libpthread.so.0
> ---Type <return> to continue, or q <return> to quit---
> No symbol table info available.
> #1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0,
> mutex=0x7f5557ede210) at qemu-thread-posix.c:113
>         err = 32597
>         __func__ = "qemu_cond_wait"
> #2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0)
> at ui/vnc-jobs-async.c:222
>         job = 0x7f5557edd510
>         entry = 0x0
>         tmp = 0x0
>         vs = {csock = -1, ds = 0x7f5557e8ec40, dirty = {{0, 0, 0}
> <repeats 2048 times>}, lossy_rect = 0x7f5557edd570, vd = 0x7f54d08ba010,
> need_update = 0, force_update = 0,
>           features = 227, absolute = 0, last_x = 0, last_y = 0,
> client_width = 0, client_height = 0, vnc_encoding = 6, major = 0, minor
> = 0, auth = 0,
>           challenge = '\000' <repeats 15 times>, info = 0x0, output =
> {capacity = 1230913, offset = 1448, buffer = 0x7f5558176d60 ""}, input =
> {capacity = 0, offset = 0,
>             buffer = 0x0}, write_pixels = 0x7f55556b2aaf
> <vnc_write_pixels_generic>, clientds = {flags = 0 '\000', width = 640,
> height = 480, linesize = 1280,
>             data = 0x7f54d0c00000 "B\a", pf = {bits_per_pixel = 32 ' ',
> bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 16711680, gmask =
> 65280, bmask = 255, amask = 0,
>               rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
> ashift = 0 '\000', rmax = 255 '\377', gmax = 255 '\377', bmax = 255
> '\377', amax = 0 '\000',
>               rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 0
> '\000'}}, audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt =
> AUD_FMT_U8, endianness = 0},
>           read_handler = 0, read_handler_expect = 0, modifiers_state =
> '\000' <repeats 255 times>, led = 0x0, abort = false, output_mutex =
> {lock = {__data = {__lock = 0,
>                 __count = 0, __owner = 0, __nusers = 0, __kind = 0,
> __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000'
> <repeats 39 times>, __align = 0}},
>           bh = 0x0, jobs_buffer = {capacity = 0, offset = 0, buffer =
> 0x0}, tight = {type = 0, quality = 255 '\377', compression = 9 '\t',
> pixel24 = 0 '\000', tight = {
>               capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity =
> 0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer =
> 0x0}, gradient = {
>               capacity = 0, offset = 0, buffer = 0x0}, levels = {0, 0,
> 0, 0}, stream = {{next_in = 0x0, avail_in = 0, total_in = 0, next_out =
> 0x0, avail_out = 0,
>                 total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree
> = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, {next_in =
> 0x0, avail_in = 0, total_in = 0,
>                 next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
> state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler =
> 0, reserved = 0}, {
>                 next_in = 0x0, avail_in = 0, total_in = 0, next_out =
> 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0,
> zfree = 0, opaque = 0x0,
>                 data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0,
> avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out =
> 0, msg = 0x0, state = 0x0,
>                 zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
> adler = 0, reserved = 0}}}, zlib = {zlib = {capacity = 1229488, offset =
> 1152000,
>               buffer = 0x7f5557fe9320 ""}, tmp = {capacity = 1230913,
> offset = 20, buffer = 0x7f5558176d60 ""}, stream = {next_in =
> 0x7f5558102720 "\030\b ", avail_in = 0,
>               total_in = 7658880, next_out = 0x7f5558177308 "",
> avail_out = 1229465, total_out = 49027, msg = 0x0, state = 0x7f55581155e0,
>               zalloc = 0x7f55556a622f <vnc_zlib_zalloc>, zfree =
> 0x7f55556a626c <vnc_zlib_zfree>, opaque = 0x7f54d08a6810, data_type = 0,
> adler = 197164569, reserved = 0},
>             level = 9}, hextile = {send_tile = 0x7f55556a1180
> <send_hextile_tile_generic_32>}, zrle = {type = 0, fb = {capacity = 0,
> offset = 0, buffer = 0x0}, zrle = {
>               capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity =
> 0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer =
> 0x0}, stream = {next_in = 0x0,
>               avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0,
> total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque =
> 0x0, data_type = 0, adler = 0,
>               reserved = 0}, palette = {pool = {{idx = 0, color = 0,
> next = {le_next = 0x0, le_prev = 0x0}} <repeats 256 times>}, size = 0,
> max = 0, bpp = 0, table = {{
>                   lh_first = 0x0} <repeats 256 times>}}}, zywrle = {buf
> = {0 <repeats 4096 times>}}, mouse_mode_notifier = {notify = 0, node =
> {tqe_next = 0x0,
>               tqe_prev = 0x0}}, next = {tqe_next = 0x0, tqe_prev = 0x0}}
>         n_rectangles = 1
>         saved_offset = 2
> #3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at
> ui/vnc-jobs-async.c:318
>         queue = 0x7f5557ede1e0
> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
> No symbol table info available.
> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
> No symbol table info available.
> #6  0x0000000000000000 in ?? ()
> No symbol table info available.
> 
> ---Type <return> to continue, or q <return> to quit---
> Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
> #0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
> No symbol table info available.
> #1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
>         rfds = {fds_bits = {14197552, 0 <repeats 15 times>}}
>         wfds = {fds_bits = {0 <repeats 16 times>}}
>         xfds = {fds_bits = {0 <repeats 16 times>}}
>         ret = 32597
>         nfds = 23
>         tv = {tv_sec = 0, tv_usec = 817147}
>         timeout = 1000
> #2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
>         nonblocking = false
>         last_io = 1
> #3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8,
> envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
>         gdbstub_dev = 0x0
>         i = 64
>         snapshot = 0
>         linux_boot = 0
>         icount_option = 0x0
>         initrd_filename = 0x0
>         kernel_filename = 0x0
>         kernel_cmdline = 0x7f55557d8bef ""
>         boot_devices = "dc", '\000' <repeats 30 times>
>         ds = 0x7f5557e8ec40
>         dcl = 0x0
>         cyls = 0
>         heads = 0
>         secs = 0
>         translation = 0
>         hda_opts = 0x0
>         opts = 0x7f5557637cb0
>         olist = 0x7ffff9d89268
>         optind = 38
>         optarg = 0x7ffff9d89bed "cirrus"
>         loadvm = 0x0
>         machine = 0x7f5555b53500
>         cpu_model = 0x7ffff9d89b6f "host,+x2apic,model_id=Intel(R)
> Xeon(R) CPU", ' ' <repeats 11 times>, "L5640  @ 2.27GHz,-tsc"
>         pid_file = 0x7ffff9d89b43 "/var/run/qemu/vm-221.pid"
>         incoming = 0x0
>         show_vnc_port = 0
>         defconfig = 1
>         log_mask = 0x0
>         log_file = 0x0
>         mem_trace = {malloc = 0x7f5555649081 <malloc_and_trace>, realloc
> = 0x7f55556490b6 <realloc_and_trace>, free = 0x7f55556490fa
> <free_and_trace>, calloc = 0,
> ---Type <return> to continue, or q <return> to quit---
>           try_malloc = 0, try_realloc = 0}
>         trace_events = 0x0
>         trace_file = 0x0
> (gdb)
> 
> 
> 





[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-27 16:54 ` Jan Kiszka
@ 2012-06-28  9:11   ` Peter Lieven
  2012-06-28  9:21     ` Jan Kiszka
  2012-07-05  8:51     ` Xiao Guangrong
  0 siblings, 2 replies; 20+ messages in thread
From: Peter Lieven @ 2012-06-28  9:11 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: qemu-devel, kvm, Gleb Natapov

On 27.06.2012 18:54, Jan Kiszka wrote:
> On 2012-06-27 17:39, Peter Lieven wrote:
>> Hi all,
>>
>> i debugged this further and found out that kvm-kmod-3.0 is working with
>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>> an infinite loop?
> Before accusing kvm-kmod ;), can you check if the effect is visible with
> an original Linux 3.3.x or 3.4.x kernel as well?
sorry, i should have been more specific. maybe I also misunderstood sth.
I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if 
I use
a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
however, maybe we don't have to dig to deep - see below.
> Then, bisection the change in qemu-kvm that apparently resolved the
> issue would be interesting.
>
> If we have to dig deeper, tracing [1] the lockup would likely be helpful
> (all events of the qemu process, not just KVM related ones: trace-cmd
> record -e all qemu-system-x86_64 ...).
that here is bascially whats going on:

   qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
read len 3 gpa 0xa0000 val 0x10ff
     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 
0xa0000 gpa 0xa0000 Read GPA
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
unsatisfied-read len 1 gpa 0xa0000 val 0x0
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason 
KVM_EXIT_MMIO (6)
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
read len 3 gpa 0xa0000 val 0x10ff
     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 
0xa0000 gpa 0xa0000 Read GPA
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
unsatisfied-read len 1 gpa 0xa0000 val 0x0
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason 
KVM_EXIT_MMIO (6)
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
read len 3 gpa 0xa0000 val 0x10ff
     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 
0xa0000 gpa 0xa0000 Read GPA
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
unsatisfied-read len 1 gpa 0xa0000 val 0x0
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason 
KVM_EXIT_MMIO (6)
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
read len 3 gpa 0xa0000 val 0x10ff
     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 
0xa0000 gpa 0xa0000 Read GPA
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
unsatisfied-read len 1 gpa 0xa0000 val 0x0
     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason 
KVM_EXIT_MMIO (6)

its doing that forever. this is tracing the kvm module. doing the 
qemu-system-x86_64 trace is a bit compilcated, but
maybe this is already sufficient. otherwise i will of course gather this 
info as well.

thanks
peter


> Jan
>
> [1] http://www.linux-kvm.org/page/Tracing
>
>> ---
>>
>> Hi,
>>
>> we recently came across multiple VMs racing and stopping working. It
>> seems to happen when the system is at 100% cpu.
>> One way to reproduce this is:
>> qemu-kvm-1.0.1 with vnc-thread enabled
>>
>> cmdline (or similar):
>> /usr/bin/qemu-kvm-1.0.1 -net
>> tap,vlan=141,script=no,downscript=no,ifname=tap15,vnet_hdr -net
>> nic,vlan=141,model=virtio,macaddr=52:54:00:ff:00:f7 -drive
>> format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-efdf4e007-16700198c7f4fead-02-debug-race-hd01,if=virtio,cache=none,aio=native
>> -m 2048 -smp 2,sockets=1,cores=2,threads=1 -monitor
>> tcp:0:4026,server,nowait -vnc :26 -qmp tcp:0:3026,server,nowait -name
>> 02-debug-race -boot order=dc,menu=off -cdrom
>> /home/kvm/cdrom//root/ubuntu-12.04-server-amd64.iso -k de -pidfile
>> /var/run/qemu/vm-221.pid -mem-prealloc -cpu
>> host,+x2apic,model_id=Intel(R) Xeon(R) CPU           L5640  @
>> 2.27GHz,-tsc -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus
>>
>> it is important that the attached virtio image contains only zeroes. if
>> the system boots from cd, select boot from first harddisk.
>> the hypervisor then hangs at 100% cpu and neither monitor nor qmp are
>> responsive anymore.
>>
>> i have also seen customers reporting this when a VM is shut down.
>>
>> if this is connected to the threaded vnc server it might be important to
>> connected at this time.
>>
>> debug backtrace attached.
>>
>> Thanks,
>> Peter
>>
>> -- 
>>
>> (gdb) file /usr/bin/qemu-kvm-1.0.1
>> Reading symbols from /usr/bin/qemu-kvm-1.0.1...done.
>> (gdb) attach 5145
>> Attaching to program: /usr/bin/qemu-kvm-1.0.1, process 5145
>> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib64/ld-linux-x86-64.so.2
>> [Thread debugging using libthread_db enabled]
>> [New Thread 0x7f54d08b9700 (LWP 5253)]
>> [New Thread 0x7f5552757700 (LWP 5152)]
>> [New Thread 0x7f5552f58700 (LWP 5151)]
>> 0x00007f5553c6b5a3 in select () from /lib/libc.so.6
>> (gdb) info threads
>>    4 Thread 0x7f5552f58700 (LWP 5151)  0x00007f5553c6a747 in ioctl ()
>> from /lib/libc.so.6
>>    3 Thread 0x7f5552757700 (LWP 5152)  0x00007f5553c6a747 in ioctl ()
>> from /lib/libc.so.6
>>    2 Thread 0x7f54d08b9700 (LWP 5253)  0x00007f5553f1a85c in
>> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
>> * 1 Thread 0x7f555550d700 (LWP 5145)  0x00007f5553c6b5a3 in select ()
>> from /lib/libc.so.6
>> (gdb) thread apply all bt
>>
>> Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
>> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
>> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10,
>> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
>> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at
>> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
>> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at
>> /usr/src/qemu-kvm-1.0.1/cpus.c:740
>> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
>> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
>> #6  0x0000000000000000 in ?? ()
>>
>> Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
>> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
>> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60,
>> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
>> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at
>> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
>> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at
>> /usr/src/qemu-kvm-1.0.1/cpus.c:740
>> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
>> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
>> #6  0x0000000000000000 in ?? ()
>>
>> Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
>> #0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/libpthread.so.0
>> #1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0,
>> mutex=0x7f5557ede210) at qemu-thread-posix.c:113
>> #2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0)
>> at ui/vnc-jobs-async.c:222
>> #3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at
>> ui/vnc-jobs-async.c:318
>> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
>> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
>> #6  0x0000000000000000 in ?? ()
>>
>> Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
>> #0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
>> #1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
>> #2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
>> #3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8,
>> envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
>> (gdb) thread apply all bt full
>>
>> Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
>> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
>> No symbol table info available.
>> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10,
>> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
>>          ret = 32597
>>          arg = 0x0
>>          ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area =
>> 0x7f5552f57e50, reg_save_area = 0x7f5552f57d90}}
>> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at
>> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
>>          run = 0x7f55553e2000
>>          ret = 0
>>          run_ret = 0
>> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at
>> /usr/src/qemu-kvm-1.0.1/cpus.c:740
>>          env = 0x7f5557652f10
>>          r = 0
>> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
>> No symbol table info available.
>> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
>> No symbol table info available.
>> #6  0x0000000000000000 in ?? ()
>> No symbol table info available.
>>
>> Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
>> #0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
>> No symbol table info available.
>> #1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60,
>> type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
>>          ret = 0
>>          arg = 0x0
>>          ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area =
>> 0x7f5552756e50, reg_save_area = 0x7f5552756d90}}
>> #2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at
>> /usr/src/qemu-kvm-1.0.1/kvm-all.c:987
>>          run = 0x7f55553df000
>>          ret = 32597
>>          run_ret = 1433358864
>> #3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at
>> /usr/src/qemu-kvm-1.0.1/cpus.c:740
>>          env = 0x7f555766ae60
>>          r = 65536
>> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
>> No symbol table info available.
>> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
>> No symbol table info available.
>> #6  0x0000000000000000 in ?? ()
>> No symbol table info available.
>>
>> Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
>> #0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/libpthread.so.0
>> ---Type<return>  to continue, or q<return>  to quit---
>> No symbol table info available.
>> #1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0,
>> mutex=0x7f5557ede210) at qemu-thread-posix.c:113
>>          err = 32597
>>          __func__ = "qemu_cond_wait"
>> #2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0)
>> at ui/vnc-jobs-async.c:222
>>          job = 0x7f5557edd510
>>          entry = 0x0
>>          tmp = 0x0
>>          vs = {csock = -1, ds = 0x7f5557e8ec40, dirty = {{0, 0, 0}
>> <repeats 2048 times>}, lossy_rect = 0x7f5557edd570, vd = 0x7f54d08ba010,
>> need_update = 0, force_update = 0,
>>            features = 227, absolute = 0, last_x = 0, last_y = 0,
>> client_width = 0, client_height = 0, vnc_encoding = 6, major = 0, minor
>> = 0, auth = 0,
>>            challenge = '\000'<repeats 15 times>, info = 0x0, output =
>> {capacity = 1230913, offset = 1448, buffer = 0x7f5558176d60 ""}, input =
>> {capacity = 0, offset = 0,
>>              buffer = 0x0}, write_pixels = 0x7f55556b2aaf
>> <vnc_write_pixels_generic>, clientds = {flags = 0 '\000', width = 640,
>> height = 480, linesize = 1280,
>>              data = 0x7f54d0c00000 "B\a", pf = {bits_per_pixel = 32 ' ',
>> bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 16711680, gmask =
>> 65280, bmask = 255, amask = 0,
>>                rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
>> ashift = 0 '\000', rmax = 255 '\377', gmax = 255 '\377', bmax = 255
>> '\377', amax = 0 '\000',
>>                rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 0
>> '\000'}}, audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt =
>> AUD_FMT_U8, endianness = 0},
>>            read_handler = 0, read_handler_expect = 0, modifiers_state =
>> '\000'<repeats 255 times>, led = 0x0, abort = false, output_mutex =
>> {lock = {__data = {__lock = 0,
>>                  __count = 0, __owner = 0, __nusers = 0, __kind = 0,
>> __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000'
>> <repeats 39 times>, __align = 0}},
>>            bh = 0x0, jobs_buffer = {capacity = 0, offset = 0, buffer =
>> 0x0}, tight = {type = 0, quality = 255 '\377', compression = 9 '\t',
>> pixel24 = 0 '\000', tight = {
>>                capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity =
>> 0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer =
>> 0x0}, gradient = {
>>                capacity = 0, offset = 0, buffer = 0x0}, levels = {0, 0,
>> 0, 0}, stream = {{next_in = 0x0, avail_in = 0, total_in = 0, next_out =
>> 0x0, avail_out = 0,
>>                  total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree
>> = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, {next_in =
>> 0x0, avail_in = 0, total_in = 0,
>>                  next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
>> state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler =
>> 0, reserved = 0}, {
>>                  next_in = 0x0, avail_in = 0, total_in = 0, next_out =
>> 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0,
>> zfree = 0, opaque = 0x0,
>>                  data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0,
>> avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out =
>> 0, msg = 0x0, state = 0x0,
>>                  zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
>> adler = 0, reserved = 0}}}, zlib = {zlib = {capacity = 1229488, offset =
>> 1152000,
>>                buffer = 0x7f5557fe9320 ""}, tmp = {capacity = 1230913,
>> offset = 20, buffer = 0x7f5558176d60 ""}, stream = {next_in =
>> 0x7f5558102720 "\030\b ", avail_in = 0,
>>                total_in = 7658880, next_out = 0x7f5558177308 "",
>> avail_out = 1229465, total_out = 49027, msg = 0x0, state = 0x7f55581155e0,
>>                zalloc = 0x7f55556a622f<vnc_zlib_zalloc>, zfree =
>> 0x7f55556a626c<vnc_zlib_zfree>, opaque = 0x7f54d08a6810, data_type = 0,
>> adler = 197164569, reserved = 0},
>>              level = 9}, hextile = {send_tile = 0x7f55556a1180
>> <send_hextile_tile_generic_32>}, zrle = {type = 0, fb = {capacity = 0,
>> offset = 0, buffer = 0x0}, zrle = {
>>                capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity =
>> 0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer =
>> 0x0}, stream = {next_in = 0x0,
>>                avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0,
>> total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque =
>> 0x0, data_type = 0, adler = 0,
>>                reserved = 0}, palette = {pool = {{idx = 0, color = 0,
>> next = {le_next = 0x0, le_prev = 0x0}}<repeats 256 times>}, size = 0,
>> max = 0, bpp = 0, table = {{
>>                    lh_first = 0x0}<repeats 256 times>}}}, zywrle = {buf
>> = {0<repeats 4096 times>}}, mouse_mode_notifier = {notify = 0, node =
>> {tqe_next = 0x0,
>>                tqe_prev = 0x0}}, next = {tqe_next = 0x0, tqe_prev = 0x0}}
>>          n_rectangles = 1
>>          saved_offset = 2
>> #3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at
>> ui/vnc-jobs-async.c:318
>>          queue = 0x7f5557ede1e0
>> #4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
>> No symbol table info available.
>> #5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
>> No symbol table info available.
>> #6  0x0000000000000000 in ?? ()
>> No symbol table info available.
>>
>> ---Type<return>  to continue, or q<return>  to quit---
>> Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
>> #0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
>> No symbol table info available.
>> #1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
>>          rfds = {fds_bits = {14197552, 0<repeats 15 times>}}
>>          wfds = {fds_bits = {0<repeats 16 times>}}
>>          xfds = {fds_bits = {0<repeats 16 times>}}
>>          ret = 32597
>>          nfds = 23
>>          tv = {tv_sec = 0, tv_usec = 817147}
>>          timeout = 1000
>> #2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
>>          nonblocking = false
>>          last_io = 1
>> #3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8,
>> envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
>>          gdbstub_dev = 0x0
>>          i = 64
>>          snapshot = 0
>>          linux_boot = 0
>>          icount_option = 0x0
>>          initrd_filename = 0x0
>>          kernel_filename = 0x0
>>          kernel_cmdline = 0x7f55557d8bef ""
>>          boot_devices = "dc", '\000'<repeats 30 times>
>>          ds = 0x7f5557e8ec40
>>          dcl = 0x0
>>          cyls = 0
>>          heads = 0
>>          secs = 0
>>          translation = 0
>>          hda_opts = 0x0
>>          opts = 0x7f5557637cb0
>>          olist = 0x7ffff9d89268
>>          optind = 38
>>          optarg = 0x7ffff9d89bed "cirrus"
>>          loadvm = 0x0
>>          machine = 0x7f5555b53500
>>          cpu_model = 0x7ffff9d89b6f "host,+x2apic,model_id=Intel(R)
>> Xeon(R) CPU", ' '<repeats 11 times>, "L5640  @ 2.27GHz,-tsc"
>>          pid_file = 0x7ffff9d89b43 "/var/run/qemu/vm-221.pid"
>>          incoming = 0x0
>>          show_vnc_port = 0
>>          defconfig = 1
>>          log_mask = 0x0
>>          log_file = 0x0
>>          mem_trace = {malloc = 0x7f5555649081<malloc_and_trace>, realloc
>> = 0x7f55556490b6<realloc_and_trace>, free = 0x7f55556490fa
>> <free_and_trace>, calloc = 0,
>> ---Type<return>  to continue, or q<return>  to quit---
>>            try_malloc = 0, try_realloc = 0}
>>          trace_events = 0x0
>>          trace_file = 0x0
>> (gdb)
>>
>>
>>
>
>
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:11   ` Peter Lieven
@ 2012-06-28  9:21     ` Jan Kiszka
  2012-06-28  9:31       ` Peter Lieven
  2012-07-05  8:51     ` Xiao Guangrong
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Kiszka @ 2012-06-28  9:21 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Gleb Natapov, qemu-devel, kvm

On 2012-06-28 11:11, Peter Lieven wrote:
> On 27.06.2012 18:54, Jan Kiszka wrote:
>> On 2012-06-27 17:39, Peter Lieven wrote:
>>> Hi all,
>>>
>>> i debugged this further and found out that kvm-kmod-3.0 is working with
>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>>> an infinite loop?
>> Before accusing kvm-kmod ;), can you check if the effect is visible with
>> an original Linux 3.3.x or 3.4.x kernel as well?
> sorry, i should have been more specific. maybe I also misunderstood sth.
> I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if
> I use
> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
> however, maybe we don't have to dig to deep - see below.

kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
working on an older kernel. This step may introduce bugs of its own.
Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
first of all.

>> Then, bisection the change in qemu-kvm that apparently resolved the
>> issue would be interesting.
>>
>> If we have to dig deeper, tracing [1] the lockup would likely be helpful
>> (all events of the qemu process, not just KVM related ones: trace-cmd
>> record -e all qemu-system-x86_64 ...).
> that here is bascially whats going on:
> 
>   qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read
> len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
> 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
> KVM_EXIT_MMIO (6)
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
> 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
> KVM_EXIT_MMIO (6)
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
> 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
> KVM_EXIT_MMIO (6)
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
> 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
> KVM_EXIT_MMIO (6)
> 
> its doing that forever. this is tracing the kvm module. doing the
> qemu-system-x86_64 trace is a bit compilcated, but
> maybe this is already sufficient. otherwise i will of course gather this
> info as well.

That's only tracing KVM event, and it's tracing when things went wrong
already. We may need a full trace (-e all) specifically for the period
when this pattern above started.

However, let's focus on bisecting at kernel and qemu-kvm level first.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:21     ` Jan Kiszka
@ 2012-06-28  9:31       ` Peter Lieven
  2012-06-28  9:38         ` Peter Lieven
  2012-06-28  9:39         ` Jan Kiszka
  0 siblings, 2 replies; 20+ messages in thread
From: Peter Lieven @ 2012-06-28  9:31 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: qemu-devel, kvm, Gleb Natapov

On 28.06.2012 11:21, Jan Kiszka wrote:
> On 2012-06-28 11:11, Peter Lieven wrote:
>> On 27.06.2012 18:54, Jan Kiszka wrote:
>>> On 2012-06-27 17:39, Peter Lieven wrote:
>>>> Hi all,
>>>>
>>>> i debugged this further and found out that kvm-kmod-3.0 is working with
>>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>>>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>>>> an infinite loop?
>>> Before accusing kvm-kmod ;), can you check if the effect is visible with
>>> an original Linux 3.3.x or 3.4.x kernel as well?
>> sorry, i should have been more specific. maybe I also misunderstood sth.
>> I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
>> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if
>> I use
>> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
>> however, maybe we don't have to dig to deep - see below.
> kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
> working on an older kernel. This step may introduce bugs of its own.
> Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
> first of all.
>
>>> Then, bisection the change in qemu-kvm that apparently resolved the
>>> issue would be interesting.
>>>
>>> If we have to dig deeper, tracing [1] the lockup would likely be helpful
>>> (all events of the qemu process, not just KVM related ones: trace-cmd
>>> record -e all qemu-system-x86_64 ...).
>> that here is bascially whats going on:
>>
>>    qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read
>> len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>> 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>> KVM_EXIT_MMIO (6)
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>> 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>> KVM_EXIT_MMIO (6)
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>> 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>> KVM_EXIT_MMIO (6)
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>> 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>> KVM_EXIT_MMIO (6)
>>
>> its doing that forever. this is tracing the kvm module. doing the
>> qemu-system-x86_64 trace is a bit compilcated, but
>> maybe this is already sufficient. otherwise i will of course gather this
>> info as well.
> That's only tracing KVM event, and it's tracing when things went wrong
> already. We may need a full trace (-e all) specifically for the period
> when this pattern above started.
i will do that. maybe i should explain that the vcpu is executing
garbage when this above starts. its basically booting from an empty 
harddisk.

if i understand correctly qemu-kvm loops in kvm_cpu_exec(CPUState *env);

maybe the time to handle the monitor/qmp connection is just to short.
if i understand furhter correctly, it can only handle monitor connections
while qemu-kvm is executing kvm_vcpu_ioctl(env, KVM_RUN, 0); or am i
wrong here? the time spend in this state might be rather short.

my concern is not that the machine hangs, just the the hypervisor is 
unresponsive
and its impossible to reset or quit gracefully. the only way to get the 
hypervisor
ended is via SIGKILL.

thanks
peter

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:31       ` Peter Lieven
@ 2012-06-28  9:38         ` Peter Lieven
  2012-07-02 15:05           ` Avi Kivity
  2012-06-28  9:39         ` Jan Kiszka
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Lieven @ 2012-06-28  9:38 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: qemu-devel, kvm, Gleb Natapov

does anyone know whats that here in handle_mmio?

     /* hack: Red Hat 7.1 generates these weird accesses. */
     if ((addr > 0xa0000-4 && addr <= 0xa0000) && kvm_run->mmio.len == 3)
         return 0;

thanks,
peter

On 28.06.2012 11:31, Peter Lieven wrote:
> On 28.06.2012 11:21, Jan Kiszka wrote:
>> On 2012-06-28 11:11, Peter Lieven wrote:
>>> On 27.06.2012 18:54, Jan Kiszka wrote:
>>>> On 2012-06-27 17:39, Peter Lieven wrote:
>>>>> Hi all,
>>>>>
>>>>> i debugged this further and found out that kvm-kmod-3.0 is working 
>>>>> with
>>>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>>>> working as well is kvm-kmod-3.4 with an old userspace 
>>>>> (qemu-kvm-0.13.0).
>>>>> Has anyone a clue which new KVM feature could cause this if a vcpu 
>>>>> is in
>>>>> an infinite loop?
>>>> Before accusing kvm-kmod ;), can you check if the effect is visible 
>>>> with
>>>> an original Linux 3.3.x or 3.4.x kernel as well?
>>> sorry, i should have been more specific. maybe I also misunderstood 
>>> sth.
>>> I was believing that kvm-kmod-3.0 is basically what is in vanialla 
>>> kernel
>>> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it 
>>> works, if
>>> I use
>>> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
>>> however, maybe we don't have to dig to deep - see below.
>> kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
>> working on an older kernel. This step may introduce bugs of its own.
>> Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
>> first of all.
>>
>>>> Then, bisection the change in qemu-kvm that apparently resolved the
>>>> issue would be interesting.
>>>>
>>>> If we have to dig deeper, tracing [1] the lockup would likely be 
>>>> helpful
>>>> (all events of the qemu process, not just KVM related ones: trace-cmd
>>>> record -e all qemu-system-x86_64 ...).
>>> that here is bascially whats going on:
>>>
>>>    qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio 
>>> read
>>> len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   
>>> reason
>>> KVM_EXIT_MMIO (6)
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> read len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   
>>> reason
>>> KVM_EXIT_MMIO (6)
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> read len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   
>>> reason
>>> KVM_EXIT_MMIO (6)
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> read len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   
>>> reason
>>> KVM_EXIT_MMIO (6)
>>>
>>> its doing that forever. this is tracing the kvm module. doing the
>>> qemu-system-x86_64 trace is a bit compilcated, but
>>> maybe this is already sufficient. otherwise i will of course gather 
>>> this
>>> info as well.
>> That's only tracing KVM event, and it's tracing when things went wrong
>> already. We may need a full trace (-e all) specifically for the period
>> when this pattern above started.
> i will do that. maybe i should explain that the vcpu is executing
> garbage when this above starts. its basically booting from an empty 
> harddisk.
>
> if i understand correctly qemu-kvm loops in kvm_cpu_exec(CPUState *env);
>
> maybe the time to handle the monitor/qmp connection is just to short.
> if i understand furhter correctly, it can only handle monitor connections
> while qemu-kvm is executing kvm_vcpu_ioctl(env, KVM_RUN, 0); or am i
> wrong here? the time spend in this state might be rather short.
>
> my concern is not that the machine hangs, just the the hypervisor is 
> unresponsive
> and its impossible to reset or quit gracefully. the only way to get 
> the hypervisor
> ended is via SIGKILL.
>
> thanks
> peter


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:31       ` Peter Lieven
  2012-06-28  9:38         ` Peter Lieven
@ 2012-06-28  9:39         ` Jan Kiszka
  2012-06-28 10:13           ` Peter Lieven
  2012-06-28 10:34           ` Peter Lieven
  1 sibling, 2 replies; 20+ messages in thread
From: Jan Kiszka @ 2012-06-28  9:39 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Gleb Natapov

On 2012-06-28 11:31, Peter Lieven wrote:
> On 28.06.2012 11:21, Jan Kiszka wrote:
>> On 2012-06-28 11:11, Peter Lieven wrote:
>>> On 27.06.2012 18:54, Jan Kiszka wrote:
>>>> On 2012-06-27 17:39, Peter Lieven wrote:
>>>>> Hi all,
>>>>>
>>>>> i debugged this further and found out that kvm-kmod-3.0 is working with
>>>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>>>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>>>>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>>>>> an infinite loop?
>>>> Before accusing kvm-kmod ;), can you check if the effect is visible with
>>>> an original Linux 3.3.x or 3.4.x kernel as well?
>>> sorry, i should have been more specific. maybe I also misunderstood sth.
>>> I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
>>> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if
>>> I use
>>> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
>>> however, maybe we don't have to dig to deep - see below.
>> kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
>> working on an older kernel. This step may introduce bugs of its own.
>> Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
>> first of all.
>>
>>>> Then, bisection the change in qemu-kvm that apparently resolved the
>>>> issue would be interesting.
>>>>
>>>> If we have to dig deeper, tracing [1] the lockup would likely be helpful
>>>> (all events of the qemu process, not just KVM related ones: trace-cmd
>>>> record -e all qemu-system-x86_64 ...).
>>> that here is bascially whats going on:
>>>
>>>    qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read
>>> len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>> KVM_EXIT_MMIO (6)
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> read len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>> KVM_EXIT_MMIO (6)
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> read len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>> KVM_EXIT_MMIO (6)
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> read len 3 gpa 0xa0000 val 0x10ff
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>> 0xa0000 gpa 0xa0000 Read GPA
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>> KVM_EXIT_MMIO (6)
>>>
>>> its doing that forever. this is tracing the kvm module. doing the
>>> qemu-system-x86_64 trace is a bit compilcated, but
>>> maybe this is already sufficient. otherwise i will of course gather this
>>> info as well.
>> That's only tracing KVM event, and it's tracing when things went wrong
>> already. We may need a full trace (-e all) specifically for the period
>> when this pattern above started.
> i will do that. maybe i should explain that the vcpu is executing
> garbage when this above starts. its basically booting from an empty 
> harddisk.
> 
> if i understand correctly qemu-kvm loops in kvm_cpu_exec(CPUState *env);
> 
> maybe the time to handle the monitor/qmp connection is just to short.
> if i understand furhter correctly, it can only handle monitor connections
> while qemu-kvm is executing kvm_vcpu_ioctl(env, KVM_RUN, 0); or am i
> wrong here? the time spend in this state might be rather short.

Unless you played with priorities and affinities, the Linux scheduler
should provide the required time to the iothread.

> 
> my concern is not that the machine hangs, just the the hypervisor is 
> unresponsive
> and its impossible to reset or quit gracefully. the only way to get the 
> hypervisor
> ended is via SIGKILL.

Right. Even if the guest runs wild, you must be able to control the vm
via the monitor etc. If not, that's a bug.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:39         ` Jan Kiszka
@ 2012-06-28 10:13           ` Peter Lieven
  2012-06-28 10:34           ` Peter Lieven
  1 sibling, 0 replies; 20+ messages in thread
From: Peter Lieven @ 2012-06-28 10:13 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Gleb Natapov

On 28.06.2012 11:39, Jan Kiszka wrote:
> On 2012-06-28 11:31, Peter Lieven wrote:
>> On 28.06.2012 11:21, Jan Kiszka wrote:
>>> On 2012-06-28 11:11, Peter Lieven wrote:
>>>> On 27.06.2012 18:54, Jan Kiszka wrote:
>>>>> On 2012-06-27 17:39, Peter Lieven wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> i debugged this further and found out that kvm-kmod-3.0 is working with
>>>>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>>>>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>>>>>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>>>>>> an infinite loop?
>>>>> Before accusing kvm-kmod ;), can you check if the effect is visible with
>>>>> an original Linux 3.3.x or 3.4.x kernel as well?
>>>> sorry, i should have been more specific. maybe I also misunderstood sth.
>>>> I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
>>>> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if
>>>> I use
>>>> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
>>>> however, maybe we don't have to dig to deep - see below.
>>> kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
>>> working on an older kernel. This step may introduce bugs of its own.
>>> Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
>>> first of all.
>>>
>>>>> Then, bisection the change in qemu-kvm that apparently resolved the
>>>>> issue would be interesting.
>>>>>
>>>>> If we have to dig deeper, tracing [1] the lockup would likely be helpful
>>>>> (all events of the qemu process, not just KVM related ones: trace-cmd
>>>>> record -e all qemu-system-x86_64 ...).
>>>> that here is bascially whats going on:
>>>>
>>>>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read
>>>> len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>
>>>> its doing that forever. this is tracing the kvm module. doing the
>>>> qemu-system-x86_64 trace is a bit compilcated, but
>>>> maybe this is already sufficient. otherwise i will of course gather this
>>>> info as well.
>>> That's only tracing KVM event, and it's tracing when things went wrong
>>> already. We may need a full trace (-e all) specifically for the period
>>> when this pattern above started.
>> i will do that. maybe i should explain that the vcpu is executing
>> garbage when this above starts. its basically booting from an empty
>> harddisk.
>>
>> if i understand correctly qemu-kvm loops in kvm_cpu_exec(CPUState *env);
>>
>> maybe the time to handle the monitor/qmp connection is just to short.
>> if i understand furhter correctly, it can only handle monitor connections
>> while qemu-kvm is executing kvm_vcpu_ioctl(env, KVM_RUN, 0); or am i
>> wrong here? the time spend in this state might be rather short.
> Unless you played with priorities and affinities, the Linux scheduler
> should provide the required time to the iothread.
I have a 1.1GB (85MB compressed) trace-file. If you have time to
look at it I could drop it somewhere.

We currently run all VMs with nice 1 because we observed that
this improves that controlability of the Node in case all VMs
have execessive CPU load. Running the VM unniced does
not change the behaviour unfortunately.

Peter
>> my concern is not that the machine hangs, just the the hypervisor is
>> unresponsive
>> and its impossible to reset or quit gracefully. the only way to get the
>> hypervisor
>> ended is via SIGKILL.
> Right. Even if the guest runs wild, you must be able to control the vm
> via the monitor etc. If not, that's a bug.
>
> Jan
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:39         ` Jan Kiszka
  2012-06-28 10:13           ` Peter Lieven
@ 2012-06-28 10:34           ` Peter Lieven
  1 sibling, 0 replies; 20+ messages in thread
From: Peter Lieven @ 2012-06-28 10:34 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Gleb Natapov

On 28.06.2012 11:39, Jan Kiszka wrote:
> On 2012-06-28 11:31, Peter Lieven wrote:
>> On 28.06.2012 11:21, Jan Kiszka wrote:
>>> On 2012-06-28 11:11, Peter Lieven wrote:
>>>> On 27.06.2012 18:54, Jan Kiszka wrote:
>>>>> On 2012-06-27 17:39, Peter Lieven wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> i debugged this further and found out that kvm-kmod-3.0 is working with
>>>>>> qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
>>>>>> working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
>>>>>> Has anyone a clue which new KVM feature could cause this if a vcpu is in
>>>>>> an infinite loop?
>>>>> Before accusing kvm-kmod ;), can you check if the effect is visible with
>>>>> an original Linux 3.3.x or 3.4.x kernel as well?
>>>> sorry, i should have been more specific. maybe I also misunderstood sth.
>>>> I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
>>>> 3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if
>>>> I use
>>>> a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
>>>> however, maybe we don't have to dig to deep - see below.
>>> kvm-kmod wraps and patches things to make the kvm code from 3.3/3.4
>>> working on an older kernel. This step may introduce bugs of its own.
>>> Therefore my suggestion to use a "real" 3.x kernel to exclude that risk
>>> first of all.
>>>
>>>>> Then, bisection the change in qemu-kvm that apparently resolved the
>>>>> issue would be interesting.
>>>>>
>>>>> If we have to dig deeper, tracing [1] the lockup would likely be helpful
>>>>> (all events of the qemu process, not just KVM related ones: trace-cmd
>>>>> record -e all qemu-system-x86_64 ...).
>>>> that here is bascially whats going on:
>>>>
>>>>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read
>>>> len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> read len 3 gpa 0xa0000 val 0x10ff
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva
>>>> 0xa0000 gpa 0xa0000 Read GPA
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio
>>>> unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>>>       qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason
>>>> KVM_EXIT_MMIO (6)
>>>>
>>>> its doing that forever. this is tracing the kvm module. doing the
>>>> qemu-system-x86_64 trace is a bit compilcated, but
>>>> maybe this is already sufficient. otherwise i will of course gather this
>>>> info as well.
>>> That's only tracing KVM event, and it's tracing when things went wrong
>>> already. We may need a full trace (-e all) specifically for the period
>>> when this pattern above started.
>> i will do that. maybe i should explain that the vcpu is executing
>> garbage when this above starts. its basically booting from an empty
>> harddisk.
>>
>> if i understand correctly qemu-kvm loops in kvm_cpu_exec(CPUState *env);
>>
>> maybe the time to handle the monitor/qmp connection is just to short.
>> if i understand furhter correctly, it can only handle monitor connections
>> while qemu-kvm is executing kvm_vcpu_ioctl(env, KVM_RUN, 0); or am i
>> wrong here? the time spend in this state might be rather short.
> Unless you played with priorities and affinities, the Linux scheduler
> should provide the required time to the iothread.
>
>> my concern is not that the machine hangs, just the the hypervisor is
>> unresponsive
>> and its impossible to reset or quit gracefully. the only way to get the
>> hypervisor
>> ended is via SIGKILL.
> Right. Even if the guest runs wild, you must be able to control the vm
> via the monitor etc. If not, that's a bug.
what i observed just know is that the monitor is working up to the
point i try to quit the hypervisor or try to reset the cpu.

so we where looking at a completely wrong place...

it seems that in this short excerpt, that the deadlock appears not
on excution but when the vcpus shall be paused.

Program received signal SIGINT, Interrupt.
0x00007fc8ec36785c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
(gdb) thread apply all bt

Thread 4 (Thread 0x7fc8eb5a9700 (LWP 731)):
#0  0x00007fc8ec0b7747 in ioctl () from /lib/libc.so.6
#1  0x00007fc8ed92eedc in kvm_vcpu_ioctl (env=0x7fc8ef6c4b80, 
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
#2  0x00007fc8ed92e936 in kvm_cpu_exec (env=0x7fc8ef6c4b80) at 
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
#3  0x00007fc8ed8fd2b4 in qemu_kvm_cpu_thread_fn (arg=0x7fc8ef6c4b80) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:740
#4  0x00007fc8ec3629ca in start_thread () from /lib/libpthread.so.0
#5  0x00007fc8ec0bfcdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7fc8eada8700 (LWP 732)):
#0  0x00007fc8ec36785c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#1  0x00007fc8ed882189 in qemu_cond_wait (cond=0x7fc8ef6ead00, 
mutex=0x7fc8ee4f2940) at qemu-thread-posix.c:113
#2  0x00007fc8ed8fd1b7 in qemu_kvm_wait_io_event (env=0x7fc8ef6dcae0) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:710
#3  0x00007fc8ed8fd2d8 in qemu_kvm_cpu_thread_fn (arg=0x7fc8ef6dcae0) at 
/usr/src/qemu-kvm-1.0.1/cpus.c:745
#4  0x00007fc8ec3629ca in start_thread () from /lib/libpthread.so.0
#5  0x00007fc8ec0bfcdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7fc8a8fae700 (LWP 733)):
#0  0x00007fc8ec36785c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#1  0x00007fc8ed882189 in qemu_cond_wait (cond=0x7fc8efb54550, 
mutex=0x7fc8efb54580) at qemu-thread-posix.c:113
#2  0x00007fc8ed8b7d49 in vnc_worker_thread_loop (queue=0x7fc8efb54550) 
at ui/vnc-jobs-async.c:222
#3  0x00007fc8ed8b8227 in vnc_worker_thread (arg=0x7fc8efb54550) at 
ui/vnc-jobs-async.c:318
#4  0x00007fc8ec3629ca in start_thread () from /lib/libpthread.so.0
#5  0x00007fc8ec0bfcdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fc8ed718700 (LWP 725)):
#0  0x00007fc8ec36785c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#1  0x00007fc8ed882189 in qemu_cond_wait (cond=0x7fc8ee16a120, 
mutex=0x7fc8ee4f2940) at qemu-thread-posix.c:113
#2  0x00007fc8ed8fd67b in pause_all_vcpus () at 
/usr/src/qemu-kvm-1.0.1/cpus.c:881
#3  0x00007fc8ed8548cd in main (argc=38, argv=0x7fffcde70a48, 
envp=0x7fffcde70b80) at /usr/src/qemu-kvm-1.0.1/vl.c:3525
(gdb) q

Peter

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:38         ` Peter Lieven
@ 2012-07-02 15:05           ` Avi Kivity
  2012-07-02 15:57             ` Peter Lieven
  2012-07-03 13:01             ` Peter Lieven
  0 siblings, 2 replies; 20+ messages in thread
From: Avi Kivity @ 2012-07-02 15:05 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 06/28/2012 12:38 PM, Peter Lieven wrote:
> does anyone know whats that here in handle_mmio?
> 
>     /* hack: Red Hat 7.1 generates these weird accesses. */
>     if ((addr > 0xa0000-4 && addr <= 0xa0000) && kvm_run->mmio.len == 3)
>         return 0;
> 

Just what it says.  There is a 4-byte access to address 0x9ffff.  The
first byte lies in RAM, the next three bytes are in mmio.  qemu is
geared to power-of-two accesses even though x86 can generate accesses to
any number of bytes between 1 and 8.

It appears that this has happened with your guest.  It's not impossible
that it's genuine.

-- 
error compiling committee.c: too many arguments to function



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-02 15:05           ` Avi Kivity
@ 2012-07-02 15:57             ` Peter Lieven
  2012-07-03 13:01             ` Peter Lieven
  1 sibling, 0 replies; 20+ messages in thread
From: Peter Lieven @ 2012-07-02 15:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 02.07.2012 17:05, Avi Kivity wrote:
> On 06/28/2012 12:38 PM, Peter Lieven wrote:
>> does anyone know whats that here in handle_mmio?
>>
>>      /* hack: Red Hat 7.1 generates these weird accesses. */
>>      if ((addr>  0xa0000-4&&  addr<= 0xa0000)&&  kvm_run->mmio.len == 3)
>>          return 0;
>>
> Just what it says.  There is a 4-byte access to address 0x9ffff.  The
> first byte lies in RAM, the next three bytes are in mmio.  qemu is
> geared to power-of-two accesses even though x86 can generate accesses to
> any number of bytes between 1 and 8.
I just stumbled across the word "hack" in the comment. When the race
occurs the CPU is basically reading from 0xa0000 in an endless loop.
> It appears that this has happened with your guest.  It's not impossible
> that it's genuine.
>
I had a lot to do the last days, but I update our build environment to
Ubuntu LTS 12.04 64-bit Server which is based on Linux 3.2.0. I still
see the issue. If I use the kvm Module provided with the kernel it is
working correctly. If I use kvm-kmod-3.4 with qemu-kvm-1.0.1 (both
from sourceforge) I can reproduce the race condition.

I will keep you posted when I have more evidence.

Thanks,
Peter

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-02 15:05           ` Avi Kivity
  2012-07-02 15:57             ` Peter Lieven
@ 2012-07-03 13:01             ` Peter Lieven
  2012-07-03 13:13               ` Avi Kivity
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Lieven @ 2012-07-03 13:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

Further output from my testing.

Working:
Linux 2.6.38 with included kvm module
Linux 3.0.0 with included kvm module

Not-Working:
Linux 3.2.0 with included kvm module
Linux 2.6.28 with kvm-kmod 3.4
Linux 3.0.0 with kvm-kmod 3.4
Linux 3.2.0 with kvm-kmod 3.4

I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
It might be that the code was introduced somewhere between 3.0.0
and 3.2.0 in the kvm kernel module and that the flaw is not
in qemu-kvm.

Any hints?

Thanks,
Peter


On 02.07.2012 17:05, Avi Kivity wrote:
> On 06/28/2012 12:38 PM, Peter Lieven wrote:
>> does anyone know whats that here in handle_mmio?
>>
>>      /* hack: Red Hat 7.1 generates these weird accesses. */
>>      if ((addr>  0xa0000-4&&  addr<= 0xa0000)&&  kvm_run->mmio.len == 3)
>>          return 0;
>>
> Just what it says.  There is a 4-byte access to address 0x9ffff.  The
> first byte lies in RAM, the next three bytes are in mmio.  qemu is
> geared to power-of-two accesses even though x86 can generate accesses to
> any number of bytes between 1 and 8.
>
> It appears that this has happened with your guest.  It's not impossible
> that it's genuine.
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-03 13:01             ` Peter Lieven
@ 2012-07-03 13:13               ` Avi Kivity
  2012-07-03 13:15                 ` Peter Lieven
  2012-07-04 23:12                 ` Peter Lieven
  0 siblings, 2 replies; 20+ messages in thread
From: Avi Kivity @ 2012-07-03 13:13 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 07/03/2012 04:01 PM, Peter Lieven wrote:
> Further output from my testing.
> 
> Working:
> Linux 2.6.38 with included kvm module
> Linux 3.0.0 with included kvm module
> 
> Not-Working:
> Linux 3.2.0 with included kvm module
> Linux 2.6.28 with kvm-kmod 3.4
> Linux 3.0.0 with kvm-kmod 3.4
> Linux 3.2.0 with kvm-kmod 3.4
> 
> I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
> It might be that the code was introduced somewhere between 3.0.0
> and 3.2.0 in the kvm kernel module and that the flaw is not
> in qemu-kvm.
> 
> Any hints?
> 

A bisect could tell us where the problem is.

To avoid bisecting all of linux, try

   git bisect v3.2 v3.0 virt/kvm arch/x86/kvm



-- 
error compiling committee.c: too many arguments to function



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-03 13:13               ` Avi Kivity
@ 2012-07-03 13:15                 ` Peter Lieven
  2012-07-03 13:25                   ` Avi Kivity
  2012-07-04 23:12                 ` Peter Lieven
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Lieven @ 2012-07-03 13:15 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 03.07.2012 15:13, Avi Kivity wrote:
> On 07/03/2012 04:01 PM, Peter Lieven wrote:
>> Further output from my testing.
>>
>> Working:
>> Linux 2.6.38 with included kvm module
>> Linux 3.0.0 with included kvm module
>>
>> Not-Working:
>> Linux 3.2.0 with included kvm module
>> Linux 2.6.28 with kvm-kmod 3.4
>> Linux 3.0.0 with kvm-kmod 3.4
>> Linux 3.2.0 with kvm-kmod 3.4
>>
>> I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
>> It might be that the code was introduced somewhere between 3.0.0
>> and 3.2.0 in the kvm kernel module and that the flaw is not
>> in qemu-kvm.
>>
>> Any hints?
>>
> A bisect could tell us where the problem is.
>
> To avoid bisecting all of linux, try
>
>     git bisect v3.2 v3.0 virt/kvm arch/x86/kvm
>
>
would it also be ok to bisect kvm-kmod?

thanks,
peter


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-03 13:15                 ` Peter Lieven
@ 2012-07-03 13:25                   ` Avi Kivity
  2012-07-04 14:57                     ` Peter Lieven
  0 siblings, 1 reply; 20+ messages in thread
From: Avi Kivity @ 2012-07-03 13:25 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 07/03/2012 04:15 PM, Peter Lieven wrote:
> On 03.07.2012 15:13, Avi Kivity wrote:
>> On 07/03/2012 04:01 PM, Peter Lieven wrote:
>>> Further output from my testing.
>>>
>>> Working:
>>> Linux 2.6.38 with included kvm module
>>> Linux 3.0.0 with included kvm module
>>>
>>> Not-Working:
>>> Linux 3.2.0 with included kvm module
>>> Linux 2.6.28 with kvm-kmod 3.4
>>> Linux 3.0.0 with kvm-kmod 3.4
>>> Linux 3.2.0 with kvm-kmod 3.4
>>>
>>> I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
>>> It might be that the code was introduced somewhere between 3.0.0
>>> and 3.2.0 in the kvm kernel module and that the flaw is not
>>> in qemu-kvm.
>>>
>>> Any hints?
>>>
>> A bisect could tell us where the problem is.
>>
>> To avoid bisecting all of linux, try
>>
>>     git bisect v3.2 v3.0 virt/kvm arch/x86/kvm
>>
>>
> would it also be ok to bisect kvm-kmod?

Yes, but note that kvm-kmod is spread across two repositories which are
not often tested out of sync, so you may get build failures.


-- 
error compiling committee.c: too many arguments to function



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-03 13:25                   ` Avi Kivity
@ 2012-07-04 14:57                     ` Peter Lieven
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Lieven @ 2012-07-04 14:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 07/03/12 15:25, Avi Kivity wrote:
> On 07/03/2012 04:15 PM, Peter Lieven wrote:
>> On 03.07.2012 15:13, Avi Kivity wrote:
>>> On 07/03/2012 04:01 PM, Peter Lieven wrote:
>>>> Further output from my testing.
>>>>
>>>> Working:
>>>> Linux 2.6.38 with included kvm module
>>>> Linux 3.0.0 with included kvm module
>>>>
>>>> Not-Working:
>>>> Linux 3.2.0 with included kvm module
>>>> Linux 2.6.28 with kvm-kmod 3.4
>>>> Linux 3.0.0 with kvm-kmod 3.4
>>>> Linux 3.2.0 with kvm-kmod 3.4
>>>>
>>>> I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
>>>> It might be that the code was introduced somewhere between 3.0.0
>>>> and 3.2.0 in the kvm kernel module and that the flaw is not
>>>> in qemu-kvm.
>>>>
>>>> Any hints?
>>>>
>>> A bisect could tell us where the problem is.
>>>
>>> To avoid bisecting all of linux, try
>>>
>>>      git bisect v3.2 v3.0 virt/kvm arch/x86/kvm
>>>
>>>
>> would it also be ok to bisect kvm-kmod?
> Yes, but note that kvm-kmod is spread across two repositories which are
> not often tested out of sync, so you may get build failures.
>
ok, i just started with this with a 3.0 (good) and 3.2 (bad) vanilla 
kernel. i can confirm
the bug and i am no starting to bisect. it will take while with my 
equipment if anyone
has a powerful testbed to run this i would greatly appreciate help.

if anyone wants to reproduce:

a) v3.2 from git.kernel.org
b) qemu-kvm 1.0.1 from sourceforge
c) ubuntu 64-bit 12.04 server cd
d) empty (e.g. all zero) hard disk image

cmdline:
./qemu-system-x86_64 -m 512 -cdrom 
/home/lieven/Downloads/ubuntu-12.04-server-amd64.iso -hda 
/dev/hd1/vmtest -vnc :1 -monitor stdio -boot dc

then choose boot from first harddisk and try to quit the qemu monitor 
with 'quit'. -> hypervisor hangs.

peter




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-03 13:13               ` Avi Kivity
  2012-07-03 13:15                 ` Peter Lieven
@ 2012-07-04 23:12                 ` Peter Lieven
  2012-07-05  6:48                   ` Xiao Guangrong
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Lieven @ 2012-07-04 23:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, Gleb Natapov, qemu-devel, kvm, xiaoguangrong

On 07/03/12 15:13, Avi Kivity wrote:
> On 07/03/2012 04:01 PM, Peter Lieven wrote:
>> Further output from my testing.
>>
>> Working:
>> Linux 2.6.38 with included kvm module
>> Linux 3.0.0 with included kvm module
>>
>> Not-Working:
>> Linux 3.2.0 with included kvm module
>> Linux 2.6.28 with kvm-kmod 3.4
>> Linux 3.0.0 with kvm-kmod 3.4
>> Linux 3.2.0 with kvm-kmod 3.4
>>
>> I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
>> It might be that the code was introduced somewhere between 3.0.0
>> and 3.2.0 in the kvm kernel module and that the flaw is not
>> in qemu-kvm.
>>
>> Any hints?
>>
> A bisect could tell us where the problem is.
>
> To avoid bisecting all of linux, try
>
>     git bisect v3.2 v3.0 virt/kvm arch/x86/kvm
here we go:

commit ca7d58f375c650cf36900cb1da1ca2cc99b13393
Author: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Date:   Wed Jul 13 14:31:08 2011 +0800

     KVM: x86: fix broken read emulation spans a page boundary

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-04 23:12                 ` Peter Lieven
@ 2012-07-05  6:48                   ` Xiao Guangrong
  0 siblings, 0 replies; 20+ messages in thread
From: Xiao Guangrong @ 2012-07-05  6:48 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Avi Kivity, Jan Kiszka, qemu-devel, kvm, Gleb Natapov,
	xiaoguangrong

On 07/05/2012 07:12 AM, Peter Lieven wrote:
> On 07/03/12 15:13, Avi Kivity wrote:
>> On 07/03/2012 04:01 PM, Peter Lieven wrote:
>>> Further output from my testing.
>>>
>>> Working:
>>> Linux 2.6.38 with included kvm module
>>> Linux 3.0.0 with included kvm module
>>>
>>> Not-Working:
>>> Linux 3.2.0 with included kvm module
>>> Linux 2.6.28 with kvm-kmod 3.4
>>> Linux 3.0.0 with kvm-kmod 3.4
>>> Linux 3.2.0 with kvm-kmod 3.4
>>>
>>> I can trigger the race with any of qemu-kvm 0.12.5, 1.0 or 1.0.1.
>>> It might be that the code was introduced somewhere between 3.0.0
>>> and 3.2.0 in the kvm kernel module and that the flaw is not
>>> in qemu-kvm.
>>>
>>> Any hints?
>>>
>> A bisect could tell us where the problem is.
>>
>> To avoid bisecting all of linux, try
>>
>>     git bisect v3.2 v3.0 virt/kvm arch/x86/kvm
> here we go:
> 
> commit ca7d58f375c650cf36900cb1da1ca2cc99b13393
> Author: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> Date:   Wed Jul 13 14:31:08 2011 +0800
> 
>     KVM: x86: fix broken read emulation spans a page boundary

Ah, i will try to reproduce it and fix it.

Thanks for your work.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-06-28  9:11   ` Peter Lieven
  2012-06-28  9:21     ` Jan Kiszka
@ 2012-07-05  8:51     ` Xiao Guangrong
  2012-07-05 12:42       ` Peter Lieven
  1 sibling, 1 reply; 20+ messages in thread
From: Xiao Guangrong @ 2012-07-05  8:51 UTC (permalink / raw)
  To: Peter Lieven; +Cc: Gleb Natapov, Jan Kiszka, qemu-devel, kvm

On 06/28/2012 05:11 PM, Peter Lieven wrote:

> that here is bascially whats going on:
> 
>   qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>     qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>     qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
> 

There are two mmio emulation after user-space-exit, it is caused by mmio
read access which spans two pages. But it should be fixed by:

commit f78146b0f9230765c6315b2e14f56112513389ad
Author: Avi Kivity <avi@redhat.com>
Date:   Wed Apr 18 19:22:47 2012 +0300

    KVM: Fix page-crossing MMIO

    MMIO that are split across a page boundary are currently broken - the
    code does not expect to be aborted by the exit to userspace for the
    first MMIO fragment.

    This patch fixes the problem by generalizing the current code for handling
    16-byte MMIOs to handle a number of "fragments", and changes the MMIO
    code to create those fragments.

    Signed-off-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Could you please pull the code from:
https://git.kernel.org/pub/scm/virt/kvm/kvm.git
and trace it again?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
  2012-07-05  8:51     ` Xiao Guangrong
@ 2012-07-05 12:42       ` Peter Lieven
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Lieven @ 2012-07-05 12:42 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Jan Kiszka, qemu-devel, kvm, Gleb Natapov

On 05.07.2012 10:51, Xiao Guangrong wrote:
> On 06/28/2012 05:11 PM, Peter Lieven wrote:
>
>> that here is bascially whats going on:
>>
>>    qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio read len 3 gpa 0xa0000 val 0x10ff
>>      qemu-kvm-1.0-2506  [010] 60996.908000: vcpu_match_mmio:      gva 0xa0000 gpa 0xa0000 Read GPA
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_mmio:             mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0
>>      qemu-kvm-1.0-2506  [010] 60996.908000: kvm_userspace_exit:   reason KVM_EXIT_MMIO (6)
>>
> There are two mmio emulation after user-space-exit, it is caused by mmio
> read access which spans two pages. But it should be fixed by:
>
> commit f78146b0f9230765c6315b2e14f56112513389ad
> Author: Avi Kivity<avi@redhat.com>
> Date:   Wed Apr 18 19:22:47 2012 +0300
>
>      KVM: Fix page-crossing MMIO
>
>      MMIO that are split across a page boundary are currently broken - the
>      code does not expect to be aborted by the exit to userspace for the
>      first MMIO fragment.
>
>      This patch fixes the problem by generalizing the current code for handling
>      16-byte MMIOs to handle a number of "fragments", and changes the MMIO
>      code to create those fragments.
>
>      Signed-off-by: Avi Kivity<avi@redhat.com>
>      Signed-off-by: Marcelo Tosatti<mtosatti@redhat.com>
>
> Could you please pull the code from:
> https://git.kernel.org/pub/scm/virt/kvm/kvm.git
> and trace it again?
Thank you very much, this fixes the issue I have seen.

Thanks,
Peter


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-07-05 12:42 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-27 15:39 race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1 Peter Lieven
2012-06-27 16:54 ` Jan Kiszka
2012-06-28  9:11   ` Peter Lieven
2012-06-28  9:21     ` Jan Kiszka
2012-06-28  9:31       ` Peter Lieven
2012-06-28  9:38         ` Peter Lieven
2012-07-02 15:05           ` Avi Kivity
2012-07-02 15:57             ` Peter Lieven
2012-07-03 13:01             ` Peter Lieven
2012-07-03 13:13               ` Avi Kivity
2012-07-03 13:15                 ` Peter Lieven
2012-07-03 13:25                   ` Avi Kivity
2012-07-04 14:57                     ` Peter Lieven
2012-07-04 23:12                 ` Peter Lieven
2012-07-05  6:48                   ` Xiao Guangrong
2012-06-28  9:39         ` Jan Kiszka
2012-06-28 10:13           ` Peter Lieven
2012-06-28 10:34           ` Peter Lieven
2012-07-05  8:51     ` Xiao Guangrong
2012-07-05 12:42       ` Peter Lieven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).