qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
@ 2009-08-12 15:01 Chris Webb
  2009-08-12 15:38 ` [Qemu-devel] " Avi Kivity
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Webb @ 2009-08-12 15:01 UTC (permalink / raw)
  To: kvm; +Cc: qemu-devel

I have a couple of clusters hosting qemu-kvm virtual machines. One of these
clusters consists of dual quad-core Xeon E5420s (vmx), the other consists of
dual quad-core Barcelona Opterons (svm), and both are running x86-64 Linux
2.6.30.4 with the kvm modules included with the upstream kernel compiled in.

Running qemu-kvm 0.10.5, I was seeing occasional segfaults from the virtual
machines, perhaps two or three a day across each cluster. The guest OS didn't
appear to be a factor, as both Linux and Windows VMs have crashed. I then
switched to the recently released qemu-kvm 0.10.6, and am still seeing these
segfaults.

It's very hard for me to arrange for core dumps on these live clusters, and the
segfaults are hard to reproduce on test machines because they are rare.
However, I have unstripped copies of the respective binaries and have used gdb
to translate the segfault ip into a source file and line number, which I hope
might be useful. On both clusters and for each version of qemu-kvm, segfaults
are happening at lines #1161 and #1163 of vl.c:

                [...]
                /* stop a timer, but do not dealloc it */
                void qemu_del_timer(QEMUTimer *ts)
                {
                    QEMUTimer **pt, *t;

                    /* NOTE: this code must be signal safe because
                       qemu_timer_expired() can be called from a signal. */
    HERE ==>        pt = &active_timers[ts->clock->type];
                    for(;;) {
    HERE ==>            t = *pt;
                        if (!t)
                            break;
                        if (t == ts) {
                            *pt = t->next;
                            break;
                        }
                        pt = &t->next;
                    }
                }
                [...]

For qemu-kvm 0.10.5, I have large numbers of segfaults in both locations. For
qemu-kvm 0.10.6, my sample is much smaller, but the segfaults I have are all at
line #1161, not #1163.

Final data-point: prior to the 0.10.5 upgrade, we had been successfully running a
(fairly old) kvm-83 userspace without experiencing this segfault problem.

Any help fixing this would be gratefully received!

Cheers,

Chris.

PS One other place I have seen a segfault in 0.10.6 since we rolled it out is
at line #141 of hw/scsi-disk.c, but this has only happened once---very rare
compared to the problem I describe above.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-12 15:01 [Qemu-devel] qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6) Chris Webb
@ 2009-08-12 15:38 ` Avi Kivity
  2009-08-12 16:24   ` Chris Webb
  0 siblings, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2009-08-12 15:38 UTC (permalink / raw)
  To: Chris Webb; +Cc: qemu-devel, kvm

On 08/12/2009 06:01 PM, Chris Webb wrote:
> I have a couple of clusters hosting qemu-kvm virtual machines. One of these
> clusters consists of dual quad-core Xeon E5420s (vmx), the other consists of
> dual quad-core Barcelona Opterons (svm), and both are running x86-64 Linux
> 2.6.30.4 with the kvm modules included with the upstream kernel compiled in.
>
> Running qemu-kvm 0.10.5, I was seeing occasional segfaults from the virtual
> machines, perhaps two or three a day across each cluster. The guest OS didn't
> appear to be a factor, as both Linux and Windows VMs have crashed. I then
> switched to the recently released qemu-kvm 0.10.6, and am still seeing these
> segfaults.
>
> It's very hard for me to arrange for core dumps on these live clusters, and the
> segfaults are hard to reproduce on test machines because they are rare.
> However, I have unstripped copies of the respective binaries and have used gdb
> to translate the segfault ip into a source file and line number, which I hope
> might be useful. On both clusters and for each version of qemu-kvm, segfaults
> are happening at lines #1161 and #1163 of vl.c:
>    

I understand it's hard, but it's nearly impossible to work out the 
problem from so little data, so please do make the effort to obtain dumps.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-12 15:38 ` [Qemu-devel] " Avi Kivity
@ 2009-08-12 16:24   ` Chris Webb
  2009-08-13 12:23     ` Chris Webb
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Webb @ 2009-08-12 16:24 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, kvm

Avi Kivity <avi@redhat.com> writes:

> I understand it's hard, but it's nearly impossible to work out the  
> problem from so little data, so please do make the effort to obtain 
> dumps.

We're trying for this at the moment, but since we can't change the rlimit
for the running qemu-kvm processes (?), we'll have to wait until one of the
new ones dies, which may take some time. I'll follow up when I do have
something.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-12 16:24   ` Chris Webb
@ 2009-08-13 12:23     ` Chris Webb
  2009-08-13 12:41       ` Chris Webb
  2009-08-13 12:42       ` Avi Kivity
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Webb @ 2009-08-13 12:23 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, kvm

Chris Webb <chris@arachsys.com> writes:

> Avi Kivity <avi@redhat.com> writes:
> 
> > I understand it's hard, but it's nearly impossible to work out the  
> > problem from so little data, so please do make the effort to obtain 
> > dumps.
> 
> We're trying for this at the moment, but since we can't change the rlimit
> for the running qemu-kvm processes (?), we'll have to wait until one of the
> new ones dies, which may take some time. I'll follow up when I do have
> something.

We've been lucky and relatively quickly got a core dump from one of the new
qemu-kvms with the non-zero core file rlimit. A backtrace looks like this:

  (gdb) bt    
  #0  0x00000000004068f7 in qemu_mod_timer (ts=0x30d1f30, expire_time=430489)
      at /packages/qemu-kvm/src-f39tF1/vl.c:1161
  #1  0x0000000000495dd5 in vnc_update_client (opaque=<value optimized out>) at vnc.c:765
  #2  0x00000000004081da in main_loop_wait (timeout=<value optimized out>) at /packages/qemu-kvm/src-f39tF1/vl.c:1240
  #3  0x000000000051613a in kvm_main_loop () at /packages/qemu-kvm/src-f39tF1/qemu-kvm.c:596
  #4  0x000000000040c7b7 in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
      at /packages/qemu-kvm/src-f39tF1/vl.c:3850

The segfault appears to be a null pointer dereference. ts->clock is NULL
and line 1161 uses ts->clock->type:

  (gdb) p ts   
  $4 = (QEMUTimer *) 0x30d1f30
  (gdb) p ts->clock
  $5 = (QEMUClock *) 0x0

The VncState in vnc_update_client is as follows:

  (gdb) f 1        
  #1  0x0000000000495dd5 in vnc_update_client (opaque=<value optimized out>) at vnc.c:765
  765             qemu_mod_timer(vs->timer, qemu_get_clock(rt_clock) + VNC_REFRESH_INTERVAL);
  (gdb) p *vs                        
  $12 = {timer = 0x30d1f30, csock = -986235208, ds = 0x0, vd = 0x0, need_update = 1, dirty_row = {{0, 0, 4294967295, 
        4294967295} <repeats 768 times>, {4294967295, 4294967295, 4294967295, 4294967295} <repeats 1280 times>}, 
    old_data = 0x7f9b8276f010 <Address 0x7f9b8276f010 out of bounds>, features = 98, absolute = 1, last_x = -1, 
    last_y = -1, vnc_encoding = 5, tight_quality = 6 '\006', tight_compression = 1 '\001', major = 3, minor = 3, 
    challenge = "\032\314i\257<\302t1(\320\312\263\024pH\226", output = {capacity = 1545078, offset = 684, 
      buffer = 0x3107860 ""}, input = {capacity = 5120, offset = 0, buffer = 0x3106450 "\020\220(\003"}, 
    write_pixels = 0x490b50 <vnc_write_pixels_generic>, send_hextile_tile = 0x492030 <send_hextile_tile_generic_32>, 
    clientds = {flags = 0 '\0', width = 800, height = 600, linesize = 3200, 
      data = 0x7f9b82944010 <Address 0x7f9b82944010 out of bounds>, pf = {bits_per_pixel = 32 ' ', 
        bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0, rshift = 16 '\020', 
        gshift = 8 '\b', bshift = 0 '\0', ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377', bmax = 255 '\377', 
        amax = 255 '\377', rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}}, serverds = {
      flags = 2 '\002', width = 1024, height = 768, linesize = 4096, data = 0x7f9b8246e010 "", pf = {
        bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 16711680, gmask = 65280, 
        bmask = 255, amask = 0, rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\0', ashift = 24 '\030', 
        rmax = 255 '\377', gmax = 255 '\377', bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b', gbits = 8 '\b', 
        bbits = 8 '\b', abits = 8 '\b'}}, audio_cap = 0x0, as = {freq = 44100, nchannels = 2, fmt = AUD_FMT_S16, 
      endianness = 0}, read_handler = 0x494b40 <protocol_client_msg>, read_handler_expect = 1, 
    modifiers_state = '\0' <repeats 255 times>, zlib = {capacity = 0, offset = 0, buffer = 0x0}, zlib_tmp = {
      capacity = 0, offset = 0, buffer = 0x0}, zlib_stream = {{next_in = 0x0, avail_in = 0, total_in = 0, 
        next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, 
        data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, 
        avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, 
        adler = 0, reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, 
        total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0, 
        reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out = 0, 
        msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}}, 
    next = 0x0}

I'm afraid I only have one of these, so I can't say whether the other
segfaults were exactly the same or different (other than knowing the source
line matched), but I'll keep my eye out for more core dumps.

qemu-kvm command line for this guest would have been

  qemu-kvm -m 1024 -smp 1 -usbdevice tablet -boot cd \
    -drive if=ide,bus=0,unit=0,cache=none,file=/dev/mapper/guest:8a2576b2-a523-4126-867a-5f411cb66f18:ide:0:0 \
    -drive if=ide,bus=0,unit=1,media=cdrom,cache=none,file=/dev/mapper/guest:8a2576b2-a523-4126-867a-5f411cb66f18:ide:0:1 \
    -net tap,vlan=0,ifname=tap1,script=no,downscript=no \
    -net nic,model=e1000,macaddr=02:00:53:de:e2:b0,vlan=0 \
    -vnc :2,password -uuid 8a2576b2-a523-4126-867a-5f411cb66f18 \
    -pidfile /var/run/guests/8a2576b2-a523-4126-867a-5f411cb66f18/kvm.pid \
    -monitor unix:/var/lib/guests/8a2576b2-a523-4126-867a-5f411cb66f18/monitor,server,nowait

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-13 12:23     ` Chris Webb
@ 2009-08-13 12:41       ` Chris Webb
  2009-08-13 12:42       ` Avi Kivity
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Webb @ 2009-08-13 12:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, kvm

Chris Webb <chris@arachsys.com> writes:

> The segfault appears to be a null pointer dereference. ts->clock is NULL
> and line 1161 uses ts->clock->type:
> 
>   (gdb) p ts   
>   $4 = (QEMUTimer *) 0x30d1f30
>   (gdb) p ts->clock
>   $5 = (QEMUClock *) 0x0

Sorry, meant to paste this too:

  (gdb) p *ts
  $1 = {clock = 0x0, expire_time = 49, cb = 0x2b63630, opaque = 0x30fe000, next = 0x495b40}

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-13 12:23     ` Chris Webb
  2009-08-13 12:41       ` Chris Webb
@ 2009-08-13 12:42       ` Avi Kivity
  2009-08-13 12:43         ` Chris Webb
  1 sibling, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2009-08-13 12:42 UTC (permalink / raw)
  To: Chris Webb; +Cc: qemu-devel, kvm

On 08/13/2009 03:23 PM, Chris Webb wrote:
> We've been lucky and relatively quickly got a core dump from one of the new
> qemu-kvms with the non-zero core file rlimit. A backtrace looks like this:
>
>    (gdb) bt
>    #0  0x00000000004068f7 in qemu_mod_timer (ts=0x30d1f30, expire_time=430489)
>        at /packages/qemu-kvm/src-f39tF1/vl.c:1161
>    #1  0x0000000000495dd5 in vnc_update_client (opaque=<value optimized out>) at vnc.c:765
>    #2  0x00000000004081da in main_loop_wait (timeout=<value optimized out>) at /packages/qemu-kvm/src-f39tF1/vl.c:1240
>    #3  0x000000000051613a in kvm_main_loop () at /packages/qemu-kvm/src-f39tF1/qemu-kvm.c:596
>    #4  0x000000000040c7b7 in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
>        at /packages/qemu-kvm/src-f39tF1/vl.c:3850
>
> The segfault appears to be a null pointer dereference. ts->clock is NULL
> and line 1161 uses ts->clock->type:
>
>    (gdb) p ts
>    $4 = (QEMUTimer *) 0x30d1f30
>    (gdb) p ts->clock
>    $5 = (QEMUClock *) 0x0
>
> The VncState in vnc_update_client is as follows:
>
>    (gdb) f 1
>    #1  0x0000000000495dd5 in vnc_update_client (opaque=<value optimized out>) at vnc.c:765
>    765             qemu_mod_timer(vs->timer, qemu_get_clock(rt_clock) + VNC_REFRESH_INTERVAL);
>    (gdb) p *vs
>    $12 = {timer = 0x30d1f30, csock = -986235208,

csock looks corrupted, should be -1 or an fd.  Was a vnc client connected?

Was the guest playing with the display resolution?

> ds = 0x0, vd = 0x0, need_update = 1, dirty_row = {{0, 0, 4294967295,
>          4294967295}<repeats 768 times>, {4294967295, 4294967295, 4294967295, 4294967295}<repeats 1280 times>},
>      old_data = 0x7f9b8276f010<Address 0x7f9b8276f010 out of bounds>,

old_data is also corrupted according to gdb, though it seems sane.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-13 12:42       ` Avi Kivity
@ 2009-08-13 12:43         ` Chris Webb
  2009-08-13 12:45           ` Chris Webb
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Webb @ 2009-08-13 12:43 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, kvm

Avi Kivity <avi@redhat.com> writes:

> csock looks corrupted, should be -1 or an fd.  Was a vnc client connected?
> Was the guest playing with the display resolution?

Yes, I think in this case there was a vncviewer connected, and the guest had
started booting up into windows, which changes the resolution a couple of
times.

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-13 12:43         ` Chris Webb
@ 2009-08-13 12:45           ` Chris Webb
  2009-08-13 12:58             ` Avi Kivity
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Webb @ 2009-08-13 12:45 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, kvm

Chris Webb <chris@arachsys.com> writes:

> Avi Kivity <avi@redhat.com> writes:
> 
> > csock looks corrupted, should be -1 or an fd.  Was a vnc client connected?
> > Was the guest playing with the display resolution?
> 
> Yes, I think in this case there was a vncviewer connected, and the guest had
> started booting up into windows, which changes the resolution a couple of
> times.

Also, I think the vncviewer might actually have been disconnecting at about
the time the segfault happened.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-13 12:45           ` Chris Webb
@ 2009-08-13 12:58             ` Avi Kivity
  2009-08-19 22:47               ` Chris Webb
  0 siblings, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2009-08-13 12:58 UTC (permalink / raw)
  To: Chris Webb; +Cc: qemu-devel, kvm

On 08/13/2009 03:45 PM, Chris Webb wrote:
> Chris Webb<chris@arachsys.com>  writes:
>
>    
>> Avi Kivity<avi@redhat.com>  writes:
>>
>>      
>>> csock looks corrupted, should be -1 or an fd.  Was a vnc client connected?
>>> Was the guest playing with the display resolution?
>>>        
>> Yes, I think in this case there was a vncviewer connected, and the guest had
>> started booting up into windows, which changes the resolution a couple of
>> times.
>>      
>
> Also, I think the vncviewer might actually have been disconnecting at about
> the time the segfault happened.
>
>    

master branch has a patch that fixes a use-after-free when 
disconnecting.  Unfortunately it doesn't port cleanly to stable-0.10.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-13 12:58             ` Avi Kivity
@ 2009-08-19 22:47               ` Chris Webb
  2009-08-24 15:45                 ` Chris Webb
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Webb @ 2009-08-19 22:47 UTC (permalink / raw)
  To: Avi Kivity, Gerd Hoffmann; +Cc: qemu-devel, kvm

Avi Kivity <avi@redhat.com> writes:

> master branch has a patch that fixes a use-after-free when  
> disconnecting.  Unfortunately it doesn't port cleanly to stable-0.10.

I've collected quite a few more core dumps from segfaults of client virtual
machines now, all of which are VNC related and could quite plausibly be use
of a VncState after it has been freed. I looked at Gerd's patch [198a00:
vnc: rework VncState release workflow] and have taken a stab at the
equivalent patch for stable qemu & qemu-kvm 0.10.

With the following applied, VNC connections and disconnections still work
correctly, so it doesn't horribly break anything, but I can't immediately
confirm whether it will cure the rare segfaults as I haven't yet found a
rapid way of reproducing the crashes other than by waiting for one.


diff --git a/vnc.c b/vnc.c
--- a/vnc.c
+++ b/vnc.c
@@ -200,6 +200,8 @@
 static void vnc_write_u8(VncState *vs, uint8_t value);
 static void vnc_flush(VncState *vs);
 static void vnc_update_client(void *opaque);
+static void vnc_disconnect_start(VncState *vs);
+static void vnc_disconnect_finish(VncState *vs);
 static void vnc_client_read(void *opaque);
 
 static void vnc_colordepth(VncState *vs);
@@ -633,8 +635,6 @@
 
 static void vnc_copy(VncState *vs, int src_x, int src_y, int dst_x, int dst_y, int w, int h)
 {
-    vnc_update_client(vs);
-
     vnc_write_u8(vs, 0);  /* msg id */
     vnc_write_u8(vs, 0);
     vnc_write_u16(vs, 1); /* number of rects */
@@ -647,13 +647,21 @@
 static void vnc_dpy_copy(DisplayState *ds, int src_x, int src_y, int dst_x, int dst_y, int w, int h)
 {
     VncDisplay *vd = ds->opaque;
-    VncState *vs = vd->clients;
-    while (vs != NULL) {
+    VncState *vs, *vn;
+
+    for (vs = vd->clients; vs != NULL; vs = vn) {
+        vn = vs->next;
+        if (vnc_has_feature(vs, VNC_FEATURE_COPYRECT)) {
+            vnc_update_client(vs);
+            /* vs might be free()ed here */
+        }
+    }
+
+    for (vs = vd->clients; vs != NULL; vs = vs->next) {
         if (vnc_has_feature(vs, VNC_FEATURE_COPYRECT))
             vnc_copy(vs, src_x, src_y, dst_x, dst_y, w, h);
         else /* TODO */
             vnc_update(vs, dst_x, dst_y, w, h);
-        vs = vs->next;
     }
 }
 
@@ -763,6 +771,8 @@
 
     if (vs->csock != -1) {
         qemu_mod_timer(vs->timer, qemu_get_clock(rt_clock) + VNC_REFRESH_INTERVAL);
+    } else {
+        vnc_disconnect_finish(vs);
     }
 
 }
@@ -832,6 +842,47 @@
     }
 }
 
+static void vnc_disconnect_start(VncState *vs)
+{
+    if (vs->csock == -1)
+        return;
+    qemu_set_fd_handler2(vs->csock, NULL, NULL, NULL, NULL);
+    closesocket(vs->csock);
+    vs->csock = -1;
+}
+
+static void vnc_disconnect_finish(VncState *vs)
+{
+    qemu_del_timer(vs->timer);
+    qemu_free_timer(vs->timer);
+    if (vs->input.buffer) qemu_free(vs->input.buffer);
+    if (vs->output.buffer) qemu_free(vs->output.buffer);
+#ifdef CONFIG_VNC_TLS
+    if (vs->tls_session) {
+        gnutls_deinit(vs->tls_session);
+        vs->tls_session = NULL;
+    }
+#endif /* CONFIG_VNC_TLS */
+    audio_del(vs);
+
+    VncState *p, *parent = NULL;
+    for (p = vs->vd->clients; p != NULL; p = p->next) {
+        if (p == vs) {
+            if (parent)
+                parent->next = p->next;
+            else
+                vs->vd->clients = p->next;
+            break;
+        }
+        parent = p;
+    }
+    if (!vs->vd->clients)
+        dcl->idle = 1;
+
+    qemu_free(vs->old_data);
+    qemu_free(vs);
+}
+
 static int vnc_client_io_error(VncState *vs, int ret, int last_errno)
 {
     if (ret == 0 || ret == -1) {
@@ -849,36 +900,7 @@
         }
 
 	VNC_DEBUG("Closing down client sock %d %d\n", ret, ret < 0 ? last_errno : 0);
-	qemu_set_fd_handler2(vs->csock, NULL, NULL, NULL, NULL);
-	closesocket(vs->csock);
-        qemu_del_timer(vs->timer);
-        qemu_free_timer(vs->timer);
-        if (vs->input.buffer) qemu_free(vs->input.buffer);
-        if (vs->output.buffer) qemu_free(vs->output.buffer);
-#ifdef CONFIG_VNC_TLS
-	if (vs->tls_session) {
-	    gnutls_deinit(vs->tls_session);
-	    vs->tls_session = NULL;
-	}
-#endif /* CONFIG_VNC_TLS */
-        audio_del(vs);
-
-        VncState *p, *parent = NULL;
-        for (p = vs->vd->clients; p != NULL; p = p->next) {
-            if (p == vs) {
-                if (parent)
-                    parent->next = p->next;
-                else
-                    vs->vd->clients = p->next;
-                break;
-            }
-            parent = p;
-        }
-        if (!vs->vd->clients)
-            dcl->idle = 1;
-
-        qemu_free(vs->old_data);
-        qemu_free(vs);
+        vnc_disconnect_start(vs);
   
 	return 0;
     }
@@ -887,7 +909,8 @@
 
 static void vnc_client_error(VncState *vs)
 {
-    vnc_client_io_error(vs, -1, EINVAL);
+    VNC_DEBUG("Closing down client sock: protocol error\n");
+    vnc_disconnect_start(vs);
 }
 
 static void vnc_client_write(void *opaque)
@@ -947,8 +970,11 @@
 #endif /* CONFIG_VNC_TLS */
 	ret = recv(vs->csock, buffer_end(&vs->input), 4096, 0);
     ret = vnc_client_io_error(vs, ret, socket_error());
-    if (!ret)
+    if (!ret) {
+        if (vs->csock == -1)
+            vnc_disconnect_finish(vs);
 	return;
+    }
 
     vs->input.offset += ret;
 
@@ -957,8 +983,10 @@
 	int ret;
 
 	ret = vs->read_handler(vs, vs->input.buffer, len);
-	if (vs->csock == -1)
+	if (vs->csock == -1) {
+            vnc_disconnect_finish(vs);
 	    return;
+        }
 
 	if (!ret) {
 	    memmove(vs->input.buffer, vs->input.buffer + len, (vs->input.offset - len));
@@ -973,7 +1001,7 @@
 {
     buffer_reserve(&vs->output, len);
 
-    if (buffer_empty(&vs->output)) {
+    if (vs->csock != -1 && buffer_empty(&vs->output)) {
 	qemu_set_fd_handler2(vs->csock, NULL, vnc_client_read, vnc_client_write, vs);
     }
 
@@ -1014,7 +1042,7 @@
 
 static void vnc_flush(VncState *vs)
 {
-    if (vs->output.offset)
+    if (vs->csock != -1 && vs->output.offset)
 	vnc_client_write(vs);
 }
 
@@ -2282,11 +2310,13 @@
     vnc_read_when(vs, protocol_version, 12);
     memset(vs->old_data, 0, ds_get_linesize(vs->ds) * ds_get_height(vs->ds));
     memset(vs->dirty_row, 0xFF, sizeof(vs->dirty_row));
-    vnc_update_client(vs);
     reset_keys(vs);
 
     vs->next = vd->clients;
     vd->clients = vs;
+
+    vnc_update_client(vs);
+    /* vs might be free()ed here */
 }
 
 static void vnc_listen_read(void *opaque)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)
  2009-08-19 22:47               ` Chris Webb
@ 2009-08-24 15:45                 ` Chris Webb
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Webb @ 2009-08-24 15:45 UTC (permalink / raw)
  To: Avi Kivity, Gerd Hoffmann; +Cc: qemu-devel, kvm

Chris Webb <chris@arachsys.com> writes:

> With the following applied, VNC connections and disconnections still work
> correctly, so it doesn't horribly break anything, but I can't immediately
> confirm whether it will cure the rare segfaults as I haven't yet found a
> rapid way of reproducing the crashes other than by waiting for one.

Just to follow up on this: the backported patch has cured the vast majority of
VNC crashes we've been seeing on 0.10.6, although I've still seen this earlier
today:

Core was generated by `qemu-kvm -m 512 -smp 1 -uuid d6f2cb13-7421-4baa-a978-eda9bec9d075 -pidfile /var'.
Program terminated with signal 11, Segmentation fault.
[New process 16847]
[New process 16855]
(gdb) bt
#0  0x00007fe42e9c6cb1 in memcpy () from /lib/libc.so.6
#1  0x00000000004917e4 in vnc_write (vs=0x31a7f50, data=0x7fffe3a19230, len=2) at vnc.c:323
#2  0x00000000004919bf in vnc_write_u16 (vs=0x7fe2f8cae023, value=<value optimized out>) at vnc.c:1035
#3  0x0000000000491bf3 in vnc_framebuffer_update (vs=0x7fe2f8cae023, x=-475950544, y=2, w=16385, h=1, encoding=6)
    at vnc.c:286
#4  0x0000000000496660 in send_framebuffer_update (vs=0x7fe2f8cae023, x=-475950544, y=196, w=208, h=1) at vnc.c:598
#5  0x0000000000496f65 in vnc_update_client (opaque=<value optimized out>) at vnc.c:754
#6  0x000000000040822a in main_loop_wait (timeout=<value optimized out>)
    at /packages/qemu-kvm+vncfix/src-nUlCId/vl.c:1240
#7  0x000000000051753a in kvm_main_loop () at /packages/qemu-kvm+vncfix/src-nUlCId/qemu-kvm.c:596
#8  0x000000000040c8a5 in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
    at /packages/qemu-kvm+vncfix/src-nUlCId/vl.c:3850
(gdb) f 1
#1  0x00000000004917e4 in vnc_write (vs=0x31a7f50, data=0x7fffe3a19230, len=2) at vnc.c:323
323         memcpy(buffer->buffer + buffer->offset, data, len);
(gdb) f 1
#1  0x00000000004917e4 in vnc_write (vs=0x31a7f50, data=0x7fffe3a19230, len=2) at vnc.c:323
323         memcpy(buffer->buffer + buffer->offset, data, len);
(gdb) p *vs
$1 = {timer = 0x2b90b20, csock = 18, ds = 0x28a1a20, vd = 0x28b0fc0, need_update = 1, dirty_row = {{0, 0, 0, 
      0} <repeats 197 times>, {65535, 262128, 0, 0}, {4294967295, 1, 0, 0}, {4294967288, 262143, 0, 0}, {4294443008, 
      262143, 0, 0}, {131071, 262128, 0, 0}, {4294967295, 1, 0, 0}, {4294967292, 262143, 0, 0}, {4294443008, 262143, 
      0, 0}, {131071, 262136, 0, 0}, {4294967295, 1, 0, 0}, {4294967292, 262143, 0, 0}, {4294443008, 262143, 0, 0}, {
      131071, 262136, 0, 0}, {4294967295, 1, 0, 0}, {4294967292, 262143, 0, 0}, {4294705152, 262143, 0, 0}, {131071, 
      262136, 0, 0}, {4294967295, 1, 0, 0}, {4294967294, 262143, 0, 0}, {4294705152, 262143, 0, 0}, {131071, 262140, 
      0, 0}, {4294967295, 1, 0, 0}, {4294967294, 262143, 0, 0}, {4294836224, 262143, 0, 0}, {131071, 262140, 0, 0}, {
      4294967295, 1, 0, 0}, {4294967294, 262143, 0, 0}, {4294836224, 262143, 0, 0}, {131071, 262140, 0, 0}, {
      4294967295, 1, 0, 0}, {4294967295, 262143, 0, 0}, {4294836224, 262143, 0, 0}, {131071, 262142, 0, 0}, {
      4294967295, 1, 0, 0}, {4294967295, 262143, 0, 0}, {4294901760, 262143, 0, 0}, {131071, 262142, 0, 0}, {
      4294967295, 1, 0, 0}, {4294967295, 262143, 0, 0}, {4294901760, 262143, 0, 0}, {131071, 262142, 0, 0}, {
      4294967295, 131073, 0, 0}, {4294967295, 262143, 0, 0}, {4294901760, 262143, 0, 0}, {131071, 262143, 0, 0}, {
      4294967295, 131073, 0, 0}, {4294967295, 262143, 0, 0}, {4294934528, 262143, 0, 0}, {131071, 262143, 0, 0}, {
      4294967295, 131075, 0, 0}, {4294967295, 262143, 0, 0}, {4294934528, 262143, 0, 0}, {131071, 262143, 0, 0}, {
      4294967295, 196611, 0, 0}, {4294967295, 262143, 0, 0}, {4294934528, 262143, 0, 0}, {2147614719, 262143, 0, 0}, {
      4294967295, 196611, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 262143, 0, 0}, {2147614719, 262143, 0, 0}, {
      4294967295, 196611, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 262143, 0, 0}, {2147614719, 262143, 0, 0}, {
      4294967295, 229379, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 262143, 0, 0}, {3221356543, 262143, 0, 0}, {
      4294967295, 229379, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 262143, 0, 0}, {3221356543, 262143, 0, 0}, {
      4294967295, 229377, 0, 0}, {4294967295, 262143, 0, 0}, {4294959104, 262143, 0, 0}, {3221356543, 262143, 0, 0}, {
      4294967295, 245761, 0, 0}, {4294967295, 262143, 0, 0}, {4294959104, 262143, 0, 0}, {3758227455, 262143, 0, 0}, {
      4294967295, 245761, 0, 0}, {4294967295, 262143, 0, 0}, {4294959104, 262143, 0, 0}, {3758227455, 262143, 0, 0}, {
      4294967295, 245761, 0, 0}, {4294967295, 262143, 0, 0}, {4294963200, 262143, 0, 0}, {3758227455, 262143, 0, 0}, {
      4294967295, 253953, 0, 0}, {4294967295, 262143, 0, 0}, {4294963200, 262143, 0, 0}, {4026662911, 262143, 0, 0}, {
      4294967295, 253953, 0, 0}, {4294967295, 262143, 0, 0}, {4294963200, 262143, 0, 0}, {4026662911, 262143, 0, 0}, {
      4294967295, 253953, 0, 0}, {4294967295, 262143, 0, 0}, {4294965248, 262143, 0, 0}, {4026662911, 262143, 0, 0}, {
      4294967295, 258049, 0, 0}, {4294967295, 262143, 0, 0}, {4294965248, 262143, 0, 0}, {4026662911, 262143, 0, 0}, {
      4294967295, 258049, 0, 0}, {4294967295, 262143, 0, 0}, {4294965248, 262143, 0, 0}, {4160815103, 262143, 0, 0}, {
      4294967295, 258049, 0, 0}, {4294967295, 262143, 0, 0}, {4294966272, 262143, 0, 0}, {4160815103, 262143, 0, 0}, {
      4294967295, 258049, 0, 0}, {4294967295, 262143, 0, 0}, {4294966272, 262143, 0, 0}, {4160815103, 262143, 0, 0}, {
      4294967295, 260097, 0, 0}, {4294967295, 262143, 0, 0}, {4294966272, 262143, 0, 0}, {4227923967, 262143, 0, 0}, {
      4294967295, 260097, 0, 0}, {4294967295, 131071, 0, 0}, {4294966272, 262143, 0, 0}, {4227923967, 262143, 0, 0}, {
      4294967295, 260097, 0, 0}, {4294967295, 131071, 0, 0}, {4294966784, 262143, 0, 0}, {4227923967, 262143, 0, 0}, {
      4294967295, 261121, 0, 0}, {4294967295, 131071, 0, 0}, {4294966784, 262143, 0, 0}, {4227923967, 262143, 0, 0}, {
      4294967295, 261120, 0, 0}, {4294967295, 131071, 0, 0}, {4294966784, 262143, 0, 0}, {4261478399, 262143, 0, 0}, {
      4294967295, 261120, 0, 0}, {4294967295, 131071, 0, 0}, {4294967040, 262143, 0, 0}, {4261478399, 262143, 0, 0}, {
      4294967295, 261120, 0, 0}, {4294967295, 131071, 0, 0}, {4294967040, 262143, 0, 0}, {4261478399, 262143, 0, 0}, {
      4294967295, 261632, 0, 0}, {4294967295, 131071, 0, 0}, {4294967040, 262143, 0, 0}, {4278222847, 262143, 0, 0}, {
      4294967295, 261632, 0, 0}, {4294967295, 131071, 0, 0}, {4294967040, 262143, 0, 0}, {4278222847, 262143, 0, 0}, {
      4294967295, 261632, 0, 0}, {4294967295, 131071, 0, 0}, {4294967168, 262143, 0, 0}, {4278222847, 262143, 0, 0}, {
      4294967295, 261888, 0, 0}, {4294967295, 131071, 0, 0}, {4294967168, 262143, 0, 0}, {4278222847, 262143, 0, 0}, {
      4294967295, 261888, 0, 0}, {4294967295, 65535, 0, 0}, {4294967168, 262143, 0, 0}, {4286611455, 262143, 0, 0}, {
      4294967295, 261888, 0, 0}, {4294967295, 65535, 0, 0}, {4294967232, 262143, 0, 0}, {4286611455, 262143, 0, 0}, {
      4294967295, 261888, 0, 0}, {4294967295, 65535, 0, 0}, {4294967232, 262143, 0, 0}, {4286611455, 262143, 0, 0}, {
      2147483647, 262016, 0, 0}, {4294967295, 65535, 0, 0}, {4294967232, 262143, 0, 0}, {4286611455, 262143, 0, 0}, {
      2147483647, 262016, 0, 0}, {4294967295, 65535, 0, 0}, {4294967232, 262143, 0, 0}, {4290805759, 262143, 0, 0}, {
      2147483647, 262016, 0, 0}, {4294967295, 65535, 0, 0}, {4294967264, 262143, 0, 0}, {4290805759, 262143, 0, 0}, {
      2147483647, 262016, 0, 0}, {4294967295, 65535, 0, 0}, {4294967264, 262143, 0, 0}, {4290789375, 262143, 0, 0}, {
      2147483647, 262080, 0, 0}...}, old_data = 0x2bbd1d0 "", features = 99, absolute = 1, last_x = -1, last_y = -1, 
  vnc_encoding = 6, tight_quality = 6 '\006', tight_compression = 9 '\t', major = 3, minor = 8, 
  challenge = "\314\351\243\267G\207\002\v\313\363\226\301ZQ\310m", output = {capacity = 2147485644, 
    offset = 18446744071562067987, 
    buffer = 0x7fe378cae010 "H\247BE\311\251\237\305j\200|\344\372\331\232\001\302t`"}, input = {capacity = 5120, 
    offset = 0, buffer = 0x31b0390 "\003\001"}, write_pixels = 0x491830 <vnc_write_pixels_generic>, 
  send_hextile_tile = 0x492db0 <send_hextile_tile_generic_32>, clientds = {flags = 0 '\0', width = 800, height = 600, 
    linesize = 3200, data = 0x7fe40c2ad010 "", pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', 
      depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0, rshift = 16 '\020', gshift = 8 '\b', 
      bshift = 0 '\0', ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377', bmax = 255 '\377', 
      amax = 255 '\377', rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}}, serverds = {
    flags = 2 '\002', width = 800, height = 600, linesize = 3200, data = 0x7fe40c2ad010 "", pf = {
      bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 16711680, gmask = 65280, 
      bmask = 255, amask = 0, rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\0', ashift = 24 '\030', 
      rmax = 255 '\377', gmax = 255 '\377', bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b', gbits = 8 '\b', 
      bbits = 8 '\b', abits = 8 '\b'}}, audio_cap = 0x0, as = {freq = 44100, nchannels = 2, fmt = AUD_FMT_S16, 
    endianness = 0}, read_handler = 0x495a10 <protocol_client_msg>, read_handler_expect = 1, 
  modifiers_state = '\0' <repeats 255 times>, zlib = {capacity = 1920304, offset = 2176, 
    buffer = 0x3680680 "A\271\237"}, zlib_tmp = {capacity = 2147485644, offset = 2147483353, 
    buffer = 0x7fe378cae010 "H\247BE\311\251\237\305j\200|\344\372\331\232\001\302t`"}, zlib_stream = {{
      next_in = 0x3680f00 ":\244\215", avail_in = 0, total_in = 15141151424, next_out = 0x7fe3f8cae023 "", 
      avail_out = 1977, total_out = 2112923344, msg = 0x0, state = 0x31b2ed0, zalloc = 0x7fe42f528b90 <zcalloc>, 
      zfree = 0x7fe42f528b80 <zcfree>, opaque = 0x31a7f50, data_type = 0, adler = 1623013286, reserved = 0}, {
      next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, 
      state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0, 
      avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, 
      zfree = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, 
      next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, 
      data_type = 0, adler = 0, reserved = 0}}, next = 0x0}

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-08-24 15:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-12 15:01 [Qemu-devel] qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6) Chris Webb
2009-08-12 15:38 ` [Qemu-devel] " Avi Kivity
2009-08-12 16:24   ` Chris Webb
2009-08-13 12:23     ` Chris Webb
2009-08-13 12:41       ` Chris Webb
2009-08-13 12:42       ` Avi Kivity
2009-08-13 12:43         ` Chris Webb
2009-08-13 12:45           ` Chris Webb
2009-08-13 12:58             ` Avi Kivity
2009-08-19 22:47               ` Chris Webb
2009-08-24 15:45                 ` Chris Webb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).