From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50652) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c5EVq-00040i-Ss for qemu-devel@nongnu.org; Fri, 11 Nov 2016 11:17:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c5EVn-0003Te-Ny for qemu-devel@nongnu.org; Fri, 11 Nov 2016 11:17:10 -0500 Received: from hera.aquilenet.fr ([141.255.128.1]:41535) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c5EVn-0003TT-Hk for qemu-devel@nongnu.org; Fri, 11 Nov 2016 11:17:07 -0500 Date: Fri, 11 Nov 2016 17:17:05 +0100 From: Samuel Thibault Message-ID: <20161111161705.GE2417@var.home> References: <95e79bc8-4547-b3b1-65b7-f641eb0c92f7@pobox.com> <20161104111419.GG9817@stefanha-x1.localdomain> <20161106180401.GE27308@var.home> <20161107104245.GC5036@stefanha-x1.localdomain> <466003bb-a2c4-bb9b-7b0b-7b2d6dcb16d7@pobox.com> <20161109112724.GC4682@stefanha-x1.localdomain> <02eee090-b017-dd4e-e63c-814d3d7beb72@pobox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <02eee090-b017-dd4e-e63c-814d3d7beb72@pobox.com> Subject: Re: [Qemu-devel] Crashing in tcp_close List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Brian Candler Cc: Stefan Hajnoczi , qemu-devel@nongnu.org, Jan Kiszka Hello, Brian Candler, on Fri 11 Nov 2016 16:02:44 +0000, wrote: > Aha!! Looking carefully at valgrind output, I see some definite cases of > use-after-free in tcp_output. Does the info below help? Ok, that's interesting. I however still don't see how that could happen :) > ==18350== Invalid read of size 4 > ==18350== at 0x550B5B: if_start (if.c:230) > ==18350== by 0x552E6C: ip_output (ip_output.c:85) > ==18350== by 0x55AA31: tcp_output (tcp_output.c:469) > ==18350== by 0x558FD7: tcp_input (tcp_input.c:1386) > ==18350== by 0x55543F: slirp_input (slirp.c:867) > ==18350== by 0x54AFBF: net_slirp_receive (slirp.c:118) > ==18350== by 0x540B18: nc_sendv_compat (net.c:701) > ==18350== by 0x540B18: qemu_deliver_packet_iov (net.c:728) > ==18350== by 0x5438DA: qemu_net_queue_deliver_iov (queue.c:179) > ==18350== by 0x5438DA: qemu_net_queue_send_iov (queue.c:224) > ==18350== by 0x36B428: virtio_net_flush_tx (virtio-net.c:1282) > ==18350== by 0x36B624: virtio_net_tx_bh (virtio-net.c:1387) > ==18350== by 0x5804EC: aio_bh_call (async.c:67) > ==18350== by 0x5804EC: aio_bh_poll (async.c:95) > ==18350== by 0x58A8FF: aio_dispatch (aio-posix.c:308) Could you increase the value given to valgrind's --num-callers= so we can make sure the context of this call? Here tcp_input get the buffer being freed below from the slirp->tcb list, and sofree happens to drop it from that list before calling free... I'm wondering whether we have a kind of concurrency or recursivity here. > ==18350== Address 0x9eabec4 is 340 bytes inside a block of size 432 free'd > ==18350== at 0x4C2EDEB: free (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==18350== by 0x55B25E: tcp_close (tcp_subr.c:334) > ==18350== by 0x55C7AE: tcp_timers (tcp_timer.c:289) > ==18350== by 0x55C7AE: tcp_slowtimo (tcp_timer.c:89) > ==18350== by 0x555187: slirp_pollfds_poll (slirp.c:576) > ==18350== by 0x5891EB: main_loop_wait (main-loop.c:508) > ==18350== by 0x2F4430: main_loop (vl.c:1908) > ==18350== by 0x2F4430: main (vl.c:4604) Samuel