From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:42152) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rx1h3-0002on-Cu for qemu-devel@nongnu.org; Mon, 13 Feb 2012 14:36:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rx1h0-0004S2-3M for qemu-devel@nongnu.org; Mon, 13 Feb 2012 14:36:08 -0500 Received: from goliath.siemens.de ([192.35.17.28]:25987) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rx1gz-0004Rp-PC for qemu-devel@nongnu.org; Mon, 13 Feb 2012 14:36:06 -0500 Message-ID: <4F396611.8070103@siemens.com> Date: Mon, 13 Feb 2012 20:35:45 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <20120212183407.GA4534@redhat.com> <4F381FE4.3050009@web.de> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] slirp-related crash List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhi Yong Wu Cc: Stefan Hajnoczi , qemu-devel@nongnu.org, Fabien Chouteau , "Michael S. Tsirkin" On 2012-02-13 16:27, Zhi Yong Wu wrote: > On Mon, Feb 13, 2012 at 4:24 AM, Jan Kiszka wrote: >> On 2012-02-12 19:34, Michael S. Tsirkin wrote: >>> It seems somewhat easy to crash qemu with slirp if we queue multiple packets. >>> I didn't investigate further yet so I don't know if this >>> is a regression. Anyone knowledgeable about slirp wants to take a look? >>> >>> /home/mst/qemu-test/bin/qemu-system-x86_64 -enable-kvm -m 1G -drive >>> file=/home/mst/rhel6.qcow2 -netdev user,id=bar -net >>> nic,netdev=bar,model=e1000,macaddr=52:54:00:12:34:57 -redir >>> tcp:8022::22 -vnc :1 -monitor stdio >>> >>> While guest is booting, quickly do this >>> >>> ssh localhost -p 8022 >>> CTRL-C >>> ssh localhost -p 8022 >>> CTRL-C >>> ssh localhost -p 8022 >>> CTRL-C >>> ssh localhost -p 8022 >>> CTRL-C >> >> Confirmed. A single canceled connection prior the interface setup is >> enough. Possibly something is not properly removed / cleaned up here. >> Will see if I find some time to debug, can't promise. > Interesting thing, pls give me some time, and i am trying to debug this issue. I had a look today, but haven't found a fix yet. The problem is related to our requeuing of packets if the target MAC is not yet known. Something goes terribly wrong once it gets resolved (mbuf use after release?). Maybe it was always wrong and the requeuing just surfaced the bug, dunno. After starring at the code for a while, I got the bad feeling of "unfixable with reasonable effort". The queuing code is horrible (well, like most of slirp), and the requeuing just made it worse. But maybe I'm just missing some trick now - yet another one that would make the code even more unreadable... I'm inclined to suggest a slirp rewrite (base support, not all features at once) as a GSOC project. QEMU really deserves something better. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux