From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:47145) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S2rRw-0003RB-SX for qemu-devel@nongnu.org; Wed, 29 Feb 2012 16:52:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S2rRu-0000iK-G1 for qemu-devel@nongnu.org; Wed, 29 Feb 2012 16:52:40 -0500 Received: from fmmailgate07.web.de ([217.72.192.248]:48433) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S2rRu-0000hU-5H for qemu-devel@nongnu.org; Wed, 29 Feb 2012 16:52:38 -0500 Received: from moweb002.kundenserver.de (moweb002.kundenserver.de [172.19.20.108]) by fmmailgate07.web.de (Postfix) with ESMTP id DFE76CDA91C for ; Wed, 29 Feb 2012 22:52:35 +0100 (CET) Message-ID: <4F4E9E21.1050105@web.de> Date: Wed, 29 Feb 2012 22:52:33 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <4F4E9208.6020207@weilnetz.de> <4F4E99BC.50109@web.de> <4F4E9D1C.3070707@weilnetz.de> In-Reply-To: <4F4E9D1C.3070707@weilnetz.de> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigBCB4F37F9E8300E8D39EC566" Subject: Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Weil Cc: Zhi Yong Wu , qemu-devel@nongnu.org, Fabien Chouteau , "Michael S. Tsirkin" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigBCB4F37F9E8300E8D39EC566 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable On 2012-02-29 22:48, Stefan Weil wrote: > Am 29.02.2012 22:33, schrieb Jan Kiszka: >> On 2012-02-29 22:00, Stefan Weil wrote: >>> Am 29.02.2012 20:15, schrieb Jan Kiszka: >>>> This is an alternative, more complete approach to fix the requeuing-= >>>> related crashes reported recently. See patch 2 for details. The rest= >>>> are >>>> simple cleanups. >>>> >>>> Please check carefully if I messed something up. >>>> >>> >>> Hi Jan, >>> >>> here is the result of MIPS Malta with your patch series applied: >>> >>> Program received signal SIGSEGV, Segmentation fault. >>> 0x000055555577db5b in slirp_remque (a=3D0x555556cff360) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39 >>> 39 ((struct quehead *)(element->qh_rlink))->qh_link =3D >>> element->qh_link; >>> (gdb) i s >>> #0 0x000055555577db5b in slirp_remque (a=3D0x555556cff360) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39 >>> #1 0x000055555577b7a2 in if_start (slirp=3D0x5555564bfb80) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:208 >>> #2 0x000055555577b607 in if_output (so=3D0x555556ea0b70, >>> ifm=3D0x555556cff9e0) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:139 >>> #3 0x000055555577d040 in ip_output (so=3D0x555556ea0b70, >>> m0=3D0x555556cff9e0) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/ip_output.c:84 >>> #4 0x00005555557865d6 in tcp_output (tp=3D0x555556ea0c20) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/tcp_output.c:456 >>> #5 0x000055555577ff5a in slirp_select_poll (readfds=3D0x7fffffffda10,= >>> writefds=3D0x7fffffffda90, xfds=3D0x7fffffffdb10, select_error=3D0) >>> at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/slirp.c:480 >>> #6 0x000055555572d8c0 in main_loop_wait (nonblocking=3D0) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/main-loop.c:469 >>> #7 0x0000555555721a61 in main_loop () at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:1558 >>> #8 0x00005555557284a2 in main (argc=3D25, argv=3D0x7fffffffdfe8, >>> envp=3D0x7fffffffe0b8) at >>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:3667 >>> (gdb) p element >>> $1 =3D (struct quehead *) 0x555556cff360 >>> (gdb) p *element >>> $2 =3D {qh_link =3D 0x555556cff360, qh_rlink =3D 0x0} >>> (gdb) p (struct quehead *)(element->qh_rlink) >>> $3 =3D (struct quehead *) 0x0 >> >> Hmm. Two options: >> >> - you try to debug what happens to that mbuf, why its queue anchors >> get corrupted (maybe while in if_encap?) >> - you tell me how to reproduce it (image file, host characteristics) >> >> Jan >=20 > I'm afraid that the first variant won't happen this or next week > because lack of time. >=20 > This is my test environment: >=20 > Debian Squeeze x86_64 host, Debian Squeeze mips guest. >=20 > I use NFS root, and the latest crash happened during boot. > All other crashes happened after the guest had booted > when I startet apt-get update, so maybe booting from a > Debian CDROM might also reproduce the crash. >=20 > I compiled QEMU with a default configuration, but used > CFLAGS=3D-g (no optimization) and startet QEMU like this: >=20 > gdb --args > /home/stefan/src/qemu/repo.or.cz/qemu/ar7/bin/debug/x86/mips-softmmu/qe= mu-system-mips > --kernel /tftpboot/malta/boot/vmlinux-2.6.26-2-4kc-malta --initrd > /tftpboot/malta/boot/initrd.img-2.6.26-2-4kc-malta --append "debug > nohz=3Doff root=3D/dev/nfs rw ip=3D::::malta::dhcp > nfsroot=3D10.0.2.2:/tftpboot/malta -bootp abc -tftp /tftpboot/malta" -M= > malta --cpu 4KEc -m 256 --net nic,model=3Dpcnet --net user,hostname=3Dm= alta > --redir tcp:5800::5800 --redir tcp:5900::5900 --redir tcp:10022::22 > --redir tcp:10080::80 >=20 > Kernel and initrd are from Debian Squeeze (mips). OK, thanks. Here is a last shot (on top of my queue) before I try to reproduce: diff --git a/slirp/if.c b/slirp/if.c index 90bf398..d3bdf58 100644 --- a/slirp/if.c +++ b/slirp/if.c @@ -181,13 +181,12 @@ void if_start(Slirp *slirp) from_batchq =3D from_batchq_next; =20 ifm_next =3D ifm->ifq_next; - if (!from_batchq) { - if (ifm_next =3D=3D &slirp->if_fastq) { - /* No more packets in fastq, switch to batchq */ - ifm_next =3D slirp->next_m; - from_batchq_next =3D true; - } - } else if (ifm_next =3D=3D &slirp->if_batchq) { + if (ifm_next =3D=3D &slirp->if_fastq) { + /* No more packets in fastq, switch to batchq */ + ifm_next =3D slirp->next_m; + from_batchq_next =3D true; + } + if (ifm_next =3D=3D &slirp->if_batchq) { /* end of batchq */ ifm_next =3D NULL; } >=20 > I had no slirp problems with that test environment during the last two > years. Yes, these regression here are unfortunate. Hope we can resolve them quickly. Jan --------------enigBCB4F37F9E8300E8D39EC566 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9OniEACgkQitSsb3rl5xQNygCgzLVhqPfHD0icca3tP1vF76/P 0ZMAmwblUehbKbF69IoDh9ao4SYI4L7k =oSgM -----END PGP SIGNATURE----- --------------enigBCB4F37F9E8300E8D39EC566--