qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
@ 2012-02-29 19:15 Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 1/4] slirp: Keep next_m always valid Jan Kiszka
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 19:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Weil, Zhi Yong Wu, Fabien Chouteau, Michael S. Tsirkin

This is an alternative, more complete approach to fix the requeuing-
related crashes reported recently. See patch 2 for details. The rest are
simple cleanups.

Please check carefully if I messed something up.

CC: Fabien Chouteau <chouteau@adacore.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Stefan Weil <sw@weilnetz.de>
CC: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>

Jan Kiszka (4):
  slirp: Keep next_m always valid
  slirp: Fix queue walking in if_start
  slirp: Remove unneeded if_queued
  slirp: Cleanup resources on instance removal

 slirp/if.c       |   68 +++++++++++++++++++++++++++++++-----------------------
 slirp/ip_icmp.c  |    7 +++++
 slirp/ip_icmp.h  |    1 +
 slirp/ip_input.c |    7 +++++
 slirp/mbuf.c     |   21 ++++++++++++++++
 slirp/mbuf.h     |    1 +
 slirp/slirp.c    |   10 +++----
 slirp/slirp.h    |    3 +-
 slirp/tcp_subr.c |    7 +++++
 slirp/udp.c      |    8 ++++++
 slirp/udp.h      |    1 +
 11 files changed, 98 insertions(+), 36 deletions(-)

-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 1/4] slirp: Keep next_m always valid
  2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
@ 2012-02-29 19:15 ` Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 2/4] slirp: Fix queue walking in if_start Jan Kiszka
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 19:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Weil, Zhi Yong Wu, Fabien Chouteau

Make sure that next_m always points to a packet if batchq is non-empty.
This will simplify walking the queues in if_start.

CC: Fabien Chouteau <chouteau@adacore.com>
CC: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
CC: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 slirp/if.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/slirp/if.c b/slirp/if.c
index 33f08e1..166852a 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -96,8 +96,13 @@ if_output(struct socket *so, struct mbuf *ifm)
 			ifs_insque(ifm, ifq->ifs_prev);
 			goto diddit;
 		}
-	} else
+        } else {
 		ifq = slirp->if_batchq.ifq_prev;
+                /* Set next_m if the queue was empty so far */
+                if (slirp->next_m == &slirp->if_batchq) {
+                    slirp->next_m = ifm;
+                }
+        }
 
 	/* Create a new doubly linked list for this session */
 	ifm->ifq_so = so;
@@ -170,13 +175,8 @@ void if_start(Slirp *slirp)
         if (slirp->if_fastq.ifq_next != &slirp->if_fastq) {
             ifm = slirp->if_fastq.ifq_next;
         } else {
-            /* Nothing on fastq, see if next_m is valid */
-            if (slirp->next_m != &slirp->if_batchq) {
-                ifm = slirp->next_m;
-            } else {
-                ifm = slirp->if_batchq.ifq_next;
-            }
-
+            /* Nothing on fastq, pick up from batchq via next_m */
+            ifm = slirp->next_m;
             from_batchq = true;
         }
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 2/4] slirp: Fix queue walking in if_start
  2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 1/4] slirp: Keep next_m always valid Jan Kiszka
@ 2012-02-29 19:15 ` Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 3/4] slirp: Remove unneeded if_queued Jan Kiszka
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 19:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Weil, Zhi Yong Wu, Fabien Chouteau

Another attempt to get this right: We need to carefully walk both the
fastq and the batchq in if_start while trying to send packets to
possibly not yet resolved hosts on the virtual network.

So far we just requeued a delayed packet where it was and then started
walking the queues from the top again - that couldn't work. Now we pre-
calculate the next packet in the queue so that the current one can
safely be removed if it was sent successfully. We also need to take into
account that the next packet can be from the same session if the current
one was sent or from another if it wasn't sent.

CC: Fabien Chouteau <chouteau@adacore.com>
CC: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
CC: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 slirp/if.c |   50 ++++++++++++++++++++++++++++++++++----------------
 1 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/slirp/if.c b/slirp/if.c
index 166852a..78a9b78 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -158,26 +158,41 @@ void if_start(Slirp *slirp)
 {
     uint64_t now = qemu_get_clock_ns(rt_clock);
     int requeued = 0;
-    bool from_batchq = false;
-    struct mbuf *ifm, *ifqt;
+    bool from_batchq, from_batchq_next;
+    struct mbuf *ifm, *ifm_next, *ifqt;
 
     DEBUG_CALL("if_start");
 
-    while (slirp->if_queued) {
+    if (slirp->if_fastq.ifq_next != &slirp->if_fastq) {
+        ifm_next = slirp->if_fastq.ifq_next;
+        from_batchq_next = false;
+    } else if (slirp->next_m != &slirp->if_batchq) {
+        /* Nothing on fastq, pick up from batchq via next_m */
+        ifm_next = slirp->next_m;
+        from_batchq_next = true;
+    } else {
+        ifm_next = NULL;
+    }
+
+    while (ifm_next) {
         /* check if we can really output */
-        if (!slirp_can_output(slirp->opaque))
+        if (!slirp_can_output(slirp->opaque)) {
             return;
+        }
 
-        /*
-         * See which queue to get next packet from
-         * If there's something in the fastq, select it immediately
-         */
-        if (slirp->if_fastq.ifq_next != &slirp->if_fastq) {
-            ifm = slirp->if_fastq.ifq_next;
-        } else {
-            /* Nothing on fastq, pick up from batchq via next_m */
-            ifm = slirp->next_m;
-            from_batchq = true;
+        ifm = ifm_next;
+        from_batchq = from_batchq_next;
+
+        ifm_next = ifm->ifq_next;
+        if (!from_batchq) {
+            if (ifm_next == &slirp->if_fastq) {
+                /* No more packets in fastq, switch to batchq */
+                ifm_next = slirp->next_m;
+                from_batchq_next = true;
+            }
+        } else if (ifm_next == &slirp->if_batchq) {
+            /* end of batchq */
+            ifm_next = NULL;
         }
 
         slirp->if_queued--;
@@ -189,7 +204,7 @@ void if_start(Slirp *slirp)
             continue;
         }
 
-        if (from_batchq) {
+        if (ifm == slirp->next_m) {
             /* Set which packet to send on next iteration */
             slirp->next_m = ifm->ifq_next;
         }
@@ -202,6 +217,10 @@ void if_start(Slirp *slirp)
         if (ifm->ifs_next != ifm) {
             insque(ifm->ifs_next, ifqt);
             ifs_remque(ifm);
+            /* Also update ifm_next to point to this next session packet,
+             * same for from_batchq_next */
+            ifm_next = ifm->ifs_next;
+            from_batchq_next = from_batchq;
         }
 
         /* Update so_queued */
@@ -211,7 +230,6 @@ void if_start(Slirp *slirp)
         }
 
         m_free(ifm);
-
     }
 
     slirp->if_queued = requeued;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 3/4] slirp: Remove unneeded if_queued
  2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 1/4] slirp: Keep next_m always valid Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 2/4] slirp: Fix queue walking in if_start Jan Kiszka
@ 2012-02-29 19:15 ` Jan Kiszka
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 4/4] slirp: Cleanup resources on instance removal Jan Kiszka
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 19:15 UTC (permalink / raw)
  To: qemu-devel

There is now a trivial check on entry of if_start for pending packets,
so we can drop the additional tracking via if_queued.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 slirp/if.c    |    8 --------
 slirp/slirp.c |    7 +------
 slirp/slirp.h |    1 -
 3 files changed, 1 insertions(+), 15 deletions(-)

diff --git a/slirp/if.c b/slirp/if.c
index 78a9b78..90bf398 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -110,8 +110,6 @@ if_output(struct socket *so, struct mbuf *ifm)
 	insque(ifm, ifq);
 
 diddit:
-	slirp->if_queued++;
-
 	if (so) {
 		/* Update *_queued */
 		so->so_queued++;
@@ -157,7 +155,6 @@ diddit:
 void if_start(Slirp *slirp)
 {
     uint64_t now = qemu_get_clock_ns(rt_clock);
-    int requeued = 0;
     bool from_batchq, from_batchq_next;
     struct mbuf *ifm, *ifm_next, *ifqt;
 
@@ -195,12 +192,9 @@ void if_start(Slirp *slirp)
             ifm_next = NULL;
         }
 
-        slirp->if_queued--;
-
         /* Try to send packet unless it already expired */
         if (ifm->expiration_date >= now && !if_encap(slirp, ifm)) {
             /* Packet is delayed due to pending ARP resolution */
-            requeued++;
             continue;
         }
 
@@ -231,6 +225,4 @@ void if_start(Slirp *slirp)
 
         m_free(ifm);
     }
-
-    slirp->if_queued = requeued;
 }
diff --git a/slirp/slirp.c b/slirp/slirp.c
index 19d69eb..bcffc34 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -581,12 +581,7 @@ void slirp_select_poll(fd_set *readfds, fd_set *writefds, fd_set *xfds,
                 }
 	}
 
-	/*
-	 * See if we can start outputting
-	 */
-	if (slirp->if_queued) {
-	    if_start(slirp);
-	}
+        if_start(slirp);
     }
 
 	/* clear global file descriptor sets.
diff --git a/slirp/slirp.h b/slirp/slirp.h
index 28a5c03..950eccd 100644
--- a/slirp/slirp.h
+++ b/slirp/slirp.h
@@ -235,7 +235,6 @@ struct Slirp {
     int mbuf_alloced;
 
     /* if states */
-    int if_queued;          /* number of packets queued so far */
     struct mbuf if_fastq;   /* fast queue (for interactive data) */
     struct mbuf if_batchq;  /* queue for non-interactive data */
     struct mbuf *next_m;    /* pointer to next mbuf to output */
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 4/4] slirp: Cleanup resources on instance removal
  2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
                   ` (2 preceding siblings ...)
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 3/4] slirp: Remove unneeded if_queued Jan Kiszka
@ 2012-02-29 19:15 ` Jan Kiszka
  2012-02-29 19:19 ` [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
  2012-02-29 21:00 ` Stefan Weil
  5 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 19:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael S. Tsirkin

Close & free sockets when shutting down a slirp instance, also release
all buffers.

CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 slirp/ip_icmp.c  |    7 +++++++
 slirp/ip_icmp.h  |    1 +
 slirp/ip_input.c |    7 +++++++
 slirp/mbuf.c     |   21 +++++++++++++++++++++
 slirp/mbuf.h     |    1 +
 slirp/slirp.c    |    3 +++
 slirp/slirp.h    |    2 ++
 slirp/tcp_subr.c |    7 +++++++
 slirp/udp.c      |    8 ++++++++
 slirp/udp.h      |    1 +
 10 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/slirp/ip_icmp.c b/slirp/ip_icmp.c
index 5dbf21d..d571fd0 100644
--- a/slirp/ip_icmp.c
+++ b/slirp/ip_icmp.c
@@ -66,6 +66,13 @@ void icmp_init(Slirp *slirp)
     slirp->icmp_last_so = &slirp->icmp;
 }
 
+void icmp_cleanup(Slirp *slirp)
+{
+    while (slirp->icmp.so_next != &slirp->icmp) {
+        icmp_detach(slirp->icmp.so_next);
+    }
+}
+
 static int icmp_send(struct socket *so, struct mbuf *m, int hlen)
 {
     struct ip *ip = mtod(m, struct ip *);
diff --git a/slirp/ip_icmp.h b/slirp/ip_icmp.h
index b3da1f2..1a1af91 100644
--- a/slirp/ip_icmp.h
+++ b/slirp/ip_icmp.h
@@ -154,6 +154,7 @@ struct icmp {
 	(type) == ICMP_MASKREQ || (type) == ICMP_MASKREPLY)
 
 void icmp_init(Slirp *slirp);
+void icmp_cleanup(Slirp *slirp);
 void icmp_input(struct mbuf *, int);
 void icmp_error(struct mbuf *msrc, u_char type, u_char code, int minsize,
                 const char *message);
diff --git a/slirp/ip_input.c b/slirp/ip_input.c
index c7b3eb4..ce24faf 100644
--- a/slirp/ip_input.c
+++ b/slirp/ip_input.c
@@ -61,6 +61,13 @@ ip_init(Slirp *slirp)
     icmp_init(slirp);
 }
 
+void ip_cleanup(Slirp *slirp)
+{
+    udp_cleanup(slirp);
+    tcp_cleanup(slirp);
+    icmp_cleanup(slirp);
+}
+
 /*
  * Ip input routine.  Checksum and byte swap header.  If fragmented
  * try to reassemble.  Process options.  Pass to next level.
diff --git a/slirp/mbuf.c b/slirp/mbuf.c
index c699c75..4fefb04 100644
--- a/slirp/mbuf.c
+++ b/slirp/mbuf.c
@@ -32,6 +32,27 @@ m_init(Slirp *slirp)
     slirp->m_usedlist.m_next = slirp->m_usedlist.m_prev = &slirp->m_usedlist;
 }
 
+void m_cleanup(Slirp *slirp)
+{
+    struct mbuf *m, *next;
+
+    m = slirp->m_usedlist.m_next;
+    while (m != &slirp->m_usedlist) {
+        next = m->m_next;
+        if (m->m_flags & M_EXT) {
+            free(m->m_ext);
+        }
+        free(m);
+        m = next;
+    }
+    m = slirp->m_freelist.m_next;
+    while (m != &slirp->m_freelist) {
+        next = m->m_next;
+        free(m);
+        m = next;
+    }
+}
+
 /*
  * Get an mbuf from the free list, if there are none
  * malloc one
diff --git a/slirp/mbuf.h b/slirp/mbuf.h
index 8d7951f..3f3ab09 100644
--- a/slirp/mbuf.h
+++ b/slirp/mbuf.h
@@ -116,6 +116,7 @@ struct mbuf {
 					 * it rather than putting it on the free list */
 
 void m_init(Slirp *);
+void m_cleanup(Slirp *slirp);
 struct mbuf * m_get(Slirp *);
 void m_free(struct mbuf *);
 void m_cat(register struct mbuf *, register struct mbuf *);
diff --git a/slirp/slirp.c b/slirp/slirp.c
index bcffc34..1502830 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -246,6 +246,9 @@ void slirp_cleanup(Slirp *slirp)
 
     unregister_savevm(NULL, "slirp", slirp);
 
+    ip_cleanup(slirp);
+    m_cleanup(slirp);
+
     g_free(slirp->tftp_prefix);
     g_free(slirp->bootp_filename);
     g_free(slirp);
diff --git a/slirp/slirp.h b/slirp/slirp.h
index 950eccd..013a3b3 100644
--- a/slirp/slirp.h
+++ b/slirp/slirp.h
@@ -314,6 +314,7 @@ void if_output(struct socket *, struct mbuf *);
 
 /* ip_input.c */
 void ip_init(Slirp *);
+void ip_cleanup(Slirp *);
 void ip_input(struct mbuf *);
 void ip_slowtimo(Slirp *);
 void ip_stripoptions(register struct mbuf *, struct mbuf *);
@@ -331,6 +332,7 @@ void tcp_setpersist(register struct tcpcb *);
 
 /* tcp_subr.c */
 void tcp_init(Slirp *);
+void tcp_cleanup(Slirp *);
 void tcp_template(struct tcpcb *);
 void tcp_respond(struct tcpcb *, register struct tcpiphdr *, register struct mbuf *, tcp_seq, tcp_seq, int);
 struct tcpcb * tcp_newtcpcb(struct socket *);
diff --git a/slirp/tcp_subr.c b/slirp/tcp_subr.c
index 143a238..6f6585a 100644
--- a/slirp/tcp_subr.c
+++ b/slirp/tcp_subr.c
@@ -55,6 +55,13 @@ tcp_init(Slirp *slirp)
     slirp->tcp_last_so = &slirp->tcb;
 }
 
+void tcp_cleanup(Slirp *slirp)
+{
+    while (slirp->tcb.so_next != &slirp->tcb) {
+        tcp_close(sototcpcb(slirp->tcb.so_next));
+    }
+}
+
 /*
  * Create template to be used to send tcp packets on a connection.
  * Call after host entry created, fills
diff --git a/slirp/udp.c b/slirp/udp.c
index 5b060f3..ced5096 100644
--- a/slirp/udp.c
+++ b/slirp/udp.c
@@ -49,6 +49,14 @@ udp_init(Slirp *slirp)
     slirp->udb.so_next = slirp->udb.so_prev = &slirp->udb;
     slirp->udp_last_so = &slirp->udb;
 }
+
+void udp_cleanup(Slirp *slirp)
+{
+    while (slirp->udb.so_next != &slirp->udb) {
+        udp_detach(slirp->udb.so_next);
+    }
+}
+
 /* m->m_data  points at ip packet header
  * m->m_len   length ip packet
  * ip->ip_len length data (IPDU)
diff --git a/slirp/udp.h b/slirp/udp.h
index 9b5c3cf..9bf31fe 100644
--- a/slirp/udp.h
+++ b/slirp/udp.h
@@ -74,6 +74,7 @@ struct udpiphdr {
 struct mbuf;
 
 void udp_init(Slirp *);
+void udp_cleanup(Slirp *);
 void udp_input(register struct mbuf *, int);
 int udp_output(struct socket *, struct mbuf *, struct sockaddr_in *);
 int udp_attach(struct socket *);
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
  2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
                   ` (3 preceding siblings ...)
  2012-02-29 19:15 ` [Qemu-devel] [PATCH 4/4] slirp: Cleanup resources on instance removal Jan Kiszka
@ 2012-02-29 19:19 ` Jan Kiszka
  2012-02-29 21:00 ` Stefan Weil
  5 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 19:19 UTC (permalink / raw)
  Cc: Stefan Weil, Zhi Yong Wu, qemu-devel, Fabien Chouteau,
	Michael S. Tsirkin

On 2012-02-29 20:15, Jan Kiszka wrote:
> This is an alternative, more complete approach to fix the requeuing-
> related crashes reported recently. See patch 2 for details. The rest are
> simple cleanups.
> 
> Please check carefully if I messed something up.

Oops, outdated intro. Should have been:

"Well, this requeuing bug seems to have a long breath. Previous attempts
to fix it (mine included) neglected the fact that we need to walk the
queue of pending packets, not just restart from the beginning after a
requeue. This version should get it Right(TM).

This also comes with a fix for resource cleanups on slirp shutdown. At
least valgrind is happy now.

Reviews welcome!"

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
  2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
                   ` (4 preceding siblings ...)
  2012-02-29 19:19 ` [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
@ 2012-02-29 21:00 ` Stefan Weil
  2012-02-29 21:33   ` Jan Kiszka
  5 siblings, 1 reply; 10+ messages in thread
From: Stefan Weil @ 2012-02-29 21:00 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Zhi Yong Wu, qemu-devel, Fabien Chouteau, Michael S. Tsirkin

Am 29.02.2012 20:15, schrieb Jan Kiszka:
> This is an alternative, more complete approach to fix the requeuing-
> related crashes reported recently. See patch 2 for details. The rest are
> simple cleanups.
>
> Please check carefully if I messed something up.
>

Hi Jan,

here is the result of MIPS Malta with your patch series applied:

Program received signal SIGSEGV, Segmentation fault.
0x000055555577db5b in slirp_remque (a=0x555556cff360) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
39        ((struct quehead *)(element->qh_rlink))->qh_link = 
element->qh_link;
(gdb) i s
#0  0x000055555577db5b in slirp_remque (a=0x555556cff360) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
#1  0x000055555577b7a2 in if_start (slirp=0x5555564bfb80) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:208
#2  0x000055555577b607 in if_output (so=0x555556ea0b70, 
ifm=0x555556cff9e0) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:139
#3  0x000055555577d040 in ip_output (so=0x555556ea0b70, 
m0=0x555556cff9e0) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/ip_output.c:84
#4  0x00005555557865d6 in tcp_output (tp=0x555556ea0c20) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/tcp_output.c:456
#5  0x000055555577ff5a in slirp_select_poll (readfds=0x7fffffffda10, 
writefds=0x7fffffffda90, xfds=0x7fffffffdb10, select_error=0)
     at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/slirp.c:480
#6  0x000055555572d8c0 in main_loop_wait (nonblocking=0) at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/main-loop.c:469
#7  0x0000555555721a61 in main_loop () at 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:1558
#8  0x00005555557284a2 in main (argc=25, argv=0x7fffffffdfe8, 
envp=0x7fffffffe0b8) at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:3667
(gdb) p element
$1 = (struct quehead *) 0x555556cff360
(gdb) p *element
$2 = {qh_link = 0x555556cff360, qh_rlink = 0x0}
(gdb) p (struct quehead *)(element->qh_rlink)
$3 = (struct quehead *) 0x0

Cheers,

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
  2012-02-29 21:00 ` Stefan Weil
@ 2012-02-29 21:33   ` Jan Kiszka
  2012-02-29 21:48     ` Stefan Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 21:33 UTC (permalink / raw)
  To: Stefan Weil; +Cc: Zhi Yong Wu, qemu-devel, Fabien Chouteau, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 2363 bytes --]

On 2012-02-29 22:00, Stefan Weil wrote:
> Am 29.02.2012 20:15, schrieb Jan Kiszka:
>> This is an alternative, more complete approach to fix the requeuing-
>> related crashes reported recently. See patch 2 for details. The rest are
>> simple cleanups.
>>
>> Please check carefully if I messed something up.
>>
> 
> Hi Jan,
> 
> here is the result of MIPS Malta with your patch series applied:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
> 39        ((struct quehead *)(element->qh_rlink))->qh_link =
> element->qh_link;
> (gdb) i s
> #0  0x000055555577db5b in slirp_remque (a=0x555556cff360) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
> #1  0x000055555577b7a2 in if_start (slirp=0x5555564bfb80) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:208
> #2  0x000055555577b607 in if_output (so=0x555556ea0b70,
> ifm=0x555556cff9e0) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:139
> #3  0x000055555577d040 in ip_output (so=0x555556ea0b70,
> m0=0x555556cff9e0) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/ip_output.c:84
> #4  0x00005555557865d6 in tcp_output (tp=0x555556ea0c20) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/tcp_output.c:456
> #5  0x000055555577ff5a in slirp_select_poll (readfds=0x7fffffffda10,
> writefds=0x7fffffffda90, xfds=0x7fffffffdb10, select_error=0)
>     at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/slirp.c:480
> #6  0x000055555572d8c0 in main_loop_wait (nonblocking=0) at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/main-loop.c:469
> #7  0x0000555555721a61 in main_loop () at
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:1558
> #8  0x00005555557284a2 in main (argc=25, argv=0x7fffffffdfe8,
> envp=0x7fffffffe0b8) at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:3667
> (gdb) p element
> $1 = (struct quehead *) 0x555556cff360
> (gdb) p *element
> $2 = {qh_link = 0x555556cff360, qh_rlink = 0x0}
> (gdb) p (struct quehead *)(element->qh_rlink)
> $3 = (struct quehead *) 0x0

Hmm. Two options:

 - you try to debug what happens to that mbuf, why its queue anchors
   get corrupted (maybe while in if_encap?)
 - you tell me how to reproduce it (image file, host characteristics)

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
  2012-02-29 21:33   ` Jan Kiszka
@ 2012-02-29 21:48     ` Stefan Weil
  2012-02-29 21:52       ` Jan Kiszka
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Weil @ 2012-02-29 21:48 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Zhi Yong Wu, qemu-devel, Fabien Chouteau, Michael S. Tsirkin

Am 29.02.2012 22:33, schrieb Jan Kiszka:
> On 2012-02-29 22:00, Stefan Weil wrote:
>> Am 29.02.2012 20:15, schrieb Jan Kiszka:
>>> This is an alternative, more complete approach to fix the requeuing-
>>> related crashes reported recently. See patch 2 for details. The rest are
>>> simple cleanups.
>>>
>>> Please check carefully if I messed something up.
>>>
>>
>> Hi Jan,
>>
>> here is the result of MIPS Malta with your patch series applied:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
>> 39 ((struct quehead *)(element->qh_rlink))->qh_link =
>> element->qh_link;
>> (gdb) i s
>> #0 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
>> #1 0x000055555577b7a2 in if_start (slirp=0x5555564bfb80) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:208
>> #2 0x000055555577b607 in if_output (so=0x555556ea0b70,
>> ifm=0x555556cff9e0) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:139
>> #3 0x000055555577d040 in ip_output (so=0x555556ea0b70,
>> m0=0x555556cff9e0) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/ip_output.c:84
>> #4 0x00005555557865d6 in tcp_output (tp=0x555556ea0c20) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/tcp_output.c:456
>> #5 0x000055555577ff5a in slirp_select_poll (readfds=0x7fffffffda10,
>> writefds=0x7fffffffda90, xfds=0x7fffffffdb10, select_error=0)
>> at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/slirp.c:480
>> #6 0x000055555572d8c0 in main_loop_wait (nonblocking=0) at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/main-loop.c:469
>> #7 0x0000555555721a61 in main_loop () at
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:1558
>> #8 0x00005555557284a2 in main (argc=25, argv=0x7fffffffdfe8,
>> envp=0x7fffffffe0b8) at 
>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:3667
>> (gdb) p element
>> $1 = (struct quehead *) 0x555556cff360
>> (gdb) p *element
>> $2 = {qh_link = 0x555556cff360, qh_rlink = 0x0}
>> (gdb) p (struct quehead *)(element->qh_rlink)
>> $3 = (struct quehead *) 0x0
>
> Hmm. Two options:
>
> - you try to debug what happens to that mbuf, why its queue anchors
> get corrupted (maybe while in if_encap?)
> - you tell me how to reproduce it (image file, host characteristics)
>
> Jan

I'm afraid that the first variant won't happen this or next week
because lack of time.

This is my test environment:

Debian Squeeze x86_64 host, Debian Squeeze mips guest.

I use NFS root, and the latest crash happened during boot.
All other crashes happened after the guest had booted
when I startet apt-get update, so maybe booting from a
Debian CDROM might also reproduce the crash.

I compiled QEMU with a default configuration, but used
CFLAGS=-g (no optimization) and startet QEMU like this:

gdb --args 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/bin/debug/x86/mips-softmmu/qemu-system-mips 
--kernel /tftpboot/malta/boot/vmlinux-2.6.26-2-4kc-malta --initrd 
/tftpboot/malta/boot/initrd.img-2.6.26-2-4kc-malta --append "debug 
nohz=off root=/dev/nfs rw ip=::::malta::dhcp 
nfsroot=10.0.2.2:/tftpboot/malta -bootp abc -tftp /tftpboot/malta" -M 
malta --cpu 4KEc -m 256 --net nic,model=pcnet --net user,hostname=malta 
--redir tcp:5800::5800 --redir tcp:5900::5900 --redir tcp:10022::22 
--redir tcp:10080::80

Kernel and initrd are from Debian Squeeze (mips).

I had no slirp problems with that test environment during the last two 
years.

Regards,

Stefan W.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
  2012-02-29 21:48     ` Stefan Weil
@ 2012-02-29 21:52       ` Jan Kiszka
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2012-02-29 21:52 UTC (permalink / raw)
  To: Stefan Weil; +Cc: Zhi Yong Wu, qemu-devel, Fabien Chouteau, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 4817 bytes --]

On 2012-02-29 22:48, Stefan Weil wrote:
> Am 29.02.2012 22:33, schrieb Jan Kiszka:
>> On 2012-02-29 22:00, Stefan Weil wrote:
>>> Am 29.02.2012 20:15, schrieb Jan Kiszka:
>>>> This is an alternative, more complete approach to fix the requeuing-
>>>> related crashes reported recently. See patch 2 for details. The rest
>>>> are
>>>> simple cleanups.
>>>>
>>>> Please check carefully if I messed something up.
>>>>
>>>
>>> Hi Jan,
>>>
>>> here is the result of MIPS Malta with your patch series applied:
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
>>> 39 ((struct quehead *)(element->qh_rlink))->qh_link =
>>> element->qh_link;
>>> (gdb) i s
>>> #0 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
>>> #1 0x000055555577b7a2 in if_start (slirp=0x5555564bfb80) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:208
>>> #2 0x000055555577b607 in if_output (so=0x555556ea0b70,
>>> ifm=0x555556cff9e0) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:139
>>> #3 0x000055555577d040 in ip_output (so=0x555556ea0b70,
>>> m0=0x555556cff9e0) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/ip_output.c:84
>>> #4 0x00005555557865d6 in tcp_output (tp=0x555556ea0c20) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/tcp_output.c:456
>>> #5 0x000055555577ff5a in slirp_select_poll (readfds=0x7fffffffda10,
>>> writefds=0x7fffffffda90, xfds=0x7fffffffdb10, select_error=0)
>>> at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/slirp.c:480
>>> #6 0x000055555572d8c0 in main_loop_wait (nonblocking=0) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/main-loop.c:469
>>> #7 0x0000555555721a61 in main_loop () at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:1558
>>> #8 0x00005555557284a2 in main (argc=25, argv=0x7fffffffdfe8,
>>> envp=0x7fffffffe0b8) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:3667
>>> (gdb) p element
>>> $1 = (struct quehead *) 0x555556cff360
>>> (gdb) p *element
>>> $2 = {qh_link = 0x555556cff360, qh_rlink = 0x0}
>>> (gdb) p (struct quehead *)(element->qh_rlink)
>>> $3 = (struct quehead *) 0x0
>>
>> Hmm. Two options:
>>
>> - you try to debug what happens to that mbuf, why its queue anchors
>> get corrupted (maybe while in if_encap?)
>> - you tell me how to reproduce it (image file, host characteristics)
>>
>> Jan
> 
> I'm afraid that the first variant won't happen this or next week
> because lack of time.
> 
> This is my test environment:
> 
> Debian Squeeze x86_64 host, Debian Squeeze mips guest.
> 
> I use NFS root, and the latest crash happened during boot.
> All other crashes happened after the guest had booted
> when I startet apt-get update, so maybe booting from a
> Debian CDROM might also reproduce the crash.
> 
> I compiled QEMU with a default configuration, but used
> CFLAGS=-g (no optimization) and startet QEMU like this:
> 
> gdb --args
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/bin/debug/x86/mips-softmmu/qemu-system-mips
> --kernel /tftpboot/malta/boot/vmlinux-2.6.26-2-4kc-malta --initrd
> /tftpboot/malta/boot/initrd.img-2.6.26-2-4kc-malta --append "debug
> nohz=off root=/dev/nfs rw ip=::::malta::dhcp
> nfsroot=10.0.2.2:/tftpboot/malta -bootp abc -tftp /tftpboot/malta" -M
> malta --cpu 4KEc -m 256 --net nic,model=pcnet --net user,hostname=malta
> --redir tcp:5800::5800 --redir tcp:5900::5900 --redir tcp:10022::22
> --redir tcp:10080::80
> 
> Kernel and initrd are from Debian Squeeze (mips).

OK, thanks.

Here is a last shot (on top of my queue) before I try to reproduce:

diff --git a/slirp/if.c b/slirp/if.c
index 90bf398..d3bdf58 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -181,13 +181,12 @@ void if_start(Slirp *slirp)
         from_batchq = from_batchq_next;
 
         ifm_next = ifm->ifq_next;
-        if (!from_batchq) {
-            if (ifm_next == &slirp->if_fastq) {
-                /* No more packets in fastq, switch to batchq */
-                ifm_next = slirp->next_m;
-                from_batchq_next = true;
-            }
-        } else if (ifm_next == &slirp->if_batchq) {
+        if (ifm_next == &slirp->if_fastq) {
+            /* No more packets in fastq, switch to batchq */
+            ifm_next = slirp->next_m;
+            from_batchq_next = true;
+        }
+        if (ifm_next == &slirp->if_batchq) {
             /* end of batchq */
             ifm_next = NULL;
         }

> 
> I had no slirp problems with that test environment during the last two
> years.

Yes, these regression here are unfortunate. Hope we can resolve them
quickly.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-02-29 21:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-29 19:15 [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
2012-02-29 19:15 ` [Qemu-devel] [PATCH 1/4] slirp: Keep next_m always valid Jan Kiszka
2012-02-29 19:15 ` [Qemu-devel] [PATCH 2/4] slirp: Fix queue walking in if_start Jan Kiszka
2012-02-29 19:15 ` [Qemu-devel] [PATCH 3/4] slirp: Remove unneeded if_queued Jan Kiszka
2012-02-29 19:15 ` [Qemu-devel] [PATCH 4/4] slirp: Cleanup resources on instance removal Jan Kiszka
2012-02-29 19:19 ` [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups Jan Kiszka
2012-02-29 21:00 ` Stefan Weil
2012-02-29 21:33   ` Jan Kiszka
2012-02-29 21:48     ` Stefan Weil
2012-02-29 21:52       ` Jan Kiszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).