oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
@ 2010-07-08  8:22 Tejun Heo
  2010-07-11  2:36 ` David Miller
  2010-07-11 16:09 ` Ilpo Järvinen
  0 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2010-07-08  8:22 UTC (permalink / raw)
  To: David S. Miller, lkml, netdev@vger.kernel.org
  Cc: Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: text/plain, Size: 1970 bytes --]

Hello,

We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
Please see the attached photoshoot.  This is happening on a HPC
cluster and very interestingly caused by one particular job.  How long
it takes isn't clear yet (at least more than a day) but when it
happens it happens on a lot of machines in relatively short time.

With a bit of disassemblying, I've found that the oops is happening
during tcp_for_write_queue_from() because the skb->next points to
NULL.

 void tcp_xmit_retransmit_queue(struct sock *sk)
 {
 ...
	if (tp->retransmit_skb_hint) {
		skb = tp->retransmit_skb_hint;
		last_lost = TCP_SKB_CB(skb)->end_seq;
		if (after(last_lost, tp->retransmit_high))
			last_lost = tp->retransmit_high;
	} else {
		skb = tcp_write_queue_head(sk);
		last_lost = tp->snd_una;
	}

 =>	tcp_for_write_queue_from(skb, sk) {
		 __u8 sacked = TCP_SKB_CB(skb)->sacked;

		 if (skb == tcp_send_head(sk))
			 break;
		 /* we could do better than to assign each time */
		 if (hole == NULL)

This can happen for one of the following reasons,

1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
   too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
   queue for some reason.

2. tp->retransmit_skb_hint is pointing to a skb which is not on the
   write_queue.  ie. somebody forgot to update hint while removing the
   skb from the write queue.

3. The hint is pointing to a skb on the list but the list itself is
   corrupt.

I added some debug code and the crash is happening when
tp->retransmit_skb_hint is not NULL but tp->retransmit_skb_hint->next
is NULL.  So, #1 is out; unfortunately, I didn't have debug code in
place to discern between #2 and #3.

Does anything ring a bell?  This is a production system and debugging
affects quite a number of people.  I can put debug code in to discern
between #2 and #3 but I'm basically shooting in the dark and it would
be great if someone has a better idea.

Thanks.

-- 
tejun

[-- Attachment #2: oops.jpg --]
[-- Type: image/jpeg, Size: 92585 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-08  8:22 oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
@ 2010-07-11  2:36 ` David Miller
  2010-07-11 16:09 ` Ilpo Järvinen
  1 sibling, 0 replies; 29+ messages in thread
From: David Miller @ 2010-07-11  2:36 UTC (permalink / raw)
  To: tj; +Cc: linux-kernel, netdev, henning.fehrmann, carsten.aulbert

From: Tejun Heo <tj@kernel.org>
Date: Thu, 08 Jul 2010 10:22:02 +0200

> We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
 ...
> Does anything ring a bell?

A long time ago we had a packet scheduler bug that could corrupt
the TCP socket queues, but that was fixed in 2.6.27 so would
definitely be fixed in your kernel.

--------------------
commit 69747650c814a8a79fef412c7416adf823293a3e
Author: David S. Miller <davem@davemloft.net>
Date:   Sun Aug 17 23:55:36 2008 -0700

    pkt_sched: Fix return value corruption in HTB and TBF.
    
    Based upon a bug report by Josip Rodin.
    
    Packet schedulers should only return NET_XMIT_DROP iff
    the packet really was dropped.  If the packet does reach
    the device after we return NET_XMIT_DROP then TCP can
    crash because it depends upon the enqueue path return
    values being accurate.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
--------------------

Nothing else jumps to mind, sorry.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-08  8:22 oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
  2010-07-11  2:36 ` David Miller
@ 2010-07-11 16:09 ` Ilpo Järvinen
  2010-07-11 17:06   ` Eric Dumazet
  2010-07-15 11:58   ` Lennart Schulte
  1 sibling, 2 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-11 16:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David S. Miller, lkml, netdev@vger.kernel.org, Fehrmann, Henning,
	Carsten Aulbert, Eric Dumazet

On Thu, 8 Jul 2010, Tejun Heo wrote:

> We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> Please see the attached photoshoot.  This is happening on a HPC
> cluster and very interestingly caused by one particular job.  How long
> it takes isn't clear yet (at least more than a day) but when it
> happens it happens on a lot of machines in relatively short time.
> 
> With a bit of disassemblying, I've found that the oops is happening
> during tcp_for_write_queue_from() because the skb->next points to
> NULL.
> 
>  void tcp_xmit_retransmit_queue(struct sock *sk)
>  {
>  ...
> 	if (tp->retransmit_skb_hint) {
> 		skb = tp->retransmit_skb_hint;
> 		last_lost = TCP_SKB_CB(skb)->end_seq;
> 		if (after(last_lost, tp->retransmit_high))
> 			last_lost = tp->retransmit_high;
> 	} else {
> 		skb = tcp_write_queue_head(sk);
> 		last_lost = tp->snd_una;
> 	}
> 
>  =>	tcp_for_write_queue_from(skb, sk) {
> 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> 
> 		 if (skb == tcp_send_head(sk))
> 			 break;
> 		 /* we could do better than to assign each time */
> 		 if (hole == NULL)
> 
> This can happen for one of the following reasons,
> 
> 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
>    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
>    queue for some reason.
> 
> 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
>    write_queue.  ie. somebody forgot to update hint while removing the
>    skb from the write queue.

Once again I've read the unlinkers through, and only thing that could 
cause this is tcp_send_synack (others do deal with the hints) but I think 
Eric already proposed a patch to that but we never got anywhere due to 
some counterargument why it wouldn't take place (too far away for me to 
remember, see archives about the discussions). ...But if you want be dead 
sure some WARN_ON there might not hurt. Also the purging of the whole 
queue was a similar suspect I then came across (but that would only 
materialize with sk reuse happening e.g., with nfs which the other guys 
weren't using).

> 3. The hint is pointing to a skb on the list but the list itself is
>    corrupt.
> 
> I added some debug code and the crash is happening when
> tp->retransmit_skb_hint is not NULL but tp->retransmit_skb_hint->next
> is NULL.  So, #1 is out; unfortunately, I didn't have debug code in
> place to discern between #2 and #3.
> 
> Does anything ring a bell?  This is a production system and debugging
> affects quite a number of people.  I can put debug code in to discern
> between #2 and #3 but I'm basically shooting in the dark and it would
> be great if someone has a better idea.

Thanks for taking this up. I've been kind of waiting somebody to show up 
who actually has some way of reproducing it. Once I had one guy in the 
hook but his ability to reproduce was for some reason lost when he tried 
with a debug patch [1]. 

I now realize that the debug patch should probably also print the write 
queue too when the problem is caught in order to discern the cases you 
mention.

Something along these lines:

tcp_for_write_queue(skb, sk) {
	printk("skb %p (%u-%u) next %p prev %p sacked %u\n", ...);
}

Anyway, my debugging patch should be such that in a lucky case it avoids 
crashing the system too, though price to pay might then be a stuck 
connection. In case #3 I'd expect the box to die elsewhere in TCP code 
pretty soon anyway so it depends whether avoiding oops is really so 
useful, but if you're lucky other mechanism in TCP will recover 
the lost one for you (basically RTO driven retransmission).

-- 
 i.

[1] http://marc.info/?l=linux-kernel&m=126624014117610&w=2

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 16:09 ` Ilpo Järvinen
@ 2010-07-11 17:06   ` Eric Dumazet
  2010-07-11 17:46     ` Eric Dumazet
  2010-07-15 11:58   ` Lennart Schulte
  1 sibling, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2010-07-11 17:06 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

Le dimanche 11 juillet 2010 à 19:09 +0300, Ilpo Järvinen a écrit :
> On Thu, 8 Jul 2010, Tejun Heo wrote:
> 
> > We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> > Please see the attached photoshoot.  This is happening on a HPC
> > cluster and very interestingly caused by one particular job.  How long
> > it takes isn't clear yet (at least more than a day) but when it
> > happens it happens on a lot of machines in relatively short time.
> > 
> > With a bit of disassemblying, I've found that the oops is happening
> > during tcp_for_write_queue_from() because the skb->next points to
> > NULL.
> > 
> >  void tcp_xmit_retransmit_queue(struct sock *sk)
> >  {
> >  ...
> > 	if (tp->retransmit_skb_hint) {
> > 		skb = tp->retransmit_skb_hint;
> > 		last_lost = TCP_SKB_CB(skb)->end_seq;
> > 		if (after(last_lost, tp->retransmit_high))
> > 			last_lost = tp->retransmit_high;
> > 	} else {
> > 		skb = tcp_write_queue_head(sk);
> > 		last_lost = tp->snd_una;
> > 	}
> > 
> >  =>	tcp_for_write_queue_from(skb, sk) {
> > 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> > 
> > 		 if (skb == tcp_send_head(sk))
> > 			 break;
> > 		 /* we could do better than to assign each time */
> > 		 if (hole == NULL)
> > 
> > This can happen for one of the following reasons,
> > 
> > 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
> >    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
> >    queue for some reason.
> > 
> > 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
> >    write_queue.  ie. somebody forgot to update hint while removing the
> >    skb from the write queue.
> 
> Once again I've read the unlinkers through, and only thing that could 
> cause this is tcp_send_synack (others do deal with the hints) but I think 
> Eric already proposed a patch to that but we never got anywhere due to 
> some counterargument why it wouldn't take place (too far away for me to 
> remember, see archives about the discussions). ...But if you want be dead 
> sure some WARN_ON there might not hurt. Also the purging of the whole 
> queue was a similar suspect I then came across (but that would only 
> materialize with sk reuse happening e.g., with nfs which the other guys 
> weren't using).
> 

Hmm.

This sounds familiar to me, but I cannot remember the discussion you
mention or the patch.

Or maybe it was the TCP transaction thing ? (including data in SYN or
SYN-ACK packet)

> > 3. The hint is pointing to a skb on the list but the list itself is
> >    corrupt.
> > 
> > I added some debug code and the crash is happening when
> > tp->retransmit_skb_hint is not NULL but tp->retransmit_skb_hint->next
> > is NULL.  So, #1 is out; unfortunately, I didn't have debug code in
> > place to discern between #2 and #3.
> > 
> > Does anything ring a bell?  This is a production system and debugging
> > affects quite a number of people.  I can put debug code in to discern
> > between #2 and #3 but I'm basically shooting in the dark and it would
> > be great if someone has a better idea.
> 
> Thanks for taking this up. I've been kind of waiting somebody to show up 
> who actually has some way of reproducing it. Once I had one guy in the 
> hook but his ability to reproduce was for some reason lost when he tried 
> with a debug patch [1]. 
> 
> I now realize that the debug patch should probably also print the write 
> queue too when the problem is caught in order to discern the cases you 
> mention.
> 
> Something along these lines:
> 
> tcp_for_write_queue(skb, sk) {
> 	printk("skb %p (%u-%u) next %p prev %p sacked %u\n", ...);
> }
> 
> Anyway, my debugging patch should be such that in a lucky case it avoids 
> crashing the system too, though price to pay might then be a stuck 
> connection. In case #3 I'd expect the box to die elsewhere in TCP code 
> pretty soon anyway so it depends whether avoiding oops is really so 
> useful, but if you're lucky other mechanism in TCP will recover 
> the lost one for you (basically RTO driven retransmission).
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 17:06   ` Eric Dumazet
@ 2010-07-11 17:46     ` Eric Dumazet
  2010-07-11 18:29       ` Eric Dumazet
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2010-07-11 17:46 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

Le dimanche 11 juillet 2010 à 19:06 +0200, Eric Dumazet a écrit :
> Le dimanche 11 juillet 2010 à 19:09 +0300, Ilpo Järvinen a écrit :
> > On Thu, 8 Jul 2010, Tejun Heo wrote:
> > 
> > > We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> > > Please see the attached photoshoot.  This is happening on a HPC
> > > cluster and very interestingly caused by one particular job.  How long
> > > it takes isn't clear yet (at least more than a day) but when it
> > > happens it happens on a lot of machines in relatively short time.
> > > 
> > > With a bit of disassemblying, I've found that the oops is happening
> > > during tcp_for_write_queue_from() because the skb->next points to
> > > NULL.
> > > 
> > >  void tcp_xmit_retransmit_queue(struct sock *sk)
> > >  {
> > >  ...
> > > 	if (tp->retransmit_skb_hint) {
> > > 		skb = tp->retransmit_skb_hint;
> > > 		last_lost = TCP_SKB_CB(skb)->end_seq;
> > > 		if (after(last_lost, tp->retransmit_high))
> > > 			last_lost = tp->retransmit_high;
> > > 	} else {
> > > 		skb = tcp_write_queue_head(sk);
> > > 		last_lost = tp->snd_una;
> > > 	}
> > > 
> > >  =>	tcp_for_write_queue_from(skb, sk) {
> > > 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> > > 
> > > 		 if (skb == tcp_send_head(sk))
> > > 			 break;
> > > 		 /* we could do better than to assign each time */
> > > 		 if (hole == NULL)
> > > 
> > > This can happen for one of the following reasons,
> > > 
> > > 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
> > >    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
> > >    queue for some reason.
> > > 
> > > 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
> > >    write_queue.  ie. somebody forgot to update hint while removing the
> > >    skb from the write queue.
> > 
> > Once again I've read the unlinkers through, and only thing that could 
> > cause this is tcp_send_synack (others do deal with the hints) but I think 
> > Eric already proposed a patch to that but we never got anywhere due to 
> > some counterargument why it wouldn't take place (too far away for me to 
> > remember, see archives about the discussions). ...But if you want be dead 
> > sure some WARN_ON there might not hurt. Also the purging of the whole 
> > queue was a similar suspect I then came across (but that would only 
> > materialize with sk reuse happening e.g., with nfs which the other guys 
> > weren't using).
> > 
> 
> Hmm.
> 
> This sounds familiar to me, but I cannot remember the discussion you
> mention or the patch.
> 
> Or maybe it was the TCP transaction thing ? (including data in SYN or
> SYN-ACK packet)

Hmm, I cannot find where we reset restransmit_skb_hint in
tcp_mtu_probe(), if we call tcp_unlink_write_queue().

if (skb->len <= copy) {
	/* We've eaten all the data from this skb.
	 * Throw it away. */
	TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags;
<<>>	tcp_unlink_write_queue(skb, sk);
	sk_wmem_free_skb(sk, skb);
} else {


Sorry if this was already discussed. We might add a comment here in anycase ;)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 17:46     ` Eric Dumazet
@ 2010-07-11 18:29       ` Eric Dumazet
  2010-07-11 19:22         ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2010-07-11 18:29 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

Le dimanche 11 juillet 2010 à 19:46 +0200, Eric Dumazet a écrit :
> Le dimanche 11 juillet 2010 à 19:06 +0200, Eric Dumazet a écrit :
> > Le dimanche 11 juillet 2010 à 19:09 +0300, Ilpo Järvinen a écrit :
> > > On Thu, 8 Jul 2010, Tejun Heo wrote:
> > > 
> > > > We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> > > > Please see the attached photoshoot.  This is happening on a HPC
> > > > cluster and very interestingly caused by one particular job.  How long
> > > > it takes isn't clear yet (at least more than a day) but when it
> > > > happens it happens on a lot of machines in relatively short time.
> > > > 
> > > > With a bit of disassemblying, I've found that the oops is happening
> > > > during tcp_for_write_queue_from() because the skb->next points to
> > > > NULL.
> > > > 
> > > >  void tcp_xmit_retransmit_queue(struct sock *sk)
> > > >  {
> > > >  ...
> > > > 	if (tp->retransmit_skb_hint) {
> > > > 		skb = tp->retransmit_skb_hint;
> > > > 		last_lost = TCP_SKB_CB(skb)->end_seq;
> > > > 		if (after(last_lost, tp->retransmit_high))
> > > > 			last_lost = tp->retransmit_high;
> > > > 	} else {
> > > > 		skb = tcp_write_queue_head(sk);
> > > > 		last_lost = tp->snd_una;
> > > > 	}
> > > > 
> > > >  =>	tcp_for_write_queue_from(skb, sk) {
> > > > 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> > > > 
> > > > 		 if (skb == tcp_send_head(sk))
> > > > 			 break;
> > > > 		 /* we could do better than to assign each time */
> > > > 		 if (hole == NULL)
> > > > 
> > > > This can happen for one of the following reasons,
> > > > 
> > > > 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
> > > >    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
> > > >    queue for some reason.
> > > > 
> > > > 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
> > > >    write_queue.  ie. somebody forgot to update hint while removing the
> > > >    skb from the write queue.
> > > 
> > > Once again I've read the unlinkers through, and only thing that could 
> > > cause this is tcp_send_synack (others do deal with the hints) but I think 
> > > Eric already proposed a patch to that but we never got anywhere due to 
> > > some counterargument why it wouldn't take place (too far away for me to 
> > > remember, see archives about the discussions). ...But if you want be dead 
> > > sure some WARN_ON there might not hurt. Also the purging of the whole 
> > > queue was a similar suspect I then came across (but that would only 
> > > materialize with sk reuse happening e.g., with nfs which the other guys 
> > > weren't using).
> > > 
> > 
> > Hmm.
> > 
> > This sounds familiar to me, but I cannot remember the discussion you
> > mention or the patch.
> > 
> > Or maybe it was the TCP transaction thing ? (including data in SYN or
> > SYN-ACK packet)
> 
> Hmm, I cannot find where we reset restransmit_skb_hint in
> tcp_mtu_probe(), if we call tcp_unlink_write_queue().
> 
> if (skb->len <= copy) {
> 	/* We've eaten all the data from this skb.
> 	 * Throw it away. */
> 	TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags;
> <<>>	tcp_unlink_write_queue(skb, sk);
> 	sk_wmem_free_skb(sk, skb);
> } else {
> 
> 
> Sorry if this was already discussed. We might add a comment here in anycase ;)
> 

Just in case, here is a patch for this issue, if Tejun wants to try it.

Thanks

[PATCH] tcp: tcp_mtu_probe() and retransmit hints

When removing an skb from tcp write queue, we must take care of various
hints that could be kept on this skb. tcp_mtu_probe() misses this
cleanup.

lkml reference : http://lkml.org/lkml/2010/7/8/63

Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..187453f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1666,6 +1666,9 @@ static int tcp_mtu_probe(struct sock *sk)
 			 * Throw it away. */
 			TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags;
 			tcp_unlink_write_queue(skb, sk);
+			tcp_clear_retrans_hints_partial(tp);
+			if (skb == tp->retransmit_skb_hint)
+				tp->retransmit_skb_hint = nskb;
 			sk_wmem_free_skb(sk, skb);
 		} else {
 			TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags &

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 18:29       ` Eric Dumazet
@ 2010-07-11 19:22         ` Ilpo Järvinen
  2010-07-11 19:25           ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-11 19:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5020 bytes --]

On Sun, 11 Jul 2010, Eric Dumazet wrote:

> Le dimanche 11 juillet 2010 à 19:46 +0200, Eric Dumazet a écrit :
> > Le dimanche 11 juillet 2010 à 19:06 +0200, Eric Dumazet a écrit :
> > > Le dimanche 11 juillet 2010 à 19:09 +0300, Ilpo Järvinen a écrit :
> > > > On Thu, 8 Jul 2010, Tejun Heo wrote:
> > > > 
> > > > > We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> > > > > Please see the attached photoshoot.  This is happening on a HPC
> > > > > cluster and very interestingly caused by one particular job.  How long
> > > > > it takes isn't clear yet (at least more than a day) but when it
> > > > > happens it happens on a lot of machines in relatively short time.
> > > > > 
> > > > > With a bit of disassemblying, I've found that the oops is happening
> > > > > during tcp_for_write_queue_from() because the skb->next points to
> > > > > NULL.
> > > > > 
> > > > >  void tcp_xmit_retransmit_queue(struct sock *sk)
> > > > >  {
> > > > >  ...
> > > > > 	if (tp->retransmit_skb_hint) {
> > > > > 		skb = tp->retransmit_skb_hint;
> > > > > 		last_lost = TCP_SKB_CB(skb)->end_seq;
> > > > > 		if (after(last_lost, tp->retransmit_high))
> > > > > 			last_lost = tp->retransmit_high;
> > > > > 	} else {
> > > > > 		skb = tcp_write_queue_head(sk);
> > > > > 		last_lost = tp->snd_una;
> > > > > 	}
> > > > > 
> > > > >  =>	tcp_for_write_queue_from(skb, sk) {
> > > > > 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> > > > > 
> > > > > 		 if (skb == tcp_send_head(sk))
> > > > > 			 break;
> > > > > 		 /* we could do better than to assign each time */
> > > > > 		 if (hole == NULL)
> > > > > 
> > > > > This can happen for one of the following reasons,
> > > > > 
> > > > > 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
> > > > >    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
> > > > >    queue for some reason.
> > > > > 
> > > > > 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
> > > > >    write_queue.  ie. somebody forgot to update hint while removing the
> > > > >    skb from the write queue.
> > > > 
> > > > Once again I've read the unlinkers through, and only thing that could 
> > > > cause this is tcp_send_synack (others do deal with the hints) but I think 
> > > > Eric already proposed a patch to that but we never got anywhere due to 
> > > > some counterargument why it wouldn't take place (too far away for me to 
> > > > remember, see archives about the discussions). ...But if you want be dead 
> > > > sure some WARN_ON there might not hurt. Also the purging of the whole 
> > > > queue was a similar suspect I then came across (but that would only 
> > > > materialize with sk reuse happening e.g., with nfs which the other guys 
> > > > weren't using).
> > > > 
> > > 
> > > Hmm.
> > > 
> > > This sounds familiar to me, but I cannot remember the discussion you
> > > mention or the patch.
> > > 
> > > Or maybe it was the TCP transaction thing ? (including data in SYN or
> > > SYN-ACK packet)

No. That's another thing. ...I've already found it with google today but 
cannot seem to find it again. I thought I used tcp_make_synack eric but 
for some reason I only get these transaction fix hits. I'll keep looking.

> > Hmm, I cannot find where we reset restransmit_skb_hint in
> > tcp_mtu_probe(), if we call tcp_unlink_write_queue().
> > 
> > if (skb->len <= copy) {
> > 	/* We've eaten all the data from this skb.
> > 	 * Throw it away. */
> > 	TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags;
> > <<>>	tcp_unlink_write_queue(skb, sk);
> > 	sk_wmem_free_skb(sk, skb);
> > } else {
> > 
> > 
> > Sorry if this was already discussed. We might add a comment here in anycase ;)
> > 
> 
> Just in case, here is a patch for this issue, if Tejun wants to try it.
> 
> Thanks
> 
> [PATCH] tcp: tcp_mtu_probe() and retransmit hints
> 
> When removing an skb from tcp write queue, we must take care of various
> hints that could be kept on this skb. tcp_mtu_probe() misses this
> cleanup.
>
> lkml reference : http://lkml.org/lkml/2010/7/8/63
> 
> Reported-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index b4ed957..187453f 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1666,6 +1666,9 @@ static int tcp_mtu_probe(struct sock *sk)
>  			 * Throw it away. */
>  			TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags;
>  			tcp_unlink_write_queue(skb, sk);
> +			tcp_clear_retrans_hints_partial(tp);
> +			if (skb == tp->retransmit_skb_hint)
> +				tp->retransmit_skb_hint = nskb;

...I think we've not sent that skb just yet, so this'll never be true (and 
the rexmit loop terminates at send_head without setting it so we should 
be safe, I'll still need to check that no other code can accidently move 
it to the send_head but I doubt it happens).

>  			sk_wmem_free_skb(sk, skb);
>  		} else {
>  			TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags &
> 
> 
> 

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 19:22         ` Ilpo Järvinen
@ 2010-07-11 19:25           ` Ilpo Järvinen
  2010-07-11 19:44             ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-11 19:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3511 bytes --]

On Sun, 11 Jul 2010, Ilpo Järvinen wrote:

> On Sun, 11 Jul 2010, Eric Dumazet wrote:
> 
> > Le dimanche 11 juillet 2010 à 19:46 +0200, Eric Dumazet a écrit :
> > > Le dimanche 11 juillet 2010 à 19:06 +0200, Eric Dumazet a écrit :
> > > > Le dimanche 11 juillet 2010 à 19:09 +0300, Ilpo Järvinen a écrit :
> > > > > On Thu, 8 Jul 2010, Tejun Heo wrote:
> > > > > 
> > > > > > We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> > > > > > Please see the attached photoshoot.  This is happening on a HPC
> > > > > > cluster and very interestingly caused by one particular job.  How long
> > > > > > it takes isn't clear yet (at least more than a day) but when it
> > > > > > happens it happens on a lot of machines in relatively short time.
> > > > > > 
> > > > > > With a bit of disassemblying, I've found that the oops is happening
> > > > > > during tcp_for_write_queue_from() because the skb->next points to
> > > > > > NULL.
> > > > > > 
> > > > > >  void tcp_xmit_retransmit_queue(struct sock *sk)
> > > > > >  {
> > > > > >  ...
> > > > > > 	if (tp->retransmit_skb_hint) {
> > > > > > 		skb = tp->retransmit_skb_hint;
> > > > > > 		last_lost = TCP_SKB_CB(skb)->end_seq;
> > > > > > 		if (after(last_lost, tp->retransmit_high))
> > > > > > 			last_lost = tp->retransmit_high;
> > > > > > 	} else {
> > > > > > 		skb = tcp_write_queue_head(sk);
> > > > > > 		last_lost = tp->snd_una;
> > > > > > 	}
> > > > > > 
> > > > > >  =>	tcp_for_write_queue_from(skb, sk) {
> > > > > > 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> > > > > > 
> > > > > > 		 if (skb == tcp_send_head(sk))
> > > > > > 			 break;
> > > > > > 		 /* we could do better than to assign each time */
> > > > > > 		 if (hole == NULL)
> > > > > > 
> > > > > > This can happen for one of the following reasons,
> > > > > > 
> > > > > > 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
> > > > > >    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
> > > > > >    queue for some reason.
> > > > > > 
> > > > > > 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
> > > > > >    write_queue.  ie. somebody forgot to update hint while removing the
> > > > > >    skb from the write queue.
> > > > > 
> > > > > Once again I've read the unlinkers through, and only thing that could 
> > > > > cause this is tcp_send_synack (others do deal with the hints) but I think 
> > > > > Eric already proposed a patch to that but we never got anywhere due to 
> > > > > some counterargument why it wouldn't take place (too far away for me to 
> > > > > remember, see archives about the discussions). ...But if you want be dead 
> > > > > sure some WARN_ON there might not hurt. Also the purging of the whole 
> > > > > queue was a similar suspect I then came across (but that would only 
> > > > > materialize with sk reuse happening e.g., with nfs which the other guys 
> > > > > weren't using).
> > > > > 
> > > > 
> > > > Hmm.
> > > > 
> > > > This sounds familiar to me, but I cannot remember the discussion you
> > > > mention or the patch.
> > > > 
> > > > Or maybe it was the TCP transaction thing ? (including data in SYN or
> > > > SYN-ACK packet)
> 
> No. That's another thing. ...I've already found it with google today but 
> cannot seem to find it again. I thought I used tcp_make_synack eric but 
> for some reason I only get these transaction fix hits. I'll keep looking.

Right, this one:

http://kerneltrap.org/mailarchive/linux-netdev/2009/10/29/6259073

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 19:25           ` Ilpo Järvinen
@ 2010-07-11 19:44             ` Ilpo Järvinen
  0 siblings, 0 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-11 19:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric Dumazet, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4203 bytes --]

On Sun, 11 Jul 2010, Ilpo Järvinen wrote:

> On Sun, 11 Jul 2010, Ilpo Järvinen wrote:
> 
> > On Sun, 11 Jul 2010, Eric Dumazet wrote:
> > 
> > > Le dimanche 11 juillet 2010 à 19:46 +0200, Eric Dumazet a écrit :
> > > > Le dimanche 11 juillet 2010 à 19:06 +0200, Eric Dumazet a écrit :
> > > > > Le dimanche 11 juillet 2010 à 19:09 +0300, Ilpo Järvinen a écrit :
> > > > > > On Thu, 8 Jul 2010, Tejun Heo wrote:
> > > > > > 
> > > > > > > We've been seeing oops in tcp_xmit_retransmit_queue() w/ 2.6.32.15.
> > > > > > > Please see the attached photoshoot.  This is happening on a HPC
> > > > > > > cluster and very interestingly caused by one particular job.  How long
> > > > > > > it takes isn't clear yet (at least more than a day) but when it
> > > > > > > happens it happens on a lot of machines in relatively short time.
> > > > > > > 
> > > > > > > With a bit of disassemblying, I've found that the oops is happening
> > > > > > > during tcp_for_write_queue_from() because the skb->next points to
> > > > > > > NULL.
> > > > > > > 
> > > > > > >  void tcp_xmit_retransmit_queue(struct sock *sk)
> > > > > > >  {
> > > > > > >  ...
> > > > > > > 	if (tp->retransmit_skb_hint) {
> > > > > > > 		skb = tp->retransmit_skb_hint;
> > > > > > > 		last_lost = TCP_SKB_CB(skb)->end_seq;
> > > > > > > 		if (after(last_lost, tp->retransmit_high))
> > > > > > > 			last_lost = tp->retransmit_high;
> > > > > > > 	} else {
> > > > > > > 		skb = tcp_write_queue_head(sk);
> > > > > > > 		last_lost = tp->snd_una;
> > > > > > > 	}
> > > > > > > 
> > > > > > >  =>	tcp_for_write_queue_from(skb, sk) {
> > > > > > > 		 __u8 sacked = TCP_SKB_CB(skb)->sacked;
> > > > > > > 
> > > > > > > 		 if (skb == tcp_send_head(sk))
> > > > > > > 			 break;
> > > > > > > 		 /* we could do better than to assign each time */
> > > > > > > 		 if (hole == NULL)
> > > > > > > 
> > > > > > > This can happen for one of the following reasons,
> > > > > > > 
> > > > > > > 1. tp->retransmit_skb_hint is NULL and tcp_write_queue_head() is NULL
> > > > > > >    too.  ie. tcp_xmit_retransmit_queue() is called on an empty write
> > > > > > >    queue for some reason.
> > > > > > > 
> > > > > > > 2. tp->retransmit_skb_hint is pointing to a skb which is not on the
> > > > > > >    write_queue.  ie. somebody forgot to update hint while removing the
> > > > > > >    skb from the write queue.
> > > > > > 
> > > > > > Once again I've read the unlinkers through, and only thing that could 
> > > > > > cause this is tcp_send_synack (others do deal with the hints) but I think 
> > > > > > Eric already proposed a patch to that but we never got anywhere due to 
> > > > > > some counterargument why it wouldn't take place (too far away for me to 
> > > > > > remember, see archives about the discussions). ...But if you want be dead 
> > > > > > sure some WARN_ON there might not hurt. Also the purging of the whole 
> > > > > > queue was a similar suspect I then came across (but that would only 
> > > > > > materialize with sk reuse happening e.g., with nfs which the other guys 
> > > > > > weren't using).
> > > > > > 
> > > > > 
> > > > > Hmm.
> > > > > 
> > > > > This sounds familiar to me, but I cannot remember the discussion you
> > > > > mention or the patch.
> > > > > 
> > > > > Or maybe it was the TCP transaction thing ? (including data in SYN or
> > > > > SYN-ACK packet)
> > 
> > No. That's another thing. ...I've already found it with google today but 
> > cannot seem to find it again. I thought I used tcp_make_synack eric but 
> > for some reason I only get these transaction fix hits. I'll keep looking.
> 
> Right, this one:
> 
> http://kerneltrap.org/mailarchive/linux-netdev/2009/10/29/6259073

Hmm, another idea... It might be useful to try to disable 
tcp_retrans_try_collapse in tcp_retransmit_skb as a test. I think that it 
might be possible that tcp_xmit_retransmit_queue might end up holding 
a stale reference either in hole or skb. Kind of shot into the dark still, 
no actual theory on how that could happen but that 
tcp_xmit_retransmit_queue logic is rather tricky because of returning to 
the first hole if such exists so that I couldn't immediately rule out the 
possibility.

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-11 16:09 ` Ilpo Järvinen
  2010-07-11 17:06   ` Eric Dumazet
@ 2010-07-15 11:58   ` Lennart Schulte
  2010-07-15 12:05     ` Eric Dumazet
  1 sibling, 1 reply; 29+ messages in thread
From: Lennart Schulte @ 2010-07-15 11:58 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert, Eric Dumazet

I'm testing new reordering algorithms in a virtual testbed, that is the 
nodes are emulated with xen and all the network parameters can be tuned 
with queues.
With one of the algorithms I also got tracebacks which include 
tcp_xmit_retransmit_queue. It only happens with ACK loss. The kernel 
version however is 2.6.31.
When I read this thread I tried the debug patch and got the following:

[ 2754.413150] NULL head, pkts 0
[ 2754.413156] Errors caught so far 1

Hope that is of any help.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-15 11:58   ` Lennart Schulte
@ 2010-07-15 12:05     ` Eric Dumazet
  2010-07-15 12:55       ` Lennart Schulte
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2010-07-15 12:05 UTC (permalink / raw)
  To: Lennart Schulte
  Cc: Ilpo Järvinen, Tejun Heo, David S. Miller, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Le jeudi 15 juillet 2010 à 13:58 +0200, Lennart Schulte a écrit :
> I'm testing new reordering algorithms in a virtual testbed, that is the 
> nodes are emulated with xen and all the network parameters can be tuned 
> with queues.
> With one of the algorithms I also got tracebacks which include 
> tcp_xmit_retransmit_queue. It only happens with ACK loss. The kernel 
> version however is 2.6.31.
> When I read this thread I tried the debug patch and got the following:
> 
> [ 2754.413150] NULL head, pkts 0
> [ 2754.413156] Errors caught so far 1
> 
> Hope that is of any help.

Not sure I understand. 

Are you saying you reproduce same tcp_xmit_retransmit_queue()  bug, with
a special tc qdisc/class droppping some ACK frames ?

Could it be some sched problem and incorrect return codes in case of
congestion ?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-15 12:05     ` Eric Dumazet
@ 2010-07-15 12:55       ` Lennart Schulte
  2010-07-16 12:02         ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Lennart Schulte @ 2010-07-15 12:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ilpo Järvinen, Tejun Heo, David S. Miller, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Since tcp_xmit_retransmit_queue also gets skb == NULL I'm pretty sure it 
is the same bug.
Up to now I only experienced the problem with ACK loss (without ACK loss 
the test ran about 30min without problems, with ACK loss it had paniced 
within 10min).
The data sender only has a HTB queue for traffic shaping (set to 20 
Mbit/s). The ACK loss is done by another router.
The setup looks like this. This way it seems to be the most realistic.

o sender with HTB
|
|
o netem queue for forward path delay
|
o netem queue for a queue limit
|
o netem queue for backward path delay
|
o netem queue for ACK loss
|
|
o receiver with HTB

Perhaps now it is a little big clearer.


On 15.07.2010 14:05, Eric Dumazet wrote:
> Le jeudi 15 juillet 2010 à 13:58 +0200, Lennart Schulte a écrit :
>    
>> I'm testing new reordering algorithms in a virtual testbed, that is the
>> nodes are emulated with xen and all the network parameters can be tuned
>> with queues.
>> With one of the algorithms I also got tracebacks which include
>> tcp_xmit_retransmit_queue. It only happens with ACK loss. The kernel
>> version however is 2.6.31.
>> When I read this thread I tried the debug patch and got the following:
>>
>> [ 2754.413150] NULL head, pkts 0
>> [ 2754.413156] Errors caught so far 1
>>
>> Hope that is of any help.
>>      
> Not sure I understand.
>
> Are you saying you reproduce same tcp_xmit_retransmit_queue()  bug, with
> a special tc qdisc/class droppping some ACK frames ?
>
> Could it be some sched problem and incorrect return codes in case of
> congestion ?
>    

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-15 12:55       ` Lennart Schulte
@ 2010-07-16 12:02         ` Ilpo Järvinen
  2010-07-16 12:25           ` Lennart Schulte
  2010-07-19 14:57           ` oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
  0 siblings, 2 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-16 12:02 UTC (permalink / raw)
  To: Lennart Schulte
  Cc: Eric Dumazet, Tejun Heo, David S. Miller, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

On Thu, 15 Jul 2010, Lennart Schulte wrote:

> Since tcp_xmit_retransmit_queue also gets skb == NULL I'm pretty sure it is
> the same bug.
> Up to now I only experienced the problem with ACK loss (without ACK loss the
> test ran about 30min without problems, with ACK loss it had paniced within
> 10min).
> The data sender only has a HTB queue for traffic shaping (set to 20 Mbit/s).
> The ACK loss is done by another router.
> The setup looks like this. This way it seems to be the most realistic.
> 
> o sender with HTB
> |
> |
> o netem queue for forward path delay
> |
> o netem queue for a queue limit
> |
> o netem queue for backward path delay
> |
> o netem queue for ACK loss
> |
> |
> o receiver with HTB
> 
> Perhaps now it is a little big clearer.

> > > [ 2754.413150] NULL head, pkts 0
> > > [ 2754.413156] Errors caught so far 1

Thanks for reporting the results.

Could you post the oops too or double check do the timestamps really match 
(and there wasn't more "Errors caught" prints in between)? Since this 
condition doesn't seem to crash the kernel as also send_head should be 
NULL, which saves the day here exiting the loop (unless send head would 
too be corrupt). ...However, I don't like too much anyway that we can end 
up into tcp_xmit_retransmit_queue loop with packets_out being zero and 
only send_head check side-effect causes proper action.

Besides, Tejun has also found that it's hint->next ptr which is NULL in 
his case so this won't solve his case anyway. Tejun, can you confirm 
whether it was retransmit_skb_hint->next being NULL on _entry time_ to 
tcp_xmit_retransmit_queue() or later on in the loop after the updates done 
by the loop itself to the hint (or that your testing didn't conclude 
either)?

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-16 12:02         ` Ilpo Järvinen
@ 2010-07-16 12:25           ` Lennart Schulte
  2010-07-16 13:19             ` Ilpo Järvinen
  2010-07-19 14:57           ` oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
  1 sibling, 1 reply; 29+ messages in thread
From: Lennart Schulte @ 2010-07-16 12:25 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Eric Dumazet, Tejun Heo, David S. Miller, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

On 16.07.2010 14:02, Ilpo Järvinen wrote:
>
>>>> [ 2754.413150] NULL head, pkts 0
>>>> [ 2754.413156] Errors caught so far 1
>>>>          
> Thanks for reporting the results.
>
> Could you post the oops too or double check do the timestamps really match
> (and there wasn't more "Errors caught" prints in between)? Since this
> condition doesn't seem to crash the kernel as also send_head should be
> NULL, which saves the day here exiting the loop (unless send head would
> too be corrupt).
>    
I can try to do some more testing, perhaps then I will get other 
results. But until now I've always gotten something like above.

With the debug patch the kernel doesn't crash, but I have an oops from a 
run before the patch:

[ 3214.498061] BUG: unable to handle kernel NULL pointer dereference at 
(null)
[ 3214.498085] IP: [<c12657dc>] tcp_xmit_retransmit_queue+0x4c/0x2b0
[ 3214.498121] *pdpt = 00000002cf6fa001
[ 3214.498130] Thread overran stack, or stack corrupted
[ 3214.498138] Oops: 0000 [#1] SMP
[ 3214.498154] last sysfs file: /sys/kernel/uevent_seqnum
[ 3214.498161] Modules linked in: tcp_ancr tcp_ncr
[ 3214.498174]
[ 3214.498180] Pid: 0, comm: swapper Not tainted (2.6.31.9-pae-um-wolff 
#79)
[ 3214.498188] EIP: 0061:[<c12657dc>] EFLAGS: 00010246 CPU: 0
[ 3214.498196] EIP is at tcp_xmit_retransmit_queue+0x4c/0x2b0
[ 3214.498203] EAX: c6da2900 EBX: c6da2880 ECX: 00000000 EDX: e50c512e
[ 3214.498211] ESI: 00000000 EDI: 0000051b EBP: c6da2900 ESP: c13d5cf0
[ 3214.498219]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 3214.498227] Process swapper (pid: 0, ti=c13d4000 task=c13e7a20 
task.ti=c13d4000)
[ 3214.498236] Stack:
[ 3214.498240]  c1005a0b 00000001 00000000 e50c512e c7804300 00000013 
c6da2880 0000051b
[ 3214.498264] <0> e50c512e c1260709 c6cbf840 c6d42000 c1031826 c1288bbd 
c6da2900 c6e09320
[ 3214.498290] <0> c6e09300 00000000 00000000 00000001 e50c512d e521a346 
e50c512e 00000000
[ 3214.498318] Call Trace:
[ 3214.498329]  [<c1005a0b>] ? xen_restore_fl_direct_end+0x0/0x1
[ 3214.498339]  [<c1260709>] ? tcp_ack+0x7f9/0x10d0
[ 3214.498350]  [<c1031826>] ? local_bh_enable+0x56/0x80
[ 3214.498359]  [<c1288bbd>] ? ipt_do_table+0x2dd/0x590
[ 3214.498369]  [<c1261eef>] ? tcp_rcv_state_process+0x41f/0x970
[ 3214.498378]  [<c1267e7f>] ? tcp_v4_do_rcv+0x8f/0x1e0
[ 3214.498387]  [<c1269ccd>] ? tcp_v4_rcv+0x68d/0x7d0
[ 3214.498397]  [<c124d5f0>] ? ip_local_deliver_finish+0x0/0x1e0
[ 3214.498406]  [<c124d687>] ? ip_local_deliver_finish+0x97/0x1e0
[ 3214.498416]  [<c124d5f0>] ? ip_local_deliver_finish+0x0/0x1e0
[ 3214.498425]  [<c124d3eb>] ? ip_rcv_finish+0x13b/0x340
[ 3214.498434]  [<c124d2b0>] ? ip_rcv_finish+0x0/0x340
[ 3214.498442]  [<c124d8c0>] ? ip_rcv+0x0/0x2e0
[ 3214.498452]  [<c1211037>] ? netif_receive_skb+0x2f7/0x4c0
[ 3214.498468]  [<c1213fe0>] ? process_backlog+0x70/0xb0
[ 3214.498476]  [<c1213d98>] ? net_rx_action+0xe8/0x1a0
[ 3214.498486]  [<c103116d>] ? __do_softirq+0x8d/0x120
[ 3214.498494]  [<c100360d>] ? xen_mc_flush+0xed/0x1a0
[ 3214.498504]  [<c1057891>] ? move_native_irq+0x11/0x50
[ 3214.498513]  [<c1031238>] ? do_softirq+0x38/0x40
[ 3214.498523]  [<c11930e2>] ? xen_evtchn_do_upcall+0x142/0x160
[ 3214.498534]  [<c10087d7>] ? xen_do_upcall+0x7/0xc
[ 3214.498543]  [<c10013a7>] ? hypercall_page+0x3a7/0x1010
[ 3214.498552]  [<c100520f>] ? xen_safe_halt+0xf/0x20
[ 3214.498560]  [<c100349c>] ? xen_idle+0x1c/0x30
[ 3214.498569]  [<c1006bba>] ? cpu_idle+0x3a/0x60
[ 3214.498578]  [<c141692a>] ? start_kernel+0x26a/0x300
[ 3214.498616]  [<c1416280>] ? unknown_bootoption+0x0/0x1c0
[ 3214.498630]  [<c141938e>] ? xen_start_kernel+0x3be/0x3e0
[ 3214.498637] Code: 00 00 8b b3 a0 03 00 00 85 f6 0f 84 53 02 00 00 8b 
46 3c 8d ab 80 00 00 00 8b 93 04 04 00 00 39 c2 89 54 24 0c 0f 89 1c 02 
00 00 <8b> 06 0f 18 00 90 39 ee 0f 84 30 01 00 00 39 b3 28 01 00 00 8d
[ 3214.498820] EIP: [<c12657dc>] tcp_xmit_retransmit_queue+0x4c/0x2b0 
SS:ESP 0069:c13d5cf0
[ 3214.498836] CR2: 0000000000000000
[ 3214.498846] ---[ end trace 709a97adf87834a7 ]---
[ 3214.498852] Kernel panic - not syncing: Fatal exception in interrupt
[ 3214.498862] Pid: 0, comm: swapper Tainted: G      D    
2.6.31.9-pae-um-wolff #79
[ 3214.498870] Call Trace:
[ 3214.498878]  [<c102c896>] ? panic+0x46/0x100
[ 3214.498904]  [<c100b428>] ? oops_end+0x98/0xa0
[ 3214.498922]  [<c1019eff>] ? no_context+0x11f/0x1b0
[ 3214.498930]  [<c101a516>] ? do_page_fault+0x66/0x240
[ 3214.498939]  [<c101a4b0>] ? do_page_fault+0x0/0x240
[ 3214.498947]  [<c101a23f>] ? bad_area_nosemaphore+0xf/0x20
[ 3214.498955]  [<c1308b7e>] ? error_code+0x66/0x6c
[ 3214.498963]  [<c101a4b0>] ? do_page_fault+0x0/0x240
[ 3214.498972]  [<c12657dc>] ? tcp_xmit_retransmit_queue+0x4c/0x2b0
[ 3214.498982]  [<c1005a0b>] ? xen_restore_fl_direct_end+0x0/0x1
[ 3214.498991]  [<c1260709>] ? tcp_ack+0x7f9/0x10d0
[ 3214.498999]  [<c1031826>] ? local_bh_enable+0x56/0x80
[ 3214.499009]  [<c1288bbd>] ? ipt_do_table+0x2dd/0x590
[ 3214.499017]  [<c1261eef>] ? tcp_rcv_state_process+0x41f/0x970
[ 3214.499025]  [<c1267e7f>] ? tcp_v4_do_rcv+0x8f/0x1e0
[ 3214.499034]  [<c1269ccd>] ? tcp_v4_rcv+0x68d/0x7d0
[ 3214.499044]  [<c124d5f0>] ? ip_local_deliver_finish+0x0/0x1e0
[ 3214.499053]  [<c124d687>] ? ip_local_deliver_finish+0x97/0x1e0
[ 3214.499063]  [<c124d5f0>] ? ip_local_deliver_finish+0x0/0x1e0
[ 3214.499072]  [<c124d3eb>] ? ip_rcv_finish+0x13b/0x340
[ 3214.499079]  [<c124d2b0>] ? ip_rcv_finish+0x0/0x340
[ 3214.499087]  [<c124d8c0>] ? ip_rcv+0x0/0x2e0
[ 3214.499101]  [<c1211037>] ? netif_receive_skb+0x2f7/0x4c0
[ 3214.499115]  [<c1213fe0>] ? process_backlog+0x70/0xb0
[ 3214.499123]  [<c1213d98>] ? net_rx_action+0xe8/0x1a0
[ 3214.499131]  [<c103116d>] ? __do_softirq+0x8d/0x120
[ 3214.499143]  [<c100360d>] ? xen_mc_flush+0xed/0x1a0
[ 3214.499152]  [<c1057891>] ? move_native_irq+0x11/0x50
[ 3214.499160]  [<c1031238>] ? do_softirq+0x38/0x40
[ 3214.499174]  [<c11930e2>] ? xen_evtchn_do_upcall+0x142/0x160
[ 3214.499188]  [<c10087d7>] ? xen_do_upcall+0x7/0xc
[ 3214.499195]  [<c10013a7>] ? hypercall_page+0x3a7/0x1010
[ 3214.499203]  [<c100520f>] ? xen_safe_halt+0xf/0x20
[ 3214.499214]  [<c100349c>] ? xen_idle+0x1c/0x30
[ 3214.499223]  [<c1006bba>] ? cpu_idle+0x3a/0x60
[ 3214.499231]  [<c141692a>] ? start_kernel+0x26a/0x300
[ 3214.499239]  [<c1416280>] ? unknown_bootoption+0x0/0x1c0
[ 3214.499247]  [<c141938e>] ? xen_start_kernel+0x3be/0x3e0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-16 12:25           ` Lennart Schulte
@ 2010-07-16 13:19             ` Ilpo Järvinen
  2010-07-19  8:06               ` Lennart Schulte
  0 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-16 13:19 UTC (permalink / raw)
  To: Lennart Schulte, David S. Miller
  Cc: Eric Dumazet, Tejun Heo, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2300 bytes --]

On Fri, 16 Jul 2010, Lennart Schulte wrote:

> On 16.07.2010 14:02, Ilpo Järvinen wrote:
> > 
> > > > > [ 2754.413150] NULL head, pkts 0
> > > > > [ 2754.413156] Errors caught so far 1
> > > > >          
> > Thanks for reporting the results.
> > 
> > Could you post the oops too or double check do the timestamps really match
> > (and there wasn't more "Errors caught" prints in between)? Since this
> > condition doesn't seem to crash the kernel as also send_head should be
> > NULL, which saves the day here exiting the loop (unless send head would
> > too be corrupt).

Doh, I think we'll deref skb already to get the sacked (wouldn't be 
absolutely necessary but better to not trust side-effects) so it 
certainly is bad even with the send_head exit.

> I can try to do some more testing, perhaps then I will get other results. But
> until now I've always gotten something like above.

It might then be useful to remove if (!caught_it) which was to prevent 
infinite printout if the problem is such that it would have persisted 
forever (now w/o the crash), but since there's no evidence of that.

> With the debug patch the kernel doesn't crash, but I have an oops from a run
> before the patch:

Right, no crash of course, stupid me :-).

Lets start with this (I'm not sure if this helps Tejun's case but 
much doubt it does):

--
[PATCH] tcp: fix crash in tcp_xmit_retransmit_queue

It can happen that there are no packets in queue while calling
tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
NULL and that gets deref'ed to get sacked into a local var.

There is no work to do if no packets are outstanding so we just
exit early.

There may still be another bug affecting this same function.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
---
 net/ipv4/tcp_output.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..7ed9dc1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2208,6 +2208,9 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
 	int mib_idx;
 	int fwd_rexmitting = 0;

+	if (!tp->packets_out)
+		return;
+
 	if (!tp->lost_out)
 		tp->retransmit_high = tp->snd_una;

-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-16 13:19             ` Ilpo Järvinen
@ 2010-07-19  8:06               ` Lennart Schulte
  2010-07-19 11:16                 ` [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Lennart Schulte @ 2010-07-19  8:06 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David S. Miller, Eric Dumazet, Tejun Heo, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

I ran tests for about 2 hours with this patch and I got no output from 
the debug patch. This seems to have solved at least my problem :)

Thanks!
> [PATCH] tcp: fix crash in tcp_xmit_retransmit_queue
>
> It can happen that there are no packets in queue while calling
> tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
> NULL and that gets deref'ed to get sacked into a local var.
>
> There is no work to do if no packets are outstanding so we just
> exit early.
>
> There may still be another bug affecting this same function.
>
> Signed-off-by: Ilpo Järvinen<ilpo.jarvinen@helsinki.fi>
> Reported-by: Lennart Schulte<lennart.schulte@nets.rwth-aachen.de>
> ---
>   net/ipv4/tcp_output.c |    3 +++
>   1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index b4ed957..7ed9dc1 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2208,6 +2208,9 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
>   	int mib_idx;
>   	int fwd_rexmitting = 0;
>
> +	if (!tp->packets_out)
> +		return;
> +
>   	if (!tp->lost_out)
>   		tp->retransmit_high = tp->snd_una;
>
>    

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
  2010-07-19  8:06               ` Lennart Schulte
@ 2010-07-19 11:16                 ` Ilpo Järvinen
  2010-07-19 14:09                   ` Eric Dumazet
  0 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-19 11:16 UTC (permalink / raw)
  To: Lennart Schulte, David Miller
  Cc: Eric Dumazet, Tejun Heo, lkml, netdev@vger.kernel.org,
	Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1951 bytes --]

On Mon, 19 Jul 2010, Lennart Schulte wrote:

> I ran tests for about 2 hours with this patch and I got no output from the
> debug patch. This seems to have solved at least my problem :)
> 
> Thanks!
> > [PATCH] tcp: fix crash in tcp_xmit_retransmit_queue
> > 
> > It can happen that there are no packets in queue while calling
> > tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
> > NULL and that gets deref'ed to get sacked into a local var.
> > 
> > There is no work to do if no packets are outstanding so we just
> > exit early.
> > 
> > There may still be another bug affecting this same function.

Thanks for testing.

DaveM, I think this oops was introduced for 2.6.28 (in 
08ebd1721ab8fd362e90ae17b461c07b23fa2824 it seems, to be exact) so to 
stables it should go too please. I've only tweaked the message (so no need 
for Lennart to retest v2 :-)).

--
[PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue

It can happen that there are no packets in queue while calling
tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
NULL and that gets deref'ed to get sacked into a local var.

There is no work to do if no packets are outstanding so we just
exit early.

This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
guard to make joining diff nicer).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Tested-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
---
 net/ipv4/tcp_output.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..7ed9dc1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2208,6 +2208,9 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
 	int mib_idx;
 	int fwd_rexmitting = 0;
 
+	if (!tp->packets_out)
+		return;
+
 	if (!tp->lost_out)
 		tp->retransmit_high = tp->snd_una;
 
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
  2010-07-19 11:16                 ` [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue Ilpo Järvinen
@ 2010-07-19 14:09                   ` Eric Dumazet
  2010-07-19 17:25                     ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2010-07-19 14:09 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Lennart Schulte, David Miller, Tejun Heo, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Le lundi 19 juillet 2010 à 14:16 +0300, Ilpo Järvinen a écrit :

> Thanks for testing.
> 
> DaveM, I think this oops was introduced for 2.6.28 (in 
> 08ebd1721ab8fd362e90ae17b461c07b23fa2824 it seems, to be exact) so to 
> stables it should go too please. I've only tweaked the message (so no need 
> for Lennart to retest v2 :-)).
> 
> --
> [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
> 
> It can happen that there are no packets in queue while calling
> tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
> NULL and that gets deref'ed to get sacked into a local var.
> 
> There is no work to do if no packets are outstanding so we just
> exit early.
> 
> This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
> guard to make joining diff nicer).
> 

But prior to commit 08ebd1721ab8fd3, we were not testing
tp->packets_out, but tp->lost_out

if it was 0, we were not doing the tcp_for_write_queue_from() loop.

Not sure it makes a difference ?

> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> Reported-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
> Tested-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
> ---
>  net/ipv4/tcp_output.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index b4ed957..7ed9dc1 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2208,6 +2208,9 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
>  	int mib_idx;
>  	int fwd_rexmitting = 0;
>  
> +	if (!tp->packets_out)
> +		return;
> +
>  	if (!tp->lost_out)
>  		tp->retransmit_high = tp->snd_una;
>  

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
  2010-07-19 14:09                   ` Eric Dumazet
@ 2010-07-19 17:25                     ` Ilpo Järvinen
  2010-07-19 17:39                       ` Eric Dumazet
  0 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-19 17:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Lennart Schulte, David Miller, Tejun Heo, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2153 bytes --]

On Mon, 19 Jul 2010, Eric Dumazet wrote:

> Le lundi 19 juillet 2010 à 14:16 +0300, Ilpo Järvinen a écrit :
> 
> > Thanks for testing.
> > 
> > DaveM, I think this oops was introduced for 2.6.28 (in 
> > 08ebd1721ab8fd362e90ae17b461c07b23fa2824 it seems, to be exact) so to 
> > stables it should go too please. I've only tweaked the message (so no need 
> > for Lennart to retest v2 :-)).
> > 
> > --
> > [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
> > 
> > It can happen that there are no packets in queue while calling
> > tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
> > NULL and that gets deref'ed to get sacked into a local var.
> > 
> > There is no work to do if no packets are outstanding so we just
> > exit early.
> > 
> > This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
> > guard to make joining diff nicer).
> > 
> 
> But prior to commit 08ebd1721ab8fd3, we were not testing
> tp->packets_out, but tp->lost_out

That's right, but back then we were not testing it for the same purpose.

> if it was 0, we were not doing the tcp_for_write_queue_from() loop.

This invariant _should_ be true all the time:
 lost_out <= packets_out

...and if it's not we would get Leak printouts every now and then. Thus is 
packets_out is zero no NULL defer with the if lost_out either. The other 
loop too (in pre 08eb kernels) will work because of earlier mentioned 
send_head check side-effects.

> Not sure it makes a difference ?

This difference is well thought and intentional, I didn't use different 
one by accident. We want to make sure we won't use NULL from 
tcp_write_queue_head() while the pre 08ebd1721ab8fd3 kernels was 
interested mainly whether the first loop should run or not (and of course 
ends up avoid the null deref too but it's more optimization like 
thing in there, ie., if there's no lost packets no work to-do). The deref 
could have been fixed by moving TCP_SKB_CB(skb)->sacked a bit later but 
that would again make us depend on the side-effect of the send_head check 
(in the case of packets_out being zero and wq empty) which is something I 
don't like too much.

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
  2010-07-19 17:25                     ` Ilpo Järvinen
@ 2010-07-19 17:39                       ` Eric Dumazet
  2010-07-19 19:55                         ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2010-07-19 17:39 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Lennart Schulte, David Miller, Tejun Heo, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Le lundi 19 juillet 2010 à 20:25 +0300, Ilpo Järvinen a écrit :

> This difference is well thought and intentional, I didn't use different 
> one by accident. We want to make sure we won't use NULL from 
> tcp_write_queue_head() while the pre 08ebd1721ab8fd3 kernels was 
> interested mainly whether the first loop should run or not (and of course 
> ends up avoid the null deref too but it's more optimization like 
> thing in there, ie., if there's no lost packets no work to-do). The deref 
> could have been fixed by moving TCP_SKB_CB(skb)->sacked a bit later but 
> that would again make us depend on the side-effect of the send_head check 
> (in the case of packets_out being zero and wq empty) which is something I 
> don't like too much.
> 

Thanks Ilpo.

Do you know in what exact circumstance the bug triggers ?

It's hard to believe thousand of machines on the Internet never hit
it :(

Maybe another problem in congestion control ?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
  2010-07-19 17:39                       ` Eric Dumazet
@ 2010-07-19 19:55                         ` David Miller
  2010-07-20  8:33                           ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2010-07-19 19:55 UTC (permalink / raw)
  To: eric.dumazet
  Cc: ilpo.jarvinen, lennart.schulte, tj, linux-kernel, netdev,
	henning.fehrmann, carsten.aulbert

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 19 Jul 2010 19:39:08 +0200

> Do you know in what exact circumstance the bug triggers ?
> 
> It's hard to believe thousand of machines on the Internet never hit
> it :(
> 
> Maybe another problem in congestion control ?

This is something to investigate, but the conditions under which
tcp_fastretrans_alert() (the main invoker of tcp_xmit_retransmit_queue())
does it's thing are complicated enough that I'm going to add this fix
for the time being and push it out to stable too.

Thanks everyone.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
  2010-07-19 19:55                         ` David Miller
@ 2010-07-20  8:33                           ` Ilpo Järvinen
  0 siblings, 0 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-20  8:33 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, lennart.schulte, tj, LKML, Netdev, henning.fehrmann,
	carsten.aulbert

On Mon, 19 Jul 2010, David Miller wrote:

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 19 Jul 2010 19:39:08 +0200
> 
> > Do you know in what exact circumstance the bug triggers ?
> > 
> > It's hard to believe thousand of machines on the Internet never hit
> > it :(
> > 
> > Maybe another problem in congestion control ?
> 
> This is something to investigate, but the conditions under which
> tcp_fastretrans_alert() (the main invoker of tcp_xmit_retransmit_queue())
> does it's thing are complicated enough that I'm going to add this fix
> for the time being and push it out to stable too.

This is so true. ...So far I've managed to twice rule out of the 
possibility of this being really triggerable (ie., it would mean
Lennart's out of tree changes broke it), and once in the middle came
into opposite conclusion. Thus by majority voting we can deduce that it 
won't happen - how reassuring :-/. It seems that tcp_try_undo_recovery 
causes return if TCP remained in CA_Loss/CA_Recovery and that 
tcp_time_to_recover won't really let past return either under normal 
circumstances (more details below), and tcp_simple_retransmit 
requires lost_out to change; seems safe in mainline to me.

Hmm... It seems that I've just solved another report too. ...Somebody a 
while back found out that setting reordering sysctl to zero (ie. to a 
value which does not make too much sense) crashed the kernel. It seems 
that at least then tcp_time_to_recover() would return true and trigger 
this bug (though I'm not sure if that's the only breakage to happen).

Also worth to keep in mind is the bugzilla entry ("New freez in 
TCP" or something like that) so I'm not really sure I could say for sure 
nobody never hit it. The bugzilla one goes away by disable SACK (at least 
for some) but it might mix two different issues. It seems that there 
really are two different issues, the other may have something to do with 
SACK though there are other variables then involved, e.g., the changes in 
retransmission logic/timing, so it's impossible to say if the SACK disable 
really "fixed" the bugzilla one or not. Also Tejun's ->next == NULL 
finding points out to a different bug than this Lennart's one.

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-16 12:02         ` Ilpo Järvinen
  2010-07-16 12:25           ` Lennart Schulte
@ 2010-07-19 14:57           ` Tejun Heo
  2010-07-20  8:41             ` Ilpo Järvinen
  2010-09-08  9:32             ` Ilpo Järvinen
  1 sibling, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2010-07-19 14:57 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Hello,

On 07/16/2010 02:02 PM, Ilpo Järvinen wrote:
> Besides, Tejun has also found that it's hint->next ptr which is NULL in 
> his case so this won't solve his case anyway. Tejun, can you confirm 
> whether it was retransmit_skb_hint->next being NULL on _entry time_ to 
> tcp_xmit_retransmit_queue() or later on in the loop after the updates done 
> by the loop itself to the hint (or that your testing didn't conclude 
> either)?

Sorry about the delay.  I was traveling last week.  Unfortunately, I
don't know whether ->next was NULL on entry or not.  I hacked up the
following ugly patch for the next test run.  It should have everything
which has come up till now + list and hint sanity checking before
starting processing them.  I'm planning on deploying it w/ crashdump
enabled in several days.  If I've missed something, please let me
know.

Thanks.

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..1c8b1e0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2190,6 +2190,53 @@ static int tcp_can_forward_retransmit(struct sock *sk)
 	return 1;
 }

+static void print_queue(struct sock *sk, struct sk_buff *old, struct sk_buff *hole)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct sk_buff *skb, *prev;
+	bool do_panic = false;
+
+	skb = tcp_write_queue_head(sk);
+	prev = (struct sk_buff *)(&sk->sk_write_queue);
+
+	if (skb == NULL) {
+		printk("XXX NULL head, pkts %u\n", tp->packets_out);
+		do_panic = true;
+	}
+
+	printk("XXX head %p tail %p sendhead %p oldhint %p now %p hole %p high %u\n",
+	       tcp_write_queue_head(sk), tcp_write_queue_tail(sk),
+	       tcp_send_head(sk), old, tp->retransmit_skb_hint, hole,
+	       tp->retransmit_high);
+
+	while (skb) {
+		printk("XXX skb %p (%u-%u) next %p prev %p sacked %u\n",
+		       skb, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
+		       skb->next, skb->prev, TCP_SKB_CB(skb)->sacked);
+		if (prev != skb->prev) {
+			printk("XXX Inconsistent prev\n");
+			do_panic = true;
+		}
+
+		if (skb == tcp_write_queue_tail(sk)) {
+			if (skb->next != (struct sk_buff *)(&sk->sk_write_queue)) {
+				printk("XXX Improper next at tail\n");
+				do_panic = true;
+			}
+			break;
+		}
+
+		prev = skb;
+		skb = skb->next;
+	}
+	if (!skb) {
+		printk("XXX Encountered unexpected NULL\n");
+		do_panic = true;
+	}
+	if (do_panic)
+		panic("XXX panicking");
+}
+
 /* This gets called after a retransmit timeout, and the initially
  * retransmitted data is acknowledged.  It tries to continue
  * resending the rest of the retransmit queue, until either
@@ -2198,19 +2245,53 @@ static int tcp_can_forward_retransmit(struct sock *sk)
  * based retransmit packet might feed us FACK information again.
  * If so, we use it to avoid unnecessarily retransmissions.
  */
+static unsigned int caught_it;
+
 void tcp_xmit_retransmit_queue(struct sock *sk)
 {
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct sk_buff *skb;
+	struct sk_buff *skb, *prev;
 	struct sk_buff *hole = NULL;
+	struct sk_buff *old = tp->retransmit_skb_hint;
 	u32 last_lost;
 	int mib_idx;
 	int fwd_rexmitting = 0;
+	bool saw_hint = false;
+
+	if (!tp->packets_out) {
+		if (net_ratelimit())
+			printk("XXX !tp->packets_out, retransmit_skb_hint=%p, write_queue_head=%p\n",
+			       tp->retransmit_skb_hint, tcp_write_queue_head(sk));
+		return;
+	}

 	if (!tp->lost_out)
 		tp->retransmit_high = tp->snd_una;

+	for (skb = tcp_write_queue_head(sk),
+	     prev = (struct sk_buff *)&sk->sk_write_queue;
+	     skb != (struct sk_buff *)&sk->sk_write_queue;
+	     prev = skb, skb = skb->next) {
+		if (prev != skb->prev) {
+			printk("XXX sanity check: prev corrupt\n");
+			print_queue(sk, old, hole);
+		}
+		if (skb == tp->retransmit_skb_hint)
+			saw_hint = true;
+		if (skb == tcp_write_queue_tail(sk) &&
+		    skb->next != (struct sk_buff *)(&sk->sk_write_queue)) {
+			printk("XXX sanity check: end corrupt\n");
+			print_queue(sk, old, hole);
+		}
+	}
+	if (tp->retransmit_skb_hint && !saw_hint) {
+		printk("XXX sanity check: retransmit_skb_hint=%p is not on list, claring hint\n",
+		       tp->retransmit_skb_hint);
+		print_queue(sk, old, hole);
+		tp->retransmit_skb_hint = NULL;
+	}
+
 	if (tp->retransmit_skb_hint) {
 		skb = tp->retransmit_skb_hint;
 		last_lost = TCP_SKB_CB(skb)->end_seq;
@@ -2218,7 +2299,17 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
 			last_lost = tp->retransmit_high;
 	} else {
 		skb = tcp_write_queue_head(sk);
-		last_lost = tp->snd_una;
+		if (skb)
+			last_lost = tp->snd_una;
+	}
+
+checknull:
+	if (skb == NULL) {
+		print_queue(sk, old, hole);
+		caught_it++;
+		if (net_ratelimit())
+			printk("XXX Errors caught so far %u\n", caught_it);
+		return;
 	}

 	tcp_for_write_queue_from(skb, sk) {
@@ -2261,7 +2352,7 @@ begin_fwd:
 		} else if (!(sacked & TCPCB_LOST)) {
 			if (hole == NULL && !(sacked & (TCPCB_SACKED_RETRANS|TCPCB_SACKED_ACKED)))
 				hole = skb;
-			continue;
+			goto cont;

 		} else {
 			last_lost = TCP_SKB_CB(skb)->end_seq;
@@ -2272,7 +2363,7 @@ begin_fwd:
 		}

 		if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
-			continue;
+			goto cont;

 		if (tcp_retransmit_skb(sk, skb))
 			return;
@@ -2282,6 +2373,9 @@ begin_fwd:
 			inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
 						  inet_csk(sk)->icsk_rto,
 						  TCP_RTO_MAX);
+cont:
+		skb = skb->next;
+		goto checknull;
 	}
 }

-- 
tejun

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-19 14:57           ` oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
@ 2010-07-20  8:41             ` Ilpo Järvinen
  2010-09-08  9:32             ` Ilpo Järvinen
  1 sibling, 0 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2010-07-20  8:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller, lkml,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6095 bytes --]

On Mon, 19 Jul 2010, Tejun Heo wrote:

> Hello,
> 
> On 07/16/2010 02:02 PM, Ilpo Järvinen wrote:
> > Besides, Tejun has also found that it's hint->next ptr which is NULL in 
> > his case so this won't solve his case anyway. Tejun, can you confirm 
> > whether it was retransmit_skb_hint->next being NULL on _entry time_ to 
> > tcp_xmit_retransmit_queue() or later on in the loop after the updates done 
> > by the loop itself to the hint (or that your testing didn't conclude 
> > either)?
> 
> Sorry about the delay.  I was traveling last week.  Unfortunately, I
> don't know whether ->next was NULL on entry or not.  I hacked up the
> following ugly patch for the next test run.  It should have everything
> which has come up till now + list and hint sanity checking before
> starting processing them.  I'm planning on deploying it w/ crashdump
> enabled in several days.  If I've missed something, please let me
> know.

One thing that complicates things further is the fact that the write queue 
can change beneath us (ie., in tcp_retrans_try_collapse and tcp_fragment).
I've read them multiple times through and always found them innocent so 
this might be just for-the-record type of a note (at least I hope so).

-- 
 i.

> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index b4ed957..1c8b1e0 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2190,6 +2190,53 @@ static int tcp_can_forward_retransmit(struct sock *sk)
>  	return 1;
>  }
> 
> +static void print_queue(struct sock *sk, struct sk_buff *old, struct sk_buff *hole)
> +{
> +	struct tcp_sock *tp = tcp_sk(sk);
> +	struct sk_buff *skb, *prev;
> +	bool do_panic = false;
> +
> +	skb = tcp_write_queue_head(sk);
> +	prev = (struct sk_buff *)(&sk->sk_write_queue);
> +
> +	if (skb == NULL) {
> +		printk("XXX NULL head, pkts %u\n", tp->packets_out);
> +		do_panic = true;
> +	}
> +
> +	printk("XXX head %p tail %p sendhead %p oldhint %p now %p hole %p high %u\n",
> +	       tcp_write_queue_head(sk), tcp_write_queue_tail(sk),
> +	       tcp_send_head(sk), old, tp->retransmit_skb_hint, hole,
> +	       tp->retransmit_high);
> +
> +	while (skb) {
> +		printk("XXX skb %p (%u-%u) next %p prev %p sacked %u\n",
> +		       skb, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
> +		       skb->next, skb->prev, TCP_SKB_CB(skb)->sacked);
> +		if (prev != skb->prev) {
> +			printk("XXX Inconsistent prev\n");
> +			do_panic = true;
> +		}
> +
> +		if (skb == tcp_write_queue_tail(sk)) {
> +			if (skb->next != (struct sk_buff *)(&sk->sk_write_queue)) {
> +				printk("XXX Improper next at tail\n");
> +				do_panic = true;
> +			}
> +			break;
> +		}
> +
> +		prev = skb;
> +		skb = skb->next;
> +	}
> +	if (!skb) {
> +		printk("XXX Encountered unexpected NULL\n");
> +		do_panic = true;
> +	}
> +	if (do_panic)
> +		panic("XXX panicking");
> +}
> +
>  /* This gets called after a retransmit timeout, and the initially
>   * retransmitted data is acknowledged.  It tries to continue
>   * resending the rest of the retransmit queue, until either
> @@ -2198,19 +2245,53 @@ static int tcp_can_forward_retransmit(struct sock *sk)
>   * based retransmit packet might feed us FACK information again.
>   * If so, we use it to avoid unnecessarily retransmissions.
>   */
> +static unsigned int caught_it;
> +
>  void tcp_xmit_retransmit_queue(struct sock *sk)
>  {
>  	const struct inet_connection_sock *icsk = inet_csk(sk);
>  	struct tcp_sock *tp = tcp_sk(sk);
> -	struct sk_buff *skb;
> +	struct sk_buff *skb, *prev;
>  	struct sk_buff *hole = NULL;
> +	struct sk_buff *old = tp->retransmit_skb_hint;
>  	u32 last_lost;
>  	int mib_idx;
>  	int fwd_rexmitting = 0;
> +	bool saw_hint = false;
> +
> +	if (!tp->packets_out) {
> +		if (net_ratelimit())
> +			printk("XXX !tp->packets_out, retransmit_skb_hint=%p, write_queue_head=%p\n",
> +			       tp->retransmit_skb_hint, tcp_write_queue_head(sk));
> +		return;
> +	}
> 
>  	if (!tp->lost_out)
>  		tp->retransmit_high = tp->snd_una;
> 
> +	for (skb = tcp_write_queue_head(sk),
> +	     prev = (struct sk_buff *)&sk->sk_write_queue;
> +	     skb != (struct sk_buff *)&sk->sk_write_queue;
> +	     prev = skb, skb = skb->next) {
> +		if (prev != skb->prev) {
> +			printk("XXX sanity check: prev corrupt\n");
> +			print_queue(sk, old, hole);
> +		}
> +		if (skb == tp->retransmit_skb_hint)
> +			saw_hint = true;
> +		if (skb == tcp_write_queue_tail(sk) &&
> +		    skb->next != (struct sk_buff *)(&sk->sk_write_queue)) {
> +			printk("XXX sanity check: end corrupt\n");
> +			print_queue(sk, old, hole);
> +		}
> +	}
> +	if (tp->retransmit_skb_hint && !saw_hint) {
> +		printk("XXX sanity check: retransmit_skb_hint=%p is not on list, claring hint\n",
> +		       tp->retransmit_skb_hint);
> +		print_queue(sk, old, hole);
> +		tp->retransmit_skb_hint = NULL;
> +	}
> +
>  	if (tp->retransmit_skb_hint) {
>  		skb = tp->retransmit_skb_hint;
>  		last_lost = TCP_SKB_CB(skb)->end_seq;
> @@ -2218,7 +2299,17 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
>  			last_lost = tp->retransmit_high;
>  	} else {
>  		skb = tcp_write_queue_head(sk);
> -		last_lost = tp->snd_una;
> +		if (skb)
> +			last_lost = tp->snd_una;
> +	}
> +
> +checknull:
> +	if (skb == NULL) {
> +		print_queue(sk, old, hole);
> +		caught_it++;
> +		if (net_ratelimit())
> +			printk("XXX Errors caught so far %u\n", caught_it);
> +		return;
>  	}
> 
>  	tcp_for_write_queue_from(skb, sk) {
> @@ -2261,7 +2352,7 @@ begin_fwd:
>  		} else if (!(sacked & TCPCB_LOST)) {
>  			if (hole == NULL && !(sacked & (TCPCB_SACKED_RETRANS|TCPCB_SACKED_ACKED)))
>  				hole = skb;
> -			continue;
> +			goto cont;
> 
>  		} else {
>  			last_lost = TCP_SKB_CB(skb)->end_seq;
> @@ -2272,7 +2363,7 @@ begin_fwd:
>  		}
> 
>  		if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
> -			continue;
> +			goto cont;
> 
>  		if (tcp_retransmit_skb(sk, skb))
>  			return;
> @@ -2282,6 +2373,9 @@ begin_fwd:
>  			inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
>  						  inet_csk(sk)->icsk_rto,
>  						  TCP_RTO_MAX);
> +cont:
> +		skb = skb->next;
> +		goto checknull;
>  	}
>  }
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-07-19 14:57           ` oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
  2010-07-20  8:41             ` Ilpo Järvinen
@ 2010-09-08  9:32             ` Ilpo Järvinen
  2010-09-08 10:25               ` Tejun Heo
  1 sibling, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-09-08  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 956 bytes --]

On Mon, 19 Jul 2010, Tejun Heo wrote:

> Hello,
> 
> On 07/16/2010 02:02 PM, Ilpo Järvinen wrote:
> > Besides, Tejun has also found that it's hint->next ptr which is NULL in 
> > his case so this won't solve his case anyway. Tejun, can you confirm 
> > whether it was retransmit_skb_hint->next being NULL on _entry time_ to 
> > tcp_xmit_retransmit_queue() or later on in the loop after the updates done 
> > by the loop itself to the hint (or that your testing didn't conclude 
> > either)?
> 
> Sorry about the delay.  I was traveling last week.  Unfortunately, I
> don't know whether ->next was NULL on entry or not.  I hacked up the
> following ugly patch for the next test run.  It should have everything
> which has come up till now + list and hint sanity checking before
> starting processing them.  I'm planning on deploying it w/ crashdump
> enabled in several days.  If I've missed something, please let me
> know.

Any news on this one?

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-09-08  9:32             ` Ilpo Järvinen
@ 2010-09-08 10:25               ` Tejun Heo
  2010-09-08 10:34                 ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2010-09-08 10:25 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Hello,

On 09/08/2010 11:32 AM, Ilpo Järvinen wrote:
>> Sorry about the delay.  I was traveling last week.  Unfortunately, I
>> don't know whether ->next was NULL on entry or not.  I hacked up the
>> following ugly patch for the next test run.  It should have everything
>> which has come up till now + list and hint sanity checking before
>> starting processing them.  I'm planning on deploying it w/ crashdump
>> enabled in several days.  If I've missed something, please let me
>> know.
> 
> Any news on this one?

Unfortunately, we haven't been able to reproduce the problem anymore.
It could be (but not likely given that none of the debugging messages
is triggering) that I was mistaken and the previously posted fixed the
issue.  The network used by the cluster went through some changes at
the time and there have been issues with packet losses.  Given that
the problem needs packet losses to trigger, it's likely that packet
loss pattern here changed such that the patterns of packet losses
which trigger the problem aren't happening anymore.  (Carsten,
Henning, please feel free to fill in if I'm missing something).

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-09-08 10:25               ` Tejun Heo
@ 2010-09-08 10:34                 ` Ilpo Järvinen
  2010-09-09 10:27                   ` Tejun Heo
  0 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2010-09-08 10:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1460 bytes --]

On Wed, 8 Sep 2010, Tejun Heo wrote:

> On 09/08/2010 11:32 AM, Ilpo Järvinen wrote:
> >> Sorry about the delay.  I was traveling last week.  Unfortunately, I
> >> don't know whether ->next was NULL on entry or not.  I hacked up the
> >> following ugly patch for the next test run.  It should have everything
> >> which has come up till now + list and hint sanity checking before
> >> starting processing them.  I'm planning on deploying it w/ crashdump
> >> enabled in several days.  If I've missed something, please let me
> >> know.
> > 
> > Any news on this one?
> 
> Unfortunately, we haven't been able to reproduce the problem anymore.

With my debug patch or not at all?

> It could be (but not likely given that none of the debugging messages
> is triggering) that I was mistaken and the previously posted fixed the
> issue. The network used by the cluster went through some changes at
> the time and there have been issues with packet losses.  Given that
> the problem needs packet losses to trigger, it's likely that packet
> loss pattern here changed such that the patterns of packet losses
> which trigger the problem aren't happening anymore.  (Carsten,
> Henning, please feel free to fill in if I'm missing something).

That might well be true, however, you're already a second guy who 
cannot reproduce it with the debug patch so I would not rule out other 
possibilities unless you've tried without debug patch too since the 
changes?


-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-09-08 10:34                 ` Ilpo Järvinen
@ 2010-09-09 10:27                   ` Tejun Heo
  2010-09-09 10:45                     ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2010-09-09 10:27 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

Hello,

On 09/08/2010 12:34 PM, Ilpo Järvinen wrote:
>> Unfortunately, we haven't been able to reproduce the problem anymore.
> 
> With my debug patch or not at all?

With the ugly merged patch I posted previously in this thread which
contained debug messages if any of the worked around condition
triggers.

>> It could be (but not likely given that none of the debugging messages
>> is triggering) that I was mistaken and the previously posted fixed the
>> issue. The network used by the cluster went through some changes at
>> the time and there have been issues with packet losses.  Given that
>> the problem needs packet losses to trigger, it's likely that packet
>> loss pattern here changed such that the patterns of packet losses
>> which trigger the problem aren't happening anymore.  (Carsten,
>> Henning, please feel free to fill in if I'm missing something).
> 
> That might well be true, however, you're already a second guy who 
> cannot reproduce it with the debug patch so I would not rule out other 
> possibilities unless you've tried without debug patch too since the 
> changes?

Unfortunately, I can't really tell one way or the other at this point.
Carsten will be back in a few days.  I'll ask him for more details.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
  2010-09-09 10:27                   ` Tejun Heo
@ 2010-09-09 10:45                     ` Ilpo Järvinen
  0 siblings, 0 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2010-09-09 10:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lennart Schulte, Eric Dumazet, David S. Miller,
	netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1609 bytes --]

On Thu, 9 Sep 2010, Tejun Heo wrote:

> On 09/08/2010 12:34 PM, Ilpo Järvinen wrote:
> >> Unfortunately, we haven't been able to reproduce the problem anymore.
> > 
> > With my debug patch or not at all?
> 
> With the ugly merged patch I posted previously in this thread which
> contained debug messages if any of the worked around condition
> triggers.
> 
> >> It could be (but not likely given that none of the debugging messages
> >> is triggering) that I was mistaken and the previously posted fixed the
> >> issue. The network used by the cluster went through some changes at
> >> the time and there have been issues with packet losses.  Given that
> >> the problem needs packet losses to trigger, it's likely that packet
> >> loss pattern here changed such that the patterns of packet losses
> >> which trigger the problem aren't happening anymore.  (Carsten,
> >> Henning, please feel free to fill in if I'm missing something).
> > 
> > That might well be true, however, you're already a second guy who 
> > cannot reproduce it with the debug patch so I would not rule out other 
> > possibilities unless you've tried without debug patch too since the 
> > changes?
> 
> Unfortunately, I can't really tell one way or the other at this point.
> Carsten will be back in a few days.  I'll ask him for more details.

Once you get the info, if not yet done, I'd recommend you try without the 
debug patch (assuming a possible crash isn't too devasting for the actual 
stuff you're doing with the machines :-)). ...If it crashes without, then 
it's time to start looking into compiler versions, etc.

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2010-09-09 10:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-08  8:22 oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
2010-07-11  2:36 ` David Miller
2010-07-11 16:09 ` Ilpo Järvinen
2010-07-11 17:06   ` Eric Dumazet
2010-07-11 17:46     ` Eric Dumazet
2010-07-11 18:29       ` Eric Dumazet
2010-07-11 19:22         ` Ilpo Järvinen
2010-07-11 19:25           ` Ilpo Järvinen
2010-07-11 19:44             ` Ilpo Järvinen
2010-07-15 11:58   ` Lennart Schulte
2010-07-15 12:05     ` Eric Dumazet
2010-07-15 12:55       ` Lennart Schulte
2010-07-16 12:02         ` Ilpo Järvinen
2010-07-16 12:25           ` Lennart Schulte
2010-07-16 13:19             ` Ilpo Järvinen
2010-07-19  8:06               ` Lennart Schulte
2010-07-19 11:16                 ` [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue Ilpo Järvinen
2010-07-19 14:09                   ` Eric Dumazet
2010-07-19 17:25                     ` Ilpo Järvinen
2010-07-19 17:39                       ` Eric Dumazet
2010-07-19 19:55                         ` David Miller
2010-07-20  8:33                           ` Ilpo Järvinen
2010-07-19 14:57           ` oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15 Tejun Heo
2010-07-20  8:41             ` Ilpo Järvinen
2010-09-08  9:32             ` Ilpo Järvinen
2010-09-08 10:25               ` Tejun Heo
2010-09-08 10:34                 ` Ilpo Järvinen
2010-09-09 10:27                   ` Tejun Heo
2010-09-09 10:45                     ` Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).