* Fw: [Bug 18822] New: TCP Communications gets blocked, then resetted
@ 2010-09-20 16:04 Stephen Hemminger
2010-09-21 9:54 ` Ilpo Järvinen
0 siblings, 1 reply; 3+ messages in thread
From: Stephen Hemminger @ 2010-09-20 16:04 UTC (permalink / raw)
To: netdev
Begin forwarded message:
Date: Mon, 20 Sep 2010 10:39:47 GMT
From: bugzilla-daemon@bugzilla.kernel.org
To: shemminger@linux-foundation.org
Subject: [Bug 18822] New: TCP Communications gets blocked, then resetted
https://bugzilla.kernel.org/show_bug.cgi?id=18822
Summary: TCP Communications gets blocked, then resetted
Product: Networking
Version: 2.5
Kernel Version: 2.6.32-24-generic #43-Ubuntu
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: IPV4
AssignedTo: shemminger@linux-foundation.org
ReportedBy: dc6iq@gmx.de
Regression: No
Created an attachment (id=30682)
--> (https://bugzilla.kernel.org/attachment.cgi?id=30682)
both machine dumps
having a freshly installed bacula server i could never get a backup from my
working machine done to the bacula server.
The Connection transmits approximately 3 GiB, then locks up and resets the
connection.
Transmission gets stuck at a certain point, then the bacula server does not
reply to retransmitted packets on IPV4 stack. Retry Count on disks machine (my
working machine) raises up to 13, then the Connection is gone. bacula server
tries to send a Push packet (after KeepAlive timer runs out), and get the final
RST packet from disks, because the connection is gone.
In the Attachment you will find the tcpdumps from both machines, actually the
sending machine dropped some packets in the dump.
It might be a possible help: disks is running an 64 bit kernel whereas bacula
is running 32 bit. I haven't looked into the option bits very well but it looks
like there is a problem hidden:
last ack being ok:
09:32:56.142876 IP bacula.elkenet.bacula-sd > disks.elkenet.50766: Flags [.],
ack 2005754588, win 9582, options [nop,nop,TS val 21825875 ecr 4530083], length
0
next ack packet:
09:32:56.144763 IP bacula.elkenet.bacula-sd > disks.elkenet.50766: Flags [.],
ack 2005773412, win 9308, options [nop,nop,TS val 21825876 ecr
4530083,nop,nop,sack 1 {2005774860:2005776308}], length 0
root@disks:~# uname -a
Linux disks 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010
x86_64 GNU/Linux
root@bacula:~# uname -a
Linux bacula 2.6.32-24-generic-pae #43-Ubuntu SMP Thu Sep 16 15:30:27 UTC 2010
i686 GNU/Linux
Doing a 20GB backup on a debian server works fine
server:~# uname -a
Linux server 2.6.32-5-486 #1 Sat Sep 18 01:43:00 UTC 2010 i686 GNU/Linux
Doing a 26 GB backup from a 32 bit Ubuntu works fine as well. Maybe its a 64
bit issue...
root@elke:~# uname -a
Linux elke 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:17:33 UTC 2010 i686
GNU/Linux
If any further input is required, just let me know...
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
--
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fw: [Bug 18822] New: TCP Communications gets blocked, then resetted
2010-09-20 16:04 Fw: [Bug 18822] New: TCP Communications gets blocked, then resetted Stephen Hemminger
@ 2010-09-21 9:54 ` Ilpo Järvinen
2010-09-21 10:18 ` Ilpo Järvinen
0 siblings, 1 reply; 3+ messages in thread
From: Ilpo Järvinen @ 2010-09-21 9:54 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Netdev
On Mon, 20 Sep 2010, Stephen Hemminger wrote:
> Begin forwarded message:
>
> Date: Mon, 20 Sep 2010 10:39:47 GMT
> From: bugzilla-daemon@bugzilla.kernel.org
> To: shemminger@linux-foundation.org
> Subject: [Bug 18822] New: TCP Communications gets blocked, then resetted
>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=18822
>
> Summary: TCP Communications gets blocked, then resetted
> Product: Networking
> Version: 2.5
> Kernel Version: 2.6.32-24-generic #43-Ubuntu
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: IPV4
> AssignedTo: shemminger@linux-foundation.org
> ReportedBy: dc6iq@gmx.de
> Regression: No
>
>
> Created an attachment (id=30682)
> --> (https://bugzilla.kernel.org/attachment.cgi?id=30682)
> both machine dumps
>
> having a freshly installed bacula server i could never get a backup from my
> working machine done to the bacula server.
>
> The Connection transmits approximately 3 GiB, then locks up and resets the
> connection.
>
> Transmission gets stuck at a certain point, then the bacula server does not
> reply to retransmitted packets on IPV4 stack. Retry Count on disks machine (my
> working machine) raises up to 13, then the Connection is gone. bacula server
> tries to send a Push packet (after KeepAlive timer runs out), and get the final
> RST packet from disks, because the connection is gone.
>
> In the Attachment you will find the tcpdumps from both machines, actually the
> sending machine dropped some packets in the dump.
If you didn't already, try with -w directly into a binary file and then
post process to textual input with -r.
> It might be a possible help: disks is running an 64 bit kernel whereas bacula
> is running 32 bit. I haven't looked into the option bits very well but it looks
> like there is a problem hidden:
>
> last ack being ok:
> 09:32:56.142876 IP bacula.elkenet.bacula-sd > disks.elkenet.50766: Flags [.],
> ack 2005754588, win 9582, options [nop,nop,TS val 21825875 ecr 4530083], length
> 0
> next ack packet:
> 09:32:56.144763 IP bacula.elkenet.bacula-sd > disks.elkenet.50766: Flags [.],
> ack 2005773412, win 9308, options [nop,nop,TS val 21825876 ecr
> 4530083,nop,nop,sack 1 {2005774860:2005776308}], length 0
...I fail to understand to what problem you're trying to point here with
these two ACKs. Could you elaborate please (if you had something specific
in mind)?
I went throught the logs... I cannot go through all the checking done
because tcpdump without enough -v's hides the sequence numbers for pure
ACKs (09:32:56.150182 shows only the ack seqno, not the other seqno which
also is used by the validator), I think you need at least -vvv to show
them nowadays. The last new data segment at 09:32:56.150132 was still
received as it is reported in SACK, only the retransmissions that
follow are discarded.
> root@disks:~# uname -a
> Linux disks 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010
> x86_64 GNU/Linux
>
> root@bacula:~# uname -a
> Linux bacula 2.6.32-24-generic-pae #43-Ubuntu SMP Thu Sep 16 15:30:27 UTC 2010
> i686 GNU/Linux
>
> Doing a 20GB backup on a debian server works fine
>
> server:~# uname -a
> Linux server 2.6.32-5-486 #1 Sat Sep 18 01:43:00 UTC 2010 i686 GNU/Linux
>
> Doing a 26 GB backup from a 32 bit Ubuntu works fine as well. Maybe its a 64
> bit issue...
>
> root@elke:~# uname -a
> Linux elke 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:17:33 UTC 2010 i686
> GNU/Linux
Hmm, some ubuntu magic in these kernels.
> If any further input is required, just let me know...
MIBs might immediately tell what caused the segments between
09:32:56.150310 and 09:46:30.970140 to be discarded (e.g., take before
and after snapshots of /proc/net/netstat and /proc/net/snmp and see what
did increase).
Any TSO enabled?
--
i.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fw: [Bug 18822] New: TCP Communications gets blocked, then resetted
2010-09-21 9:54 ` Ilpo Järvinen
@ 2010-09-21 10:18 ` Ilpo Järvinen
0 siblings, 0 replies; 3+ messages in thread
From: Ilpo Järvinen @ 2010-09-21 10:18 UTC (permalink / raw)
To: dc6iq; +Cc: Stephen Hemminger, Netdev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4295 bytes --]
Now with the original reported too among the receivers.
On Tue, 21 Sep 2010, Ilpo Järvinen wrote:
> On Mon, 20 Sep 2010, Stephen Hemminger wrote:
>
> > Begin forwarded message:
> >
> > Date: Mon, 20 Sep 2010 10:39:47 GMT
> > From: bugzilla-daemon@bugzilla.kernel.org
> > To: shemminger@linux-foundation.org
> > Subject: [Bug 18822] New: TCP Communications gets blocked, then resetted
> >
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=18822
> >
> > Summary: TCP Communications gets blocked, then resetted
> > Product: Networking
> > Version: 2.5
> > Kernel Version: 2.6.32-24-generic #43-Ubuntu
> > Platform: All
> > OS/Version: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: normal
> > Priority: P1
> > Component: IPV4
> > AssignedTo: shemminger@linux-foundation.org
> > ReportedBy: dc6iq@gmx.de
> > Regression: No
> >
> >
> > Created an attachment (id=30682)
> > --> (https://bugzilla.kernel.org/attachment.cgi?id=30682)
> > both machine dumps
> >
> > having a freshly installed bacula server i could never get a backup from my
> > working machine done to the bacula server.
> >
> > The Connection transmits approximately 3 GiB, then locks up and resets the
> > connection.
> >
> > Transmission gets stuck at a certain point, then the bacula server does not
> > reply to retransmitted packets on IPV4 stack. Retry Count on disks machine (my
> > working machine) raises up to 13, then the Connection is gone. bacula server
> > tries to send a Push packet (after KeepAlive timer runs out), and get the final
> > RST packet from disks, because the connection is gone.
> >
> > In the Attachment you will find the tcpdumps from both machines, actually the
> > sending machine dropped some packets in the dump.
>
> If you didn't already, try with -w directly into a binary file and then
> post process to textual input with -r.
>
> > It might be a possible help: disks is running an 64 bit kernel whereas bacula
> > is running 32 bit. I haven't looked into the option bits very well but it looks
> > like there is a problem hidden:
> >
> > last ack being ok:
> > 09:32:56.142876 IP bacula.elkenet.bacula-sd > disks.elkenet.50766: Flags [.],
> > ack 2005754588, win 9582, options [nop,nop,TS val 21825875 ecr 4530083], length
> > 0
> > next ack packet:
> > 09:32:56.144763 IP bacula.elkenet.bacula-sd > disks.elkenet.50766: Flags [.],
> > ack 2005773412, win 9308, options [nop,nop,TS val 21825876 ecr
> > 4530083,nop,nop,sack 1 {2005774860:2005776308}], length 0
>
> ...I fail to understand to what problem you're trying to point here with
> these two ACKs. Could you elaborate please (if you had something specific
> in mind)?
>
> I went throught the logs... I cannot go through all the checking done
> because tcpdump without enough -v's hides the sequence numbers for pure
> ACKs (09:32:56.150182 shows only the ack seqno, not the other seqno which
> also is used by the validator), I think you need at least -vvv to show
> them nowadays. The last new data segment at 09:32:56.150132 was still
> received as it is reported in SACK, only the retransmissions that
> follow are discarded.
>
> > root@disks:~# uname -a
> > Linux disks 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010
> > x86_64 GNU/Linux
> >
> > root@bacula:~# uname -a
> > Linux bacula 2.6.32-24-generic-pae #43-Ubuntu SMP Thu Sep 16 15:30:27 UTC 2010
> > i686 GNU/Linux
> >
> > Doing a 20GB backup on a debian server works fine
> >
> > server:~# uname -a
> > Linux server 2.6.32-5-486 #1 Sat Sep 18 01:43:00 UTC 2010 i686 GNU/Linux
> >
> > Doing a 26 GB backup from a 32 bit Ubuntu works fine as well. Maybe its a 64
> > bit issue...
> >
> > root@elke:~# uname -a
> > Linux elke 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:17:33 UTC 2010 i686
> > GNU/Linux
>
> Hmm, some ubuntu magic in these kernels.
>
> > If any further input is required, just let me know...
>
> MIBs might immediately tell what caused the segments between
> 09:32:56.150310 and 09:46:30.970140 to be discarded (e.g., take before
> and after snapshots of /proc/net/netstat and /proc/net/snmp and see what
> did increase).
>
> Any TSO enabled?
>
>
--
i.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-09-21 10:18 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-20 16:04 Fw: [Bug 18822] New: TCP Communications gets blocked, then resetted Stephen Hemminger
2010-09-21 9:54 ` Ilpo Järvinen
2010-09-21 10:18 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).