public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Lost network connectivity in 4.0.x
@ 2015-05-24  2:43 Ken Moffat
  2015-05-24  3:29 ` Ken Moffat
  0 siblings, 1 reply; 5+ messages in thread
From: Ken Moffat @ 2015-05-24  2:43 UTC (permalink / raw)
  To: linux-kernel

Anybody else suffering frm lost network connectivity in 4.0.x
kernels ?  A couple of times this week, vim on an nfs-3 mount hung
and I had to reboot.  Both of those occasions were on an AMD desktop
with the r8169 driver, running 4.0.3.  I thought it might be
specific to that machine.  For the last two or three days I've been
using an intel, and about 10 minutes ago it suffered the same problem
while running 4.0.4.  Using ping from another term showed that it
had no connectivity to the server on my local network.

This is a bit hard to diagnose - nothing in the logs.

ĸen
-- 
Nanny Ogg usually went to bed early. After all, she was an old lady.
Sometimes she went to bed as early as 6 a.m.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lost network connectivity in 4.0.x
  2015-05-24  2:43 Lost network connectivity in 4.0.x Ken Moffat
@ 2015-05-24  3:29 ` Ken Moffat
  2015-05-28  5:53   ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Ken Moffat @ 2015-05-24  3:29 UTC (permalink / raw)
  To: linux-kernel

On Sun, May 24, 2015 at 03:43:52AM +0100, Ken Moffat wrote:
> Anybody else suffering frm lost network connectivity in 4.0.x
> kernels ?  A couple of times this week, vim on an nfs-3 mount hung
> and I had to reboot.  Both of those occasions were on an AMD desktop
> with the r8169 driver, running 4.0.3.  I thought it might be
> specific to that machine.  For the last two or three days I've been
> using an intel, and about 10 minutes ago it suffered the same problem
> while running 4.0.4.  Using ping from another term showed that it
> had no connectivity to the server on my local network.
> 
> This is a bit hard to diagnose - nothing in the logs.
> 
I forgot to add that this is with the released gcc-5.1 : I keep
forgetting that some people use old compilers ;-)

ĸen
-- 
Nanny Ogg usually went to bed early. After all, she was an old lady.
Sometimes she went to bed as early as 6 a.m.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lost network connectivity in 4.0.x
  2015-05-24  3:29 ` Ken Moffat
@ 2015-05-28  5:53   ` Cong Wang
  2015-05-28 14:41     ` Ken Moffat
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2015-05-28  5:53 UTC (permalink / raw)
  To: Ken Moffat; +Cc: LKML, Linux Kernel Network Developers

(Please always Cc netdev for networking bugs.)

On Sat, May 23, 2015 at 8:29 PM, Ken Moffat <zarniwhoop@ntlworld.com> wrote:
> On Sun, May 24, 2015 at 03:43:52AM +0100, Ken Moffat wrote:
>> Anybody else suffering frm lost network connectivity in 4.0.x
>> kernels ?  A couple of times this week, vim on an nfs-3 mount hung
>> and I had to reboot.  Both of those occasions were on an AMD desktop
>> with the r8169 driver, running 4.0.3.  I thought it might be
>> specific to that machine.  For the last two or three days I've been
>> using an intel, and about 10 minutes ago it suffered the same problem
>> while running 4.0.4.  Using ping from another term showed that it
>> had no connectivity to the server on my local network.
>>
>> This is a bit hard to diagnose - nothing in the logs.
>>
> I forgot to add that this is with the released gcc-5.1 : I keep
> forgetting that some people use old compilers ;-)
>

Is there any way you can help to narrow down the problem?

For example:

1) What is your network setup? iptables? routes? etc.

2) Can you check the stats to see if there is any error?
  `ip -s -s li show`, `ethtool -S <DEV>`

3) Do a bisect?

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lost network connectivity in 4.0.x
  2015-05-28  5:53   ` Cong Wang
@ 2015-05-28 14:41     ` Ken Moffat
  2015-05-28 16:11       ` Ken Moffat
  0 siblings, 1 reply; 5+ messages in thread
From: Ken Moffat @ 2015-05-28 14:41 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, Linux Kernel Network Developers

On Wed, May 27, 2015 at 10:53:00PM -0700, Cong Wang wrote:
> (Please always Cc netdev for networking bugs.)
> 
> On Sat, May 23, 2015 at 8:29 PM, Ken Moffat <zarniwhoop@ntlworld.com> wrote:
> > On Sun, May 24, 2015 at 03:43:52AM +0100, Ken Moffat wrote:
> >> Anybody else suffering frm lost network connectivity in 4.0.x
> >> kernels ?  A couple of times this week, vim on an nfs-3 mount hung
> >> and I had to reboot.  Both of those occasions were on an AMD desktop
> >> with the r8169 driver, running 4.0.3.  I thought it might be
> >> specific to that machine.  For the last two or three days I've been
> >> using an intel, and about 10 minutes ago it suffered the same problem
> >> while running 4.0.4.  Using ping from another term showed that it
> >> had no connectivity to the server on my local network.
> >>
> >> This is a bit hard to diagnose - nothing in the logs.
> >>
> > I forgot to add that this is with the released gcc-5.1 : I keep
> > forgetting that some people use old compilers ;-)
> >
> 
> Is there any way you can help to narrow down the problem?
> 

Thanks for the reply.  The problem is continuing to show up, but
irregularly and often only after the machine has been booted for a
long time (with s2ram, but I don't think I've used s2ram on every
occasion).

> For example:
> 
> 1) What is your network setup? iptables? routes? etc.
> 
I'm using iptables.  Ah, yes - it started dropping packets around
the time I last had a problem:

May 27 00:48:26 ac4tv dhclient: DHCPREQUEST on eth0 to 192.168.7.254
port 67
May 27 00:48:27 ac4tv dhclient: DHCPACK from 192.168.7.254
May 27 00:48:27 ac4tv dhclient: bound to 192.168.7.152 -- renewal in
1787 seconds.

 That address came from my router, and I had been getting the same
address for an hour, tbut then the dropped packet messages start
appearing - they are for a different address, one that would have
been offered by my server:

May 27 00:53:16 ac4tv kernel: [31922.316798] IPTABLES Packet
Dropped: IN=eth0 OUT= MAC=c8:60:00:97:07:35:bc:ae:c5:57:70:c5:08:00
SRC=192.168.7.11 DST=192.168.7.121 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=0 DF PROTO=TCP SPT=2049 DPT=1005 WINDOW=28960 RES=0x00 ACK SYN
URGP=0 
May 27 00:53:17 ac4tv kernel: [31923.316612] IPTABLES Packet
Dropped: IN=eth0 OUT= MAC=c8:60:00:97:07:35:bc:ae:c5:57:70:c5:08:00
SRC=192.168.7.11 DST=192.168.7.121 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=0 DF PROTO=TCP SPT=2049 DPT=1005 WINDOW=28960 RES=0x00 ACK SYN
URGP=0 

and those continued until I forced a reboot.

> 2) Can you check the stats to see if there is any error?
>   `ip -s -s li show`, `ethtool -S <DEV>`
> 

I don't have ethtool installed, and that ip command appears ok at
the moment:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
mode DEFAULT group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    RX: bytes  packets  errors  dropped overrun mcast   
    3964       66       0       0       0       0       
    RX errors: length   crc     frame   fifo    missed
               0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    3964       66       0       0       0       0       
    TX errors: aborted  fifo   window heartbeat transns
               0        0       0       0       0       
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 1000
    link/ether c8:60:00:97:07:35 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    224661061  277642   0       0       0       0       
    RX errors: length   crc     frame   fifo    missed
               0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    278152429  370438   0       0       0       0       
    TX errors: aborted  fifo   window heartbeat transns
               0        0       0       0       6       

> 3) Do a bisect?
> 
> Thanks!

That doesn't seem very practical when the machine is ok for a couple
of days at a time.

ĸen
-- 
Nanny Ogg usually went to bed early. After all, she was an old lady.
Sometimes she went to bed as early as 6 a.m.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lost network connectivity in 4.0.x
  2015-05-28 14:41     ` Ken Moffat
@ 2015-05-28 16:11       ` Ken Moffat
  0 siblings, 0 replies; 5+ messages in thread
From: Ken Moffat @ 2015-05-28 16:11 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, Linux Kernel Network Developers

On Thu, May 28, 2015 at 03:41:49PM +0100, Ken Moffat wrote:
> On Wed, May 27, 2015 at 10:53:00PM -0700, Cong Wang wrote:
> > (Please always Cc netdev for networking bugs.)
> > 
Sorry, didn't spot that.  But anyway
> 
> > For example:
> > 
> > 1) What is your network setup? iptables? routes? etc.
> > 
> I'm using iptables.  Ah, yes - it started dropping packets around
> the time I last had a problem:
> 
> May 27 00:48:26 ac4tv dhclient: DHCPREQUEST on eth0 to 192.168.7.254
> port 67
> May 27 00:48:27 ac4tv dhclient: DHCPACK from 192.168.7.254
> May 27 00:48:27 ac4tv dhclient: bound to 192.168.7.152 -- renewal in
> 1787 seconds.
> 
>  That address came from my router, and I had been getting the same
> address for an hour, tbut then the dropped packet messages start
> appearing - they are for a different address, one that would have
> been offered by my server:
> 
Now that I've had time to think about this and look a bit more
deeply, I can see that at one point I got a lease from my server,
but then after a random length of time the client tried to renew and
got a lease from the router.  Some time after that, it failed
because iptables rejected the nfs packets because they were "not for
me".

So, not a kernel problem, and the reason I'm (now) seeing this on
4.0+ kernels is that I have not recently booted a system with an old
(3.19 or earlier) kernel and kept it running for a long time.

Thanks again, sorry to waste everybody's bandwidth.

ĸen
-- 
Nanny Ogg usually went to bed early. After all, she was an old lady.
Sometimes she went to bed as early as 6 a.m.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-05-28 16:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-24  2:43 Lost network connectivity in 4.0.x Ken Moffat
2015-05-24  3:29 ` Ken Moffat
2015-05-28  5:53   ` Cong Wang
2015-05-28 14:41     ` Ken Moffat
2015-05-28 16:11       ` Ken Moffat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox