* ICMP echo reply fails @ 2010-03-26 21:48 Andy Fleming 2010-03-26 22:06 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Andy Fleming @ 2010-03-26 21:48 UTC (permalink / raw) To: netdev For various reasons, we have been running a stress test on one of our boards. The test consists of initiating 2-3 flood pings from a Windows box running Cygwin, plus one additional ping we use as a "heartbeat". The ping flood is overwhelming our board (we're dropping packets at a prodigious rate), but the board continues to respond for a while. In addition, we are running a script on the board which alternates bringing up and bringing down the interface every ten seconds. After a highly variable amount of time, the board stops replying to the pings. We suspected a driver issue, however, on closer inspection, we are still able to send and receive packets (I can ping *from* the board to the PC, and I can *telnet* from the PC to the board). We tried pinging the board from another PC, and it also failed. Essentially, ICMP echo requests are being ignored (A glance at memory indicates that packets are arriving, but no packets are being enqueued to the ethernet controller). We still have a lot more debugging to do, but I was wondering if anyone had ever seen something like this, or might be quicker to realize the obvious mistake we're making. Thanks, Andy Fleming ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ICMP echo reply fails 2010-03-26 21:48 ICMP echo reply fails Andy Fleming @ 2010-03-26 22:06 ` Eric Dumazet 2010-03-26 22:46 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2010-03-26 22:06 UTC (permalink / raw) To: Andy Fleming; +Cc: netdev Le vendredi 26 mars 2010 à 16:48 -0500, Andy Fleming a écrit : > For various reasons, we have been running a stress test on one of our > boards. The test consists of initiating 2-3 flood pings from a > Windows box running Cygwin, plus one additional ping we use as a > "heartbeat". The ping flood is overwhelming our board (we're dropping > packets at a prodigious rate), but the board continues to respond for > a while. In addition, we are running a script on the board which > alternates bringing up and bringing down the interface every ten > seconds. After a highly variable amount of time, the board stops > replying to the pings. We suspected a driver issue, however, on > closer inspection, we are still able to send and receive packets (I > can ping *from* the board to the PC, and I can *telnet* from the PC to > the board). We tried pinging the board from another PC, and it also > failed. Essentially, ICMP echo requests are being ignored (A glance > at memory indicates that packets are arriving, but no packets are > being enqueued to the ethernet controller). We still have a lot more > debugging to do, but I was wondering if anyone had ever seen something > like this, or might be quicker to realize the obvious mistake we're > making. > > Thanks, > Andy Fleming kernel version ? NIC driver ? Are ICMP echo request received ? (grep Icmp /proc/net/snmp) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ICMP echo reply fails 2010-03-26 22:06 ` Eric Dumazet @ 2010-03-26 22:46 ` Eric Dumazet 2010-03-27 0:56 ` Andy Fleming 2010-04-02 7:09 ` [PATCH net-next-2.6] Eric Dumazet 0 siblings, 2 replies; 6+ messages in thread From: Eric Dumazet @ 2010-03-26 22:46 UTC (permalink / raw) To: Andy Fleming; +Cc: netdev Le vendredi 26 mars 2010 à 23:06 +0100, Eric Dumazet a écrit : > Le vendredi 26 mars 2010 à 16:48 -0500, Andy Fleming a écrit : > > For various reasons, we have been running a stress test on one of our > > boards. The test consists of initiating 2-3 flood pings from a > > Windows box running Cygwin, plus one additional ping we use as a > > "heartbeat". The ping flood is overwhelming our board (we're dropping > > packets at a prodigious rate), but the board continues to respond for > > a while. In addition, we are running a script on the board which > > alternates bringing up and bringing down the interface every ten > > seconds. After a highly variable amount of time, the board stops > > replying to the pings. We suspected a driver issue, however, on > > closer inspection, we are still able to send and receive packets (I > > can ping *from* the board to the PC, and I can *telnet* from the PC to > > the board). We tried pinging the board from another PC, and it also > > failed. Essentially, ICMP echo requests are being ignored (A glance > > at memory indicates that packets are arriving, but no packets are > > being enqueued to the ethernet controller). We still have a lot more > > debugging to do, but I was wondering if anyone had ever seen something > > like this, or might be quicker to realize the obvious mistake we're > > making. > > > > Thanks, > > Andy Fleming > > > kernel version ? > > NIC driver ? > > Are ICMP echo request received ? (grep Icmp /proc/net/snmp) > vi +1166 net/ipv4/icmp.c /* Enough space for 2 64K ICMP packets, including * sk_buff struct overhead. */ sk->sk_sndbuf = (2 * ((64 * 1024) + sizeof(struct sk_buff))); If many ICMP replies are lost/leaked by your driver when doing up/down things, ICMP socket can consume all its sndbuf reserve and no more icmp replies can be sent (a reboot is needed) You could try changing sk->sk_sndbuf to 0x7FFFFFFF to see if the icmp replies survive longer to your tests. If this is the case, then find the leaks in your driver (tx path, maybe you forgot to free skbs in some reset cases ?) We should add a SNMP counter for failed ip_append() calls in icmp_push_reply()... ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ICMP echo reply fails 2010-03-26 22:46 ` Eric Dumazet @ 2010-03-27 0:56 ` Andy Fleming 2010-04-02 7:09 ` [PATCH net-next-2.6] Eric Dumazet 1 sibling, 0 replies; 6+ messages in thread From: Andy Fleming @ 2010-03-27 0:56 UTC (permalink / raw) To: netdev On Fri, Mar 26, 2010 at 5:46 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le vendredi 26 mars 2010 à 23:06 +0100, Eric Dumazet a écrit : >> Le vendredi 26 mars 2010 à 16:48 -0500, Andy Fleming a écrit : >> > For various reasons, we have been running a stress test on one of our >> > boards. The test consists of initiating 2-3 flood pings from a >> > Windows box running Cygwin, plus one additional ping we use as a >> > "heartbeat". The ping flood is overwhelming our board (we're dropping >> > packets at a prodigious rate), but the board continues to respond for >> > a while. In addition, we are running a script on the board which >> > alternates bringing up and bringing down the interface every ten >> > seconds. After a highly variable amount of time, the board stops >> > replying to the pings. We suspected a driver issue, however, on >> > closer inspection, we are still able to send and receive packets (I >> > can ping *from* the board to the PC, and I can *telnet* from the PC to >> > the board). We tried pinging the board from another PC, and it also >> > failed. Essentially, ICMP echo requests are being ignored (A glance >> > at memory indicates that packets are arriving, but no packets are >> > being enqueued to the ethernet controller). We still have a lot more >> > debugging to do, but I was wondering if anyone had ever seen something >> > like this, or might be quicker to realize the obvious mistake we're >> > making. >> > >> > Thanks, >> > Andy Fleming >> >> >> kernel version ? >> >> NIC driver ? >> >> Are ICMP echo request received ? (grep Icmp /proc/net/snmp) >> > > vi +1166 net/ipv4/icmp.c > > /* Enough space for 2 64K ICMP packets, including > * sk_buff struct overhead. > */ > sk->sk_sndbuf = > (2 * ((64 * 1024) + sizeof(struct sk_buff))); > > > If many ICMP replies are lost/leaked by your driver when doing up/down > things, ICMP socket can consume all its sndbuf reserve and no more icmp > replies can be sent (a reboot is needed) > > You could try changing sk->sk_sndbuf to 0x7FFFFFFF to see if the icmp > replies survive longer to your tests. If this is the case, then find the > leaks in your driver (tx path, maybe you forgot to free skbs in some > reset cases ?) > Ah, that makes a bunch of sense. I had a feeling the socket was involved. Thank you so much for your help. I will test this as soon as I have access to the board again! ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next-2.6] 2010-03-26 22:46 ` Eric Dumazet 2010-03-27 0:56 ` Andy Fleming @ 2010-04-02 7:09 ` Eric Dumazet 2010-04-03 22:09 ` [PATCH net-next-2.6] icmp: Account for ICMP out errors David Miller 1 sibling, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2010-04-02 7:09 UTC (permalink / raw) To: Andy Fleming, David Miller; +Cc: netdev Le vendredi 26 mars 2010 à 23:46 +0100, Eric Dumazet a écrit : > > We should add a SNMP counter for failed ip_append() calls in > icmp_push_reply()... > It appears we never increment ICMP_MIB_OUTERRORS, so we can use official RFC 2011 SNMP counter for this. [PATCH net-next-2.6] icmp: Account for ICMP out errors When ip_append() fails because of socket limit or memory shortage, increment ICMP_MIB_OUTERRORS counter, so that "netstat -s" can report these errors. LANG=C netstat -s | grep "ICMP messages failed" 0 ICMP messages failed For IPV6, implement ICMP6_MIB_OUTERRORS counter as well. # grep Icmp6OutErrors /proc/net/dev_snmp6/* /proc/net/dev_snmp6/eth0:Icmp6OutErrors 0 /proc/net/dev_snmp6/lo:Icmp6OutErrors 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/snmp.h | 1 + net/ipv4/icmp.c | 5 +++-- net/ipv6/icmp.c | 2 ++ net/ipv6/proc.c | 1 + 4 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/linux/snmp.h b/include/linux/snmp.h index d2a9aa3..5279771 100644 --- a/include/linux/snmp.h +++ b/include/linux/snmp.h @@ -100,6 +100,7 @@ enum ICMP6_MIB_INMSGS, /* InMsgs */ ICMP6_MIB_INERRORS, /* InErrors */ ICMP6_MIB_OUTMSGS, /* OutMsgs */ + ICMP6_MIB_OUTERRORS, /* OutErrors */ __ICMP6_MIB_MAX }; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 4b4c2bc..d2aa743 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -330,9 +330,10 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param, if (ip_append_data(sk, icmp_glue_bits, icmp_param, icmp_param->data_len+icmp_param->head_len, icmp_param->head_len, - ipc, rt, MSG_DONTWAIT) < 0) + ipc, rt, MSG_DONTWAIT) < 0) { + ICMP_INC_STATS_BH(sock_net(sk), ICMP_MIB_OUTERRORS); ip_flush_pending_frames(sk); - else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) { + } else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) { struct icmphdr *icmph = icmp_hdr(skb); __wsum csum = 0; struct sk_buff *skb1; diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index eb9abe2..a00c18a 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -482,6 +482,7 @@ route_done: np->tclass, NULL, &fl, (struct rt6_info*)dst, MSG_DONTWAIT); if (err) { + ICMP6_INC_STATS_BH(net, idev, ICMP6_MIB_OUTMSGS); ip6_flush_pending_frames(sk); goto out_put; } @@ -562,6 +563,7 @@ static void icmpv6_echo_reply(struct sk_buff *skb) (struct rt6_info*)dst, MSG_DONTWAIT); if (err) { + ICMP6_INC_STATS_BH(net, idev, ICMP6_MIB_OUTMSGS); ip6_flush_pending_frames(sk); goto out_put; } diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c index 58344c0..458eabf 100644 --- a/net/ipv6/proc.c +++ b/net/ipv6/proc.c @@ -97,6 +97,7 @@ static const struct snmp_mib snmp6_icmp6_list[] = { SNMP_MIB_ITEM("Icmp6InMsgs", ICMP6_MIB_INMSGS), SNMP_MIB_ITEM("Icmp6InErrors", ICMP6_MIB_INERRORS), SNMP_MIB_ITEM("Icmp6OutMsgs", ICMP6_MIB_OUTMSGS), + SNMP_MIB_ITEM("Icmp6OutErrors", ICMP6_MIB_OUTERRORS), SNMP_MIB_SENTINEL }; ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next-2.6] icmp: Account for ICMP out errors 2010-04-02 7:09 ` [PATCH net-next-2.6] Eric Dumazet @ 2010-04-03 22:09 ` David Miller 0 siblings, 0 replies; 6+ messages in thread From: David Miller @ 2010-04-03 22:09 UTC (permalink / raw) To: eric.dumazet; +Cc: afleming, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 02 Apr 2010 09:09:34 +0200 > [PATCH net-next-2.6] icmp: Account for ICMP out errors > > When ip_append() fails because of socket limit or memory shortage, > increment ICMP_MIB_OUTERRORS counter, so that "netstat -s" can report > these errors. > > LANG=C netstat -s | grep "ICMP messages failed" > 0 ICMP messages failed > > For IPV6, implement ICMP6_MIB_OUTERRORS counter as well. > > # grep Icmp6OutErrors /proc/net/dev_snmp6/* > /proc/net/dev_snmp6/eth0:Icmp6OutErrors 0 > /proc/net/dev_snmp6/lo:Icmp6OutErrors 0 > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Applied, thanks Eric. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-04-03 22:09 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-26 21:48 ICMP echo reply fails Andy Fleming 2010-03-26 22:06 ` Eric Dumazet 2010-03-26 22:46 ` Eric Dumazet 2010-03-27 0:56 ` Andy Fleming 2010-04-02 7:09 ` [PATCH net-next-2.6] Eric Dumazet 2010-04-03 22:09 ` [PATCH net-next-2.6] icmp: Account for ICMP out errors David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox