* [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful @ 2010-06-01 21:13 Christoph Lameter 2010-06-01 22:07 ` Eric Dumazet 0 siblings, 1 reply; 28+ messages in thread From: Christoph Lameter @ 2010-06-01 21:13 UTC (permalink / raw) To: netdev; +Cc: Stephen Hemminger, David Miller Something like this would have been very helpful during recent debugging of multicast issues. Silent discards are bad. If the kernel perceives that something is wrong with an incoming packet then the IP stack currently silently discards packets. This makes it difficult to diagnose problems with the network configurations (such as a misbehaving kernel subsystem discarding multicast packets because the reverse path filter does not like multicast subscriptions on the second NIC with rp_filter=1). It is also necessary to know how many inbound packets are discarded to assess networking issues in general with a NIC. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> --- net/ipv4/route.c | 3 +++ 1 file changed, 3 insertions(+) Index: linux-2.6/net/ipv4/route.c =================================================================== --- linux-2.6.orig/net/ipv4/route.c 2010-06-01 11:46:10.000000000 -0500 +++ linux-2.6/net/ipv4/route.c 2010-06-01 11:52:55.000000000 -0500 @@ -2981,6 +2981,9 @@ static int inet_rtm_getroute(struct sk_b rt = skb_rtable(skb); if (err == 0 && rt->u.dst.error) err = -rt->u.dst.error; + if (err) + IP_INC_STATS_BH(dev_net(skb->dev), + IPSTATS_MIB_INADDRERRORS); } else { struct flowi fl = { .nl_u = { ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-01 21:13 [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful Christoph Lameter @ 2010-06-01 22:07 ` Eric Dumazet 2010-06-01 22:23 ` David Miller 2010-06-02 15:27 ` Christoph Lameter 0 siblings, 2 replies; 28+ messages in thread From: Eric Dumazet @ 2010-06-01 22:07 UTC (permalink / raw) To: Christoph Lameter; +Cc: netdev, Stephen Hemminger, David Miller Le mardi 01 juin 2010 à 16:13 -0500, Christoph Lameter a écrit : > Something like this would have been very helpful during recent debugging > of multicast issues. Silent discards are bad. > > > If the kernel perceives that something is wrong with an incoming packet then the > IP stack currently silently discards packets. This makes it difficult to diagnose > problems with the network configurations (such as a misbehaving kernel > subsystem discarding multicast packets because the reverse path filter > does not like multicast subscriptions on the second NIC with rp_filter=1). > > It is also necessary to know how many inbound packets are discarded to > assess networking issues in general with a NIC. > > Signed-off-by: Christoph Lameter <cl@linux-foundation.org> > Acked-by: Stephen Hemminger <shemminger@vyatta.com> > I disagree with this patch. IPSTATS_MIB_INADDRERRORS has a strong meaning, part of RFCS. In this path, we simulate the routing of a virtual packet, not its delivery. This should not affect IPSTATS SNMP entries. You should use another MIB entry, say LINUX_MIB_INROUTEERRORS ? Dont inet_rtm_getroute() caller gets an error status anyway ? > --- > net/ipv4/route.c | 3 +++ > 1 file changed, 3 insertions(+) > > Index: linux-2.6/net/ipv4/route.c > =================================================================== > --- linux-2.6.orig/net/ipv4/route.c 2010-06-01 11:46:10.000000000 -0500 > +++ linux-2.6/net/ipv4/route.c 2010-06-01 11:52:55.000000000 -0500 > @@ -2981,6 +2981,9 @@ static int inet_rtm_getroute(struct sk_b > rt = skb_rtable(skb); > if (err == 0 && rt->u.dst.error) > err = -rt->u.dst.error; > + if (err) > + IP_INC_STATS_BH(dev_net(skb->dev), > + IPSTATS_MIB_INADDRERRORS); > } else { > struct flowi fl = { > .nl_u = { > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-01 22:07 ` Eric Dumazet @ 2010-06-01 22:23 ` David Miller 2010-06-02 15:27 ` Christoph Lameter 1 sibling, 0 replies; 28+ messages in thread From: David Miller @ 2010-06-01 22:23 UTC (permalink / raw) To: eric.dumazet; +Cc: cl, netdev, shemminger From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 02 Jun 2010 00:07:34 +0200 > IPSTATS_MIB_INADDRERRORS has a strong meaning, part of RFCS. > > In this path, we simulate the routing of a virtual packet, not its > delivery. > > This should not affect IPSTATS SNMP entries. > > You should use another MIB entry, say LINUX_MIB_INROUTEERRORS ? Agreed. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-01 22:07 ` Eric Dumazet 2010-06-01 22:23 ` David Miller @ 2010-06-02 15:27 ` Christoph Lameter 2010-06-02 15:29 ` David Miller 2010-06-02 15:32 ` Eric Dumazet 1 sibling, 2 replies; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 15:27 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev, Stephen Hemminger, David Miller [-- Attachment #1: Type: TEXT/PLAIN, Size: 1665 bytes --] On Wed, 2 Jun 2010, Eric Dumazet wrote: > Le mardi 01 juin 2010 à 16:13 -0500, Christoph Lameter a écrit : > > Something like this would have been very helpful during recent debugging > > of multicast issues. Silent discards are bad. > > > > > > > If the kernel perceives that something is wrong with an incoming packet then the > > IP stack currently silently discards packets. This makes it difficult to diagnose > > problems with the network configurations (such as a misbehaving kernel > > subsystem discarding multicast packets because the reverse path filter > > does not like multicast subscriptions on the second NIC with rp_filter=1). > > > > It is also necessary to know how many inbound packets are discarded to > > assess networking issues in general with a NIC. > > > > Signed-off-by: Christoph Lameter <cl@linux-foundation.org> > > Acked-by: Stephen Hemminger <shemminger@vyatta.com> > > > > I disagree with this patch. > > IPSTATS_MIB_INADDRERRORS has a strong meaning, part of RFCS. > > In this path, we simulate the routing of a virtual packet, not its > delivery. > > This should not affect IPSTATS SNMP entries. > > You should use another MIB entry, say LINUX_MIB_INROUTEERRORS ? > > Dont inet_rtm_getroute() caller gets an error status anyway ? Yes but they are not increment any counter. If packets are dropped because of the rp_filter setting interfering f.e. then the packets vanish without any accounting. LINUX_MIB_INROUTEERRORS? Does it mean I can create a series of new counters that allow us to diagnose and distinguish all the different causes of packet loss? We would love to have that. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 15:27 ` Christoph Lameter @ 2010-06-02 15:29 ` David Miller 2010-06-02 15:32 ` Eric Dumazet 1 sibling, 0 replies; 28+ messages in thread From: David Miller @ 2010-06-02 15:29 UTC (permalink / raw) To: cl; +Cc: eric.dumazet, netdev, shemminger From: Christoph Lameter <cl@linux-foundation.org> Date: Wed, 2 Jun 2010 10:27:13 -0500 (CDT) > LINUX_MIB_INROUTEERRORS? Does it mean I can create a series of new > counters that allow us to diagnose and distinguish all the different > causes of packet loss? We would love to have that. Within reason. If you're going to spam the tree with something like 10 or 20 new stat counters getting bumped all over the place, that's not what we're trying to suggest here. Consolidate as much as possible, add new things when absolutely nothing existing fits the bill. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 15:27 ` Christoph Lameter 2010-06-02 15:29 ` David Miller @ 2010-06-02 15:32 ` Eric Dumazet 2010-06-02 16:12 ` Christoph Lameter 1 sibling, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 15:32 UTC (permalink / raw) To: Christoph Lameter; +Cc: netdev, Stephen Hemminger, David Miller Le mercredi 02 juin 2010 à 10:27 -0500, Christoph Lameter a écrit : > Yes but they are not increment any counter. If packets are dropped because > of the rp_filter setting interfering f.e. then the packets vanish without > any accounting. > But packets are vanishing either way. > LINUX_MIB_INROUTEERRORS? Does it mean I can create a series of new > counters that allow us to diagnose and distinguish all the different > causes of packet loss? We would love to have that. > For an example, you could take a look at commit 907cdda5205b (tcp: Add SNMP counter for DEFER_ACCEPT) Its pretty straigthforward, and wont conflict with monitoring apps. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 15:32 ` Eric Dumazet @ 2010-06-02 16:12 ` Christoph Lameter 2010-06-02 16:19 ` David Miller 2010-06-02 16:28 ` Eric Dumazet 0 siblings, 2 replies; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 16:12 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev, Stephen Hemminger, David Miller [-- Attachment #1: Type: TEXT/PLAIN, Size: 1455 bytes --] On Wed, 2 Jun 2010, Eric Dumazet wrote: > Le mercredi 02 juin 2010 à 10:27 -0500, Christoph Lameter a écrit : > > > Yes but they are not increment any counter. If packets are dropped because > > of the rp_filter setting interfering f.e. then the packets vanish without > > any accounting. > > > > But packets are vanishing either way. Its important to know why drops occur (any drops for that matter, drops mean retransmit which means latency). The symptom here was that multicast traffic on secondary interfaces was not being forwarded to the application. The rp_filter just dropped them and there was no way to easily track down the issue for the people experiencing the problem. In 2.6.31 the rp_filter was fixed to work properly with multicast and now it considers multicast traffic to secondary interfaces to have the wrong reverse path even though the multicast subscription occurred on the secondary interface. So it drops multicast traffic. The rp_filter has to be switched off when using multiple NICs for multicast load balancing. > > LINUX_MIB_INROUTEERRORS? Does it mean I can create a series of new > > counters that allow us to diagnose and distinguish all the different > > causes of packet loss? We would love to have that. > > > > For an example, you could take a look at commit 907cdda5205b > (tcp: Add SNMP counter for DEFER_ACCEPT) Well these are all TCP counters. I would add IP and UDP counters? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:12 ` Christoph Lameter @ 2010-06-02 16:19 ` David Miller 2010-06-02 16:27 ` Christoph Lameter 2010-06-02 16:28 ` Eric Dumazet 1 sibling, 1 reply; 28+ messages in thread From: David Miller @ 2010-06-02 16:19 UTC (permalink / raw) To: cl; +Cc: eric.dumazet, netdev, shemminger From: Christoph Lameter <cl@linux-foundation.org> Date: Wed, 2 Jun 2010 11:12:14 -0500 (CDT) > Its important to know why drops occur (any drops for that matter, drops > mean retransmit which means latency). We know, that's why there is a networking tracepoint that allows you to see where all drops occur. :-) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:19 ` David Miller @ 2010-06-02 16:27 ` Christoph Lameter 2010-06-02 16:33 ` Eric Dumazet 0 siblings, 1 reply; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 16:27 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, netdev, shemminger On Wed, 2 Jun 2010, David Miller wrote: > From: Christoph Lameter <cl@linux-foundation.org> > Date: Wed, 2 Jun 2010 11:12:14 -0500 (CDT) > > > Its important to know why drops occur (any drops for that matter, drops > > mean retransmit which means latency). > > We know, that's why there is a networking tracepoint that allows you > to see where all drops occur. :-) Where can I find out more about the network tracepoint? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:27 ` Christoph Lameter @ 2010-06-02 16:33 ` Eric Dumazet 2010-06-02 16:49 ` Christoph Lameter 0 siblings, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 16:33 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, netdev, shemminger Le mercredi 02 juin 2010 à 11:27 -0500, Christoph Lameter a écrit : > On Wed, 2 Jun 2010, David Miller wrote: > > > From: Christoph Lameter <cl@linux-foundation.org> > > Date: Wed, 2 Jun 2010 11:12:14 -0500 (CDT) > > > > > Its important to know why drops occur (any drops for that matter, drops > > > mean retransmit which means latency). > > > > We know, that's why there is a networking tracepoint that allows you > > to see where all drops occur. :-) > > Where can I find out more about the network tracepoint? > take a look at http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html#dropwatch ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:33 ` Eric Dumazet @ 2010-06-02 16:49 ` Christoph Lameter 2010-06-02 17:12 ` David Miller 0 siblings, 1 reply; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 16:49 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev, shemminger On Wed, 2 Jun 2010, Eric Dumazet wrote: > take a look at > > http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html#dropwatch System tap? Oh no. Also skbs may be freed for legitimate reasons. The point is that the loss detection needs to be usable for a regular mortal. With the counters in /proc/net/snmp you have something that is easy to handle. The approach with systemtap will need lots of work to both get the tracing environment setup (local competence in systemtap) and participation by a developer to figure out what the output means. Then there is also the additional code overhead that you do not want by default in the kernel. So we would need a different kernel for diagnostics. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:49 ` Christoph Lameter @ 2010-06-02 17:12 ` David Miller 2010-06-02 17:19 ` Eric Dumazet 2010-06-02 17:31 ` David Miller 0 siblings, 2 replies; 28+ messages in thread From: David Miller @ 2010-06-02 17:12 UTC (permalink / raw) To: cl; +Cc: eric.dumazet, netdev, shemminger From: Christoph Lameter <cl@linux-foundation.org> Date: Wed, 2 Jun 2010 11:49:18 -0500 (CDT) > On Wed, 2 Jun 2010, Eric Dumazet wrote: > >> take a look at >> >> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html#dropwatch > > System tap? You don't need to use system tap, just the normal tracing stuff using sysfs files suffices. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 17:12 ` David Miller @ 2010-06-02 17:19 ` Eric Dumazet 2010-06-02 17:41 ` Neil Horman 2010-06-02 17:31 ` David Miller 1 sibling, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 17:19 UTC (permalink / raw) To: David Miller; +Cc: cl, netdev, shemminger, Neil Horman Le mercredi 02 juin 2010 à 10:12 -0700, David Miller a écrit : > From: Christoph Lameter <cl@linux-foundation.org> > Date: Wed, 2 Jun 2010 11:49:18 -0500 (CDT) > > > On Wed, 2 Jun 2010, Eric Dumazet wrote: > > > >> take a look at > >> > >> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html#dropwatch > > > > System tap? > > You don't need to use system tap, just the normal tracing stuff using > sysfs files suffices. > It would be good if Neil could gave us a man page or something ;) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 17:19 ` Eric Dumazet @ 2010-06-02 17:41 ` Neil Horman 0 siblings, 0 replies; 28+ messages in thread From: Neil Horman @ 2010-06-02 17:41 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, cl, netdev, shemminger On Wed, Jun 02, 2010 at 07:19:10PM +0200, Eric Dumazet wrote: > Le mercredi 02 juin 2010 à 10:12 -0700, David Miller a écrit : > > From: Christoph Lameter <cl@linux-foundation.org> > > Date: Wed, 2 Jun 2010 11:49:18 -0500 (CDT) > > > > > On Wed, 2 Jun 2010, Eric Dumazet wrote: > > > > > >> take a look at > > >> > > >> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html#dropwatch > > > > > > System tap? > > > > You don't need to use system tap, just the normal tracing stuff using > > sysfs files suffices. > > > > It would be good if Neil could gave us a man page or something ;) > That stap script was really meant to be a stopgap measure. As mentioned, you can use the debugfs interface to turn tracepoints on and use them anyway you wish. Or, if you want to use the kfree_skb and napi_poll tracepoints in a more formalized way, you can use the dropwatch user space utility: https://fedorahosted.org/dropwatch/ Which includes a man page on usage :) I also recently updated it so that this utility can query /proc/kallsyms to translate program counter values into symbollic names and offsets for you. :) Regards Neil ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 17:12 ` David Miller 2010-06-02 17:19 ` Eric Dumazet @ 2010-06-02 17:31 ` David Miller 2010-06-02 17:46 ` Eric Dumazet 2010-06-03 3:50 ` [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful Bill Fink 1 sibling, 2 replies; 28+ messages in thread From: David Miller @ 2010-06-02 17:31 UTC (permalink / raw) To: cl; +Cc: eric.dumazet, netdev, shemminger Just in case people are really so clueless as to be unable to figure this out: echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable ...do some stuff... cat /sys/kernel/debug/tracing/trace You can even trace it using 'perf' by passing "skb:kfree_skb" as the event specifier. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 17:31 ` David Miller @ 2010-06-02 17:46 ` Eric Dumazet 2010-06-02 18:01 ` Christoph Lameter 2010-06-03 3:50 ` [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful Bill Fink 1 sibling, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 17:46 UTC (permalink / raw) To: David Miller; +Cc: cl, netdev, shemminger Le mercredi 02 juin 2010 à 10:31 -0700, David Miller a écrit : > Just in case people are really so clueless as to be unable to figure > this out: > > echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable > ...do some stuff... > cat /sys/kernel/debug/tracing/trace > > You can even trace it using 'perf' by passing "skb:kfree_skb" > as the event specifier. Thanks ! Here is the patch I cooked to account for RP_FILTER errors in multicast path. I will complete it to also do the unicast part before official submission. Christoph, the official counter would be IPSTATS_MIB_INNOROUTES ipSystemStatsInNoRoutes OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input IP datagrams discarded because no route could be found to transmit them to their destination. diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 4f0ed45..f207289 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -284,7 +284,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, if (no_addr) goto last_resort; if (rpf == 1) - goto e_inval; + goto e_rpf; fl.oif = dev->ifindex; ret = 0; @@ -299,7 +299,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, last_resort: if (rpf) - goto e_inval; + goto e_rpf; *spec_dst = inet_select_addr(dev, 0, RT_SCOPE_UNIVERSE); *itag = 0; return 0; @@ -308,6 +308,8 @@ e_inval_res: fib_res_put(&res); e_inval: return -EINVAL; +e_rpf: + return -ENETUNREACH; } static inline __be32 sk_extract_addr(struct sockaddr *addr) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8495bce..8e9e2f9 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1851,6 +1851,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, __be32 spec_dst; struct in_device *in_dev = in_dev_get(dev); u32 itag = 0; + int err; /* Primary sanity checks. */ @@ -1865,10 +1866,12 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, if (!ipv4_is_local_multicast(daddr)) goto e_inval; spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK); - } else if (fib_validate_source(saddr, 0, tos, 0, - dev, &spec_dst, &itag, 0) < 0) - goto e_inval; - + } else { + err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst, + &itag, 0); + if (err < 0) + goto e_err; + } rth = dst_alloc(&ipv4_dst_ops); if (!rth) goto e_nobufs; @@ -1922,6 +1925,9 @@ e_nobufs: e_inval: in_dev_put(in_dev); return -EINVAL; +e_err: + in_dev_put(in_dev); + return err; } ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 17:46 ` Eric Dumazet @ 2010-06-02 18:01 ` Christoph Lameter 2010-06-02 18:41 ` Eric Dumazet 0 siblings, 1 reply; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 18:01 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev, shemminger On Wed, 2 Jun 2010, Eric Dumazet wrote: > Here is the patch I cooked to account for RP_FILTER errors in multicast > path. > > I will complete it to also do the unicast part before official > submission. > > Christoph, the official counter would be IPSTATS_MIB_INNOROUTES Great. Thanks. > ipSystemStatsInNoRoutes OBJECT-TYPE > SYNTAX Counter32 > MAX-ACCESS read-only > STATUS current > DESCRIPTION > "The number of input IP datagrams discarded because no route > could be found to transmit them to their destination. add "or because the rp_filter rejected the packet"? In the case of MC traffic you dont really need a route. In my particular case it is a weird corner case for the rp_filter. Two NICs are on the same subnet. Different multicast groups are joined on each (Using two NICs to balance the MC load since the drivers have some multicast limitations and having different interrupt lines for each NIC is also beneficial). The rp_filter rejects all multicast traffic to the subscriptions on the second NIC. I guess this is because the source address of the MC traffic (on the same subnet) is also reachable via the first NIC. So you could add also "because of breakage in the rp_filter (rp_filter ignores the multicast subscription tables when determining the correct reverse path of the packet)" ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 18:01 ` Christoph Lameter @ 2010-06-02 18:41 ` Eric Dumazet 2010-06-02 18:59 ` Christoph Lameter 0 siblings, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 18:41 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, netdev, shemminger Le mercredi 02 juin 2010 à 13:01 -0500, Christoph Lameter a écrit : > On Wed, 2 Jun 2010, Eric Dumazet wrote: > > > Here is the patch I cooked to account for RP_FILTER errors in multicast > > path. > > > > I will complete it to also do the unicast part before official > > submission. > > > > Christoph, the official counter would be IPSTATS_MIB_INNOROUTES > > Great. Thanks. > > > ipSystemStatsInNoRoutes OBJECT-TYPE > > SYNTAX Counter32 > > MAX-ACCESS read-only > > STATUS current > > DESCRIPTION > > "The number of input IP datagrams discarded because no route > > could be found to transmit them to their destination. > > add "or because the rp_filter rejected the packet"? In the case of MC > traffic you dont really need a route. > Unicast trafic dont need a reverse route, if you only receive packets. rp_filter is an optional check, not covered by standard MIBS, so its borderline. > In my particular case it is a weird corner case for the rp_filter. > > Two NICs are on the same subnet. Different multicast groups are joined > on each (Using two NICs to balance the MC load since the drivers have > some multicast limitations and having different interrupt lines for each > NIC is also beneficial). > yeah, I know about this problem, and am working on it too... > The rp_filter rejects all multicast traffic to the subscriptions on the > second NIC. I guess this is because the source address of the MC traffic > (on the same subnet) is also reachable via the first NIC. > Its clearly a case were rp_filter should be set to 2, dont you think ? > So you could add also "because of breakage in the rp_filter (rp_filter > ignores the multicast subscription tables when determining the correct > reverse path of the packet)" > In standard RFC ? I wont change it :) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 18:41 ` Eric Dumazet @ 2010-06-02 18:59 ` Christoph Lameter 2010-06-02 19:25 ` Eric Dumazet 0 siblings, 1 reply; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 18:59 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev, shemminger On Wed, 2 Jun 2010, Eric Dumazet wrote: > > In my particular case it is a weird corner case for the rp_filter. > > > > Two NICs are on the same subnet. Different multicast groups are joined > > on each (Using two NICs to balance the MC load since the drivers have > > some multicast limitations and having different interrupt lines for each > > NIC is also beneficial). > > > > yeah, I know about this problem, and am working on it too... > > > The rp_filter rejects all multicast traffic to the subscriptions on the > > second NIC. I guess this is because the source address of the MC traffic > > (on the same subnet) is also reachable via the first NIC. > > > > Its clearly a case were rp_filter should be set to 2, dont you think ? The rp_filter is rejecting traffic coming into a NIC for which the kernel has a multicast join list that indicates that this traffic is expected on this NIC. You could consult the MC subscription list to verify that the traffic is coming into the right NIC. In the MC case the user can explicitly specify through which NIC the traffic is expected. See IP_ADD_MEMBERSHIP. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 18:59 ` Christoph Lameter @ 2010-06-02 19:25 ` Eric Dumazet 2010-06-02 20:11 ` Christoph Lameter 0 siblings, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 19:25 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, netdev, shemminger Le mercredi 02 juin 2010 à 13:59 -0500, Christoph Lameter a écrit : > The rp_filter is rejecting traffic coming into a NIC for which the kernel > has a multicast join list that indicates that this traffic is expected on > this NIC. You could consult the MC subscription list to verify that the > traffic is coming into the right NIC. > > In the MC case the user can explicitly specify through which NIC the > traffic is expected. See IP_ADD_MEMBERSHIP. This has litle to do with MC. We certainly are not going to check MC membership in fib_validate_source() ! Say we have eth0 on 192.168.0.1/24 and eth1 on 192.168.0.2/24 Then we cannot use rp_filter = 1, even with unicast trafic. I really dont understand why you would setup rp_filter in such a situation. This wont work. Now, I agree we should have a counter somewhere to help admins to understand their error ;) Here is patch I am currently testing. I finaly created a new counter, because its a linux specific check. diff --git a/include/linux/snmp.h b/include/linux/snmp.h index 5279771..ebb0c80 100644 --- a/include/linux/snmp.h +++ b/include/linux/snmp.h @@ -229,6 +229,7 @@ enum LINUX_MIB_TCPBACKLOGDROP, LINUX_MIB_TCPMINTTLDROP, /* RFC 5082 */ LINUX_MIB_TCPDEFERACCEPTDROP, + LINUX_MIB_IPRPFILTER, /* IP Reverse Path Filter (rp_filter) */ __LINUX_MIB_MAX }; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 4f0ed45..e830f7a 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -284,7 +284,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, if (no_addr) goto last_resort; if (rpf == 1) - goto e_inval; + goto e_rpf; fl.oif = dev->ifindex; ret = 0; @@ -299,7 +299,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, last_resort: if (rpf) - goto e_inval; + goto e_rpf; *spec_dst = inet_select_addr(dev, 0, RT_SCOPE_UNIVERSE); *itag = 0; return 0; @@ -308,6 +308,8 @@ e_inval_res: fib_res_put(&res); e_inval: return -EINVAL; +e_rpf: + return -EXDEV; } static inline __be32 sk_extract_addr(struct sockaddr *addr) diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c index d930dc5..d52c9da 100644 --- a/net/ipv4/ip_input.c +++ b/net/ipv4/ip_input.c @@ -340,6 +340,9 @@ static int ip_rcv_finish(struct sk_buff *skb) else if (err == -ENETUNREACH) IP_INC_STATS_BH(dev_net(skb->dev), IPSTATS_MIB_INNOROUTES); + else if (err == -EXDEV) + NET_INC_STATS_BH(dev_net(skb->dev), + LINUX_MIB_IPRPFILTER); goto drop; } } diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index 3dc9914..e320ca6 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -252,6 +252,7 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPBacklogDrop", LINUX_MIB_TCPBACKLOGDROP), SNMP_MIB_ITEM("TCPMinTTLDrop", LINUX_MIB_TCPMINTTLDROP), SNMP_MIB_ITEM("TCPDeferAcceptDrop", LINUX_MIB_TCPDEFERACCEPTDROP), + SNMP_MIB_ITEM("IPReversePathFilter", LINUX_MIB_IPRPFILTER), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8495bce..3a264f7 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1851,6 +1851,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, __be32 spec_dst; struct in_device *in_dev = in_dev_get(dev); u32 itag = 0; + int err; /* Primary sanity checks. */ @@ -1865,10 +1866,12 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, if (!ipv4_is_local_multicast(daddr)) goto e_inval; spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK); - } else if (fib_validate_source(saddr, 0, tos, 0, - dev, &spec_dst, &itag, 0) < 0) - goto e_inval; - + } else { + err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst, + &itag, 0); + if (err < 0) + goto e_err; + } rth = dst_alloc(&ipv4_dst_ops); if (!rth) goto e_nobufs; @@ -1922,6 +1925,9 @@ e_nobufs: e_inval: in_dev_put(in_dev); return -EINVAL; +e_err: + in_dev_put(in_dev); + return err; } @@ -1985,7 +1991,6 @@ static int __mkroute_input(struct sk_buff *skb, ip_handle_martian_source(in_dev->dev, in_dev, skb, daddr, saddr); - err = -EINVAL; goto cleanup; } @@ -2191,7 +2196,7 @@ brd_input: err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst, &itag, skb->mark); if (err < 0) - goto martian_source; + goto martian_source_keep_err; if (err) flags |= RTCF_DIRECTSRC; } @@ -2272,8 +2277,10 @@ e_nobufs: goto done; martian_source: + err = -EINVAL; +martian_source_keep_err: ip_handle_martian_source(dev, in_dev, skb, daddr, saddr); - goto e_inval; + goto done; } int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr, ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 19:25 ` Eric Dumazet @ 2010-06-02 20:11 ` Christoph Lameter 2010-06-02 22:05 ` [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter Eric Dumazet 0 siblings, 1 reply; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 20:11 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev, shemminger On Wed, 2 Jun 2010, Eric Dumazet wrote: > > Say we have eth0 on 192.168.0.1/24 and eth1 on 192.168.0.2/24 > > Then we cannot use rp_filter = 1, even with unicast trafic. > > I really dont understand why you would setup rp_filter in such a > situation. This wont work. rp_filter was setup in the past and it worked. Stephen fixed it in 2.6.31 for multicast and thus suddenly multicast stopped working on secondary interfaces when we moved to 2.6.32. rp_filter having to be off is okay but it does not feel correct. > Now, I agree we should have a counter somewhere to help admins to > understand their error ;) Ah. Good. > Here is patch I am currently testing. > > I finaly created a new counter, because its a linux specific check. Looks good which does not say too much given my limited networking knowledge. Reviewed-by: Christoph Lameter <cl@linux-foundation.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter 2010-06-02 20:11 ` Christoph Lameter @ 2010-06-02 22:05 ` Eric Dumazet 2010-06-03 10:19 ` David Miller 0 siblings, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 22:05 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, netdev, shemminger Le mercredi 02 juin 2010 à 15:11 -0500, Christoph Lameter a écrit : > On Wed, 2 Jun 2010, Eric Dumazet wrote: > > > Here is patch I am currently testing. > > > > I finaly created a new counter, because its a linux specific check. > > Looks good which does not say too much given my limited networking > knowledge. > > Reviewed-by: Christoph Lameter <cl@linux-foundation.org> I had one correction to do, here is the official submission. I did unicast tests only. Thanks ! [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter Christoph Lameter mentioned that packets could be dropped in input path because of rp_filter settings, without any SNMP counter being incremented. System administrator can have a hard time to track the problem. This patch introduces a new counter, LINUX_MIB_IPRPFILTER, incremented each time we drop a packet because Reverse Path Filter triggers. (We receive an IPv4 datagram on a given interface, and find the route to send an answer would use another interface) netstat -s | grep IPReversePathFilter IPReversePathFilter: 21714 Reported-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/snmp.h | 1 + net/ipv4/fib_frontend.c | 6 ++++-- net/ipv4/ip_input.c | 3 +++ net/ipv4/proc.c | 1 + net/ipv4/route.c | 31 ++++++++++++++++++------------- 5 files changed, 27 insertions(+), 15 deletions(-) diff --git a/include/linux/snmp.h b/include/linux/snmp.h index 5279771..ebb0c80 100644 --- a/include/linux/snmp.h +++ b/include/linux/snmp.h @@ -229,6 +229,7 @@ enum LINUX_MIB_TCPBACKLOGDROP, LINUX_MIB_TCPMINTTLDROP, /* RFC 5082 */ LINUX_MIB_TCPDEFERACCEPTDROP, + LINUX_MIB_IPRPFILTER, /* IP Reverse Path Filter (rp_filter) */ __LINUX_MIB_MAX }; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 4f0ed45..e830f7a 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -284,7 +284,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, if (no_addr) goto last_resort; if (rpf == 1) - goto e_inval; + goto e_rpf; fl.oif = dev->ifindex; ret = 0; @@ -299,7 +299,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, last_resort: if (rpf) - goto e_inval; + goto e_rpf; *spec_dst = inet_select_addr(dev, 0, RT_SCOPE_UNIVERSE); *itag = 0; return 0; @@ -308,6 +308,8 @@ e_inval_res: fib_res_put(&res); e_inval: return -EINVAL; +e_rpf: + return -EXDEV; } static inline __be32 sk_extract_addr(struct sockaddr *addr) diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c index d930dc5..d52c9da 100644 --- a/net/ipv4/ip_input.c +++ b/net/ipv4/ip_input.c @@ -340,6 +340,9 @@ static int ip_rcv_finish(struct sk_buff *skb) else if (err == -ENETUNREACH) IP_INC_STATS_BH(dev_net(skb->dev), IPSTATS_MIB_INNOROUTES); + else if (err == -EXDEV) + NET_INC_STATS_BH(dev_net(skb->dev), + LINUX_MIB_IPRPFILTER); goto drop; } } diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index 3dc9914..e320ca6 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -252,6 +252,7 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPBacklogDrop", LINUX_MIB_TCPBACKLOGDROP), SNMP_MIB_ITEM("TCPMinTTLDrop", LINUX_MIB_TCPMINTTLDROP), SNMP_MIB_ITEM("TCPDeferAcceptDrop", LINUX_MIB_TCPDEFERACCEPTDROP), + SNMP_MIB_ITEM("IPReversePathFilter", LINUX_MIB_IPRPFILTER), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8495bce..d377b45 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1851,6 +1851,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, __be32 spec_dst; struct in_device *in_dev = in_dev_get(dev); u32 itag = 0; + int err; /* Primary sanity checks. */ @@ -1865,10 +1866,12 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, if (!ipv4_is_local_multicast(daddr)) goto e_inval; spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK); - } else if (fib_validate_source(saddr, 0, tos, 0, - dev, &spec_dst, &itag, 0) < 0) - goto e_inval; - + } else { + err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst, + &itag, 0); + if (err < 0) + goto e_err; + } rth = dst_alloc(&ipv4_dst_ops); if (!rth) goto e_nobufs; @@ -1920,8 +1923,10 @@ e_nobufs: return -ENOBUFS; e_inval: + err = -EINVAL; +e_err: in_dev_put(in_dev); - return -EINVAL; + return err; } @@ -1985,7 +1990,6 @@ static int __mkroute_input(struct sk_buff *skb, ip_handle_martian_source(in_dev->dev, in_dev, skb, daddr, saddr); - err = -EINVAL; goto cleanup; } @@ -2157,13 +2161,12 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr, goto brd_input; if (res.type == RTN_LOCAL) { - int result; - result = fib_validate_source(saddr, daddr, tos, + err = fib_validate_source(saddr, daddr, tos, net->loopback_dev->ifindex, dev, &spec_dst, &itag, skb->mark); - if (result < 0) - goto martian_source; - if (result) + if (err < 0) + goto martian_source_keep_err; + if (err) flags |= RTCF_DIRECTSRC; spec_dst = daddr; goto local_input; @@ -2191,7 +2194,7 @@ brd_input: err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst, &itag, skb->mark); if (err < 0) - goto martian_source; + goto martian_source_keep_err; if (err) flags |= RTCF_DIRECTSRC; } @@ -2272,8 +2275,10 @@ e_nobufs: goto done; martian_source: + err = -EINVAL; +martian_source_keep_err: ip_handle_martian_source(dev, in_dev, skb, daddr, saddr); - goto e_inval; + goto done; } int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr, ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter 2010-06-02 22:05 ` [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter Eric Dumazet @ 2010-06-03 10:19 ` David Miller 0 siblings, 0 replies; 28+ messages in thread From: David Miller @ 2010-06-03 10:19 UTC (permalink / raw) To: eric.dumazet; +Cc: cl, netdev, shemminger From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 03 Jun 2010 00:05:27 +0200 > [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter > > Christoph Lameter mentioned that packets could be dropped in input path > because of rp_filter settings, without any SNMP counter being > incremented. System administrator can have a hard time to track the > problem. > > This patch introduces a new counter, LINUX_MIB_IPRPFILTER, incremented > each time we drop a packet because Reverse Path Filter triggers. > > (We receive an IPv4 datagram on a given interface, and find the route to > send an answer would use another interface) > > netstat -s | grep IPReversePathFilter > IPReversePathFilter: 21714 > > Reported-by: Christoph Lameter <cl@linux-foundation.org> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Applied. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 17:31 ` David Miller 2010-06-02 17:46 ` Eric Dumazet @ 2010-06-03 3:50 ` Bill Fink 2010-06-03 3:54 ` Eric Dumazet 1 sibling, 1 reply; 28+ messages in thread From: Bill Fink @ 2010-06-03 3:50 UTC (permalink / raw) To: David Miller; +Cc: cl, eric.dumazet, netdev, shemminger On Wed, 02 Jun 2010, David Miller wrote: > Just in case people are really so clueless as to be unable to figure > this out: > > echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable > ...do some stuff... > cat /sys/kernel/debug/tracing/trace > > You can even trace it using 'perf' by passing "skb:kfree_skb" > as the event specifier. Could someone remind me where to get documentation for obtaining and using perf. -Thanks -Bill ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-03 3:50 ` [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful Bill Fink @ 2010-06-03 3:54 ` Eric Dumazet 2010-06-03 4:56 ` Bill Fink 0 siblings, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-03 3:54 UTC (permalink / raw) To: Bill Fink; +Cc: David Miller, cl, netdev, shemminger Le mercredi 02 juin 2010 à 23:50 -0400, Bill Fink a écrit : > On Wed, 02 Jun 2010, David Miller wrote: > > > Just in case people are really so clueless as to be unable to figure > > this out: > > > > echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable > > ...do some stuff... > > cat /sys/kernel/debug/tracing/trace > > > > You can even trace it using 'perf' by passing "skb:kfree_skb" > > as the event specifier. > > Could someone remind me where to get documentation for obtaining > and using perf. > Its sources are included in kernel tree cd tools/perf make ./perf --help ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-03 3:54 ` Eric Dumazet @ 2010-06-03 4:56 ` Bill Fink 0 siblings, 0 replies; 28+ messages in thread From: Bill Fink @ 2010-06-03 4:56 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, cl, netdev, shemminger On Thu, 03 Jun 2010, Eric Dumazet wrote: > Le mercredi 02 juin 2010 à 23:50 -0400, Bill Fink a écrit : > > On Wed, 02 Jun 2010, David Miller wrote: > > > > > Just in case people are really so clueless as to be unable to figure > > > this out: > > > > > > echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable > > > ...do some stuff... > > > cat /sys/kernel/debug/tracing/trace > > > > > > You can even trace it using 'perf' by passing "skb:kfree_skb" > > > as the event specifier. > > > > Could someone remind me where to get documentation for obtaining > > and using perf. > > > > Its sources are included in kernel tree > > cd tools/perf > make > ./perf --help Thanks. That seems easy enough. From some gitweb digging, it appears perf first appeared in 2.6.31 kernels, so I'm going to have to do a kernel upgrade from our current 2.6.30.10 kernels. For others, I also ran across a perf wiki page at: https://perf.wiki.kernel.org/index.php/Main_Page -Bill ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:12 ` Christoph Lameter 2010-06-02 16:19 ` David Miller @ 2010-06-02 16:28 ` Eric Dumazet 2010-06-02 16:35 ` Christoph Lameter 1 sibling, 1 reply; 28+ messages in thread From: Eric Dumazet @ 2010-06-02 16:28 UTC (permalink / raw) To: Christoph Lameter; +Cc: netdev, Stephen Hemminger, David Miller Le mercredi 02 juin 2010 à 11:12 -0500, Christoph Lameter a écrit : > On Wed, 2 Jun 2010, Eric Dumazet wrote: > > > Le mercredi 02 juin 2010 à 10:27 -0500, Christoph Lameter a écrit : > > > > > Yes but they are not increment any counter. If packets are dropped because > > > of the rp_filter setting interfering f.e. then the packets vanish without > > > any accounting. > > > > > > > But packets are vanishing either way. > > Its important to know why drops occur (any drops for that matter, drops > mean retransmit which means latency). The symptom here was that > multicast traffic on secondary interfaces was not being forwarded to the application. > The rp_filter just dropped them and there was no way to easily track down > the issue for the people experiencing the problem. > I just dont follow you. Your patch has nothing to do with dropped packets because of rp_filter. You inserted a counter increment in a path that is not taken at all by packet delivery. Maybe I missed something really obvious with your patch ? It should only matters for the admin doing following command : ip route get 1.2.3.4 from 192.168.0.1 iif eth0 That is a probe, not a 'packet delivery', this is why I said incrementing an official MIB counter was probably wrong. > In 2.6.31 the rp_filter was fixed to work properly with multicast and now > it considers multicast traffic to secondary interfaces to have the wrong > reverse path even though the multicast subscription occurred on the > secondary interface. So it drops multicast traffic. The rp_filter has to > be switched off when using multiple NICs for multicast load balancing. > Have you considered CONFIG_NET_DROP_MONITOR ? This one catches all possible cases, a developper doesnt have to patch his kernel to add SNMP counters everywhere... config NET_DROP_MONITOR boolean "Network packet drop alerting service" depends on INET && EXPERIMENTAL && TRACEPOINTS ---help--- This feature provides an alerting service to userspace in the event that packets are discarded in the network stack. Alerts are broadcast via netlink socket to any listening user space process. If you don't need network drop alerts, or if you are ok just checking the various proc files and other utilities for drop statistics, say N here. > > > LINUX_MIB_INROUTEERRORS? Does it mean I can create a series of new > > > counters that allow us to diagnose and distinguish all the different > > > causes of packet loss? We would love to have that. > > > > > > > For an example, you could take a look at commit 907cdda5205b > > (tcp: Add SNMP counter for DEFER_ACCEPT) > > Well these are all TCP counters. I would add IP and UDP counters? Why not ? But as David said, it should be motivated by real use case ;) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful 2010-06-02 16:28 ` Eric Dumazet @ 2010-06-02 16:35 ` Christoph Lameter 0 siblings, 0 replies; 28+ messages in thread From: Christoph Lameter @ 2010-06-02 16:35 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev, Stephen Hemminger, David Miller On Wed, 2 Jun 2010, Eric Dumazet wrote: > Your patch has nothing to do with dropped packets because of rp_filter. > > You inserted a counter increment in a path that is not taken at all by > packet delivery. > > Maybe I missed something really obvious with your patch ? rp_filter rejects a route and packets are dropped then. > Have you considered CONFIG_NET_DROP_MONITOR ? > This one catches all possible cases, a developper doesnt have to patch > his kernel to add SNMP counters everywhere... Just looking at it. Great. A hook into skb_free. That will do. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2010-06-03 10:19 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-06-01 21:13 [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful Christoph Lameter 2010-06-01 22:07 ` Eric Dumazet 2010-06-01 22:23 ` David Miller 2010-06-02 15:27 ` Christoph Lameter 2010-06-02 15:29 ` David Miller 2010-06-02 15:32 ` Eric Dumazet 2010-06-02 16:12 ` Christoph Lameter 2010-06-02 16:19 ` David Miller 2010-06-02 16:27 ` Christoph Lameter 2010-06-02 16:33 ` Eric Dumazet 2010-06-02 16:49 ` Christoph Lameter 2010-06-02 17:12 ` David Miller 2010-06-02 17:19 ` Eric Dumazet 2010-06-02 17:41 ` Neil Horman 2010-06-02 17:31 ` David Miller 2010-06-02 17:46 ` Eric Dumazet 2010-06-02 18:01 ` Christoph Lameter 2010-06-02 18:41 ` Eric Dumazet 2010-06-02 18:59 ` Christoph Lameter 2010-06-02 19:25 ` Eric Dumazet 2010-06-02 20:11 ` Christoph Lameter 2010-06-02 22:05 ` [PATCH net-next-2.6] ipv4: add LINUX_MIB_IPRPFILTER snmp counter Eric Dumazet 2010-06-03 10:19 ` David Miller 2010-06-03 3:50 ` [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful Bill Fink 2010-06-03 3:54 ` Eric Dumazet 2010-06-03 4:56 ` Bill Fink 2010-06-02 16:28 ` Eric Dumazet 2010-06-02 16:35 ` Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).