netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Clayton <chris2553@googlemail.com>
To: David Miller <davem@davemloft.net>
Cc: eric.dumazet@gmail.com, netdev@vger.kernel.org, gpiez@web.de
Subject: Re: Possible networking regression in 3.6.0
Date: Fri, 28 Sep 2012 10:14:56 +0100	[thread overview]
Message-ID: <50656A90.5030503@googlemail.com> (raw)
In-Reply-To: <20120928.025351.156118608293844465.davem@davemloft.net>



On 09/28/12 07:53, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 27 Sep 2012 23:17:04 +0200
>
>> Yes it seems the problem. On the host I tried :
>>
>> # ip ro get 8.8.8.8 from 192.168.200.1 iif tap1
>> 8.8.8.8 from 192.168.200.1 via 172.30.42.1 dev eth0
>>      cache  iif *
>>
>> So if the guest tries to send a frame to 8.8.8.8 we are going to forward
>> the packet to eth0
>>
>> But if the guest tries to send to 255.255.255.255, we try to deliver the
>> packet to the host itself, instead of broadcasting to eth0
>>
>> # ip ro get 255.255.255.255 from 192.168.200.1 iif tap1
>> broadcast 255.255.255.255 from 192.168.200.1 dev lo
>>      cache <local,brd>  iif *
>>
>> David, maybe you'll have an idea ?
>
> Perhaps this was introduced by:

Thanks, David.

Unfortunately, reversing that patch does not fix the problem. The pings 
from the KVM client to the router still time out.

I have bisected this (see 
http://marc.info/?l=linux-netdev&m=134797809611847&w=2) and that rendered:

$ git bisect bad
d2d68ba9fe8b38eb03124b3176a013bb8aa2b5e5 is the first bad commit
commit d2d68ba9fe8b38eb03124b3176a013bb8aa2b5e5
Author: David S. Miller <davem@davemloft.net>
Date:   Tue Jul 17 12:58:50 2012 -0700

     ipv4: Cache input routes in fib_info nexthops.

     Caching input routes is slightly simpler than output routes, since we
     don't need to be concerned with nexthop exceptions.  (locally
     destined, and routed packets, never trigger PMTU events or redirects
     that will be processed by us).

     However, we have to elide caching for the DIRECTSRC and non-zero itag
     cases.

     Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 6bbc75c1cbe62bf84ea412d3b98adf2b614779cd 
3ad7256b4a71e63ca4530977c0550121ea803d35 M      include
:040000 040000 18c2a950a53c4eec9bfa12185d1e382dfed74af8 
a2ab6157d6cd54930da395758c6ded3a225d1f04 M      net

Unfortunately, the related patches don't reverse cleanly, but a kernel 
built from a git checkout of the parent commit ( 
f2bb4bedf35d5167a073dcdddf16543f351ef3ae) works fine.

>
> commit 7bd86cc282a458b66c41e3f6676de6656c99b8db
> Author: Yan, Zheng <zheng.z.yan@intel.com>
> Date:   Sun Aug 12 20:09:59 2012 +0000
>
>      ipv4: Cache local output routes
>
>      Commit caacf05e5ad1abf causes big drop of UDP loop back performance.
>      The cause of the regression is that we do not cache the local output
>      routes. Each time we send a datagram from unconnected UDP socket,
>      the kernel allocates a dst_entry and adds it to the rt_uncached_list.
>      It creates lock contention on the rt_uncached_lock.
>
>      Reported-by: Alex Shi <alex.shi@intel.com>
>      Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
>      Signed-off-by: David S. Miller <davem@davemloft.net>
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index e4ba974..fd9ecb5 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2028,7 +2028,6 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4)
>   		}
>   		dev_out = net->loopback_dev;
>   		fl4->flowi4_oif = dev_out->ifindex;
> -		res.fi = NULL;
>   		flags |= RTCF_LOCAL;
>   		goto make_route;
>   	}
>

  reply	other threads:[~2012-09-28  9:15 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-17 15:44 Possible networking regression in 3.6.0 Chris Clayton
2012-09-18 14:21 ` Chris Clayton
2012-09-18 14:31   ` Chris Clayton
2012-09-18 14:40     ` Eric Dumazet
2012-09-18 15:51       ` Chris Clayton
2012-09-19 15:26       ` Chris Clayton
2012-09-22  6:26         ` Chris Clayton
2012-09-27 11:50           ` Chris Clayton
2012-09-27 12:14             ` Eric Dumazet
2012-09-27 18:05               ` Chris Clayton
2012-09-27 21:03                 ` Eric Dumazet
2012-09-27 21:17                   ` Eric Dumazet
2012-09-28  6:53                     ` David Miller
2012-09-28  9:14                       ` Chris Clayton [this message]
2012-09-28  9:22                     ` Chris Clayton
2012-09-28 11:26                       ` Eric Dumazet
2012-09-28 14:28                         ` Chris Clayton
2012-09-30 15:26                         ` Chris Clayton
2012-09-30 19:45                           ` Eric Dumazet
2012-10-01  8:36                             ` Chris Clayton
2012-10-01  9:15                               ` Eric Dumazet
2012-10-01 15:13                                 ` Chris Clayton
2012-10-01 15:31                                   ` Eric Dumazet
2012-10-01 16:19                                     ` Chris Clayton
2012-10-01 16:37                                       ` Eric Dumazet
2012-10-01 18:28                                         ` Chris Clayton
2012-10-01 18:34                                     ` Captain Obvious
2012-10-01 19:21                                       ` Eric Dumazet
2012-10-01 19:55                                         ` Chris Clayton
2012-10-01 19:22                                       ` Chris Clayton
2012-10-01 19:34                                 ` Dave Jones
2012-10-01 20:01                                   ` David Miller
2012-10-01 20:04                                     ` Eric Dumazet
2012-10-02 15:27                                       ` Edivaldo de Araújo Pereira
2012-10-02 15:35                                       ` Eric Dumazet
2012-10-02 15:48                                         ` Eric Dumazet
2012-10-02 15:57                                           ` Dave Jones
2012-10-02 16:06                                             ` Eric Dumazet
2012-10-02 18:25                                           ` David Miller
2012-10-02 21:14                                             ` Alexander Duyck
2012-10-02 21:35                                               ` Eric Dumazet
2012-10-02 23:24                                           ` Julian Anastasov
2012-10-03  3:10                                             ` David Miller
2012-10-03 15:01                                               ` Chris Clayton
2012-10-03 20:57                                               ` Julian Anastasov
2012-10-03  7:28                                             ` [PATCH] udp: increment UDP_MIB_NOPORTS in mcast receive Eric Dumazet
2012-10-03 12:45                                               ` David Stevens
2012-10-03 13:15                                                 ` Eric Dumazet
2012-10-03 14:09                                                   ` David Stevens
2012-10-03 15:29                                                     ` Eric Dumazet
2012-10-03 17:31                                                       ` David Stevens
2012-10-03 19:30                                                         ` David Miller
2012-10-03 17:39                                                     ` Rick Jones
2012-10-03  2:55                                           ` Possible networking regression in 3.6.0 David Miller
2012-10-04 11:25                                           ` [PATCH] ipv4: add a fib_type to fib_info Eric Dumazet
2012-10-04 13:08                                             ` Chris Clayton
2012-10-04 13:32                                               ` Eric Dumazet
2012-10-04 18:14                                                 ` David Miller
2012-09-18 14:44     ` Possible networking regression in 3.6.0 Chris Clayton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50656A90.5030503@googlemail.com \
    --to=chris2553@googlemail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=gpiez@web.de \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).