From: Chris Clayton <chris2553@googlemail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, gpiez@web.de,
Dave Jones <davej@redhat.com>, Julian Anastasov <ja@ssi.bg>
Subject: Re: [PATCH] ipv4: add a fib_type to fib_info
Date: Thu, 04 Oct 2012 14:08:31 +0100 [thread overview]
Message-ID: <506D8A4F.3020008@googlemail.com> (raw)
In-Reply-To: <1349349926.16011.48.camel@edumazet-glaptop>
On 10/04/12 12:25, Eric Dumazet wrote:
> On Tue, 2012-10-02 at 17:48 +0200, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
>>
>> On Tue, 2012-10-02 at 17:35 +0200, Eric Dumazet wrote:
>>> On Mon, 2012-10-01 at 22:04 +0200, Eric Dumazet wrote:
>>>> On Mon, 2012-10-01 at 16:01 -0400, David Miller wrote:
>>>
>>>>> If you can find a way to more reliably trigger the case, that would
>>>>> help us immensely.
>>>>
>>>> I am building a KMEMCHECK kernel, as a last try before my night ;)
>>>
>>> This was a total disaster. KMEMCHECK dies horribly on my machine
>>>
>>> David, shouldnt we use a nh_rth_forward instead of a nh_rth_input in
>>> __mkroute_input() ?
>>>
>>> (And change rt_cache_route() as well ?)
>>>
>>> I am testing a patch right now.
>>
>
> OK so I implemented David idea and it seems to work.
>
> Testers are needed, thanks ! ;)
>
I've tested 3.6.0 with this patch applied and networking in a WinXP KVM
client is now working fine. The patch applies cleanly to 3.6.0, so I
assume the patch will be forwarded to stable in due course.
Tested-by: Chris Clayton <chris2553@googlemail.com>
> [PATCH] ipv4: add a fib_type to fib_info
>
> commit d2d68ba9fe8 (ipv4: Cache input routes in fib_info nexthops.)
> introduced a regression for forwarding.
>
> This was hard to reproduce but the symptom was that packets were
> delivered to local host instead of being forwarded.
>
> David suggested to add fib_type to fib_info so that we dont
> inadvertently share same fib_info for different purposes.
>
> With help from Julian Anastasov who provided very helpful
> hints, reproduced here :
>
> <quote>
> Can it be a problem related to fib_info reuse
> from different routes. For example, when local IP address
> is created for subnet we have:
>
> broadcast 192.168.0.255 dev DEV proto kernel scope link src
> 192.168.0.1
> 192.168.0.0/24 dev DEV proto kernel scope link src 192.168.0.1
> local 192.168.0.1 dev DEV proto kernel scope host src 192.168.0.1
>
> The "dev DEV proto kernel scope link src 192.168.0.1" is
> a reused fib_info structure where we put cached routes.
> The result can be same fib_info for 192.168.0.255 and
> 192.168.0.0/24. RTN_BROADCAST is cached only for input
> routes. Incoming broadcast to 192.168.0.255 can be cached
> and can cause problems for traffic forwarded to 192.168.0.0/24.
> So, this patch should solve the problem because it
> separates the broadcast from unicast traffic.
>
> And the ip_route_input_slow caching will work for
> local and broadcast input routes (above routes 1 and 3) just
> because they differ in scope and use different fib_info.
>
> </quote>
>
> Many thanks to Chris Clayton for his patience and help.
>
> Reported-by: Chris Clayton <chris2553@googlemail.com>
> Bisected-by: Chris Clayton <chris2553@googlemail.com>
> Reported-by: Dave Jones <davej@redhat.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Julian Anastasov <ja@ssi.bg>
> ---
> include/net/ip_fib.h | 1 +
> net/ipv4/fib_semantics.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> index 926142e..9497be1 100644
> --- a/include/net/ip_fib.h
> +++ b/include/net/ip_fib.h
> @@ -102,6 +102,7 @@ struct fib_info {
> unsigned char fib_dead;
> unsigned char fib_protocol;
> unsigned char fib_scope;
> + unsigned char fib_type;
> __be32 fib_prefsrc;
> u32 fib_priority;
> u32 *fib_metrics;
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index 3509065..2677530 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -314,6 +314,7 @@ static struct fib_info *fib_find_info(const struct fib_info *nfi)
> nfi->fib_scope == fi->fib_scope &&
> nfi->fib_prefsrc == fi->fib_prefsrc &&
> nfi->fib_priority == fi->fib_priority &&
> + nfi->fib_type == fi->fib_type &&
> memcmp(nfi->fib_metrics, fi->fib_metrics,
> sizeof(u32) * RTAX_MAX) == 0 &&
> ((nfi->fib_flags ^ fi->fib_flags) & ~RTNH_F_DEAD) == 0 &&
> @@ -833,6 +834,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
> fi->fib_flags = cfg->fc_flags;
> fi->fib_priority = cfg->fc_priority;
> fi->fib_prefsrc = cfg->fc_prefsrc;
> + fi->fib_type = cfg->fc_type;
>
> fi->fib_nhs = nhs;
> change_nexthops(fi) {
>
>
next prev parent reply other threads:[~2012-10-04 13:08 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-17 15:44 Possible networking regression in 3.6.0 Chris Clayton
2012-09-18 14:21 ` Chris Clayton
2012-09-18 14:31 ` Chris Clayton
2012-09-18 14:40 ` Eric Dumazet
2012-09-18 15:51 ` Chris Clayton
2012-09-19 15:26 ` Chris Clayton
2012-09-22 6:26 ` Chris Clayton
2012-09-27 11:50 ` Chris Clayton
2012-09-27 12:14 ` Eric Dumazet
2012-09-27 18:05 ` Chris Clayton
2012-09-27 21:03 ` Eric Dumazet
2012-09-27 21:17 ` Eric Dumazet
2012-09-28 6:53 ` David Miller
2012-09-28 9:14 ` Chris Clayton
2012-09-28 9:22 ` Chris Clayton
2012-09-28 11:26 ` Eric Dumazet
2012-09-28 14:28 ` Chris Clayton
2012-09-30 15:26 ` Chris Clayton
2012-09-30 19:45 ` Eric Dumazet
2012-10-01 8:36 ` Chris Clayton
2012-10-01 9:15 ` Eric Dumazet
2012-10-01 15:13 ` Chris Clayton
2012-10-01 15:31 ` Eric Dumazet
2012-10-01 16:19 ` Chris Clayton
2012-10-01 16:37 ` Eric Dumazet
2012-10-01 18:28 ` Chris Clayton
2012-10-01 18:34 ` Captain Obvious
2012-10-01 19:21 ` Eric Dumazet
2012-10-01 19:55 ` Chris Clayton
2012-10-01 19:22 ` Chris Clayton
2012-10-01 19:34 ` Dave Jones
2012-10-01 20:01 ` David Miller
2012-10-01 20:04 ` Eric Dumazet
2012-10-02 15:27 ` Edivaldo de Araújo Pereira
2012-10-02 15:35 ` Eric Dumazet
2012-10-02 15:48 ` Eric Dumazet
2012-10-02 15:57 ` Dave Jones
2012-10-02 16:06 ` Eric Dumazet
2012-10-02 18:25 ` David Miller
2012-10-02 21:14 ` Alexander Duyck
2012-10-02 21:35 ` Eric Dumazet
2012-10-02 23:24 ` Julian Anastasov
2012-10-03 3:10 ` David Miller
2012-10-03 15:01 ` Chris Clayton
2012-10-03 20:57 ` Julian Anastasov
2012-10-03 7:28 ` [PATCH] udp: increment UDP_MIB_NOPORTS in mcast receive Eric Dumazet
2012-10-03 12:45 ` David Stevens
2012-10-03 13:15 ` Eric Dumazet
2012-10-03 14:09 ` David Stevens
2012-10-03 15:29 ` Eric Dumazet
2012-10-03 17:31 ` David Stevens
2012-10-03 19:30 ` David Miller
2012-10-03 17:39 ` Rick Jones
2012-10-03 2:55 ` Possible networking regression in 3.6.0 David Miller
2012-10-04 11:25 ` [PATCH] ipv4: add a fib_type to fib_info Eric Dumazet
2012-10-04 13:08 ` Chris Clayton [this message]
2012-10-04 13:32 ` Eric Dumazet
2012-10-04 18:14 ` David Miller
2012-09-18 14:44 ` Possible networking regression in 3.6.0 Chris Clayton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=506D8A4F.3020008@googlemail.com \
--to=chris2553@googlemail.com \
--cc=davej@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=gpiez@web.de \
--cc=ja@ssi.bg \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.