* RPS and forwarding
@ 2010-04-26 2:24 Herbert Xu
2010-04-26 4:44 ` Tom Herbert
0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2010-04-26 2:24 UTC (permalink / raw)
To: David S. Miller, netdev
Hi:
I'm sorry I didn't have time to jump into the RPS discussions
earlier, so in a way I'm just getting what I deserved :)
Anyway, I am specifically concerned about the possibility of
reordering of forwarded traffic.
As RPS is doing fuzzy matching, it is possible (and quite likely
if rps_sock_flow_table is small) for a forwarded flow to be hashed
to the same index as a local flow.
In that case we may end up redirecting a forwarded flow. That
in itself is undesirable because for forwarded flows the best
solution is to stay on the ingress CPU.
What's worse is that if the local flow bounces around different
CPUs, the forwarded flow will follow it.
For a local flow RPS can guarantee original ordering (assuming
we're not doing anything weird like netfilter queueing), but
this doesn't work for forwarded flows.
Even if netif_receive_skb has completed, the forwarded packet
may still be sitting in a hardware TX queue, selected based
on the processing CPU. If you then bounce the forwarded flow
then packets may be placed in a different hardware TX queue,
causing reordering.
BTW, selecting hardware TX queues for forwarded flows based
on the rxhash is not a good solution, as that causes cache-line
bouncing between CPUs.
Apart from not using RPS on routers, I suppose people doing
forwarding will simply have to maintain a constant RPS table,
and forgo its local redirection capabilities.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 2:24 RPS and forwarding Herbert Xu
@ 2010-04-26 4:44 ` Tom Herbert
2010-04-26 4:53 ` Herbert Xu
2010-04-26 5:15 ` Eric Dumazet
0 siblings, 2 replies; 11+ messages in thread
From: Tom Herbert @ 2010-04-26 4:44 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev
> Apart from not using RPS on routers, I suppose people doing
> forwarding will simply have to maintain a constant RPS table,
> and forgo its local redirection capabilities.
>
I think you'd want RPS on router (steering packets using hash and CPU
mask). Most of your concerns are about RFS (steering by flows), which
is probably more appropriate for a server.
Tom
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 4:44 ` Tom Herbert
@ 2010-04-26 4:53 ` Herbert Xu
2010-04-26 5:00 ` Tom Herbert
2010-04-26 5:15 ` Eric Dumazet
1 sibling, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2010-04-26 4:53 UTC (permalink / raw)
To: Tom Herbert; +Cc: David S. Miller, netdev
On Sun, Apr 25, 2010 at 09:44:05PM -0700, Tom Herbert wrote:
>
> I think you'd want RPS on router (steering packets using hash and CPU
> mask). Most of your concerns are about RFS (steering by flows), which
> is probably more appropriate for a server.
Right. The problem is that if you run a distribution kernel on
a router with CONFIG_RPS (and hence RFS) enabled, you may suddenly
start seeing packet reordering on forwarded traffic due to the
presence of local traffic.
Can we perhaps add a run-time toggle to disable RFS?
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 4:53 ` Herbert Xu
@ 2010-04-26 5:00 ` Tom Herbert
2010-04-26 6:28 ` David Miller
0 siblings, 1 reply; 11+ messages in thread
From: Tom Herbert @ 2010-04-26 5:00 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev
> Right. The problem is that if you run a distribution kernel on
> a router with CONFIG_RPS (and hence RFS) enabled, you may suddenly
> start seeing packet reordering on forwarded traffic due to the
> presence of local traffic.
>
> Can we perhaps add a run-time toggle to disable RFS?
>
RFS is not on at run-time by default. The number of entries in the
global_rps_sock table needs to be set in sysctl, and the number of
entries in rps_dev_flow_table needs to be set per RX queue in sysfs.
You can turn it on for some devices, but not others if that helps.
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 4:44 ` Tom Herbert
2010-04-26 4:53 ` Herbert Xu
@ 2010-04-26 5:15 ` Eric Dumazet
1 sibling, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2010-04-26 5:15 UTC (permalink / raw)
To: Tom Herbert; +Cc: Herbert Xu, David S. Miller, netdev
Le dimanche 25 avril 2010 à 21:44 -0700, Tom Herbert a écrit :
> > Apart from not using RPS on routers, I suppose people doing
> > forwarding will simply have to maintain a constant RPS table,
> > and forgo its local redirection capabilities.
> >
>
> I think you'd want RPS on router (steering packets using hash and CPU
> mask). Most of your concerns are about RFS (steering by flows), which
> is probably more appropriate for a server.
We could probably rename RFS parts to RFS to avoid confusion ?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 5:00 ` Tom Herbert
@ 2010-04-26 6:28 ` David Miller
2010-04-26 6:31 ` Herbert Xu
0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2010-04-26 6:28 UTC (permalink / raw)
To: therbert; +Cc: herbert, netdev
From: Tom Herbert <therbert@google.com>
Date: Sun, 25 Apr 2010 22:00:09 -0700
>> Right. The problem is that if you run a distribution kernel on
>> a router with CONFIG_RPS (and hence RFS) enabled, you may suddenly
>> start seeing packet reordering on forwarded traffic due to the
>> presence of local traffic.
>>
>> Can we perhaps add a run-time toggle to disable RFS?
>>
> RFS is not on at run-time by default. The number of entries in the
> global_rps_sock table needs to be set in sysctl, and the number of
> entries in rps_dev_flow_table needs to be set per RX queue in sysfs.
> You can turn it on for some devices, but not others if that helps.
Right, none of this stuff is on by default.
It might be possible to somehow make the sock table get bypassed for
forwarded traffic, but I can't think of a cheap way to do that at
the moment.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 6:28 ` David Miller
@ 2010-04-26 6:31 ` Herbert Xu
2010-04-26 7:25 ` Eric Dumazet
0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2010-04-26 6:31 UTC (permalink / raw)
To: David Miller; +Cc: therbert, netdev
On Sun, Apr 25, 2010 at 11:28:57PM -0700, David Miller wrote:
>
> Right, none of this stuff is on by default.
Great, that puts my mind at ease :)
> It might be possible to somehow make the sock table get bypassed for
> forwarded traffic, but I can't think of a cheap way to do that at
> the moment.
Yeah I thought about that too but as we need to go through the
routing table before we know whether something is forwarded or
not, it certainly seems non-trivial.
Hmm, maybe if we used a routing cache keyed by rxhash we could
make an appromixation? After all, we only need to make sure that
no forwarded traffic is redirected, and not the reverse. IOW,
it's OK if we incorrectly classify some local traffic as forwarded.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 6:31 ` Herbert Xu
@ 2010-04-26 7:25 ` Eric Dumazet
2010-04-26 7:30 ` Herbert Xu
0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2010-04-26 7:25 UTC (permalink / raw)
To: Herbert Xu; +Cc: David Miller, therbert, netdev
Le lundi 26 avril 2010 à 14:31 +0800, Herbert Xu a écrit :
> On Sun, Apr 25, 2010 at 11:28:57PM -0700, David Miller wrote:
>
> > It might be possible to somehow make the sock table get bypassed for
> > forwarded traffic, but I can't think of a cheap way to do that at
> > the moment.
>
> Yeah I thought about that too but as we need to go through the
> routing table before we know whether something is forwarded or
> not, it certainly seems non-trivial.
>
> Hmm, maybe if we used a routing cache keyed by rxhash we could
> make an appromixation? After all, we only need to make sure that
> no forwarded traffic is redirected, and not the reverse. IOW,
> it's OK if we incorrectly classify some local traffic as forwarded.
>
RFS is already working like that, if RPS is not used in conjonction.
Forwarded traffic will see a non tagged rfs flow entry, so this skb will
be processed by this cpu, not a remote one.
Currently, only an application can set a RFS flow entry.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 7:25 ` Eric Dumazet
@ 2010-04-26 7:30 ` Herbert Xu
2010-04-26 7:38 ` Eric Dumazet
0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2010-04-26 7:30 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, therbert, netdev
On Mon, Apr 26, 2010 at 09:25:16AM +0200, Eric Dumazet wrote:
>
> RFS is already working like that, if RPS is not used in conjonction.
>
> Forwarded traffic will see a non tagged rfs flow entry, so this skb will
> be processed by this cpu, not a remote one.
>
> Currently, only an application can set a RFS flow entry.
I was referring to the case when a forwarded flow gets hashed to
the same index as a local flow.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 7:30 ` Herbert Xu
@ 2010-04-26 7:38 ` Eric Dumazet
2010-04-26 7:41 ` Herbert Xu
0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2010-04-26 7:38 UTC (permalink / raw)
To: Herbert Xu; +Cc: David Miller, therbert, netdev
Le lundi 26 avril 2010 à 15:30 +0800, Herbert Xu a écrit :
> On Mon, Apr 26, 2010 at 09:25:16AM +0200, Eric Dumazet wrote:
> >
> > RFS is already working like that, if RPS is not used in conjonction.
> >
> > Forwarded traffic will see a non tagged rfs flow entry, so this skb will
> > be processed by this cpu, not a remote one.
> >
> > Currently, only an application can set a RFS flow entry.
>
> I was referring to the case when a forwarded flow gets hashed to
> the same index as a local flow.
Ah OK, with same rxhash I guess ?
(we could store rxhash next to cpu in struct rps_dev_flow, or 16bit part
of it in the existing padding)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RPS and forwarding
2010-04-26 7:38 ` Eric Dumazet
@ 2010-04-26 7:41 ` Herbert Xu
0 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2010-04-26 7:41 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, therbert, netdev
On Mon, Apr 26, 2010 at 09:38:54AM +0200, Eric Dumazet wrote:
>
> Ah OK, with same rxhash I guess ?
It's worse than that I think, because the actual table contains
less than 4G entries. So either the same rxhash or the same index
in the table.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-04-26 7:42 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-26 2:24 RPS and forwarding Herbert Xu
2010-04-26 4:44 ` Tom Herbert
2010-04-26 4:53 ` Herbert Xu
2010-04-26 5:00 ` Tom Herbert
2010-04-26 6:28 ` David Miller
2010-04-26 6:31 ` Herbert Xu
2010-04-26 7:25 ` Eric Dumazet
2010-04-26 7:30 ` Herbert Xu
2010-04-26 7:38 ` Eric Dumazet
2010-04-26 7:41 ` Herbert Xu
2010-04-26 5:15 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox