From: "David S. Miller" <davem@redhat.com>
To: Julian Anastasov <ja@ssi.bg>
Cc: niv@us.ibm.com, ak@suse.de, ruddk@us.ibm.com,
kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com,
chester.f.johnson@intel.com
Subject: Re: PMTU issues due to TOS field manipulation (for DSCP)
Date: Fri, 12 Dec 2003 00:31:43 -0800 [thread overview]
Message-ID: <20031212003143.062598e9.davem@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0312110219210.1519-100000@u.domain.uli>
On Thu, 11 Dec 2003 02:34:51 +0200 (EET)
Julian Anastasov <ja@ssi.bg> wrote:
> On Wed, 10 Dec 2003, David S. Miller wrote:
>
> > But regardless, let us say that your system has complexity O(16)
> > lookups as you mention, your proposal changes this to O(16+8).
>
> It is ~16 :)
>
> ip_rt_max_size = (rt_hash_mask + 1) * 16;
>
> This is what happens on full table, of course. OK,
> some simple numbers for an ideal table:
But look at default gc_thresh setting, which is when we trim
rt cache entries:
ipv4_dst_ops.gc_thresh = (rt_hash_mask + 1);
The ip_rt_max_size value is meant to be a sort of buffer to absorb
the situation where many rt cache entries are unreclaimable.
But this is a seperate issue, and we can discuss your further points
regardless.
> 2 cases depending on whether TOS is a hash key (path=saddr->daddr):
>
> 1. TOS is a hash key:
>
> - in each chain we have 16 paths, 1 TOS value per path
> - all 8 TOS values for a path are in 8 different chains
>
> 2. TOS is not a hash key:
>
> 2 paths per chain (2 paths x 8 TOS values => 16 entries)
>
> if all saddr->daddr->tos streams have same packet rate I think
> the CPU time to lookup them will be same.
> This is because 8 (number of TOS values) < 16 (chain length).
>
> And I hope the users always can tune the proposed TOS
> settings if they see DoS and if they do not need TOS as a rt key.
Ok. I agree with your analysis. Let's propose something concrete.
1) PMTU processing applies PMTU change to all TOS'd instances of
a route. This behavior change is sysctl controllable, and
on by default.
The implementation is to just lookup all 8 possible TOS values.
2) Whether TOS is a routing cache hash key is controlled by another
sysctl.
When CONFIG_IP_ROUTE_TOS is set this sysctl defaults to on, other-
wise it defaults to off.
I think #2 should be very safe because fib node fn_tos values are only
ever set when that config variable is enabled, and fib rule r_tos values
are only compared on lookup when it is enabled as well. However, there
could be a few more ifdefs added to the fib rule code to cover all the
assignment cases too but let's not worry about that right now.
Comments?
next prev parent reply other threads:[~2003-12-12 8:31 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-10 18:23 PMTU issues due to TOS field manipulation (for DSCP) Kevin W. Rudd
2003-12-10 19:34 ` Andi Kleen
2003-12-10 20:20 ` Julian Anastasov
2003-12-10 20:33 ` Nivedita Singhvi
2003-12-10 21:18 ` Julian Anastasov
2003-12-10 22:36 ` Nivedita Singhvi
2003-12-10 22:51 ` David S. Miller
2003-12-10 23:15 ` Julian Anastasov
2003-12-10 23:20 ` David S. Miller
2003-12-11 0:06 ` Julian Anastasov
2003-12-11 0:09 ` David S. Miller
2003-12-11 0:34 ` Julian Anastasov
2003-12-12 8:31 ` David S. Miller [this message]
2003-12-12 23:38 ` Julian Anastasov
2003-12-18 23:17 ` Kevin W. Rudd
2004-01-19 22:43 ` Kevin W. Rudd
2004-01-20 4:29 ` David S. Miller
2003-12-13 0:10 ` Julian Anastasov
2004-03-04 9:36 ` David S. Miller
2004-03-04 20:56 ` Julian Anastasov
2004-03-04 22:02 ` kuznet
2004-03-06 11:55 ` Julian Anastasov
2004-03-06 16:02 ` Julian Anastasov
2003-12-10 20:26 ` Kevin W. Rudd
2003-12-10 20:52 ` Andi Kleen
2003-12-10 20:30 ` Nivedita Singhvi
2003-12-10 20:55 ` Andi Kleen
2003-12-10 21:11 ` Nivedita Singhvi
-- strict thread matches above, loose matches on Subject: below --
2003-12-10 21:35 Johnson, Chester F
2003-12-10 23:36 Johnson, Chester F
2003-12-11 0:17 ` Julian Anastasov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031212003143.062598e9.davem@redhat.com \
--to=davem@redhat.com \
--cc=ak@suse.de \
--cc=chester.f.johnson@intel.com \
--cc=ja@ssi.bg \
--cc=kuznet@ms2.inr.ac.ru \
--cc=netdev@oss.sgi.com \
--cc=niv@us.ibm.com \
--cc=ruddk@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).