netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "David S. Miller" <davem@davemloft.net>
To: hadi@cyberus.ca
Cc: netdev@oss.sgi.com
Subject: Re: [PATCH] Clean up fib_hash datastructures
Date: Mon, 20 Sep 2004 12:11:23 -0700	[thread overview]
Message-ID: <20040920121123.70baf895.davem@davemloft.net> (raw)
In-Reply-To: <1095686672.1049.301.camel@jzny.localdomain>

On 20 Sep 2004 09:24:32 -0400
jamal <hadi@cyberus.ca> wrote:

> On Sun, 2004-09-19 at 22:53, David S. Miller wrote:
> > 
> > There are two problems.  Well, logically one, which is the seperation
> > out of fib_node from the code.
> > 
> > I need to expose the layout of fib_node for other reasons.
> > 
> > If you look at Ben LaHaise's case, the issue becomes evident.
> 
> IIRC, he was having issues when 1000 devices all came up at once and all
> tried to add local scope routes?

Furthermore, once you have 1000 devices the algorithms in
net/ipv4/fib_semantics.c fall apart.

The code in fib_semantics.c was written assuming that next
hop information composed of a small set of unique entries.
So even if your routing tables were huge, those routing
entries pointed to a small group of unique next-hop objects
(which is what fib_info represents).

So all of that code was using a singly linked list of all
fib_info's to do lookups.  Therefore once you have a couple
thousand entries, even the simplest event processing becomes
expensive.

> > Firstly, fib_semantics was not designed to scale at all,
> > thus last weeks patches to add the hash tables.  
> 
> Will have to look at those patches - already in?

Yes, current 2.6.x tree has the new fib_semantics.c code.

> I like the idea except for when enforcing that scheme to be used
> by other algorithms[1]. 

It is the core issue.

I read from this statement that it is envisioned that some
fib_node lookup algorithm could, in a different way, find
all fib_nodes corresponding to a given fib_info.  I ask you
to provide what that mechanism might be before I am willing
to be concerned about this possibility :-)

> > So the general idea I was after was:
> > 
> > 1) All things performing pure longest-matching prefix
> >    lookup on an ipv4 address is confined to fib_node objects
> >    and the actual lookup algorithm.
> > 
> > 2) Everything that occurs once we have a matching fib_node
> >    object, is consolidated into a common pieces of code
> >    that does all the TOS, priority, et al. magic
> 
> sigh. Ok, some of the stuff in fib_node like TOS lookups are
> necessary for _total_ RFC1812 compliance. So i see where you are coming
> from although i am not a big fan of it. I will chew on it in the
> background.

Jamal and others, while I have your brains active in this area
I have a question.

I tried very hard to preserve existing behavior wrt. TOS
handling wrt. the setting of CONFIG_IP_ROUTE_TOS.  Basically,
the r->rtm_tos is ignored when adding routes, and treated as
zero.

I believe I was successful in preserving existing behavior, but
I wonder if it makes sense any longer.  CONFIG_IP_ROUTE_TOS
changes not one data structure.  It does exactly:

1) Controls the presence of a TOS comparison in
   fib_rules.c:fib_lookup()

2) Controls, in fib_hash.c:
	a) TOS comparison in fn_hash_lookup()
	b) whether TOS is paid attention to in fn_hash_insert
	c) similarly in fn_hash_delete()

In the TOS comparison changes, the TOS test treats zero
TOS values as "any TOS".  Therefore unless the user explicitly
tried to add a non-zero TOS route, TOS makes no difference
in behavior both with and without CONFIG_IP_ROUTE_TOS set.

Therefore, I think it behooves us just kill off this config
value.  It saves a mere 6 or 7 lines of code and no data
space.

  reply	other threads:[~2004-09-20 19:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-19  3:33 [PATCH] Clean up fib_hash datastructures David S. Miller
2004-09-20  0:39 ` jamal
2004-09-20  2:46   ` David S. Miller
2004-09-20  3:14   ` Herbert Xu
2004-09-20  3:17     ` David S. Miller
2004-09-20 13:33       ` jamal
2004-09-20  1:51 ` jamal
2004-09-20  2:53   ` David S. Miller
2004-09-20 13:24     ` jamal
2004-09-20 19:11       ` David S. Miller [this message]
2004-09-21  3:42         ` Herbert Xu
2004-09-21  6:18           ` David S. Miller
2004-09-21  9:04             ` Andi Kleen
2004-09-21  9:32               ` Herbert Xu
2004-09-21 11:03                 ` jamal
2004-09-21 12:07                   ` Andi Kleen
2004-09-21 12:22                     ` jamal
2004-09-21 23:38                   ` Steven Blake
2004-09-22  3:10                     ` jamal
2004-09-22 11:56                       ` Steven Blake
2004-09-22 12:12                         ` jamal
2004-09-22 20:04                           ` Steven Blake
2004-09-22  2:07             ` Herbert Xu
2004-09-22  2:32               ` David S. Miller
2004-09-22  3:57                 ` [IPv4] Check PAGE_SIZE in fz_hash_alloc Herbert Xu
2004-09-23 20:05                   ` David S. Miller
2004-09-22  3:30         ` [PATCH] Clean up fib_hash datastructures jamal
2004-09-22  3:36           ` David S. Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040920121123.70baf895.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=hadi@cyberus.ca \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).