netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [NET 00/06]: Increase number of possible routing tables
@ 2006-08-10 19:29 Patrick McHardy
  2006-08-10 19:29 ` [NET 01/06]: Use u32 for routing table IDs Patrick McHardy
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:29 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

These are the updated patches (against net-2.6.19) to increase the number
of possible routing tables to 2^32. They basically consist of four parts:

- Use u32 for routing table IDs everywhere inside the kernel

- Introduce new netlink attributes to carry extended table IDs and add
  compatibility functions to use either rtm_table or the new attributes
  if given.

- Prepare IPv4/IPv6/DecNET for increasing RT_TABLE_MAX to 2^32 by using
  hash tables instead of a fixed size array of pointers (IPv4 and DecNET,
  IPv6 already contains that part) and replacing iterations over all
  possible table IDs by hash walking.

- Finally, increase RT_TABLE_MAX to 2^32

IPv4 and IPv6 are tested with and without CONFIG_MULTIPLE_TABLES, DecNET
is only compile tested.


 include/linux/fib_rules.h |    4 +
 include/linux/rtnetlink.h |   12 ++-
 include/net/dn_fib.h      |    7 -
 include/net/fib_rules.h   |    7 +
 include/net/ip6_route.h   |    7 +
 include/net/ip_fib.h      |   39 +++-------
 net/core/fib_rules.c      |    5 -
 net/decnet/dn_fib.c       |   62 +---------------
 net/decnet/dn_route.c     |    1 
 net/decnet/dn_rules.c     |    2 
 net/decnet/dn_table.c     |  136 ++++++++++++++++++++++++++----------
 net/ipv4/fib_frontend.c   |  117 ++++++++++++++++++++-----------
 net/ipv4/fib_hash.c       |   30 ++++----
 net/ipv4/fib_lookup.h     |    4 -
 net/ipv4/fib_rules.c      |    7 +
 net/ipv4/fib_semantics.c  |    5 -
 net/ipv4/fib_trie.c       |   32 ++++----
 net/ipv4/route.c          |    1 
 net/ipv6/fib6_rules.c     |    1 
 net/ipv6/ip6_fib.c        |  171 ++++++++++++++++++++++++++++++++++++++++------
 net/ipv6/route.c          |  141 ++-----------------------------------
 21 files changed, 430 insertions(+), 361 deletions(-)

Patrick McHardy:
      [NET]: Use u32 for routing table IDs
      [NET]: Introduce RTA_TABLE/FRA_TABLE attributes
      [IPV4]: Increase number of possible routing tables to 2^32
      [IPV6]: Increase number of possible routing tables to 2^32
      [DECNET]: Increase number of possible routing tables to 2^32
      [NET]: Increate RT_TABLE_MAX to 2^32

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [NET 01/06]: Use u32 for routing table IDs
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
@ 2006-08-10 19:29 ` Patrick McHardy
  2006-08-11  6:08   ` David Miller
  2006-08-10 19:30 ` [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes Patrick McHardy
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:29 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

[NET]: Use u32 for routing table IDs

Use u32 for routing table IDs in net/ipv4 and net/decnet in preparation of
support for a larger number of routing tables. net/ipv6 already uses u32
everywhere and needs no further changes. No functional changes are made by
this patch.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 29a0f4a779543907ddf8fbca55b6f1d0e0017f64
tree c559ca79c2d6ab28ceb4a4c1d5ecd5ea81264f0d
parent 1b471cd32acdff18786bc06542c686d52decbc5a
author Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:45:22 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:45:22 +0200

 include/net/dn_fib.h     |    4 ++--
 include/net/ip_fib.h     |   14 +++++++-------
 net/decnet/dn_fib.c      |    6 +++---
 net/decnet/dn_table.c    |   10 +++++-----
 net/ipv4/fib_frontend.c  |    8 ++++----
 net/ipv4/fib_hash.c      |    4 ++--
 net/ipv4/fib_lookup.h    |    4 ++--
 net/ipv4/fib_rules.c     |    2 +-
 net/ipv4/fib_semantics.c |    4 ++--
 net/ipv4/fib_trie.c      |    6 +++---
 10 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/include/net/dn_fib.h b/include/net/dn_fib.h
index 32bc8ce..cd9c378 100644
--- a/include/net/dn_fib.h
+++ b/include/net/dn_fib.h
@@ -94,7 +94,7 @@ #define DN_FIB_INFO(f) ((f)->fn_info)
 
 
 struct dn_fib_table {
-	int n;
+	u32 n;
 
 	int (*insert)(struct dn_fib_table *t, struct rtmsg *r, 
 			struct dn_kern_rta *rta, struct nlmsghdr *n, 
@@ -137,7 +137,7 @@ extern int dn_fib_sync_up(struct net_dev
 /*
  * dn_tables.c
  */
-extern struct dn_fib_table *dn_fib_get_table(int n, int creat);
+extern struct dn_fib_table *dn_fib_get_table(u32 n, int creat);
 extern struct dn_fib_table *dn_fib_empty_table(void);
 extern void dn_fib_table_init(void);
 extern void dn_fib_table_cleanup(void);
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index adf7358..0dcbf16 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -150,7 +150,7 @@ #define FIB_RES_NETMASK(res)	        (0)
 #endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */
 
 struct fib_table {
-	unsigned char	tb_id;
+	u32		tb_id;
 	unsigned	tb_stamp;
 	int		(*tb_lookup)(struct fib_table *tb, const struct flowi *flp, struct fib_result *res);
 	int		(*tb_insert)(struct fib_table *table, struct rtmsg *r,
@@ -173,14 +173,14 @@ #ifndef CONFIG_IP_MULTIPLE_TABLES
 extern struct fib_table *ip_fib_local_table;
 extern struct fib_table *ip_fib_main_table;
 
-static inline struct fib_table *fib_get_table(int id)
+static inline struct fib_table *fib_get_table(u32 id)
 {
 	if (id != RT_TABLE_LOCAL)
 		return ip_fib_main_table;
 	return ip_fib_local_table;
 }
 
-static inline struct fib_table *fib_new_table(int id)
+static inline struct fib_table *fib_new_table(u32 id)
 {
 	return fib_get_table(id);
 }
@@ -205,9 +205,9 @@ #define ip_fib_main_table (fib_tables[RT
 
 extern struct fib_table * fib_tables[RT_TABLE_MAX+1];
 extern int fib_lookup(struct flowi *flp, struct fib_result *res);
-extern struct fib_table *__fib_new_table(int id);
+extern struct fib_table *__fib_new_table(u32 id);
 
-static inline struct fib_table *fib_get_table(int id)
+static inline struct fib_table *fib_get_table(u32 id)
 {
 	if (id == 0)
 		id = RT_TABLE_MAIN;
@@ -215,7 +215,7 @@ static inline struct fib_table *fib_get_
 	return fib_tables[id];
 }
 
-static inline struct fib_table *fib_new_table(int id)
+static inline struct fib_table *fib_new_table(u32 id)
 {
 	if (id == 0)
 		id = RT_TABLE_MAIN;
@@ -248,7 +248,7 @@ extern int fib_convert_rtentry(int cmd, 
 extern u32  __fib_res_prefsrc(struct fib_result *res);
 
 /* Exported by fib_hash.c */
-extern struct fib_table *fib_hash_init(int id);
+extern struct fib_table *fib_hash_init(u32 id);
 
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 extern int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb);
diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c
index ed5fb5c..7b3bf5c 100644
--- a/net/decnet/dn_fib.c
+++ b/net/decnet/dn_fib.c
@@ -534,8 +534,8 @@ int dn_fib_rtm_newroute(struct sk_buff *
 
 int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb)
 {
-	int t;
-	int s_t;
+	u32 t;
+	u32 s_t;
 	struct dn_fib_table *tb;
 
 	if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) &&
@@ -765,7 +765,7 @@ void dn_fib_flush(void)
 {
         int flushed = 0;
         struct dn_fib_table *tb;
-        int id;
+        u32 id;
 
         for(id = RT_TABLE_MAX; id > 0; id--) {
                 if ((tb = dn_fib_get_table(id, 0)) == NULL)
diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c
index c6a2e41..b7c6c06 100644
--- a/net/decnet/dn_table.c
+++ b/net/decnet/dn_table.c
@@ -264,7 +264,7 @@ static int dn_fib_nh_match(struct rtmsg 
 }
 
 static int dn_fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event,
-                        u8 tb_id, u8 type, u8 scope, void *dst, int dst_len,
+                        u32 tb_id, u8 type, u8 scope, void *dst, int dst_len,
                         struct dn_fib_info *fi, unsigned int flags)
 {
         struct rtmsg *rtm;
@@ -327,7 +327,7 @@ rtattr_failure:
 }
 
 
-static void dn_rtmsg_fib(int event, struct dn_fib_node *f, int z, int tb_id,
+static void dn_rtmsg_fib(int event, struct dn_fib_node *f, int z, u32 tb_id,
                         struct nlmsghdr *nlh, struct netlink_skb_parms *req)
 {
         struct sk_buff *skb;
@@ -736,7 +736,7 @@ out:
 }
 
 
-struct dn_fib_table *dn_fib_get_table(int n, int create)
+struct dn_fib_table *dn_fib_get_table(u32 n, int create)
 {
         struct dn_fib_table *t;
 
@@ -773,7 +773,7 @@ struct dn_fib_table *dn_fib_get_table(in
         return t;
 }
 
-static void dn_fib_del_tree(int n)
+static void dn_fib_del_tree(u32 n)
 {
 	struct dn_fib_table *t;
 
@@ -787,7 +787,7 @@ static void dn_fib_del_tree(int n)
 
 struct dn_fib_table *dn_fib_empty_table(void)
 {
-        int id;
+        u32 id;
 
         for(id = RT_TABLE_MIN; id <= RT_TABLE_MAX; id++)
                 if (dn_fib_tables[id] == NULL)
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index a83f1aa..06f4b23 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -62,7 +62,7 @@ #define RT_TABLE_MIN 1
 
 struct fib_table *fib_tables[RT_TABLE_MAX+1];
 
-struct fib_table *__fib_new_table(int id)
+struct fib_table *__fib_new_table(u32 id)
 {
 	struct fib_table *tb;
 
@@ -82,7 +82,7 @@ static void fib_flush(void)
 	int flushed = 0;
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	struct fib_table *tb;
-	int id;
+	u32 id;
 
 	for (id = RT_TABLE_MAX; id>0; id--) {
 		if ((tb = fib_get_table(id))==NULL)
@@ -333,8 +333,8 @@ int inet_rtm_newroute(struct sk_buff *sk
 
 int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
 {
-	int t;
-	int s_t;
+	u32 t;
+	u32 s_t;
 	struct fib_table *tb;
 
 	if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) &&
diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c
index 72c633b..f8d5c80 100644
--- a/net/ipv4/fib_hash.c
+++ b/net/ipv4/fib_hash.c
@@ -765,9 +765,9 @@ static int fn_hash_dump(struct fib_table
 }
 
 #ifdef CONFIG_IP_MULTIPLE_TABLES
-struct fib_table * fib_hash_init(int id)
+struct fib_table * fib_hash_init(u32 id)
 #else
-struct fib_table * __init fib_hash_init(int id)
+struct fib_table * __init fib_hash_init(u32 id)
 #endif
 {
 	struct fib_table *tb;
diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h
index ef6609e..ddd5249 100644
--- a/net/ipv4/fib_lookup.h
+++ b/net/ipv4/fib_lookup.h
@@ -30,11 +30,11 @@ extern struct fib_info *fib_create_info(
 extern int fib_nh_match(struct rtmsg *r, struct nlmsghdr *,
 			struct kern_rta *rta, struct fib_info *fi);
 extern int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event,
-			 u8 tb_id, u8 type, u8 scope, void *dst,
+			 u32 tb_id, u8 type, u8 scope, void *dst,
 			 int dst_len, u8 tos, struct fib_info *fi,
 			 unsigned int);
 extern void rtmsg_fib(int event, u32 key, struct fib_alias *fa,
-		      int z, int tb_id,
+		      int z, u32 tb_id,
 		      struct nlmsghdr *n, struct netlink_skb_parms *req);
 extern struct fib_alias *fib_find_alias(struct list_head *fah,
 					u8 tos, u32 prio);
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index d242e52..58fb91b 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -169,7 +169,7 @@ #endif
 
 static struct fib_table *fib_empty_table(void)
 {
-	int id;
+	u32 id;
 
 	for (id = 1; id <= RT_TABLE_MAX; id++)
 		if (fib_tables[id] == NULL)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 8aa6014..17c7974 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -273,7 +273,7 @@ int ip_fib_check_default(u32 gw, struct 
 }
 
 void rtmsg_fib(int event, u32 key, struct fib_alias *fa,
-	       int z, int tb_id,
+	       int z, u32 tb_id,
 	       struct nlmsghdr *n, struct netlink_skb_parms *req)
 {
 	struct sk_buff *skb;
@@ -935,7 +935,7 @@ u32 __fib_res_prefsrc(struct fib_result 
 
 int
 fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event,
-	      u8 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos,
+	      u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos,
 	      struct fib_info *fi, unsigned int flags)
 {
 	struct rtmsg *rtm;
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 23fb9d9..b98f211 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1148,7 +1148,7 @@ fn_trie_insert(struct fib_table *tb, str
 
 	key = ntohl(key);
 
-	pr_debug("Insert table=%d %08x/%d\n", tb->tb_id, key, plen);
+	pr_debug("Insert table=%u %08x/%d\n", tb->tb_id, key, plen);
 
 	mask = ntohl(inet_make_mask(plen));
 
@@ -1943,9 +1943,9 @@ out:
 /* Fix more generic FIB names for init later */
 
 #ifdef CONFIG_IP_MULTIPLE_TABLES
-struct fib_table * fib_hash_init(int id)
+struct fib_table * fib_hash_init(u32 id)
 #else
-struct fib_table * __init fib_hash_init(int id)
+struct fib_table * __init fib_hash_init(u32 id)
 #endif
 {
 	struct fib_table *tb;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
  2006-08-10 19:29 ` [NET 01/06]: Use u32 for routing table IDs Patrick McHardy
@ 2006-08-10 19:30 ` Patrick McHardy
  2006-08-11  6:09   ` David Miller
  2006-08-11  7:02   ` Michael Tokarev
  2006-08-10 19:30 ` [IPV4 03/06]: Increase number of possible routing tables to 2^32 Patrick McHardy
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

[NET]: Introduce RTA_TABLE/FRA_TABLE attributes

Introduce RTA_TABLE route attribute and FRA_TABLE routing rule attribute
to hold 32 bit routing table IDs. Usespace compatibility is provided by
continuing to accept and send the rtm_table field, but because of its
limited size it can only carry the low 8 bits of the table ID. This
implies that if larger IDs are used, _all_ userspace programs using them
need to use RTA_TABLE.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit a9fe50e925cdc0471b88bcf6f3cc18278b63c984
tree 08d8bfa20011b5afa940126a8bb0c153729584c3
parent 29a0f4a779543907ddf8fbca55b6f1d0e0017f64
author Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:50:19 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:50:19 +0200

 include/linux/fib_rules.h |    4 ++++
 include/linux/rtnetlink.h |    8 ++++++++
 include/net/fib_rules.h   |    7 +++++++
 net/core/fib_rules.c      |    5 +++--
 net/decnet/dn_fib.c       |    7 ++++---
 net/decnet/dn_route.c     |    1 +
 net/decnet/dn_table.c     |    1 +
 net/ipv4/fib_frontend.c   |    7 ++++---
 net/ipv4/fib_rules.c      |    1 +
 net/ipv4/fib_semantics.c  |    1 +
 net/ipv4/route.c          |    1 +
 net/ipv6/fib6_rules.c     |    1 +
 net/ipv6/route.c          |   13 +++++++++----
 13 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/include/linux/fib_rules.h b/include/linux/fib_rules.h
index 5e503f0..19a82b6 100644
--- a/include/linux/fib_rules.h
+++ b/include/linux/fib_rules.h
@@ -36,6 +36,10 @@ enum
 	FRA_UNUSED5,
 	FRA_FWMARK,	/* netfilter mark (IPv4) */
 	FRA_FLOW,	/* flow/class id */
+	FRA_UNUSED6,
+	FRA_UNUSED7,
+	FRA_UNUSED8,
+	FRA_TABLE,	/* Extended table id */
 	__FRA_MAX
 };
 
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 5deca87..b01bc8b 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -264,6 +264,7 @@ enum rtattr_type_t
 	RTA_CACHEINFO,
 	RTA_SESSION,
 	RTA_MP_ALGO,
+	RTA_TABLE,
 	__RTA_MAX
 };
 
@@ -716,6 +717,13 @@ #define BUG_TRAP(x) do { \
 	} \
 } while(0)
 
+static inline u32 rtm_get_table(struct rtattr **rta, u8 table)
+{
+	return RTA_GET_U32(rta[RTA_TABLE-1]);
+rtattr_failure:
+	return table;
+}
+
 #endif /* __KERNEL__ */
 
 
diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 61375d9..8e2f473 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -74,6 +74,13 @@ static inline void fib_rule_put(struct f
 		call_rcu(&rule->rcu, fib_rule_put_rcu);
 }
 
+static inline u32 frh_get_table(struct fib_rule_hdr *frh, struct nlattr **nla)
+{
+	if (nla[FRA_TABLE])
+		return nla_get_u32(nla[FRA_TABLE]);
+	return frh->table;
+}
+
 extern int			fib_rules_register(struct fib_rules_ops *);
 extern int			fib_rules_unregister(struct fib_rules_ops *);
 
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 2e7ed5d..97b196f 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -187,7 +187,7 @@ int fib_nl_newrule(struct sk_buff *skb, 
 
 	rule->action = frh->action;
 	rule->flags = frh->flags;
-	rule->table = frh->table;
+	rule->table = frh_get_table(frh, tb);
 
 	if (!rule->pref && ops->default_pref)
 		rule->pref = ops->default_pref();
@@ -245,7 +245,7 @@ int fib_nl_delrule(struct sk_buff *skb, 
 		if (frh->action && (frh->action != rule->action))
 			continue;
 
-		if (frh->table && (frh->table != rule->table))
+		if (frh->table && (frh_get_table(frh, tb) != rule->table))
 			continue;
 
 		if (tb[FRA_PRIORITY] &&
@@ -291,6 +291,7 @@ static int fib_nl_fill_rule(struct sk_bu
 
 	frh = nlmsg_data(nlh);
 	frh->table = rule->table;
+	NLA_PUT_U32(skb, FRA_TABLE, rule->table);
 	frh->res1 = 0;
 	frh->res2 = 0;
 	frh->action = rule->action;
diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c
index 7b3bf5c..fb59637 100644
--- a/net/decnet/dn_fib.c
+++ b/net/decnet/dn_fib.c
@@ -491,7 +491,8 @@ static int dn_fib_check_attr(struct rtms
 		if (attr) {
 			if (RTA_PAYLOAD(attr) < 4 && RTA_PAYLOAD(attr) != 2)
 				return -EINVAL;
-			if (i != RTA_MULTIPATH && i != RTA_METRICS)
+			if (i != RTA_MULTIPATH && i != RTA_METRICS &&
+			    i != RTA_TABLE)
 				rta[i-1] = (struct rtattr *)RTA_DATA(attr);
 		}
 	}
@@ -508,7 +509,7 @@ int dn_fib_rtm_delroute(struct sk_buff *
 	if (dn_fib_check_attr(r, rta))
 		return -EINVAL;
 
-	tb = dn_fib_get_table(r->rtm_table, 0);
+	tb = dn_fib_get_table(rtm_get_table(rta, r->rtm_table), 0);
 	if (tb)
 		return tb->delete(tb, r, (struct dn_kern_rta *)rta, nlh, &NETLINK_CB(skb));
 
@@ -524,7 +525,7 @@ int dn_fib_rtm_newroute(struct sk_buff *
 	if (dn_fib_check_attr(r, rta))
 		return -EINVAL;
 
-	tb = dn_fib_get_table(r->rtm_table, 1);
+	tb = dn_fib_get_table(rtm_get_table(rta, r->rtm_table), 1);
 	if (tb) 
 		return tb->insert(tb, r, (struct dn_kern_rta *)rta, nlh, &NETLINK_CB(skb));
 
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index d5c1758..c5daf35 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -1486,6 +1486,7 @@ static int dn_rt_fill_info(struct sk_buf
 	r->rtm_src_len = 0;
 	r->rtm_tos = 0;
 	r->rtm_table = RT_TABLE_MAIN;
+	RTA_PUT_U32(skb, RTA_TABLE, RT_TABLE_MAIN);
 	r->rtm_type = rt->rt_type;
 	r->rtm_flags = (rt->rt_flags & ~0xFFFF) | RTM_F_CLONED;
 	r->rtm_scope = RT_SCOPE_UNIVERSE;
diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c
index b7c6c06..d2ad791 100644
--- a/net/decnet/dn_table.c
+++ b/net/decnet/dn_table.c
@@ -278,6 +278,7 @@ static int dn_fib_dump_info(struct sk_bu
         rtm->rtm_src_len = 0;
         rtm->rtm_tos = 0;
         rtm->rtm_table = tb_id;
+	RTA_PUT_U32(skb, RTA_TABLE, tb_id);
         rtm->rtm_flags = fi->fib_flags;
         rtm->rtm_scope = scope;
 	rtm->rtm_type  = type;
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 06f4b23..2696ede 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -294,7 +294,8 @@ static int inet_check_attr(struct rtmsg 
 		if (attr) {
 			if (RTA_PAYLOAD(attr) < 4)
 				return -EINVAL;
-			if (i != RTA_MULTIPATH && i != RTA_METRICS)
+			if (i != RTA_MULTIPATH && i != RTA_METRICS &&
+			    i != RTA_TABLE)
 				*rta = (struct rtattr*)RTA_DATA(attr);
 		}
 	}
@@ -310,7 +311,7 @@ int inet_rtm_delroute(struct sk_buff *sk
 	if (inet_check_attr(r, rta))
 		return -EINVAL;
 
-	tb = fib_get_table(r->rtm_table);
+	tb = fib_get_table(rtm_get_table(rta, r->rtm_table));
 	if (tb)
 		return tb->tb_delete(tb, r, (struct kern_rta*)rta, nlh, &NETLINK_CB(skb));
 	return -ESRCH;
@@ -325,7 +326,7 @@ int inet_rtm_newroute(struct sk_buff *sk
 	if (inet_check_attr(r, rta))
 		return -EINVAL;
 
-	tb = fib_new_table(r->rtm_table);
+	tb = fib_new_table(rtm_get_table(rta, r->rtm_table));
 	if (tb)
 		return tb->tb_insert(tb, r, (struct kern_rta*)rta, nlh, &NETLINK_CB(skb));
 	return -ENOBUFS;
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 58fb91b..0330b9c 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -184,6 +184,7 @@ static struct nla_policy fib4_rule_polic
 	[FRA_DST]	= { .type = NLA_U32 },
 	[FRA_FWMARK]	= { .type = NLA_U32 },
 	[FRA_FLOW]	= { .type = NLA_U32 },
+	[FRA_TABLE]	= { .type = NLA_U32 },
 };
 
 static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 17c7974..ffb5966 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -949,6 +949,7 @@ fib_dump_info(struct sk_buff *skb, u32 p
 	rtm->rtm_src_len = 0;
 	rtm->rtm_tos = tos;
 	rtm->rtm_table = tb_id;
+	RTA_PUT_U32(skb, RTA_TABLE, tb_id);
 	rtm->rtm_type = type;
 	rtm->rtm_flags = fi->fib_flags;
 	rtm->rtm_scope = scope;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 77852da..956e4da 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2652,6 +2652,7 @@ #endif
 	r->rtm_src_len	= 0;
 	r->rtm_tos	= rt->fl.fl4_tos;
 	r->rtm_table	= RT_TABLE_MAIN;
+	RTA_PUT_U32(skb, RTA_TABLE, RT_TABLE_MAIN);
 	r->rtm_type	= rt->rt_type;
 	r->rtm_scope	= RT_SCOPE_UNIVERSE;
 	r->rtm_protocol = RTPROT_UNSPEC;
diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index 22a2fdb..2c4fbc8 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -129,6 +129,7 @@ static struct nla_policy fib6_rule_polic
 	[FRA_PRIORITY]	= { .type = NLA_U32 },
 	[FRA_SRC]	= { .minlen = sizeof(struct in6_addr) },
 	[FRA_DST]	= { .minlen = sizeof(struct in6_addr) },
+	[FRA_TABLE]	= { .type = NLA_U32 },
 };
 
 static int fib6_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c30da9a..b8b5132 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1855,7 +1855,8 @@ int inet6_rtm_delroute(struct sk_buff *s
 
 	if (inet6_rtm_to_rtmsg(r, arg, &rtmsg))
 		return -EINVAL;
-	return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb), r->rtm_table);
+	return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb),
+			     rtm_get_table(arg, r->rtm_table));
 }
 
 int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
@@ -1865,7 +1866,8 @@ int inet6_rtm_newroute(struct sk_buff *s
 
 	if (inet6_rtm_to_rtmsg(r, arg, &rtmsg))
 		return -EINVAL;
-	return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb), r->rtm_table);
+	return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb),
+			     rtm_get_table(arg, r->rtm_table));
 }
 
 struct rt6_rtnl_dump_arg
@@ -1883,6 +1885,7 @@ static int rt6_fill_node(struct sk_buff 
 	struct nlmsghdr  *nlh;
 	unsigned char	 *b = skb->tail;
 	struct rta_cacheinfo ci;
+	u32 table;
 
 	if (prefix) {	/* user wants prefix routes only */
 		if (!(rt->rt6i_flags & RTF_PREFIX_RT)) {
@@ -1898,9 +1901,11 @@ static int rt6_fill_node(struct sk_buff 
 	rtm->rtm_src_len = rt->rt6i_src.plen;
 	rtm->rtm_tos = 0;
 	if (rt->rt6i_table)
-		rtm->rtm_table = rt->rt6i_table->tb6_id;
+		table = rt->rt6i_table->tb6_id;
 	else
-		rtm->rtm_table = RT6_TABLE_UNSPEC;
+		table = RT6_TABLE_UNSPEC;
+	rtm->rtm_table = table;
+	RTA_PUT_U32(skb, RTA_TABLE, table);
 	if (rt->rt6i_flags&RTF_REJECT)
 		rtm->rtm_type = RTN_UNREACHABLE;
 	else if (rt->rt6i_dev && (rt->rt6i_dev->flags&IFF_LOOPBACK))

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [IPV4 03/06]: Increase number of possible routing tables to 2^32
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
  2006-08-10 19:29 ` [NET 01/06]: Use u32 for routing table IDs Patrick McHardy
  2006-08-10 19:30 ` [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes Patrick McHardy
@ 2006-08-10 19:30 ` Patrick McHardy
  2006-08-11  6:10   ` David Miller
  2006-08-10 19:30 ` [IPV6 04/06]: " Patrick McHardy
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

[IPV4]: Increase number of possible routing tables to 2^32

Increase the number of possible routing tables to 2^32 by replacing the
fixed sized array of pointers by a hash table and replacing iterations
over all possible table IDs by hash table walking.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 148d1ca7c199005b5a92f8154a7caf3f78529672
tree ee025abdbab6fe6a4eac916791b8a06f0622d71e
parent a9fe50e925cdc0471b88bcf6f3cc18278b63c984
author Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:52:30 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:52:30 +0200

 include/net/ip_fib.h    |   25 ++----------
 net/ipv4/fib_frontend.c |  102 +++++++++++++++++++++++++++++++----------------
 net/ipv4/fib_hash.c     |   26 ++++++------
 net/ipv4/fib_rules.c    |    4 +-
 net/ipv4/fib_trie.c     |   26 ++++++------
 5 files changed, 101 insertions(+), 82 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 0dcbf16..8e9ba56 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -150,6 +150,7 @@ #define FIB_RES_NETMASK(res)	        (0)
 #endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */
 
 struct fib_table {
+	struct hlist_node tb_hlist;
 	u32		tb_id;
 	unsigned	tb_stamp;
 	int		(*tb_lookup)(struct fib_table *tb, const struct flowi *flp, struct fib_result *res);
@@ -200,29 +201,13 @@ static inline void fib_select_default(co
 }
 
 #else /* CONFIG_IP_MULTIPLE_TABLES */
-#define ip_fib_local_table (fib_tables[RT_TABLE_LOCAL])
-#define ip_fib_main_table (fib_tables[RT_TABLE_MAIN])
+#define ip_fib_local_table fib_get_table(RT_TABLE_LOCAL)
+#define ip_fib_main_table fib_get_table(RT_TABLE_MAIN)
 
-extern struct fib_table * fib_tables[RT_TABLE_MAX+1];
 extern int fib_lookup(struct flowi *flp, struct fib_result *res);
-extern struct fib_table *__fib_new_table(u32 id);
-
-static inline struct fib_table *fib_get_table(u32 id)
-{
-	if (id == 0)
-		id = RT_TABLE_MAIN;
-
-	return fib_tables[id];
-}
-
-static inline struct fib_table *fib_new_table(u32 id)
-{
-	if (id == 0)
-		id = RT_TABLE_MAIN;
-
-	return fib_tables[id] ? : __fib_new_table(id);
-}
 
+extern struct fib_table *fib_new_table(u32 id);
+extern struct fib_table *fib_get_table(u32 id);
 extern void fib_select_default(const struct flowi *flp, struct fib_result *res);
 
 #endif /* CONFIG_IP_MULTIPLE_TABLES */
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 2696ede..ad4c14f 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -37,6 +37,7 @@ #include <linux/if_arp.h>
 #include <linux/skbuff.h>
 #include <linux/netlink.h>
 #include <linux/init.h>
+#include <linux/list.h>
 
 #include <net/ip.h>
 #include <net/protocol.h>
@@ -51,48 +52,67 @@ #define FFprint(a...) printk(KERN_DEBUG 
 
 #ifndef CONFIG_IP_MULTIPLE_TABLES
 
-#define RT_TABLE_MIN RT_TABLE_MAIN
-
 struct fib_table *ip_fib_local_table;
 struct fib_table *ip_fib_main_table;
 
-#else
+#define FIB_TABLE_HASHSZ 1
+static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ];
 
-#define RT_TABLE_MIN 1
+#else
 
-struct fib_table *fib_tables[RT_TABLE_MAX+1];
+#define FIB_TABLE_HASHSZ 256
+static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ];
 
-struct fib_table *__fib_new_table(u32 id)
+struct fib_table *fib_new_table(u32 id)
 {
 	struct fib_table *tb;
+	unsigned int h;
 
+	if (id == 0)
+		id = RT_TABLE_MAIN;
+	tb = fib_get_table(id);
+	if (tb)
+		return tb;
 	tb = fib_hash_init(id);
 	if (!tb)
 		return NULL;
-	fib_tables[id] = tb;
+	h = id & (FIB_TABLE_HASHSZ - 1);
+	hlist_add_head_rcu(&tb->tb_hlist, &fib_table_hash[h]);
 	return tb;
 }
 
+struct fib_table *fib_get_table(u32 id)
+{
+	struct fib_table *tb;
+	struct hlist_node *node;
+	unsigned int h;
 
+	if (id == 0)
+		id = RT_TABLE_MAIN;
+	h = id & (FIB_TABLE_HASHSZ - 1);
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(tb, node, &fib_table_hash[h], tb_hlist) {
+		if (tb->tb_id == id) {
+			rcu_read_unlock();
+			return tb;
+		}
+	}
+	rcu_read_unlock();
+	return NULL;
+}
 #endif /* CONFIG_IP_MULTIPLE_TABLES */
 
-
 static void fib_flush(void)
 {
 	int flushed = 0;
-#ifdef CONFIG_IP_MULTIPLE_TABLES
 	struct fib_table *tb;
-	u32 id;
+	struct hlist_node *node;
+	unsigned int h;
 
-	for (id = RT_TABLE_MAX; id>0; id--) {
-		if ((tb = fib_get_table(id))==NULL)
-			continue;
-		flushed += tb->tb_flush(tb);
+	for (h = 0; h < FIB_TABLE_HASHSZ; h++) {
+		hlist_for_each_entry(tb, node, &fib_table_hash[h], tb_hlist)
+			flushed += tb->tb_flush(tb);
 	}
-#else /* CONFIG_IP_MULTIPLE_TABLES */
-	flushed += ip_fib_main_table->tb_flush(ip_fib_main_table);
-	flushed += ip_fib_local_table->tb_flush(ip_fib_local_table);
-#endif /* CONFIG_IP_MULTIPLE_TABLES */
 
 	if (flushed)
 		rt_cache_flush(-1);
@@ -334,29 +354,37 @@ int inet_rtm_newroute(struct sk_buff *sk
 
 int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
 {
-	u32 t;
-	u32 s_t;
+	unsigned int h, s_h;
+	unsigned int e = 0, s_e;
 	struct fib_table *tb;
+	struct hlist_node *node;
+	int dumped = 0;
 
 	if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) &&
 	    ((struct rtmsg*)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED)
 		return ip_rt_dump(skb, cb);
 
-	s_t = cb->args[0];
-	if (s_t == 0)
-		s_t = cb->args[0] = RT_TABLE_MIN;
-
-	for (t=s_t; t<=RT_TABLE_MAX; t++) {
-		if (t < s_t) continue;
-		if (t > s_t)
-			memset(&cb->args[1], 0, sizeof(cb->args)-sizeof(cb->args[0]));
-		if ((tb = fib_get_table(t))==NULL)
-			continue;
-		if (tb->tb_dump(tb, skb, cb) < 0) 
-			break;
+	s_h = cb->args[0];
+	s_e = cb->args[1];
+
+	for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) {
+		e = 0;
+		hlist_for_each_entry(tb, node, &fib_table_hash[h], tb_hlist) {
+			if (e < s_e)
+				goto next;
+			if (dumped)
+				memset(&cb->args[2], 0, sizeof(cb->args) -
+				                 2 * sizeof(cb->args[0]));
+			if (tb->tb_dump(tb, skb, cb) < 0)
+				goto out;
+			dumped = 1;
+next:
+			e++;
+		}
 	}
-
-	cb->args[0] = t;
+out:
+	cb->args[1] = e;
+	cb->args[0] = h;
 
 	return skb->len;
 }
@@ -654,9 +682,15 @@ static struct notifier_block fib_netdev_
 
 void __init ip_fib_init(void)
 {
+	unsigned int i;
+
+	for (i = 0; i < FIB_TABLE_HASHSZ; i++)
+		INIT_HLIST_HEAD(&fib_table_hash[i]);
 #ifndef CONFIG_IP_MULTIPLE_TABLES
 	ip_fib_local_table = fib_hash_init(RT_TABLE_LOCAL);
+	hlist_add_head_rcu(&ip_fib_local_table->tb_hlist, &fib_table_hash[0]);
 	ip_fib_main_table  = fib_hash_init(RT_TABLE_MAIN);
+	hlist_add_head_rcu(&ip_fib_main_table->tb_hlist, &fib_table_hash[0]);
 #else
 	fib4_rules_init();
 #endif
diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c
index f8d5c80..b5bee1a 100644
--- a/net/ipv4/fib_hash.c
+++ b/net/ipv4/fib_hash.c
@@ -684,7 +684,7 @@ fn_hash_dump_bucket(struct sk_buff *skb,
 	struct fib_node *f;
 	int i, s_i;
 
-	s_i = cb->args[3];
+	s_i = cb->args[4];
 	i = 0;
 	hlist_for_each_entry(f, node, head, fn_hash) {
 		struct fib_alias *fa;
@@ -704,14 +704,14 @@ fn_hash_dump_bucket(struct sk_buff *skb,
 					  fa->fa_tos,
 					  fa->fa_info,
 					  NLM_F_MULTI) < 0) {
-				cb->args[3] = i;
+				cb->args[4] = i;
 				return -1;
 			}
 		next:
 			i++;
 		}
 	}
-	cb->args[3] = i;
+	cb->args[4] = i;
 	return skb->len;
 }
 
@@ -722,21 +722,21 @@ fn_hash_dump_zone(struct sk_buff *skb, s
 {
 	int h, s_h;
 
-	s_h = cb->args[2];
+	s_h = cb->args[3];
 	for (h=0; h < fz->fz_divisor; h++) {
 		if (h < s_h) continue;
 		if (h > s_h)
-			memset(&cb->args[3], 0,
-			       sizeof(cb->args) - 3*sizeof(cb->args[0]));
+			memset(&cb->args[4], 0,
+			       sizeof(cb->args) - 4*sizeof(cb->args[0]));
 		if (fz->fz_hash == NULL ||
 		    hlist_empty(&fz->fz_hash[h]))
 			continue;
 		if (fn_hash_dump_bucket(skb, cb, tb, fz, &fz->fz_hash[h])<0) {
-			cb->args[2] = h;
+			cb->args[3] = h;
 			return -1;
 		}
 	}
-	cb->args[2] = h;
+	cb->args[3] = h;
 	return skb->len;
 }
 
@@ -746,21 +746,21 @@ static int fn_hash_dump(struct fib_table
 	struct fn_zone *fz;
 	struct fn_hash *table = (struct fn_hash*)tb->tb_data;
 
-	s_m = cb->args[1];
+	s_m = cb->args[2];
 	read_lock(&fib_hash_lock);
 	for (fz = table->fn_zone_list, m=0; fz; fz = fz->fz_next, m++) {
 		if (m < s_m) continue;
 		if (m > s_m)
-			memset(&cb->args[2], 0,
-			       sizeof(cb->args) - 2*sizeof(cb->args[0]));
+			memset(&cb->args[3], 0,
+			       sizeof(cb->args) - 3*sizeof(cb->args[0]));
 		if (fn_hash_dump_zone(skb, cb, tb, fz) < 0) {
-			cb->args[1] = m;
+			cb->args[2] = m;
 			read_unlock(&fib_hash_lock);
 			return -1;
 		}
 	}
 	read_unlock(&fib_hash_lock);
-	cb->args[1] = m;
+	cb->args[2] = m;
 	return skb->len;
 }
 
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 0330b9c..ce185ac 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -172,8 +172,8 @@ static struct fib_table *fib_empty_table
 	u32 id;
 
 	for (id = 1; id <= RT_TABLE_MAX; id++)
-		if (fib_tables[id] == NULL)
-			return __fib_new_table(id);
+		if (fib_get_table(id) == NULL)
+			return fib_new_table(id);
 	return NULL;
 }
 
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index b98f211..3e7b0d4 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1848,7 +1848,7 @@ static int fn_trie_dump_fa(t_key key, in
 
 	u32 xkey = htonl(key);
 
-	s_i = cb->args[3];
+	s_i = cb->args[4];
 	i = 0;
 
 	/* rcu_read_lock is hold by caller */
@@ -1870,12 +1870,12 @@ static int fn_trie_dump_fa(t_key key, in
 				  plen,
 				  fa->fa_tos,
 				  fa->fa_info, 0) < 0) {
-			cb->args[3] = i;
+			cb->args[4] = i;
 			return -1;
 		}
 		i++;
 	}
-	cb->args[3] = i;
+	cb->args[4] = i;
 	return skb->len;
 }
 
@@ -1886,14 +1886,14 @@ static int fn_trie_dump_plen(struct trie
 	struct list_head *fa_head;
 	struct leaf *l = NULL;
 
-	s_h = cb->args[2];
+	s_h = cb->args[3];
 
 	for (h = 0; (l = nextleaf(t, l)) != NULL; h++) {
 		if (h < s_h)
 			continue;
 		if (h > s_h)
-			memset(&cb->args[3], 0,
-			       sizeof(cb->args) - 3*sizeof(cb->args[0]));
+			memset(&cb->args[4], 0,
+			       sizeof(cb->args) - 4*sizeof(cb->args[0]));
 
 		fa_head = get_fa_head(l, plen);
 
@@ -1904,11 +1904,11 @@ static int fn_trie_dump_plen(struct trie
 			continue;
 
 		if (fn_trie_dump_fa(l->key, plen, fa_head, tb, skb, cb)<0) {
-			cb->args[2] = h;
+			cb->args[3] = h;
 			return -1;
 		}
 	}
-	cb->args[2] = h;
+	cb->args[3] = h;
 	return skb->len;
 }
 
@@ -1917,23 +1917,23 @@ static int fn_trie_dump(struct fib_table
 	int m, s_m;
 	struct trie *t = (struct trie *) tb->tb_data;
 
-	s_m = cb->args[1];
+	s_m = cb->args[2];
 
 	rcu_read_lock();
 	for (m = 0; m <= 32; m++) {
 		if (m < s_m)
 			continue;
 		if (m > s_m)
-			memset(&cb->args[2], 0,
-				sizeof(cb->args) - 2*sizeof(cb->args[0]));
+			memset(&cb->args[3], 0,
+				sizeof(cb->args) - 3*sizeof(cb->args[0]));
 
 		if (fn_trie_dump_plen(t, 32-m, tb, skb, cb)<0) {
-			cb->args[1] = m;
+			cb->args[2] = m;
 			goto out;
 		}
 	}
 	rcu_read_unlock();
-	cb->args[1] = m;
+	cb->args[2] = m;
 	return skb->len;
 out:
 	rcu_read_unlock();

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [IPV6 04/06]: Increase number of possible routing tables to 2^32
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
                   ` (2 preceding siblings ...)
  2006-08-10 19:30 ` [IPV4 03/06]: Increase number of possible routing tables to 2^32 Patrick McHardy
@ 2006-08-10 19:30 ` Patrick McHardy
  2006-08-11  6:11   ` David Miller
  2006-08-10 19:30 ` [DECNET 05/06]: " Patrick McHardy
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

[IPV6]: Increase number of possible routing tables to 2^32

Increase number of possible routing tables to 2^32 by replacing iterations
over all possible table IDs by hash table walking.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit cad398a8f3ef363abba9e6450dded94a022c96fa
tree 4fea9c50650ab65d942dca9c2545d1810b227839
parent 148d1ca7c199005b5a92f8154a7caf3f78529672
author Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:53:33 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:53:33 +0200

 include/net/ip6_route.h |    7 ++
 net/ipv6/ip6_fib.c      |  171 ++++++++++++++++++++++++++++++++++++++++++-----
 net/ipv6/route.c        |  128 -----------------------------------
 3 files changed, 159 insertions(+), 147 deletions(-)

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 9bfa3cc..01bfe40 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -137,6 +137,13 @@ extern int inet6_rtm_newroute(struct sk_
 extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg);
 extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg);
 
+struct rt6_rtnl_dump_arg
+{
+	struct sk_buff *skb;
+	struct netlink_callback *cb;
+};
+
+extern int rt6_dump_route(struct rt6_info *rt, void *p_arg);
 extern void rt6_ifdown(struct net_device *dev);
 extern void rt6_mtu_change(struct net_device *dev, unsigned mtu);
 
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 1f23161..bececbe 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -158,7 +158,26 @@ static struct fib6_table fib6_main_tbl =
 };
 
 #ifdef CONFIG_IPV6_MULTIPLE_TABLES
+#define FIB_TABLE_HASHSZ 256
+#else
+#define FIB_TABLE_HASHSZ 1
+#endif
+static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ];
+
+static void fib6_link_table(struct fib6_table *tb)
+{
+	unsigned int h;
+
+	h = tb->tb6_id & (FIB_TABLE_HASHSZ - 1);
 
+	/*
+	 * No protection necessary, this is the only list mutatation
+	 * operation, tables never disappear once they exist.
+	 */
+	hlist_add_head_rcu(&tb->tb6_hlist, &fib_table_hash[h]);
+}
+
+#ifdef CONFIG_IPV6_MULTIPLE_TABLES
 static struct fib6_table fib6_local_tbl = {
 	.tb6_id		= RT6_TABLE_LOCAL,
 	.tb6_lock	= RW_LOCK_UNLOCKED,
@@ -168,9 +187,6 @@ static struct fib6_table fib6_local_tbl 
 	},
 };
 
-#define FIB_TABLE_HASHSZ 256
-static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ];
-
 static struct fib6_table *fib6_alloc_table(u32 id)
 {
 	struct fib6_table *table;
@@ -186,19 +202,6 @@ static struct fib6_table *fib6_alloc_tab
 	return table;
 }
 
-static void fib6_link_table(struct fib6_table *tb)
-{
-	unsigned int h;
-
-	h = tb->tb6_id & (FIB_TABLE_HASHSZ - 1);
-
-	/*
-	 * No protection necessary, this is the only list mutatation
-	 * operation, tables never disappear once they exist.
-	 */
-	hlist_add_head_rcu(&tb->tb6_hlist, &fib_table_hash[h]);
-}
-
 struct fib6_table *fib6_new_table(u32 id)
 {
 	struct fib6_table *tb;
@@ -263,10 +266,135 @@ struct dst_entry *fib6_rule_lookup(struc
 
 static void __init fib6_tables_init(void)
 {
+	fib6_link_table(&fib6_main_tbl);
 }
 
 #endif
 
+static int fib6_dump_node(struct fib6_walker_t *w)
+{
+	int res;
+	struct rt6_info *rt;
+
+	for (rt = w->leaf; rt; rt = rt->u.next) {
+		res = rt6_dump_route(rt, w->args);
+		if (res < 0) {
+			/* Frame is full, suspend walking */
+			w->leaf = rt;
+			return 1;
+		}
+		BUG_TRAP(res!=0);
+	}
+	w->leaf = NULL;
+	return 0;
+}
+
+static void fib6_dump_end(struct netlink_callback *cb)
+{
+	struct fib6_walker_t *w = (void*)cb->args[2];
+
+	if (w) {
+		cb->args[2] = 0;
+		kfree(w);
+	}
+	cb->done = (void*)cb->args[3];
+	cb->args[1] = 3;
+}
+
+static int fib6_dump_done(struct netlink_callback *cb)
+{
+	fib6_dump_end(cb);
+	return cb->done ? cb->done(cb) : 0;
+}
+
+static int fib6_dump_table(struct fib6_table *table, struct sk_buff *skb,
+			   struct netlink_callback *cb)
+{
+	struct fib6_walker_t *w;
+	int res;
+
+	w = (void *)cb->args[2];
+	w->root = &table->tb6_root;
+
+	if (cb->args[4] == 0) {
+		read_lock_bh(&table->tb6_lock);
+		res = fib6_walk(w);
+		read_unlock_bh(&table->tb6_lock);
+		if (res > 0)
+			cb->args[4] = 1;
+	} else {
+		read_lock_bh(&table->tb6_lock);
+		res = fib6_walk_continue(w);
+		read_unlock_bh(&table->tb6_lock);
+		if (res != 0) {
+			if (res < 0)
+				fib6_walker_unlink(w);
+			goto end;
+		}
+		fib6_walker_unlink(w);
+		cb->args[4] = 0;
+	}
+end:
+	return res;
+}
+
+int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	unsigned int h, s_h;
+	unsigned int e = 0, s_e;
+	struct rt6_rtnl_dump_arg arg;
+	struct fib6_walker_t *w;
+	struct fib6_table *tb;
+	struct hlist_node *node;
+	int res = 0;
+
+	s_h = cb->args[0];
+	s_e = cb->args[1];
+
+	w = (void *)cb->args[2];
+	if (w == NULL) {
+		/* New dump:
+		 *
+		 * 1. hook callback destructor.
+		 */
+		cb->args[3] = (long)cb->done;
+		cb->done = fib6_dump_done;
+
+		/*
+		 * 2. allocate and initialize walker.
+		 */
+		w = kzalloc(sizeof(*w), GFP_ATOMIC);
+		if (w == NULL)
+			return -ENOMEM;
+		w->func = fib6_dump_node;
+		cb->args[2] = (long)w;
+	}
+
+	arg.skb = skb;
+	arg.cb = cb;
+	w->args = &arg;
+
+	for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) {
+		e = 0;
+		hlist_for_each_entry(tb, node, &fib_table_hash[h], tb6_hlist) {
+			if (e < s_e)
+				goto next;
+			res = fib6_dump_table(tb, skb, cb);
+			if (res != 0)
+				goto out;
+next:
+			e++;
+		}
+	}
+out:
+	cb->args[1] = e;
+	cb->args[0] = h;
+
+	res = res < 0 ? res : skb->len;
+	if (res <= 0)
+		fib6_dump_end(cb);
+	return res;
+}
 
 /*
  *	Routing Table
@@ -1187,17 +1315,20 @@ static void fib6_clean_tree(struct fib6_
 void fib6_clean_all(int (*func)(struct rt6_info *, void *arg),
 		    int prune, void *arg)
 {
-	int i;
 	struct fib6_table *table;
+	struct hlist_node *node;
+	unsigned int h;
 
-	for (i = FIB6_TABLE_MIN; i <= FIB6_TABLE_MAX; i++) {
-		table = fib6_get_table(i);
-		if (table != NULL) {
+	rcu_read_lock();
+	for (h = 0; h < FIB_TABLE_HASHSZ; h++) {
+		hlist_for_each_entry_rcu(table, node, &fib_table_hash[h],
+					 tb6_hlist) {
 			write_lock_bh(&table->tb6_lock);
 			fib6_clean_tree(&table->tb6_root, func, prune, arg);
 			write_unlock_bh(&table->tb6_lock);
 		}
 	}
+	rcu_read_unlock();
 }
 
 static int fib6_prune_clone(struct rt6_info *rt, void *arg)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b8b5132..191d12f 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1870,12 +1870,6 @@ int inet6_rtm_newroute(struct sk_buff *s
 			     rtm_get_table(arg, r->rtm_table));
 }
 
-struct rt6_rtnl_dump_arg
-{
-	struct sk_buff *skb;
-	struct netlink_callback *cb;
-};
-
 static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt,
 			 struct in6_addr *dst, struct in6_addr *src,
 			 int iif, int type, u32 pid, u32 seq,
@@ -1972,7 +1966,7 @@ rtattr_failure:
 	return -1;
 }
 
-static int rt6_dump_route(struct rt6_info *rt, void *p_arg)
+int rt6_dump_route(struct rt6_info *rt, void *p_arg)
 {
 	struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg;
 	int prefix;
@@ -1988,126 +1982,6 @@ static int rt6_dump_route(struct rt6_inf
 		     prefix, NLM_F_MULTI);
 }
 
-static int fib6_dump_node(struct fib6_walker_t *w)
-{
-	int res;
-	struct rt6_info *rt;
-
-	for (rt = w->leaf; rt; rt = rt->u.next) {
-		res = rt6_dump_route(rt, w->args);
-		if (res < 0) {
-			/* Frame is full, suspend walking */
-			w->leaf = rt;
-			return 1;
-		}
-		BUG_TRAP(res!=0);
-	}
-	w->leaf = NULL;
-	return 0;
-}
-
-static void fib6_dump_end(struct netlink_callback *cb)
-{
-	struct fib6_walker_t *w = (void*)cb->args[0];
-
-	if (w) {
-		cb->args[0] = 0;
-		kfree(w);
-	}
-	cb->done = (void*)cb->args[1];
-	cb->args[1] = 0;
-}
-
-static int fib6_dump_done(struct netlink_callback *cb)
-{
-	fib6_dump_end(cb);
-	return cb->done ? cb->done(cb) : 0;
-}
-
-int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
-{
-	struct fib6_table *table;
-	struct rt6_rtnl_dump_arg arg;
-	struct fib6_walker_t *w;
-	int i, res = 0;
-
-	arg.skb = skb;
-	arg.cb = cb;
-
-	/*
-	 * cb->args[0] = pointer to walker structure
-	 * cb->args[1] = saved cb->done() pointer
-	 * cb->args[2] = current table being dumped
-	 */
-
-	w = (void*)cb->args[0];
-	if (w == NULL) {
-		/* New dump:
-		 * 
-		 * 1. hook callback destructor.
-		 */
-		cb->args[1] = (long)cb->done;
-		cb->done = fib6_dump_done;
-
-		/*
-		 * 2. allocate and initialize walker.
-		 */
-		w = kzalloc(sizeof(*w), GFP_ATOMIC);
-		if (w == NULL)
-			return -ENOMEM;
-		w->func = fib6_dump_node;
-		w->args = &arg;
-		cb->args[0] = (long)w;
-		cb->args[2] = FIB6_TABLE_MIN;
-	} else {
-		w->args = &arg;
-		i = cb->args[2];
-		if (i > FIB6_TABLE_MAX)
-			goto end;
-
-		table = fib6_get_table(i);
-		if (table != NULL) {
-			read_lock_bh(&table->tb6_lock);
-			w->root = &table->tb6_root;
-			res = fib6_walk_continue(w);
-			read_unlock_bh(&table->tb6_lock);
-			if (res != 0) {
-				if (res < 0)
-					fib6_walker_unlink(w);
-				goto end;
-			}
-		}
-
-		fib6_walker_unlink(w);
-		cb->args[2] = ++i;
-	}
-
-	for (i = cb->args[2]; i <= FIB6_TABLE_MAX; i++) {
-		table = fib6_get_table(i);
-		if (table == NULL)
-			continue;
-
-		read_lock_bh(&table->tb6_lock);
-		w->root = &table->tb6_root;
-		res = fib6_walk(w);
-		read_unlock_bh(&table->tb6_lock);
-		if (res)
-			break;
-	}
-end:
-	cb->args[2] = i;
-
-	res = res < 0 ? res : skb->len;
-	/* res < 0 is an error. (really, impossible)
-	   res == 0 means that dump is complete, but skb still can contain data.
-	   res > 0 dump is not complete, but frame is full.
-	 */
-	/* Destroy walker, if dump of this table is complete. */
-	if (res <= 0)
-		fib6_dump_end(cb);
-	return res;
-}
-
 int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg)
 {
 	struct rtattr **rta = arg;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [DECNET 05/06]: Increase number of possible routing tables to 2^32
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
                   ` (3 preceding siblings ...)
  2006-08-10 19:30 ` [IPV6 04/06]: " Patrick McHardy
@ 2006-08-10 19:30 ` Patrick McHardy
  2006-08-11  6:11   ` David Miller
  2006-08-10 19:30 ` [NET 06/06]: Increate RT_TABLE_MAX " Patrick McHardy
  2006-08-11  6:39 ` [NET 00/06]: Increase number of possible routing tables Michael Tokarev
  6 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

[DECNET]: Increase number of possible routing tables to 2^32

Increase the number of possible routing tables to 2^32 by replacing the
fixed sized array of pointers by a hash table and replacing iterations
over all possible table IDs by hash table walking.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 9203e4cdab89d96c474c6a903ef9a1f47c7eee07
tree e0c6a2c5e3a691919863b4eb871fc3a25ebd5d44
parent cad398a8f3ef363abba9e6450dded94a022c96fa
author Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:54:19 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:54:19 +0200

 include/net/dn_fib.h  |    3 -
 net/decnet/dn_fib.c   |   49 -------------------
 net/decnet/dn_rules.c |    2 -
 net/decnet/dn_table.c |  125 ++++++++++++++++++++++++++++++++++++-------------
 4 files changed, 93 insertions(+), 86 deletions(-)

diff --git a/include/net/dn_fib.h b/include/net/dn_fib.h
index cd9c378..d97aa10 100644
--- a/include/net/dn_fib.h
+++ b/include/net/dn_fib.h
@@ -94,6 +94,7 @@ #define DN_FIB_INFO(f) ((f)->fn_info)
 
 
 struct dn_fib_table {
+	struct hlist_node hlist;
 	u32 n;
 
 	int (*insert)(struct dn_fib_table *t, struct rtmsg *r, 
@@ -177,8 +178,6 @@ static inline void dn_fib_res_put(struct
 		fib_rule_put(res->r);
 }
 
-extern struct dn_fib_table *dn_fib_tables[];
-
 #else /* Endnode */
 
 #define dn_fib_init()  do { } while(0)
diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c
index fb59637..5ccca3e 100644
--- a/net/decnet/dn_fib.c
+++ b/net/decnet/dn_fib.c
@@ -532,39 +532,6 @@ int dn_fib_rtm_newroute(struct sk_buff *
 	return -ENOBUFS;
 }
 
-
-int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb)
-{
-	u32 t;
-	u32 s_t;
-	struct dn_fib_table *tb;
-
-	if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) &&
-		((struct rtmsg *)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED)
-			return dn_cache_dump(skb, cb);
-
-	s_t = cb->args[0];
-	if (s_t == 0)
-		s_t = cb->args[0] = RT_MIN_TABLE;
-
-	for(t = s_t; t <= RT_TABLE_MAX; t++) {
-		if (t < s_t)
-			continue;
-		if (t > s_t)
-			memset(&cb->args[1], 0,
-			       sizeof(cb->args) - sizeof(cb->args[0]));
-		tb = dn_fib_get_table(t, 0);
-		if (tb == NULL)
-			continue;
-		if (tb->dump(tb, skb, cb) < 0)
-			break;
-	}
-
-	cb->args[0] = t;
-
-	return skb->len;
-}
-
 static void fib_magic(int cmd, int type, __le16 dst, int dst_len, struct dn_ifaddr *ifa)
 {
 	struct dn_fib_table *tb;
@@ -762,22 +729,6 @@ int dn_fib_sync_up(struct net_device *de
         return ret;
 }
 
-void dn_fib_flush(void)
-{
-        int flushed = 0;
-        struct dn_fib_table *tb;
-        u32 id;
-
-        for(id = RT_TABLE_MAX; id > 0; id--) {
-                if ((tb = dn_fib_get_table(id, 0)) == NULL)
-                        continue;
-                flushed += tb->flush(tb);
-        }
-
-        if (flushed)
-                dn_rt_cache_flush(-1);
-}
-
 static struct notifier_block dn_fib_dnaddr_notifier = {
 	.notifier_call = dn_fib_dnaddr_event,
 };
diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c
index 096f127..878312f 100644
--- a/net/decnet/dn_rules.c
+++ b/net/decnet/dn_rules.c
@@ -210,7 +210,7 @@ unsigned dnet_addr_type(__le16 addr)
 	struct flowi fl = { .nl_u = { .dn_u = { .daddr = addr } } };
 	struct dn_fib_res res;
 	unsigned ret = RTN_UNICAST;
-	struct dn_fib_table *tb = dn_fib_tables[RT_TABLE_LOCAL];
+	struct dn_fib_table *tb = dn_fib_get_table(RT_TABLE_LOCAL, 0);
 
 	res.r = NULL;
 
diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c
index d2ad791..5701a3f 100644
--- a/net/decnet/dn_table.c
+++ b/net/decnet/dn_table.c
@@ -75,9 +75,9 @@ #define DN_FIB_SCAN_KEY(f, fp, key) \
 for( ; ((f) = *(fp)) != NULL && dn_key_eq((f)->fn_key, (key)); (fp) = &(f)->fn_next)
 
 #define RT_TABLE_MIN 1
-
+#define DN_FIB_TABLE_HASHSZ 256
+static struct hlist_head dn_fib_table_hash[DN_FIB_TABLE_HASHSZ];
 static DEFINE_RWLOCK(dn_fib_tables_lock);
-struct dn_fib_table *dn_fib_tables[RT_TABLE_MAX + 1];
 
 static kmem_cache_t *dn_hash_kmem __read_mostly;
 static int dn_fib_hash_zombies;
@@ -357,7 +357,7 @@ static __inline__ int dn_hash_dump_bucke
 {
 	int i, s_i;
 
-	s_i = cb->args[3];
+	s_i = cb->args[4];
 	for(i = 0; f; i++, f = f->fn_next) {
 		if (i < s_i)
 			continue;
@@ -370,11 +370,11 @@ static __inline__ int dn_hash_dump_bucke
 				(f->fn_state & DN_S_ZOMBIE) ? 0 : f->fn_type,
 				f->fn_scope, &f->fn_key, dz->dz_order, 
 				f->fn_info, NLM_F_MULTI) < 0) {
-			cb->args[3] = i;
+			cb->args[4] = i;
 			return -1;
 		}
 	}
-	cb->args[3] = i;
+	cb->args[4] = i;
 	return skb->len;
 }
 
@@ -385,20 +385,20 @@ static __inline__ int dn_hash_dump_zone(
 {
 	int h, s_h;
 
-	s_h = cb->args[2];
+	s_h = cb->args[3];
 	for(h = 0; h < dz->dz_divisor; h++) {
 		if (h < s_h)
 			continue;
 		if (h > s_h)
-			memset(&cb->args[3], 0, sizeof(cb->args) - 3*sizeof(cb->args[0]));
+			memset(&cb->args[4], 0, sizeof(cb->args) - 4*sizeof(cb->args[0]));
 		if (dz->dz_hash == NULL || dz->dz_hash[h] == NULL)
 			continue;
 		if (dn_hash_dump_bucket(skb, cb, tb, dz, dz->dz_hash[h]) < 0) {
-			cb->args[2] = h;
+			cb->args[3] = h;
 			return -1;
 		}
 	}
-	cb->args[2] = h;
+	cb->args[3] = h;
 	return skb->len;
 }
 
@@ -409,26 +409,63 @@ static int dn_fib_table_dump(struct dn_f
 	struct dn_zone *dz;
 	struct dn_hash *table = (struct dn_hash *)tb->data;
 
-	s_m = cb->args[1];
+	s_m = cb->args[2];
 	read_lock(&dn_fib_tables_lock);
 	for(dz = table->dh_zone_list, m = 0; dz; dz = dz->dz_next, m++) {
 		if (m < s_m)
 			continue;
 		if (m > s_m)
-			memset(&cb->args[2], 0, sizeof(cb->args) - 2*sizeof(cb->args[0]));
+			memset(&cb->args[3], 0, sizeof(cb->args) - 3*sizeof(cb->args[0]));
 
 		if (dn_hash_dump_zone(skb, cb, tb, dz) < 0) {
-			cb->args[1] = m;
+			cb->args[2] = m;
 			read_unlock(&dn_fib_tables_lock);
 			return -1;
 		}
 	}
 	read_unlock(&dn_fib_tables_lock);
-	cb->args[1] = m;
+	cb->args[2] = m;
 
         return skb->len;
 }
 
+int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	unsigned int h, s_h;
+	unsigned int e = 0, s_e;
+	struct dn_fib_table *tb;
+	struct hlist_node *node;
+	int dumped = 0;
+
+	if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) &&
+		((struct rtmsg *)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED)
+			return dn_cache_dump(skb, cb);
+
+	s_h = cb->args[0];
+	s_e = cb->args[1];
+
+	for (h = s_h; h < DN_FIB_TABLE_HASHSZ; h++, s_h = 0) {
+		e = 0;
+		hlist_for_each_entry(tb, node, &dn_fib_table_hash[h], hlist) {
+			if (e < s_e)
+				goto next;
+			if (dumped)
+				memset(&cb->args[2], 0, sizeof(cb->args) -
+				                 2 * sizeof(cb->args[0]));
+			if (tb->dump(tb, skb, cb) < 0)
+				goto out;
+			dumped = 1;
+next:
+			e++;
+		}
+	}
+out:
+	cb->args[1] = e;
+	cb->args[0] = h;
+
+	return skb->len;
+}
+
 static int dn_fib_table_insert(struct dn_fib_table *tb, struct rtmsg *r, struct dn_kern_rta *rta, struct nlmsghdr *n, struct netlink_skb_parms *req)
 {
 	struct dn_hash *table = (struct dn_hash *)tb->data;
@@ -740,6 +777,8 @@ out:
 struct dn_fib_table *dn_fib_get_table(u32 n, int create)
 {
         struct dn_fib_table *t;
+	struct hlist_node *node;
+	unsigned int h;
 
         if (n < RT_TABLE_MIN)
                 return NULL;
@@ -747,8 +786,15 @@ struct dn_fib_table *dn_fib_get_table(u3
         if (n > RT_TABLE_MAX)
                 return NULL;
 
-        if (dn_fib_tables[n]) 
-                return dn_fib_tables[n];
+	h = n & (DN_FIB_TABLE_HASHSZ - 1);
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(t, node, &dn_fib_table_hash[h], hlist) {
+		if (t->n == n) {
+			rcu_read_unlock();
+			return t;
+		}
+	}
+	rcu_read_unlock();
 
         if (!create)
                 return NULL;
@@ -769,33 +815,37 @@ struct dn_fib_table *dn_fib_get_table(u3
         t->flush  = dn_fib_table_flush;
         t->dump = dn_fib_table_dump;
 	memset(t->data, 0, sizeof(struct dn_hash));
-        dn_fib_tables[n] = t;
+	hlist_add_head_rcu(&t->hlist, &dn_fib_table_hash[h]);
 
         return t;
 }
 
-static void dn_fib_del_tree(u32 n)
-{
-	struct dn_fib_table *t;
-
-	write_lock(&dn_fib_tables_lock);
-	t = dn_fib_tables[n];
-	dn_fib_tables[n] = NULL;
-	write_unlock(&dn_fib_tables_lock);
-
-	kfree(t);
-}
-
 struct dn_fib_table *dn_fib_empty_table(void)
 {
         u32 id;
 
         for(id = RT_TABLE_MIN; id <= RT_TABLE_MAX; id++)
-                if (dn_fib_tables[id] == NULL)
+		if (dn_fib_get_table(id, 0) == NULL)
                         return dn_fib_get_table(id, 1);
         return NULL;
 }
 
+void dn_fib_flush(void)
+{
+        int flushed = 0;
+        struct dn_fib_table *tb;
+	struct hlist_node *node;
+	unsigned int h;
+
+	for (h = 0; h < DN_FIB_TABLE_HASHSZ; h++) {
+		hlist_for_each_entry(tb, node, &dn_fib_table_hash[h], hlist)
+	                flushed += tb->flush(tb);
+        }
+
+        if (flushed)
+                dn_rt_cache_flush(-1);
+}
+
 void __init dn_fib_table_init(void)
 {
 	dn_hash_kmem = kmem_cache_create("dn_fib_info_cache",
@@ -806,10 +856,17 @@ void __init dn_fib_table_init(void)
 
 void __exit dn_fib_table_cleanup(void)
 {
-	int i;
-
-	for (i = RT_TABLE_MIN; i <= RT_TABLE_MAX; ++i)
-		dn_fib_del_tree(i);
+	struct dn_fib_table *t;
+	struct hlist_node *node, *next;
+	unsigned int h;
 
-	return;
+	write_lock(&dn_fib_tables_lock);
+	for (h = 0; h < DN_FIB_TABLE_HASHSZ; h++) {
+		hlist_for_each_entry_safe(t, node, next, &dn_fib_table_hash[h],
+		                          hlist) {
+			hlist_del(&t->hlist);
+			kfree(t);
+		}
+	}
+	write_unlock(&dn_fib_tables_lock);
 }

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [NET 06/06]: Increate RT_TABLE_MAX to 2^32
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
                   ` (4 preceding siblings ...)
  2006-08-10 19:30 ` [DECNET 05/06]: " Patrick McHardy
@ 2006-08-10 19:30 ` Patrick McHardy
  2006-08-11  6:12   ` David Miller
  2006-08-11  6:39 ` [NET 00/06]: Increase number of possible routing tables Michael Tokarev
  6 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2006-08-10 19:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, Patrick McHardy, patrick

[NET]: Increate RT_TABLE_MAX to 2^32

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit f20cbb83204cd7e2ffa9cf4e8ee8b6353628d5d3
tree 8f0eaa4219506715449e7118037040f396875c99
parent 9203e4cdab89d96c474c6a903ef9a1f47c7eee07
author Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:54:49 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 10 Aug 2006 20:54:49 +0200

 include/linux/rtnetlink.h |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index b01bc8b..a616c68 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -239,10 +239,8 @@ enum rt_class_t
 	RT_TABLE_DEFAULT=253,
 	RT_TABLE_MAIN=254,
 	RT_TABLE_LOCAL=255,
-	__RT_TABLE_MAX
+	RT_TABLE_MAX=0xFFFFFFFF
 };
-#define RT_TABLE_MAX (__RT_TABLE_MAX - 1)
-
 
 
 /* Routing message attributes */

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [NET 01/06]: Use u32 for routing table IDs
  2006-08-10 19:29 ` [NET 01/06]: Use u32 for routing table IDs Patrick McHardy
@ 2006-08-11  6:08   ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  6:08 UTC (permalink / raw)
  To: kaber; +Cc: netdev, patrick

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 10 Aug 2006 21:29:59 +0200 (MEST)

> [NET]: Use u32 for routing table IDs
> 
> Use u32 for routing table IDs in net/ipv4 and net/decnet in preparation of
> support for a larger number of routing tables. net/ipv6 already uses u32
> everywhere and needs no further changes. No functional changes are made by
> this patch.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes
  2006-08-10 19:30 ` [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes Patrick McHardy
@ 2006-08-11  6:09   ` David Miller
  2006-08-11  7:02   ` Michael Tokarev
  1 sibling, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  6:09 UTC (permalink / raw)
  To: kaber; +Cc: netdev, patrick

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 10 Aug 2006 21:30:00 +0200 (MEST)

> [NET]: Introduce RTA_TABLE/FRA_TABLE attributes
> 
> Introduce RTA_TABLE route attribute and FRA_TABLE routing rule attribute
> to hold 32 bit routing table IDs. Usespace compatibility is provided by
> continuing to accept and send the rtm_table field, but because of its
> limited size it can only carry the low 8 bits of the table ID. This
> implies that if larger IDs are used, _all_ userspace programs using them
> need to use RTA_TABLE.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied, thanks Patrick.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [IPV4 03/06]: Increase number of possible routing tables to 2^32
  2006-08-10 19:30 ` [IPV4 03/06]: Increase number of possible routing tables to 2^32 Patrick McHardy
@ 2006-08-11  6:10   ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  6:10 UTC (permalink / raw)
  To: kaber; +Cc: netdev, patrick

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 10 Aug 2006 21:30:01 +0200 (MEST)

> [IPV4]: Increase number of possible routing tables to 2^32
> 
> Increase the number of possible routing tables to 2^32 by replacing the
> fixed sized array of pointers by a hash table and replacing iterations
> over all possible table IDs by hash table walking.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [IPV6 04/06]: Increase number of possible routing tables to 2^32
  2006-08-10 19:30 ` [IPV6 04/06]: " Patrick McHardy
@ 2006-08-11  6:11   ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  6:11 UTC (permalink / raw)
  To: kaber; +Cc: netdev, patrick

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 10 Aug 2006 21:30:03 +0200 (MEST)

> [IPV6]: Increase number of possible routing tables to 2^32
> 
> Increase number of possible routing tables to 2^32 by replacing iterations
> over all possible table IDs by hash table walking.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [DECNET 05/06]: Increase number of possible routing tables to 2^32
  2006-08-10 19:30 ` [DECNET 05/06]: " Patrick McHardy
@ 2006-08-11  6:11   ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  6:11 UTC (permalink / raw)
  To: kaber; +Cc: netdev, patrick

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 10 Aug 2006 21:30:04 +0200 (MEST)

> [DECNET]: Increase number of possible routing tables to 2^32
> 
> Increase the number of possible routing tables to 2^32 by replacing the
> fixed sized array of pointers by a hash table and replacing iterations
> over all possible table IDs by hash table walking.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 06/06]: Increate RT_TABLE_MAX to 2^32
  2006-08-10 19:30 ` [NET 06/06]: Increate RT_TABLE_MAX " Patrick McHardy
@ 2006-08-11  6:12   ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  6:12 UTC (permalink / raw)
  To: kaber; +Cc: netdev, patrick

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 10 Aug 2006 21:30:06 +0200 (MEST)

> [NET]: Increate RT_TABLE_MAX to 2^32
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

2^32-1 :-)

Applied, thanks Patrick.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 00/06]: Increase number of possible routing tables
  2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
                   ` (5 preceding siblings ...)
  2006-08-10 19:30 ` [NET 06/06]: Increate RT_TABLE_MAX " Patrick McHardy
@ 2006-08-11  6:39 ` Michael Tokarev
  2006-08-11  6:44   ` David Miller
  6 siblings, 1 reply; 19+ messages in thread
From: Michael Tokarev @ 2006-08-11  6:39 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: davem, netdev, patrick

Patrick McHardy wrote:
> These are the updated patches (against net-2.6.19) to increase the number
> of possible routing tables to 2^32. They basically consist of four parts:
> 
> - Use u32 for routing table IDs everywhere inside the kernel

Just out of curiocity: why current limit of 2^31 isn't sufficient?
Or am I missing the point?

Thanks.

/mjt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 00/06]: Increase number of possible routing tables
  2006-08-11  6:39 ` [NET 00/06]: Increase number of possible routing tables Michael Tokarev
@ 2006-08-11  6:44   ` David Miller
  2006-08-11  6:56     ` Michael Tokarev
  0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2006-08-11  6:44 UTC (permalink / raw)
  To: mjt; +Cc: kaber, netdev, patrick

From: Michael Tokarev <mjt@tls.msk.ru>
Date: Fri, 11 Aug 2006 10:39:19 +0400

> Patrick McHardy wrote:
> > These are the updated patches (against net-2.6.19) to increase the number
> > of possible routing tables to 2^32. They basically consist of four parts:
> > 
> > - Use u32 for routing table IDs everywhere inside the kernel
> 
> Just out of curiocity: why current limit of 2^31 isn't sufficient?
> Or am I missing the point?

The current limit is 256 because the table member of the struct
used to configure them is an 8-bit quantity.

That's the whole purpose of Patrick's patch set, to provide a new
optional attribute that allows specifying a 32-bit rather than
the 8-bit table ID.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 00/06]: Increase number of possible routing tables
  2006-08-11  6:44   ` David Miller
@ 2006-08-11  6:56     ` Michael Tokarev
  2006-08-11  9:48       ` Thomas Graf
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Tokarev @ 2006-08-11  6:56 UTC (permalink / raw)
  To: David Miller; +Cc: kaber, netdev, patrick

David Miller wrote:
> From: Michael Tokarev <mjt@tls.msk.ru>
[]
>>> - Use u32 for routing table IDs everywhere inside the kernel
>> Just out of curiocity: why current limit of 2^31 isn't sufficient?
>> Or am I missing the point?
> 
> The current limit is 256 because the table member of the struct
> used to configure them is an 8-bit quantity.
> 
> That's the whole purpose of Patrick's patch set, to provide a new
> optional attribute that allows specifying a 32-bit rather than
> the 8-bit table ID.

Aha, it was 256, not 2^31.  I remember now.

So the question probably should have been like, why u32 and additional
attribute (to represent former -1) instead of current int?  I mean,
it probably makes no difference whenever there are 2^32 or 2^31 tables
(both values are pretty large), but 2^32 requires more changes for the
existing code.

And while we're at it...  How about using table *names* instead of
numbers in kernel too, a-la iptables?  Once possible number of tables
is large, and we're using hashes for tables now anyway, keeping a
name inside the table structure wont hurt ;)

/mjt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes
  2006-08-10 19:30 ` [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes Patrick McHardy
  2006-08-11  6:09   ` David Miller
@ 2006-08-11  7:02   ` Michael Tokarev
  2006-08-11  7:33     ` David Miller
  1 sibling, 1 reply; 19+ messages in thread
From: Michael Tokarev @ 2006-08-11  7:02 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: davem, netdev, patrick

Patrick McHardy wrote:
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h

> +static inline u32 rtm_get_table(struct rtattr **rta, u8 table)
> +{
> +	return RTA_GET_U32(rta[RTA_TABLE-1]);
> +rtattr_failure:
> +	return table;
> +}
> +

What's that ?  ;)

/mjt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes
  2006-08-11  7:02   ` Michael Tokarev
@ 2006-08-11  7:33     ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2006-08-11  7:33 UTC (permalink / raw)
  To: mjt; +Cc: kaber, netdev, patrick

From: Michael Tokarev <mjt@tls.msk.ru>
Date: Fri, 11 Aug 2006 11:02:43 +0400

> Patrick McHardy wrote:
> > --- a/include/linux/rtnetlink.h
> > +++ b/include/linux/rtnetlink.h
> 
> > +static inline u32 rtm_get_table(struct rtattr **rta, u8 table)
> > +{
> > +	return RTA_GET_U32(rta[RTA_TABLE-1]);
> > +rtattr_failure:
> > +	return table;
> > +}
> > +
> 
> What's that ?  ;)

The RTA_GET_U32 macro internally branches to rtattr_failure.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [NET 00/06]: Increase number of possible routing tables
  2006-08-11  6:56     ` Michael Tokarev
@ 2006-08-11  9:48       ` Thomas Graf
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Graf @ 2006-08-11  9:48 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: David Miller, kaber, netdev, patrick

* Michael Tokarev <mjt@tls.msk.ru> 2006-08-11 10:56
> And while we're at it...  How about using table *names* instead of
> numbers in kernel too, a-la iptables?  Once possible number of tables
> is large, and we're using hashes for tables now anyway, keeping a
> name inside the table structure wont hurt ;)

This is and should be implemented in userspace.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-08-11  9:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-10 19:29 [NET 00/06]: Increase number of possible routing tables Patrick McHardy
2006-08-10 19:29 ` [NET 01/06]: Use u32 for routing table IDs Patrick McHardy
2006-08-11  6:08   ` David Miller
2006-08-10 19:30 ` [NET 02/06]: Introduce RTA_TABLE/FRA_TABLE attributes Patrick McHardy
2006-08-11  6:09   ` David Miller
2006-08-11  7:02   ` Michael Tokarev
2006-08-11  7:33     ` David Miller
2006-08-10 19:30 ` [IPV4 03/06]: Increase number of possible routing tables to 2^32 Patrick McHardy
2006-08-11  6:10   ` David Miller
2006-08-10 19:30 ` [IPV6 04/06]: " Patrick McHardy
2006-08-11  6:11   ` David Miller
2006-08-10 19:30 ` [DECNET 05/06]: " Patrick McHardy
2006-08-11  6:11   ` David Miller
2006-08-10 19:30 ` [NET 06/06]: Increate RT_TABLE_MAX " Patrick McHardy
2006-08-11  6:12   ` David Miller
2006-08-11  6:39 ` [NET 00/06]: Increase number of possible routing tables Michael Tokarev
2006-08-11  6:44   ` David Miller
2006-08-11  6:56     ` Michael Tokarev
2006-08-11  9:48       ` Thomas Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).