Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 1/5] spidernet: add missing initialization
From: Ishizaki Kou @ 2008-01-17 10:22 UTC (permalink / raw)
  To: jens; +Cc: netdev, cbe-oss-dev
In-Reply-To: <200801111344.35652.jens@de.ibm.com>

Jens-san,

> Hi Ishizaki,
>
> Linas has left the company and is no longer doing kernel related stuff,
> so I suggest, given Jeff is ok with that, that the two of us take over
> spidernet maintainership.
 (snip)
> Change maintainership for spidernet.
>
> Signed-off-by: Jens Osterkamp <jens@de.ibm.com>

I apologize to my late reply.

I hope to accept your suggestion. But I have to get authorization
to take maintainership in my company. I have started negotiation
to my boss.


I can't check that spidernet driver works on Cell Blade, because I
don't have one.  So I hope you check spidernet driver works on Cell
Blade when it changes.

And then, will you review our latest patches?

Best regards,
Kou Ishizaki

^ permalink raw reply

* [PATCH 3/3 net-2.6.25] Process FIB rule action in the context of the namespace.
From: Denis V. Lunev @ 2008-01-17 10:09 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, dlezcano, containers, Denis V. Lunev
In-Reply-To: <478F2933.1000007@openvz.org>

Save namespace context on the fib rule at the rule creation time and call
routing lookup in the correct namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
---
 include/net/fib_rules.h |    1 +
 net/core/fib_rules.c    |    2 ++
 net/ipv4/fib_rules.c    |    2 +-
 3 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 7f9f4ae..34349f9 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -22,6 +22,7 @@ struct fib_rule
 	u32			target;
 	struct fib_rule *	ctarget;
 	struct rcu_head		rcu;
+	struct net *		fr_net;
 };
 
 struct fib_lookup_arg
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 3cd4f13..42ccaf5 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -29,6 +29,7 @@ int fib_default_rule_add(struct fib_rules_ops *ops,
 	r->pref = pref;
 	r->table = table;
 	r->flags = flags;
+	r->fr_net = ops->fro_net;
 
 	/* The lock is not required here, the list in unreacheable
 	 * at the moment this function is called */
@@ -242,6 +243,7 @@ static int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 		err = -ENOMEM;
 		goto errout;
 	}
+	rule->fr_net = net;
 
 	if (tb[FRA_PRIORITY])
 		rule->pref = nla_get_u32(tb[FRA_PRIORITY]);
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 3b7affd..d2001f1 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -91,7 +91,7 @@ static int fib4_rule_action(struct fib_rule *rule, struct flowi *flp,
 		goto errout;
 	}
 
-	if ((tbl = fib_get_table(&init_net, rule->table)) == NULL)
+	if ((tbl = fib_get_table(rule->fr_net, rule->table)) == NULL)
 		goto errout;
 
 	err = tbl->tb_lookup(tbl, flp, (struct fib_result *) arg->result);
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 1/3 net-2.6.25] Add netns to fib_rules_ops.
From: Denis V. Lunev @ 2008-01-17 10:09 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, dlezcano, containers, Denis V. Lunev
In-Reply-To: <478F2933.1000007@openvz.org>

The backward link from FIB rules operations to the network namespace will
allow to simplify the API a bit.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
---
 include/net/fib_rules.h |    1 +
 net/decnet/dn_rules.c   |    1 +
 net/ipv4/fib_rules.c    |    2 ++
 net/ipv6/fib6_rules.c   |    1 +
 4 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 4f47250..6910e01 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -67,6 +67,7 @@ struct fib_rules_ops
 	const struct nla_policy	*policy;
 	struct list_head	rules_list;
 	struct module		*owner;
+	struct net		*fro_net;
 };
 
 #define FRA_GENERIC_POLICY \
diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c
index c1fae23..964e658 100644
--- a/net/decnet/dn_rules.c
+++ b/net/decnet/dn_rules.c
@@ -249,6 +249,7 @@ static struct fib_rules_ops dn_fib_rules_ops = {
 	.policy		= dn_fib_rule_policy,
 	.rules_list	= LIST_HEAD_INIT(dn_fib_rules_ops.rules_list),
 	.owner		= THIS_MODULE,
+	.fro_net	= &init_net,
 };
 
 void __init dn_fib_rules_init(void)
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 72232ab..8d0ebe7 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -315,6 +315,8 @@ int __net_init fib4_rules_init(struct net *net)
 	if (ops == NULL)
 		return -ENOMEM;
 	INIT_LIST_HEAD(&ops->rules_list);
+	ops->fro_net = net;
+
 	fib_rules_register(net, ops);
 
 	err = fib_default_rules_init(ops);
diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index 76437a1..ead5ab2 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -249,6 +249,7 @@ static struct fib_rules_ops fib6_rules_ops = {
 	.policy			= fib6_rule_policy,
 	.rules_list		= LIST_HEAD_INIT(fib6_rules_ops.rules_list),
 	.owner			= THIS_MODULE,
+	.fro_net		= &init_net,
 };
 
 static int __init fib6_default_rules_init(void)
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 2/3 net-2.6.25] [NETNS] FIB rules API cleanup.
From: Denis V. Lunev @ 2008-01-17 10:09 UTC (permalink / raw)
  To: davem; +Cc: netdev, devel, dlezcano, containers, Denis V. Lunev
In-Reply-To: <478F2933.1000007@openvz.org>

Remove struct net from fib_rules_register(unregister)/notify_change paths
and diet code size a bit.

add/remove: 0/0 grow/shrink: 10/12 up/down: 35/-100 (-65)
function                                     old     new   delta
notify_rule_change                           273     280      +7
trie_show_stats                              471     475      +4
fn_trie_delete                               473     477      +4
fib_rules_unregister                         144     148      +4
fib4_rule_compare                            119     123      +4
resize                                      2842    2845      +3
fn_trie_select_default                       515     518      +3
inet_sk_rebuild_header                       836     838      +2
fib_trie_seq_show                            764     766      +2
__devinet_sysctl_register                    276     278      +2
fn_trie_lookup                              1124    1123      -1
ip_fib_check_default                         133     131      -2
devinet_conf_sysctl                          223     221      -2
snmp_fold_field                              126     123      -3
fn_trie_insert                              2091    2086      -5
inet_create                                  876     870      -6
fib4_rules_init                              197     191      -6
fib_sync_down                                452     444      -8
inet_gso_send_check                          334     325      -9
fib_create_info                             3003    2991     -12
fib_nl_delrule                               568     553     -15
fib_nl_newrule                               883     852     -31

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
---
 include/net/fib_rules.h |    4 ++--
 net/core/fib_rules.c    |   20 +++++++++++++-------
 net/decnet/dn_rules.c   |    4 ++--
 net/ipv4/fib_rules.c    |    6 +++---
 net/ipv6/fib6_rules.c   |    4 ++--
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 6910e01..7f9f4ae 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -102,8 +102,8 @@ static inline u32 frh_get_table(struct fib_rule_hdr *frh, struct nlattr **nla)
 	return frh->table;
 }
 
-extern int fib_rules_register(struct net *, struct fib_rules_ops *);
-extern void fib_rules_unregister(struct net *, struct fib_rules_ops *);
+extern int fib_rules_register(struct fib_rules_ops *);
+extern void fib_rules_unregister(struct fib_rules_ops *);
 extern void                     fib_rules_cleanup_ops(struct fib_rules_ops *);
 
 extern int			fib_rules_lookup(struct fib_rules_ops *,
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 541728a..3cd4f13 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -37,8 +37,7 @@ int fib_default_rule_add(struct fib_rules_ops *ops,
 }
 EXPORT_SYMBOL(fib_default_rule_add);
 
-static void notify_rule_change(struct net *net, int event,
-			       struct fib_rule *rule,
+static void notify_rule_change(int event, struct fib_rule *rule,
 			       struct fib_rules_ops *ops, struct nlmsghdr *nlh,
 			       u32 pid);
 
@@ -72,10 +71,13 @@ static void flush_route_cache(struct fib_rules_ops *ops)
 		ops->flush_cache();
 }
 
-int fib_rules_register(struct net *net, struct fib_rules_ops *ops)
+int fib_rules_register(struct fib_rules_ops *ops)
 {
 	int err = -EEXIST;
 	struct fib_rules_ops *o;
+	struct net *net;
+
+	net = ops->fro_net;
 
 	if (ops->rule_size < sizeof(struct fib_rule))
 		return -EINVAL;
@@ -112,8 +114,9 @@ void fib_rules_cleanup_ops(struct fib_rules_ops *ops)
 }
 EXPORT_SYMBOL_GPL(fib_rules_cleanup_ops);
 
-void fib_rules_unregister(struct net *net, struct fib_rules_ops *ops)
+void fib_rules_unregister(struct fib_rules_ops *ops)
 {
+	struct net *net = ops->fro_net;
 
 	spin_lock(&net->rules_mod_lock);
 	list_del_rcu(&ops->list);
@@ -333,7 +336,7 @@ static int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 	else
 		list_add_rcu(&rule->list, &ops->rules_list);
 
-	notify_rule_change(net, RTM_NEWRULE, rule, ops, nlh, NETLINK_CB(skb).pid);
+	notify_rule_change(RTM_NEWRULE, rule, ops, nlh, NETLINK_CB(skb).pid);
 	flush_route_cache(ops);
 	rules_ops_put(ops);
 	return 0;
@@ -423,7 +426,7 @@ static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 		}
 
 		synchronize_rcu();
-		notify_rule_change(net, RTM_DELRULE, rule, ops, nlh,
+		notify_rule_change(RTM_DELRULE, rule, ops, nlh,
 				   NETLINK_CB(skb).pid);
 		fib_rule_put(rule);
 		flush_route_cache(ops);
@@ -561,13 +564,15 @@ static int fib_nl_dumprule(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
-static void notify_rule_change(struct net *net, int event, struct fib_rule *rule,
+static void notify_rule_change(int event, struct fib_rule *rule,
 			       struct fib_rules_ops *ops, struct nlmsghdr *nlh,
 			       u32 pid)
 {
+	struct net *net;
 	struct sk_buff *skb;
 	int err = -ENOBUFS;
 
+	net = ops->fro_net;
 	skb = nlmsg_new(fib_rule_nlmsg_size(ops, rule), GFP_KERNEL);
 	if (skb == NULL)
 		goto errout;
@@ -579,6 +584,7 @@ static void notify_rule_change(struct net *net, int event, struct fib_rule *rule
 		kfree_skb(skb);
 		goto errout;
 	}
+
 	err = rtnl_notify(skb, net, pid, ops->nlgroup, nlh, GFP_KERNEL);
 errout:
 	if (err < 0)
diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c
index 964e658..5b7539b 100644
--- a/net/decnet/dn_rules.c
+++ b/net/decnet/dn_rules.c
@@ -256,12 +256,12 @@ void __init dn_fib_rules_init(void)
 {
 	BUG_ON(fib_default_rule_add(&dn_fib_rules_ops, 0x7fff,
 			            RT_TABLE_MAIN, 0));
-	fib_rules_register(&init_net, &dn_fib_rules_ops);
+	fib_rules_register(&dn_fib_rules_ops);
 }
 
 void __exit dn_fib_rules_cleanup(void)
 {
-	fib_rules_unregister(&init_net, &dn_fib_rules_ops);
+	fib_rules_unregister(&dn_fib_rules_ops);
 }
 
 
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 8d0ebe7..3b7affd 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -317,7 +317,7 @@ int __net_init fib4_rules_init(struct net *net)
 	INIT_LIST_HEAD(&ops->rules_list);
 	ops->fro_net = net;
 
-	fib_rules_register(net, ops);
+	fib_rules_register(ops);
 
 	err = fib_default_rules_init(ops);
 	if (err < 0)
@@ -327,13 +327,13 @@ int __net_init fib4_rules_init(struct net *net)
 
 fail:
 	/* also cleans all rules already added */
-	fib_rules_unregister(net, ops);
+	fib_rules_unregister(ops);
 	kfree(ops);
 	return err;
 }
 
 void __net_exit fib4_rules_exit(struct net *net)
 {
-	fib_rules_unregister(net, net->ipv4.rules_ops);
+	fib_rules_unregister(net->ipv4.rules_ops);
 	kfree(net->ipv4.rules_ops);
 }
diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index ead5ab2..695c0ca 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -274,7 +274,7 @@ int __init fib6_rules_init(void)
 	if (ret)
 		goto out;
 
-	ret = fib_rules_register(&init_net, &fib6_rules_ops);
+	ret = fib_rules_register(&fib6_rules_ops);
 	if (ret)
 		goto out_default_rules_init;
 out:
@@ -287,5 +287,5 @@ out_default_rules_init:
 
 void fib6_rules_cleanup(void)
 {
-	fib_rules_unregister(&init_net, &fib6_rules_ops);
+	fib_rules_unregister(&fib6_rules_ops);
 }
-- 
1.5.3.rc5


^ permalink raw reply related

* [PATCH 0/3 net-2.6.25] call FIB rule->action in the correct namespace
From: Denis V. Lunev @ 2008-01-17 10:08 UTC (permalink / raw)
  To: David Miller; +Cc: Daniel Lezcano, netdev, Linux Containers, devel

FIB rule->action should operate in the same namespace as fib_lookup.
This is definitely missed right now.

There are two ways to implement this: pass struct net into another rules
API call (2 levels) or place netns into rule struct directly. The second
approach seems better as the code will grow less.

Additionally, the patchset cleanups struct net from
fib_rules_register/unregister to have network namespace context at the
time of default rules creation.

Signed-off-by: Denis V. Lunev <den@openvz.org>

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: David Miller @ 2008-01-17 10:06 UTC (permalink / raw)
  To: timo.teras; +Cc: herbert, hadi, netdev
In-Reply-To: <478F276D.8080407@iki.fi>

From: Timo_Teräs <timo.teras@iki.fi>
Date: Thu, 17 Jan 2008 12:01:17 +0200

> David Miller wrote:
> > This is an inherent aspect of AF_KEY (and what it was
> > derived from, BSD routing sockets).
> 
> Yes, this is the way BSD does it.
>  
> > It has to provide dumps atomically, and if there is no
> > space there is no way to provide those entries which
> > would require more rcvbuf space.
> 
> RFC does not say it has to be atomic.

Every application out there in the universe expects BSD socket
semantics, and therefore atomic dumps.  You cannot "fix" things
without breaking applications.

^ permalink raw reply

* Broken "Make ip6_frags per namespace" patch
From: Alexey Dobriyan @ 2008-01-17 10:05 UTC (permalink / raw)
  To: dlezcano, davem; +Cc: den, netdev, devel

> commit c064c4811b3e87ff8202f5a966ff4eea0bc54575
> Author: Daniel Lezcano <dlezcano@fr.ibm.com>
> Date:   Thu Jan 10 02:56:03 2008 -0800
> 
>     [NETNS][IPV6]: Make ip6_frags per namespace.
>     
>     The ip6_frags is moved to the network namespace structure.  Because
>     there can be multiple instances of the network namespaces, and the
>     ip6_frags is no longer a global static variable, a helper function has
>     been added to facilitate the initialization of the variables.
>     
>     Until the ipv6 protocol is not per namespace, the variables are
>     accessed relatively from the initial network namespace.

> --- a/include/net/netns/ipv6.h
> +++ b/include/net/netns/ipv6.h

> @@ -11,6 +13,7 @@ struct netns_sysctl_ipv6 {
>  #ifdef CONFIG_SYSCTL
>  	struct ctl_table_header *table;
>  #endif
> +	struct inet_frags_ctl frags;

> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c

> @@ -632,6 +625,11 @@ static struct inet6_protocol frag_protocol =
>  	.flags		=	INET6_PROTO_NOPOLICY,
>  };
>  
> +void ipv6_frag_sysctl_init(struct net *net)
> +{
> +	ip6_frags.ctl = &net->ipv6.sysctl.frags;
> +}

_This_ can't work. ip6frags is only one and ->ctl pointer is flipped
onto per-netns data. Changelog is also misleading: ip6_frags_ctl is
moved to netns not all ip6_frags.

Oopsing place below -- f->ctl dereference in preparation of mod_timer() call.



BUG: unable to handle kernel paging request at virtual address f5da8fc8
printing eip: c11d868a *pdpt = 0000000000003001 *pde = 0000000001728067 *pte = 0000000035da8000 
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among ebtable_nat ip6t_REJECT ip6table_filter ip6_tables ebtable_filter ebtable_broute ebt_802_3 ebtables des_generic nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables deflate zlib_deflate zlib_inflate cryptomgr crypto_hash cpufreq_stats cpufreq_ondemand cdrom cbc bridge llc blkcipher crypto_algapi arpt_mangle arptable_filter arp_tables x_tables ah6 af_packet ipv6

Pid: 0, comm: swapper Not tainted (2.6.24-rc7-net-2.6.25-nf-sysfs-n #30)
EIP: 0060:[<c11d868a>] EFLAGS: 00010246 CPU: 1
EIP is at inet_frag_secret_rebuild+0xaa/0xd0
EAX: f5da8fbc EBX: 00000000 ECX: c1310000 EDX: 00000100
ESI: f7cba000 EDI: f898f7a0 EBP: 00000040 ESP: c1310f90
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c1310000 task=f7c9a580 task.ti=f7c9b000)
Stack: f898f7a8 f898f8a8 000ddcbd f898f7a0 f7cba000 c1310fc4 00000100 c1026d60 
       00000002 00000001 c1191183 c4779ddc c11d85e0 f898c860 f898c860 c12c4a88 
       00000001 c1308da0 0000000a c1023477 00000001 c130b640 c130b640 f7c9bf34 
Call Trace:
 [<c1026d60>] run_timer_softirq+0x120/0x190
 [<c1191183>] net_rx_action+0x53/0x220
 [<c11d85e0>] inet_frag_secret_rebuild+0x0/0xd0
 [<c1023477>] __do_softirq+0x87/0x100
 [<c10059cf>] do_softirq+0xaf/0x110
 [<c10233e3>] irq_exit+0x83/0x90
 [<c1010ce7>] smp_apic_timer_interrupt+0x57/0x90
 [<c10036e1>] apic_timer_interrupt+0x29/0x38
 [<c10036eb>] apic_timer_interrupt+0x33/0x38
 [<c1001460>] default_idle+0x0/0x60
 [<c10014a0>] default_idle+0x40/0x60
 [<c1000ea3>] cpu_idle+0x73/0xb0
=======================
Code: 8b 10 85 d2 89 13 74 03 89 5a 04 89 18 89 43 04 85 f6 89 f3 75 bb 45 83 fd 40 75 a5 8b 44 24 04 e8 4c 3f 01 00 8b 87 50 01 00 00 <8b> 50 0c 01 54 24 08 8d 87 38 01 00 00 8b 54 24 08 83 c4 0c 5b 
EIP: [<c11d868a>] inet_frag_secret_rebuild+0xaa/0xd0 SS:ESP 0068:c1310f90
Kernel panic - not syncing: Fatal exception in interrupt


^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: Timo Teräs @ 2008-01-17 10:01 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, hadi, netdev
In-Reply-To: <20080117.014458.16733544.davem@davemloft.net>

David Miller wrote:
> From: Timo_Teräs <timo.teras@iki.fi>
> Date: Thu, 17 Jan 2008 11:38:13 +0200
> 
>> The af_key issue is that in big dumps you get only first X
>> entries. The rest of the entries are dropped because the
>> socket receive buffer goes full. You get data corruption:
>> missing entries.
> 
> This is an inherent aspect of AF_KEY (and what it was
> derived from, BSD routing sockets).

Yes, this is the way BSD does it.
 
> It has to provide dumps atomically, and if there is no
> space there is no way to provide those entries which
> would require more rcvbuf space.

RFC does not say it has to be atomic.

It does say that the dump is terminated with SADB_DUMP
message having sadb_seq field set to zero. Currently
that is dropped too when the problem occurs. Thus the
socket is left in a bad state: dump ends never. This
can cause applications without any workarounds to hang.

- Timo
 

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: David Miller @ 2008-01-17  9:45 UTC (permalink / raw)
  To: acme; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel
In-Reply-To: <20080117094007.GF321@ghostprotocols.net>

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 17 Jan 2008 07:40:07 -0200

> I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again
> to reproduce.

Thanks for the datapoints and testing.

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: David Miller @ 2008-01-17  9:44 UTC (permalink / raw)
  To: timo.teras; +Cc: herbert, hadi, netdev
In-Reply-To: <478F2205.80403@iki.fi>

From: Timo_Teräs <timo.teras@iki.fi>
Date: Thu, 17 Jan 2008 11:38:13 +0200

> The af_key issue is that in big dumps you get only first X
> entries. The rest of the entries are dropped because the
> socket receive buffer goes full. You get data corruption:
> missing entries.

This is an inherent aspect of AF_KEY (and what it was
derived from, BSD routing sockets).

It has to provide dumps atomically, and if there is no
space there is no way to provide those entries which
would require more rcvbuf space.

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: Arnaldo Carvalho de Melo @ 2008-01-17  9:40 UTC (permalink / raw)
  To: David Miller; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel
In-Reply-To: <20080117.000002.37027317.davem@davemloft.net>

Em Thu, Jan 17, 2008 at 12:00:02AM -0800, David Miller escreveu:
> From: Frans Pop <elendil@planet.nl>
> Date: Thu, 17 Jan 2008 08:51:55 +0100
> 
> > On Thursday 17 January 2008, David Miller wrote:
> > > From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
> > >
> > > > We spent Wednesday trying to reproduce (without the patch) these issues
> > > > without much luck, and have applied the patch cleanly and will continue
> > > > testing it.  Given the simplicity of the changes, and the community
> > > > testing, I'll give my ack and we will continue testing.
> > >
> > > You need a slow CPU, and you need to make sure you do actually
> > > trigger the TX limiting code there.
> > 
> > Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days?
> 
> No of course :-)  I guess it therefore depends upon the load
> as well.

I saw it just once, yesterday:

[root@doppio ~]# uname -r
2.6.24-rc5
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <58>
  TDT                  <8f>
  next_to_use          <8f>
  next_to_clean        <55>
buffer_info[next_to_clean]
  time_stamp           <105e973a9>
  next_to_watch        <56>
  jiffies              <105e97992>
  next_to_watch.status <1>
[root@doppio ~]#

on a lenovo T60W, core2duo machine (2GHz), when using it to stress test
another machine, I was using netperf TCP_STREAM ranging from 1 to 8
streams + a ping -f using various packet sizes.

I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again
to reproduce.

I also applied David's patch while trying some RT experiments on
another, 8 way machine used as a server, but on this machine I didn't
experience the Tx Unit Hang message with or without the patch.

- Arnaldo

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: Timo Teräs @ 2008-01-17  9:38 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, hadi, netdev
In-Reply-To: <20080117.013107.241902256.davem@davemloft.net>

David Miller wrote:
> From: Timo_Teräs <timo.teras@iki.fi>
> Date: Thu, 17 Jan 2008 11:20:42 +0200
> 
>> Where as the pfkey bug fix is non-intrusive and helps all
>> legacy applications still using af_key by _fixing a bug in
>> kernel_.
> 
> It's not a bug.  You're fixing a speed issue, not a crash
> or a case where AF_KEY is providing incorrect data.
> 
> That is what I mean when I mean "life support", we fix crashes and
> data corruption.  We don't make performance tweaks.

No. The speed issue is complitely handled in xfrm_state
and xfrm_user changes.

The af_key issue is that in big dumps you get only first X
entries. The rest of the entries are dropped because the
socket receive buffer goes full. You get data corruption:
missing entries.

- Timo


^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: David Miller @ 2008-01-17  9:31 UTC (permalink / raw)
  To: timo.teras; +Cc: herbert, hadi, netdev
In-Reply-To: <478F1DEA.5070903@iki.fi>

From: Timo_Teräs <timo.teras@iki.fi>
Date: Thu, 17 Jan 2008 11:20:42 +0200

> Where as the pfkey bug fix is non-intrusive and helps all
> legacy applications still using af_key by _fixing a bug in
> kernel_.

It's not a bug.  You're fixing a speed issue, not a crash
or a case where AF_KEY is providing incorrect data.

That is what I mean when I mean "life support", we fix crashes and
data corruption.  We don't make performance tweaks.

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: Timo Teräs @ 2008-01-17  9:20 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, hadi, netdev
In-Reply-To: <20080117.004900.58497170.davem@davemloft.net>

David Miller wrote:
> From: Timo_Teräs <timo.teras@iki.fi>
> Date: Thu, 17 Jan 2008 10:11:17 +0200
> 
>> I thought my patch would qualify as "life support" bug fix.
>> Currently racoon fails to work if there are too many SPDs or SAs
>> because the kernel cannot handle the dump request properly. And this
>> is what my patch fixes for pfkey. It adds no new features or
>> functionality; just makes the dumping work with large databases.
> 
> Racoon should use netlink for reasons far and beyond the
> problem you are trying to address.

Yes. But this is fairly major thing to do. One needs to create
API abstraction layer (still need to use pfkey in *BSD). Test it.
A lot of work that is not going to happen very soon.

Where as the pfkey bug fix is non-intrusive and helps all
legacy applications still using af_key by _fixing a bug in
kernel_.

> The dumping behavior of AF_KEY is just horrific, as one of
> several examples.

If af_key is all that bad and does not qualify to get maintanace
bug fixes, why not remove it complitely?

That would make userland adapt faster.

>> Then there's also the xfrm dumping changes which change the
>> algorithm from O(n^2) to O(n) with some memory overhead, but
>> that is a different story. Any comments on that?
> 
> I have no general objections to those changes although I am
> backlogged and thus have not studied them in detail.  Jamal
> is having what appears to be a healthy dialogue with you about
> the details so I'm not concerned much :)

Ok. I hope someone can also give feedback on the naming
conventions. And about the api changes to xfrm policy/state
walking.

- Timo


^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: David Miller @ 2008-01-17  8:49 UTC (permalink / raw)
  To: timo.teras; +Cc: herbert, hadi, netdev
In-Reply-To: <478F0DA5.2060401@iki.fi>

From: Timo_Teräs <timo.teras@iki.fi>
Date: Thu, 17 Jan 2008 10:11:17 +0200

> I thought my patch would qualify as "life support" bug fix.
> Currently racoon fails to work if there are too many SPDs or SAs
> because the kernel cannot handle the dump request properly. And this
> is what my patch fixes for pfkey. It adds no new features or
> functionality; just makes the dumping work with large databases.

Racoon should use netlink for reasons far and beyond the
problem you are trying to address.

The dumping behavior of AF_KEY is just horrific, as one of
several examples.

> Then there's also the xfrm dumping changes which change the
> algorithm from O(n^2) to O(n) with some memory overhead, but
> that is a different story. Any comments on that?

I have no general objections to those changes although I am
backlogged and thus have not studied them in detail.  Jamal
is having what appears to be a healthy dialogue with you about
the details so I'm not concerned much :)

^ permalink raw reply

* Re: [Bugme-new] [Bug 9767] New: missing native u32 classifier for routing policy
From: Andrew Morton @ 2008-01-17  8:46 UTC (permalink / raw)
  To: netdev; +Cc: bugme-daemon, pupilla
In-Reply-To: <bug-9767-10286@http.bugzilla.kernel.org/>

On Thu, 17 Jan 2008 00:30:49 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9767
> 
>            Summary: missing native u32 classifier for routing policy
>            Product: Networking
>            Version: 2.5
>      KernelVersion: all since 2.2
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: low
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: pupilla@hotmail.com
> 
> 
> This is not a bug report, but a feature request.
> routing policy database management is supported since linux 2.2, but it lacks
> u32 selector (matching by IP protocols, transport ports).
> fwmark is a workaround for this missing feature, but source ip address
> selection will not work anyway: the mark value can't be used for source address
> selection because at the time source address selection is performed, there is
> no packet yet and thus no mark value.
> 

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: Timo Teräs @ 2008-01-17  8:11 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, hadi, netdev
In-Reply-To: <20080116.235923.208347316.davem@davemloft.net>

David Miller wrote:
> Doing anything other than "life support" bug fixes for AF_KEY is
> inappropriate.

Yes. I thought my patch would qualify as "life support" bug fix.
Currently racoon fails to work if there are too many SPDs or SAs
because the kernel cannot handle the dump request properly. And
this is what my patch fixes for pfkey. It adds no new features or
functionality; just makes the dumping work with large databases.

Then there's also the xfrm dumping changes which change the
algorithm from O(n^2) to O(n) with some memory overhead, but
that is a different story. Any comments on that?

- Timo

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: David Miller @ 2008-01-17  8:00 UTC (permalink / raw)
  To: elendil; +Cc: jesse.brandeburg, slavon, netdev, linux-kernel
In-Reply-To: <200801170851.56029.elendil@planet.nl>

From: Frans Pop <elendil@planet.nl>
Date: Thu, 17 Jan 2008 08:51:55 +0100

> On Thursday 17 January 2008, David Miller wrote:
> > From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
> >
> > > We spent Wednesday trying to reproduce (without the patch) these issues
> > > without much luck, and have applied the patch cleanly and will continue
> > > testing it.  Given the simplicity of the changes, and the community
> > > testing, I'll give my ack and we will continue testing.
> >
> > You need a slow CPU, and you need to make sure you do actually
> > trigger the TX limiting code there.
> 
> Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days?

No of course :-)  I guess it therefore depends upon the load
as well.

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: David Miller @ 2008-01-17  7:59 UTC (permalink / raw)
  To: timo.teras; +Cc: herbert, hadi, netdev
In-Reply-To: <478F05E7.6070503@iki.fi>

From: Timo_Teräs <timo.teras@iki.fi>
Date: Thu, 17 Jan 2008 09:38:15 +0200

> David Miller wrote:
> > From: Timo_Teräs <timo.teras@iki.fi>
> > Date: Thu, 17 Jan 2008 08:27:14 +0200
> > 
> >> I don't know about netlink. But pfkey works in *BSD too and it is RFC'd.
> >> So I'd say pfkey might be a bit more portable. Though netlink is definitely
> >> more robust and extensive.
> > 
> > The RFCs say absolutely nothing about policy interfaces for AF_KEY,
> > everybody rolls their own in slightly incompatible ways.
> > 
> > It is therefore anything but standardized.
> 
> Yes, there's non-standardized extensions.

You can't implement a keying daemon without policy support, and policy
support is where the "non-standardized extensions" live.

Doing anything other than "life support" bug fixes for AF_KEY is
inappropriate.

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: Frans Pop @ 2008-01-17  7:51 UTC (permalink / raw)
  To: David Miller; +Cc: jesse.brandeburg, slavon, netdev, linux-kernel
In-Reply-To: <20080116.232037.261622584.davem@davemloft.net>

On Thursday 17 January 2008, David Miller wrote:
> From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
>
> > We spent Wednesday trying to reproduce (without the patch) these issues
> > without much luck, and have applied the patch cleanly and will continue
> > testing it.  Given the simplicity of the changes, and the community
> > testing, I'll give my ack and we will continue testing.
>
> You need a slow CPU, and you need to make sure you do actually
> trigger the TX limiting code there.

Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days?

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: Timo Teräs @ 2008-01-17  7:38 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, hadi, netdev
In-Reply-To: <20080116.231654.74131878.davem@davemloft.net>

David Miller wrote:
> From: Timo_Teräs <timo.teras@iki.fi>
> Date: Thu, 17 Jan 2008 08:27:14 +0200
> 
>> I don't know about netlink. But pfkey works in *BSD too and it is RFC'd.
>> So I'd say pfkey might be a bit more portable. Though netlink is definitely
>> more robust and extensive.
> 
> The RFCs say absolutely nothing about policy interfaces for AF_KEY,
> everybody rolls their own in slightly incompatible ways.
> 
> It is therefore anything but standardized.

Yes, there's non-standardized extensions. But the point was that there are
other implementations of pfkey. And ipsec-tools racoon is an example of
a widely used application that runs in Linux and *BSD using this API. So
for the time being I'd consider having pfkey fixes as a good thing. This
pfkey dumping problem seems to be affecting many users.

- Timo

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: David Miller @ 2008-01-17  7:20 UTC (permalink / raw)
  To: jesse.brandeburg; +Cc: slavon, elendil, netdev, linux-kernel
In-Reply-To: <36D9DB17C6DE9E40B059440DB8D95F520432DA91@orsmsx418.amr.corp.intel.com>

From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
Date: Wed, 16 Jan 2008 23:09:47 -0800

> We spent Wednesday trying to reproduce (without the patch) these issues
> without much luck, and have applied the patch cleanly and will continue
> testing it.  Given the simplicity of the changes, and the community
> testing, I'll give my ack and we will continue testing.

You need a slow CPU, and you need to make sure you do actually
trigger the TX limiting code there.

I bet your cpus are fast enough that it simply never triggers.
:-)

> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

Thanks for reviewing Jesse.

^ permalink raw reply

* Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key
From: David Miller @ 2008-01-17  7:16 UTC (permalink / raw)
  To: timo.teras; +Cc: herbert, hadi, netdev
In-Reply-To: <478EF542.1010702@iki.fi>

From: Timo_Teräs <timo.teras@iki.fi>
Date: Thu, 17 Jan 2008 08:27:14 +0200

> I don't know about netlink. But pfkey works in *BSD too and it is RFC'd.
> So I'd say pfkey might be a bit more portable. Though netlink is definitely
> more robust and extensive.

The RFCs say absolutely nothing about policy interfaces for AF_KEY,
everybody rolls their own in slightly incompatible ways.

It is therefore anything but standardized.

^ permalink raw reply

* RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: Brandeburg, Jesse @ 2008-01-17  7:09 UTC (permalink / raw)
  To: David Miller; +Cc: slavon, elendil, netdev, linux-kernel
In-Reply-To: <20080115.210214.170759690.davem@davemloft.net>

David Miller wrote:
> From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
> Date: Tue, 15 Jan 2008 13:53:43 -0800
> 
>> The tx code has an "early exit" that tries to limit the amount of tx
>> packets handled in a single poll loop and requires napi or interrupt
>> rescheduling based on the return value from e1000_clean_tx_irq.
> 
> That explains everything, thanks Jesse.
> 
> Ok, here is the patch I'll propose to fix this.  The goal is to make
> it as simple as possible without regressing the thing we were trying
> to fix.

We spent Wednesday trying to reproduce (without the patch) these issues
without much luck, and have applied the patch cleanly and will continue
testing it.  Given the simplicity of the changes, and the community
testing, I'll give my ack and we will continue testing.

I think we should fix Robert's (unrelated, but in this thread) reported
issue before 2.6.24 final if we can, and I'll look at that tonight and
tomorrow.

Thanks for your work on this Dave,
 Jesse

Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply

* [PATCH] e1000e Kconfig: remove ref to nonexistant docs
From: Jason Uhlenkott @ 2008-01-17  7:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, Auke Kok

There is no Documentation/networking/e1000e.txt.

Signed-off-by: Jason Uhlenkott <jasonuhl@jasonuhl.org>
Cc: Auke Kok <auke-jan.h.kok@intel.com>
---

Index: linux/drivers/net/Kconfig
===================================================================
--- linux.orig/drivers/net/Kconfig	2008-01-16 17:48:03.041103083 -0800
+++ linux/drivers/net/Kconfig	2008-01-16 23:00:23.647430487 -0800
@@ -1976,9 +1976,6 @@
 
 	  <http://support.intel.com>
 
-	  More specific information on configuring the driver is in
-	  <file:Documentation/networking/e1000e.txt>.
-
 	  To compile this driver as a module, choose M here. The module
 	  will be called e1000e.
 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox