[PATCH 0/4] xfrm fixes and flow structurization

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] xfrm fixes and flow structurization
@ 2010-03-31 10:17 Timo Teras
  2010-03-31 10:17 ` Timo Teras
                   ` (5 more replies)
  0 siblings, 6 replies; 40+ messages in thread
From: Timo Teras @ 2010-03-31 10:17 UTC (permalink / raw)
  To: netdev; +Cc: Herbert Xu, Timo Teras

These are fixes and cleanups which should be good for merging in.
Patches 1 and 2 are new. Patches 3 and 4 were previously discussed
with Herbert.

Please review and consider committing these.

Thanks.

Timo Teras (4):
  xfrm: increment genid before bumping state genids
  xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler
  xfrm: remove policy lock when accessing policy->walk.dead
  flow: structurize flow cache

 net/core/flow.c        |  223 +++++++++++++++++++++++++----------------------
 net/xfrm/xfrm_policy.c |   31 ++-----
 net/xfrm/xfrm_state.c  |    3 +-
 net/xfrm/xfrm_user.c   |   10 +-
 4 files changed, 135 insertions(+), 132 deletions(-)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 0/4] xfrm fixes and flow structurization
  2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
@ 2010-03-31 10:17 ` Timo Teras
  2010-03-31 10:17 ` [PATCH 1/4] xfrm: increment genid before bumping state genids Timo Teras
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 40+ messages in thread
From: Timo Teras @ 2010-03-31 10:17 UTC (permalink / raw)
  To: netdev; +Cc: Herbert Xu, Timo Teras

These are fixes and cleanups which should be good for merging in.
Patches 1 and 2 are new. Patches 3 and 4 were previously discussed
with Herbert.

Please review and consider committing.

Thanks.

Timo Teras (4):
  xfrm: increment genid before bumping state genids
  xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler
  xfrm: remove policy lock when accessing policy->walk.dead
  flow: structurize flow cache

 net/core/flow.c        |  223 +++++++++++++++++++++++++----------------------
 net/xfrm/xfrm_policy.c |   31 ++-----
 net/xfrm/xfrm_state.c  |    3 +-
 net/xfrm/xfrm_user.c   |   10 +-
 4 files changed, 135 insertions(+), 132 deletions(-)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/4] xfrm: increment genid before bumping state genids
  2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
  2010-03-31 10:17 ` Timo Teras
@ 2010-03-31 10:17 ` Timo Teras
  2010-03-31 10:50   ` Herbert Xu
  2010-03-31 10:17 ` [PATCH 2/4] xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler Timo Teras
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Timo Teras @ 2010-03-31 10:17 UTC (permalink / raw)
  To: netdev; +Cc: Herbert Xu, Timo Teras

__xfrm_state_bump_genids() is used to update the genid of all
matching xfrm_state's, so any bundle using the state would get
refreshed with the newly inserted state.

However, since __xfrm_state_bump_genids() is called before the
__xfrm_state_insert() which actually bumps the genid counter,
it is possible that the genid was not updated at all (if there
was no state inserts previously).

This is fixed by moving the genid incrementation to
__xfrm_state_bump_genids() so the older states are guaranteed
to get different genid.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
---
 net/xfrm/xfrm_state.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 17d5b96..b4efc28 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -923,7 +923,7 @@ static void __xfrm_state_insert(struct xfrm_state *x)
 	struct net *net = xs_net(x);
 	unsigned int h;
 
-	x->genid = ++xfrm_state_genid;
+	x->genid = xfrm_state_genid;
 
 	list_add(&x->km.all, &net->xfrm.state_all);
 
@@ -963,6 +963,7 @@ static void __xfrm_state_bump_genids(struct xfrm_state *xnew)
 	unsigned int h;
 	u32 mark = xnew->mark.v & xnew->mark.m;
 
+	xfrm_state_genid++;
 	h = xfrm_dst_hash(net, &xnew->id.daddr, &xnew->props.saddr, reqid, family);
 	hlist_for_each_entry(x, entry, net->xfrm.state_bydst+h, bydst) {
 		if (x->props.family	== family &&
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 2/4] xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler
  2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
  2010-03-31 10:17 ` Timo Teras
  2010-03-31 10:17 ` [PATCH 1/4] xfrm: increment genid before bumping state genids Timo Teras
@ 2010-03-31 10:17 ` Timo Teras
  2010-03-31 10:54   ` Herbert Xu
  2010-03-31 10:17 ` [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead Timo Teras
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Timo Teras @ 2010-03-31 10:17 UTC (permalink / raw)
  To: netdev; +Cc: Herbert Xu, Timo Teras

Add missing check for policy direction verification. This is
especially important since without this xfrm_user may end up
deleting per-socket policy which is not allowed.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
---
 net/xfrm/xfrm_user.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 6106b72..da5ba86 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1741,6 +1741,10 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (err)
 		return err;
 
+	err = verify_policy_dir(p->dir);
+	if (err)
+		return err;
+
 	if (p->index)
 		xp = xfrm_policy_byid(net, mark, type, p->dir, p->index, 0, &err);
 	else {
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
                   ` (2 preceding siblings ...)
  2010-03-31 10:17 ` [PATCH 2/4] xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler Timo Teras
@ 2010-03-31 10:17 ` Timo Teras
  2010-03-31 11:03   ` Herbert Xu
  2010-03-31 10:17 ` [PATCH 4/4] flow: structurize flow cache Timo Teras
  2010-04-02  2:42 ` [PATCH 0/4] xfrm fixes and flow structurization David Miller
  5 siblings, 1 reply; 40+ messages in thread
From: Timo Teras @ 2010-03-31 10:17 UTC (permalink / raw)
  To: netdev; +Cc: Herbert Xu, Timo Teras

All of the code considers ->dead as a hint that the cached policy
needs to get refreshed. The read side can just drop the read lock
without any side effects.

The write side needs to make sure that it's written only exactly
once. Only possible race is at xfrm_policy_kill(). This is fixed
by checking result of __xfrm_policy_unlink() when needed. It will
always succeed if the policy object is looked up from the hash
list (so some checks are removed), but it needs to be checked if
we are trying to unlink policy via a reference (appropriate
checks added).

Since policy->walk.dead is written exactly once, it no longer
needs to be protected with a write lock.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
---
 net/xfrm/xfrm_policy.c |   31 +++++++++----------------------
 net/xfrm/xfrm_user.c   |    6 +-----
 2 files changed, 10 insertions(+), 27 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 843e066..82789cf 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -156,7 +156,7 @@ static void xfrm_policy_timer(unsigned long data)
 
 	read_lock(&xp->lock);
 
-	if (xp->walk.dead)
+	if (unlikely(xp->walk.dead))
 		goto out;
 
 	dir = xfrm_policy_id2dir(xp->index);
@@ -297,17 +297,7 @@ static DECLARE_WORK(xfrm_policy_gc_work, xfrm_policy_gc_task);
 
 static void xfrm_policy_kill(struct xfrm_policy *policy)
 {
-	int dead;
-
-	write_lock_bh(&policy->lock);
-	dead = policy->walk.dead;
 	policy->walk.dead = 1;
-	write_unlock_bh(&policy->lock);
-
-	if (unlikely(dead)) {
-		WARN_ON(1);
-		return;
-	}
 
 	spin_lock_bh(&xfrm_policy_gc_lock);
 	hlist_add_head(&policy->bydst, &xfrm_policy_gc_list);
@@ -776,7 +766,6 @@ xfrm_policy_flush_secctx_check(struct net *net, u8 type, struct xfrm_audit *audi
 int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info)
 {
 	int dir, err = 0, cnt = 0;
-	struct xfrm_policy *dp;
 
 	write_lock_bh(&xfrm_policy_lock);
 
@@ -794,10 +783,9 @@ int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info)
 				     &net->xfrm.policy_inexact[dir], bydst) {
 			if (pol->type != type)
 				continue;
-			dp = __xfrm_policy_unlink(pol, dir);
+			__xfrm_policy_unlink(pol, dir);
 			write_unlock_bh(&xfrm_policy_lock);
-			if (dp)
-				cnt++;
+			cnt++;
 
 			xfrm_audit_policy_delete(pol, 1, audit_info->loginuid,
 						 audit_info->sessionid,
@@ -816,10 +804,9 @@ int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info)
 					     bydst) {
 				if (pol->type != type)
 					continue;
-				dp = __xfrm_policy_unlink(pol, dir);
+				__xfrm_policy_unlink(pol, dir);
 				write_unlock_bh(&xfrm_policy_lock);
-				if (dp)
-					cnt++;
+				cnt++;
 
 				xfrm_audit_policy_delete(pol, 1,
 							 audit_info->loginuid,
@@ -1132,6 +1119,9 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol)
 		__xfrm_policy_link(pol, XFRM_POLICY_MAX+dir);
 	}
 	if (old_pol)
+		/* Unlinking succeeds always. This is the only function
+		 * allowed to delete or replace socket policy.
+		 */
 		__xfrm_policy_unlink(old_pol, XFRM_POLICY_MAX+dir);
 	write_unlock_bh(&xfrm_policy_lock);
 
@@ -1737,11 +1727,8 @@ restart:
 			goto error;
 		}
 
-		for (pi = 0; pi < npols; pi++) {
-			read_lock_bh(&pols[pi]->lock);
+		for (pi = 0; pi < npols; pi++)
 			pol_dead |= pols[pi]->walk.dead;
-			read_unlock_bh(&pols[pi]->lock);
-		}
 
 		write_lock_bh(&policy->lock);
 		if (unlikely(pol_dead || stale_bundle(dst))) {
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index da5ba86..a267fbd 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1770,13 +1770,9 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (xp == NULL)
 		return -ENOENT;
 
-	read_lock(&xp->lock);
-	if (xp->walk.dead) {
-		read_unlock(&xp->lock);
+	if (unlikely(xp->walk.dead))
 		goto out;
-	}
 
-	read_unlock(&xp->lock);
 	err = 0;
 	if (up->hard) {
 		uid_t loginuid = NETLINK_CB(skb).loginuid;
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 4/4] flow: structurize flow cache
  2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
                   ` (3 preceding siblings ...)
  2010-03-31 10:17 ` [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead Timo Teras
@ 2010-03-31 10:17 ` Timo Teras
  2010-03-31 11:21   ` Herbert Xu
  2010-04-02  2:42 ` [PATCH 0/4] xfrm fixes and flow structurization David Miller
  5 siblings, 1 reply; 40+ messages in thread
From: Timo Teras @ 2010-03-31 10:17 UTC (permalink / raw)
  To: netdev; +Cc: Herbert Xu, Timo Teras

Group all per-cpu data to one structure instead of having many
globals. Also prepare the internals so that we can have multiple
instances of the flow cache if needed.

Only the kmem_cache is left as a global as all flow caches share
the same element size, and benefit from using a common cache.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
---
 net/core/flow.c |  223 +++++++++++++++++++++++++++++--------------------------
 1 files changed, 119 insertions(+), 104 deletions(-)

diff --git a/net/core/flow.c b/net/core/flow.c
index 9601587..1d27ca6 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -35,104 +35,105 @@ struct flow_cache_entry {
 	atomic_t		*object_ref;
 };
 
-atomic_t flow_cache_genid = ATOMIC_INIT(0);
-
-static u32 flow_hash_shift;
-#define flow_hash_size	(1 << flow_hash_shift)
-static DEFINE_PER_CPU(struct flow_cache_entry **, flow_tables) = { NULL };
-
-#define flow_table(cpu) (per_cpu(flow_tables, cpu))
-
-static struct kmem_cache *flow_cachep __read_mostly;
-
-static int flow_lwm, flow_hwm;
-
-struct flow_percpu_info {
-	int hash_rnd_recalc;
-	u32 hash_rnd;
-	int count;
+struct flow_cache_percpu {
+	struct flow_cache_entry **	hash_table;
+	int				hash_count;
+	u32				hash_rnd;
+	int				hash_rnd_recalc;
+	struct tasklet_struct		flush_tasklet;
 };
-static DEFINE_PER_CPU(struct flow_percpu_info, flow_hash_info) = { 0 };
-
-#define flow_hash_rnd_recalc(cpu) \
-	(per_cpu(flow_hash_info, cpu).hash_rnd_recalc)
-#define flow_hash_rnd(cpu) \
-	(per_cpu(flow_hash_info, cpu).hash_rnd)
-#define flow_count(cpu) \
-	(per_cpu(flow_hash_info, cpu).count)
-
-static struct timer_list flow_hash_rnd_timer;
-
-#define FLOW_HASH_RND_PERIOD	(10 * 60 * HZ)
 
 struct flow_flush_info {
-	atomic_t cpuleft;
-	struct completion completion;
+	struct flow_cache *		cache;
+	atomic_t			cpuleft;
+	struct completion		completion;
 };
-static DEFINE_PER_CPU(struct tasklet_struct, flow_flush_tasklets) = { NULL };
 
-#define flow_flush_tasklet(cpu) (&per_cpu(flow_flush_tasklets, cpu))
+struct flow_cache {
+	u32				hash_shift;
+	unsigned long			order;
+	struct flow_cache_percpu *	percpu;
+	struct notifier_block		hotcpu_notifier;
+	int				low_watermark;
+	int				high_watermark;
+	struct timer_list		rnd_timer;
+};
+
+atomic_t flow_cache_genid = ATOMIC_INIT(0);
+static struct flow_cache flow_cache_global;
+static struct kmem_cache *flow_cachep;
+
+#define flow_cache_hash_size(cache)	(1 << (cache)->hash_shift)
+#define FLOW_HASH_RND_PERIOD		(10 * 60 * HZ)
 
 static void flow_cache_new_hashrnd(unsigned long arg)
 {
+	struct flow_cache *fc = (void *) arg;
 	int i;
 
 	for_each_possible_cpu(i)
-		flow_hash_rnd_recalc(i) = 1;
+		per_cpu_ptr(fc->percpu, i)->hash_rnd_recalc = 1;
 
-	flow_hash_rnd_timer.expires = jiffies + FLOW_HASH_RND_PERIOD;
-	add_timer(&flow_hash_rnd_timer);
+	fc->rnd_timer.expires = jiffies + FLOW_HASH_RND_PERIOD;
+	add_timer(&fc->rnd_timer);
 }
 
-static void flow_entry_kill(int cpu, struct flow_cache_entry *fle)
+static void flow_entry_kill(struct flow_cache *fc,
+			    struct flow_cache_percpu *fcp,
+			    struct flow_cache_entry *fle)
 {
 	if (fle->object)
 		atomic_dec(fle->object_ref);
 	kmem_cache_free(flow_cachep, fle);
-	flow_count(cpu)--;
+	fcp->hash_count--;
 }
 
-static void __flow_cache_shrink(int cpu, int shrink_to)
+static void __flow_cache_shrink(struct flow_cache *fc,
+				struct flow_cache_percpu *fcp,
+				int shrink_to)
 {
 	struct flow_cache_entry *fle, **flp;
 	int i;
 
-	for (i = 0; i < flow_hash_size; i++) {
+	for (i = 0; i < flow_cache_hash_size(fc); i++) {
 		int k = 0;
 
-		flp = &flow_table(cpu)[i];
+		flp = &fcp->hash_table[i];
 		while ((fle = *flp) != NULL && k < shrink_to) {
 			k++;
 			flp = &fle->next;
 		}
 		while ((fle = *flp) != NULL) {
 			*flp = fle->next;
-			flow_entry_kill(cpu, fle);
+			flow_entry_kill(fc, fcp, fle);
 		}
 	}
 }
 
-static void flow_cache_shrink(int cpu)
+static void flow_cache_shrink(struct flow_cache *fc,
+			      struct flow_cache_percpu *fcp)
 {
-	int shrink_to = flow_lwm / flow_hash_size;
+	int shrink_to = fc->low_watermark / flow_cache_hash_size(fc);
 
-	__flow_cache_shrink(cpu, shrink_to);
+	__flow_cache_shrink(fc, fcp, shrink_to);
 }
 
-static void flow_new_hash_rnd(int cpu)
+static void flow_new_hash_rnd(struct flow_cache *fc,
+			      struct flow_cache_percpu *fcp)
 {
-	get_random_bytes(&flow_hash_rnd(cpu), sizeof(u32));
-	flow_hash_rnd_recalc(cpu) = 0;
-
-	__flow_cache_shrink(cpu, 0);
+	get_random_bytes(&fcp->hash_rnd, sizeof(u32));
+	fcp->hash_rnd_recalc = 0;
+	__flow_cache_shrink(fc, fcp, 0);
 }
 
-static u32 flow_hash_code(struct flowi *key, int cpu)
+static u32 flow_hash_code(struct flow_cache *fc,
+			  struct flow_cache_percpu *fcp,
+			  struct flowi *key)
 {
 	u32 *k = (u32 *) key;
 
-	return (jhash2(k, (sizeof(*key) / sizeof(u32)), flow_hash_rnd(cpu)) &
-		(flow_hash_size - 1));
+	return (jhash2(k, (sizeof(*key) / sizeof(u32)), fcp->hash_rnd)
+		& (flow_cache_hash_size(fc) - 1));
 }
 
 #if (BITS_PER_LONG == 64)
@@ -168,24 +169,25 @@ static int flow_key_compare(struct flowi *key1, struct flowi *key2)
 void *flow_cache_lookup(struct net *net, struct flowi *key, u16 family, u8 dir,
 			flow_resolve_t resolver)
 {
+	struct flow_cache *fc = &flow_cache_global;
+	struct flow_cache_percpu *fcp;
 	struct flow_cache_entry *fle, **head;
 	unsigned int hash;
-	int cpu;
 
 	local_bh_disable();
-	cpu = smp_processor_id();
+	fcp = per_cpu_ptr(fc->percpu, smp_processor_id());
 
 	fle = NULL;
 	/* Packet really early in init?  Making flow_cache_init a
 	 * pre-smp initcall would solve this.  --RR */
-	if (!flow_table(cpu))
+	if (!fcp->hash_table)
 		goto nocache;
 
-	if (flow_hash_rnd_recalc(cpu))
-		flow_new_hash_rnd(cpu);
-	hash = flow_hash_code(key, cpu);
+	if (fcp->hash_rnd_recalc)
+		flow_new_hash_rnd(fc, fcp);
+	hash = flow_hash_code(fc, fcp, key);
 
-	head = &flow_table(cpu)[hash];
+	head = &fcp->hash_table[hash];
 	for (fle = *head; fle; fle = fle->next) {
 		if (fle->family == family &&
 		    fle->dir == dir &&
@@ -204,8 +206,8 @@ void *flow_cache_lookup(struct net *net, struct flowi *key, u16 family, u8 dir,
 	}
 
 	if (!fle) {
-		if (flow_count(cpu) > flow_hwm)
-			flow_cache_shrink(cpu);
+		if (fcp->hash_count > fc->high_watermark)
+			flow_cache_shrink(fc, fcp);
 
 		fle = kmem_cache_alloc(flow_cachep, GFP_ATOMIC);
 		if (fle) {
@@ -215,7 +217,7 @@ void *flow_cache_lookup(struct net *net, struct flowi *key, u16 family, u8 dir,
 			fle->dir = dir;
 			memcpy(&fle->key, key, sizeof(*key));
 			fle->object = NULL;
-			flow_count(cpu)++;
+			fcp->hash_count++;
 		}
 	}
 
@@ -249,14 +251,15 @@ nocache:
 static void flow_cache_flush_tasklet(unsigned long data)
 {
 	struct flow_flush_info *info = (void *)data;
+	struct flow_cache *fc = info->cache;
+	struct flow_cache_percpu *fcp;
 	int i;
-	int cpu;
 
-	cpu = smp_processor_id();
-	for (i = 0; i < flow_hash_size; i++) {
+	fcp = per_cpu_ptr(fc->percpu, smp_processor_id());
+	for (i = 0; i < flow_cache_hash_size(fc); i++) {
 		struct flow_cache_entry *fle;
 
-		fle = flow_table(cpu)[i];
+		fle = fcp->hash_table[i];
 		for (; fle; fle = fle->next) {
 			unsigned genid = atomic_read(&flow_cache_genid);
 
@@ -272,7 +275,6 @@ static void flow_cache_flush_tasklet(unsigned long data)
 		complete(&info->completion);
 }
 
-static void flow_cache_flush_per_cpu(void *) __attribute__((__unused__));
 static void flow_cache_flush_per_cpu(void *data)
 {
 	struct flow_flush_info *info = data;
@@ -280,8 +282,7 @@ static void flow_cache_flush_per_cpu(void *data)
 	struct tasklet_struct *tasklet;
 
 	cpu = smp_processor_id();
-
-	tasklet = flow_flush_tasklet(cpu);
+	tasklet = &per_cpu_ptr(info->cache->percpu, cpu)->flush_tasklet;
 	tasklet->data = (unsigned long)info;
 	tasklet_schedule(tasklet);
 }
@@ -294,6 +295,7 @@ void flow_cache_flush(void)
 	/* Don't want cpus going down or up during this. */
 	get_online_cpus();
 	mutex_lock(&flow_flush_sem);
+	info.cache = &flow_cache_global;
 	atomic_set(&info.cpuleft, num_online_cpus());
 	init_completion(&info.completion);
 
@@ -307,62 +309,75 @@ void flow_cache_flush(void)
 	put_online_cpus();
 }
 
-static void __init flow_cache_cpu_prepare(int cpu)
+static void __init flow_cache_cpu_prepare(struct flow_cache *fc,
+					  struct flow_cache_percpu *fcp)
 {
-	struct tasklet_struct *tasklet;
-	unsigned long order;
-
-	for (order = 0;
-	     (PAGE_SIZE << order) <
-		     (sizeof(struct flow_cache_entry *)*flow_hash_size);
-	     order++)
-		/* NOTHING */;
-
-	flow_table(cpu) = (struct flow_cache_entry **)
-		__get_free_pages(GFP_KERNEL|__GFP_ZERO, order);
-	if (!flow_table(cpu))
-		panic("NET: failed to allocate flow cache order %lu\n", order);
-
-	flow_hash_rnd_recalc(cpu) = 1;
-	flow_count(cpu) = 0;
-
-	tasklet = flow_flush_tasklet(cpu);
-	tasklet_init(tasklet, flow_cache_flush_tasklet, 0);
+	fcp->hash_table = (struct flow_cache_entry **)
+		__get_free_pages(GFP_KERNEL|__GFP_ZERO, fc->order);
+	if (!fcp->hash_table)
+		panic("NET: failed to allocate flow cache order %lu\n", fc->order);
+
+	fcp->hash_rnd_recalc = 1;
+	fcp->hash_count = 0;
+	tasklet_init(&fcp->flush_tasklet, flow_cache_flush_tasklet, 0);
 }
 
 static int flow_cache_cpu(struct notifier_block *nfb,
 			  unsigned long action,
 			  void *hcpu)
 {
+	struct flow_cache *fc = container_of(nfb, struct flow_cache, hotcpu_notifier);
+	int cpu = (unsigned long) hcpu;
+	struct flow_cache_percpu *fcp = per_cpu_ptr(fc->percpu, cpu);
+
 	if (action == CPU_DEAD || action == CPU_DEAD_FROZEN)
-		__flow_cache_shrink((unsigned long)hcpu, 0);
+		__flow_cache_shrink(fc, fcp, 0);
 	return NOTIFY_OK;
 }
 
-static int __init flow_cache_init(void)
+static int flow_cache_init(struct flow_cache *fc)
 {
+	unsigned long order;
 	int i;
 
-	flow_cachep = kmem_cache_create("flow_cache",
-					sizeof(struct flow_cache_entry),
-					0, SLAB_PANIC,
-					NULL);
-	flow_hash_shift = 10;
-	flow_lwm = 2 * flow_hash_size;
-	flow_hwm = 4 * flow_hash_size;
+	fc->hash_shift = 10;
+	fc->low_watermark = 2 * flow_cache_hash_size(fc);
+	fc->high_watermark = 4 * flow_cache_hash_size(fc);
+
+	for (order = 0;
+	     (PAGE_SIZE << order) <
+		     (sizeof(struct flow_cache_entry *)*flow_cache_hash_size(fc));
+	     order++)
+		/* NOTHING */;
+	fc->order = order;
+	fc->percpu = alloc_percpu(struct flow_cache_percpu);
 
-	setup_timer(&flow_hash_rnd_timer, flow_cache_new_hashrnd, 0);
-	flow_hash_rnd_timer.expires = jiffies + FLOW_HASH_RND_PERIOD;
-	add_timer(&flow_hash_rnd_timer);
+	setup_timer(&fc->rnd_timer, flow_cache_new_hashrnd,
+		    (unsigned long) fc);
+	fc->rnd_timer.expires = jiffies + FLOW_HASH_RND_PERIOD;
+	add_timer(&fc->rnd_timer);
 
 	for_each_possible_cpu(i)
-		flow_cache_cpu_prepare(i);
+		flow_cache_cpu_prepare(fc, per_cpu_ptr(fc->percpu, i));
+
+	fc->hotcpu_notifier = (struct notifier_block){
+		.notifier_call = flow_cache_cpu,
+	};
+	register_hotcpu_notifier(&fc->hotcpu_notifier);
 
-	hotcpu_notifier(flow_cache_cpu, 0);
 	return 0;
 }
 
-module_init(flow_cache_init);
+static int __init flow_cache_init_global(void)
+{
+	flow_cachep = kmem_cache_create("flow_cache",
+					sizeof(struct flow_cache_entry),
+					0, SLAB_PANIC, NULL);
+
+	return flow_cache_init(&flow_cache_global);
+}
+
+module_init(flow_cache_init_global);
 
 EXPORT_SYMBOL(flow_cache_genid);
 EXPORT_SYMBOL(flow_cache_lookup);
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] xfrm: increment genid before bumping state genids
  2010-03-31 10:17 ` [PATCH 1/4] xfrm: increment genid before bumping state genids Timo Teras
@ 2010-03-31 10:50   ` Herbert Xu
  2010-03-31 10:55     ` Timo Teräs
  0 siblings, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 10:50 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev

On Wed, Mar 31, 2010 at 01:17:03PM +0300, Timo Teras wrote:
> __xfrm_state_bump_genids() is used to update the genid of all
> matching xfrm_state's, so any bundle using the state would get
> refreshed with the newly inserted state.
> 
> However, since __xfrm_state_bump_genids() is called before the
> __xfrm_state_insert() which actually bumps the genid counter,
> it is possible that the genid was not updated at all (if there
> was no state inserts previously).
> 
> This is fixed by moving the genid incrementation to
> __xfrm_state_bump_genids() so the older states are guaranteed
> to get different genid.
> 
> Signed-off-by: Timo Teras <timo.teras@iki.fi>

It would appear that not all xfrm_state_insert calls are preceded
by xfrm_state_bump_genids so this patch isn't correct.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler
  2010-03-31 10:17 ` [PATCH 2/4] xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler Timo Teras
@ 2010-03-31 10:54   ` Herbert Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 10:54 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev

On Wed, Mar 31, 2010 at 01:17:04PM +0300, Timo Teras wrote:
> Add missing check for policy direction verification. This is
> especially important since without this xfrm_user may end up
> deleting per-socket policy which is not allowed.
> 
> Signed-off-by: Timo Teras <timo.teras@iki.fi>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] xfrm: increment genid before bumping state genids
  2010-03-31 10:50   ` Herbert Xu
@ 2010-03-31 10:55     ` Timo Teräs
  2010-03-31 11:01       ` Timo Teräs
  0 siblings, 1 reply; 40+ messages in thread
From: Timo Teräs @ 2010-03-31 10:55 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev

Herbert Xu wrote:
> On Wed, Mar 31, 2010 at 01:17:03PM +0300, Timo Teras wrote:
>> __xfrm_state_bump_genids() is used to update the genid of all
>> matching xfrm_state's, so any bundle using the state would get
>> refreshed with the newly inserted state.
>>
>> However, since __xfrm_state_bump_genids() is called before the
>> __xfrm_state_insert() which actually bumps the genid counter,
>> it is possible that the genid was not updated at all (if there
>> was no state inserts previously).
>>
>> This is fixed by moving the genid incrementation to
>> __xfrm_state_bump_genids() so the older states are guaranteed
>> to get different genid.
>>
>> Signed-off-by: Timo Teras <timo.teras@iki.fi>
> 
> It would appear that not all xfrm_state_insert calls are preceded
> by xfrm_state_bump_genids so this patch isn't correct.

Yes, I noticed that there's one. But __xfrm_state_insert() is
called to replace an acquire in that case. Acquires are never
used in bundle, so this is good.

But maybe it'd be more explicit if the genid increment is done
before each __xfrm_state_insert()?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] xfrm: increment genid before bumping state genids
  2010-03-31 10:55     ` Timo Teräs
@ 2010-03-31 11:01       ` Timo Teräs
  2010-03-31 11:19         ` Herbert Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Timo Teräs @ 2010-03-31 11:01 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev

Timo Teräs wrote:
> Herbert Xu wrote:
>> On Wed, Mar 31, 2010 at 01:17:03PM +0300, Timo Teras wrote:
>>> __xfrm_state_bump_genids() is used to update the genid of all
>>> matching xfrm_state's, so any bundle using the state would get
>>> refreshed with the newly inserted state.
>>>
>>> However, since __xfrm_state_bump_genids() is called before the
>>> __xfrm_state_insert() which actually bumps the genid counter,
>>> it is possible that the genid was not updated at all (if there
>>> was no state inserts previously).
>>>
>>> This is fixed by moving the genid incrementation to
>>> __xfrm_state_bump_genids() so the older states are guaranteed
>>> to get different genid.
>>>
>>> Signed-off-by: Timo Teras <timo.teras@iki.fi>
>>
>> It would appear that not all xfrm_state_insert calls are preceded
>> by xfrm_state_bump_genids so this patch isn't correct.
> 
> Yes, I noticed that there's one. But __xfrm_state_insert() is
> called to replace an acquire in that case. Acquires are never
> used in bundle, so this is good.
> 
> But maybe it'd be more explicit if the genid increment is done
> before each __xfrm_state_insert()?

Actually. Search for xfrm_state_genid in xfrm_state.c. It's only
used in two places: __xfrm_state_bumb_genid() and _insert().

The only case when it needs to get incremented is in the bump
function. If we are adding a state for the first time, there's
no need to bump the genid as it's per-matching state. The actual
work to invalidate states is done in the bump function, so it's
the only place that needs to increment it.

If any other xfrm_state_insert place needs to invalidate old
states it needs an additional bumping call. So the bumping function
is the right place to increment the genid.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 10:17 ` [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead Timo Teras
@ 2010-03-31 11:03   ` Herbert Xu
  2010-03-31 13:06     ` jamal
  0 siblings, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 11:03 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev, Jamal Hadi Salim, David S. Miller

On Wed, Mar 31, 2010 at 01:17:05PM +0300, Timo Teras wrote:
> All of the code considers ->dead as a hint that the cached policy
> needs to get refreshed. The read side can just drop the read lock
> without any side effects.
> 
> The write side needs to make sure that it's written only exactly
> once. Only possible race is at xfrm_policy_kill(). This is fixed
> by checking result of __xfrm_policy_unlink() when needed. It will
> always succeed if the policy object is looked up from the hash
> list (so some checks are removed), but it needs to be checked if
> we are trying to unlink policy via a reference (appropriate
> checks added).
> 
> Since policy->walk.dead is written exactly once, it no longer
> needs to be protected with a write lock.
> 
> Signed-off-by: Timo Teras <timo.teras@iki.fi>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

> @@ -794,10 +783,9 @@ int xfrm_policy_flush(struct net *net, u8 type, struct xfrm_audit *audit_info)
>  				     &net->xfrm.policy_inexact[dir], bydst) {
>  			if (pol->type != type)
>  				continue;
> -			dp = __xfrm_policy_unlink(pol, dir);
> +			__xfrm_policy_unlink(pol, dir);
>  			write_unlock_bh(&xfrm_policy_lock);
> -			if (dp)
> -				cnt++;
> +			cnt++;
>  
>  			xfrm_audit_policy_delete(pol, 1, audit_info->loginuid,
>  						 audit_info->sessionid,

I was intrigued to find that my local source never had this dp
stuff.

It seems that this was added recently:

commit 2f1eb65f366b81aa3c22c31e6e8db26168777ec5
Author: Jamal Hadi Salim <hadi@cyberus.ca>
Date:   Fri Feb 19 02:00:42 2010 +0000

    xfrm: Flushing empty SPD generates false events

    To see the effect make sure you have an empty SPD.
    On window1 "ip xfrm mon" and on window2 issue "ip xfrm policy flush"
    You get prompt back in window2 and you see the flush event on window1.
    With this fix, you still get prompt on window1 but no event on window2.

    Thanks to Alexey Dobriyan for finding a bug in earlier version
    when using pfkey to do the flushing.

    Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
    Signed-off-by: David S. Miller <davem@davemloft.net>

This seems to be bogus to me.  Just because the DB was empty
before the flush doesn't mean that the flush didn't happen.

We should revert this patch in its entirety.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] xfrm: increment genid before bumping state genids
  2010-03-31 11:01       ` Timo Teräs
@ 2010-03-31 11:19         ` Herbert Xu
  2010-03-31 11:24           ` Timo Teräs
  0 siblings, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 11:19 UTC (permalink / raw)
  To: Timo Teräs; +Cc: netdev, David S. Miller

On Wed, Mar 31, 2010 at 02:01:03PM +0300, Timo Teräs wrote:
>
> If any other xfrm_state_insert place needs to invalidate old
> states it needs an additional bumping call. So the bumping function
> is the right place to increment the genid.

Right.  In fact, this thing doesn't need to be incremented on every
insert.  How about this patch?

xfrm: Remove xfrm_state_genid

The xfrm state genid only needs to be matched against the copy
saved in xfrm_dst.  So we don't need a global genid at all.  In
fact, we don't even need to initialise it.

Based on observation by Timo Teräs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 17d5b96..71f8f33 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -37,7 +37,6 @@
 static DEFINE_SPINLOCK(xfrm_state_lock);
 
 static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;
-static unsigned int xfrm_state_genid;
 
 static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family);
 static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo);
@@ -923,8 +922,6 @@ static void __xfrm_state_insert(struct xfrm_state *x)
 	struct net *net = xs_net(x);
 	unsigned int h;
 
-	x->genid = ++xfrm_state_genid;
-
 	list_add(&x->km.all, &net->xfrm.state_all);
 
 	h = xfrm_dst_hash(net, &x->id.daddr, &x->props.saddr,
@@ -970,7 +967,7 @@ static void __xfrm_state_bump_genids(struct xfrm_state *xnew)
 		    (mark & x->mark.m) == x->mark.v &&
 		    !xfrm_addr_cmp(&x->id.daddr, &xnew->id.daddr, family) &&
 		    !xfrm_addr_cmp(&x->props.saddr, &xnew->props.saddr, family))
-			x->genid = xfrm_state_genid;
+			x->genid++;
 	}
 }
 
Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/4] flow: structurize flow cache
  2010-03-31 10:17 ` [PATCH 4/4] flow: structurize flow cache Timo Teras
@ 2010-03-31 11:21   ` Herbert Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 11:21 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev

On Wed, Mar 31, 2010 at 01:17:06PM +0300, Timo Teras wrote:
> Group all per-cpu data to one structure instead of having many
> globals. Also prepare the internals so that we can have multiple
> instances of the flow cache if needed.
> 
> Only the kmem_cache is left as a global as all flow caches share
> the same element size, and benefit from using a common cache.
> 
> Signed-off-by: Timo Teras <timo.teras@iki.fi>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] xfrm: increment genid before bumping state genids
  2010-03-31 11:19         ` Herbert Xu
@ 2010-03-31 11:24           ` Timo Teräs
  0 siblings, 0 replies; 40+ messages in thread
From: Timo Teräs @ 2010-03-31 11:24 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, David S. Miller

Herbert Xu wrote:
> On Wed, Mar 31, 2010 at 02:01:03PM +0300, Timo Teräs wrote:
>> If any other xfrm_state_insert place needs to invalidate old
>> states it needs an additional bumping call. So the bumping function
>> is the right place to increment the genid.
> 
> Right.  In fact, this thing doesn't need to be incremented on every
> insert.  How about this patch?
> 
> xfrm: Remove xfrm_state_genid
> 
> The xfrm state genid only needs to be matched against the copy
> saved in xfrm_dst.  So we don't need a global genid at all.  In
> fact, we don't even need to initialise it.
> 
> Based on observation by Timo Teräs.

Ah, yes. This is indeed better fix.

Thanks.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 11:03   ` Herbert Xu
@ 2010-03-31 13:06     ` jamal
  2010-03-31 13:11       ` Herbert Xu
  0 siblings, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 13:06 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 19:03 +0800, Herbert Xu wrote:

> This seems to be bogus to me.  Just because the DB was empty
> before the flush doesn't mean that the flush didn't happen.

Herbert, If a tree falls in a forest and no one is around to hear it,
does it make a sound? ;->
A flush event is meant to be a signal to user space that what
was once a non-empty table is now empty. 
This is a consistent definition of the semantics everywhere tables
are flushed (not just in Linux)..

What makes the SPD and SAD speacial?

cheers,
jamal

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:06     ` jamal
@ 2010-03-31 13:11       ` Herbert Xu
  2010-03-31 13:26         ` Herbert Xu
  2010-03-31 13:28         ` jamal
  0 siblings, 2 replies; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 13:11 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 09:06:13AM -0400, jamal wrote:
> On Wed, 2010-03-31 at 19:03 +0800, Herbert Xu wrote:
> 
> > This seems to be bogus to me.  Just because the DB was empty
> > before the flush doesn't mean that the flush didn't happen.
> 
> Herbert, If a tree falls in a forest and no one is around to hear it,
> does it make a sound? ;->
> A flush event is meant to be a signal to user space that what
> was once a non-empty table is now empty. 

I disagree.  A flush event is a signal that someone has sent a
flush command.  In any case we've had this semantics for years
and I haven't heard a good reason why this should be changed.

> This is a consistent definition of the semantics everywhere tables
> are flushed (not just in Linux)..

Please give specific examples in the kernel.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:11       ` Herbert Xu
@ 2010-03-31 13:26         ` Herbert Xu
  2010-03-31 13:32           ` jamal
  2010-03-31 13:28         ` jamal
  1 sibling, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 13:26 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 09:11:31PM +0800, Herbert Xu wrote:
> 
> > This is a consistent definition of the semantics everywhere tables
> > are flushed (not just in Linux)..
> 
> Please give specific examples in the kernel.

In fact the previous behaviour is also consistent with RFC2367:

3.1.9 SADB_FLUSH

   The SADB_FLUSH message causes the kernel to delete all entries in its
   key table for a certain sadb_msg_satype.  Only the base header is
   required for a flush message.  If sadb_msg_satype is filled in with a
   specific value, only associations of that type are deleted.  If it is
   filled in with SADB_SATYPE_UNSPEC, ALL associations are deleted.

     The messaging behavior for SADB_FLUSH is:

           Send an SADB_FLUSH message from a user process to the kernel.

           <base>

           The kernel will return an SADB_FLUSH message to all listening
           sockets.

           <base>

           The reply message happens only after the actual flushing
           of security associations has been attempted.

There is no special treatment for an empty DB.

Please send a revert.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:11       ` Herbert Xu
  2010-03-31 13:26         ` Herbert Xu
@ 2010-03-31 13:28         ` jamal
  2010-03-31 13:53           ` Herbert Xu
  1 sibling, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 13:28 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 21:11 +0800, Herbert Xu wrote:

> I disagree.  A flush event is a signal that someone has sent a
> flush command.  

Sorry - I respectfully disagree.

> In any case we've had this semantics for years
> and I haven't heard a good reason why this should be changed.

It generates unnecessary noise and it is a deviation like i mentioned.

> > This is a consistent definition of the semantics everywhere tables
> > are flushed (not just in Linux)..
> 
> Please give specific examples in the kernel.

Something i can do safely right now without messing my connection; 
Issue iproute commands in one window, observe events in another

-sudo ip route add 192.168.11.100 dev eth0 table 15
generates an event
-sudo ip route flush table 15
generates an event
-sudo ip route flush table 15
No event

But pick anything else in the other netlink knowledgeable subsystem
and youd see similar behavior.

If there was an app depending on this behavior - thats a separate reason
(but thats not the arguement you are making).

cheers,
jamal


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:26         ` Herbert Xu
@ 2010-03-31 13:32           ` jamal
  2010-03-31 13:39             ` jamal
  0 siblings, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 13:32 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 21:26 +0800, Herbert Xu wrote:


> In fact the previous behaviour is also consistent with RFC2367:

I did not touch pfkey. That behavior remains there. 

cheers,
jamal



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:32           ` jamal
@ 2010-03-31 13:39             ` jamal
  2010-03-31 13:41               ` jamal
  2010-03-31 13:55               ` Herbert Xu
  0 siblings, 2 replies; 40+ messages in thread
From: jamal @ 2010-03-31 13:39 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 09:32 -0400, jamal wrote:

> I did not touch pfkey. That behavior remains there. 

Sorry - I lied. I did touch pfkey. Here was my reasoning.

RFC 2367 says flushing behavior should be:
1) user space -> kernel: flush
2) kernel: flush
3) kernel -> user space: flush event to ALL listeners

This is not realistic today in the presence of selinux policies
which may reject the flush etc. So we make the sequence become:
1) user space -> kernel: flush
2) kernel: flush
3) kernel -> user space: flush response to originater from #1
4) if there were no errors then:
kernel -> user space: flush event to ALL listeners

This was in the logs.

cheers,
jamal

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:39             ` jamal
@ 2010-03-31 13:41               ` jamal
  2010-03-31 13:56                 ` Herbert Xu
  2010-03-31 13:55               ` Herbert Xu
  1 sibling, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 13:41 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller


On Wed, 2010-03-31 at 09:39 -0400, jamal wrote:
> 
> On Wed, 2010-03-31 at 09:32 -0400, jamal wrote:
> 
> > I did not touch pfkey. That behavior remains there. 
> 
> Sorry - I lied. I did touch pfkey. Here was my reasoning.

And what I meant by not touching is that "the behavior there
remains as it was before"

cheers,
jamal

> RFC 2367 says flushing behavior should be:
> 1) user space -> kernel: flush
> 2) kernel: flush
> 3) kernel -> user space: flush event to ALL listeners
> 
> This is not realistic today in the presence of selinux policies
> which may reject the flush etc. So we make the sequence become:
> 1) user space -> kernel: flush
> 2) kernel: flush
> 3) kernel -> user space: flush response to originater from #1
> 4) if there were no errors then:
> kernel -> user space: flush event to ALL listeners
> 
> This was in the logs.
> 
> cheers,
> jamal


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:28         ` jamal
@ 2010-03-31 13:53           ` Herbert Xu
  2010-03-31 16:41             ` Patrick McHardy
  2010-03-31 20:52             ` David Miller
  0 siblings, 2 replies; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 13:53 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 09:28:12AM -0400, jamal wrote:
> 
> -sudo ip route add 192.168.11.100 dev eth0 table 15
> generates an event
> -sudo ip route flush table 15
> generates an event
> -sudo ip route flush table 15
> No event

That's completely different.  We don't have a route flush event,
instead we're sending route delete events.  That's why when the
table is empty you get no events.

If we had a route flush event then it would behave exactly the
same.

BTW you've also made xfrm_state_flush inconsistent with respect
to xfrm_policy_flush.

Dave, please revert this patch.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:39             ` jamal
  2010-03-31 13:41               ` jamal
@ 2010-03-31 13:55               ` Herbert Xu
  2010-03-31 14:12                 ` jamal
  1 sibling, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 13:55 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 09:39:55AM -0400, jamal wrote:
> 
> This is not realistic today in the presence of selinux policies
> which may reject the flush etc. So we make the sequence become:
> 1) user space -> kernel: flush
> 2) kernel: flush
> 3) kernel -> user space: flush response to originater from #1
> 4) if there were no errors then:
> kernel -> user space: flush event to ALL listeners

Eliding the notification if SELinux says so is fine, but eliding
it because the table is empty is wrong.

The flush did not fail just because the table was empty to begin
with.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:41               ` jamal
@ 2010-03-31 13:56                 ` Herbert Xu
  2010-03-31 14:15                   ` jamal
  2010-03-31 20:54                   ` David Miller
  0 siblings, 2 replies; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 13:56 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 09:41:23AM -0400, jamal wrote:
> 
> On Wed, 2010-03-31 at 09:39 -0400, jamal wrote:
> > 
> > On Wed, 2010-03-31 at 09:32 -0400, jamal wrote:
> > 
> > > I did not touch pfkey. That behavior remains there. 
> > 
> > Sorry - I lied. I did touch pfkey. Here was my reasoning.
> 
> And what I meant by not touching is that "the behavior there
> remains as it was before"

No you've changed it.  PF_KEY will no longer notify if the policy
table is empty.

This is inconsistent with the behaviour of SADB flushes, and the
spirit of RFC2367.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:55               ` Herbert Xu
@ 2010-03-31 14:12                 ` jamal
  2010-03-31 14:15                   ` Herbert Xu
  0 siblings, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 14:12 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 21:55 +0800, Herbert Xu wrote:

> Eliding the notification if SELinux says so is fine, but eliding
> it because the table is empty is wrong.
> 
> The flush did not fail just because the table was empty to begin
> with.

Like i said i didnt touch the behavior except for the selinux case
(which sounds very reasonable). I believe there maybe historical legacy
reasons for that semantic in pfkey. 
Can you point to something in the kernel (or anywhere else) that behaves
like this on table flushing? Actually if there was an app that depended
on netlink flush being exposed on empty table - then i think theres
reason for a revert.
Other than that i will say again: i respectfully disagree.

cheers,
jamal

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:56                 ` Herbert Xu
@ 2010-03-31 14:15                   ` jamal
  2010-03-31 20:54                   ` David Miller
  1 sibling, 0 replies; 40+ messages in thread
From: jamal @ 2010-03-31 14:15 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 21:56 +0800, Herbert Xu wrote:

> 
> No you've changed it.  PF_KEY will no longer notify if the policy
> table is empty.
> 
> This is inconsistent with the behaviour of SADB flushes, and the
> spirit of RFC2367.

I did not mean to change it for pfkey. I do believe there are apps that
need it.
I will run some tests and if it breaks - I will send a patch.

cheers,
jamal


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 14:12                 ` jamal
@ 2010-03-31 14:15                   ` Herbert Xu
  2010-03-31 14:24                     ` jamal
  2010-03-31 20:57                     ` David Miller
  0 siblings, 2 replies; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 14:15 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 10:12:48AM -0400, jamal wrote:
> On Wed, 2010-03-31 at 21:55 +0800, Herbert Xu wrote:
> 
> > Eliding the notification if SELinux says so is fine, but eliding
> > it because the table is empty is wrong.
> > 
> > The flush did not fail just because the table was empty to begin
> > with.
> 
> Like i said i didnt touch the behavior except for the selinux case
> (which sounds very reasonable). I believe there maybe historical legacy
> reasons for that semantic in pfkey. 
> Can you point to something in the kernel (or anywhere else) that behaves
> like this on table flushing? Actually if there was an app that depended
> on netlink flush being exposed on empty table - then i think theres
> reason for a revert.
> Other than that i will say again: i respectfully disagree.

OK I give up.

Dave can keep or revert this as he likes.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 14:15                   ` Herbert Xu
@ 2010-03-31 14:24                     ` jamal
  2010-03-31 14:29                       ` Herbert Xu
  2010-03-31 20:57                     ` David Miller
  1 sibling, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 14:24 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 22:15 +0800, Herbert Xu wrote:

> 
> OK I give up.
> 
> Dave can keep or revert this as he likes.

Ok - I will await to see what Dave thinks before i bother doing whatever
setup. What we have here is a philosophical difference.
Herbert note: i realize this is unfair for me to say given you were
likely very busy: I respect your opinion and knowledge and did CC you on
all the many mails when i was RFCing this and when i sent the patches. 

cheers,
jamal

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 14:24                     ` jamal
@ 2010-03-31 14:29                       ` Herbert Xu
  2010-03-31 14:38                         ` jamal
  0 siblings, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-03-31 14:29 UTC (permalink / raw)
  To: jamal; +Cc: Timo Teras, netdev, David S. Miller

On Wed, Mar 31, 2010 at 10:24:38AM -0400, jamal wrote:
> On Wed, 2010-03-31 at 22:15 +0800, Herbert Xu wrote:
> 
> > 
> > OK I give up.
> > 
> > Dave can keep or revert this as he likes.
> 
> Ok - I will await to see what Dave thinks before i bother doing whatever
> setup. What we have here is a philosophical difference.
> Herbert note: i realize this is unfair for me to say given you were
> likely very busy: I respect your opinion and knowledge and did CC you on
> all the many mails when i was RFCing this and when i sent the patches. 

Yes you did cc me.  Unfortunately I was rather busy with some other
stuff at the time.  Had I read your patch back then I would've said
something :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 14:29                       ` Herbert Xu
@ 2010-03-31 14:38                         ` jamal
  0 siblings, 0 replies; 40+ messages in thread
From: jamal @ 2010-03-31 14:38 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 22:29 +0800, Herbert Xu wrote:

> Yes you did cc me.  Unfortunately I was rather busy with some other
> stuff at the time.  Had I read your patch back then I would've said
> something :)

makes sense ;-> And it wouldnt have been like the first (or the last)
time you and i have had this heated discussions;->
Lets just wait to hear what Dave says - I dont want make his life more
difficult than it is; so no hard feeling if he decides to revert.

cheers,
jamal

PS:- since you woke me up and i know you are awake, 
i am going to get rid of a jetlag by sending
probably another controversial patch. Hmmm.. where to start.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:53           ` Herbert Xu
@ 2010-03-31 16:41             ` Patrick McHardy
  2010-03-31 17:47               ` jamal
  2010-03-31 20:52             ` David Miller
  1 sibling, 1 reply; 40+ messages in thread
From: Patrick McHardy @ 2010-03-31 16:41 UTC (permalink / raw)
  To: Herbert Xu; +Cc: jamal, Timo Teras, netdev, David S. Miller

Herbert Xu wrote:
> On Wed, Mar 31, 2010 at 09:28:12AM -0400, jamal wrote:
>> -sudo ip route add 192.168.11.100 dev eth0 table 15
>> generates an event
>> -sudo ip route flush table 15
>> generates an event
>> -sudo ip route flush table 15
>> No event
> 
> That's completely different.  We don't have a route flush event,
> instead we're sending route delete events.  That's why when the
> table is empty you get no events.
> 
> If we had a route flush event then it would behave exactly the
> same.

I was just about to say the exact same thing when I noticed your
email. I agree with Herbert, the flush notification indicates that
the table is now empty, independant of its previous state.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 16:41             ` Patrick McHardy
@ 2010-03-31 17:47               ` jamal
  2010-04-01 11:24                 ` Patrick McHardy
  0 siblings, 1 reply; 40+ messages in thread
From: jamal @ 2010-03-31 17:47 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Herbert Xu, Timo Teras, netdev, David S. Miller

On Wed, 2010-03-31 at 18:41 +0200, Patrick McHardy wrote:

> I agree with Herbert, the flush notification indicates that
> the table is now empty, independant of its previous state.

What purpose does it serve?

As an example, if i delete an entry, the fact i deleted an entry is 
of interest to some manager in user space for the purpose of syncing.
If i brought a link down, same thing. If i brought a link down
that was already down - why would that be of interest to generate
as an event? etc.

cheers,
jamal

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:53           ` Herbert Xu
  2010-03-31 16:41             ` Patrick McHardy
@ 2010-03-31 20:52             ` David Miller
  1 sibling, 0 replies; 40+ messages in thread
From: David Miller @ 2010-03-31 20:52 UTC (permalink / raw)
  To: herbert; +Cc: hadi, timo.teras, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 31 Mar 2010 21:53:31 +0800

> Dave, please revert this patch.

Herbert I actually don't agree with you on this one, sorry :-)

If no semantic change occurred to the policy or state databases,
there is zero reason to report the flush event, it's noise.

Like Jamal, I too am pretty frustrated with how noisy and unusable the
netlink messages are to watch when trying to debug something and this
behavior change should break no applications whatsoever.

When you have evidence of a real existing case failing because of
Jamal's changes (not some case you crafted specifically to show that
an empty flush doesn't get reported), I'll consider a revert or ask
Jamal to fix that case up.

Until then, Jamal's change is staying in the tree.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 13:56                 ` Herbert Xu
  2010-03-31 14:15                   ` jamal
@ 2010-03-31 20:54                   ` David Miller
  1 sibling, 0 replies; 40+ messages in thread
From: David Miller @ 2010-03-31 20:54 UTC (permalink / raw)
  To: herbert; +Cc: hadi, timo.teras, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 31 Mar 2010 21:56:29 +0800

> This is inconsistent with the behaviour of SADB flushes, and the
> spirit of RFC2367.

Want to have an even larger list of areas where we're not compliant
with that RFC2367 that everyone has to extend and makes changes to in
order to be useful? :-)

Herbert, you have to show that something breaks because of this,
because the new behavior helps in diagnosis and efficiency and the old
behavior has been a serious pet peeve of mine just as it was for
Jamal.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 14:15                   ` Herbert Xu
  2010-03-31 14:24                     ` jamal
@ 2010-03-31 20:57                     ` David Miller
  2010-04-01  0:22                       ` Herbert Xu
  1 sibling, 1 reply; 40+ messages in thread
From: David Miller @ 2010-03-31 20:57 UTC (permalink / raw)
  To: herbert; +Cc: hadi, timo.teras, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 31 Mar 2010 22:15:25 +0800

> Dave can keep or revert this as he likes.

Herbert, if it doesn't break anything, I feel the gains are very
worthwhile.

When we have a breakage report, I will undo this or fix it
immediately.  Until then, we have to be cognizant of what positives we
get out of this.

Have you actually tried to monitor the IPSEC netlink socket events
when a daemon is running?  It's way too painful before Jamal's
changes, and we already emit way too many semantically empty events in
netlink.

Minimization of the noise is a good thing, if it can legally be done,
which I believe is the case here.  And this applies to pfkey events
too, because processing those is even more expensive on the
application side.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 20:57                     ` David Miller
@ 2010-04-01  0:22                       ` Herbert Xu
  2010-04-01  2:19                         ` jamal
  0 siblings, 1 reply; 40+ messages in thread
From: Herbert Xu @ 2010-04-01  0:22 UTC (permalink / raw)
  To: David Miller; +Cc: hadi, timo.teras, netdev

On Wed, Mar 31, 2010 at 01:57:35PM -0700, David Miller wrote:
> 
> Herbert, if it doesn't break anything, I feel the gains are very
> worthwhile.

OK, in that case please change xfrm_state_flush too.  Right now
it also notifies on empty.

> Have you actually tried to monitor the IPSEC netlink socket events
> when a daemon is running?  It's way too painful before Jamal's
> changes, and we already emit way too many semantically empty events in
> netlink.

Honestly I can't see how eliminating an empty flush event helps.
Flushes are very rare.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-04-01  0:22                       ` Herbert Xu
@ 2010-04-01  2:19                         ` jamal
  2010-04-01 10:53                           ` jamal
  0 siblings, 1 reply; 40+ messages in thread
From: jamal @ 2010-04-01  2:19 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, timo.teras, netdev

On Thu, 2010-04-01 at 08:22 +0800, Herbert Xu wrote:

> OK, in that case please change xfrm_state_flush too.  Right now
> it also notifies on empty.

It shouldnt. I will double check it tomorrow..

cheers,
jamal


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-04-01  2:19                         ` jamal
@ 2010-04-01 10:53                           ` jamal
  0 siblings, 0 replies; 40+ messages in thread
From: jamal @ 2010-04-01 10:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, timo.teras, netdev


Ive tried it with net-next which is derived of 2.6.34-rc1
and it seems to work as expected
i.e. only the flush on a non-empty SAD is event-ed.
Probably older kernel you are running?

cheers,
jamal

On Wed, 2010-03-31 at 22:19 -0400, jamal wrote:
> On Thu, 2010-04-01 at 08:22 +0800, Herbert Xu wrote:
> 
> > OK, in that case please change xfrm_state_flush too.  Right now
> > it also notifies on empty.
> 
> It shouldnt. I will double check it tomorrow..
> 
> cheers,
> jamal


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead
  2010-03-31 17:47               ` jamal
@ 2010-04-01 11:24                 ` Patrick McHardy
  0 siblings, 0 replies; 40+ messages in thread
From: Patrick McHardy @ 2010-04-01 11:24 UTC (permalink / raw)
  To: hadi; +Cc: Herbert Xu, Timo Teras, netdev, David S. Miller

jamal wrote:
> On Wed, 2010-03-31 at 18:41 +0200, Patrick McHardy wrote:
> 
>> I agree with Herbert, the flush notification indicates that
>> the table is now empty, independant of its previous state.
> 
> What purpose does it serve?
> 
> As an example, if i delete an entry, the fact i deleted an entry is 
> of interest to some manager in user space for the purpose of syncing.
> If i brought a link down, same thing. If i brought a link down
> that was already down - why would that be of interest to generate
> as an event? etc.

That's true. Since both pfkey and xfrm process messages synchronously,
there shouldn't be any need for this. In fact I couldn't even find
a single keying daemon that cares about this message.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] xfrm fixes and flow structurization
  2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
                   ` (4 preceding siblings ...)
  2010-03-31 10:17 ` [PATCH 4/4] flow: structurize flow cache Timo Teras
@ 2010-04-02  2:42 ` David Miller
  5 siblings, 0 replies; 40+ messages in thread
From: David Miller @ 2010-04-02  2:42 UTC (permalink / raw)
  To: timo.teras; +Cc: netdev, herbert

From: Timo Teras <timo.teras@iki.fi>
Date: Wed, 31 Mar 2010 13:17:01 +0300

> These are fixes and cleanups which should be good for merging in.
> Patches 1 and 2 are new. Patches 3 and 4 were previously discussed
> with Herbert.
> 
> Please review and consider committing these.

All applied to net-next-2.6, and I used Herbert version of the genid
change, patch #1.

Thanks!

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2010-04-02  2:42 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-31 10:17 [PATCH 0/4] xfrm fixes and flow structurization Timo Teras
2010-03-31 10:17 ` Timo Teras
2010-03-31 10:17 ` [PATCH 1/4] xfrm: increment genid before bumping state genids Timo Teras
2010-03-31 10:50   ` Herbert Xu
2010-03-31 10:55     ` Timo Teräs
2010-03-31 11:01       ` Timo Teräs
2010-03-31 11:19         ` Herbert Xu
2010-03-31 11:24           ` Timo Teräs
2010-03-31 10:17 ` [PATCH 2/4] xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler Timo Teras
2010-03-31 10:54   ` Herbert Xu
2010-03-31 10:17 ` [PATCH 3/4] xfrm: remove policy lock when accessing policy->walk.dead Timo Teras
2010-03-31 11:03   ` Herbert Xu
2010-03-31 13:06     ` jamal
2010-03-31 13:11       ` Herbert Xu
2010-03-31 13:26         ` Herbert Xu
2010-03-31 13:32           ` jamal
2010-03-31 13:39             ` jamal
2010-03-31 13:41               ` jamal
2010-03-31 13:56                 ` Herbert Xu
2010-03-31 14:15                   ` jamal
2010-03-31 20:54                   ` David Miller
2010-03-31 13:55               ` Herbert Xu
2010-03-31 14:12                 ` jamal
2010-03-31 14:15                   ` Herbert Xu
2010-03-31 14:24                     ` jamal
2010-03-31 14:29                       ` Herbert Xu
2010-03-31 14:38                         ` jamal
2010-03-31 20:57                     ` David Miller
2010-04-01  0:22                       ` Herbert Xu
2010-04-01  2:19                         ` jamal
2010-04-01 10:53                           ` jamal
2010-03-31 13:28         ` jamal
2010-03-31 13:53           ` Herbert Xu
2010-03-31 16:41             ` Patrick McHardy
2010-03-31 17:47               ` jamal
2010-04-01 11:24                 ` Patrick McHardy
2010-03-31 20:52             ` David Miller
2010-03-31 10:17 ` [PATCH 4/4] flow: structurize flow cache Timo Teras
2010-03-31 11:21   ` Herbert Xu
2010-04-02  2:42 ` [PATCH 0/4] xfrm fixes and flow structurization David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).