netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
@ 2011-09-26 17:09 Madalin Bucur
  2011-09-27 19:28 ` David Miller
  2011-09-29 13:24 ` Benjamin Poirier
  0 siblings, 2 replies; 7+ messages in thread
From: Madalin Bucur @ 2011-09-26 17:09 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, davem, timo.teras, Madalin Bucur

flow_cache_flush must not sleep as it can be called in atomic context;
removed the schedule_work as the deferred processing lead to the flow
cache gc never being actually run under heavy network load

Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
---
 net/core/flow.c |   21 +++++++++------------
 1 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/net/core/flow.c b/net/core/flow.c
index 555a456..0950f97 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -14,7 +14,6 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
-#include <linux/completion.h>
 #include <linux/percpu.h>
 #include <linux/bitops.h>
 #include <linux/notifier.h>
@@ -49,7 +48,6 @@ struct flow_cache_percpu {
 struct flow_flush_info {
 	struct flow_cache		*cache;
 	atomic_t			cpuleft;
-	struct completion		completion;
 };
 
 struct flow_cache {
@@ -100,7 +98,7 @@ static void flow_entry_kill(struct flow_cache_entry *fle)
 	kmem_cache_free(flow_cachep, fle);
 }
 
-static void flow_cache_gc_task(struct work_struct *work)
+static void flow_cache_gc_task(void)
 {
 	struct list_head gc_list;
 	struct flow_cache_entry *fce, *n;
@@ -113,7 +111,6 @@ static void flow_cache_gc_task(struct work_struct *work)
 	list_for_each_entry_safe(fce, n, &gc_list, u.gc_list)
 		flow_entry_kill(fce);
 }
-static DECLARE_WORK(flow_cache_gc_work, flow_cache_gc_task);
 
 static void flow_cache_queue_garbage(struct flow_cache_percpu *fcp,
 				     int deleted, struct list_head *gc_list)
@@ -123,7 +120,7 @@ static void flow_cache_queue_garbage(struct flow_cache_percpu *fcp,
 		spin_lock_bh(&flow_cache_gc_lock);
 		list_splice_tail(gc_list, &flow_cache_gc_list);
 		spin_unlock_bh(&flow_cache_gc_lock);
-		schedule_work(&flow_cache_gc_work);
+		flow_cache_gc_task();
 	}
 }
 
@@ -320,8 +317,7 @@ static void flow_cache_flush_tasklet(unsigned long data)
 
 	flow_cache_queue_garbage(fcp, deleted, &gc_list);
 
-	if (atomic_dec_and_test(&info->cpuleft))
-		complete(&info->completion);
+	atomic_dec(&info->cpuleft);
 }
 
 static void flow_cache_flush_per_cpu(void *data)
@@ -339,22 +335,23 @@ static void flow_cache_flush_per_cpu(void *data)
 void flow_cache_flush(void)
 {
 	struct flow_flush_info info;
-	static DEFINE_MUTEX(flow_flush_sem);
+	static DEFINE_SPINLOCK(flow_flush_lock);
 
 	/* Don't want cpus going down or up during this. */
 	get_online_cpus();
-	mutex_lock(&flow_flush_sem);
+	spin_lock_bh(&flow_flush_lock);
 	info.cache = &flow_cache_global;
 	atomic_set(&info.cpuleft, num_online_cpus());
-	init_completion(&info.completion);
 
 	local_bh_disable();
 	smp_call_function(flow_cache_flush_per_cpu, &info, 0);
 	flow_cache_flush_tasklet((unsigned long)&info);
 	local_bh_enable();
 
-	wait_for_completion(&info.completion);
-	mutex_unlock(&flow_flush_sem);
+	while (atomic_read(&info.cpuleft) != 0)
+		cpu_relax();
+
+	spin_unlock_bh(&flow_flush_lock);
 	put_online_cpus();
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
  2011-09-26 17:09 [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush Madalin Bucur
@ 2011-09-27 19:28 ` David Miller
  2011-09-27 19:31   ` David Miller
  2011-09-29 13:24 ` Benjamin Poirier
  1 sibling, 1 reply; 7+ messages in thread
From: David Miller @ 2011-09-27 19:28 UTC (permalink / raw)
  To: madalin.bucur; +Cc: eric.dumazet, netdev, timo.teras

From: Madalin Bucur <madalin.bucur@freescale.com>
Date: Mon, 26 Sep 2011 20:09:16 +0300

> flow_cache_flush must not sleep as it can be called in atomic context;
> removed the schedule_work as the deferred processing lead to the flow
> cache gc never being actually run under heavy network load
> 
> Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>

How is this called in an atomic context?  The only caller of
flow_cache_flush() is __xfrm_garbage_collect() which is only invoked
during a NETDEV_DOWN event which ought to be non-atomic.

afinfo->garbage_collect is the only other place __xfrm_garbage_collect
is referenced, and that is completely unused and should thus be deleted
(I'll take care of that in net-next).

If NETDEV_DOWN notifier is in an atomic context, we need to accomodate
or fix that somehow.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
  2011-09-27 19:28 ` David Miller
@ 2011-09-27 19:31   ` David Miller
  2011-12-20  8:23     ` Steffen Klassert
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2011-09-27 19:31 UTC (permalink / raw)
  To: madalin.bucur; +Cc: eric.dumazet, netdev, timo.teras

From: David Miller <davem@davemloft.net>
Date: Tue, 27 Sep 2011 15:28:36 -0400 (EDT)

> afinfo->garbage_collect is the only other place __xfrm_garbage_collect
> is referenced, and that is completely unused and should thus be deleted
> (I'll take care of that in net-next).

Nevermind I see how these are referenced directly via xfrm4_policy.c
and xfrm6_policy.c, sigh...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
  2011-09-26 17:09 [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush Madalin Bucur
  2011-09-27 19:28 ` David Miller
@ 2011-09-29 13:24 ` Benjamin Poirier
  1 sibling, 0 replies; 7+ messages in thread
From: Benjamin Poirier @ 2011-09-29 13:24 UTC (permalink / raw)
  To: Madalin Bucur; +Cc: eric.dumazet, netdev, davem, timo.teras

On 11-09-26 20:09, Madalin Bucur wrote:
> flow_cache_flush must not sleep as it can be called in atomic context;
> removed the schedule_work as the deferred processing lead to the flow
> cache gc never being actually run under heavy network load
> 
> Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
> ---
>  net/core/flow.c |   21 +++++++++------------
>  1 files changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/net/core/flow.c b/net/core/flow.c
> index 555a456..0950f97 100644
> --- a/net/core/flow.c
> +++ b/net/core/flow.c
> @@ -14,7 +14,6 @@
>  #include <linux/init.h>
>  #include <linux/slab.h>
>  #include <linux/smp.h>
> -#include <linux/completion.h>
>  #include <linux/percpu.h>
>  #include <linux/bitops.h>
>  #include <linux/notifier.h>
> @@ -49,7 +48,6 @@ struct flow_cache_percpu {
>  struct flow_flush_info {
>  	struct flow_cache		*cache;
>  	atomic_t			cpuleft;
> -	struct completion		completion;
>  };
>  
>  struct flow_cache {
> @@ -100,7 +98,7 @@ static void flow_entry_kill(struct flow_cache_entry *fle)
>  	kmem_cache_free(flow_cachep, fle);
>  }
>  
> -static void flow_cache_gc_task(struct work_struct *work)
> +static void flow_cache_gc_task(void)
>  {
>  	struct list_head gc_list;
>  	struct flow_cache_entry *fce, *n;
> @@ -113,7 +111,6 @@ static void flow_cache_gc_task(struct work_struct *work)
>  	list_for_each_entry_safe(fce, n, &gc_list, u.gc_list)
>  		flow_entry_kill(fce);
>  }
> -static DECLARE_WORK(flow_cache_gc_work, flow_cache_gc_task);
>  
>  static void flow_cache_queue_garbage(struct flow_cache_percpu *fcp,
>  				     int deleted, struct list_head *gc_list)
> @@ -123,7 +120,7 @@ static void flow_cache_queue_garbage(struct flow_cache_percpu *fcp,
>  		spin_lock_bh(&flow_cache_gc_lock);
>  		list_splice_tail(gc_list, &flow_cache_gc_list);
>  		spin_unlock_bh(&flow_cache_gc_lock);
> -		schedule_work(&flow_cache_gc_work);
> +		flow_cache_gc_task();
>  	}
>  }
>  
> @@ -320,8 +317,7 @@ static void flow_cache_flush_tasklet(unsigned long data)
>  
>  	flow_cache_queue_garbage(fcp, deleted, &gc_list);
>  
> -	if (atomic_dec_and_test(&info->cpuleft))
> -		complete(&info->completion);
> +	atomic_dec(&info->cpuleft);
>  }
>  
>  static void flow_cache_flush_per_cpu(void *data)
> @@ -339,22 +335,23 @@ static void flow_cache_flush_per_cpu(void *data)
>  void flow_cache_flush(void)
>  {
>  	struct flow_flush_info info;
> -	static DEFINE_MUTEX(flow_flush_sem);
> +	static DEFINE_SPINLOCK(flow_flush_lock);
>  
>  	/* Don't want cpus going down or up during this. */
>  	get_online_cpus();
> -	mutex_lock(&flow_flush_sem);
> +	spin_lock_bh(&flow_flush_lock);
>  	info.cache = &flow_cache_global;
>  	atomic_set(&info.cpuleft, num_online_cpus());
> -	init_completion(&info.completion);
>  
>  	local_bh_disable();
local_bh_disable may as well be removed with the change to
spin_lock_bh() just above.

Also, I fail to see why bh_disable is needed for
flow_cache_flush_tasklet(). If you don't mind enlightening me, it'll be
appreciated.

Thanks,
-Ben
>  	smp_call_function(flow_cache_flush_per_cpu, &info, 0);
>  	flow_cache_flush_tasklet((unsigned long)&info);
>  	local_bh_enable();
>  
> -	wait_for_completion(&info.completion);
> -	mutex_unlock(&flow_flush_sem);
> +	while (atomic_read(&info.cpuleft) != 0)
> +		cpu_relax();
> +
> +	spin_unlock_bh(&flow_flush_lock);
>  	put_online_cpus();
>  }
>  
> -- 
> 1.7.0.1
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
  2011-09-27 19:31   ` David Miller
@ 2011-12-20  8:23     ` Steffen Klassert
  2011-12-20  8:41       ` Timo Teräs
  0 siblings, 1 reply; 7+ messages in thread
From: Steffen Klassert @ 2011-12-20  8:23 UTC (permalink / raw)
  To: David Miller; +Cc: madalin.bucur, eric.dumazet, netdev, timo.teras

On Tue, Sep 27, 2011 at 03:31:32PM -0400, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Tue, 27 Sep 2011 15:28:36 -0400 (EDT)
> 
> > afinfo->garbage_collect is the only other place __xfrm_garbage_collect
> > is referenced, and that is completely unused and should thus be deleted
> > (I'll take care of that in net-next).
> 
> Nevermind I see how these are referenced directly via xfrm4_policy.c
> and xfrm6_policy.c, sigh...

Is there any progress in fixing this issue? I've seen this occasionally
on some of our production systems, so I fixed it for us in the meantime
with the patch below. I could submit this for inclusion if noone else
wants to fix it in a different manner.

------
net: Add a flow_cache_flush_deferred function

flow_cach_flush() might sleep but can be called from
atomic context via the xfrm garbage collector. So add
a flow_cache_flush_deferred() function and use this if
the xfrm garbage colector is invoked from within the
packet path.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/flow.h     |    1 +
 net/core/flow.c        |   12 ++++++++++++
 net/xfrm/xfrm_policy.c |   18 ++++++++++++++----
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index a094477..57f15a7 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -207,6 +207,7 @@ extern struct flow_cache_object *flow_cache_lookup(
 		u8 dir, flow_resolve_t resolver, void *ctx);
 
 extern void flow_cache_flush(void);
+extern void flow_cache_flush_deferred(void);
 extern atomic_t flow_cache_genid;
 
 #endif
diff --git a/net/core/flow.c b/net/core/flow.c
index 8ae42de..e318c7e 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -358,6 +358,18 @@ void flow_cache_flush(void)
 	put_online_cpus();
 }
 
+static void flow_cache_flush_task(struct work_struct *work)
+{
+	flow_cache_flush();
+}
+
+static DECLARE_WORK(flow_cache_flush_work, flow_cache_flush_task);
+
+void flow_cache_flush_deferred(void)
+{
+	schedule_work(&flow_cache_flush_work);
+}
+
 static int __cpuinit flow_cache_cpu_prepare(struct flow_cache *fc, int cpu)
 {
 	struct flow_cache_percpu *fcp = per_cpu_ptr(fc->percpu, cpu);
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 2118d64..9049a5c 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2276,8 +2276,6 @@ static void __xfrm_garbage_collect(struct net *net)
 {
 	struct dst_entry *head, *next;
 
-	flow_cache_flush();
-
 	spin_lock_bh(&xfrm_policy_sk_bundle_lock);
 	head = xfrm_policy_sk_bundles;
 	xfrm_policy_sk_bundles = NULL;
@@ -2290,6 +2288,18 @@ static void __xfrm_garbage_collect(struct net *net)
 	}
 }
 
+static void xfrm_garbage_collect(struct net *net)
+{
+	flow_cache_flush();
+	__xfrm_garbage_collect(net);
+}
+
+static void xfrm_garbage_collect_deferred(struct net *net)
+{
+	flow_cache_flush_deferred();
+	__xfrm_garbage_collect(net);
+}
+
 static void xfrm_init_pmtu(struct dst_entry *dst)
 {
 	do {
@@ -2422,7 +2432,7 @@ int xfrm_policy_register_afinfo(struct xfrm_policy_afinfo *afinfo)
 		if (likely(dst_ops->neigh_lookup == NULL))
 			dst_ops->neigh_lookup = xfrm_neigh_lookup;
 		if (likely(afinfo->garbage_collect == NULL))
-			afinfo->garbage_collect = __xfrm_garbage_collect;
+			afinfo->garbage_collect = xfrm_garbage_collect_deferred;
 		xfrm_policy_afinfo[afinfo->family] = afinfo;
 	}
 	write_unlock_bh(&xfrm_policy_afinfo_lock);
@@ -2516,7 +2526,7 @@ static int xfrm_dev_event(struct notifier_block *this, unsigned long event, void
 
 	switch (event) {
 	case NETDEV_DOWN:
-		__xfrm_garbage_collect(dev_net(dev));
+		xfrm_garbage_collect(dev_net(dev));
 	}
 	return NOTIFY_DONE;
 }
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
  2011-12-20  8:23     ` Steffen Klassert
@ 2011-12-20  8:41       ` Timo Teräs
  2011-12-21 21:48         ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Timo Teräs @ 2011-12-20  8:41 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, madalin.bucur, eric.dumazet, netdev

On 12/20/2011 10:23 AM, Steffen Klassert wrote:
> On Tue, Sep 27, 2011 at 03:31:32PM -0400, David Miller wrote:
>> From: David Miller <davem@davemloft.net>
>> Date: Tue, 27 Sep 2011 15:28:36 -0400 (EDT)
>>
>>> afinfo->garbage_collect is the only other place __xfrm_garbage_collect
>>> is referenced, and that is completely unused and should thus be deleted
>>> (I'll take care of that in net-next).
>>
>> Nevermind I see how these are referenced directly via xfrm4_policy.c
>> and xfrm6_policy.c, sigh...
> 
> Is there any progress in fixing this issue? I've seen this occasionally
> on some of our production systems, so I fixed it for us in the meantime
> with the patch below. I could submit this for inclusion if noone else
> wants to fix it in a different manner.
> 
> ------
> net: Add a flow_cache_flush_deferred function
> 
> flow_cach_flush() might sleep but can be called from
> atomic context via the xfrm garbage collector. So add
> a flow_cache_flush_deferred() function and use this if
> the xfrm garbage colector is invoked from within the
> packet path.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

Acked-by: Timo Teräs <timo.teras@iki.fi>

I was first thinking if it made sense to run the local CPUs task
immediately on gc. But since all it does is queue the removed nodes to
the second gc that actually frees the dst's, it doesn't really make a
difference.

So this is probably as good as it gets.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush
  2011-12-20  8:41       ` Timo Teräs
@ 2011-12-21 21:48         ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2011-12-21 21:48 UTC (permalink / raw)
  To: timo.teras; +Cc: steffen.klassert, madalin.bucur, eric.dumazet, netdev

From: Timo Teräs <timo.teras@iki.fi>
Date: Tue, 20 Dec 2011 10:41:40 +0200

> On 12/20/2011 10:23 AM, Steffen Klassert wrote:
>> On Tue, Sep 27, 2011 at 03:31:32PM -0400, David Miller wrote:
>>> From: David Miller <davem@davemloft.net>
>>> Date: Tue, 27 Sep 2011 15:28:36 -0400 (EDT)
>>>
>>>> afinfo->garbage_collect is the only other place __xfrm_garbage_collect
>>>> is referenced, and that is completely unused and should thus be deleted
>>>> (I'll take care of that in net-next).
>>>
>>> Nevermind I see how these are referenced directly via xfrm4_policy.c
>>> and xfrm6_policy.c, sigh...
>> 
>> Is there any progress in fixing this issue? I've seen this occasionally
>> on some of our production systems, so I fixed it for us in the meantime
>> with the patch below. I could submit this for inclusion if noone else
>> wants to fix it in a different manner.
>> 
>> ------
>> net: Add a flow_cache_flush_deferred function
>> 
>> flow_cach_flush() might sleep but can be called from
>> atomic context via the xfrm garbage collector. So add
>> a flow_cache_flush_deferred() function and use this if
>> the xfrm garbage colector is invoked from within the
>> packet path.
>> 
>> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> 
> Acked-by: Timo Teräs <timo.teras@iki.fi>
> 
> I was first thinking if it made sense to run the local CPUs task
> immediately on gc. But since all it does is queue the removed nodes to
> the second gc that actually frees the dst's, it doesn't really make a
> difference.
> 
> So this is probably as good as it gets.

Applied, thanks everyone.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-12-21 21:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-26 17:09 [PATCH] net/flow: remove sleeping and deferral mechanism from flow_cache_flush Madalin Bucur
2011-09-27 19:28 ` David Miller
2011-09-27 19:31   ` David Miller
2011-12-20  8:23     ` Steffen Klassert
2011-12-20  8:41       ` Timo Teräs
2011-12-21 21:48         ` David Miller
2011-09-29 13:24 ` Benjamin Poirier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).