netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net] openvswitch: Allow deferred action fifo to expand during run time
@ 2016-03-18  4:32 Andy Zhou
       [not found] ` <1458275533-2896-1-git-send-email-azhou-LZ6Gd1LRuIk@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Andy Zhou @ 2016-03-18  4:32 UTC (permalink / raw)
  To: dev-yBygre7rU0SM8Zsap4Y0gw, davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA

Current openvswitch implementation allows up to 10 recirculation actions
for each packet. This limit was sufficient for most use cases in the
past, but with more new features, such as supporting connection
tracking, and testing in larger scale network environment,
This limit may be too restrictive.

An obvious design choice is to increase the hard coded limit
from 10 to a larger number.  However, there are two draw backs to this
approach. First, it is not obvious what the upper limit should be.
Second, the fifos are allocated per CPU core. Statically allocated
large buffers will be over kill for most packet execution, and will
increase the module's memory footprint unnecessarily for some
smaller scale applications.

This patch makes a different design choice that accommodates packets
that require more than 10 recirculation actions without significantly
increase module's memory footprint.  kmalloc() is used to expand
deferred action fifo at run time whenever a packet's action contains
more than 10 recirculation actions. The dynamically allocated fifos are
freed as soon as the packet's actions execution are completed.

Reported-and-tested-by: Ramu Ramamurthy <sramamur@linux.vnet.ibm.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-March/067672.html
Signed-off-by: Andy Zhou <azhou@ovn.org>
---
 net/openvswitch/actions.c | 45 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 2d59df5..eb068c7 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -70,12 +70,14 @@ struct ovs_frag_data {
 
 static DEFINE_PER_CPU(struct ovs_frag_data, ovs_frag_data_storage);
 
-#define DEFERRED_ACTION_FIFO_SIZE 10
+#define DEFERRED_ACTION_FIFO_INIT_SIZE 10
 struct action_fifo {
 	int head;
 	int tail;
 	/* Deferred action fifo queue storage. */
-	struct deferred_action fifo[DEFERRED_ACTION_FIFO_SIZE];
+	struct deferred_action *fifo;
+	size_t size;     /* Current size of the fifo in bytes */
+	struct deferred_action initial_fifo[DEFERRED_ACTION_FIFO_INIT_SIZE];
 };
 
 static struct action_fifo __percpu *action_fifos;
@@ -85,6 +87,16 @@ static void action_fifo_init(struct action_fifo *fifo)
 {
 	fifo->head = 0;
 	fifo->tail = 0;
+	fifo->fifo = fifo->initial_fifo;
+	fifo->size = sizeof(*fifo->fifo) * DEFERRED_ACTION_FIFO_INIT_SIZE;
+}
+
+static void action_fifo_clear(struct action_fifo *fifo)
+{
+	if (fifo->fifo != fifo->initial_fifo)
+		kfree(fifo->fifo);
+
+	action_fifo_init(fifo);
 }
 
 static bool action_fifo_is_empty(const struct action_fifo *fifo)
@@ -102,8 +114,22 @@ static struct deferred_action *action_fifo_get(struct action_fifo *fifo)
 
 static struct deferred_action *action_fifo_put(struct action_fifo *fifo)
 {
-	if (fifo->head >= DEFERRED_ACTION_FIFO_SIZE - 1)
-		return NULL;
+	if (fifo->head >= fifo->size / sizeof(*fifo->fifo) - 1) {
+		/* out of fifo buffer space, try to allocate a new fifo
+		 * buffer. */
+		struct deferred_action *new_fifo;
+		int new_size = fifo->size * 2;
+
+		new_fifo = kmalloc(new_size, GFP_ATOMIC);
+		if (new_fifo) {
+			memcpy(new_fifo, fifo->fifo, fifo->size);
+			if (fifo->fifo != fifo->initial_fifo)
+				kfree(fifo->fifo);
+			fifo->fifo = new_fifo;
+			fifo->size = new_size;
+		} else
+			return NULL;
+	}
 
 	return &fifo->fifo[fifo->head++];
 }
@@ -1152,7 +1178,7 @@ static void process_deferred_actions(struct datapath *dp)
 	} while (!action_fifo_is_empty(fifo));
 
 	/* Reset FIFO for the next packet.  */
-	action_fifo_init(fifo);
+	action_fifo_clear(fifo);
 }
 
 /* Execute a list of actions against 'skb'. */
@@ -1185,10 +1211,19 @@ out:
 
 int action_fifos_init(void)
 {
+	int i;
+
 	action_fifos = alloc_percpu(struct action_fifo);
 	if (!action_fifos)
 		return -ENOMEM;
 
+	/* Use pre allocated fifo. */
+	for_each_possible_cpu(i) {
+		struct action_fifo *action_fifo;
+
+		action_fifo = per_cpu_ptr(action_fifos, i);
+		action_fifo_init(action_fifo);
+	}
 	return 0;
 }
 
-- 
1.9.1

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [net] openvswitch: Allow deferred action fifo to expand during run time
       [not found] ` <1458275533-2896-1-git-send-email-azhou-LZ6Gd1LRuIk@public.gmane.org>
@ 2016-03-18 21:19   ` David Miller
       [not found]     ` <20160318.171909.1177364175347712706.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2016-03-18 21:19 UTC (permalink / raw)
  To: azhou-LZ6Gd1LRuIk
  Cc: dev-yBygre7rU0SM8Zsap4Y0gw, netdev-u79uwXL29TY76Z2rM5mHXA

From: Andy Zhou <azhou@ovn.org>
Date: Thu, 17 Mar 2016 21:32:13 -0700

> Current openvswitch implementation allows up to 10 recirculation actions
> for each packet. This limit was sufficient for most use cases in the
> past, but with more new features, such as supporting connection
> tracking, and testing in larger scale network environment,
> This limit may be too restrictive.
 ...

Actions that need to recirculate that many times are extremely poorly
designed, and will have significant performance problems.

I think the way rules are put together and processed should be redone
before we do insane stuff like this.

There is no way I'm applying a patch like this, sorry.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [net] openvswitch: Allow deferred action fifo to expand during run time
       [not found]     ` <20160318.171909.1177364175347712706.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2016-08-24 21:00       ` Lance Richardson
  0 siblings, 0 replies; 3+ messages in thread
From: Lance Richardson @ 2016-08-24 21:00 UTC (permalink / raw)
  To: dev-yBygre7rU0SM8Zsap4Y0gw, netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: David Miller

> From: "David Miller" <davem@davemloft.net>
> To: azhou@ovn.org
> Cc: dev@openvswitch.com, netdev@vger.kernel.org
> Sent: Friday, March 18, 2016 5:19:09 PM
> Subject: Re: [net] openvswitch: Allow deferred action fifo to expand during run time
> 
> From: Andy Zhou <azhou@ovn.org>
> Date: Thu, 17 Mar 2016 21:32:13 -0700
> 
> > Current openvswitch implementation allows up to 10 recirculation actions
> > for each packet. This limit was sufficient for most use cases in the
> > past, but with more new features, such as supporting connection
> > tracking, and testing in larger scale network environment,
> > This limit may be too restrictive.
>  ...
> 
> Actions that need to recirculate that many times are extremely poorly
> designed, and will have significant performance problems.
> 
> I think the way rules are put together and processed should be redone
> before we do insane stuff like this.
> 
> There is no way I'm applying a patch like this, sorry.
> 

Apologies for coming into this thread so late, I happened on it after finding
out that this is actually an issue in some production networks.

The need to buffer so many deferred actions seems to be mostly due to having
relatively simple rules (that have, say, one or two recirculations) that get
multiplied per packet by the number of egress ports.

For example, a configuration with 11 or more OVS bond ports in balance-tcp
mode (which needs one recirculation) will exceed the deferred action fifo limit
of 10 every time a broadcast (or multicast or unknown unicast) is forwarded by
the OVS bridge because one entry will be consumed by each egress port. Since
the order in which egress ports are handled is deterministic, this means
e.g. broadcast ARP requests will only ever make it out the first 10 
bond ports in this scenario.

Note that bonding isn't necessary to have this issue, it just makes for a
relatively straightforward example.

Andy's patch certainly seems to be an improvement on this situation,
but maybe there another/better way.

Regards,

   Lance
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-08-24 21:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-18  4:32 [net] openvswitch: Allow deferred action fifo to expand during run time Andy Zhou
     [not found] ` <1458275533-2896-1-git-send-email-azhou-LZ6Gd1LRuIk@public.gmane.org>
2016-03-18 21:19   ` David Miller
     [not found]     ` <20160318.171909.1177364175347712706.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2016-08-24 21:00       ` Lance Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).