netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race
@ 2019-10-31 19:08 John Hurley
  2019-11-01 13:01 ` Vlad Buslov
  0 siblings, 1 reply; 5+ messages in thread
From: John Hurley @ 2019-10-31 19:08 UTC (permalink / raw)
  To: vladbu
  Cc: jiri, netdev, simon.horman, jakub.kicinski, louis.peens,
	oss-drivers, John Hurley

When a new filter is added to cls_api, the function
tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
determine if the tcf_proto is duplicated in the chain's hashtable. It then
creates a new entry or continues with an existing one. In cls_flower, this
allows the function fl_ht_insert_unque to determine if a filter is a
duplicate and reject appropriately, meaning that the duplicate will not be
passed to drivers via the offload hooks. However, when a tcf_proto is
destroyed it is removed from its chain before a hardware remove hook is
hit. This can lead to a race whereby the driver has not received the
remove message but duplicate flows can be accepted. This, in turn, can
lead to the offload driver receiving incorrect duplicate flows and out of
order add/delete messages.

Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
hash table per block stores each unique chain/protocol/prio being
destroyed. This entry is only removed when the full destroy (and hardware
offload) has completed. If a new flow is being added with the same
identiers as a tc_proto being detroyed, then the add request is replayed
until the destroy is complete.

v1->v2:
- put tcf_proto into block->proto_destroy_ht rather than creating new key
  (Vlad Buslov)
- add error log for cases when hash entry fails. Previously it failed
  silently with no indication. (Vlad Buslov)

Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reported-by: Louis Peens <louis.peens@netronome.com>
---
 include/net/sch_generic.h |   4 ++
 net/sched/cls_api.c       | 123 ++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 123 insertions(+), 4 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 637548d..56c9b2c 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -362,6 +362,7 @@ struct tcf_proto {
 	bool			deleting;
 	refcount_t		refcnt;
 	struct rcu_head		rcu;
+	struct rhash_head	destroy_ht_node;
 };
 
 struct qdisc_skb_cb {
@@ -414,6 +415,9 @@ struct tcf_block {
 		struct list_head filter_chain_list;
 	} chain0;
 	struct rcu_head rcu;
+	struct rhashtable proto_destroy_ht;
+	struct rhashtable_params proto_destroy_ht_params;
+	struct mutex proto_destroy_lock; /* Lock for proto_destroy hashtable. */
 };
 
 #ifdef CONFIG_PROVE_LOCKING
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8717c0b..4970fad 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -47,6 +47,93 @@ static LIST_HEAD(tcf_proto_base);
 /* Protects list of registered TC modules. It is pure SMP lock. */
 static DEFINE_RWLOCK(cls_mod_lock);
 
+struct destroy_ht_key {
+	u32 chain_index;
+	u32 prio;
+	__be16 protocol;
+};
+
+static inline void destroy_fill_ht_key(struct destroy_ht_key *key,
+				       const struct tcf_proto *tp)
+{
+	key->chain_index = tp->chain->index;
+	key->prio = tp->prio;
+	key->protocol = tp->protocol;
+}
+
+static inline int destroy_obj_cmpfn(struct rhashtable_compare_arg *arg,
+				    const void *obj)
+{
+	const struct destroy_ht_key *key = arg->key;
+	const struct tcf_proto *tp = obj;
+
+	return key->chain_index != tp->chain->index ||
+	       key->prio != tp->prio ||
+	       key->protocol != tp->protocol;
+}
+
+static inline u32 detroy_obj_hashfn(const void *data, u32 len, u32 seed)
+{
+	const struct tcf_proto *tp = data;
+	struct destroy_ht_key key;
+
+	destroy_fill_ht_key(&key, tp);
+	return jhash(&key, offsetofend(struct destroy_ht_key, protocol), seed);
+}
+
+static const struct rhashtable_params destroy_ht_params = {
+	.head_offset = offsetof(struct tcf_proto, destroy_ht_node),
+	.key_len = offsetofend(struct destroy_ht_key, protocol),
+	.obj_hashfn = detroy_obj_hashfn,
+	.obj_cmpfn = destroy_obj_cmpfn,
+	.automatic_shrinking = true,
+};
+
+static void tcf_proto_signal_destroying(struct tcf_chain *chain,
+					struct tcf_proto *tp)
+{
+	struct tcf_block *block = chain->block;
+	struct destroy_ht_key key;
+	struct tcf_proto *ret;
+
+	destroy_fill_ht_key(&key, tp);
+	mutex_lock(&block->proto_destroy_lock);
+	ret = rhashtable_lookup_get_insert_key(&block->proto_destroy_ht, &key,
+					       &tp->destroy_ht_node,
+					       block->proto_destroy_ht_params);
+	mutex_unlock(&block->proto_destroy_lock);
+
+	if (IS_ERR(ret))
+		net_err_ratelimited("mem failure on destroy hash insert - chain: %u, prio: %u, proto: %u",
+				    key.chain_index, key.prio, key.protocol);
+}
+
+static struct tcf_proto *
+tcf_proto_lookup_destroying(struct tcf_chain *chain, struct tcf_proto *tp)
+{
+	struct tcf_block *block = chain->block;
+	struct destroy_ht_key key;
+
+	destroy_fill_ht_key(&key, tp);
+	return rhashtable_lookup_fast(&block->proto_destroy_ht, &key,
+				      block->proto_destroy_ht_params);
+}
+
+static void
+tcf_proto_signal_destroyed(struct tcf_chain *chain, struct tcf_proto *tp)
+{
+	struct tcf_block *block = chain->block;
+	struct tcf_proto *entry;
+
+	mutex_lock(&block->proto_destroy_lock);
+	entry = tcf_proto_lookup_destroying(chain, tp);
+	if (entry)
+		rhashtable_remove_fast(&block->proto_destroy_ht,
+				       &entry->destroy_ht_node,
+				       block->proto_destroy_ht_params);
+	mutex_unlock(&block->proto_destroy_lock);
+}
+
 /* Find classifier type by string name */
 
 static const struct tcf_proto_ops *__tcf_proto_lookup_ops(const char *kind)
@@ -234,9 +321,11 @@ static void tcf_proto_get(struct tcf_proto *tp)
 static void tcf_chain_put(struct tcf_chain *chain);
 
 static void tcf_proto_destroy(struct tcf_proto *tp, bool rtnl_held,
-			      struct netlink_ext_ack *extack)
+			      bool sig_destroy, struct netlink_ext_ack *extack)
 {
 	tp->ops->destroy(tp, rtnl_held, extack);
+	if (sig_destroy)
+		tcf_proto_signal_destroyed(tp->chain, tp);
 	tcf_chain_put(tp->chain);
 	module_put(tp->ops->owner);
 	kfree_rcu(tp, rcu);
@@ -246,7 +335,7 @@ static void tcf_proto_put(struct tcf_proto *tp, bool rtnl_held,
 			  struct netlink_ext_ack *extack)
 {
 	if (refcount_dec_and_test(&tp->refcnt))
-		tcf_proto_destroy(tp, rtnl_held, extack);
+		tcf_proto_destroy(tp, rtnl_held, true, extack);
 }
 
 static int walker_check_empty(struct tcf_proto *tp, void *fh,
@@ -370,6 +459,8 @@ static bool tcf_chain_detach(struct tcf_chain *chain)
 static void tcf_block_destroy(struct tcf_block *block)
 {
 	mutex_destroy(&block->lock);
+	rhashtable_destroy(&block->proto_destroy_ht);
+	mutex_destroy(&block->proto_destroy_lock);
 	kfree_rcu(block, rcu);
 }
 
@@ -545,6 +636,12 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
 
 	mutex_lock(&chain->filter_chain_lock);
 	tp = tcf_chain_dereference(chain->filter_chain, chain);
+	while (tp) {
+		tp_next = rcu_dereference_protected(tp->next, 1);
+		tcf_proto_signal_destroying(chain, tp);
+		tp = tp_next;
+	}
+	tp = tcf_chain_dereference(chain->filter_chain, chain);
 	RCU_INIT_POINTER(chain->filter_chain, NULL);
 	tcf_chain0_head_change(chain, NULL);
 	chain->flushing = true;
@@ -857,6 +954,16 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 	/* Don't store q pointer for blocks which are shared */
 	if (!tcf_block_shared(block))
 		block->q = q;
+
+	block->proto_destroy_ht_params = destroy_ht_params;
+	if (rhashtable_init(&block->proto_destroy_ht,
+			    &block->proto_destroy_ht_params)) {
+		NL_SET_ERR_MSG(extack, "Failed to initialise blocks proto destroy hashtable");
+		kfree(block);
+		return ERR_PTR(-ENOMEM);
+	}
+	mutex_init(&block->proto_destroy_lock);
+
 	return block;
 }
 
@@ -1621,6 +1728,12 @@ static struct tcf_proto *tcf_chain_tp_insert_unique(struct tcf_chain *chain,
 
 	mutex_lock(&chain->filter_chain_lock);
 
+	if (tcf_proto_lookup_destroying(chain, tp_new)) {
+		mutex_unlock(&chain->filter_chain_lock);
+		tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
+		return ERR_PTR(-EAGAIN);
+	}
+
 	tp = tcf_chain_tp_find(chain, &chain_info,
 			       protocol, prio, false);
 	if (!tp)
@@ -1628,10 +1741,10 @@ static struct tcf_proto *tcf_chain_tp_insert_unique(struct tcf_chain *chain,
 	mutex_unlock(&chain->filter_chain_lock);
 
 	if (tp) {
-		tcf_proto_destroy(tp_new, rtnl_held, NULL);
+		tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
 		tp_new = tp;
 	} else if (err) {
-		tcf_proto_destroy(tp_new, rtnl_held, NULL);
+		tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
 		tp_new = ERR_PTR(err);
 	}
 
@@ -1669,6 +1782,7 @@ static void tcf_chain_tp_delete_empty(struct tcf_chain *chain,
 		return;
 	}
 
+	tcf_proto_signal_destroying(chain, tp);
 	next = tcf_chain_dereference(chain_info.next, chain);
 	if (tp == chain->filter_chain)
 		tcf_chain0_head_change(chain, next);
@@ -2188,6 +2302,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		err = -EINVAL;
 		goto errout_locked;
 	} else if (t->tcm_handle == 0) {
+		tcf_proto_signal_destroying(chain, tp);
 		tcf_chain_tp_remove(chain, &chain_info, tp);
 		mutex_unlock(&chain->filter_chain_lock);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race
  2019-10-31 19:08 [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race John Hurley
@ 2019-11-01 13:01 ` Vlad Buslov
  2019-11-01 14:54   ` John Hurley
  0 siblings, 1 reply; 5+ messages in thread
From: Vlad Buslov @ 2019-11-01 13:01 UTC (permalink / raw)
  To: John Hurley
  Cc: Vlad Buslov, Jiri Pirko, netdev@vger.kernel.org,
	simon.horman@netronome.com, jakub.kicinski@netronome.com,
	louis.peens@netronome.com, oss-drivers@netronome.com

[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]


On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@netronome.com> wrote:
> When a new filter is added to cls_api, the function
> tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
> determine if the tcf_proto is duplicated in the chain's hashtable. It then
> creates a new entry or continues with an existing one. In cls_flower, this
> allows the function fl_ht_insert_unque to determine if a filter is a
> duplicate and reject appropriately, meaning that the duplicate will not be
> passed to drivers via the offload hooks. However, when a tcf_proto is
> destroyed it is removed from its chain before a hardware remove hook is
> hit. This can lead to a race whereby the driver has not received the
> remove message but duplicate flows can be accepted. This, in turn, can
> lead to the offload driver receiving incorrect duplicate flows and out of
> order add/delete messages.
>
> Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
> hash table per block stores each unique chain/protocol/prio being
> destroyed. This entry is only removed when the full destroy (and hardware
> offload) has completed. If a new flow is being added with the same
> identiers as a tc_proto being detroyed, then the add request is replayed
> until the destroy is complete.
>
> v1->v2:
> - put tcf_proto into block->proto_destroy_ht rather than creating new key
>   (Vlad Buslov)
> - add error log for cases when hash entry fails. Previously it failed
>   silently with no indication. (Vlad Buslov)
>
> Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
> Signed-off-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Simon Horman <simon.horman@netronome.com>
> Reported-by: Louis Peens <louis.peens@netronome.com>
> ---

Hi John,

Patch looks good! However, I think we can simplify it even more and
remove duplication of data in tcf_proto (hashtable key contains copy of
data that is already available in the struct itself) and remove all
dynamic memory allocations. I have refactored your patch accordingly
(attached). WDYT?

[...]


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 6325 bytes --]

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 637548d54b3e..e6085811559c 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -15,6 +15,7 @@
 #include <linux/mutex.h>
 #include <linux/rwsem.h>
 #include <linux/atomic.h>
+#include <linux/hashtable.h>
 #include <net/gen_stats.h>
 #include <net/rtnetlink.h>
 #include <net/flow_offload.h>
@@ -362,6 +363,7 @@ struct tcf_proto {
 	bool			deleting;
 	refcount_t		refcnt;
 	struct rcu_head		rcu;
+	struct hlist_node	destroy_ht_node;
 };
 
 struct qdisc_skb_cb {
@@ -414,6 +416,8 @@ struct tcf_block {
 		struct list_head filter_chain_list;
 	} chain0;
 	struct rcu_head rcu;
+	DECLARE_HASHTABLE(proto_destroy_ht, 16);
+	struct mutex proto_destroy_lock; /* Lock for proto_destroy hashtable. */
 };
 
 #ifdef CONFIG_PROVE_LOCKING
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8717c0b26c90..c8278516f847 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -21,6 +21,7 @@
 #include <linux/slab.h>
 #include <linux/idr.h>
 #include <linux/rhashtable.h>
+#include <linux/jhash.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -47,6 +48,61 @@ static LIST_HEAD(tcf_proto_base);
 /* Protects list of registered TC modules. It is pure SMP lock. */
 static DEFINE_RWLOCK(cls_mod_lock);
 
+static u32 destroy_obj_hashfn(const struct tcf_proto *tp)
+{
+	return jhash_3words(tp->chain->index, tp->prio, tp->protocol, 0);
+}
+
+static void tcf_proto_signal_destroying(struct tcf_chain *chain,
+					struct tcf_proto *tp)
+{
+	struct tcf_block *block = chain->block;
+
+	mutex_lock(&block->proto_destroy_lock);
+	hash_add_rcu(block->proto_destroy_ht, &tp->destroy_ht_node,
+		     destroy_obj_hashfn(tp));
+	mutex_unlock(&block->proto_destroy_lock);
+}
+
+static bool tcf_proto_cmp(const struct tcf_proto *tp1,
+			  const struct tcf_proto *tp2)
+{
+	return tp1->chain->index == tp2->chain->index &&
+		tp1->prio == tp2->prio &&
+		tp1->protocol == tp2->protocol;
+}
+
+static bool
+tcf_proto_exists_destroying(struct tcf_chain *chain, struct tcf_proto *tp)
+{
+	u32 hash = destroy_obj_hashfn(tp);
+	struct tcf_proto *iter;
+	bool found = false;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(chain->block->proto_destroy_ht, iter,
+				   destroy_ht_node, hash) {
+		if (tcf_proto_cmp(tp, iter)) {
+			found = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return found;
+}
+
+static void
+tcf_proto_signal_destroyed(struct tcf_chain *chain, struct tcf_proto *tp)
+{
+	struct tcf_block *block = chain->block;
+
+	mutex_lock(&block->proto_destroy_lock);
+	if (hash_hashed(&tp->destroy_ht_node))
+		hash_del_rcu(&tp->destroy_ht_node);
+	mutex_unlock(&block->proto_destroy_lock);
+}
+
 /* Find classifier type by string name */
 
 static const struct tcf_proto_ops *__tcf_proto_lookup_ops(const char *kind)
@@ -234,9 +290,11 @@ static void tcf_proto_get(struct tcf_proto *tp)
 static void tcf_chain_put(struct tcf_chain *chain);
 
 static void tcf_proto_destroy(struct tcf_proto *tp, bool rtnl_held,
-			      struct netlink_ext_ack *extack)
+			      bool sig_destroy, struct netlink_ext_ack *extack)
 {
 	tp->ops->destroy(tp, rtnl_held, extack);
+	if (sig_destroy)
+		tcf_proto_signal_destroyed(tp->chain, tp);
 	tcf_chain_put(tp->chain);
 	module_put(tp->ops->owner);
 	kfree_rcu(tp, rcu);
@@ -246,7 +304,7 @@ static void tcf_proto_put(struct tcf_proto *tp, bool rtnl_held,
 			  struct netlink_ext_ack *extack)
 {
 	if (refcount_dec_and_test(&tp->refcnt))
-		tcf_proto_destroy(tp, rtnl_held, extack);
+		tcf_proto_destroy(tp, rtnl_held, true, extack);
 }
 
 static int walker_check_empty(struct tcf_proto *tp, void *fh,
@@ -370,6 +428,7 @@ static bool tcf_chain_detach(struct tcf_chain *chain)
 static void tcf_block_destroy(struct tcf_block *block)
 {
 	mutex_destroy(&block->lock);
+	mutex_destroy(&block->proto_destroy_lock);
 	kfree_rcu(block, rcu);
 }
 
@@ -545,6 +604,12 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
 
 	mutex_lock(&chain->filter_chain_lock);
 	tp = tcf_chain_dereference(chain->filter_chain, chain);
+	while (tp) {
+		tp_next = rcu_dereference_protected(tp->next, 1);
+		tcf_proto_signal_destroying(chain, tp);
+		tp = tp_next;
+	}
+	tp = tcf_chain_dereference(chain->filter_chain, chain);
 	RCU_INIT_POINTER(chain->filter_chain, NULL);
 	tcf_chain0_head_change(chain, NULL);
 	chain->flushing = true;
@@ -844,6 +909,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 		return ERR_PTR(-ENOMEM);
 	}
 	mutex_init(&block->lock);
+	mutex_init(&block->proto_destroy_lock);
 	init_rwsem(&block->cb_lock);
 	flow_block_init(&block->flow_block);
 	INIT_LIST_HEAD(&block->chain_list);
@@ -1621,6 +1687,12 @@ static struct tcf_proto *tcf_chain_tp_insert_unique(struct tcf_chain *chain,
 
 	mutex_lock(&chain->filter_chain_lock);
 
+	if (tcf_proto_exists_destroying(chain, tp_new)) {
+		mutex_unlock(&chain->filter_chain_lock);
+		tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
+		return ERR_PTR(-EAGAIN);
+	}
+
 	tp = tcf_chain_tp_find(chain, &chain_info,
 			       protocol, prio, false);
 	if (!tp)
@@ -1628,10 +1700,10 @@ static struct tcf_proto *tcf_chain_tp_insert_unique(struct tcf_chain *chain,
 	mutex_unlock(&chain->filter_chain_lock);
 
 	if (tp) {
-		tcf_proto_destroy(tp_new, rtnl_held, NULL);
+		tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
 		tp_new = tp;
 	} else if (err) {
-		tcf_proto_destroy(tp_new, rtnl_held, NULL);
+		tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
 		tp_new = ERR_PTR(err);
 	}
 
@@ -1669,6 +1741,7 @@ static void tcf_chain_tp_delete_empty(struct tcf_chain *chain,
 		return;
 	}
 
+	tcf_proto_signal_destroying(chain, tp);
 	next = tcf_chain_dereference(chain_info.next, chain);
 	if (tp == chain->filter_chain)
 		tcf_chain0_head_change(chain, next);
@@ -2188,6 +2261,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		err = -EINVAL;
 		goto errout_locked;
 	} else if (t->tcm_handle == 0) {
+		tcf_proto_signal_destroying(chain, tp);
 		tcf_chain_tp_remove(chain, &chain_info, tp);
 		mutex_unlock(&chain->filter_chain_lock);
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race
  2019-11-01 13:01 ` Vlad Buslov
@ 2019-11-01 14:54   ` John Hurley
  2019-11-01 15:08     ` Vlad Buslov
  0 siblings, 1 reply; 5+ messages in thread
From: John Hurley @ 2019-11-01 14:54 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev@vger.kernel.org, simon.horman@netronome.com,
	jakub.kicinski@netronome.com, louis.peens@netronome.com,
	oss-drivers@netronome.com

On Fri, Nov 1, 2019 at 1:01 PM Vlad Buslov <vladbu@mellanox.com> wrote:
>
>
> On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@netronome.com> wrote:
> > When a new filter is added to cls_api, the function
> > tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
> > determine if the tcf_proto is duplicated in the chain's hashtable. It then
> > creates a new entry or continues with an existing one. In cls_flower, this
> > allows the function fl_ht_insert_unque to determine if a filter is a
> > duplicate and reject appropriately, meaning that the duplicate will not be
> > passed to drivers via the offload hooks. However, when a tcf_proto is
> > destroyed it is removed from its chain before a hardware remove hook is
> > hit. This can lead to a race whereby the driver has not received the
> > remove message but duplicate flows can be accepted. This, in turn, can
> > lead to the offload driver receiving incorrect duplicate flows and out of
> > order add/delete messages.
> >
> > Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
> > hash table per block stores each unique chain/protocol/prio being
> > destroyed. This entry is only removed when the full destroy (and hardware
> > offload) has completed. If a new flow is being added with the same
> > identiers as a tc_proto being detroyed, then the add request is replayed
> > until the destroy is complete.
> >
> > v1->v2:
> > - put tcf_proto into block->proto_destroy_ht rather than creating new key
> >   (Vlad Buslov)
> > - add error log for cases when hash entry fails. Previously it failed
> >   silently with no indication. (Vlad Buslov)
> >
> > Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
> > Signed-off-by: John Hurley <john.hurley@netronome.com>
> > Reviewed-by: Simon Horman <simon.horman@netronome.com>
> > Reported-by: Louis Peens <louis.peens@netronome.com>
> > ---
>
> Hi John,
>
> Patch looks good! However, I think we can simplify it even more and
> remove duplication of data in tcf_proto (hashtable key contains copy of
> data that is already available in the struct itself) and remove all
> dynamic memory allocations. I have refactored your patch accordingly
> (attached). WDYT?
>

Hi Vlad,
Thanks for the review/modifications.
The changes look good to me but I can fire it through our test setup to be sure.

The only thing I'm a bit concerned about here is the size of the
static allocation.
We are now defining quite a large hash table per block (65536 buckets).
It's hard to know exactly how many elements will be in this at the one time.
With flushes of large chains it may well become quite full, but I'd
expect that it will be empty a majority of the time.
Resizable hash seems a bit more appropriate here.
Do you have any opinions on this?

John

> [...]
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race
  2019-11-01 14:54   ` John Hurley
@ 2019-11-01 15:08     ` Vlad Buslov
  2019-11-01 15:27       ` John Hurley
  0 siblings, 1 reply; 5+ messages in thread
From: Vlad Buslov @ 2019-11-01 15:08 UTC (permalink / raw)
  To: John Hurley
  Cc: Vlad Buslov, Jiri Pirko, netdev@vger.kernel.org,
	simon.horman@netronome.com, jakub.kicinski@netronome.com,
	louis.peens@netronome.com, oss-drivers@netronome.com


On Fri 01 Nov 2019 at 16:54, John Hurley <john.hurley@netronome.com> wrote:
> On Fri, Nov 1, 2019 at 1:01 PM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>>
>> On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@netronome.com> wrote:
>> > When a new filter is added to cls_api, the function
>> > tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
>> > determine if the tcf_proto is duplicated in the chain's hashtable. It then
>> > creates a new entry or continues with an existing one. In cls_flower, this
>> > allows the function fl_ht_insert_unque to determine if a filter is a
>> > duplicate and reject appropriately, meaning that the duplicate will not be
>> > passed to drivers via the offload hooks. However, when a tcf_proto is
>> > destroyed it is removed from its chain before a hardware remove hook is
>> > hit. This can lead to a race whereby the driver has not received the
>> > remove message but duplicate flows can be accepted. This, in turn, can
>> > lead to the offload driver receiving incorrect duplicate flows and out of
>> > order add/delete messages.
>> >
>> > Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
>> > hash table per block stores each unique chain/protocol/prio being
>> > destroyed. This entry is only removed when the full destroy (and hardware
>> > offload) has completed. If a new flow is being added with the same
>> > identiers as a tc_proto being detroyed, then the add request is replayed
>> > until the destroy is complete.
>> >
>> > v1->v2:
>> > - put tcf_proto into block->proto_destroy_ht rather than creating new key
>> >   (Vlad Buslov)
>> > - add error log for cases when hash entry fails. Previously it failed
>> >   silently with no indication. (Vlad Buslov)
>> >
>> > Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
>> > Signed-off-by: John Hurley <john.hurley@netronome.com>
>> > Reviewed-by: Simon Horman <simon.horman@netronome.com>
>> > Reported-by: Louis Peens <louis.peens@netronome.com>
>> > ---
>>
>> Hi John,
>>
>> Patch looks good! However, I think we can simplify it even more and
>> remove duplication of data in tcf_proto (hashtable key contains copy of
>> data that is already available in the struct itself) and remove all
>> dynamic memory allocations. I have refactored your patch accordingly
>> (attached). WDYT?
>>
>
> Hi Vlad,
> Thanks for the review/modifications.
> The changes look good to me but I can fire it through our test setup to be sure.
>
> The only thing I'm a bit concerned about here is the size of the
> static allocation.
> We are now defining quite a large hash table per block (65536 buckets).
> It's hard to know exactly how many elements will be in this at the one time.
> With flushes of large chains it may well become quite full, but I'd
> expect that it will be empty a majority of the time.
> Resizable hash seems a bit more appropriate here.
> Do you have any opinions on this?
>
> John

Yeah, I agree that 65536 buckets is quite an overkill for this. I think
cls API assumes that there is not too many tp instances because they are
attached to chain through regular linked list and each lookup is a
linear search. With this I assume that proto_destroy_ht size can be
safely reduced to 256 buckets, if not less. Resizable table is an
option, but that also sounds like an overkill to me since linear search
over list of chains attached to block or over list of tp's attached to
chain will be the main bottleneck, if user creates hundreds of thousands
of proto instances.

>
>> [...]
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race
  2019-11-01 15:08     ` Vlad Buslov
@ 2019-11-01 15:27       ` John Hurley
  0 siblings, 0 replies; 5+ messages in thread
From: John Hurley @ 2019-11-01 15:27 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev@vger.kernel.org, simon.horman@netronome.com,
	jakub.kicinski@netronome.com, louis.peens@netronome.com,
	oss-drivers@netronome.com

On Fri, Nov 1, 2019 at 3:08 PM Vlad Buslov <vladbu@mellanox.com> wrote:
>
>
> On Fri 01 Nov 2019 at 16:54, John Hurley <john.hurley@netronome.com> wrote:
> > On Fri, Nov 1, 2019 at 1:01 PM Vlad Buslov <vladbu@mellanox.com> wrote:
> >>
> >>
> >> On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@netronome.com> wrote:
> >> > When a new filter is added to cls_api, the function
> >> > tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
> >> > determine if the tcf_proto is duplicated in the chain's hashtable. It then
> >> > creates a new entry or continues with an existing one. In cls_flower, this
> >> > allows the function fl_ht_insert_unque to determine if a filter is a
> >> > duplicate and reject appropriately, meaning that the duplicate will not be
> >> > passed to drivers via the offload hooks. However, when a tcf_proto is
> >> > destroyed it is removed from its chain before a hardware remove hook is
> >> > hit. This can lead to a race whereby the driver has not received the
> >> > remove message but duplicate flows can be accepted. This, in turn, can
> >> > lead to the offload driver receiving incorrect duplicate flows and out of
> >> > order add/delete messages.
> >> >
> >> > Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
> >> > hash table per block stores each unique chain/protocol/prio being
> >> > destroyed. This entry is only removed when the full destroy (and hardware
> >> > offload) has completed. If a new flow is being added with the same
> >> > identiers as a tc_proto being detroyed, then the add request is replayed
> >> > until the destroy is complete.
> >> >
> >> > v1->v2:
> >> > - put tcf_proto into block->proto_destroy_ht rather than creating new key
> >> >   (Vlad Buslov)
> >> > - add error log for cases when hash entry fails. Previously it failed
> >> >   silently with no indication. (Vlad Buslov)
> >> >
> >> > Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
> >> > Signed-off-by: John Hurley <john.hurley@netronome.com>
> >> > Reviewed-by: Simon Horman <simon.horman@netronome.com>
> >> > Reported-by: Louis Peens <louis.peens@netronome.com>
> >> > ---
> >>
> >> Hi John,
> >>
> >> Patch looks good! However, I think we can simplify it even more and
> >> remove duplication of data in tcf_proto (hashtable key contains copy of
> >> data that is already available in the struct itself) and remove all
> >> dynamic memory allocations. I have refactored your patch accordingly
> >> (attached). WDYT?
> >>
> >
> > Hi Vlad,
> > Thanks for the review/modifications.
> > The changes look good to me but I can fire it through our test setup to be sure.
> >
> > The only thing I'm a bit concerned about here is the size of the
> > static allocation.
> > We are now defining quite a large hash table per block (65536 buckets).
> > It's hard to know exactly how many elements will be in this at the one time.
> > With flushes of large chains it may well become quite full, but I'd
> > expect that it will be empty a majority of the time.
> > Resizable hash seems a bit more appropriate here.
> > Do you have any opinions on this?
> >
> > John
>
> Yeah, I agree that 65536 buckets is quite an overkill for this. I think
> cls API assumes that there is not too many tp instances because they are
> attached to chain through regular linked list and each lookup is a
> linear search. With this I assume that proto_destroy_ht size can be
> safely reduced to 256 buckets, if not less. Resizable table is an
> option, but that also sounds like an overkill to me since linear search
> over list of chains attached to block or over list of tp's attached to
> chain will be the main bottleneck, if user creates hundreds of thousands
> of proto instances.
>

Yes, it's a valid point about the tp/chain lookup being the bottleneck
in cases of high usage.
Let's reduce the buckets to say 128 and avoid the need for rhash.
Thanks


> >
> >> [...]
> >>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-01 15:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-31 19:08 [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race John Hurley
2019-11-01 13:01 ` Vlad Buslov
2019-11-01 14:54   ` John Hurley
2019-11-01 15:08     ` Vlad Buslov
2019-11-01 15:27       ` John Hurley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).