* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node
@ 2011-02-02 23:54 jay.busch
0 siblings, 0 replies; 8+ messages in thread
From: jay.busch @ 2011-02-02 23:54 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Sent from my HTC PURE™, a Windows® phone from AT&T
-----Original Message-----
From: Marek Lindner <lindner_marek@yahoo.de>
Sent: Wednesday, February 02, 2011 2:49 PM
To: The list for a Better Approach To Mobile Ad-hoc Networking <b.a.t.m.a.n@lists.open-mesh.org>
Subject: Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node
On Wednesday 02 February 2011 18:37:18 Linus Lüssing wrote:
> So after some more discussions with Marek and Sven, it looks like we
> have to use the rcu protected macros rcu_dereference() and
> rcu_assign_pointer() for the bat_priv->curr_gw and curr_gw->orig_node.
>
> Changes here also include moving the kref_get() from unicast_send_skb()
> into gw_get_selected(). The orig_node could have been freed already at
> the time the kref_get() was called in unicast_send_skb().
I'd suggest you make a standalone patch because the patches address different
problems.
Thanks,
Marek
^ permalink raw reply [flat|nested] 8+ messages in thread* [B.A.T.M.A.N.] [PATCH 1/4] batman-adv: Correct rcu refcounting for gw_node @ 2011-01-30 1:52 Sven Eckelmann 2011-02-02 17:37 ` [B.A.T.M.A.N.] [PATCH] " Linus Lüssing 0 siblings, 1 reply; 8+ messages in thread From: Sven Eckelmann @ 2011-01-30 1:52 UTC (permalink / raw) To: b.a.t.m.a.n <TODO: write a long monologue about every problem we have or could have or maybe never had and would have when we not have it> Signed-off-by: Sven Eckelmann <sven@narfation.org> --- batman-adv/gateway_client.c | 28 +++++++++++++--------------- batman-adv/types.h | 2 +- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/batman-adv/gateway_client.c b/batman-adv/gateway_client.c index 429a013..8ce3a63 100644 --- a/batman-adv/gateway_client.c +++ b/batman-adv/gateway_client.c @@ -28,20 +28,18 @@ #include <linux/udp.h> #include <linux/if_vlan.h> -static void gw_node_free_ref(struct kref *refcount) -{ - struct gw_node *gw_node; - - gw_node = container_of(refcount, struct gw_node, refcount); - kfree(gw_node); -} - static void gw_node_free_rcu(struct rcu_head *rcu) { struct gw_node *gw_node; gw_node = container_of(rcu, struct gw_node, rcu); - kref_put(&gw_node->refcount, gw_node_free_ref); + kfree(gw_node); +} + +static void gw_node_free_ref(struct gw_node *gw_node) +{ + if (atomic_dec_and_test(&gw_node->refcount)) + call_rcu(&gw_node->rcu, gw_node_free_rcu); } void *gw_get_selected(struct bat_priv *bat_priv) @@ -61,7 +59,7 @@ void gw_deselect(struct bat_priv *bat_priv) bat_priv->curr_gw = NULL; if (gw_node) - kref_put(&gw_node->refcount, gw_node_free_ref); + gw_node_free_ref(gw_node); } static struct gw_node *gw_select(struct bat_priv *bat_priv, @@ -69,8 +67,8 @@ static struct gw_node *gw_select(struct bat_priv *bat_priv, { struct gw_node *curr_gw_node = bat_priv->curr_gw; - if (new_gw_node) - kref_get(&new_gw_node->refcount); + if (new_gw_node && !atomic_inc_not_zero(&new_gw_node->refcount)) + return NULL; bat_priv->curr_gw = new_gw_node; return curr_gw_node; @@ -181,7 +179,7 @@ void gw_election(struct bat_priv *bat_priv) /* the kfree() has to be outside of the rcu lock */ if (old_gw_node) - kref_put(&old_gw_node->refcount, gw_node_free_ref); + gw_node_free_ref(old_gw_node); } void gw_check_election(struct bat_priv *bat_priv, struct orig_node *orig_node) @@ -242,7 +240,7 @@ static void gw_node_add(struct bat_priv *bat_priv, memset(gw_node, 0, sizeof(struct gw_node)); INIT_HLIST_NODE(&gw_node->list); gw_node->orig_node = orig_node; - kref_init(&gw_node->refcount); + atomic_set(&gw_node->refcount, 1); spin_lock_bh(&bat_priv->gw_list_lock); hlist_add_head_rcu(&gw_node->list, &bat_priv->gw_list); @@ -325,7 +323,7 @@ void gw_node_purge(struct bat_priv *bat_priv) gw_deselect(bat_priv); hlist_del_rcu(&gw_node->list); - call_rcu(&gw_node->rcu, gw_node_free_rcu); + gw_node_free_ref(gw_node); } diff --git a/batman-adv/types.h b/batman-adv/types.h index e4a0462..ca5f20a 100644 --- a/batman-adv/types.h +++ b/batman-adv/types.h @@ -100,7 +100,7 @@ struct gw_node { struct hlist_node list; struct orig_node *orig_node; unsigned long deleted; - struct kref refcount; + atomic_t refcount; struct rcu_head rcu; }; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-01-30 1:52 [B.A.T.M.A.N.] [PATCH 1/4] " Sven Eckelmann @ 2011-02-02 17:37 ` Linus Lüssing 2011-02-02 19:49 ` Marek Lindner 2011-02-02 21:42 ` Sven Eckelmann 0 siblings, 2 replies; 8+ messages in thread From: Linus Lüssing @ 2011-02-02 17:37 UTC (permalink / raw) To: b.a.t.m.a.n From: Sven Eckelmann <sven@narfation.org> Was: --- <TODO: write a long monologue about every problem we have or could have or maybe never had and would have when we not have it> Signed-off-by: Sven Eckelmann <sven@narfation.org> --- So after some more discussions with Marek and Sven, it looks like we have to use the rcu protected macros rcu_dereference() and rcu_assign_pointer() for the bat_priv->curr_gw and curr_gw->orig_node. Changes here also include moving the kref_get() from unicast_send_skb() into gw_get_selected(). The orig_node could have been freed already at the time the kref_get() was called in unicast_send_skb(). Some things that are still not that clear to me: gw_election(): * can the if-block before gw_deselect() be ommited, we had a nullpointer check for curr_gw just a couple of lines before during the rcu-lock. gw_deselet(): * is the refcount at this time always 1 for gw_node, can the null pointer check + a rcu_dereference be ommited? (at least that's what it looks like when comparing to the rcuref.txt example) gw_get_selected(): * Probably the orig_node's refcounting has to be made atomic, too? Cheers, Linus Not-Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> --- gateway_client.c | 169 +++++++++++++++++++++++++++++++++--------------------- main.c | 1 + types.h | 7 +- unicast.c | 1 - 4 files changed, 109 insertions(+), 69 deletions(-) diff --git a/gateway_client.c b/gateway_client.c index 429a013..96a67bc 100644 --- a/gateway_client.c +++ b/gateway_client.c @@ -28,40 +28,54 @@ #include <linux/udp.h> #include <linux/if_vlan.h> -static void gw_node_free_ref(struct kref *refcount) +static void gw_node_free_rcu(struct rcu_head *rcu) { struct gw_node *gw_node; - gw_node = container_of(refcount, struct gw_node, refcount); + gw_node = container_of(rcu, struct gw_node, rcu); kfree(gw_node); } -static void gw_node_free_rcu(struct rcu_head *rcu) +static void gw_node_free_ref(struct gw_node *gw_node) { - struct gw_node *gw_node; - - gw_node = container_of(rcu, struct gw_node, rcu); - kref_put(&gw_node->refcount, gw_node_free_ref); + if (atomic_dec_and_test(&gw_node->refcount)) + call_rcu(&gw_node->rcu, gw_node_free_rcu); } +/* increases the returned orig_node's refcount */ void *gw_get_selected(struct bat_priv *bat_priv) { - struct gw_node *curr_gateway_tmp = bat_priv->curr_gw; + struct gw_node *curr_gateway_tmp; + struct orig_node *orig_node; - if (!curr_gateway_tmp) + rcu_read_lock(); + curr_gateway_tmp = rcu_dereference(bat_priv->curr_gw); + if (!curr_gateway_tmp) { + rcu_read_unlock(); return NULL; + } - return curr_gateway_tmp->orig_node; + orig_node = rcu_dereference(curr_gateway_tmp->orig_node); + if (orig_node) { + kref_get(&orig_node->refcount); + rcu_read_unlock(); + return NULL; + } + + rcu_read_unlock(); + return orig_node; } void gw_deselect(struct bat_priv *bat_priv) { - struct gw_node *gw_node = bat_priv->curr_gw; + struct gw_node *gw_node; - bat_priv->curr_gw = NULL; + spin_lock_bh(&bat_priv->curr_gw_lock); + gw_node = bat_priv->curr_gw; + rcu_assign_pointer(bat_priv->curr_gw, NULL); + spin_unlock_bh(&bat_priv->curr_gw_lock); - if (gw_node) - kref_put(&gw_node->refcount, gw_node_free_ref); + gw_node_free_ref(gw_node); } static struct gw_node *gw_select(struct bat_priv *bat_priv, @@ -69,17 +83,21 @@ static struct gw_node *gw_select(struct bat_priv *bat_priv, { struct gw_node *curr_gw_node = bat_priv->curr_gw; - if (new_gw_node) - kref_get(&new_gw_node->refcount); + if (new_gw_node && !atomic_inc_not_zero(&new_gw_node->refcount)) + return NULL; + + spin_lock_bh(&bat_priv->curr_gw_lock); + rcu_assign_pointer(bat_priv->curr_gw, new_gw_node); + spin_unlock_bh(&bat_priv->curr_gw_lock); - bat_priv->curr_gw = new_gw_node; return curr_gw_node; } void gw_election(struct bat_priv *bat_priv) { struct hlist_node *node; - struct gw_node *gw_node, *curr_gw_tmp = NULL, *old_gw_node = NULL; + struct gw_node *gw_node, *curr_gw, *curr_gw_tmp = NULL, *old_gw_node = NULL; + struct orig_node *orig_node; uint8_t max_tq = 0; uint32_t max_gw_factor = 0, tmp_gw_factor = 0; int down, up; @@ -93,25 +111,28 @@ void gw_election(struct bat_priv *bat_priv) if (atomic_read(&bat_priv->gw_mode) != GW_MODE_CLIENT) return; - if (bat_priv->curr_gw) - return; - rcu_read_lock(); - if (hlist_empty(&bat_priv->gw_list)) { + curr_gw = rcu_dereference(bat_priv->curr_gw); + if (curr_gw) { rcu_read_unlock(); + return; + } - if (bat_priv->curr_gw) { + if (hlist_empty(&bat_priv->gw_list)) { + if (curr_gw) { bat_dbg(DBG_BATMAN, bat_priv, "Removing selected gateway - " "no gateway in range\n"); gw_deselect(bat_priv); } + rcu_read_unlock(); return; } hlist_for_each_entry_rcu(gw_node, node, &bat_priv->gw_list, list) { - if (!gw_node->orig_node->router) + orig_node = rcu_dereference(gw_node->orig_node); + if (!orig_node->router) continue; if (gw_node->deleted) @@ -119,18 +140,17 @@ void gw_election(struct bat_priv *bat_priv) switch (atomic_read(&bat_priv->gw_sel_class)) { case 1: /* fast connection */ - gw_bandwidth_to_kbit(gw_node->orig_node->gw_flags, - &down, &up); + gw_bandwidth_to_kbit(orig_node->gw_flags, &down, &up); - tmp_gw_factor = (gw_node->orig_node->router->tq_avg * - gw_node->orig_node->router->tq_avg * + tmp_gw_factor = (orig_node->router->tq_avg * + orig_node->router->tq_avg * down * 100 * 100) / (TQ_LOCAL_WINDOW_SIZE * TQ_LOCAL_WINDOW_SIZE * 64); if ((tmp_gw_factor > max_gw_factor) || ((tmp_gw_factor == max_gw_factor) && - (gw_node->orig_node->router->tq_avg > max_tq))) + (orig_node->router->tq_avg > max_tq))) curr_gw_tmp = gw_node; break; @@ -142,37 +162,38 @@ void gw_election(struct bat_priv *bat_priv) * soon as a better gateway appears which has * $routing_class more tq points) **/ - if (gw_node->orig_node->router->tq_avg > max_tq) + if (orig_node->router->tq_avg > max_tq) curr_gw_tmp = gw_node; break; } - if (gw_node->orig_node->router->tq_avg > max_tq) - max_tq = gw_node->orig_node->router->tq_avg; + if (orig_node->router->tq_avg > max_tq) + max_tq = orig_node->router->tq_avg; if (tmp_gw_factor > max_gw_factor) max_gw_factor = tmp_gw_factor; } - if (bat_priv->curr_gw != curr_gw_tmp) { - if ((bat_priv->curr_gw) && (!curr_gw_tmp)) + if (curr_gw != curr_gw_tmp) { + orig_node = rcu_dereference(curr_gw_tmp->orig_node); + if ((curr_gw) && (!curr_gw_tmp)) bat_dbg(DBG_BATMAN, bat_priv, "Removing selected gateway - " "no gateway in range\n"); - else if ((!bat_priv->curr_gw) && (curr_gw_tmp)) + else if ((!curr_gw) && (curr_gw_tmp)) bat_dbg(DBG_BATMAN, bat_priv, "Adding route to gateway %pM " "(gw_flags: %i, tq: %i)\n", - curr_gw_tmp->orig_node->orig, - curr_gw_tmp->orig_node->gw_flags, - curr_gw_tmp->orig_node->router->tq_avg); + orig_node->orig, + orig_node->gw_flags, + orig_node->router->tq_avg); else bat_dbg(DBG_BATMAN, bat_priv, "Changing route to gateway %pM " "(gw_flags: %i, tq: %i)\n", - curr_gw_tmp->orig_node->orig, - curr_gw_tmp->orig_node->gw_flags, - curr_gw_tmp->orig_node->router->tq_avg); + orig_node->orig, + orig_node->gw_flags, + orig_node->router->tq_avg); old_gw_node = gw_select(bat_priv, curr_gw_tmp); } @@ -181,36 +202,40 @@ void gw_election(struct bat_priv *bat_priv) /* the kfree() has to be outside of the rcu lock */ if (old_gw_node) - kref_put(&old_gw_node->refcount, gw_node_free_ref); + gw_node_free_ref(old_gw_node); } void gw_check_election(struct bat_priv *bat_priv, struct orig_node *orig_node) { - struct gw_node *curr_gateway_tmp = bat_priv->curr_gw; + struct gw_node *curr_gateway_tmp; + struct orig_node *curr_gw_orig; uint8_t gw_tq_avg, orig_tq_avg; + rcu_read_lock(); + curr_gateway_tmp = rcu_dereference(bat_priv->curr_gw); if (!curr_gateway_tmp) - return; + goto rcu_unlock; - if (!curr_gateway_tmp->orig_node) + curr_gw_orig = rcu_dereference(curr_gateway_tmp->orig_node); + if (!curr_gw_orig) goto deselect; - if (!curr_gateway_tmp->orig_node->router) + if (!curr_gw_orig->router) goto deselect; /* this node already is the gateway */ - if (curr_gateway_tmp->orig_node == orig_node) - return; + if (curr_gw_orig == orig_node) + goto deselect; if (!orig_node->router) - return; + goto rcu_unlock; - gw_tq_avg = curr_gateway_tmp->orig_node->router->tq_avg; + gw_tq_avg = curr_gw_orig ->router->tq_avg; orig_tq_avg = orig_node->router->tq_avg; /* the TQ value has to be better */ if (orig_tq_avg < gw_tq_avg) - return; + goto rcu_unlock; /** * if the routing class is greater than 3 the value tells us how much @@ -218,7 +243,7 @@ void gw_check_election(struct bat_priv *bat_priv, struct orig_node *orig_node) **/ if ((atomic_read(&bat_priv->gw_sel_class) > 3) && (orig_tq_avg - gw_tq_avg < atomic_read(&bat_priv->gw_sel_class))) - return; + goto rcu_unlock; bat_dbg(DBG_BATMAN, bat_priv, "Restarting gateway selection: better gateway found (tq curr: " @@ -227,6 +252,8 @@ void gw_check_election(struct bat_priv *bat_priv, struct orig_node *orig_node) deselect: gw_deselect(bat_priv); +rcu_unlock: + rcu_read_unlock(); } static void gw_node_add(struct bat_priv *bat_priv, @@ -242,7 +269,7 @@ static void gw_node_add(struct bat_priv *bat_priv, memset(gw_node, 0, sizeof(struct gw_node)); INIT_HLIST_NODE(&gw_node->list); gw_node->orig_node = orig_node; - kref_init(&gw_node->refcount); + atomic_set(&gw_node->refcount, 1); spin_lock_bh(&bat_priv->gw_list_lock); hlist_add_head_rcu(&gw_node->list, &bat_priv->gw_list); @@ -325,7 +352,7 @@ void gw_node_purge(struct bat_priv *bat_priv) gw_deselect(bat_priv); hlist_del_rcu(&gw_node->list); - call_rcu(&gw_node->rcu, gw_node_free_rcu); + gw_node_free_ref(gw_node); } @@ -335,21 +362,29 @@ void gw_node_purge(struct bat_priv *bat_priv) static int _write_buffer_text(struct bat_priv *bat_priv, struct seq_file *seq, struct gw_node *gw_node) { - int down, up; + struct gw_node *curr_gw; + struct orig_node *orig_node; + int down, up, ret; - gw_bandwidth_to_kbit(gw_node->orig_node->gw_flags, &down, &up); - - return seq_printf(seq, "%s %pM (%3i) %pM [%10s]: %3i - %i%s/%i%s\n", - (bat_priv->curr_gw == gw_node ? "=>" : " "), - gw_node->orig_node->orig, - gw_node->orig_node->router->tq_avg, - gw_node->orig_node->router->addr, - gw_node->orig_node->router->if_incoming->net_dev->name, - gw_node->orig_node->gw_flags, + rcu_read_lock(); + curr_gw = rcu_dereference(bat_priv->curr_gw); + orig_node = rcu_dereference(gw_node->orig_node); + gw_bandwidth_to_kbit(orig_node->gw_flags, &down, &up); + + ret = seq_printf(seq, "%s %pM (%3i) %pM [%10s]: %3i - %i%s/%i%s\n", + (curr_gw == gw_node ? "=>" : " "), + orig_node->orig, + orig_node->router->tq_avg, + orig_node->router->addr, + orig_node->router->if_incoming->net_dev->name, + orig_node->gw_flags, (down > 2048 ? down / 1024 : down), (down > 2048 ? "MBit" : "KBit"), (up > 2048 ? up / 1024 : up), (up > 2048 ? "MBit" : "KBit")); + rcu_read_unlock(); + + return ret; } int gw_client_seq_print_text(struct seq_file *seq, void *offset) @@ -470,8 +505,12 @@ int gw_is_target(struct bat_priv *bat_priv, struct sk_buff *skb) if (atomic_read(&bat_priv->gw_mode) == GW_MODE_SERVER) return -1; - if (!bat_priv->curr_gw) + rcu_read_lock(); + if (!rcu_dereference(bat_priv->curr_gw)) { + rcu_read_unlock(); return 0; + } + rcu_read_unlock(); return 1; } diff --git a/main.c b/main.c index e687e7f..8679260 100644 --- a/main.c +++ b/main.c @@ -85,6 +85,7 @@ int mesh_init(struct net_device *soft_iface) spin_lock_init(&bat_priv->hna_lhash_lock); spin_lock_init(&bat_priv->hna_ghash_lock); spin_lock_init(&bat_priv->gw_list_lock); + spin_lock_init(&bat_priv->curr_gw_lock); spin_lock_init(&bat_priv->vis_hash_lock); spin_lock_init(&bat_priv->vis_list_lock); spin_lock_init(&bat_priv->softif_neigh_lock); diff --git a/types.h b/types.h index e4a0462..b9b20b6 100644 --- a/types.h +++ b/types.h @@ -98,9 +98,9 @@ struct orig_node { struct gw_node { struct hlist_node list; - struct orig_node *orig_node; + struct orig_node *orig_node; /* rcu protected pointer */ unsigned long deleted; - struct kref refcount; + atomic_t refcount; struct rcu_head rcu; }; @@ -163,6 +163,7 @@ struct bat_priv { spinlock_t hna_lhash_lock; /* protects hna_local_hash */ spinlock_t hna_ghash_lock; /* protects hna_global_hash */ spinlock_t gw_list_lock; /* protects gw_list */ + spinlock_t curr_gw_lock; /* protects curr_gw updates */ spinlock_t vis_hash_lock; /* protects vis_hash */ spinlock_t vis_list_lock; /* protects vis_info::recv_list */ spinlock_t softif_neigh_lock; /* protects soft-interface neigh list */ @@ -171,7 +172,7 @@ struct bat_priv { struct delayed_work hna_work; struct delayed_work orig_work; struct delayed_work vis_work; - struct gw_node *curr_gw; + struct gw_node *curr_gw; /* rcu protected pointer */ struct vis_info *my_vis_info; }; diff --git a/unicast.c b/unicast.c index 6a9ab61..8816102 100644 --- a/unicast.c +++ b/unicast.c @@ -298,7 +298,6 @@ int unicast_send_skb(struct sk_buff *skb, struct bat_priv *bat_priv) if (!orig_node) goto trans_search; - kref_get(&orig_node->refcount); goto find_router; } else { rcu_read_lock(); -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-02-02 17:37 ` [B.A.T.M.A.N.] [PATCH] " Linus Lüssing @ 2011-02-02 19:49 ` Marek Lindner 2011-02-02 20:43 ` Linus Lüssing 2011-02-02 21:42 ` Sven Eckelmann 1 sibling, 1 reply; 8+ messages in thread From: Marek Lindner @ 2011-02-02 19:49 UTC (permalink / raw) To: The list for a Better Approach To Mobile Ad-hoc Networking On Wednesday 02 February 2011 18:37:18 Linus Lüssing wrote: > So after some more discussions with Marek and Sven, it looks like we > have to use the rcu protected macros rcu_dereference() and > rcu_assign_pointer() for the bat_priv->curr_gw and curr_gw->orig_node. > > Changes here also include moving the kref_get() from unicast_send_skb() > into gw_get_selected(). The orig_node could have been freed already at > the time the kref_get() was called in unicast_send_skb(). I'd suggest you make a standalone patch because the patches address different problems. Thanks, Marek ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-02-02 19:49 ` Marek Lindner @ 2011-02-02 20:43 ` Linus Lüssing 0 siblings, 0 replies; 8+ messages in thread From: Linus Lüssing @ 2011-02-02 20:43 UTC (permalink / raw) To: The list for a Better Approach To Mobile Ad-hoc Networking On Wed, Feb 02, 2011 at 08:49:17PM +0100, Marek Lindner wrote: > On Wednesday 02 February 2011 18:37:18 Linus Lüssing wrote: > > So after some more discussions with Marek and Sven, it looks like we > > have to use the rcu protected macros rcu_dereference() and > > rcu_assign_pointer() for the bat_priv->curr_gw and curr_gw->orig_node. > > > > Changes here also include moving the kref_get() from unicast_send_skb() > > into gw_get_selected(). The orig_node could have been freed already at > > the time the kref_get() was called in unicast_send_skb(). > > I'd suggest you make a standalone patch because the patches address different > problems. > > Thanks, > Marek > Oki doki, will do that (wasn't ment to be a clean patch yet anyway - I wanted to know first if the usage of rcu_dereference/rcu_assign_pointer() was going in the right direction). Cheers, Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-02-02 17:37 ` [B.A.T.M.A.N.] [PATCH] " Linus Lüssing 2011-02-02 19:49 ` Marek Lindner @ 2011-02-02 21:42 ` Sven Eckelmann 2011-02-03 0:19 ` Marek Lindner 2011-02-03 9:55 ` Linus Lüssing 1 sibling, 2 replies; 8+ messages in thread From: Sven Eckelmann @ 2011-02-02 21:42 UTC (permalink / raw) To: Linus Lüssing; +Cc: b.a.t.m.a.n [-- Attachment #1: Type: Text/Plain, Size: 4106 bytes --] On Wednesday 02 February 2011 18:37:18 Linus Lüssing wrote: > From: Sven Eckelmann <sven@narfation.org> > > Was: > --- > <TODO: write a long monologue about every problem we have or could have or > maybe never had and would have when we not have it> > > Signed-off-by: Sven Eckelmann <sven@narfation.org> > --- > > So after some more discussions with Marek and Sven, it looks like we > have to use the rcu protected macros rcu_dereference() and > rcu_assign_pointer() for the bat_priv->curr_gw and curr_gw->orig_node. > > Changes here also include moving the kref_get() from unicast_send_skb() > into gw_get_selected(). The orig_node could have been freed already at > the time the kref_get() was called in unicast_send_skb(). > > Some things that are still not that clear to me: > > gw_election(): > * can the if-block before gw_deselect() be ommited, we had a nullpointer > check for curr_gw just a couple of lines before during the rcu-lock. I thought that this if block should be moved to gw_select. And your gw_select still has the bug that the bat_priv->curr_gw isn't set to NULL when new_gw_node is NULL. > gw_deselet(): > * is the refcount at this time always 1 for gw_node, can the null > pointer check + a rcu_dereference be ommited? (at least that's what > it looks like when comparing to the rcuref.txt example) Why can't it be NULL? And _always_ use rcu_dereference. What example tells you that it isn't needed? None of the examples has any kind of rcu pointer in it (just el as pointer which is stored in a struct were the pointer inside the struct is rcu protected). > gw_get_selected(): > * Probably the orig_node's refcounting has to be made atomic, too? This part is still a little bit ugly and I cannot give you an easy answer. Just think about following: * Hash list is a bunch of rcu protected lists * pointer to originator is stored inside a bucket (list elements inside the hash) * hash bucket wants to get removed - call_rcu; reference count of the originator is decremented immediately * (!!!! lots of reordering of read and write commands inside the cpu!!!! - aren't we happy about the added complexity which tries to hide the memory latency?) * the originator was removed, the bucket which is removed in the call_rcu still points to the removed originator * a parallel running operation tries to find a originator, the rcu list iterator gets the to-be-deleted bucket to the originator * the pointer to the already removed originator inside the bucket is dereferenced, data is read/written -> Kernel Oops Does this sound scary? At least it could be used in some horror movies (and I would watch them). But that is the other problem I currently have with the state of batman-adv in trunk - and I think I forget to tell you about it after the release of v2011.0.0. So, a good idea would be the removal of the buckets for the hash. Usage of "struct hlist_node" inside the hash elements should be a good starting point. But think about the problem that the different hashes could have the same element. So you need for each distinct hash an extra "struct hlist_node" inside the element which should be part of the hash. The hash_add (and related) functions don't get the actual pointer to the element, but the pointer to the correct "struct hlist_node" inside the element/struct. The comparison and hashing function would also receive "struct hlist_node" as parameter and must get the pointer to the element using the container_of macro. > @@ -171,7 +172,7 @@ struct bat_priv { > struct delayed_work hna_work; > struct delayed_work orig_work; > struct delayed_work vis_work; > - struct gw_node *curr_gw; > + struct gw_node *curr_gw; /* rcu protected pointer */ > struct vis_info *my_vis_info; > }; Sry, but I have to say that: FAIL ;) I think it should look that way: > - struct gw_node *curr_gw; > + struct gw_node __rcu *curr_gw; Best regards, Sven [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-02-02 21:42 ` Sven Eckelmann @ 2011-02-03 0:19 ` Marek Lindner 2011-02-03 9:55 ` Linus Lüssing 1 sibling, 0 replies; 8+ messages in thread From: Marek Lindner @ 2011-02-03 0:19 UTC (permalink / raw) To: The list for a Better Approach To Mobile Ad-hoc Networking On Wednesday 02 February 2011 22:42:46 Sven Eckelmann wrote: > > gw_election(): > > * can the if-block before gw_deselect() be ommited, we had a nullpointer > > > > check for curr_gw just a couple of lines before during the rcu-lock. > > I thought that this if block should be moved to gw_select. And your > gw_select still has the bug that the bat_priv->curr_gw isn't set to NULL > when new_gw_node is NULL. Yes, but we discussed this without Linus. @Linus: This section will be changed - I'm working on this. Regards, Marek ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-02-02 21:42 ` Sven Eckelmann 2011-02-03 0:19 ` Marek Lindner @ 2011-02-03 9:55 ` Linus Lüssing 2011-02-03 10:01 ` Sven Eckelmann 1 sibling, 1 reply; 8+ messages in thread From: Linus Lüssing @ 2011-02-03 9:55 UTC (permalink / raw) To: The list for a Better Approach To Mobile Ad-hoc Networking Hi Sven, On Wed, Feb 02, 2011 at 10:42:46PM +0100, Sven Eckelmann wrote: > > gw_deselet(): > > * is the refcount at this time always 1 for gw_node, can the null > > pointer check + a rcu_dereference be ommited? (at least that's what > > it looks like when comparing to the rcuref.txt example) > > Why can't it be NULL? And _always_ use rcu_dereference. What example tells you > that it isn't needed? None of the examples has any kind of rcu pointer in it > (just el as pointer which is stored in a struct were the pointer inside the > struct is rcu protected). Ok, you got a point there with the always-rcu-dereference pointers. I somehow was thinking that in between the spin-lock/unlock there could possibly be no other thread reading/writing to it then - but I guess at that moment I forgot about the reordering and the whole point of using the rcu macros between the spinlock there :). So, yes, you're right with that one, will change it. For the NULL pointer, guess you're right again. I was looking at the delete() example in rcuref.txt which was not doing any NULL pointer check. But either that's the case there because it's more pseudo-code there or because it's more related to lists, meaning that after the delete_element there it's not in the list anymore and not possible for any other thread to have the idea to free the same thing again. > > > > gw_get_selected(): > > * Probably the orig_node's refcounting has to be made atomic, too? > > This part is still a little bit ugly and I cannot give you an easy answer. > Just think about following: > * Hash list is a bunch of rcu protected lists > * pointer to originator is stored inside a bucket (list elements inside the > hash) > * hash bucket wants to get removed - call_rcu; reference count of the > originator is decremented immediately > * (!!!! lots of reordering of read and write commands inside the cpu!!!! - > aren't we happy about the added complexity which tries to hide the memory > latency?) > * the originator was removed, the bucket which is removed in the call_rcu > still points to the removed originator > * a parallel running operation tries to find a originator, the rcu list > iterator gets the to-be-deleted bucket to the originator > * the pointer to the already removed originator inside the bucket is > dereferenced, data is read/written -> Kernel Oops > > Does this sound scary? At least it could be used in some horror movies (and I > would watch them). > > But that is the other problem I currently have with the state of batman-adv in > trunk - and I think I forget to tell you about it after the release of > v2011.0.0. > > So, a good idea would be the removal of the buckets for the hash. Usage of > "struct hlist_node" inside the hash elements should be a good starting point. > But think about the problem that the different hashes could have the same > element. So you need for each distinct hash an extra "struct hlist_node" > inside the element which should be part of the hash. The hash_add (and > related) functions don't get the actual pointer to the element, but the > pointer to the correct "struct hlist_node" inside the element/struct. The > comparison and hashing function would also receive "struct hlist_node" as > parameter and must get the pointer to the element using the container_of > macro. > > > > @@ -171,7 +172,7 @@ struct bat_priv { > > struct delayed_work hna_work; > > struct delayed_work orig_work; > > struct delayed_work vis_work; > > - struct gw_node *curr_gw; > > + struct gw_node *curr_gw; /* rcu protected pointer */ > > struct vis_info *my_vis_info; > > }; > > Sry, but I have to say that: FAIL ;) > > I think it should look that way: > > - struct gw_node *curr_gw; > > + struct gw_node __rcu *curr_gw; Eh, had been looking at whatisRCU.txt and there gbl_foo in section 3 did not have a "__rcu" (actually I hadn't seen that in any of the documentations before). > > Best regards, > Sven Cheers, Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node 2011-02-03 9:55 ` Linus Lüssing @ 2011-02-03 10:01 ` Sven Eckelmann 0 siblings, 0 replies; 8+ messages in thread From: Sven Eckelmann @ 2011-02-03 10:01 UTC (permalink / raw) To: b.a.t.m.a.n [-- Attachment #1: Type: Text/Plain, Size: 567 bytes --] Linus Lüssing wrote: > > I think it should look that way: > > > - struct gw_node *curr_gw; > > > + struct gw_node __rcu *curr_gw; > > Eh, had been looking at whatisRCU.txt and there gbl_foo in section > 3 did not have a "__rcu" (actually I hadn't seen that in any of the > documentations before). $ git show v2.6.37:Documentation/RCU/checklist.txt|grep __rcu That document explains it quite well how the sparse check works and why you must use __rcu to say "hey, this is a pointer which is used in a special way). Best regards, Sven [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-02-03 10:01 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-02 23:54 [B.A.T.M.A.N.] [PATCH] Re: batman-adv: Correct rcu refcounting for gw_node jay.busch -- strict thread matches above, loose matches on Subject: below -- 2011-01-30 1:52 [B.A.T.M.A.N.] [PATCH 1/4] " Sven Eckelmann 2011-02-02 17:37 ` [B.A.T.M.A.N.] [PATCH] " Linus Lüssing 2011-02-02 19:49 ` Marek Lindner 2011-02-02 20:43 ` Linus Lüssing 2011-02-02 21:42 ` Sven Eckelmann 2011-02-03 0:19 ` Marek Lindner 2011-02-03 9:55 ` Linus Lüssing 2011-02-03 10:01 ` Sven Eckelmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox