* [B.A.T.M.A.N.] [PATCH] batman-adv: fix race condition in TT full-table replacement
@ 2012-06-18 16:10 Antonio Quartulli
2012-06-20 12:12 ` [B.A.T.M.A.N.] [PATCHv2] " Antonio Quartulli
0 siblings, 1 reply; 5+ messages in thread
From: Antonio Quartulli @ 2012-06-18 16:10 UTC (permalink / raw)
To: b.a.t.m.a.n
In the current TT code, when a TT_Response containing a full table is received
form an originator, the node first purges all the clients for that originator in
the global translation-table and then merges the new received table.
During the purging phase each client deletion is done by means of a call_rcu()
invocation and at the end the global entry counter for that originator is set to
0. However the invoked rcu function decreases by one the global entry counter
for that originator as well and since the rcu invocation is likely to be
postponed, the node will end up in first setting the counter to 0 and then
decreasing it one by one for each deleted client.
To solve this problem the counter is not explicitly set to 0 anymore and the
counter decrement is performed right before the invocation of call_rcu().
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
translation-table.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/translation-table.c b/translation-table.c
index 2a6d7d6..b08cb76 100644
--- a/translation-table.c
+++ b/translation-table.c
@@ -145,7 +145,6 @@ static void batadv_tt_orig_list_entry_free_rcu(struct rcu_head *rcu)
struct batadv_tt_orig_list_entry *orig_entry;
orig_entry = container_of(rcu, struct batadv_tt_orig_list_entry, rcu);
- atomic_dec(&orig_entry->orig_node->tt_size);
batadv_orig_node_free_ref(orig_entry->orig_node);
kfree(orig_entry);
}
@@ -153,6 +152,8 @@ static void batadv_tt_orig_list_entry_free_rcu(struct rcu_head *rcu)
static void
batadv_tt_orig_list_entry_free_ref(struct batadv_tt_orig_list_entry *orig_entry)
{
+ /* to avoid race conditions, immediately decrease the tt counter */
+ atomic_dec(&orig_entry->orig_node->tt_size);
call_rcu(&orig_entry->rcu, batadv_tt_orig_list_entry_free_rcu);
}
@@ -1025,7 +1026,6 @@ void batadv_tt_global_del_orig(struct batadv_priv *bat_priv,
}
spin_unlock_bh(list_lock);
}
- atomic_set(&orig_node->tt_size, 0);
orig_node->tt_initialised = false;
}
--
1.7.9.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [B.A.T.M.A.N.] [PATCHv2] batman-adv: fix race condition in TT full-table replacement
2012-06-18 16:10 [B.A.T.M.A.N.] [PATCH] batman-adv: fix race condition in TT full-table replacement Antonio Quartulli
@ 2012-06-20 12:12 ` Antonio Quartulli
2012-06-22 18:42 ` [B.A.T.M.A.N.] [PATCHv3] " Antonio Quartulli
2012-06-22 18:50 ` [B.A.T.M.A.N.] [PATCHv2] " Marek Lindner
0 siblings, 2 replies; 5+ messages in thread
From: Antonio Quartulli @ 2012-06-20 12:12 UTC (permalink / raw)
To: b.a.t.m.a.n
bug introduced with cea194d90b11aff7fc289149e4c7f305fad3535a
In the current TT code, when a TT_Response containing a full table is received
form an originator, the node first purges all the clients for that originator in
the global translation-table and then merges the new received table.
During the purging phase each client deletion is done by means of a call_rcu()
invocation and at the end of this phase the global entry counter for that
originator is set to 0. However the invoked rcu function decreases by one the
global entry counter for that originator too and since the rcu invocation is
likely to be postponed, the node will end up in first setting the counter to 0
and then decreasing it one by one for each deleted client.
This bug leads to having a wrong global entry counter for the related node, say
X. Then when the node with the broken counter will answer to a TT_REQUEST on
behalf of node X, it will create faulty TT_RESPONSE that will generate an
unrecoverable situation on the node that asked for the full table recover.
The non-recoverability is given by the fact that the node with the broken
counter will keep answering on behalf of X because its knowledge about X's state
(ttvn + tt_crc) is correct.
To solve this problem the counter is not explicitly set to 0 anymore and the
counter decrement is performed right before the invocation of call_rcu().
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
v2:
- patch rebased on top of maint
- commit message extended
translation-table.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/translation-table.c b/translation-table.c
index a66c2dc..21e493d 100644
--- a/translation-table.c
+++ b/translation-table.c
@@ -141,13 +141,14 @@ static void tt_orig_list_entry_free_rcu(struct rcu_head *rcu)
struct tt_orig_list_entry *orig_entry;
orig_entry = container_of(rcu, struct tt_orig_list_entry, rcu);
- atomic_dec(&orig_entry->orig_node->tt_size);
orig_node_free_ref(orig_entry->orig_node);
kfree(orig_entry);
}
static void tt_orig_list_entry_free_ref(struct tt_orig_list_entry *orig_entry)
{
+ /* to avoid race conditions, immediately decrease the tt counter */
+ atomic_dec(&orig_entry->orig_node->tt_size);
call_rcu(&orig_entry->rcu, tt_orig_list_entry_free_rcu);
}
@@ -910,7 +911,6 @@ void tt_global_del_orig(struct bat_priv *bat_priv,
}
spin_unlock_bh(list_lock);
}
- atomic_set(&orig_node->tt_size, 0);
orig_node->tt_initialised = false;
}
--
1.7.9.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [B.A.T.M.A.N.] [PATCHv3] batman-adv: fix race condition in TT full-table replacement
2012-06-20 12:12 ` [B.A.T.M.A.N.] [PATCHv2] " Antonio Quartulli
@ 2012-06-22 18:42 ` Antonio Quartulli
2012-06-22 18:46 ` Antonio Quartulli
2012-06-22 18:50 ` [B.A.T.M.A.N.] [PATCHv2] " Marek Lindner
1 sibling, 1 reply; 5+ messages in thread
From: Antonio Quartulli @ 2012-06-22 18:42 UTC (permalink / raw)
To: b.a.t.m.a.n
bug introduced with cea194d90b11aff7fc289149e4c7f305fad3535a
In the current TT code, when a TT_Response containing a full table is received
form an originator, the node first purges all the clients for that originator in
the global translation-table and then merges the new received table.
During the purging phase each client deletion is done by means of a call_rcu()
invocation and at the end of this phase the global entry counter for that
originator is set to 0. However the invoked rcu function decreases by one the
global entry counter for that originator too and since the rcu invocation is
likely to be postponed, the node will end up in first setting the counter to 0
and then decreasing it one by one for each deleted client.
This bug leads to having a wrong global entry counter for the related node, say
X. Then when the node with the broken counter will answer to a TT_REQUEST on
behalf of node X, it will create faulty TT_RESPONSE that will generate an
unrecoverable situation on the node that asked for the full table recover.
The non-recoverability is given by the fact that the node with the broken
counter will keep answering on behalf of X because its knowledge about X's state
(ttvn + tt_crc) is correct.
To solve this problem the counter is not explicitly set to 0.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
translation-table.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/translation-table.c b/translation-table.c
index a66c2dc..19d23b4 100644
--- a/translation-table.c
+++ b/translation-table.c
@@ -910,7 +910,6 @@ void tt_global_del_orig(struct bat_priv *bat_priv,
}
spin_unlock_bh(list_lock);
}
- atomic_set(&orig_node->tt_size, 0);
orig_node->tt_initialised = false;
}
--
1.7.9.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCHv3] batman-adv: fix race condition in TT full-table replacement
2012-06-22 18:42 ` [B.A.T.M.A.N.] [PATCHv3] " Antonio Quartulli
@ 2012-06-22 18:46 ` Antonio Quartulli
0 siblings, 0 replies; 5+ messages in thread
From: Antonio Quartulli @ 2012-06-22 18:46 UTC (permalink / raw)
To: b.a.t.m.a.n
[-- Attachment #1: Type: text/plain, Size: 281 bytes --]
On Fri, Jun 22, 2012 at 08:42:23 +0200, Antonio Quartulli wrote:
>
> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
>
Please drop this patch. PATCH v2 was correct.
Thanks
--
Antonio Quartulli
..each of us alone is worth nothing..
Ernesto "Che" Guevara
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCHv2] batman-adv: fix race condition in TT full-table replacement
2012-06-20 12:12 ` [B.A.T.M.A.N.] [PATCHv2] " Antonio Quartulli
2012-06-22 18:42 ` [B.A.T.M.A.N.] [PATCHv3] " Antonio Quartulli
@ 2012-06-22 18:50 ` Marek Lindner
1 sibling, 0 replies; 5+ messages in thread
From: Marek Lindner @ 2012-06-22 18:50 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
On Wednesday, June 20, 2012 14:12:56 Antonio Quartulli wrote:
> bug introduced with cea194d90b11aff7fc289149e4c7f305fad3535a
>
> In the current TT code, when a TT_Response containing a full table is
> received form an originator, the node first purges all the clients for
> that originator in the global translation-table and then merges the new
> received table. During the purging phase each client deletion is done by
> means of a call_rcu() invocation and at the end of this phase the global
> entry counter for that originator is set to 0. However the invoked rcu
> function decreases by one the global entry counter for that originator too
> and since the rcu invocation is likely to be postponed, the node will end
> up in first setting the counter to 0 and then decreasing it one by one for
> each deleted client.
>
> This bug leads to having a wrong global entry counter for the related node,
> say X. Then when the node with the broken counter will answer to a
> TT_REQUEST on behalf of node X, it will create faulty TT_RESPONSE that
> will generate an unrecoverable situation on the node that asked for the
> full table recover.
>
> The non-recoverability is given by the fact that the node with the broken
> counter will keep answering on behalf of X because its knowledge about X's
> state (ttvn + tt_crc) is correct.
>
> To solve this problem the counter is not explicitly set to 0 anymore and
> the counter decrement is performed right before the invocation of
> call_rcu().
>
> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
> ---
> v2:
> - patch rebased on top of maint
> - commit message extended
>
> translation-table.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Applied in revision d1f13e2.
Thanks,
Marek
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-06-22 18:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-18 16:10 [B.A.T.M.A.N.] [PATCH] batman-adv: fix race condition in TT full-table replacement Antonio Quartulli
2012-06-20 12:12 ` [B.A.T.M.A.N.] [PATCHv2] " Antonio Quartulli
2012-06-22 18:42 ` [B.A.T.M.A.N.] [PATCHv3] " Antonio Quartulli
2012-06-22 18:46 ` Antonio Quartulli
2012-06-22 18:50 ` [B.A.T.M.A.N.] [PATCHv2] " Marek Lindner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox