public inbox for b.a.t.m.a.n@lists.open-mesh.org
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown
@ 2010-09-05 23:29 Linus Lüssing
  2010-09-06  7:30 ` Sven Eckelmann
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Lüssing @ 2010-09-05 23:29 UTC (permalink / raw)
  To: b.a.t.m.a.n

During the module shutdown procedure in batman_exit(), a rcu callback is
being scheduled (batman_exit -> hardif_remove_interfaces ->
hardif_remove_interfae -> call_rcu). However, when the kernel unloads
the module, the rcu callback might not have been executed yet, resulting
in a "unable to handle kernel paging request" in __rcu_process_callback
afterwards, causing the kernel to freeze.
Therefore, we should always flush all rcu callback functions scheduled
during the shutdown procedure.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
---
 main.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/main.c b/main.c
index 209a46b..e8acb46 100644
--- a/main.c
+++ b/main.c
@@ -73,6 +73,8 @@ static void __exit batman_exit(void)
 	flush_workqueue(bat_event_workqueue);
 	destroy_workqueue(bat_event_workqueue);
 	bat_event_workqueue = NULL;
+
+	synchronize_net();
 }
 
 int mesh_init(struct net_device *soft_iface)
@@ -135,9 +137,6 @@ void mesh_free(struct net_device *soft_iface)
 	hna_local_free(bat_priv);
 	hna_global_free(bat_priv);
 
-	synchronize_net();
-
-	synchronize_rcu();
 	atomic_set(&bat_priv->mesh_state, MESH_INACTIVE);
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown
  2010-09-05 23:29 [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown Linus Lüssing
@ 2010-09-06  7:30 ` Sven Eckelmann
  2010-09-06 10:09   ` Linus Lüssing
  0 siblings, 1 reply; 7+ messages in thread
From: Sven Eckelmann @ 2010-09-06  7:30 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

On Mon, Sep 06, 2010 at 01:29:53AM +0200, Linus Lüssing wrote:
> During the module shutdown procedure in batman_exit(), a rcu callback is
> being scheduled (batman_exit -> hardif_remove_interfaces ->
> hardif_remove_interfae -> call_rcu). However, when the kernel unloads
> the module, the rcu callback might not have been executed yet, resulting
> in a "unable to handle kernel paging request" in __rcu_process_callback
> afterwards, causing the kernel to freeze.
> Therefore, we should always flush all rcu callback functions scheduled
> during the shutdown procedure.

I am really irritated by your patch. I would have expected that you add a
synchronyze_rcu in batman_exit and that was it. Instead I see a synchronize_net
added and a synchronize_net/-_rcu removed from mesh_free. This doesn't seem to
match at all. Could you please explain further why it is implemented that way?

thanks,
	Sven

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown
  2010-09-06  7:30 ` Sven Eckelmann
@ 2010-09-06 10:09   ` Linus Lüssing
  2010-09-06 12:07     ` Sven Eckelmann
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Lüssing @ 2010-09-06 10:09 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hi Sven,

synchronize_net already contains a synchronize_rcu at its end, so
the synchronize_rcu in the batman code there has always been
redundant.

I've removed the synchronize_rcu instead of the synchronize_net to
be on the safe side. I guess usually no more packets should arrive
anyway as the batman packet type is not registered anymore. But I
wasn't sure if the might_sleep() of synchronize_net() might be
needed for something, so I didn't dare to remove synchronize_net.

If someone says it'd be ok to remove synchronize_net() instead,
I could make a new patch, no problem.

Cheers, Linus

On Mon, Sep 06, 2010 at 09:30:46AM +0200, Sven Eckelmann wrote:
> On Mon, Sep 06, 2010 at 01:29:53AM +0200, Linus Lüssing wrote:
> > During the module shutdown procedure in batman_exit(), a rcu callback is
> > being scheduled (batman_exit -> hardif_remove_interfaces ->
> > hardif_remove_interfae -> call_rcu). However, when the kernel unloads
> > the module, the rcu callback might not have been executed yet, resulting
> > in a "unable to handle kernel paging request" in __rcu_process_callback
> > afterwards, causing the kernel to freeze.
> > Therefore, we should always flush all rcu callback functions scheduled
> > during the shutdown procedure.
> 
> I am really irritated by your patch. I would have expected that you add a
> synchronyze_rcu in batman_exit and that was it. Instead I see a synchronize_net
> added and a synchronize_net/-_rcu removed from mesh_free. This doesn't seem to
> match at all. Could you please explain further why it is implemented that way?
> 
> thanks,
> 	Sven



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown
  2010-09-06 10:09   ` Linus Lüssing
@ 2010-09-06 12:07     ` Sven Eckelmann
  2010-09-06 12:37       ` Linus Lüssing
  0 siblings, 1 reply; 7+ messages in thread
From: Sven Eckelmann @ 2010-09-06 12:07 UTC (permalink / raw)
  To: b.a.t.m.a.n

[-- Attachment #1: Type: Text/Plain, Size: 4656 bytes --]

Linus Lüssing wrote:
> Hi Sven,
> 
> synchronize_net already contains a synchronize_rcu at its end, so
> the synchronize_rcu in the batman code there has always been
> redundant.
> 
> I've removed the synchronize_rcu instead of the synchronize_net to
> be on the safe side. I guess usually no more packets should arrive
> anyway as the batman packet type is not registered anymore. But I
> wasn't sure if the might_sleep() of synchronize_net() might be
> needed for something, so I didn't dare to remove synchronize_net.
> 
> If someone says it'd be ok to remove synchronize_net() instead,
> I could make a new patch, no problem.

Ok, it would have been nice to state such things in the commit message 
(otherwise the stable@kernel.org will drop such a patch quite easily). Marek 
and I have ausgekaspert why it only happens in 1765 and also in 1766. So it 
will not be a patch for stable.

And the might_sleep is only for debugging purposes. But yes, it makes sense to 
use synchronize_net here (for example due to the usage of dev_remove_pack 
before).

That means that technically the patch seems to be ok, but didn't liked the 
explanation with the problem that we might have to justify it to the 
stable@kernel.org guys that way.

So I would ack the patch with a minor change in the commit message. So instead 
of

> During the module shutdown procedure in batman_exit(), a rcu callback is
> being scheduled (batman_exit -> hardif_remove_interfaces ->
> hardif_remove_interfae -> call_rcu). However, when the kernel unloads
> the module, the rcu callback might not have been executed yet, resulting
> in a "unable to handle kernel paging request" in __rcu_process_callback
> afterwards, causing the kernel to freeze.
> Therefore, we should always flush all rcu callback functions scheduled
> during the shutdown procedure.

something like

> During the module shutdown procedure in batman_exit(), a rcu callback is                                                                                                                                  
> being scheduled (batman_exit -> hardif_remove_interfaces ->                                                                                                                                               
> hardif_remove_interfae -> call_rcu). However, when the kernel unloads                                                                                                                                     
> the module, the rcu callback might not have been executed yet, resulting                                                                                                                                  
> in a "unable to handle kernel paging request" in __rcu_process_callback                                                                                                                                   
> afterwards, causing the kernel to freeze.                                                                                                                                                                 
                                                                                                                                                                                                          
> The synchronize_net and synchronize_rcu in mesh_free are currently                                                                                                                                        
> called before the call_rcu in hardif_remove_interface and have no real                                                                                                                                    
> effect on it.                                                                                                                                                                                             
                                                                                                                                                                                                          
> Therefore, we should always flush all rcu callback functions scheduled                                                                                                                                    
> during the shutdown procedure using synchronize_net. The call to                                                                                                                                          
> synchronize_rcu can be omitted because synchronize_net already calls it.

thanks,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown
  2010-09-06 12:07     ` Sven Eckelmann
@ 2010-09-06 12:37       ` Linus Lüssing
  2010-09-06 12:45         ` [B.A.T.M.A.N.] [PATCHv2] " Sven Eckelmann
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Lüssing @ 2010-09-06 12:37 UTC (permalink / raw)
  To: Sven Eckelmann; +Cc: b.a.t.m.a.n

> So I would ack the patch with a minor change in the commit message. So instead 
> of
> 
> > During the module shutdown procedure in batman_exit(), a rcu callback is
> > being scheduled (batman_exit -> hardif_remove_interfaces ->
> > hardif_remove_interfae -> call_rcu). However, when the kernel unloads
> > the module, the rcu callback might not have been executed yet, resulting
> > in a "unable to handle kernel paging request" in __rcu_process_callback
> > afterwards, causing the kernel to freeze.
> > Therefore, we should always flush all rcu callback functions scheduled
> > during the shutdown procedure.
> 
> something like
> 
> > During the module shutdown procedure in batman_exit(), a rcu callback is                                                                                                                                  
> > being scheduled (batman_exit -> hardif_remove_interfaces ->                                                                                                                                               
> > hardif_remove_interfae -> call_rcu). However, when the kernel unloads                                                                                                                                     
> > the module, the rcu callback might not have been executed yet, resulting                                                                                                                                  
> > in a "unable to handle kernel paging request" in __rcu_process_callback                                                                                                                                   
> > afterwards, causing the kernel to freeze.                                                                                                                                                                 
>                                                                                                                                                                                                           
> > The synchronize_net and synchronize_rcu in mesh_free are currently                                                                                                                                        
> > called before the call_rcu in hardif_remove_interface and have no real                                                                                                                                    
> > effect on it.                                                                                                                                                                                             
>                                                                                                                                                                                                           
> > Therefore, we should always flush all rcu callback functions scheduled                                                                                                                                    
> > during the shutdown procedure using synchronize_net. The call to                                                                                                                                          
> > synchronize_rcu can be omitted because synchronize_net already calls it.

Yep, sounds good :). Thanks for reviewing and the info about
synchronize_net.

Cheers, Linus

> 
> thanks,
> 	Sven



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [B.A.T.M.A.N.] [PATCHv2] batman-adv: Always synchronize rcu's on module shutdown
  2010-09-06 12:37       ` Linus Lüssing
@ 2010-09-06 12:45         ` Sven Eckelmann
  2010-09-06 14:09           ` Marek Lindner
  0 siblings, 1 reply; 7+ messages in thread
From: Sven Eckelmann @ 2010-09-06 12:45 UTC (permalink / raw)
  To: b.a.t.m.a.n

From: Linus Lüssing <linus.luessing@web.de>

During the module shutdown procedure in batman_exit(), a rcu callback is
being scheduled (batman_exit -> hardif_remove_interfaces ->
hardif_remove_interfae -> call_rcu). However, when the kernel unloads
the module, the rcu callback might not have been executed yet, resulting
in a "unable to handle kernel paging request" in __rcu_process_callback
afterwards, causing the kernel to freeze.

The synchronize_net and synchronize_rcu in mesh_free are currently
called before the call_rcu in hardif_remove_interface and have no real
effect on it.

Therefore, we should always flush all rcu callback functions scheduled
during the shutdown procedure using synchronize_net. The call to
synchronize_rcu can be omitted because synchronize_net already calls it.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Acked-by: Sven Eckelmann <sven.eckelmann@gmx.de>
---
 batman-adv/main.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/batman-adv/main.c b/batman-adv/main.c
index 209a46b..e8acb46 100644
--- a/batman-adv/main.c
+++ b/batman-adv/main.c
@@ -73,6 +73,8 @@ static void __exit batman_exit(void)
 	flush_workqueue(bat_event_workqueue);
 	destroy_workqueue(bat_event_workqueue);
 	bat_event_workqueue = NULL;
+
+	synchronize_net();
 }
 
 int mesh_init(struct net_device *soft_iface)
@@ -135,9 +137,6 @@ void mesh_free(struct net_device *soft_iface)
 	hna_local_free(bat_priv);
 	hna_global_free(bat_priv);
 
-	synchronize_net();
-
-	synchronize_rcu();
 	atomic_set(&bat_priv->mesh_state, MESH_INACTIVE);
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCHv2] batman-adv: Always synchronize rcu's on module shutdown
  2010-09-06 12:45         ` [B.A.T.M.A.N.] [PATCHv2] " Sven Eckelmann
@ 2010-09-06 14:09           ` Marek Lindner
  0 siblings, 0 replies; 7+ messages in thread
From: Marek Lindner @ 2010-09-06 14:09 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Monday 06 September 2010 14:45:24 Sven Eckelmann wrote:
> Therefore, we should always flush all rcu callback functions scheduled
> during the shutdown procedure using synchronize_net. The call to
> synchronize_rcu can be omitted because synchronize_net already calls it.

Applied in revision 1788.

Thanks,
Marek

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-09-06 14:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-05 23:29 [B.A.T.M.A.N.] [PATCH] batman-adv: Always synchronize rcu's on module shutdown Linus Lüssing
2010-09-06  7:30 ` Sven Eckelmann
2010-09-06 10:09   ` Linus Lüssing
2010-09-06 12:07     ` Sven Eckelmann
2010-09-06 12:37       ` Linus Lüssing
2010-09-06 12:45         ` [B.A.T.M.A.N.] [PATCHv2] " Sven Eckelmann
2010-09-06 14:09           ` Marek Lindner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox