linux-bluetooth.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Bluetooth: Fix hidp disconnect deadlock
@ 2011-06-25 21:32 Peter Hurley
  2011-06-26  7:16 ` Ilia, Kolominsky
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Hurley @ 2011-06-25 21:32 UTC (permalink / raw)
  To: linux-bluetooth

Release reader lock on r/w sem before stopping khidp thread (which needs to
claim the writer lock on sem before unlinking the session).

NB: kthread_stop waits for thread completion.
---
 net/bluetooth/hidp/core.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index c405a95..16d75d7 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -1096,6 +1096,7 @@ int hidp_del_connection(struct hidp_conndel_req *req)
 {
 	struct hidp_session *session;
 	int err = 0;
+	bool stop_thread = false;
 
 	BT_DBG("");
 
@@ -1111,12 +1112,14 @@ int hidp_del_connection(struct hidp_conndel_req *req)
 			skb_queue_purge(&session->ctrl_transmit);
 			skb_queue_purge(&session->intr_transmit);
 
-			kthread_stop(session->task);
+			stop_thread = true;
 		}
 	} else
 		err = -ENOENT;
 
 	up_read(&hidp_session_sem);
+	if (stop_thread)
+		kthread_stop(session->task);
 	return err;
 }
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-25 21:32 [PATCH] Bluetooth: Fix hidp disconnect deadlock Peter Hurley
@ 2011-06-26  7:16 ` Ilia, Kolominsky
  2011-06-29 20:24   ` Gustavo F. Padovan
  0 siblings, 1 reply; 9+ messages in thread
From: Ilia, Kolominsky @ 2011-06-26  7:16 UTC (permalink / raw)
  To: Peter Hurley, linux-bluetooth

Hi!
IMHO the fix isnt good due to possible race condition which 
will destroy session/task objects - either by a call to kthread_stop
from the timer func or reentry to hidp_del_connection() on 
smp platforms.
see the intopic comments


> -----Original Message-----
> From: linux-bluetooth-owner@vger.kernel.org [mailto:linux-bluetooth-
> owner@vger.kernel.org] On Behalf Of Peter Hurley
> Sent: Sunday, June 26, 2011 12:33 AM
> To: linux-bluetooth
> Subject: [PATCH] Bluetooth: Fix hidp disconnect deadlock
> 
> Release reader lock on r/w sem before stopping khidp thread (which
> needs to
> claim the writer lock on sem before unlinking the session).
> 
> NB: kthread_stop waits for thread completion.
> ---
>  net/bluetooth/hidp/core.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
> index c405a95..16d75d7 100644
> --- a/net/bluetooth/hidp/core.c
> +++ b/net/bluetooth/hidp/core.c
> @@ -1096,6 +1096,7 @@ int hidp_del_connection(struct hidp_conndel_req
> *req)
>  {
>  	struct hidp_session *session;
>  	int err = 0;
> +	bool stop_thread = false;
> 
>  	BT_DBG("");
> 
> @@ -1111,12 +1112,14 @@ int hidp_del_connection(struct hidp_conndel_req
> *req)
>  			skb_queue_purge(&session->ctrl_transmit);
>  			skb_queue_purge(&session->intr_transmit);
> 
> -			kthread_stop(session->task);
> +			stop_thread = true;
>  		}
>  	} else
>  		err = -ENOENT;
> 
>  	up_read(&hidp_session_sem);
> +	if (stop_thread)

Timer fires here - session is destroyed.

> +		kthread_stop(session->task);
>  	return err;
>  }
> 
> --
> 1.7.4.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I am working on a better solution which will ensure the async deletion
of session kthread operates correctly from different exec. contexts

Regards

Ilia Kolominsky
iliak@ti.com
Direct:  +972(9)7906231
Mobile: +972(54)909009


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-26  7:16 ` Ilia, Kolominsky
@ 2011-06-29 20:24   ` Gustavo F. Padovan
  2011-06-29 20:52     ` Gustavo F. Padovan
  0 siblings, 1 reply; 9+ messages in thread
From: Gustavo F. Padovan @ 2011-06-29 20:24 UTC (permalink / raw)
  To: Ilia, Kolominsky; +Cc: Peter Hurley, linux-bluetooth

* Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:

> Hi!
> IMHO the fix isnt good due to possible race condition which 
> will destroy session/task objects - either by a call to kthread_stop
> from the timer func or reentry to hidp_del_connection() on 
> smp platforms.
> see the intopic comments
> 
> 
> > -----Original Message-----
> > From: linux-bluetooth-owner@vger.kernel.org [mailto:linux-bluetooth-
> > owner@vger.kernel.org] On Behalf Of Peter Hurley
> > Sent: Sunday, June 26, 2011 12:33 AM
> > To: linux-bluetooth
> > Subject: [PATCH] Bluetooth: Fix hidp disconnect deadlock
> > 
> > Release reader lock on r/w sem before stopping khidp thread (which
> > needs to
> > claim the writer lock on sem before unlinking the session).
> > 
> > NB: kthread_stop waits for thread completion.
> > ---
> >  net/bluetooth/hidp/core.c |    5 ++++-
> >  1 files changed, 4 insertions(+), 1 deletions(-)
> > 
> > diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
> > index c405a95..16d75d7 100644
> > --- a/net/bluetooth/hidp/core.c
> > +++ b/net/bluetooth/hidp/core.c
> > @@ -1096,6 +1096,7 @@ int hidp_del_connection(struct hidp_conndel_req
> > *req)
> >  {
> >  	struct hidp_session *session;
> >  	int err = 0;
> > +	bool stop_thread = false;
> > 
> >  	BT_DBG("");
> > 
> > @@ -1111,12 +1112,14 @@ int hidp_del_connection(struct hidp_conndel_req
> > *req)
> >  			skb_queue_purge(&session->ctrl_transmit);
> >  			skb_queue_purge(&session->intr_transmit);
> > 
> > -			kthread_stop(session->task);
> > +			stop_thread = true;
> >  		}
> >  	} else
> >  		err = -ENOENT;
> > 
> >  	up_read(&hidp_session_sem);
> > +	if (stop_thread)
> 
> Timer fires here - session is destroyed.
> 
> > +		kthread_stop(session->task);
> >  	return err;
> >  }
> > 
> > --
> > 1.7.4.1
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-
> > bluetooth" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> I am working on a better solution which will ensure the async deletion
> of session kthread operates correctly from different exec. contexts

Are you going to deliver this soon? This needs to be fixed on 3.0.

	Gustavo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-29 20:24   ` Gustavo F. Padovan
@ 2011-06-29 20:52     ` Gustavo F. Padovan
  2011-06-30 14:34       ` Peter Hurley
  0 siblings, 1 reply; 9+ messages in thread
From: Gustavo F. Padovan @ 2011-06-29 20:52 UTC (permalink / raw)
  To: Ilia, Kolominsky, Peter Hurley, linux-bluetooth

* Gustavo F. Padovan <padovan@profusion.mobi> [2011-06-29 17:24:56 -0300]:

> * Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:
> 
> > Hi!
> > IMHO the fix isnt good due to possible race condition which 
> > will destroy session/task objects - either by a call to kthread_stop
> > from the timer func or reentry to hidp_del_connection() on 
> > smp platforms.
> > see the intopic comments
> > 
> > 
> > > -----Original Message-----
> > > From: linux-bluetooth-owner@vger.kernel.org [mailto:linux-bluetooth-
> > > owner@vger.kernel.org] On Behalf Of Peter Hurley
> > > Sent: Sunday, June 26, 2011 12:33 AM
> > > To: linux-bluetooth
> > > Subject: [PATCH] Bluetooth: Fix hidp disconnect deadlock
> > > 
> > > Release reader lock on r/w sem before stopping khidp thread (which
> > > needs to
> > > claim the writer lock on sem before unlinking the session).
> > > 
> > > NB: kthread_stop waits for thread completion.
> > > ---
> > >  net/bluetooth/hidp/core.c |    5 ++++-
> > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
> > > index c405a95..16d75d7 100644
> > > --- a/net/bluetooth/hidp/core.c
> > > +++ b/net/bluetooth/hidp/core.c
> > > @@ -1096,6 +1096,7 @@ int hidp_del_connection(struct hidp_conndel_req
> > > *req)
> > >  {
> > >  	struct hidp_session *session;
> > >  	int err = 0;
> > > +	bool stop_thread = false;
> > > 
> > >  	BT_DBG("");
> > > 
> > > @@ -1111,12 +1112,14 @@ int hidp_del_connection(struct hidp_conndel_req
> > > *req)
> > >  			skb_queue_purge(&session->ctrl_transmit);
> > >  			skb_queue_purge(&session->intr_transmit);
> > > 
> > > -			kthread_stop(session->task);
> > > +			stop_thread = true;
> > >  		}
> > >  	} else
> > >  		err = -ENOENT;
> > > 
> > >  	up_read(&hidp_session_sem);
> > > +	if (stop_thread)
> > 
> > Timer fires here - session is destroyed.
> > 
> > > +		kthread_stop(session->task);
> > >  	return err;
> > >  }
> > > 
> > > --
> > > 1.7.4.1
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-
> > > bluetooth" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > I am working on a better solution which will ensure the async deletion
> > of session kthread operates correctly from different exec. contexts
> 
> Are you going to deliver this soon? This needs to be fixed on 3.0.


This should fix the timer issue. Please test.

	Gustavo


Author: Gustavo F. Padovan <padovan@profusion.mobi>
Date:   Wed Jun 29 17:45:56 2011 -0300

    Bluetooth: Delete timer before call kthread_stop()
    
    If we keep the timer running we can have a race condition between the
    timer and hidp_del_connection. By deleting the time before call
    kthread_stop() we won't have any race condition.
    
    Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index 16d75d7..a86ba69 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -476,7 +476,7 @@ static void hidp_set_timer(struct hidp_session *session)
 static inline void hidp_del_timer(struct hidp_session *session)
 {
        if (session->idle_to > 0)
-               del_timer(&session->timer);
+               del_timer_sync(&session->timer);
 }
 
 static void hidp_process_handshake(struct hidp_session *session,
@@ -1113,6 +1113,8 @@ int hidp_del_connection(struct hidp_conndel_req *req)
                        skb_queue_purge(&session->intr_transmit);
 
                        stop_thread = true;
+
+                       hidp_del_timer(session);
                }
        } else
                err = -ENOENT;
~


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-29 20:52     ` Gustavo F. Padovan
@ 2011-06-30 14:34       ` Peter Hurley
  2011-06-30 14:46         ` gene heskett
                           ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Peter Hurley @ 2011-06-30 14:34 UTC (permalink / raw)
  To: Gustavo F. Padovan; +Cc: Ilia, Kolominsky, linux-bluetooth

On Wed, 2011-06-29 at 16:52 -0400, Gustavo F. Padovan wrote:
> * Gustavo F. Padovan <padovan@profusion.mobi> [2011-06-29 17:24:56 -0300]:
> 
> > * Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:
> > 
> > > Hi!
> > > IMHO the fix isnt good due to possible race condition which 
> > > will destroy session/task objects - either by a call to kthread_stop
> > > from the timer func or reentry to hidp_del_connection() on 
> > > smp platforms.
....
> This should fix the timer issue. Please test.
> 
> 	Gustavo

Hi Ilia & Gustavo,

After Ilia pointed out the problem with the timer function, I went back
and reviewed *all* the synchronization code relevant to the hid session
thread.

A number of problems were introduced with commit aabf6f89 - when the
session thread was converted from a kernel_thread to a kthread. Although
a kthread is a better choice for representing the session thread, the
naive conversion of atomic/wakeup to kthread_stop() was inappropriate.

kthread_stop() has usage semantics that different significantly from
atomic/wakeup. As we already know, because kthread_stop() blocks on
thread completion, it can introduce deadlocks in code that already uses
exclusion mechanisms. Even with Ilia's new patch, consider the following
sequence:

Thread 0                 Thread 1                 Thread 2
in hidp_del_connection                            in hidp session
  claim r/w sem                                     .
                         timer triggers             .
                           kthread_stop() --------->.
                           *blocks on thread 2*     exits loop
                                                    *blocks for r/w sem*
  in hidp_del_timer
    del_timer_sync()
      *blocks on thread 1*

Deadlock occurs because:
+ thread 0 holds reader lock but is waiting for the timer function on
thread 1 to finish
+ thread 1 has stopped kthread and is waiting for thread completion
+ thread 2 (aka kthread) is waiting to claim writer lock held by thread
0

In addition to the deadlocks and races, kthread_stop() is being called
by hidp_process_hid_control() which is *in the session thread context* -
kthread_stop() cannot be called on itself!

I've been testing patch v2 since Monday which continues to use kthread
but reverts back to the old behavior of atomic/wakeup. It also fixes the
potential for a lost wakeup. Of course, "testing" means running it and
looking at it carefully - as someone on IRC pointed out, kthread really
needs to get instrumented with lockdep.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-30 14:34       ` Peter Hurley
@ 2011-06-30 14:46         ` gene heskett
  2011-06-30 14:55         ` Ilia, Kolominsky
  2011-06-30 17:40         ` Gustavo F. Padovan
  2 siblings, 0 replies; 9+ messages in thread
From: gene heskett @ 2011-06-30 14:46 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Gustavo F. Padovan, Ilia, Kolominsky, linux-bluetooth

On Thursday, June 30, 2011 10:41:51 AM Peter Hurley did opine:

> On Wed, 2011-06-29 at 16:52 -0400, Gustavo F. Padovan wrote:
> > * Gustavo F. Padovan <padovan@profusion.mobi> [2011-06-29 17:24:56 
-0300]:
> > > * Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:
> > > > Hi!
> > > > IMHO the fix isnt good due to possible race condition which
> > > > will destroy session/task objects - either by a call to
> > > > kthread_stop from the timer func or reentry to
> > > > hidp_del_connection() on smp platforms.
> 
> ....
> 
> > This should fix the timer issue. Please test.
> > 
> > 	Gustavo
> 
> Hi Ilia & Gustavo,
> 
> After Ilia pointed out the problem with the timer function, I went back
> and reviewed *all* the synchronization code relevant to the hid session
> thread.
> 
> A number of problems were introduced with commit aabf6f89 - when the
> session thread was converted from a kernel_thread to a kthread. Although
> a kthread is a better choice for representing the session thread, the
> naive conversion of atomic/wakeup to kthread_stop() was inappropriate.
> 
> kthread_stop() has usage semantics that different significantly from
> atomic/wakeup. As we already know, because kthread_stop() blocks on
> thread completion, it can introduce deadlocks in code that already uses
> exclusion mechanisms. Even with Ilia's new patch, consider the following
> sequence:
> 
> Thread 0                 Thread 1                 Thread 2
> in hidp_del_connection                            in hidp session
>   claim r/w sem                                     .
>                          timer triggers             .
>                            kthread_stop() --------->.
>                            *blocks on thread 2*     exits loop
>                                                     *blocks for r/w sem*
>   in hidp_del_timer
>     del_timer_sync()
>       *blocks on thread 1*
> 
> Deadlock occurs because:
> + thread 0 holds reader lock but is waiting for the timer function on
> thread 1 to finish
> + thread 1 has stopped kthread and is waiting for thread completion
> + thread 2 (aka kthread) is waiting to claim writer lock held by thread
> 0
> 
> In addition to the deadlocks and races, kthread_stop() is being called
> by hidp_process_hid_control() which is *in the session thread context* -
> kthread_stop() cannot be called on itself!
> 
> I've been testing patch v2 since Monday which continues to use kthread
> but reverts back to the old behavior of atomic/wakeup. It also fixes the
> potential for a lost wakeup. Of course, "testing" means running it and
> looking at it carefully - as someone on IRC pointed out, kthread really
> needs to get instrumented with lockdep.
> 
> Regards,
> Peter Hurley

Peter;  You may have hit on a problem I'm seeing here also, to wit:
I can reboot, start bluetoothd by hand, and using blueman-manager make a 
connection that works well despite the signal being reported as a bit less 
than perfect since the two devices are a couple of walls and 20 feet apart.  
It will work for 12 to 18 hours, then die 'deadlocked' and cannot be 
restored without a reboot.

Seems to me we at least have similar ducks, they both waddle.  ;-)

Cheers, gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
When God endowed human beings with brains, He did not intend to guarantee 
them.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-30 14:34       ` Peter Hurley
  2011-06-30 14:46         ` gene heskett
@ 2011-06-30 14:55         ` Ilia, Kolominsky
  2011-06-30 17:47           ` Gustavo F. Padovan
  2011-06-30 17:40         ` Gustavo F. Padovan
  2 siblings, 1 reply; 9+ messages in thread
From: Ilia, Kolominsky @ 2011-06-30 14:55 UTC (permalink / raw)
  To: Peter Hurley, Gustavo F. Padovan; +Cc: linux-bluetooth

> -----Original Message-----
> From: Peter Hurley [mailto:peter@hurleysoftware.com]
> Sent: Thursday, June 30, 2011 5:35 PM
> To: Gustavo F. Padovan
> Cc: Ilia, Kolominsky; linux-bluetooth
> Subject: Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
> 
> On Wed, 2011-06-29 at 16:52 -0400, Gustavo F. Padovan wrote:
> > * Gustavo F. Padovan <padovan@profusion.mobi> [2011-06-29 17:24:56 -
> 0300]:
> >
> > > * Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:
> > >
> > > > Hi!
> > > > IMHO the fix isnt good due to possible race condition which
> > > > will destroy session/task objects - either by a call to
> kthread_stop
> > > > from the timer func or reentry to hidp_del_connection() on
> > > > smp platforms.
> ....
> > This should fix the timer issue. Please test.
> >
> > 	Gustavo
> 
> Hi Ilia & Gustavo,
> 
> After Ilia pointed out the problem with the timer function, I went back
> and reviewed *all* the synchronization code relevant to the hid session
> thread.
> 
> A number of problems were introduced with commit aabf6f89 - when the
> session thread was converted from a kernel_thread to a kthread.
> Although
> a kthread is a better choice for representing the session thread, the
> naive conversion of atomic/wakeup to kthread_stop() was inappropriate.
> 
> kthread_stop() has usage semantics that different significantly from
> atomic/wakeup. As we already know, because kthread_stop() blocks on
> thread completion, it can introduce deadlocks in code that already uses
> exclusion mechanisms. Even with Ilia's new patch, consider the
> following
> sequence:
> 
> Thread 0                 Thread 1                 Thread 2
> in hidp_del_connection                            in hidp session
>   claim r/w sem                                     .
>                          timer triggers             .
>                            kthread_stop() --------->.
>                            *blocks on thread 2*     exits loop
>                                                     *blocks for r/w

Guys, as far as I know - it is not possible to use kthread_stop from
The timer func - since it may sleep and we are in soft_irq...
What I did is added kthread_stop_async in the kthread use it in both 
Timer func and hidp_del_connection(). I am still polishing the code
though...
Gustavo, is adding the new api to kthread feasible?
How to upstream such change?

> sem*
>   in hidp_del_timer
>     del_timer_sync()
>       *blocks on thread 1*
> 
> Deadlock occurs because:
> + thread 0 holds reader lock but is waiting for the timer function on
> thread 1 to finish
> + thread 1 has stopped kthread and is waiting for thread completion
> + thread 2 (aka kthread) is waiting to claim writer lock held by thread
> 0
> 
> In addition to the deadlocks and races, kthread_stop() is being called
> by hidp_process_hid_control() which is *in the session thread context*
> -
> kthread_stop() cannot be called on itself!
> 
> I've been testing patch v2 since Monday which continues to use kthread
> but reverts back to the old behavior of atomic/wakeup. It also fixes
> the
> potential for a lost wakeup. Of course, "testing" means running it and
> looking at it carefully - as someone on IRC pointed out, kthread really
> needs to get instrumented with lockdep.
> 
> Regards,
> Peter Hurley

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-30 14:34       ` Peter Hurley
  2011-06-30 14:46         ` gene heskett
  2011-06-30 14:55         ` Ilia, Kolominsky
@ 2011-06-30 17:40         ` Gustavo F. Padovan
  2 siblings, 0 replies; 9+ messages in thread
From: Gustavo F. Padovan @ 2011-06-30 17:40 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Ilia, Kolominsky, linux-bluetooth

* Peter Hurley <peter@hurleysoftware.com> [2011-06-30 10:34:35 -0400]:

> On Wed, 2011-06-29 at 16:52 -0400, Gustavo F. Padovan wrote:
> > * Gustavo F. Padovan <padovan@profusion.mobi> [2011-06-29 17:24:56 -0300]:
> > 
> > > * Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:
> > > 
> > > > Hi!
> > > > IMHO the fix isnt good due to possible race condition which 
> > > > will destroy session/task objects - either by a call to kthread_stop
> > > > from the timer func or reentry to hidp_del_connection() on 
> > > > smp platforms.
> ....
> > This should fix the timer issue. Please test.
> > 
> > 	Gustavo
> 
> Hi Ilia & Gustavo,
> 
> After Ilia pointed out the problem with the timer function, I went back
> and reviewed *all* the synchronization code relevant to the hid session
> thread.
> 
> A number of problems were introduced with commit aabf6f89 - when the
> session thread was converted from a kernel_thread to a kthread. Although
> a kthread is a better choice for representing the session thread, the
> naive conversion of atomic/wakeup to kthread_stop() was inappropriate.
> 
> kthread_stop() has usage semantics that different significantly from
> atomic/wakeup. As we already know, because kthread_stop() blocks on
> thread completion, it can introduce deadlocks in code that already uses
> exclusion mechanisms. Even with Ilia's new patch, consider the following
> sequence:
> 
> Thread 0                 Thread 1                 Thread 2
> in hidp_del_connection                            in hidp session
>   claim r/w sem                                     .
>                          timer triggers             .
>                            kthread_stop() --------->.
>                            *blocks on thread 2*     exits loop
>                                                     *blocks for r/w sem*
>   in hidp_del_timer
>     del_timer_sync()
>       *blocks on thread 1*
> 
> Deadlock occurs because:
> + thread 0 holds reader lock but is waiting for the timer function on
> thread 1 to finish
> + thread 1 has stopped kthread and is waiting for thread completion
> + thread 2 (aka kthread) is waiting to claim writer lock held by thread
> 0
> 
> In addition to the deadlocks and races, kthread_stop() is being called
> by hidp_process_hid_control() which is *in the session thread context* -
> kthread_stop() cannot be called on itself!
> 
> I've been testing patch v2 since Monday which continues to use kthread
> but reverts back to the old behavior of atomic/wakeup. It also fixes the
> potential for a lost wakeup. Of course, "testing" means running it and
> looking at it carefully - as someone on IRC pointed out, kthread really
> needs to get instrumented with lockdep.

Where is this v2 patch? I wanna take a look on it.

	Gustavo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
  2011-06-30 14:55         ` Ilia, Kolominsky
@ 2011-06-30 17:47           ` Gustavo F. Padovan
  0 siblings, 0 replies; 9+ messages in thread
From: Gustavo F. Padovan @ 2011-06-30 17:47 UTC (permalink / raw)
  To: Ilia, Kolominsky; +Cc: Peter Hurley, linux-bluetooth

* Ilia, Kolominsky <iliak@ti.com> [2011-06-30 16:55:52 +0200]:

> > -----Original Message-----
> > From: Peter Hurley [mailto:peter@hurleysoftware.com]
> > Sent: Thursday, June 30, 2011 5:35 PM
> > To: Gustavo F. Padovan
> > Cc: Ilia, Kolominsky; linux-bluetooth
> > Subject: Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock
> > 
> > On Wed, 2011-06-29 at 16:52 -0400, Gustavo F. Padovan wrote:
> > > * Gustavo F. Padovan <padovan@profusion.mobi> [2011-06-29 17:24:56 -
> > 0300]:
> > >
> > > > * Ilia, Kolominsky <iliak@ti.com> [2011-06-26 09:16:58 +0200]:
> > > >
> > > > > Hi!
> > > > > IMHO the fix isnt good due to possible race condition which
> > > > > will destroy session/task objects - either by a call to
> > kthread_stop
> > > > > from the timer func or reentry to hidp_del_connection() on
> > > > > smp platforms.
> > ....
> > > This should fix the timer issue. Please test.
> > >
> > > 	Gustavo
> > 
> > Hi Ilia & Gustavo,
> > 
> > After Ilia pointed out the problem with the timer function, I went back
> > and reviewed *all* the synchronization code relevant to the hid session
> > thread.
> > 
> > A number of problems were introduced with commit aabf6f89 - when the
> > session thread was converted from a kernel_thread to a kthread.
> > Although
> > a kthread is a better choice for representing the session thread, the
> > naive conversion of atomic/wakeup to kthread_stop() was inappropriate.
> > 
> > kthread_stop() has usage semantics that different significantly from
> > atomic/wakeup. As we already know, because kthread_stop() blocks on
> > thread completion, it can introduce deadlocks in code that already uses
> > exclusion mechanisms. Even with Ilia's new patch, consider the
> > following
> > sequence:
> > 
> > Thread 0                 Thread 1                 Thread 2
> > in hidp_del_connection                            in hidp session
> >   claim r/w sem                                     .
> >                          timer triggers             .
> >                            kthread_stop() --------->.
> >                            *blocks on thread 2*     exits loop
> >                                                     *blocks for r/w
> 
> Guys, as far as I know - it is not possible to use kthread_stop from
> The timer func - since it may sleep and we are in soft_irq...
> What I did is added kthread_stop_async in the kthread use it in both 
> Timer func and hidp_del_connection(). I am still polishing the code
> though...

Yes, you right, we really can't call kthread_stop there.

> Gustavo, is adding the new api to kthread feasible?
> How to upstream such change?

I don't know the kthread internals to tell you if this is feasible or not. You
have to send the patch to the maintainer of kthread
(./scripts/get_maintainer.pl tells you the maintainer) and explain why you
need this to him.

But we also need a solution in the short term, this week, in order to fix this
in the up coming 3.0 release.

	Gustavo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-06-30 17:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-25 21:32 [PATCH] Bluetooth: Fix hidp disconnect deadlock Peter Hurley
2011-06-26  7:16 ` Ilia, Kolominsky
2011-06-29 20:24   ` Gustavo F. Padovan
2011-06-29 20:52     ` Gustavo F. Padovan
2011-06-30 14:34       ` Peter Hurley
2011-06-30 14:46         ` gene heskett
2011-06-30 14:55         ` Ilia, Kolominsky
2011-06-30 17:47           ` Gustavo F. Padovan
2011-06-30 17:40         ` Gustavo F. Padovan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).