public inbox for linux-sh@vger.kernel.org
 help / color / mirror / Atom feed
* [bug report] dead lock (?) occur on ap325 board
@ 2009-01-23  9:04 morimoto.kuninori
  2009-01-26  2:20 ` Paul Mundt
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: morimoto.kuninori @ 2009-01-23  9:04 UTC (permalink / raw)
  To: linux-sh


Hi Paul.

I found dead lock (? freeze?) bug again on
------------------------------------------------
commit 77ba93a7ac5fb0d9338bffbf97c787b8efe00806
Author: Paul Mundt <lethal@linux-sh.org>
Date:   Mon Dec 8 11:25:50 2008 +0900

    sh: Fix up the SH-4A mutex fastpath semantics.
------------------------------------------------

I get dead lock even if following patch is added
(this bug will be fixed on 06be3724548a443a99d703ff79f43d6f1e2975f0)
----------------
Author: Paul Mundt <lethal@linux-sh.org>
Date:   Mon Dec 8 17:01:40 2008 +0900

    sh: Fix an off-by-1 check in __mutex_fastpath_unlock().
----------------

c6f17cb2272121475c87592560534b157b17544e is OK for me.
(but YOSHII-san's patch (3b041227f7ef7c7e97f205c68c6069c0c62e5204)
 is still needed.)

*** My envilonment ****

board:     ap325
gcc:       gcc 4.1.2 (Gentoo 4.1.2 p1.0.2)
defconfig: ap325_defconfig

*** Reproduction method ***

> git checkout 77ba93a7ac5fb0d9338bffbf97c787b8efe00806
> patch -p1 < 3b041227f7ef7c7e97f205c68c6069c0c62e5204.patch
> patch -p1 < 06be3724548a443a99d703ff79f43d6f1e2975f0.patch
> make

copy following string by consol application.
-------------
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
HOGE="Morimoto is stupid"
-------------

paste string many times on ap325 board.
------------
> HOGE="Morimoto is stupid"  <= paste string
> HOGE="Morimoto is stupid"
> HOGE="Morimoto is stupid"
...
> HOGE="Morimoto is stupid"  <= paste string
> HOGE="Morimoto is stupid"
> HOGE="Morimoto is stupid"
...
> HOGE="Morimoto is stupid"  <= paste string
> HOGE="Morimoto is stupid"
> HOG                        <= freeze
------------

/Morimoto

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
@ 2009-01-26  2:20 ` Paul Mundt
  2009-01-26  5:01 ` morimoto.kuninori
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Paul Mundt @ 2009-01-26  2:20 UTC (permalink / raw)
  To: linux-sh

On Fri, Jan 23, 2009 at 06:04:20PM +0900, morimoto.kuninori@renesas.com wrote:
> I found dead lock (? freeze?) bug again on
> ------------------------------------------------
> commit 77ba93a7ac5fb0d9338bffbf97c787b8efe00806
> Author: Paul Mundt <lethal@linux-sh.org>
> Date:   Mon Dec 8 11:25:50 2008 +0900
> 
>     sh: Fix up the SH-4A mutex fastpath semantics.
> ------------------------------------------------
> 
> I get dead lock even if following patch is added
> (this bug will be fixed on 06be3724548a443a99d703ff79f43d6f1e2975f0)
> ----------------
> Author: Paul Mundt <lethal@linux-sh.org>
> Date:   Mon Dec 8 17:01:40 2008 +0900
> 
>     sh: Fix an off-by-1 check in __mutex_fastpath_unlock().
> ----------------
> 
> c6f17cb2272121475c87592560534b157b17544e is OK for me.
> (but YOSHII-san's patch (3b041227f7ef7c7e97f205c68c6069c0c62e5204)
>  is still needed.)
> 
I haven't been able to reproduce this yet, but it would be helpful to
know precisely where your lock up occurs. If you aren't using JTAG, you
can try sending an NMI, or send a register and/or stacktrace request via
magic sysrq.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
  2009-01-26  2:20 ` Paul Mundt
@ 2009-01-26  5:01 ` morimoto.kuninori
  2009-01-27  8:26 ` morimoto.kuninori
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: morimoto.kuninori @ 2009-01-26  5:01 UTC (permalink / raw)
  To: linux-sh


Dear Paul

> know precisely where your lock up occurs. If you aren't using JTAG, you
> can try sending an NMI, or send a register and/or stacktrace request via
> magic sysrq.

# In the sample, it seems to happen easily.
# But this problem happens only unusually.
# Therefore, we should try a lot of past string to occur this lock up.

hmm...
I can use JTAG now, but kernel is always running default_idle
when freeze occur.

88003074 <default_idle>:
...
8800309a:	1b 00       	sleep
8800309c:	f2 01       	stc	r7_bank,r1
8800309e:	12 50       	mov.l	@(8,r1),r0
880030a0:	04 c9       	and	#4,r0
880030a2:	08 20       	tst	r0,r0
880030a4:	f9 89       	bt	8800309a

I don't know how to debug it.

Best regards
--
Kuninori Morimoto


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
  2009-01-26  2:20 ` Paul Mundt
  2009-01-26  5:01 ` morimoto.kuninori
@ 2009-01-27  8:26 ` morimoto.kuninori
  2009-01-27  8:32 ` Paul Mundt
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: morimoto.kuninori @ 2009-01-27  8:26 UTC (permalink / raw)
  To: linux-sh


Dear Paul

> I haven't been able to reproduce this yet, but it would be helpful to
> know precisely where your lock up occurs. If you aren't using JTAG, you
> can try sending an NMI, or send a register and/or stacktrace request via
> magic sysrq.

I think I found the reason.

in "__mutex_fastpath_unlock" (arch/sh/include/asm/mutex-llsc.h),
currect operation is following.

-------------------
	__res |= !__ex_flag;
	if (unlikely(__res <= 0))
-------------------

But, This operation is wrong for me.
I think following is correct ?

-------------------
     	if (unlikely(__res <= 0 || __ex_flag = 0))
-------------------

above patch works well and dead lock never occur
on my local environment.
If my opinion is correct, I will send patch.

Best regards
--
Kuninori Morimoto
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (2 preceding siblings ...)
  2009-01-27  8:26 ` morimoto.kuninori
@ 2009-01-27  8:32 ` Paul Mundt
  2009-01-27  8:46 ` morimoto.kuninori
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Paul Mundt @ 2009-01-27  8:32 UTC (permalink / raw)
  To: linux-sh

On Tue, Jan 27, 2009 at 05:26:46PM +0900, morimoto.kuninori@renesas.com wrote:
> 
> Dear Paul
> 
> > I haven't been able to reproduce this yet, but it would be helpful to
> > know precisely where your lock up occurs. If you aren't using JTAG, you
> > can try sending an NMI, or send a register and/or stacktrace request via
> > magic sysrq.
> 
> I think I found the reason.
> 
> in "__mutex_fastpath_unlock" (arch/sh/include/asm/mutex-llsc.h),
> currect operation is following.
> 
> -------------------
> 	__res |= !__ex_flag;
> 	if (unlikely(__res <= 0))
> -------------------
> 
> But, This operation is wrong for me.
> I think following is correct ?
> 
> -------------------
>      	if (unlikely(__res <= 0 || __ex_flag = 0))
> -------------------
> 
> above patch works well and dead lock never occur
> on my local environment.
> If my opinion is correct, I will send patch.
> 
Hmm, the !__ex_flag should have handled that case properly, but I wonder
if it is negated in the unlock case. Does switching to __res |= __ex_flag
and leaving the existing test as-is fix it?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (3 preceding siblings ...)
  2009-01-27  8:32 ` Paul Mundt
@ 2009-01-27  8:46 ` morimoto.kuninori
  2009-01-27  8:49 ` Paul Mundt
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: morimoto.kuninori @ 2009-01-27  8:46 UTC (permalink / raw)
  To: linux-sh


Dear Paul

> Hmm, the !__ex_flag should have handled that case properly, but I wonder
> if it is negated in the unlock case. Does switching to __res |= __ex_flag
> and leaving the existing test as-is fix it?

Sorry difficult English.
Do you say like this ?

--------------
 	__res |= __ex_flag;
 	if (unlikely(__res <= 0))
--------------

Dead lock occur in this case. kernel can not boot.

Best regards
--
Kuninori Morimoto
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (4 preceding siblings ...)
  2009-01-27  8:46 ` morimoto.kuninori
@ 2009-01-27  8:49 ` Paul Mundt
  2009-01-28  1:25 ` morimoto.kuninori
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Paul Mundt @ 2009-01-27  8:49 UTC (permalink / raw)
  To: linux-sh

On Tue, Jan 27, 2009 at 05:46:57PM +0900, morimoto.kuninori@renesas.com wrote:
> > Hmm, the !__ex_flag should have handled that case properly, but I wonder
> > if it is negated in the unlock case. Does switching to __res |= __ex_flag
> > and leaving the existing test as-is fix it?
> 
> Sorry difficult English.
> Do you say like this ?
> 
> --------------
>  	__res |= __ex_flag;
>  	if (unlikely(__res <= 0))
> --------------
> 
> Dead lock occur in this case. kernel can not boot.
> 
Ok, I'll debug it a bit more then, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (5 preceding siblings ...)
  2009-01-27  8:49 ` Paul Mundt
@ 2009-01-28  1:25 ` morimoto.kuninori
  2009-01-28  9:29 ` Takashi Yoshii
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: morimoto.kuninori @ 2009-01-28  1:25 UTC (permalink / raw)
  To: linux-sh


Dear Paul

> Hmm, the !__ex_flag should have handled that case properly, but I wonder
> if it is negated in the unlock case. Does switching to __res |= __ex_flag
> and leaving the existing test as-is fix it?

On my stupid opinion, dead lock will occur 

    if ( __res = 1 &&
         __ex_flag = 0) {

         /* this case, dead lock will occur
          *
          * it mean interrupt occur
          */
    }

      /*
       * __res will be 1 even if __ex_flag = 0
       *
       * __res = 1
       * __ex_flag = 0
       * count->counter = 0  <= not changed
       *
       */
	__res |= !__ex_flag;
 	if (unlikely(__res <= 0))
                 fail_fn(count)    <= this function should be
                                      called if __ex_flag = 0
                                      or __res != count->counter

how about this ?
--------------------------------------------------
-	__res |= !__ex_flag;
-	if (unlikely(__res <= 0))
+	if (unlikely(__res <= 0 || !__ex_flag))
 		fail_fn(count);
--------------------------------------------------

Best regards
--
Kuninori Morimoto
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (6 preceding siblings ...)
  2009-01-28  1:25 ` morimoto.kuninori
@ 2009-01-28  9:29 ` Takashi Yoshii
  2009-01-28 10:03 ` Paul Mundt
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Takashi Yoshii @ 2009-01-28  9:29 UTC (permalink / raw)
  To: linux-sh

Kunihiro's fix looks good for me.
I have accidentally(?) fixed this one just same way as him.

I don't know why the original code does 
> 	__res |= !__ex_flag;
to combine the results (counter and exception flag).
Logical-OR to scaler value is logically:) wrong.
# And using movt explicitly results stupid code generated anyway.

Because this __ex_flag is from "t" of movco, which means
 0: interrupted by int or exp.
 1: done successfully.
, it works as "Low-active" signal, and is not easy to handle
as a C-language conditon.

So, I think 
        int __done, __res;
        ....
        if (unlikely(!done || __res != 0))

is easier to read.
This fixes the deadlock issue Kunihiro reported (by coincident:).

# I'm not sure if this is suitable to be marked as "earlyclobber",
# but I leave it as-is, since nothing bad will occur.
/yoshii

Fix conditon checking expression of __mutex_fastpath_*

Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com>
---
 arch/sh/include/asm/mutex-llsc.h |   21 +++++++++------------
 1 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index ee839ee..66f045f 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -21,38 +21,36 @@
 static inline void
 __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
 {
-	int __ex_flag, __res;
+	int __done, __res;
 
 	__asm__ __volatile__ (
 		"movli.l	@%2, %0	\n"
 		"add		#-1, %0	\n"
 		"movco.l	%0, @%2	\n"
 		"movt		%1	\n"
-		: "=&z" (__res), "=&r" (__ex_flag)
+		: "=&z" (__res), "=&r" (__done)
 		: "r" (&(count)->counter)
 		: "t");
 
-	__res |= !__ex_flag;
-	if (unlikely(__res != 0))
+	if (unlikely(!__done || __res != 0))
 		fail_fn(count);
 }
 
 static inline int
 __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 {
-	int __ex_flag, __res;
+	int __done, __res;
 
 	__asm__ __volatile__ (
 		"movli.l	@%2, %0	\n"
 		"add		#-1, %0	\n"
 		"movco.l	%0, @%2	\n"
 		"movt		%1	\n"
-		: "=&z" (__res), "=&r" (__ex_flag)
+		: "=&z" (__res), "=&r" (__done)
 		: "r" (&(count)->counter)
 		: "t");
 
-	__res |= !__ex_flag;
-	if (unlikely(__res != 0))
+	if (unlikely(!__done || __res != 0))
 		__res = fail_fn(count);
 
 	return __res;
@@ -61,19 +59,18 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 static inline void
 __mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *))
 {
-	int __ex_flag, __res;
+	int __done, __res;
 
 	__asm__ __volatile__ (
 		"movli.l	@%2, %0	\n\t"
 		"add		#1, %0	\n\t"
 		"movco.l	%0, @%2 \n\t"
 		"movt		%1	\n\t"
-		: "=&z" (__res), "=&r" (__ex_flag)
+		: "=&z" (__res), "=&r" (__done)
 		: "r" (&(count)->counter)
 		: "t");
 
-	__res |= !__ex_flag;
-	if (unlikely(__res <= 0))
+	if (unlikely(!__done || __res <= 0))
 		fail_fn(count);
 }
 
-- 1.6.0.6 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (7 preceding siblings ...)
  2009-01-28  9:29 ` Takashi Yoshii
@ 2009-01-28 10:03 ` Paul Mundt
  2009-01-29  0:34 ` yoshii.takashi
  2009-01-29 23:54 ` morimoto.kuninori
  10 siblings, 0 replies; 12+ messages in thread
From: Paul Mundt @ 2009-01-28 10:03 UTC (permalink / raw)
  To: linux-sh

On Wed, Jan 28, 2009 at 06:29:13PM +0900, Takashi Yoshii wrote:
> Fix conditon checking expression of __mutex_fastpath_*
> 
> Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com>

Yes, that looks fine. I'll queue this up, thanks.

If Morimoto-san wants to add his Signed-off-by or Acked-by, I'll add that
also.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (8 preceding siblings ...)
  2009-01-28 10:03 ` Paul Mundt
@ 2009-01-29  0:34 ` yoshii.takashi
  2009-01-29 23:54 ` morimoto.kuninori
  10 siblings, 0 replies; 12+ messages in thread
From: yoshii.takashi @ 2009-01-29  0:34 UTC (permalink / raw)
  To: linux-sh

Morimoto-san, I'm sorry for making mistake for your name.

> Kunihiro's fix looks good for me.
Who I meant was you.
The core part of the patch is just the same as you have posted.
Could your please give your signe-off (or ack) as a reply to my patch?
/yoshii

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bug report] dead lock (?) occur on ap325 board
  2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
                   ` (9 preceding siblings ...)
  2009-01-29  0:34 ` yoshii.takashi
@ 2009-01-29 23:54 ` morimoto.kuninori
  10 siblings, 0 replies; 12+ messages in thread
From: morimoto.kuninori @ 2009-01-29 23:54 UTC (permalink / raw)
  To: linux-sh


Dear Paul and yoshii-san.

Thank you yoshii-san for your response.

> > Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com>
> 
> Yes, that looks fine. I'll queue this up, thanks.
> 
> If Morimoto-san wants to add his Signed-off-by or Acked-by, I'll add that
> also.

Please add.

>> Kunihiro's fix looks good for me.
> Who I meant was you.

Yes. My name is Kuninori
v(^o^)v         ~~~~~~~~

Best regards
--
Kuninori Morimoto
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-01-29 23:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-23  9:04 [bug report] dead lock (?) occur on ap325 board morimoto.kuninori
2009-01-26  2:20 ` Paul Mundt
2009-01-26  5:01 ` morimoto.kuninori
2009-01-27  8:26 ` morimoto.kuninori
2009-01-27  8:32 ` Paul Mundt
2009-01-27  8:46 ` morimoto.kuninori
2009-01-27  8:49 ` Paul Mundt
2009-01-28  1:25 ` morimoto.kuninori
2009-01-28  9:29 ` Takashi Yoshii
2009-01-28 10:03 ` Paul Mundt
2009-01-29  0:34 ` yoshii.takashi
2009-01-29 23:54 ` morimoto.kuninori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox