All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Athan <aathan@memeplex.com>
To: Darren Hart <dvhltc@us.ibm.com>
Cc: "Andrew Athan" <linux_kernel_aathan@memeplex.com>,
	"Américo Wang" <xiyou.wangcong@gmail.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	linux-kernel@vger.kernel.org,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@elte.hu>, "Gong Cheng" <chengg11@yahoo.com>
Subject: Re: Futex hang/lockup problem in 2.6.30+ on AMD64
Date: Thu, 28 Jan 2010 12:46:50 -0500	[thread overview]
Message-ID: <4B61CD8A.50601@memeplex.com> (raw)
In-Reply-To: <4B574359.9080308@us.ibm.com>

Darren Hart wrote:
> Andrew Athan wrote:
>> Andrew Athan wrote:
>>> Américo Wang wrote:
>>>> On Tue, Jan 12, 2010 at 10:55 PM, Peter Zijlstra 
>>>> <peterz@infradead.org> wrote:
>>>>  
>>>>> On Tue, 2010-01-12 at 22:52 +0800, Américo Wang wrote:
>>>>>
>>>>>  
>>>>>>> $ uname -a
>>>>>>> Linux UK22 2.6.30-2-amd64 #1 SMP Fri Sep 25 22:16:56 UTC 2009 
>>>>>>> x86_64
>>>>>>> GNU/Linux
>>>>>>>         
>>>>> Does a recent kernel work?
>>>>>
>>>>>
>>>>>     
>>>>
>>>> Ah, I just wanted to ask the same question, adding the original 
>>>> reporter
>>>> Gong Cheng into Cc...
>>>>
>>>> Gong, could you reproduce it on the latest kernel? And what is your 
>>>> .config?
>>>>
>>>> Thanks!
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>   
>>> Due to remote location of the hardware and I haven't been able to 
>>> test a more recent (or older) kernel.  Remote hands have put a KVM 
>>> on the box as of an hour ago, so I hope to have some information for 
>>> you in a day or two.
>>>
>>> A.
>>>
>>
>>
>> I wanted to report that although I have had no luck (so far) running 
>> anything more recent than 2.6.30, I was able to revert to 2.6.26.  
>> Unfortunately, the application hang still occurs.  I also saw a 
>> similar hang of the application running on a 32 bit Intel box, also 
>> under 2.6.26.  So far, the hang *always* involves threads stuck on 
>> pthread_cond_broadcast()'s  condition variable's internal lock while 
>> other threads are waiting on the outer "public" lock.
>
>
> Are you using real-time scheduling policy or priority inheritance 
> (PTHREAD_PRIO_INHERIT)? It is possible to suffer an unbounded priority 
> inversion on the internal condvar data lock in the current distro 
> implementations of glibc.
>
>
>> These other threads are *not* yet (nor about to) 
>> pthread_cond_wait().  I saw a message from Darren Hart (subject "Re: 
>> Problems with futex") in response to someone who apparently was 
>> having futex problems in 2.6.27, so I'm still operating under the 
>> assumption that this is not an application bug.
>
> Those all turned out to be application issues with one exception which 
> had already been fixed upstream.
>
>
>> Over the next couple of days, I will be running a version of the 
>> application in which I replaced the pthread_cond calls with simpler 
>> locks, in the hopes that it won't hang (because I'm hoping the 
>> underlying implementation in pthreads uses a different set of futex 
>> opcodes).
>>
>> Andrew Athan
>>
>
>

I wanted to report that this application hang is certainly related to 
pthread_cond_* calls.  With them in place, it consistently hangs.  
Without, it consistently does not.  Whether pthread_cond_* is 
misbehaving due to memory corruption or another application bug I 
suppose is an open question.

We have now experienced several lockups where even a kill -9 of the 
application won't get rid of it.  Does this say anything about the 
nature of the hang?

By the way, majordomo stopped sending me emails as of 1/17 so I have not 
seen any updates to this thread sent after this date.  Not sure why this 
happened, as I never asked to be unsubscribed.  I've resubscribed, but 
not sure I will get anything.  Please make sure I am directly cc:ed on 
any responses.

carlinux138:~# uname -a
Linux carlinux138.thinktradellc.com 2.6.26-2-686 #1 SMP Sun Jun 21 
04:57:38 UTC 2009 i686 GNU/Linux

(I have to go look up what the best way to give a system config snapshot 
is, e.g., all major library version etc ... )

Thanks,
Andrew Athan


  reply	other threads:[~2010-01-28 17:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-12  9:18 Futex hang/lockup problem in 2.6.30+ on AMD64 Andrew Athan
2010-01-12 14:52 ` Américo Wang
2010-01-12 14:55   ` Peter Zijlstra
2010-01-12 15:00     ` Américo Wang
2010-01-12 16:25       ` Andrew Athan
2010-01-20  4:53         ` Andrew Athan
2010-01-20 17:54           ` Darren Hart
2010-01-28 17:46             ` Andrew Athan [this message]
2010-01-12 17:53       ` Gong Cheng
2010-01-13 16:03         ` Américo Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B61CD8A.50601@memeplex.com \
    --to=aathan@memeplex.com \
    --cc=chengg11@yahoo.com \
    --cc=dvhltc@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_kernel_aathan@memeplex.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.