From: Jim Fehlig <jfehlig@novell.com>
To: Patrick Colp <pjcolp@gmail.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Subject: Re: help with xenstored 'hang'
Date: Wed, 30 Jun 2010 17:31:40 -0600 [thread overview]
Message-ID: <4C2BD3DC.1030008@novell.com> (raw)
In-Reply-To: <AANLkTilPlAln8LIAsKBSo8JmACWqZufPZuYtuURxJWd-@mail.gmail.com>
Patrick Colp wrote:
> I was recently struggling with what sounds like a not-too-dissimilar
> problem while working with a disaggregated version of xenstore. The
> ultimate solution for me was to disable pthreads in xenstore/libxs. I
> just commented out the following line in tools/xenstore/Makefile:
>
> xs.opic: CFLAGS += -DUSE_PTHREAD
>
Xen3.2 predates c/s 17405, which introduced optional use of pthreads.
Prior to that, pthreads was used explicitly.
> After I removed that line and rebuilt and installed xenstore, it
> worked just fine. I would be curious to know if this also solves your
> problem.
>
I can see if the user is receptive to testing backported 17405 with
pthreads disabled.
Thanks for the suggestion.
Jim
>
> Patrick
>
>
> On 30 June 2010 15:15, Jim Fehlig <jfehlig@novell.com> wrote:
>
>> I'm trying to debug an 'xm list' hang on a large (~700 hosts) Xen 3.2
>> production installation. The hang occurs randomly, on a random host.
>> User has provided cores of xend and xenstored processes when hang
>> occurs. After poking at these cores I have discovered
>>
>> In xend process, a thread is blocked on a cond variable, waiting for a
>> response to XS_TRANSACTION_START from xenstored. A reader thread
>> responsible for reading from xenstored is blocked on read(2).
>>
>> In the xenstored process, the lone thread is blocked on select(2),
>> waiting for IO. I examined the connections list and see that it contains
>> a connection for the XS_TRANSACTION_START request. Dumping the
>> connection object:
>>
>> (gdb) p *(struct connection *)0x526c70
>> $48 = {list = {next = 0x517c30, prev = 0x5151f0}, fd = 13, id = 0,
>> can_write =
>> true, in = 0x523600,
>> out_list = {next = 0x526c98, prev = 0x526c98}, transaction = 0x0,
>> transaction_list = {next = 0x523560,
>> prev = 0x523560}, next_transaction_id = 60231445, transaction_started = 1,
>> domain = 0x0, watches = {
>> next = 0x51daa0, prev = 0x5267b0}, write = 0x402460 <writefd>, read =
>> 0x405180 <readfd>}
>>
>> Notice transaction_started is set to 1, but out_list is empty. AFAICT,
>> that means the reply has been sent to xend. The reader thread in xend
>> should have received the response and signaled the cond variable -
>> allowing execution to progress. Ultimately, xend would send a
>> XS_TRANSACTION_END message, freeing the connection object in xenstored
>> and removing it from connections list.
>>
>> Does my understanding of this code sound correct? Anyone have
>> suggestions or further debugging tips? Examining cores is about my only
>> debug option as user does not want to deploy debug patches, enable
>> tracing, etc. across 700 hosts.
>>
>> Interestingly, when user strace's or attaches to xenstored process with
>> gdb, xenstored "awakes", the hung 'xm list' returns, and xenstored
>> continues normally. A new connection to xenstored (e.g. running xmtop)
>> seems to poke it along as well. Would a timeout on select(2) in main
>> loop of xenstored help at all?
>>
>> Thanks for any insights!
>> Jim
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>>
>>
next prev parent reply other threads:[~2010-06-30 23:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-30 22:15 help with xenstored 'hang' Jim Fehlig
2010-06-30 23:17 ` Patrick Colp
2010-06-30 23:31 ` Jim Fehlig [this message]
2010-07-01 21:30 ` Jim Fehlig
2010-07-01 22:33 ` Patrick Colp
2010-07-01 23:03 ` Jim Fehlig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C2BD3DC.1030008@novell.com \
--to=jfehlig@novell.com \
--cc=pjcolp@gmail.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.