From: Stefan Bader <stefan.bader@canonical.com>
To: Konrad Rzeszutek Wilk <konrad@darnok.org>
Cc: Olaf Hering <olaf@aepfle.de>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: xenbus and the message of doom
Date: Thu, 15 Dec 2011 20:45:55 +0100 [thread overview]
Message-ID: <4EEA4E73.9030002@canonical.com> (raw)
In-Reply-To: <20111215193942.GA7640@andromeda.dapyr.net>
On 15.12.2011 20:39, Konrad Rzeszutek Wilk wrote:
> On Thu, Dec 15, 2011 at 08:20:23PM +0100, Stefan Bader wrote:
>> I was investigating a bug report[1] about newer kernels (>3.1) not booting as
>> HVM guests on Amazon EC2. For some reason git bisect did give the some pain, but
>> it lead me at least close and with some crash dump data I think I figured the
>> problem.
>
> Stefan, thanks for finding this.
>
I realize I wanted to add the reference to our bug report but completely forgot
to do so. So just for completeness:
http://bugs.launchpad.net/bugs/901305
> Olaf, what are your thoughts? Should I prep a patch to revert the patch
> below and then we can work on 3.3 and rethink this in 3.3? The clock is
> ticking for 3.2 and there is not much runway to fix stuff.
>
>>
>> commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05
>> Author: Olaf Hering <olaf@aepfle.de>
>> Date: Thu Sep 22 16:14:49 2011 +0200
>>
>> xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old
>> kernel
>>
>> This change introduced a xs_reset_watches() call. The problem seems to be that
>> there is at least some version of Xen (I was able to reproduce with a 3.4.3
>> version which I admit to deliberately not having updated) for which xenstore
>> will not return any reply.
>
> And oxenstore too, but Ian prepped a patch for this. Perhaps that is
> what Amazon is running.
>>
>> At least the backtraces in crash showed that xs_init had been calling
>> xs_reset_watches() and that was happily idling in read_reply(). Effectively
>> nothing was going on and the boot just hung.
>
> So at least we should have a timeout read_reply. But I don't see
> anything in the code that we could immediately use.
>
>> By just not doing that xs_reset_watches() call, I was able to boot under the
>> same host. And for what it is worth there has not been an issue with Xen 4.1.1
>> and a 3.0 dom0 kernel. Just this "older" release is trouble.
>>
>> Now the big question is, should this never happen and the host needs urgent
>> updating. Or, should xs_talkv() set up a time limit and assume failure when not
>> receiving a message after that? I could imagine the latter might lead at least
>> to a more helpful "there is something wrong here, dude" than just hanging around
>> without any response. ;)
>>
>> -Stefan
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2011-12-15 19:45 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-15 19:20 xenbus and the message of doom Stefan Bader
2011-12-15 19:39 ` Konrad Rzeszutek Wilk
2011-12-15 19:45 ` Stefan Bader [this message]
2011-12-16 11:33 ` Olaf Hering
2011-12-16 15:25 ` Konrad Rzeszutek Wilk
2012-01-02 9:32 ` Stefan Bader
2011-12-20 10:11 ` Ian Campbell
2011-12-20 13:15 ` Olaf Hering
2011-12-20 14:16 ` Konrad Rzeszutek Wilk
2011-12-20 17:29 ` Ian Jackson
2011-12-20 20:19 ` Ian Campbell
2012-01-02 17:16 ` Olaf Hering
2012-01-03 11:01 ` Ian Campbell
2012-01-04 15:57 ` Olaf Hering
2012-01-04 16:22 ` Ian Campbell
2012-01-04 16:27 ` Olaf Hering
2012-01-05 9:26 ` Ian Campbell
2012-01-05 18:43 ` Olaf Hering
2012-01-02 9:29 ` Stefan Bader
2011-12-15 20:53 ` Ian Campbell
2011-12-16 9:18 ` Stefan Bader
2011-12-16 9:31 ` Ian Campbell
2011-12-16 17:01 ` Olaf Hering
2011-12-16 21:26 ` Alessandro Salvatori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EEA4E73.9030002@canonical.com \
--to=stefan.bader@canonical.com \
--cc=konrad.wilk@oracle.com \
--cc=konrad@darnok.org \
--cc=olaf@aepfle.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.