xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* xenbus and the message of doom
@ 2011-12-15 19:20 Stefan Bader
  2011-12-15 19:39 ` Konrad Rzeszutek Wilk
  2011-12-15 20:53 ` Ian Campbell
  0 siblings, 2 replies; 24+ messages in thread
From: Stefan Bader @ 2011-12-15 19:20 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com; +Cc: Olaf Hering, Konrad Rzeszutek Wilk

I was investigating a bug report[1] about newer kernels (>3.1) not booting as
HVM guests on Amazon EC2. For some reason git bisect did give the some pain, but
it lead me at least close and with some crash dump data I think I figured the
problem.

commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05
Author: Olaf Hering <olaf@aepfle.de>
Date:   Thu Sep 22 16:14:49 2011 +0200

    xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old
    kernel

This change introduced a xs_reset_watches() call. The problem seems to be that
there is at least some version of Xen (I was able to reproduce with a 3.4.3
version which I admit to deliberately not having updated) for which xenstore
will not return any reply.

At least the backtraces in crash showed that xs_init had been calling
xs_reset_watches() and that was happily idling in read_reply(). Effectively
nothing was going on and the boot just hung.
By just not doing that xs_reset_watches() call, I was able to boot under the
same host. And for what it is worth there has not been an issue with Xen 4.1.1
and a 3.0 dom0 kernel. Just this "older" release is trouble.

Now the big question is, should this never happen and the host needs urgent
updating. Or, should xs_talkv() set up a time limit and assume failure when not
receiving a message after that? I could imagine the latter might lead at least
to a more helpful "there is something wrong here, dude" than just hanging around
without any response. ;)

-Stefan

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-01-05 18:43 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-15 19:20 xenbus and the message of doom Stefan Bader
2011-12-15 19:39 ` Konrad Rzeszutek Wilk
2011-12-15 19:45   ` Stefan Bader
2011-12-16 11:33   ` Olaf Hering
2011-12-16 15:25     ` Konrad Rzeszutek Wilk
2012-01-02  9:32       ` Stefan Bader
2011-12-20 10:11     ` Ian Campbell
2011-12-20 13:15       ` Olaf Hering
2011-12-20 14:16         ` Konrad Rzeszutek Wilk
2011-12-20 17:29           ` Ian Jackson
2011-12-20 20:19             ` Ian Campbell
2012-01-02 17:16               ` Olaf Hering
2012-01-03 11:01                 ` Ian Campbell
2012-01-04 15:57                   ` Olaf Hering
2012-01-04 16:22                     ` Ian Campbell
2012-01-04 16:27                       ` Olaf Hering
2012-01-05  9:26                         ` Ian Campbell
2012-01-05 18:43                           ` Olaf Hering
2012-01-02  9:29     ` Stefan Bader
2011-12-15 20:53 ` Ian Campbell
2011-12-16  9:18   ` Stefan Bader
2011-12-16  9:31     ` Ian Campbell
2011-12-16 17:01       ` Olaf Hering
2011-12-16 21:26         ` Alessandro Salvatori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).