From: Daniel Stodden <daniel.stodden@citrix.com>
To: MaoXiaoyun <tinnycloud@hotmail.com>
Cc: xen devel <xen-devel@lists.xensource.com>, "keir@xen.org" <keir@xen.org>
Subject: RE: Domain 0 stop response on frequently reboot VMS
Date: Sat, 23 Oct 2010 22:56:51 -0700 [thread overview]
Message-ID: <1287899811.4575.32.camel@ramone> (raw)
In-Reply-To: <BLU157-w385D9EE38F5058B8F364C2DA400@phx.gbl>
On Sun, 2010-10-24 at 01:48 -0400, MaoXiaoyun wrote:
> Hi Daniel:
>
> Sorry for tht late response, and really thanks for your kindly
> suggestion.
> Well, I believe we will upgrade to the lastest kernel in the
> coming future, but currently
> we perfer to maintain for stable reason.
>
> Our kernel version is 2.6.31. Now I am going through the change
> set of blktap to get
> more detail info.
NP. Let me know if you have questions.
Daniel
> thanks.
>
> > Subject: RE: [Xen-devel] Domain 0 stop response on frequently reboot
> VMS
> > From: daniel.stodden@citrix.com
> > To: tinnycloud@hotmail.com; jeremy@goop.org
> > CC: keir@xen.org; xen-devel@lists.xensource.com
> > Date: Mon, 18 Oct 2010 14:17:50 -0700
> >
> >
> > I'd strongly suggest to try upgrading your kernel, or at least the
> > blktap component. The condition below is new to me, but that
> wait_queue
> > file and some related code was known to be buggy and has long since
> been
> > removed.
> >
> > If you choose to only upgrade blktap from tip, let me know what
> kernel
> > version you're dealing with, you might need to backport some of the
> > device queue macros to match your version's needs.
> >
> > Daniel
> >
> >
> > On Sat, 2010-10-16 at 01:39 -0400, MaoXiaoyun wrote:
> > > Well, Thanks Keir.
> > > Fortunately we caught the bug, it turned out to be a tapdisk
> problem.
> > > A brief explaination for other guys might confront this issue.
> > >
> > > Clear BLKTAP_DEFERRED on line 19 will lead to the concurrent
> access
> > > of
> > > tap->deferred_queue between line 24 and 37, which will finally
> cause
> > > bad
> > > pointer of tap->deferred_queue, and infinte loop in while clause
> in
> > > line 22.
> > > Lock line 24 will be a simple fix.
> > >
> > > /linux-2.6-pvops.git/drivers/xen/blktap/wait_queue.c
> > > 9 void
> > > 10 blktap_run_deferred(void)
> > > 11 {
> > > 12 LIST_HEAD(queue);
> > > 13 struct blktap *tap;
> > > 14 unsigned long flags;
> > > 15
> > > 16 spin_lock_irqsave(&deferred_work_lock, flags);
> > > 17 list_splice_init(&deferred_work_queue, &queue);
> > > 18 list_for_each_entry(tap, &queue, deferred_queue)
> > > 19 clear_bit(BLKTAP_DEFERRED, &tap->dev_inuse);
> > > 20 spin_unlock_irqrestore(&deferred_work_lock, flags);
> > > 21
> > > 22 while (!list_empty(&queue)) {
> > > 23 tap = list_entry(queue.next, struct blktap,
> > > deferred_queue);
> > > 24 &nb sp; list_del_init(&tap->deferred_queue);
> > > 25 blktap_device_restart(tap);
> > > 26 }
> > > 27 }
> > > 28
> > > 29 void
> > > 30 blktap_defer(struct blktap *tap)
> > > 31 {
> > > 32 unsigned long flags;
> > > 33
> > > 34 spin_lock_irqsave(&deferred_work_lock, flags);
> > > 35 if (!test_bit(BLKTAP_DEFERRED, &tap->dev_inuse)) {
> > > 36 set_bit(BLKTAP_DEFERRED, &tap->dev_inuse);
> > > 37 list_add_tail(&tap->deferred_queue, &deferred_work_queue);
> > > 38 }
> > > 39 spin_unlock_irqrestore(&deferred_work_lock, f lags);
> > > 40 }
> > >
> > >
> > > > Date: Fri, 15 Oct 2010 13:57:09 +0100
> > > > Subject: Re: [Xen-devel] Domain 0 stop response on frequently
> reboot
> > > VMS
> > > > From: keir@xen.org
> > > > To: tinnycloud@hotmail.com; xen-devel@lists.xensource.com
> > > >
> > > > You'll probably want to see if you can get SysRq output from
> dom0
> > > via serial
> > > > line. It's likely you can if it is alive enough to respond to
> ping.
> > > This
> > > > might tell you things like what all processes are getting
> blocked
> > > on, and
> > > > thus indicate what is stopping dom0 from making progress.
> > > >
> > > > -- Keir
> > > >
> > > > On 15/10/2010 13:43, "MaoXiaoyun" <tinnycloud@hotmail.com>
> wrote:
> > > >
> > > > >
> > > > > Hi Keir:
> > > > >
> > > > > First, I'd like to express my appreciation for the help your
> > > offered
> > > > > before.
> > > > > Well, recently we confront a rather nasty domain 0 no response
> > > > > problem.
> > > > >
> > > > > We still have 12 HVMs almost continuously and con currently
> reboot
> > > > > test on a physical server.
> > > > > A few hours later, the server looks like dead. We only can
> ping to
> > > > > the server and get right response,
> > > > > the Xen works fine since we can get debug info from serial
> port.
> > > Attached is
> > > > > the full debug output.
> > > > > After decode the domain 0 CPU stack, I find the CPU still
> works
> > > for domain 0
> > > > > since the stack changed
> > > > > info changed every time I dumped.
> > > > >
> > > > > Could help to take a look at the attentchment to see whether
> there
> > > are
> > > > > some hints for debugging this
> > > > > problem. Thanks in advance.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Xen-devel mailing list
> > > > > Xen-devel@lists.xensource.com
> > > > > http://lists.xensource.com/xen-devel
> > > >
> > > >
> >
> >
next prev parent reply other threads:[~2010-10-24 5:56 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BAY121-W45A47AC73BDA1A9E7474A2DA720@phx.gbl>
[not found] ` <C8ACD97B.1256D%keir.fraser@eu.citrix.com>
2010-09-10 11:01 ` VM hung after running sometime MaoXiaoyun
2010-09-19 10:37 ` MaoXiaoyun
2010-09-19 11:49 ` Keir Fraser
2010-09-19 12:21 ` Zhang, Yang Z
2010-09-20 6:00 ` MaoXiaoyun
2010-09-20 7:45 ` Keir Fraser
2010-09-20 8:23 ` MaoXiaoyun
2010-09-20 9:15 ` MaoXiaoyun
2010-09-20 9:35 ` Keir Fraser
2010-09-21 5:02 ` MaoXiaoyun
2010-09-21 7:53 ` Keir Fraser
2010-09-21 9:24 ` wei song
2010-09-21 9:49 ` wei song
2010-09-21 17:28 ` Jeremy Fitzhardinge
2010-09-22 0:02 ` MaoXiaoyun
2010-09-22 0:17 ` Jeremy Fitzhardinge
2010-09-22 1:19 ` MaoXiaoyun
2010-09-22 18:31 ` Jeremy Fitzhardinge
2010-09-23 0:55 ` MaoXiaoyun
2010-09-23 23:20 ` Jeremy Fitzhardinge
2010-09-24 4:29 ` MaoXiaoyun
2010-09-25 9:33 ` MaoXiaoyun
2010-09-25 10:40 ` wei song
2010-09-27 18:02 ` Jeremy Fitzhardinge
2010-09-27 11:56 ` MaoXiaoyun
2010-09-28 5:43 ` MaoXiaoyun
2010-09-28 11:23 ` MaoXiaoyun
2010-09-28 17:07 ` Jeremy Fitzhardinge
2010-09-29 6:01 ` MaoXiaoyun
2010-09-29 16:12 ` Jeremy Fitzhardinge
2010-10-15 12:43 ` Domain 0 stop response on frequently reboot VMS MaoXiaoyun
2010-10-15 12:57 ` Keir Fraser
2010-10-16 5:39 ` MaoXiaoyun
2010-10-16 7:16 ` Keir Fraser
2010-10-18 21:17 ` Daniel Stodden
2010-10-24 5:48 ` MaoXiaoyun
2010-10-24 5:56 ` Daniel Stodden [this message]
2010-10-26 8:16 ` MaoXiaoyun
2010-10-26 9:09 ` Daniel Stodden
2010-10-26 10:54 ` MaoXiaoyun
2010-10-26 9:20 ` Ian Campbell
2010-10-26 10:59 ` MaoXiaoyun
2010-10-26 11:54 ` Domain 0 stop response on frequently reboot VMS, fix xen/master link? Pasi Kärkkäinen
2010-10-26 17:08 ` Jeremy Fitzhardinge
2010-11-04 3:09 ` A Patch for modify DomU network transmit rate dynamically MaoXiaoyun
2010-11-04 3:43 ` MaoXiaoyun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1287899811.4575.32.camel@ramone \
--to=daniel.stodden@citrix.com \
--cc=keir@xen.org \
--cc=tinnycloud@hotmail.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).