All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Stodden <daniel.stodden@citrix.com>
To: MaoXiaoyun <tinnycloud@hotmail.com>
Cc: xen devel <xen-devel@lists.xensource.com>, "keir@xen.org" <keir@xen.org>
Subject: RE: Domain 0 stop response on frequently reboot VMS
Date: Sat, 23 Oct 2010 22:56:51 -0700	[thread overview]
Message-ID: <1287899811.4575.32.camel@ramone> (raw)
In-Reply-To: <BLU157-w385D9EE38F5058B8F364C2DA400@phx.gbl>

On Sun, 2010-10-24 at 01:48 -0400, MaoXiaoyun wrote:
> Hi Daniel:
>  
>      Sorry for tht late response, and really thanks for your kindly
> suggestion.
>      Well, I believe we will upgrade to the lastest kernel in the
> coming future, but currently 
> we perfer to maintain for stable reason.
>  
>     Our kernel version is 2.6.31. Now I am going through the change
> set of blktap to get 
> more detail info. 

NP. Let me know if you have questions.

Daniel

>    thanks.
>  
> > Subject: RE: [Xen-devel] Domain 0 stop response on frequently reboot
> VMS
> > From: daniel.stodden@citrix.com
> > To: tinnycloud@hotmail.com; jeremy@goop.org
> > CC: keir@xen.org; xen-devel@lists.xensource.com
> > Date: Mon, 18 Oct 2010 14:17:50 -0700
> > 
> > 
> > I'd strongly suggest to try upgrading your kernel, or at least the
> > blktap component. The condition below is new to me, but that
> wait_queue
> > file and some related code was known to be buggy and has long since
> been
> > removed.
> > 
> > If you choose to only upgrade blktap from tip, let me know what
> kernel
> > version you're dealing with, you might need to backport some of the
> > device queue macros to match your version's needs.
> > 
> > Daniel
> > 
> > 
> > On Sat, 2010-10-16 at 01:39 -0400, MaoXiaoyun wrote:
> > > Well, Thanks Keir.
> > > Fortunately we caught the bug, it turned out to be a tapdisk
> problem. 
> > > A brief explaination for other guys might confront this issue.
> > > 
> > > Clear BLKTAP_DEFERRED on line 19 will lead to the concurrent
> access
> > > of 
> > > tap->deferred_queue between line 24 and 37, which will finally
> cause
> > > bad 
> > > pointer of tap->deferred_queue, and infinte loop in while clause
> in
> > > line 22.
> > > Lock line 24 will be a simple fix. 
> > > 
> > > /linux-2.6-pvops.git/drivers/xen/blktap/wait_queue.c
> > > 9 void
> > > 10 blktap_run_deferred(void)
> > > 11 {
> > > 12 LIST_HEAD(queue);
> > > 13 struct blktap *tap;
> > > 14 unsigned long flags;
> > > 15 
> > > 16 spin_lock_irqsave(&deferred_work_lock, flags);
> > > 17 list_splice_init(&deferred_work_queue, &queue);
> > > 18 list_for_each_entry(tap, &queue, deferred_queue)
> > > 19 clear_bit(BLKTAP_DEFERRED, &tap->dev_inuse);
> > > 20 spin_unlock_irqrestore(&deferred_work_lock, flags);
> > > 21 
> > > 22 while (!list_empty(&queue)) {
> > > 23 tap = list_entry(queue.next, struct blktap,
> > > deferred_queue);
> > > 24 &nb sp; list_del_init(&tap->deferred_queue);
> > > 25 blktap_device_restart(tap);
> > > 26 } 
> > > 27 } 
> > > 28 
> > > 29 void
> > > 30 blktap_defer(struct blktap *tap)
> > > 31 {
> > > 32 unsigned long flags;
> > > 33 
> > > 34 spin_lock_irqsave(&deferred_work_lock, flags);
> > > 35 if (!test_bit(BLKTAP_DEFERRED, &tap->dev_inuse)) {
> > > 36 set_bit(BLKTAP_DEFERRED, &tap->dev_inuse);
> > > 37 list_add_tail(&tap->deferred_queue, &deferred_work_queue);
> > > 38 } 
> > > 39 spin_unlock_irqrestore(&deferred_work_lock, f lags);
> > > 40 } 
> > > 
> > > 
> > > > Date: Fri, 15 Oct 2010 13:57:09 +0100
> > > > Subject: Re: [Xen-devel] Domain 0 stop response on frequently
> reboot
> > > VMS
> > > > From: keir@xen.org
> > > > To: tinnycloud@hotmail.com; xen-devel@lists.xensource.com
> > > > 
> > > > You'll probably want to see if you can get SysRq output from
> dom0
> > > via serial
> > > > line. It's likely you can if it is alive enough to respond to
> ping.
> > > This
> > > > might tell you things like what all processes are getting
> blocked
> > > on, and
> > > > thus indicate what is stopping dom0 from making progress.
> > > > 
> > > > -- Keir
> > > > 
> > > > On 15/10/2010 13:43, "MaoXiaoyun" <tinnycloud@hotmail.com>
> wrote:
> > > > 
> > > > > 
> > > > > Hi Keir:
> > > > > 
> > > > > First, I'd like to express my appreciation for the help your
> > > offered
> > > > > before.
> > > > > Well, recently we confront a rather nasty domain 0 no response
> > > > > problem.
> > > > > 
> > > > > We still have 12 HVMs almost continuously and con currently
> reboot
> > > > > test on a physical server.
> > > > > A few hours later, the server looks like dead. We only can
> ping to
> > > > > the server and get right response,
> > > > > the Xen works fine since we can get debug info from serial
> port.
> > > Attached is
> > > > > the full debug output.
> > > > > After decode the domain 0 CPU stack, I find the CPU still
> works
> > > for domain 0
> > > > > since the stack changed
> > > > > info changed every time I dumped.
> > > > > 
> > > > > Could help to take a look at the attentchment to see whether
> there
> > > are
> > > > > some hints for debugging this
> > > > > problem. Thanks in advance.
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > _______________________________________________
> > > > > Xen-devel mailing list
> > > > > Xen-devel@lists.xensource.com
> > > > > http://lists.xensource.com/xen-devel
> > > > 
> > > > 
> > 
> > 

  reply	other threads:[~2010-10-24  5:56 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <BAY121-W45A47AC73BDA1A9E7474A2DA720@phx.gbl>
     [not found] ` <C8ACD97B.1256D%keir.fraser@eu.citrix.com>
2010-09-10 11:01   ` VM hung after running sometime MaoXiaoyun
2010-09-19 10:37     ` MaoXiaoyun
2010-09-19 11:49       ` Keir Fraser
2010-09-19 12:21         ` Zhang, Yang Z
2010-09-20  6:00         ` MaoXiaoyun
2010-09-20  7:45           ` Keir Fraser
2010-09-20  8:23             ` MaoXiaoyun
2010-09-20  9:15             ` MaoXiaoyun
2010-09-20  9:35               ` Keir Fraser
2010-09-21  5:02                 ` MaoXiaoyun
2010-09-21  7:53                   ` Keir Fraser
2010-09-21  9:24                     ` wei song
2010-09-21  9:49                       ` wei song
2010-09-21 17:28                     ` Jeremy Fitzhardinge
2010-09-22  0:02                       ` MaoXiaoyun
2010-09-22  0:17                         ` Jeremy Fitzhardinge
2010-09-22  1:19                           ` MaoXiaoyun
2010-09-22 18:31                             ` Jeremy Fitzhardinge
2010-09-23  0:55                               ` MaoXiaoyun
2010-09-23 23:20                                 ` Jeremy Fitzhardinge
2010-09-24  4:29                                   ` MaoXiaoyun
2010-09-25  9:33                                   ` MaoXiaoyun
2010-09-25 10:40                                     ` wei song
2010-09-27 18:02                                       ` Jeremy Fitzhardinge
2010-09-27 11:56                                     ` MaoXiaoyun
2010-09-28  5:43                                   ` MaoXiaoyun
2010-09-28 11:23                                     ` MaoXiaoyun
2010-09-28 17:07                                       ` Jeremy Fitzhardinge
2010-09-29  6:01                                         ` MaoXiaoyun
2010-09-29 16:12                                           ` Jeremy Fitzhardinge
2010-10-15 12:43     ` Domain 0 stop response on frequently reboot VMS MaoXiaoyun
2010-10-15 12:57       ` Keir Fraser
2010-10-16  5:39         ` MaoXiaoyun
2010-10-16  7:16           ` Keir Fraser
2010-10-18 21:17           ` Daniel Stodden
2010-10-24  5:48             ` MaoXiaoyun
2010-10-24  5:56               ` Daniel Stodden [this message]
2010-10-26  8:16                 ` MaoXiaoyun
2010-10-26  9:09                   ` Daniel Stodden
2010-10-26 10:54                     ` MaoXiaoyun
2010-10-26  9:20                   ` Ian Campbell
2010-10-26 10:59                     ` MaoXiaoyun
2010-10-26 11:54                       ` Domain 0 stop response on frequently reboot VMS, fix xen/master link? Pasi Kärkkäinen
2010-10-26 17:08                         ` Jeremy Fitzhardinge
2010-11-04  3:09               ` A Patch for modify DomU network transmit rate dynamically MaoXiaoyun
2010-11-04  3:43                 ` MaoXiaoyun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1287899811.4575.32.camel@ramone \
    --to=daniel.stodden@citrix.com \
    --cc=keir@xen.org \
    --cc=tinnycloud@hotmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.