From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Cc: "ian.campbell@citrix.com" <ian.campbell@citrix.com>,
"stefano.stabellini@eu.citrix.com"
<stefano.stabellini@eu.citrix.com>,
"Zhangbo (Oscar)" <oscar.zhangbo@huawei.com>,
Yanqiangjun <yanqiangjun@huawei.com>,
Luonengjun <luonengjun@huawei.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
"rjw@sisk.pl" <rjw@sisk.pl>,
"rshriram@cs.ubc.ca" <rshriram@cs.ubc.ca>,
"Jinjian (Ken)" <jinjian@huawei.com>
Subject: Re: pvops: Does PVOPS guest os support online "suspend/resume"
Date: Tue, 13 Aug 2013 12:34:46 -0400 [thread overview]
Message-ID: <20130813163446.GA12994@phenom.dumpdata.com> (raw)
In-Reply-To: <33183CC9F5247A488A2544077AF190208159306C@SZXEMA503-MBS.china.huawei.com>
On Tue, Aug 13, 2013 at 02:38:18PM +0000, Gonglei (Arei) wrote:
> Hi,
> I rechecked the different kernels today, and found that I made a mistake before. sorry for misleading you all:)
>
> All in all, the problems should be concluded in the 2 items below:
> 1 the kernel 2.6.32 PVOPS guest os(I tested RHEL6.1 and RHEL6.3), does have bugs in ONLINE suspend/resume (checkpoint), which was,
> as Shriram mentioned, fixed in:
> http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/xen/manage.c?id=b3e96c0c756211e805c6941d4a6e5f6e1995cb6b
> 2 the kernel above 3.0(I tested Ubuntu12.10 with kernel 3.5 and Ubuntu13.04 with kernel 3.8), they seem to have another "bug":
> 1) if we set MULTI VCPUS for the guest os, it would have problems in resuming(to be correctly, it's thaw).
> In details:
> <1>set the guest os with 4 vcpus
> in dom1.cfg: vcpus=4
> <2>xl create dom1.cfg
> excute command "top -d 1" in guest dom1's vnc window
> <3>xl save -c dom1 /opt/dom1.save
> <4>after step <3>, we check the guest dom1's vnc window, and found that:
> kernel thread migration/1, migration/2, migration/3 got their cpu usage up to 100%
> the guest os couldn't respond to any request such as mouse movement or keyboard input.
> no "thaw" things printed in dom1's serial output.
>
> 2) if we set only 1 vcpu for the guest os, it would thaw back and works fine.
> 3) anyother odd thing is that: if we use the saved file generated in 2-1) to restore the guest, and then do online suspend/resume (xl save -c, checkpoint),
> it would be fine, no problems occurred.
>
> Such problem occurs on guest os with kernel 3.5/3.8(maybe other kernels as well, not tested). I hope that the steps I did was correct.
Please do check with the upstream kernel. There were some CPU hotplug issues in older kernels
and just to make sure that this is not one of them it would be good to eliminate this.
Please do test with v3.11-rc5.
> Have you ever entercounter such "suspend/resume checkpoint on multi-vcpu guest os" problem?
>
> -------
> PS: BTW, I'm wondering why using freeze/thaw instead of suspend/resume would solve the problem with kernels below 3.0?
> It seems that blkfront_resume is still called if we use thaw method here, because blkfront has no available pm_op.
>
> static int device_resume(struct device *dev, pm_message_t state, bool async)
> {
> …………
> if (dev->bus) {
> if (dev->bus->pm) {
> info = "bus ";
> callback = pm_op(dev->bus->pm, state);
> } else if (dev->bus->resume) {
> info = "legacy bus ";
> callback = dev->bus->resume; //blkfront_resume is called here. here?
> goto End;
One easy way to figure this out is to stick printks in here to see if that blkfront code
is indeed called. You can also use 'dump_stack()' to get a nice stack-trace.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2013-08-13 16:34 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-08 14:23 pvops: Does PVOPS guest os support online "suspend/resume" Gonglei (Arei)
2013-08-08 19:16 ` Konrad Rzeszutek Wilk
2013-08-10 8:29 ` Gonglei (Arei)
2013-08-12 12:49 ` Konrad Rzeszutek Wilk
2013-08-12 14:19 ` Gonglei (Arei)
2013-08-12 18:04 ` Shriram Rajagopalan
2013-08-13 14:38 ` Gonglei (Arei)
2013-08-13 16:34 ` Konrad Rzeszutek Wilk [this message]
2013-08-14 10:52 ` Gonglei (Arei)
-- strict thread matches above, loose matches on Subject: below --
2013-10-29 10:24 herbert cland
2013-10-29 16:48 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130813163446.GA12994@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=arei.gonglei@huawei.com \
--cc=ian.campbell@citrix.com \
--cc=jinjian@huawei.com \
--cc=luonengjun@huawei.com \
--cc=oscar.zhangbo@huawei.com \
--cc=rjw@sisk.pl \
--cc=rshriram@cs.ubc.ca \
--cc=stefano.stabellini@eu.citrix.com \
--cc=xen-devel@lists.xen.org \
--cc=yanqiangjun@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).