From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume Date: Mon, 20 Jun 2011 08:36:26 -0400 Message-ID: <20110620123626.GA2973@dumpdata.com> References: <20110616225739.GA8714@dumpdata.com> <625BA99ED14B2D499DC4E29D8138F1505D2C2DD530@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <625BA99ED14B2D499DC4E29D8138F1505D2C2DD530@shsmsx502.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Tian, Kevin" Cc: "xen-devel@lists.xensource.com" , "Yu, Ke" List-Id: xen-devel@lists.xenproject.org > ideally ACPI S3/S5 has nothing to do with ACPI processor driver which is for Cx/Px. Right.. > > > > > (which is in the devel/acpi-s3.v0 branch). > > > > the hypervisor, after an S3 resume sits forever in the default_idle. The > > Linux dom0 is stuck looping (I think) around SCHEDOP_block hypercall. > > > > http://darnok.org/xen/devel.acpi-s3.v1.serial.log > > > > If that patch above is present and I've cpufreq=xen on the Xen > > hypervisor then Linux kernel gets unstuck and returns to userspace: > > > > http://darnok.org/xen/devel.acpi-s3.v0.serial.log > > Compare your logs, the major difference is: > > [ 168.754739] calling i2c-8+ @ 3096 > [ 168.758200] call i2c-8+ returned 0 after 0 usecs > <<< 1st case stuck here > [ 168.762882] calling card0-VGA-1+ @ 3096 > [ 168.766867] call card0-VGA-1+ returned 0 after 0 usecs > [ 168.772085] calling ttm+ @ 3096 > [ 168.775360] call ttm+ returned 0 after 0 usecs > [ 168.779870] PM: resume of devices complete after 13117.603 msecs > [ 168.786006] PM: Finishing wakeup. > <<<2nd case forward progress > > It looks that VGA card resume has some problem on resume, which then In both cases - with the patch and without.. > makes dom0 stay in idle loop and thus block hypercall, and then due to > no runnable vcpu so Xen most time in idle_loop too. In earlier log there're > some stack trace in i915 driver. Perhaps you can try a different machine Or remove the i915 just to eliminate that. > or try native S3 on same box to make sure it's not mixed with native issues. > > > > > (however, if I set cpuidle=0 cpufreq=none on the hypervisor line and > > have the 9f301b0a0081676dfc71b7f0898295e6bcba391a patch it still > > gets stuck). > > > > I figured that the primary reason the guest is allowed to > > exit is SCHEDOP_block loop is b/c the pm_idle call is set to the > > acp_processor_idle which does "something" extra after the machine comes > > out of a S3 suspend. > > If that's the case I think you should disable CONFIG_ACPI_PROCESSOR in dom0 > before incorporating Xen specific version (the patch you tried). We don't want > dom0 to play with Cx directly b/c it's the responsibility of Xen. Huh? You misunderstood me. The 'acpi_processor_idle' is the hypervisor's idle loop. It can be running inside of that one, or the 'default_idle' loop. Hence my question why would that specific hypervisor idle loop make dom0 run nicely while the default one would not. In dom0, irregardless of the patches, the 'default_idle' is run which makes the xen_safe_halt paravirt call. > > Of course we still need figure out why same issues occur with cpuidle=0/ > cpufreq=none, which however can be revisited after the basic S3 works. :-) Right. The end result of those parameters is that the 'default_idle' in the hypervisor is choosen instead of the 'acpi_processor_idle' one. > > > > > Any ideas? > > No other ideas for now. From historical view Xen S3 was supported before Hmm, I am actually tempted to start commenting out code in the acpi_processor_idle and seeing what will cause it to have the same failure as 'default_idle'.