From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")] Date: Wed, 18 May 2016 10:24:24 +0300 Message-ID: <20160518072424.GU4329@intel.com> References: <20160511101920.GZ4329@intel.com> <57332171.8070403@linutronix.de> <20160511122116.GA4329@intel.com> <20160511084445.00030b49@gandalf.local.home> <20160511133406.GC4329@intel.com> <20160516193910.GL4329@intel.com> <573BA5E2.5040506@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Content-Disposition: inline In-Reply-To: <573BA5E2.5040506@intel.com> Sender: linux-kernel-owner@vger.kernel.org To: "Rafael J. Wysocki" Cc: Steven Rostedt , Sebastian Andrzej Siewior , Thomas Gleixner , linux-arch@vger.kernel.org, Rik van Riel , "Srivatsa S. Bhat" , Peter Zijlstra , Arjan van de Ven , Rusty Russell , Oleg Nesterov , Tejun Heo , Andrew Morton , Paul McKenney , Linus Torvalds , Paul Turner , linux-kernel@vger.kernel.org, rui.zhang@intel.com, len.brown@intel.com, Linux PM , Linux ACPI List-Id: linux-arch.vger.kernel.org On Wed, May 18, 2016 at 01:14:42AM +0200, Rafael J. Wysocki wrote: > On 5/16/2016 9:39 PM, Ville Syrj=E4l=E4 wrote: > > On Wed, May 11, 2016 at 04:34:06PM +0300, Ville Syrj=E4l=E4 wrote: > >> On Wed, May 11, 2016 at 08:44:45AM -0400, Steven Rostedt wrote: > >>> On Wed, 11 May 2016 15:21:16 +0300 > >>> Ville Syrj=E4l=E4 wrote: > >>> > >>>> Yeah can't get anything from the machine at that point. netconso= le > >>>> didn't help either, and no serial on this machine. And IIRC I've > >>>> tried ramoops on this thing in the past but unfortunately the me= mory > >>>> got cleared on reboot. > >>>> > >>> Can you look at the documentation in the kernel code at > >>> > >>> Documentation/power/basic-pm-debugging.txt And follow the procedu= res > >>> for testing suspend to RAM (although it requires mostly running t= he > >>> same tests as for hibernation suspending). > >>> > >>> You can also use the tool s2ram for this as well. > >>> > >>> See Documentation/power/s2ram.txt > >>> > >>> Perhaps this can give us a bit more light onto the problem. > >>> > >>> Basically the above does partial suspend and resume, and can pinp= oint > >>> problem areas down to a more select location. > >> All the pm_test modes work fine. The only difference between them = was > >> that 'platform' required me to manually wake up the machine (hitti= ng a > >> key was sufficient), whereas the others woke up without help. > >> > >> pm_trace gave me > >> [ 1.306633] Magic number: 0:185:178 > >> [ 1.322880] hash matches ../drivers/base/power/main.c:1070 > >> [ 1.339270] acpi device:0e: hash matches > >> [ 1.355414] platform: hash matches > >> > >> which is the TRACE_SUSPEND in __device_suspend_noirq(), so no help > >> there. > >> > >> I guess I could try to sprinkle more TRACE_RESUMEs around into som= e > >> early resume code. If anyone has good ideas where to put them it > >> might speed things up a bit. > > So I did a bunch of that and found that it gets stuck somewhere > > around executing the _WAK method: > > platform_resume_noirq > > acpi_pm_finish > > acpi_leave_sleep_state > > acpi_hw_sleep_dispatch > > acpi_hw_legacy_wake > > acpi_hw_execute_sleep_method > > acpi_evaluate_object > > acpi_ns_evaluate > > acpi_ps_execute_method > > acpi_ps_parse_aml > > > > It also seesm that adding a few TRACE_RESUME()s or an msleep() righ= t > > after enable_nonboot_cpus() can avoid the hang, sometimes. > > > > I've attached the DSDT in case anyone is interested in looking at i= t. > > >=20 > What if you comment out the execution of _WAK (line 318 of=20 > drivers/acpi/acpica/hwsleep.c in 4.6)? Does that make any difference= ? Indeed it does. Tried with acpi_idle and intel_idle, and both appear to resume just fine with that hack. - acpi_hw_execute_sleep_method(METHOD_PATHNAME__WAK, sleep_state)= ; + //acpi_hw_execute_sleep_method(METHOD_PATHNAME__WAK, sleep_stat= e); + printk(KERN_CRIT "skipping _WAK\n"); --=20 Ville Syrj=E4l=E4 Intel OTC From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:45520 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752263AbcERHYy (ORCPT ); Wed, 18 May 2016 03:24:54 -0400 Date: Wed, 18 May 2016 10:24:24 +0300 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")] Message-ID: <20160518072424.GU4329@intel.com> References: <20160511101920.GZ4329@intel.com> <57332171.8070403@linutronix.de> <20160511122116.GA4329@intel.com> <20160511084445.00030b49@gandalf.local.home> <20160511133406.GC4329@intel.com> <20160516193910.GL4329@intel.com> <573BA5E2.5040506@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <573BA5E2.5040506@intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: "Rafael J. Wysocki" Cc: Steven Rostedt , Sebastian Andrzej Siewior , Thomas Gleixner , linux-arch@vger.kernel.org, Rik van Riel , "Srivatsa S. Bhat" , Peter Zijlstra , Arjan van de Ven , Rusty Russell , Oleg Nesterov , Tejun Heo , Andrew Morton , Paul McKenney , Linus Torvalds , Paul Turner , linux-kernel@vger.kernel.org, rui.zhang@intel.com, len.brown@intel.com, Linux PM , Linux ACPI Message-ID: <20160518072424.Av3OW0GSQeNzXUR5D-CY5HNMn5iUmCWqRNzp5ZiAl_c@z> On Wed, May 18, 2016 at 01:14:42AM +0200, Rafael J. Wysocki wrote: > On 5/16/2016 9:39 PM, Ville Syrjälä wrote: > > On Wed, May 11, 2016 at 04:34:06PM +0300, Ville Syrjälä wrote: > >> On Wed, May 11, 2016 at 08:44:45AM -0400, Steven Rostedt wrote: > >>> On Wed, 11 May 2016 15:21:16 +0300 > >>> Ville Syrjälä wrote: > >>> > >>>> Yeah can't get anything from the machine at that point. netconsole > >>>> didn't help either, and no serial on this machine. And IIRC I've > >>>> tried ramoops on this thing in the past but unfortunately the memory > >>>> got cleared on reboot. > >>>> > >>> Can you look at the documentation in the kernel code at > >>> > >>> Documentation/power/basic-pm-debugging.txt And follow the procedures > >>> for testing suspend to RAM (although it requires mostly running the > >>> same tests as for hibernation suspending). > >>> > >>> You can also use the tool s2ram for this as well. > >>> > >>> See Documentation/power/s2ram.txt > >>> > >>> Perhaps this can give us a bit more light onto the problem. > >>> > >>> Basically the above does partial suspend and resume, and can pinpoint > >>> problem areas down to a more select location. > >> All the pm_test modes work fine. The only difference between them was > >> that 'platform' required me to manually wake up the machine (hitting a > >> key was sufficient), whereas the others woke up without help. > >> > >> pm_trace gave me > >> [ 1.306633] Magic number: 0:185:178 > >> [ 1.322880] hash matches ../drivers/base/power/main.c:1070 > >> [ 1.339270] acpi device:0e: hash matches > >> [ 1.355414] platform: hash matches > >> > >> which is the TRACE_SUSPEND in __device_suspend_noirq(), so no help > >> there. > >> > >> I guess I could try to sprinkle more TRACE_RESUMEs around into some > >> early resume code. If anyone has good ideas where to put them it > >> might speed things up a bit. > > So I did a bunch of that and found that it gets stuck somewhere > > around executing the _WAK method: > > platform_resume_noirq > > acpi_pm_finish > > acpi_leave_sleep_state > > acpi_hw_sleep_dispatch > > acpi_hw_legacy_wake > > acpi_hw_execute_sleep_method > > acpi_evaluate_object > > acpi_ns_evaluate > > acpi_ps_execute_method > > acpi_ps_parse_aml > > > > It also seesm that adding a few TRACE_RESUME()s or an msleep() right > > after enable_nonboot_cpus() can avoid the hang, sometimes. > > > > I've attached the DSDT in case anyone is interested in looking at it. > > > > What if you comment out the execution of _WAK (line 318 of > drivers/acpi/acpica/hwsleep.c in 4.6)? Does that make any difference? Indeed it does. Tried with acpi_idle and intel_idle, and both appear to resume just fine with that hack. - acpi_hw_execute_sleep_method(METHOD_PATHNAME__WAK, sleep_state); + //acpi_hw_execute_sleep_method(METHOD_PATHNAME__WAK, sleep_state); + printk(KERN_CRIT "skipping _WAK\n"); -- Ville Syrjälä Intel OTC