From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759314Ab2AMWHW (ORCPT ); Fri, 13 Jan 2012 17:07:22 -0500 Received: from e23smtp04.au.ibm.com ([202.81.31.146]:38992 "EHLO e23smtp04.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755782Ab2AMWHT (ORCPT ); Fri, 13 Jan 2012 17:07:19 -0500 Message-ID: <4F10AB02.1010305@linux.vnet.ibm.com> Date: Sat, 14 Jan 2012 03:36:58 +0530 From: "Srivatsa S. Bhat" User-Agent: Mozilla/5.0 (X11; Linux i686; rv:7.0) Gecko/20110927 Thunderbird/7.0 MIME-Version: 1.0 To: "Justin P. Mattock" CC: Andi Kleen , Linus Torvalds , Ming Lei , Djalal Harouni , Borislav Petkov , Tony Luck , Hidetoshi Seto , Ingo Molnar , linux-kernel@vger.kernel.org, Greg Kroah-Hartman , Kay Sievers , gouders@et.bocholt.fh-gelsenkirchen.de, Marcos Souza , Linux PM mailing list , "Rafael J. Wysocki" , "tglx@linutronix.de" , prasad@linux.vnet.ibm.com, Jeff Chua Subject: Re: x86/mce: machine check warning during poweroff References: <20120111000051.GA28874@dztty> <4F10929E.8070007@linux.vnet.ibm.com> <4F1099CA.7090209@linux.vnet.ibm.com> <4F10A0FD.9040504@linux.intel.com> <4F10A464.8080603@gmail.com> In-Reply-To: <4F10A464.8080603@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit x-cbid: 12011311-9264-0000-0000-0000009E982C Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/14/2012 03:08 AM, Justin P. Mattock wrote: >> > > this showed up using no_console_suspend > > > 131.875143] usb 5-1: device descriptor read/64, error -110 > [ 140.599340] PM: Syncing filesystems ... done. > [ 140.815981] PM: Preparing system for mem sleep > [ 140.829117] Freezing user space processes ... (elapsed 0.01 seconds) > done. > [ 140.840150] Freezing remaining freezable tasks ... > [ 147.079160] usb 5-1: device descriptor read/64, error -110 > [ 147.282166] usb 5-1: new full-speed USB device number 6 using uhci_hcd > [ 157.686165] usb 5-1: device not accepting address 6, error -110 > [ 157.788183] usb 5-1: new full-speed USB device number 7 using uhci_hcd > [ 160.849310] > [ 160.849320] Freezing of tasks failed after 20.00 seconds (1 tasks > refusing to freeze, wq_busy=0): > [ 160.849460] khubd D f5d90020 0 20 2 0x00000000 > [ 160.849471] f5d95d50 00000046 f5d095d0 f5d90020 00000000 c16ec3c0 > bce8e78a 00000024 > [ 160.849488] c16ec3c0 bce7b7f6 00000024 f60063c0 f5d09170 f5d95d20 > c120490b c1039aa6 > [ 160.849505] 00000000 00000046 c1721180 00000296 f5d95d70 f5d95d40 > c1465208 00000000 > [ 160.849521] Call Trace: > [ 160.849538] [] ? do_raw_spin_lock+0x3b/0xf0 > [ 160.849548] [] ? lock_timer_base.isra.24+0x26/0x50 > [ 160.849558] [] ? _raw_spin_lock_irqsave+0x58/0x70 > [ 160.849567] [] ? do_raw_spin_unlock+0x4e/0x90 > [ 160.849574] [] schedule+0x30/0x50 > [ 160.849582] [] schedule_timeout+0x10f/0x1f0 > [ 160.849589] [] ? usleep_range+0x40/0x40 > [ 160.849597] [] wait_for_common+0xb0/0x120 > [ 160.849605] [] ? try_to_wake_up+0x260/0x260 > [ 160.849614] [] wait_for_completion_timeout+0xd/0x10 > [ 160.849624] [] usb_start_wait_urb+0xb1/0xe0 > [ 160.849632] [] ? sys_swapon+0xab1/0xc50 > [ 160.849640] [] usb_control_msg+0xb8/0xf0 > [ 160.849648] [] ? _dev_info+0x28/0x30 > [ 160.849656] [] hub_port_init+0x627/0x710 > [ 160.849664] [] ? usb_set_device_state+0x76/0x130 > [ 160.849672] [] hub_thread+0x626/0x1080 > [ 160.849681] [] ? finish_task_switch+0x31/0xf0 > [ 160.849688] [] ? __schedule+0x3b0/0x7b0 > [ 160.849698] [] ? __init_waitqueue_head+0x50/0x50 > [ 160.849705] [] ? complete+0x49/0x60 > [ 160.849713] [] ? usb_remote_wakeup+0x40/0x40 > [ 160.849720] [] kthread+0x78/0x80 > [ 160.849728] [] ? __init_kthread_worker+0x60/0x60 > [ 160.849736] [] kernel_thread_helper+0x6/0xd > [ 160.849755] > [ 160.849759] Restarting tasks ... done. > [ 160.865733] power_supply BAT0: uevent > [ 160.865737] power_supply BAT0: POWER_SUPPLY_NAME=BAT0 > [ 160.886551] power_supply BAT0: prop STATUS=Full > [ 160.886562] power_supply BAT0: prop PRESENT=1 > [ 160.886570] power_supply BAT0: prop TECHNOLOGY=Unknown > [ 160.886577] power_supply BAT0: prop CYCLE_COUNT=0 > > I can supply full dmesg if needed. > a bisect on this should not take too long, just need the time to do so. > > last good kernel I have here is: 3.2.0-06541-gf33180c > Freezing failure is a totally different problem. Freezing happens much before CPUs are taken offline and even before devices are suspended. But yes, if freezing fails, suspend fails too (it is aborted rather). And freezing failures are typically a bit harder to trigger since they occur due to some race conditions. But the suspend failure problem discussed earlier in this thread (while discussing the MCE warnings) is a deterministic thing and very easily reproducible. Regards, Srivatsa S. Bhat IBM Linux Technology Center