From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756609Ab3KXJlp (ORCPT ); Sun, 24 Nov 2013 04:41:45 -0500 Received: from mail-wi0-f181.google.com ([209.85.212.181]:45276 "EHLO mail-wi0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754582Ab3KXJln (ORCPT ); Sun, 24 Nov 2013 04:41:43 -0500 Message-ID: <5291C9F8.5090105@gmail.com> Date: Sun, 24 Nov 2013 10:42:16 +0100 From: Francis Moreau User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: "Rafael J. Wysocki" CC: Jingoo Han , "'Borislav Petkov'" , "'Wei WANG'" , "'LKML'" , "'Thomas Gleixner'" , "'Samuel Ortiz'" , "'Chris Ball'" Subject: Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64) References: <20131117195358.GO27323@pd.tnic> <24318767.PoCLxGCZH4@vostro.rjw.lan> <528FCE57.2000805@gmail.com> <2363041.q8n2SXViVK@vostro.rjw.lan> In-Reply-To: <2363041.q8n2SXViVK@vostro.rjw.lan> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Rafael, On 11/22/2013 11:08 PM, Rafael J. Wysocki wrote: > On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote: >> On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote: >>> On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote: >>>> Le 22/11/2013 08:43, Francis Moreau a écrit : >>>>> Le 21/11/2013 12:17, Jingoo Han a écrit : >>>>> [...] >>>>>>> >>>>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area >>>>>>>> and those changes match the issue I'm facing: >>>>>>>> >>>>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c >>>>>>>> 09fd867 mfd: rtsx: Copyright modifications >>>>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3 >>>>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual >>>>>>>> extra_init_hw >>>>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver >>>>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space >>>>>> >>>>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card >>>>>> reader driver may make the kernel panic. >>>>>> >>>>>> I think that the commit "mfd: rtsx: Configure to enter a deeper >>>>>> power-saving mode in S3" may be the culprit. >>>>> >>>>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I >>>>> also reverted 7140812, 5947c16 but it didn't improve anything. >>>>> >>>>> The good news is that I managed to have a "light" kernel configuration >>>>> which is faster to build and more important it seems that the bug is >>>>> almost 100% reproductible now. >>>>> >>>>> So I'll try to do another git-bisect session later. >>>> >>>> So after bisecting between v3.11..v3.12 range, git bisect told me: >>>> >>>> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0 >>>> >>>> Merge branch 'acpi-processor' >>>> >>>> * acpi-processor: >>>> ACPI / processor: Acquire writer lock to update CPU maps >>>> ACPI / processor: Remove acpi_processor_get_limit_info() >>>> >>>> The two commits brought by the merge are not the culprits because >>>> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU >>>> maps" doesn't have the issue anymore. >>>> >>>> At that point I'm not sure how to bisect futher. >>> >>> Does the second parent of this merge (that is, 8462d9df9d50) have the problem? >>> >> >> Yes it does. >> >> Ok, I've finally managed to find out the bad commit: >> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock >> over system PM transitions >> >> I verified that the parent commit doesn't have the problem. > > Interesting. > >> Rafael, you're the man now ;) > > I kind of don't see how that commit may result in behavior that you > described earlier in the thread. > > You get a memory corruption that seems to have started to happen because > we're holding an additional lock over suspend resume now. Something's fishy > on that machine and we need to figure out what it is. > > Please file a bug at bugzilla.kernel.org against ACPI and assign it to me. > Please put all of the relevant info in there and attach the output of dmesg > after a fresh boot and the output of acpidump from the affected machine to > the bug entry. > I just sent a new trace with DEBUG_OBJECTS enabled which seems to give some interesting traces. If nothing can be found from them, I'll do the bug report. Thanks.