From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1D20C433DB for ; Wed, 27 Jan 2021 22:05:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A5DC464DC4 for ; Wed, 27 Jan 2021 22:05:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235747AbhA0WFM (ORCPT ); Wed, 27 Jan 2021 17:05:12 -0500 Received: from mga06.intel.com ([134.134.136.31]:5578 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235066AbhA0WEZ (ORCPT ); Wed, 27 Jan 2021 17:04:25 -0500 IronPort-SDR: CMYQ2hmg0ZGB52R86tcZ5jBt0eO3f0HNWGUtNiITpPuvi+Ls1G8E14DvpPYk2zVTIAvSK+MxVz 4Bwev6VElrDA== X-IronPort-AV: E=McAfee;i="6000,8403,9877"; a="241669756" X-IronPort-AV: E=Sophos;i="5.79,380,1602572400"; d="scan'208";a="241669756" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jan 2021 14:02:31 -0800 IronPort-SDR: i6F2gvM/6kh4rOtFpRty7DdNRR50gCxMPh7CIJN3lr4amk2x6aV+e5xOImusTA0cvWFdaxJKC4 EWZ6/3SWpDxQ== X-IronPort-AV: E=Sophos;i="5.79,380,1602572400"; d="scan'208";a="430259811" Received: from sschwenc-mobl.amr.corp.intel.com (HELO [10.209.87.195]) ([10.209.87.195]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jan 2021 14:02:30 -0800 Subject: Re: Crash in acpi_ns_validate_handle triggered by soundwire on Linux 5.10 To: =?UTF-8?Q?Marcin_=c5=9alusarz?= Cc: "moderated list:SOUND - SOC LAYER / DYNAMIC AUDIO POWER MANAGEM..." , "Rafael J. Wysocki" , "Rafael J. Wysocki" , ACPI Devel Maling List , Vinod Koul , Bard Liao , Len Brown References: <1f0f7273-597e-cdf0-87d1-908e56c13133@linux.intel.com> <1dc2639a-ecbc-c554-eaf6-930256dcda96@linux.intel.com> From: Pierre-Louis Bossart Message-ID: <709fa03c-43b7-45e4-3ddc-aae0d8f4ced4@linux.intel.com> Date: Wed, 27 Jan 2021 16:02:29 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org On 1/27/21 1:18 PM, Marcin Ślusarz wrote: > śr., 27 sty 2021 o 18:28 Pierre-Louis Bossart > napisał(a): >>> Weird, I can't reproduce this problem with my self-compiled kernel :/ >>> I don't even see soundwire modules loaded in. Manually loading them of course >>> doesn't do much. >>> >>> Previously I could boot into the "faulty" kernel by using "recovery mode", but >>> I can't do that anymore - it crashes too. >>> >>> Maybe there's some kind of race and this bug depends on some specific >>> ordering of events? >> >> missing Kconfig? >> You need CONFIG_SOUNDWIRE and CONFIG_SND_SOC_SOF_INTEL_SOUNDWIRE >> selected to enter this sdw_intel_acpi_scan() routine. > > It was a PEBKAC, but a slightly different one. I won't bore you with > (embarrassing) details ;). > > I reproduced the problem, tested both your and Rafael's patches > and the kernel still crashes, with the same stack trace. > (Yes, I'm sure I booted the right kernel :) > > Why "recovery mode" stopped working (or worked previously) is still a mystery. > Thanks Marcin for the information. If you have a consistent failure that's better to some extent. Maybe a bit of explanation of what this routine tries to do: when SoundWire is enabled in a system, we need to have the following pattern in the DSDT: Scope (_SB.PCI0) { Device (HDAS) { Name (_ADR, 0x001F0003) // _ADR: Address } Scope (HDAS) { Device (SNDW) { Name (_ADR, 0x40000000) // _ADR: Address The only thing the code does is to walk through the children and check if the valid _ADR 0x40000000 is found. You don't have SoundWire in your device so there should not be any children found. I don't see anything in the DSDT that looks like _SB.PCI0.HDAS., so in theory we should not even enter the callback. The error happens in acpi_bus_get_device(), after we read the adr but before we check it, so wondering if we shouldn't revert the checks. Can you try the diff below? I am not sure why there is a crash and we should root-cause this issue, just trying to triangulate what is happening. diff --git a/drivers/soundwire/intel_init.c b/drivers/soundwire/intel_init.c index cabdadb09a1b..6bc87a682fb3 100644 --- a/drivers/soundwire/intel_init.c +++ b/drivers/soundwire/intel_init.c @@ -369,13 +369,6 @@ static acpi_status sdw_intel_acpi_cb(acpi_handle handle, u32 level, if (ACPI_FAILURE(status)) return AE_OK; /* keep going */ - if (acpi_bus_get_device(handle, &adev)) { - pr_err("%s: Couldn't find ACPI handle\n", __func__); - return AE_NOT_FOUND; - } - - info->handle = handle; - /* * On some Intel platforms, multiple children of the HDAS * device can be found, but only one of them is the SoundWire @@ -386,6 +379,13 @@ static acpi_status sdw_intel_acpi_cb(acpi_handle handle, u32 level, if (FIELD_GET(GENMASK(31, 28), adr) != SDW_LINK_TYPE) return AE_OK; /* keep going */ + if (acpi_bus_get_device(handle, &adev)) { + pr_err("%s: Couldn't find ACPI handle\n", __func__); + return AE_NOT_FOUND; + } + + info->handle = handle; + /* device found, stop namespace walk */ return AE_CTRL_TERMINATE; }