From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B367133B962 for ; Wed, 14 Jan 2026 19:50:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768420236; cv=none; b=K+hrIXRhyoEIFB7WLKIRGoliJ7hSdt6DsUNBPTKzoSDm8ZW6xUAh1ehl2uJUtbXSAZrCRxf9QuXSllg/mPMjHp5vxmFhPrAZpsluKwJYPe+42sl1muC9x+3EuqK2g/h6vF0OTlsdnUH+tMV76M3fY0nCgJVOtqRjilyVorR+V20= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768420236; c=relaxed/simple; bh=lftn52dXRiMpGb5hR6erO6gE0eQ07MFa1vm6azGi5kY=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=WmTevQj6KImNlRjr928Kh2cIU0M1hAyMEHVGAx4l+xTHTfz3OQ1zuh7pGR01tOH/hKHh/fFDhgP70hPNqPCi+aWVX5BBfcM5T/7E9m+JQTeTiPKqsSkytVa8SCfzrZuezMOJtW+tupr9JSEnhun0lOwaYAU3xHICRsQbROA044A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EL3AfJnA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EL3AfJnA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C1412C4CEF7; Wed, 14 Jan 2026 19:50:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1768420236; bh=lftn52dXRiMpGb5hR6erO6gE0eQ07MFa1vm6azGi5kY=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=EL3AfJnAnp7T7RtNgC81vtuBB/ltzMLzZWjzsfUhgbXAbKJH+nCDhF/4wcJt6A/KY 2VcVU3NHhPpeukOQXrZpm0DBM0u3lH3zcuuaJLmZ0Z6cDbS66G41kTU+yR9CUjoZsk W4M4sgHHPfREy2NIBsN5PzOrU/CsVKM8Vo3ZOwcGYzSTjt3a3+cx0KorZMY5C+A5md TqIhkSyeX2i3rHVStgW6eVz3lpq1gSu9oVghr9z4ZWtPVsfAg1XOLckz4E/W/1V2Ko S3gJ/KWfvw4TO1u3iQLJk2GFgz2FOmsnVA8ZM7CIQN/jMX6DGUUckMCoAlHjARNdPW vCeX6DFV+1OUA== From: Thomas Gleixner To: Yicong Yang , Anup Patel Cc: yang.yicong@picoheart.com, anup@brainfault.org, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, geshijian@picoheart.com, weidong.wd@picoheart.com, Greg Kroah-Hartman , "Rafael J. Wysocki" , Danilo Krummrich Subject: Re: [PATCH] irqchip/riscv-aplic: Register the driver prior to device creation In-Reply-To: <7b859dd5-9262-4d68-9a8e-e0be0c24ac4a@picoheart.com> References: <20260114063730.78009-1-yang.yicong@picoheart.com> <7b859dd5-9262-4d68-9a8e-e0be0c24ac4a@picoheart.com> Date: Wed, 14 Jan 2026 20:50:32 +0100 Message-ID: <877btkht2v.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Wed, Jan 14 2026 at 19:48, Yicong Yang wrote: > On 1/14/26 4:57 PM, Anup Patel wrote: >> On Wed, Jan 14, 2026 at 12:08=E2=80=AFPM Yicong Yang wrote: >>> >>> On RISC-V the APLIC serves part of the GSI interrupts, but unlike >>> other arthitecture it's initialized a bit late on ACPI based >>> system: >>> - the spec only mandates the report in DSDT (riscv-brs rule AML_100) >>> so the APLIC is created as platform_device when scanning DSDT >>> - the driver is registered and initialize the device in device_initcall >>> stage >>> >>> The creation of devices depends on APLIC is deferred after the APLIC >>> is initialized (when the driver calls acpi_dev_clear_dependencies), >>> not like most other devices which is created when scanning the DSDT. >>> The affected devices include those declare the dependency explicitly >>> by ACPI _DEP method and _PRT for PCIe host bridge and those require >>> their interrupts as GSI. Furhtermore, the deferred creation is >>> performed in an async way (queued in the system_dfl_wq workqueue) >>> but all contend on the acpi_scan_lock. The lock contention is irrelevant to the real underlying problem. >>> Since the deferred devcie creation is asynchronous and will contend >>> for the same lock, the order and timing is not certain. And the time >>> is late enough for the device creation running parallel with the init >>> task. This will lead to below issues (also observed on our platforms): >>> - the console/tty device is created lately and sometimes it's not ready >>> when init task check for its presence. the system will crash in the >>> latter case since the init task always requires a valid console. >>> - the root device will by probed and registered lately (e.g. NVME, >>> after the init task executed) and may run into the rescue shell if >>> root device is not found. And again, you _cannot_ solve this problem completely with initcall ordering; Deferred probing with delegation to work queues has the systemic issue that there is no guarantee that all devices, which are required to actually proceed to userspace, have been initialized at that point. Changing the initcall priority of a particular driver papers over the underlying problem to the extent that _you_ cannot observe it anymore, but that provides exactly _zero_ guarantee that it is correct under all circumstances. "Works for me" is the worst engineering principle as you might know already. That said, I still refuse to take random initcall ordering patches unless somebody comes up with a coherent explanation of the actual guarantee. But before you start to come up with more fairy tales, let me come back to your two points from above: >>> - the console/tty device is created lately and sometimes it's not ready >>> when init task check for its presence. the system will crash in the >>> latter case since the init task always requires a valid console. I assume you want to say that console_on_rootfs() fails to open '/dev/console', right? That's obvious because console_on_rootfs() is invoked _before_ async_synchronize_full() is invoked which ensures that all outstanding initialization work has been completed. The fix for this is obvious too and it's therefore bloody obvious that changing the init call priority of a random driver does not fix that at all, no? But that's not sufficient, see below. >>> - the root device will by probed and registered lately (e.g. NVME, >>> after the init task executed) and may run into the rescue shell if >>> root device is not found. You completely fail to explain how outstanding initializations in work queues survive past the async_synchronize_full() synchronization point. You are merely describing random observations on your system, but you stopped right there without trying to decode the underlying root cause. The root cause is: 1) as I already said above that deferred probing does not provide any guarantees at all. 2) async_synchronize_full() is obviously not the barrier which it is supposed to be (the misplaced console_on_rootfs() call aside). That needs to be fixed at the conceptual level and not hacked around with "works for me" patches and fairy tale change logs. Thanks, tglx