From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 764D838FF13 for ; Tue, 13 Jan 2026 13:34:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768311277; cv=none; b=KzMJUOCuoFNgm8r2fx+Ed1Z3VxrUbRtl6tKDwzI7poo0P8FhPUmuMqpZXu0+40Z5qyl2Ah9WSghvrBPz8JyW3Id0MacP5vL0oPkyc+cl/l+TmyHaholNCSVRrXnIWTFu+X8AWW4d1+sza5d+eFvl41WjQjjyIxFF1yPg0ADGQoI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768311277; c=relaxed/simple; bh=NhC293pqk1dwPGJ+i0G37CPjxpqW2rP+jihzJHKUlmo=; h=Mime-Version:Content-Type:Date:Message-Id:Subject:Cc:To:From: References:In-Reply-To; b=SgVOf6zXFEwq0pmeQbIuOY6JHijZMPPt3Ay2opPlCMcXyIZRNOFv178cyH5EnBSydMpYF43eupxaw3aVlW5hE26DRqHZGLfij1MUQBn8K+gn0SdwO5nAUusW4ttsplHW1ul6YgfpuomjfJPvw5u5zgKpjZSUUY5fZzmiMM9LOAg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=knBYPDMM; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="knBYPDMM" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90647C16AAE; Tue, 13 Jan 2026 13:34:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1768311277; bh=NhC293pqk1dwPGJ+i0G37CPjxpqW2rP+jihzJHKUlmo=; h=Date:Subject:Cc:To:From:References:In-Reply-To:From; b=knBYPDMMTcd13OowmVSpPBdrLRBmdnND6u8S5aNRa13+6YbcC3zOJ2Fqvs1963u+P qPQdoQb+xo9mE1cCHkgSTncPcJZQA5P3xlTtGTN/oFJAGoIAdT0/cJCW4lnSj1JPIN Ioi2AUZChr4+lvN3R+BpT6aIIFUx0ePJmsSWugSGxO3g/gzWxX9GjB1Se7tMf68Dub gSNK2O626mn8V+braDdz/xRaW8QO/y1F4keebFsDi1qJl5GiC+t2CwmX627nSFiQ9u 5RYxp80KltvmPeAKVWzj2qEDgcraN81o0/I4b2aKbLsGY/Jlnvxgkz7K+rKshuvcYS crQt/f5iCJnYQ== Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 13 Jan 2026 14:34:33 +0100 Message-Id: Subject: Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device() Cc: , , , , "Qiu-ji Chen" , <2045gemini@gmail.com> To: "Gui-Dong Han" From: "Danilo Krummrich" References: <20251127145753.13080-1-hanguidong02@gmail.com> In-Reply-To: On Tue Jan 13, 2026 at 1:42 PM CET, Gui-Dong Han wrote: > On Tue, Jan 13, 2026 at 5:55=E2=80=AFPM Danilo Krummrich wrote: >> >> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote: >> > diff --git a/drivers/base/base.h b/drivers/base/base.h >> > index 86fa7fbb3548..72791125de91 100644 >> > --- a/drivers/base/base.h >> > +++ b/drivers/base/base.h >> > @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct= device *dev, struct va_format >> > static inline int driver_match_device(const struct device_driver *drv= , >> > struct device *dev) >> > { >> > + /* Protects against driver_set_override() races */ >> > + device_lock_assert(dev); >> > + >> > return drv->bus->match ? drv->bus->match(dev, drv) : 1; >> > } >> >> I am not convinced that this is the correct fix, since >> >> 1. Not all match() callbacks access the driver_override field, >> >> 2. driver_override is accessed in other places as well, >> >> 3. driver_override is a bus device specific field (with a common >> helper admittedly). >> >> I think it would be better to make driver_override a field in the base >> struct device. This way we can not only provide driver_set_override(), b= ut also >> driver_get_override(), which should contain the device_lock_assert() ins= tead. > > Hi Danilo, > > Thanks for the feedback. While I agree that moving driver_override to > struct device is a good idea for future refactoring, I believe this > patch is necessary as the immediate fix due to existing locking > constraints. > > The main constraint is the device_lock. Currently, > driver_match_device() is called from three paths. One of them > (__device_attach_driver) already holds device_lock to protect the > whole binding process. We cannot add locking inside the bus match() > callbacks (or after refactoring, acquiring the lock to call a > hypothetical getter inside match) because it would deadlock this > existing path. Since we cannot remove the lock from > __device_attach_driver, the only safe solution is to ensure the lock > is held at the other two call sites. Moving the field to struct device > would not change this locking asymmetry. The device_lock is a coarse > lock intended for correctness here, and contention is low. Don't get me wrong, I'm not against providing the guarantee that the match(= ) callbacks are always called with the device lock held, I think we should do that. But I do not agree that this is *properly* fixing locking around the driver_override field by itself. > I understand you might be concerned about performance. I tested this > using ftrace. driver_match_device() is a cold path (called ~600 times > during boot in my setup). I observed no measurable boot time > regression. After boot, it is rarely called (e.g., module loading). So > I think the impact is negligible. I'm not concerned about performance at all. I'm concerned about locking cod= e paths instead of locking data. > Regarding other access sites: driver_override is specifically designed > for matching. Apart from sysfs (which I am fixing separately [1]), I > have not found other places that require locking. Other accesses occur > mostly during init/cleanup (e.g., pci_device_probe/pci_release_dev) > where concurrency is not an issue. If there are other specific race > conditions, we can fix them individually as they are not universal. In > contrast, the match() path is the critical concurrent path where the > feature is actually implemented, affecting over 10 buses. Again, the problem is not that match() must be called with the device lock = held because of driver_set_override(). The problem is that currently match() is only *sometimes* called with the d= evice lock held, i.e. it would also work if match() is never called with the devi= ce lock held. So, there are two separate issues: fix the inconsistent locking guarantee o= f match() callbacks and locking design of driver_override. Regarding the latter, it is odd that the driver_override field belongs to t= he bus device structure, which ultimately makes the bus responsible to ensure mutual exclusion, while the driver-core helper driver_set_override() assume= s that the field is protected with the device lock. So, either we leave it to the bus entirely, then the driver_set_override() helper should take a lock pointer provided by the bus or we define that driver_override is protected with the device lock, but then we should move = it to struct device and provide an accessor with proper documentation of the lock= ing rules. Having that said, I think this patch should focus on fixing the inconsisten= t locking locking guarantee of the match() callback. But it should not justif= y that driver_match_device() is called with the device lock held with "Protec= ts against driver_set_override() races". > Even if we refactor (e.g., adding dev_access_driver_override), we > would still need to hold the lock at these call sites for the reasons > above. Sure, but then we have a lockdep assert and the rules are properly document= ed.