From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3406626FD82; Wed, 9 Apr 2025 15:04:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744211069; cv=none; b=c6NMv5vIqyJOXwcKuehS5LVxoQGjmCLdwCVB/tDb3nGEdUp8pW8g1cyBUofuR8hRB9JAJAahLGWt5Z9ixFVQN57k8HWpuLAqjTfuJoom5pHcHH8L5fZul67iHF3vFdi8uwOqsVZHlG8VgnHm77EiwZfn2VoZT0zA6xUB4Xry21U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744211069; c=relaxed/simple; bh=vXNbrTLUQrlYNydORT5K+rP57dVB2oADM2jV2AgBth4=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=hANqiunfNiLMiZ5Roz7J4XNLzWR1U5UKXcZjlvxHGPyXmNxadVdNaUVenVq5cYKm6l3wX18rn5Qz0KzBlEZLdtuW/hbEAAQcV4jGh+1zBtiPNo+GwRc9zWfRjM4KVPDiY5mnh9kj5ordAMb80kbQpX0/mVo3drKho2rtmEyEinM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=mailbox.org; spf=pass smtp.mailfrom=mailbox.org; dkim=pass (2048-bit key) header.d=mailbox.org header.i=@mailbox.org header.b=LS9up8sz; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=mailbox.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=mailbox.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=mailbox.org header.i=@mailbox.org header.b="LS9up8sz" Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4ZXmSR19b9z9t7P; Wed, 9 Apr 2025 17:04:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mailbox.org; s=mail20150812; t=1744211063; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vXNbrTLUQrlYNydORT5K+rP57dVB2oADM2jV2AgBth4=; b=LS9up8szfCpBJACvt4gXa2EtpB97sVavX67wzRPpNDGRm6AHGs0RWhd+AXtPqRnp9op8zM yE04pWSHbv+2bVCH/VYb7yTY+oy+qVUAZ4WW9/6lnP3PpHLGzc4tcRtTF4O7CQkK3tXhvX cCiMrZuEgIld1a2KfjtHsIUu/oAQCKCx7fGs11u4gWzMoATXWnWjLCYPUuT99rUvw0HFVE FckYqYRVDbBSXy1btg0Kjd2aoH0hjFAIPruG1nC4IsLJddiCYtoMfHQpDpP9RKPQBOiAnR HJHimkXP9tmHGGqthFRauJolLOimIaLd2csSieO9zeWinLAUXQxNFxqKta20mA== Message-ID: <0b2fc70d8fae566c8ca43bafc929e2bd19725924.camel@mailbox.org> Subject: Re: [PATCH 1/2] dma-fence: Rename dma_fence_is_signaled() From: Philipp Stanner Reply-To: phasta@kernel.org To: Christian =?ISO-8859-1?Q?K=F6nig?= , phasta@kernel.org, Boris Brezillon Cc: Sumit Semwal , Gustavo Padovan , Felix Kuehling , Alex Deucher , Xinhui Pan , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Lucas Stach , Russell King , Christian Gmeiner , Jani Nikula , Joonas Lahtinen , Rodrigo Vivi , Tvrtko Ursulin , Frank Binns , Matt Coster , Qiang Yu , Rob Clark , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten , Lyude Paul , Danilo Krummrich , Rob Herring , Steven Price , Dave Airlie , Gerd Hoffmann , Matthew Brost , Huang Rui , Matthew Auld , Melissa Wen , =?ISO-8859-1?Q?Ma=EDra?= Canal , Zack Rusin , Broadcom internal kernel review list , Lucas De Marchi , Thomas =?ISO-8859-1?Q?Hellstr=F6m?= , Bas Nieuwenhuizen , Yang Wang , Jesse Zhang , Tim Huang , Sathishkumar S , Saleemkhan Jamadar , Sunil Khatri , Lijo Lazar , Hawking Zhang , Ma Jun , Yunxiang Li , Eric Huang , Asad Kamal , Srinivasan Shanmugam , Jack Xiao , Friedrich Vock , Michel =?ISO-8859-1?Q?D=E4nzer?= , Geert Uytterhoeven , Anna-Maria Behnsen , Thomas Gleixner , Frederic Weisbecker , Dan Carpenter , linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, etnaviv@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, lima@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, nouveau@lists.freedesktop.org, virtualization@lists.linux.dev, spice-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Date: Wed, 09 Apr 2025 17:04:00 +0200 In-Reply-To: <334e843c-d7fe-4e33-b4fc-f3d18226465a@amd.com> References: <20250409120640.106408-2-phasta@kernel.org> <20250409120640.106408-3-phasta@kernel.org> <20250409143917.31303d22@collabora.com> <73d41cd84c73b296789b654e45125bfce88e0dbf.camel@mailbox.org> <72eb974dfea8fa1167cf97e29848672223f6fc5b.camel@mailbox.org> <9a90f7f14c22c01aa28d89aa91bf4dfa4049c062.camel@mailbox.org> <334e843c-d7fe-4e33-b4fc-f3d18226465a@amd.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MBO-RS-META: 9wh48drngnos9i43y9zpcu5f5mrpihgx X-MBO-RS-ID: f0082fb0ff96fc9be99 On Wed, 2025-04-09 at 16:10 +0200, Christian K=C3=B6nig wrote: > Am 09.04.25 um 16:01 schrieb Philipp Stanner: > > On Wed, 2025-04-09 at 15:14 +0200, Christian K=C3=B6nig wrote: > > > Am 09.04.25 um 14:56 schrieb Philipp Stanner: > > > > On Wed, 2025-04-09 at 14:51 +0200, Philipp Stanner wrote: > > > > > On Wed, 2025-04-09 at 14:39 +0200, Boris Brezillon wrote: > > > > > > Hi Philipp, > > > > > >=20 > > > > > > On Wed,=C2=A0 9 Apr 2025 14:06:37 +0200 > > > > > > Philipp Stanner wrote: > > > > > >=20 > > > > > > > dma_fence_is_signaled()'s name strongly reads as if this > > > > > > > function > > > > > > > were > > > > > > > intended for checking whether a fence is already > > > > > > > signaled. > > > > > > > Also > > > > > > > the > > > > > > > boolean it returns hints at that. > > > > > > >=20 > > > > > > > The function's behavior, however, is more complex: it can > > > > > > > check > > > > > > > with a > > > > > > > driver callback whether the hardware's sequence number > > > > > > > indicates > > > > > > > that > > > > > > > the fence can already be treated as signaled, although > > > > > > > the > > > > > > > hardware's / > > > > > > > driver's interrupt handler has not signaled it yet. If > > > > > > > that's > > > > > > > the > > > > > > > case, > > > > > > > the function also signals the fence. > > > > > > >=20 > > > > > > > (Presumably) this has caused a bug in Nouveau (unknown > > > > > > > commit), > > > > > > > where > > > > > > > nouveau_fence_done() uses the function to check a fence, > > > > > > > which > > > > > > > causes a > > > > > > > race. > > > > > > >=20 > > > > > > > Give the function a more obvious name. > > > > > > This is just my personal view on this, but I find the new > > > > > > name > > > > > > just > > > > > > as > > > > > > confusing as the old one. It sounds like something is > > > > > > checked, > > > > > > but > > > > > > it's > > > > > > clear what, and then the fence is forcibly signaled like it > > > > > > would > > > > > > be > > > > > > if > > > > > > you call drm_fence_signal(). Of course, this clarified by > > > > > > the > > > > > > doc, > > > > > > but > > > > > > given the goal was to make the function name clearly > > > > > > reflect > > > > > > what > > > > > > it > > > > > > does, I'm not convinced it's significantly better. > > > > > >=20 > > > > > > Maybe dma_fence_check_hw_state_and_propagate(), though it > > > > > > might > > > > > > be > > > > > > too long of name. Oh well, feel free to ignore this > > > > > > comments if > > > > > > a > > > > > > majority is fine with the new name. > > > > > Yoa, the name isn't perfect (the perfect name describing the > > > > > whole > > > > > behavior would be > > > > > dma_fence_check_if_already_signaled_then_check_hardware_state > > > > > _and > > > > > _pro > > > > > pa > > > > > gate() ^^' > > > > >=20 > > > > > My intention here is to have the reader realize "watch out, > > > > > the > > > > > fence > > > > > might get signaled here!", which is probably the most > > > > > important > > > > > event > > > > > regarding fences, which can race, invoke the callbacks and so > > > > > on. > > > > >=20 > > > > > For details readers will then check the documentation. > > > > >=20 > > > > > But I'm of course open to see if there's a majority for this > > > > > or > > > > > that > > > > > name. > > > > how about: > > > >=20 > > > > dma_fence_check_hw_and_signal() ? > > > I don't think that renaming the function is a good idea in the > > > first > > > place. > > >=20 > > > What the function does internally is an implementation detail of > > > the > > > framework. > > >=20 > > > For the code using this function it's completely irrelevant if > > > the > > > function might also signal the fence, what matters for the caller > > > is > > > the returned status of the fence. I think this also counts for > > > the > > > dma_fence_is_signaled() documentation. > > It does obviously matter. As it's currently implemented, a lot of > > important things happen implicitly. >=20 > Yeah, but that's ok. >=20 > The code who calls this is the consumer of the interface and so > shouldn't need to know this. That's why we have created the DMA fence > framework in the first place. >=20 > For the provider side when a driver or similar implements the > interface the relevant documentation is the dma_fence_ops structure. >=20 > > I only see improvement by making things more obvious. > >=20 > > In any case, how would you call a wrapper that just does > > test_bit(IS_SIGNALED, =E2=80=A6) ? >=20 > Broken, that was very intentionally removed quite shortly after we > created the framework. >=20 > We have a few cases were implementations do check that for their > fences, but consumers should never be allowed to touch such > internals. There is theory and there is practice. In practice, those internals are being used by Nouveau, i915, Xe, vmgfx and radeon. So it seems that we failed quite a bit at communicating clearly how the interface should be used. And, to repeat myself, with both name and docu of that function, I think it is very easy to misunderstand what it's doing. You say that it shouldn't matter =E2=80=93 and maybe that's true, in theory. In practice, i= t does matter. In practice, APIs get misused and have side-effects. And making that harder is desirable. In any case, I might have to add another such call to Nouveau, because the solution preferred by you over the callback causes another race. Certainly one could solve this in a clean way, but someone has to do the work, and we're talking about more than a few hours here. In any case, be so kind and look at patch 2 and tell me there if you're at least OK with making the documentation more detailed. P. >=20 > Regards, > Christian. >=20 > >=20 > > P. > >=20 > > > What we should improve is the documentation of the dma_fence_ops- > > > > enable_signaling and dma_fence_ops->signaled callbacks. > > > Especially see the comment about reference counts on > > > enable_signaling > > > which is missing on the signaled callback. That is most likely > > > the > > > root cause why nouveau implemented enable_signaling correctly but > > > not > > > the other one. > > >=20 > > > But putting that aside I think we should make nails with heads > > > and > > > let the framework guarantee that the fences stay alive until they > > > are > > > signaled (one way or another). This completely removes the burden > > > to > > > keep a reference on unsignaled fences from the drivers / > > > implementations and make things more over all more defensive. > > >=20 > > > Regards, > > > Christian. > > >=20 > > > > P. > > > >=20 > > > > > P. > > > > >=20 > > > > >=20 > > > > > > Regards, > > > > > >=20 > > > > > > Boris >=20