From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fout-a5-smtp.messagingengine.com (fout-a5-smtp.messagingengine.com [103.168.172.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEADE359A97 for ; Fri, 6 Mar 2026 10:13:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.148 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772792006; cv=none; b=RyZ9LhGk1T3DnVVx5dCgDrT/iHVI0w8OMF79AOHpraORnAljqR4yVdgzZraimAKJq5CRfdlRLTWDrdDq1u5eZx5Wwu7P+t+3HtDw++ZiqHiKVzGE0rEdyXvz9/ND3HpSiUqEwS1AF2DGQaDWEJX/OI+B8sxNDWt+hp2Myu3gK/4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772792006; c=relaxed/simple; bh=8iUlFMCBwG1DcpDGPk0dI4+F9rsSor1oacl3ahu8vmE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=OKaU6prKn1UhOORSDeFQuGie7mbZsl9q4cXsO9t5IetGjomiz7PgjiiK9imGVindveKPy1Wzjfl5YEs3Szf0nUcoaNTfj2lSSorDF66LNh36rM+e/dZ043jLdZxRsrLgK1qZDuhGbt437p40/4bOLDuegb7MEEl5VoqEygiGH0Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=queasysnail.net; spf=pass smtp.mailfrom=queasysnail.net; dkim=pass (2048-bit key) header.d=queasysnail.net header.i=@queasysnail.net header.b=eeZETnzY; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=idn/Ra19; arc=none smtp.client-ip=103.168.172.148 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=queasysnail.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=queasysnail.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=queasysnail.net header.i=@queasysnail.net header.b="eeZETnzY"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="idn/Ra19" Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfout.phl.internal (Postfix) with ESMTP id F3F18EC0308; Fri, 6 Mar 2026 05:13:21 -0500 (EST) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-02.internal (MEProxy); Fri, 06 Mar 2026 05:13:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=queasysnail.net; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm2; t=1772792001; x= 1772878401; bh=i2BMcW6q5XjQq2zKyCDAwZpe2VVOKvd3c+Wqr5PNqnY=; b=e eZETnzYlqqiHGMfcT0gkgWe/PjiDot38OQKl4y1TZiE2byJ3XQ1TkEv1IBXFSnJP sp/8ULfI6sDn5xIzj6Vlt0HhalOPm0evPlG7VEt8XYV4B1dW4tW9rMtAyp4mOiHY rMbxaBLnWad2ajp8K9BQmRAXhVxwVCjEWU3XNnii+xglIBUiodRCoPbjg1mtM9RA iG9p4quPzymx0oV7MUmickTk0zEsEja1OlScgnFuj0O6V4YTQF8s+Ttdis9I3F80 5D1OIX+NV4i2J4NmMuGI3ZkfZ9sXyJTDBr/3MRJGS7t+6ZPDUXTQcuXtJkEFgQCO J2ez3bzTiWenzSoIlMMJQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1772792001; x=1772878401; bh=i2BMcW6q5XjQq2zKyCDAwZpe2VVOKvd3c+W qr5PNqnY=; b=idn/Ra19p1hhii9uBw/Y8tKNoHayXeTZCv88fIhrwSI01IsQbGT UNn+QkcckilHBRiy/hhwILp0rPcg5OXzZieQGA0yK5R0pSJWspGLp73ac+BvNg4c ZLbTpXi0dvIj5Z+lBr71SZUW0C2pkNWicVXYAtBwAbWaCPt8oz+FJqFuQEpws30x r5Xk7JD19Lgwv9wHVIgB9N0DFemaL7KhgFtwBzmxgNl6iiI8S8uQcKEu4bw1ZD3I FF5SIgnzrtcL1XHbjyEDI4mbL7XSdar9aXl9tTKo9jQTkxUuvcOA4sc5AsW0lWj/ jnGH4lm8E2WhlngiO/oIBOPrTaB64aXbUZw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvieeltddvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehttdertd dttdejnecuhfhrohhmpefurggsrhhinhgrucffuhgsrhhotggruceoshgusehquhgvrghs hihsnhgrihhlrdhnvghtqeenucggtffrrghtthgvrhhnpeeuhffhfffgfffhfeeuieduge dtfefhkeegteehgeehieffgfeuvdeuffefgfduffenucevlhhushhtvghrufhiiigvpedt necurfgrrhgrmhepmhgrihhlfhhrohhmpehsugesqhhuvggrshihshhnrghilhdrnhgvth dpnhgspghrtghpthhtohepuddtpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehi mhhvgegsvghlsehgmhgrihhlrdgtohhmpdhrtghpthhtohepuggrvhgvmhesuggrvhgvmh hlohhfthdrnhgvthdprhgtphhtthhopegvughumhgriigvthesghhoohhglhgvrdgtohhm pdhrtghpthhtohepkhhusggrsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehprggsvg hnihesrhgvughhrghtrdgtohhmpdhrtghpthhtohephhhorhhmsheskhgvrhhnvghlrdho rhhgpdhrtghpthhtohepjhhulhhirgdrlhgrfigrlhhlsehinhhrihgrrdhfrhdprhgtph htthhopehlihhnuhigsehtrhgvsghlihhgrdhorhhgpdhrtghpthhtohepnhgrthgvrdhk rghrshhtvghnshesghgrrhhmihhnrdgtohhm X-ME-Proxy: Feedback-ID: i934648bf:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 6 Mar 2026 05:13:21 -0500 (EST) Date: Fri, 6 Mar 2026 11:13:19 +0100 From: Sabrina Dubroca To: Hyunwoo Kim Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, Julia.Lawall@inria.fr, linux@treblig.org, nate.karstens@garmin.com, netdev@vger.kernel.org Subject: Re: [PATCH net v2] strparser: Fix race condition in strp_done() Message-ID: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: 2026-03-06, 09:11:04 +0900, Hyunwoo Kim wrote: > On Fri, Mar 06, 2026 at 12:35:48AM +0100, Sabrina Dubroca wrote: > > Sorry for the delay, I wanted to think about the race condition a bit > > more. > > > > 2026-03-03, 10:50:05 +0900, Hyunwoo Kim wrote: > > > On Tue, Mar 03, 2026 at 12:10:33AM +0100, Sabrina Dubroca wrote: > > > > 2026-02-27, 06:51:10 +0900, Hyunwoo Kim wrote: > > > > > On Mon, Feb 23, 2026 at 06:20:58PM +0100, Sabrina Dubroca wrote: > > > > > > 2026-02-20, 18:29:55 +0900, Hyunwoo Kim wrote: > > > > > > "strp stopped" is not really enough, I think we'd also need to reset > > > > > > the CBs, and then grab bh_lock_sock to make sure a previously-running > > > > > > ->sk_data_ready has completed. This is what kcm does, at least. > > > > > > > > > > It seems that this is not something that should be handled inside strp itself, > > > > > but rather something that each caller of strp_stop() is expected to take care > > > > > of individually. Would that be the right direction? > > > > > > > > Agree. > > > > > > > > > It also appears that ovpn and kcm handle this by implementing their own callback > > > > > restoration logic. > > > > > > > > Right. I tried to look at skmsg/psock (the other user of strp), but > > > > didn't get far enough to verify if it's handling this correctly. > > > > > > > > > > Without that, if strp_recv runs in parallel (not from strp->work) with > > > > > > strp_done, cleaning up skb_head in strp_done seems problematic. > > > > > > > > > > From the espintcp perspective, how about applying a patch along the following lines? > > > > > > > > This is what I was thinking about, yes. > > > > > > In my opinion, it might be cleaner to split the espintcp callback restoration work into > > > a separate patch, rather than merging it into the strparser v3 patch. What do you think? > > > > Sure. But once espintcp is fixed in that way, can the original race > > condition with strparser still occur? release_sock() will wait for any > > If the espintcp callback restoration patch is applied, the strparser > race should no longer occur in espintcp. > > > espintcp_data_ready()/strp_data_ready() that's already running, and a > > sk_data_ready that starts after we've changed the callbacks will not > > end up in strp_data_ready() at all so it won't restart the works that > > are being stopped by strp_done()? > > > > It's quite reasonable to use disable*_work_sync in strp_done, but I'm > > not sure there's a bug other than espintcp not terminating itself > > correctly on the socket. > > That said, the _cancel APIs in strparser still appear to carry some > structural risk, so it might still make sense to switch to the _disable > APIs for the benefit of other strp users or potential future callers. Not really. Every user of strp that is open to the strp_recv vs cancel_* race is also open to the strp_recv vs free race, so switching from cancel_* to disable_* is only a partial fix. But if we took and released the socket lock in strp_done, we would solve the issue for all users even without resetting the callbacks? @@ -503,6 +503,10 @@ void strp_done(struct strparser *strp) { WARN_ON(!strp->stopped); + lock_sock(strp->sk); + /* sync with other code */ + release_sock(strp->sk); + cancel_delayed_work_sync(&strp->msg_timer_work); cancel_work_sync(&strp->work); - strp->stopped so any new call into strp_data_ready will not do anything - lock/release need to take bh_lock_sock so any existing call to strp_data_ready will have to complete before we move on to cancel*_work Or maybe the requirement should be that strp_stop has to be called under lock_sock() (or even just bh_lock_sock), but again I can't figure out if that's ok for sockmap. > With that in mind, perhaps the direct fix for this race could be handled > in the espintcp callback restoration patch. For the strparser patch, I > could instead adjust the commit message to reflect that it removes a > potential hazard by replacing the _cancel APIs with the _disable > variants, and resubmit it in that form. I'm not going to nack a patch doing s/cancel_/disable_/ in strp_done, but it doesn't fully solve the race condition if the caller isn't doing the right thing, and it doesn't do anything if the strp user is handling the teardown correctly. -- Sabrina