From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6FFFE3B27FF for ; Wed, 11 Mar 2026 07:36:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773214600; cv=none; b=cLSQ5+3q7ey5ABc/EOeUhihrb0uqwiJcpuYB5jsms9Vpxyl0CBmrKvdl0fq5zYYtK/TUuzw/oLFWl3X65InYGtMRVEo4yWXiP5jY3kZk5Aqy6EdA6Wi7WKAdzOIEOwAFQGQ11PDRh7xQJRb3ktHpB1WI5YH32EO2nUfTf5qcp5I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773214600; c=relaxed/simple; bh=HcCA86If4wMPDKdIAoMLlvJ612HBDwwIc90rcnzJwNI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=t3mmVqEV19i5RzS+37FKOVVkraOpZTJyyPCRVsalfNJEkpDbXbkEkJ5P8sJQfmrSKt5Gp/wQ+BgvxuMwwD2p9M18xZHL7IBByzQHDzzeASY8OnB7QkmWzIzpUEfi4cYNXlFgS6qvdG49hnpVUNygrwmfIFmRz8KCxAKPElR1bH4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MQteAyR9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MQteAyR9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 23AC0C4CEF7; Wed, 11 Mar 2026 07:36:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773214599; bh=HcCA86If4wMPDKdIAoMLlvJ612HBDwwIc90rcnzJwNI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=MQteAyR9G5imXHR3ZpRw4vBPOx29pmZPQplR5qh0oIqnZGtYv7Ya+df0ECw70+b7G IusGbtgVeF3MVcKPgHFitNHIQCiUeG575GN3ePgbjTgD8hZODdO3LNeQV/wm8+pfQ2 rkUxYUBIQ05AtI69hG4DlTl29m/xS5PFRbgEIBquYMypFQn98/CHlr+/Zv4z7V9hE8 O6uAWams+6y75vZdqzaQFot3s0qYEb2JT1rV4V1/71EpmSQsU6UJDlU1uQc6uf3hzc X+Aa1t7OmKIU8Sjq4xE46HosRKVs6Ipd5qTSCPNRddmUWFCNDB8X3ardARiecABH9H IokssWnbu7YBw== Date: Wed, 11 Mar 2026 09:36:35 +0200 From: Leon Romanovsky To: Sabrina Dubroca Cc: Simon Horman , Florian Westphal , netdev@vger.kernel.org, Steffen Klassert , Herbert Xu Subject: Re: [PATCH ipsec-next 01/10] xfrm: state: fix sparse warnings on xfrm_state_hold_rcu Message-ID: <20260311073635.GR12611@unreal> References: <7388df7238672a92be0e4048f0225e6db294e736.1773051558.git.sd@queasysnail.net> <20260310103135.GB12611@unreal> <20260310182012.GF12611@unreal> <20260310194500.GO12611@unreal> <20260310201021.GP12611@unreal> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Mar 10, 2026 at 10:41:46PM +0100, Sabrina Dubroca wrote: > (adding Simon: maybe you can help with the "comment wording" issue here :)) > > 2026-03-10, 22:10:21 +0200, Leon Romanovsky wrote: > > On Tue, Mar 10, 2026 at 08:49:44PM +0100, Florian Westphal wrote: > > > Leon Romanovsky wrote: > > > > > You can only use xfrm_state_hold() if the refcount is already > 0. > > > > > xfrm_state_hold uses refcount_inc(), so you get a UaF warn splat > > > > > if this assuption doesn't hold true. > > > > > > > > I know it, the thing that bothers me is that it is unclear how > > > > xfrm_state_hold_rcu() can have refcount equal to 0. > > > > > > > > xfrm_state_put() decreases refcount and when it is zero, it calls > > > > to __xfrm_state_destroy(). The latter assumes that the state was > > > > already removed from various hlists. > > > > > > Yes, insertion in the table means refcount is 1, but userspace > > > can zap states at any time, e.g.: > > > > > > xfrm_del_sa -> xfrm_state_delete -> __xfrm_state_delete (which > > > unlinks from hash lists). > > > > > > The last xfrm_state_put() in that function may cause 1 -> 0 > > > transition. Parallel lookup can still observe that state, > > > so it has to pretend it wasn't there to begin with. > > Thanks Florian. > > > > Yes, this is possible scenario and this is what is worth to document. > > > We could add something like: > > /* Take a reference to @x, when we know the state has a refcount >= 1. > * In this case, we can avoid refcount_inc_not_zero and the error > * handling it requires. > * In contexts where concurrent state deletion is possible and we > * don't already hold a reference to that state, xfrm_state_hold_rcu > * must be used. > */ > > Though it may not make much sense to refer to xfrm_state_hold_rcu > (implemented in net/xfrm/xfrm_state.c) from include/net/xfrm.h. > > And if we consider the hashtables to be private to > net/xfrm/xfrm_state.c, nothing outside of it should ever see a state > with refcount=0, since they will only ever see states that already > have one reference held by whatever gave them the pointer. > > So maybe it's more xfrm_state_hold_rcu that needs a mention of > "concurrent state deletion could bring the refcount to 0 while we're > doing the lookup"? I don't know, for me it's pretty obvious with the > _rcu suffix that RCU -> unlocked -> could be deleted in parallel. It was the case when xfrm_state had __rcu marker, it is less obvious now. I agree with you that xfrm_state_hold_rcu() needs to be documented and not xfrm_state_hold(). Thanks > > > We also have __xfrm_state_put, the commit message that introduced it > states: > > We often just do an atomic_dec(&x->refcnt) on an xfrm_state object > because we know there is more than 1 reference remaining and thus > we can elide the heavier xfrm_state_put() call. > > so we could add: > > /* Drop a reference to @x, when we know there is more than 1 reference remaining. > * In this case, we can avoid refcount_dec_and_test and just decrement refcnt. > */ > > but maybe someone has a better suggestion. > > > -- > Sabrina