From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [91.218.175.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 169642E4275 for ; Thu, 3 Jul 2025 12:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.183 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751544229; cv=none; b=I/ucK9wdoIMdlx/8U2QObaKnbrkyyrBuk2TAYbBN0HzP1YZx82fqFjtoxC+7/s10/lxEclaHtrRlaEa8mQU82GzrwKJA6M4Mq2KwkftcXMzTAE/oxF2Ng6wasQcshOzFwjEnnmamR/Epx2M5MdYTddeOGIFS8vsPVs2zPAMK4Zo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751544229; c=relaxed/simple; bh=jerXSrRoqA1qUbkI8oky8jp1ImqFUxyU/bN43PKvfSM=; h=MIME-Version:Date:Content-Type:From:Message-ID:Subject:To:Cc: In-Reply-To:References; b=QZbzg5zAgpUiESifm13sy6/6AxlTnw945Gw0mcMQ/U58POQlH96QE54VMVsXG/ERBKX6qUT11HJdtOCdJzbsR/8iGBBJmQHlgXVZax0hkP5Vlh8Gg9nplPKxsnUbkrwObWWq7iSrGCQX4s9FWyBuFz6Kp4qHajkDeV8xEpPvF+I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=dco+cCvD; arc=none smtp.client-ip=91.218.175.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="dco+cCvD" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1751544214; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e/8w/xeDaBVWRw9XzILJTV9gwuQG1T26Mai5piqvf6w=; b=dco+cCvDXtd07E6MtJaUfPxKAou+XRstlbU4pNHpBHSOjS1EfCn/aTXbQ/AlYwzmwrbq8w DbEawoGxejy3A5BRZzMeCXV8UrnPuayAvfIxanMZ+BHtqE0OYKNLwvmcH3MUTFhwVmPB0r fNvqLGQJQ6dxDoAImE6C2OYiahyERFU= Date: Thu, 03 Jul 2025 12:03:33 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Jiayuan Chen" Message-ID: <6724e69057445ab66d70f0b28c115e2d8fb5543b@linux.dev> TLS-Required: No Subject: Re: [PATCH net-next v1] tcp: Correct signedness in skb remaining space calculation To: "Eric Dumazet" Cc: netdev@vger.kernel.org, mrpre@163.com, "Neal Cardwell" , "Kuniyuki Iwashima" , "David S. Miller" , "David Ahern" , "Jakub Kicinski" , "Paolo Abeni" , "Simon Horman" , "David Howells" , linux-kernel@vger.kernel.org In-Reply-To: References: <20250702110039.15038-1-jiayuan.chen@linux.dev> X-Migadu-Flow: FLOW_OUT 2025/7/2 23:34, "Eric Dumazet" =E5=86=99=E5=88=B0: >=20 >=20On Wed, Jul 2, 2025 at 8:28 AM Jiayuan Chen = wrote: >=20 >=20>=20 >=20> July 2, 2025 at 22:02, "Eric Dumazet" wrote: > >=20 >=20> On Wed, Jul 2, 2025 at 6:59 AM Eric Dumazet = wrote: > >=20 >=20> > > >=20 >=20> > On Wed, Jul 2, 2025 at 6:42 AM Jiayuan Chen wrote: > >=20 >=20> > > >=20 >=20> > July 2, 2025 at 19:00, "Jiayuan Chen" w= rote: > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > The calculation for the remaining space, 'copy =3D size_goal -= skb->len', > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > was prone to an integer promotion bug that prevented copy from= ever being > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > negative. > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > The variable types involved are: > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > copy: ssize_t (long) > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > size_goal: int > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > skb->len: unsigned int > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > Due to C's type promotion rules, the signed size_goal is conve= rted to an > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > unsigned int to match skb->len before the subtraction. The res= ult is an > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > unsigned int. > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > When this unsigned int result is then assigned to the s64 copy= variable, > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > it is zero-extended, preserving its non-negative value. Conseq= uently, > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > > copy is always >=3D 0. > >=20 >=20> > > >=20 >=20> > > > >=20 >=20> > > >=20 >=20> > To better explain this problem, consider the following example: > >=20 >=20> > > >=20 >=20> > ''' > >=20 >=20> > > >=20 >=20> > #include > >=20 >=20> > > >=20 >=20> > #include > >=20 >=20> > > >=20 >=20> > int size_goal =3D 536; > >=20 >=20> > > >=20 >=20> > unsigned int skblen =3D 1131; > >=20 >=20> > > >=20 >=20> > void main() { > >=20 >=20> > > >=20 >=20> > ssize_t copy =3D 0; > >=20 >=20> > > >=20 >=20> > copy =3D size_goal - skblen; > >=20 >=20> > > >=20 >=20> > printf("wrong: %zd\n", copy); > >=20 >=20> > > >=20 >=20> > copy =3D size_goal - (ssize_t)skblen; > >=20 >=20> > > >=20 >=20> > printf("correct: %zd\n", copy); > >=20 >=20> > > >=20 >=20> > return; > >=20 >=20> > > >=20 >=20> > } > >=20 >=20> > > >=20 >=20> > ''' > >=20 >=20> > > >=20 >=20> > Output: > >=20 >=20> > > >=20 >=20> > ''' > >=20 >=20> > > >=20 >=20> > wrong: 4294966701 > >=20 >=20> > > >=20 >=20> > correct: -595 > >=20 >=20> > > >=20 >=20> > ''' > >=20 >=20> > > >=20 >=20> > Can you explain how one skb could have more bytes (skb->len) tha= n size_goal ? > >=20 >=20> > > >=20 >=20> > If we are under this condition, we already have a prior bug ? > >=20 >=20> > > >=20 >=20> > Please describe how you caught this issue. > >=20 >=20> > > >=20 >=20> Also, not sure why copy variable had to be changed from "int" to "= ssize_t" > >=20 >=20> A nicer patch (without a cast) would be to make it an "int" again/ > >=20 >=20> I encountered this issue because I had tcp_repair enabled, which u= ses > >=20 >=20> tcp_init_tso_segs to reset the MSS. > >=20 >=20> However, it seems that tcp_bound_to_half_wnd also dynamically adju= sts > >=20 >=20> the value to be smaller than the current size_goal. > >=20 >=20 > Okay, and what was the end result ? >=20 >=20An skb has a limited amount of bytes that can be put into it >=20 >=20(MAX_SKB_FRAGS * 32K) , and I can't see what are the effects of havin= g >=20 Hi=20Eric, I'm working with a reproducer generated by syzkaller [1], and its core logic is roughly as follows: ''' setsockopt(fd, TCP_REPAIR, 1) connect(fd); setsockopt(fd, TCP_REPAIR, -1) send(fd, small); sendmmsg(fd, buffer_2G); ''' First, because TCP_REPAIR is enabled, the send() operation leaves the skb at the tail of the write_queue. Subsequently, sendmmsg is called to send 2GB of data. Due to TCP_REPAIR, the size_goal is reduced, which can cause the copy variable to become negative. However, because of integer promotion bug mentioned in the previous email, this negative value is misinterpreted as a large positive number. Ultimately, copy becomes a huge value, approachi= ng the int32 limit. This, in turn, causes sk->sk_forward_alloc to overflow, which is the exact issue reported by syzkaller. On a related note, even without using TCP_REPAIR, the tcp_bound_to_half_w= nd() function can also reduce size_goal on its own. Therefore, my understandin= g is that under extreme conditions, we might still encounter an overflow in sk->sk_forward_alloc. So, I think we have good reason to change copy to an int.