From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9EAAA24676D for ; Wed, 24 Jun 2026 13:33:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782308019; cv=none; b=NbV1kFUTRHgTsb0MvrNsR7N0ha9pBHB4VZB0zDcXPw491E6p+I0Emi4jjDvn9gMojErvv/ypdeRrrMJFCQ+xPOR76H0lyJIz9imWIH8POpsNK+0noIvY0yJI7Nf4BJMKWro5Bpdbcx4DV1mxs77vPNd1fqlW7C63imOqKFRYI9o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782308019; c=relaxed/simple; bh=kqALxvCv5hcvtn/VSN4a74zhks3eFUUcr6cnmbZpTmg=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=l7PiKeB5MpEJmb8pWpPLw4YpCKtdz4mrKK2CrFeD+Ituv3NvLljK6CBH63hrIi76wXdgV/ZCsItV0P4mjVjRuX1ocgq0A5uzYLH9Mi8m7QWixdi7mpAslxBJZVKvMUAiqTsM2qMiv+J0JNL6IpYMcgERewOZkkoKkzirW6fFHpA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oVFcS/Hl; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oVFcS/Hl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54F3F1F00A3A; Wed, 24 Jun 2026 13:33:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782308018; bh=mzLhVhDlW7fOpagGEKBv7ryy6+s1E0cRqF6wSmh04zI=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=oVFcS/Hl+X2N9jbOBZZjhAgcsuyZo45yuqX0YJslebJ3biZNY/O2waKKYt60bbYVl UIQc6UJrUmDStAKy05DVBgixKhpZn5mECDKWDAPUXSQFJ94jbM9NiEX4REOF1kVHoE X0yTZgw+hLd1P988m3tcPEtcEOBIodnraMPA0vl43JpFNud7hfKv5g52FItNl9ob5W 07ZKy1dvHToTGNyx6RGxLYTd30BbCz5c2Ly85ikLe1zluhw4MT9gH7FxeLd0og4xMW /p/B3HEztx4fRXmt/fQltTK8MvqsAW8HDXXPE7op38Gvdh3QBaP/uSSzrQQfO9Jasn OSCkoJ2rXe1nA== From: sashiko-bot@kernel.org Subject: Re: [PATCH net 3/7] xsk: drain continuation descs on invalid descriptor in __xsk_generic_xmit() Reply-To: sashiko-reviews@lists.linux.dev To: "Maciej Fijalkowski" Cc: bpf@vger.kernel.org In-Reply-To: <20260623133240.1048434-4-maciej.fijalkowski@intel.com> References: <20260623133240.1048434-1-maciej.fijalkowski@intel.com> <20260623133240.1048434-4-maciej.fijalkowski@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Wed, 24 Jun 2026 13:33:36 +0000 Message-Id: <20260624133337.54F3F1F00A3A@smtp.kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 2 potential issue(= s) to consider: New issues: - [Medium] The `invalid_descs` statistic is repeatedly incremented for the = same invalid descriptor if the Completion Queue (CQ) is full. Pre-existing issues: - [High] A malformed descriptor (e.g., invalid TX metadata) permanently sta= lls the TX ring because it is never consumed. -- commit b076f5f0d28ebb521e9e57462363bc0f2250e923 Author: Jason Xing xsk: drain continuation descs on invalid descriptor in __xsk_generic_xmit() When the TX loop in __xsk_generic_xmit() encounters an invalid descriptor mid-packet, the partial skb is dropped and the offending descriptor is released. This commit ensures remaining continuation descriptors belonging to the same multi-buffer packet are correctly drained on the next iteration instead of being treated as a brand-new packet. > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c > index e80c035a7af54..c489fadc36087 100644 > --- a/net/xdp/xsk.c > +++ b/net/xdp/xsk.c [Severity: High] This is a pre-existing issue, but does encountering a non-transient error during skb building permanently stall the TX ring? In __xsk_generic_xmit(), if xsk_build_skb() returns a non-transient error like -EINVAL (for example, from an invalid XDP_TX_METADATA payload), it intentionally skips releasing the descriptor: net/xdp/xsk.c:xsk_build_skb() { ... } else { /* Let application retry */ xsk_cq_cancel_locked(xs->pool, 1); } } Since the error is not -EOVERFLOW, __xsk_generic_xmit() jumps to out without releasing the descriptor: net/xdp/xsk.c:__xsk_generic_xmit() { ... skb =3D xsk_build_skb(xs, &desc); if (IS_ERR(skb)) { err =3D PTR_ERR(skb); if (err !=3D -EOVERFLOW) goto out; ... } Because the descriptor remains at the head of the TX ring and the error is non-transient, will every subsequent call peek the same descriptor, fail with -EINVAL, and exit without ever advancing? [ ... ] > @@ -1128,6 +1128,7 @@ static int __xsk_generic_xmit(struct sock *sk) > if (xs->skb) > xsk_drop_skb(xs->skb); > xskq_cons_release(xs->tx); > + xs->drain_cont =3D xp_mb_desc(&desc); > } > =20 > out: [Severity: Medium] If the completion queue is full, does this post-loop cleanup jump to out without consuming the invalid descriptor from the TX ring? Looking earlier in this same xskq_has_descs() block in __xsk_generic_xmit(), if reclaim_desc is true, it attempts to reserve a CQ slot: net/xdp/xsk.c:__xsk_generic_xmit() { ... if (xskq_has_descs(xs->tx)) { bool reclaim_desc =3D xs->skb || xs->drain_cont; =20 if (reclaim_desc) { err =3D xsk_cq_reserve_locked(xs->pool); if (err) { err =3D -EAGAIN; goto out; } } ... } If xsk_cq_reserve_locked() returns -EAGAIN, we jump to out before reaching the xskq_cons_release() call in the diff above. Would the next invocation (via poll or sendmsg) then peek the exact same unconsumed descriptor? Since xskq_cons_is_valid_desc() increments invalid_descs every time a malformed descriptor is peeked: net/xdp/xsk_queue.h:xskq_cons_is_valid_desc() { ... if (!xp_validate_desc(pool, d)) { q->invalid_descs++; return false; } ... } Could this arbitrarily inflate the invalid_descs statistic while the CQ remains full? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260623133240.1048= 434-1-maciej.fijalkowski@intel.com?part=3D3