From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38A632C9C for ; Tue, 7 Feb 2023 13:09:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ABB20C4339B; Tue, 7 Feb 2023 13:09:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1675775385; bh=FHYi2wnxVd3sPANXwCth6PmKEY9fnalZG81+2q0wm8Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=De3UE7fR6HI29rBjiO5ZgC6oSnbMejGcXwJNvidPlMvPqMfK8223YezWkx/DiR45i tJRxmL0K2ex/T15D/ah1w4MZfVdo59ZuT4mcGQPz4LcaooAOv2AykqgvfoTOOWGnY1 Upi2jphnhKzCJyIFVznnRF0SJbI2UGwsE7BRki+Q= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Magnus Karlsson , =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= , Jakub Kicinski , Sasha Levin Subject: [PATCH 5.15 023/120] qede: execute xdp_do_flush() before napi_complete_done() Date: Tue, 7 Feb 2023 13:56:34 +0100 Message-Id: <20230207125619.774653907@linuxfoundation.org> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207125618.699726054@linuxfoundation.org> References: <20230207125618.699726054@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Magnus Karlsson [ Upstream commit 2ccce20d51faa0178086163ccb6c84a099a87ab4 ] Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: d1b25b79e162b ("qede: add .ndo_xdp_xmit() and XDP_REDIRECT support") Signed-off-by: Magnus Karlsson Acked-by: Toke Høiland-Jørgensen Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/qlogic/qede/qede_fp.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c index 17f895250e04..d67d4e74b326 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_fp.c +++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c @@ -1439,6 +1439,10 @@ int qede_poll(struct napi_struct *napi, int budget) rx_work_done = (likely(fp->type & QEDE_FASTPATH_RX) && qede_has_rx_work(fp->rxq)) ? qede_rx_int(fp, budget) : 0; + + if (fp->xdp_xmit & QEDE_XDP_REDIRECT) + xdp_do_flush(); + /* Handle case where we are called by netpoll with a budget of 0 */ if (rx_work_done < budget || !budget) { if (!qede_poll_is_more_work(fp)) { @@ -1458,9 +1462,6 @@ int qede_poll(struct napi_struct *napi, int budget) qede_update_tx_producer(fp->xdp_tx); } - if (fp->xdp_xmit & QEDE_XDP_REDIRECT) - xdp_do_flush_map(); - return rx_work_done; } -- 2.39.0