From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAE027E for ; Wed, 2 Nov 2022 02:55:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BA6D5C433C1; Wed, 2 Nov 2022 02:55:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1667357707; bh=pRILxpzAZ9n5CvohsHJzRYFohLuV/DxDTLb1ULcsJTI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=H1+y0DZtVqtTS2QDbEHlRu4Pe4oNAJEfGjcPkMbqiSdsUq0dCxIpzryxnttU90voO LzhKvLXqxEzsn7wRI2xzb2gFsYtH4iRDK6f12zWEnkBy/gtvRAOOjdTOB6fuGxc0GW /nrJrmBoIuF84NSpbbBBXXFZFgaem7d7nYiW7p6Q= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Shakeel Butt , Roman Gushchin , Jakub Kicinski , Sasha Levin Subject: [PATCH 6.0 189/240] net-memcg: avoid stalls when under memory pressure Date: Wed, 2 Nov 2022 03:32:44 +0100 Message-Id: <20221102022115.666150528@linuxfoundation.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221102022111.398283374@linuxfoundation.org> References: <20221102022111.398283374@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Jakub Kicinski [ Upstream commit 720ca52bcef225b967a339e0fffb6d0c7e962240 ] As Shakeel explains the commit under Fixes had the unintended side-effect of no longer pre-loading the cached memory allowance. Even tho we previously dropped the first packet received when over memory limit - the consecutive ones would get thru by using the cache. The charging was happening in batches of 128kB, so we'd let in 128kB (truesize) worth of packets per one drop. After the change we no longer force charge, there will be no cache filling side effects. This causes significant drops and connection stalls for workloads which use a lot of page cache, since we can't reclaim page cache under GFP_NOWAIT. Some of the latency can be recovered by improving SACK reneg handling but nowhere near enough to get back to the pre-5.15 performance (the application I'm experimenting with still sees 5-10x worst latency). Apply the suggested workaround of using GFP_ATOMIC. We will now be more permissive than previously as we'll drop _no_ packets in softirq when under pressure. But I can't think of any good and simple way to address that within networking. Link: https://lore.kernel.org/all/20221012163300.795e7b86@kernel.org/ Suggested-by: Shakeel Butt Fixes: 4b1327be9fe5 ("net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()") Acked-by: Shakeel Butt Acked-by: Roman Gushchin Link: https://lore.kernel.org/r/20221021160304.1362511-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- include/net/sock.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/sock.h b/include/net/sock.h index d08cfe190a78..8a98ea9360fb 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2567,7 +2567,7 @@ static inline gfp_t gfp_any(void) static inline gfp_t gfp_memcg_charge(void) { - return in_softirq() ? GFP_NOWAIT : GFP_KERNEL; + return in_softirq() ? GFP_ATOMIC : GFP_KERNEL; } static inline long sock_rcvtimeo(const struct sock *sk, bool noblock) -- 2.35.1