From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C299C30658 for ; Tue, 2 Jul 2024 10:35:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42E316B009B; Tue, 2 Jul 2024 06:35:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3DE606B009D; Tue, 2 Jul 2024 06:35:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A5D66B009E; Tue, 2 Jul 2024 06:35:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0B97B6B009B for ; Tue, 2 Jul 2024 06:35:19 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8191EC205F for ; Tue, 2 Jul 2024 10:35:18 +0000 (UTC) X-FDA: 82294455516.13.047F5CC Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id B53C740021 for ; Tue, 2 Jul 2024 10:35:16 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WvEjWGzS; spf=pass (imf11.hostedemail.com: domain of hawk@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=hawk@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719916495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Cu9BNHR6TjIldfmi34hZ4AdMtPkwWnw+Yl+hzQ4iRPQ=; b=LstqvrezsA5r7PnyXYr/AREC3nE+teAnyBEnRkwxGiz1H2DlU2eG8wlpvu7bKpq7NvPmHa b5teE63HXCRMXmxgcL6bg7b0q7I2/5jgHci92ZRbiKdA5NCj/VBFBAqgowpMPq75mh0zJs vghbOm3Xa5SyYsRewQkMPMjpFVF+9uE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719916495; a=rsa-sha256; cv=none; b=XafxmysXs77laKKv7JOjj7TC8vk/O/GRQJ2D+vvnmEh9/EkPz4oF5reezcNcDqbXjX82Z9 SOljV+XAge8P0C+1yb9wzwDmeFVnyKDOVTLRB3y84QeQunsST/QDNPlJHto6tNQrWoof2W GKD0+RfEmJ31IpirD//Z/dUDV6PngOk= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WvEjWGzS; spf=pass (imf11.hostedemail.com: domain of hawk@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=hawk@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id C8E1E61B11; Tue, 2 Jul 2024 10:35:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A920DC116B1; Tue, 2 Jul 2024 10:35:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719916515; bh=PwVCJWNLKdS7OFxQ6pr4q5KnmpoFbEVa0L/2bGwEHGM=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=WvEjWGzS/5lH+NLwTbSUkCwF3MN44bZZg+IUc1t2g1+jqkPvCiJIXNkFT3oVybr/w Tqrf9FI2c4m6OeXTTeH2ptQ/dDEcIZPI2EQXoze53OF2dO0H8SMvclyo9im3a7NuBI SSB9WwkIzhjfCLvIPlKY3/V/U3KrdRpcyaHZmCvSsiZATB6UQWLEAMmmaaH/h6ifDV ROt83lpNTKgT/YSfUEeAkGDPOEEW43iNdQq6YFuyCt4h0uC0fLC7nvO3qDYTttocBY 5ufh6bcJMIJIONgsUScXALOeKkFfumtWBh2dipBGo79jrmDAntjDkrEhHZFUooUNsi s5qBofsbhAiEg== Message-ID: <849e7b86-b971-47d7-8e31-7eee0918ea33@kernel.org> Date: Tue, 2 Jul 2024 12:35:12 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH V4 2/2] cgroup/rstat: Avoid thundering herd problem by kswapd across NUMA nodes To: Yosry Ahmed Cc: Shakeel Butt , tj@kernel.org, cgroups@vger.kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, longman@redhat.com, kernel-team@cloudflare.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <171952310959.1810550.17003659816794335660.stgit@firesoul> <171952312320.1810550.13209360603489797077.stgit@firesoul> <4n3qu75efpznkomxytm7irwfiq44hhi4hb5igjbd55ooxgmvwa@tbgmwvcqsy75> <7ecdd625-37a0-49f1-92fc-eef9791fbe5b@kernel.org> Content-Language: en-US From: Jesper Dangaard Brouer In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: B53C740021 X-Stat-Signature: f9ybcc3muitywtatgj3pfbnr3g3uxmcr X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1719916516-21135 X-HE-Meta: U2FsdGVkX19FsoP+5+tbNH08B6SXHVndSFAMxnbRb/JPsgXW0ZP8VTeUK5F1tMADQtkMdFTqz/2n7ngjnv+SQ6c6duFQQV/mLksMUluxwBvbycNlLLGaIwGtVu/yAWXIUn709AqagTBV+UES9IG+1+UkZM9z/wwFu3SNmR+L+zdXUi+6ecTZYbo4iJcmtvFE9CL7Y2F+2wOI8dC9bVAcRMOfu1h8x2xGz2xDo50aagXgmG+7fqOB1oUun3lm0ZXEm0Nxp6w/fJW4y92WmHF3sm1V9Gzs+BZ5zA8pKV57smxV/eEYi7Rx+s+CCy+Q/Y5JfZGpPje9WvkY0IHo+lYFSw7zM1r8CkfpsVLqjTr6NsVzKSwb47ZNFyHwK/+92RHa4XEhK2muKZduRZWggYVXpPReTISzySYq6XP2oKJVuqqQM+cpV8ICfuHFTci5svYJ81ngAfpoTHMA4pZT4ykwHBATly4CkqEy72avmq3w21WJO3ZgB+SRve2RHaTmwxtR9knyA1DgQaVo9aXCFJHoG2l9qUqRAnhd095iXbNTA8TvrZ7A5r3XbsjYmhSAIaXFN2y+eKl6qeQlhAN5sqbbBj858GjOF2YC8BoP6hRfUejW7TlwKddmk5RXfUjobBboYH+M0jZSR1Rmw1kgRDYmBEMBxmYXcGuEZYHRa3XIh4SS3wzytVJXjBwuCUKv6swJLiKXPmi07ct9SlvpH+zxAVDagpqjpLOJ3wFwjlhKZffV/XvJBXl8qP+MNCVSqwHEzeikGf2r3a3EodvjjS+kVdIM8n/UeR8LmXYv9/UtKYK3BlxrvOXCJFyk00Rf6MnTAMmejMDwE9kJlPKBK+0rLxKM1b2dZRLCF7/bxLAcO2LeahEWs7FrArjw7f5qaUlKOHBT61hOVSJHSs/u+aCxOeTsGDjenWzK16cyN7FmrAt5I46VxEK3bFT5Wg5GpnbzhrVmLayBY628Lrdh+p9 42jgDlqi p0qdGLL2xIr8lsaN0yagGW9Lw6oPj5glFR3j04s1hMnCNiHwBuXcs4uIn1WG1kJkfMCnw8mq+6P5EvgdIUJholJKFlzTXOP6I4CvGrGGDkgE4LC24nQBu8kgIYz3Q7g+J88U4XtvEsXTPxpgB/RujghTUfsR4Rh+wBQt26j67czowYS93oeJprylWa0nAUNrhXxygBffX6jLjFvzFroAAP2DnwV4i3j3Yy4dQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.041407, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 29/06/2024 00.15, Yosry Ahmed wrote: > [..] >>>> + /* Obtained lock, record this cgrp as the ongoing flusher */ >>>> + if (!READ_ONCE(cgrp_rstat_ongoing_flusher)) { >>> >>> Can the above condition will ever be false? >>> >> >> Yes, I think so, because I realized that cgroup_rstat_flush_locked() can >> release/"yield" the lock. Thus, other CPUs/threads have a chance to >> call cgroup_rstat_flush, and try to become the "ongoing-flusher". > > Right, there may actually be multiple ongoing flushers. I am now > wondering if it would be better if we drop cgrp_rstat_ongoing_flusher > completely, add a per-cgroup under_flush boolean/flag, and have the > cgroup iterate its parents here to check if any of them is under_flush > and wait for it instead. > > Yes, we have to add parent iteration here, but I think it may be fine > because the flush path is already expensive. This will allow us to > detect if any ongoing flush is overlapping with us, not just the one > that happened to update cgrp_rstat_ongoing_flusher first. > > WDYT? No, I don't think we should complicate the code to "support" multiple ongoing flushers (there is no parallel execution of these). The lock yielding cause the (I assume) unintended side-effect that multiple ongoing flushers can exist. We should work towards only having a single ongoing flusher. With the current kswapd rstat contention issue, yielding the lock in the loop, creates the worst possible case of cache-line trashing, as these kthreads run on 12 different NUMA nodes. I'm working towards changing rstat lock to a mutex. When doing so, we should not yield the lock in the loop. This will guarantee only having a single ongoing flusher, and reduce cache-line trashing. --Jesper