From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B53F7C0032E for ; Wed, 25 Oct 2023 18:36:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FC378D000D; Wed, 25 Oct 2023 14:36:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1ACA38D0001; Wed, 25 Oct 2023 14:36:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 075388D000D; Wed, 25 Oct 2023 14:36:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ECBC88D0001 for ; Wed, 25 Oct 2023 14:36:56 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7F0551207A2 for ; Wed, 25 Oct 2023 18:36:56 +0000 (UTC) X-FDA: 81384840432.19.8325570 Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by imf07.hostedemail.com (Postfix) with ESMTP id A39084000D for ; Wed, 25 Oct 2023 18:36:54 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wxAEzzd1; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698259014; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2e92O8wEWD/53OKXUltaj6IOD/TJFtp9v92hFMtfX+0=; b=fVPeyKRZ7nMdZycIoYBJ/bYjixgzM4KTyWrDY/kljXqFkRnhW2T8Z/n1V4kBzkO5jCpqLp EsN+Tldh49+iaTy3V/7Af+TTvqgxhes4ZCoYTcaRyA523bCt/TgYAHeynEEEp1zVhRPR2+ /hjs4eCQmD9SuvKnTYIETv+/Q7gnmlw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698259014; a=rsa-sha256; cv=none; b=4UGqEyUYYklpbgEr75Wiqm/0eZcCxCc4lsxNC7oGvM6WYkAAdgwST0VxlOFDWj4Rt1o16K WdQAnozjgc8hX2f4HMFmawIWeUzpoO9MgtjnZt9yq4mzvtUbxNaAbY5DgLvNAn2n0RCQ0t eOQusKDTykMRPRSubohtewA3x169WeA= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wxAEzzd1; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-53e3b8f906fso13507a12.2 for ; Wed, 25 Oct 2023 11:36:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698259013; x=1698863813; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2e92O8wEWD/53OKXUltaj6IOD/TJFtp9v92hFMtfX+0=; b=wxAEzzd1VSkDwAmTCnHc3zDDT2InHKs3Qkt4X0O0GXju7YVekIIPlLAfZPT7A8cwlQ 4i8ujhQvZK/m8lZCUlmMIkK6kUZdY7kQtp3p5Rw6w/v31JXpTl7o1i2cF8X7Rj1v4cKX kgqj0mFDfl9jEoLFEeM4zil6uMXbZ/i5HF7JqxKJ3cCFDlI5mwYmy+H+vEEcjPvmDHX7 fDbMv/DVcMaw4fIffDbebIKNgItlc8+H7KCMXszMvliluFM87g3iD2irD+Tg2VXtRwMV jy2db34QQkxxIuQXchS1NyMct+iTBHM7etfQetYiBNeW5TU4MIh9aBxLyS5x0sp31rKK KUmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698259013; x=1698863813; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2e92O8wEWD/53OKXUltaj6IOD/TJFtp9v92hFMtfX+0=; b=linvxwU/1a4qcGXiDfTSI8/NXxXef/a9NJYOXGhmALXtZvi2xkF3BMqhtAvQwOoRWR 4xZV62difZhqd14hmXUjl4TtNm5zXZ0+noY4NG2EXn8P9i0h32ut0FF6KLRfDhwblHIX 6KZP76l27vrqgdAaTU5ahApruCeSnrnt9t0TPVZMXOBgNGLc/aV+AEV1KYTJfdhXLdl5 3Juy6Jw7WLT77sf5FuAws1lSLG4++Do/cdYXfkOYXlVB/py6g/Lgli9fKj5aFfV2Y/gV U6mjsAAnTWMMvcUdJNoZuIh5kF8mN59XoBAtNLRG2fnvPjQ8oF3ViQG/MdcnjZJrho3R cnSA== X-Gm-Message-State: AOJu0Yym3OTVl7DdtDF1wX16TBXXOOQED52Ha7F20BH17wAvrTxLcgI5 sGpbeO/gwUIFkhkFZw7iXAkcbDvsI3HGAN7ULJAbOw== X-Google-Smtp-Source: AGHT+IFOrNXjGlGRWcmGJFHDoHQlFO4oVM3n0QNoccBm8uXU8YVkyo7zdGNfYE0da5c1OHviWCGusZmXqEZmjkytLeo= X-Received: by 2002:a17:907:70b:b0:9c6:724:fa16 with SMTP id xb11-20020a170907070b00b009c60724fa16mr12368133ejb.59.1698259012860; Wed, 25 Oct 2023 11:36:52 -0700 (PDT) MIME-Version: 1.0 References: <20231010032117.1577496-4-yosryahmed@google.com> <202310202303.c68e7639-oliver.sang@intel.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 25 Oct 2023 11:36:13 -0700 Message-ID: Subject: Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg To: Shakeel Butt Cc: Oliver Sang , Johannes Weiner , Feng Tang , "oe-lkp@lists.linux.dev" , lkp , "cgroups@vger.kernel.org" , "linux-mm@kvack.org" , "Huang, Ying" , "Yin, Fengwei" , Andrew Morton , Michal Hocko , Roman Gushchin , Muchun Song , Ivan Babrou , Tejun Heo , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Waiman Long , "kernel-team@cloudflare.com" , Wei Xu , Greg Thelen , "linux-kernel@vger.kernel.org" , Domenico Cerasuolo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A39084000D X-Rspam-User: X-Stat-Signature: otsdz6c551qucezwdurnxjbznajnbhhj X-Rspamd-Server: rspam03 X-HE-Tag: 1698259014-706029 X-HE-Meta: U2FsdGVkX1+RYyuXCTe611Hfw/zV8DQvZiijvRPWWyrFClWaP9/p2FDN2gchecS+euhxPM/ZMTp4P77oduh3T76yw/PkPTJ0QkY/OaZg6fXpoXzJYLZTEHCwAkQBCOzrEkTwEeAY31L+SqjTtoILmNLxIurIDfU2sVU7GUG6WbQyOa2vBpMwMUc+5A1ilSCAZtYHCug3mIjU4eSbwpKbIrLN3NNSKsHDDwvZjNLrReYUucd1DHEo7qf8/U9kJ1XeJN9pO+t4d6N+5Gc/uQ/VpsKExVSh29uH/HMR5W4eHeY6QHMRbRheC1eSHchNLzTstHAJ9uH9hyMwWpDsYfegpWjqkJCg40YTqqVfb/RU71vNurkVM9rJIZaympmaGbNOANOgu5VM7XV8wKeWESQ0MZhbizNgqSoJzFveqJ1LYEDNnwwvjXsqa6QE1GkeWjTkX9uoKFrZM5T9BdILa7PcBpffl3y+Aoc4ER4m3RcN+RBV3+emVQVmqhWYt7xpfkp4xsweIpn1o24ZmlrqbxXGBcEDoe64sNM/MyiZDel4zC3EFnM83DQfqHCvbSYBl+Sp55ng7fNV672MuAqkT/ZlY0ej8m4wDWuqeffHMpLPVIwhm1dXT0S9h5AdyuKqTVJTj1PWliF09nI+/vQ/PBXJGPdBkGnk0fij6cpA3nbaMkLBGjx4nNAfLDE9sRAISy8n1QbN+mWMTl1weakMRUpJu01gRA13sVs6fcuaVofVFbRAd1tcpZodgFLmiq+JpdgyktFLzUeVnZJxNuifs3B+Y2MdiLbgnEQkPfcXV4lLmTOmqVgNBDTrQ7w0DZmTzGcLGxR7Y6HOcqtRFYnPDHifvhC9aWhyVpcO0XFWS77twm7vOIVfGyM2QcYeaxv6P+qkzo5oWQR6VU0+BggdNhd83SEODCxRe2Y6T+J6vSLfa/v6si95uU5q4fLJOncGraW3zH3XxZssQTQXgcBlOof xxjcMMGN 3G0Ya+qV1G/WiNKrDLjzX3gr01lzqTsw0L+aXpWhrEJ+e58gP46Uw4qOXHcpLh2+QSVLRH6FdV9o/GZdT3sDfCfSGVakW8/jonvii/+hD+CPkyCVHZZbjDMae585JeV0UB+pbfIsf4fN1U/3C7OtmtWwdyMEyyE/Fha0t3vT/Mkgd5j3yQ34vk8tma6ZDC/gGfyePWPG/AEkryHkiQI8oA78w/ZTy4tKgT1gFR4UvO720GflImLsMZKJ/gd6f6QYDN33bGgaYu7lRF6qsOKb9D8qZi0m8juH+UNeoJToGKmuCVC+QhYjWWgC0RtnjaOZa1htc1EGoeDZ521pvW57bc5O8iLy99YS54DkrQaH9BTcnGvQ9XO+jhUPbeA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 25, 2023 at 10:06=E2=80=AFAM Shakeel Butt = wrote: > > On Tue, Oct 24, 2023 at 11:23=E2=80=AFPM Yosry Ahmed wrote: > > > [...] > > > > Thanks Oliver for running the numbers. If I understand correctly the > > will-it-scale.fallocate1 microbenchmark is the only one showing > > significant regression here, is this correct? > > > > In my runs, other more representative microbenchmarks benchmarks like > > netperf and will-it-scale.page_fault* show minimal regression. I would > > expect practical workloads to have high concurrency of page faults or > > networking, but maybe not fallocate/ftruncate. > > > > Oliver, in your experience, how often does such a regression in such a > > microbenchmark translate to a real regression that people care about? > > (or how often do people dismiss it?) > > > > I tried optimizing this further for the fallocate/ftruncate case but > > without luck. I even tried moving stats_updates into cgroup core > > (struct cgroup_rstat_cpu) to reuse the existing loop in > > cgroup_rstat_updated() -- but it somehow made it worse. > > > > On the other hand, we do have some machines in production running this > > series together with a previous optimization for non-hierarchical > > stats [1] on an older kernel, and we do see significant reduction in > > cpu time spent on reading the stats. Domenico did a similar experiment > > with only this series and reported similar results [2]. > > > > Shakeel, Johannes, (and other memcg folks), I personally think the > > benefits here outweigh a regression in this particular benchmark, but > > I am obviously biased. What do you think? > > > > [1]https://lore.kernel.org/lkml/20230726153223.821757-2-yosryahmed@goog= le.com/ > > [2]https://lore.kernel.org/lkml/CAFYChMv_kv_KXOMRkrmTN-7MrfgBHMcK3YXv0d= PYEL7nK77e2A@mail.gmail.com/ > > I still am not convinced of the benefits outweighing the regression > but I would not block this. So, let's do this, skip this open window, > get the patch series reviewed and hopefully we can work together on > fixing that regression and we can make an informed decision of > accepting the regression for this series for the next cycle. Skipping this open window sounds okay to me. FWIW, I think with this patch series we can keep the old behavior (roughly) and hide the changes behind a tunable (config option or sysfs file). I think the only changes that need to be done to the code to approximate the previous behavior are: - Use root when updating the pending stats in memcg_rstat_updated() instead of the passed memcg. - Use root in mem_cgroup_flush_stats() instead of the passed memcg. - Use mutex_trylock() instead of mutex_lock() in mem_cgroup_flush_stats(). So I think it should be doable to hide most changes behind a tunable, but let's not do this unless necessary.