From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BCD8C3600C for ; Thu, 3 Apr 2025 16:39:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F7B4280004; Thu, 3 Apr 2025 12:39:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A49A280001; Thu, 3 Apr 2025 12:39:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 394C9280004; Thu, 3 Apr 2025 12:39:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1953D280001 for ; Thu, 3 Apr 2025 12:39:38 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 74A5856F17 for ; Thu, 3 Apr 2025 16:39:39 +0000 (UTC) X-FDA: 83293293678.23.D0FDF74 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf22.hostedemail.com (Postfix) with ESMTP id 9EEB2C0007 for ; Thu, 3 Apr 2025 16:39:37 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=j776MC1H; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743698377; a=rsa-sha256; cv=none; b=uUYELK/7fHIUD1Cwod5Vgo0RUeXaVWrPPpgpzxsqlrf5q6qcYhonYN9awSWkAWJeQPKBXK D3lsqlzJKVIWZoS01zlx8kfPvcNNshPvjtM8Pu8zgqSlU9/RB91i+8v1U9R11kCnvlojYH qK7lsBW2opMoKh4360s4OBHI36QR4OI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=j776MC1H; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743698377; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QsquTBzgOmkr9/ysxz/GY+SPh+eIH8M6tMImN0n3NDc=; b=tvINiqOD0PeEuLyn9A2iMp6RAwAkpSSyiZuM0Nxsb6O8OX5cwQcuOS5mCJnbxy175jKnXF NS8tKFJqFJEKmJOLASMrQShfLIQQWyXzfXMQcEmRVW/BXyi7D4TdudxPiVvKwFuq2iCCWw dIAiPwhRaPMjHFZO5MhGwFbVCP8zKEM= Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-5e66407963fso2172489a12.2 for ; Thu, 03 Apr 2025 09:39:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743698376; x=1744303176; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QsquTBzgOmkr9/ysxz/GY+SPh+eIH8M6tMImN0n3NDc=; b=j776MC1Hd+tPcypunRWjpGtlpnkA0G+6sZ49XL+znWcTueG7BfXzx91xoyI/s7jgdo e9OTy+otUL+2VKaIBnplpg0M6Zwsbg0U0fzyMsEtTjkxIC2XQcOwiGzu41cmFdiq1NCC 9FGla7CW1mNIL9gGzuKx4Kpg9w+ed2Oe7tkQafFEu6WU/UKBmuXXyt2QavMZKgMyqws7 I6/4vDZLl6m3TPLBOOxxTBogAbL5oMemMXlTsCegBml5FEKGdVqBfILPwqZl1n4e5glx dHWj9slqa3q3g29+4eAGYN8JvR0KVJVvYrRA7gX3U+4Jf9jUK14dL2RrVh2i8KTTBL8m PMxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743698376; x=1744303176; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QsquTBzgOmkr9/ysxz/GY+SPh+eIH8M6tMImN0n3NDc=; b=X/Ll9d6k4bpUX/syzJOshokHHTFBx5IPK+J6JozPs0fWJO8dlhfWy6YAn7250m/xwk +N+BiWxOWsMFl2amf8wujEUIhTXY5X+UZ+VBit17nU83OKOBIMVITtwIR4h8iyf7LJZt FHid2UJjFVxhqZ8rsEpwV5pcDphLxvZB06QUqmJ4wwf4r3J9HTLno7Qkz1Cbcrbjuwa3 tc3lYWAAlaEC0HlatfrgOxtV5xqhRgLSqWnDP+Pfe4oTgB92W667W5i17hAIMHbfdQBx ZWkFGHSNZwD4QHNN3N3vddCYATnXQkKmy+igc03hVbQZrixDt6qJoPG/B7GTcdijnd7B s8uA== X-Forwarded-Encrypted: i=1; AJvYcCUZG4U2zofNKUPzm58barsrD1ZrPDpGnqyKJ0nGDIoRTfVzxzypiHwh6h/yiZATzHKIZUnIsPAscw==@kvack.org X-Gm-Message-State: AOJu0Yy5gRwE/nmo2gC4JMcuYJeGlTaSnnfgLda7qhy5+64dcPdyAZFu JSnFfC0kcjke2NfvvQLPJ51Mqz25KDbPgDeF7uCgGxrDvJcBIQV1JvgVIybeWdm7dVTLtNjmF56 onZw/oGDJrM3Gk0ZMeEZMG9g529k= X-Gm-Gg: ASbGncvpmN5pZBFXt1Oh38JYVU+PfYeAioBPk/tLKjdDrhOwtI7ixxj1YEwtmyLIxa+ O3DPK569HujujVzlkYSeKXI1PDRYB90OyM4JssyffBrS2RW1Mp8Bzzhvt8yU91bFVjNWG0R08/D ZXePsF6K4ZbqKdErjtjiWOPzeJ X-Google-Smtp-Source: AGHT+IHwXZTdMgLzXp7QCCwNxDBR2dVogVg7DgrbpojvC2yqGK8Fv9w9GNmWMTZeRDGVJpYZV7xR1c970wAge9hCFQQ= X-Received: by 2002:a05:6402:42c4:b0:5dc:94ce:42a6 with SMTP id 4fb4d7f45d1cf-5edfd9f795fmr19692799a12.22.1743698375723; Thu, 03 Apr 2025 09:39:35 -0700 (PDT) MIME-Version: 1.0 References: <20250331223516.7810-2-sweettea-kernel@dorminy.me> In-Reply-To: <20250331223516.7810-2-sweettea-kernel@dorminy.me> From: Mateusz Guzik Date: Thu, 3 Apr 2025 18:39:23 +0200 X-Gm-Features: ATxdqUEt9Pz10N02XrpS39HolmBCJKRettgtdx6EOudxklUyVCrn5fPBZffA18k Message-ID: Subject: Re: [RFC PATCH v2] mm: use per-numa-node atomics instead of percpu_counters To: Sweet Tea Dorminy Cc: Andrew Morton , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Dennis Zhou , Tejun Heo , Christoph Lameter , Martin Liu , David Rientjes , =?UTF-8?Q?Christian_K=C3=B6nig?= , Shakeel Butt , Johannes Weiner , Sweet Tea Dorminy , Lorenzo Stoakes , "Liam R . Howlett" , Suren Baghdasaryan , Vlastimil Babka , Christian Brauner , Wei Yang , David Hildenbrand , Miaohe Lin , Al Viro , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, Yu Zhao , Roman Gushchin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9EEB2C0007 X-Stat-Signature: to5rowz4184yo9f3f1m9uuzam1kyei8i X-HE-Tag: 1743698377-209635 X-HE-Meta: U2FsdGVkX1+Sh+T5ZVj4PwC+e5bo2M96flj97j92OaUNi4sbpGYaypmxLKirw0i1ssJuZMJ2zilRrNWp8GeG6GcLlsQhcf8yI1ezd2NrNBiujS9PDHOstu8yvVjVmq+fnlIUx2uYXpB5wZ9OF+L2Nd/jft6RY4jfpOUea9xGqkkP6Jj3ENOfsyhvQaSmhLEqvuTsm/X5+ERgeNpebsmBGYf6kqknjdBfxuoDUp+rG0dROK7tbT0l2dklnVijjn5guZQkhsSNJzCMWi4N42wJIIsQ3lTpTEpsWgxkmfLUuOUmATu48lDpmx2Z8CndQRSWEIuihezUXmTRgH0zzRi9/r+Y9WPiSsItj9oLNxQrqbjJD+WhzxLbvxt2c/WgKCScRd8cXfdkylM88z+soeHOQE9VzPj4m2YhE/rjxIytW0HnQTbj25RoW5lk+R8JjdonorfYT07VUX0omH/HKf06ivguihJC90ZwNzCMR8AxohwkOAu3h4AMMfJPxVOvFtCbwysnKEWxcmPqKtfNAF1Hl/+MabNWVLTVI9Z7bUFwc8E8YvtyH0R8NYsHSaZHiHlpEJitbbGRXp5kSfgQdOE725TQYRWWDm5PNoZUhBK1eRyf1FLlN0rCXsVTlkbVE8bijLPRYrRm5eVz09n65XUu64yHYxsSIRoHtqwiyTmhF1+dcgGPIsELaoKOuXNmlSV1JK1QxxpcfSWePvFJDdspZsPR5AavetY1H7tieHMX059OWTmi/oD0OWQRNz8+McqF34Rp11wcWltwf6+N9OpQxNSFOs0TmqfOdpYm9hJbZTbJE2Q/6SNJ9SauZ1UKF98iumlZmfHTkftGy5kFu3iwyur1S9cy5hQ4KNclZWKF0rwGnOhFl3oZ4GQOr9dVqnxq5w1fCGOo+Lt3ZoHtuD1WXPIVS7AJonMaIHz63bXnaIq91ZYll2xZRd4A3JgkJNKVp6wcZ6T5+UA7NOSvuXQ YeiklW8+ RqRL9nyOzGiWmjuS/Hr+7sJ/iCOPMpDjZlAIFfH64wRem9gm1voub1KMa4xbWwL8Juz1AS+v8aoHdUSNM/kVQWHey1kmTmXI5oHY+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000011, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 1, 2025 at 12:36=E2=80=AFAM Sweet Tea Dorminy wrote: > > [Resend as requested as RFC and minus prereq-patch-id junk] > > Recently, several internal services had an RSS usage regression as part o= f a > kernel upgrade. Previously, they were on a pre-6.2 kernel and were able t= o > read RSS statistics in a backup watchdog process to monitor and decide if > they'd overrun their memory budget. Now, however, a representative servic= e > with five threads, expected to use about a hundred MB of memory, on a 250= -cpu > machine had memory usage tens of megabytes different from the expected am= ount > -- this constituted a significant percentage of inaccuracy, causing the > watchdog to act. > [snip] > I think the important part is that this improves accuracy; the current > scheme is difficult to use on many-cored machines. It improves > performance, but there are tradeoffs; but it tightly bounds the > inaccuracy so that decisions can actually be reasonably made with the > resulting numbers. > Even disregarding this specific report, a prior patch submission points a result which is so off that it already constitutes a bug: https://lwn.net/ml/linux-kernel/20220728204511.56348-1-ryncsn@gmail.com/ So something definitely needs to be done to improve the accuracy. But that always will be a tradeoff vs update performance. This brings me to a question: how often does the watchdog thing look at the stats? I wonder if it would make sense add another file to proc, similar to "status", but returning *exact* values. So in particular with percpu counters it would walk all CPUs on to generate the answer. Then interested parties would still get an accurate count and not get in the way, provided they don't relentlessly do it. --=20 Mateusz Guzik