From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AA8CC3ABC3 for ; Tue, 13 May 2025 00:27:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 312F16B0083; Mon, 12 May 2025 20:27:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1AF186B0085; Mon, 12 May 2025 20:27:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF5176B0088; Mon, 12 May 2025 20:27:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D04D36B0083 for ; Mon, 12 May 2025 20:27:23 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C06DC1A0BE6 for ; Tue, 13 May 2025 00:27:24 +0000 (UTC) X-FDA: 83435995608.10.DE8FFE1 Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by imf14.hostedemail.com (Postfix) with ESMTP id 1F4C9100007 for ; Tue, 13 May 2025 00:27:22 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="u5S/RKCJ"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf14.hostedemail.com: domain of sj@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=sj@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747096043; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h0jOKlpjbhDMZv3DOM63dv2CNQEyCip/czHowoDr0T0=; b=nuKu58CW7p159kdXyJOF2qvYm4zAmtN9M3+3gPuq4IcMLbpWKGuAZ8nQP/4nwkH8bh//3B ekQDqbTa9Ir3oxVr4gBIFWx4xIa89CMq0QrrJNIlv4q8VHcVRfUckkpH8m+N/026CQd3S3 yUPCqCEecA1HohEkNu03N8RPronzaL4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747096043; a=rsa-sha256; cv=none; b=gjTdv7lyWIfJ60QTQpV1lP2IGlSeUggYhs3QtaLC30xRZ0FvsIALbsUMNAmQ33mn3nXv3l zzRFFBY95RNHPuC8rfVINtpul7xwRF3NgKeMU5KlJe/ZPUSh4wHdfEiZ/7+J66JRDm96rT HiWFYAFs4a4j5UcgJlsm9Y9+vPjwvHQ= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="u5S/RKCJ"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf14.hostedemail.com: domain of sj@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=sj@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 264114ABD1; Tue, 13 May 2025 00:27:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B84A7C4AF09; Tue, 13 May 2025 00:27:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1747096041; bh=yewlievy6Qnuy2etyGnbRM4hLG3L14FNoqWuRtiYgHE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u5S/RKCJfJEh62ST1HmV5elMRoGVTSM+ZcG6nF1/Wg734Ij1D2uBaCWmrQQikg5Iq oB2oh21+VqNJHc101yJXOeF1wJ7agjFCsZw/Wy0L7bHs+cb354gt1r78xRdzu8QfIC 9wPD6vTFGODaAdefk7qSyMv/0zvVYpqqdHtwy4CLI6EkJnnu9brStX7MHJ71vxyDt6 TM2J2GbkRKd0NO26aRYk6eHuD31YFzZnHhFT5vAoWXOedHiivUBB8jwOCTW2TnOmuT DQklln2I+0Sg2PIz7X5g4IvIiJGtCwCymbxOXLxJ1ui2FYsdlFB3VQK7t9WMXfr3xF ZXcR6rpH6Av4w== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , damon@lists.linux.dev, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 1/6] mm/damon/core: warn and fix nr_accesses[_bp] corruption Date: Mon, 12 May 2025 17:27:10 -0700 Message-Id: <20250513002715.40126-2-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250513002715.40126-1-sj@kernel.org> References: <20250513002715.40126-1-sj@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: f1sx8q8wrgdnx88e5rqjs9k7fwmkdc9e X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1F4C9100007 X-HE-Tag: 1747096042-604589 X-HE-Meta: U2FsdGVkX1++RmFvH6EwjwcMZ5C53BWb6zHEUdhsEBuQ0pwutn39KqVZJwfXIcNJ2C+dwaXXKfamxrJXYuZbpVCu4OssWtiU1SWkgx9bhUte5t1UaQpIvri7pABK+9cNk3UknPKWda0V9ny4XQazVw2UcoSy95RVS955kkwD8rWtVJijZAjjC5N66C/BukDyOrjc+jPHE3J0TMj2+8cKmmaopMnxli71DT1Oc1ZlRCVuXcTahkvNBsLpPKb6JC7l1NPL3nvF6AbsQUepYP8wWjCmiaqUY1829oJjlNe80P37jq4DAwBuSz1Qlh61sO6XX3stQ0M+8m1PBYB6vDmdWfMCQa0joOz62v0yc+9P7hR72cAtttEeMftRcmoms6jpmU0gBHILxP/yfT2ZGN/KO/tC5x8Cym5AbZcPwSLp9PtgobuOmD14bFLA70T7w4F+Nm0c5bhk9m3M+mQQL5EQFRVmw9iBXlf6h5R5Iob5LhVBAFWA8C/Fm6gKWcpkFPKOISWu5Q9kITogpw/9DdnTWLdR72b1CkzWG9XXPBlA4SXud9cTLemkxvzIZ8plwuA7jYn9jJTzi4nnvH+RTmhHErc8bbHvJ4vwZW+UNbBTkAr6kSL3fyOADVIqPlvEXVrOtFMGVql29Y32x7ZfJt/eO4OPiiqGUHRlwYqJLB3myxQU/Z1NwwzO/uZtFm+8DDYLl5512f+iQMjuklkMMUkcZwodky/4zudDuph6Z0yNQnXxG+mZDud6HHovky19eZYAlXAZUsql7en4d/9aSnXosXEOOrAgTMnY+8SgGovqAQOq5Ow7L0mVPYVLQR/u2WIGOllmsEyrEkzjhipefUoxLwzExio8EO1UWJ6GJvaK62hGz6TeEHYtBU1o7sVeCPW+XsOta2QPKSxd5El++xhiFU0NfcexGhEPNVZWcoK0nqikJTDdUbWBSGoXxE8rR4WfT8E1E9n2aghzETpx+/k yCaa1xcl WwFlIcQfXfqdAtGxUh2IzSvC2BWEiQhWVSLvam7z43HbMag2hCNBxp9RQT+BYNeS70Nlw14VRKA+XRZLlONV93mvEwMm1pfJq+K8GecFkxPFKSUy7PV3sM43bPh3flj9S+iEtxkSeKmBZTucp/zUqPMGlw2fi6yPw2oI+K96RkNuPHwiwKoqHUtJX4QQiQbMmcnVonoidSjt7DwtZLpgiNJoAjlvHDPHTF0ctPFNjz2NFCCBhccRUukjypLTGQJDsTYJy/kETR/JmRbPpjM7MIcdY2w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For a bug such as double aggregation reset[1], ->nr_accesses and/or ->nr_accesses_bp of damon_region could be corrupted. Such corruption can make monitoring results pretty inaccurate, so the root causing bug should be investigated. Meanwhile, the corruption itself can easily be fixed but silently fixing it will hide the bug. Fix the corruption as soon as found, but WARN_ONCE() so that we can be aware of the existence of the bug while keeping the system running in a more sane way. [1] https://lore.kernel.org/20250302214145.356806-1-sj@kernel.org Signed-off-by: SeongJae Park --- mm/damon/core.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/damon/core.c b/mm/damon/core.c index 587fb9a4fef8..0bb71e2ab713 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -1391,6 +1391,19 @@ int damos_walk(struct damon_ctx *ctx, struct damos_walk_control *control) return 0; } +/* + * Warn and fix corrupted ->nr_accesses[_bp] for investigations and preventing + * the problem being propagated. + */ +static void damon_warn_fix_nr_accesses_corruption(struct damon_region *r) +{ + if (r->nr_accesses_bp == r->nr_accesses * 10000) + return; + WARN_ONCE(true, "invalid nr_accesses_bp at reset: %u %u\n", + r->nr_accesses_bp, r->nr_accesses); + r->nr_accesses_bp = r->nr_accesses * 10000; +} + /* * Reset the aggregated monitoring results ('nr_accesses' of each region). */ @@ -1404,6 +1417,7 @@ static void kdamond_reset_aggregated(struct damon_ctx *c) damon_for_each_region(r, t) { trace_damon_aggregated(ti, r, damon_nr_regions(t)); + damon_warn_fix_nr_accesses_corruption(r); r->last_nr_accesses = r->nr_accesses; r->nr_accesses = 0; } -- 2.39.5