From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E8AC39E176; Mon, 9 Mar 2026 14:21:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773066114; cv=none; b=MexLiNg3y8VsSn5gGSnuSxycgISmfNjT4DOtuFVRaLonZr7gHhC4LpmcsyX299JzH3lCeA8XcTJ21I1YELmK1acNSYfPt4CG1Cqv2TZixzT3nckz9ebsJCOLec/IQlBKhX7fj+JKNUhjT8tKlCV6MHeICulMxhUIRoOIvQRMZyY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773066114; c=relaxed/simple; bh=ZzA6HzHga9tzIo7rugFYD6/RsDh4dugF1CPijud6ro8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=LCGPKnguvwu5VE5uc4Op+WVsSM80sNQRiharkYtfB1SuEj+WW3hfe7PuCT/9iY4j7NrpQwegLfqmlAC19SoQSnBN8Brk39h7dbJGLP9o3yNk6UMLjLt27eTenIMTP0fAxXqjLaUqM8m9f9kMUPA80MYni2vySrIAh8rv7boY7ww= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=pyKkhJWQ; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="pyKkhJWQ" Received: from shell.ilvokhin.com (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 9317AB36CF; Mon, 09 Mar 2026 14:21:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1773066105; bh=89Q59i3cmlvIuCaU6IMXcnS9oldz7qFt8v/vCBuKyH4=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=pyKkhJWQnjig5kCqeP6hPYZL4qkV2wpdupo53NVcZ3mea6inhIPXBmcwS67BHdur+ nSuzcghck0UixFHftToE/hLhexverWNxCN53oAJfLOtGCAJR3ozlthcuBtJqj1LMx4 cdrhLDyOjnTT0nroLki1F7MdfAr3DvANC4ja6Tfo= Date: Mon, 9 Mar 2026 14:21:42 +0000 From: Dmitry Ilvokhin To: Matthew Wilcox Cc: Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Axel Rasmussen , Yuanchu Xie , Wei Xu , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , "Rafael J. Wysocki" , Pavel Machek , Len Brown , Brendan Jackman , Johannes Weiner , Zi Yan , Oscar Salvador , Qi Zheng , Shakeel Butt , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-pm@vger.kernel.org Subject: Re: [PATCH v4 0/5] mm: zone lock tracepoint instrumentation Message-ID: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Mon, Mar 09, 2026 at 01:10:46PM +0000, Matthew Wilcox wrote: > On Fri, Feb 27, 2026 at 04:00:22PM +0000, Dmitry Ilvokhin wrote: > > Zone lock contention can significantly impact allocation and > > reclaim latency, as it is a central synchronization point in > > the page allocator and reclaim paths. Improved visibility into > > its behavior is therefore important for diagnosing performance > > issues in memory-intensive workloads. > > > > On some production workloads at Meta, we have observed noticeable > > zone lock contention. Deeper analysis of lock holders and waiters > > is currently difficult with existing instrumentation. > > > > While generic lock contention_begin/contention_end tracepoints > > cover the slow path, they do not provide sufficient visibility > > into lock hold times. In particular, the lack of a release-side > > event makes it difficult to identify long lock holders and > > correlate them with waiters. As a result, distinguishing between > > short bursts of contention and pathological long hold times > > requires additional instrumentation. > > > > This patch series adds dedicated tracepoint instrumentation to > > zone lock, following the existing mmap_lock tracing model. > > I don't like this at all. We have CONFIG_LOCK_STAT. That should be > improved insted of coming up with one-offs for every single lock > that someone deems "special". Thanks for the feedback, Matthew. CONFIG_LOCK_STAT provides useful statistics, but it is primarily a debug facility and is generally too heavyweight for the production environments. The motivation for this series was to provide lightweight observability for the zone lock in production workloads. I agree that improving generic lock instrumentation would be preferable. I did consider whether something similar could be done generically for spinlocks, but the unlock path there is typically just a single atomic store, so adding generic lightweight instrumentation without affecting the fast path is difficult. In parallel, I've been experimenting with improving observability for sleepable locks by adding a contended_release tracepoint, which would allow correlating lock holders and waiters in a more generic way. I've posted an RFC here: https://lore.kernel.org/all/cover.1772642407.git.d@ilvokhin.com/ I'd appreciate feedback on whether that direction makes sense for improving the generic lock tracing infrastructure.