From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 881D5C4345F
	for <linux-mm@archiver.kernel.org>; Thu, 25 Apr 2024 04:17:24 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id C3AD96B007B; Thu, 25 Apr 2024 00:17:23 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id BC3CF6B0087; Thu, 25 Apr 2024 00:17:23 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id A64A86B0089; Thu, 25 Apr 2024 00:17:23 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16])
	by kanga.kvack.org (Postfix) with ESMTP id 872506B007B
	for <linux-mm@kvack.org>; Thu, 25 Apr 2024 00:17:23 -0400 (EDT)
Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id F3A3CA1ED2
	for <linux-mm@kvack.org>; Thu, 25 Apr 2024 04:17:22 +0000 (UTC)
X-FDA: 82046744766.04.52F6DE8
Received: from casper.infradead.org (casper.infradead.org [90.155.50.34])
	by imf12.hostedemail.com (Postfix) with ESMTP id 8906140019
	for <linux-mm@kvack.org>; Thu, 25 Apr 2024 04:17:21 +0000 (UTC)
Authentication-Results: imf12.hostedemail.com;
	dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=kbKPXNpJ;
	spf=none (imf12.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org;
	dmarc=none
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1714018641;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=ANgVjP9MVyrfjCZEO3eEEJT1uU8xEYCqPXFBgzQpPAM=;
	b=fntvhcS+84gRHDM8gFXPYFszijJ/dV/9dI+kVFNlksCFlentlg/OZ/gz1u8TpglUTWph/X
	P/WFi0MNaFgHX5mX8ZNDtgW7NdDJ28filVeIHMynh1UBMmRcemY80H8jBahj7CyjAWynxZ
	BUg0VSl6NKwAaeQHBFDGz5WhUprTve8=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714018641; a=rsa-sha256;
	cv=none;
	b=mNdG6i3M4G38KoSWjeZWB4cNIQl36SGoTcMs/sko+uHvL4hh/vaw7C1aB4KcHCdjz1W079
	wShIa3N5Ziou7S6/wjgU2Oh+hByv58CEt98ywaSfGYJQGluvYgfFoxS57EUjbwJC9Qkg1b
	1Jrz57+cYvDtrOwL0vUtgwU8hEq0JFw=
ARC-Authentication-Results: i=1;
	imf12.hostedemail.com;
	dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=kbKPXNpJ;
	spf=none (imf12.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org;
	dmarc=none
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version:
	References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To:
	Content-Transfer-Encoding:Content-ID:Content-Description;
	bh=ANgVjP9MVyrfjCZEO3eEEJT1uU8xEYCqPXFBgzQpPAM=; b=kbKPXNpJvsZf7A/62Zcy59tMFi
	yrULr3msYj/XIPTJLrxxkcPYlENIJBAlIF3NR2twl0wiQFwWwvp0u5hMoL5l3OxDEsdVqAJNkDz1U
	Z/PWhigmVZtMUWPKtqr71QCrZcvuDBStKU9u+yeSJMbhQzBremnJ+S7kOixjGiAVPmovoPbjuKHDc
	im1qXIx7jatqxBXVFCOcUm8xzYa3aBOtl1r1MBDX1sKTM4Us/aV/ubnarVnNbJG8cUBADJZBSTep+
	m6B3BIWNq6u8T9xKif86DQeleaMZ1v1Gr5m542FGLxFUCH6aZQS/0PHsbYPbsm/A5pL2oeXfE3tnO
	cv4pNm5w==;
Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux))
	id 1rzqXo-00000002INc-3GeG;
	Thu, 25 Apr 2024 04:17:12 +0000
Date: Thu, 25 Apr 2024 05:17:12 +0100
From: Matthew Wilcox <willy@infradead.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Zi Yan <ziy@nvidia.com>, Yang Shi <yang.shi@linux.alibaba.com>,
	Ryan Roberts <ryan.roberts@arm.com>
Subject: Re: [PATCH v1] mm/khugepaged: replace page_mapcount() check by
 folio_likely_mapped_shared()
Message-ID: <ZinZSDTMXjPjAHLe@casper.infradead.org>
References: <20240424122630.495788-1-david@redhat.com>
 <73de5556-e574-4ed7-a7fb-c4648e46206b@nvidia.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <73de5556-e574-4ed7-a7fb-c4648e46206b@nvidia.com>
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Rspamd-Queue-Id: 8906140019
X-Stat-Signature: ojzyc7wn4ebo31s6y3kxfiuc1cojaz5m
X-HE-Tag: 1714018641-827115
X-HE-Meta: U2FsdGVkX1+Zr4Yq0kKkwEfjJ7aiOcPMwYaHvJxyv7IpSg+4tAsaMWUZ13ZWx+UQlRpdL8sSZvK4FSjfAs6RKY8iYX1/GKr4EdeOJfPB4+phrwCzJRx/ShInaqKco8qmMzVAcORBy6pRuB41HX6beYKHw/BtC+CaVp+yesnHyGxNalvXYaB6AlG4cwTOr/tQnA4TqtoO25AWmi/TETk6ahpGmYO+fadE84+rIDge/7STuXOE2SuxSP0E/naDgTwI6z4ArhY8lN1PhTXZshZ9mp3TNVHPS0LaLgLO+N9RgMvjZ2EU0Z25Bo9s8rnWIVY6W+xvzahnx1XpB3T4VBis7T4gwPvFw+uLGtnCqdMxIyJBg0tmSnHnVQTuTotGvlJx71Seu0xQwm2q1kW1qhPHlWvvh453m/u4QowxrGgWuWr9D83vjnF1V3Yu2LB1j3CFW7A1xbXqhIjQVrrFQnWFQNrzTdFTAmD57YJQ1LCGoIkXt11C6h5bBYK34dFkVl1pMxEgoG5KfkOD++0Zs6ZYACz8yBG7dxv/7vMlKlYgviPF108sHhoVcUpiP83yDXdCD2f1epDtYrhLs+mC1Sjui3zpJmtq9O/7aZrwkplKrE4V0iTugjf849MbZbsdQ87cPEACb5BY+fMnXeP9eZ7xg2tVZbxYeHCNBO7HPPaWglqUtveYdFg5ZNl+w17+dQ9AtwJBoor6rqPGI4FXCG6waR6dP0IVoC+06BWLyp28vWK/HA3hW3l40OGPGgyBF+jF6WnFyk68SEGtzugKl5i2c358umJf7If7o6bmymjRuYC3FNfTIv0MyC9+LyLP8QuZAlpI+Z3HsJxK6qeWSvzUoUBAr2Au7zlT
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Wed, Apr 24, 2024 at 09:00:50PM -0700, John Hubbard wrote:
> > We want to limit the use of page_mapcount() to places where absolutely
> > required, to prepare for kernel configs where we won't keep track of
> > per-page mapcounts in large folios.
> 
> 
> Just curious, can you elaborate on the motivation? I probably missed
> the discussions that explained why page_mapcount() in large folios
> is not desirable. Are we getting rid of a field in struct page/folio?
> Some other reason?

Two reasons.  One is that, regardless of anything else, folio_mapcount()
is expensive on large folios as it has to walk every page in the folio
summing the mapcounts.  The more important reason is that when we move
to separately allocated folios, we don't want to allocate an array of
mapcounts in order to maintain a per-page mapcount.

So we're looking for a more compact scheme to avoid maintaining a
per-page mapcount.

> > The khugepage MM selftests keep working as expected, including:
> > 
> > 	Run test: collapse_max_ptes_shared (khugepaged:anon)
> > 	Allocate huge page... OK
> > 	Share huge page over fork()... OK
> > 	Trigger CoW on page 255 of 512... OK
> > 	Maybe collapse with max_ptes_shared exceeded.... OK
> > 	Trigger CoW on page 256 of 512... OK
> > 	Collapse with max_ptes_shared PTEs shared.... OK
> > 	Check if parent still has huge page... OK
> 
> Well, a word of caution! These tests do not (yet) cover either of
> the interesting new cases that folio_likely_mapped_shared() presents:
> KSM or hugetlbfs interactions. In other words, false positives.

Hmm ... KSM never uses large folios and hugetlbfs is disjoint from
khugepaged?