From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 029B13A875D
	for <linux-kernel@vger.kernel.org>; Thu,  9 Apr 2026 09:18:13 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1775726294; cv=none; b=hTEhcpsUeRv//eudlpFMhklN1f5JmvLTjetF7tjk56pNLBPTio5H9GZc6auugwiCHNjXFv/BRn4iiVeljAP74UuSjGTJ5lk8ume1S+55/H5d9C+soPim05m5k2Ko5ZR+FuR37fwutr5W5r/aO8nSx2t3PpoRJxBbnnnpaIvgofY=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1775726294; c=relaxed/simple;
	bh=l4DQ125b/7uRFDTo0CRMQOSUlthuAZ0rRzbKGkw3cEc=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=D4p/EKlH9Wi38LQpYjPIY50u2Qr2N9smctYQ8svaz7ANF1IXU1Xh9UsyIU1i3DgKF5AKoAUrqZYlQdjKJV//BAOHFcu9Khmaedfoc3H+IETs4XO+ZWJteOAo9v5Icj9bbtn9GhzaZVzzV8q3MHQ6miDfzJXDPkEJnFj3wySffAw=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dn/tfxHF; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dn/tfxHF"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id B4A6BC19424;
	Thu,  9 Apr 2026 09:18:09 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1775726292;
	bh=l4DQ125b/7uRFDTo0CRMQOSUlthuAZ0rRzbKGkw3cEc=;
	h=Date:From:To:Cc:Subject:References:In-Reply-To:From;
	b=dn/tfxHFLwuOjXDZkAstNVEUjr98q2XVZSrqaAXDbkjYV+nyB76PREwXcJ8bAObzP
	 QevxZFTHLxmqQjNlhNZcfK+Zgxk1MrpHuFRBHdXoHbBKkXCUUOxEcaUxet5IFYD0g5
	 7Ge2j+178bd6e5npbCPC+Ynma9khbpv7gy18NVb1fcp9dh6k5Z9JT9pff44F8pgFgv
	 bqFkvhLAHrkDuQuUXGR2IhbYBgAOXDrKrY71108UaJbRaS94xdbWo2UHRB9mBU5BX4
	 r6vPMehdZTRH3CrLcAahl7JpstxKj962Tu3wmcnm+CA+1afgqT+ZeTCza42r8qy9n7
	 3q/TYDrJ+5zfw==
Date: Thu, 9 Apr 2026 10:18:07 +0100
From: Lorenzo Stoakes <ljs@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: xu.xin16@zte.com.cn, hughd@google.com, akpm@linux-foundation.org, 
	chengming.zhou@linux.dev, wang.yaxin@zte.com.cn, yang.yang29@zte.com.cn, 
	michel@lespinasse.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable
 address range
Message-ID: <addoN3ur7GtiKOFf@lucifer>
References: <9950c6c1-f960-58c0-4312-e4f5ac122043@google.com>
 <20260407142141059pWDasxUAknP5rqvAMl28K@zte.com.cn>
 <adTPQSb-qSSHviJN@lucifer>
 <8332aedb-e499-4789-8f46-832df8d60224@kernel.org>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <8332aedb-e499-4789-8f46-832df8d60224@kernel.org>

On Wed, Apr 08, 2026 at 02:57:10PM +0200, David Hildenbrand (Arm) wrote:
> On 4/7/26 11:36, Lorenzo Stoakes (Oracle) wrote:
> > On Tue, Apr 07, 2026 at 02:21:41PM +0800, xu.xin16@zte.com.cn wrote:
> >>>
> >>> I'd completely forgotten that patch by now!  But it's dealing with a
> >>> different issue; and note how it's intentionally leaving MADV_MERGEABLE
> >>> on the vma itself, just using MADV_UNMERGEABLE (with &dummy) as an
> >>> interface to CoW the KSM pages at that time, letting them be remerged after.
> >
> > Hmm yeah, we mark them unmergeable but don't update the VMA flags (since using
> > &dummy), so they can just be merged later right?
> >
> > And then the:
> >
> > void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
> > {
> > 	...
> > 		const pgoff_t pgoff = rmap_item->address >> PAGE_SHIFT;
> > 		...
> > 		anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> > 					       pgoff, pgoff) {
> > 			...
> > 		}
> > 	...
> > }
> >
> > Would _assume_ that folio->pgoff == addr >> PAGE_SHIFT, which will no longer be
> > the case here?
>
> I'm wondering whether we could figure the pgoff out, somehow, so we
> wouldn't have to store it elsewhere.
>
> What we need is essentially what __folio_set_anon() would have done for
> the original folio we replaced.
>
> 	folio->index = linear_page_index(vma, address);
>
> Could we obtain that from the anon_vma assigned to our rmap_item?
>
> pgoff_t pgoff;
>
> pgoff = (rmap_item->address - anon_vma->vma->vm_start) >> PAGE_SHIFT;
> pgoff += anon_vma->vma->vm_pgoff;

anon_vma doesn't have a vma field :) it has anon_vma->rb_root which maps to all
'related' VMAs.

And we're already looking at what might be covered by the anon_vma by
invoking anon_vma_interval_tree_foreach() on anon_vma->rb_root in [0,
ULONG_MAX).

>
> It would be the same adjustment everywhere we look in child processes,
> because the moment they would mremap() would be where we would have
> unshared.
>
> Just a thought after reading avc_start_pgoff ...

One interesting thing here is in the anon_vma_interval_tree_foreach() loop
we check:

if (addr < vma->vm_start || addr >= vma->vm_end)
	continue;

Which is the same as saying 'hey we are ignoring remaps'.

But... if _we_ got remapped previously (the unsharing is only temporary),
then we'd _still_ have an anon_vma with an old index != addr >> PAGE_SHIFT,
and would still not be able to figure out the correct pgoff after sharing.

I wonder if we could just store the pgoff in the rmap_item though?

Because we unshare on remap, so we'd expect a new share after remapping, at
which point we could account for the remapping by just setting
rmap_item->pgoff = vma->vm_pgoff I think?

Then we're back in business.

Another way around this issue is to do the rmap_walk_ksm() loop for (addr
>> PAGE_SHIFT) _first_, but that'd only be useful for walkers that can exit
early once they find the mapping they care about, and I worry about 'some
how' missing remapped cases, so probably not actually all that useful.

>
> --
> Cheers,
>
> David

Cheers, Lorenzo