From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4250F3B5835 for ; Thu, 9 Apr 2026 09:59:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775728751; cv=none; b=i7PNUiebkgYWOL+p3khImdOGgFVehmnejAGpgyhYqhAyq5m4KebCzYMZ73Q5QKkaTac8Ix6caMwM4D9dydMBMKVuuUc1xIogxBWbpny4hwN8xabvrCTdKIz3jiHFfH5ofG0zrU53+hKJXki41NAN83lyjdgF86s4mmnVGNlu0Fw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775728751; c=relaxed/simple; bh=l58J6p7RSZsMYLcpl1ErL9twn6ogfizIE2rdWiUxv7w=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=REWfDD7NNwQNXBRvmUaHaI9hk1jR79BLj0ZMXNkH8SAQmOjo5sI8XdWxcsClWrDeupNCm0siMMaLo/tozP7YyHeCpL9YDQN1CsGlPny1jepM6w5on4iPZJuZa5+kk2cEqvENGm2RRBgc+79dmm6Ktkr7WAGLgdlmhKGBde7U2Vg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=G8LxTcy0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="G8LxTcy0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5FA0FC4CEF7; Thu, 9 Apr 2026 09:59:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775728750; bh=l58J6p7RSZsMYLcpl1ErL9twn6ogfizIE2rdWiUxv7w=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=G8LxTcy0dPuD7fkcBK+BWyhLMcOGFvE8UJGFw2siUINqM+dsaNgtyyM8soKVIUZqP TyNEaabRAGkxWk4WB9Zh9D6vZaIM7Ri//PT+URcSxPytPdNpP3APWNt9gzHK8jujOO /vdlwna+Sp+tGKkyQxKa270KUtkLU7ejpNfVyrAggT2c2ZmF4EVSCBjDcAACvy63eq tVcubq4hA91GO6/fE150AWe1Q5Vp0/DbSPlUizC5bnQYaAjyLeYpscskTLaxHSpNiv hG1iob5LILMQ0D6ZHkC12Rr3iJvX/RnJzDJ2czJmZ7bidoOnVmtTZtFyNw59r7EBuY 3FH+0p7Dzdelg== Date: Thu, 9 Apr 2026 10:59:06 +0100 From: Lorenzo Stoakes To: "David Hildenbrand (Arm)" Cc: xu.xin16@zte.com.cn, hughd@google.com, akpm@linux-foundation.org, chengming.zhou@linux.dev, wang.yaxin@zte.com.cn, yang.yang29@zte.com.cn, michel@lespinasse.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range Message-ID: References: <9950c6c1-f960-58c0-4312-e4f5ac122043@google.com> <20260407142141059pWDasxUAknP5rqvAMl28K@zte.com.cn> <8332aedb-e499-4789-8f46-832df8d60224@kernel.org> <015c3268-9c95-4314-b28d-c5e33eb2fb86@kernel.org> <5401c1d2-5f42-4288-9dad-2b9768b579c7@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5401c1d2-5f42-4288-9dad-2b9768b579c7@kernel.org> On Thu, Apr 09, 2026 at 11:55:10AM +0200, David Hildenbrand (Arm) wrote: > On 4/9/26 11:41, David Hildenbrand (Arm) wrote: > > On 4/9/26 11:37, David Hildenbrand (Arm) wrote: > >> On 4/9/26 11:18, Lorenzo Stoakes wrote: > >>> > >>> anon_vma doesn't have a vma field :) it has anon_vma->rb_root which maps to all > >>> 'related' VMAs. > >> > >> Right, anon_vma_chain has. Dammit. > >> > >>> > >>> And we're already looking at what might be covered by the anon_vma by > >>> invoking anon_vma_interval_tree_foreach() on anon_vma->rb_root in [0, > >>> ULONG_MAX). > >>> > >>> > >>> One interesting thing here is in the anon_vma_interval_tree_foreach() loop > >>> we check: > >>> > >>> if (addr < vma->vm_start || addr >= vma->vm_end) > >>> continue; > >>> > >>> Which is the same as saying 'hey we are ignoring remaps'. > >>> > >>> But... if _we_ got remapped previously (the unsharing is only temporary), > >>> then we'd _still_ have an anon_vma with an old index != addr >> PAGE_SHIFT, > >>> and would still not be able to figure out the correct pgoff after sharing. > >>> > >>> I wonder if we could just store the pgoff in the rmap_item though? > >> > >> That's what I said elsewhere and what I was trying to avoid here. > >> > >> It's 64bytes, and adding a new item will increase it to 96 bytes IIUC. > > > > As we're using a dedicate kmem cache it might "only" add 8 bytes, not > > sure. Still an undesired increase given that we need that for each entry > > in the stable/unstable tree. > > > > Hmm, maybe we could do the following. I think the other members are only > relevant for the unstable tree. Nice, will leave the KSM stuff to you to confirm :) This kind of approach should work fine... > > diff --git a/mm/ksm.c b/mm/ksm.c > index 7d5b76478f0b..0c6bfed280f7 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -191,12 +191,13 @@ struct ksm_stable_node { > * @nid: NUMA node id of unstable tree in which linked (may not match page) > * @mm: the memory structure this rmap_item is pointing into > * @address: the virtual address this rmap_item tracks (+ flags in low bits) > - * @oldchecksum: previous checksum of the page at that virtual address > + * @oldchecksum: previous checksum of the page at that virtual address (unstable tree) > * @node: rb node of this rmap_item in the unstable tree > * @head: pointer to stable_node heading this list in the stable tree > * @hlist: link into hlist of rmap_items hanging off that stable_node > - * @age: number of scan iterations since creation > - * @remaining_skips: how many scans to skip > + * @age: number of scan iterations since creation (unstable tree) > + * @remaining_skips: how many scans to skip (unstable tree) > + * @pgoff: pgoff into @anon_vma where the page is mapped (stable tree) > */ > struct ksm_rmap_item { > struct ksm_rmap_item *rmap_list; > @@ -208,9 +209,14 @@ struct ksm_rmap_item { > }; > struct mm_struct *mm; > unsigned long address; /* + low bits used for flags below */ > - unsigned int oldchecksum; /* when unstable */ > - rmap_age_t age; > - rmap_age_t remaining_skips; > + union { > + struct { > + unsigned int oldchecksum; > + rmap_age_t age; > + rmap_age_t remaining_skips; > + }; > + pgoff_t pgoff; > + }; union to the rescue :) > union { > struct rb_node node; /* when node of unstable tree */ > struct { /* when listed from stable tree */ > @@ -1600,6 +1606,7 @@ static int try_to_merge_with_ksm_page(struct ksm_rmap_item *rmap_item, > > /* Must get reference to anon_vma while still holding mmap_lock */ > rmap_item->anon_vma = vma->anon_vma; > + rmap_item->pgoff = linear_page_index(vma, rmap_item->address); > get_anon_vma(vma->anon_vma); > out: > mmap_read_unlock(mm); > -- > 2.43.0 > > -- > Cheers, > > David Cheers, Lorenzo