From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D5A13B52FF for ; Thu, 9 Apr 2026 09:53:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775728435; cv=none; b=IQkuR2Gcx/r3kPj1MS39NrsQ+bAgnsvaV1UaG5a+4ZKWPrN6Hv/M1x1V0HPFzxeolWViEn9qQitpshc/4TDnyB7EuhjnbV1+mcsJ8IPULGyOP5u0kJGQ0jTU/BRTU4TwD9SeHmJdJTNgp59zdeuUkJWGjQYoREauRb/sYHQ4a/0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775728435; c=relaxed/simple; bh=7I0MXF+l8nyjZuATbUmKGm0GIf+FqeD/ZIqKF9lA6sU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HKUpRlBGntwnIMMtYxi4mAAuISsN8C0o8DUWIYWodJcNrMoMMGG/sFfPm0TYdwR1gsPw89USpc2vzYpMxbNPJHEpoiWg0O9CoUxyoWU0h8yEozYLhRy+nmRnkA7x5N5LqCK8A3VWA6jcN9oG/B8NL8KDvGgxv5xYCk69/0W3fJg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bTMdzfSB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bTMdzfSB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BA044C19424; Thu, 9 Apr 2026 09:53:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775728435; bh=7I0MXF+l8nyjZuATbUmKGm0GIf+FqeD/ZIqKF9lA6sU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bTMdzfSB/UkQKZksXO3wU9KqnVN6BbkOWqhv8s2eX/ynWdibrAMAZaToNN9MXVRrH yHR4iR1EtLGrIka0F02Okxgwy5xmS7Vm2eZ8e0jCVgaV1kCidZy468rtia507gLsnb 9H9fQa11DQmsjSXt084PvLirULuXw7u0OLDw7JQRdk0AEAEU/xgE1+g6S9ijiKNS7M USMnYBd7HvEGSUATMngXbLKeNVevlr+TtRS/FZ3DUVfUtggH2iOc+oTWnNOvwpEBk+ 7LU3NerSxO1brRPvIe6vevkEyAaSAArGsJOTuUlNr8TbXCCv6vjPLAXr/O22/OAHxV RIjOCDxBYU51g== Date: Thu, 9 Apr 2026 10:53:50 +0100 From: Lorenzo Stoakes To: "David Hildenbrand (Arm)" Cc: xu.xin16@zte.com.cn, hughd@google.com, akpm@linux-foundation.org, chengming.zhou@linux.dev, wang.yaxin@zte.com.cn, yang.yang29@zte.com.cn, michel@lespinasse.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range Message-ID: References: <9950c6c1-f960-58c0-4312-e4f5ac122043@google.com> <20260407142141059pWDasxUAknP5rqvAMl28K@zte.com.cn> <8332aedb-e499-4789-8f46-832df8d60224@kernel.org> <015c3268-9c95-4314-b28d-c5e33eb2fb86@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Apr 09, 2026 at 11:41:46AM +0200, David Hildenbrand (Arm) wrote: > On 4/9/26 11:37, David Hildenbrand (Arm) wrote: > > On 4/9/26 11:18, Lorenzo Stoakes wrote: > >> On Wed, Apr 08, 2026 at 02:57:10PM +0200, David Hildenbrand (Arm) wrote: > >>> > >>> I'm wondering whether we could figure the pgoff out, somehow, so we > >>> wouldn't have to store it elsewhere. > >>> > >>> What we need is essentially what __folio_set_anon() would have done for > >>> the original folio we replaced. > >>> > >>> folio->index = linear_page_index(vma, address); > >>> > >>> Could we obtain that from the anon_vma assigned to our rmap_item? > >>> > >>> pgoff_t pgoff; > >>> > >>> pgoff = (rmap_item->address - anon_vma->vma->vm_start) >> PAGE_SHIFT; > >>> pgoff += anon_vma->vma->vm_pgoff; > >> > >> anon_vma doesn't have a vma field :) it has anon_vma->rb_root which maps to all > >> 'related' VMAs. > > > > Right, anon_vma_chain has. Dammit. > > > >> > >> And we're already looking at what might be covered by the anon_vma by > >> invoking anon_vma_interval_tree_foreach() on anon_vma->rb_root in [0, > >> ULONG_MAX). > >> > >>> > >>> It would be the same adjustment everywhere we look in child processes, > >>> because the moment they would mremap() would be where we would have > >>> unshared. > >>> > >>> Just a thought after reading avc_start_pgoff ... > >> > >> One interesting thing here is in the anon_vma_interval_tree_foreach() loop > >> we check: > >> > >> if (addr < vma->vm_start || addr >= vma->vm_end) > >> continue; > >> > >> Which is the same as saying 'hey we are ignoring remaps'. > >> > >> But... if _we_ got remapped previously (the unsharing is only temporary), > >> then we'd _still_ have an anon_vma with an old index != addr >> PAGE_SHIFT, > >> and would still not be able to figure out the correct pgoff after sharing. > >> > >> I wonder if we could just store the pgoff in the rmap_item though? > > > > That's what I said elsewhere and what I was trying to avoid here. > > > > It's 64bytes, and adding a new item will increase it to 96 bytes IIUC. > > As we're using a dedicate kmem cache it might "only" add 8 bytes, not > sure. Still an undesired increase given that we need that for each entry > in the stable/unstable tree. Hm, random idea, but I wonder if we could cram a bit somewhere that indicates whether a remap has in fact taken place? rmap_item->some_field |= !!(vma->vm_start >> PAGE_SHIFT != vma->vm_pgoff); (yeah obviously _not implemented like that_ but you get the point) Since remap case should be rare, then if that bit is clear, do the cheap path, otherwise do expensive? Longer term, my anon_vma rework should fix this more broadly :) > > -- > Cheers, > > David Cheers, Lorenzo