From: Lorenzo Stoakes <ljs@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: xu.xin16@zte.com.cn, hughd@google.com, akpm@linux-foundation.org,
chengming.zhou@linux.dev, wang.yaxin@zte.com.cn,
yang.yang29@zte.com.cn, michel@lespinasse.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range
Date: Thu, 9 Apr 2026 10:59:06 +0100 [thread overview]
Message-ID: <add4PUE4GqWh9j9x@lucifer> (raw)
In-Reply-To: <5401c1d2-5f42-4288-9dad-2b9768b579c7@kernel.org>
On Thu, Apr 09, 2026 at 11:55:10AM +0200, David Hildenbrand (Arm) wrote:
> On 4/9/26 11:41, David Hildenbrand (Arm) wrote:
> > On 4/9/26 11:37, David Hildenbrand (Arm) wrote:
> >> On 4/9/26 11:18, Lorenzo Stoakes wrote:
> >>>
> >>> anon_vma doesn't have a vma field :) it has anon_vma->rb_root which maps to all
> >>> 'related' VMAs.
> >>
> >> Right, anon_vma_chain has. Dammit.
> >>
> >>>
> >>> And we're already looking at what might be covered by the anon_vma by
> >>> invoking anon_vma_interval_tree_foreach() on anon_vma->rb_root in [0,
> >>> ULONG_MAX).
> >>>
> >>>
> >>> One interesting thing here is in the anon_vma_interval_tree_foreach() loop
> >>> we check:
> >>>
> >>> if (addr < vma->vm_start || addr >= vma->vm_end)
> >>> continue;
> >>>
> >>> Which is the same as saying 'hey we are ignoring remaps'.
> >>>
> >>> But... if _we_ got remapped previously (the unsharing is only temporary),
> >>> then we'd _still_ have an anon_vma with an old index != addr >> PAGE_SHIFT,
> >>> and would still not be able to figure out the correct pgoff after sharing.
> >>>
> >>> I wonder if we could just store the pgoff in the rmap_item though?
> >>
> >> That's what I said elsewhere and what I was trying to avoid here.
> >>
> >> It's 64bytes, and adding a new item will increase it to 96 bytes IIUC.
> >
> > As we're using a dedicate kmem cache it might "only" add 8 bytes, not
> > sure. Still an undesired increase given that we need that for each entry
> > in the stable/unstable tree.
> >
>
> Hmm, maybe we could do the following. I think the other members are only
> relevant for the unstable tree.
Nice, will leave the KSM stuff to you to confirm :)
This kind of approach should work fine...
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 7d5b76478f0b..0c6bfed280f7 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -191,12 +191,13 @@ struct ksm_stable_node {
> * @nid: NUMA node id of unstable tree in which linked (may not match page)
> * @mm: the memory structure this rmap_item is pointing into
> * @address: the virtual address this rmap_item tracks (+ flags in low bits)
> - * @oldchecksum: previous checksum of the page at that virtual address
> + * @oldchecksum: previous checksum of the page at that virtual address (unstable tree)
> * @node: rb node of this rmap_item in the unstable tree
> * @head: pointer to stable_node heading this list in the stable tree
> * @hlist: link into hlist of rmap_items hanging off that stable_node
> - * @age: number of scan iterations since creation
> - * @remaining_skips: how many scans to skip
> + * @age: number of scan iterations since creation (unstable tree)
> + * @remaining_skips: how many scans to skip (unstable tree)
> + * @pgoff: pgoff into @anon_vma where the page is mapped (stable tree)
> */
> struct ksm_rmap_item {
> struct ksm_rmap_item *rmap_list;
> @@ -208,9 +209,14 @@ struct ksm_rmap_item {
> };
> struct mm_struct *mm;
> unsigned long address; /* + low bits used for flags below */
> - unsigned int oldchecksum; /* when unstable */
> - rmap_age_t age;
> - rmap_age_t remaining_skips;
> + union {
> + struct {
> + unsigned int oldchecksum;
> + rmap_age_t age;
> + rmap_age_t remaining_skips;
> + };
> + pgoff_t pgoff;
> + };
union to the rescue :)
> union {
> struct rb_node node; /* when node of unstable tree */
> struct { /* when listed from stable tree */
> @@ -1600,6 +1606,7 @@ static int try_to_merge_with_ksm_page(struct ksm_rmap_item *rmap_item,
>
> /* Must get reference to anon_vma while still holding mmap_lock */
> rmap_item->anon_vma = vma->anon_vma;
> + rmap_item->pgoff = linear_page_index(vma, rmap_item->address);
> get_anon_vma(vma->anon_vma);
> out:
> mmap_read_unlock(mm);
> --
> 2.43.0
>
> --
> Cheers,
>
> David
Cheers, Lorenzo
next prev parent reply other threads:[~2026-04-09 9:59 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-12 11:28 [PATCH v3 0/2] KSM: Optimizations for rmap_walk_ksm xu.xin16
2026-02-12 11:29 ` [PATCH v3 1/2] ksm: Initialize the addr only once in rmap_walk_ksm xu.xin16
2026-02-12 11:30 ` [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range xu.xin16
2026-02-12 12:21 ` David Hildenbrand (Arm)
2026-04-05 4:44 ` Hugh Dickins
2026-04-05 21:01 ` Andrew Morton
2026-04-07 9:43 ` Lorenzo Stoakes (Oracle)
2026-04-07 21:21 ` Andrew Morton
2026-04-08 6:29 ` Lorenzo Stoakes
2026-04-06 1:58 ` xu.xin16
2026-04-06 5:35 ` Hugh Dickins
2026-04-07 6:21 ` xu.xin16
2026-04-07 9:36 ` Lorenzo Stoakes (Oracle)
2026-04-08 12:57 ` David Hildenbrand (Arm)
2026-04-09 9:18 ` Lorenzo Stoakes
2026-04-09 9:37 ` David Hildenbrand (Arm)
2026-04-09 9:41 ` David Hildenbrand (Arm)
2026-04-09 9:53 ` Lorenzo Stoakes
2026-04-09 9:56 ` David Hildenbrand (Arm)
2026-04-09 9:55 ` David Hildenbrand (Arm)
2026-04-09 9:59 ` Lorenzo Stoakes [this message]
2026-04-09 10:56 ` 答复: " xu.xin16
2026-04-09 11:59 ` David Hildenbrand (Arm)
2026-04-09 12:26 ` David Hildenbrand (Arm)
2026-04-10 8:06 ` xu.xin16
2026-04-10 9:06 ` David Hildenbrand (Arm)
2026-04-09 10:06 ` xu.xin16
2026-04-09 10:09 ` Lorenzo Stoakes
2026-04-06 9:21 ` David Hildenbrand (arm)
2026-04-06 9:23 ` David Hildenbrand (arm)
2026-04-07 9:39 ` Lorenzo Stoakes (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=add4PUE4GqWh9j9x@lucifer \
--to=ljs@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=david@kernel.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=michel@lespinasse.org \
--cc=wang.yaxin@zte.com.cn \
--cc=xu.xin16@zte.com.cn \
--cc=yang.yang29@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.