From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
Pedro Falcato <pfalcato@suse.de>, Xu Xin <xu.xin16@zte.com.cn>,
Chengming Zhou <chengming.zhou@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/4] mm: prevent KSM from completely breaking VMA merging
Date: Mon, 19 May 2025 20:00:29 +0200 [thread overview]
Message-ID: <e5d0b98f-6d9c-4409-82cd-7d23dc7c3bda@redhat.com> (raw)
In-Reply-To: <418d3edbec3a718a7023f1beed5478f5952fc3df.1747431920.git.lorenzo.stoakes@oracle.com>
On 19.05.25 10:51, Lorenzo Stoakes wrote:
> If a user wishes to enable KSM mergeability for an entire process and all
> fork/exec'd processes that come after it, they use the prctl()
> PR_SET_MEMORY_MERGE operation.
>
> This defaults all newly mapped VMAs to have the VM_MERGEABLE VMA flag set
> (in order to indicate they are KSM mergeable), as well as setting this flag
> for all existing VMAs.
>
> However it also entirely and completely breaks VMA merging for the process
> and all forked (and fork/exec'd) processes.
>
> This is because when a new mapping is proposed, the flags specified will
> never have VM_MERGEABLE set. However all adjacent VMAs will already have
> VM_MERGEABLE set, rendering VMAs unmergeable by default.
>
> To work around this, we try to set the VM_MERGEABLE flag prior to
> attempting a merge. In the case of brk() this can always be done.
>
> However on mmap() things are more complicated - while KSM is not supported
> for file-backed mappings, it is supported for MAP_PRIVATE file-backed
> mappings.
>
> And these mappings may have deprecated .mmap() callbacks specified which
> could, in theory, adjust flags and thus KSM merge eligiblity.
>
> So we check to determine whether this at all possible. If not, we set
> VM_MERGEABLE prior to the merge attempt on mmap(), otherwise we retain the
> previous behaviour.
>
> When .mmap_prepare() is more widely used, we can remove this precaution.
>
> While this doesn't quite cover all cases, it covers a great many (all
> anonymous memory, for instance), meaning we should already see a
> significant improvement in VMA mergeability.
We should add a Fixes: tag.
CCing stable is likely not a good idea at this point (and might be
rather hairy).
[...]
> /**
> - * ksm_add_vma - Mark vma as mergeable if compatible
> + * ksm_vma_flags - Update VMA flags to mark as mergeable if compatible
> *
> - * @vma: Pointer to vma
> + * @mm: Proposed VMA's mm_struct
> + * @file: Proposed VMA's file-backed mapping, if any.
> + * @vm_flags: Proposed VMA"s flags.
> + *
> + * Returns: @vm_flags possibly updated to mark mergeable.
> */
> -void ksm_add_vma(struct vm_area_struct *vma)
> +vm_flags_t ksm_vma_flags(const struct mm_struct *mm, const struct file *file,
> + vm_flags_t vm_flags)
> {
> - struct mm_struct *mm = vma->vm_mm;
> + vm_flags_t ret = vm_flags;
>
> - if (test_bit(MMF_VM_MERGE_ANY, &mm->flags))
> - __ksm_add_vma(vma);
> + if (test_bit(MMF_VM_MERGE_ANY, &mm->flags) &&
> + __ksm_should_add_vma(file, vm_flags))
> + ret |= VM_MERGEABLE;
> +
> + return ret;
> }
No need for ret without harming readability.
if ()
vm_flags |= VM_MERGEABLE
return vm_flags;
[...]
> +/*
> + * Are we guaranteed no driver can change state such as to preclude KSM merging?
> + * If so, let's set the KSM mergeable flag early so we don't break VMA merging.
> + *
> + * This is applicable when PR_SET_MEMORY_MERGE has been set on the mm_struct via
> + * prctl() causing newly mapped VMAs to have the KSM mergeable VMA flag set.
> + *
> + * If this is not the case, then we set the flag after considering mergeability,
> + * which will prevent mergeability as, when PR_SET_MEMORY_MERGE is set, a new
> + * VMA will not have the KSM mergeability VMA flag set, but all other VMAs will,
> + * preventing any merge.
Hmmm, so an ordinary MAP_PRIVATE of any file (executable etc.) will get
VM_MERGEABLE set but not be able to merge?
Probably these are not often expected to be merged ...
Preventing merging should really only happen because of VMA flags that
are getting set: VM_PFNMAP, VM_MIXEDMAP, VM_DONTEXPAND, VM_IO.
I am not 100% sure why we bail out on special mappings: all we have to
do is reliably identify anon pages, and we should be able to do that.
GUP does currently refuses any VM_PFNMAP | VM_IO, and KSM uses GUP,
which might need a tweak then (maybe the solution could be to ... not
use GUP but a folio_walk).
So, assuming we could remove the VM_PFNMAP | VM_IO | VM_DONTEXPAND |
VM_MIXEDMAP constraint from vma_ksm_compatible(), could we simplify?
That is: the other ones must not really be updated during mmap(), right?
(in particular: VM_SHARED | VM_MAYSHARE | VM_HUGETLB | VM_DROPPABLE)
Have to double-check VM_SAO and VM_SPARC_ADI.
> + */
> +static bool can_set_ksm_flags_early(struct mmap_state *map)
> +{
> + struct file *file = map->file;
> +
> + /* Anonymous mappings have no driver which can change them. */
> + if (!file)
> + return true;
> +
> + /* shmem is safe. */
> + if (shmem_file(file))
> + return true;
> +
> + /*
> + * If .mmap_prepare() is specified, then the driver will have already
> + * manipulated state prior to updating KSM flags.
> + */
> + if (file->f_op->mmap_prepare)
> + return true;
> +
> + return false;
> +}
So, long-term (mmap_prepare) this function will essentially go away?
Nothing jumped at me, this definitely improves the situation.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2025-05-19 18:00 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-19 8:51 [PATCH 0/4] mm: ksm: prevent KSM from entirely breaking VMA merging Lorenzo Stoakes
2025-05-19 8:51 ` [PATCH 1/4] mm: ksm: have KSM VMA checks not require a VMA pointer Lorenzo Stoakes
2025-05-19 17:40 ` David Hildenbrand
2025-05-20 3:14 ` Chengming Zhou
2025-05-19 8:51 ` [PATCH 2/4] mm: ksm: refer to special VMAs via VM_SPECIAL in ksm_compatible() Lorenzo Stoakes
2025-05-19 17:41 ` David Hildenbrand
2025-05-20 3:15 ` Chengming Zhou
2025-05-19 8:51 ` [PATCH 3/4] mm: prevent KSM from completely breaking VMA merging Lorenzo Stoakes
2025-05-19 13:08 ` Chengming Zhou
2025-05-19 13:13 ` Lorenzo Stoakes
2025-05-19 13:19 ` kernel test robot
2025-05-19 13:36 ` Lorenzo Stoakes
2025-05-19 18:00 ` David Hildenbrand [this message]
2025-05-19 18:04 ` David Hildenbrand
2025-05-19 19:02 ` Lorenzo Stoakes
2025-05-19 19:11 ` David Hildenbrand
2025-05-19 19:26 ` Lorenzo Stoakes
2025-05-19 19:29 ` David Hildenbrand
2025-05-19 18:52 ` Lorenzo Stoakes
2025-05-19 18:59 ` David Hildenbrand
2025-05-19 19:14 ` Lorenzo Stoakes
2025-05-19 19:18 ` Lorenzo Stoakes
2025-05-19 19:28 ` David Hildenbrand
2025-05-19 21:57 ` Andrew Morton
2025-05-20 5:25 ` Lorenzo Stoakes
2025-05-20 3:55 ` Chengming Zhou
2025-05-20 5:24 ` Lorenzo Stoakes
2025-05-19 8:51 ` [PATCH 4/4] tools/testing/selftests: add VMA merge tests for KSM merge Lorenzo Stoakes
2025-05-21 8:07 ` Chengming Zhou
2025-05-21 8:10 ` Lorenzo Stoakes
2025-05-19 11:53 ` [PATCH 0/4] mm: ksm: prevent KSM from entirely breaking VMA merging David Hildenbrand
2025-05-19 11:56 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e5d0b98f-6d9c-4409-82cd-7d23dc7c3bda@redhat.com \
--to=david@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=pfalcato@suse.de \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=xu.xin16@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).