linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: cgel.zte@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, vbabka@suse.cz, minchan@kernel.org,
	oleksandr@redhat.com, xu xin <xu.xin16@zte.com.cn>,
	Jann Horn <jannh@google.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH linux-next] mm/madvise: allow KSM hints for process_madvise
Date: Fri, 1 Jul 2022 21:12:56 +0200	[thread overview]
Message-ID: <11d28e6d-edb0-7d11-b476-c5808f3b7c5d@redhat.com> (raw)
In-Reply-To: <Yr70ZwUAIHNz5VNy@dhcp22.suse.cz>

On 01.07.22 15:19, Michal Hocko wrote:
> On Fri 01-07-22 14:39:24, David Hildenbrand wrote:
>>> I am not sure about exact details of the KSM implementation but if that
>>> is not a desirable behavior then it should be handled on the KSM level.
>>> The very sam thing can easily happen in a multithreaded (or in general
>>> multi-process with shared mm) environment as well.
>>
>> I don't quite get what you mean.
> 
> I meant to say that if KSM needs to be aware of a special CoW semantic
> then it should be handled on the KSM layer regardless whether the KSM
> has been set by the process itself or any other process that has acccess
> to the MM. process_madvise is just another way to access a remote MM
> other than sharing the full MM.

Okay.

KSM has been a corner case feature that was restricted to well-defined
and well-tested environments. Until recently, R/O pins of any KSM pages
was essentially completely unreliably. And applications don't expect
such surprises. The shared zeropage is most probably the last
problematic piece.

Yes, we're getting there that it's a real feature that can see more
(forced) wide-spread use. However, until the known issues in KSM have
been fixed (e.g., below -- there is a whole list of papers regarding
attacks on memory deduplication), it should be limited to well defined
environments and applications only -- IMHO.

So what I want to express here is that if we're adding an interface that
can be used to just enable KSM on the whole system easily, it might be a
bit to soon for that. No matter what you document, people will ignore it.

OTOH, if this is a real debug feature that will only be available in
specific debug/test scenarios (kernel config? toggle? whatsoever?), then
it's "better". If that is already the case, good.

>  
> [...]
>>> Are you saying that any remote handling of the KSM has to deal with a
>>> pre-existing semantic as well? Are we aware of any existing application
>>> that really uses MADV_UNMERGEABLE in a hope to disable KSM for any of
>>> its sensitive memory ranges? My understanding is that this is simply a
>>> on/off knob and a remote way to do the same is in line with the existing
>>> API.
>>
>> "its sensitive memory ranges" that's exactly what I am concerned of.
>> There should be a toggle, and existing applciations will not be using it.
> 
> The thing is that most applications (are there any?) do not actively
> say that something is not KSM safe, right? They expect they opt in where

They can't. But knowing about stuff like
https://access.redhat.com/security/cve/cve-2021-3714 makes me be sure
that there are applications that don't want this force-enabled ever.

> it makes sense. So my question is, whether any remote way to opt in for
> KSM has to redefine the existing semantic or the same should be achieved
> by a sufficient privileges?
> 
> The former would have really hard times to be applicable to the very
> likely first hand usecase - unmodifiable binaries...

Yes, I know. I also don't have a good answer to all of that.

> 
>>> To be completely honest I do not really buy an argument that this might
>>> break something much more than the original application can do already.
>>
>> How can you get a shared zeropage in a private mapping after a previous
>> write if not via KSM?
> 
> I was not referring to KSM specifically here. My recollection is that
> PTRACE_MODE_READ_FSCREDS is quite powerful already.

Ah, you mean process_madvise() permissions.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2022-07-01 19:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-01  8:43 [PATCH linux-next] mm/madvise: allow KSM hints for process_madvise cgel.zte
2022-07-01  9:11 ` Michal Hocko
2022-07-01 10:32   ` David Hildenbrand
2022-07-01 10:50     ` David Hildenbrand
2022-07-01 12:02       ` Michal Hocko
2022-07-01 12:09         ` David Hildenbrand
2022-07-01 12:36           ` Michal Hocko
2022-07-01 12:39             ` David Hildenbrand
2022-07-01 13:19               ` Michal Hocko
2022-07-01 19:12                 ` David Hildenbrand [this message]
2022-07-04  6:48                   ` Michal Hocko
2022-07-04  7:29                     ` CGEL
2022-07-04  8:40                       ` Michal Hocko
2022-07-04  9:35                         ` David Hildenbrand
2022-07-04  8:13           ` CGEL
2022-07-04  9:30             ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11d28e6d-edb0-7d11-b476-c5808f3b7c5d@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgel.zte@gmail.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=oleksandr@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=xu.xin16@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).