linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Izik Eidus <ieidus@redhat.com>, Rik van Riel <riel@redhat.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	chrisw@redhat.com, alan@lxorguk.ukuu.org.uk, device@lanana.org,
	linux-mm@kvack.org, nickpiggin@yahoo.com.au
Subject: Re: [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl.
Date: Wed, 6 May 2009 15:34:34 +0200	[thread overview]
Message-ID: <20090506133434.GX16078@random.random> (raw)
In-Reply-To: <Pine.LNX.4.64.0905061110470.3519@blonde.anvils>

On Wed, May 06, 2009 at 12:16:52PM +0100, Hugh Dickins wrote:
> I'm very much with those who suggested an madvise(), for which Chris
> prepared a patch.  I know Andrea felt uneasy with an madvise() going
> to a possibly-configured-out-or-never-loaded module, but it is just
> advice, so I don't have a problem with that myself, so long as it
> is documented in the manpage.

I don't have so much of a problem with that, but there are a couple of
differences: normally madvise doesn't depend on the admin to start
some kernel thread to be meaningful, and normally madvise isn't a
privileged operation, see below.

> Whereas I do worry just what capability should be required for this:
> can't a greedy app simply fork itself, touch all its pages, and thus
> lock itself into memory in this way?  And I do worry about the cpu

KSM pages are supposed to be swappable in the long run so let's think
longer term.

> cost of all the scanning, if it were to get used more generally -
> it would be a pity if we just advised complainers to tune it out.

Clearly if tons of apps maliciously register themself in ksm, they'll
waste tons of CPU for no good, they'll just populate the unstable tree
with pages that are all equal except for the last 4 bytes slowing down
KSM for nothing. This is also why it's good to have a /dev/ksm ioctl
that the admin can allow only certain users to use for registering
virtual ranges (for example only the kvm/qemu user or all users in
scientific environments). Otherwise we'd need some kind of permissions
settings in sysfs with some API that certainly is less intuitive than
chown/chmod on /dev/ksm. We just can't allow madvise to succeed on any
luser registering itself in KSM, so if it was madvise, it shall return
-EPERM somehow sometime.

> I'm still working my way through ksm.c, and not gone back to look at
> Chris's madvise patch, but doubt it will be sufficient.  There's an
> interesting difference between what you're doing in ksm.c, and how
> madvise usually behaves, regarding unmapped areas: madvice doesn't
> usually apply to an unmapped area, and goes away with an area when
> it is unmapped; whereas in KSM's case, the advice applies to whatever
> happens to get mapped in the area specified, persisting across unmaps.

Given the apps using KSM tends to be quite special, the fact it's
sticky, it doesn't go away with munmap isn't big deal, quite to the
contrary those apps will likely have an easier time thanks to the
registration not going away over munmap/mmap, without requiring
reloading of malloc/new calls.

To skip over holes during virtual scans we just vma->vm_next.

> But I do appreciate the separation you've kept so far,
> and wouldn't want to tie it all together too closely.

The above plus the fact it remains self contained without making the
VM any more complicated, gives some value. Even swapping I'd like to
add it without VM specific knowledge about KSM. tmpfs has an easier
time because it has its own vma type, here we've KSM pages mixed
inside regular anonymous !vma->vm_file regions and !vm_ops.

> p.s.  I wish you'd chosen different name than KSM - the kernel
> has supported shared memory for many years - and notice ksm.c itself
> says "Memory merging driver".  "Merge" would indeed have been a less
> ambiguous term than "Share", but I think too late to change that now
> - except possibly in the MADV_ flag names?

I don't actually care about names, so I leave this to other to discuss.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-05-06 13:34 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-04 22:25 [PATCH 0/6] ksm changes (v2) Izik Eidus
2009-05-04 22:25 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
2009-05-04 22:25   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
2009-05-04 22:25     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
2009-05-04 22:25       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
2009-05-04 22:25         ` [PATCH 5/6] ksm: build system make it compile for all archs Izik Eidus
2009-05-04 22:25           ` [PATCH 6/6] ksm: use another miscdevice minor number Izik Eidus
2009-05-06  0:55             ` Rik van Riel
2009-05-06  0:54           ` [PATCH 5/6] ksm: build system make it compile for all archs Rik van Riel
2009-05-06  0:54         ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Rik van Riel
2009-05-06  0:53       ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Rik van Riel
2009-05-06  8:38         ` Izik Eidus
2009-05-06 11:16           ` Hugh Dickins
2009-05-06 13:34             ` Andrea Arcangeli [this message]
2009-05-06 13:56               ` Izik Eidus
2009-05-06 16:41                 ` Hugh Dickins
2009-05-06 16:49                   ` Chris Wright
2009-05-06 16:57                     ` Hugh Dickins
2009-05-06 17:47                       ` Andrea Arcangeli
2009-05-06 16:59                     ` Izik Eidus
2009-05-07 11:31                     ` Andrea Arcangeli
2009-05-07 13:13                       ` Hugh Dickins
2009-05-07 13:23                         ` Andrea Arcangeli
2009-05-06 14:25               ` Hugh Dickins
2009-05-06 14:45                 ` Andrea Arcangeli
2009-05-06 15:36                   ` Chris Wright
2009-05-06 15:27             ` Izik Eidus
2009-05-06 16:14               ` Chris Wright
2009-05-06 16:36                 ` Hugh Dickins
2009-05-06 17:09                   ` Chris Wright
2009-05-06 17:54                     ` Hugh Dickins
2009-05-06 16:26               ` Hugh Dickins
2009-05-06 16:58                 ` Izik Eidus
2009-05-06 23:59                   ` Chris Wright
2009-05-07  2:41                     ` Rik van Riel
2009-05-06  0:43     ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Rik van Riel
2009-05-06  9:46       ` Izik Eidus
2009-05-06 12:26         ` Rik van Riel
2009-05-06 12:39           ` Izik Eidus
2009-05-06 13:17           ` Andrea Arcangeli
2009-05-06 13:28             ` Hugh Dickins
2009-05-06 14:02               ` Izik Eidus
2009-05-06 17:11                 ` Hugh Dickins
2009-05-06 14:09               ` Andrea Arcangeli
2009-05-06 14:21                 ` Alan Cox
2009-05-06 14:46                   ` Hugh Dickins
2009-05-06 14:56                     ` Andrea Arcangeli
2009-05-06 23:55                       ` Minchan Kim
2009-05-07  0:19                         ` Chris Wright
2009-05-07 10:46                         ` Andrea Arcangeli
2009-05-07 12:01                           ` Minchan Kim
2009-05-06 14:57                     ` Izik Eidus
2009-05-06  0:40   ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Rik van Riel
  -- strict thread matches above, loose matches on Subject: below --
2009-05-02 22:16 [PATCH 0/6] ksm changes Izik Eidus
2009-05-02 22:16 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
2009-05-02 22:16   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
2009-05-02 22:16     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
2009-05-04 19:43       ` Hugh Dickins
2009-05-04 20:37         ` Izik Eidus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090506133434.GX16078@random.random \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=chrisw@redhat.com \
    --cc=device@lanana.org \
    --cc=hugh@veritas.com \
    --cc=ieidus@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).