Re: [PATCH] ksm: add vm_stat and meminfo entry to reflect pte mapping to ksm pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Nai Xia <nai.xia@gmail.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Chris Wright <chrisw@sous-sol.org>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kernel-janitors@vger.kernel.org
Subject: Re: [PATCH] ksm: add vm_stat and meminfo entry to reflect pte mapping to ksm pages
Date: Sat, 19 Mar 2011 22:55:36 +0800	[thread overview]
Message-ID: <AANLkTimZsGCTJoea-uK9Box5WCXHziGBH7+qFq7yy=PN@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.00.1103181448100.2092@sister.anvils>

On Sat, Mar 19, 2011 at 6:40 AM, Hugh Dickins <hughd@google.com> wrote:
> On Fri, 18 Mar 2011, Nai Xia wrote:
>> >On Thursday 03 March 2011, at 06:31:42, <Andrew Morton <akpm@linux-foundation.org>> wrote
>> > This patch obviously wasn't tested with CONFIG_KSM=n, which was a
>> > pretty basic patch-testing failure :(
>>
>> Oops, I will be careful to avoid similar mistakes next time.
>>
>> >
>> > I fixed up my tree with the below, but really the amount of ifdeffing
>> > is unacceptable - please find a cleaner way to fix up this patch.
>>
>> Ok, I will have a try in my next patch submit.
>
> A couple of notes on that.
>
> akpm's fixup introduced an #ifdef CONFIG_KSM in mm/ksm.c: that should
> be, er, unnecessary - since ksm.c is only compiled when CONFIG_KSM=y.

This was lately pointed out by me and canceled by another patch in
mm-commits@vger.kernel.org and CCed to your obsolete email
address: hugh.dickins@tiscali.co.uk I think.

>
> And PageKsm(page) evaluates to 0 when CONFIG_KSM is not set, so the
> optimizer should eliminate code from most places without #ifdef:
> though you need to keep the #ifdef around display in /proc/meminfo
> itself, so as not to annoy non-KSM people with an always 0kB line.

This is just what I thought before I introduced NR_KSM_PAGES_SHARING,
which then did break the compiling. My mistake.

>
> But I am uncomfortable with the whole patch.
>
> Can you make a stronger case for it?  KSM is designed to have its own
> cycle, and to keep out of the way of the rest of mm as much as possible
> (not as much as originally hoped, I admit).  Do we really want to show
> its statistics in /proc/meminfo now?  And do we really care that they
> don't keep up with exiting processes when the scan rate is low?

OK, I have to explain, here.
This patch is actually a tiny part of a bunch of code I wrote to improve ksm
in several aspects(This is somewhat off the topic but if you are interested,
please take at look at  http://code.google.com/p/uksm/, It's still on
very early
stage).

In my code, the inconsistency is amplified by non-uniform
scan speed for different VMAs and significantly improved max scan speed.
Then I think this patch may also be helpful to ksm itself.  Just as you said,
I had thought it at least improves the accuracy.

>
> I am not asserting that we don't, nor am I nacking your patch:
> but I would like to hear more support for it, before it adds
> yet another line to our user interface in /proc/meminfo.

Then how about not touching the sexy meminfo and adding a new
interface file in /sys/kernel/mm/ksm/ ?  OK, on condition that the bug
below can be properly solved.

>
> And there is an awkward little bug in your patch, which amplifies
> a more significant and shameful pair of bugs of mine in KSM itself -
> no wonder that I'm anxious about your patch going in!
>
> Your bug is precisely where akpm added the #ifdef in ksm.c.  The
> problem is that page_mapcount() is maintained atomically, generally
> without spinlock or pagelock: so the value of mapcount there, unless
> it is 1, can go up or down racily (as other processes sharing that
> anonymous page fork or unmap at the same time).

You are right,  copy_one_pte does not take page lock. So it's definitely a
bug in my patch, although it did not appear in my tests.  Actually, there is
another issue in my patch: It tries to count all the ptes, while actually only
those changed by ksmd really matter, those added by fork does not mean
memory savings. I had thought not taking the mapcount
, instead, only increase the count by one each time a pte is changed by ksmd,
but It seems also hard to tell a pte mapped to ksm page was previously
changed by ksmd or by fork when it gets unmapped.

So indeed, I have no idea to fix this bug for the time being.


>
> I could hardly complain about that, while suggesting above that more
> approximate numbers are good enough!  Except that, when KSM is turned
> off, there's a chance that you'd be left showing a non-0 kB in
> /proc/meminfo.  Then people will want a fix, and I don't yet know
> what that fix will be.
>
> My first bug is in the break_cow() technique used to get back to
> normal, when merging into a KSM page fails for one reason or another:
> that technique misses other mappings of the page.  I did have a patch
> in progress to fix that a few months ago, but it wasn't quite working,
> and then I realized the second bug: that even when successful, if
> VM_UNMERGEABLE has been used in forked processes, then we could end up
> with a KSM page in a VM_UNMERGEABLE area, which is against the spec.
>
> A solution to all three problems would be to revert to allocating a
> separate KSM page, instead of using one of the pages already there.
> But that feels like a regression, and I don't think anybody is really
> hurting from the current situation, so I've not jumped to fix it yet.
>
> Hugh
>

Yes, I agree on your point. Let's hope there is an efficient and
simple solution.
But for now,  please drop this patch, Andrew.

Nai

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2011-03-19 14:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-26 14:56 [PATCH] ksm: add vm_stat and meminfo entry to reflect pte mapping to ksm pages Nai Xia
2011-03-01  2:21 ` Dave Hansen
2011-03-18  6:44   ` xianai
2011-03-01 23:41 ` Andrew Morton
2011-03-18  7:16   ` Nai Xia
2011-03-02 22:31 ` Andrew Morton
2011-03-18  7:29   ` Nai Xia
2011-03-18 22:40     ` Hugh Dickins
2011-03-19 14:55       ` Nai Xia [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTimZsGCTJoea-uK9Box5WCXHziGBH7+qFq7yy=PN@mail.gmail.com' \
    --to=nai.xia@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisw@sous-sol.org \
    --cc=hughd@google.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).