All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/3] mm: adjust rss counters for migration entiries
Date: Tue, 24 Jan 2012 11:08:13 +0400	[thread overview]
Message-ID: <4F1E58DD.6030607@openvz.org> (raw)
In-Reply-To: <alpine.LSU.2.00.1201231719580.14979@eggly.anvils>

Hugh Dickins wrote:
> On Wed, 18 Jan 2012, Andrew Morton wrote:
>> On Wed, 11 Jan 2012 17:41:26 +0900
>> KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>  wrote:
>>> On Wed, 11 Jan 2012 12:23:11 +0400
>>> Konstantin Khlebnikov<khlebnikov@openvz.org>  wrote:
>
> I only just got around to looking at these, sorry.
>
>>
>> Putting "fix" in the patch title text is a good way of handling this.
>>
>> I renamed [3/3] to "mm: fix rss count leakage during migration" and
>> shall queue it for 3.3.  If people think we should also backport it
>> into -stable then please let me know.
>
> I don't think it needs backporting to stable: unless I'm forgetting
> something, the only thing that actually uses these rss counters is the
> OOM killer, and I don't think that will be greatly affected by the bug.
>

Yes, there are only counters. But, I can imagine local DoS attack via this race.

>>
>> I reordered the patches and worked the chagnelogs quite a bit.  I now
>> have:
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: fix rss count leakage during migration
>> :
>> : Memory migration fills a pte with a migration entry and it doesn't update
>> : the rss counters.  Then it replaces the migration entry with the new page
>> : (or the old one if migration failed).  But between these two passes this
>> : pte can be unmaped, or a task can fork a child and it will get a copy of
>> : this migration entry.  Nobody accounts for this in the rss counters.
>> :
>> : This patch properly adjust rss counters for migration entries in
>> : zap_pte_range() and copy_one_pte().  Thus we avoid extra atomic operations
>> : on the migration fast-path.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> That was a good find, Konstantin: thank you.
>
>>
>> and
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: add rss counters consistency check
>> :
>> : Warn about non-zero rss counters at final mmdrop.
>> :
>> : This check will prevent reoccurences of bugs such as that fixed in "mm:
>> : fix rss count leakage during migration".
>> :
>> : I didn't hide this check under CONFIG_VM_DEBUG because it rather small and
>> : rss counters cover whole page-table management, so this is a good
>> : invariant.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> I'd be happier with this one if you do hide the check under
> CONFIG_VM_DEBUG - or even under CONFIG_DEBUG_VM if you want it to
> be compiled in sometimes ;)  I suppose NR_MM_COUNTERS is only 3,
> so it isn't a huge overhead; but I think you're overestimating the
> importance of these counters, and it would look better under DEBUG_VM.

Theoretically, some drivers can touch page tables,
for example if they do that outside of vma we can get some kind of strange memory leaks.

>
>>
>> and
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: postpone migrated page mapping reset
>> :
>> : Postpone resetting page->mapping until the final remove_migration_ptes().
>> : Otherwise the expression PageAnon(migration_entry_to_page(entry)) does not
>> : work.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> Isn't this one actually an essential part of the fix?  It should have
> been part of the same patch, but you split them apart, now Andrew has
> reordered them and pushed one part to 3.3, but this needs to go in too?
>

Oops. I missed that. Yes. race-fix does not work for anon-memory without that patch.
But this is non-fatal, there are no new bugs.

> Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/3] mm: adjust rss counters for migration entiries
Date: Tue, 24 Jan 2012 11:08:13 +0400	[thread overview]
Message-ID: <4F1E58DD.6030607@openvz.org> (raw)
In-Reply-To: <alpine.LSU.2.00.1201231719580.14979@eggly.anvils>

Hugh Dickins wrote:
> On Wed, 18 Jan 2012, Andrew Morton wrote:
>> On Wed, 11 Jan 2012 17:41:26 +0900
>> KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>  wrote:
>>> On Wed, 11 Jan 2012 12:23:11 +0400
>>> Konstantin Khlebnikov<khlebnikov@openvz.org>  wrote:
>
> I only just got around to looking at these, sorry.
>
>>
>> Putting "fix" in the patch title text is a good way of handling this.
>>
>> I renamed [3/3] to "mm: fix rss count leakage during migration" and
>> shall queue it for 3.3.  If people think we should also backport it
>> into -stable then please let me know.
>
> I don't think it needs backporting to stable: unless I'm forgetting
> something, the only thing that actually uses these rss counters is the
> OOM killer, and I don't think that will be greatly affected by the bug.
>

Yes, there are only counters. But, I can imagine local DoS attack via this race.

>>
>> I reordered the patches and worked the chagnelogs quite a bit.  I now
>> have:
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: fix rss count leakage during migration
>> :
>> : Memory migration fills a pte with a migration entry and it doesn't update
>> : the rss counters.  Then it replaces the migration entry with the new page
>> : (or the old one if migration failed).  But between these two passes this
>> : pte can be unmaped, or a task can fork a child and it will get a copy of
>> : this migration entry.  Nobody accounts for this in the rss counters.
>> :
>> : This patch properly adjust rss counters for migration entries in
>> : zap_pte_range() and copy_one_pte().  Thus we avoid extra atomic operations
>> : on the migration fast-path.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> That was a good find, Konstantin: thank you.
>
>>
>> and
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: add rss counters consistency check
>> :
>> : Warn about non-zero rss counters at final mmdrop.
>> :
>> : This check will prevent reoccurences of bugs such as that fixed in "mm:
>> : fix rss count leakage during migration".
>> :
>> : I didn't hide this check under CONFIG_VM_DEBUG because it rather small and
>> : rss counters cover whole page-table management, so this is a good
>> : invariant.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> I'd be happier with this one if you do hide the check under
> CONFIG_VM_DEBUG - or even under CONFIG_DEBUG_VM if you want it to
> be compiled in sometimes ;)  I suppose NR_MM_COUNTERS is only 3,
> so it isn't a huge overhead; but I think you're overestimating the
> importance of these counters, and it would look better under DEBUG_VM.

Theoretically, some drivers can touch page tables,
for example if they do that outside of vma we can get some kind of strange memory leaks.

>
>>
>> and
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: postpone migrated page mapping reset
>> :
>> : Postpone resetting page->mapping until the final remove_migration_ptes().
>> : Otherwise the expression PageAnon(migration_entry_to_page(entry)) does not
>> : work.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> Isn't this one actually an essential part of the fix?  It should have
> been part of the same patch, but you split them apart, now Andrew has
> reordered them and pushed one part to 3.3, but this needs to go in too?
>

Oops. I missed that. Yes. race-fix does not work for anon-memory without that patch.
But this is non-fatal, there are no new bugs.

> Hugh


  reply	other threads:[~2012-01-24  7:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-06 17:38 [PATCH 1/3] mm: add rss counters consistency check Konstantin Khlebnikov
2012-01-06 17:38 ` Konstantin Khlebnikov
2012-01-06 17:38 ` [PATCH 2/3] mm: postpone migrated page mapping reset Konstantin Khlebnikov
2012-01-06 17:38   ` Konstantin Khlebnikov
2012-01-06 17:38 ` [PATCH 3/3] mm: adjust rss counters for migration entiries Konstantin Khlebnikov
2012-01-06 17:38   ` Konstantin Khlebnikov
2012-01-06 17:45   ` Konstantin Khlebnikov
2012-01-11  5:41   ` KAMEZAWA Hiroyuki
2012-01-11  5:41     ` KAMEZAWA Hiroyuki
2012-01-11  8:23     ` Konstantin Khlebnikov
2012-01-11  8:23       ` Konstantin Khlebnikov
2012-01-11  8:41       ` KAMEZAWA Hiroyuki
2012-01-11  8:41         ` KAMEZAWA Hiroyuki
2012-01-18 23:21         ` Andrew Morton
2012-01-18 23:21           ` Andrew Morton
2012-01-24  1:42           ` Hugh Dickins
2012-01-24  1:42             ` Hugh Dickins
2012-01-24  7:08             ` Konstantin Khlebnikov [this message]
2012-01-24  7:08               ` Konstantin Khlebnikov
2012-01-25 23:01               ` Hugh Dickins
2012-01-25 23:01                 ` Hugh Dickins
2012-01-26  0:20                 ` Andrew Morton
2012-01-26  0:20                   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F1E58DD.6030607@openvz.org \
    --to=khlebnikov@openvz.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.