All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	"markus@trippelsdorf.de" <markus@trippelsdorf.de>,
	"hughd@google.com" <hughd@google.com>,
	"kamezawa.hiroyu@jp.fujitsu.com" <kamezawa.hiroyu@jp.fujitsu.com>,
	Michal Hocko <mhocko@suse.cz>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1 val:-59
Date: Thu, 7 Jun 2012 15:18:48 +0200	[thread overview]
Message-ID: <20120607131848.GA19076@redhat.com> (raw)
In-Reply-To: <4FD05F75.1050108@openvz.org>

On 06/07, Konstantin Khlebnikov wrote:
>
> Oleg Nesterov wrote:
>>
>> I'll write the changelog and send the patch tomorrow.
>
> Ding! Week is over, or I missed something? )

Pong ;)

I have sent the patch on May 31, see
http://marc.info/?l=linux-kernel&m=133848759505805
Also attached below, just in case.

Initiallly I sent 2 patches, see
http://marc.info/?l=linux-kernel&m=133848784705941
but 2/2 (your patch) was already merged.

-------------------------------------------------------------------------------
[PATCH] correctly synchronize rss-counters at exit/exec

A simplified version of Konstantin Khlebnikov's patch.

do_exit() and exec_mmap() call sync_mm_rss() before mm_release()
does put_user(clear_child_tid) which can update task->rss_stat
and thus make mm->rss_stat inconsistent. This triggers the "BUG:"
printk in check_mm().

- Move the final sync_mm_rss() from do_exit() to exit_mm(), and
  change exec_mmap() to call sync_mm_rss() after mm_release() to
  make check_mm() happy.

  Perhaps we should simply move it into mm_release() and call it
  unconditionally to catch the "task->rss_stat != 0 && !task->mm"
  bugs.

- Since taskstats_exit() is called before exit_mm(), add another
  sync_mm_rss() into xacct_add_tsk() who actually uses rss_stat.

  Probably we should also shift acct_update_integrals().

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 fs/exec.c       |    2 +-
 kernel/exit.c   |    5 ++---
 kernel/tsacct.c |    1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 52c9e2f..e49e3c2 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -823,10 +823,10 @@ static int exec_mmap(struct mm_struct *mm)
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
 	old_mm = current->mm;
-	sync_mm_rss(old_mm);
 	mm_release(tsk, old_mm);
 
 	if (old_mm) {
+		sync_mm_rss(old_mm);
 		/*
 		 * Make sure that if there is a core dump in progress
 		 * for the old mm, we get out and die instead of going
diff --git a/kernel/exit.c b/kernel/exit.c
index ab972a7..b3a84b5 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -655,6 +655,8 @@ static void exit_mm(struct task_struct * tsk)
 	mm_release(tsk, mm);
 	if (!mm)
 		return;
+
+	sync_mm_rss(mm);
 	/*
 	 * Serialize with any possible pending coredump.
 	 * We must hold mmap_sem around checking core_state
@@ -965,9 +967,6 @@ void do_exit(long code)
 				preempt_count());
 
 	acct_update_integrals(tsk);
-	/* sync mm's RSS info before statistics gathering */
-	if (tsk->mm)
-		sync_mm_rss(tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
 		hrtimer_cancel(&tsk->signal->real_timer);
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index 23b4d78..a64ee90 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -91,6 +91,7 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
 	stats->virtmem = p->acct_vm_mem1 * PAGE_SIZE / MB;
 	mm = get_task_mm(p);
 	if (mm) {
+		sync_mm_rss(mm);
 		/* adjust to KB unit */
 		stats->hiwater_rss   = get_mm_hiwater_rss(mm) * PAGE_SIZE / KB;
 		stats->hiwater_vm    = get_mm_hiwater_vm(mm)  * PAGE_SIZE / KB;
-- 
1.5.5.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	"markus@trippelsdorf.de" <markus@trippelsdorf.de>,
	"hughd@google.com" <hughd@google.com>,
	"kamezawa.hiroyu@jp.fujitsu.com" <kamezawa.hiroyu@jp.fujitsu.com>,
	Michal Hocko <mhocko@suse.cz>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1 val:-59
Date: Thu, 7 Jun 2012 15:18:48 +0200	[thread overview]
Message-ID: <20120607131848.GA19076@redhat.com> (raw)
In-Reply-To: <4FD05F75.1050108@openvz.org>

On 06/07, Konstantin Khlebnikov wrote:
>
> Oleg Nesterov wrote:
>>
>> I'll write the changelog and send the patch tomorrow.
>
> Ding! Week is over, or I missed something? )

Pong ;)

I have sent the patch on May 31, see
http://marc.info/?l=linux-kernel&m=133848759505805
Also attached below, just in case.

Initiallly I sent 2 patches, see
http://marc.info/?l=linux-kernel&m=133848784705941
but 2/2 (your patch) was already merged.

-------------------------------------------------------------------------------
[PATCH] correctly synchronize rss-counters at exit/exec

A simplified version of Konstantin Khlebnikov's patch.

do_exit() and exec_mmap() call sync_mm_rss() before mm_release()
does put_user(clear_child_tid) which can update task->rss_stat
and thus make mm->rss_stat inconsistent. This triggers the "BUG:"
printk in check_mm().

- Move the final sync_mm_rss() from do_exit() to exit_mm(), and
  change exec_mmap() to call sync_mm_rss() after mm_release() to
  make check_mm() happy.

  Perhaps we should simply move it into mm_release() and call it
  unconditionally to catch the "task->rss_stat != 0 && !task->mm"
  bugs.

- Since taskstats_exit() is called before exit_mm(), add another
  sync_mm_rss() into xacct_add_tsk() who actually uses rss_stat.

  Probably we should also shift acct_update_integrals().

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 fs/exec.c       |    2 +-
 kernel/exit.c   |    5 ++---
 kernel/tsacct.c |    1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 52c9e2f..e49e3c2 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -823,10 +823,10 @@ static int exec_mmap(struct mm_struct *mm)
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
 	old_mm = current->mm;
-	sync_mm_rss(old_mm);
 	mm_release(tsk, old_mm);
 
 	if (old_mm) {
+		sync_mm_rss(old_mm);
 		/*
 		 * Make sure that if there is a core dump in progress
 		 * for the old mm, we get out and die instead of going
diff --git a/kernel/exit.c b/kernel/exit.c
index ab972a7..b3a84b5 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -655,6 +655,8 @@ static void exit_mm(struct task_struct * tsk)
 	mm_release(tsk, mm);
 	if (!mm)
 		return;
+
+	sync_mm_rss(mm);
 	/*
 	 * Serialize with any possible pending coredump.
 	 * We must hold mmap_sem around checking core_state
@@ -965,9 +967,6 @@ void do_exit(long code)
 				preempt_count());
 
 	acct_update_integrals(tsk);
-	/* sync mm's RSS info before statistics gathering */
-	if (tsk->mm)
-		sync_mm_rss(tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
 		hrtimer_cancel(&tsk->signal->real_timer);
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index 23b4d78..a64ee90 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -91,6 +91,7 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
 	stats->virtmem = p->acct_vm_mem1 * PAGE_SIZE / MB;
 	mm = get_task_mm(p);
 	if (mm) {
+		sync_mm_rss(mm);
 		/* adjust to KB unit */
 		stats->hiwater_rss   = get_mm_hiwater_rss(mm) * PAGE_SIZE / KB;
 		stats->hiwater_vm    = get_mm_hiwater_vm(mm)  * PAGE_SIZE / KB;
-- 
1.5.5.1



  parent reply	other threads:[~2012-06-07 13:20 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-22 22:41 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1 val:-59 Martin Mokrejs
2012-05-22 23:28 ` Andrew Morton
2012-05-22 23:28   ` Andrew Morton
2012-05-22 23:29   ` Andrew Morton
2012-05-22 23:29     ` Andrew Morton
2012-05-23 17:21     ` Oleg Nesterov
2012-05-23 17:21       ` Oleg Nesterov
2012-05-29 20:18       ` Konstantin Khlebnikov
2012-05-29 20:18         ` Konstantin Khlebnikov
2012-05-29 20:26         ` Andrew Morton
2012-05-29 20:26           ` Andrew Morton
2012-05-29 21:59           ` Martin Mokrejs
2012-05-29 21:59             ` Martin Mokrejs
2012-05-30 11:39             ` Konstantin Khlebnikov
2012-05-30 11:39               ` Konstantin Khlebnikov
2012-05-30 11:59               ` Martin Mokrejs
2012-05-30 11:59                 ` Martin Mokrejs
2012-05-30 12:22                 ` Konstantin Khlebnikov
2012-05-30 12:22                   ` Konstantin Khlebnikov
2012-05-30 12:54                   ` Konstantin Khlebnikov
2012-05-30 12:54                     ` Konstantin Khlebnikov
2012-05-30 14:20                     ` Martin Mokrejs
2012-05-30 14:20                       ` Martin Mokrejs
2012-05-30 17:11         ` Oleg Nesterov
2012-05-30 17:11           ` Oleg Nesterov
2012-06-07  7:59           ` Konstantin Khlebnikov
2012-06-07  7:59             ` Konstantin Khlebnikov
2012-06-07  8:23             ` richard -rw- weinberger
2012-06-07  8:23               ` richard -rw- weinberger
2012-06-07 13:18             ` Oleg Nesterov [this message]
2012-06-07 13:18               ` Oleg Nesterov
2012-06-07 13:53               ` Konstantin Khlebnikov
2012-06-07 13:53                 ` Konstantin Khlebnikov
2012-05-30  9:54       ` Martin Mokrejs
2012-05-30  9:54         ` Martin Mokrejs
2012-05-23  6:07   ` Konstantin Khlebnikov
2012-05-23  6:07     ` Konstantin Khlebnikov
2012-05-30  8:25     ` Martin Mokrejs
2012-05-30  8:25       ` Martin Mokrejs
2012-05-23 17:04   ` Martin Mokrejs
2012-05-23 17:04     ` Martin Mokrejs
2012-05-24 10:36     ` Konstantin Khlebnikov
2012-05-24 10:36       ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120607131848.GA19076@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=khlebnikov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=markus@trippelsdorf.de \
    --cc=mhocko@suse.cz \
    --cc=mmokrejs@fold.natur.cuni.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.