From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>,
akpm@linux-foundation.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
khlebnikov@openvz.org, hughd@google.com,
kamezawa.hiroyu@jp.fujitsu.com, stable@vger.kernel.org
Subject: Re: [patch 12/12] mm: correctly synchronize rss-counters at exit/exec
Date: Fri, 8 Jun 2012 14:18:16 +0200 [thread overview]
Message-ID: <20120608121816.GA23147@redhat.com> (raw)
In-Reply-To: <CA+55aFwuA3ex+XXW+TzOee8ax0g1NK9Mm5F3nYtY1m6YtvUFhQ@mail.gmail.com>
On 06/07, Linus Torvalds wrote:
>
> It does totally insane things in xacct_add_tsk(). You can't call
> "sync_mm_rss(mm)" on somebody elses mm,
Damn, I am stupid. Yes, I forgot about fill_stats_for_pid().
And I didn't bother to look at get_task_mm() which clearly
shows that this tsk can be !current.
We can add the "p == current" check as Hugh suggested.
But,
> Doing it
> *anywhere* where mm is not clearly "current->mm" is wrong.
Agreed.
How about v2? It adds sync_mm_rss() into taskstats_exit(). Note
that it preserves the "tsk->mm != NULL" check we currently have.
I think it should be removed (see the changelog), but even if I
am right I'd prefer to do this in a separate patch.
------------------------------------------------------------------------------
Subject: [PATCH] correctly synchronize rss-counters at exit/exec
A simplified version of Konstantin Khlebnikov's patch.
do_exit() and exec_mmap() call sync_mm_rss() before mm_release()
does put_user(clear_child_tid) which can update task->rss_stat
and thus make mm->rss_stat inconsistent. This triggers the "BUG:"
printk in check_mm().
- Move the final sync_mm_rss() from do_exit() to exit_mm(), and
change exec_mmap() to call sync_mm_rss() after mm_release() to
make check_mm() happy.
Perhaps we should simply move it into mm_release() and call it
unconditionally to catch the "task->rss_stat != 0 && !task->mm"
bugs.
- Since taskstats_exit() is called before exit_mm(), add another
sync_mm_rss() into taskstats_exit() for xacct_add_tsk() who
actually uses rss_stat. As Linus pointed out, it is not sane
to move it into xacct_add_tsk().
Probably we should also shift acct_update_integrals(), and
"tsk->mm != NULL" check looks equally unneeded.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
fs/exec.c | 2 +-
kernel/exit.c | 5 ++---
kernel/taskstats.c | 2 ++
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index a79786a..da27b91 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -819,10 +819,10 @@ static int exec_mmap(struct mm_struct *mm)
/* Notify parent that we're no longer interested in the old VM */
tsk = current;
old_mm = current->mm;
- sync_mm_rss(old_mm);
mm_release(tsk, old_mm);
if (old_mm) {
+ sync_mm_rss(old_mm);
/*
* Make sure that if there is a core dump in progress
* for the old mm, we get out and die instead of going
diff --git a/kernel/exit.c b/kernel/exit.c
index 0e40041..38c4a91 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -655,6 +655,8 @@ static void exit_mm(struct task_struct * tsk)
mm_release(tsk, mm);
if (!mm)
return;
+
+ sync_mm_rss(mm);
/*
* Serialize with any possible pending coredump.
* We must hold mmap_sem around checking core_state
@@ -966,9 +968,6 @@ void do_exit(long code)
preempt_count());
acct_update_integrals(tsk);
- /* sync mm's RSS info before statistics gathering */
- if (tsk->mm)
- sync_mm_rss(tsk->mm);
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
hrtimer_cancel(&tsk->signal->real_timer);
diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index e660464..55d1103 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -630,6 +630,8 @@ void taskstats_exit(struct task_struct *tsk, int group_dead)
if (!stats)
goto err;
+ if (tsk->mm)
+ sync_mm_rss(tsk->mm);
fill_stats(tsk, stats);
/*
--
1.5.5.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>,
akpm@linux-foundation.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
khlebnikov@openvz.org, hughd@google.com,
kamezawa.hiroyu@jp.fujitsu.com, stable@vger.kernel.org
Subject: Re: [patch 12/12] mm: correctly synchronize rss-counters at exit/exec
Date: Fri, 8 Jun 2012 14:18:16 +0200 [thread overview]
Message-ID: <20120608121816.GA23147@redhat.com> (raw)
In-Reply-To: <CA+55aFwuA3ex+XXW+TzOee8ax0g1NK9Mm5F3nYtY1m6YtvUFhQ@mail.gmail.com>
On 06/07, Linus Torvalds wrote:
>
> It does totally insane things in xacct_add_tsk(). You can't call
> "sync_mm_rss(mm)" on somebody elses mm,
Damn, I am stupid. Yes, I forgot about fill_stats_for_pid().
And I didn't bother to look at get_task_mm() which clearly
shows that this tsk can be !current.
We can add the "p == current" check as Hugh suggested.
But,
> Doing it
> *anywhere* where mm is not clearly "current->mm" is wrong.
Agreed.
How about v2? It adds sync_mm_rss() into taskstats_exit(). Note
that it preserves the "tsk->mm != NULL" check we currently have.
I think it should be removed (see the changelog), but even if I
am right I'd prefer to do this in a separate patch.
------------------------------------------------------------------------------
Subject: [PATCH] correctly synchronize rss-counters at exit/exec
A simplified version of Konstantin Khlebnikov's patch.
do_exit() and exec_mmap() call sync_mm_rss() before mm_release()
does put_user(clear_child_tid) which can update task->rss_stat
and thus make mm->rss_stat inconsistent. This triggers the "BUG:"
printk in check_mm().
- Move the final sync_mm_rss() from do_exit() to exit_mm(), and
change exec_mmap() to call sync_mm_rss() after mm_release() to
make check_mm() happy.
Perhaps we should simply move it into mm_release() and call it
unconditionally to catch the "task->rss_stat != 0 && !task->mm"
bugs.
- Since taskstats_exit() is called before exit_mm(), add another
sync_mm_rss() into taskstats_exit() for xacct_add_tsk() who
actually uses rss_stat. As Linus pointed out, it is not sane
to move it into xacct_add_tsk().
Probably we should also shift acct_update_integrals(), and
"tsk->mm != NULL" check looks equally unneeded.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
fs/exec.c | 2 +-
kernel/exit.c | 5 ++---
kernel/taskstats.c | 2 ++
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index a79786a..da27b91 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -819,10 +819,10 @@ static int exec_mmap(struct mm_struct *mm)
/* Notify parent that we're no longer interested in the old VM */
tsk = current;
old_mm = current->mm;
- sync_mm_rss(old_mm);
mm_release(tsk, old_mm);
if (old_mm) {
+ sync_mm_rss(old_mm);
/*
* Make sure that if there is a core dump in progress
* for the old mm, we get out and die instead of going
diff --git a/kernel/exit.c b/kernel/exit.c
index 0e40041..38c4a91 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -655,6 +655,8 @@ static void exit_mm(struct task_struct * tsk)
mm_release(tsk, mm);
if (!mm)
return;
+
+ sync_mm_rss(mm);
/*
* Serialize with any possible pending coredump.
* We must hold mmap_sem around checking core_state
@@ -966,9 +968,6 @@ void do_exit(long code)
preempt_count());
acct_update_integrals(tsk);
- /* sync mm's RSS info before statistics gathering */
- if (tsk->mm)
- sync_mm_rss(tsk->mm);
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
hrtimer_cancel(&tsk->signal->real_timer);
diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index e660464..55d1103 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -630,6 +630,8 @@ void taskstats_exit(struct task_struct *tsk, int group_dead)
if (!stats)
goto err;
+ if (tsk->mm)
+ sync_mm_rss(tsk->mm);
fill_stats(tsk, stats);
/*
--
1.5.5.1
next prev parent reply other threads:[~2012-06-08 12:20 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120607212114.E4F5AA02F8@akpm.mtv.corp.google.com>
[not found] ` <CA+55aFxOWR_h1vqRLAd_h5_woXjFBLyBHP--P8F7WsYrciXdmA@mail.gmail.com>
2012-06-08 0:25 ` [patch 12/12] mm: correctly synchronize rss-counters at exit/exec Linus Torvalds
2012-06-08 0:25 ` Linus Torvalds
2012-06-08 0:25 ` Linus Torvalds
2012-06-08 1:05 ` Markus Trippelsdorf
2012-06-08 1:18 ` Linus Torvalds
2012-06-08 1:18 ` Linus Torvalds
2012-06-08 12:18 ` Oleg Nesterov [this message]
2012-06-08 12:18 ` Oleg Nesterov
2012-06-11 10:25 ` Kamezawa Hiroyuki
2012-06-11 10:25 ` Kamezawa Hiroyuki
2012-06-08 1:16 ` Hugh Dickins
2012-06-08 1:19 ` Linus Torvalds
2012-06-08 1:19 ` Linus Torvalds
2012-06-08 1:19 ` Linus Torvalds
2012-06-08 5:28 ` Hugh Dickins
2012-06-08 10:20 ` Konstantin Khlebnikov
2012-06-08 10:20 ` Konstantin Khlebnikov
2012-06-08 12:24 ` Oleg Nesterov
2012-06-08 12:24 ` Oleg Nesterov
2012-06-08 13:29 ` Konstantin Khlebnikov
2012-06-08 13:29 ` Konstantin Khlebnikov
2012-06-08 17:01 ` Oleg Nesterov
2012-06-08 17:01 ` Oleg Nesterov
2012-06-09 9:43 ` [PATCH] " Konstantin Khlebnikov
2012-06-09 9:43 ` Konstantin Khlebnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120608121816.GA23147@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=khlebnikov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=markus@trippelsdorf.de \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.