From: Oleg Nesterov <oleg@redhat.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>, Jay Lan <jlan@sgi.com>,
Jiri Pirko <jpirko@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting
Date: Thu, 4 Dec 2008 14:52:50 +0100 [thread overview]
Message-ID: <20081204135250.GA5448@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0812031957570.19174@blonde.anvils>
On 12/03, Hugh Dickins wrote:
>
> On Wed, 3 Dec 2008, Oleg Nesterov wrote:
>
> > Unless we are going to decrease rss/vm there is no point to call the
> > (racy) update_hiwater_xxx() helpers. Still do_exit() does this, and
>
> I'm puzzled by this comment. exit() _is_ about to decrease rss/vm,
> so isn't it right to be calling update_hiwater_xxx()?
Do you mean exit_mm()->...->exit_mmap() ? But this doesn't matter, this
->mm is going away. Nobody can read these counters when ->mm_users == 0,
no?
> There is a question of who's going to be able to see the result from
> this point on: I forget whether I was doing it for my own satisfaction,
> or for a real observer. Even if there isn't a real observer today,
> I think I'd prefer do_exit() to continue to update_hiwater_xxx(),
> in case an observer is added tomorrow - unless you feel it's
> unjustifiably adding code to and slowing down process exit.
Please see below,
> You say "(racy)": in my view, it was only as racy as whatever might
> cause it to be racy. By that, I mean that if the numbers ended up
> slightly wrong, you could reasonably imagine that the races happened
> in a different sequence which would have ended up with the numbers
> seen. Have you noticed something more serious we need to fix?
But the difference can be huge. Let's suppose the process has 2 threads
T1 and T2.
T1 exits, calls update_hiwater_vm(), notices that mm->hiwater_vm must be
updated, and preempted right before mm->hiwater_vm = new_value.
T2 does free(malloc(A_LOT)) and exits. It sets the correct value for
->hiwater_vm which takes A_LOT into account and disappears.
T1 resumes, and "reverts" ->hiwater_vm to the "new_value" which was
calculated before.
Now, since T1 is the last exiting thread, the whole thread group
exits and we report the wrong ->hiwater_vm to the userspace.
Yes, this race is unlikely. But there is another reason (perhaps
not very good) why I tried to remove update_hiwater_xxx() from
do_exit().
Imho this code looks as if: from now it is "safe" to use ->hiwater_xxx
directly because we already updated it. But it is not. Unless we are
the last thread, we can't trust mm->hiwater_xxx anyway, we should
re-check get_mm_rss/total_vm.
> > Introduce get_mm_hiwater_rss() and get_mm_hiwater_vm() to use instead,
> > and kill the "if (tsk->mm) {}" code in do_exit().
>
> If you're going to add special helper macros (I don't care myself),
> wouldn't it be better to convert fs/proc/task_mmu.c (the original
> consumer) to use them too?
Yes, I was going to convert task_mem(), but noticed that it has
to read get_mm_rss() and ->total_vm anyway. Still, perhaps it
would be more clean to use the new macros anyway, even if this
wiil (unlikely) need a couple of extra cpu ticks.
> And, as I say, I'd _prefer_ that block to remain in do_exit(),
> but don't have strong evidence why it should.
I think that update_hiwater_xxx() should be "private" for vm code which
does zap/unmap, and any observer should use get_mm_hiwater_xxx(). Nobody
should use mm->hiwater_xxx directly.
But I don't (and of course can't) have a strong opinion on that,
will wait for your verdict and re-send.
The patch need the update anyway, I just noticed that if we remove
update_hiwater_xxx() from do_exit(), we should change the comment
in exit_mmap().
> > The first helper will
> > be also used to actually fill/report rusage->ru_maxrss.
>
> Oh, yes, I noticed a mail yesterday in which you claimed to Cc me,
> but didn't (like we all claim to be attaching missing patches ;)
Yes sorry ;) The only problem with mutt is that it is not possible
to change CC while editing the text (unless edit_headers is set).
Oleg.
next prev parent reply other threads:[~2008-12-04 14:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-03 18:12 [PATCH] introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting Oleg Nesterov
2008-12-03 20:24 ` Hugh Dickins
2008-12-04 13:52 ` Oleg Nesterov [this message]
2008-12-05 12:54 ` Hugh Dickins
2008-12-05 18:41 ` Jay Lan
2008-12-06 8:42 ` Balbir Singh
2008-12-06 9:56 ` Hugh Dickins
2008-12-06 10:39 ` Balbir Singh
2008-12-07 16:17 ` Oleg Nesterov
2008-12-07 16:58 ` Balbir Singh
2008-12-07 17:28 ` Oleg Nesterov
2008-12-07 17:39 ` Balbir Singh
2008-12-06 15:26 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081204135250.GA5448@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=hugh@veritas.com \
--cc=jlan@sgi.com \
--cc=jpirko@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox