From: Oleg Nesterov <oleg@redhat.com>
To: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/5] thread_group_cputime: simplify, document the "alive" check
Date: Fri, 11 Jun 2010 18:57:51 +0200 [thread overview]
Message-ID: <20100611165750.GA7019@redhat.com> (raw)
In-Reply-To: <20100611164050.GA19325@dhcp-lab-161.englab.brq.redhat.com>
On 06/11, Stanislaw Gruszka wrote:
>
> On Fri, Jun 11, 2010 at 05:15:33PM +0200, Oleg Nesterov wrote:
> > On 06/11, Stanislaw Gruszka wrote:
> > >
> > > On Fri, Jun 11, 2010 at 01:09:56AM +0200, Oleg Nesterov wrote:
> > > > thread_group_cputime() looks as if it is rcu-safe, but in fact this
> > > > was wrong until ea6d290c which pins task->signal to task_struct.
> > > > It checks ->sighand != NULL under rcu, but this can't help if ->signal
> > > > can go away. Fortunately the caller either holds ->siglock, or it is
> > > > fastpath_timer_check() which uses current and checks exit_state == 0.
> > >
> > > Hmm, I thought we avoided calling thread_group_cputime() from
> > > fastpatch_timer_check(), but seems it is still possible when we
> > > call run_posix_cpu_timers() on two different cpus simultaneously ...
> >
> > No, we can't. thread_group_cputimer() does test-and-set ->running
> > under cputimer->lock.
> >
> > But when I sent these patches, I realized we have another race here
> > (with or without these patches). I am already doing the fix.
>
> Don't know what you catch, I was thinking about:
>
> cpu0 cpu1
>
> fastpath_timer_check():
>
> if (sig->cputimer.running) {
> struct task_cputime group_sample;
> stop_process_timers():
>
> spin_lock_irqsave(&cputimer->lock, flags);
> cputimer->running = 0;
> spin_unlock_irqrestore(&cputimer->lock, flags);
>
> thread_group_cputimer(tsk, &group_sample);
Yes, I was thinking about this race too. Please wait a bit, I'll send
the patch.
In short: it is safe to call thread_group_cputime() lockless, but
thread_group_cputimer() must not be called without siglock/tasklist
(oh, and imho we should rename them somehow, their names are almost
identical). And in fact fastpath_timer_check() does not need
thread_group_cputimer().
> > > > - Since ea6d290c commit tsk->signal is stable, we can read it first
> > > > and avoid the initialization from INIT_CPUTIME.
> > > >
> > > > - Even if tsk->signal is always valid, we still have to check it
> > > > is safe to use next_thread() under rcu_read_lock(). Currently
> > > > the code checks ->sighand != NULL, change it to use pid_alive()
> > > > which is commonly used to ensure the task wasn't unhashed before
> > > > we take rcu_read_lock().
> > >
> > > I'm not sure how important are values of almost dead task, but
> > > perhaps would be better to return times form all threads
> > > using as base sig->curr_target in loop.
> >
> > Could you clarify?
>
> Avoid pid_alive check and loop starting from sig->curr_target:
>
> t = tsk = sig->curr_target;
> do {
> times->utime = cputime_add(times->utime, t->utime);
> times->stime = cputime_add(times->stime, t->stime);
> times->sum_exec_runtime += t->se.sum_exec_runtime;
> } while_each_thread(tsk, t);
>
> I don't know what are rules regarding accessing sig->curr_target, but
> if this is done under sighand->siglock we should be safe. Question
> if if we always have lock taken, we tried to assure that in the past,
> but if we really do?
Ah, you are talking about thread_group_cputime().
Without ->siglock this is not safe. We can change __exit_signal() to
nullify ->curr_target in the group_dead case, then the code above
could check sig->curr_target != NULL.
But this is too subtle imho, and not needed. Instead we should move
group_leader into ->signal (and kill signal->leader_pid). I am going
to do more cleanups in this area "later".
Anyway. This all has nothing to do with this patch. The 4/5 change
in thread_group_cputime() is cleanup, and it ccan help to make
/proc/pid/stat /proc/pid/status lockless.
With or without 5/5 thread_group_cputime() can be called lockless
and race with exit/fork. This is fine by itself, but this is wrong
because the caller sets ->running.
Oleg.
next prev parent reply other threads:[~2010-06-11 16:59 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-10 23:09 [PATCH 4/5] thread_group_cputime: simplify, document the "alive" check Oleg Nesterov
2010-06-11 14:35 ` Stanislaw Gruszka
2010-06-11 15:15 ` Oleg Nesterov
2010-06-11 16:40 ` Stanislaw Gruszka
2010-06-11 16:57 ` Oleg Nesterov [this message]
2010-06-18 10:20 ` [tip:sched/core] sched: thread_group_cputime: Simplify, " tip-bot for Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100611165750.GA7019@redhat.com \
--to=oleg@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=sgruszka@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox