From: Oleg Nesterov <oleg@redhat.com>
To: Rik van Riel <riel@redhat.com>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
Frank Mayhar <fmayhar@google.com>,
Frederic Weisbecker <fweisbec@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Sanjay Rao <srao@redhat.com>, Larry Woodman <lwoodman@redhat.com>
Subject: Re: [PATCH RFC] time,signal: protect resource use statistics with seqlock
Date: Thu, 14 Aug 2014 15:22:40 +0200 [thread overview]
Message-ID: <20140814132239.GA24465@redhat.com> (raw)
In-Reply-To: <20140813170324.544aaf2d@cuia.bos.redhat.com>
On 08/13, Rik van Riel wrote:
>
> On Wed, 13 Aug 2014 20:45:11 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > That said, it is not that I am really sure that seqcount_t in ->signal
> > is actually worse, not to mention that this is subjective anyway. IOW,
> > I am not going to really fight with your approach ;)
>
> This is what it looks like, on top of your for_each_thread series
> from yesterday:
OK, lets forget about alternative approach for now. We can reconsider
it later. At least I have to admit that seqlock is more straighforward.
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -646,6 +646,7 @@ struct signal_struct {
> * Live threads maintain their own counters and add to these
> * in __exit_signal, except for the group leader.
> */
> + seqlock_t stats_lock;
Ah. Somehow I thought that you were going to use seqcount_t and fallback
to taking ->siglock if seqcount_retry, but this patch adds the "full blown"
seqlock_t.
OK, I won't argue, this can make the seqbegin_or_lock simpler...
> @@ -288,18 +288,31 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
> struct signal_struct *sig = tsk->signal;
> cputime_t utime, stime;
> struct task_struct *t;
> -
> - times->utime = sig->utime;
> - times->stime = sig->stime;
> - times->sum_exec_runtime = sig->sum_sched_runtime;
> + unsigned int seq, nextseq;
>
> rcu_read_lock();
Almost cosmetic nit, but afaics this patch expands the rcu critical section
for no reason. We only need rcu_read_lock/unlock around for_each_thread()
below.
> + nextseq = 0;
> + do {
> + seq = nextseq;
> + read_seqbegin_or_lock(&sig->stats_lock, &seq);
> + times->utime = sig->utime;
> + times->stime = sig->stime;
> + times->sum_exec_runtime = sig->sum_sched_runtime;
> +
> + for_each_thread(tsk, t) {
> + task_cputime(t, &utime, &stime);
> + times->utime += utime;
> + times->stime += stime;
> + times->sum_exec_runtime += task_sched_runtime(t);
> + }
> + /*
> + * If a writer is currently active, seq will be odd, and
> + * read_seqbegin_or_lock will take the lock.
> + */
> + nextseq = raw_read_seqcount(&sig->stats_lock.seqcount);
> + } while (need_seqretry(&sig->stats_lock, seq));
> + done_seqretry(&sig->stats_lock, seq);
Hmm. It seems that read_seqbegin_or_lock() is not used correctly. I mean,
this code still can livelock in theory. Just suppose that anoter CPU does
write_seqlock/write_sequnlock right after read_seqbegin_or_lock(). In this
case "seq & 1" will be never true and thus "or_lock" will never happen.
IMO, this should be fixed. Either we should guarantee the forward progress
or we should not play with read_seqbegin_or_lock() at all. This code assumes
that sooner or later "nextseq = raw_read_seqcount()" should return the odd
counter, but in theory this can never happen.
And if we want to fix this we do not need 2 counters, just we need to set
"seq = 1" manually after need_seqretry() == T. Say, like __dentry_path() does.
(but unlike __dentry_path() we do not need to worry about rcu_read_unlock so
the code will be simpler).
I am wondering if it makes sense to introduce
bool read_seqretry_or_lock(const seqlock_t *sl, int *seq)
{
if (*seq & 1) {
read_sequnlock_excl(lock);
return false;
}
if (!read_seqretry(lock, *seq))
return false;
*seq = 1;
return true;
}
Oleg.
next prev parent reply other threads:[~2014-08-14 13:25 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-12 18:25 [PATCH RFC] time: drop do_sys_times spinlock Rik van Riel
2014-08-12 19:12 ` Oleg Nesterov
2014-08-12 19:22 ` Rik van Riel
2014-08-12 22:27 ` Rik van Riel
2014-08-13 17:22 ` Oleg Nesterov
2014-08-13 17:35 ` Rik van Riel
2014-08-13 18:08 ` Oleg Nesterov
2014-08-13 18:25 ` Rik van Riel
2014-08-13 18:45 ` Oleg Nesterov
2014-08-13 18:57 ` Rik van Riel
2014-08-13 21:03 ` [PATCH RFC] time,signal: protect resource use statistics with seqlock Rik van Riel
2014-08-14 0:43 ` Frederic Weisbecker
2014-08-14 1:57 ` Rik van Riel
2014-08-14 13:34 ` Frederic Weisbecker
2014-08-14 14:39 ` Oleg Nesterov
2014-08-15 2:52 ` Frederic Weisbecker
2014-08-15 14:26 ` Oleg Nesterov
2014-08-15 22:33 ` Frederic Weisbecker
2014-08-14 13:22 ` Oleg Nesterov [this message]
2014-08-14 13:38 ` Frederic Weisbecker
2014-08-14 13:53 ` Oleg Nesterov
2014-08-14 17:48 ` Oleg Nesterov
2014-08-14 18:34 ` Oleg Nesterov
2014-08-15 5:19 ` Mike Galbraith
2014-08-15 6:28 ` Peter Zijlstra
2014-08-15 9:37 ` Mike Galbraith
2014-08-15 9:44 ` Peter Zijlstra
2014-08-15 16:36 ` Oleg Nesterov
2014-08-15 16:49 ` Oleg Nesterov
2014-08-15 17:25 ` Rik van Riel
2014-08-15 18:36 ` Oleg Nesterov
2014-08-14 14:24 ` Oleg Nesterov
2014-08-14 15:37 ` Rik van Riel
2014-08-14 16:12 ` Oleg Nesterov
2014-08-14 17:36 ` Rik van Riel
2014-08-14 18:15 ` Oleg Nesterov
2014-08-14 19:03 ` Rik van Riel
2014-08-14 19:37 ` Oleg Nesterov
2014-08-15 2:14 ` Rik van Riel
2014-08-15 14:58 ` Oleg Nesterov
2014-08-13 21:03 ` Rik van Riel
2014-08-13 17:40 ` [PATCH RFC] time: drop do_sys_times spinlock Peter Zijlstra
2014-08-13 17:50 ` Rik van Riel
2014-08-13 17:53 ` Peter Zijlstra
2014-08-13 6:59 ` Mike Galbraith
2014-08-13 11:11 ` Peter Zijlstra
2014-08-13 13:24 ` Rik van Riel
2014-08-13 13:39 ` Peter Zijlstra
2014-08-13 14:09 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140814132239.GA24465@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=fmayhar@google.com \
--cc=fweisbec@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lwoodman@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=srao@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.