All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Rik van Riel <riel@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	Frank Mayhar <fmayhar@google.com>,
	Frederic Weisbecker <fweisbec@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sanjay Rao <srao@redhat.com>, Larry Woodman <lwoodman@redhat.com>
Subject: Re: [PATCH RFC] time,signal: protect resource use statistics with seqlock
Date: Fri, 15 Aug 2014 16:58:13 +0200	[thread overview]
Message-ID: <20140815145813.GA15379@redhat.com> (raw)
In-Reply-To: <20140814221447.7b8cf03f@annuminas.surriel.com>

On 08/14, Rik van Riel wrote:
>
> @@ -288,18 +288,31 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
>  	struct signal_struct *sig = tsk->signal;
>  	cputime_t utime, stime;
>  	struct task_struct *t;
> -
> -	times->utime = sig->utime;
> -	times->stime = sig->stime;
> -	times->sum_exec_runtime = sig->sum_sched_runtime;
> +	unsigned int seq, nextseq;
>  
>  	rcu_read_lock();
> -	for_each_thread(tsk, t) {
> -		task_cputime(t, &utime, &stime);
> -		times->utime += utime;
> -		times->stime += stime;
> -		times->sum_exec_runtime += task_sched_runtime(t);
> -	}
> +	/* Attempt a lockless read on the first round. */
> +	nextseq = 0;
> +	do {
> +		seq = nextseq;
> +		read_seqbegin_or_lock(&sig->stats_lock, &seq);
> +		times->utime = sig->utime;
> +		times->stime = sig->stime;
> +		times->sum_exec_runtime = sig->sum_sched_runtime;
> +
> +		for_each_thread(tsk, t) {
> +			task_cputime(t, &utime, &stime);
> +			times->utime += utime;
> +			times->stime += stime;
> +			times->sum_exec_runtime += task_sched_runtime(t);
> +		}
> +		/*
> +		 * If a writer is currently active, seq will be odd, and
> +		 * read_seqbegin_or_lock will take the lock.
> +		 */
> +		nextseq = raw_read_seqcount(&sig->stats_lock.seqcount);
> +	} while (need_seqretry(&sig->stats_lock, seq));
> +	done_seqretry(&sig->stats_lock, seq);
>  	rcu_read_unlock();
>  }

I still think this is not right. Let me quote my previous email,

	> @@ -288,18 +288,31 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
	>  	struct signal_struct *sig = tsk->signal;
	>  	cputime_t utime, stime;
	>  	struct task_struct *t;
	> -
	> -	times->utime = sig->utime;
	> -	times->stime = sig->stime;
	> -	times->sum_exec_runtime = sig->sum_sched_runtime;
	> +	unsigned int seq, nextseq;
	>
	>  	rcu_read_lock();

	Almost cosmetic nit, but afaics this patch expands the rcu critical section
	for no reason. We only need rcu_read_lock/unlock around for_each_thread()
	below.

	> +	nextseq = 0;
	> +	do {
	> +		seq = nextseq;
	> +		read_seqbegin_or_lock(&sig->stats_lock, &seq);
	> +		times->utime = sig->utime;
	> +		times->stime = sig->stime;
	> +		times->sum_exec_runtime = sig->sum_sched_runtime;
	> +
	> +		for_each_thread(tsk, t) {
	> +			task_cputime(t, &utime, &stime);
	> +			times->utime += utime;
	> +			times->stime += stime;
	> +			times->sum_exec_runtime += task_sched_runtime(t);
	> +		}
	> +		/*
	> +		 * If a writer is currently active, seq will be odd, and
	> +		 * read_seqbegin_or_lock will take the lock.
	> +		 */
	> +		nextseq = raw_read_seqcount(&sig->stats_lock.seqcount);
	> +	} while (need_seqretry(&sig->stats_lock, seq));
	> +	done_seqretry(&sig->stats_lock, seq);

	Hmm. It seems that read_seqbegin_or_lock() is not used correctly. I mean,
	this code still can livelock in theory. Just suppose that anoter CPU does
	write_seqlock/write_sequnlock right after read_seqbegin_or_lock(). In this
	case "seq & 1" will be never true and thus "or_lock" will never happen.

	IMO, this should be fixed. Either we should guarantee the forward progress
	or we should not play with read_seqbegin_or_lock() at all. This code assumes
	that sooner or later "nextseq = raw_read_seqcount()" should return the odd
	counter, but in theory this can never happen.

	And if we want to fix this we do not need 2 counters, just we need to set
	"seq = 1" manually after need_seqretry() == T. Say, like __dentry_path() does.
	(but unlike __dentry_path() we do not need to worry about rcu_read_unlock so
	the code will be simpler).

	I am wondering if it makes sense to introduce

		bool read_seqretry_or_lock(const seqlock_t *sl, int *seq)
		{
			if (*seq & 1) {
				read_sequnlock_excl(lock);
				return false;
			}
		
			if (!read_seqretry(lock, *seq))
				return false;
		
			*seq = 1;
			return true;
		}

Or I missed your reply?

Oleg.


  reply	other threads:[~2014-08-15 15:00 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-12 18:25 [PATCH RFC] time: drop do_sys_times spinlock Rik van Riel
2014-08-12 19:12 ` Oleg Nesterov
2014-08-12 19:22   ` Rik van Riel
2014-08-12 22:27   ` Rik van Riel
2014-08-13 17:22     ` Oleg Nesterov
2014-08-13 17:35       ` Rik van Riel
2014-08-13 18:08         ` Oleg Nesterov
2014-08-13 18:25           ` Rik van Riel
2014-08-13 18:45             ` Oleg Nesterov
2014-08-13 18:57               ` Rik van Riel
2014-08-13 21:03               ` [PATCH RFC] time,signal: protect resource use statistics with seqlock Rik van Riel
2014-08-14  0:43                 ` Frederic Weisbecker
2014-08-14  1:57                   ` Rik van Riel
2014-08-14 13:34                     ` Frederic Weisbecker
2014-08-14 14:39                       ` Oleg Nesterov
2014-08-15  2:52                         ` Frederic Weisbecker
2014-08-15 14:26                           ` Oleg Nesterov
2014-08-15 22:33                             ` Frederic Weisbecker
2014-08-14 13:22                 ` Oleg Nesterov
2014-08-14 13:38                   ` Frederic Weisbecker
2014-08-14 13:53                     ` Oleg Nesterov
2014-08-14 17:48                   ` Oleg Nesterov
2014-08-14 18:34                     ` Oleg Nesterov
2014-08-15  5:19                     ` Mike Galbraith
2014-08-15  6:28                       ` Peter Zijlstra
2014-08-15  9:37                         ` Mike Galbraith
2014-08-15  9:44                           ` Peter Zijlstra
2014-08-15 16:36                         ` Oleg Nesterov
2014-08-15 16:49                           ` Oleg Nesterov
2014-08-15 17:25                             ` Rik van Riel
2014-08-15 18:36                               ` Oleg Nesterov
2014-08-14 14:24                 ` Oleg Nesterov
2014-08-14 15:37                   ` Rik van Riel
2014-08-14 16:12                     ` Oleg Nesterov
2014-08-14 17:36                       ` Rik van Riel
2014-08-14 18:15                         ` Oleg Nesterov
2014-08-14 19:03                           ` Rik van Riel
2014-08-14 19:37                             ` Oleg Nesterov
2014-08-15  2:14                       ` Rik van Riel
2014-08-15 14:58                         ` Oleg Nesterov [this message]
2014-08-13 21:03               ` Rik van Riel
2014-08-13 17:40       ` [PATCH RFC] time: drop do_sys_times spinlock Peter Zijlstra
2014-08-13 17:50         ` Rik van Riel
2014-08-13 17:53           ` Peter Zijlstra
2014-08-13  6:59   ` Mike Galbraith
2014-08-13 11:11     ` Peter Zijlstra
2014-08-13 13:24       ` Rik van Riel
2014-08-13 13:39         ` Peter Zijlstra
2014-08-13 14:09           ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140815145813.GA15379@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fmayhar@google.com \
    --cc=fweisbec@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lwoodman@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=srao@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.