public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org,
	linux-rt-users <linux-rt-users@vger.kernel.org>,
	Carsten Emde <C.Emde@osadl.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	John Kacur <jkacur@redhat.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Clark Williams <clark.williams@gmail.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: Re: [RFC][PATCH RT 0/3] RT: Fix trylock deadlock without msleep() hack
Date: Mon, 7 Sep 2015 10:35:25 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.11.1509070959440.15006@nanos> (raw)
In-Reply-To: <20150905120457.GA21338@gmail.com>

On Sat, 5 Sep 2015, Ingo Molnar wrote:
> * Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > So the problem we need to solve is:
> > 
> > retry:
> > 	lock(B);
> > 	if (!try_lock(A)) {
> > 		unlock(B);
> > 		cpu_relax();
> > 		goto retry;
> > 	}
> > 
> > So instead of doing that proposed magic boost, we can do something
> > more straight forward:
> > 
> > retry:
> > 	lock(B);
> > 	if (!try_lock(A)) {
> > 		lock_and_drop(A, B);
> > 		unlock(A);
> > 		goto retry;
> > 	}
> > 
> > lock_and_drop() queues the task as a waiter on A, drops B and then
> > does the PI adjustment on A. 
> > 
> > Thoughts?
> 
> So why not do:
> 
> 	lock(B);
> 	if (!trylock(A)) {
> 		unlock(B);
> 		lock(A);
> 		lock(B);
> 	}
> 
> ?
> 
> Or, if this can be done, why didn't we do:
> 
> 	lock(A);
> 	lock(B);
> 
> to begin with?
> 
> i.e. I'm not sure the problem is properly specified.

Right. I omitted some essential information.

       lock(y->lock);
       x = y->x;
       if (!try_lock(x->lock))
		....

Once we drop x->lock, y->x can change. That's why the retry is there.

Now on RT the trylock loop can obviously lead to a live lock if the
try locker preempted the holder of x->lock.

What Steve is trying to do is to boost the holder of x->lock (task A)
without actually queueing the task (task B) on the lock wait queue of
x->lock. To get out of the try-lock loop he calls sched_yield() from
task B.

While this works by some definition of works, I really do not like the
semantical obscurity of this approach.

1) The boosting is not related to anything.

   If the priority of taskB changes then nothing changes the boosting
   of taskA.

2) The boosting stops

3) sched_yield() makes me shudder

   CPU0			CPU1	

   taskA
     lock(x->lock)

   preemption
   taskC
			taskB
			  lock(y->lock);
			  x = y->x;
			  if (!try_lock(x->lock)) {
			    unlock(y->lock);
			    boost(taskA);
			    sched_yield();  <- returns immediately
			    
   So, if taskC has higher priority than taskB and therefor than
   taskA, taskB will do the lock/trylock/unlock/boost dance in
   circles.

   We can make that worse. If taskB's code looks like this:

			  lock(y->lock);
			  x = y->x;
			  if (!try_lock(x->lock)) {
			    unlock(y->lock);
			    boost(taskA);
			    sched_yield();
			    return -EAGAIN;

  and at the callsite it decides to do something completely different
  than retrying then taskA stays boosted.

So we have already two scenarios where this clearly violates the PI
rules and I really do not have any interest to debug leaked RT
priorites.

I agree with Steve, that the main case where we have this horrible
msleep() right now - dcache - is complex, but we rather sit down and
analyze it proper and come up with semantically well defined
solutions.

Thanks,

	tglx







  parent reply	other threads:[~2015-09-07  8:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04  1:19 [RFC][PATCH RT 0/3] RT: Fix trylock deadlock without msleep() hack Steven Rostedt
2015-09-04  1:19 ` [RFC][PATCH RT 1/3] locking: Add spin_try_or_boost_lock() infrastructure Steven Rostedt
2015-09-04  1:48   ` Steven Rostedt
2015-09-04  1:19 ` [RFC][PATCH RT 2/3] locking: Convert trylock spinners over to spin_try_or_boost_lock() Steven Rostedt
2015-09-04  1:19 ` [RFC][PATCH RT 3/3] rt: Make cpu_chill() into yield() and add new cpu_rest() as msleep(1) Steven Rostedt
2015-09-05 10:30 ` [RFC][PATCH RT 0/3] RT: Fix trylock deadlock without msleep() hack Thomas Gleixner
2015-09-05 12:04   ` Ingo Molnar
2015-09-05 12:26     ` Steven Rostedt
2015-09-07  8:35     ` Thomas Gleixner [this message]
2015-09-07 10:10       ` Thomas Gleixner
2015-09-08  7:31       ` Ingo Molnar
2015-09-08  8:09         ` Thomas Gleixner
2015-09-14  9:50           ` Ingo Molnar
2015-09-08 16:59       ` Steven Rostedt
2015-09-08 19:35         ` Steven Rostedt
2015-09-05 12:18   ` Steven Rostedt
2015-09-05 12:27     ` Steven Rostedt
2015-09-05 12:50     ` Steven Rostedt
2015-09-07  9:14       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1509070959440.15006@nanos \
    --to=tglx@linutronix.de \
    --cc=C.Emde@osadl.org \
    --cc=acme@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=clark.williams@gmail.com \
    --cc=jkacur@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox