linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matt Cowell <matt.cowell@nsn.com>
To: ext Thomas Gleixner <tglx@linutronix.de>
Cc: "linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Subject: Re: [PATCH RT] rtmutex: fix deadlock in spin_trylock
Date: Wed, 06 Nov 2013 14:22:18 -0600	[thread overview]
Message-ID: <527AA4FA.2000208@nsn.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1311061832020.23353@ionos.tec.linutronix.de>

>> There is a deadlock condition with RT spin_locks that may use trylock in
>> hardirq context.  The main user of this is get_next_timer_interrupt, which
>> will deadlock if an IRQ preempts a timer wheel modification.
>
> Errm. No.

Are you saying that spin_trylock will/should not be called from hardirq 
context, or that it is not possible for spin_trylock to deadlock in the 
situation I described?  With PREEMPT_RT_FULL and NO_HZ_FULL on v3.10, 
spin_trylock will definitely will be called from hardirq context.  This 
seems like a valid usage to me as well, as an optimization to prevent 
disabling IRQs and preemption in code that holds the timer spinlock.

Even if this code did not cause a deadlock, is there a reason that 
rt_mutex_slowtrylock is using raw_spin_lock instead of raw_spin_trylock? 
  I would think that we would want to ensure that any *trylock call 
would return as fast as possible without spinning or blocking.

> The cpu is in thread context and therefor not idle.
>
> void tick_nohz_irq_exit(void)
> {
>          if (!ts->inidle)
>                  return;
> 	....
> }
>
> So you are papering over the real issue:
>
>     Why is ts->inidle true, if the CPU is not in idle?

The code you cited is from pre-v3.10 (without NO_HZ_FULL changes).  This 
was changed in commit 5811d99 (nohz: Prepare to stop the tick on irq 
exit).  When ts->idle is false (as it was in my trace), this code now 
calls tick_nohz_full_stop_tick().  This leads to the following trace:

	do_raw_spin_lock
	rt_spin_trylock
	spin_do_trylock
	get_next_timer_interrupt
	tick_nohz_stop_sched_tick
	tick_nohz_full_stop_tick
	tick_nohz_irq_exit
	irq_exit

I see two ways to fix this:
1) The way I did in the patch to ensure that spin_trylock never can spin 
on any locks internally.
2) Converting the timer base lock to a raw_spinlock_t so that it 
actually disables hardirqs in lock_timer_base().  I don't think this 
would be a good idea though.

Thanks,
-Matt

  reply	other threads:[~2013-11-06 20:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-30 15:53 [PATCH RT] rtmutex: fix deadlock in spin_trylock Cowell, Matt (NSN - US/Arlington Heights)
2013-11-06 17:39 ` Thomas Gleixner
2013-11-06 20:22   ` Matt Cowell [this message]
2013-11-16 13:12     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=527AA4FA.2000208@nsn.com \
    --to=matt.cowell@nsn.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).