All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vineet Gupta <Vineet.Gupta1@synopsys.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	lkml <linux-kernel@vger.kernel.org>,
	Noam Camus <noamc@ezchip.com>,
	arcml <linux-snps-arc@lists.infradead.org>
Subject: Re: Interesting csd deadlock on ARC
Date: Wed, 24 Feb 2016 10:21:25 +0530	[thread overview]
Message-ID: <56CD36CD.2080303@synopsys.com> (raw)
In-Reply-To: <20160223095824.GH6356@twins.programming.kicks-ass.net>

On Tuesday 23 February 2016 03:28 PM, Peter Zijlstra wrote:
> On Tue, Feb 23, 2016 at 10:51:42AM +0530, Vineet Gupta wrote:
>> On Friday 19 February 2016 12:17 PM, Vineet Gupta wrote:
>>> Hi Peter,
>>>
>>> I've been debugging a csd_lock_wait() deadlock on SMP+PREEMPT ARC HS38x2 and it
>>> turned out to be lot more interesting than I'd hoped for. This is stock v4.4
>>>
>>> Trouble starts with an IPI to self which doesn't get delivered as the inter-core
>>> interrupt providing h/w is not capable of IPI to self (which I found as part of
>>> debugging this). Subsequent IPIs from other cores to this core get elided as well
>>> due to the IPI coalescing optimization in arch/arc/kernel/smp.c: ipi_send_msg_one()
>>>
>>> There are ways to use a different h/w mechanism to solve the trigger issue and I'd
>>> hoped to just implement arch_irq_work_raise(). 
> 
> Yes, there are other architectures that use other means for self-IPI,
> IIRC PowerPC has to program their timer in the past to generate a local
> interrupt.
> 
>>> But the trouble is the call stack
>>> for this issue: IPI to self is triggered from
>>>
>>> sys_sched_setscheduler
>>>    __balance_callback
>>>        pull_rt_task
>>>          irq_work_queue_on  <-- called with @cpu == self
>>>
>>> Looking into irq_work.c, irq_work_queue() is what is semantically needed,
>>> specifically arch_irq_work_raise() will not be called, which means I need
>>> arch_send_call_function_single_ipi() to be able to IPI to self cpu also. Is that
>>> expected from arch code....
>>
>> What I actually meant was is it OK for irq_work_queue_on() to be called locally
>> (is this a sched bug/optimization(. Further if it is OK to be called, does it need
>> to do behave more like irq_work_queue() i.e. call arch_irq_work_raise() or
>> arch_send_call_function_single_ipi() is expected to handle sending IPI to self !
> 
> Right, so I'm not actually sure we started out with this requirement.
> But you're not the first to run into this, see:
> 
>   lkml.kernel.org/r/CAJZ5v0gLankSuziQq25qTCyNqeOX43yD9jnJu_XXwbdyajfmKg@mail.gmail.com
> 
> Initially I think irq_work_queue_on() was only used remotely, but I
> think it makes sense to allow the current cpu, esp. since people seem to
> be using it like that.

So it seems Russell's questions in the thread above stands still. IMO we need to
massage irq_work_queue_on() to handle the case of called for local cpu. This will
automatically take care of CONFIG_SMP kernel running on UP hardware.

> 
> Now the distinct difference between arch_irq_work_raise() and
> arch_send_call_function_single_ipi() is that arch_irq_work_raise()
> should be NMI-safe.
> 
> So on x86 it has to be extra careful about the lapic state, whereas the
> regular IPI code doesn't.
> 
> I seem to have forgotten the status of NMIs on ARC, but this is
> something to make a note of.
> 

WARNING: multiple messages have this Message-ID (diff)
From: Vineet.Gupta1@synopsys.com (Vineet Gupta)
To: linux-snps-arc@lists.infradead.org
Subject: Interesting csd deadlock on ARC
Date: Wed, 24 Feb 2016 10:21:25 +0530	[thread overview]
Message-ID: <56CD36CD.2080303@synopsys.com> (raw)
In-Reply-To: <20160223095824.GH6356@twins.programming.kicks-ass.net>

On Tuesday 23 February 2016 03:28 PM, Peter Zijlstra wrote:
> On Tue, Feb 23, 2016@10:51:42AM +0530, Vineet Gupta wrote:
>> On Friday 19 February 2016 12:17 PM, Vineet Gupta wrote:
>>> Hi Peter,
>>>
>>> I've been debugging a csd_lock_wait() deadlock on SMP+PREEMPT ARC HS38x2 and it
>>> turned out to be lot more interesting than I'd hoped for. This is stock v4.4
>>>
>>> Trouble starts with an IPI to self which doesn't get delivered as the inter-core
>>> interrupt providing h/w is not capable of IPI to self (which I found as part of
>>> debugging this). Subsequent IPIs from other cores to this core get elided as well
>>> due to the IPI coalescing optimization in arch/arc/kernel/smp.c: ipi_send_msg_one()
>>>
>>> There are ways to use a different h/w mechanism to solve the trigger issue and I'd
>>> hoped to just implement arch_irq_work_raise(). 
> 
> Yes, there are other architectures that use other means for self-IPI,
> IIRC PowerPC has to program their timer in the past to generate a local
> interrupt.
> 
>>> But the trouble is the call stack
>>> for this issue: IPI to self is triggered from
>>>
>>> sys_sched_setscheduler
>>>    __balance_callback
>>>        pull_rt_task
>>>          irq_work_queue_on  <-- called with @cpu == self
>>>
>>> Looking into irq_work.c, irq_work_queue() is what is semantically needed,
>>> specifically arch_irq_work_raise() will not be called, which means I need
>>> arch_send_call_function_single_ipi() to be able to IPI to self cpu also. Is that
>>> expected from arch code....
>>
>> What I actually meant was is it OK for irq_work_queue_on() to be called locally
>> (is this a sched bug/optimization(. Further if it is OK to be called, does it need
>> to do behave more like irq_work_queue() i.e. call arch_irq_work_raise() or
>> arch_send_call_function_single_ipi() is expected to handle sending IPI to self !
> 
> Right, so I'm not actually sure we started out with this requirement.
> But you're not the first to run into this, see:
> 
>   lkml.kernel.org/r/CAJZ5v0gLankSuziQq25qTCyNqeOX43yD9jnJu_XXwbdyajfmKg at mail.gmail.com
> 
> Initially I think irq_work_queue_on() was only used remotely, but I
> think it makes sense to allow the current cpu, esp. since people seem to
> be using it like that.

So it seems Russell's questions in the thread above stands still. IMO we need to
massage irq_work_queue_on() to handle the case of called for local cpu. This will
automatically take care of CONFIG_SMP kernel running on UP hardware.

> 
> Now the distinct difference between arch_irq_work_raise() and
> arch_send_call_function_single_ipi() is that arch_irq_work_raise()
> should be NMI-safe.
> 
> So on x86 it has to be extra careful about the lapic state, whereas the
> regular IPI code doesn't.
> 
> I seem to have forgotten the status of NMIs on ARC, but this is
> something to make a note of.
> 

  parent reply	other threads:[~2016-02-24  4:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-19  6:47 Interesting csd deadlock on ARC Vineet Gupta
2016-02-19  6:47 ` Vineet Gupta
2016-02-23  5:21 ` Vineet Gupta
2016-02-23  5:21   ` Vineet Gupta
2016-02-23  9:58   ` Peter Zijlstra
2016-02-23  9:58     ` Peter Zijlstra
2016-02-23  9:58     ` Peter Zijlstra
2016-02-23 10:21     ` Vineet Gupta
2016-02-23 10:21       ` Vineet Gupta
2016-02-23 10:39       ` Peter Zijlstra
2016-02-23 10:39         ` Peter Zijlstra
2016-02-23 10:58         ` Noam Camus
2016-02-23 10:58           ` Noam Camus
2016-02-23 10:58           ` Noam Camus
2016-02-24  4:45           ` Vineet Gupta
2016-02-24  4:45             ` Vineet Gupta
2016-02-24  4:51     ` Vineet Gupta [this message]
2016-02-24  4:51       ` Vineet Gupta
2016-02-25 14:06       ` Peter Zijlstra
2016-02-25 14:06         ` Peter Zijlstra
2016-02-25 14:06         ` Peter Zijlstra
2016-02-25 14:23         ` Vineet Gupta
2016-02-25 14:23           ` Vineet Gupta
2016-02-25 14:30           ` Russell King - ARM Linux
2016-02-25 14:30             ` Russell King - ARM Linux
2016-02-25 15:58             ` Vineet Gupta
2016-02-25 15:58               ` Vineet Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56CD36CD.2080303@synopsys.com \
    --to=vineet.gupta1@synopsys.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux@arm.linux.org.uk \
    --cc=marc.zyngier@arm.com \
    --cc=noamc@ezchip.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.