All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Boyd <sboyd@codeaurora.org>
To: Thomas Gleixner <tglx@linutronix.de>,
	Jason Cooper <jason@lakedaemon.net>
Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Nicolas Pitre <nico@linaro.org>
Subject: Re: [PATCH v2] irqchip: gic: Allow gic_arch_extn hooks to call into scheduler
Date: Tue, 05 Aug 2014 10:48:30 -0700	[thread overview]
Message-ID: <53E118EE.5040205@codeaurora.org> (raw)
In-Reply-To: <1407194837-27190-1-git-send-email-sboyd@codeaurora.org>

+nico (sorry dropped CC for v2)

On 08/04/14 16:27, Stephen Boyd wrote:
> Commit 1a6b69b6548c (ARM: gic: add CPU migration support,
> 2012-04-12) introduced an acquisition of the irq_controller_lock
> in gic_raise_softirq() which can lead to a spinlock recursion if
> the gic_arch_extn hooks call into the scheduler (via complete()
> or wake_up(), etc.). This happens because gic_arch_extn hooks are
> normally called with the irq_controller_lock held and calling
> into the scheduler may cause us to call smp_send_reschedule()
> which will grab the irq_controller_lock again. Here's an example
> from a vendor kernel (note that the gic_arch_extn hook code here
> isn't actually in mainline):
>
> BUG: spinlock recursion on CPU#0, swapper/0/1
>  lock: irq_controller_lock+0x0/0x18, .magic: dead4ead, .owner: sw
> er_cpu: 0
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.10-00430-g3d433c4e
>
> Call trace:
> [<ffffffc000087e1c>] dump_backtrace+0x0/0x140
> [<ffffffc000087f6c>] show_stack+0x10/0x1c
> [<ffffffc00064732c>] dump_stack+0x74/0xc4
> [<ffffffc0006446c0>] spin_dump+0x78/0x88
> [<ffffffc0006446f4>] spin_bug+0x24/0x34
> [<ffffffc0000d47d0>] do_raw_spin_lock+0x58/0x148
> [<ffffffc00064d398>] _raw_spin_lock_irqsave+0x24/0x38
> [<ffffffc0002c9d7c>] gic_raise_softirq+0x2c/0xbc
> [<ffffffc00008daa4>] smp_send_reschedule+0x34/0x40
> [<ffffffc0000c1e94>] try_to_wake_up+0x224/0x288
> [<ffffffc0000c1f4c>] default_wake_function+0xc/0x18
> [<ffffffc0000ceef0>] __wake_up_common+0x50/0x8c
> [<ffffffc0000cef3c>] __wake_up_locked+0x10/0x1c
> [<ffffffc0000cf734>] complete+0x3c/0x5c
> [<ffffffc0002f0e78>] msm_mpm_enable_irq_exclusive+0x1b8/0x1c8
> [<ffffffc0002f0f58>] __msm_mpm_enable_irq+0x4c/0x7c
> [<ffffffc0002f0f94>] msm_mpm_enable_irq+0xc/0x18
> [<ffffffc0002c9bb0>] gic_unmask_irq+0x40/0x7c
> [<ffffffc0000de5f4>] irq_enable+0x2c/0x48
> [<ffffffc0000de65c>] irq_startup+0x4c/0x74
> [<ffffffc0000dd2fc>] __setup_irq+0x264/0x3f0
> [<ffffffc0000dd5e0>] request_threaded_irq+0xcc/0x11c
> [<ffffffc0000df254>] devm_request_threaded_irq+0x68/0xb4
> [<ffffffc000471520>] msm_iommu_ctx_probe+0x124/0x2d4
> [<ffffffc000337374>] platform_drv_probe+0x20/0x54
> [<ffffffc00033598c>] driver_probe_device+0x158/0x340
> [<ffffffc000335c20>] __driver_attach+0x60/0x90
> [<ffffffc000333c9c>] bus_for_each_dev+0x6c/0x8c
> [<ffffffc000335304>] driver_attach+0x1c/0x28
> [<ffffffc000334f14>] bus_add_driver+0x120/0x204
> [<ffffffc0003362e4>] driver_register+0xbc/0x10c
> [<ffffffc000337348>] __platform_driver_register+0x5c/0x68
> [<ffffffc00094c478>] msm_iommu_driver_init+0x54/0x7c
> [<ffffffc0000813ec>] do_one_initcall+0xa4/0x130
> [<ffffffc00091d928>] kernel_init_freeable+0x138/0x1dc
> [<ffffffc000642578>] kernel_init+0xc/0xd4
>
> We really just want to synchronize the sending of an SGI with the
> update of the gic_cpu_map[], so introduce a new SGI lock that we
> can use to synchronize the two code paths. Three main events are
> happening that we have to consider:
>
> 	1. We're updating the gic_cpu_mask to point to an
> 	incoming CPU
>
> 	2. We're (potentially) sending an SGI to the outgoing CPU
>
> 	3. We're redirecting any pending SGIs for the outgoing
> 	CPU to the incoming CPU.
>
> Events 1 and 3 are already ordered within the same CPU by means
> of program order and use of I/O accessors. Events 1 and 2 don't
> need to be ordered, but events 2 and 3 do because any SGIs for
> the outgoing CPU need to be pending before we can redirect them.
> Synchronize by acquiring a new lock around event 2 and before
> event 3. Use smp_mb__after_unlock_lock() before event 3 to ensure
> that event 1 is seen before event 3 on other CPUs that may be
> executing event 2.
>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> ---
>
> Changes since v1:
>  * Move gic_sgi_lock definition below gic_cpu_map[]
>  * Just use spinlock for synchronization instead of over the map update
>
>  drivers/irqchip/irq-gic.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 7c131cf7cc13..00bac4627d2e 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -81,6 +81,9 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>  #define NR_GIC_CPU_IF 8
>  static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
>  
> +/* Synchronize switching CPU interface and sending SGIs */
> +static DEFINE_RAW_SPINLOCK(gic_sgi_lock);

This also needs an #ifdef CONFIG_SMP

> +
>  /*
>   * Supported arch specific GIC irq extension.
>   * Default make them NULL.
> @@ -658,7 +661,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	int cpu;
>  	unsigned long flags, map = 0;
>  
> -	raw_spin_lock_irqsave(&irq_controller_lock, flags);
> +	raw_spin_lock_irqsave(&gic_sgi_lock, flags);
>  
>  	/* Convert our logical CPU mask into a physical one. */
>  	for_each_cpu(cpu, mask)
> @@ -673,7 +676,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	/* this always happens on GIC0 */
>  	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  
> -	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
> +	raw_spin_unlock_irqrestore(&gic_sgi_lock, flags);
>  }
>  #endif
>  
> @@ -764,6 +767,15 @@ void gic_migrate_target(unsigned int new_cpu_id)
>  
>  	raw_spin_unlock(&irq_controller_lock);
>  
> +	raw_spin_lock(&gic_sgi_lock);
> +	/*
> +	 * Ensure that the gic_cpu_map update above is seen in
> +	 * gic_raise_softirq() before we redirect any pending SGIs that
> +	 * may have been raised for the outgoing CPU (cur_cpu_id)
> +	 */
> +	smp_mb__after_unlock_lock();
> +	raw_spin_unlock(&gic_sgi_lock);
> +
>  	/*
>  	 * Now let's migrate and clear any potential SGIs that might be
>  	 * pending for us (cur_cpu_id).  Since GIC_DIST_SGI_PENDING_SET

I also wonder if we should do something like this:

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index ea3df28043f2..1655c9bcb633 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -663,7 +663,8 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
        int cpu;
        unsigned long flags, map = 0;

-       raw_spin_lock_irqsave(&gic_sgi_lock, flags);
+       if (IS_ENABLED(CONFIG_BL_SWITCHER))
+               raw_spin_lock_irqsave(&gic_sgi_lock, flags);

        /* Convert our logical CPU mask into a physical one. */
        for_each_cpu(cpu, mask)
@@ -678,7 +679,8 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
        /* this always happens on GIC0 */
        writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);

-       raw_spin_unlock_irqrestore(&gic_sgi_lock, flags);
+       if (IS_ENABLED(CONFIG_BL_SWITCHER))
+               raw_spin_unlock_irqrestore(&gic_sgi_lock, flags);
 }
 #endif


because we really don't even need to do any locking here if we're not
using the bL switcher code.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

WARNING: multiple messages have this Message-ID (diff)
From: sboyd@codeaurora.org (Stephen Boyd)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] irqchip: gic: Allow gic_arch_extn hooks to call into scheduler
Date: Tue, 05 Aug 2014 10:48:30 -0700	[thread overview]
Message-ID: <53E118EE.5040205@codeaurora.org> (raw)
In-Reply-To: <1407194837-27190-1-git-send-email-sboyd@codeaurora.org>

+nico (sorry dropped CC for v2)

On 08/04/14 16:27, Stephen Boyd wrote:
> Commit 1a6b69b6548c (ARM: gic: add CPU migration support,
> 2012-04-12) introduced an acquisition of the irq_controller_lock
> in gic_raise_softirq() which can lead to a spinlock recursion if
> the gic_arch_extn hooks call into the scheduler (via complete()
> or wake_up(), etc.). This happens because gic_arch_extn hooks are
> normally called with the irq_controller_lock held and calling
> into the scheduler may cause us to call smp_send_reschedule()
> which will grab the irq_controller_lock again. Here's an example
> from a vendor kernel (note that the gic_arch_extn hook code here
> isn't actually in mainline):
>
> BUG: spinlock recursion on CPU#0, swapper/0/1
>  lock: irq_controller_lock+0x0/0x18, .magic: dead4ead, .owner: sw
> er_cpu: 0
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.10-00430-g3d433c4e
>
> Call trace:
> [<ffffffc000087e1c>] dump_backtrace+0x0/0x140
> [<ffffffc000087f6c>] show_stack+0x10/0x1c
> [<ffffffc00064732c>] dump_stack+0x74/0xc4
> [<ffffffc0006446c0>] spin_dump+0x78/0x88
> [<ffffffc0006446f4>] spin_bug+0x24/0x34
> [<ffffffc0000d47d0>] do_raw_spin_lock+0x58/0x148
> [<ffffffc00064d398>] _raw_spin_lock_irqsave+0x24/0x38
> [<ffffffc0002c9d7c>] gic_raise_softirq+0x2c/0xbc
> [<ffffffc00008daa4>] smp_send_reschedule+0x34/0x40
> [<ffffffc0000c1e94>] try_to_wake_up+0x224/0x288
> [<ffffffc0000c1f4c>] default_wake_function+0xc/0x18
> [<ffffffc0000ceef0>] __wake_up_common+0x50/0x8c
> [<ffffffc0000cef3c>] __wake_up_locked+0x10/0x1c
> [<ffffffc0000cf734>] complete+0x3c/0x5c
> [<ffffffc0002f0e78>] msm_mpm_enable_irq_exclusive+0x1b8/0x1c8
> [<ffffffc0002f0f58>] __msm_mpm_enable_irq+0x4c/0x7c
> [<ffffffc0002f0f94>] msm_mpm_enable_irq+0xc/0x18
> [<ffffffc0002c9bb0>] gic_unmask_irq+0x40/0x7c
> [<ffffffc0000de5f4>] irq_enable+0x2c/0x48
> [<ffffffc0000de65c>] irq_startup+0x4c/0x74
> [<ffffffc0000dd2fc>] __setup_irq+0x264/0x3f0
> [<ffffffc0000dd5e0>] request_threaded_irq+0xcc/0x11c
> [<ffffffc0000df254>] devm_request_threaded_irq+0x68/0xb4
> [<ffffffc000471520>] msm_iommu_ctx_probe+0x124/0x2d4
> [<ffffffc000337374>] platform_drv_probe+0x20/0x54
> [<ffffffc00033598c>] driver_probe_device+0x158/0x340
> [<ffffffc000335c20>] __driver_attach+0x60/0x90
> [<ffffffc000333c9c>] bus_for_each_dev+0x6c/0x8c
> [<ffffffc000335304>] driver_attach+0x1c/0x28
> [<ffffffc000334f14>] bus_add_driver+0x120/0x204
> [<ffffffc0003362e4>] driver_register+0xbc/0x10c
> [<ffffffc000337348>] __platform_driver_register+0x5c/0x68
> [<ffffffc00094c478>] msm_iommu_driver_init+0x54/0x7c
> [<ffffffc0000813ec>] do_one_initcall+0xa4/0x130
> [<ffffffc00091d928>] kernel_init_freeable+0x138/0x1dc
> [<ffffffc000642578>] kernel_init+0xc/0xd4
>
> We really just want to synchronize the sending of an SGI with the
> update of the gic_cpu_map[], so introduce a new SGI lock that we
> can use to synchronize the two code paths. Three main events are
> happening that we have to consider:
>
> 	1. We're updating the gic_cpu_mask to point to an
> 	incoming CPU
>
> 	2. We're (potentially) sending an SGI to the outgoing CPU
>
> 	3. We're redirecting any pending SGIs for the outgoing
> 	CPU to the incoming CPU.
>
> Events 1 and 3 are already ordered within the same CPU by means
> of program order and use of I/O accessors. Events 1 and 2 don't
> need to be ordered, but events 2 and 3 do because any SGIs for
> the outgoing CPU need to be pending before we can redirect them.
> Synchronize by acquiring a new lock around event 2 and before
> event 3. Use smp_mb__after_unlock_lock() before event 3 to ensure
> that event 1 is seen before event 3 on other CPUs that may be
> executing event 2.
>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> ---
>
> Changes since v1:
>  * Move gic_sgi_lock definition below gic_cpu_map[]
>  * Just use spinlock for synchronization instead of over the map update
>
>  drivers/irqchip/irq-gic.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 7c131cf7cc13..00bac4627d2e 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -81,6 +81,9 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>  #define NR_GIC_CPU_IF 8
>  static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
>  
> +/* Synchronize switching CPU interface and sending SGIs */
> +static DEFINE_RAW_SPINLOCK(gic_sgi_lock);

This also needs an #ifdef CONFIG_SMP

> +
>  /*
>   * Supported arch specific GIC irq extension.
>   * Default make them NULL.
> @@ -658,7 +661,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	int cpu;
>  	unsigned long flags, map = 0;
>  
> -	raw_spin_lock_irqsave(&irq_controller_lock, flags);
> +	raw_spin_lock_irqsave(&gic_sgi_lock, flags);
>  
>  	/* Convert our logical CPU mask into a physical one. */
>  	for_each_cpu(cpu, mask)
> @@ -673,7 +676,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	/* this always happens on GIC0 */
>  	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  
> -	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
> +	raw_spin_unlock_irqrestore(&gic_sgi_lock, flags);
>  }
>  #endif
>  
> @@ -764,6 +767,15 @@ void gic_migrate_target(unsigned int new_cpu_id)
>  
>  	raw_spin_unlock(&irq_controller_lock);
>  
> +	raw_spin_lock(&gic_sgi_lock);
> +	/*
> +	 * Ensure that the gic_cpu_map update above is seen in
> +	 * gic_raise_softirq() before we redirect any pending SGIs that
> +	 * may have been raised for the outgoing CPU (cur_cpu_id)
> +	 */
> +	smp_mb__after_unlock_lock();
> +	raw_spin_unlock(&gic_sgi_lock);
> +
>  	/*
>  	 * Now let's migrate and clear any potential SGIs that might be
>  	 * pending for us (cur_cpu_id).  Since GIC_DIST_SGI_PENDING_SET

I also wonder if we should do something like this:

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index ea3df28043f2..1655c9bcb633 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -663,7 +663,8 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
        int cpu;
        unsigned long flags, map = 0;

-       raw_spin_lock_irqsave(&gic_sgi_lock, flags);
+       if (IS_ENABLED(CONFIG_BL_SWITCHER))
+               raw_spin_lock_irqsave(&gic_sgi_lock, flags);

        /* Convert our logical CPU mask into a physical one. */
        for_each_cpu(cpu, mask)
@@ -678,7 +679,8 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
        /* this always happens on GIC0 */
        writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);

-       raw_spin_unlock_irqrestore(&gic_sgi_lock, flags);
+       if (IS_ENABLED(CONFIG_BL_SWITCHER))
+               raw_spin_unlock_irqrestore(&gic_sgi_lock, flags);
 }
 #endif


because we really don't even need to do any locking here if we're not
using the bL switcher code.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

  reply	other threads:[~2014-08-05 17:48 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-04 22:33 [PATCH] irqchip: gic: Allow gic_arch_extn hooks to call into scheduler Stephen Boyd
2014-08-04 22:33 ` Stephen Boyd
2014-08-04 23:20 ` Nicolas Pitre
2014-08-04 23:20   ` Nicolas Pitre
2014-08-04 23:22   ` Stephen Boyd
2014-08-04 23:22     ` Stephen Boyd
2014-08-04 23:27   ` [PATCH v2] " Stephen Boyd
2014-08-04 23:27     ` Stephen Boyd
2014-08-05 17:48     ` Stephen Boyd [this message]
2014-08-05 17:48       ` Stephen Boyd
2014-08-05 19:50       ` Nicolas Pitre
2014-08-05 19:50         ` Nicolas Pitre
2014-08-05 21:22         ` Stephen Boyd
2014-08-05 21:22           ` Stephen Boyd
2014-08-06  2:34           ` Nicolas Pitre
2014-08-06  2:34             ` Nicolas Pitre
2014-08-12 22:37             ` Stephen Boyd
2014-08-12 22:37               ` Stephen Boyd
2014-08-13  0:39               ` Nicolas Pitre
2014-08-13  0:39                 ` Nicolas Pitre
2014-08-13  0:43                 ` Stephen Boyd
2014-08-13  0:43                   ` Stephen Boyd
2014-08-13  0:49                   ` Nicolas Pitre
2014-08-13  0:49                     ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53E118EE.5040205@codeaurora.org \
    --to=sboyd@codeaurora.org \
    --cc=jason@lakedaemon.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nico@linaro.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.