public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
@ 2009-07-29 21:54 Bjorn Helgaas
  2009-07-30  0:59 ` Zhang Rui
  2009-07-30  2:43 ` Shaohua Li
  0 siblings, 2 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2009-07-29 21:54 UTC (permalink / raw)
  To: Len Brown; +Cc: Matthew Garrett, linux-acpi

On some machines, a software-initiated SMI causes corruption unless the
SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
done in GPE-related methods that are run via workqueues, so we can avoid
the known corruption cases by binding the workqueues to CPU 0.

References:
    http://bugzilla.kernel.org/show_bug.cgi?id=13751
    https://bugs.launchpad.net/bugs/157171
    https://bugs.launchpad.net/bugs/157691

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---
 drivers/acpi/osl.c |   25 +++++++++++++++++++++++++
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 7167071..5691f16 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -189,11 +189,36 @@ acpi_status __init acpi_os_initialize(void)
 	return AE_OK;
 }
 
+static void bind_to_cpu0(struct work_struct *work)
+{
+	set_cpus_allowed(current, cpumask_of_cpu(0));
+	kfree(work);
+}
+
+static void bind_workqueue(struct workqueue_struct *wq)
+{
+	struct work_struct *work;
+
+	work = kzalloc(sizeof(struct work_struct), GFP_KERNEL);
+	INIT_WORK(work, bind_to_cpu0);
+	queue_work(wq, work);
+}
+
 acpi_status acpi_os_initialize1(void)
 {
+	/*
+	 * On some machines, a software-initiated SMI causes corruption unless
+	 * the SMI runs on CPU 0.  An SMI can be initiated by any AML, but
+	 * typically it's done in GPE-related methods that are run via
+	 * workqueues, so we can avoid the known corruption cases by binding
+	 * the workqueues to CPU 0.
+	 */
 	kacpid_wq = create_singlethread_workqueue("kacpid");
+	bind_workqueue(kacpid_wq);
 	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
+	bind_workqueue(kacpi_notify_wq);
 	kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug");
+	bind_workqueue(kacpi_hotplug_wq);
 	BUG_ON(!kacpid_wq);
 	BUG_ON(!kacpi_notify_wq);
 	BUG_ON(!kacpi_hotplug_wq);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-29 21:54 [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption Bjorn Helgaas
@ 2009-07-30  0:59 ` Zhang Rui
  2009-07-31 22:47   ` Bjorn Helgaas
  2009-07-30  2:43 ` Shaohua Li
  1 sibling, 1 reply; 9+ messages in thread
From: Zhang Rui @ 2009-07-30  0:59 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Len Brown, Matthew Garrett, linux-acpi@vger.kernel.org

On Thu, 2009-07-30 at 05:54 +0800, Bjorn Helgaas wrote:
> On some machines, a software-initiated SMI causes corruption unless the
> SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> done in GPE-related methods that are run via workqueues, so we can avoid
> the known corruption cases by binding the workqueues to CPU 0.
> 
> References:
>     http://bugzilla.kernel.org/show_bug.cgi?id=13751
>     https://bugs.launchpad.net/bugs/157171
>     https://bugs.launchpad.net/bugs/157691
> 
> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>

Acked-by: Zhang Rui <rui.zhang@intel.com>

> ---
>  drivers/acpi/osl.c |   25 +++++++++++++++++++++++++
>  1 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 7167071..5691f16 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -189,11 +189,36 @@ acpi_status __init acpi_os_initialize(void)
>  	return AE_OK;
>  }
>  
> +static void bind_to_cpu0(struct work_struct *work)
> +{
> +	set_cpus_allowed(current, cpumask_of_cpu(0));
> +	kfree(work);
> +}
> +
> +static void bind_workqueue(struct workqueue_struct *wq)
> +{
> +	struct work_struct *work;
> +
> +	work = kzalloc(sizeof(struct work_struct), GFP_KERNEL);
> +	INIT_WORK(work, bind_to_cpu0);
> +	queue_work(wq, work);
> +}
> +
>  acpi_status acpi_os_initialize1(void)
>  {
> +	/*
> +	 * On some machines, a software-initiated SMI causes corruption unless
> +	 * the SMI runs on CPU 0.  An SMI can be initiated by any AML, but
> +	 * typically it's done in GPE-related methods that are run via
> +	 * workqueues, so we can avoid the known corruption cases by binding
> +	 * the workqueues to CPU 0.
> +	 */
>  	kacpid_wq = create_singlethread_workqueue("kacpid");
> +	bind_workqueue(kacpid_wq);
>  	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
> +	bind_workqueue(kacpi_notify_wq);
>  	kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug");
> +	bind_workqueue(kacpi_hotplug_wq);
>  	BUG_ON(!kacpid_wq);
>  	BUG_ON(!kacpi_notify_wq);
>  	BUG_ON(!kacpi_hotplug_wq);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-29 21:54 [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption Bjorn Helgaas
  2009-07-30  0:59 ` Zhang Rui
@ 2009-07-30  2:43 ` Shaohua Li
  2009-07-30  2:55   ` Matthew Garrett
  2009-07-30 17:06   ` Bjorn Helgaas
  1 sibling, 2 replies; 9+ messages in thread
From: Shaohua Li @ 2009-07-30  2:43 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Len Brown, Matthew Garrett, linux-acpi@vger.kernel.org

On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote:
> On some machines, a software-initiated SMI causes corruption unless the
> SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> done in GPE-related methods that are run via workqueues, so we can avoid
> the known corruption cases by binding the workqueues to CPU 0.
> 
> References:
>     http://bugzilla.kernel.org/show_bug.cgi?id=13751
>     https://bugs.launchpad.net/bugs/157171
>     https://bugs.launchpad.net/bugs/157691
Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be
limited to run on CPU 0?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-30  2:43 ` Shaohua Li
@ 2009-07-30  2:55   ` Matthew Garrett
  2009-07-30  3:13     ` Shaohua Li
  2009-07-30 17:06   ` Bjorn Helgaas
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Garrett @ 2009-07-30  2:55 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Bjorn Helgaas, Len Brown, linux-acpi@vger.kernel.org

On Thu, Jul 30, 2009 at 10:43:00AM +0800, Shaohua Li wrote:
> On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote:
> > On some machines, a software-initiated SMI causes corruption unless the
> > SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> > done in GPE-related methods that are run via workqueues, so we can avoid
> > the known corruption cases by binding the workqueues to CPU 0.
> > 
> > References:
> >     http://bugzilla.kernel.org/show_bug.cgi?id=13751
> >     https://bugs.launchpad.net/bugs/157171
> >     https://bugs.launchpad.net/bugs/157691
> Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be
> limited to run on CPU 0?

If ACPI is a performance bottleneck then we have other problems, so I 
suspect that we could live with that. We'd probably want to be able to 
disable it at runtime for the small number of users who have 
"interesting" performance requirements, but falling on the side of 
safety over slightly reduced latency under some circumstances seems fair 
to me. It'd be interesting to see if this helps with any of the other 
SMI-related hangs we've seen.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-30  2:55   ` Matthew Garrett
@ 2009-07-30  3:13     ` Shaohua Li
  2009-07-30  3:17       ` Matthew Garrett
  0 siblings, 1 reply; 9+ messages in thread
From: Shaohua Li @ 2009-07-30  3:13 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Bjorn Helgaas, Len Brown, linux-acpi@vger.kernel.org

On Thu, Jul 30, 2009 at 10:55:54AM +0800, Matthew Garrett wrote:
> On Thu, Jul 30, 2009 at 10:43:00AM +0800, Shaohua Li wrote:
> > On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote:
> > > On some machines, a software-initiated SMI causes corruption unless the
> > > SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> > > done in GPE-related methods that are run via workqueues, so we can avoid
> > > the known corruption cases by binding the workqueues to CPU 0.
> > > 
> > > References:
> > >     http://bugzilla.kernel.org/show_bug.cgi?id=13751
> > >     https://bugs.launchpad.net/bugs/157171
> > >     https://bugs.launchpad.net/bugs/157691
> > Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be
> > limited to run on CPU 0?
> 
> If ACPI is a performance bottleneck then we have other problems, so I 
> suspect that we could live with that. We'd probably want to be able to 
> disable it at runtime for the small number of users who have 
> "interesting" performance requirements, but falling on the side of 
> safety over slightly reduced latency under some circumstances seems fair 
> to me. It'd be interesting to see if this helps with any of the other 
> SMI-related hangs we've seeni.
ACPICA isn't designed for performance. If it has performance issue, it should
already have.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-30  3:13     ` Shaohua Li
@ 2009-07-30  3:17       ` Matthew Garrett
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Garrett @ 2009-07-30  3:17 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Bjorn Helgaas, Len Brown, linux-acpi@vger.kernel.org

On Thu, Jul 30, 2009 at 11:13:48AM +0800, Shaohua Li wrote:

> ACPICA isn't designed for performance. If it has performance issue, it should
> already have.

Yeah. My point was just that we have some customers who like tuning 
systems heavily - I suspect they'd prefer to be able to control whether 
or not ACPI is running entirely on cpu 0 or not. As you say, it should 
make little difference in the real world but some people do have very 
specialised requirements.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-30  2:43 ` Shaohua Li
  2009-07-30  2:55   ` Matthew Garrett
@ 2009-07-30 17:06   ` Bjorn Helgaas
  1 sibling, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2009-07-30 17:06 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Len Brown, Matthew Garrett, linux-acpi@vger.kernel.org

On Wednesday 29 July 2009 08:43:00 pm Shaohua Li wrote:
> On Thu, Jul 30, 2009 at 05:54:25AM +0800, Bjorn Helgaas wrote:
> > On some machines, a software-initiated SMI causes corruption unless the
> > SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> > done in GPE-related methods that are run via workqueues, so we can avoid
> > the known corruption cases by binding the workqueues to CPU 0.
> > 
> > References:
> >     http://bugzilla.kernel.org/show_bug.cgi?id=13751
> >     https://bugs.launchpad.net/bugs/157171
> >     https://bugs.launchpad.net/bugs/157691
> Good job! Since any AML code can invoke a SMI, I wonder if all ACPICA should be
> limited to run on CPU 0?

I did look into doing that, but I didn't see an easy way to do it.

My first thought was that we could do a set_cpus_allowed() in
acpi_ex_enter_interpreter() and restore in acpi_ex_exit_interpreter().
But of course, those are ACPI CA functions, so to do it without an
ACPI CA change would mean some kind of hook in acpi_os_wait_semaphore(),
and there, we don't know *which* semaphore means "enter interpreter".

So I gave up for now.  But if somebody has a smarter idea, I agree
that it would be nice to at least have the option to run all AML on
CPU 0.

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-30  0:59 ` Zhang Rui
@ 2009-07-31 22:47   ` Bjorn Helgaas
  2009-08-01 11:01     ` Rafael J. Wysocki
  0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2009-07-31 22:47 UTC (permalink / raw)
  To: Len Brown
  Cc: Zhang Rui, Matthew Garrett, linux-acpi@vger.kernel.org,
	Rafael J. Wysocki

On Wednesday 29 July 2009 06:59:59 pm Zhang Rui wrote:
> On Thu, 2009-07-30 at 05:54 +0800, Bjorn Helgaas wrote:
> > On some machines, a software-initiated SMI causes corruption unless the
> > SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> > done in GPE-related methods that are run via workqueues, so we can avoid
> > the known corruption cases by binding the workqueues to CPU 0.
> > 
> > References:
> >     http://bugzilla.kernel.org/show_bug.cgi?id=13751
> >     https://bugs.launchpad.net/bugs/157171
> >     https://bugs.launchpad.net/bugs/157691
> > 
> > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
> 
> Acked-by: Zhang Rui <rui.zhang@intel.com>

In addition to the reports above, I think it's likely this patch
will fix the problems reported below:

  http://bugzilla.kernel.org/show_bug.cgi?id=13412
  http://bugzilla.kernel.org/show_bug.cgi?id=11259
  http://bugzilla.kernel.org/show_bug.cgi?id=12328
  http://bugzilla.kernel.org/show_bug.cgi?id=12106

I think we should consider this patch for 2.6.31.

(Rafael, 13751 is on your "2.6.29 -> 2.6.30" regression list.
I actually think it's been around much longer than that, but
there seem to be many things that affect whether it manifests.)

Bjorn

> > ---
> >  drivers/acpi/osl.c |   25 +++++++++++++++++++++++++
> >  1 files changed, 25 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> > index 7167071..5691f16 100644
> > --- a/drivers/acpi/osl.c
> > +++ b/drivers/acpi/osl.c
> > @@ -189,11 +189,36 @@ acpi_status __init acpi_os_initialize(void)
> >  	return AE_OK;
> >  }
> >  
> > +static void bind_to_cpu0(struct work_struct *work)
> > +{
> > +	set_cpus_allowed(current, cpumask_of_cpu(0));
> > +	kfree(work);
> > +}
> > +
> > +static void bind_workqueue(struct workqueue_struct *wq)
> > +{
> > +	struct work_struct *work;
> > +
> > +	work = kzalloc(sizeof(struct work_struct), GFP_KERNEL);
> > +	INIT_WORK(work, bind_to_cpu0);
> > +	queue_work(wq, work);
> > +}
> > +
> >  acpi_status acpi_os_initialize1(void)
> >  {
> > +	/*
> > +	 * On some machines, a software-initiated SMI causes corruption unless
> > +	 * the SMI runs on CPU 0.  An SMI can be initiated by any AML, but
> > +	 * typically it's done in GPE-related methods that are run via
> > +	 * workqueues, so we can avoid the known corruption cases by binding
> > +	 * the workqueues to CPU 0.
> > +	 */
> >  	kacpid_wq = create_singlethread_workqueue("kacpid");
> > +	bind_workqueue(kacpid_wq);
> >  	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
> > +	bind_workqueue(kacpi_notify_wq);
> >  	kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug");
> > +	bind_workqueue(kacpi_hotplug_wq);
> >  	BUG_ON(!kacpid_wq);
> >  	BUG_ON(!kacpi_notify_wq);
> >  	BUG_ON(!kacpi_hotplug_wq);
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  2009-07-31 22:47   ` Bjorn Helgaas
@ 2009-08-01 11:01     ` Rafael J. Wysocki
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2009-08-01 11:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Len Brown, Zhang Rui, Matthew Garrett, linux-acpi@vger.kernel.org

On Saturday 01 August 2009, Bjorn Helgaas wrote:
> On Wednesday 29 July 2009 06:59:59 pm Zhang Rui wrote:
> > On Thu, 2009-07-30 at 05:54 +0800, Bjorn Helgaas wrote:
> > > On some machines, a software-initiated SMI causes corruption unless the
> > > SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
> > > done in GPE-related methods that are run via workqueues, so we can avoid
> > > the known corruption cases by binding the workqueues to CPU 0.
> > > 
> > > References:
> > >     http://bugzilla.kernel.org/show_bug.cgi?id=13751
> > >     https://bugs.launchpad.net/bugs/157171
> > >     https://bugs.launchpad.net/bugs/157691
> > > 
> > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
> > 
> > Acked-by: Zhang Rui <rui.zhang@intel.com>
> 
> In addition to the reports above, I think it's likely this patch
> will fix the problems reported below:
> 
>   http://bugzilla.kernel.org/show_bug.cgi?id=13412
>   http://bugzilla.kernel.org/show_bug.cgi?id=11259
>   http://bugzilla.kernel.org/show_bug.cgi?id=12328
>   http://bugzilla.kernel.org/show_bug.cgi?id=12106
> 
> I think we should consider this patch for 2.6.31.
> 
> (Rafael, 13751 is on your "2.6.29 -> 2.6.30" regression list.
> I actually think it's been around much longer than that, but
> there seem to be many things that affect whether it manifests.)

I've dropped it from the list, thanks.

Best,
Rafael

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-08-01 11:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-29 21:54 [PATCH] ACPI: bind workqueues to CPU 0 to avoid SMI corruption Bjorn Helgaas
2009-07-30  0:59 ` Zhang Rui
2009-07-31 22:47   ` Bjorn Helgaas
2009-08-01 11:01     ` Rafael J. Wysocki
2009-07-30  2:43 ` Shaohua Li
2009-07-30  2:55   ` Matthew Garrett
2009-07-30  3:13     ` Shaohua Li
2009-07-30  3:17       ` Matthew Garrett
2009-07-30 17:06   ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox