linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Lv Zheng <lv.zheng@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Len Brown <len.brown@intel.com>, Lv Zheng <zetalog@gmail.com>,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org
Subject: Re: [PATCH v2] ACPI/EC: Enable storm prevention mechanisms.
Date: Tue, 15 Jul 2014 17:44:25 -0700	[thread overview]
Message-ID: <87pph63zfa.fsf@tassilo.jf.intel.com> (raw)
In-Reply-To: <d43c4857af0aef38ba62fd21f62cb69deea888f3.1405407302.git.lv.zheng@intel.com> (Lv Zheng's message of "Tue, 15 Jul 2014 15:09:01 +0800")

Lv Zheng <lv.zheng@intel.com> writes:

> This patch enables storm prevention mechanisms. After applying this patch,
> when the command/event storms are actually detected, EC driver will be
> switched from the IRQ mode to the polling mode.

It would be far better to fix the root cause of the storms, instead
of just barely curing the symptoms.

The last time I investigated this the main problem was desynchronization
with the EC mailbox protocol because Linux was too fast.

At least on my laptop the old delay patch was fairly successful in
resynchronizing.

[needs some more changes to make the ACPI interrupt threaded]

Author: Andi Kleen <andi@firstfloor.org>
Date:   Fri Nov 11 13:46:22 2011 +0100

    ACPI: EC: Add a limited number of repeats after false EC interrupts
5A    
    My Acer laptop has a large number of false EC interrupts
    (interrupts when the EC indexed data register protocol is in the wrong
    state, expecting input when we should send output or vice versa)
    It seems the hardware triggers the interrupt before it actually
    sets the right status in the register.
    
    With a delay and a repeat it usually works on the second and sometimes
    on the third repeat.
    
    With the threaded interrupt we can do this safely now without
    needing a state machine.
    
    This doesn't completely fix the problem on my system, but makes the
    desynchronizations much less frequent and rare enough that passive
    trips still work.
    
    OPEN: best length of the delay. I picked an arbitary value that may
    be too long.
    
    Signed-off-by: Andi Kleen <ak@linux.intel.com>>

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index e520310..8304cdd 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -88,6 +88,16 @@ static unsigned int ec_delay __read_mostly = ACPI_EC_DELAY;
 module_param(ec_delay, uint, 0644);
 MODULE_PARM_DESC(ec_delay, "Timeout(ms) waited until an EC command completes");
 
+static unsigned int ec_max_repeat __read_mostly = 4;
+module_param(ec_max_repeat, uint, 0644);
+MODULE_PARM_DESC(ec_max_repeat, 
+	"Maximum number of repeats after an false EC interrupt");
+
+static unsigned int ec_repeat_delay __read_mostly = 200;
+module_param(ec_repeat_delay, uint, 0644);
+MODULE_PARM_DESC(ec_repeat_delay,
+	"Delay between each repeat after an false EC interrupt (in us)");
+
 /* If we find an EC via the ECDT, we need to keep a ptr to its context */
 /* External interfaces use first EC only, so remember */
 typedef int (*acpi_ec_query_func) (void *data);
@@ -167,9 +177,13 @@ static void start_transaction(struct acpi_ec *ec)
 	acpi_ec_write_cmd(ec, ec->curr->command);
 }
 
-static void advance_transaction(struct acpi_ec *ec, u8 status)
+static void advance_transaction(struct acpi_ec *ec)
 {
 	unsigned long flags;
+	int repeat = ec_max_repeat;
+	u8 status;
+again:
+	status = acpi_ec_read_status(ec);
 	spin_lock_irqsave(&ec->curr_lock, flags);
 	if (!ec->curr)
 		goto unlock;
@@ -195,8 +209,15 @@ err:
 	trace_acpi_ec_unsynchronized(status, ec->curr->wlen, ec->curr->rlen, 
 				     ec->curr->wi, ec->curr->ri);
 	/* false interrupt, state didn't change */
-	if (in_interrupt())
+	if (in_interrupt()) {
 		++ec->curr->irq_count;
+		if (repeat == 0)
+			goto unlock;
+		spin_unlock_irqrestore(&ec->curr_lock, flags);
+		usleep_range(ec_repeat_delay, ec_repeat_delay + 10);
+		repeat--;
+		goto again;
+	}
 unlock:
 	spin_unlock_irqrestore(&ec->curr_lock, flags);
 }
@@ -231,7 +252,7 @@ static int ec_poll(struct acpi_ec *ec)
 						msecs_to_jiffies(1)))
 					return 0;
 			}
-			advance_transaction(ec, acpi_ec_read_status(ec));
+			advance_transaction(ec);
 		} while (time_before(jiffies, delay));
 		if (acpi_ec_read_status(ec) & ACPI_EC_FLAG_IBF)
 			break;
@@ -623,7 +644,7 @@ static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
 
 	pr_debug(PREFIX "~~~> interrupt\n");
 
-	advance_transaction(ec, acpi_ec_read_status(ec));
+	advance_transaction(ec);
 	if (ec_transaction_done(ec) &&
 	    (acpi_ec_read_status(ec) & ACPI_EC_FLAG_IBF) == 0) {
 		wake_up(&ec->wait);






>
> If regressions are reported against storm prevention support, this patch
> can be bisected and reverted before issues can be root caused.
>
> By changing the storm threshold to 0 and stops returning from
> advance_transaction() without increasing irq_count on non-error cases, we
> can perform unit test for the storm prevention. The result is as follows:
>   [    4.525321] ACPI : EC: ***** Command(RD_EC) started *****
>   [    4.525321] ACPI : EC: ===== TASK =====
>   [    4.525326] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.525327] ACPI : EC: EC_SC(W) = 0x80
>   [    4.525436] ACPI : EC: ===== IRQ =====
>   [    4.525442] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.525442] ACPI : EC: EC_DATA(W) = 0x23
>   [    4.525448] ACPI : EC: +++++ Polling enabled +++++
>   [    4.525451] ACPI : EC: EC_SC(R) = 0x02 SCI_EVT=0 BURST=0 CMD=0 IBF=1 OBF=0
> # [    4.528954] ACPI : EC: ===== TASK =====
>   [    4.528957] ACPI : EC: EC_SC(R) = 0x01 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=1
>   [    4.528960] ACPI : EC: EC_DATA(R) = 0x2b
>   [    4.528963] ACPI : EC: +++++ Polling disabled +++++
>   [    4.528964] ACPI : EC: ***** Command(RD_EC) stopped *****
>   [    4.528974] ACPI : EC: ===== IRQ =====
>   [    4.528977] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.528980] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.528988] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.529347] ACPI : EC: ***** Command(WR_EC) started *****
>   [    4.529348] ACPI : EC: ===== TASK =====
>   [    4.529352] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.529353] ACPI : EC: EC_SC(W) = 0x81
> * [    4.529467] ACPI : EC: ===== IRQ =====
>   [    4.529473] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.529473] ACPI : EC: EC_DATA(W) = 0x04
>   [    4.529479] ACPI : EC: +++++ Polling enabled +++++
>   [    4.529482] ACPI : EC: EC_SC(R) = 0x02 SCI_EVT=0 BURST=0 CMD=0 IBF=1 OBF=0
>   [    4.532951] ACPI : EC: ===== TASK =====
>   [    4.532957] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.532957] ACPI : EC: EC_DATA(W) = 0x01
>   [    4.536952] ACPI : EC: ===== TASK =====
>   [    4.536957] ACPI : EC: EC_SC(R) = 0x00 SCI_EVT=0 BURST=0 CMD=0 IBF=0 OBF=0
>   [    4.536964] ACPI : EC: +++++ Polling disabled +++++
>   [    4.536965] ACPI : EC: ***** Command(WR_EC) stopped *****
> We can see the command is advanced in the task context after enabling the
> polling mode (#) and the next command can still be started from the high
> performance IRQ mode (*).
>
> Signed-off-by: Lv Zheng <lv.zheng@intel.com>
> ---
>  drivers/acpi/ec.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index d3b1bd7..f40144e 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -825,13 +825,17 @@ static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
>  {
>  	unsigned long flags;
>  	struct acpi_ec *ec = data;
> +	u32 enable = 0;
>  
>  	spin_lock_irqsave(&ec->lock, flags);
>  	if (advance_transaction(ec))
>  		wake_up(&ec->wait);
> +	if (!test_bit(EC_FLAGS_EVENT_STORM, &ec->flags) &&
> +	    !test_bit(EC_FLAGS_COMMAND_STORM, &ec->flags))
> +		enable = ACPI_REENABLE_GPE;
>  	spin_unlock_irqrestore(&ec->lock, flags);
>  	ec_check_sci(ec, acpi_ec_read_status(ec));
> -	return ACPI_INTERRUPT_HANDLED | ACPI_REENABLE_GPE;
> +	return ACPI_INTERRUPT_HANDLED | enable;
>  }
>  
>  /* --------------------------------------------------------------------------

-- 
ak@linux.intel.com -- Speaking for myself only

  reply	other threads:[~2014-07-16  0:44 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18  3:17 [PATCH 0/9] ACPI/EC: Improve GPE handling model Lv Zheng
2014-06-18  3:17 ` [PATCH 1/9] ACPICA: Events: Reduce indent divergences of events files Lv Zheng
2014-06-18  3:17 ` [PATCH 2/9] ACPICA: Events: Fix an issue that GPE APIs cannot be invoked in atomic context Lv Zheng
2014-06-18  3:17 ` [PATCH 3/9] ACPICA: Events: Introduce acpi_set_gpe()/acpi_finish_gpe() to reduce divergences Lv Zheng
2014-06-18  3:17 ` [PATCH 4/9] ACPICA: Events: Remove acpi_ev_enable_gpe() Lv Zheng
2014-06-18  3:17 ` [PATCH 5/9] ACPICA: Events: Reduce divergences to honor notify handler enabled GPEs Lv Zheng
2014-06-18  3:17 ` [PATCH 6/9] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag Lv Zheng
2014-06-18  3:18 ` [PATCH 7/9] ACPI/EC: Add detailed command/query debugging information Lv Zheng
2014-06-18  3:18 ` [PATCH 8/9] ACPI/EC: Deploy the new GPE handling model Lv Zheng
2014-06-18  3:18 ` [PATCH 9/9] ACPI/EC: Add unit test support for EC driver hotplug Lv Zheng
2014-07-15  3:07 ` [PATCH v2 0/9] ACPICA: Improve GPE handling model Lv Zheng
2014-07-15  3:07   ` [PATCH v2 1/9] ACPICA: Events: Reduce indent divergences of events files Lv Zheng
2014-07-15  3:07   ` [PATCH v2 2/9] ACPICA: Events: Fix an issue that GPE APIs cannot be invoked in atomic context Lv Zheng
2014-07-15  3:07   ` [PATCH v2 3/9] ACPICA: Events: Fix an issue that GPE APIs cannot be invoked in deferred handlers Lv Zheng
2014-07-15  3:07   ` [PATCH v2 4/9] ACPICA: Events: Introduce acpi_set_gpe()/acpi_finish_gpe() to reduce divergences Lv Zheng
2014-07-15  3:07   ` [PATCH v2 5/9] ACPICA: Events: Fix an issue that acpi_set_gpe() cannot unconditionally enable GPEs Lv Zheng
2014-07-15  3:07   ` [PATCH v2 6/9] ACPICA: Events: Fix an issue that status of GPEs are unexpectedly cleared Lv Zheng
2014-07-15  3:07   ` [PATCH v2 7/9] ACPICA: Events: Fix an issue that originally_enabled check is not paired Lv Zheng
2014-07-15  3:08   ` [PATCH v2 8/9] ACPICA: Events: Add a flag to disable internal auto GPE clearing Lv Zheng
2014-07-15  3:08   ` [PATCH v2 9/9] ACPI: Update interrupt handler to honor new ACPI_REENABLE_GPE bit Lv Zheng
2014-07-15 12:25   ` [PATCH v2 0/9] ACPICA: Improve GPE handling model Rafael J. Wysocki
2014-07-16  0:36     ` Zheng, Lv
2014-07-15  3:14 ` [PATCH v2 00/10] ACPI/EC: Add event storm prevention and cleanup command storm prevention Lv Zheng
2014-07-15  3:14   ` [PATCH v2 01/10] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag Lv Zheng
2014-07-15  3:14   ` [PATCH v2 02/10] ACPI/EC: Add detailed command/query debugging information Lv Zheng
2014-07-15  3:14   ` [PATCH v2 03/10] ACPI/EC: Deploy the new GPE handling model Lv Zheng
2014-07-15  3:14   ` [PATCH v2 04/10] ACPI/EC: Refine command storm prevention support Lv Zheng
2014-07-15  3:14   ` [PATCH v2 05/10] ACPI/EC: Add reference counting for query handlers Lv Zheng
2014-07-15  3:15   ` [PATCH v2 06/10] ACPI/EC: Add a warning message to indicate event storms Lv Zheng
2014-07-15  3:15   ` [PATCH v2 07/10] ACPI/EC: Refine event/query debugging messages Lv Zheng
2014-07-15  3:15   ` [PATCH v2 08/10] ACPI/EC: Add command flushing support Lv Zheng
2014-07-15  3:15   ` [PATCH v2 09/10] ACPI/EC: Add event storm prevention support Lv Zheng
2014-07-15  3:15   ` [PATCH v2 10/10] ACPI/EC: Add unit test support for EC driver hotplug Lv Zheng
2014-07-15  7:09 ` [PATCH v2] ACPI/EC: Enable storm prevention mechanisms Lv Zheng
2014-07-16  0:44   ` Andi Kleen [this message]
2014-07-16  0:55     ` Zheng, Lv
2014-07-21  6:04 ` [RFC PATCH v3 00/14] ACPI/EC: Add event storm prevention and cleanup command storm prevention Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 01/14] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 02/14] ACPI/EC: Add detailed command/query debugging information Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 03/14] ACPI/EC: Cleanup command storm prevention using the new GPE handling model Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 04/14] ACPI/EC: Refine command storm prevention support Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 05/14] ACPI/EC: Add reference counting for query handlers Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 06/14] ACPI/EC: Add command flushing support Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 07/14] ACPI/EC: Add a warning message to indicate event storms Lv Zheng
2014-07-21  6:05   ` [RFC PATCH v3 08/14] ACPI/EC: Refine event/query debugging messages Lv Zheng
2014-07-21  6:06   ` [RFC PATCH v3 09/14] ACPI/EC: Add CPU ID to " Lv Zheng
2014-07-21  6:06   ` [RFC PATCH v3 10/14] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events Lv Zheng
2014-07-21  6:06   ` [RFC PATCH v3 11/14] ACPI/EC: Add event storm prevention support Lv Zheng
2014-07-21  6:06   ` [RFC PATCH v3 12/14] ACPI/EC: Add GPE reference counting debugging messages Lv Zheng
2014-07-21  6:06   ` [RFC PATCH v3 13/14] ACPI/EC: Add unit test support for EC driver hotplug Lv Zheng
2014-07-21  6:06   ` [RFC PATCH v3 14/14] ACPI/EC: Cleanup coding style Lv Zheng
2014-07-22  1:11   ` [RFC PATCH v3 00/14] ACPI/EC: Add event storm prevention and cleanup command storm prevention Rafael J. Wysocki
2014-07-22  1:25     ` Zheng, Lv
2014-07-22 22:15       ` Rafael J. Wysocki
2014-07-23  1:31         ` Zheng, Lv
2015-02-05  8:24 ` [PATCH v4 0/6] ACPI / EC: Fix GPE handling related races Lv Zheng
2015-02-05  8:27   ` [PATCH v4 1/6] ACPICA: Events: Introduce ACPI_GPE_DISPATCH_RAW_HANDLER to fix 2 issues for the current GPE APIs Lv Zheng
2015-02-05  8:27   ` [PATCH v4 2/6] ACPICA: Events: Introduce acpi_set_gpe()/acpi_finish_gpe() to reduce divergences Lv Zheng
2015-02-05  8:27   ` [PATCH v4 3/6] ACPICA: Events: Enable APIs to allow interrupt/polling adaptive request based GPE handling model Lv Zheng
2015-02-05  8:27   ` [PATCH v4 4/6] ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode Lv Zheng
2015-02-05  8:27   ` [PATCH v4 5/6] ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp Lv Zheng
2015-02-05  8:27   ` [PATCH v4 6/6] ACPI / EC: Update revision due to raw handler mode Lv Zheng
2015-02-05 15:19   ` [PATCH v4 0/6] ACPI / EC: Fix GPE handling related races Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pph63zfa.fsf@tassilo.jf.intel.com \
    --to=andi@firstfloor.org \
    --cc=len.brown@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lv.zheng@intel.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=zetalog@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).