public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ipmi: Run a dummy command before submitting a new command
@ 2010-07-27 16:01 Matthew Garrett
  2010-07-27 17:07 ` Corey Minyard
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Garrett @ 2010-07-27 16:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Matthew Garrett, Corey Minyard

Newer firmware revisions on HP's ILO3 (1.05 and later) generate state
machine errors with the current IPMI code. Running through the IPMI
timeout handler once before submitting the command avoids this.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: Corey Minyard <cminyard@mvista.com>
---
 drivers/char/ipmi/ipmi_si_intf.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index e39a744..3f06199 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -317,6 +317,7 @@ static int unload_when_empty = 1;
 static int add_smi(struct smi_info *smi);
 static int try_smi_init(struct smi_info *smi);
 static void cleanup_one_si(struct smi_info *to_clean);
+static void smi_timeout(unsigned long data);
 
 static ATOMIC_NOTIFIER_HEAD(xaction_notifier_list);
 static int register_xaction_notifier(struct notifier_block *nb)
@@ -897,6 +898,7 @@ static void sender(void                *send_info,
 #endif
 
 	mod_timer(&smi_info->si_timer, jiffies + SI_TIMEOUT_JIFFIES);
+	smi_timeout((unsigned long)smi_info);
 
 	if (smi_info->thread)
 		wake_up_process(smi_info->thread);
-- 
1.7.1.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] ipmi: Run a dummy command before submitting a new command
  2010-07-27 16:01 [PATCH] ipmi: Run a dummy command before submitting a new command Matthew Garrett
@ 2010-07-27 17:07 ` Corey Minyard
  2010-07-27 17:21   ` Matthew Garrett
  0 siblings, 1 reply; 4+ messages in thread
From: Corey Minyard @ 2010-07-27 17:07 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel

I don't think this is the right way to handle the problem.  Though it's 
not going to break anything, this change is just a hack.  We need to 
figure out why these machine exhibit this behavior.  If it's a bug in 
the driver, then we need to fix the driver.  If it's a bug in the HP 
firmware, then we need to document it well as such, get HP to fix their 
firmware, and possibly tie it into the xaction handler that's already in 
start_next_msg.

The only interaction with the device that this change should cause is 
one read from the status register, since the device should be idle at 
this point.  If that's the case, and it's not a driver bug, you can try 
adding an xaction that calls smi_info->handlers->event(smi_info->si_sm, 0).

There are debugging flags in the state machines that might help debug 
this, too.

-corey

On 07/27/2010 11:01 AM, Matthew Garrett wrote:
> Newer firmware revisions on HP's ILO3 (1.05 and later) generate state
> machine errors with the current IPMI code. Running through the IPMI
> timeout handler once before submitting the command avoids this.
>
> Signed-off-by: Matthew Garrett<mjg@redhat.com>
> Cc: Corey Minyard<cminyard@mvista.com>
> ---
>   drivers/char/ipmi/ipmi_si_intf.c |    2 ++
>   1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index e39a744..3f06199 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -317,6 +317,7 @@ static int unload_when_empty = 1;
>   static int add_smi(struct smi_info *smi);
>   static int try_smi_init(struct smi_info *smi);
>   static void cleanup_one_si(struct smi_info *to_clean);
> +static void smi_timeout(unsigned long data);
>
>   static ATOMIC_NOTIFIER_HEAD(xaction_notifier_list);
>   static int register_xaction_notifier(struct notifier_block *nb)
> @@ -897,6 +898,7 @@ static void sender(void                *send_info,
>   #endif
>
>   	mod_timer(&smi_info->si_timer, jiffies + SI_TIMEOUT_JIFFIES);
> +	smi_timeout((unsigned long)smi_info);
>
>   	if (smi_info->thread)
>   		wake_up_process(smi_info->thread);
>    


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ipmi: Run a dummy command before submitting a new command
  2010-07-27 17:07 ` Corey Minyard
@ 2010-07-27 17:21   ` Matthew Garrett
  2010-10-26 17:45     ` Matthew Garrett
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Garrett @ 2010-07-27 17:21 UTC (permalink / raw)
  To: Corey Minyard; +Cc: linux-kernel

On Tue, Jul 27, 2010 at 12:07:11PM -0500, Corey Minyard wrote:
> I don't think this is the right way to handle the problem.  Though it's  
> not going to break anything, this change is just a hack.  We need to  
> figure out why these machine exhibit this behavior.  If it's a bug in  
> the driver, then we need to fix the driver.  If it's a bug in the HP  
> firmware, then we need to document it well as such, get HP to fix their  
> firmware, and possibly tie it into the xaction handler that's already in  
> start_next_msg.

Yeah, I agree that this isn't the optimal approach. I'm waiting to hear 
from HP if they have any idea what happened between 1.01 (which worked) 
and 1.05 (which is broken), which might give some more insight into what 
we're doing wrong.

> The only interaction with the device that this change should cause is  
> one read from the status register, since the device should be idle at  
> this point.  If that's the case, and it's not a driver bug, you can try  
> adding an xaction that calls smi_info->handlers->event(smi_info->si_sm, 
> 0).

I'll try to see what's going on.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ipmi: Run a dummy command before submitting a new command
  2010-07-27 17:21   ` Matthew Garrett
@ 2010-10-26 17:45     ` Matthew Garrett
  0 siblings, 0 replies; 4+ messages in thread
From: Matthew Garrett @ 2010-10-26 17:45 UTC (permalink / raw)
  To: Corey Minyard; +Cc: linux-kernel

I've finally had time to look at this more closely. The following patch 
seems to make things happy, but I still don't have a full understanding 
of what's going on.

(Background: Some HP ilo firmware versions are unhappy due to 
3326f4f2276791561af1fd5f2020be0186459813 . Running a dummy command in 
schedule() works around this but clearly isn't the right answer. Turns 
out that this also breaks IPMI on some Suns, and Oracle reverted this in 
their kernel but never told me. Thoracle)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index e537610..763af8f 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -901,6 +901,8 @@ static void sender(void                *send_info,
 
 	mod_timer(&smi_info->si_timer, jiffies + SI_TIMEOUT_JIFFIES);
 
+	smi_info->last_timeout_jiffies = jiffies;
+
 	if (smi_info->thread)
 		wake_up_process(smi_info->thread);
 

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-10-26 17:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-27 16:01 [PATCH] ipmi: Run a dummy command before submitting a new command Matthew Garrett
2010-07-27 17:07 ` Corey Minyard
2010-07-27 17:21   ` Matthew Garrett
2010-10-26 17:45     ` Matthew Garrett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox