* Re: [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error
2012-12-17 21:58 [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error Sreekanth Reddy
@ 2012-12-17 13:12 ` Tomas Henzl
2012-12-18 5:07 ` Reddy, Sreekanth
0 siblings, 1 reply; 4+ messages in thread
From: Tomas Henzl @ 2012-12-17 13:12 UTC (permalink / raw)
To: Sreekanth Reddy
Cc: jejb, Nagalakshmi.Nandigama, JBottomley, linux-scsi,
sathya.prakash
On 12/17/2012 10:58 PM, Sreekanth Reddy wrote:
> This patch stops the driver to invoke kthread (which remove the dead ioc)
> for some time while EEH recovery has started.
Thank you for posting this, the issue we have seen is resolved now.
Shouldn't be an additional initialization added?
So after a transient event the non_operational_loop is reset again?
Tomas
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index fd3b3d7..480111c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -208,6 +208,8 @@ _base_fault_reset_work(struct work_struct *work)
return; /* don't rearm timer */
}
+ ioc->non_operational_loop = 0;
+
if ((doorbell & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT) {
rc = mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP,
FORCE_BIG_HAMMER);
>
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
> ---
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index ffd85c5..2349531 100755
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -155,7 +155,7 @@ _base_fault_reset_work(struct work_struct *work)
> struct task_struct *p;
>
> spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags);
> - if (ioc->shost_recovery)
> + if (ioc->shost_recovery || ioc->pci_error_recovery)
> goto rearm_timer;
> spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags);
>
> @@ -164,6 +164,20 @@ _base_fault_reset_work(struct work_struct *work)
> printk(MPT2SAS_INFO_FMT "%s : SAS host is non-operational !!!!\n",
> ioc->name, __func__);
>
> + /* It may be possible that EEH recovery can resolve some of
> + * pci bus failure issues rather removing the dead ioc function
> + * by considering controller is in a non-operational state. So
> + * here priority is given to the EEH recovery. If it doesn't
> + * not resolve this issue, mpt2sas driver will consider this
> + * controller to non-operational state and remove the dead ioc
> + * function.
> + */
> + if (ioc->non_operational_loop++ < 5) {
> + spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock,
> + flags);
> + goto rearm_timer;
> + }
> +
> /*
> * Call _scsih_flush_pending_cmds callback so that we flush all
> * pending commands back to OS. This call is required to aovid
> @@ -4386,6 +4400,7 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
> if (missing_delay[0] != -1 && missing_delay[1] != -1)
> _base_update_missing_delay(ioc, missing_delay[0],
> missing_delay[1]);
> + ioc->non_operational_loop = 0;
>
> return 0;
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index 543d8d6..c6ee7aa 100755
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -835,6 +835,7 @@ struct MPT2SAS_ADAPTER {
> u16 cpu_msix_table_sz;
> u32 ioc_reset_count;
> MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
> + u32 non_operational_loop;
>
> /* internal commands, callback index */
> u8 scsi_io_cb_idx;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error
@ 2012-12-17 21:58 Sreekanth Reddy
2012-12-17 13:12 ` Tomas Henzl
0 siblings, 1 reply; 4+ messages in thread
From: Sreekanth Reddy @ 2012-12-17 21:58 UTC (permalink / raw)
To: jejb, sreekanth.reddy, Nagalakshmi.Nandigama, JBottomley
Cc: linux-scsi, sathya.prakash
This patch stops the driver to invoke kthread (which remove the dead ioc)
for some time while EEH recovery has started.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
---
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index ffd85c5..2349531 100755
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -155,7 +155,7 @@ _base_fault_reset_work(struct work_struct *work)
struct task_struct *p;
spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags);
- if (ioc->shost_recovery)
+ if (ioc->shost_recovery || ioc->pci_error_recovery)
goto rearm_timer;
spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags);
@@ -164,6 +164,20 @@ _base_fault_reset_work(struct work_struct *work)
printk(MPT2SAS_INFO_FMT "%s : SAS host is non-operational !!!!\n",
ioc->name, __func__);
+ /* It may be possible that EEH recovery can resolve some of
+ * pci bus failure issues rather removing the dead ioc function
+ * by considering controller is in a non-operational state. So
+ * here priority is given to the EEH recovery. If it doesn't
+ * not resolve this issue, mpt2sas driver will consider this
+ * controller to non-operational state and remove the dead ioc
+ * function.
+ */
+ if (ioc->non_operational_loop++ < 5) {
+ spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock,
+ flags);
+ goto rearm_timer;
+ }
+
/*
* Call _scsih_flush_pending_cmds callback so that we flush all
* pending commands back to OS. This call is required to aovid
@@ -4386,6 +4400,7 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
if (missing_delay[0] != -1 && missing_delay[1] != -1)
_base_update_missing_delay(ioc, missing_delay[0],
missing_delay[1]);
+ ioc->non_operational_loop = 0;
return 0;
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 543d8d6..c6ee7aa 100755
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -835,6 +835,7 @@ struct MPT2SAS_ADAPTER {
u16 cpu_msix_table_sz;
u32 ioc_reset_count;
MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
+ u32 non_operational_loop;
/* internal commands, callback index */
u8 scsi_io_cb_idx;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error
2012-12-17 13:12 ` Tomas Henzl
@ 2012-12-18 5:07 ` Reddy, Sreekanth
2012-12-18 13:36 ` Tomas Henzl
0 siblings, 1 reply; 4+ messages in thread
From: Reddy, Sreekanth @ 2012-12-18 5:07 UTC (permalink / raw)
To: Tomas Henzl
Cc: jejb@kernel.org, Nandigama, Nagalakshmi, JBottomley@Parallels.com,
linux-scsi@vger.kernel.org, Prakash, Sathya
Yes Thomas, we need to reset the non_operational_loop to zero after the transient event.
Thanks,
Sreekanth.
-----Original Message-----
From: Tomas Henzl [mailto:thenzl@redhat.com]
Sent: Monday, December 17, 2012 6:43 PM
To: Reddy, Sreekanth
Cc: jejb@kernel.org; Nandigama, Nagalakshmi; JBottomley@Parallels.com; linux-scsi@vger.kernel.org; Prakash, Sathya
Subject: Re: [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error
On 12/17/2012 10:58 PM, Sreekanth Reddy wrote:
> This patch stops the driver to invoke kthread (which remove the dead
> ioc) for some time while EEH recovery has started.
Thank you for posting this, the issue we have seen is resolved now.
Shouldn't be an additional initialization added?
So after a transient event the non_operational_loop is reset again?
Tomas
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index fd3b3d7..480111c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -208,6 +208,8 @@ _base_fault_reset_work(struct work_struct *work)
return; /* don't rearm timer */
}
+ ioc->non_operational_loop = 0;
+
if ((doorbell & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT) {
rc = mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP,
FORCE_BIG_HAMMER);
>
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
> ---
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index ffd85c5..2349531 100755
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -155,7 +155,7 @@ _base_fault_reset_work(struct work_struct *work)
> struct task_struct *p;
>
> spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags);
> - if (ioc->shost_recovery)
> + if (ioc->shost_recovery || ioc->pci_error_recovery)
> goto rearm_timer;
> spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags);
>
> @@ -164,6 +164,20 @@ _base_fault_reset_work(struct work_struct *work)
> printk(MPT2SAS_INFO_FMT "%s : SAS host is non-operational !!!!\n",
> ioc->name, __func__);
>
> + /* It may be possible that EEH recovery can resolve some of
> + * pci bus failure issues rather removing the dead ioc function
> + * by considering controller is in a non-operational state. So
> + * here priority is given to the EEH recovery. If it doesn't
> + * not resolve this issue, mpt2sas driver will consider this
> + * controller to non-operational state and remove the dead ioc
> + * function.
> + */
> + if (ioc->non_operational_loop++ < 5) {
> + spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock,
> + flags);
> + goto rearm_timer;
> + }
> +
> /*
> * Call _scsih_flush_pending_cmds callback so that we flush all
> * pending commands back to OS. This call is required to aovid @@
> -4386,6 +4400,7 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
> if (missing_delay[0] != -1 && missing_delay[1] != -1)
> _base_update_missing_delay(ioc, missing_delay[0],
> missing_delay[1]);
> + ioc->non_operational_loop = 0;
>
> return 0;
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h
> b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index 543d8d6..c6ee7aa 100755
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -835,6 +835,7 @@ struct MPT2SAS_ADAPTER {
> u16 cpu_msix_table_sz;
> u32 ioc_reset_count;
> MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
> + u32 non_operational_loop;
>
> /* internal commands, callback index */
> u8 scsi_io_cb_idx;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error
2012-12-18 5:07 ` Reddy, Sreekanth
@ 2012-12-18 13:36 ` Tomas Henzl
0 siblings, 0 replies; 4+ messages in thread
From: Tomas Henzl @ 2012-12-18 13:36 UTC (permalink / raw)
To: Reddy, Sreekanth
Cc: jejb@kernel.org, Nandigama, Nagalakshmi, JBottomley@Parallels.com,
linux-scsi@vger.kernel.org, Prakash, Sathya
On 12/18/2012 06:07 AM, Reddy, Sreekanth wrote:
> Yes Thomas, we need to reset the non_operational_loop to zero after the transient event.
OK, so let me repost a V2 of the whole patch.
>
> Thanks,
> Sreekanth.
>
> -----Original Message-----
> From: Tomas Henzl [mailto:thenzl@redhat.com]
> Sent: Monday, December 17, 2012 6:43 PM
> To: Reddy, Sreekanth
> Cc: jejb@kernel.org; Nandigama, Nagalakshmi; JBottomley@Parallels.com; linux-scsi@vger.kernel.org; Prakash, Sathya
> Subject: Re: [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error
>
> On 12/17/2012 10:58 PM, Sreekanth Reddy wrote:
>> This patch stops the driver to invoke kthread (which remove the dead
>> ioc) for some time while EEH recovery has started.
> Thank you for posting this, the issue we have seen is resolved now.
> Shouldn't be an additional initialization added?
> So after a transient event the non_operational_loop is reset again?
>
> Tomas
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index fd3b3d7..480111c 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -208,6 +208,8 @@ _base_fault_reset_work(struct work_struct *work)
> return; /* don't rearm timer */
> }
>
> + ioc->non_operational_loop = 0;
> +
> if ((doorbell & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT) {
> rc = mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP,
> FORCE_BIG_HAMMER);
>
>
>
>> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
>> ---
>>
>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
>> b/drivers/scsi/mpt2sas/mpt2sas_base.c
>> index ffd85c5..2349531 100755
>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
>> @@ -155,7 +155,7 @@ _base_fault_reset_work(struct work_struct *work)
>> struct task_struct *p;
>>
>> spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags);
>> - if (ioc->shost_recovery)
>> + if (ioc->shost_recovery || ioc->pci_error_recovery)
>> goto rearm_timer;
>> spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags);
>>
>> @@ -164,6 +164,20 @@ _base_fault_reset_work(struct work_struct *work)
>> printk(MPT2SAS_INFO_FMT "%s : SAS host is non-operational !!!!\n",
>> ioc->name, __func__);
>>
>> + /* It may be possible that EEH recovery can resolve some of
>> + * pci bus failure issues rather removing the dead ioc function
>> + * by considering controller is in a non-operational state. So
>> + * here priority is given to the EEH recovery. If it doesn't
>> + * not resolve this issue, mpt2sas driver will consider this
>> + * controller to non-operational state and remove the dead ioc
>> + * function.
>> + */
>> + if (ioc->non_operational_loop++ < 5) {
>> + spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock,
>> + flags);
>> + goto rearm_timer;
>> + }
>> +
>> /*
>> * Call _scsih_flush_pending_cmds callback so that we flush all
>> * pending commands back to OS. This call is required to aovid @@
>> -4386,6 +4400,7 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
>> if (missing_delay[0] != -1 && missing_delay[1] != -1)
>> _base_update_missing_delay(ioc, missing_delay[0],
>> missing_delay[1]);
>> + ioc->non_operational_loop = 0;
>>
>> return 0;
>>
>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h
>> b/drivers/scsi/mpt2sas/mpt2sas_base.h
>> index 543d8d6..c6ee7aa 100755
>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
>> @@ -835,6 +835,7 @@ struct MPT2SAS_ADAPTER {
>> u16 cpu_msix_table_sz;
>> u32 ioc_reset_count;
>> MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
>> + u32 non_operational_loop;
>>
>> /* internal commands, callback index */
>> u8 scsi_io_cb_idx;
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-12-18 13:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-17 21:58 [PATCH] [SCSI] mpt2sas: fix for driver fails EEH recovery from injected pci bus error Sreekanth Reddy
2012-12-17 13:12 ` Tomas Henzl
2012-12-18 5:07 ` Reddy, Sreekanth
2012-12-18 13:36 ` Tomas Henzl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).