public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
@ 2013-10-16 11:34 Sumit.Saxena
  2013-10-16 21:44 ` James Bottomley
  2013-10-17 14:04 ` Tomas Henzl
  0 siblings, 2 replies; 7+ messages in thread
From: Sumit.Saxena @ 2013-10-16 11:34 UTC (permalink / raw)
  To: linux-scsi; +Cc: jbottomley, kashyap.desai, aradford

There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance->pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs-

MR_EVT_PD_INSERTED
MR_EVT_PD_REMOVED
MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
MR_EVT_FOREIGN_CFG_IMPORTED

At same time running sysPD IO will be accessing the same array instance->pd_list[], which is getting updated in AEN path, because
of this IO may not get correct PD info from instance->pd_list[] array.

Signed-off-by: Adam Radford <adam.radford@lsi.com>
Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
---
diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index 0c73ba4..e9e543c 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -1531,6 +1531,7 @@ struct megasas_instance {
 	struct megasas_register_set __iomem *reg_set;
 	u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY];
 	struct megasas_pd_list          pd_list[MEGASAS_MAX_PD];
+	struct megasas_pd_list          local_pd_list[MEGASAS_MAX_PD];
 	u8     ld_ids[MEGASAS_MAX_LD_IDS];
 	s8 init_id;
 
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index e62ff02..83ebc75 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance *instance)
 	     (le32_to_cpu(ci->count) <
 		  (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) {
 
-		memset(instance->pd_list, 0,
+		memset(instance->local_pd_list, 0,
 			MEGASAS_MAX_PD * sizeof(struct megasas_pd_list));
 
 		for (pd_index = 0; pd_index < le32_to_cpu(ci->count); pd_index++) {
 
-			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].tid	=
+			instance->local_pd_list[le16_to_cpu(pd_addr->deviceId)].tid	=
 				le16_to_cpu(pd_addr->deviceId);
-			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].driveType	=
+			instance->local_pd_list[le16_to_cpu(pd_addr->deviceId)].driveType	=
 							pd_addr->scsiDevType;
-			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].driveState	=
+			instance->local_pd_list[le16_to_cpu(pd_addr->deviceId)].driveState	=
 							MR_PD_STATE_SYSTEM;
 			pd_addr++;
 		}
 	}
 
+	memcpy(instance->pd_list, instance->local_pd_list,
+		sizeof(instance->pd_list));
 	pci_free_consistent(instance->pdev,
 				MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST),
 				ci, ci_h);


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
  2013-10-16 11:34 [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path Sumit.Saxena
@ 2013-10-16 21:44 ` James Bottomley
  2013-10-17  4:15   ` Saxena, Sumit
  2013-10-17 14:04 ` Tomas Henzl
  1 sibling, 1 reply; 7+ messages in thread
From: James Bottomley @ 2013-10-16 21:44 UTC (permalink / raw)
  To: Sumit.Saxena@lsi.com
  Cc: linux-scsi@vger.kernel.org, kashyap.desai@lsi.com,
	aradford@gmail.com

On Wed, 2013-10-16 at 17:04 +0530, Sumit.Saxena@lsi.com wrote:
> There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance->pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs-
> 
> MR_EVT_PD_INSERTED
> MR_EVT_PD_REMOVED
> MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
> MR_EVT_FOREIGN_CFG_IMPORTED
> 
> At same time running sysPD IO will be accessing the same array instance->pd_list[], which is getting updated in AEN path, because
> of this IO may not get correct PD info from instance->pd_list[] array.
> 
> Signed-off-by: Adam Radford <adam.radford@lsi.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>

Explain the signoff chain: is this a joinly authored patch?

James


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
  2013-10-16 21:44 ` James Bottomley
@ 2013-10-17  4:15   ` Saxena, Sumit
  0 siblings, 0 replies; 7+ messages in thread
From: Saxena, Sumit @ 2013-10-17  4:15 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi@vger.kernel.org, Desai, Kashyap, aradford@gmail.com



>-----Original Message-----
>From: James Bottomley [mailto:jbottomley@parallels.com]
>Sent: Thursday, October 17, 2013 3:15 AM
>To: Saxena, Sumit
>Cc: linux-scsi@vger.kernel.org; Desai, Kashyap; aradford@gmail.com
>Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem
>between sysPD IO path and AEN path
>
>On Wed, 2013-10-16 at 17:04 +0530, Sumit.Saxena@lsi.com wrote:
>> There is syncronization problem between sysPD IO path and AEN path.
>> Driver maintains instance->pd_list[] array, which will get updated(by
>> calling function megasas_get_pd_list[]), whenever any of below events
>> occurs-
>>
>> MR_EVT_PD_INSERTED
>> MR_EVT_PD_REMOVED
>> MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
>> MR_EVT_FOREIGN_CFG_IMPORTED
>>
>> At same time running sysPD IO will be accessing the same array
>> instance->pd_list[], which is getting updated in AEN path, because of
>this IO may not get correct PD info from instance->pd_list[] array.
>>
>> Signed-off-by: Adam Radford <adam.radford@lsi.com>
>> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
>
>Explain the signoff chain: is this a joinly authored patch?

Yes, it's jointly authored patch. Originally, this patch is authored by Adam Radford, and then modified by me(Sumit Saxena).

>
>James
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
  2013-10-16 11:34 [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path Sumit.Saxena
  2013-10-16 21:44 ` James Bottomley
@ 2013-10-17 14:04 ` Tomas Henzl
  2013-10-17 15:10   ` Saxena, Sumit
  1 sibling, 1 reply; 7+ messages in thread
From: Tomas Henzl @ 2013-10-17 14:04 UTC (permalink / raw)
  To: Sumit.Saxena, linux-scsi; +Cc: jbottomley, kashyap.desai, aradford

On 10/16/2013 01:34 PM, Sumit.Saxena@lsi.com wrote:
> There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance->pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs-

Hi Sumit,
- I'm a bit confused here- there are two threads which might access the same array, but the problem is still there
  when the second thread accesses the array during the final memcpy, I have expected that you will add some locking,
  but maybe I'm missing something.
- now the code zeroes the pd_list even when the 
  (ret == 0 && (ci->count < (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL)))
  is not true. This is I think new - is that intentional?
Thanks, Tomas
 

>
>
>
>
>
>
> MR_EVT_PD_INSERTED
> MR_EVT_PD_REMOVED
> MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
> MR_EVT_FOREIGN_CFG_IMPORTED
>
> At same time running sysPD IO will be accessing the same array instance->pd_list[], which is getting updated in AEN path, because
> of this IO may not get correct PD info from instance->pd_list[] array.
>
> Signed-off-by: Adam Radford <adam.radford@lsi.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
> ---
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
> index 0c73ba4..e9e543c 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -1531,6 +1531,7 @@ struct megasas_instance {
>  	struct megasas_register_set __iomem *reg_set;
>  	u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY];
>  	struct megasas_pd_list          pd_list[MEGASAS_MAX_PD];
> +	struct megasas_pd_list          local_pd_list[MEGASAS_MAX_PD];
>  	u8     ld_ids[MEGASAS_MAX_LD_IDS];
>  	s8 init_id;
>  
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index e62ff02..83ebc75 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance *instance)
>  	     (le32_to_cpu(ci->count) <
>  		  (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) {
>  
> -		memset(instance->pd_list, 0,
> +		memset(instance->local_pd_list, 0,
>  			MEGASAS_MAX_PD * sizeof(struct megasas_pd_list));
>  
>  		for (pd_index = 0; pd_index < le32_to_cpu(ci->count); pd_index++) {
>  
> -			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].tid	=
> +			instance->local_pd_list[le16_to_cpu(pd_addr->deviceId)].tid	=
>  				le16_to_cpu(pd_addr->deviceId);
> -			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].driveType	=
> +			instance->local_pd_list[le16_to_cpu(pd_addr->deviceId)].driveType	=
>  							pd_addr->scsiDevType;
> -			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].driveState	=
> +			instance->local_pd_list[le16_to_cpu(pd_addr->deviceId)].driveState	=
>  							MR_PD_STATE_SYSTEM;
>  			pd_addr++;
>  		}
>  	}
>  
> +	memcpy(instance->pd_list, instance->local_pd_list,
> +		sizeof(instance->pd_list));
>  	pci_free_consistent(instance->pdev,
>  				MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST),
>  				ci, ci_h);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
  2013-10-17 14:04 ` Tomas Henzl
@ 2013-10-17 15:10   ` Saxena, Sumit
  2013-10-17 15:48     ` Tomas Henzl
  0 siblings, 1 reply; 7+ messages in thread
From: Saxena, Sumit @ 2013-10-17 15:10 UTC (permalink / raw)
  To: Tomas Henzl, linux-scsi@vger.kernel.org
  Cc: jbottomley@parallels.com, Desai, Kashyap, aradford@gmail.com



>-----Original Message-----
>From: Tomas Henzl [mailto:thenzl@redhat.com]
>Sent: Thursday, October 17, 2013 7:35 PM
>To: Saxena, Sumit; linux-scsi@vger.kernel.org
>Cc: jbottomley@parallels.com; Desai, Kashyap; aradford@gmail.com
>Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem
>between sysPD IO path and AEN path
>
>On 10/16/2013 01:34 PM, Sumit.Saxena@lsi.com wrote:
>> There is syncronization problem between sysPD IO path and AEN path.
>Driver maintains instance->pd_list[] array, which will get updated(by
>calling function megasas_get_pd_list[]), whenever any of below events
>occurs-
>
>Hi Sumit,
>- I'm a bit confused here- there are two threads which might access the
>same array, but the problem is still there
>  when the second thread accesses the array during the final memcpy, I
>have expected that you will add some locking,
>  but maybe I'm missing something.
>- now the code zeroes the pd_list even when the
>  (ret == 0 && (ci->count < (MEGASAS_MAX_PD_CHANNELS *
>MEGASAS_MAX_DEV_PER_CHANNEL)))
>  is not true. This is I think new - is that intentional?

Tomas,

Having lock to synchronize this will be a good choice, but will need changes in multiple places.
Without this patch: driver memsets instance->pd_list[] array to zero, same array will be accessed in sysPD IO path, that creates problem.

To resolve this issue, we introduced a new array "instance->local_pd_list[]" array, which we will be filling from Firmware DMAed data and then finally memcpy that array to the "instance->pd_list[]". Since "instance->pd_list" is accessed in IO path, then no problem in memset zero here(memset is on "instance->local_pd_list").

Final "Memcpy" operation is not saved with locking, reason is: "instance->pd_list" array is of type "struct megasas_pd_list", which is of 32-bit, so single entry in "instance->local_pd_list" array will be copied in one CPU cycle, and with current MR FW design, it will not be a problem even if IO path (or any other thread) is accessing old instance->pd_list[]. so we are safe here in memcpy() here.

Adding lock will add overhead in IO path, which could be avoided is main reason to resolve this issue with this fix.


Thanks,
Sumit


>Thanks, Tomas
>
>
>>
>>
>>
>>
>>
>>
>> MR_EVT_PD_INSERTED
>> MR_EVT_PD_REMOVED
>> MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
>> MR_EVT_FOREIGN_CFG_IMPORTED
>>
>> At same time running sysPD IO will be accessing the same array
>instance->pd_list[], which is getting updated in AEN path, because
>> of this IO may not get correct PD info from instance->pd_list[] array.
>>
>> Signed-off-by: Adam Radford <adam.radford@lsi.com>
>> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
>> ---
>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>b/drivers/scsi/megaraid/megaraid_sas.h
>> index 0c73ba4..e9e543c 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>> @@ -1531,6 +1531,7 @@ struct megasas_instance {
>>  	struct megasas_register_set __iomem *reg_set;
>>  	u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY];
>>  	struct megasas_pd_list          pd_list[MEGASAS_MAX_PD];
>> +	struct megasas_pd_list          local_pd_list[MEGASAS_MAX_PD];
>>  	u8     ld_ids[MEGASAS_MAX_LD_IDS];
>>  	s8 init_id;
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index e62ff02..83ebc75 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance
>*instance)
>>  	     (le32_to_cpu(ci->count) <
>>  		  (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL)))
>{
>>
>> -		memset(instance->pd_list, 0,
>> +		memset(instance->local_pd_list, 0,
>>  			MEGASAS_MAX_PD * sizeof(struct megasas_pd_list));
>>
>>  		for (pd_index = 0; pd_index < le32_to_cpu(ci->count);
>pd_index++) {
>>
>> -			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].tid
>	=
>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>deviceId)].tid	=
>>  				le16_to_cpu(pd_addr->deviceId);
>> -			instance->pd_list[le16_to_cpu(pd_addr-
>>deviceId)].driveType	=
>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>deviceId)].driveType	=
>>  							pd_addr->scsiDevType;
>> -			instance->pd_list[le16_to_cpu(pd_addr-
>>deviceId)].driveState	=
>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>deviceId)].driveState	=
>>  							MR_PD_STATE_SYSTEM;
>>  			pd_addr++;
>>  		}
>>  	}
>>
>> +	memcpy(instance->pd_list, instance->local_pd_list,
>> +		sizeof(instance->pd_list));
>>  	pci_free_consistent(instance->pdev,
>>  				MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST),
>>  				ci, ci_h);
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
  2013-10-17 15:10   ` Saxena, Sumit
@ 2013-10-17 15:48     ` Tomas Henzl
  2013-10-17 16:05       ` Saxena, Sumit
  0 siblings, 1 reply; 7+ messages in thread
From: Tomas Henzl @ 2013-10-17 15:48 UTC (permalink / raw)
  To: Saxena, Sumit, linux-scsi@vger.kernel.org
  Cc: jbottomley@parallels.com, Desai, Kashyap, aradford@gmail.com

On 10/17/2013 05:10 PM, Saxena, Sumit wrote:
>
>> -----Original Message-----
>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>> Sent: Thursday, October 17, 2013 7:35 PM
>> To: Saxena, Sumit; linux-scsi@vger.kernel.org
>> Cc: jbottomley@parallels.com; Desai, Kashyap; aradford@gmail.com
>> Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem
>> between sysPD IO path and AEN path
>>
>> On 10/16/2013 01:34 PM, Sumit.Saxena@lsi.com wrote:
>>> There is syncronization problem between sysPD IO path and AEN path.
>> Driver maintains instance->pd_list[] array, which will get updated(by
>> calling function megasas_get_pd_list[]), whenever any of below events
>> occurs-
>>
>> Hi Sumit,
>> - I'm a bit confused here- there are two threads which might access the
>> same array, but the problem is still there
>>  when the second thread accesses the array during the final memcpy, I
>> have expected that you will add some locking,
>>  but maybe I'm missing something.
>> - now the code zeroes the pd_list even when the
>>  (ret == 0 && (ci->count < (MEGASAS_MAX_PD_CHANNELS *
>> MEGASAS_MAX_DEV_PER_CHANNEL)))
>>  is not true. This is I think new - is that intentional?
> Tomas,
>
> Having lock to synchronize this will be a good choice, but will need changes in multiple places.
> Without this patch: driver memsets instance->pd_list[] array to zero, same array will be accessed in sysPD IO path, that creates problem.
>
> To resolve this issue, we introduced a new array "instance->local_pd_list[]" array, which we will be filling from Firmware DMAed data and then finally memcpy that array to the "instance->pd_list[]". Since "instance->pd_list" is accessed in IO path, then no problem in memset zero here(memset is on "instance->local_pd_list").
>
> Final "Memcpy" operation is not saved with locking, reason is: "instance->pd_list" array is of type "struct megasas_pd_list", which is of 32-bit, so single entry in "instance->local_pd_list" array will be copied in one CPU cycle, and with current MR FW design, it will not be a problem even if IO path (or any other thread) is accessing old instance->pd_list[]. so we are safe here in memcpy() here.
>
> Adding lock will add overhead in IO path, which could be avoided is main reason to resolve this issue with this fix.

Thanks, what remains is my second question 
- now the code zeroes the pd_list even when the (ret == 0 && (ci->count < (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL)))
is not true.
This is I think new - is that intentional?

>
>
> Thanks,
> Sumit
>
>
>> Thanks, Tomas
>>
>>
>>>
>>>
>>>
>>>
>>>
>>> MR_EVT_PD_INSERTED
>>> MR_EVT_PD_REMOVED
>>> MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
>>> MR_EVT_FOREIGN_CFG_IMPORTED
>>>
>>> At same time running sysPD IO will be accessing the same array
>> instance->pd_list[], which is getting updated in AEN path, because
>>> of this IO may not get correct PD info from instance->pd_list[] array.
>>>
>>> Signed-off-by: Adam Radford <adam.radford@lsi.com>
>>> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
>>> ---
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>> b/drivers/scsi/megaraid/megaraid_sas.h
>>> index 0c73ba4..e9e543c 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>> @@ -1531,6 +1531,7 @@ struct megasas_instance {
>>>  	struct megasas_register_set __iomem *reg_set;
>>>  	u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY];
>>>  	struct megasas_pd_list          pd_list[MEGASAS_MAX_PD];
>>> +	struct megasas_pd_list          local_pd_list[MEGASAS_MAX_PD];
>>>  	u8     ld_ids[MEGASAS_MAX_LD_IDS];
>>>  	s8 init_id;
>>>
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> index e62ff02..83ebc75 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance
>> *instance)
>>>  	     (le32_to_cpu(ci->count) <
>>>  		  (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL)))
>> {
>>> -		memset(instance->pd_list, 0,
>>> +		memset(instance->local_pd_list, 0,
>>>  			MEGASAS_MAX_PD * sizeof(struct megasas_pd_list));
>>>
>>>  		for (pd_index = 0; pd_index < le32_to_cpu(ci->count);
>> pd_index++) {
>>> -			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].tid
>> 	=
>>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>> deviceId)].tid	=
>>>  				le16_to_cpu(pd_addr->deviceId);
>>> -			instance->pd_list[le16_to_cpu(pd_addr-
>>> deviceId)].driveType	=
>>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>> deviceId)].driveType	=
>>>  							pd_addr->scsiDevType;
>>> -			instance->pd_list[le16_to_cpu(pd_addr-
>>> deviceId)].driveState	=
>>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>> deviceId)].driveState	=
>>>  							MR_PD_STATE_SYSTEM;
>>>  			pd_addr++;
>>>  		}
>>>  	}
>>>
>>> +	memcpy(instance->pd_list, instance->local_pd_list,
>>> +		sizeof(instance->pd_list));
>>>  	pci_free_consistent(instance->pdev,
>>>  				MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST),
>>>  				ci, ci_h);
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>> in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
  2013-10-17 15:48     ` Tomas Henzl
@ 2013-10-17 16:05       ` Saxena, Sumit
  0 siblings, 0 replies; 7+ messages in thread
From: Saxena, Sumit @ 2013-10-17 16:05 UTC (permalink / raw)
  To: Tomas Henzl, linux-scsi@vger.kernel.org
  Cc: jbottomley@parallels.com, Desai, Kashyap, aradford@gmail.com



>-----Original Message-----
>From: Tomas Henzl [mailto:thenzl@redhat.com]
>Sent: Thursday, October 17, 2013 9:18 PM
>To: Saxena, Sumit; linux-scsi@vger.kernel.org
>Cc: jbottomley@parallels.com; Desai, Kashyap; aradford@gmail.com
>Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem
>between sysPD IO path and AEN path
>
>On 10/17/2013 05:10 PM, Saxena, Sumit wrote:
>>
>>> -----Original Message-----
>>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>>> Sent: Thursday, October 17, 2013 7:35 PM
>>> To: Saxena, Sumit; linux-scsi@vger.kernel.org
>>> Cc: jbottomley@parallels.com; Desai, Kashyap; aradford@gmail.com
>>> Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem
>>> between sysPD IO path and AEN path
>>>
>>> On 10/16/2013 01:34 PM, Sumit.Saxena@lsi.com wrote:
>>>> There is syncronization problem between sysPD IO path and AEN path.
>>> Driver maintains instance->pd_list[] array, which will get updated(by
>>> calling function megasas_get_pd_list[]), whenever any of below events
>>> occurs-
>>>
>>> Hi Sumit,
>>> - I'm a bit confused here- there are two threads which might access
>>> the same array, but the problem is still there  when the second
>>> thread accesses the array during the final memcpy, I have expected
>>> that you will add some locking,  but maybe I'm missing something.
>>> - now the code zeroes the pd_list even when the  (ret == 0 &&
>>> (ci->count < (MEGASAS_MAX_PD_CHANNELS *
>>> MEGASAS_MAX_DEV_PER_CHANNEL)))
>>>  is not true. This is I think new - is that intentional?
>> Tomas,
>>
>> Having lock to synchronize this will be a good choice, but will need
>changes in multiple places.
>> Without this patch: driver memsets instance->pd_list[] array to zero,
>same array will be accessed in sysPD IO path, that creates problem.
>>
>> To resolve this issue, we introduced a new array "instance-
>>local_pd_list[]" array, which we will be filling from Firmware DMAed
>data and then finally memcpy that array to the "instance->pd_list[]".
>Since "instance->pd_list" is accessed in IO path, then no problem in
>memset zero here(memset is on "instance->local_pd_list").
>>
>> Final "Memcpy" operation is not saved with locking, reason is:
>"instance->pd_list" array is of type "struct megasas_pd_list", which is
>of 32-bit, so single entry in "instance->local_pd_list" array will be
>copied in one CPU cycle, and with current MR FW design, it will not be a
>problem even if IO path (or any other thread) is accessing old instance-
>>pd_list[]. so we are safe here in memcpy() here.
>>
>> Adding lock will add overhead in IO path, which could be avoided is
>main reason to resolve this issue with this fix.
>
>Thanks, what remains is my second question
>- now the code zeroes the pd_list even when the (ret == 0 && (ci->count
>< (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true.
>This is I think new - is that intentional?

Thanks for pointing this out, it's unintentional and memcpy() should be done only when (ret == 0 && (ci->count < (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is true, it did not cause problem because if (ret == 0 && (ci->count < (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true, still driver will memcpy "instance->local_pd_list" to "instance->pd_list", inspite of both arrays being 
same(extra overhead of memcpy, which is not needed). I will send out the updated patch.
>
>>
>>
>> Thanks,
>> Sumit
>>
>>
>>> Thanks, Tomas
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> MR_EVT_PD_INSERTED
>>>> MR_EVT_PD_REMOVED
>>>> MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED
>>>> MR_EVT_FOREIGN_CFG_IMPORTED
>>>>
>>>> At same time running sysPD IO will be accessing the same array
>>> instance->pd_list[], which is getting updated in AEN path, because
>>>> of this IO may not get correct PD info from instance->pd_list[]
>array.
>>>>
>>>> Signed-off-by: Adam Radford <adam.radford@lsi.com>
>>>> Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
>>>> ---
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>> index 0c73ba4..e9e543c 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>> @@ -1531,6 +1531,7 @@ struct megasas_instance {
>>>>  	struct megasas_register_set __iomem *reg_set;
>>>>  	u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY];
>>>>  	struct megasas_pd_list          pd_list[MEGASAS_MAX_PD];
>>>> +	struct megasas_pd_list          local_pd_list[MEGASAS_MAX_PD];
>>>>  	u8     ld_ids[MEGASAS_MAX_LD_IDS];
>>>>  	s8 init_id;
>>>>
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> index e62ff02..83ebc75 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance
>>> *instance)
>>>>  	     (le32_to_cpu(ci->count) <
>>>>  		  (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL)))
>>> {
>>>> -		memset(instance->pd_list, 0,
>>>> +		memset(instance->local_pd_list, 0,
>>>>  			MEGASAS_MAX_PD * sizeof(struct megasas_pd_list));
>>>>
>>>>  		for (pd_index = 0; pd_index < le32_to_cpu(ci->count);
>>> pd_index++) {
>>>> -			instance->pd_list[le16_to_cpu(pd_addr->deviceId)].tid
>>> 	=
>>>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>>> deviceId)].tid	=
>>>>  				le16_to_cpu(pd_addr->deviceId);
>>>> -			instance->pd_list[le16_to_cpu(pd_addr-
>>>> deviceId)].driveType	=
>>>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>>> deviceId)].driveType	=
>>>>  							pd_addr->scsiDevType;
>>>> -			instance->pd_list[le16_to_cpu(pd_addr-
>>>> deviceId)].driveState	=
>>>> +			instance->local_pd_list[le16_to_cpu(pd_addr-
>>>> deviceId)].driveState	=
>>>>  							MR_PD_STATE_SYSTEM;
>>>>  			pd_addr++;
>>>>  		}
>>>>  	}
>>>>
>>>> +	memcpy(instance->pd_list, instance->local_pd_list,
>>>> +		sizeof(instance->pd_list));
>>>>  	pci_free_consistent(instance->pdev,
>>>>  				MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST),
>>>>  				ci, ci_h);
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-
>scsi"
>>> in
>>>> the body of a message to majordomo@vger.kernel.org More majordomo
>>>> info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-10-17 16:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-16 11:34 [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path Sumit.Saxena
2013-10-16 21:44 ` James Bottomley
2013-10-17  4:15   ` Saxena, Sumit
2013-10-17 14:04 ` Tomas Henzl
2013-10-17 15:10   ` Saxena, Sumit
2013-10-17 15:48     ` Tomas Henzl
2013-10-17 16:05       ` Saxena, Sumit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox