* [PATCH V4 00/12] hpsa updates
@ 2017-05-04 22:50 Don Brace
2017-05-04 22:50 ` [PATCH V4 01/12] hpsa: update identify physical device structure Don Brace
` (12 more replies)
0 siblings, 13 replies; 16+ messages in thread
From: Don Brace @ 2017-05-04 22:50 UTC (permalink / raw)
To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett,
Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G,
Justin.Lindley, scott.benesh, POSWALD
Cc: linux-scsi
These patches are based on Linus's tree
These patches are for:
- RedHat BZ: 1404073 - [HPE 7.3 Bug] multipath failover not reliable
- Multipath failover support in general.
The changes are:
- update identify physical device structure
- align with FW
- stop getting enclosure info for externals
- no BMIC support
- update reset handler
- update to match out of box driver
- do not reset enclosures
- reset can sometimes hang
- rescan later if reset in progress
- wait for devices to settle.
- correct resets on retried commands
- was not calling scsi_done on retried completion
- correct queue depth for externals
- Code not in correct function
- separate monitor events from rescan worker
- allows driver to check for controller events more frequently
without affecting controller lockup detection or check
for changes on PTRAID devices.
- send ioaccel requests with 0 length down raid path
- avoid hang issues for customers running older firmware.
- remove abort handler
- align driver with our out of box driver
- bump driver version
- align version with out of box driver for multi-path changes
Changes since V1:
- includes changes based on Martin Wilck <mwilck@suse.com> reviews.
- cleanup reset handler
- added spin_lock protection around h->reset_in_progress
- this has helped solve a multipath issue, thanks Martin.
- renamed patch hpsa-separate-monitor-events-from-hearbeat-worker to
hpsa-separate-monitor-events-from-rescan-worker to more correctly
convey the patches intent.
- split out some commonality between the rescan worker and the
event monitor worker.
- updated the patch description.
Changes since V2:
- includes a change based on Tomas Henzl <thenzl@redhat.com> review
- Corrected placement of spin_unlock_irqrestore in patch
hpsa-clean-up-reset-handler
- Thanks for your attention to hpsa Tomas
Changes since V3:
- Removed patches hpsa-update-pci-ids and hpsa-change-driver-version
because they are already in. Sorry about that.
---
Don Brace (11):
hpsa: update identify physical device structure
hpsa: do not get enclosure info for external devices
hpsa: update reset handler
hpsa: do not reset enclosures
hpsa: rescan later if reset in progress
hpsa: correct resets on retried commands
hpsa: cleanup reset handler
hpsa: correct queue depth for externals
hpsa: send ioaccel requests with 0 length down raid path
hpsa: remove abort handler
hpsa: bump driver version
Scott Teel (1):
hpsa: separate monitor events from rescan worker
drivers/scsi/hpsa.c | 849 +++++++++++------------------------------------
drivers/scsi/hpsa.h | 4
drivers/scsi/hpsa_cmd.h | 20 +
3 files changed, 208 insertions(+), 665 deletions(-)
--
Signature
^ permalink raw reply [flat|nested] 16+ messages in thread* [PATCH V4 01/12] hpsa: update identify physical device structure 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace @ 2017-05-04 22:50 ` Don Brace 2017-05-04 22:50 ` [PATCH V4 02/12] hpsa: do not get enclosure info for external devices Don Brace ` (11 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:50 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - align with latest spec. - added __attribute((aligned(512))) Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa_cmd.h | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h index 5961705..078afe4 100644 --- a/drivers/scsi/hpsa_cmd.h +++ b/drivers/scsi/hpsa_cmd.h @@ -809,10 +809,7 @@ struct bmic_identify_physical_device { u8 max_temperature_degreesC; u8 logical_blocks_per_phys_block_exp; /* phyblocksize = 512*2^exp */ __le16 current_queue_depth_limit; - u8 switch_name[10]; - __le16 switch_port; - u8 alternate_paths_switch_name[40]; - u8 alternate_paths_switch_port[8]; + u8 reserved_switch_stuff[60]; __le16 power_on_hours; /* valid only if gas gauge supported */ __le16 percent_endurance_used; /* valid only if gas gauge supported. */ #define BMIC_PHYS_DRIVE_SSD_WEAROUT(idphydrv) \ @@ -828,11 +825,22 @@ struct bmic_identify_physical_device { (idphydrv->smart_carrier_authentication == 0x01) u8 smart_carrier_app_fw_version; u8 smart_carrier_bootloader_fw_version; + u8 sanitize_support_flags; + u8 drive_key_flags; u8 encryption_key_name[64]; __le32 misc_drive_flags; __le16 dek_index; - u8 padding[112]; -}; + __le16 hba_drive_encryption_flags; + __le16 max_overwrite_time; + __le16 max_block_erase_time; + __le16 max_crypto_erase_time; + u8 device_connector_info[5]; + u8 connector_name[8][8]; + u8 page_83_id[16]; + u8 max_link_rate[256]; + u8 neg_phys_link_rate[256]; + u8 box_conn_name[8]; +} __attribute((aligned(512))); struct bmic_sense_subsystem_info { u8 primary_slot_number; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 02/12] hpsa: do not get enclosure info for external devices 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace 2017-05-04 22:50 ` [PATCH V4 01/12] hpsa: update identify physical device structure Don Brace @ 2017-05-04 22:50 ` Don Brace 2017-05-04 22:50 ` [PATCH V4 03/12] hpsa: update reset handler Don Brace ` (10 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:50 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi external shelves do not support BMICs. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 73daace..8e22aed 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -3353,6 +3353,11 @@ static void hpsa_get_enclosure_info(struct ctlr_info *h, bmic_device_index = GET_BMIC_DRIVE_NUMBER(&rle->lunid[0]); + if (encl_dev->target == -1 || encl_dev->lun == -1) { + rc = IO_OK; + goto out; + } + if (bmic_device_index == 0xFF00 || MASKED_DEVICE(&rle->lunid[0])) { rc = IO_OK; goto out; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 03/12] hpsa: update reset handler 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace 2017-05-04 22:50 ` [PATCH V4 01/12] hpsa: update identify physical device structure Don Brace 2017-05-04 22:50 ` [PATCH V4 02/12] hpsa: do not get enclosure info for external devices Don Brace @ 2017-05-04 22:50 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 04/12] hpsa: do not reset enclosures Don Brace ` (9 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:50 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi Use the return from TUR as a check for the device state. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.tell@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 8e22aed..9fb30c4 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -3090,7 +3090,7 @@ static int hpsa_do_reset(struct ctlr_info *h, struct hpsa_scsi_dev_t *dev, if (unlikely(rc)) atomic_set(&dev->reset_cmds_out, 0); else - wait_for_device_to_become_ready(h, scsi3addr, 0); + rc = wait_for_device_to_become_ready(h, scsi3addr, 0); mutex_unlock(&h->reset_mutex); return rc; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 04/12] hpsa: do not reset enclosures 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (2 preceding siblings ...) 2017-05-04 22:50 ` [PATCH V4 03/12] hpsa: update reset handler Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 05/12] hpsa: rescan later if reset in progress Don Brace ` (8 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi Prevent enclosure resets. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.tell@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 9fb30c4..2990897 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -5853,6 +5853,9 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) return FAILED; } + if (dev->devtype == TYPE_ENCLOSURE) + return SUCCESS; + /* if controller locked up, we can guarantee command won't complete */ if (lockup_detected(h)) { snprintf(msg, sizeof(msg), ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 05/12] hpsa: rescan later if reset in progress 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (3 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 04/12] hpsa: do not reset enclosures Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 06/12] hpsa: correct resets on retried commands Don Brace ` (7 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - schedule another scan. - mark current scan as completed Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 2990897..53a4f34 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -5620,6 +5620,7 @@ static void hpsa_scan_start(struct Scsi_Host *sh) */ if (h->reset_in_progress) { h->drv_req_rescan = 1; + hpsa_scan_complete(h); return; } ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 06/12] hpsa: correct resets on retried commands 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (4 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 05/12] hpsa: rescan later if reset in progress Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 07/12] hpsa: cleanup reset handler Don Brace ` (6 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - call scsi_done when the command completes. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 53a4f34..a2852da 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -5465,7 +5465,7 @@ static void hpsa_command_resubmit_worker(struct work_struct *work) return hpsa_cmd_free_and_done(c->h, c, cmd); } if (c->reset_pending) - return hpsa_cmd_resolve_and_free(c->h, c); + return hpsa_cmd_free_and_done(c->h, c, cmd); if (c->abort_pending) return hpsa_cmd_abort_and_free(c->h, c, cmd); if (c->cmd_type == CMD_IOACCEL2) { ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 07/12] hpsa: cleanup reset handler 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (5 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 06/12] hpsa: correct resets on retried commands Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-05 13:27 ` Tomas Henzl 2017-05-04 22:51 ` [PATCH V4 08/12] hpsa: correct queue depth for externals Don Brace ` (5 subsequent siblings) 12 siblings, 1 reply; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - mark device state sooner. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 59 +++++++++++++++++++++++++++++++++++++++------------ drivers/scsi/hpsa.h | 1 + 2 files changed, 46 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index a2852da..20b4e83 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -1859,10 +1859,13 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, * A reset can cause a device status to change * re-schedule the scan to see what happened. */ + spin_lock_irqsave(&h->reset_lock, flags); if (h->reset_in_progress) { h->drv_req_rescan = 1; + spin_unlock_irqrestore(&h->reset_lock, flags); return; } + spin_unlock_irqrestore(&h->reset_lock, flags); added = kzalloc(sizeof(*added) * HPSA_MAX_DEVICES, GFP_KERNEL); removed = kzalloc(sizeof(*removed) * HPSA_MAX_DEVICES, GFP_KERNEL); @@ -5618,11 +5621,14 @@ static void hpsa_scan_start(struct Scsi_Host *sh) /* * Do the scan after a reset completion */ + spin_lock_irqsave(&h->reset_lock, flags); if (h->reset_in_progress) { h->drv_req_rescan = 1; + spin_unlock_irqrestore(&h->reset_lock, flags); hpsa_scan_complete(h); return; } + spin_unlock_irqrestore(&h->reset_lock, flags); hpsa_update_scsi_devices(h); @@ -5834,28 +5840,38 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h, */ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) { - int rc; + int rc = SUCCESS; struct ctlr_info *h; struct hpsa_scsi_dev_t *dev; u8 reset_type; char msg[48]; + unsigned long flags; /* find the controller to which the command to be aborted was sent */ h = sdev_to_hba(scsicmd->device); if (h == NULL) /* paranoia */ return FAILED; - if (lockup_detected(h)) - return FAILED; + spin_lock_irqsave(&h->reset_lock, flags); + h->reset_in_progress = 1; + spin_unlock_irqrestore(&h->reset_lock, flags); + + if (lockup_detected(h)) { + rc = FAILED; + goto return_reset_status; + } dev = scsicmd->device->hostdata; if (!dev) { dev_err(&h->pdev->dev, "%s: device lookup failed\n", __func__); - return FAILED; + rc = FAILED; + goto return_reset_status; } - if (dev->devtype == TYPE_ENCLOSURE) - return SUCCESS; + if (dev->devtype == TYPE_ENCLOSURE) { + rc = SUCCESS; + goto return_reset_status; + } /* if controller locked up, we can guarantee command won't complete */ if (lockup_detected(h)) { @@ -5863,7 +5879,8 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) "cmd %d RESET FAILED, lockup detected", hpsa_get_cmd_index(scsicmd)); hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); - return FAILED; + rc = FAILED; + goto return_reset_status; } /* this reset request might be the result of a lockup; check */ @@ -5872,12 +5889,15 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) "cmd %d RESET FAILED, new lockup detected", hpsa_get_cmd_index(scsicmd)); hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); - return FAILED; + rc = FAILED; + goto return_reset_status; } /* Do not attempt on controller */ - if (is_hba_lunid(dev->scsi3addr)) - return SUCCESS; + if (is_hba_lunid(dev->scsi3addr)) { + rc = SUCCESS; + goto return_reset_status; + } if (is_logical_dev_addr_mode(dev->scsi3addr)) reset_type = HPSA_DEVICE_RESET_MSG; @@ -5888,17 +5908,24 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) reset_type == HPSA_DEVICE_RESET_MSG ? "logical " : "physical "); hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); - h->reset_in_progress = 1; - /* send a reset to the SCSI LUN which the command was sent to */ rc = hpsa_do_reset(h, dev, dev->scsi3addr, reset_type, DEFAULT_REPLY_QUEUE); + if (rc == 0) + rc = SUCCESS; + else + rc = FAILED; + sprintf(msg, "reset %s %s", reset_type == HPSA_DEVICE_RESET_MSG ? "logical " : "physical ", - rc == 0 ? "completed successfully" : "failed"); + rc == SUCCESS ? "completed successfully" : "failed"); hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); + +return_reset_status: + spin_lock_irqsave(&h->reset_lock, flags); h->reset_in_progress = 0; - return rc == 0 ? SUCCESS : FAILED; + spin_unlock_irqrestore(&h->reset_lock, flags); + return rc; } static void swizzle_abort_tag(u8 *tag) @@ -8649,10 +8676,13 @@ static void hpsa_rescan_ctlr_worker(struct work_struct *work) /* * Do the scan after the reset */ + spin_lock_irqsave(&h->reset_lock, flags); if (h->reset_in_progress) { h->drv_req_rescan = 1; + spin_unlock_irqrestore(&h->reset_lock, flags); return; } + spin_unlock_irqrestore(&h->reset_lock, flags); if (hpsa_ctlr_needs_rescan(h) || hpsa_offline_devices_ready(h)) { scsi_host_get(h->scsi_host); @@ -8759,6 +8789,7 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) spin_lock_init(&h->lock); spin_lock_init(&h->offline_device_lock); spin_lock_init(&h->scan_lock); + spin_lock_init(&h->reset_lock); atomic_set(&h->passthru_cmds_avail, HPSA_MAX_CONCURRENT_PASSTHRUS); atomic_set(&h->abort_cmds_available, HPSA_CMDS_RESERVED_FOR_ABORTS); diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h index 6f04f2a..5352664 100644 --- a/drivers/scsi/hpsa.h +++ b/drivers/scsi/hpsa.h @@ -301,6 +301,7 @@ struct ctlr_info { struct mutex reset_mutex; u8 reset_in_progress; struct hpsa_sas_node *sas_host; + spinlock_t reset_lock; }; struct offline_device_entry { ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH V4 07/12] hpsa: cleanup reset handler 2017-05-04 22:51 ` [PATCH V4 07/12] hpsa: cleanup reset handler Don Brace @ 2017-05-05 13:27 ` Tomas Henzl 2017-05-05 16:10 ` Don Brace 0 siblings, 1 reply; 16+ messages in thread From: Tomas Henzl @ 2017-05-05 13:27 UTC (permalink / raw) To: Don Brace, joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi On 5.5.2017 00:51, Don Brace wrote: > - mark device state sooner. > > Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> > Reviewed-by: Scott Teel <scott.teel@microsemi.com> > Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> > Signed-off-by: Don Brace <don.brace@microsemi.com> > --- > drivers/scsi/hpsa.c | 59 +++++++++++++++++++++++++++++++++++++++------------ > drivers/scsi/hpsa.h | 1 + > 2 files changed, 46 insertions(+), 14 deletions(-) > > diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c > index a2852da..20b4e83 100644 > --- a/drivers/scsi/hpsa.c > +++ b/drivers/scsi/hpsa.c > @@ -1859,10 +1859,13 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, > * A reset can cause a device status to change > * re-schedule the scan to see what happened. > */ > + spin_lock_irqsave(&h->reset_lock, flags); > if (h->reset_in_progress) { > h->drv_req_rescan = 1; > + spin_unlock_irqrestore(&h->reset_lock, flags); > return; > } > + spin_unlock_irqrestore(&h->reset_lock, flags); > > added = kzalloc(sizeof(*added) * HPSA_MAX_DEVICES, GFP_KERNEL); > removed = kzalloc(sizeof(*removed) * HPSA_MAX_DEVICES, GFP_KERNEL); > @@ -5618,11 +5621,14 @@ static void hpsa_scan_start(struct Scsi_Host *sh) > /* > * Do the scan after a reset completion > */ > + spin_lock_irqsave(&h->reset_lock, flags); > if (h->reset_in_progress) { > h->drv_req_rescan = 1; > + spin_unlock_irqrestore(&h->reset_lock, flags); > hpsa_scan_complete(h); > return; > } > + spin_unlock_irqrestore(&h->reset_lock, flags); The name 'reset_in_progress' suggest that you want to protect something else from running in parallel to the hpsa_eh_device_reset_handler. Which code parts are you protecting ? The remainder of this function below doesn't seem to be protected, is that fine ? tomash > > hpsa_update_scsi_devices(h); > > @@ -5834,28 +5840,38 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h, > */ > static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) > { > - int rc; > + int rc = SUCCESS; > struct ctlr_info *h; > struct hpsa_scsi_dev_t *dev; > u8 reset_type; > char msg[48]; > + unsigned long flags; > > /* find the controller to which the command to be aborted was sent */ > h = sdev_to_hba(scsicmd->device); > if (h == NULL) /* paranoia */ > return FAILED; > > - if (lockup_detected(h)) > - return FAILED; > + spin_lock_irqsave(&h->reset_lock, flags); > + h->reset_in_progress = 1; > + spin_unlock_irqrestore(&h->reset_lock, flags); > + > + if (lockup_detected(h)) { > + rc = FAILED; > + goto return_reset_status; > + } > > dev = scsicmd->device->hostdata; > if (!dev) { > dev_err(&h->pdev->dev, "%s: device lookup failed\n", __func__); > - return FAILED; > + rc = FAILED; > + goto return_reset_status; > } > > - if (dev->devtype == TYPE_ENCLOSURE) > - return SUCCESS; > + if (dev->devtype == TYPE_ENCLOSURE) { > + rc = SUCCESS; > + goto return_reset_status; > + } > > /* if controller locked up, we can guarantee command won't complete */ > if (lockup_detected(h)) { > @@ -5863,7 +5879,8 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) > "cmd %d RESET FAILED, lockup detected", > hpsa_get_cmd_index(scsicmd)); > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > - return FAILED; > + rc = FAILED; > + goto return_reset_status; > } > > /* this reset request might be the result of a lockup; check */ > @@ -5872,12 +5889,15 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) > "cmd %d RESET FAILED, new lockup detected", > hpsa_get_cmd_index(scsicmd)); > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > - return FAILED; > + rc = FAILED; > + goto return_reset_status; > } > > /* Do not attempt on controller */ > - if (is_hba_lunid(dev->scsi3addr)) > - return SUCCESS; > + if (is_hba_lunid(dev->scsi3addr)) { > + rc = SUCCESS; > + goto return_reset_status; > + } > > if (is_logical_dev_addr_mode(dev->scsi3addr)) > reset_type = HPSA_DEVICE_RESET_MSG; > @@ -5888,17 +5908,24 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) > reset_type == HPSA_DEVICE_RESET_MSG ? "logical " : "physical "); > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > > - h->reset_in_progress = 1; > - > /* send a reset to the SCSI LUN which the command was sent to */ > rc = hpsa_do_reset(h, dev, dev->scsi3addr, reset_type, > DEFAULT_REPLY_QUEUE); > + if (rc == 0) > + rc = SUCCESS; > + else > + rc = FAILED; > + > sprintf(msg, "reset %s %s", > reset_type == HPSA_DEVICE_RESET_MSG ? "logical " : "physical ", > - rc == 0 ? "completed successfully" : "failed"); > + rc == SUCCESS ? "completed successfully" : "failed"); > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > + > +return_reset_status: > + spin_lock_irqsave(&h->reset_lock, flags); > h->reset_in_progress = 0; > - return rc == 0 ? SUCCESS : FAILED; > + spin_unlock_irqrestore(&h->reset_lock, flags); > + return rc; > } > > static void swizzle_abort_tag(u8 *tag) > @@ -8649,10 +8676,13 @@ static void hpsa_rescan_ctlr_worker(struct work_struct *work) > /* > * Do the scan after the reset > */ > + spin_lock_irqsave(&h->reset_lock, flags); > if (h->reset_in_progress) { > h->drv_req_rescan = 1; > + spin_unlock_irqrestore(&h->reset_lock, flags); > return; > } > + spin_unlock_irqrestore(&h->reset_lock, flags); > > if (hpsa_ctlr_needs_rescan(h) || hpsa_offline_devices_ready(h)) { > scsi_host_get(h->scsi_host); > @@ -8759,6 +8789,7 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) > spin_lock_init(&h->lock); > spin_lock_init(&h->offline_device_lock); > spin_lock_init(&h->scan_lock); > + spin_lock_init(&h->reset_lock); > atomic_set(&h->passthru_cmds_avail, HPSA_MAX_CONCURRENT_PASSTHRUS); > atomic_set(&h->abort_cmds_available, HPSA_CMDS_RESERVED_FOR_ABORTS); > > diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h > index 6f04f2a..5352664 100644 > --- a/drivers/scsi/hpsa.h > +++ b/drivers/scsi/hpsa.h > @@ -301,6 +301,7 @@ struct ctlr_info { > struct mutex reset_mutex; > u8 reset_in_progress; > struct hpsa_sas_node *sas_host; > + spinlock_t reset_lock; > }; > > struct offline_device_entry { > ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH V4 07/12] hpsa: cleanup reset handler 2017-05-05 13:27 ` Tomas Henzl @ 2017-05-05 16:10 ` Don Brace 0 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-05 16:10 UTC (permalink / raw) To: Tomas Henzl, joseph.szczypek@hpe.com, Gerry Morong, John Hall, jejb@linux.vnet.ibm.com, Kevin Barnett, Mahesh Rajashekhara, Bader Ali - Saleh, hch@infradead.org, Scott Teel, Viswas G, Justin Lindley, Scott Benesh, POSWALD@suse.com Cc: linux-scsi@vger.kernel.org > -----Original Message----- > On 5.5.2017 00:51, Don Brace wrote: > > - mark device state sooner. > > > > Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> > > Reviewed-by: Scott Teel <scott.teel@microsemi.com> > > Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> > > Signed-off-by: Don Brace <don.brace@microsemi.com> > > --- > > drivers/scsi/hpsa.c | 59 > +++++++++++++++++++++++++++++++++++++++------------ > > drivers/scsi/hpsa.h | 1 + > > 2 files changed, 46 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c > > index a2852da..20b4e83 100644 > > --- a/drivers/scsi/hpsa.c > > +++ b/drivers/scsi/hpsa.c > > @@ -1859,10 +1859,13 @@ static void adjust_hpsa_scsi_table(struct > ctlr_info *h, > > * A reset can cause a device status to change > > * re-schedule the scan to see what happened. > > */ > > + spin_lock_irqsave(&h->reset_lock, flags); > > if (h->reset_in_progress) { > > h->drv_req_rescan = 1; > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > return; > > } > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > > > added = kzalloc(sizeof(*added) * HPSA_MAX_DEVICES, GFP_KERNEL); > > removed = kzalloc(sizeof(*removed) * HPSA_MAX_DEVICES, > GFP_KERNEL); > > @@ -5618,11 +5621,14 @@ static void hpsa_scan_start(struct Scsi_Host > *sh) > > /* > > * Do the scan after a reset completion > > */ > > + spin_lock_irqsave(&h->reset_lock, flags); > > if (h->reset_in_progress) { > > h->drv_req_rescan = 1; > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > hpsa_scan_complete(h); > > return; > > } > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > The name 'reset_in_progress' suggest that you want to protect > something else from running in parallel to the > hpsa_eh_device_reset_handler. > Which code parts are you protecting ? > The remainder of this function below doesn't seem to be protected, is that > fine ? > tomash This code was added to prevent a scan operation from occurring while a reset operation is in progress. It schedules a scan later if a reset operation is detected. There are more checks for this later on in the scan code also. I found this helps during multipath failover testing. So, yes this is ok. I appreciate your input. Thanks, Don Brace ESC - Smart Storage Microsemi Corporation > > > > > hpsa_update_scsi_devices(h); > > > > @@ -5834,28 +5840,38 @@ static int > wait_for_device_to_become_ready(struct ctlr_info *h, > > */ > > static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) > > { > > - int rc; > > + int rc = SUCCESS; > > struct ctlr_info *h; > > struct hpsa_scsi_dev_t *dev; > > u8 reset_type; > > char msg[48]; > > + unsigned long flags; > > > > /* find the controller to which the command to be aborted was sent */ > > h = sdev_to_hba(scsicmd->device); > > if (h == NULL) /* paranoia */ > > return FAILED; > > > > - if (lockup_detected(h)) > > - return FAILED; > > + spin_lock_irqsave(&h->reset_lock, flags); > > + h->reset_in_progress = 1; > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > + > > + if (lockup_detected(h)) { > > + rc = FAILED; > > + goto return_reset_status; > > + } > > > > dev = scsicmd->device->hostdata; > > if (!dev) { > > dev_err(&h->pdev->dev, "%s: device lookup failed\n", __func__); > > - return FAILED; > > + rc = FAILED; > > + goto return_reset_status; > > } > > > > - if (dev->devtype == TYPE_ENCLOSURE) > > - return SUCCESS; > > + if (dev->devtype == TYPE_ENCLOSURE) { > > + rc = SUCCESS; > > + goto return_reset_status; > > + } > > > > /* if controller locked up, we can guarantee command won't complete > */ > > if (lockup_detected(h)) { > > @@ -5863,7 +5879,8 @@ static int hpsa_eh_device_reset_handler(struct > scsi_cmnd *scsicmd) > > "cmd %d RESET FAILED, lockup detected", > > hpsa_get_cmd_index(scsicmd)); > > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > > - return FAILED; > > + rc = FAILED; > > + goto return_reset_status; > > } > > > > /* this reset request might be the result of a lockup; check */ > > @@ -5872,12 +5889,15 @@ static int hpsa_eh_device_reset_handler(struct > scsi_cmnd *scsicmd) > > "cmd %d RESET FAILED, new lockup detected", > > hpsa_get_cmd_index(scsicmd)); > > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > > - return FAILED; > > + rc = FAILED; > > + goto return_reset_status; > > } > > > > /* Do not attempt on controller */ > > - if (is_hba_lunid(dev->scsi3addr)) > > - return SUCCESS; > > + if (is_hba_lunid(dev->scsi3addr)) { > > + rc = SUCCESS; > > + goto return_reset_status; > > + } > > > > if (is_logical_dev_addr_mode(dev->scsi3addr)) > > reset_type = HPSA_DEVICE_RESET_MSG; > > @@ -5888,17 +5908,24 @@ static int hpsa_eh_device_reset_handler(struct > scsi_cmnd *scsicmd) > > reset_type == HPSA_DEVICE_RESET_MSG ? "logical " : "physical "); > > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > > > > - h->reset_in_progress = 1; > > - > > /* send a reset to the SCSI LUN which the command was sent to */ > > rc = hpsa_do_reset(h, dev, dev->scsi3addr, reset_type, > > DEFAULT_REPLY_QUEUE); > > + if (rc == 0) > > + rc = SUCCESS; > > + else > > + rc = FAILED; > > + > > sprintf(msg, "reset %s %s", > > reset_type == HPSA_DEVICE_RESET_MSG ? "logical " : "physical ", > > - rc == 0 ? "completed successfully" : "failed"); > > + rc == SUCCESS ? "completed successfully" : "failed"); > > hpsa_show_dev_msg(KERN_WARNING, h, dev, msg); > > + > > +return_reset_status: > > + spin_lock_irqsave(&h->reset_lock, flags); > > h->reset_in_progress = 0; > > - return rc == 0 ? SUCCESS : FAILED; > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > + return rc; > > } > > > > static void swizzle_abort_tag(u8 *tag) > > @@ -8649,10 +8676,13 @@ static void hpsa_rescan_ctlr_worker(struct > work_struct *work) > > /* > > * Do the scan after the reset > > */ > > + spin_lock_irqsave(&h->reset_lock, flags); > > if (h->reset_in_progress) { > > h->drv_req_rescan = 1; > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > return; > > } > > + spin_unlock_irqrestore(&h->reset_lock, flags); > > > > if (hpsa_ctlr_needs_rescan(h) || hpsa_offline_devices_ready(h)) { > > scsi_host_get(h->scsi_host); > > @@ -8759,6 +8789,7 @@ static int hpsa_init_one(struct pci_dev *pdev, > const struct pci_device_id *ent) > > spin_lock_init(&h->lock); > > spin_lock_init(&h->offline_device_lock); > > spin_lock_init(&h->scan_lock); > > + spin_lock_init(&h->reset_lock); > > atomic_set(&h->passthru_cmds_avail, > HPSA_MAX_CONCURRENT_PASSTHRUS); > > atomic_set(&h->abort_cmds_available, > HPSA_CMDS_RESERVED_FOR_ABORTS); > > > > diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h > > index 6f04f2a..5352664 100644 > > --- a/drivers/scsi/hpsa.h > > +++ b/drivers/scsi/hpsa.h > > @@ -301,6 +301,7 @@ struct ctlr_info { > > struct mutex reset_mutex; > > u8 reset_in_progress; > > struct hpsa_sas_node *sas_host; > > + spinlock_t reset_lock; > > }; > > > > struct offline_device_entry { > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH V4 08/12] hpsa: correct queue depth for externals 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (6 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 07/12] hpsa: cleanup reset handler Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 09/12] hpsa: separate monitor events from rescan worker Don Brace ` (4 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - queue depth assignment not in correct place, had no effect. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 22 ++++++++++------------ drivers/scsi/hpsa.h | 1 + 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 20b4e83..5b0cc0e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2069,10 +2069,13 @@ static int hpsa_slave_configure(struct scsi_device *sdev) sd = sdev->hostdata; sdev->no_uld_attach = !sd || !sd->expose_device; - if (sd) - queue_depth = sd->queue_depth != 0 ? - sd->queue_depth : sdev->host->can_queue; - else + if (sd) { + if (sd->external) + queue_depth = EXTERNAL_QD; + else + queue_depth = sd->queue_depth != 0 ? + sd->queue_depth : sdev->host->can_queue; + } else queue_depth = sdev->host->can_queue; scsi_change_queue_depth(sdev, queue_depth); @@ -3915,6 +3918,9 @@ static int hpsa_update_device_info(struct ctlr_info *h, this_device->queue_depth = h->nr_cmds; } + if (this_device->external) + this_device->queue_depth = EXTERNAL_QD; + if (is_OBDR_device) { /* See if this is a One-Button-Disaster-Recovery device * by looking for "$DR-10" at offset 43 in inquiry data. @@ -4123,14 +4129,6 @@ static void hpsa_get_ioaccel_drive_info(struct ctlr_info *h, int rc; struct ext_report_lun_entry *rle; - /* - * external targets don't support BMIC - */ - if (dev->external) { - dev->queue_depth = 7; - return; - } - rle = &rlep->LUN[rle_index]; dev->ioaccel_handle = rle->ioaccel_handle; diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h index 5352664..61dd54a 100644 --- a/drivers/scsi/hpsa.h +++ b/drivers/scsi/hpsa.h @@ -57,6 +57,7 @@ struct hpsa_sas_phy { bool added_to_port; }; +#define EXTERNAL_QD 7 struct hpsa_scsi_dev_t { unsigned int devtype; int bus, target, lun; /* as presented to the OS */ ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 09/12] hpsa: separate monitor events from rescan worker 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (7 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 08/12] hpsa: correct queue depth for externals Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 10/12] hpsa: send ioaccel requests with 0 length down raid path Don Brace ` (3 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi From: Scott Teel <scott.teel@microsemi.com> create new worker thread to monitor controller events - both the rescan and event monitor workers can cause a rescan to occur however for multipath we have found that we need to respond faster than the normal scheduled rescan interval for path fail-overs. - getting controller events only involves reading a register, but the rescan worker can obtain an updated LUN list when there is a PTRAID device present. - move common code to a separate function. advantages: - detect controller events more frequently. - leave rescan thread interval at 30 seconds. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 76 +++++++++++++++++++++++++++++++++++++++------------ drivers/scsi/hpsa.h | 1 + 2 files changed, 59 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 5b0cc0e..2ec9079 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -1110,6 +1110,7 @@ static int is_firmware_flash_cmd(u8 *cdb) */ #define HEARTBEAT_SAMPLE_INTERVAL_DURING_FLASH (240 * HZ) #define HEARTBEAT_SAMPLE_INTERVAL (30 * HZ) +#define HPSA_EVENT_MONITOR_INTERVAL (15 * HZ) static void dial_down_lockup_detection_during_fw_flash(struct ctlr_info *h, struct CommandList *c) { @@ -8661,15 +8662,10 @@ static int hpsa_luns_changed(struct ctlr_info *h) return rc; } -static void hpsa_rescan_ctlr_worker(struct work_struct *work) +static void hpsa_perform_rescan(struct ctlr_info *h) { + struct Scsi_Host *sh = NULL; unsigned long flags; - struct ctlr_info *h = container_of(to_delayed_work(work), - struct ctlr_info, rescan_ctlr_work); - - - if (h->remove_in_progress) - return; /* * Do the scan after the reset @@ -8682,23 +8678,63 @@ static void hpsa_rescan_ctlr_worker(struct work_struct *work) } spin_unlock_irqrestore(&h->reset_lock, flags); - if (hpsa_ctlr_needs_rescan(h) || hpsa_offline_devices_ready(h)) { - scsi_host_get(h->scsi_host); + sh = scsi_host_get(h->scsi_host); + if (sh != NULL) { + hpsa_scan_start(sh); + scsi_host_put(sh); + h->drv_req_rescan = 0; + } +} + +/* + * watch for controller events + */ +static void hpsa_event_monitor_worker(struct work_struct *work) +{ + struct ctlr_info *h = container_of(to_delayed_work(work), + struct ctlr_info, event_monitor_work); + unsigned long flags; + + spin_lock_irqsave(&h->lock, flags); + if (h->remove_in_progress) { + spin_unlock_irqrestore(&h->lock, flags); + return; + } + spin_unlock_irqrestore(&h->lock, flags); + + if (hpsa_ctlr_needs_rescan(h)) { hpsa_ack_ctlr_events(h); - hpsa_scan_start(h->scsi_host); - scsi_host_put(h->scsi_host); + hpsa_perform_rescan(h); + } + + spin_lock_irqsave(&h->lock, flags); + if (!h->remove_in_progress) + schedule_delayed_work(&h->event_monitor_work, + HPSA_EVENT_MONITOR_INTERVAL); + spin_unlock_irqrestore(&h->lock, flags); +} + +static void hpsa_rescan_ctlr_worker(struct work_struct *work) +{ + unsigned long flags; + struct ctlr_info *h = container_of(to_delayed_work(work), + struct ctlr_info, rescan_ctlr_work); + + spin_lock_irqsave(&h->lock, flags); + if (h->remove_in_progress) { + spin_unlock_irqrestore(&h->lock, flags); + return; + } + spin_unlock_irqrestore(&h->lock, flags); + + if (h->drv_req_rescan || hpsa_offline_devices_ready(h)) { + hpsa_perform_rescan(h); } else if (h->discovery_polling) { hpsa_disable_rld_caching(h); if (hpsa_luns_changed(h)) { - struct Scsi_Host *sh = NULL; - dev_info(&h->pdev->dev, "driver discovery polling rescan.\n"); - sh = scsi_host_get(h->scsi_host); - if (sh != NULL) { - hpsa_scan_start(sh); - scsi_host_put(sh); - } + hpsa_perform_rescan(h); } } spin_lock_irqsave(&h->lock, flags); @@ -8964,6 +9000,9 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) INIT_DELAYED_WORK(&h->rescan_ctlr_work, hpsa_rescan_ctlr_worker); queue_delayed_work(h->rescan_ctlr_wq, &h->rescan_ctlr_work, h->heartbeat_sample_interval); + INIT_DELAYED_WORK(&h->event_monitor_work, hpsa_event_monitor_worker); + schedule_delayed_work(&h->event_monitor_work, + HPSA_EVENT_MONITOR_INTERVAL); return 0; clean7: /* perf, sg, cmd, irq, shost, pci, lu, aer/h */ @@ -9132,6 +9171,7 @@ static void hpsa_remove_one(struct pci_dev *pdev) spin_unlock_irqrestore(&h->lock, flags); cancel_delayed_work_sync(&h->monitor_ctlr_work); cancel_delayed_work_sync(&h->rescan_ctlr_work); + cancel_delayed_work_sync(&h->event_monitor_work); destroy_workqueue(h->rescan_ctlr_wq); destroy_workqueue(h->resubmit_wq); diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h index 61dd54a..c9c4927 100644 --- a/drivers/scsi/hpsa.h +++ b/drivers/scsi/hpsa.h @@ -245,6 +245,7 @@ struct ctlr_info { u32 __percpu *lockup_detected; struct delayed_work monitor_ctlr_work; struct delayed_work rescan_ctlr_work; + struct delayed_work event_monitor_work; int remove_in_progress; /* Address of h->q[x] is passed to intr handler to know which queue */ u8 q[MAX_REPLY_QUEUES]; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 10/12] hpsa: send ioaccel requests with 0 length down raid path 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (8 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 09/12] hpsa: separate monitor events from rescan worker Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 11/12] hpsa: remove abort handler Don Brace ` (2 subsequent siblings) 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - Block I/O requests with 0 length transfers which go down the ioaccel path. This causes lockup issues down in the basecode. - These issues have been fixed, but there are customers who are experiencing the issues when running older firmware. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 2ec9079..ea778cc 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -4591,7 +4591,55 @@ static int hpsa_scatter_gather(struct ctlr_info *h, return 0; } -#define IO_ACCEL_INELIGIBLE (1) +#define BUFLEN 128 +static inline void warn_zero_length_transfer(struct ctlr_info *h, + u8 *cdb, int cdb_len, + const char *func) +{ + char buf[BUFLEN]; + int outlen; + int i; + + outlen = scnprintf(buf, BUFLEN, + "%s: Blocking zero-length request: CDB:", func); + for (i = 0; i < cdb_len; i++) + outlen += scnprintf(buf+outlen, BUFLEN - outlen, + "%02hhx", cdb[i]); + dev_warn(&h->pdev->dev, "%s\n", buf); +} + +#define IO_ACCEL_INELIGIBLE 1 +/* zero-length transfers trigger hardware errors. */ +static bool is_zero_length_transfer(u8 *cdb) +{ + u32 block_cnt; + + /* Block zero-length transfer sizes on certain commands. */ + switch (cdb[0]) { + case READ_10: + case WRITE_10: + case VERIFY: /* 0x2F */ + case WRITE_VERIFY: /* 0x2E */ + block_cnt = get_unaligned_be16(&cdb[7]); + break; + case READ_12: + case WRITE_12: + case VERIFY_12: /* 0xAF */ + case WRITE_VERIFY_12: /* 0xAE */ + block_cnt = get_unaligned_be32(&cdb[6]); + break; + case READ_16: + case WRITE_16: + case VERIFY_16: /* 0x8F */ + block_cnt = get_unaligned_be32(&cdb[10]); + break; + default: + return false; + } + + return block_cnt == 0; +} + static int fixup_ioaccel_cdb(u8 *cdb, int *cdb_len) { int is_write = 0; @@ -4658,6 +4706,12 @@ static int hpsa_scsi_ioaccel1_queue_command(struct ctlr_info *h, BUG_ON(cmd->cmd_len > IOACCEL1_IOFLAGS_CDBLEN_MAX); + if (is_zero_length_transfer(cdb)) { + warn_zero_length_transfer(h, cdb, cdb_len, __func__); + atomic_dec(&phys_disk->ioaccel_cmds_out); + return IO_ACCEL_INELIGIBLE; + } + if (fixup_ioaccel_cdb(cdb, &cdb_len)) { atomic_dec(&phys_disk->ioaccel_cmds_out); return IO_ACCEL_INELIGIBLE; @@ -4822,6 +4876,12 @@ static int hpsa_scsi_ioaccel2_queue_command(struct ctlr_info *h, BUG_ON(scsi_sg_count(cmd) > h->maxsgentries); + if (is_zero_length_transfer(cdb)) { + warn_zero_length_transfer(h, cdb, cdb_len, __func__); + atomic_dec(&phys_disk->ioaccel_cmds_out); + return IO_ACCEL_INELIGIBLE; + } + if (fixup_ioaccel_cdb(cdb, &cdb_len)) { atomic_dec(&phys_disk->ioaccel_cmds_out); return IO_ACCEL_INELIGIBLE; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 11/12] hpsa: remove abort handler 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (9 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 10/12] hpsa: send ioaccel requests with 0 length down raid path Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 12/12] hpsa: bump driver version Don Brace 2017-05-12 3:07 ` [PATCH V4 00/12] hpsa updates Martin K. Petersen 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi - simplify the driver - there are a lot of quirky racy conditions not handled - causes more aborts/resets when the number of commands to be aborted is large, such as in multi-path fail-overs. - has been turned off in our internal driver since 8/31/2015 Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 621 +-------------------------------------------------- drivers/scsi/hpsa.h | 1 2 files changed, 8 insertions(+), 614 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index ea778cc..9a631e3 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -258,7 +258,6 @@ static int hpsa_scan_finished(struct Scsi_Host *sh, static int hpsa_change_queue_depth(struct scsi_device *sdev, int qdepth); static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd); -static int hpsa_eh_abort_handler(struct scsi_cmnd *scsicmd); static int hpsa_slave_alloc(struct scsi_device *sdev); static int hpsa_slave_configure(struct scsi_device *sdev); static void hpsa_slave_destroy(struct scsi_device *sdev); @@ -326,7 +325,7 @@ static inline bool hpsa_is_cmd_idle(struct CommandList *c) static inline bool hpsa_is_pending_event(struct CommandList *c) { - return c->abort_pending || c->reset_pending; + return c->reset_pending; } /* extract sense key, asc, and ascq from sense data. -1 means invalid. */ @@ -581,12 +580,6 @@ static u32 soft_unresettable_controller[] = { 0x409D0E11, /* Smart Array 6400 EM */ }; -static u32 needs_abort_tags_swizzled[] = { - 0x323D103C, /* Smart Array P700m */ - 0x324a103C, /* Smart Array P712m */ - 0x324b103C, /* SmartArray P711m */ -}; - static int board_id_in_array(u32 a[], int nelems, u32 board_id) { int i; @@ -615,12 +608,6 @@ static int ctlr_is_resettable(u32 board_id) ctlr_is_soft_resettable(board_id); } -static int ctlr_needs_abort_tags_swizzled(u32 board_id) -{ - return board_id_in_array(needs_abort_tags_swizzled, - ARRAY_SIZE(needs_abort_tags_swizzled), board_id); -} - static ssize_t host_show_resettable(struct device *dev, struct device_attribute *attr, char *buf) { @@ -928,8 +915,8 @@ static struct device_attribute *hpsa_shost_attrs[] = { NULL, }; -#define HPSA_NRESERVED_CMDS (HPSA_CMDS_RESERVED_FOR_ABORTS + \ - HPSA_CMDS_RESERVED_FOR_DRIVER + HPSA_MAX_CONCURRENT_PASSTHRUS) +#define HPSA_NRESERVED_CMDS (HPSA_CMDS_RESERVED_FOR_DRIVER +\ + HPSA_MAX_CONCURRENT_PASSTHRUS) static struct scsi_host_template hpsa_driver_template = { .module = THIS_MODULE, @@ -941,7 +928,6 @@ static struct scsi_host_template hpsa_driver_template = { .change_queue_depth = hpsa_change_queue_depth, .this_id = -1, .use_clustering = ENABLE_CLUSTERING, - .eh_abort_handler = hpsa_eh_abort_handler, .eh_device_reset_handler = hpsa_eh_device_reset_handler, .ioctl = hpsa_ioctl, .slave_alloc = hpsa_slave_alloc, @@ -2361,26 +2347,12 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h, bool do_wake = false; /* - * Prevent the following race in the abort handler: - * - * 1. LLD is requested to abort a SCSI command - * 2. The SCSI command completes - * 3. The struct CommandList associated with step 2 is made available - * 4. New I/O request to LLD to another LUN re-uses struct CommandList - * 5. Abort handler follows scsi_cmnd->host_scribble and - * finds struct CommandList and tries to aborts it - * Now we have aborted the wrong command. - * - * Reset c->scsi_cmd here so that the abort or reset handler will know + * Reset c->scsi_cmd here so that the reset handler will know * this command has completed. Then, check to see if the handler is * waiting for this command, and, if so, wake it. */ c->scsi_cmd = SCSI_CMD_IDLE; mb(); /* Declare command idle before checking for pending events. */ - if (c->abort_pending) { - do_wake = true; - c->abort_pending = false; - } if (c->reset_pending) { unsigned long flags; struct hpsa_scsi_dev_t *dev; @@ -2423,20 +2395,6 @@ static void hpsa_retry_cmd(struct ctlr_info *h, struct CommandList *c) queue_work_on(raw_smp_processor_id(), h->resubmit_wq, &c->work); } -static void hpsa_set_scsi_cmd_aborted(struct scsi_cmnd *cmd) -{ - cmd->result = DID_ABORT << 16; -} - -static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c, - struct scsi_cmnd *cmd) -{ - hpsa_set_scsi_cmd_aborted(cmd); - dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n", - c->Request.CDB, c->err_info->ScsiStatus); - hpsa_cmd_resolve_and_free(h, c); -} - static void process_ioaccel2_completion(struct ctlr_info *h, struct CommandList *c, struct scsi_cmnd *cmd, struct hpsa_scsi_dev_t *dev) @@ -2561,12 +2519,9 @@ static void complete_scsi_command(struct CommandList *cp) return hpsa_cmd_free_and_done(h, cp, cmd); } - if ((unlikely(hpsa_is_pending_event(cp)))) { + if ((unlikely(hpsa_is_pending_event(cp)))) if (cp->reset_pending) return hpsa_cmd_free_and_done(h, cp, cmd); - if (cp->abort_pending) - return hpsa_cmd_abort_and_free(h, cp, cmd); - } if (cp->cmd_type == CMD_IOACCEL2) return process_ioaccel2_completion(h, cp, cmd, dev); @@ -2686,8 +2641,8 @@ static void complete_scsi_command(struct CommandList *cp) cp->Request.CDB); break; case CMD_ABORTED: - /* Return now to avoid calling scsi_done(). */ - return hpsa_cmd_abort_and_free(h, cp, cmd); + cmd->result = DID_ABORT << 16; + break; case CMD_ABORT_FAILED: cmd->result = DID_ERROR << 16; dev_warn(&h->pdev->dev, "CDB %16phN : abort failed\n", @@ -3793,53 +3748,6 @@ static unsigned char hpsa_volume_offline(struct ctlr_info *h, return HPSA_LV_OK; } -/* - * Find out if a logical device supports aborts by simply trying one. - * Smart Array may claim not to support aborts on logical drives, but - * if a MSA2000 * is connected, the drives on that will be presented - * by the Smart Array as logical drives, and aborts may be sent to - * those devices successfully. So the simplest way to find out is - * to simply try an abort and see how the device responds. - */ -static int hpsa_device_supports_aborts(struct ctlr_info *h, - unsigned char *scsi3addr) -{ - struct CommandList *c; - struct ErrorInfo *ei; - int rc = 0; - - u64 tag = (u64) -1; /* bogus tag */ - - /* Assume that physical devices support aborts */ - if (!is_logical_dev_addr_mode(scsi3addr)) - return 1; - - c = cmd_alloc(h); - - (void) fill_cmd(c, HPSA_ABORT_MSG, h, &tag, 0, 0, scsi3addr, TYPE_MSG); - (void) hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, - DEFAULT_TIMEOUT); - /* no unmap needed here because no data xfer. */ - ei = c->err_info; - switch (ei->CommandStatus) { - case CMD_INVALID: - rc = 0; - break; - case CMD_UNABORTABLE: - case CMD_ABORT_FAILED: - rc = 1; - break; - case CMD_TMF_STATUS: - rc = hpsa_evaluate_tmf_status(h, c); - break; - default: - rc = 0; - break; - } - cmd_free(h, c); - return rc; -} - static int hpsa_update_device_info(struct ctlr_info *h, unsigned char scsi3addr[], struct hpsa_scsi_dev_t *this_device, unsigned char *is_OBDR_device) @@ -3939,31 +3847,6 @@ static int hpsa_update_device_info(struct ctlr_info *h, return rc; } -static void hpsa_update_device_supports_aborts(struct ctlr_info *h, - struct hpsa_scsi_dev_t *dev, u8 *scsi3addr) -{ - unsigned long flags; - int rc, entry; - /* - * See if this device supports aborts. If we already know - * the device, we already know if it supports aborts, otherwise - * we have to find out if it supports aborts by trying one. - */ - spin_lock_irqsave(&h->devlock, flags); - rc = hpsa_scsi_find_entry(dev, h->dev, h->ndevices, &entry); - if ((rc == DEVICE_SAME || rc == DEVICE_UPDATED) && - entry >= 0 && entry < h->ndevices) { - dev->supports_aborts = h->dev[entry]->supports_aborts; - spin_unlock_irqrestore(&h->devlock, flags); - } else { - spin_unlock_irqrestore(&h->devlock, flags); - dev->supports_aborts = - hpsa_device_supports_aborts(h, scsi3addr); - if (dev->supports_aborts < 0) - dev->supports_aborts = 0; - } -} - /* * Helper function to assign bus, target, lun mapping of devices. * Logical drive target and lun are assigned at this time, but @@ -4001,35 +3884,6 @@ static void figure_bus_target_lun(struct ctlr_info *h, 0, lunid & 0x3fff); } - -/* - * Get address of physical disk used for an ioaccel2 mode command: - * 1. Extract ioaccel2 handle from the command. - * 2. Find a matching ioaccel2 handle from list of physical disks. - * 3. Return: - * 1 and set scsi3addr to address of matching physical - * 0 if no matching physical disk was found. - */ -static int hpsa_get_pdisk_of_ioaccel2(struct ctlr_info *h, - struct CommandList *ioaccel2_cmd_to_abort, unsigned char *scsi3addr) -{ - struct io_accel2_cmd *c2 = - &h->ioaccel2_cmd_pool[ioaccel2_cmd_to_abort->cmdindex]; - unsigned long flags; - int i; - - spin_lock_irqsave(&h->devlock, flags); - for (i = 0; i < h->ndevices; i++) - if (h->dev[i]->ioaccel_handle == le32_to_cpu(c2->scsi_nexus)) { - memcpy(scsi3addr, h->dev[i]->scsi3addr, - sizeof(h->dev[i]->scsi3addr)); - spin_unlock_irqrestore(&h->devlock, flags); - return 1; - } - spin_unlock_irqrestore(&h->devlock, flags); - return 0; -} - static int figure_external_status(struct ctlr_info *h, int raid_ctlr_position, int i, int nphysicals, int nlocal_logicals) { @@ -4394,7 +4248,6 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h) } figure_bus_target_lun(h, lunaddrbytes, tmpdevice); - hpsa_update_device_supports_aborts(h, tmpdevice, lunaddrbytes); this_device = currentsd[ncurrent]; /* Turn on discovery_polling if there are ext target devices. @@ -5528,8 +5381,6 @@ static void hpsa_command_resubmit_worker(struct work_struct *work) } if (c->reset_pending) return hpsa_cmd_free_and_done(c->h, c, cmd); - if (c->abort_pending) - return hpsa_cmd_abort_and_free(c->h, c, cmd); if (c->cmd_type == CMD_IOACCEL2) { struct ctlr_info *h = c->h; struct io_accel2_cmd *c2 = &h->ioaccel2_cmd_pool[c->cmdindex]; @@ -5987,433 +5838,6 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd) return rc; } -static void swizzle_abort_tag(u8 *tag) -{ - u8 original_tag[8]; - - memcpy(original_tag, tag, 8); - tag[0] = original_tag[3]; - tag[1] = original_tag[2]; - tag[2] = original_tag[1]; - tag[3] = original_tag[0]; - tag[4] = original_tag[7]; - tag[5] = original_tag[6]; - tag[6] = original_tag[5]; - tag[7] = original_tag[4]; -} - -static void hpsa_get_tag(struct ctlr_info *h, - struct CommandList *c, __le32 *taglower, __le32 *tagupper) -{ - u64 tag; - if (c->cmd_type == CMD_IOACCEL1) { - struct io_accel1_cmd *cm1 = (struct io_accel1_cmd *) - &h->ioaccel_cmd_pool[c->cmdindex]; - tag = le64_to_cpu(cm1->tag); - *tagupper = cpu_to_le32(tag >> 32); - *taglower = cpu_to_le32(tag); - return; - } - if (c->cmd_type == CMD_IOACCEL2) { - struct io_accel2_cmd *cm2 = (struct io_accel2_cmd *) - &h->ioaccel2_cmd_pool[c->cmdindex]; - /* upper tag not used in ioaccel2 mode */ - memset(tagupper, 0, sizeof(*tagupper)); - *taglower = cm2->Tag; - return; - } - tag = le64_to_cpu(c->Header.tag); - *tagupper = cpu_to_le32(tag >> 32); - *taglower = cpu_to_le32(tag); -} - -static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr, - struct CommandList *abort, int reply_queue) -{ - int rc = IO_OK; - struct CommandList *c; - struct ErrorInfo *ei; - __le32 tagupper, taglower; - - c = cmd_alloc(h); - - /* fill_cmd can't fail here, no buffer to map */ - (void) fill_cmd(c, HPSA_ABORT_MSG, h, &abort->Header.tag, - 0, 0, scsi3addr, TYPE_MSG); - if (h->needs_abort_tags_swizzled) - swizzle_abort_tag(&c->Request.CDB[4]); - (void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, DEFAULT_TIMEOUT); - hpsa_get_tag(h, abort, &taglower, &tagupper); - dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd(abort) completed.\n", - __func__, tagupper, taglower); - /* no unmap needed here because no data xfer. */ - - ei = c->err_info; - switch (ei->CommandStatus) { - case CMD_SUCCESS: - break; - case CMD_TMF_STATUS: - rc = hpsa_evaluate_tmf_status(h, c); - break; - case CMD_UNABORTABLE: /* Very common, don't make noise. */ - rc = -1; - break; - default: - dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: interpreting error.\n", - __func__, tagupper, taglower); - hpsa_scsi_interpret_error(h, c); - rc = -1; - break; - } - cmd_free(h, c); - dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: Finished.\n", - __func__, tagupper, taglower); - return rc; -} - -static void setup_ioaccel2_abort_cmd(struct CommandList *c, struct ctlr_info *h, - struct CommandList *command_to_abort, int reply_queue) -{ - struct io_accel2_cmd *c2 = &h->ioaccel2_cmd_pool[c->cmdindex]; - struct hpsa_tmf_struct *ac = (struct hpsa_tmf_struct *) c2; - struct io_accel2_cmd *c2a = - &h->ioaccel2_cmd_pool[command_to_abort->cmdindex]; - struct scsi_cmnd *scmd = command_to_abort->scsi_cmd; - struct hpsa_scsi_dev_t *dev = scmd->device->hostdata; - - if (!dev) - return; - - /* - * We're overlaying struct hpsa_tmf_struct on top of something which - * was allocated as a struct io_accel2_cmd, so we better be sure it - * actually fits, and doesn't overrun the error info space. - */ - BUILD_BUG_ON(sizeof(struct hpsa_tmf_struct) > - sizeof(struct io_accel2_cmd)); - BUG_ON(offsetof(struct io_accel2_cmd, error_data) < - offsetof(struct hpsa_tmf_struct, error_len) + - sizeof(ac->error_len)); - - c->cmd_type = IOACCEL2_TMF; - c->scsi_cmd = SCSI_CMD_BUSY; - - /* Adjust the DMA address to point to the accelerated command buffer */ - c->busaddr = (u32) h->ioaccel2_cmd_pool_dhandle + - (c->cmdindex * sizeof(struct io_accel2_cmd)); - BUG_ON(c->busaddr & 0x0000007F); - - memset(ac, 0, sizeof(*c2)); /* yes this is correct */ - ac->iu_type = IOACCEL2_IU_TMF_TYPE; - ac->reply_queue = reply_queue; - ac->tmf = IOACCEL2_TMF_ABORT; - ac->it_nexus = cpu_to_le32(dev->ioaccel_handle); - memset(ac->lun_id, 0, sizeof(ac->lun_id)); - ac->tag = cpu_to_le64(c->cmdindex << DIRECT_LOOKUP_SHIFT); - ac->abort_tag = cpu_to_le64(le32_to_cpu(c2a->Tag)); - ac->error_ptr = cpu_to_le64(c->busaddr + - offsetof(struct io_accel2_cmd, error_data)); - ac->error_len = cpu_to_le32(sizeof(c2->error_data)); -} - -/* ioaccel2 path firmware cannot handle abort task requests. - * Change abort requests to physical target reset, and send to the - * address of the physical disk used for the ioaccel 2 command. - * Return 0 on success (IO_OK) - * -1 on failure - */ - -static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h, - unsigned char *scsi3addr, struct CommandList *abort, int reply_queue) -{ - int rc = IO_OK; - struct scsi_cmnd *scmd; /* scsi command within request being aborted */ - struct hpsa_scsi_dev_t *dev; /* device to which scsi cmd was sent */ - unsigned char phys_scsi3addr[8]; /* addr of phys disk with volume */ - unsigned char *psa = &phys_scsi3addr[0]; - - /* Get a pointer to the hpsa logical device. */ - scmd = abort->scsi_cmd; - dev = (struct hpsa_scsi_dev_t *)(scmd->device->hostdata); - if (dev == NULL) { - dev_warn(&h->pdev->dev, - "Cannot abort: no device pointer for command.\n"); - return -1; /* not abortable */ - } - - if (h->raid_offload_debug > 0) - dev_info(&h->pdev->dev, - "scsi %d:%d:%d:%d %s scsi3addr 0x%8phN\n", - h->scsi_host->host_no, dev->bus, dev->target, dev->lun, - "Reset as abort", scsi3addr); - - if (!dev->offload_enabled) { - dev_warn(&h->pdev->dev, - "Can't abort: device is not operating in HP SSD Smart Path mode.\n"); - return -1; /* not abortable */ - } - - /* Incoming scsi3addr is logical addr. We need physical disk addr. */ - if (!hpsa_get_pdisk_of_ioaccel2(h, abort, psa)) { - dev_warn(&h->pdev->dev, "Can't abort: Failed lookup of physical address.\n"); - return -1; /* not abortable */ - } - - /* send the reset */ - if (h->raid_offload_debug > 0) - dev_info(&h->pdev->dev, - "Reset as abort: Resetting physical device at scsi3addr 0x%8phN\n", - psa); - rc = hpsa_do_reset(h, dev, psa, HPSA_PHYS_TARGET_RESET, reply_queue); - if (rc != 0) { - dev_warn(&h->pdev->dev, - "Reset as abort: Failed on physical device at scsi3addr 0x%8phN\n", - psa); - return rc; /* failed to reset */ - } - - /* wait for device to recover */ - if (wait_for_device_to_become_ready(h, psa, reply_queue) != 0) { - dev_warn(&h->pdev->dev, - "Reset as abort: Failed: Device never recovered from reset: 0x%8phN\n", - psa); - return -1; /* failed to recover */ - } - - /* device recovered */ - dev_info(&h->pdev->dev, - "Reset as abort: Device recovered from reset: scsi3addr 0x%8phN\n", - psa); - - return rc; /* success */ -} - -static int hpsa_send_abort_ioaccel2(struct ctlr_info *h, - struct CommandList *abort, int reply_queue) -{ - int rc = IO_OK; - struct CommandList *c; - __le32 taglower, tagupper; - struct hpsa_scsi_dev_t *dev; - struct io_accel2_cmd *c2; - - dev = abort->scsi_cmd->device->hostdata; - if (!dev) - return -1; - - if (!dev->offload_enabled && !dev->hba_ioaccel_enabled) - return -1; - - c = cmd_alloc(h); - setup_ioaccel2_abort_cmd(c, h, abort, reply_queue); - c2 = &h->ioaccel2_cmd_pool[c->cmdindex]; - (void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, DEFAULT_TIMEOUT); - hpsa_get_tag(h, abort, &taglower, &tagupper); - dev_dbg(&h->pdev->dev, - "%s: Tag:0x%08x:%08x: do_simple_cmd(ioaccel2 abort) completed.\n", - __func__, tagupper, taglower); - /* no unmap needed here because no data xfer. */ - - dev_dbg(&h->pdev->dev, - "%s: Tag:0x%08x:%08x: abort service response = 0x%02x.\n", - __func__, tagupper, taglower, c2->error_data.serv_response); - switch (c2->error_data.serv_response) { - case IOACCEL2_SERV_RESPONSE_TMF_COMPLETE: - case IOACCEL2_SERV_RESPONSE_TMF_SUCCESS: - rc = 0; - break; - case IOACCEL2_SERV_RESPONSE_TMF_REJECTED: - case IOACCEL2_SERV_RESPONSE_FAILURE: - case IOACCEL2_SERV_RESPONSE_TMF_WRONG_LUN: - rc = -1; - break; - default: - dev_warn(&h->pdev->dev, - "%s: Tag:0x%08x:%08x: unknown abort service response 0x%02x\n", - __func__, tagupper, taglower, - c2->error_data.serv_response); - rc = -1; - } - cmd_free(h, c); - dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: Finished.\n", __func__, - tagupper, taglower); - return rc; -} - -static int hpsa_send_abort_both_ways(struct ctlr_info *h, - struct hpsa_scsi_dev_t *dev, struct CommandList *abort, int reply_queue) -{ - /* - * ioccelerator mode 2 commands should be aborted via the - * accelerated path, since RAID path is unaware of these commands, - * but not all underlying firmware can handle abort TMF. - * Change abort to physical device reset when abort TMF is unsupported. - */ - if (abort->cmd_type == CMD_IOACCEL2) { - if ((HPSATMF_IOACCEL_ENABLED & h->TMFSupportFlags) || - dev->physical_device) - return hpsa_send_abort_ioaccel2(h, abort, - reply_queue); - else - return hpsa_send_reset_as_abort_ioaccel2(h, - dev->scsi3addr, - abort, reply_queue); - } - return hpsa_send_abort(h, dev->scsi3addr, abort, reply_queue); -} - -/* Find out which reply queue a command was meant to return on */ -static int hpsa_extract_reply_queue(struct ctlr_info *h, - struct CommandList *c) -{ - if (c->cmd_type == CMD_IOACCEL2) - return h->ioaccel2_cmd_pool[c->cmdindex].reply_queue; - return c->Header.ReplyQueue; -} - -/* - * Limit concurrency of abort commands to prevent - * over-subscription of commands - */ -static inline int wait_for_available_abort_cmd(struct ctlr_info *h) -{ -#define ABORT_CMD_WAIT_MSECS 5000 - return !wait_event_timeout(h->abort_cmd_wait_queue, - atomic_dec_if_positive(&h->abort_cmds_available) >= 0, - msecs_to_jiffies(ABORT_CMD_WAIT_MSECS)); -} - -/* Send an abort for the specified command. - * If the device and controller support it, - * send a task abort request. - */ -static int hpsa_eh_abort_handler(struct scsi_cmnd *sc) -{ - - int rc; - struct ctlr_info *h; - struct hpsa_scsi_dev_t *dev; - struct CommandList *abort; /* pointer to command to be aborted */ - struct scsi_cmnd *as; /* ptr to scsi cmd inside aborted command. */ - char msg[256]; /* For debug messaging. */ - int ml = 0; - __le32 tagupper, taglower; - int refcount, reply_queue; - - if (sc == NULL) - return FAILED; - - if (sc->device == NULL) - return FAILED; - - /* Find the controller of the command to be aborted */ - h = sdev_to_hba(sc->device); - if (h == NULL) - return FAILED; - - /* Find the device of the command to be aborted */ - dev = sc->device->hostdata; - if (!dev) { - dev_err(&h->pdev->dev, "%s FAILED, Device lookup failed.\n", - msg); - return FAILED; - } - - /* If controller locked up, we can guarantee command won't complete */ - if (lockup_detected(h)) { - hpsa_show_dev_msg(KERN_WARNING, h, dev, - "ABORT FAILED, lockup detected"); - return FAILED; - } - - /* This is a good time to check if controller lockup has occurred */ - if (detect_controller_lockup(h)) { - hpsa_show_dev_msg(KERN_WARNING, h, dev, - "ABORT FAILED, new lockup detected"); - return FAILED; - } - - /* Check that controller supports some kind of task abort */ - if (!(HPSATMF_PHYS_TASK_ABORT & h->TMFSupportFlags) && - !(HPSATMF_LOG_TASK_ABORT & h->TMFSupportFlags)) - return FAILED; - - memset(msg, 0, sizeof(msg)); - ml += sprintf(msg+ml, "scsi %d:%d:%d:%llu %s %p", - h->scsi_host->host_no, sc->device->channel, - sc->device->id, sc->device->lun, - "Aborting command", sc); - - /* Get SCSI command to be aborted */ - abort = (struct CommandList *) sc->host_scribble; - if (abort == NULL) { - /* This can happen if the command already completed. */ - return SUCCESS; - } - refcount = atomic_inc_return(&abort->refcount); - if (refcount == 1) { /* Command is done already. */ - cmd_free(h, abort); - return SUCCESS; - } - - /* Don't bother trying the abort if we know it won't work. */ - if (abort->cmd_type != CMD_IOACCEL2 && - abort->cmd_type != CMD_IOACCEL1 && !dev->supports_aborts) { - cmd_free(h, abort); - return FAILED; - } - - /* - * Check that we're aborting the right command. - * It's possible the CommandList already completed and got re-used. - */ - if (abort->scsi_cmd != sc) { - cmd_free(h, abort); - return SUCCESS; - } - - abort->abort_pending = true; - hpsa_get_tag(h, abort, &taglower, &tagupper); - reply_queue = hpsa_extract_reply_queue(h, abort); - ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower); - as = abort->scsi_cmd; - if (as != NULL) - ml += sprintf(msg+ml, - "CDBLen: %d CDB: 0x%02x%02x... SN: 0x%lx ", - as->cmd_len, as->cmnd[0], as->cmnd[1], - as->serial_number); - dev_warn(&h->pdev->dev, "%s BEING SENT\n", msg); - hpsa_show_dev_msg(KERN_WARNING, h, dev, "Aborting command"); - - /* - * Command is in flight, or possibly already completed - * by the firmware (but not to the scsi mid layer) but we can't - * distinguish which. Send the abort down. - */ - if (wait_for_available_abort_cmd(h)) { - dev_warn(&h->pdev->dev, - "%s FAILED, timeout waiting for an abort command to become available.\n", - msg); - cmd_free(h, abort); - return FAILED; - } - rc = hpsa_send_abort_both_ways(h, dev, abort, reply_queue); - atomic_inc(&h->abort_cmds_available); - wake_up_all(&h->abort_cmd_wait_queue); - if (rc != 0) { - dev_warn(&h->pdev->dev, "%s SENT, FAILED\n", msg); - hpsa_show_dev_msg(KERN_WARNING, h, dev, - "FAILED to abort command"); - cmd_free(h, abort); - return FAILED; - } - dev_info(&h->pdev->dev, "%s SENT, SUCCESS\n", msg); - wait_event(h->event_sync_wait_queue, - abort->scsi_cmd != sc || lockup_detected(h)); - cmd_free(h, abort); - return !lockup_detected(h) ? SUCCESS : FAILED; -} - /* * For operations with an associated SCSI command, a command block is allocated * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the @@ -6459,9 +5883,7 @@ static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c) { /* * Release our reference to the block. We don't need to do anything - * else to free it, because it is accessed by index. (There's no point - * in checking the result of the decrement, since we cannot guarantee - * that there isn't a concurrent abort which is also accessing it.) + * else to free it, because it is accessed by index. */ (void)atomic_dec(&c->refcount); } @@ -7000,7 +6422,6 @@ static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h, int cmd_type) { int pci_dir = XFER_NONE; - u64 tag; /* for commands to be aborted */ c->cmd_type = CMD_IOCTL_PEND; c->scsi_cmd = SCSI_CMD_BUSY; @@ -7184,27 +6605,6 @@ static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h, c->Request.CDB[6] = 0x00; c->Request.CDB[7] = 0x00; break; - case HPSA_ABORT_MSG: - memcpy(&tag, buff, sizeof(tag)); - dev_dbg(&h->pdev->dev, - "Abort Tag:0x%016llx using rqst Tag:0x%016llx", - tag, c->Header.tag); - c->Request.CDBLen = 16; - c->Request.type_attr_dir = - TYPE_ATTR_DIR(cmd_type, - ATTR_SIMPLE, XFER_WRITE); - c->Request.Timeout = 0; /* Don't time out */ - c->Request.CDB[0] = HPSA_TASK_MANAGEMENT; - c->Request.CDB[1] = HPSA_TMF_ABORT_TASK; - c->Request.CDB[2] = 0x00; /* reserved */ - c->Request.CDB[3] = 0x00; /* reserved */ - /* Tag to abort goes in CDB[4]-CDB[11] */ - memcpy(&c->Request.CDB[4], &tag, sizeof(tag)); - c->Request.CDB[12] = 0x00; /* reserved */ - c->Request.CDB[13] = 0x00; /* reserved */ - c->Request.CDB[14] = 0x00; /* reserved */ - c->Request.CDB[15] = 0x00; /* reserved */ - break; default: dev_warn(&h->pdev->dev, "unknown message type %d\n", cmd); @@ -8162,9 +7562,6 @@ static int hpsa_pci_init(struct ctlr_info *h) h->product_name = products[prod_index].product_name; h->access = *(products[prod_index].access); - h->needs_abort_tags_swizzled = - ctlr_needs_abort_tags_swizzled(h->board_id); - pci_disable_link_state(h->pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 | PCIE_LINK_STATE_CLKPM); @@ -8885,7 +8282,6 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) spin_lock_init(&h->scan_lock); spin_lock_init(&h->reset_lock); atomic_set(&h->passthru_cmds_avail, HPSA_MAX_CONCURRENT_PASSTHRUS); - atomic_set(&h->abort_cmds_available, HPSA_CMDS_RESERVED_FOR_ABORTS); /* Allocate and clear per-cpu variable lockup_detected */ h->lockup_detected = alloc_percpu(u32); @@ -8937,7 +8333,6 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (rc) goto clean5; /* cmd, irq, shost, pci, lu, aer/h */ init_waitqueue_head(&h->scan_wait_queue); - init_waitqueue_head(&h->abort_cmd_wait_queue); init_waitqueue_head(&h->event_sync_wait_queue); mutex_init(&h->reset_mutex); h->scan_finished = 1; /* no scan currently in progress */ diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h index c9c4927..1c49741 100644 --- a/drivers/scsi/hpsa.h +++ b/drivers/scsi/hpsa.h @@ -298,7 +298,6 @@ struct ctlr_info { struct workqueue_struct *resubmit_wq; struct workqueue_struct *rescan_ctlr_wq; atomic_t abort_cmds_available; - wait_queue_head_t abort_cmd_wait_queue; wait_queue_head_t event_sync_wait_queue; struct mutex reset_mutex; u8 reset_in_progress; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V4 12/12] hpsa: bump driver version 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (10 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 11/12] hpsa: remove abort handler Don Brace @ 2017-05-04 22:51 ` Don Brace 2017-05-12 3:07 ` [PATCH V4 00/12] hpsa updates Martin K. Petersen 12 siblings, 0 replies; 16+ messages in thread From: Don Brace @ 2017-05-04 22:51 UTC (permalink / raw) To: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD Cc: linux-scsi Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> --- drivers/scsi/hpsa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 9a631e3..9934947 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -60,7 +60,7 @@ * HPSA_DRIVER_VERSION must be 3 byte values (0-255) separated by '.' * with an optional trailing '-' followed by a byte value (0-255). */ -#define HPSA_DRIVER_VERSION "3.4.18-0" +#define HPSA_DRIVER_VERSION "3.4.20-0" #define DRIVER_NAME "HP HPSA Driver (v " HPSA_DRIVER_VERSION ")" #define HPSA "hpsa" ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH V4 00/12] hpsa updates 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace ` (11 preceding siblings ...) 2017-05-04 22:51 ` [PATCH V4 12/12] hpsa: bump driver version Don Brace @ 2017-05-12 3:07 ` Martin K. Petersen 12 siblings, 0 replies; 16+ messages in thread From: Martin K. Petersen @ 2017-05-12 3:07 UTC (permalink / raw) To: Don Brace Cc: joseph.szczypek, gerry.morong, john.hall, jejb, Kevin.Barnett, Mahesh.Rajashekhara, bader.alisaleh, hch, scott.teel, Viswas.G, Justin.Lindley, scott.benesh, POSWALD, linux-scsi Don, Applied to 4.13/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-05-12 3:08 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-05-04 22:50 [PATCH V4 00/12] hpsa updates Don Brace 2017-05-04 22:50 ` [PATCH V4 01/12] hpsa: update identify physical device structure Don Brace 2017-05-04 22:50 ` [PATCH V4 02/12] hpsa: do not get enclosure info for external devices Don Brace 2017-05-04 22:50 ` [PATCH V4 03/12] hpsa: update reset handler Don Brace 2017-05-04 22:51 ` [PATCH V4 04/12] hpsa: do not reset enclosures Don Brace 2017-05-04 22:51 ` [PATCH V4 05/12] hpsa: rescan later if reset in progress Don Brace 2017-05-04 22:51 ` [PATCH V4 06/12] hpsa: correct resets on retried commands Don Brace 2017-05-04 22:51 ` [PATCH V4 07/12] hpsa: cleanup reset handler Don Brace 2017-05-05 13:27 ` Tomas Henzl 2017-05-05 16:10 ` Don Brace 2017-05-04 22:51 ` [PATCH V4 08/12] hpsa: correct queue depth for externals Don Brace 2017-05-04 22:51 ` [PATCH V4 09/12] hpsa: separate monitor events from rescan worker Don Brace 2017-05-04 22:51 ` [PATCH V4 10/12] hpsa: send ioaccel requests with 0 length down raid path Don Brace 2017-05-04 22:51 ` [PATCH V4 11/12] hpsa: remove abort handler Don Brace 2017-05-04 22:51 ` [PATCH V4 12/12] hpsa: bump driver version Don Brace 2017-05-12 3:07 ` [PATCH V4 00/12] hpsa updates Martin K. Petersen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox