* [PATCH 0/2] Ensure FCoE target interrupts work @ 2024-01-30 16:42 Lee Duncan 2024-01-30 16:42 ` [PATCH 1/2] Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" Lee Duncan 2024-01-30 16:42 ` [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue Lee Duncan 0 siblings, 2 replies; 6+ messages in thread From: Lee Duncan @ 2024-01-30 16:42 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel, hare, Lee Duncan From: Lee Duncan <lduncan@suse.com> Commit 1a1975551943 "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" changed locking for fnic/FCoE, but it did so by disabling interrupts where they weren't disabled before, and this caused FCoE targets to go offline. Reverting that patch fixed the issue. But to handle the problem originally addressed by the commit, instead of modifying the locking, move the work to be done into a work queue. Lee Duncan (1): Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" Hannes Reinecke (1): fnic: move fnic_fnic_flush_tx() to a work queue drivers/scsi/fcoe/fcoe_ctlr.c | 20 ++++++++------------ drivers/scsi/fnic/fnic.h | 3 ++- drivers/scsi/fnic/fnic_fcs.c | 3 ++- drivers/scsi/fnic/fnic_main.c | 1 + drivers/scsi/fnic/fnic_scsi.c | 4 ++-- 5 files changed, 15 insertions(+), 16 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" 2024-01-30 16:42 [PATCH 0/2] Ensure FCoE target interrupts work Lee Duncan @ 2024-01-30 16:42 ` Lee Duncan 2024-02-06 1:14 ` Hannes Reinecke 2024-01-30 16:42 ` [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue Lee Duncan 1 sibling, 1 reply; 6+ messages in thread From: Lee Duncan @ 2024-01-30 16:42 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel, hare, Lee Duncan From: Lee Duncan <lduncan@suse.com> This reverts commit 1a1975551943f681772720f639ff42fbaa746212 This commit causes interrupts to be lost for FCoE devices, since it changed sping locks from "bh" to "irqsave". Instead, a work queue should be used, and will be addressed in a separate patch. Fixes: 1a1975551943f681772720f639ff42fbaa746212 Signed-off-by: Lee Duncan <lduncan@suse.com> --- drivers/scsi/fcoe/fcoe_ctlr.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c index 19eee108db02..5c8d1ba3f8f3 100644 --- a/drivers/scsi/fcoe/fcoe_ctlr.c +++ b/drivers/scsi/fcoe/fcoe_ctlr.c @@ -319,17 +319,16 @@ static void fcoe_ctlr_announce(struct fcoe_ctlr *fip) { struct fcoe_fcf *sel; struct fcoe_fcf *fcf; - unsigned long flags; mutex_lock(&fip->ctlr_mutex); - spin_lock_irqsave(&fip->ctlr_lock, flags); + spin_lock_bh(&fip->ctlr_lock); kfree_skb(fip->flogi_req); fip->flogi_req = NULL; list_for_each_entry(fcf, &fip->fcfs, list) fcf->flogi_sent = 0; - spin_unlock_irqrestore(&fip->ctlr_lock, flags); + spin_unlock_bh(&fip->ctlr_lock); sel = fip->sel_fcf; if (sel && ether_addr_equal(sel->fcf_mac, fip->dest_addr)) @@ -700,7 +699,6 @@ int fcoe_ctlr_els_send(struct fcoe_ctlr *fip, struct fc_lport *lport, { struct fc_frame *fp; struct fc_frame_header *fh; - unsigned long flags; u16 old_xid; u8 op; u8 mac[ETH_ALEN]; @@ -734,11 +732,11 @@ int fcoe_ctlr_els_send(struct fcoe_ctlr *fip, struct fc_lport *lport, op = FIP_DT_FLOGI; if (fip->mode == FIP_MODE_VN2VN) break; - spin_lock_irqsave(&fip->ctlr_lock, flags); + spin_lock_bh(&fip->ctlr_lock); kfree_skb(fip->flogi_req); fip->flogi_req = skb; fip->flogi_req_send = 1; - spin_unlock_irqrestore(&fip->ctlr_lock, flags); + spin_unlock_bh(&fip->ctlr_lock); schedule_work(&fip->timer_work); return -EINPROGRESS; case ELS_FDISC: @@ -1707,11 +1705,10 @@ static int fcoe_ctlr_flogi_send_locked(struct fcoe_ctlr *fip) static int fcoe_ctlr_flogi_retry(struct fcoe_ctlr *fip) { struct fcoe_fcf *fcf; - unsigned long flags; int error; mutex_lock(&fip->ctlr_mutex); - spin_lock_irqsave(&fip->ctlr_lock, flags); + spin_lock_bh(&fip->ctlr_lock); LIBFCOE_FIP_DBG(fip, "re-sending FLOGI - reselect\n"); fcf = fcoe_ctlr_select(fip); if (!fcf || fcf->flogi_sent) { @@ -1722,7 +1719,7 @@ static int fcoe_ctlr_flogi_retry(struct fcoe_ctlr *fip) fcoe_ctlr_solicit(fip, NULL); error = fcoe_ctlr_flogi_send_locked(fip); } - spin_unlock_irqrestore(&fip->ctlr_lock, flags); + spin_unlock_bh(&fip->ctlr_lock); mutex_unlock(&fip->ctlr_mutex); return error; } @@ -1739,9 +1736,8 @@ static int fcoe_ctlr_flogi_retry(struct fcoe_ctlr *fip) static void fcoe_ctlr_flogi_send(struct fcoe_ctlr *fip) { struct fcoe_fcf *fcf; - unsigned long flags; - spin_lock_irqsave(&fip->ctlr_lock, flags); + spin_lock_bh(&fip->ctlr_lock); fcf = fip->sel_fcf; if (!fcf || !fip->flogi_req_send) goto unlock; @@ -1768,7 +1764,7 @@ static void fcoe_ctlr_flogi_send(struct fcoe_ctlr *fip) } else /* XXX */ LIBFCOE_FIP_DBG(fip, "No FCF selected - defer send\n"); unlock: - spin_unlock_irqrestore(&fip->ctlr_lock, flags); + spin_unlock_bh(&fip->ctlr_lock); } /** -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" 2024-01-30 16:42 ` [PATCH 1/2] Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" Lee Duncan @ 2024-02-06 1:14 ` Hannes Reinecke 0 siblings, 0 replies; 6+ messages in thread From: Hannes Reinecke @ 2024-02-06 1:14 UTC (permalink / raw) To: Lee Duncan, linux-scsi; +Cc: linux-kernel, Lee Duncan On 1/31/24 00:42, Lee Duncan wrote: > From: Lee Duncan <lduncan@suse.com> > > This reverts commit 1a1975551943f681772720f639ff42fbaa746212 > > This commit causes interrupts to be lost for FCoE devices, > since it changed sping locks from "bh" to "irqsave". > > Instead, a work queue should be used, and will be addressed > in a separate patch. > > Fixes: 1a1975551943f681772720f639ff42fbaa746212 > Signed-off-by: Lee Duncan <lduncan@suse.com> > --- > drivers/scsi/fcoe/fcoe_ctlr.c | 20 ++++++++------------ > 1 file changed, 8 insertions(+), 12 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald, Werner Knoblich ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue 2024-01-30 16:42 [PATCH 0/2] Ensure FCoE target interrupts work Lee Duncan 2024-01-30 16:42 ` [PATCH 1/2] Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" Lee Duncan @ 2024-01-30 16:42 ` Lee Duncan 2024-01-31 11:50 ` kernel test robot 2024-02-06 1:23 ` Hannes Reinecke 1 sibling, 2 replies; 6+ messages in thread From: Lee Duncan @ 2024-01-30 16:42 UTC (permalink / raw) To: linux-scsi; +Cc: linux-kernel, hare, Lee Duncan From: Hannes Reinecke <hare@suse.de> Rather than call 'fnic_flush_tx()' from interrupt context we should be moving it onto a work queue to avoid any locking issues. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Lee Duncan <lduncan@suse.com> --- drivers/scsi/fnic/fnic.h | 3 ++- drivers/scsi/fnic/fnic_fcs.c | 3 ++- drivers/scsi/fnic/fnic_main.c | 1 + drivers/scsi/fnic/fnic_scsi.c | 4 ++-- 4 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h index 2074937c05bc..3b8eb7dee500 100644 --- a/drivers/scsi/fnic/fnic.h +++ b/drivers/scsi/fnic/fnic.h @@ -305,6 +305,7 @@ struct fnic { unsigned int copy_wq_base; struct work_struct link_work; struct work_struct frame_work; + struct work_struct flush_work; struct sk_buff_head frame_queue; struct sk_buff_head tx_queue; @@ -363,7 +364,7 @@ void fnic_handle_event(struct work_struct *work); int fnic_rq_cmpl_handler(struct fnic *fnic, int); int fnic_alloc_rq_frame(struct vnic_rq *rq); void fnic_free_rq_buf(struct vnic_rq *rq, struct vnic_rq_buf *buf); -void fnic_flush_tx(struct fnic *); +void fnic_flush_tx(struct work_struct *); void fnic_eth_send(struct fcoe_ctlr *, struct sk_buff *skb); void fnic_set_port_id(struct fc_lport *, u32, struct fc_frame *); void fnic_update_mac(struct fc_lport *, u8 *new); diff --git a/drivers/scsi/fnic/fnic_fcs.c b/drivers/scsi/fnic/fnic_fcs.c index 5e312a55cc7d..1bf1418bbde4 100644 --- a/drivers/scsi/fnic/fnic_fcs.c +++ b/drivers/scsi/fnic/fnic_fcs.c @@ -1190,8 +1190,9 @@ int fnic_send(struct fc_lport *lp, struct fc_frame *fp) * * Called without fnic_lock held. */ -void fnic_flush_tx(struct fnic *fnic) +void fnic_flush_tx(struct work_struct *work) { + struct fnic *fnic = container_of(work, struct fnic, flush_work); struct sk_buff *skb; struct fc_frame *fp; diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c index 5ed1d897311a..29eead383eb9 100644 --- a/drivers/scsi/fnic/fnic_main.c +++ b/drivers/scsi/fnic/fnic_main.c @@ -830,6 +830,7 @@ static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent) spin_lock_init(&fnic->vlans_lock); INIT_WORK(&fnic->fip_frame_work, fnic_handle_fip_frame); INIT_WORK(&fnic->event_work, fnic_handle_event); + INIT_WORK(&fnic->flush_work, fnic_flush_tx); skb_queue_head_init(&fnic->fip_frame_queue); INIT_LIST_HEAD(&fnic->evlist); INIT_LIST_HEAD(&fnic->vlans); diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c index 8d7fc5284293..fc4cee91b175 100644 --- a/drivers/scsi/fnic/fnic_scsi.c +++ b/drivers/scsi/fnic/fnic_scsi.c @@ -680,7 +680,7 @@ static int fnic_fcpio_fw_reset_cmpl_handler(struct fnic *fnic, spin_unlock_irqrestore(&fnic->fnic_lock, flags); - fnic_flush_tx(fnic); + queue_work(fnic_event_queue, &fnic->flush_work); reset_cmpl_handler_end: fnic_clear_state_flags(fnic, FNIC_FLAGS_FWRESET); @@ -736,7 +736,7 @@ static int fnic_fcpio_flogi_reg_cmpl_handler(struct fnic *fnic, } spin_unlock_irqrestore(&fnic->fnic_lock, flags); - fnic_flush_tx(fnic); + queue_work(fnic_event_queue, &fnic->flush_work); queue_work(fnic_event_queue, &fnic->frame_work); } else { spin_unlock_irqrestore(&fnic->fnic_lock, flags); -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue 2024-01-30 16:42 ` [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue Lee Duncan @ 2024-01-31 11:50 ` kernel test robot 2024-02-06 1:23 ` Hannes Reinecke 1 sibling, 0 replies; 6+ messages in thread From: kernel test robot @ 2024-01-31 11:50 UTC (permalink / raw) To: Lee Duncan, linux-scsi; +Cc: oe-kbuild-all, linux-kernel, hare, Lee Duncan Hi Lee, kernel test robot noticed the following build warnings: [auto build test WARNING on jejb-scsi/for-next] [also build test WARNING on mkp-scsi/for-next linus/master v6.8-rc2 next-20240131] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Lee-Duncan/Revert-scsi-fcoe-Fix-potential-deadlock-on-fip-ctlr_lock/20240131-004656 base: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next patch link: https://lore.kernel.org/r/9c51ef07a04413fb2f2bd20f1534f96e004e4e59.1706632031.git.lduncan%40suse.com patch subject: [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue config: x86_64-kexec (https://download.01.org/0day-ci/archive/20240131/202401311947.cPDhv2xa-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240131/202401311947.cPDhv2xa-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202401311947.cPDhv2xa-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/scsi/fnic/fnic_fcs.c:1194: warning: Function parameter or struct member 'work' not described in 'fnic_flush_tx' >> drivers/scsi/fnic/fnic_fcs.c:1194: warning: Excess function parameter 'fnic' description in 'fnic_flush_tx' vim +1194 drivers/scsi/fnic/fnic_fcs.c 5df6d737dd4b0fe Abhijeet Joglekar 2009-04-17 1182 78112e5558064cb Joe Eykholt 2009-11-03 1183 /** 78112e5558064cb Joe Eykholt 2009-11-03 1184 * fnic_flush_tx() - send queued frames. 78112e5558064cb Joe Eykholt 2009-11-03 1185 * @fnic: fnic device 78112e5558064cb Joe Eykholt 2009-11-03 1186 * 78112e5558064cb Joe Eykholt 2009-11-03 1187 * Send frames that were waiting to go out in FC or Ethernet mode. 78112e5558064cb Joe Eykholt 2009-11-03 1188 * Whenever changing modes we purge queued frames, so these frames should 78112e5558064cb Joe Eykholt 2009-11-03 1189 * be queued for the stable mode that we're in, either FC or Ethernet. 78112e5558064cb Joe Eykholt 2009-11-03 1190 * 78112e5558064cb Joe Eykholt 2009-11-03 1191 * Called without fnic_lock held. 78112e5558064cb Joe Eykholt 2009-11-03 1192 */ 7ea34b1ffb4e1aa Hannes Reinecke 2024-01-30 1193 void fnic_flush_tx(struct work_struct *work) 78112e5558064cb Joe Eykholt 2009-11-03 @1194 { 7ea34b1ffb4e1aa Hannes Reinecke 2024-01-30 1195 struct fnic *fnic = container_of(work, struct fnic, flush_work); 78112e5558064cb Joe Eykholt 2009-11-03 1196 struct sk_buff *skb; 78112e5558064cb Joe Eykholt 2009-11-03 1197 struct fc_frame *fp; 5df6d737dd4b0fe Abhijeet Joglekar 2009-04-17 1198 d9e9ab56b687da0 Brian Uchino 2010-04-09 1199 while ((skb = skb_dequeue(&fnic->tx_queue))) { 78112e5558064cb Joe Eykholt 2009-11-03 1200 fp = (struct fc_frame *)skb; 78112e5558064cb Joe Eykholt 2009-11-03 1201 fnic_send_frame(fnic, fp); 78112e5558064cb Joe Eykholt 2009-11-03 1202 } 78112e5558064cb Joe Eykholt 2009-11-03 1203 } 5df6d737dd4b0fe Abhijeet Joglekar 2009-04-17 1204 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue 2024-01-30 16:42 ` [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue Lee Duncan 2024-01-31 11:50 ` kernel test robot @ 2024-02-06 1:23 ` Hannes Reinecke 1 sibling, 0 replies; 6+ messages in thread From: Hannes Reinecke @ 2024-02-06 1:23 UTC (permalink / raw) To: Lee Duncan, linux-scsi; +Cc: linux-kernel, Lee Duncan On 1/31/24 00:42, Lee Duncan wrote: > From: Hannes Reinecke <hare@suse.de> > > Rather than call 'fnic_flush_tx()' from interrupt context we should > be moving it onto a work queue to avoid any locking issues. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > Signed-off-by: Lee Duncan <lduncan@suse.com> > --- > drivers/scsi/fnic/fnic.h | 3 ++- > drivers/scsi/fnic/fnic_fcs.c | 3 ++- > drivers/scsi/fnic/fnic_main.c | 1 + > drivers/scsi/fnic/fnic_scsi.c | 4 ++-- > 4 files changed, 7 insertions(+), 4 deletions(-) > Can you please add the 'fixes' tag to this one, too? Just the first patch doesn't fix anything, one really would need to apply both to get the issue fixed. And, of course, fix the kbuild robot warning :-) Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald, Werner Knoblich ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-02-06 1:23 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-30 16:42 [PATCH 0/2] Ensure FCoE target interrupts work Lee Duncan 2024-01-30 16:42 ` [PATCH 1/2] Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" Lee Duncan 2024-02-06 1:14 ` Hannes Reinecke 2024-01-30 16:42 ` [PATCH 2/2] fnic: move fnic_fnic_flush_tx() to a work queue Lee Duncan 2024-01-31 11:50 ` kernel test robot 2024-02-06 1:23 ` Hannes Reinecke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox