RE: [PATCH 02/28] be2iscsi: Replace _bh with

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
@ 2016-09-23 15:08 Jitendra Bhivare
  0 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-09-23 15:08 UTC (permalink / raw)
  To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi

Hi Mike,

I could reproduce hard lockup using for-next kernel only in
iscsi_eh_cmd_timeout path due to spin_lock_irqsave taken in blk_timeout_work
Please refer stack trace below.

The _bh version used for frwd_lock and back_lock does not seem to be causing
any issue similar to seen with be2iscsi after replacing
_bh versions in be2iscsi.

I am testing it further, to confirm in all possible scenarios... NOPs, error
recovery, resets and reconnects.
On my setup, I affined all EQ interrupts on a single CPU.
Along with heavy IO, few of the invocations of fio were pinned to run on
same CPU.

Any call to unlock_bh with another spin_lock already held, invoking
do_softirq, might cause deadlock if bottom half used by driver
calls function which needs that another spin_lock.
Is there a code which prevents this issue?

Thanks,

JB

[ 3843.125976] ------------[ cut here ]------------
[ 3843.132217] WARNING: CPU: 20 PID: 1227 at kernel/softirq.c:150
__local_bh_enable_ip+0x6b/0x90
[ 3843.142815] Modules linked in: dm_service_time be2iscsi(E)
iscsi_boot_sysfs xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun
ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4
xt_conntrack ebtable_nat ebtable_broute rpcrdma bridge ib_isert stp
iscsi_target_mod llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ib_iser ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 libiscsi nf_nat scsi_transport_iscsi ib_srpt
nf_conntrack target_core_mod iptable_mangle iptable_security iptable_raw
iptable_filter ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs
ib_umad rdma_cm ib_cm iw_cm ocrdma ib_core dm_mirror dm_region_hash dm_log
intel_rapl sb_edac edac_core x86_pkg_temp_thermal
[ 3843.231162]  intel_powerclamp coretemp kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel iTCO_wdt
iTCO_vendor_support ipmi_devintf lrw dcdbas mei_me ipmi_ssif gf128mul sg mei
glue_helper ablk_helper ioatdma cryptd ipmi_si shpchp nfsd wmi
acpi_power_meter ipmi_msghandler pcspkr dca lpc_ich acpi_pad auth_rpcgss
nfs_acl lockd grace sunrpc dm_multipath dm_mod ip_tables ext4 jbd2 mbcache
sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm drm tg3 crc32c_intel ptp i2c_core be2net megaraid_sas fjes
pps_core
[ 3843.294328] CPU: 20 PID: 1227 Comm: kworker/20:1H Tainted: G            E
4.8.0-rc1+ #3
[ 3843.304944] Hardware name: Dell Inc. PowerEdge R720/0X6H47, BIOS 1.4.8
10/25/2012
[ 3843.314798] Workqueue: kblockd blk_timeout_work
[ 3843.321350]  0000000000000086 00000000a32f4533 ffff8802216d7bd8
ffffffff8135c3cf
[ 3843.331146]  0000000000000000 0000000000000000 ffff8802216d7c18
ffffffff8108d661
[ 3843.340918]  00000096216d7c50 0000000000000200 ffff8802d07cc828
ffff8801b3632550
[ 3843.350687] Call Trace:
[ 3843.354866]  [<ffffffff8135c3cf>] dump_stack+0x63/0x84
[ 3843.362061]  [<ffffffff8108d661>] __warn+0xd1/0xf0
[ 3843.368851]  [<ffffffff8108d79d>] warn_slowpath_null+0x1d/0x20
[ 3843.376791]  [<ffffffff810930eb>] __local_bh_enable_ip+0x6b/0x90
[ 3843.384903]  [<ffffffff816fe7be>] _raw_spin_unlock_bh+0x1e/0x20
[ 3843.392940]  [<ffffffffa085f710>] beiscsi_alloc_pdu+0x2f0/0x6e0
[be2iscsi]
[ 3843.402076]  [<ffffffffa06bc358>] __iscsi_conn_send_pdu+0xf8/0x370
[libiscsi]
[ 3843.411549]  [<ffffffffa06bc6fe>] iscsi_send_nopout+0xbe/0x110 [libiscsi]
[ 3843.420639]  [<ffffffffa06bd98b>] iscsi_eh_cmd_timed_out+0x29b/0x2b0
[libiscsi]
[ 3843.430339]  [<ffffffff814cd1de>] scsi_times_out+0x5e/0x250
[ 3843.438119]  [<ffffffff813374af>] blk_rq_timed_out+0x1f/0x60
[ 3843.446009]  [<ffffffff8133759d>] blk_timeout_work+0xad/0x150
[ 3843.454010]  [<ffffffff810a6642>] process_one_work+0x152/0x400
[ 3843.462114]  [<ffffffff810a6f35>] worker_thread+0x125/0x4b0
[ 3843.469961]  [<ffffffff810a6e10>] ? rescuer_thread+0x380/0x380
[ 3843.478116]  [<ffffffff810aca28>] kthread+0xd8/0xf0
[ 3843.485212]  [<ffffffff816fedff>] ret_from_fork+0x1f/0x40
[ 3843.492908]  [<ffffffff810ac950>] ? kthread_park+0x60/0x60
[ 3843.500715] ---[ end trace 57ec0a1d8f0dd3a0 ]---
[ 3852.328667] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Kernel
panic - not syncing: Hard LOCKUP
[ 3852.341357] Modules linked in: dm_service_time be2iscsi(E)
iscsi_boot_sysfs xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun
ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4
xt_conntrack ebtable_nat ebtable_broute rpcrdma bridge ib_isert stp
iscsi_target_mod llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ib_iser ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 libiscsi nf_nat scsi_transport_iscsi ib_srpt
nf_conntrack target_core_mod iptable_mangle iptable_security iptable_raw
iptable_filter ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs
ib_umad rdma_cm ib_cm iw_cm ocrdma ib_core dm_mirror dm_region_hash dm_log
intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
aesni_intel iTCO_wdt iTCO_vendor_support ipmi_devintf lrw dcdbas mei_me ipm
[ 3852.341358] CPU: 1 PID: 1129 Comm: kworker/1:1H Tainted: G        W   E
4.8.0-rc1+ #3
[ 3852.341359] Hardware name: Dell Inc. PowerEdge R720/0X6H47, BIOS 1.4.8
10/25/2012
[ 3852.341359] Workqueue: kblockd blk_timeout_work
[ 3852.341360]  0000000000000086 00000000acd8294d ffff88042f605bb0
ffffffff8135c3cf
[ 3852.341361]  0000000000000000 0000000000000000 ffff88042f605bc8
ffffffff8114b3b8
[ 3852.341362]  ffff8802af358000 ffff88042f605c00 ffffffff8118546c
0000000000000001
[ 3852.341362] Call Trace:
[ 3852.341363]  <NMI>  [<ffffffff8135c3cf>] dump_stack+0x63/0x84
[ 3852.341363]  [<ffffffff8114b3b8>] watchdog_overflow_callback+0xc8/0xf0
[ 3852.341364]  [<ffffffff8118546c>] __perf_event_overflow+0x7c/0x1f0
[ 3852.341364]  [<ffffffff8118f7b4>] perf_event_overflow+0x14/0x20
[ 3852.341365]  [<ffffffff8100c46d>] intel_pmu_handle_irq+0x1dd/0x490
[ 3852.341366]  [<ffffffff8135e561>] ? ioremap_page_range+0x2a1/0x400
[ 3852.341366]  [<ffffffff811e0e00>] ? vunmap_page_range+0x1e0/0x320
[ 3852.341367]  [<ffffffff811e0f51>] ? unmap_kernel_range_noflush+0x11/0x20
[ 3852.341368]  [<ffffffff814206c6>] ? ghes_copy_tofrom_phys+0x116/0x1f0
[ 3852.341368]  [<ffffffff81053e3f>] ? native_apic_wait_icr_idle+0x1f/0x30
[ 3852.341369]  [<ffffffff810057dd>] perf_event_nmi_handler+0x2d/0x50
[ 3852.341369]  [<ffffffff810313f1>] nmi_handle+0x61/0x110
[ 3852.341370]  [<ffffffff81031954>] default_do_nmi+0x44/0x120
[ 3852.341371]  [<ffffffff81031b1b>] do_nmi+0xeb/0x160
[ 3852.341371]  [<ffffffff817011b1>] end_repeat_nmi+0x1a/0x1e
[ 3852.341372]  [<ffffffff810da737>] ?
native_queued_spin_lock_slowpath+0x117/0x1a0
[ 3852.341373]  [<ffffffff810da737>] ?
native_queued_spin_lock_slowpath+0x117/0x1a0
[ 3852.341373]  [<ffffffff810da737>] ?
native_queued_spin_lock_slowpath+0x117/0x1a0
[ 3852.341374]  <<EOE>>  <IRQ>  [<ffffffff811983ba>]
queued_spin_lock_slowpath+0xb/0xf
[ 3852.341375]  [<ffffffff816feb27>] _raw_spin_lock_irqsave+0x37/0x40
[ 3852.341375]  [<ffffffff814d041b>] scsi_end_request+0x10b/0x1d0
[ 3852.341376]  [<ffffffff814d2143>] scsi_io_completion+0x153/0x650
[ 3852.341377]  [<ffffffff814c8faf>] scsi_finish_command+0xcf/0x120
[ 3852.341377]  [<ffffffff814d1927>] scsi_softirq_done+0x127/0x150
[ 3852.341378]  [<ffffffff813370dc>] blk_done_softirq+0x8c/0xc0
[ 3852.341378]  [<ffffffff81701932>] __do_softirq+0xd2/0x27a
[ 3852.341379]  [<ffffffff817009bc>] do_softirq_own_stack+0x1c/0x30

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
@ 2016-09-27 10:18 Jitendra Bhivare
  0 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-09-27 10:18 UTC (permalink / raw)
  To: cleech, lduncan; +Cc: linux-scsi, Mike Christie, Martin K. Petersen

> -----Original Message-----
> From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> Sent: Friday, September 23, 2016 8:38 PM
> To: 'Mike Christie'; 'Martin K. Petersen'
> Cc: 'linux-scsi@vger.kernel.org'
> Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
>
> Hi Mike,
>
> I could reproduce hard lockup using for-next kernel only in
> iscsi_eh_cmd_timeout path due to spin_lock_irqsave taken in
> blk_timeout_work
> Please refer stack trace below.
>
> The _bh version used for frwd_lock and back_lock does not seem to be
> causing
> any issue similar to seen with be2iscsi after replacing _bh versions in
> be2iscsi.
>
> I am testing it further, to confirm in all possible scenarios... NOPs,
> error
> recovery, resets and reconnects.
> On my setup, I affined all EQ interrupts on a single CPU.
> Along with heavy IO, few of the invocations of fio were pinned to run on
> same
> CPU.
>
> Any call to unlock_bh with another spin_lock already held, invoking
> do_softirq,
> might cause deadlock if bottom half used by driver calls function which
> needs
> that another spin_lock.
> Is there a code which prevents this issue?
>
The only place, I think, libiscsi can have issues is__iscsi_complete_pdu
which should be protected under _bh version for frwd and back locks.
If the function is executed in process context there is a good chance the
base version of spin_lock used for frwd/back locks could cause deadlocks
when lld uses _bh version in alloc_pdu path
(__iscsi_complete_pdu->iscsi_send_nopout... alloc_pdu).

Thanks,

JB

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 00/28] be2iscsi: driver update 11.2.0.0
@ 2016-07-21  9:07 Jitendra Bhivare
  2016-07-21  9:07 ` [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare
  0 siblings, 1 reply; 11+ messages in thread
From: Jitendra Bhivare @ 2016-07-21  9:07 UTC (permalink / raw)
  To: linux-scsi; +Cc: mchristi, Jitendra Bhivare

This patch is generated against for-next branch.

Jitendra Bhivare (28):
  be2iscsi: Fix to use correct configuration values
  be2iscsi: Replace _bh with _irqsave/irqrestore
  be2iscsi: Replace _bh version for mcc_lock spinlock
  be2iscsi: Reduce driver load/unload time
  be2iscsi: Set and return right iface v4/v6 states
  be2iscsi: Fix gateway APIs to support IPv4 & IPv6
  be2iscsi: Fix release of DHCP IP in static mode
  be2iscsi: Move VLAN code to common iface_set_param
  be2iscsi: Update iface handle before any set param
  be2iscsi: Rename iface get/set/create/destroy APIs
  be2iscsi: Handle only NET_PARAM in iface_get_param
  be2iscsi: Check all zeroes IP before issuing IOCTL
  be2iscsi: Remove alloc_mcc_tag & beiscsi_pci_soft_reset
  be2iscsi: Remove isr_lock and dead code
  be2iscsi: Fix checks for HBA in error state
  be2iscsi: Fix to make boot discovery non-blocking
  be2iscsi: Fix to add timer for UE detection
  be2iscsi: Add IOCTL to check UER supported
  be2iscsi: Move functions to right files
  be2iscsi: Fix POST check and reset sequence
  be2iscsi: Add V1 of EPFW cleanup IOCTL
  be2iscsi: Add TPE recovery feature
  be2iscsi: Fail the sessions immediately after TPE
  be2iscsi: Add FUNCTION_RESET during driver unload
  be2iscsi: Fix async PDU handling path
  be2iscsi: Update copyright information
  be2iscsi: Update the driver version
  MAINTAINERS: Update be2iscsi contact info

 MAINTAINERS                      |   10 +-
 drivers/scsi/be2iscsi/be.h       |   15 +-
 drivers/scsi/be2iscsi/be_cmds.c  | 1096 +++++++++++------
 drivers/scsi/be2iscsi/be_cmds.h  |  142 ++-
 drivers/scsi/be2iscsi/be_iscsi.c |  408 +++----
 drivers/scsi/be2iscsi/be_iscsi.h |   25 +-
 drivers/scsi/be2iscsi/be_main.c  | 2476 +++++++++++++++++++-------------------
 drivers/scsi/be2iscsi/be_main.h  |  220 ++--
 drivers/scsi/be2iscsi/be_mgmt.c  | 1490 +++++++++--------------
 drivers/scsi/be2iscsi/be_mgmt.h  |   51 +-
 10 files changed, 3052 insertions(+), 2881 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-07-21  9:07 [PATCH 00/28] be2iscsi: driver update 11.2.0.0 Jitendra Bhivare
@ 2016-07-21  9:07 ` Jitendra Bhivare
  2016-08-05 18:38   ` Mike Christie
  0 siblings, 1 reply; 11+ messages in thread
From: Jitendra Bhivare @ 2016-07-21  9:07 UTC (permalink / raw)
  To: linux-scsi; +Cc: mchristi, Jitendra Bhivare

Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 21
Pid: 13242, comm: flush-8:80 Tainted: G        W  -- ------------
2.6.32-573.el6.x86_64 #1
Call Trace:
<NMI>  [<ffffffff81537a84>] ? panic+0xa7/0x16f
[<ffffffff810149c9>] ? sched_clock+0x9/0x10
[<ffffffff810ed4bd>] ? watchdog_overflow_callback+0xcd/0xd0
[<ffffffff81124037>] ? __perf_event_overflow+0xa7/0x240
[<ffffffff8101dc54>] ? x86_perf_event_set_period+0xf4/0x180
[<ffffffff81124684>] ? perf_event_overflow+0x14/0x20
[<ffffffff81024a02>] ? intel_pmu_handle_irq+0x202/0x3f0
[<ffffffff8153cd89>] ? perf_event_nmi_handler+0x39/0xb0
[<ffffffff8153e845>] ? notifier_call_chain+0x55/0x80
[<ffffffff81389ba0>] ? scsi_done+0x0/0x60
[<ffffffff8153e8aa>] ? atomic_notifier_call_chain+0x1a/0x20
[<ffffffff810a788e>] ? notify_die+0x2e/0x30
[<ffffffff8153c503>] ? do_nmi+0x1c3/0x350
[<ffffffff8153bdc0>] ? nmi+0x20/0x30
[<ffffffff81389ba0>] ? scsi_done+0x0/0x60
[<ffffffff81389ba0>] ? scsi_done+0x0/0x60
[<ffffffff8153b62e>] ? _spin_lock+0x1e/0x30
<<EOE>>  [<ffffffffa00e2eaf>] ? iscsi_queuecommand+0x7f/0x4e0 [libiscsi]
[<ffffffff81389df5>] ? scsi_dispatch_cmd+0xe5/0x310
[<ffffffff813927be>] ? scsi_request_fn+0x5be/0x750
[<ffffffff81089bad>] ? del_timer+0x7d/0xe0
[<ffffffff81273542>] ? __generic_unplug_device+0x32/0x40
[<ffffffff8126e823>] ? elv_insert+0xd3/0x190
[<ffffffff8126e920>] ? __elv_add_request+0x40/0x90

In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for
protecting SGLs and WRBs. _bh versions are needed as the function gets
invoked in process context and BLOCK_IOPOLL softirq.

In spin_unlock_bh, after releasing the lock and enabling BH, do_softirq
is called which executes till last SOFTIRQ.

beiscsi_alloc_pdu is called under session lock. Through block layer,
iSCSI stack in some cases send IOs with interrupts disabled. In such paths,
CPU will get stuck for a while for session lock with interrupts disabled
because in other CPU do_softirq is executing under session lock thus
causing hard lock up.

Use spin_lock_irqsave/spin_lock_irqrestore as the driver can't be sure
in all paths interrupts are enabled.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
---
 drivers/scsi/be2iscsi/be_main.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index f05e773..02d24d5 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -1131,8 +1131,9 @@ beiscsi_process_async_pdu(struct beiscsi_conn *beiscsi_conn,
 static struct sgl_handle *alloc_io_sgl_handle(struct beiscsi_hba *phba)
 {
 	struct sgl_handle *psgl_handle;
+	unsigned long flags;
 
-	spin_lock_bh(&phba->io_sgl_lock);
+	spin_lock_irqsave(&phba->io_sgl_lock, flags);
 	if (phba->io_sgl_hndl_avbl) {
 		beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_IO,
 			    "BM_%d : In alloc_io_sgl_handle,"
@@ -1150,14 +1151,16 @@ static struct sgl_handle *alloc_io_sgl_handle(struct beiscsi_hba *phba)
 			phba->io_sgl_alloc_index++;
 	} else
 		psgl_handle = NULL;
-	spin_unlock_bh(&phba->io_sgl_lock);
+	spin_unlock_irqrestore(&phba->io_sgl_lock, flags);
 	return psgl_handle;
 }
 
 static void
 free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle)
 {
-	spin_lock_bh(&phba->io_sgl_lock);
+	unsigned long flags;
+
+	spin_lock_irqsave(&phba->io_sgl_lock, flags);
 	beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_IO,
 		    "BM_%d : In free_,io_sgl_free_index=%d\n",
 		    phba->io_sgl_free_index);
@@ -1172,7 +1175,7 @@ free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle)
 			     "value there=%p\n", phba->io_sgl_free_index,
 			     phba->io_sgl_hndl_base
 			     [phba->io_sgl_free_index]);
-		 spin_unlock_bh(&phba->io_sgl_lock);
+		 spin_unlock_irqrestore(&phba->io_sgl_lock, flags);
 		return;
 	}
 	phba->io_sgl_hndl_base[phba->io_sgl_free_index] = psgl_handle;
@@ -1181,7 +1184,7 @@ free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle)
 		phba->io_sgl_free_index = 0;
 	else
 		phba->io_sgl_free_index++;
-	spin_unlock_bh(&phba->io_sgl_lock);
+	spin_unlock_irqrestore(&phba->io_sgl_lock, flags);
 }
 
 static inline struct wrb_handle *
@@ -1189,15 +1192,16 @@ beiscsi_get_wrb_handle(struct hwi_wrb_context *pwrb_context,
 		       unsigned int wrbs_per_cxn)
 {
 	struct wrb_handle *pwrb_handle;
+	unsigned long flags;
 
-	spin_lock_bh(&pwrb_context->wrb_lock);
+	spin_lock_irqsave(&pwrb_context->wrb_lock, flags);
 	pwrb_handle = pwrb_context->pwrb_handle_base[pwrb_context->alloc_index];
 	pwrb_context->wrb_handles_available--;
 	if (pwrb_context->alloc_index == (wrbs_per_cxn - 1))
 		pwrb_context->alloc_index = 0;
 	else
 		pwrb_context->alloc_index++;
-	spin_unlock_bh(&pwrb_context->wrb_lock);
+	spin_unlock_irqrestore(&pwrb_context->wrb_lock, flags);
 
 	return pwrb_handle;
 }
@@ -1229,14 +1233,16 @@ beiscsi_put_wrb_handle(struct hwi_wrb_context *pwrb_context,
 		       struct wrb_handle *pwrb_handle,
 		       unsigned int wrbs_per_cxn)
 {
-	spin_lock_bh(&pwrb_context->wrb_lock);
+	unsigned long flags;
+
+	spin_lock_irqsave(&pwrb_context->wrb_lock, flags);
 	pwrb_context->pwrb_handle_base[pwrb_context->free_index] = pwrb_handle;
 	pwrb_context->wrb_handles_available++;
 	if (pwrb_context->free_index == (wrbs_per_cxn - 1))
 		pwrb_context->free_index = 0;
 	else
 		pwrb_context->free_index++;
-	spin_unlock_bh(&pwrb_context->wrb_lock);
+	spin_unlock_irqrestore(&pwrb_context->wrb_lock, flags);
 }
 
 /**
@@ -1265,8 +1271,9 @@ free_wrb_handle(struct beiscsi_hba *phba, struct hwi_wrb_context *pwrb_context,
 static struct sgl_handle *alloc_mgmt_sgl_handle(struct beiscsi_hba *phba)
 {
 	struct sgl_handle *psgl_handle;
+	unsigned long flags;
 
-	spin_lock_bh(&phba->mgmt_sgl_lock);
+	spin_lock_irqsave(&phba->mgmt_sgl_lock, flags);
 	if (phba->eh_sgl_hndl_avbl) {
 		psgl_handle = phba->eh_sgl_hndl_base[phba->eh_sgl_alloc_index];
 		phba->eh_sgl_hndl_base[phba->eh_sgl_alloc_index] = NULL;
@@ -1284,14 +1291,16 @@ static struct sgl_handle *alloc_mgmt_sgl_handle(struct beiscsi_hba *phba)
 			phba->eh_sgl_alloc_index++;
 	} else
 		psgl_handle = NULL;
-	spin_unlock_bh(&phba->mgmt_sgl_lock);
+	spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags);
 	return psgl_handle;
 }
 
 void
 free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle)
 {
-	spin_lock_bh(&phba->mgmt_sgl_lock);
+	unsigned long flags;
+
+	spin_lock_irqsave(&phba->mgmt_sgl_lock, flags);
 	beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_CONFIG,
 		    "BM_%d : In  free_mgmt_sgl_handle,"
 		    "eh_sgl_free_index=%d\n",
@@ -1306,7 +1315,7 @@ free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle)
 			    "BM_%d : Double Free in eh SGL ,"
 			    "eh_sgl_free_index=%d\n",
 			    phba->eh_sgl_free_index);
-		spin_unlock_bh(&phba->mgmt_sgl_lock);
+		spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags);
 		return;
 	}
 	phba->eh_sgl_hndl_base[phba->eh_sgl_free_index] = psgl_handle;
@@ -1316,7 +1325,7 @@ free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle)
 		phba->eh_sgl_free_index = 0;
 	else
 		phba->eh_sgl_free_index++;
-	spin_unlock_bh(&phba->mgmt_sgl_lock);
+	spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags);
 }
 
 static void
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-07-21  9:07 ` [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare
@ 2016-08-05 18:38   ` Mike Christie
  2016-08-09  1:05     ` Martin K. Petersen
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Christie @ 2016-08-05 18:38 UTC (permalink / raw)
  To: Jitendra Bhivare, linux-scsi

On 07/21/2016 04:07 AM, Jitendra Bhivare wrote:
> Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 21
> Pid: 13242, comm: flush-8:80 Tainted: G        W  -- ------------
> 2.6.32-573.el6.x86_64 #1
> Call Trace:
> <NMI>  [<ffffffff81537a84>] ? panic+0xa7/0x16f
> [<ffffffff810149c9>] ? sched_clock+0x9/0x10
> [<ffffffff810ed4bd>] ? watchdog_overflow_callback+0xcd/0xd0
> [<ffffffff81124037>] ? __perf_event_overflow+0xa7/0x240
> [<ffffffff8101dc54>] ? x86_perf_event_set_period+0xf4/0x180
> [<ffffffff81124684>] ? perf_event_overflow+0x14/0x20
> [<ffffffff81024a02>] ? intel_pmu_handle_irq+0x202/0x3f0
> [<ffffffff8153cd89>] ? perf_event_nmi_handler+0x39/0xb0
> [<ffffffff8153e845>] ? notifier_call_chain+0x55/0x80
> [<ffffffff81389ba0>] ? scsi_done+0x0/0x60
> [<ffffffff8153e8aa>] ? atomic_notifier_call_chain+0x1a/0x20
> [<ffffffff810a788e>] ? notify_die+0x2e/0x30
> [<ffffffff8153c503>] ? do_nmi+0x1c3/0x350
> [<ffffffff8153bdc0>] ? nmi+0x20/0x30
> [<ffffffff81389ba0>] ? scsi_done+0x0/0x60
> [<ffffffff81389ba0>] ? scsi_done+0x0/0x60
> [<ffffffff8153b62e>] ? _spin_lock+0x1e/0x30
> <<EOE>>  [<ffffffffa00e2eaf>] ? iscsi_queuecommand+0x7f/0x4e0 [libiscsi]
> [<ffffffff81389df5>] ? scsi_dispatch_cmd+0xe5/0x310
> [<ffffffff813927be>] ? scsi_request_fn+0x5be/0x750
> [<ffffffff81089bad>] ? del_timer+0x7d/0xe0
> [<ffffffff81273542>] ? __generic_unplug_device+0x32/0x40
> [<ffffffff8126e823>] ? elv_insert+0xd3/0x190
> [<ffffffff8126e920>] ? __elv_add_request+0x40/0x90
> 
> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for
> protecting SGLs and WRBs. _bh versions are needed as the function gets
> invoked in process context and BLOCK_IOPOLL softirq.
> 
> In spin_unlock_bh, after releasing the lock and enabling BH, do_softirq
> is called which executes till last SOFTIRQ.
> 
> beiscsi_alloc_pdu is called under session lock. Through block layer,
> iSCSI stack in some cases send IOs with interrupts disabled. In such paths,


What path is this? Is this with mq enabled or disabled?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-05 18:38   ` Mike Christie
@ 2016-08-09  1:05     ` Martin K. Petersen
  2016-08-09  8:29       ` Jitendra Bhivare
  0 siblings, 1 reply; 11+ messages in thread
From: Martin K. Petersen @ 2016-08-09  1:05 UTC (permalink / raw)
  To: Mike Christie; +Cc: Jitendra Bhivare, linux-scsi

>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:

>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for
>> protecting SGLs and WRBs. _bh versions are needed as the function
>> gets invoked in process context and BLOCK_IOPOLL softirq.
>> 
>> In spin_unlock_bh, after releasing the lock and enabling BH,
>> do_softirq is called which executes till last SOFTIRQ.
>> 
>> beiscsi_alloc_pdu is called under session lock. Through block layer,
>> iSCSI stack in some cases send IOs with interrupts disabled. In such
>> paths,


Mike> What path is this? Is this with mq enabled or disabled?

Jitendra?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-09  1:05     ` Martin K. Petersen
@ 2016-08-09  8:29       ` Jitendra Bhivare
  2016-08-09 17:48         ` Mike Christie
  0 siblings, 1 reply; 11+ messages in thread
From: Jitendra Bhivare @ 2016-08-09  8:29 UTC (permalink / raw)
  To: Martin K. Petersen, Mike Christie; +Cc: linux-scsi

> -----Original Message-----
> From: Martin K. Petersen [mailto:martin.petersen@oracle.com]
> Sent: Tuesday, August 09, 2016 6:35 AM
> To: Mike Christie
> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org
> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
_irqsave/irqrestore
>
> >>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:
>
> >> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for
> >> protecting SGLs and WRBs. _bh versions are needed as the function
> >> gets invoked in process context and BLOCK_IOPOLL softirq.
> >>
> >> In spin_unlock_bh, after releasing the lock and enabling BH,
> >> do_softirq is called which executes till last SOFTIRQ.
> >>
> >> beiscsi_alloc_pdu is called under session lock. Through block layer,
> >> iSCSI stack in some cases send IOs with interrupts disabled. In such
> >> paths,
>
>
> Mike> What path is this? Is this with mq enabled or disabled?
>
> Jitendra?
>
> --
> Martin K. Petersen	Oracle Linux Engineering
[JB] Sorry for the delayed response, there was some issue with my mail
client.
There are paths block layer where IRQs are disabled with request_queue
queue_lock.
- blk_timeout_work : this triggers NOP-OUT thru' iscsi_eh_cmd_timed_out.
- blk_execute_rq_nowait

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-09  8:29       ` Jitendra Bhivare
@ 2016-08-09 17:48         ` Mike Christie
  2016-08-10 12:46           ` Jitendra Bhivare
                             ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Mike Christie @ 2016-08-09 17:48 UTC (permalink / raw)
  To: Jitendra Bhivare, Martin K. Petersen; +Cc: linux-scsi

On 08/09/2016 03:29 AM, Jitendra Bhivare wrote:
>> -----Original Message-----
>> From: Martin K. Petersen [mailto:martin.petersen@oracle.com]
>> Sent: Tuesday, August 09, 2016 6:35 AM
>> To: Mike Christie
>> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org
>> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> _irqsave/irqrestore
>>
>>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:
>>
>>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for
>>>> protecting SGLs and WRBs. _bh versions are needed as the function
>>>> gets invoked in process context and BLOCK_IOPOLL softirq.
>>>>
>>>> In spin_unlock_bh, after releasing the lock and enabling BH,
>>>> do_softirq is called which executes till last SOFTIRQ.
>>>>
>>>> beiscsi_alloc_pdu is called under session lock. Through block layer,
>>>> iSCSI stack in some cases send IOs with interrupts disabled. In such
>>>> paths,
>>
>>
>> Mike> What path is this? Is this with mq enabled or disabled?
>>
>> Jitendra?
>>
>> --
>> Martin K. Petersen	Oracle Linux Engineering
> [JB] Sorry for the delayed response, there was some issue with my mail
> client.
> There are paths block layer where IRQs are disabled with request_queue
> queue_lock.
> - blk_timeout_work : this triggers NOP-OUT thru' iscsi_eh_cmd_timed_out.
> - blk_execute_rq_nowait

I am not sure I understand the problem/solution. Why does the patch only
need to modify be2iscsi's locking? Why wouldn't you need to also change
the libiscsi session lock locking? You don't need irqs to be running
right? You are just doing this so the locking calls (irq vs bh) matches up?

Don't you also need to fix the non-mq IO path. scsi_request_fn needs a
fix to use spin_lock_irqsave/spin_unlock_irqrestore because
blk_execute_rq_nowait already disables them right?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-09 17:48         ` Mike Christie
@ 2016-08-10 12:46           ` Jitendra Bhivare
  2016-08-11  5:42           ` Jitendra Bhivare
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-08-10 12:46 UTC (permalink / raw)
  To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi

> -----Original Message-----
> From: Mike Christie [mailto:mchristi@redhat.com]
> Sent: Tuesday, August 09, 2016 11:19 PM
> To: Jitendra Bhivare; Martin K. Petersen
> Cc: linux-scsi@vger.kernel.org
> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
>
> On 08/09/2016 03:29 AM, Jitendra Bhivare wrote:
> >> -----Original Message-----
> >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com]
> >> Sent: Tuesday, August 09, 2016 6:35 AM
> >> To: Mike Christie
> >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org
> >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > _irqsave/irqrestore
> >>
> >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:
> >>
> >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for
> >>>> protecting SGLs and WRBs. _bh versions are needed as the function
> >>>> gets invoked in process context and BLOCK_IOPOLL softirq.
> >>>>
> >>>> In spin_unlock_bh, after releasing the lock and enabling BH,
> >>>> do_softirq is called which executes till last SOFTIRQ.
> >>>>
> >>>> beiscsi_alloc_pdu is called under session lock. Through block
> >>>> layer, iSCSI stack in some cases send IOs with interrupts disabled.
> >>>> In such paths,
> >>
> >>
> >> Mike> What path is this? Is this with mq enabled or disabled?
> >>
> >> Jitendra?
> >>
> >> --
> >> Martin K. Petersen	Oracle Linux Engineering
> > [JB] Sorry for the delayed response, there was some issue with my mail
> > client.
> > There are paths block layer where IRQs are disabled with request_queue
> > queue_lock.
> > - blk_timeout_work : this triggers NOP-OUT thru' iscsi_eh_cmd_timed_out.
> > - blk_execute_rq_nowait
>
> I am not sure I understand the problem/solution. Why does the patch only
> need
> to modify be2iscsi's locking? Why wouldn't you need to also change the
> libiscsi
> session lock locking? You don't need irqs to be running right? You are
> just doing
> this so the locking calls (irq vs bh) matches up?
>
> Don't you also need to fix the non-mq IO path. scsi_request_fn needs a fix
> to use
> spin_lock_irqsave/spin_unlock_irqrestore because blk_execute_rq_nowait
> already disables them right?
>
[JB] The issue is hard lockup seen on a CPU which is waiting for session
lock taken by another
CPU processing do_softirq. This do_softirq is getting called from be2iscsi
spin_unlock_bh.

The contention gets easily exposed with be2iscsi as it data structures are
accessed in process context
and IRQ_POLL softirq.

iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock version
for session lock (frwd and back).
iscsi_queuecommand is using _bh which is bit scary.

Yes, that needs to be fixed too... and then this other path looks buggy
too -
blk_run_queue (takes queue_lock with irqsave) ->scsi_request_fn (unlocks it
with _irq)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-09 17:48         ` Mike Christie
  2016-08-10 12:46           ` Jitendra Bhivare
@ 2016-08-11  5:42           ` Jitendra Bhivare
  2016-08-12  7:55           ` Jitendra Bhivare
  2016-08-12  8:05           ` Jitendra Bhivare
  3 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-08-11  5:42 UTC (permalink / raw)
  To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi

> -----Original Message-----
> From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> Sent: Wednesday, August 10, 2016 6:16 PM
> To: 'Mike Christie'; 'Martin K. Petersen'
> Cc: 'linux-scsi@vger.kernel.org'
> Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
>
> > -----Original Message-----
> > From: Mike Christie [mailto:mchristi@redhat.com]
> > Sent: Tuesday, August 09, 2016 11:19 PM
> > To: Jitendra Bhivare; Martin K. Petersen
> > Cc: linux-scsi@vger.kernel.org
> > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > _irqsave/irqrestore
> >
> > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote:
> > >> -----Original Message-----
> > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com]
> > >> Sent: Tuesday, August 09, 2016 6:35 AM
> > >> To: Mike Christie
> > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org
> > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > > _irqsave/irqrestore
> > >>
> > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:
> > >>
> > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used
> > >>>> for protecting SGLs and WRBs. _bh versions are needed as the
> > >>>> function gets invoked in process context and BLOCK_IOPOLL softirq.
> > >>>>
> > >>>> In spin_unlock_bh, after releasing the lock and enabling BH,
> > >>>> do_softirq is called which executes till last SOFTIRQ.
> > >>>>
> > >>>> beiscsi_alloc_pdu is called under session lock. Through block
> > >>>> layer, iSCSI stack in some cases send IOs with interrupts disabled.
> > >>>> In such paths,
> > >>
> > >>
> > >> Mike> What path is this? Is this with mq enabled or disabled?
> > >>
> > >> Jitendra?
> > >>
> > >> --
> > >> Martin K. Petersen	Oracle Linux Engineering
> > > [JB] Sorry for the delayed response, there was some issue with my
> > > mail client.
> > > There are paths block layer where IRQs are disabled with
> > > request_queue queue_lock.
> > > - blk_timeout_work : this triggers NOP-OUT thru'
> > > iscsi_eh_cmd_timed_out.
> > > - blk_execute_rq_nowait
> >
> > I am not sure I understand the problem/solution. Why does the patch
> > only need to modify be2iscsi's locking? Why wouldn't you need to also
> > change the libiscsi session lock locking? You don't need irqs to be
> > running right? You are just doing this so the locking calls (irq vs bh)
> > matches
> up?
> >
> > Don't you also need to fix the non-mq IO path. scsi_request_fn needs a
> > fix to use spin_lock_irqsave/spin_unlock_irqrestore because
> > blk_execute_rq_nowait already disables them right?
> >
> [JB] The issue is hard lockup seen on a CPU which is waiting for session
> lock
> taken by another CPU processing do_softirq. This do_softirq is getting
> called
> from be2iscsi spin_unlock_bh.
>
> The contention gets easily exposed with be2iscsi as it data structures are
> accessed in process context and IRQ_POLL softirq.
>
> iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock
> version
> for session lock (frwd and back).
> iscsi_queuecommand is using _bh which is bit scary.
>
> Yes, that needs to be fixed too... and then this other path looks buggy
> too -
> blk_run_queue (takes queue_lock with irqsave) ->scsi_request_fn (unlocks
> it with
> _irq)
[JB] In this case, the other CPU executing do_softirq was too sending IO
under session lock.
When be2iscsi does spin_unlock_bh after getting resources for the IO,
unlock_bh finds pending IRQ_POLL softirq.

The issue is more to do with driver using _unlock_bh.
It is causing a WARN_ON too in unlock_bh where ever used  for disabled IRQs.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-09 17:48         ` Mike Christie
  2016-08-10 12:46           ` Jitendra Bhivare
  2016-08-11  5:42           ` Jitendra Bhivare
@ 2016-08-12  7:55           ` Jitendra Bhivare
  2016-08-12  8:05           ` Jitendra Bhivare
  3 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-08-12  7:55 UTC (permalink / raw)
  To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi

> -----Original Message-----
> From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> Sent: Thursday, August 11, 2016 11:12 AM
> To: 'Mike Christie'; 'Martin K. Petersen'
> Cc: 'linux-scsi@vger.kernel.org'
> Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
>
>
>
> > -----Original Message-----
> > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> > Sent: Wednesday, August 10, 2016 6:16 PM
> > To: 'Mike Christie'; 'Martin K. Petersen'
> > Cc: 'linux-scsi@vger.kernel.org'
> > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with
> > _irqsave/irqrestore
> >
> > > -----Original Message-----
> > > From: Mike Christie [mailto:mchristi@redhat.com]
> > > Sent: Tuesday, August 09, 2016 11:19 PM
> > > To: Jitendra Bhivare; Martin K. Petersen
> > > Cc: linux-scsi@vger.kernel.org
> > > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > > _irqsave/irqrestore
> > >
> > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote:
> > > >> -----Original Message-----
> > > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com]
> > > >> Sent: Tuesday, August 09, 2016 6:35 AM
> > > >> To: Mike Christie
> > > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org
> > > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > > > _irqsave/irqrestore
> > > >>
> > > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:
> > > >>
> > > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used
> > > >>>> for protecting SGLs and WRBs. _bh versions are needed as the
> > > >>>> function gets invoked in process context and BLOCK_IOPOLL
> > > >>>> softirq.
> > > >>>>
> > > >>>> In spin_unlock_bh, after releasing the lock and enabling BH,
> > > >>>> do_softirq is called which executes till last SOFTIRQ.
> > > >>>>
> > > >>>> beiscsi_alloc_pdu is called under session lock. Through block
> > > >>>> layer, iSCSI stack in some cases send IOs with interrupts
> > > >>>> disabled.
> > > >>>> In such paths,
> > > >>
> > > >>
> > > >> Mike> What path is this? Is this with mq enabled or disabled?
> > > >>
> > > >> Jitendra?
> > > >>
> > > >> --
> > > >> Martin K. Petersen	Oracle Linux Engineering
> > > > [JB] Sorry for the delayed response, there was some issue with my
> > > > mail client.
> > > > There are paths block layer where IRQs are disabled with
> > > > request_queue queue_lock.
> > > > - blk_timeout_work : this triggers NOP-OUT thru'
> > > > iscsi_eh_cmd_timed_out.
> > > > - blk_execute_rq_nowait
> > >
> > > I am not sure I understand the problem/solution. Why does the patch
> > > only need to modify be2iscsi's locking? Why wouldn't you need to
> > > also change the libiscsi session lock locking? You don't need irqs
> > > to be running right? You are just doing this so the locking calls
> > > (irq vs bh) matches
> > up?
> > >
> > > Don't you also need to fix the non-mq IO path. scsi_request_fn needs
> > > a fix to use spin_lock_irqsave/spin_unlock_irqrestore because
> > > blk_execute_rq_nowait already disables them right?
> > >
> > [JB] The issue is hard lockup seen on a CPU which is waiting for
> > session lock taken by another CPU processing do_softirq. This
> > do_softirq is getting called from be2iscsi spin_unlock_bh.
> >
> > The contention gets easily exposed with be2iscsi as it data structures
> > are accessed in process context and IRQ_POLL softirq.
> >
> > iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock
> > version for session lock (frwd and back).
> > iscsi_queuecommand is using _bh which is bit scary.
> >
> > Yes, that needs to be fixed too... and then this other path looks
> > buggy too - blk_run_queue (takes queue_lock with irqsave)
> > ->scsi_request_fn (unlocks it with
> > _irq)
> [JB] In this case, the other CPU executing do_softirq was too sending IO
> under
> session lock.
> When be2iscsi does spin_unlock_bh after getting resources for the IO,
> unlock_bh finds pending IRQ_POLL softirq.
>
> The issue is more to do with driver using _unlock_bh.
> It is causing a WARN_ON too in unlock_bh where ever used  for disabled
> IRQs.
[JB] My apologies. blk_execute_rq_nowait and the call trace shown is with
older kernel.
Only valid kernel trace with latest kernel is:
[  180.957062]  <IRQ>  [<ffffffff81603f36>] dump_stack+0x19/0x1b
[  180.957070]  [<ffffffff8106e28b>] warn_slowpath_common+0x6b/0xb0
[  180.957072]  [<ffffffff8106e3da>] warn_slowpath_null+0x1a/0x20
[  180.957073]  [<ffffffff81076d1a>] local_bh_enable_ip+0x7a/0xa0
[  180.957077]  [<ffffffff8160b0eb>] _raw_spin_unlock_bh+0x1b/0x40
[  180.957084]  [<ffffffffa037f753>] alloc_mgmt_sgl_handle+0x83/0xe0
[be2iscsi]
[  180.957089]  [<ffffffffa0388d9e>] beiscsi_alloc_pdu+0x21e/0x5d0
[be2iscsi]
[  180.957094]  [<ffffffffa01bec71>] __iscsi_conn_send_pdu+0x101/0x2e0
[libiscsi]
[  180.957099]  [<ffffffffa01bef78>] iscsi_send_nopout+0xb8/0x100 [libiscsi]
[  180.957104]  [<ffffffffa01bfca4>] iscsi_eh_cmd_timed_out+0x2a4/0x2d0
[libiscsi]
[  180.957107]  [<ffffffff813f56de>] scsi_times_out+0x5e/0x320

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
  2016-08-09 17:48         ` Mike Christie
                             ` (2 preceding siblings ...)
  2016-08-12  7:55           ` Jitendra Bhivare
@ 2016-08-12  8:05           ` Jitendra Bhivare
  3 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-08-12  8:05 UTC (permalink / raw)
  To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi

> -----Original Message-----
> From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> Sent: Friday, August 12, 2016 1:26 PM
> To: 'Mike Christie'; 'Martin K. Petersen'
> Cc: 'linux-scsi@vger.kernel.org'
> Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
>
>
>
> > -----Original Message-----
> > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> > Sent: Thursday, August 11, 2016 11:12 AM
> > To: 'Mike Christie'; 'Martin K. Petersen'
> > Cc: 'linux-scsi@vger.kernel.org'
> > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with
> > _irqsave/irqrestore
> >
> >
> >
> > > -----Original Message-----
> > > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> > > Sent: Wednesday, August 10, 2016 6:16 PM
> > > To: 'Mike Christie'; 'Martin K. Petersen'
> > > Cc: 'linux-scsi@vger.kernel.org'
> > > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with
> > > _irqsave/irqrestore
> > >
> > > > -----Original Message-----
> > > > From: Mike Christie [mailto:mchristi@redhat.com]
> > > > Sent: Tuesday, August 09, 2016 11:19 PM
> > > > To: Jitendra Bhivare; Martin K. Petersen
> > > > Cc: linux-scsi@vger.kernel.org
> > > > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > > > _irqsave/irqrestore
> > > >
> > > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote:
> > > > >> -----Original Message-----
> > > > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com]
> > > > >> Sent: Tuesday, August 09, 2016 6:35 AM
> > > > >> To: Mike Christie
> > > > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org
> > > > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with
> > > > > _irqsave/irqrestore
> > > > >>
> > > > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes:
> > > > >>
> > > > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being
> > > > >>>> used for protecting SGLs and WRBs. _bh versions are needed as
> > > > >>>> the function gets invoked in process context and BLOCK_IOPOLL
> softirq.
> > > > >>>>
> > > > >>>> In spin_unlock_bh, after releasing the lock and enabling BH,
> > > > >>>> do_softirq is called which executes till last SOFTIRQ.
> > > > >>>>
> > > > >>>> beiscsi_alloc_pdu is called under session lock. Through block
> > > > >>>> layer, iSCSI stack in some cases send IOs with interrupts
> > > > >>>> disabled.
> > > > >>>> In such paths,
> > > > >>
> > > > >>
> > > > >> Mike> What path is this? Is this with mq enabled or disabled?
> > > > >>
> > > > >> Jitendra?
> > > > >>
> > > > >> --
> > > > >> Martin K. Petersen	Oracle Linux Engineering
> > > > > [JB] Sorry for the delayed response, there was some issue with
> > > > > my mail client.
> > > > > There are paths block layer where IRQs are disabled with
> > > > > request_queue queue_lock.
> > > > > - blk_timeout_work : this triggers NOP-OUT thru'
> iscsi_eh_cmd_timed_out.
> > > > > - blk_execute_rq_nowait
> > > >
> > > > I am not sure I understand the problem/solution. Why does the
> > > > patch only need to modify be2iscsi's locking? Why wouldn't you
> > > > need to also change the libiscsi session lock locking? You don't
> > > > need irqs to be running right? You are just doing this so the
> > > > locking calls (irq vs bh) matches
> > > up?
> > > >
> > > > Don't you also need to fix the non-mq IO path. scsi_request_fn
> > > > needs a fix to use spin_lock_irqsave/spin_unlock_irqrestore
> > > > because blk_execute_rq_nowait already disables them right?
> > > >
> > > [JB] The issue is hard lockup seen on a CPU which is waiting for
> > > session lock taken by another CPU processing do_softirq. This
> > > do_softirq is getting called from be2iscsi spin_unlock_bh.
> > >
> > > The contention gets easily exposed with be2iscsi as it data
> > > structures are accessed in process context and IRQ_POLL softirq.
> > >
> > > iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock
> > > version for session lock (frwd and back).
> > > iscsi_queuecommand is using _bh which is bit scary.
> > >
> > > Yes, that needs to be fixed too... and then this other path looks
> > > buggy too - blk_run_queue (takes queue_lock with irqsave)
> > > ->scsi_request_fn (unlocks it with
> > > _irq)
> > [JB] In this case, the other CPU executing do_softirq was too sending
> > IO under session lock.
> > When be2iscsi does spin_unlock_bh after getting resources for the IO,
> > unlock_bh finds pending IRQ_POLL softirq.
> >
> > The issue is more to do with driver using _unlock_bh.
> > It is causing a WARN_ON too in unlock_bh where ever used  for disabled
> > IRQs.
> [JB] My apologies. blk_execute_rq_nowait and the call trace shown is with
> older
> kernel.
> Only valid kernel trace with latest kernel is:
> [  180.957062]  <IRQ>  [<ffffffff81603f36>] dump_stack+0x19/0x1b [
> 180.957070]  [<ffffffff8106e28b>] warn_slowpath_common+0x6b/0xb0 [
> 180.957072]  [<ffffffff8106e3da>] warn_slowpath_null+0x1a/0x20 [
> 180.957073]  [<ffffffff81076d1a>] local_bh_enable_ip+0x7a/0xa0 [
> 180.957077]  [<ffffffff8160b0eb>] _raw_spin_unlock_bh+0x1b/0x40 [
> 180.957084]  [<ffffffffa037f753>] alloc_mgmt_sgl_handle+0x83/0xe0
> [be2iscsi]
> [  180.957089]  [<ffffffffa0388d9e>] beiscsi_alloc_pdu+0x21e/0x5d0
> [be2iscsi] [
> 180.957094]  [<ffffffffa01bec71>] __iscsi_conn_send_pdu+0x101/0x2e0
> [libiscsi] [  180.957099]  [<ffffffffa01bef78>]
> iscsi_send_nopout+0xb8/0x100
> [libiscsi] [  180.957104]  [<ffffffffa01bfca4>]
> iscsi_eh_cmd_timed_out+0x2a4/0x2d0
> [libiscsi]
> [  180.957107]  [<ffffffff813f56de>] scsi_times_out+0x5e/0x320
[JB] The scenario of lockup explained, but likely a soft one, can still
happen, hence the change.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-09-27 10:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-23 15:08 [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare
  -- strict thread matches above, loose matches on Subject: below --
2016-09-27 10:18 Jitendra Bhivare
2016-07-21  9:07 [PATCH 00/28] be2iscsi: driver update 11.2.0.0 Jitendra Bhivare
2016-07-21  9:07 ` [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare
2016-08-05 18:38   ` Mike Christie
2016-08-09  1:05     ` Martin K. Petersen
2016-08-09  8:29       ` Jitendra Bhivare
2016-08-09 17:48         ` Mike Christie
2016-08-10 12:46           ` Jitendra Bhivare
2016-08-11  5:42           ` Jitendra Bhivare
2016-08-12  7:55           ` Jitendra Bhivare
2016-08-12  8:05           ` Jitendra Bhivare

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).