* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore @ 2016-09-23 15:08 Jitendra Bhivare 0 siblings, 0 replies; 11+ messages in thread From: Jitendra Bhivare @ 2016-09-23 15:08 UTC (permalink / raw) To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi Hi Mike, I could reproduce hard lockup using for-next kernel only in iscsi_eh_cmd_timeout path due to spin_lock_irqsave taken in blk_timeout_work Please refer stack trace below. The _bh version used for frwd_lock and back_lock does not seem to be causing any issue similar to seen with be2iscsi after replacing _bh versions in be2iscsi. I am testing it further, to confirm in all possible scenarios... NOPs, error recovery, resets and reconnects. On my setup, I affined all EQ interrupts on a single CPU. Along with heavy IO, few of the invocations of fio were pinned to run on same CPU. Any call to unlock_bh with another spin_lock already held, invoking do_softirq, might cause deadlock if bottom half used by driver calls function which needs that another spin_lock. Is there a code which prevents this issue? Thanks, JB [ 3843.125976] ------------[ cut here ]------------ [ 3843.132217] WARNING: CPU: 20 PID: 1227 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x90 [ 3843.142815] Modules linked in: dm_service_time be2iscsi(E) iscsi_boot_sysfs xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute rpcrdma bridge ib_isert stp iscsi_target_mod llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ib_iser ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 libiscsi nf_nat scsi_transport_iscsi ib_srpt nf_conntrack target_core_mod iptable_mangle iptable_security iptable_raw iptable_filter ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ocrdma ib_core dm_mirror dm_region_hash dm_log intel_rapl sb_edac edac_core x86_pkg_temp_thermal [ 3843.231162] intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel iTCO_wdt iTCO_vendor_support ipmi_devintf lrw dcdbas mei_me ipmi_ssif gf128mul sg mei glue_helper ablk_helper ioatdma cryptd ipmi_si shpchp nfsd wmi acpi_power_meter ipmi_msghandler pcspkr dca lpc_ich acpi_pad auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath dm_mod ip_tables ext4 jbd2 mbcache sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm tg3 crc32c_intel ptp i2c_core be2net megaraid_sas fjes pps_core [ 3843.294328] CPU: 20 PID: 1227 Comm: kworker/20:1H Tainted: G E 4.8.0-rc1+ #3 [ 3843.304944] Hardware name: Dell Inc. PowerEdge R720/0X6H47, BIOS 1.4.8 10/25/2012 [ 3843.314798] Workqueue: kblockd blk_timeout_work [ 3843.321350] 0000000000000086 00000000a32f4533 ffff8802216d7bd8 ffffffff8135c3cf [ 3843.331146] 0000000000000000 0000000000000000 ffff8802216d7c18 ffffffff8108d661 [ 3843.340918] 00000096216d7c50 0000000000000200 ffff8802d07cc828 ffff8801b3632550 [ 3843.350687] Call Trace: [ 3843.354866] [<ffffffff8135c3cf>] dump_stack+0x63/0x84 [ 3843.362061] [<ffffffff8108d661>] __warn+0xd1/0xf0 [ 3843.368851] [<ffffffff8108d79d>] warn_slowpath_null+0x1d/0x20 [ 3843.376791] [<ffffffff810930eb>] __local_bh_enable_ip+0x6b/0x90 [ 3843.384903] [<ffffffff816fe7be>] _raw_spin_unlock_bh+0x1e/0x20 [ 3843.392940] [<ffffffffa085f710>] beiscsi_alloc_pdu+0x2f0/0x6e0 [be2iscsi] [ 3843.402076] [<ffffffffa06bc358>] __iscsi_conn_send_pdu+0xf8/0x370 [libiscsi] [ 3843.411549] [<ffffffffa06bc6fe>] iscsi_send_nopout+0xbe/0x110 [libiscsi] [ 3843.420639] [<ffffffffa06bd98b>] iscsi_eh_cmd_timed_out+0x29b/0x2b0 [libiscsi] [ 3843.430339] [<ffffffff814cd1de>] scsi_times_out+0x5e/0x250 [ 3843.438119] [<ffffffff813374af>] blk_rq_timed_out+0x1f/0x60 [ 3843.446009] [<ffffffff8133759d>] blk_timeout_work+0xad/0x150 [ 3843.454010] [<ffffffff810a6642>] process_one_work+0x152/0x400 [ 3843.462114] [<ffffffff810a6f35>] worker_thread+0x125/0x4b0 [ 3843.469961] [<ffffffff810a6e10>] ? rescuer_thread+0x380/0x380 [ 3843.478116] [<ffffffff810aca28>] kthread+0xd8/0xf0 [ 3843.485212] [<ffffffff816fedff>] ret_from_fork+0x1f/0x40 [ 3843.492908] [<ffffffff810ac950>] ? kthread_park+0x60/0x60 [ 3843.500715] ---[ end trace 57ec0a1d8f0dd3a0 ]--- [ 3852.328667] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Kernel panic - not syncing: Hard LOCKUP [ 3852.341357] Modules linked in: dm_service_time be2iscsi(E) iscsi_boot_sysfs xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute rpcrdma bridge ib_isert stp iscsi_target_mod llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ib_iser ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 libiscsi nf_nat scsi_transport_iscsi ib_srpt nf_conntrack target_core_mod iptable_mangle iptable_security iptable_raw iptable_filter ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ocrdma ib_core dm_mirror dm_region_hash dm_log intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel iTCO_wdt iTCO_vendor_support ipmi_devintf lrw dcdbas mei_me ipm [ 3852.341358] CPU: 1 PID: 1129 Comm: kworker/1:1H Tainted: G W E 4.8.0-rc1+ #3 [ 3852.341359] Hardware name: Dell Inc. PowerEdge R720/0X6H47, BIOS 1.4.8 10/25/2012 [ 3852.341359] Workqueue: kblockd blk_timeout_work [ 3852.341360] 0000000000000086 00000000acd8294d ffff88042f605bb0 ffffffff8135c3cf [ 3852.341361] 0000000000000000 0000000000000000 ffff88042f605bc8 ffffffff8114b3b8 [ 3852.341362] ffff8802af358000 ffff88042f605c00 ffffffff8118546c 0000000000000001 [ 3852.341362] Call Trace: [ 3852.341363] <NMI> [<ffffffff8135c3cf>] dump_stack+0x63/0x84 [ 3852.341363] [<ffffffff8114b3b8>] watchdog_overflow_callback+0xc8/0xf0 [ 3852.341364] [<ffffffff8118546c>] __perf_event_overflow+0x7c/0x1f0 [ 3852.341364] [<ffffffff8118f7b4>] perf_event_overflow+0x14/0x20 [ 3852.341365] [<ffffffff8100c46d>] intel_pmu_handle_irq+0x1dd/0x490 [ 3852.341366] [<ffffffff8135e561>] ? ioremap_page_range+0x2a1/0x400 [ 3852.341366] [<ffffffff811e0e00>] ? vunmap_page_range+0x1e0/0x320 [ 3852.341367] [<ffffffff811e0f51>] ? unmap_kernel_range_noflush+0x11/0x20 [ 3852.341368] [<ffffffff814206c6>] ? ghes_copy_tofrom_phys+0x116/0x1f0 [ 3852.341368] [<ffffffff81053e3f>] ? native_apic_wait_icr_idle+0x1f/0x30 [ 3852.341369] [<ffffffff810057dd>] perf_event_nmi_handler+0x2d/0x50 [ 3852.341369] [<ffffffff810313f1>] nmi_handle+0x61/0x110 [ 3852.341370] [<ffffffff81031954>] default_do_nmi+0x44/0x120 [ 3852.341371] [<ffffffff81031b1b>] do_nmi+0xeb/0x160 [ 3852.341371] [<ffffffff817011b1>] end_repeat_nmi+0x1a/0x1e [ 3852.341372] [<ffffffff810da737>] ? native_queued_spin_lock_slowpath+0x117/0x1a0 [ 3852.341373] [<ffffffff810da737>] ? native_queued_spin_lock_slowpath+0x117/0x1a0 [ 3852.341373] [<ffffffff810da737>] ? native_queued_spin_lock_slowpath+0x117/0x1a0 [ 3852.341374] <<EOE>> <IRQ> [<ffffffff811983ba>] queued_spin_lock_slowpath+0xb/0xf [ 3852.341375] [<ffffffff816feb27>] _raw_spin_lock_irqsave+0x37/0x40 [ 3852.341375] [<ffffffff814d041b>] scsi_end_request+0x10b/0x1d0 [ 3852.341376] [<ffffffff814d2143>] scsi_io_completion+0x153/0x650 [ 3852.341377] [<ffffffff814c8faf>] scsi_finish_command+0xcf/0x120 [ 3852.341377] [<ffffffff814d1927>] scsi_softirq_done+0x127/0x150 [ 3852.341378] [<ffffffff813370dc>] blk_done_softirq+0x8c/0xc0 [ 3852.341378] [<ffffffff81701932>] __do_softirq+0xd2/0x27a [ 3852.341379] [<ffffffff817009bc>] do_softirq_own_stack+0x1c/0x30 ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
@ 2016-09-27 10:18 Jitendra Bhivare
0 siblings, 0 replies; 11+ messages in thread
From: Jitendra Bhivare @ 2016-09-27 10:18 UTC (permalink / raw)
To: cleech, lduncan; +Cc: linux-scsi, Mike Christie, Martin K. Petersen
> -----Original Message-----
> From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com]
> Sent: Friday, September 23, 2016 8:38 PM
> To: 'Mike Christie'; 'Martin K. Petersen'
> Cc: 'linux-scsi@vger.kernel.org'
> Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore
>
> Hi Mike,
>
> I could reproduce hard lockup using for-next kernel only in
> iscsi_eh_cmd_timeout path due to spin_lock_irqsave taken in
> blk_timeout_work
> Please refer stack trace below.
>
> The _bh version used for frwd_lock and back_lock does not seem to be
> causing
> any issue similar to seen with be2iscsi after replacing _bh versions in
> be2iscsi.
>
> I am testing it further, to confirm in all possible scenarios... NOPs,
> error
> recovery, resets and reconnects.
> On my setup, I affined all EQ interrupts on a single CPU.
> Along with heavy IO, few of the invocations of fio were pinned to run on
> same
> CPU.
>
> Any call to unlock_bh with another spin_lock already held, invoking
> do_softirq,
> might cause deadlock if bottom half used by driver calls function which
> needs
> that another spin_lock.
> Is there a code which prevents this issue?
>
The only place, I think, libiscsi can have issues is__iscsi_complete_pdu
which should be protected under _bh version for frwd and back locks.
If the function is executed in process context there is a good chance the
base version of spin_lock used for frwd/back locks could cause deadlocks
when lld uses _bh version in alloc_pdu path
(__iscsi_complete_pdu->iscsi_send_nopout... alloc_pdu).
Thanks,
JB
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 00/28] be2iscsi: driver update 11.2.0.0 @ 2016-07-21 9:07 Jitendra Bhivare 2016-07-21 9:07 ` [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare 0 siblings, 1 reply; 11+ messages in thread From: Jitendra Bhivare @ 2016-07-21 9:07 UTC (permalink / raw) To: linux-scsi; +Cc: mchristi, Jitendra Bhivare This patch is generated against for-next branch. Jitendra Bhivare (28): be2iscsi: Fix to use correct configuration values be2iscsi: Replace _bh with _irqsave/irqrestore be2iscsi: Replace _bh version for mcc_lock spinlock be2iscsi: Reduce driver load/unload time be2iscsi: Set and return right iface v4/v6 states be2iscsi: Fix gateway APIs to support IPv4 & IPv6 be2iscsi: Fix release of DHCP IP in static mode be2iscsi: Move VLAN code to common iface_set_param be2iscsi: Update iface handle before any set param be2iscsi: Rename iface get/set/create/destroy APIs be2iscsi: Handle only NET_PARAM in iface_get_param be2iscsi: Check all zeroes IP before issuing IOCTL be2iscsi: Remove alloc_mcc_tag & beiscsi_pci_soft_reset be2iscsi: Remove isr_lock and dead code be2iscsi: Fix checks for HBA in error state be2iscsi: Fix to make boot discovery non-blocking be2iscsi: Fix to add timer for UE detection be2iscsi: Add IOCTL to check UER supported be2iscsi: Move functions to right files be2iscsi: Fix POST check and reset sequence be2iscsi: Add V1 of EPFW cleanup IOCTL be2iscsi: Add TPE recovery feature be2iscsi: Fail the sessions immediately after TPE be2iscsi: Add FUNCTION_RESET during driver unload be2iscsi: Fix async PDU handling path be2iscsi: Update copyright information be2iscsi: Update the driver version MAINTAINERS: Update be2iscsi contact info MAINTAINERS | 10 +- drivers/scsi/be2iscsi/be.h | 15 +- drivers/scsi/be2iscsi/be_cmds.c | 1096 +++++++++++------ drivers/scsi/be2iscsi/be_cmds.h | 142 ++- drivers/scsi/be2iscsi/be_iscsi.c | 408 +++---- drivers/scsi/be2iscsi/be_iscsi.h | 25 +- drivers/scsi/be2iscsi/be_main.c | 2476 +++++++++++++++++++------------------- drivers/scsi/be2iscsi/be_main.h | 220 ++-- drivers/scsi/be2iscsi/be_mgmt.c | 1490 +++++++++-------------- drivers/scsi/be2iscsi/be_mgmt.h | 51 +- 10 files changed, 3052 insertions(+), 2881 deletions(-) -- 1.9.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-07-21 9:07 [PATCH 00/28] be2iscsi: driver update 11.2.0.0 Jitendra Bhivare @ 2016-07-21 9:07 ` Jitendra Bhivare 2016-08-05 18:38 ` Mike Christie 0 siblings, 1 reply; 11+ messages in thread From: Jitendra Bhivare @ 2016-07-21 9:07 UTC (permalink / raw) To: linux-scsi; +Cc: mchristi, Jitendra Bhivare Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 21 Pid: 13242, comm: flush-8:80 Tainted: G W -- ------------ 2.6.32-573.el6.x86_64 #1 Call Trace: <NMI> [<ffffffff81537a84>] ? panic+0xa7/0x16f [<ffffffff810149c9>] ? sched_clock+0x9/0x10 [<ffffffff810ed4bd>] ? watchdog_overflow_callback+0xcd/0xd0 [<ffffffff81124037>] ? __perf_event_overflow+0xa7/0x240 [<ffffffff8101dc54>] ? x86_perf_event_set_period+0xf4/0x180 [<ffffffff81124684>] ? perf_event_overflow+0x14/0x20 [<ffffffff81024a02>] ? intel_pmu_handle_irq+0x202/0x3f0 [<ffffffff8153cd89>] ? perf_event_nmi_handler+0x39/0xb0 [<ffffffff8153e845>] ? notifier_call_chain+0x55/0x80 [<ffffffff81389ba0>] ? scsi_done+0x0/0x60 [<ffffffff8153e8aa>] ? atomic_notifier_call_chain+0x1a/0x20 [<ffffffff810a788e>] ? notify_die+0x2e/0x30 [<ffffffff8153c503>] ? do_nmi+0x1c3/0x350 [<ffffffff8153bdc0>] ? nmi+0x20/0x30 [<ffffffff81389ba0>] ? scsi_done+0x0/0x60 [<ffffffff81389ba0>] ? scsi_done+0x0/0x60 [<ffffffff8153b62e>] ? _spin_lock+0x1e/0x30 <<EOE>> [<ffffffffa00e2eaf>] ? iscsi_queuecommand+0x7f/0x4e0 [libiscsi] [<ffffffff81389df5>] ? scsi_dispatch_cmd+0xe5/0x310 [<ffffffff813927be>] ? scsi_request_fn+0x5be/0x750 [<ffffffff81089bad>] ? del_timer+0x7d/0xe0 [<ffffffff81273542>] ? __generic_unplug_device+0x32/0x40 [<ffffffff8126e823>] ? elv_insert+0xd3/0x190 [<ffffffff8126e920>] ? __elv_add_request+0x40/0x90 In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for protecting SGLs and WRBs. _bh versions are needed as the function gets invoked in process context and BLOCK_IOPOLL softirq. In spin_unlock_bh, after releasing the lock and enabling BH, do_softirq is called which executes till last SOFTIRQ. beiscsi_alloc_pdu is called under session lock. Through block layer, iSCSI stack in some cases send IOs with interrupts disabled. In such paths, CPU will get stuck for a while for session lock with interrupts disabled because in other CPU do_softirq is executing under session lock thus causing hard lock up. Use spin_lock_irqsave/spin_lock_irqrestore as the driver can't be sure in all paths interrupts are enabled. Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> --- drivers/scsi/be2iscsi/be_main.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index f05e773..02d24d5 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -1131,8 +1131,9 @@ beiscsi_process_async_pdu(struct beiscsi_conn *beiscsi_conn, static struct sgl_handle *alloc_io_sgl_handle(struct beiscsi_hba *phba) { struct sgl_handle *psgl_handle; + unsigned long flags; - spin_lock_bh(&phba->io_sgl_lock); + spin_lock_irqsave(&phba->io_sgl_lock, flags); if (phba->io_sgl_hndl_avbl) { beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_IO, "BM_%d : In alloc_io_sgl_handle," @@ -1150,14 +1151,16 @@ static struct sgl_handle *alloc_io_sgl_handle(struct beiscsi_hba *phba) phba->io_sgl_alloc_index++; } else psgl_handle = NULL; - spin_unlock_bh(&phba->io_sgl_lock); + spin_unlock_irqrestore(&phba->io_sgl_lock, flags); return psgl_handle; } static void free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) { - spin_lock_bh(&phba->io_sgl_lock); + unsigned long flags; + + spin_lock_irqsave(&phba->io_sgl_lock, flags); beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_IO, "BM_%d : In free_,io_sgl_free_index=%d\n", phba->io_sgl_free_index); @@ -1172,7 +1175,7 @@ free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) "value there=%p\n", phba->io_sgl_free_index, phba->io_sgl_hndl_base [phba->io_sgl_free_index]); - spin_unlock_bh(&phba->io_sgl_lock); + spin_unlock_irqrestore(&phba->io_sgl_lock, flags); return; } phba->io_sgl_hndl_base[phba->io_sgl_free_index] = psgl_handle; @@ -1181,7 +1184,7 @@ free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) phba->io_sgl_free_index = 0; else phba->io_sgl_free_index++; - spin_unlock_bh(&phba->io_sgl_lock); + spin_unlock_irqrestore(&phba->io_sgl_lock, flags); } static inline struct wrb_handle * @@ -1189,15 +1192,16 @@ beiscsi_get_wrb_handle(struct hwi_wrb_context *pwrb_context, unsigned int wrbs_per_cxn) { struct wrb_handle *pwrb_handle; + unsigned long flags; - spin_lock_bh(&pwrb_context->wrb_lock); + spin_lock_irqsave(&pwrb_context->wrb_lock, flags); pwrb_handle = pwrb_context->pwrb_handle_base[pwrb_context->alloc_index]; pwrb_context->wrb_handles_available--; if (pwrb_context->alloc_index == (wrbs_per_cxn - 1)) pwrb_context->alloc_index = 0; else pwrb_context->alloc_index++; - spin_unlock_bh(&pwrb_context->wrb_lock); + spin_unlock_irqrestore(&pwrb_context->wrb_lock, flags); return pwrb_handle; } @@ -1229,14 +1233,16 @@ beiscsi_put_wrb_handle(struct hwi_wrb_context *pwrb_context, struct wrb_handle *pwrb_handle, unsigned int wrbs_per_cxn) { - spin_lock_bh(&pwrb_context->wrb_lock); + unsigned long flags; + + spin_lock_irqsave(&pwrb_context->wrb_lock, flags); pwrb_context->pwrb_handle_base[pwrb_context->free_index] = pwrb_handle; pwrb_context->wrb_handles_available++; if (pwrb_context->free_index == (wrbs_per_cxn - 1)) pwrb_context->free_index = 0; else pwrb_context->free_index++; - spin_unlock_bh(&pwrb_context->wrb_lock); + spin_unlock_irqrestore(&pwrb_context->wrb_lock, flags); } /** @@ -1265,8 +1271,9 @@ free_wrb_handle(struct beiscsi_hba *phba, struct hwi_wrb_context *pwrb_context, static struct sgl_handle *alloc_mgmt_sgl_handle(struct beiscsi_hba *phba) { struct sgl_handle *psgl_handle; + unsigned long flags; - spin_lock_bh(&phba->mgmt_sgl_lock); + spin_lock_irqsave(&phba->mgmt_sgl_lock, flags); if (phba->eh_sgl_hndl_avbl) { psgl_handle = phba->eh_sgl_hndl_base[phba->eh_sgl_alloc_index]; phba->eh_sgl_hndl_base[phba->eh_sgl_alloc_index] = NULL; @@ -1284,14 +1291,16 @@ static struct sgl_handle *alloc_mgmt_sgl_handle(struct beiscsi_hba *phba) phba->eh_sgl_alloc_index++; } else psgl_handle = NULL; - spin_unlock_bh(&phba->mgmt_sgl_lock); + spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags); return psgl_handle; } void free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) { - spin_lock_bh(&phba->mgmt_sgl_lock); + unsigned long flags; + + spin_lock_irqsave(&phba->mgmt_sgl_lock, flags); beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_CONFIG, "BM_%d : In free_mgmt_sgl_handle," "eh_sgl_free_index=%d\n", @@ -1306,7 +1315,7 @@ free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) "BM_%d : Double Free in eh SGL ," "eh_sgl_free_index=%d\n", phba->eh_sgl_free_index); - spin_unlock_bh(&phba->mgmt_sgl_lock); + spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags); return; } phba->eh_sgl_hndl_base[phba->eh_sgl_free_index] = psgl_handle; @@ -1316,7 +1325,7 @@ free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) phba->eh_sgl_free_index = 0; else phba->eh_sgl_free_index++; - spin_unlock_bh(&phba->mgmt_sgl_lock); + spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags); } static void -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-07-21 9:07 ` [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare @ 2016-08-05 18:38 ` Mike Christie 2016-08-09 1:05 ` Martin K. Petersen 0 siblings, 1 reply; 11+ messages in thread From: Mike Christie @ 2016-08-05 18:38 UTC (permalink / raw) To: Jitendra Bhivare, linux-scsi On 07/21/2016 04:07 AM, Jitendra Bhivare wrote: > Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 21 > Pid: 13242, comm: flush-8:80 Tainted: G W -- ------------ > 2.6.32-573.el6.x86_64 #1 > Call Trace: > <NMI> [<ffffffff81537a84>] ? panic+0xa7/0x16f > [<ffffffff810149c9>] ? sched_clock+0x9/0x10 > [<ffffffff810ed4bd>] ? watchdog_overflow_callback+0xcd/0xd0 > [<ffffffff81124037>] ? __perf_event_overflow+0xa7/0x240 > [<ffffffff8101dc54>] ? x86_perf_event_set_period+0xf4/0x180 > [<ffffffff81124684>] ? perf_event_overflow+0x14/0x20 > [<ffffffff81024a02>] ? intel_pmu_handle_irq+0x202/0x3f0 > [<ffffffff8153cd89>] ? perf_event_nmi_handler+0x39/0xb0 > [<ffffffff8153e845>] ? notifier_call_chain+0x55/0x80 > [<ffffffff81389ba0>] ? scsi_done+0x0/0x60 > [<ffffffff8153e8aa>] ? atomic_notifier_call_chain+0x1a/0x20 > [<ffffffff810a788e>] ? notify_die+0x2e/0x30 > [<ffffffff8153c503>] ? do_nmi+0x1c3/0x350 > [<ffffffff8153bdc0>] ? nmi+0x20/0x30 > [<ffffffff81389ba0>] ? scsi_done+0x0/0x60 > [<ffffffff81389ba0>] ? scsi_done+0x0/0x60 > [<ffffffff8153b62e>] ? _spin_lock+0x1e/0x30 > <<EOE>> [<ffffffffa00e2eaf>] ? iscsi_queuecommand+0x7f/0x4e0 [libiscsi] > [<ffffffff81389df5>] ? scsi_dispatch_cmd+0xe5/0x310 > [<ffffffff813927be>] ? scsi_request_fn+0x5be/0x750 > [<ffffffff81089bad>] ? del_timer+0x7d/0xe0 > [<ffffffff81273542>] ? __generic_unplug_device+0x32/0x40 > [<ffffffff8126e823>] ? elv_insert+0xd3/0x190 > [<ffffffff8126e920>] ? __elv_add_request+0x40/0x90 > > In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for > protecting SGLs and WRBs. _bh versions are needed as the function gets > invoked in process context and BLOCK_IOPOLL softirq. > > In spin_unlock_bh, after releasing the lock and enabling BH, do_softirq > is called which executes till last SOFTIRQ. > > beiscsi_alloc_pdu is called under session lock. Through block layer, > iSCSI stack in some cases send IOs with interrupts disabled. In such paths, What path is this? Is this with mq enabled or disabled? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-05 18:38 ` Mike Christie @ 2016-08-09 1:05 ` Martin K. Petersen 2016-08-09 8:29 ` Jitendra Bhivare 0 siblings, 1 reply; 11+ messages in thread From: Martin K. Petersen @ 2016-08-09 1:05 UTC (permalink / raw) To: Mike Christie; +Cc: Jitendra Bhivare, linux-scsi >>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: >> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for >> protecting SGLs and WRBs. _bh versions are needed as the function >> gets invoked in process context and BLOCK_IOPOLL softirq. >> >> In spin_unlock_bh, after releasing the lock and enabling BH, >> do_softirq is called which executes till last SOFTIRQ. >> >> beiscsi_alloc_pdu is called under session lock. Through block layer, >> iSCSI stack in some cases send IOs with interrupts disabled. In such >> paths, Mike> What path is this? Is this with mq enabled or disabled? Jitendra? -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-09 1:05 ` Martin K. Petersen @ 2016-08-09 8:29 ` Jitendra Bhivare 2016-08-09 17:48 ` Mike Christie 0 siblings, 1 reply; 11+ messages in thread From: Jitendra Bhivare @ 2016-08-09 8:29 UTC (permalink / raw) To: Martin K. Petersen, Mike Christie; +Cc: linux-scsi > -----Original Message----- > From: Martin K. Petersen [mailto:martin.petersen@oracle.com] > Sent: Tuesday, August 09, 2016 6:35 AM > To: Mike Christie > Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore > > >>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: > > >> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for > >> protecting SGLs and WRBs. _bh versions are needed as the function > >> gets invoked in process context and BLOCK_IOPOLL softirq. > >> > >> In spin_unlock_bh, after releasing the lock and enabling BH, > >> do_softirq is called which executes till last SOFTIRQ. > >> > >> beiscsi_alloc_pdu is called under session lock. Through block layer, > >> iSCSI stack in some cases send IOs with interrupts disabled. In such > >> paths, > > > Mike> What path is this? Is this with mq enabled or disabled? > > Jitendra? > > -- > Martin K. Petersen Oracle Linux Engineering [JB] Sorry for the delayed response, there was some issue with my mail client. There are paths block layer where IRQs are disabled with request_queue queue_lock. - blk_timeout_work : this triggers NOP-OUT thru' iscsi_eh_cmd_timed_out. - blk_execute_rq_nowait ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-09 8:29 ` Jitendra Bhivare @ 2016-08-09 17:48 ` Mike Christie 2016-08-10 12:46 ` Jitendra Bhivare ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Mike Christie @ 2016-08-09 17:48 UTC (permalink / raw) To: Jitendra Bhivare, Martin K. Petersen; +Cc: linux-scsi On 08/09/2016 03:29 AM, Jitendra Bhivare wrote: >> -----Original Message----- >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com] >> Sent: Tuesday, August 09, 2016 6:35 AM >> To: Mike Christie >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > _irqsave/irqrestore >> >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: >> >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for >>>> protecting SGLs and WRBs. _bh versions are needed as the function >>>> gets invoked in process context and BLOCK_IOPOLL softirq. >>>> >>>> In spin_unlock_bh, after releasing the lock and enabling BH, >>>> do_softirq is called which executes till last SOFTIRQ. >>>> >>>> beiscsi_alloc_pdu is called under session lock. Through block layer, >>>> iSCSI stack in some cases send IOs with interrupts disabled. In such >>>> paths, >> >> >> Mike> What path is this? Is this with mq enabled or disabled? >> >> Jitendra? >> >> -- >> Martin K. Petersen Oracle Linux Engineering > [JB] Sorry for the delayed response, there was some issue with my mail > client. > There are paths block layer where IRQs are disabled with request_queue > queue_lock. > - blk_timeout_work : this triggers NOP-OUT thru' iscsi_eh_cmd_timed_out. > - blk_execute_rq_nowait I am not sure I understand the problem/solution. Why does the patch only need to modify be2iscsi's locking? Why wouldn't you need to also change the libiscsi session lock locking? You don't need irqs to be running right? You are just doing this so the locking calls (irq vs bh) matches up? Don't you also need to fix the non-mq IO path. scsi_request_fn needs a fix to use spin_lock_irqsave/spin_unlock_irqrestore because blk_execute_rq_nowait already disables them right? ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-09 17:48 ` Mike Christie @ 2016-08-10 12:46 ` Jitendra Bhivare 2016-08-11 5:42 ` Jitendra Bhivare ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Jitendra Bhivare @ 2016-08-10 12:46 UTC (permalink / raw) To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi > -----Original Message----- > From: Mike Christie [mailto:mchristi@redhat.com] > Sent: Tuesday, August 09, 2016 11:19 PM > To: Jitendra Bhivare; Martin K. Petersen > Cc: linux-scsi@vger.kernel.org > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote: > >> -----Original Message----- > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com] > >> Sent: Tuesday, August 09, 2016 6:35 AM > >> To: Mike Christie > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > _irqsave/irqrestore > >> > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: > >> > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used for > >>>> protecting SGLs and WRBs. _bh versions are needed as the function > >>>> gets invoked in process context and BLOCK_IOPOLL softirq. > >>>> > >>>> In spin_unlock_bh, after releasing the lock and enabling BH, > >>>> do_softirq is called which executes till last SOFTIRQ. > >>>> > >>>> beiscsi_alloc_pdu is called under session lock. Through block > >>>> layer, iSCSI stack in some cases send IOs with interrupts disabled. > >>>> In such paths, > >> > >> > >> Mike> What path is this? Is this with mq enabled or disabled? > >> > >> Jitendra? > >> > >> -- > >> Martin K. Petersen Oracle Linux Engineering > > [JB] Sorry for the delayed response, there was some issue with my mail > > client. > > There are paths block layer where IRQs are disabled with request_queue > > queue_lock. > > - blk_timeout_work : this triggers NOP-OUT thru' iscsi_eh_cmd_timed_out. > > - blk_execute_rq_nowait > > I am not sure I understand the problem/solution. Why does the patch only > need > to modify be2iscsi's locking? Why wouldn't you need to also change the > libiscsi > session lock locking? You don't need irqs to be running right? You are > just doing > this so the locking calls (irq vs bh) matches up? > > Don't you also need to fix the non-mq IO path. scsi_request_fn needs a fix > to use > spin_lock_irqsave/spin_unlock_irqrestore because blk_execute_rq_nowait > already disables them right? > [JB] The issue is hard lockup seen on a CPU which is waiting for session lock taken by another CPU processing do_softirq. This do_softirq is getting called from be2iscsi spin_unlock_bh. The contention gets easily exposed with be2iscsi as it data structures are accessed in process context and IRQ_POLL softirq. iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock version for session lock (frwd and back). iscsi_queuecommand is using _bh which is bit scary. Yes, that needs to be fixed too... and then this other path looks buggy too - blk_run_queue (takes queue_lock with irqsave) ->scsi_request_fn (unlocks it with _irq) ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-09 17:48 ` Mike Christie 2016-08-10 12:46 ` Jitendra Bhivare @ 2016-08-11 5:42 ` Jitendra Bhivare 2016-08-12 7:55 ` Jitendra Bhivare 2016-08-12 8:05 ` Jitendra Bhivare 3 siblings, 0 replies; 11+ messages in thread From: Jitendra Bhivare @ 2016-08-11 5:42 UTC (permalink / raw) To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi > -----Original Message----- > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com] > Sent: Wednesday, August 10, 2016 6:16 PM > To: 'Mike Christie'; 'Martin K. Petersen' > Cc: 'linux-scsi@vger.kernel.org' > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore > > > -----Original Message----- > > From: Mike Christie [mailto:mchristi@redhat.com] > > Sent: Tuesday, August 09, 2016 11:19 PM > > To: Jitendra Bhivare; Martin K. Petersen > > Cc: linux-scsi@vger.kernel.org > > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > _irqsave/irqrestore > > > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote: > > >> -----Original Message----- > > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com] > > >> Sent: Tuesday, August 09, 2016 6:35 AM > > >> To: Mike Christie > > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org > > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > > _irqsave/irqrestore > > >> > > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: > > >> > > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used > > >>>> for protecting SGLs and WRBs. _bh versions are needed as the > > >>>> function gets invoked in process context and BLOCK_IOPOLL softirq. > > >>>> > > >>>> In spin_unlock_bh, after releasing the lock and enabling BH, > > >>>> do_softirq is called which executes till last SOFTIRQ. > > >>>> > > >>>> beiscsi_alloc_pdu is called under session lock. Through block > > >>>> layer, iSCSI stack in some cases send IOs with interrupts disabled. > > >>>> In such paths, > > >> > > >> > > >> Mike> What path is this? Is this with mq enabled or disabled? > > >> > > >> Jitendra? > > >> > > >> -- > > >> Martin K. Petersen Oracle Linux Engineering > > > [JB] Sorry for the delayed response, there was some issue with my > > > mail client. > > > There are paths block layer where IRQs are disabled with > > > request_queue queue_lock. > > > - blk_timeout_work : this triggers NOP-OUT thru' > > > iscsi_eh_cmd_timed_out. > > > - blk_execute_rq_nowait > > > > I am not sure I understand the problem/solution. Why does the patch > > only need to modify be2iscsi's locking? Why wouldn't you need to also > > change the libiscsi session lock locking? You don't need irqs to be > > running right? You are just doing this so the locking calls (irq vs bh) > > matches > up? > > > > Don't you also need to fix the non-mq IO path. scsi_request_fn needs a > > fix to use spin_lock_irqsave/spin_unlock_irqrestore because > > blk_execute_rq_nowait already disables them right? > > > [JB] The issue is hard lockup seen on a CPU which is waiting for session > lock > taken by another CPU processing do_softirq. This do_softirq is getting > called > from be2iscsi spin_unlock_bh. > > The contention gets easily exposed with be2iscsi as it data structures are > accessed in process context and IRQ_POLL softirq. > > iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock > version > for session lock (frwd and back). > iscsi_queuecommand is using _bh which is bit scary. > > Yes, that needs to be fixed too... and then this other path looks buggy > too - > blk_run_queue (takes queue_lock with irqsave) ->scsi_request_fn (unlocks > it with > _irq) [JB] In this case, the other CPU executing do_softirq was too sending IO under session lock. When be2iscsi does spin_unlock_bh after getting resources for the IO, unlock_bh finds pending IRQ_POLL softirq. The issue is more to do with driver using _unlock_bh. It is causing a WARN_ON too in unlock_bh where ever used for disabled IRQs. ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-09 17:48 ` Mike Christie 2016-08-10 12:46 ` Jitendra Bhivare 2016-08-11 5:42 ` Jitendra Bhivare @ 2016-08-12 7:55 ` Jitendra Bhivare 2016-08-12 8:05 ` Jitendra Bhivare 3 siblings, 0 replies; 11+ messages in thread From: Jitendra Bhivare @ 2016-08-12 7:55 UTC (permalink / raw) To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi > -----Original Message----- > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com] > Sent: Thursday, August 11, 2016 11:12 AM > To: 'Mike Christie'; 'Martin K. Petersen' > Cc: 'linux-scsi@vger.kernel.org' > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore > > > > > -----Original Message----- > > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com] > > Sent: Wednesday, August 10, 2016 6:16 PM > > To: 'Mike Christie'; 'Martin K. Petersen' > > Cc: 'linux-scsi@vger.kernel.org' > > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with > > _irqsave/irqrestore > > > > > -----Original Message----- > > > From: Mike Christie [mailto:mchristi@redhat.com] > > > Sent: Tuesday, August 09, 2016 11:19 PM > > > To: Jitendra Bhivare; Martin K. Petersen > > > Cc: linux-scsi@vger.kernel.org > > > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > > _irqsave/irqrestore > > > > > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote: > > > >> -----Original Message----- > > > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com] > > > >> Sent: Tuesday, August 09, 2016 6:35 AM > > > >> To: Mike Christie > > > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org > > > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > > > _irqsave/irqrestore > > > >> > > > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: > > > >> > > > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used > > > >>>> for protecting SGLs and WRBs. _bh versions are needed as the > > > >>>> function gets invoked in process context and BLOCK_IOPOLL > > > >>>> softirq. > > > >>>> > > > >>>> In spin_unlock_bh, after releasing the lock and enabling BH, > > > >>>> do_softirq is called which executes till last SOFTIRQ. > > > >>>> > > > >>>> beiscsi_alloc_pdu is called under session lock. Through block > > > >>>> layer, iSCSI stack in some cases send IOs with interrupts > > > >>>> disabled. > > > >>>> In such paths, > > > >> > > > >> > > > >> Mike> What path is this? Is this with mq enabled or disabled? > > > >> > > > >> Jitendra? > > > >> > > > >> -- > > > >> Martin K. Petersen Oracle Linux Engineering > > > > [JB] Sorry for the delayed response, there was some issue with my > > > > mail client. > > > > There are paths block layer where IRQs are disabled with > > > > request_queue queue_lock. > > > > - blk_timeout_work : this triggers NOP-OUT thru' > > > > iscsi_eh_cmd_timed_out. > > > > - blk_execute_rq_nowait > > > > > > I am not sure I understand the problem/solution. Why does the patch > > > only need to modify be2iscsi's locking? Why wouldn't you need to > > > also change the libiscsi session lock locking? You don't need irqs > > > to be running right? You are just doing this so the locking calls > > > (irq vs bh) matches > > up? > > > > > > Don't you also need to fix the non-mq IO path. scsi_request_fn needs > > > a fix to use spin_lock_irqsave/spin_unlock_irqrestore because > > > blk_execute_rq_nowait already disables them right? > > > > > [JB] The issue is hard lockup seen on a CPU which is waiting for > > session lock taken by another CPU processing do_softirq. This > > do_softirq is getting called from be2iscsi spin_unlock_bh. > > > > The contention gets easily exposed with be2iscsi as it data structures > > are accessed in process context and IRQ_POLL softirq. > > > > iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock > > version for session lock (frwd and back). > > iscsi_queuecommand is using _bh which is bit scary. > > > > Yes, that needs to be fixed too... and then this other path looks > > buggy too - blk_run_queue (takes queue_lock with irqsave) > > ->scsi_request_fn (unlocks it with > > _irq) > [JB] In this case, the other CPU executing do_softirq was too sending IO > under > session lock. > When be2iscsi does spin_unlock_bh after getting resources for the IO, > unlock_bh finds pending IRQ_POLL softirq. > > The issue is more to do with driver using _unlock_bh. > It is causing a WARN_ON too in unlock_bh where ever used for disabled > IRQs. [JB] My apologies. blk_execute_rq_nowait and the call trace shown is with older kernel. Only valid kernel trace with latest kernel is: [ 180.957062] <IRQ> [<ffffffff81603f36>] dump_stack+0x19/0x1b [ 180.957070] [<ffffffff8106e28b>] warn_slowpath_common+0x6b/0xb0 [ 180.957072] [<ffffffff8106e3da>] warn_slowpath_null+0x1a/0x20 [ 180.957073] [<ffffffff81076d1a>] local_bh_enable_ip+0x7a/0xa0 [ 180.957077] [<ffffffff8160b0eb>] _raw_spin_unlock_bh+0x1b/0x40 [ 180.957084] [<ffffffffa037f753>] alloc_mgmt_sgl_handle+0x83/0xe0 [be2iscsi] [ 180.957089] [<ffffffffa0388d9e>] beiscsi_alloc_pdu+0x21e/0x5d0 [be2iscsi] [ 180.957094] [<ffffffffa01bec71>] __iscsi_conn_send_pdu+0x101/0x2e0 [libiscsi] [ 180.957099] [<ffffffffa01bef78>] iscsi_send_nopout+0xb8/0x100 [libiscsi] [ 180.957104] [<ffffffffa01bfca4>] iscsi_eh_cmd_timed_out+0x2a4/0x2d0 [libiscsi] [ 180.957107] [<ffffffff813f56de>] scsi_times_out+0x5e/0x320 ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore 2016-08-09 17:48 ` Mike Christie ` (2 preceding siblings ...) 2016-08-12 7:55 ` Jitendra Bhivare @ 2016-08-12 8:05 ` Jitendra Bhivare 3 siblings, 0 replies; 11+ messages in thread From: Jitendra Bhivare @ 2016-08-12 8:05 UTC (permalink / raw) To: Mike Christie, Martin K. Petersen; +Cc: linux-scsi > -----Original Message----- > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com] > Sent: Friday, August 12, 2016 1:26 PM > To: 'Mike Christie'; 'Martin K. Petersen' > Cc: 'linux-scsi@vger.kernel.org' > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore > > > > > -----Original Message----- > > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com] > > Sent: Thursday, August 11, 2016 11:12 AM > > To: 'Mike Christie'; 'Martin K. Petersen' > > Cc: 'linux-scsi@vger.kernel.org' > > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with > > _irqsave/irqrestore > > > > > > > > > -----Original Message----- > > > From: Jitendra Bhivare [mailto:jitendra.bhivare@broadcom.com] > > > Sent: Wednesday, August 10, 2016 6:16 PM > > > To: 'Mike Christie'; 'Martin K. Petersen' > > > Cc: 'linux-scsi@vger.kernel.org' > > > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with > > > _irqsave/irqrestore > > > > > > > -----Original Message----- > > > > From: Mike Christie [mailto:mchristi@redhat.com] > > > > Sent: Tuesday, August 09, 2016 11:19 PM > > > > To: Jitendra Bhivare; Martin K. Petersen > > > > Cc: linux-scsi@vger.kernel.org > > > > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > > > _irqsave/irqrestore > > > > > > > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote: > > > > >> -----Original Message----- > > > > >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com] > > > > >> Sent: Tuesday, August 09, 2016 6:35 AM > > > > >> To: Mike Christie > > > > >> Cc: Jitendra Bhivare; linux-scsi@vger.kernel.org > > > > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > > > > _irqsave/irqrestore > > > > >> > > > > >>>>>>> "Mike" == Mike Christie <mchristi@redhat.com> writes: > > > > >> > > > > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being > > > > >>>> used for protecting SGLs and WRBs. _bh versions are needed as > > > > >>>> the function gets invoked in process context and BLOCK_IOPOLL > softirq. > > > > >>>> > > > > >>>> In spin_unlock_bh, after releasing the lock and enabling BH, > > > > >>>> do_softirq is called which executes till last SOFTIRQ. > > > > >>>> > > > > >>>> beiscsi_alloc_pdu is called under session lock. Through block > > > > >>>> layer, iSCSI stack in some cases send IOs with interrupts > > > > >>>> disabled. > > > > >>>> In such paths, > > > > >> > > > > >> > > > > >> Mike> What path is this? Is this with mq enabled or disabled? > > > > >> > > > > >> Jitendra? > > > > >> > > > > >> -- > > > > >> Martin K. Petersen Oracle Linux Engineering > > > > > [JB] Sorry for the delayed response, there was some issue with > > > > > my mail client. > > > > > There are paths block layer where IRQs are disabled with > > > > > request_queue queue_lock. > > > > > - blk_timeout_work : this triggers NOP-OUT thru' > iscsi_eh_cmd_timed_out. > > > > > - blk_execute_rq_nowait > > > > > > > > I am not sure I understand the problem/solution. Why does the > > > > patch only need to modify be2iscsi's locking? Why wouldn't you > > > > need to also change the libiscsi session lock locking? You don't > > > > need irqs to be running right? You are just doing this so the > > > > locking calls (irq vs bh) matches > > > up? > > > > > > > > Don't you also need to fix the non-mq IO path. scsi_request_fn > > > > needs a fix to use spin_lock_irqsave/spin_unlock_irqrestore > > > > because blk_execute_rq_nowait already disables them right? > > > > > > > [JB] The issue is hard lockup seen on a CPU which is waiting for > > > session lock taken by another CPU processing do_softirq. This > > > do_softirq is getting called from be2iscsi spin_unlock_bh. > > > > > > The contention gets easily exposed with be2iscsi as it data > > > structures are accessed in process context and IRQ_POLL softirq. > > > > > > iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock > > > version for session lock (frwd and back). > > > iscsi_queuecommand is using _bh which is bit scary. > > > > > > Yes, that needs to be fixed too... and then this other path looks > > > buggy too - blk_run_queue (takes queue_lock with irqsave) > > > ->scsi_request_fn (unlocks it with > > > _irq) > > [JB] In this case, the other CPU executing do_softirq was too sending > > IO under session lock. > > When be2iscsi does spin_unlock_bh after getting resources for the IO, > > unlock_bh finds pending IRQ_POLL softirq. > > > > The issue is more to do with driver using _unlock_bh. > > It is causing a WARN_ON too in unlock_bh where ever used for disabled > > IRQs. > [JB] My apologies. blk_execute_rq_nowait and the call trace shown is with > older > kernel. > Only valid kernel trace with latest kernel is: > [ 180.957062] <IRQ> [<ffffffff81603f36>] dump_stack+0x19/0x1b [ > 180.957070] [<ffffffff8106e28b>] warn_slowpath_common+0x6b/0xb0 [ > 180.957072] [<ffffffff8106e3da>] warn_slowpath_null+0x1a/0x20 [ > 180.957073] [<ffffffff81076d1a>] local_bh_enable_ip+0x7a/0xa0 [ > 180.957077] [<ffffffff8160b0eb>] _raw_spin_unlock_bh+0x1b/0x40 [ > 180.957084] [<ffffffffa037f753>] alloc_mgmt_sgl_handle+0x83/0xe0 > [be2iscsi] > [ 180.957089] [<ffffffffa0388d9e>] beiscsi_alloc_pdu+0x21e/0x5d0 > [be2iscsi] [ > 180.957094] [<ffffffffa01bec71>] __iscsi_conn_send_pdu+0x101/0x2e0 > [libiscsi] [ 180.957099] [<ffffffffa01bef78>] > iscsi_send_nopout+0xb8/0x100 > [libiscsi] [ 180.957104] [<ffffffffa01bfca4>] > iscsi_eh_cmd_timed_out+0x2a4/0x2d0 > [libiscsi] > [ 180.957107] [<ffffffff813f56de>] scsi_times_out+0x5e/0x320 [JB] The scenario of lockup explained, but likely a soft one, can still happen, hence the change. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-09-27 10:18 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-09-23 15:08 [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare -- strict thread matches above, loose matches on Subject: below -- 2016-09-27 10:18 Jitendra Bhivare 2016-07-21 9:07 [PATCH 00/28] be2iscsi: driver update 11.2.0.0 Jitendra Bhivare 2016-07-21 9:07 ` [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore Jitendra Bhivare 2016-08-05 18:38 ` Mike Christie 2016-08-09 1:05 ` Martin K. Petersen 2016-08-09 8:29 ` Jitendra Bhivare 2016-08-09 17:48 ` Mike Christie 2016-08-10 12:46 ` Jitendra Bhivare 2016-08-11 5:42 ` Jitendra Bhivare 2016-08-12 7:55 ` Jitendra Bhivare 2016-08-12 8:05 ` Jitendra Bhivare
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).