linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template.
@ 2014-08-05 19:45 Chad Dupuis
  2014-08-05 20:38 ` Venkatesh Srinivas
  0 siblings, 1 reply; 6+ messages in thread
From: Chad Dupuis @ 2014-08-05 19:45 UTC (permalink / raw)
  To: linux-scsi, hch; +Cc: giridhar.malavali

Set this to 1 for now as we've observed crashes when this is set to the default
value of 0.

Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
---
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index 785d0d7..4882ad6 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -2789,6 +2789,7 @@ static struct scsi_host_template bnx2fc_shost_template = {
 	.use_clustering		= ENABLE_CLUSTERING,
 	.sg_tablesize		= BNX2FC_MAX_BDS_PER_CMD,
 	.max_sectors		= 1024,
+	.no_async_abort         = 1,
 };
 
 static struct libfc_function_template bnx2fc_libfc_fcn_templ = {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template.
  2014-08-05 19:45 [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template Chad Dupuis
@ 2014-08-05 20:38 ` Venkatesh Srinivas
  2014-08-05 21:02   ` Chad Dupuis
  0 siblings, 1 reply; 6+ messages in thread
From: Venkatesh Srinivas @ 2014-08-05 20:38 UTC (permalink / raw)
  To: Chad Dupuis; +Cc: linux-scsi, hch, giridhar.malavali

On Tue, Aug 5, 2014 at 12:45 PM, Chad Dupuis <chad.dupuis@qlogic.com> wrote:
> Set this to 1 for now as we've observed crashes when this is set to the default
> value of 0.

What sorts of crashes have you seen with no_async_abort=0 (default)?
At which kernel versions?

Thanks,
-- vs;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template.
  2014-08-05 20:38 ` Venkatesh Srinivas
@ 2014-08-05 21:02   ` Chad Dupuis
  2014-08-06 19:17     ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Chad Dupuis @ 2014-08-05 21:02 UTC (permalink / raw)
  To: Venkatesh Srinivas; +Cc: linux-scsi, hch, giridhar.malavali



On Tue, 5 Aug 2014, Venkatesh Srinivas wrote:

> On Tue, Aug 5, 2014 at 12:45 PM, Chad Dupuis <chad.dupuis@qlogic.com> wrote:
>> Set this to 1 for now as we've observed crashes when this is set to the default
>> value of 0.
>
> What sorts of crashes have you seen with no_async_abort=0 (default)?
> At which kernel versions?

We've seen a crash where we use a request that has already been used else 
where when requeuing a SCSI command:

May  1 15:50:48 sl-b109 kernel: ------------[ cut here ]------------
May  1 15:50:48 sl-b109 kernel: kernel BUG at block/blk-core.c:1217!
May  1 15:50:48 sl-b109 kernel: invalid opcode: 0000 [#1] SMP
May  1 15:50:48 sl-b109 kernel: Modules linked in: sg ebtable_nat ebtables 
nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ipt_REJECT 
ip6t_REJECT nf_conntrack_ipv4 nf_conntrack_ipv6 nf_defrag_ipv4 
nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter iptable_filter 
ip_tables ip6_tables dm_round_robin iTCO_wdt iTCO_vendor_support coretemp 
kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper 
cryptd pcspkr serio_raw sb_edac edac_core lpc_ich mfd_core hpilo hpwdt 
ioatdma ntb ipmi_si ipmi_msghandler video acpi_power_meter shpchp 
pcc_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc uinput xfs 
dm_service_time 8021q garp stp llc mrp sd_mod crc_t10dif crct10dif_common 
mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ttm igb drm bnx2x
May  1 15:50:48 sl-b109 kernel: ptp pps_core dca i2c_algo_bit i2c_core 
hpsa mdio libcrc32c dm_mirror dm_region_hash dm_log bnx2fc cnic uio fcoe 
libfcoe libfc scsi_transport_fc scsi_tgt dm_multipath dm_mod
May  1 15:50:48 sl-b109 kernel: CPU: 3 PID: 515 Comm: bnx2fc_thread/3 Not 
tainted 3.10.0-121.el7.x86_64 #1
May  1 15:50:48 sl-b109 kernel: Hardware name: HP ProLiant BL460c Gen8, 
BIOS I31 12/20/2013
May  1 15:50:48 sl-b109 kernel: task: ffff880232345b00 ti: 
ffff880036b20000 task.ti: ffff880036b20000
May  1 15:50:48 sl-b109 kernel: RIP: 0010:[<ffffffff8129007d>] 
[<ffffffff8129007d>] blk_requeue_request+0x8d/0x90
May  1 15:50:48 sl-b109 kernel: RSP: 0018:ffff880237a63e68  EFLAGS: 
00010097
May  1 15:50:48 sl-b109 kernel: RAX: ffff880432d9c660 RBX: 
ffff88043142c800 RCX: dead000000200200
May  1 15:50:48 sl-b109 kernel: RDX: ffff880432d9c660 RSI: 
ffff880432d9c510 RDI: ffff880432d9c660
May  1 15:50:48 sl-b109 kernel: RBP: ffff880237a63e80 R08: 
ffff880432d9c660 R09: 0000000000000001
May  1 15:50:48 sl-b109 kernel: R10: ffffffff819f4560 R11: 
0000000000001000 R12: ffff880432d9c510
May  1 15:50:48 sl-b109 kernel: R13: ffff880432b9ad00 R14: 
ffff88021222cc40 R15: ffff880410ba0c28
May  1 15:50:48 sl-b109 kernel: FS:  0000000000000000(0000) 
GS:ffff880237a60000(0000) knlGS:0000000000000000
May  1 15:50:48 sl-b109 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
May  1 15:50:48 sl-b109 kernel: CR2: 00000000020ad4c0 CR3: 
000000042b166000 CR4: 00000000000407e0
May  1 15:50:48 sl-b109 kernel: DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
May  1 15:50:48 sl-b109 kernel: DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
May  1 15:50:48 sl-b109 kernel: Stack:
May  1 15:50:48 sl-b109 kernel: ffff88043142c800 ffff880432b9ad00 
0000000000000202 ffff880237a63ec8
May  1 15:50:48 sl-b109 kernel: ffffffff813e73c8 ffff880237a63ea0 
ffffffff8128f85c ffff88021222cc40
May  1 15:50:48 sl-b109 kernel: 0000000000002001 000000000002bf20 
0000000000000006 0000000000000001
May  1 15:50:48 sl-b109 kernel: Call Trace:
May  1 15:50:48 sl-b109 kernel: <IRQ>
May  1 15:50:48 sl-b109 kernel:
May  1 15:50:48 sl-b109 kernel: [<ffffffff813e73c8>] 
__scsi_queue_insert+0x98/0x120
May  1 15:50:48 sl-b109 kernel: [<ffffffff8128f85c>] ? 
blk_run_queue_async+0x3c/0x40
May  1 15:50:48 sl-b109 kernel: [<ffffffff813e7542>] 
scsi_softirq_done+0xd2/0x160
May  1 15:50:48 sl-b109 kernel: [<ffffffff81299b80>] 
blk_done_softirq+0x90/0xc0
May  1 15:50:48 sl-b109 kernel: [<ffffffff81067047>] 
__do_softirq+0xf7/0x290
May  1 15:50:48 sl-b109 kernel: [<ffffffff815fe15c>] 
call_softirq+0x1c/0x30
May  1 15:50:48 sl-b109 kernel: <EOI>
May  1 15:50:48 sl-b109 kernel:
May  1 15:50:48 sl-b109 kernel: [<ffffffff81014d25>] do_softirq+0x55/0x90
May  1 15:50:48 sl-b109 kernel: [<ffffffff81066564>] 
local_bh_enable_ip+0x94/0xa0
May  1 15:50:48 sl-b109 kernel: [<ffffffff815f378b>] 
_raw_spin_unlock_bh+0x1b/0x40
May  1 15:50:48 sl-b109 kernel: [<ffffffffa0097f58>] 
bnx2fc_process_cq_compl+0xf8/0x280 [bnx2fc]
May  1 15:50:48 sl-b109 kernel: [<ffffffffa0092e16>] 
bnx2fc_percpu_io_thread+0x116/0x1a0 [bnx2fc]
May  1 15:50:48 sl-b109 kernel: [<ffffffffa0092d00>] ? 
bnx2fc_get_src_mac+0x20/0x20 [bnx2fc]

It's possible that this is related to:
http://marc.info/?l=linux-scsi&m=140266091913551&w=2

however we observed no crashes when setting no_async to '1' so for now we 
wanted to set this in our host template.

> > Thanks, > 
-- vs; >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template.
  2014-08-05 21:02   ` Chad Dupuis
@ 2014-08-06 19:17     ` Christoph Hellwig
  2014-08-06 20:01       ` Chad Dupuis
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2014-08-06 19:17 UTC (permalink / raw)
  To: Chad Dupuis; +Cc: Venkatesh Srinivas, linux-scsi, hch, giridhar.malavali

On Tue, Aug 05, 2014 at 05:02:46PM -0400, Chad Dupuis wrote:
>
>
> On Tue, 5 Aug 2014, Venkatesh Srinivas wrote:
>
>> On Tue, Aug 5, 2014 at 12:45 PM, Chad Dupuis <chad.dupuis@qlogic.com> wrote:
>>> Set this to 1 for now as we've observed crashes when this is set to the default
>>> value of 0.
>>
>> What sorts of crashes have you seen with no_async_abort=0 (default)?
>> At which kernel versions?
>
> We've seen a crash where we use a request that has already been used else 
> where when requeuing a SCSI command:

What kernel version is this one?  The last issues we saw in this area
were before we pulled fixes for these scenarios in:

d555a2abf3481f81303d835046a5ec2c4fb3ca8e
644373a4219add42123df69c8b7ce6a918475ccd
7daf480483e60898f30e0a2a84fecada7a7cfac0
c69e6f812bab0d5442b40e2f1bfbca48d40bc50b
8922a908908ff947f1f211e07e2e97f1169ad3cb
a33c070bced8b283e22e8dbae35177a033b810bf


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template.
  2014-08-06 19:17     ` Christoph Hellwig
@ 2014-08-06 20:01       ` Chad Dupuis
  2014-08-07  6:43         ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Chad Dupuis @ 2014-08-06 20:01 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Venkatesh Srinivas, linux-scsi, giridhar.malavali



On Wed, 6 Aug 2014, Christoph Hellwig wrote:

> On Tue, Aug 05, 2014 at 05:02:46PM -0400, Chad Dupuis wrote:
>>
>>
>> On Tue, 5 Aug 2014, Venkatesh Srinivas wrote:
>>
>>> On Tue, Aug 5, 2014 at 12:45 PM, Chad Dupuis <chad.dupuis@qlogic.com> wrote:
>>>> Set this to 1 for now as we've observed crashes when this is set to the default
>>>> value of 0.
>>>
>>> What sorts of crashes have you seen with no_async_abort=0 (default)?
>>> At which kernel versions?
>>
>> We've seen a crash where we use a request that has already been used else
>> where when requeuing a SCSI command:
>
> What kernel version is this one?  The last issues we saw in this area
> were before we pulled fixes for these scenarios in:
>
> d555a2abf3481f81303d835046a5ec2c4fb3ca8e
> 644373a4219add42123df69c8b7ce6a918475ccd
> 7daf480483e60898f30e0a2a84fecada7a7cfac0
> c69e6f812bab0d5442b40e2f1bfbca48d40bc50b
> 8922a908908ff947f1f211e07e2e97f1169ad3cb
> a33c070bced8b283e22e8dbae35177a033b810bf
>
>

This was on RHEL 7.  Should we retry the test case with the latest 
mainline?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template.
  2014-08-06 20:01       ` Chad Dupuis
@ 2014-08-07  6:43         ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2014-08-07  6:43 UTC (permalink / raw)
  To: Chad Dupuis
  Cc: Christoph Hellwig, Venkatesh Srinivas, linux-scsi,
	giridhar.malavali

On Wed, Aug 06, 2014 at 04:01:12PM -0400, Chad Dupuis wrote:
> This was on RHEL 7.  Should we retry the test case with the latest 
> mainline?

Yes.  Please always test your patches against latest mainline.  If that is
for some reason not possible please at least very prominently mention that
the issue was reported against a back-level kernel.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-07  6:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-05 19:45 [PATCH] bnx2fc: Set no_async_abort to 1 in SCSI host template Chad Dupuis
2014-08-05 20:38 ` Venkatesh Srinivas
2014-08-05 21:02   ` Chad Dupuis
2014-08-06 19:17     ` Christoph Hellwig
2014-08-06 20:01       ` Chad Dupuis
2014-08-07  6:43         ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).