linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* nvme-rdma and rdma comp vector affinity problem
@ 2018-07-09 19:25 Steve Wise
  2018-07-12 15:10 ` Steve Wise
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Wise @ 2018-07-09 19:25 UTC (permalink / raw)


Hey Sagi,

I'm adding cxgb4 support for ib_get_vector_affinity(), and I see an
error when connecting via nvme-rdma for certain affinity settings for my
comp vectors.? The error I see is:

[root at stevo1 linux]# nvme connect-all -t rdma -a 172.16.2.1
Failed to write to /dev/nvme-fabrics: Invalid cross-device link

And this gets logged:

[? 590.357506] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 590.364730] nvme nvme0: failed to connect queue: 2 ret=-18

The EXDEV error is being returned by blk_mq_alloc_request_hctx() because
blk_mq_queue_mapped() fails.? This only happens when I setup my vector
affinity such that there is overlap.? IE if 2 comp vectors are setup to
the same cpu then i see this failure.? If they are all mapped each to
their own cpu, then it works.? I added some debug in my cxgb4
get_comp_vector_affinity(), and a WARN_ONCE() in
blk_mq_alloc_request_hctx() and below is the output.

I would think that the vector affinity shouldn't cause connection
failures.? Any ideas?? Thanks!

Steve.

[? 433.528743] nvmet: creating controller 1 for subsystem
nqn.2014-08.org.nvmexpress.discovery for NQN
nqn.2014-08.org.nvmexpress:uuid:228c41cb-86c1-4aca-8a10-6e8d8c7998a0.
[? 433.545267] nvme nvme0: new ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery", addr 172.16.2.1:4420
[? 433.554972] nvme nvme0: Removing ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery"
[? 433.604610] nvmet: creating controller 1 for subsystem nvme-nullb0
for NQN
nqn.2014-08.org.nvmexpress:uuid:228c41cb-86c1-4aca-8a10-6e8d8c7998a0.
[? 433.619048] nvme nvme0: creating 16 I/O queues.
[? 433.643746] iw_cxgb4: comp_vector 0, irq 217 mask 0x100
[? 433.649630] iw_cxgb4: comp_vector 1, irq 218 mask 0x200
[? 433.655501] iw_cxgb4: comp_vector 2, irq 219 mask 0x400
[? 433.661379] iw_cxgb4: comp_vector 3, irq 220 mask 0x800
[? 433.667243] iw_cxgb4: comp_vector 4, irq 221 mask 0x1000
[? 433.673179] iw_cxgb4: comp_vector 5, irq 222 mask 0x2000
[? 433.679110] iw_cxgb4: comp_vector 6, irq 223 mask 0x4000
[? 433.685020] iw_cxgb4: comp_vector 7, irq 224 mask 0x8000
[? 433.690928] iw_cxgb4: comp_vector 8, irq 225 mask 0x100
[? 433.696736] iw_cxgb4: comp_vector 9, irq 226 mask 0x200
[? 433.702531] iw_cxgb4: comp_vector 10, irq 227 mask 0x400
[? 433.708401] iw_cxgb4: comp_vector 11, irq 228 mask 0x800
[? 433.714277] iw_cxgb4: comp_vector 12, irq 229 mask 0x1000
[? 433.720208] iw_cxgb4: comp_vector 13, irq 230 mask 0x2000
[? 433.726138] iw_cxgb4: comp_vector 14, irq 231 mask 0x4000
[? 433.732051] iw_cxgb4: comp_vector 15, irq 232 mask 0x8000
[? 433.739894] ------------[ cut here ]------------
[? 433.745026] blk_mq_alloc_request_hctx hw_queue not mapped!
[? 433.751030] WARNING: CPU: 6 PID: 9950 at block/blk-mq.c:454
blk_mq_alloc_request_hctx+0x163/0x180
[? 433.760396] Modules linked in: nvmet_rdma nvmet null_blk nvme_rdma
nvme_fabrics nvme_core mlx5_ib mlx5_core mlxfw rdma_ucm ib_uverbs
iw_cxgb4 rdma_cm iw_cm ib_cm ib_core cxgb4 iscsi_target_mod libiscsi
scsi_transport_iscsi target_core_mod libcxgb vfat fat intel_rapl sb_edac
x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
crypto_simd cryptd glue_helper iTCO_wdt iTCO_vendor_support mxm_wmi
pcspkr joydev mei_me devlink ipmi_si sg mei i2c_i801 ipmi_devintf
lpc_ich ioatdma ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace
sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci libsas igb
ahci scsi_transport_sas libahci libata crc32c_intel dca i2c_algo_bit
[? 433.835278]? i2c_core [last unloaded: mlxfw]
[? 433.840150] CPU: 6 PID: 9950 Comm: nvme Kdump: loaded Tainted:
G??????? W???????? 4.18.0-rc1+ #131
[? 433.849714] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a
07/09/2015
[? 433.857301] RIP: 0010:blk_mq_alloc_request_hctx+0x163/0x180
[? 433.863493] Code: 0f 0b 48 c7 c0 ea ff ff ff e9 1a ff ff ff 48 c7 c6
e0 34 c8 bd 48 c7 c7 bb e4 ea bd 31 c0 c6 05 bc d1 e8 00 01 e8 bd 96 d1
ff <0f> 0b 48 c7 c0 ee ff ff ff e9 f0 fe ff ff 0f 1f 44 00 00 66 2e 0f
[? 433.883625] RSP: 0018:ffffab7f4790bba8 EFLAGS: 00010286
[? 433.889481] RAX: 0000000000000000 RBX: ffff918412ab9360 RCX:
0000000000000000
[? 433.897252] RDX: 0000000000000001 RSI: ffff91841fd96978 RDI:
ffff91841fd96978
[? 433.905014] RBP: 0000000000000001 R08: 0000000000000000 R09:
000000000000057d
[? 433.912782] R10: 00000000000003ff R11: 0000000000aaaaaa R12:
0000000000000023
[? 433.920555] R13: ffffab7f4790bc50 R14: 0000000000000400 R15:
0000000000000000
[? 433.928325] FS:? 00007f54566d6780(0000) GS:ffff91841fd80000(0000)
knlGS:0000000000000000
[? 433.937040] CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[? 433.943418] CR2: 00007f5456000610 CR3: 0000000858f58003 CR4:
00000000000606e0
[? 433.951178] Call Trace:
[? 433.954241]? nvme_alloc_request+0x36/0x80 [nvme_core]
[? 433.959891]? __nvme_submit_sync_cmd+0x2b/0xd0 [nvme_core]
[? 433.965884]? nvmf_connect_io_queue+0x10e/0x170 [nvme_fabrics]
[? 433.972215]? nvme_rdma_start_queue+0x21/0x80 [nvme_rdma]
[? 433.978100]? nvme_rdma_configure_io_queues+0x196/0x280 [nvme_rdma]
[? 433.984846]? nvme_rdma_create_ctrl+0x4f9/0x640 [nvme_rdma]
[? 433.990901]? nvmf_dev_write+0x954/0xaf8 [nvme_fabrics]
[? 433.996614]? __vfs_write+0x33/0x190
[? 434.000681]? ? list_lru_add+0x97/0x140
[? 434.005015]? ? __audit_syscall_entry+0xd7/0x160
[? 434.010135]? vfs_write+0xad/0x1a0
[? 434.014039]? ksys_write+0x52/0xc0
[? 434.017959]? do_syscall_64+0x55/0x180
[? 434.022222]? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[? 434.027880] RIP: 0033:0x7f5455fda840
[? 434.032061] Code: 73 01 c3 48 8b 0d 48 26 2d 00 f7 d8 64 89 01 48 83
c8 ff c3 66 0f 1f 44 00 00 83 3d 3d 87 2d 00 00 75 10 b8 01 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ce c6 01 00 48 89 04 24
[? 434.052217] RSP: 002b:00007ffc930111e8 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[? 434.060449] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
00007f5455fda840
[? 434.068266] RDX: 000000000000003d RSI: 00007ffc93012260 RDI:
0000000000000003
[? 434.076088] RBP: 00007ffc93012260 R08: 00007f5455f39988 R09:
000000000000000d
[? 434.083911] R10: 0000000000000004 R11: 0000000000000246 R12:
000000000000003d
[? 434.091736] R13: 0000000000000003 R14: 0000000000000001 R15:
0000000000000001
[? 434.099555] ---[ end trace 9f5bec6eef77fae9 ]---
[? 434.104864] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 434.112235] nvme nvme0: failed to connect queue: 2 ret=-18

^ permalink raw reply	[flat|nested] 6+ messages in thread

* nvme-rdma and rdma comp vector affinity problem
  2018-07-09 19:25 nvme-rdma and rdma comp vector affinity problem Steve Wise
@ 2018-07-12 15:10 ` Steve Wise
  2018-07-15  7:58   ` Max Gurtovoy
  2018-07-16  6:51   ` Sagi Grimberg
  0 siblings, 2 replies; 6+ messages in thread
From: Steve Wise @ 2018-07-12 15:10 UTC (permalink / raw)


Hey Sagi and Christoph,

Do you all have any thoughts on this?? It seems like a bug in nvme-rdma
or the blk-mq code.?? I can debug it further, if we agree this does look
like a bug...

Thanks,

Steve.


On 7/9/2018 2:25 PM, Steve Wise wrote:
> Hey Sagi,
>
> I'm adding cxgb4 support for ib_get_vector_affinity(), and I see an
> error when connecting via nvme-rdma for certain affinity settings for my
> comp vectors.? The error I see is:
>
> [root at stevo1 linux]# nvme connect-all -t rdma -a 172.16.2.1
> Failed to write to /dev/nvme-fabrics: Invalid cross-device link
>
> And this gets logged:
>
> [? 590.357506] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
> [? 590.364730] nvme nvme0: failed to connect queue: 2 ret=-18
>
> The EXDEV error is being returned by blk_mq_alloc_request_hctx() because
> blk_mq_queue_mapped() fails.? This only happens when I setup my vector
> affinity such that there is overlap.? IE if 2 comp vectors are setup to
> the same cpu then i see this failure.? If they are all mapped each to
> their own cpu, then it works.? I added some debug in my cxgb4
> get_comp_vector_affinity(), and a WARN_ONCE() in
> blk_mq_alloc_request_hctx() and below is the output.
>
> I would think that the vector affinity shouldn't cause connection
> failures.? Any ideas?? Thanks!
>
> Steve.
>
> [? 433.528743] nvmet: creating controller 1 for subsystem
> nqn.2014-08.org.nvmexpress.discovery for NQN
> nqn.2014-08.org.nvmexpress:uuid:228c41cb-86c1-4aca-8a10-6e8d8c7998a0.
> [? 433.545267] nvme nvme0: new ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery", addr 172.16.2.1:4420
> [? 433.554972] nvme nvme0: Removing ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery"
> [? 433.604610] nvmet: creating controller 1 for subsystem nvme-nullb0
> for NQN
> nqn.2014-08.org.nvmexpress:uuid:228c41cb-86c1-4aca-8a10-6e8d8c7998a0.
> [? 433.619048] nvme nvme0: creating 16 I/O queues.
> [? 433.643746] iw_cxgb4: comp_vector 0, irq 217 mask 0x100
> [? 433.649630] iw_cxgb4: comp_vector 1, irq 218 mask 0x200
> [? 433.655501] iw_cxgb4: comp_vector 2, irq 219 mask 0x400
> [? 433.661379] iw_cxgb4: comp_vector 3, irq 220 mask 0x800
> [? 433.667243] iw_cxgb4: comp_vector 4, irq 221 mask 0x1000
> [? 433.673179] iw_cxgb4: comp_vector 5, irq 222 mask 0x2000
> [? 433.679110] iw_cxgb4: comp_vector 6, irq 223 mask 0x4000
> [? 433.685020] iw_cxgb4: comp_vector 7, irq 224 mask 0x8000
> [? 433.690928] iw_cxgb4: comp_vector 8, irq 225 mask 0x100
> [? 433.696736] iw_cxgb4: comp_vector 9, irq 226 mask 0x200
> [? 433.702531] iw_cxgb4: comp_vector 10, irq 227 mask 0x400
> [? 433.708401] iw_cxgb4: comp_vector 11, irq 228 mask 0x800
> [? 433.714277] iw_cxgb4: comp_vector 12, irq 229 mask 0x1000
> [? 433.720208] iw_cxgb4: comp_vector 13, irq 230 mask 0x2000
> [? 433.726138] iw_cxgb4: comp_vector 14, irq 231 mask 0x4000
> [? 433.732051] iw_cxgb4: comp_vector 15, irq 232 mask 0x8000
> [? 433.739894] ------------[ cut here ]------------
> [? 433.745026] blk_mq_alloc_request_hctx hw_queue not mapped!
> [? 433.751030] WARNING: CPU: 6 PID: 9950 at block/blk-mq.c:454
> blk_mq_alloc_request_hctx+0x163/0x180
> [? 433.760396] Modules linked in: nvmet_rdma nvmet null_blk nvme_rdma
> nvme_fabrics nvme_core mlx5_ib mlx5_core mlxfw rdma_ucm ib_uverbs
> iw_cxgb4 rdma_cm iw_cm ib_cm ib_core cxgb4 iscsi_target_mod libiscsi
> scsi_transport_iscsi target_core_mod libcxgb vfat fat intel_rapl sb_edac
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
> crypto_simd cryptd glue_helper iTCO_wdt iTCO_vendor_support mxm_wmi
> pcspkr joydev mei_me devlink ipmi_si sg mei i2c_i801 ipmi_devintf
> lpc_ich ioatdma ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace
> sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper
> syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci libsas igb
> ahci scsi_transport_sas libahci libata crc32c_intel dca i2c_algo_bit
> [? 433.835278]? i2c_core [last unloaded: mlxfw]
> [? 433.840150] CPU: 6 PID: 9950 Comm: nvme Kdump: loaded Tainted:
> G??????? W???????? 4.18.0-rc1+ #131
> [? 433.849714] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a
> 07/09/2015
> [? 433.857301] RIP: 0010:blk_mq_alloc_request_hctx+0x163/0x180
> [? 433.863493] Code: 0f 0b 48 c7 c0 ea ff ff ff e9 1a ff ff ff 48 c7 c6
> e0 34 c8 bd 48 c7 c7 bb e4 ea bd 31 c0 c6 05 bc d1 e8 00 01 e8 bd 96 d1
> ff <0f> 0b 48 c7 c0 ee ff ff ff e9 f0 fe ff ff 0f 1f 44 00 00 66 2e 0f
> [? 433.883625] RSP: 0018:ffffab7f4790bba8 EFLAGS: 00010286
> [? 433.889481] RAX: 0000000000000000 RBX: ffff918412ab9360 RCX:
> 0000000000000000
> [? 433.897252] RDX: 0000000000000001 RSI: ffff91841fd96978 RDI:
> ffff91841fd96978
> [? 433.905014] RBP: 0000000000000001 R08: 0000000000000000 R09:
> 000000000000057d
> [? 433.912782] R10: 00000000000003ff R11: 0000000000aaaaaa R12:
> 0000000000000023
> [? 433.920555] R13: ffffab7f4790bc50 R14: 0000000000000400 R15:
> 0000000000000000
> [? 433.928325] FS:? 00007f54566d6780(0000) GS:ffff91841fd80000(0000)
> knlGS:0000000000000000
> [? 433.937040] CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [? 433.943418] CR2: 00007f5456000610 CR3: 0000000858f58003 CR4:
> 00000000000606e0
> [? 433.951178] Call Trace:
> [? 433.954241]? nvme_alloc_request+0x36/0x80 [nvme_core]
> [? 433.959891]? __nvme_submit_sync_cmd+0x2b/0xd0 [nvme_core]
> [? 433.965884]? nvmf_connect_io_queue+0x10e/0x170 [nvme_fabrics]
> [? 433.972215]? nvme_rdma_start_queue+0x21/0x80 [nvme_rdma]
> [? 433.978100]? nvme_rdma_configure_io_queues+0x196/0x280 [nvme_rdma]
> [? 433.984846]? nvme_rdma_create_ctrl+0x4f9/0x640 [nvme_rdma]
> [? 433.990901]? nvmf_dev_write+0x954/0xaf8 [nvme_fabrics]
> [? 433.996614]? __vfs_write+0x33/0x190
> [? 434.000681]? ? list_lru_add+0x97/0x140
> [? 434.005015]? ? __audit_syscall_entry+0xd7/0x160
> [? 434.010135]? vfs_write+0xad/0x1a0
> [? 434.014039]? ksys_write+0x52/0xc0
> [? 434.017959]? do_syscall_64+0x55/0x180
> [? 434.022222]? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [? 434.027880] RIP: 0033:0x7f5455fda840
> [? 434.032061] Code: 73 01 c3 48 8b 0d 48 26 2d 00 f7 d8 64 89 01 48 83
> c8 ff c3 66 0f 1f 44 00 00 83 3d 3d 87 2d 00 00 75 10 b8 01 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ce c6 01 00 48 89 04 24
> [? 434.052217] RSP: 002b:00007ffc930111e8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [? 434.060449] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
> 00007f5455fda840
> [? 434.068266] RDX: 000000000000003d RSI: 00007ffc93012260 RDI:
> 0000000000000003
> [? 434.076088] RBP: 00007ffc93012260 R08: 00007f5455f39988 R09:
> 000000000000000d
> [? 434.083911] R10: 0000000000000004 R11: 0000000000000246 R12:
> 000000000000003d
> [? 434.091736] R13: 0000000000000003 R14: 0000000000000001 R15:
> 0000000000000001
> [? 434.099555] ---[ end trace 9f5bec6eef77fae9 ]---
> [? 434.104864] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
> [? 434.112235] nvme nvme0: failed to connect queue: 2 ret=-18
>
>
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* nvme-rdma and rdma comp vector affinity problem
  2018-07-12 15:10 ` Steve Wise
@ 2018-07-15  7:58   ` Max Gurtovoy
  2018-07-16 14:34     ` Steve Wise
  2018-07-16  6:51   ` Sagi Grimberg
  1 sibling, 1 reply; 6+ messages in thread
From: Max Gurtovoy @ 2018-07-15  7:58 UTC (permalink / raw)


Hi Steve,
I didn't go throw the implementation of cxgb4, but did you implement the 
needed callback or you calling the blk mapping function (fallback to 
blk_mq_map_queues) ?

Regards,
-Max.

On 7/12/2018 6:10 PM, Steve Wise wrote:
> Hey Sagi and Christoph,
> 
> Do you all have any thoughts on this?? It seems like a bug in nvme-rdma
> or the blk-mq code.?? I can debug it further, if we agree this does look
> like a bug...
> 
> Thanks,
> 
> Steve.
> 
> 
> On 7/9/2018 2:25 PM, Steve Wise wrote:
>> Hey Sagi,
>>
>> I'm adding cxgb4 support for ib_get_vector_affinity(), and I see an
>> error when connecting via nvme-rdma for certain affinity settings for my
>> comp vectors.? The error I see is:
>>
>> [root at stevo1 linux]# nvme connect-all -t rdma -a 172.16.2.1
>> Failed to write to /dev/nvme-fabrics: Invalid cross-device link
>>
>> And this gets logged:
>>
>> [? 590.357506] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
>> [? 590.364730] nvme nvme0: failed to connect queue: 2 ret=-18
>>
>> The EXDEV error is being returned by blk_mq_alloc_request_hctx() because
>> blk_mq_queue_mapped() fails.? This only happens when I setup my vector
>> affinity such that there is overlap.? IE if 2 comp vectors are setup to
>> the same cpu then i see this failure.? If they are all mapped each to
>> their own cpu, then it works.? I added some debug in my cxgb4
>> get_comp_vector_affinity(), and a WARN_ONCE() in
>> blk_mq_alloc_request_hctx() and below is the output.
>>
>> I would think that the vector affinity shouldn't cause connection
>> failures.? Any ideas?? Thanks!
>>
>> Steve.
>>
>> [? 433.528743] nvmet: creating controller 1 for subsystem
>> nqn.2014-08.org.nvmexpress.discovery for NQN
>> nqn.2014-08.org.nvmexpress:uuid:228c41cb-86c1-4aca-8a10-6e8d8c7998a0.
>> [? 433.545267] nvme nvme0: new ctrl: NQN
>> "nqn.2014-08.org.nvmexpress.discovery", addr 172.16.2.1:4420
>> [? 433.554972] nvme nvme0: Removing ctrl: NQN
>> "nqn.2014-08.org.nvmexpress.discovery"
>> [? 433.604610] nvmet: creating controller 1 for subsystem nvme-nullb0
>> for NQN
>> nqn.2014-08.org.nvmexpress:uuid:228c41cb-86c1-4aca-8a10-6e8d8c7998a0.
>> [? 433.619048] nvme nvme0: creating 16 I/O queues.
>> [? 433.643746] iw_cxgb4: comp_vector 0, irq 217 mask 0x100
>> [? 433.649630] iw_cxgb4: comp_vector 1, irq 218 mask 0x200
>> [? 433.655501] iw_cxgb4: comp_vector 2, irq 219 mask 0x400
>> [? 433.661379] iw_cxgb4: comp_vector 3, irq 220 mask 0x800
>> [? 433.667243] iw_cxgb4: comp_vector 4, irq 221 mask 0x1000
>> [? 433.673179] iw_cxgb4: comp_vector 5, irq 222 mask 0x2000
>> [? 433.679110] iw_cxgb4: comp_vector 6, irq 223 mask 0x4000
>> [? 433.685020] iw_cxgb4: comp_vector 7, irq 224 mask 0x8000
>> [? 433.690928] iw_cxgb4: comp_vector 8, irq 225 mask 0x100
>> [? 433.696736] iw_cxgb4: comp_vector 9, irq 226 mask 0x200
>> [? 433.702531] iw_cxgb4: comp_vector 10, irq 227 mask 0x400
>> [? 433.708401] iw_cxgb4: comp_vector 11, irq 228 mask 0x800
>> [? 433.714277] iw_cxgb4: comp_vector 12, irq 229 mask 0x1000
>> [? 433.720208] iw_cxgb4: comp_vector 13, irq 230 mask 0x2000
>> [? 433.726138] iw_cxgb4: comp_vector 14, irq 231 mask 0x4000
>> [? 433.732051] iw_cxgb4: comp_vector 15, irq 232 mask 0x8000
>> [? 433.739894] ------------[ cut here ]------------
>> [? 433.745026] blk_mq_alloc_request_hctx hw_queue not mapped!
>> [? 433.751030] WARNING: CPU: 6 PID: 9950 at block/blk-mq.c:454
>> blk_mq_alloc_request_hctx+0x163/0x180
>> [? 433.760396] Modules linked in: nvmet_rdma nvmet null_blk nvme_rdma
>> nvme_fabrics nvme_core mlx5_ib mlx5_core mlxfw rdma_ucm ib_uverbs
>> iw_cxgb4 rdma_cm iw_cm ib_cm ib_core cxgb4 iscsi_target_mod libiscsi
>> scsi_transport_iscsi target_core_mod libcxgb vfat fat intel_rapl sb_edac
>> x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
>> crypto_simd cryptd glue_helper iTCO_wdt iTCO_vendor_support mxm_wmi
>> pcspkr joydev mei_me devlink ipmi_si sg mei i2c_i801 ipmi_devintf
>> lpc_ich ioatdma ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace
>> sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper
>> syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci libsas igb
>> ahci scsi_transport_sas libahci libata crc32c_intel dca i2c_algo_bit
>> [? 433.835278]? i2c_core [last unloaded: mlxfw]
>> [? 433.840150] CPU: 6 PID: 9950 Comm: nvme Kdump: loaded Tainted:
>> G??????? W???????? 4.18.0-rc1+ #131
>> [? 433.849714] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a
>> 07/09/2015
>> [? 433.857301] RIP: 0010:blk_mq_alloc_request_hctx+0x163/0x180
>> [? 433.863493] Code: 0f 0b 48 c7 c0 ea ff ff ff e9 1a ff ff ff 48 c7 c6
>> e0 34 c8 bd 48 c7 c7 bb e4 ea bd 31 c0 c6 05 bc d1 e8 00 01 e8 bd 96 d1
>> ff <0f> 0b 48 c7 c0 ee ff ff ff e9 f0 fe ff ff 0f 1f 44 00 00 66 2e 0f
>> [? 433.883625] RSP: 0018:ffffab7f4790bba8 EFLAGS: 00010286
>> [? 433.889481] RAX: 0000000000000000 RBX: ffff918412ab9360 RCX:
>> 0000000000000000
>> [? 433.897252] RDX: 0000000000000001 RSI: ffff91841fd96978 RDI:
>> ffff91841fd96978
>> [? 433.905014] RBP: 0000000000000001 R08: 0000000000000000 R09:
>> 000000000000057d
>> [? 433.912782] R10: 00000000000003ff R11: 0000000000aaaaaa R12:
>> 0000000000000023
>> [? 433.920555] R13: ffffab7f4790bc50 R14: 0000000000000400 R15:
>> 0000000000000000
>> [? 433.928325] FS:? 00007f54566d6780(0000) GS:ffff91841fd80000(0000)
>> knlGS:0000000000000000
>> [? 433.937040] CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [? 433.943418] CR2: 00007f5456000610 CR3: 0000000858f58003 CR4:
>> 00000000000606e0
>> [? 433.951178] Call Trace:
>> [? 433.954241]? nvme_alloc_request+0x36/0x80 [nvme_core]
>> [? 433.959891]? __nvme_submit_sync_cmd+0x2b/0xd0 [nvme_core]
>> [? 433.965884]? nvmf_connect_io_queue+0x10e/0x170 [nvme_fabrics]
>> [? 433.972215]? nvme_rdma_start_queue+0x21/0x80 [nvme_rdma]
>> [? 433.978100]? nvme_rdma_configure_io_queues+0x196/0x280 [nvme_rdma]
>> [? 433.984846]? nvme_rdma_create_ctrl+0x4f9/0x640 [nvme_rdma]
>> [? 433.990901]? nvmf_dev_write+0x954/0xaf8 [nvme_fabrics]
>> [? 433.996614]? __vfs_write+0x33/0x190
>> [? 434.000681]? ? list_lru_add+0x97/0x140
>> [? 434.005015]? ? __audit_syscall_entry+0xd7/0x160
>> [? 434.010135]? vfs_write+0xad/0x1a0
>> [? 434.014039]? ksys_write+0x52/0xc0
>> [? 434.017959]? do_syscall_64+0x55/0x180
>> [? 434.022222]? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [? 434.027880] RIP: 0033:0x7f5455fda840
>> [? 434.032061] Code: 73 01 c3 48 8b 0d 48 26 2d 00 f7 d8 64 89 01 48 83
>> c8 ff c3 66 0f 1f 44 00 00 83 3d 3d 87 2d 00 00 75 10 b8 01 00 00 00 0f
>> 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ce c6 01 00 48 89 04 24
>> [? 434.052217] RSP: 002b:00007ffc930111e8 EFLAGS: 00000246 ORIG_RAX:
>> 0000000000000001
>> [? 434.060449] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
>> 00007f5455fda840
>> [? 434.068266] RDX: 000000000000003d RSI: 00007ffc93012260 RDI:
>> 0000000000000003
>> [? 434.076088] RBP: 00007ffc93012260 R08: 00007f5455f39988 R09:
>> 000000000000000d
>> [? 434.083911] R10: 0000000000000004 R11: 0000000000000246 R12:
>> 000000000000003d
>> [? 434.091736] R13: 0000000000000003 R14: 0000000000000001 R15:
>> 0000000000000001
>> [? 434.099555] ---[ end trace 9f5bec6eef77fae9 ]---
>> [? 434.104864] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
>> [? 434.112235] nvme nvme0: failed to connect queue: 2 ret=-18
>>
>>
>>
>> _______________________________________________
>> Linux-nvme mailing list
>> Linux-nvme at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-nvme
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* nvme-rdma and rdma comp vector affinity problem
  2018-07-12 15:10 ` Steve Wise
  2018-07-15  7:58   ` Max Gurtovoy
@ 2018-07-16  6:51   ` Sagi Grimberg
  2018-07-16 15:11     ` Steve Wise
  1 sibling, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2018-07-16  6:51 UTC (permalink / raw)



> Hey Sagi and Christoph,
> 
> Do you all have any thoughts on this?? It seems like a bug in nvme-rdma
> or the blk-mq code.?? I can debug it further, if we agree this does look
> like a bug...

It is a bug... blk-mq tells expects us to skip unmapped queues but
we fail the controller altogether...

I assume managed affinity would have take care of linearization for us..

Does this quick untested patch work?
--
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 8023054ec83e..766d10acb1b9 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -604,20 +604,33 @@ static int nvme_rdma_start_queue(struct 
nvme_rdma_ctrl *ctrl, int idx)

  static int nvme_rdma_start_io_queues(struct nvme_rdma_ctrl *ctrl)
  {
-       int i, ret = 0;
+       int i, ret = 0, count = 0;

         for (i = 1; i < ctrl->ctrl.queue_count; i++) {
                 ret = nvme_rdma_start_queue(ctrl, i);
-               if (ret)
+               if (ret) {
+                       if (ret == -EXDEV) {
+                               /* unmapped queue, skip ... */
+                               nvme_rdma_free_queue(&ctrl->queues[i]);
+                               continue;
+                       }
                         goto out_stop_queues;
+               }
+               count++;
         }

+       if (!count)
+               /* no started queues, fail */
+               goto out_stop_queues;
+
+       dev_info(ctrl->ctrl.device, "connected %d I/O queues.\n", count);
+
         return 0;

  out_stop_queues:
         for (i--; i >= 1; i--)
                 nvme_rdma_stop_queue(&ctrl->queues[i]);
-       return ret;
+       return -EIO;
  }

  static int nvme_rdma_alloc_io_queues(struct nvme_rdma_ctrl *ctrl)
--

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* nvme-rdma and rdma comp vector affinity problem
  2018-07-15  7:58   ` Max Gurtovoy
@ 2018-07-16 14:34     ` Steve Wise
  0 siblings, 0 replies; 6+ messages in thread
From: Steve Wise @ 2018-07-16 14:34 UTC (permalink / raw)




On 7/15/2018 2:58 AM, Max Gurtovoy wrote:
> Hi Steve,
> I didn't go throw the implementation of cxgb4, but did you implement
> the needed callback or you calling the blk mapping function (fallback
> to blk_mq_map_queues) ?
>
> Regards,
> -Max.
>

Hey Max.? I implemented ib_get_vector_affinity() support for cxgb4.??
Also, If I change mlx5 to use the same cpus for each vector, I see the
same error for nvme-rdma over mlx5.?

Steve.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* nvme-rdma and rdma comp vector affinity problem
  2018-07-16  6:51   ` Sagi Grimberg
@ 2018-07-16 15:11     ` Steve Wise
  0 siblings, 0 replies; 6+ messages in thread
From: Steve Wise @ 2018-07-16 15:11 UTC (permalink / raw)




On 7/16/2018 1:51 AM, Sagi Grimberg wrote:
>
>> Hey Sagi and Christoph,
>>
>> Do you all have any thoughts on this?? It seems like a bug in nvme-rdma
>> or the blk-mq code.?? I can debug it further, if we agree this does look
>> like a bug...
>
> It is a bug... blk-mq tells expects us to skip unmapped queues but
> we fail the controller altogether...
>
> I assume managed affinity would have take care of linearization for us..
>
> Does this quick untested patch work?

Hey Sagi,

I can connect now with your patch, but perhaps these errors shouldn't be
logged?? Also, It apparently connect 9 IO queues.? I think it should
have connected only 8, right?

Log showing the iw_cxgb4 vector affinity ( 16 comp vectors configured to
only use cpus in the same numa node - cpus 8-15):

[? 810.387762] iw_cxgb4: comp_vector 0, irq 217 mask 0x100
[? 810.393543] iw_cxgb4: comp_vector 1, irq 218 mask 0x200
[? 810.399229] iw_cxgb4: comp_vector 2, irq 219 mask 0x400
[? 810.404902] iw_cxgb4: comp_vector 3, irq 220 mask 0x800
[? 810.410584] iw_cxgb4: comp_vector 4, irq 221 mask 0x1000
[? 810.416333] iw_cxgb4: comp_vector 5, irq 222 mask 0x2000
[? 810.422085] iw_cxgb4: comp_vector 6, irq 223 mask 0x4000
[? 810.427827] iw_cxgb4: comp_vector 7, irq 224 mask 0x8000
[? 810.433564] iw_cxgb4: comp_vector 8, irq 225 mask 0x100
[? 810.439212] iw_cxgb4: comp_vector 9, irq 226 mask 0x200
[? 810.444851] iw_cxgb4: comp_vector 10, irq 227 mask 0x400
[? 810.450570] iw_cxgb4: comp_vector 11, irq 228 mask 0x800
[? 810.456271] iw_cxgb4: comp_vector 12, irq 229 mask 0x1000
[? 810.462057] iw_cxgb4: comp_vector 13, irq 230 mask 0x2000
[? 810.467841] iw_cxgb4: comp_vector 14, irq 231 mask 0x4000
[? 810.473606] iw_cxgb4: comp_vector 15, irq 232 mask 0x8000

Log showing the nvme queue setup (attempting 16 IO Queues and thus
trying all 16 comp vectors):

[? 810.839135] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.846531] nvme nvme0: failed to connect queue: 2 ret=-18
[? 810.853330] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.860698] nvme nvme0: failed to connect queue: 3 ret=-18
[? 810.867502] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.874834] nvme nvme0: failed to connect queue: 4 ret=-18
[? 810.881579] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.888883] nvme nvme0: failed to connect queue: 5 ret=-18
[? 810.895617] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.902908] nvme nvme0: failed to connect queue: 6 ret=-18
[? 810.909650] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.916936] nvme nvme0: failed to connect queue: 7 ret=-18
[? 810.923655] nvme nvme0: Connect command failed, error wo/DNR bit: -16402
[? 810.930924] nvme nvme0: failed to connect queue: 8 ret=-18
[? 810.937818] nvme nvme0: connected 9 I/O queues.
[? 810.942902] nvme nvme0: new ctrl: NQN "nvme-nullb0", addr 172.16.2.1:4420

[root at stevo1 linux]# nvme list
Node???????????? SN??????????????????
Model??????????????????????????????????? Namespace
Usage????????????????????? Format?????????? FW Rev
---------------- --------------------
---------------------------------------- ---------
-------------------------- ---------------- --------
/dev/nvme0n1???? db56fecfd36969df????
Linux??????????????????????????????????? 1?????????? 1.07? GB /?? 1.07?
GB??? 512?? B +? 0 B?? 4.18.0-r
[root at stevo1 linux]#

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-07-16 15:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-09 19:25 nvme-rdma and rdma comp vector affinity problem Steve Wise
2018-07-12 15:10 ` Steve Wise
2018-07-15  7:58   ` Max Gurtovoy
2018-07-16 14:34     ` Steve Wise
2018-07-16  6:51   ` Sagi Grimberg
2018-07-16 15:11     ` Steve Wise

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).