public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* rmmod mlx4_core panic 3.16-rc1
@ 2014-06-19  3:33 Shirley Ma
       [not found] ` <53A259F3.3040203-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Shirley Ma @ 2014-06-19  3:33 UTC (permalink / raw)
  To: ogerlitz-VPRAkNaXOzVWk0Htik3J/w, linux-rdma

Hello Or,

Two questions here:
1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?

I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with
mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet

2. After mlx4_core module is being loaded with with num_vfs={} parameters, when removing mlx4_core, it consistently hits below panic. Whether this problem is being tracked?

Thanks
Shirley


<mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014)
mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
mlx4_core: Initializing 0000:40:00.0
mlx4_core 0000:40:00.0: Enabling SR-IOV with 2 VFs
pci 0000:40:00.1: [15b3:1002] type 00 class 0x0c0600
mlx4_core: Initializing 0000:40:00.1
mlx4_core 0000:40:00.1: enabling device (0000 -> 0002)
mlx4_core 0000:40:00.1: Skipping virtual function:1
pci 0000:40:00.2: [15b3:1002] type 00 class 0x0c0600
mlx4_core: Initializing 0000:40:00.2
mlx4_core 0000:40:00.2: enabling device (0000 -> 0002)
mlx4_core 0000:40:00.2: Skipping virtual function:2
mlx4_core 0000:40:00.0: Running in master mode
mlx4_core 0000:40:00.0: PCIe BW is different than device's capability
mlx4_core 0000:40:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s
mlx4_core 0000:40:00.0: PCIe link width is x8, device supports x8
mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
BUG: unable to handle kernel NULL pointer dereference at 000000000000038c
IP: [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
PGD 45d3ba067 PUD 45ace8067 PMD 0 
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas scsi_transport_sas raid_class [last unloaded: mlx4_core]
CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
Hardware name: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
task: ffff880461540110 ti: ffff880465000000 task.ti: ffff880465000000
RIP: 0010:[<ffffffffa0350450>]  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
RSP: 0018:ffff880465003d88  EFLAGS: 00010296
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000026 RSI: 0000000000000292 RDI: ffff880468b8f000
RBP: ffff880465003db8 R08: 0000000000000000 R09: 0000000000000000
R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 0000000000000000
R13: ffff880468b8f000 R14: ffffffffa036de40 R15: 0000000000000001
FS:  00007ff287fc2700(0000) GS:ffff88046fce0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000038c CR3: 000000045cfae000 CR4: 00000000000407e0
Stack:
 ffff880465003da8 ffff880468b8f000 0000000000000000 ffff880468b8f000
 ffffffffa036de40 0000000000000001 ffff880465003dd8 ffffffffa0350805
 ffff880468b8f098 ffffffffa036dd60 ffff880465003e08 ffffffff812ebaa6
Call Trace:
 [<ffffffffa0350805>] mlx4_remove_one+0x25/0x50 [mlx4_core]
 [<ffffffff812ebaa6>] pci_device_remove+0x46/0xc0
 [<ffffffff813ce08f>] __device_release_driver+0x7f/0xf0
 [<ffffffff813ce1c8>] driver_detach+0xc8/0xd0
 [<ffffffff813cced9>] bus_remove_driver+0x59/0xd0
 [<ffffffff813cef80>] driver_unregister+0x30/0x70
 [<ffffffff812ebc13>] pci_unregister_driver+0x23/0x80
 [<ffffffffa03650e4>] mlx4_cleanup+0x10/0x1e [mlx4_core]
 [<ffffffff810ceff9>] SyS_delete_module+0x189/0x210
 [<ffffffff815d2f12>] system_call_fastpath+0x16/0x1b
Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44 
RIP  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
 RSP <ffff880465003d88>
CR2: 000000000000038c
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found] ` <53A259F3.3040203-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2014-06-20  3:34   ` Or Gerlitz
       [not found]     ` <CAJZOPZK__Qo2LcTpL3R0=WOzsLKY2FYT_7VkKXThGiPWF8wv_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2014-06-20  3:34 UTC (permalink / raw)
  To: Shirley Ma; +Cc: Or Gerlitz, linux-rdma, Wei Yang, Matan Barak

On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>
> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>
> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet


What do you mean by "port1" and "port2" -- can you give the exact
command line you used?

Single ported VFs are currently supported for Ethernet only
configuration, that is not for only IB nor for VPI, that is only if
you use port_type_arrary=2,2



>
>
> 2. After mlx4_core module is being loaded with with num_vfs={} parameters, when removing mlx4_core, it consistently hits below panic. Whether this problem is being tracked?


what do you mean by  "num_vfs={}", is it num_vfs=N or {N}, also here,
please send the exact setting you used. The crash you indicated below
is supposed to be fixed by the upstream  commit
da1de8dfff09d33d4a5345762c21b487028e25f5 "net/mlx4_core: Keep only one
driver entry release" - are you sure to have this commit in the tree
you are working with?

Or.

>
> <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014)
> mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
> mlx4_core: Initializing 0000:40:00.0
> mlx4_core 0000:40:00.0: Enabling SR-IOV with 2 VFs
> pci 0000:40:00.1: [15b3:1002] type 00 class 0x0c0600
> mlx4_core: Initializing 0000:40:00.1
> mlx4_core 0000:40:00.1: enabling device (0000 -> 0002)
> mlx4_core 0000:40:00.1: Skipping virtual function:1
> pci 0000:40:00.2: [15b3:1002] type 00 class 0x0c0600
> mlx4_core: Initializing 0000:40:00.2
> mlx4_core 0000:40:00.2: enabling device (0000 -> 0002)
> mlx4_core 0000:40:00.2: Skipping virtual function:2
> mlx4_core 0000:40:00.0: Running in master mode
> mlx4_core 0000:40:00.0: PCIe BW is different than device's capability
> mlx4_core 0000:40:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s
> mlx4_core 0000:40:00.0: PCIe link width is x8, device supports x8
> mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
> BUG: unable to handle kernel NULL pointer dereference at 000000000000038c
> IP: [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
> PGD 45d3ba067 PUD 45ace8067 PMD 0
> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas scsi_transport_sas raid_class [last unloaded: mlx4_core]
> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
> Hardware name: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
> task: ffff880461540110 ti: ffff880465000000 task.ti: ffff880465000000
> RIP: 0010:[<ffffffffa0350450>]  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
> RSP: 0018:ffff880465003d88  EFLAGS: 00010296
> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000026 RSI: 0000000000000292 RDI: ffff880468b8f000
> RBP: ffff880465003db8 R08: 0000000000000000 R09: 0000000000000000
> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 0000000000000000
> R13: ffff880468b8f000 R14: ffffffffa036de40 R15: 0000000000000001
> FS:  00007ff287fc2700(0000) GS:ffff88046fce0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000000038c CR3: 000000045cfae000 CR4: 00000000000407e0
> Stack:
>  ffff880465003da8 ffff880468b8f000 0000000000000000 ffff880468b8f000
>  ffffffffa036de40 0000000000000001 ffff880465003dd8 ffffffffa0350805
>  ffff880468b8f098 ffffffffa036dd60 ffff880465003e08 ffffffff812ebaa6
> Call Trace:
>  [<ffffffffa0350805>] mlx4_remove_one+0x25/0x50 [mlx4_core]
>  [<ffffffff812ebaa6>] pci_device_remove+0x46/0xc0
>  [<ffffffff813ce08f>] __device_release_driver+0x7f/0xf0
>  [<ffffffff813ce1c8>] driver_detach+0xc8/0xd0
>  [<ffffffff813cced9>] bus_remove_driver+0x59/0xd0
>  [<ffffffff813cef80>] driver_unregister+0x30/0x70
>  [<ffffffff812ebc13>] pci_unregister_driver+0x23/0x80
>  [<ffffffffa03650e4>] mlx4_cleanup+0x10/0x1e [mlx4_core]
>  [<ffffffff810ceff9>] SyS_delete_module+0x189/0x210
>  [<ffffffff815d2f12>] system_call_fastpath+0x16/0x1b
> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44
> RIP  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>  RSP <ffff880465003d88>
> CR2: 000000000000038c
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]     ` <CAJZOPZK__Qo2LcTpL3R0=WOzsLKY2FYT_7VkKXThGiPWF8wv_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-20  3:51       ` Wei Yang
  2014-06-20  4:02         ` Or Gerlitz
  2014-06-20  6:17       ` Or Gerlitz
  2014-06-20 17:15       ` Shirley Ma
  2 siblings, 1 reply; 13+ messages in thread
From: Wei Yang @ 2014-06-20  3:51 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Shirley Ma, Or Gerlitz, linux-rdma, Wei Yang, Matan Barak

On Fri, Jun 20, 2014 at 06:34:48AM +0300, Or Gerlitz wrote:
>On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>
>> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>>
>> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
>
>
>What do you mean by "port1" and "port2" -- can you give the exact
>command line you used?
>
>Single ported VFs are currently supported for Ethernet only
>configuration, that is not for only IB nor for VPI, that is only if
>you use port_type_arrary=2,2
>
>
>
>>
>>
>> 2. After mlx4_core module is being loaded with with num_vfs={} parameters, when removing mlx4_core, it consistently hits below panic. Whether this problem is being tracked?
>
>
>what do you mean by  "num_vfs={}", is it num_vfs=N or {N}, also here,
>please send the exact setting you used. The crash you indicated below
>is supposed to be fixed by the upstream  commit
>da1de8dfff09d33d4a5345762c21b487028e25f5 "net/mlx4_core: Keep only one
>driver entry release" - are you sure to have this commit in the tree
>you are working with?
>

Just checked, this patch is in 3.16-rc1.

>Or.
>
>>
>> <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014)
>> mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
>> mlx4_core: Initializing 0000:40:00.0
>> mlx4_core 0000:40:00.0: Enabling SR-IOV with 2 VFs
>> pci 0000:40:00.1: [15b3:1002] type 00 class 0x0c0600
>> mlx4_core: Initializing 0000:40:00.1
>> mlx4_core 0000:40:00.1: enabling device (0000 -> 0002)
>> mlx4_core 0000:40:00.1: Skipping virtual function:1
>> pci 0000:40:00.2: [15b3:1002] type 00 class 0x0c0600
>> mlx4_core: Initializing 0000:40:00.2
>> mlx4_core 0000:40:00.2: enabling device (0000 -> 0002)
>> mlx4_core 0000:40:00.2: Skipping virtual function:2
>> mlx4_core 0000:40:00.0: Running in master mode
>> mlx4_core 0000:40:00.0: PCIe BW is different than device's capability
>> mlx4_core 0000:40:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s
>> mlx4_core 0000:40:00.0: PCIe link width is x8, device supports x8
>> mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
>> BUG: unable to handle kernel NULL pointer dereference at 000000000000038c
>> IP: [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]

>From this log, it happens during probe?
If not, any action after probe?

>> PGD 45d3ba067 PUD 45ace8067 PMD 0
>> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
>> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas scsi_transport_sas raid_class [last unloaded: mlx4_core]
>> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
>> Hardware name: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
>> task: ffff880461540110 ti: ffff880465000000 task.ti: ffff880465000000
>> RIP: 0010:[<ffffffffa0350450>]  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> RSP: 0018:ffff880465003d88  EFLAGS: 00010296
>> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: 0000000000000026 RSI: 0000000000000292 RDI: ffff880468b8f000
>> RBP: ffff880465003db8 R08: 0000000000000000 R09: 0000000000000000
>> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 0000000000000000
>> R13: ffff880468b8f000 R14: ffffffffa036de40 R15: 0000000000000001
>> FS:  00007ff287fc2700(0000) GS:ffff88046fce0000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 000000000000038c CR3: 000000045cfae000 CR4: 00000000000407e0
>> Stack:
>>  ffff880465003da8 ffff880468b8f000 0000000000000000 ffff880468b8f000
>>  ffffffffa036de40 0000000000000001 ffff880465003dd8 ffffffffa0350805
>>  ffff880468b8f098 ffffffffa036dd60 ffff880465003e08 ffffffff812ebaa6
>> Call Trace:
>>  [<ffffffffa0350805>] mlx4_remove_one+0x25/0x50 [mlx4_core]
>>  [<ffffffff812ebaa6>] pci_device_remove+0x46/0xc0
>>  [<ffffffff813ce08f>] __device_release_driver+0x7f/0xf0
>>  [<ffffffff813ce1c8>] driver_detach+0xc8/0xd0
>>  [<ffffffff813cced9>] bus_remove_driver+0x59/0xd0
>>  [<ffffffff813cef80>] driver_unregister+0x30/0x70
>>  [<ffffffff812ebc13>] pci_unregister_driver+0x23/0x80
>>  [<ffffffffa03650e4>] mlx4_cleanup+0x10/0x1e [mlx4_core]
>>  [<ffffffff810ceff9>] SyS_delete_module+0x189/0x210
>>  [<ffffffff815d2f12>] system_call_fastpath+0x16/0x1b
>> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44
>> RIP  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>>  RSP <ffff880465003d88>
>> CR2: 000000000000038c
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
  2014-06-20  3:51       ` Wei Yang
@ 2014-06-20  4:02         ` Or Gerlitz
       [not found]           ` <CAJZOPZJ8r6CVRtuoeYGgDT=TY_YmmwirysoDePsa8KAYZj97cQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2014-06-20  4:02 UTC (permalink / raw)
  To: Wei Yang; +Cc: Shirley Ma, Or Gerlitz, linux-rdma, Matan Barak

On Fri, Jun 20, 2014 at 6:51 AM, Wei Yang <weiyang-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:

> >> mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
> >> BUG: unable to handle kernel NULL pointer dereference at 000000000000038c
> >> IP: [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>
> From this log, it happens during probe?
> If not, any action after probe?

yep, maybe the bug still exists in the error flow of probe? you can probe with

num_vfs=1,1,1 port_type_array=1,1 and see if you hit it


>
> >> PGD 45d3ba067 PUD 45ace8067 PMD 0
> >> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> >> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas scsi_transport_sas raid_class [last unloaded: mlx4_core]
> >> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
> >> Hardware name: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
> >> task: ffff880461540110 ti: ffff880465000000 task.ti: ffff880465000000
> >> RIP: 0010:[<ffffffffa0350450>]  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
> >> RSP: 0018:ffff880465003d88  EFLAGS: 00010296
> >> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
> >> RDX: 0000000000000026 RSI: 0000000000000292 RDI: ffff880468b8f000
> >> RBP: ffff880465003db8 R08: 0000000000000000 R09: 0000000000000000
> >> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 0000000000000000
> >> R13: ffff880468b8f000 R14: ffffffffa036de40 R15: 0000000000000001
> >> FS:  00007ff287fc2700(0000) GS:ffff88046fce0000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 000000000000038c CR3: 000000045cfae000 CR4: 00000000000407e0
> >> Stack:
> >>  ffff880465003da8 ffff880468b8f000 0000000000000000 ffff880468b8f000
> >>  ffffffffa036de40 0000000000000001 ffff880465003dd8 ffffffffa0350805
> >>  ffff880468b8f098 ffffffffa036dd60 ffff880465003e08 ffffffff812ebaa6
> >> Call Trace:
> >>  [<ffffffffa0350805>] mlx4_remove_one+0x25/0x50 [mlx4_core]
> >>  [<ffffffff812ebaa6>] pci_device_remove+0x46/0xc0
> >>  [<ffffffff813ce08f>] __device_release_driver+0x7f/0xf0
> >>  [<ffffffff813ce1c8>] driver_detach+0xc8/0xd0
> >>  [<ffffffff813cced9>] bus_remove_driver+0x59/0xd0
> >>  [<ffffffff813cef80>] driver_unregister+0x30/0x70
> >>  [<ffffffff812ebc13>] pci_unregister_driver+0x23/0x80
> >>  [<ffffffffa03650e4>] mlx4_cleanup+0x10/0x1e [mlx4_core]
> >>  [<ffffffff810ceff9>] SyS_delete_module+0x189/0x210
> >>  [<ffffffff815d2f12>] system_call_fastpath+0x16/0x1b
> >> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44
> >> RIP  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
> >>  RSP <ffff880465003d88>
> >> CR2: 000000000000038c
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> Richard Yang
> Help you, Help me
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]     ` <CAJZOPZK__Qo2LcTpL3R0=WOzsLKY2FYT_7VkKXThGiPWF8wv_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-06-20  3:51       ` Wei Yang
@ 2014-06-20  6:17       ` Or Gerlitz
       [not found]         ` <CAJZOPZJaStV41bbj-qHZVfTi2m__9J_TqNcYJ1kkc=X12cJWPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-06-20 17:15       ` Shirley Ma
  2 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2014-06-20  6:17 UTC (permalink / raw)
  To: Shirley Ma; +Cc: Or Gerlitz, linux-rdma, Wei Yang, Matan Barak

On Fri, Jun 20, 2014 at 6:34 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet

> What do you mean by "port1" and "port2" -- can you give the exact
> command line you used?

> Single ported VFs are currently supported for Ethernet only
> configuration, that is not for only IB nor for VPI, that is only if
> you use port_type_arrary=2,2

Note that you can still use dual-ported VFs, for both IB, Ethernet and
VPI, that is
num_vfs=N will create N dual-ported VFs, are you on IB?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]           ` <CAJZOPZJ8r6CVRtuoeYGgDT=TY_YmmwirysoDePsa8KAYZj97cQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-20  6:21             ` Wei Yang
  2014-06-20  6:33               ` Or Gerlitz
  0 siblings, 1 reply; 13+ messages in thread
From: Wei Yang @ 2014-06-20  6:21 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Wei Yang, Shirley Ma, Or Gerlitz, linux-rdma, Matan Barak

On Fri, Jun 20, 2014 at 07:02:41AM +0300, Or Gerlitz wrote:
>On Fri, Jun 20, 2014 at 6:51 AM, Wei Yang <weiyang-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>
>> >> mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
>> >> BUG: unable to handle kernel NULL pointer dereference at 000000000000038c
>> >> IP: [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>>
>> From this log, it happens during probe?
>> If not, any action after probe?
>
>yep, maybe the bug still exists in the error flow of probe? you can probe with
>
>num_vfs=1,1,1 port_type_array=1,1 and see if you hit it
>

I tried this

modprobe mlx4_core num_vfs=3 probe_vf=3 port_type_array=1,1

It looks good to me.

BTW, I didn't test 3.16-rc1, sine the SRIOV patch on power platform is not
rebased to the latest kernel yet.

>
>>
>> >> PGD 45d3ba067 PUD 45ace8067 PMD 0
>> >> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
>> >> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas scsi_transport_sas raid_class [last unloaded: mlx4_core]
>> >> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
>> >> Hardware name: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
>> >> task: ffff880461540110 ti: ffff880465000000 task.ti: ffff880465000000
>> >> RIP: 0010:[<ffffffffa0350450>]  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> >> RSP: 0018:ffff880465003d88  EFLAGS: 00010296
>> >> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
>> >> RDX: 0000000000000026 RSI: 0000000000000292 RDI: ffff880468b8f000
>> >> RBP: ffff880465003db8 R08: 0000000000000000 R09: 0000000000000000
>> >> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 0000000000000000
>> >> R13: ffff880468b8f000 R14: ffffffffa036de40 R15: 0000000000000001
>> >> FS:  00007ff287fc2700(0000) GS:ffff88046fce0000(0000) knlGS:0000000000000000
>> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >> CR2: 000000000000038c CR3: 000000045cfae000 CR4: 00000000000407e0
>> >> Stack:
>> >>  ffff880465003da8 ffff880468b8f000 0000000000000000 ffff880468b8f000
>> >>  ffffffffa036de40 0000000000000001 ffff880465003dd8 ffffffffa0350805
>> >>  ffff880468b8f098 ffffffffa036dd60 ffff880465003e08 ffffffff812ebaa6
>> >> Call Trace:
>> >>  [<ffffffffa0350805>] mlx4_remove_one+0x25/0x50 [mlx4_core]
>> >>  [<ffffffff812ebaa6>] pci_device_remove+0x46/0xc0
>> >>  [<ffffffff813ce08f>] __device_release_driver+0x7f/0xf0
>> >>  [<ffffffff813ce1c8>] driver_detach+0xc8/0xd0
>> >>  [<ffffffff813cced9>] bus_remove_driver+0x59/0xd0
>> >>  [<ffffffff813cef80>] driver_unregister+0x30/0x70
>> >>  [<ffffffff812ebc13>] pci_unregister_driver+0x23/0x80
>> >>  [<ffffffffa03650e4>] mlx4_cleanup+0x10/0x1e [mlx4_core]
>> >>  [<ffffffff810ceff9>] SyS_delete_module+0x189/0x210
>> >>  [<ffffffff815d2f12>] system_call_fastpath+0x16/0x1b
>> >> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44
>> >> RIP  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> >>  RSP <ffff880465003d88>
>> >> CR2: 000000000000038c
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> Richard Yang
>> Help you, Help me
>>

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
  2014-06-20  6:21             ` Wei Yang
@ 2014-06-20  6:33               ` Or Gerlitz
       [not found]                 ` <CAJZOPZJAFYxTpBEiAq5Xzo5_zCXAP08P_vRdMsqgxfEL0y6iWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2014-06-20  6:33 UTC (permalink / raw)
  To: Wei Yang; +Cc: Shirley Ma, Or Gerlitz, linux-rdma, Matan Barak

On Fri, Jun 20, 2014 at 9:21 AM, Wei Yang <weiyang-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:

>>> From this log, it happens during probe?
>>> If not, any action after probe?

>>yep, maybe the bug still exists in the error flow of probe? you can probe with
>>num_vfs=1,1,1 port_type_array=1,1 and see if you hit it

> I tried this
> modprobe mlx4_core num_vfs=3 probe_vf=3 port_type_array=1,1
> It looks good to me.

NO. I wanted you to hit the error flow where this bug seems to
remain... so you need to try and use single ported VFs with IB e.g

$ modprobe mlx4_core num_vfs=1,1,1 port_type_array=1,1


> BTW, I didn't test 3.16-rc1, sine the SRIOV patch on power platform is not
> rebased to the latest kernel yet.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]                 ` <CAJZOPZJAFYxTpBEiAq5Xzo5_zCXAP08P_vRdMsqgxfEL0y6iWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-20  6:34                   ` Or Gerlitz
       [not found]                     ` <CAJZOPZLYBBKK70My5L7++B7_TYKoxrRABPUHegExerBKLcYwcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2014-06-20  6:34 UTC (permalink / raw)
  To: Wei Yang; +Cc: Shirley Ma, Or Gerlitz, linux-rdma, Matan Barak

On Fri, Jun 20, 2014 at 9:33 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, Jun 20, 2014 at 9:21 AM, Wei Yang <weiyang-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>
>>>> From this log, it happens during probe?
>>>> If not, any action after probe?
>
>>>yep, maybe the bug still exists in the error flow of probe? you can probe with
>>>num_vfs=1,1,1 port_type_array=1,1 and see if you hit it
>
>> I tried this
>> modprobe mlx4_core num_vfs=3 probe_vf=3 port_type_array=1,1
>> It looks good to me.
>
> NO. I wanted you to hit the error flow where this bug seems to
> remain... so you need to try and use single ported VFs with IB e.g
> $ modprobe mlx4_core num_vfs=1,1,1 port_type_array=1,1

or

$ modprobe mlx4_core num_vfs=1,1,1 probe_vf=1,1,1 rt_type_array=1,1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]     ` <CAJZOPZK__Qo2LcTpL3R0=WOzsLKY2FYT_7VkKXThGiPWF8wv_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-06-20  3:51       ` Wei Yang
  2014-06-20  6:17       ` Or Gerlitz
@ 2014-06-20 17:15       ` Shirley Ma
       [not found]         ` <53A46C2B.8030301-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  2 siblings, 1 reply; 13+ messages in thread
From: Shirley Ma @ 2014-06-20 17:15 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Or Gerlitz, linux-rdma, Wei Yang, Matan Barak



On 06/19/2014 08:34 PM, Or Gerlitz wrote:
> On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>
>> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>>
>> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
> 
> 
> What do you mean by "port1" and "port2" -- can you give the exact
> command line you used?
> 
> Single ported VFs are currently supported for Ethernet only
> configuration, that is not for only IB nor for VPI, that is only if
> you use port_type_arrary=2,2
> 

I tried command line with num_vfs without port_type_array=2,2.

num_vfs=2 
num_vfs={1,1,2}

both failed.

> 
>>
>>
>> 2. After mlx4_core module is being loaded with with num_vfs={} parameters, when removing mlx4_core, it consistently hits below panic. Whether this problem is being tracked?
> 
> 
> what do you mean by  "num_vfs={}", is it num_vfs=N or {N}, also here,
> please send the exact setting you used. The crash you indicated below
> is supposed to be fixed by the upstream  commit
> da1de8dfff09d33d4a5345762c21b487028e25f5 "net/mlx4_core: Keep only one
> driver entry release" - are you sure to have this commit in the tree
> you are working with?
> 
> Or.

Yes, I tried net-next tree with this commit a1de8dfff09d33d4a5345762c21b487028e25f5.

>>
>> <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014)
>> mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
>> mlx4_core: Initializing 0000:40:00.0
>> mlx4_core 0000:40:00.0: Enabling SR-IOV with 2 VFs
>> pci 0000:40:00.1: [15b3:1002] type 00 class 0x0c0600
>> mlx4_core: Initializing 0000:40:00.1
>> mlx4_core 0000:40:00.1: enabling device (0000 -> 0002)
>> mlx4_core 0000:40:00.1: Skipping virtual function:1
>> pci 0000:40:00.2: [15b3:1002] type 00 class 0x0c0600
>> mlx4_core: Initializing 0000:40:00.2
>> mlx4_core 0000:40:00.2: enabling device (0000 -> 0002)
>> mlx4_core 0000:40:00.2: Skipping virtual function:2
>> mlx4_core 0000:40:00.0: Running in master mode
>> mlx4_core 0000:40:00.0: PCIe BW is different than device's capability
>> mlx4_core 0000:40:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s
>> mlx4_core 0000:40:00.0: PCIe link width is x8, device supports x8
>> mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
>> BUG: unable to handle kernel NULL pointer dereference at 000000000000038c
>> IP: [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> PGD 45d3ba067 PUD 45ace8067 PMD 0
>> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
>> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas scsi_transport_sas raid_class [last unloaded: mlx4_core]
>> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
>> Hardware name: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
>> task: ffff880461540110 ti: ffff880465000000 task.ti: ffff880465000000
>> RIP: 0010:[<ffffffffa0350450>]  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> RSP: 0018:ffff880465003d88  EFLAGS: 00010296
>> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: 0000000000000026 RSI: 0000000000000292 RDI: ffff880468b8f000
>> RBP: ffff880465003db8 R08: 0000000000000000 R09: 0000000000000000
>> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 0000000000000000
>> R13: ffff880468b8f000 R14: ffffffffa036de40 R15: 0000000000000001
>> FS:  00007ff287fc2700(0000) GS:ffff88046fce0000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 000000000000038c CR3: 000000045cfae000 CR4: 00000000000407e0
>> Stack:
>>  ffff880465003da8 ffff880468b8f000 0000000000000000 ffff880468b8f000
>>  ffffffffa036de40 0000000000000001 ffff880465003dd8 ffffffffa0350805
>>  ffff880468b8f098 ffffffffa036dd60 ffff880465003e08 ffffffff812ebaa6
>> Call Trace:
>>  [<ffffffffa0350805>] mlx4_remove_one+0x25/0x50 [mlx4_core]
>>  [<ffffffff812ebaa6>] pci_device_remove+0x46/0xc0
>>  [<ffffffff813ce08f>] __device_release_driver+0x7f/0xf0
>>  [<ffffffff813ce1c8>] driver_detach+0xc8/0xd0
>>  [<ffffffff813cced9>] bus_remove_driver+0x59/0xd0
>>  [<ffffffff813cef80>] driver_unregister+0x30/0x70
>>  [<ffffffff812ebc13>] pci_unregister_driver+0x23/0x80
>>  [<ffffffffa03650e4>] mlx4_cleanup+0x10/0x1e [mlx4_core]
>>  [<ffffffff810ceff9>] SyS_delete_module+0x189/0x210
>>  [<ffffffff815d2f12>] system_call_fastpath+0x16/0x1b
>> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44
>> RIP  [<ffffffffa0350450>] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>>  RSP <ffff880465003d88>
>> CR2: 000000000000038c
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]         ` <CAJZOPZJaStV41bbj-qHZVfTi2m__9J_TqNcYJ1kkc=X12cJWPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-20 17:16           ` Shirley Ma
  0 siblings, 0 replies; 13+ messages in thread
From: Shirley Ma @ 2014-06-20 17:16 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Or Gerlitz, linux-rdma, Wei Yang, Matan Barak



On 06/19/2014 11:17 PM, Or Gerlitz wrote:
> On Fri, Jun 20, 2014 at 6:34 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>>> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
> 
>> What do you mean by "port1" and "port2" -- can you give the exact
>> command line you used?
> 
>> Single ported VFs are currently supported for Ethernet only
>> configuration, that is not for only IB nor for VPI, that is only if
>> you use port_type_arrary=2,2
> 
> Note that you can still use dual-ported VFs, for both IB, Ethernet and
> VPI, that is
> num_vfs=N will create N dual-ported VFs, are you on IB?

Yes, I tried num_vfs=N. It failed. I didn't combine with port_type_array=2,2. If it's required, then the code needs to check the arguments.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]                     ` <CAJZOPZLYBBKK70My5L7++B7_TYKoxrRABPUHegExerBKLcYwcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-20 17:18                       ` Shirley Ma
  0 siblings, 0 replies; 13+ messages in thread
From: Shirley Ma @ 2014-06-20 17:18 UTC (permalink / raw)
  To: Or Gerlitz, Wei Yang; +Cc: Or Gerlitz, linux-rdma, Matan Barak

When loading the modules, the proper arguments need to be checked if one depends on another. You can easily reproduce this problem by only using num_vfs = N.

Shirley

On 06/19/2014 11:34 PM, Or Gerlitz wrote:
> On Fri, Jun 20, 2014 at 9:33 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Fri, Jun 20, 2014 at 9:21 AM, Wei Yang <weiyang-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>
>>>>> From this log, it happens during probe?
>>>>> If not, any action after probe?
>>
>>>> yep, maybe the bug still exists in the error flow of probe? you can probe with
>>>> num_vfs=1,1,1 port_type_array=1,1 and see if you hit it
>>
>>> I tried this
>>> modprobe mlx4_core num_vfs=3 probe_vf=3 port_type_array=1,1
>>> It looks good to me.
>>
>> NO. I wanted you to hit the error flow where this bug seems to
>> remain... so you need to try and use single ported VFs with IB e.g
>> $ modprobe mlx4_core num_vfs=1,1,1 port_type_array=1,1
> 
> or
> 
> $ modprobe mlx4_core num_vfs=1,1,1 probe_vf=1,1,1 rt_type_array=1,1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]         ` <53A46C2B.8030301-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2014-06-22  9:29           ` Or Gerlitz
       [not found]             ` <53A6A1DD.3020907-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2014-06-22  9:29 UTC (permalink / raw)
  To: Shirley Ma
  Cc: Or Gerlitz, linux-rdma, Wei Yang, Matan Barak, Jack Morgenstein

On 20/06/2014 20:15, Shirley Ma wrote:
>
> On 06/19/2014 08:34 PM, Or Gerlitz wrote:
>> On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>>>
>>> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it failed with mlx4_core 0000:40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet
>>
>> What do you mean by "port1" and "port2" -- can you give the exact
>> command line you used?
>>
>> Single ported VFs are currently supported for Ethernet only
>> configuration, that is not for only IB nor for VPI, that is only if
>> you use port_type_arrary=2,2
>>
> I tried command line with num_vfs without port_type_array=2,2.
>
> num_vfs=2
> num_vfs={1,1,2}
>
> both failed.

Please provide further details on what fails when you use num_vfs=2, 
here it works just fine, e.g send us the output of

$ modprobe -v mlx4_core

and the related dmesg

As for the crash you reported, indeed, it seems we have a bug there on 
the error flow of module loading when single ported VF are requested in 
conjunction with IB ports. I have it fixed on my system with this patch 
which are will review etc


 From 6dec77e0dc68679e4127de19f6798151cc8b2fe6 Mon Sep 17 00:00:00 2001
From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Date: Sun, 22 Jun 2014 13:46:04 +0300
Subject: [PATCH] net/mlx4_core: Return error when module is loaded with 
invalid VF configuration

Single ported VF are currently not supported on configurations where
one or both ports are IB. When we hit this case, the module load
function didn't return error, fix that.

Fixes: dd41cc3 ('net/mlx4: Adapt num_vfs/probed_vf params for single 
port VF')
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
  drivers/net/ethernet/mellanox/mlx4/main.c |    1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c 
b/drivers/net/ethernet/mellanox/mlx4/main.c
index 5f42f6d..4f4d48c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2439,6 +2439,7 @@ slave_start:
                             (num_vfs_argc > 1 || probe_vfs_argc > 1)) {
                                 mlx4_err(dev,
                                          "Invalid syntax of 
num_vfs/probe_vfs with IB port - single port VFs syntax is only 
supported when all ports are configured as ethernet\n");
+                               err = -EINVAL;
                                 goto err_close;
                         }
                         for (i = 0; i < sizeof(nvfs)/sizeof(nvfs[0]); 
i++) {
--
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: rmmod mlx4_core panic 3.16-rc1
       [not found]             ` <53A6A1DD.3020907-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-06-22 10:25               ` Or Gerlitz
  0 siblings, 0 replies; 13+ messages in thread
From: Or Gerlitz @ 2014-06-22 10:25 UTC (permalink / raw)
  To: Shirley Ma; +Cc: linux-rdma, Wei Yang, Matan Barak, Jack Morgenstein

On Sun, Jun 22, 2014 at 12:29 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>
> As for the crash you reported, indeed, it seems we have a bug there on the error flow of module loading when single ported VF are requested in conjunction with IB ports. I have it fixed on my system with this patch which are will review etc


Just posted a patch fixing the crash you reported to netdev
http://marc.info/?l=linux-netdev&m=140343252214146&w=2

Thanks for the report and tell us if you have further issues.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-06-22 10:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-19  3:33 rmmod mlx4_core panic 3.16-rc1 Shirley Ma
     [not found] ` <53A259F3.3040203-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-06-20  3:34   ` Or Gerlitz
     [not found]     ` <CAJZOPZK__Qo2LcTpL3R0=WOzsLKY2FYT_7VkKXThGiPWF8wv_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-20  3:51       ` Wei Yang
2014-06-20  4:02         ` Or Gerlitz
     [not found]           ` <CAJZOPZJ8r6CVRtuoeYGgDT=TY_YmmwirysoDePsa8KAYZj97cQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-20  6:21             ` Wei Yang
2014-06-20  6:33               ` Or Gerlitz
     [not found]                 ` <CAJZOPZJAFYxTpBEiAq5Xzo5_zCXAP08P_vRdMsqgxfEL0y6iWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-20  6:34                   ` Or Gerlitz
     [not found]                     ` <CAJZOPZLYBBKK70My5L7++B7_TYKoxrRABPUHegExerBKLcYwcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-20 17:18                       ` Shirley Ma
2014-06-20  6:17       ` Or Gerlitz
     [not found]         ` <CAJZOPZJaStV41bbj-qHZVfTi2m__9J_TqNcYJ1kkc=X12cJWPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-20 17:16           ` Shirley Ma
2014-06-20 17:15       ` Shirley Ma
     [not found]         ` <53A46C2B.8030301-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-06-22  9:29           ` Or Gerlitz
     [not found]             ` <53A6A1DD.3020907-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-06-22 10:25               ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox