From: Joao Pinto <Joao.Pinto-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
To: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
Cc: Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Joao Pinto <Joao.Pinto-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Issue with MLX5 IB driver
Date: Thu, 1 Jun 2017 12:18:28 +0100 [thread overview]
Message-ID: <fbb4b7cb-e3e4-b540-22e4-5d920857e8fe@synopsys.com> (raw)
In-Reply-To: <09d8f6bc-5994-82d1-9a0f-59540b6c525f-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
Às 11:05 AM de 6/1/2017, Joao Pinto escreveu:
>
> Hello,
>
> Às 5:30 AM de 6/1/2017, Leon Romanovsky escreveu:
>> On Wed, May 31, 2017 at 12:44:26PM -0700, Christoph Hellwig wrote:
>>> On Wed, May 31, 2017 at 07:18:19PM +0300, Leon Romanovsky wrote:
>>>> I think that you are hitting the side effect of these commits
>>>> 7d0cc6edcc70 ("IB/mlx5: Add MR cache for large UMR regions") and
>>>> 81713d3788d2 ("IB/mlx5: Add implicit MR support")
>>>>
>>>> Do you have CONFIG_INFINIBAND_ON_DEMAND_PAGING on? Can you disable it
>>>> for the test?
>>>
>>> Eww. Please make sure mlx5 gracefully handles cases where it can't use
>>> crazy amount of memory, including disabling features like the above
>>> at runtime when the required resources aren't available.
>>
>> Right, the real consumer of memory in mlx5_ib is mr_cache, so the
>> question is how can we check in advance if we have enough memory
>> without calling allocations with GFP_NOWARN flag.
>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=Uf5GrWBvnD9y_cvJHxE3U34WbGfrJ6SH6xoBLXn3-iA&s=qOiYqKtZvTJzs3QPNC_YxrNg-S_g-1PfDr0ZvDTE5pY&e=
>
> With CONFIG_INFINIBAND_ON_DEMAND_PAGING disabled:
> Crashes the same way.
>
> With MLX5_DEFAULT_PROF defined as 0:
>
> There is no crash.
>
> mlx5_core 0000:01:00.0: enabling device (0000 -> 0002)
> mlx5_core 0000:01:00.0: Warning: couldn't set 64-bit PCI DMA mask
> mlx5_core 0000:01:00.0: Warning: couldn't set 64-bit consistent PCI DMA mask
> mlx5_core 0000:01:00.0: firmware version: 16.19.21102
> (...)
> mlx5_ib: Mellanox Connect-IB Infiniband driver v2.2-1 (Feb 2014)
> (...)
> mlx5_core 0000:01:00.0: device's health compromised - reached miss count
> mlx5_core 0000:01:00.0: assert_var[0] 0x00000001
> mlx5_core 0000:01:00.0: assert_var[1] 0x00000000
> mlx5_core 0000:01:00.0: assert_var[2] 0x00000000
> mlx5_core 0000:01:00.0: assert_var[3] 0x00000000
> mlx5_core 0000:01:00.0: assert_var[4] 0x00000000
> mlx5_core 0000:01:00.0: assert_exit_ptr 0x006994c0
> mlx5_core 0000:01:00.0: assert_callra 0x00699680
> mlx5_core 0000:01:00.0: fw_ver 16.19.21102
> mlx5_core 0000:01:00.0: hw_id 0x0000020d
> mlx5_core 0000:01:00.0: irisc_index 0
> mlx5_core 0000:01:00.0: synd 0x1: firmware internal error
> mlx5_core 0000:01:00.0: ext_synd 0x11c5
> mlx5_core 0000:01:00.0: raw fw_ver 0x1013526e
>
> lspci -v result:
>
> 01:00.0 Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
> Subsystem: Mellanox Technologies Device 0002
> Flags: bus master, fast devsel, latency 0
> Memory at d2000000 (64-bit, prefetchable) [size=32M]
> Capabilities: [60] Express Endpoint, MSI 00
> Capabilities: [48] Vital Product Data
> Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
> Capabilities: [c0] Vendor Specific Information: Len=18 <?>
> Capabilities: [40] Power Management version 3
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
> Capabilities: [1c0] #19
> Capabilities: [320] #27
> Kernel driver in use: mlx5_core
>
> Interrupts:
>
> 45: 0 PCI-MSI 0 aerdrv
> 46: 2 PCI-MSI 524288 mlx5_pages_eq@pci:0000:01:00.0
> 47: 347 PCI-MSI 524289 mlx5_cmd_eq@pci:0000:01:00.0
> 48: 0 PCI-MSI 524290 mlx5_async_eq@pci:0000:01:00.0
> 50: 0 PCI-MSI 524292 mlx5_comp0@pci:0000:01:00.0
>
> List of devices:
>
> # ls /dev/infiniband/
> issm0 rdma_cm ucm0 umad0 uverbs0
>
> Shouldn't I be getting some mellanox devices?
>
> Thanks,
> Joao
>
After search in /sys I found the mellanox device mlx5_0
(/sys/class/infiniband/mlx5_0/) and was able to execute ibstat on it:
# ibstat mlx5_0
CA 'mlx5_0'
CA type: MT4121
Number of ports: 1
Firmware version: 16.19.21102
Hardware version: 0
Node GUID: 0x248a070300aa8466
System image GUID: 0x248a070300aa8466
Port 1:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 65535
LMC: 0
SM lid: 0
Capability mask: 0x2651e848
Port GUID: 0x248a070300aa8466
Link layer: InfiniBand
#
#
# pwd
Shouldn't the device be visible in /dev?
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-06-01 11:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-31 15:59 Issue with MLX5 IB driver Joao Pinto
[not found] ` <ae8a8bbf-edb5-1909-824c-f98384f506b0-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2017-05-31 16:18 ` Leon Romanovsky
[not found] ` <20170531161819.GK5406-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-31 16:39 ` Majd Dibbiny
2017-05-31 19:44 ` Christoph Hellwig
[not found] ` <20170531194426.GA23120-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2017-06-01 4:30 ` Leon Romanovsky
[not found] ` <20170601043013.GN5406-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-06-01 10:05 ` Joao Pinto
[not found] ` <09d8f6bc-5994-82d1-9a0f-59540b6c525f-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2017-06-01 11:18 ` Joao Pinto [this message]
[not found] ` <fbb4b7cb-e3e4-b540-22e4-5d920857e8fe-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2017-06-01 11:57 ` Majd Dibbiny
[not found] ` <52727D4A-F647-4924-8DF0-4D7F248626AA-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-06-01 11:59 ` Joao Pinto
[not found] ` <7a4e8dce-f1af-d664-bb0b-062f84b45b60-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2017-06-01 12:07 ` Majd Dibbiny
[not found] ` <E798E910-E897-4C14-9161-BE1220D412DF-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-06-01 12:08 ` Joao Pinto
[not found] ` <455d9539-8284-7e8d-fe8b-17035b511e9d-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2017-06-01 18:40 ` Issue with Infiniband / MLX5 IB driver when running opensm Joao Pinto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fbb4b7cb-e3e4-b540-22e4-5d920857e8fe@synopsys.com \
--to=joao.pinto-hkixbcoqz3hwk0htik3j/w@public.gmane.org \
--cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox