public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Need help with strange mlx4_core error
@ 2014-02-03  3:35 Vasiliy Tolstov
       [not found] ` <CAJZOPZJmdGutUQZ4icBrKRq-YPwD8TqtoLH9SgqNVx_CDjL0TQ@mail.gmail.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Vasiliy Tolstov @ 2014-02-03  3:35 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi all. After switching to kernel 3.10 i get sometimes errors in dmesg
like this:
[Sun Jan 26 09:58:50 2014] <mlx4_ib> destroy_qp_common: modify QP
007def to RESET failed.
[Sun Jan 26 20:27:52 2014] <mlx4_ib> destroy_qp_common: modify QP
0221ca to RESET failed.
[Mon Jan 27 03:44:20 2014] <mlx4_ib> destroy_qp_common: modify QP
0232ad to RESET failed.
[Mon Jan 27 14:23:25 2014] mlx4_core 0000:03:00.0: command 0x19
failed: fw status = 0x9
[Mon Jan 27 14:23:25 2014] ib0: failed to modify QP to INIT: -9
[Mon Jan 27 16:37:00 2014] <mlx4_ib> destroy_qp_common: modify QP
00258a to RESET failed.
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0: Internal error detected:
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[00]: 001805a5
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[01]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[02]: 20060384
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[03]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[04]: 0018050c
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[05]: 00000001
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[06]: 00002cd4
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[07]: 00000084
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[08]: 0000f8af
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[09]: 00004000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[0a]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[0b]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[0c]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[0d]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[0e]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_core 0000:03:00.0:   buf[0f]: 00000000
[Mon Jan 27 16:58:26 2014] mlx4_en 0000:03:00.0: Internal error
detected, restarting device
[Mon Jan 27 16:58:35 2014] <mlx4_ib> destroy_qp_common: modify QP
002cd4 to RESET failed.
[Mon Jan 27 16:58:35 2014] ib0: dev_queue_xmit failed to requeue packet
[Mon Jan 27 16:58:35 2014] ib0: dev_queue_xmit failed to requeue packet
[Mon Jan 27 16:58:35 2014] ib0: dev_queue_xmit failed to requeue packet
[Mon Jan 27 16:58:36 2014] mlx4_core: Initializing 0000:03:00.0
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 59 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 60 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 61 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 62 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 63 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 64 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 65 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 66 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 67 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 68 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 69 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 70 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: irq 71 for MSI/MSI-X
[Mon Jan 27 16:58:38 2014] mlx4_core 0000:03:00.0: command 0xc failed:
fw status = 0x40
[Mon Jan 27 16:58:38 2014] mlx4_en 0000:03:00.0: UDP RSS is not
supported on this device.

I don't know what traffic can trigger this (i'm using IPoIB with
connected mode) but i think this can happening then someone send
massive udp traffic.
What can i do to fix this issue? When error appears ib0 device (IPoIB)
goes to down.

Very big thanks for all help.

-- 
Vasiliy Tolstov,
e-mail: v.tolstov-+9FY0jupvH6HXe+LvDLADg@public.gmane.org
jabber: vase-+9FY0jupvH6HXe+LvDLADg@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: Need help with strange mlx4_core error
@ 2014-06-26 16:59 Mark Lehrer
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Lehrer @ 2014-06-26 16:59 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Feb 28, 2014, Vasiliy Tolstov wrote:

> I'm update firmware to 2.7.200 (that is the latest from ftp
> supermicro). Now i'm try to test it.

Which type of HCA do you have?  The firmware I downloaded from
ftp.supermicro.com was 2.9.1000, and once I upgraded to this version
from 2.7.200 I was able to use mlx4_en.

lspci shows my card as: Mellanox Technologies MT26428 [ConnectX VPI
PCIe 2.0 5GT/s - IB QDR / 10GigE]


Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-06-26 16:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03  3:35 Need help with strange mlx4_core error Vasiliy Tolstov
     [not found] ` <CAJZOPZJmdGutUQZ4icBrKRq-YPwD8TqtoLH9SgqNVx_CDjL0TQ@mail.gmail.com>
     [not found]   ` <CAJZOPZJmdGutUQZ4icBrKRq-YPwD8TqtoLH9SgqNVx_CDjL0TQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-02-03  7:27     ` Vasiliy Tolstov
     [not found]       ` <CACaajQtHzF9g2kp9MrAaQsvKUQdr10YVa+Np1danmAcR_rvLqg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-02-03 10:54         ` Or Gerlitz
2014-02-03 21:19           ` Vasiliy Tolstov
     [not found]             ` <CACaajQs9ANqLX7m=TcR2wT6wkiS-M5uVvAnQrvGWPmukZnLqcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-02-04 14:19               ` Vasiliy Tolstov
     [not found]                 ` <CACaajQuRzBy3oV+-3+NjyE9muGKPN3YQUd5wnkkcoak0GSP=uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-02-28 18:24                   ` Vasiliy Tolstov
     [not found]                     ` <CACaajQtKvihFu6X-+9N3VvVKK4Jw3eADBOENhuM+-4s8s=PsUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-02-28 18:35                       ` Vasiliy Tolstov
     [not found]                         ` <CACaajQuQ+mbwhp4jxUt+9r4Rq=14oBUqC-=xx6J3EGpAfecG-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-01 12:50                           ` Bart Van Assche
     [not found]                             ` <5311D799.2090607-HInyCGIudOg@public.gmane.org>
2014-03-01 16:55                               ` Vasiliy Tolstov
     [not found]                                 ` <CACaajQuPrAFNAR8d4NphuVRH+pVGC-Yv=afS1cbkrg=48M8ROg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-01 17:16                                   ` Vasiliy Tolstov
     [not found]                                     ` <CACaajQuXn5+-KQh_XtB3bb-1q9C+5ezAEh26C1TYT=5ncYJvCw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-01 17:23                                       ` Bart Van Assche
     [not found]                                         ` <531217AD.3030807-HInyCGIudOg@public.gmane.org>
2014-03-01 20:06                                           ` Vasiliy Tolstov
  -- strict thread matches above, loose matches on Subject: below --
2014-06-26 16:59 Mark Lehrer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox