All of lore.kernel.org
 help / color / mirror / Atom feed
* krping problem on 4.15-rc4
@ 2018-01-09 15:30 Olga Kornievskaia
       [not found] ` <CAN-5tyH1HO7yzzQLyb5z5Pq=OrHnKzmCrR2MffLguqsEA-mwWg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Olga Kornievskaia @ 2018-01-09 15:30 UTC (permalink / raw)
  To: linux-rdma

Hi folks,

I have 2 linux machines with CX-5 cards (Mellanox MCX515A-CCAT (one
port)) and krping doesn't work in one direction but works in another.
rping works in both direction. ib_send_bw works in both directions and
display 39Gb one way and 36Gb other way on a 40Gb setup.

krping is upstream commit 4df520c888d80e5370d0f58b2eeac8355e3f2286.

Server is started with: [kolga@localhost krping]$ sudo echo
"server,port=9999,addr=172.20.35.191,count=10,verbose" > /proc/krping
And it displays in /var/log/messages:
Jan 4 14:23:29 localhost kernel: mlx5_0:dump_cqe:277:(pid 0): dump error cqe
Jan 4 14:23:29 localhost kernel: 00000000 00000000 00000000 00000000
Jan 4 14:23:29 localhost kernel: 00000000 00000000 00000000 00000000
Jan 4 14:23:29 localhost kernel: 00000000 00000000 00000000 00000000
Jan 4 14:23:29 localhost kernel: 00000000 93003204 10000122 0005bfd2
Jan 4 14:23:29 localhost kernel: krping: cq completion failed with
wr_id 0 status 4 opcode 128 vender_err 32
Jan 4 14:23:29 localhost kernel: krping: cq completion in ERROR state
Jan 4 14:23:29 localhost kernel: krping: wait for RDMA_READ_COMPLETE state 10

Client is run with: [kolga@sti-rx200-231-d1 ~]$ sudo echo
"client,addr=172.20.35.191,port=9999,verbose,count=10" > /proc/krping
And in var log messages:
Jan 4 14:19:27 localhost kernel: krping: DISCONNECT EVENT...
Jan 4 14:19:27 localhost kernel: krping: wait for RDMA_WRITE_ADV state 10
Jan 4 14:19:28 localhost kernel: krping: cq completion in ERROR state

On the network trace is see (over RRoCE):
CM: ConnectRequest
CM: ConnectReply
CM: ReadyToUse
RC Send Only QP
RC Ack
RC RDMA Read Request
RC RDMA Read Response Only
CM: DisconnectRequest
CM: DisconnectReply

I have previously submitted it to Mellanox but they told me to
resubmit to linux-rdma list: They also said the engineering did look
at the CQE error and the meaning of it was:
PD (protection domain) violation - error in fetch data in rxs in pd
(send opcodes/ read respond / atomic ack).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-01-19 21:07 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-09 15:30 krping problem on 4.15-rc4 Olga Kornievskaia
     [not found] ` <CAN-5tyH1HO7yzzQLyb5z5Pq=OrHnKzmCrR2MffLguqsEA-mwWg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-10 20:10   ` Steve Wise
2018-01-11 18:18     ` Olga Kornievskaia
2018-01-11 19:45       ` Steve Wise
2018-01-12 22:06         ` Olga Kornievskaia
     [not found]           ` <CAN-5tyGq=hmXY9HZYXpfaytOUV=gb0fri69gj69WKbbYtW3nTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-13  0:07             ` Steve Wise
2018-01-16 19:50               ` Olga Kornievskaia
     [not found]                 ` <CAN-5tyG9ZsaKZs3ayfFfuy7o25DrXR2yWmwUvLdNutJ1SbEg1w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-16 21:14                   ` Olga Kornievskaia
     [not found]                     ` <CAN-5tyFSYWaTPVdq=99Yr9XwnULyf4tw06roZys=rtR0F3x03g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-17 21:03                       ` Doug Ledford
     [not found]                         ` <1516223013.3403.285.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-17 22:03                           ` Olga Kornievskaia
     [not found]                             ` <CAN-5tyFM_Noj5n-BW+BMa-0VXBWnUVWU2JkiP2f5JBpZoA6YcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-18 16:13                               ` Olga Kornievskaia
     [not found]                                 ` <CAN-5tyGxnd0WnvgxEpNpZ5fG6u2JZs=Wg0fEvt8EaNLHckvx0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-19 11:08                                   ` Leon Romanovsky
     [not found]                                     ` <20180119110852.GB1393-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-19 12:21                                       ` Majd Dibbiny
     [not found]                                         ` <14B966CB-B883-4431-A2A3-9DDE6B88B9AB-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-19 13:57                                           ` Olga Kornievskaia
     [not found]                                             ` <CAN-5tyGiuuvzxru+aeeCahukrbm_aivN+HfLx=X1d8txxL4A9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-19 21:07                                               ` Olga Kornievskaia
2018-01-19 15:53                                       ` Steve Wise

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.