From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [Bug 190951] New: SoftRoCE Performance Puzzle
Date: Fri, 23 Dec 2016 03:59:25 +0000 [thread overview]
Message-ID: <bug-190951-11804@https.bugzilla.kernel.org/> (raw)
https://bugzilla.kernel.org/show_bug.cgi?id=190951
Bug ID: 190951
Summary: SoftRoCE Performance Puzzle
Product: Drivers
Version: 2.5
Kernel Version: 4.9
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Infiniband/RDMA
Assignee: drivers_infiniband-rdma-ztI5WcYan/vQLgFONoPN62D2FQJk+8+b@public.gmane.org
Reporter: songweijia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Regression: No
Created attachment 248401
--> https://bugzilla.kernel.org/attachment.cgi?id=248401&action=edit
SoftRoCE Performance with 10G ethernet
I found the SoftRoCE throughput is much lower than TCP or UDP. I used two
high-end servers with Myricomm 10G dual port NIC. I ran a CentOS-7 virtual
machine in each of them. I upgraded the virtual machine kernel to the lastest
4.9(2016-12-11) version:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ uname -a
Linux srvm1 4.9.0 #1 SMP Fri Dec 16 16:35:46 EST 2016 x86_64 x86_64 x86_64
GNU/Linux
--------------------------------------------------------------------------
The two virtual machines use virtio nic driver so the network I/O over head is
very low. The iperf tool show ~9Gbps peak throughput with both TCP/UDP:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ iperf3 -c 192.168.30.10
Connecting to host 192.168.30.10, port 5201
[ 4] local 192.168.29.10 port 59986 connected to 192.168.30.10 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.06 GBytes 9.12 Gbits/sec 3 1.28 MBytes
[ 4] 1.00-2.00 sec 1.09 GBytes 9.39 Gbits/sec 1 1.81 MBytes
[ 4] 2.00-3.00 sec 1.06 GBytes 9.14 Gbits/sec 0 2.21 MBytes
[ 4] 3.00-4.00 sec 1.09 GBytes 9.36 Gbits/sec 0 2.56 MBytes
[ 4] 4.00-5.00 sec 1.07 GBytes 9.15 Gbits/sec 0 2.85 MBytes
[ 4] 5.00-6.00 sec 1.09 GBytes 9.39 Gbits/sec 0 3.00 MBytes
[ 4] 6.00-7.00 sec 1.07 GBytes 9.21 Gbits/sec 0 3.00 MBytes
[ 4] 7.00-8.00 sec 1.09 GBytes 9.39 Gbits/sec 0 3.00 MBytes
[ 4] 8.00-9.00 sec 1.09 GBytes 9.39 Gbits/sec 0 3.00 MBytes
[ 4] 9.00-10.00 sec 1.09 GBytes 9.38 Gbits/sec 0 3.00 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 10.8 GBytes 9.29 Gbits/sec 4 sender
[ 4] 0.00-10.00 sec 10.8 GBytes 9.29 Gbits/sec receiver
iperf Done.
[weijia@srvm1 ~]$ iperf3 -c 192.168.30.10 -u -b 15000m
Connecting to host 192.168.30.10, port 5201
[ 4] local 192.168.29.10 port 50826 connected to 192.168.30.10 port 5201
[ ID] Interval Transfer Bandwidth Total Datagrams
[ 4] 0.00-1.00 sec 976 MBytes 8.19 Gbits/sec 124931
[ 4] 1.00-2.00 sec 1.00 GBytes 8.63 Gbits/sec 131657
[ 4] 2.00-3.00 sec 1.02 GBytes 8.75 Gbits/sec 133452
[ 4] 3.00-4.00 sec 1.05 GBytes 9.02 Gbits/sec 137581
[ 4] 4.00-5.00 sec 1.05 GBytes 9.02 Gbits/sec 137567
[ 4] 5.00-6.00 sec 1.02 GBytes 8.72 Gbits/sec 133102
[ 4] 6.00-7.00 sec 1.00 GBytes 8.61 Gbits/sec 131386
[ 4] 7.00-8.00 sec 994 MBytes 8.34 Gbits/sec 127229
[ 4] 8.00-9.00 sec 1.04 GBytes 8.94 Gbits/sec 136484
[ 4] 9.00-10.00 sec 839 MBytes 7.04 Gbits/sec 107376
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Jitter Lost/Total
Datagrams
[ 4] 0.00-10.00 sec 9.92 GBytes 8.52 Gbits/sec 0.005 ms 323914/1300764
(25%)
[ 4] Sent 1300764 datagrams
iperf Done.
--------------------------------------------------------------------------
Then I used ibv_rc_pingpong to test the bandwith between the two virtual
machines. The result is extremely low:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ ibv_rc_pingpong -s 4096 -g 1 -n 1000000 192.168.30.10
local address: LID 0x0000, QPN 0x000011, PSN 0x3072e0, GID
::ffff:192.168.29.10
remote address: LID 0x0000, QPN 0x000011, PSN 0xa54a62, GID
::ffff:192.168.30.10
8192000000 bytes in 220.23 seconds = 297.58 Mbit/sec
1000000 iters in 220.23 seconds = 220.23 usec/iter
[weijia@srvm1 ~]$ ibv_uc_pingpong -s 4096 -g 1 -n 10000 192.168.30.10
local address: LID 0x0000, QPN 0x000011, PSN 0x7daab0, GID
::ffff:192.168.29.10
remote address: LID 0x0000, QPN 0x000011, PSN 0xdd96cf, GID
::ffff:192.168.30.10
81920000 bytes in 67.86 seconds = 9.66 Mbit/sec
10000 iters in 67.86 seconds = 6786.20 usec/iter
--------------------------------------------------------------------------
Then I repeated the ibv_rc_pingpong experiments with different message sizes,
and tried both polling/event mode. And I also measured the CPU utilization of
the ibv_rc_pingpong process. The result is shown in the attached figure. 'poll'
means polling mode, where ibv_rc_pingpong is issued without '-e' option; while
'int' (interrupt mode) represents the event mode with '-e' enabled. It seems
the CPU is saturated when SoftRoCE throughput goes up to ~2Gbit/s. This does
not make sense since udp and tcp can do much better. Could there be some
optimization for SoftRoCE implementation?
ibv_devinfo information:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ ibv_devinfo
hca_id: rxe0
transport: InfiniBand (0)
fw_ver: 0.0.0
node_guid: 5054:00ff:fe4b:d859
sys_image_guid: 0000:0000:0000:0000
vendor_id: 0x0000
vendor_part_id: 0
hw_ver: 0x0
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
--------------------------------------------------------------------------
--
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2016-12-23 3:59 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-23 3:59 bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r [this message]
[not found] ` <bug-190951-11804-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
2016-12-25 10:00 ` [Bug 190951] New: SoftRoCE Performance Puzzle Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-190951-11804@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon-590eeb7gvniway/ihj7yzeb+6bgklq7r@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.