From: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: songweijia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
Yonatan Cohen <yonatanc-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: [Bug 190951] New: SoftRoCE Performance Puzzle
Date: Sun, 25 Dec 2016 12:00:20 +0200 [thread overview]
Message-ID: <20161225100020.GC14356@mtr-leonro.local> (raw)
In-Reply-To: <bug-190951-11804-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
[-- Attachment #1: Type: text/plain, Size: 7257 bytes --]
On Fri, Dec 23, 2016 at 03:59:25AM +0000, bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=190951
>
> Bug ID: 190951
> Summary: SoftRoCE Performance Puzzle
> Product: Drivers
> Version: 2.5
> Kernel Version: 4.9
> Hardware: All
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Infiniband/RDMA
> Assignee: drivers_infiniband-rdma-ztI5WcYan/vQLgFONoPN62D2FQJk+8+b@public.gmane.org
> Reporter: songweijia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
> Regression: No
>
> Created attachment 248401
> --> https://bugzilla.kernel.org/attachment.cgi?id=248401&action=edit
> SoftRoCE Performance with 10G ethernet
>
> I found the SoftRoCE throughput is much lower than TCP or UDP. I used two
> high-end servers with Myricomm 10G dual port NIC. I ran a CentOS-7 virtual
> machine in each of them. I upgraded the virtual machine kernel to the lastest
> 4.9(2016-12-11) version:
> --------------------------------------------------------------------------
> [weijia@srvm1 ~]$ uname -a
> Linux srvm1 4.9.0 #1 SMP Fri Dec 16 16:35:46 EST 2016 x86_64 x86_64 x86_64
> GNU/Linux
> --------------------------------------------------------------------------
> The two virtual machines use virtio nic driver so the network I/O over head is
> very low. The iperf tool show ~9Gbps peak throughput with both TCP/UDP:
> --------------------------------------------------------------------------
> [weijia@srvm1 ~]$ iperf3 -c 192.168.30.10
> Connecting to host 192.168.30.10, port 5201
> [ 4] local 192.168.29.10 port 59986 connected to 192.168.30.10 port 5201
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-1.00 sec 1.06 GBytes 9.12 Gbits/sec 3 1.28 MBytes
> [ 4] 1.00-2.00 sec 1.09 GBytes 9.39 Gbits/sec 1 1.81 MBytes
> [ 4] 2.00-3.00 sec 1.06 GBytes 9.14 Gbits/sec 0 2.21 MBytes
> [ 4] 3.00-4.00 sec 1.09 GBytes 9.36 Gbits/sec 0 2.56 MBytes
> [ 4] 4.00-5.00 sec 1.07 GBytes 9.15 Gbits/sec 0 2.85 MBytes
> [ 4] 5.00-6.00 sec 1.09 GBytes 9.39 Gbits/sec 0 3.00 MBytes
> [ 4] 6.00-7.00 sec 1.07 GBytes 9.21 Gbits/sec 0 3.00 MBytes
> [ 4] 7.00-8.00 sec 1.09 GBytes 9.39 Gbits/sec 0 3.00 MBytes
> [ 4] 8.00-9.00 sec 1.09 GBytes 9.39 Gbits/sec 0 3.00 MBytes
> [ 4] 9.00-10.00 sec 1.09 GBytes 9.38 Gbits/sec 0 3.00 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.00 sec 10.8 GBytes 9.29 Gbits/sec 4 sender
> [ 4] 0.00-10.00 sec 10.8 GBytes 9.29 Gbits/sec receiver
>
> iperf Done.
>
> [weijia@srvm1 ~]$ iperf3 -c 192.168.30.10 -u -b 15000m
> Connecting to host 192.168.30.10, port 5201
> [ 4] local 192.168.29.10 port 50826 connected to 192.168.30.10 port 5201
> [ ID] Interval Transfer Bandwidth Total Datagrams
> [ 4] 0.00-1.00 sec 976 MBytes 8.19 Gbits/sec 124931
> [ 4] 1.00-2.00 sec 1.00 GBytes 8.63 Gbits/sec 131657
> [ 4] 2.00-3.00 sec 1.02 GBytes 8.75 Gbits/sec 133452
> [ 4] 3.00-4.00 sec 1.05 GBytes 9.02 Gbits/sec 137581
> [ 4] 4.00-5.00 sec 1.05 GBytes 9.02 Gbits/sec 137567
> [ 4] 5.00-6.00 sec 1.02 GBytes 8.72 Gbits/sec 133102
> [ 4] 6.00-7.00 sec 1.00 GBytes 8.61 Gbits/sec 131386
> [ 4] 7.00-8.00 sec 994 MBytes 8.34 Gbits/sec 127229
> [ 4] 8.00-9.00 sec 1.04 GBytes 8.94 Gbits/sec 136484
> [ 4] 9.00-10.00 sec 839 MBytes 7.04 Gbits/sec 107376
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Jitter Lost/Total
> Datagrams
> [ 4] 0.00-10.00 sec 9.92 GBytes 8.52 Gbits/sec 0.005 ms 323914/1300764
> (25%)
> [ 4] Sent 1300764 datagrams
>
> iperf Done.
> --------------------------------------------------------------------------
>
> Then I used ibv_rc_pingpong to test the bandwith between the two virtual
> machines. The result is extremely low:
> --------------------------------------------------------------------------
> [weijia@srvm1 ~]$ ibv_rc_pingpong -s 4096 -g 1 -n 1000000 192.168.30.10
> local address: LID 0x0000, QPN 0x000011, PSN 0x3072e0, GID
> ::ffff:192.168.29.10
> remote address: LID 0x0000, QPN 0x000011, PSN 0xa54a62, GID
> ::ffff:192.168.30.10
> 8192000000 bytes in 220.23 seconds = 297.58 Mbit/sec
> 1000000 iters in 220.23 seconds = 220.23 usec/iter
> [weijia@srvm1 ~]$ ibv_uc_pingpong -s 4096 -g 1 -n 10000 192.168.30.10
> local address: LID 0x0000, QPN 0x000011, PSN 0x7daab0, GID
> ::ffff:192.168.29.10
> remote address: LID 0x0000, QPN 0x000011, PSN 0xdd96cf, GID
> ::ffff:192.168.30.10
> 81920000 bytes in 67.86 seconds = 9.66 Mbit/sec
> 10000 iters in 67.86 seconds = 6786.20 usec/iter
>
> --------------------------------------------------------------------------
>
> Then I repeated the ibv_rc_pingpong experiments with different message sizes,
> and tried both polling/event mode. And I also measured the CPU utilization of
> the ibv_rc_pingpong process. The result is shown in the attached figure. 'poll'
> means polling mode, where ibv_rc_pingpong is issued without '-e' option; while
> 'int' (interrupt mode) represents the event mode with '-e' enabled. It seems
> the CPU is saturated when SoftRoCE throughput goes up to ~2Gbit/s. This does
> not make sense since udp and tcp can do much better. Could there be some
> optimization for SoftRoCE implementation?
>
> ibv_devinfo information:
> --------------------------------------------------------------------------
> [weijia@srvm1 ~]$ ibv_devinfo
> hca_id: rxe0
> transport: InfiniBand (0)
> fw_ver: 0.0.0
> node_guid: 5054:00ff:fe4b:d859
> sys_image_guid: 0000:0000:0000:0000
> vendor_id: 0x0000
> vendor_part_id: 0
> hw_ver: 0x0
> phys_port_cnt: 1
> port: 1
> state: PORT_ACTIVE (4)
> max_mtu: 4096 (5)
> active_mtu: 1024 (3)
> sm_lid: 0
> port_lid: 0
> port_lmc: 0x00
> link_layer: Ethernet
>
> --------------------------------------------------------------------------
Thanks for taking look on it,
We are working to fix the issue. Right now, Yonatan is working to add
various counters to better instrument SoftRoCE.
>
> --
> You are receiving this mail because:
> You are watching the assignee of the bug.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2016-12-25 10:00 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-23 3:59 [Bug 190951] New: SoftRoCE Performance Puzzle bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
[not found] ` <bug-190951-11804-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
2016-12-25 10:00 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161225100020.GC14356@mtr-leonro.local \
--to=leon-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=songweijia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=yonatanc-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox