All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [Bug 190951] New: SoftRoCE Performance Puzzle
Date: Fri, 23 Dec 2016 03:59:25 +0000	[thread overview]
Message-ID: <bug-190951-11804@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=190951

            Bug ID: 190951
           Summary: SoftRoCE Performance Puzzle
           Product: Drivers
           Version: 2.5
    Kernel Version: 4.9
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Infiniband/RDMA
          Assignee: drivers_infiniband-rdma-ztI5WcYan/vQLgFONoPN62D2FQJk+8+b@public.gmane.org
          Reporter: songweijia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
        Regression: No

Created attachment 248401
  --> https://bugzilla.kernel.org/attachment.cgi?id=248401&action=edit
SoftRoCE Performance with 10G ethernet

I found the SoftRoCE throughput is much lower than TCP or UDP. I used two
high-end servers with Myricomm 10G dual port NIC. I ran a CentOS-7 virtual
machine in each of them. I upgraded the virtual machine kernel to the lastest
4.9(2016-12-11) version:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ uname -a
Linux srvm1 4.9.0 #1 SMP Fri Dec 16 16:35:46 EST 2016 x86_64 x86_64 x86_64
GNU/Linux
--------------------------------------------------------------------------
The two virtual machines use virtio nic driver so the network I/O over head is
very low. The iperf tool show ~9Gbps peak throughput with both TCP/UDP:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ iperf3 -c 192.168.30.10
Connecting to host 192.168.30.10, port 5201
[  4] local 192.168.29.10 port 59986 connected to 192.168.30.10 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.06 GBytes  9.12 Gbits/sec    3   1.28 MBytes
[  4]   1.00-2.00   sec  1.09 GBytes  9.39 Gbits/sec    1   1.81 MBytes
[  4]   2.00-3.00   sec  1.06 GBytes  9.14 Gbits/sec    0   2.21 MBytes
[  4]   3.00-4.00   sec  1.09 GBytes  9.36 Gbits/sec    0   2.56 MBytes
[  4]   4.00-5.00   sec  1.07 GBytes  9.15 Gbits/sec    0   2.85 MBytes
[  4]   5.00-6.00   sec  1.09 GBytes  9.39 Gbits/sec    0   3.00 MBytes
[  4]   6.00-7.00   sec  1.07 GBytes  9.21 Gbits/sec    0   3.00 MBytes
[  4]   7.00-8.00   sec  1.09 GBytes  9.39 Gbits/sec    0   3.00 MBytes
[  4]   8.00-9.00   sec  1.09 GBytes  9.39 Gbits/sec    0   3.00 MBytes
[  4]   9.00-10.00  sec  1.09 GBytes  9.38 Gbits/sec    0   3.00 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  10.8 GBytes  9.29 Gbits/sec    4             sender
[  4]   0.00-10.00  sec  10.8 GBytes  9.29 Gbits/sec                  receiver

iperf Done.

[weijia@srvm1 ~]$ iperf3 -c 192.168.30.10 -u -b 15000m
Connecting to host 192.168.30.10, port 5201
[  4] local 192.168.29.10 port 50826 connected to 192.168.30.10 port 5201
[ ID] Interval           Transfer     Bandwidth       Total Datagrams
[  4]   0.00-1.00   sec   976 MBytes  8.19 Gbits/sec  124931
[  4]   1.00-2.00   sec  1.00 GBytes  8.63 Gbits/sec  131657
[  4]   2.00-3.00   sec  1.02 GBytes  8.75 Gbits/sec  133452
[  4]   3.00-4.00   sec  1.05 GBytes  9.02 Gbits/sec  137581
[  4]   4.00-5.00   sec  1.05 GBytes  9.02 Gbits/sec  137567
[  4]   5.00-6.00   sec  1.02 GBytes  8.72 Gbits/sec  133102
[  4]   6.00-7.00   sec  1.00 GBytes  8.61 Gbits/sec  131386
[  4]   7.00-8.00   sec   994 MBytes  8.34 Gbits/sec  127229
[  4]   8.00-9.00   sec  1.04 GBytes  8.94 Gbits/sec  136484
[  4]   9.00-10.00  sec   839 MBytes  7.04 Gbits/sec  107376
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total
Datagrams
[  4]   0.00-10.00  sec  9.92 GBytes  8.52 Gbits/sec  0.005 ms  323914/1300764
(25%)
[  4] Sent 1300764 datagrams

iperf Done.
--------------------------------------------------------------------------

Then I used ibv_rc_pingpong to test the bandwith between the two virtual
machines. The result is extremely low:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ ibv_rc_pingpong -s 4096 -g 1 -n 1000000 192.168.30.10
  local address:  LID 0x0000, QPN 0x000011, PSN 0x3072e0, GID
::ffff:192.168.29.10
  remote address: LID 0x0000, QPN 0x000011, PSN 0xa54a62, GID
::ffff:192.168.30.10
8192000000 bytes in 220.23 seconds = 297.58 Mbit/sec
1000000 iters in 220.23 seconds = 220.23 usec/iter
[weijia@srvm1 ~]$ ibv_uc_pingpong -s 4096 -g 1 -n 10000 192.168.30.10
  local address:  LID 0x0000, QPN 0x000011, PSN 0x7daab0, GID
::ffff:192.168.29.10
  remote address: LID 0x0000, QPN 0x000011, PSN 0xdd96cf, GID
::ffff:192.168.30.10
81920000 bytes in 67.86 seconds = 9.66 Mbit/sec
10000 iters in 67.86 seconds = 6786.20 usec/iter

--------------------------------------------------------------------------

Then I repeated the ibv_rc_pingpong experiments with different message sizes,
and tried both polling/event mode. And I also measured the CPU utilization of
the ibv_rc_pingpong process. The result is shown in the attached figure. 'poll'
means polling mode, where ibv_rc_pingpong is issued without '-e' option; while
'int' (interrupt mode) represents the event mode with '-e' enabled. It seems
the CPU is saturated when SoftRoCE throughput goes up to ~2Gbit/s. This does
not make sense since udp and tcp can do much better. Could there be some
optimization for SoftRoCE implementation?

ibv_devinfo information:
--------------------------------------------------------------------------
[weijia@srvm1 ~]$ ibv_devinfo
hca_id: rxe0
        transport:                      InfiniBand (0)
        fw_ver:                         0.0.0
        node_guid:                      5054:00ff:fe4b:d859
        sys_image_guid:                 0000:0000:0000:0000
        vendor_id:                      0x0000
        vendor_part_id:                 0
        hw_ver:                         0x0
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             1024 (3)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

--------------------------------------------------------------------------

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2016-12-23  3:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-23  3:59 bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r [this message]
     [not found] ` <bug-190951-11804-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
2016-12-25 10:00   ` [Bug 190951] New: SoftRoCE Performance Puzzle Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-190951-11804@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon-590eeb7gvniway/ihj7yzeb+6bgklq7r@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.