From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Yan Burman <yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Wendy Cheng
<s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
"Atchley, Scott" <atchleyes-1Heg1YXhbW8@public.gmane.org>,
Tom Tucker
<tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: NFS over RDMA benchmark
Date: Sun, 28 Apr 2013 10:42:48 -0400 [thread overview]
Message-ID: <20130428144248.GA2037@fieldses.org> (raw)
In-Reply-To: <0EE9A1CDC8D6434DB00095CD7DB873462CF9A820-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
On Sun, Apr 28, 2013 at 06:28:16AM +0000, Yan Burman wrote:
> > > > > > > >> On Wed, Apr 17, 2013 at 7:36 AM, Yan Burman
> > > > > > > >> <yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > > > > > > >>> I've been trying to do some benchmarks for NFS over RDMA
> > > > > > > >>> and I seem to
> > > > > > > only get about half of the bandwidth that the HW can give me.
> > > > > > > >>> My setup consists of 2 servers each with 16 cores, 32Gb of
> > > > > > > >>> memory, and
> > > > > > > Mellanox ConnectX3 QDR card over PCI-e gen3.
> > > > > > > >>> These servers are connected to a QDR IB switch. The
> > > > > > > >>> backing storage on
> > > > > > > the server is tmpfs mounted with noatime.
> > > > > > > >>> I am running kernel 3.5.7.
> > > > > > > >>>
> > > > > > > >>> When running ib_send_bw, I get 4.3-4.5 GB/sec for block sizes 4-
> > 512K.
> > > > > > > >>> When I run fio over rdma mounted nfs, I get 260-2200MB/sec
> > > > > > > >>> for the
> > > > > > > same block sizes (4-512K). running over IPoIB-CM, I get 200-
> > 980MB/sec.
...
> > > > > > I am trying to get maximum performance from a single server - I
> > > > > > used 2
> > > > > processes in fio test - more than 2 did not show any performance boost.
> > > > > > I tried running fio from 2 different PCs on 2 different files,
> > > > > > but the sum of
> > > > > the two is more or less the same as running from single client PC.
> > > > > >
> > > > > > What I did see is that server is sweating a lot more than the
> > > > > > clients and
> > > > > more than that, it has 1 core (CPU5) in 100% softirq tasklet:
> > > > > > cat /proc/softirqs
...
> > > > Perf top for the CPU with high tasklet count gives:
> > > >
> > > > samples pcnt RIP function DSO
...
> > > > 2787.00 24.1% ffffffff81062a00 mutex_spin_on_owner
> > /root/vmlinux
...
> > Googling around.... I think we want:
> >
> > perf record -a --call-graph
> > (give it a chance to collect some samples, then ^C)
> > perf report --call-graph --stdio
> >
>
> Sorry it took me a while to get perf to show the call trace (did not enable frame pointers in kernel and struggled with perf options...), but what I get is:
> 36.18% nfsd [kernel.kallsyms] [k] mutex_spin_on_owner
> |
> --- mutex_spin_on_owner
> |
> |--99.99%-- __mutex_lock_slowpath
> | mutex_lock
> | |
> | |--85.30%-- generic_file_aio_write
That's the inode i_mutex.
> | | do_sync_readv_writev
> | | do_readv_writev
> | | vfs_writev
> | | nfsd_vfs_write
> | | nfsd_write
> | | nfsd3_proc_write
> | | nfsd_dispatch
> | | svc_process_common
> | | svc_process
> | | nfsd
> | | kthread
> | | kernel_thread_helper
> | |
> | --14.70%-- svc_send
That's the xpt_mutex (ensuring rpc replies aren't interleaved).
> | svc_process
> | nfsd
> | kthread
> | kernel_thread_helper
> --0.01%-- [...]
>
> 9.63% nfsd [kernel.kallsyms] [k] _raw_spin_lock_irqsave
> |
> --- _raw_spin_lock_irqsave
> |
> |--43.97%-- alloc_iova
And that (and __free_iova below) looks like iova_rbtree_lock.
--b.
> | intel_alloc_iova
> | __intel_map_single
> | intel_map_page
> | |
> | |--60.47%-- svc_rdma_sendto
> | | svc_send
> | | svc_process
> | | nfsd
> | | kthread
> | | kernel_thread_helper
> | |
> | |--30.10%-- rdma_read_xdr
> | | svc_rdma_recvfrom
> | | svc_recv
> | | nfsd
> | | kthread
> | | kernel_thread_helper
> | |
> | |--6.69%-- svc_rdma_post_recv
> | | send_reply
> | | svc_rdma_sendto
> | | svc_send
> | | svc_process
> | | nfsd
> | | kthread
> | | kernel_thread_helper
> | |
> | --2.74%-- send_reply
> | svc_rdma_sendto
> | svc_send
> | svc_process
> | nfsd
> | kthread
> | kernel_thread_helper
> |
> |--37.52%-- __free_iova
> | flush_unmaps
> | add_unmap
> | intel_unmap_page
> | |
> | |--97.18%-- svc_rdma_put_frmr
> | | sq_cq_reap
> | | dto_tasklet_func
> | | tasklet_action
> | | __do_softirq
> | | call_softirq
> | | do_softirq
> | | |
> | | |--97.40%-- irq_exit
> | | | |
> | | | |--99.85%-- do_IRQ
> | | | | ret_from_intr
> | | | | |
> | | | | |--40.74%-- generic_file_buffered_write
> | | | | | __generic_file_aio_write
> | | | | | generic_file_aio_write
> | | | | | do_sync_readv_writev
> | | | | | do_readv_writev
> | | | | | vfs_writev
> | | | | | nfsd_vfs_write
> | | | | | nfsd_write
> | | | | | nfsd3_proc_write
> | | | | | nfsd_dispatch
> | | | | | svc_process_common
> | | | | | svc_process
> | | | | | nfsd
> | | | | | kthread
> | | | | | kernel_thread_helper
> | | | | |
> | | | | |--25.21%-- __mutex_lock_slowpath
> | | | | | mutex_lock
> | | | | | |
> | | | | | |--94.84%-- generic_file_aio_write
> | | | | | | do_sync_readv_writev
> | | | | | | do_readv_writev
> | | | | | | vfs_writev
> | | | | | | nfsd_vfs_write
> | | | | | | nfsd_write
> | | | | | | nfsd3_proc_write
> | | | | | | nfsd_dispatch
> | | | | | | svc_process_common
> | | | | | | svc_process
> | | | | | | nfsd
> | | | | | | kthread
> | | | | | | kernel_thread_helper
> | | | | | |
>
> The entire trace is almost 1MB, so send me an off-list message if you want it.
>
> Yan
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-04-28 14:42 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-17 14:36 NFS over RDMA benchmark Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF96C65-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-17 17:15 ` Wendy Cheng
[not found] ` <CABgxfbF7c9ktSoMSPV21JU76V5J4iwbJQ257S91Y3z36WJbJVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-17 17:32 ` Atchley, Scott
[not found] ` <62745258-4F3B-4C05-BFFD-03EA604576E4-1Heg1YXhbW8@public.gmane.org>
2013-04-17 18:06 ` Wendy Cheng
[not found] ` <CABgxfbGxhnKj2n0Z-w87rZ6fwCssO31G009gwej957gv1p8PQQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-18 12:47 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9715B-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-18 16:16 ` Wendy Cheng
2013-04-23 21:06 ` J. Bruce Fields
[not found] ` <20130423210607.GJ3676-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-04-24 12:35 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF988C9-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-24 15:05 ` J. Bruce Fields
[not found] ` <20130424150540.GB20275-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-04-24 15:26 ` J. Bruce Fields
[not found] ` <20130424152631.GC20275-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-04-24 16:27 ` Wendy Cheng
[not found] ` <CABgxfbHShU7aEttJ35vdAjXduPFFj8+E4=5LZqOgh4e=5bax5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-24 18:04 ` Wendy Cheng
[not found] ` <CABgxfbHpNgQyEjd2OVNMgJoLpt_VyLiOL5hMCLwotMd5kincwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-24 18:26 ` Tom Talpey
[not found] ` <517823E0.4000402-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2013-04-25 17:18 ` Wendy Cheng
[not found] ` <CABgxfbHePAyq6AH9TFKZKUmwEHOupuYUnfc1W99HAuDkYddUqQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-25 19:01 ` Phil Pishioneri
[not found] ` <51797D8D.1080302-8DAjSxpRXgY@public.gmane.org>
2013-04-25 20:14 ` Tom Talpey
2013-04-25 20:04 ` Tom Talpey
[not found] ` <51798C51.50209-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2013-04-25 21:17 ` Tom Tucker
[not found] ` <51799D52.1040903-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-04-25 21:58 ` Wendy Cheng
[not found] ` <CABgxfbHnqQyucEpbGhsQ8-pA69peHUza7L7WHKD1K1n9Zv0WXQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-25 22:26 ` Wendy Cheng
2013-04-28 6:28 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9A820-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-28 14:42 ` J. Bruce Fields [this message]
[not found] ` <20130428144248.GA2037-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-04-29 5:34 ` Wendy Cheng
[not found] ` <CABgxfbF9MepShtOP8EoTjfMXzU4LLWC7brTmMfa3rtoWBiOweg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-29 12:16 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9B3E7-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-29 13:05 ` Tom Tucker
[not found] ` <517E701F.1010807-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-04-29 13:07 ` Tom Tucker
2013-04-30 5:09 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9C90C-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-30 13:05 ` Tom Talpey
[not found] ` <517FC182.3030703-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2013-04-30 14:23 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9CBA7-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-30 14:44 ` Tom Talpey
2013-04-30 14:20 ` Tom Talpey
[not found] ` <517FD327.3060901-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2013-04-30 14:38 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9CBD0-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-04-30 18:58 ` Tom Tucker
[not found] ` <CALsNU1MsjH5=p4Wtj2a J5+odC7y7-5oTGhrzOL-=15pXaYYUZw@mail.gmail.com>
[not found] ` <CABgxfbFhZTBO81WC5BcRRfQB_YBjE4N=sfS+G9eAzaFHYC_dWw@mail.gmail.com>
[not found] ` <CABgxfbFhZTBO81WC5BcRRfQB_YBjE4N=sfS+G9eAzaFHYC_dWw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-20 14:56 ` Or Gerlitz
2013-04-30 16:24 ` Wendy Cheng
2013-04-30 13:38 ` J. Bruce Fields
2013-04-19 2:27 ` Peng Tao
[not found] ` <CA+a=Yy7zruyGbjLyYXtPsYs12xs1uCwXo9BJtU1Fg6OMoC2z6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-22 11:07 ` Yan Burman
[not found] <51703280.03e9440a.06a6.3f9f@mx.google.com>
[not found] ` <51703280.03e9440a.06a6.3f9f-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2013-04-18 19:15 ` Wendy Cheng
[not found] ` <CABgxfbF2HSYPF=rAjrKjoMAKqMOzUOUxJbNFspKaZ4ykMyaCbw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-19 1:03 ` Atchley, Scott
[not found] ` <47A54DE8-EBCF-4CE2-80AC-58415B985FD7-1Heg1YXhbW8@public.gmane.org>
2013-04-19 3:35 ` Spencer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130428144248.GA2037@fieldses.org \
--to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
--cc=atchleyes-1Heg1YXhbW8@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
--cc=yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox