From: Mark Nelson <mnelson@redhat.com>
To: "Curley, Matthew" <matthew.curley@hpe.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: Reproducing allocator performance differences
Date: Thu, 01 Oct 2015 12:18:18 -0500 [thread overview]
Message-ID: <560D6ADA.4000107@redhat.com> (raw)
In-Reply-To: <E011655B53F0CC41BB273E3A0B53F14409B749EC@G1W3777.americas.hpqcorp.net>
On 10/01/2015 10:32 AM, Curley, Matthew wrote:
> We've been trying to reproduce the allocator performance impact on 4K random reads seen in the Hackathon (and more recent tests). At this point though, we're not seeing any significant difference between tcmalloc and jemalloc so we're looking for thoughts on what we're doing wrong. Or at least some suggestions to try out.
>
> More detail here:
> https://drive.google.com/file/d/0B2kp18maR7axTmU5WG9WclNKQlU/view?usp=sharing
>
> Thanks for any input!
Hi Mathew,
I can point out a couple of differences in our setups:
1) I have 4 NVMe cards with 4 OSDs per card in each node, ie 16 OSDs
total per node. I'm also running the fio processes on the same nodes as
the OSDs, so there is far less CPU available per OSD in my setup.
2) You have more memory per node than I do (and far more memory per OSD)
3) I'm using fio with the librbd engine, not fio+libaio on kernel RBD.
It would be interesting to know if if this is having an effect.
4) I'm using RBD cache (and allowing writeback before flush)
5) I'm not using nobarriers
I suspect that in my setup I am very much bound by things other than the
NVMe cards. I think we should look at this in terms of per-node
throughput rather than per-OSD. What I find very interesting is that
you are seeing much higher per-node tcmalloc performance than I am but
fairly similar per-node jemalloc performance. For 4K random reads I saw
about 14K random read IOPs per node for tcmalloc+32MB TC and around 40K
IOPS per node with tcmalloc+128MB tc or jemalloc. It appears to me that
for both tcmalloc and jemalloc you saw around 50K IOPS per node in the 4
OSD per card case.
A couple of thoughts:
1) Did you happen to record any CPU usage data during your tests?
Perhaps with only 4 OSDs per node there is less CPU contention.
2) Did you test 4K random writes? It would be interesting to see if
those results show the same behavior.
3) I'm going to assume that since you saw differences in performance
with different queue depths that this is O_DIRECT? Did you sync/drop
cache on the OSDs before the tests? Was the data pre-filled on the RBD
volumes?
4) Even given the above, you have a lot more memory available for buffer
cache. Did you happen to look at how many of the IOs were actually
hitting the NVMe devices?
Mark
>
> --MC
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2015-10-01 17:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-01 15:32 Reproducing allocator performance differences Curley, Matthew
2015-10-01 17:18 ` Mark Nelson [this message]
2015-10-01 18:09 ` Curley, Matthew
2015-10-02 3:39 ` Chaitanya Huilgol
2015-10-13 23:32 ` Curley, Matthew
2015-10-14 19:04 ` Mark Nelson
2015-10-02 6:07 ` Dałek, Piotr
2015-10-02 6:55 ` Alexandre DERUMIER
2015-10-02 7:24 ` Dałek, Piotr
2015-10-02 11:25 ` Alexandre DERUMIER
2015-10-02 11:33 ` Dałek, Piotr
2015-10-02 6:10 ` Alexandre DERUMIER
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=560D6ADA.4000107@redhat.com \
--to=mnelson@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=matthew.curley@hpe.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.