All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Nelson <mnelson@redhat.com>
To: "Wang, Zhiqiang" <zhiqiang.wang@intel.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: Proxy write performance on fio with zipf distribution
Date: Mon, 18 May 2015 10:21:07 -0500	[thread overview]
Message-ID: <555A0363.3020305@redhat.com> (raw)
In-Reply-To: <06E7D85B3BA36C4DB207FEDE871C53489E7E4D@SHSMSX101.ccr.corp.intel.com>

On 05/18/2015 02:29 AM, Wang, Zhiqiang wrote:
> Hi all,
>
> This is a follow-up of a previous discussion in the performance weekly meeting on proxy write performance. Several months ago, I tested the performance of proxy write using fio with uniform random distribution. The performance gain is about 3.5x compared with non-proxywrite case. However, the uniform random workload is not an ideal workload for cache tiering. I learned from Mark/Sage that fio can generate non-uniform random (zipf/pareto) workload using some config options. I did the evaluations. Please allow me to report the results here.
>
> Configurations:
> - 2 ceph node, each with 1xIntel Xeon E3-1275 V2 @3.5GHz CPU, 32GB memory, 10Gb NIC. There are 8 HDDs and 6 intel DCS3700 SSDs on each node
> - 1 ceph client, with 2xIntel Xeon x5570 @2.93GHz CPU, 128GB memory and 10GB NIC, running 20 VMs, each VM runs fio on a RBD
> - Base pool: composed of 16 HDD OSDs, with 4 SSDs acting as the journal
> - Cache pool: composed of 8 SSD OSDs, journals are on the same SSD
> - Data set size: 20x20GB
> - Cache tier configurations: target_max_bytes 100GB, cache_target_dirty_ratio 0.4, cache_target_full_ratio 0.8, write recency 1
> - Code version: proxy write code is at https://github.com/ceph/ceph/pull/3354. The without proxy write code version is on the same branch, but eliminating the proxy write commits.
> - Fio configuration: 4k random write, random_distribution zipf:1.1, ioengine libaio
> The top 10% of the data is hit over 80% of the time as generated by fio-genzipf. I don't quite understand what the '-b' option means, and fio-genzipf core dumps if I use 4096 for it. So I used 1000000, which seems to be the default.
> # ./fio-genzipf -t zipf -i 1.1 -b 1000000 -g 400 -o 10
> Generating Zipf distribution with 1.100000 input and 400 GB size and 1000000 block_size.
>
>     Rows           Hits %         Sum %           # Hits          Size
> ----------------------------------------------------------------------------------------------------------------------------
> Top  10.00%      82.86%          82.86%           355883        331.44G
> |->  20.00%       4.13%          86.99%            17725         16.51G
> |->  30.00%       2.86%          89.85%            12278         11.43G
> |->  40.00%       1.58%          91.43%             6781          6.32G
> |->  50.00%       1.43%          92.85%             6139          5.72G
> |->  60.00%       1.43%          94.28%             6139          5.72G
> |->  70.00%       1.43%          95.71%             6139          5.72G
> |->  80.00%       1.43%          97.14%             6139          5.72G
> |->  90.00%       1.43%          98.57%             6139          5.72G
> |-> 100.00%       1.43%         100.00%             6134          5.71G
> ---------------------------------------------------------------------------------------------------------------------------
>
> Performance results:
>
> - Without proxy write results
> QD			1		2		4		8		16
> IOPS			190		200		203		201		198
> Latency (ms)	100.43	191.9	380.57	775.15	1600.66
>
> - With proxy write results
> QD			1		2		4		8		16
> IOPS			902		896		1067	1207	1486
> Latency (ms)	22.08	44.43	74.83	133.12	217.42
>
> As you can see from above, proxy write improves IOPS from ~200 up to ~1400, and reduces latency by about 80%.
>
> Any comments/feedbacks are welcomed.

Excellent testing Zhiqiang!  It looks like proxy write helps 
dramatically.  Do you happen to know what the performance of the base 
pool is without the cache tier involved?  That was the big problem we 
saw back in firefly.  We didn't have proxy write, but here's the results 
we saw for 4k random writes and 4k random zipf 1.2 writes in 4 different 
pool configurations (base, base+ssd journal, base+ssd tiering, ssd as base)

http://nhm.ceph.com/librbdfio-tiering-tests2/randwrite%204K-iops.png

that graph is a bit difficult to make out, but the gist of it is that in 
firefly, the performance of a spinning disk base pool with SSD cache 
pool was significantly slower than either the base pool or the ssd cache 
pool when used independently.  This was true both for 4K random writes 
and 4K random zipf 1.2 distribution writes.  This was primarily limited 
by the sequential write throughput of the cache tier due to the number 
of 4MB object promotions.

Mark

> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2015-05-18 15:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-18  7:29 Proxy write performance on fio with zipf distribution Wang, Zhiqiang
2015-05-18 15:21 ` Mark Nelson [this message]
2015-05-19  2:15   ` Wang, Zhiqiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555A0363.3020305@redhat.com \
    --to=mnelson@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=zhiqiang.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.