All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Nelson <mark.nelson@inktank.com>
To: sheng qiu <herbert1984106@gmail.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: some performance issue
Date: Fri, 01 Feb 2013 15:10:33 -0600	[thread overview]
Message-ID: <510C2F49.8010401@inktank.com> (raw)
In-Reply-To: <CAB7xdinwhd6gmqbzeC7Gp+ugDbJhOcr1TwX-1KYg5=geWbT0+g@mail.gmail.com>

On 02/01/2013 02:20 PM, sheng qiu wrote:
> Hi,
>
> i did one experiment which gives some interesting result.
>
> i create two OSD (ext4), each is a SSD attached on the same PC. i also
> configure one monitor and one mds on that PC.
> so generally, my OSDs, monitor and mds locate on the same node.
>
> i set up the ceph service and mount the ceph also on a local directory
> on that PC. so client, OSDs, monitor and mds all on the same node.
> i suppose this will exclude the network communication cost.
>
> i run fio benchmark which create one 10GB file (larger than main
> memory) on the ceph mount point. it perform sequential read/write and
> random read/write on the file, and generate the throughput result.
>
> next i umount the ceph and stop ceph service. i create ext4 on the
> same SSD that used as OSD before. then run the same workloads and get
> the throughput result.
>
> here are the results:
>
> (throughput kb/s)Seq-read	Rand-read	Seq-write	Rand-write
> ceph	                 7378	4740	           790	1211
> ext4	                 58260	17334	 54697	34257
>
> as you see, the ceph has huge performance down, even monitor, mds,
> client and osds locate on the same physical machine.
> another interesting thing is the seq-write has lower throughput
> compared with random-write under ceph. not quite clear....
>
> does anyone have idea about why ceph has that performance down?

Hi Sheng,

Are you using RBD or CephFS (and kernel or userland clients?)  How much 
replication?  Also, what FIO settings?

In general, it is difficult to make distributed storage systems perform 
as well as local storage for small read/write workloads.  You need a lot 
of concurrency to hide the latencies, and if the local storage is 
incredibly fast (like an SSD!) you have a huge uphill battle.

Regarding the network, Even though you ran everything on localhost, ceph 
is still using TCP sockets to do all of the communication.

Having said that, I think we can do better than 790 IOPs for seq writes, 
even if it's 2x replication.  The trick is to find where in the stack 
things are getting held up.  You might want to look at tools like iostat 
and collectl, and look at some of the op latency data in the ceph admin 
socket.  A basic introduction is described in sebastian's article here:

http://www.sebastien-han.fr/blog/2012/08/14/ceph-admin-socket/

>
> Thanks,
> Sheng
>
>


  reply	other threads:[~2013-02-01 21:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-01 20:20 some performance issue sheng qiu
2013-02-01 21:10 ` Mark Nelson [this message]
2013-02-04 15:36   ` sheng qiu
2013-02-04 16:52     ` Chen, Xiaoxi
2013-02-04 17:15       ` sheng qiu
2013-02-04 17:29         ` Mark Kampe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=510C2F49.8010401@inktank.com \
    --to=mark.nelson@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=herbert1984106@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.