All of lore.kernel.org
 help / color / mirror / Atom feed
From: David McBride <dwm37@cam.ac.uk>
To: Denis Fondras <ceph@ledeuns.net>
Cc: ceph-devel@vger.kernel.org
Subject: Re: Ceph performance improvement
Date: Wed, 22 Aug 2012 11:24:20 +0100	[thread overview]
Message-ID: <5034B354.1040109@cam.ac.uk> (raw)
In-Reply-To: <50349E62.90405@ledeuns.net>

On 22/08/12 09:54, Denis Fondras wrote:

> The only point that prevents my from using it at datacenter-scale is
> performance.

> Here are some figures :
> * Test with "dd" on the OSD server (on drive
> /dev/disk/by-id/scsi-SATA_WDC_WD30EZRX-00_WD-WMAWZ0152201) :
> # dd if=/dev/zero of=testdd bs=4k count=4M
> 17179869184 bytes (17 GB) written, 123,746 s, 139 MB/s

That looks like you're writing to a filesystem on that disk, rather than 
the block device itself -- but lets say you've got 139MB/sec 
(1112Mbit/sec) of straight-line performance.

Note: this is already faster than your network link can go -- you can, 
at best, only achieve 120MB/sec over your gigabit link.

> * Test with "dd" from the client using RBD :
> # dd if=/dev/zero of=testdd bs=4k count=4M
> 17179869184 bytes (17 GB) written, 406,941 s, 42,2 MB/s

Is this a dd to the RBD device directly, or is this a write to a file in 
a filesystem created on top of it?

dd will write blocks synchronously -- that is, it will write one block, 
wait for the write to complete, then write the next block, and so on. 
Because of the durability guarantees provided by ceph, this will result 
in dd doing a lot of waiting around while writes are being sent over the 
network and written out on your OSD.

(If you're using the default replication count of 2, probably twice? 
I'm not exactly sure what Ceph does when it only has one OSD to work on..?)

> * Test with unpacking and deleting OpenBSD/5.1 src.tar.gz from the
> client using RBD :
> # time tar xzf src.tar.gz
> real    0m26.955s
> user    0m9.233s
> sys     0m11.425s

Just ignoring networking and storage for a moment, this also isn't a 
fair test: you're comparing the decompress-and-unpack time of a 139MB 
tarball on a 3GHz Pentium 4 with 1GB of RAM and a quad-core Xeon E5 that 
has 8GB.

Even ignoring the relative CPU difference, then unless you're doing 
something clever that you haven't described, there's no guarantee that 
the files in the latter case have actually been written to disk -- you 
have enough memory on your server for it to buffer all of those writes 
in RAM.  You'd need to add a sync() call or similar at the end of your 
timing run to ensure that all of those writes have actually been 
committed to disk.

> * Test with "dd" from the client using CephFS :
> # dd if=/dev/zero of=testdd bs=4k count=4M
> 17179869184 bytes (17 GB) written, 338,29 s, 50,8 MB/s

Again, the synchronous nature of 'dd' is probably severely affecting 
apparent performance.  I'd suggest looking at some other tools, like 
fio, bonnie++, or iozone, which might generate more representative load.

(Or, if you have a specific use-case in mind, something that generates 
an IO pattern like what you'll be using in production would be ideal!)

Cheers,
David
-- 
David McBride <dwm37@cam.ac.uk>
Unix Specialist, University Computing Service

  reply	other threads:[~2012-08-22 10:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-22  8:54 Ceph performance improvement Denis Fondras
2012-08-22 10:24 ` David McBride [this message]
2012-08-22 12:10   ` Denis Fondras
2012-08-23  3:51   ` Mark Kirkwood
2012-08-22 12:35 ` Mark Nelson
2012-08-22 12:42   ` Alexandre DERUMIER
2012-08-24 16:41   ` Denis Fondras
2012-08-24 17:42     ` Wido den Hollander
2012-08-22 16:03 ` Tommi Virtanen
2012-08-22 16:23   ` Denis Fondras
2012-08-22 16:29     ` Tommi Virtanen
2012-08-22 19:12       ` Ceph performance improvement / journal on block-dev Dieter Kasper (KD)
2012-08-22 23:19         ` Tommi Virtanen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5034B354.1040109@cam.ac.uk \
    --to=dwm37@cam.ac.uk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=ceph@ledeuns.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.