All of lore.kernel.org
 help / color / mirror / Atom feed
* Ceph RBD performance - random writes
@ 2012-08-08  5:19 Mark Kirkwood
  2012-08-08 18:46 ` Josh Durgin
  2012-08-09 14:48 ` Matthew Richardson
  0 siblings, 2 replies; 10+ messages in thread
From: Mark Kirkwood @ 2012-08-08  5:19 UTC (permalink / raw)
  To: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 2585 bytes --]

I've been looking at using Ceph RBD as a block store for database use. 
As part of this I'm looking a how (particularly random) IO of smallish 
(4K, 8K) block sizes performs.

I've setup Ceph with a single osd and mon spread over two SSD (Intel 
520) - 2G journal on one and the osd data on the other (xfs filesystem). 
The Intel's are pretty fast, and (despite being shackled by a crappy 
Nvidia SATA controller) fly for random IO.

However I am not seeing that reflected in the RBD case. I have the 
device mounted on the local machine where the osd and mon are running 
(so network performance should not be a factor here).

Here is what I did:

Create a rbd device of 10G and mount on /mnt/vol0:

$ rbd create --size 10240 vol0
$ rbd map vol0
$ mkfx.xfs /dev/rbd0
$ rbd mount /dev/rdb0 /mnt/vol0

Make a file:

$ dd if=/dev/zero of=/mnt/vol0/dump/file bs=4k count=300000 conv=fsync
1228800000 bytes (1.2 GB) copied, 13.4361 s, 91.5 MB/s

Performance ok if file size < journal (2G).

$ dd if=/dev/zero of=/mnt/vol0/dump/file bs=4096k count=200 conv=fsync
838860800 bytes (839 MB) copied, 9.47086 s, 88.6 MB/s

Not so good if file size > journal.

$ dd if=/dev/zero of=/mnt/vol0/dump/file bs=4096k count=1000 conv=fsync
4194304000 bytes (4.2 GB) copied, 279.891 s, 15.0 MB/s

Random writes (see attached file) sync'ed with sync_file_range are ok if 
block size big:

$ ./writetest /mnt/vol0/dump/file 4194304 0 1
random writes: 292 of: 4194304 bytes elapsed: 9.8397s io rate: 30/s 
(118.70 MB/s)

$ ./writetest /mnt/vol0/dump/file 1048576 0 1
random writes: 1171 of: 1048576 bytes elapsed: 10.6042s io rate: 110/s 
(110.43 MB/s)

$ ./writetest /mnt/vol0/dump/file 131072 0 1
random writes: 9375 of: 131072 bytes elapsed: 15.8075s io rate: 593/s 
(74.13 MB/s)


However smallish block size is suicide (trigger suicide assert after a 
while), I see 100 IOPS or less on actual devices, all 100% util:

$ ./writetest /mnt/vol0/dump/file 8192 0 1

I am running into http://tracker.newdream.net/issues/2784 here I think.

Note that the actual SSD are very fast for this when accessed directly:

$ ./writetest /data1/ceph/1/file 8192 0 1
random writes: 1000000 of: 8192 bytes elapsed: 125.7907s io rate: 7950/s 
(62.11 MB/s)


Thanks for your patience in reading so far - some actual questions now :-)

1/ Why is the appending write from dd when the size of file > journal so 
slow, despite reasonably capable storage devices?

2/ Is the sudden dramatic drop in random write performance a 
manifestation of the "small requests  are slow" issue? or is this 
something else?


Thanks

Mark



[-- Attachment #2: ceph.conf.gz --]
[-- Type: application/x-gzip, Size: 1237 bytes --]

[-- Attachment #3: writetest.c.gz --]
[-- Type: application/x-gzip, Size: 1656 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-08-14  5:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-08  5:19 Ceph RBD performance - random writes Mark Kirkwood
2012-08-08 18:46 ` Josh Durgin
2012-08-08 21:58   ` Mark Nelson
2012-08-08 23:36     ` Mark Kirkwood
2012-08-09  0:43       ` Mark Kirkwood
2012-08-09  3:54         ` Mark Kirkwood
2012-08-09 11:42           ` Mark Nelson
2012-08-09 23:31             ` Mark Kirkwood
2012-08-14  5:41               ` Mark Kirkwood
2012-08-09 14:48 ` Matthew Richardson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.