* Rados faster than KVM block device?
@ 2012-06-28 13:10 Stefan Priebe - Profihost AG
2012-06-28 16:12 ` Josh Durgin
0 siblings, 1 reply; 5+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-06-28 13:10 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org
Hello list,
my cluster is now pretty stable i'm just wondering about the sequential
write values.
With rados bench command and 16 threads i get totally different values
than with KVM and rbd block device.
rados -p kvmpool bench 60 write -t 16:
pool size 2: Bandwidth (MB/sec): 1137.294
pool size 3: Bandwidth (MB/sec): 846.983
Inside KVM with fio:
fio --filename=$DISK --direct=1 --rw=write --bs=4M --size=200G
--numjobs=16 --runtime=60 --group_reporting --name=file1:
pool size 2:
write: io=32984MB, bw=562046KB/s, iops=137 , runt= 60094msec
pool size 3:
write: io=29124MB, bw=496024KB/s, iops=121 , runt= 60124msec
Even when i change the pool size to 3 i get with fio 520MB/s.
Any ideas? Is this expected?
Greets
Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Rados faster than KVM block device?
2012-06-28 13:10 Rados faster than KVM block device? Stefan Priebe - Profihost AG
@ 2012-06-28 16:12 ` Josh Durgin
2012-06-28 16:39 ` Tommi Virtanen
2012-06-28 21:17 ` Stefan Priebe
0 siblings, 2 replies; 5+ messages in thread
From: Josh Durgin @ 2012-06-28 16:12 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org
On 06/28/2012 06:10 AM, Stefan Priebe - Profihost AG wrote:
> Hello list,
>
> my cluster is now pretty stable i'm just wondering about the sequential
> write values.
>
> With rados bench command and 16 threads i get totally different values
> than with KVM and rbd block device.
>
> rados -p kvmpool bench 60 write -t 16:
> pool size 2: Bandwidth (MB/sec): 1137.294
> pool size 3: Bandwidth (MB/sec): 846.983
>
> Inside KVM with fio:
>
> fio --filename=$DISK --direct=1 --rw=write --bs=4M --size=200G
> --numjobs=16 --runtime=60 --group_reporting --name=file1:
There are a number of differences between running that in a vm on rbd
and rados bench.
Keep in mind it's running on a filesystem, so requests go through the
guest fs and block layer before getting into librbd. These two layers
can break up those 4M writes, so you end up doing a bunch more small
I/Os which degrades performance a bunch. Running those 16 processes in
does not directly translate to 16 I/Os in flight from the guest kernel,
like rados bench is doing. If you use blktrace on the guest, or just
add --debug-ms 1, you can track the requests the guest is sending by
looking at the lines with 'osd_op\(.*'.
If you don't use direct I/O, and you enable rbd writeback caching,
librbd will be able to merge many of the smaller requests and
you should see much better throughput.
Josh
> pool size 2:
> write: io=32984MB, bw=562046KB/s, iops=137 , runt= 60094msec
> pool size 3:
> write: io=29124MB, bw=496024KB/s, iops=121 , runt= 60124msec
>
> Even when i change the pool size to 3 i get with fio 520MB/s.
>
> Any ideas? Is this expected?
>
> Greets
> Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Rados faster than KVM block device?
2012-06-28 16:12 ` Josh Durgin
@ 2012-06-28 16:39 ` Tommi Virtanen
2012-06-28 21:17 ` Stefan Priebe
1 sibling, 0 replies; 5+ messages in thread
From: Tommi Virtanen @ 2012-06-28 16:39 UTC (permalink / raw)
To: Josh Durgin; +Cc: Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org
On Thu, Jun 28, 2012 at 9:12 AM, Josh Durgin <josh.durgin@inktank.com> wrote:
> Keep in mind it's running on a filesystem, so requests go through the
> guest fs and block layer before getting into librbd. These two layers
Quick side note: an easy way to get around the filesystem is to use
rbd as the *second* hard drive for a vm, and not even mkfs it. Run
tests on the raw block device.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Rados faster than KVM block device?
2012-06-28 16:12 ` Josh Durgin
2012-06-28 16:39 ` Tommi Virtanen
@ 2012-06-28 21:17 ` Stefan Priebe
2012-07-02 18:33 ` Gregory Farnum
1 sibling, 1 reply; 5+ messages in thread
From: Stefan Priebe @ 2012-06-28 21:17 UTC (permalink / raw)
To: Josh Durgin; +Cc: ceph-devel@vger.kernel.org
Am 28.06.2012 18:12, schrieb Josh Durgin:
> On 06/28/2012 06:10 AM, Stefan Priebe - Profihost AG wrote:
>> Hello list,
>>
>> my cluster is now pretty stable i'm just wondering about the sequential
>> write values.
>>
>> With rados bench command and 16 threads i get totally different values
>> than with KVM and rbd block device.
>>
>> rados -p kvmpool bench 60 write -t 16:
>> pool size 2: Bandwidth (MB/sec): 1137.294
>> pool size 3: Bandwidth (MB/sec): 846.983
>>
>> Inside KVM with fio:
>>
>> fio --filename=$DISK --direct=1 --rw=write --bs=4M --size=200G
>> --numjobs=16 --runtime=60 --group_reporting --name=file1:
>
> There are a number of differences between running that in a vm on rbd
> and rados bench.
>
> Keep in mind it's running on a filesystem, so requests go through the
> guest fs and block layer before getting into librbd.
No it doesn't i'm testing directly the block device.
> If you don't use direct I/O, and you enable rbd writeback caching,
> librbd will be able to merge many of the smaller requests and
> you should see much better throughput.
I'm already using rbd writeback and it works good for random 4k writes,
But it doesn't make sense for sequential 4M writes.
Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Rados faster than KVM block device?
2012-06-28 21:17 ` Stefan Priebe
@ 2012-07-02 18:33 ` Gregory Farnum
0 siblings, 0 replies; 5+ messages in thread
From: Gregory Farnum @ 2012-07-02 18:33 UTC (permalink / raw)
To: Stefan Priebe; +Cc: Josh Durgin, ceph-devel@vger.kernel.org
On Thu, Jun 28, 2012 at 2:17 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
> Am 28.06.2012 18:12, schrieb Josh Durgin:
>
>> On 06/28/2012 06:10 AM, Stefan Priebe - Profihost AG wrote:
>>>
>>> Hello list,
>>>
>>> my cluster is now pretty stable i'm just wondering about the sequential
>>> write values.
>>>
>>> With rados bench command and 16 threads i get totally different values
>>> than with KVM and rbd block device.
>>>
>>> rados -p kvmpool bench 60 write -t 16:
>>> pool size 2: Bandwidth (MB/sec): 1137.294
>>> pool size 3: Bandwidth (MB/sec): 846.983
>>>
>>> Inside KVM with fio:
>>>
>>> fio --filename=$DISK --direct=1 --rw=write --bs=4M --size=200G
>>> --numjobs=16 --runtime=60 --group_reporting --name=file1:
>>
>>
>> There are a number of differences between running that in a vm on rbd
>> and rados bench.
>>
>> Keep in mind it's running on a filesystem, so requests go through the
>> guest fs and block layer before getting into librbd.
>
> No it doesn't i'm testing directly the block device.
>
>
>> If you don't use direct I/O, and you enable rbd writeback caching,
>> librbd will be able to merge many of the smaller requests and
>> you should see much better throughput.
>
> I'm already using rbd writeback and it works good for random 4k writes, But
> it doesn't make sense for sequential 4M writes.
I haven't been able to come up with a good explanation for this, but
if you're interested in exploring it further, you can gather logging
data that includes the messages KVM is sending out to the OSDs (and do
the same for rados bench). Any differences we see there would be
instructive.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-07-02 18:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-28 13:10 Rados faster than KVM block device? Stefan Priebe - Profihost AG
2012-06-28 16:12 ` Josh Durgin
2012-06-28 16:39 ` Tommi Virtanen
2012-06-28 21:17 ` Stefan Priebe
2012-07-02 18:33 ` Gregory Farnum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.