optmize librbd for iops

All of lore.kernel.org
 help / color / mirror / Atom feed

* optmize librbd for iops
@ 2012-11-12 13:50 Stefan Priebe - Profihost AG
  2012-11-13  7:51 ` Josh Durgin
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-11-12 13:50 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hello list,

are there any plans to optimize librbd for iops? Right now i'm able to 
get 50.000 iop/s via iscsi and 100.000 iop/s using multipathing with iscsi.

With librbd i'm stuck to around 18.000iops. As this scales with more 
hosts but not with more disks in a vm. It must be limited by rbd 
implementation in kvm / librbd.

Greets,
Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: optmize librbd for iops
  2012-11-12 13:50 optmize librbd for iops Stefan Priebe - Profihost AG
@ 2012-11-13  7:51 ` Josh Durgin
  2012-11-13  7:55   ` Stefan Priebe
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Durgin @ 2012-11-13  7:51 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org

On 11/12/2012 05:50 AM, Stefan Priebe - Profihost AG wrote:
> Hello list,
>
> are there any plans to optimize librbd for iops? Right now i'm able to
> get 50.000 iop/s via iscsi and 100.000 iop/s using multipathing with iscsi.
>
> With librbd i'm stuck to around 18.000iops. As this scales with more
> hosts but not with more disks in a vm. It must be limited by rbd
> implementation in kvm / librbd.

It'd be interesting to see which layers are most limiting in this
case - qemu/kvm, librados, or librbd.

How does rados bench with 4k writes and then 4k reads with many
concurrent IOs do?

Unfortunately there's no librbd read benchmark yet.

Josh


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: optmize librbd for iops
  2012-11-13  7:51 ` Josh Durgin
@ 2012-11-13  7:55   ` Stefan Priebe
  2012-11-13  8:20     ` Josh Durgin
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Priebe @ 2012-11-13  7:55 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel@vger.kernel.org

Am 13.11.2012 08:51, schrieb Josh Durgin:
> On 11/12/2012 05:50 AM, Stefan Priebe - Profihost AG wrote:
>> Hello list,
>>
>> are there any plans to optimize librbd for iops? Right now i'm able to
>> get 50.000 iop/s via iscsi and 100.000 iop/s using multipathing with
>> iscsi.
>>
>> With librbd i'm stuck to around 18.000iops. As this scales with more
>> hosts but not with more disks in a vm. It must be limited by rbd
>> implementation in kvm / librbd.
>
> It'd be interesting to see which layers are most limiting in this
> case - qemu/kvm, librados, or librbd.
>
> How does rados bench with 4k writes and then 4k reads with many
> concurrent IOs do?
Right now i'm using qemu-kvm with librbd and fio inside guest. How does 
the rados bench work?

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: optmize librbd for iops
  2012-11-13  7:55   ` Stefan Priebe
@ 2012-11-13  8:20     ` Josh Durgin
  2012-11-13 10:10       ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Durgin @ 2012-11-13  8:20 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

On 11/12/2012 11:55 PM, Stefan Priebe wrote:
> Am 13.11.2012 08:51, schrieb Josh Durgin:
>> On 11/12/2012 05:50 AM, Stefan Priebe - Profihost AG wrote:
>>> Hello list,
>>>
>>> are there any plans to optimize librbd for iops? Right now i'm able to
>>> get 50.000 iop/s via iscsi and 100.000 iop/s using multipathing with
>>> iscsi.
>>>
>>> With librbd i'm stuck to around 18.000iops. As this scales with more
>>> hosts but not with more disks in a vm. It must be limited by rbd
>>> implementation in kvm / librbd.
>>
>> It'd be interesting to see which layers are most limiting in this
>> case - qemu/kvm, librados, or librbd.
>>
>> How does rados bench with 4k writes and then 4k reads with many
>> concurrent IOs do?
> Right now i'm using qemu-kvm with librbd and fio inside guest. How does
> the rados bench work?

rados bench uses librados aio, keeping several operations in flight.
IO size is the same as object size for it.

You can do a 4k write benchmark that doesn't delete the objects it
writes, with 32 IOs in flight for 300 seconds:

rados -p data bench 300 write -b 4096 -t 32 --no-cleanup

Then a read benchmark (only sequential is implemented, but with 4k
objects it's similar to random if you flush the osd's page cache before
running it):

rados -p data bench 300 seq -b 4096 -t 32

You can divide the avg throughput by IO size to get IOPS.

Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: optmize librbd for iops
  2012-11-13  8:20     ` Josh Durgin
@ 2012-11-13 10:10       ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Priebe - Profihost AG @ 2012-11-13 10:10 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 743 bytes --]

Am 13.11.2012 09:20, schrieb Josh Durgin:
> On 11/12/2012 11:55 PM, Stefan Priebe wrote:
> rados bench uses librados aio, keeping several operations in flight.
> IO size is the same as object size for it.
>
> You can do a 4k write benchmark that doesn't delete the objects it
> writes, with 32 IOs in flight for 300 seconds:
>
> rados -p data bench 300 write -b 4096 -t 32 --no-cleanup

This gives me just 9000 iop/s - callgraph attached.

> Then a read benchmark (only sequential is implemented, but with 4k
> objects it's similar to random if you flush the osd's page cache before
> running it):
>
> rados -p data bench 300 seq -b 4096 -t 32

This gives me 43 000 iops. But i'm sure readahead or a buffer is the 
burst here.

Greets,
Stefan

[-- Attachment #2: out.pdf --]
[-- Type: application/pdf, Size: 15753 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-11-13 10:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-12 13:50 optmize librbd for iops Stefan Priebe - Profihost AG
2012-11-13  7:51 ` Josh Durgin
2012-11-13  7:55   ` Stefan Priebe
2012-11-13  8:20     ` Josh Durgin
2012-11-13 10:10       ` Stefan Priebe - Profihost AG

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.