All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Dillaman <dillaman@redhat.com>
To: Huan Zhang <huan.zhang.jn@gmail.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: rbd_aio_flush cause guestos sync wirte poor iops?
Date: Fri, 18 Mar 2016 08:02:37 -0400 (EDT)	[thread overview]
Message-ID: <1216780323.39858060.1458302557470.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <CAFU9Con=_qXF37XSre1Ep7Ji5MFKcVstQ9MfQPTk=+mvojcLvw@mail.gmail.com>

There isn't anything slow about the flush -- the flush will complete when your previous writes complete.  If it takes 2.5 ms for your OSDs to ACK a write as safely written to disk, you will only be able to issue ~400 sync writes per second.  

The flush issues by your guest OS / QEMU to librbd is to designed to ensure that your previous write operations are safely committed to disk.  If flushes were ignored, your data would no longer be crash consistent.  This is nothing unique to RBD -- you would have the safe effect with a local disk as well.

-- 

Jason Dillaman 

----- Original Message -----
> From: "Huan Zhang" <huan.zhang.jn@gmail.com>
> To: "Jason Dillaman" <dillaman@redhat.com>
> Cc: ceph-devel@vger.kernel.org, haomaiwang@gmail.com
> Sent: Thursday, March 17, 2016 12:58:59 AM
> Subject: Re: rbd_aio_flush cause guestos sync wirte poor iops?
> 
> Hi Jason & Haomai,
>     Thanks for reply  and explanation.
>     fio with ioengine=rbd fsync=1 within physical compute onde
> performance is ok. similar to normal wirte(direct=1)
>     ceph --admin-daemon /var/run/ceph/rbd-41837.asok config show |
> grep rbd_cache
>     "rbd_cache": "false"
> 
>     As you mentioned, sync=1 within guestos will issue rbd_aio_flush.
> so my question is:
>     1. why rbd_aio_flush is so poor even if rbd cache is off?
>     2. could we ignore the sync cache(rbd_aio_flush) instructed by the
> guest OS if rbd cache is off?
> 
> 
> 
> 2016-03-16 21:37 GMT+08:00 Jason Dillaman <dillaman@redhat.com>:
> > As previously mentioned [1], the fio rbd engine ignores the "sync" option.
> > You need to use "fsync=1" to issue a flush after each write to simulate
> > what "sync=1" is doing.  When running fio within a VM against an RBD
> > image, QEMU is not issuing sync writes to RBD -- it's issuing AIO writes
> > and a AIO flush (as instructed by the guest OS).  Looking at the man page
> > for O_SYNC [2], which is what that fio option enables in supported
> > engines, that flag will act "as though each write(2) was followed by a
> > call to fsync(2)".
> >
> > [1]
> > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-February/007780.html
> > [2] http://man7.org/linux/man-pages/man2/open.2.html
> >
> > --
> >
> > Jason Dillaman
> >
> >
> > ----- Original Message -----
> >> From: "Huan Zhang" <huan.zhang.jn@gmail.com>
> >> To: ceph-devel@vger.kernel.org
> >> Sent: Wednesday, March 16, 2016 12:52:33 AM
> >> Subject: rbd_aio_flush cause guestos sync wirte poor iops?
> >>
> >> Hi,
> >>    We test sync iops with fio sync=1 for database workloads in VM,
> >> the backend is librbd and ceph (all SSD setup).'
> >>    The result is sad to me. we only get ~400 IOPS sync randwrite with
> >>    iodepth=1
> >> to iodepth=32.
> >>     But test in physical machine with fio ioengine=rbd sync=1, we can
> >> reache ~35K IOPS.
> >> seems the qemu rbd is the bottleneck.
> >>
> >>     qemu version is 2.1.2 with rbd_aio_flush patched.
> >>     rbd cache is off, qemu cache=none.
> >>
> >>     IMHO, ceph use sync write for every write to disk, so
> >> rbd_aio_flush can ignore the sync
> >> cache command if rbd cache is off so that we can get higher
> >> iops(similar to direct=1 write)
> >> for sync=1 iops, right?
> >>
> >>    Very appreciated to get your reply!
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> 

  reply	other threads:[~2016-03-18 12:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-16  4:52 rbd_aio_flush cause guestos sync wirte poor iops? Huan Zhang
2016-03-16 13:37 ` Jason Dillaman
2016-03-17  4:58   ` Huan Zhang
2016-03-18 12:02     ` Jason Dillaman [this message]
2016-03-21  4:44       ` Huan Zhang
2016-03-21 12:56         ` Jason Dillaman
2016-03-22  6:53           ` Huan Zhang
2016-04-06  4:02             ` Huan Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1216780323.39858060.1458302557470.JavaMail.zimbra@redhat.com \
    --to=dillaman@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=huan.zhang.jn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.