From: Wido den Hollander <wido@42on.com>
To: Xing Lin <xinglin@cs.utah.edu>
Cc: Gregory Farnum <greg@inktank.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: When ceph synchronizes journal to disk?
Date: Tue, 05 Mar 2013 14:54:42 +0100 [thread overview]
Message-ID: <5135F922.9020206@42on.com> (raw)
In-Reply-To: <51357589.6080202@cs.utah.edu>
On 03/05/2013 05:33 AM, Xing Lin wrote:
> Hi Gregory,
>
> Thanks for your reply.
>
> On 03/04/2013 09:55 AM, Gregory Farnum wrote:
>> The "journal [min|max] sync interval" values specify how frequently
>> the OSD's "FileStore" sends a sync to the disk. However, data is still
>> written into the normal filesystem as it comes in, and the normal
>> filesystem continues to schedule normal dirty data writeouts. This is
>> good — it means that when we do send a sync down you don't need to
>> wait for all (30 seconds * 100MB/s) 3GB or whatever of data to go to
>> disk before it's completed.
>
> I do not think I understand this well. When the writeahead journal mode
> is in use, would you please explain what happens to a single 4M write
> request? I assume that an entry in the journal will be created for this
> write request and after this entry is flushed to the journal disk, Ceph
> returns successful. There should be no IO to the osd's disk. All IOs are
> supposed to go to the journal disk. At a later time, Ceph will start to
> apply these changes to the normal filesystem by reading from the first
> entry at which its previous synchronization stops. Finally, it will read
> this entry and apply this write change to the normal file system. Could
> you please point out where is wrong in my understanding? Thanks,
>
All the data goes to the disk in write-back mode so it isn't safe yet
until the flush is called. That's why it goes into the journal first, to
be consistent at all times.
If you would buffer everything in the journal and flush that at once you
would overload the disk for that time.
Let's say you have 300MB in the journal after 10 seconds and you want to
flush that at once. That would mean that specific disk is unable to do
any other operations then writing with 60MB/sec for 5 seconds.
It's better to always write in write-back mode to the disk and flush at
a certain point.
In the meantime the scheduler can do it's job to balance between the
reads and the writes.
Wido
>>> >I am running 0.48.2. The related configuration is as follows.
>> If you're starting up a new cluster I recommend upgrading to the
>> bobtail series (.56.3) instead of using Argonaut — it's got a number
>> of enhancements you'll appreciate!
>
> Yeah, I would like to use bobtail series. However, I started to make
> small changes with Argonaut (0.48) and had ported my changes once to
> 0.48.2 when it was released. I think I am good to continue with it for
> the moment. I may consider to port my changes to bobtail series at a
> later time. Thanks,
>
> Xing
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Wido den Hollander
42on B.V.
Phone: +31 (0)20 700 9902
Skype: contact42on
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-03-05 13:54 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-03 12:36 When ceph synchronizes journal to disk? Xing Lin
2013-03-04 16:32 ` Sage Weil
2013-03-05 4:08 ` Xing Lin
2013-03-04 16:55 ` Gregory Farnum
2013-03-05 4:33 ` Xing Lin
2013-03-05 8:37 ` When ceph synchronizes journal to disk? / read request Dieter Kasper
2013-03-05 20:13 ` Greg Farnum
2013-03-05 13:54 ` Wido den Hollander [this message]
2013-03-05 20:12 ` When ceph synchronizes journal to disk? Greg Farnum
2013-03-06 1:50 ` Xing Lin
2013-03-05 14:27 ` Ugis
2013-03-05 4:47 ` Xing Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5135F922.9020206@42on.com \
--to=wido@42on.com \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=xinglin@cs.utah.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.