From: Luis Pabon <lpabon@redhat.com>
To: Sage Weil <sweil@redhat.com>
Cc: Mark Nelson <mnelson@redhat.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: Cache tier READ_FORWARD transition
Date: Wed, 09 Jul 2014 13:46:31 -0400 [thread overview]
Message-ID: <53BD7FF7.1090801@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1407080856060.9196@cobra.newdream.net>
This is great information.
Thank Sage.
- Luis
On 07/08/2014 12:01 PM, Sage Weil wrote:
> On Mon, 7 Jul 2014, Luis Pab?n wrote:
>> What about the following usecase (please forgive some of my ceph architecture
>> ignorance):
>>
>> If it was possible to setup OSD caching tier at the host (if the host had a
>> dedicated SSD for accelerating I/O), then caching pools could be created to
>> cache VM rbds, since they are inherently exclusive to a single host. Using a
>> write through (or a readonly, depending on the workload) policy would have a
>> major increase in VM IOPs. Using writethrough or readonly policy would also
>> ensure any writes are first written to the back end storage tier. Enabling
>> hosts to service most of their VM I/O reads would also increases the overall
>> IOPs of the back end storage tier.
> This could be accomplished by doing a rados pool per client host. The
> rados caching only works in as a writeback cache, though, not
> write-through, so you really need to replicate it for it to be usable in
> practice. So although it's possible, this isn't a particularly attractive
> approach.
>
> What you're describing is really a client-side write-through cache, either
> for librbd or librados. We've discussed this in the past (mostly in the
> context of a shared host-wide read-only data, not as write-through), but
> in both cases the caching would plug into the client libraries. There are
> some CDS notes from emperor:
>
> http://wiki.ceph.com/Planning/Sideboard/rbd%3A_shared_read_cache
> http://pad.ceph.com/p/rbd-shared-read-cache
> http://www.youtube.com/watch?v=SVgBdUv_Lv4&t=70m11s
>
> Note that you can also accomplish this with the kernel rbd driver by
> layering dm-cache or bcache or something similar on top and running it in
> write-through mode. Most clients are (KVM+)librbd, though, so eventually
> a userspace implementation for librbd (or maybe librados) makes sense.
>
> sage
>
>
>> Does this make sense?
>>
>> - Luis
>>
>> On 07/07/2014 03:29 PM, Sage Weil wrote:
>>> On Mon, 7 Jul 2014, Luis Pabon wrote:
>>>> Hi all,
>>>> I am working on OSDMonitor.cc:5325 and wanted to confirm the
>>>> following
>>>> read_forward cache tier transition:
>>>>
>>>> readforward -> forward || writeback || (any && num_objects_dirty ==
>>>> 0)
>>>> forward -> writeback || readforward || (any && num_objects_dirty ==
>>>> 0)
>>>> writeback -> readforward || forward
>>>>
>>>> Is this the correct cache tier state transition?
>>> That looks right to me.
>>>
>>> By the way, I had a thought after we spoke that we probably want something
>>> that is somewhere inbetween the current writeback behavior (promote on
>>> first read) and the read_forward behavior (never promote on read). I
>>> suspect a good all-around policy is something like promote on second read?
>>> This should probably be rolled into the writeback mode as a tunable...
>>>
>>> sage
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-07-09 17:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-07 16:29 Cache tier READ_FORWARD transition Luis Pabon
2014-07-07 19:29 ` Sage Weil
2014-07-07 19:38 ` Mark Nelson
2014-07-07 19:43 ` Sage Weil
2014-07-07 21:02 ` Mark Nelson
2014-07-07 19:45 ` Sage Weil
2014-07-07 21:03 ` Luis Pabón
2014-07-07 21:31 ` Luis Pabón
2014-07-08 16:01 ` Sage Weil
2014-07-09 17:46 ` Luis Pabon [this message]
2014-07-10 4:34 ` Alexandre DERUMIER
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53BD7FF7.1090801@redhat.com \
--to=lpabon@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=mnelson@redhat.com \
--cc=sweil@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.