* Using locally replicated OSDs to reduce Ceph replication
@ 2013-04-17 22:02 Steve Barber
2013-04-17 22:23 ` Gregory Farnum
0 siblings, 1 reply; 4+ messages in thread
From: Steve Barber @ 2013-04-17 22:02 UTC (permalink / raw)
To: Jeff Mitchell; +Cc: ceph-devel
On Wed, Apr 17, 2013 at 04:49:53PM -0400, Jeff Mitchell wrote in
another thread:
> ... If you
> set up the OSDs such that each OSD is based off of a ZFS mirror, you
> get these benefits locally. For some people, especially when heavy on
> reads (due to the intelligent caching), a solution that knocks the
> remote replication level down by one but uses local mirrors for OSDs
> may provide good functionality and safety compromises.
Funny that you mention this today; that's exactly an idea I was thinking
about pursuing yesterday, so that I don't have to do repl=4 for data
protection both between two sites and within each site (i.e. 2 copies of
data at each site).
If anybody is actively doing/trying this (whether via RAID or ZFS or
whatever, although I'm particularly interested in a ZFS/ZoL solution) I'd
love to see some discussion about it.
In particular, has anyone tried making a big RAID set (of any type) and
carving out space (logical volumes, zvols, etc.) to become virtual OSDs?
Any architectural gotchas with this idea?
I'm trying to set up a cluster spread across two server rooms in separate
buildings that can survive an outage of one building and still have
replicated (safe) data in the event of e.g. a disk failure during the
outage. It seems like some local data protection would be much more
efficient than having Ceph manage the extra replicas - subject to testing
of course!
As a side note I do like the thought of ZFS ensuring data integrity, and
in the long run it might allow some of the same optimizations with Ceph that
btrfs is used for now (re: snapshots, compression, etc.) and as Jeff
mentioned, ZFS gives you a lot of performance tuning options. I'm
thrilled to see that it's getting some attention.
Steve
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Using locally replicated OSDs to reduce Ceph replication
2013-04-17 22:02 Using locally replicated OSDs to reduce Ceph replication Steve Barber
@ 2013-04-17 22:23 ` Gregory Farnum
2013-04-17 23:02 ` Steve Barber
0 siblings, 1 reply; 4+ messages in thread
From: Gregory Farnum @ 2013-04-17 22:23 UTC (permalink / raw)
To: Steve Barber; +Cc: Jeff Mitchell, ceph-devel
On Wed, Apr 17, 2013 at 3:02 PM, Steve Barber <steve.barber@nist.gov> wrote:
> On Wed, Apr 17, 2013 at 04:49:53PM -0400, Jeff Mitchell wrote in
> another thread:
>> ... If you
>> set up the OSDs such that each OSD is based off of a ZFS mirror, you
>> get these benefits locally. For some people, especially when heavy on
>> reads (due to the intelligent caching), a solution that knocks the
>> remote replication level down by one but uses local mirrors for OSDs
>> may provide good functionality and safety compromises.
>
> Funny that you mention this today; that's exactly an idea I was thinking
> about pursuing yesterday, so that I don't have to do repl=4 for data
> protection both between two sites and within each site (i.e. 2 copies of
> data at each site).
>
> If anybody is actively doing/trying this (whether via RAID or ZFS or
> whatever, although I'm particularly interested in a ZFS/ZoL solution) I'd
> love to see some discussion about it.
>
> In particular, has anyone tried making a big RAID set (of any type) and
> carving out space (logical volumes, zvols, etc.) to become virtual OSDs?
> Any architectural gotchas with this idea?
I believe there are some people running with this architecture;
there's just less knowledge about how it behaves in the long term. It
should be fine subject to the standard issues with RAID5/6 small
writes, which OSDs do a lot of (and I don't know why you'd bother
using a mirroring RAID instead of Ceph replication!).
I can say that there would be little point to carving up the arrays
into multiple OSDs; other than that, have fun. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> I'm trying to set up a cluster spread across two server rooms in separate
> buildings that can survive an outage of one building and still have
> replicated (safe) data in the event of e.g. a disk failure during the
> outage. It seems like some local data protection would be much more
> efficient than having Ceph manage the extra replicas - subject to testing
> of course!
>
> As a side note I do like the thought of ZFS ensuring data integrity, and
> in the long run it might allow some of the same optimizations with Ceph that
> btrfs is used for now (re: snapshots, compression, etc.) and as Jeff
> mentioned, ZFS gives you a lot of performance tuning options. I'm
> thrilled to see that it's getting some attention.
>
> Steve
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Using locally replicated OSDs to reduce Ceph replication
2013-04-17 22:23 ` Gregory Farnum
@ 2013-04-17 23:02 ` Steve Barber
2013-04-17 23:09 ` Gregory Farnum
0 siblings, 1 reply; 4+ messages in thread
From: Steve Barber @ 2013-04-17 23:02 UTC (permalink / raw)
To: Gregory Farnum; +Cc: Jeff Mitchell, ceph-devel
On Wed, Apr 17, 2013 at 06:23:43PM -0400, Gregory Farnum wrote:
> On Wed, Apr 17, 2013 at 3:02 PM, I wrote:
> > In particular, has anyone tried making a big RAID set (of any type) and
> > carving out space (logical volumes, zvols, etc.) to become virtual OSDs?
> > Any architectural gotchas with this idea?
>
> I believe there are some people running with this architecture;
> there's just less knowledge about how it behaves in the long term. It
> should be fine subject to the standard issues with RAID5/6 small
> writes, which OSDs do a lot of (and I don't know why you'd bother
> using a mirroring RAID instead of Ceph replication!).
> I can say that there would be little point to carving up the arrays
> into multiple OSDs; other than that, have fun. :)
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
So you think it'd be worth trying just running a few really large OSDs in
a configuration like that? I wasn't sure that would scale as well, but I'm
still pretty new to Ceph.
About mirroring/RAID vs. Ceph replication, I was under the impression that
there would be a lot of extra network traffic generated by writes with so
many replicas which might not be optimal. True enough about RAID 5/6 small
writes.
Just gotta try it and see I guess.
Thanks for the feedback!
Steve
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Using locally replicated OSDs to reduce Ceph replication
2013-04-17 23:02 ` Steve Barber
@ 2013-04-17 23:09 ` Gregory Farnum
0 siblings, 0 replies; 4+ messages in thread
From: Gregory Farnum @ 2013-04-17 23:09 UTC (permalink / raw)
To: Steve Barber; +Cc: Jeff Mitchell, ceph-devel
On Wed, Apr 17, 2013 at 4:02 PM, Steve Barber <steve.barber@nist.gov> wrote:
> On Wed, Apr 17, 2013 at 06:23:43PM -0400, Gregory Farnum wrote:
>> On Wed, Apr 17, 2013 at 3:02 PM, I wrote:
>> > In particular, has anyone tried making a big RAID set (of any type) and
>> > carving out space (logical volumes, zvols, etc.) to become virtual OSDs?
>> > Any architectural gotchas with this idea?
>>
>> I believe there are some people running with this architecture;
>> there's just less knowledge about how it behaves in the long term. It
>> should be fine subject to the standard issues with RAID5/6 small
>> writes, which OSDs do a lot of (and I don't know why you'd bother
>> using a mirroring RAID instead of Ceph replication!).
>> I can say that there would be little point to carving up the arrays
>> into multiple OSDs; other than that, have fun. :)
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> So you think it'd be worth trying just running a few really large OSDs in
> a configuration like that? I wasn't sure that would scale as well, but I'm
> still pretty new to Ceph.
The scaling ought to be fine, though you might need to go through more
config tuning to scale it up for that level of "disk" underneath.
The bigger concerns are that if you lose an OSD it's a huge chunk of
data at once, but if they're all on the same RAID array you can't
really lose them incrementally anyway.
> About mirroring/RAID vs. Ceph replication, I was under the impression that
> there would be a lot of extra network traffic generated by writes with so
> many replicas which might not be optimal. True enough about RAID 5/6 small
> writes.
Well, yeah, by sticking RAID underneath you get more reliability
without having to traverse a network. But you've still got a ceiling
in terms of how many physical nodes are storing the data, etc.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-04-17 23:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-17 22:02 Using locally replicated OSDs to reduce Ceph replication Steve Barber
2013-04-17 22:23 ` Gregory Farnum
2013-04-17 23:02 ` Steve Barber
2013-04-17 23:09 ` Gregory Farnum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.