* interesting use case for multiple devices and delayed raid? @ 2009-04-01 9:17 Brian J. Murrell 2009-04-01 10:13 ` Dmitri Nikulin 0 siblings, 1 reply; 5+ messages in thread From: Brian J. Murrell @ 2009-04-01 9:17 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1261 bytes --] I have a use case that I wonder if anyone might find interesting involving multiple device support and delayed raid. Let's say I have a system with two disks of equal size (to make it easy) which has sporadic, heavy, write requirements. At some points in time there will be multiple files being appended to simultaneously and at other times, there will be no activity at all. The write activity is time sensitive, however, so the filesystem must be able to provide guaranteed (only in a loose sense -- not looking for real QoS reservation semantics) bandwidths at times. Let's say slightly (but within the realm of reality) less than the bandwidth of the two disks combined. I also want both the metadata and file data mirrored between the two disks so that I can afford to lose one of the disks and not lose (most of) my data. It is not a strict requirement that all data be immediately mirrored however. So it seems that given these requirements that a filesystem should be able to keep the disks mirrored in a "loose timeframe" so as to provide redundancy (for all but the currently writing data) but also be able to provide the full bandwidth of the two disks, yes? Is this sort of idea on the btrfs roadmap at all? b. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: interesting use case for multiple devices and delayed raid? 2009-04-01 9:17 interesting use case for multiple devices and delayed raid? Brian J. Murrell @ 2009-04-01 10:13 ` Dmitri Nikulin 2009-04-01 21:04 ` Brian J. Murrell 0 siblings, 1 reply; 5+ messages in thread From: Dmitri Nikulin @ 2009-04-01 10:13 UTC (permalink / raw) To: Linux btrfs On Wed, Apr 1, 2009 at 8:17 PM, Brian J. Murrell <brian@interlinx.bc.ca= > wrote: > I have a use case that I wonder if anyone might find interesting > involving multiple device support and delayed raid. > > Let's say I have a system with two disks of equal size (to make it ea= sy) > which has sporadic, heavy, write requirements. =C2=A0At some points i= n time > there will be multiple files being appended to simultaneously and at > other times, there will be no activity at all. > > The write activity is time sensitive, however, so the filesystem must= be > able to provide guaranteed (only in a loose sense -- not looking for > real QoS reservation semantics) bandwidths at times. =C2=A0Let's say = slightly > (but within the realm of reality) less than the bandwidth of the two > disks combined. I assume you mean read bandwidth, since write bandwidth cannot be increased by mirroring, only striping. If you intend to stripe first, then mirror later as time permits, this is the kind of sophistication you will need to write in the program code itself. A filesystem is a handy abstraction, but you are by no means limited to using it. If you have very special needs, you can get pretty far by writing your own meta-filesystem to add semantics you don't have in your kernel filesystem of choice. That's what every single database application does. You can get even further by writing a complete user-space filesystem as part of your program, or a shared daemon, and the performance isn't really that bad. > I also want both the metadata and file data mirrored between the two > disks so that I can afford to lose one of the disks and not lose (mos= t > of) my data. =C2=A0It is not a strict requirement that all data be > immediately mirrored however. This is handled by DragonFly BSD's HAMMER filesystem. A master gets written to, and asynchronously updates a slave, even over a network. It is transactionally consistent and virtually impossible to corrupt as long as the disk media is stable. However as far as I know it won't spread reads, so you'll still get the performance of one disk. A more complete solution, that requires no software changes, would be to have 3 or 4 disks. A stripe for really fast reads and writes, and another disk (or another stripe) to act as a slave to the data being written to the primary stripe. This seems to do what you want, at a small price premium. --=20 Dmitri Nikulin Centre for Synchrotron Science Monash University Victoria 3800, Australia -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: interesting use case for multiple devices and delayed raid? 2009-04-01 10:13 ` Dmitri Nikulin @ 2009-04-01 21:04 ` Brian J. Murrell 2009-04-02 5:41 ` Dmitri Nikulin 0 siblings, 1 reply; 5+ messages in thread From: Brian J. Murrell @ 2009-04-01 21:04 UTC (permalink / raw) To: linux-btrfs On Wed, 01 Apr 2009 21:13:19 +1100, Dmitri Nikulin wrote: On Wed, 2009-04-01 at 21:13 +1100, Dmitri Nikulin wrote: > > I assume you mean read bandwidth, since write bandwidth cannot be > increased by mirroring, only striping. No, I mean write bandwidth. You can get increased write bandwidth with RAID 0 if you only write to one side of the mirror (initially), effectively, striping. You would update the other half of the mirror "lazily" (iow, "delayed") when the filesystem has idle bandwidth. One of the stipulations was that the use pattern is peaks and valleys, not sustained usage. Yes, you would lose the data that was written to a failed mirror before the filesystem got a chance to do the lazy mirror updating later on. That was a stipulation in my original requirements too. > If you intend to stripe first, > then mirror later as time permits, Yeah, that's one way to describe it. > this is the kind of sophistication > you will need to write in the program code itself. Why? A filesystem that does already does it's own mirroring and striping (as I understand btrfs does) should be able to handle this itself. Much better in the filesystem than for each application to have to handle it itself. > A filesystem is a handy abstraction, but you are by no means limited > to using it. If you have very special needs, you can get pretty far by > writing your own meta-filesystem to add semantics you don't have in > your kernel filesystem of choice. Of course. But I am floating this idea as a feature of btrfs given that it already has much of the components needed. > This is handled by DragonFly BSD's HAMMER filesystem. A master gets > written to, and asynchronously updates a slave, even over a network. > It is transactionally consistent and virtually impossible to corrupt > as long as the disk media is stable. However as far as I know it won't > spread reads, so you'll still get the performance of one disk. More importantly, it won't spread writes. > A more complete solution, that requires no software changes, would be > to have 3 or 4 disks. A stripe for really fast reads and writes, and > another disk (or another stripe) to act as a slave to the data being > written to the primary stripe. This seems to do what you want, at a > small price premium. No. That's not really what I am describing at all. I apologize if my original description was unclear. Hopefully it is more so now. b. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: interesting use case for multiple devices and delayed raid? 2009-04-01 21:04 ` Brian J. Murrell @ 2009-04-02 5:41 ` Dmitri Nikulin 2009-04-02 11:27 ` Chris Mason 0 siblings, 1 reply; 5+ messages in thread From: Dmitri Nikulin @ 2009-04-02 5:41 UTC (permalink / raw) To: linux-btrfs On Thu, Apr 2, 2009 at 8:04 AM, Brian J. Murrell <brian@interlinx.bc.ca= > wrote: >> A more complete solution, that requires no software changes, would b= e >> to have 3 or 4 disks. A stripe for really fast reads and writes, and >> another disk (or another stripe) to act as a slave to the data being >> written to the primary stripe. This seems to do what you want, at a >> small price premium. > > No. =C2=A0That's not really what I am describing at all. Well you get the bandwidth of 2 disks when reading and writing, and still mirrored to a second stripe as time permits. Kind of like delayed RAID10. > I apologize if my original description was unclear. =C2=A0Hopefully i= t is > more so now. Yes. It'll be up to the actual filesystem devs to weigh in on whether it's worth implementing. --=20 Dmitri Nikulin Centre for Synchrotron Science Monash University Victoria 3800, Australia -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: interesting use case for multiple devices and delayed raid? 2009-04-02 5:41 ` Dmitri Nikulin @ 2009-04-02 11:27 ` Chris Mason 0 siblings, 0 replies; 5+ messages in thread From: Chris Mason @ 2009-04-02 11:27 UTC (permalink / raw) To: Dmitri Nikulin; +Cc: linux-btrfs On Thu, 2009-04-02 at 16:41 +1100, Dmitri Nikulin wrote: > On Thu, Apr 2, 2009 at 8:04 AM, Brian J. Murrell <brian@interlinx.bc.ca> wrote: > >> A more complete solution, that requires no software changes, would be > >> to have 3 or 4 disks. A stripe for really fast reads and writes, and > >> another disk (or another stripe) to act as a slave to the data being > >> written to the primary stripe. This seems to do what you want, at a > >> small price premium. > > > > No. That's not really what I am describing at all. > > Well you get the bandwidth of 2 disks when reading and writing, and > still mirrored to a second stripe as time permits. Kind of like > delayed RAID10. > > > I apologize if my original description was unclear. Hopefully it is > > more so now. > > Yes. It'll be up to the actual filesystem devs to weigh in on whether > it's worth implementing. > It's an interesting idea, but I think we've got fast front end devices higher up on the todo list. That will still support the destaging to slower disks idea, but will be more flexible overall. -chris ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-04-02 11:27 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-04-01 9:17 interesting use case for multiple devices and delayed raid? Brian J. Murrell 2009-04-01 10:13 ` Dmitri Nikulin 2009-04-01 21:04 ` Brian J. Murrell 2009-04-02 5:41 ` Dmitri Nikulin 2009-04-02 11:27 ` Chris Mason
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox