* 3.10LTS ok for production?
@ 2013-11-09 3:01 Paul B. Henson
[not found] ` <20131109030128.GJ5474-eJ6RpuielZ6oHZ9hTG1MgCsmlnnoMqry@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Paul B. Henson @ 2013-11-09 3:01 UTC (permalink / raw)
To: linux-bcache-u79uwXL29TY76Z2rM5mHXA
I'd kinda like to use the 3.10 LTS kernel for a virtualization server
I'm building, but it seems like every time somebody reports a problem
the recommendation is to make sure you're using the latest bleeding edge
kernel. Is it intended for bcache to be considered production ready in
the 3.10 LTS branch, or do you pretty much have to run the latest stable
of the week for now if you want to be sure to get all the bcache bugfixes
necessary for a stable system? Specifically, I'd like to use a raid1 of 2
256G SSDs to be a write-back cache for a raid10 of 4 2TB HDs. Occasional
reboots aren't an issue for kernel updates, but I'd prefer to avoid the
potential instability and config churn of tracking the mainline kernel.
Thanks...
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <20131109030128.GJ5474-eJ6RpuielZ6oHZ9hTG1MgCsmlnnoMqry@public.gmane.org>]
* Re: 3.10LTS ok for production? [not found] ` <20131109030128.GJ5474-eJ6RpuielZ6oHZ9hTG1MgCsmlnnoMqry@public.gmane.org> @ 2013-11-09 5:29 ` Matthew Patton [not found] ` <op.w59n7e06f3gqgg-r49W/1Cwd2cba4AQcYcrVKxOck334EZe@public.gmane.org> 2013-11-09 6:47 ` Kent Overstreet 1 sibling, 1 reply; 7+ messages in thread From: Matthew Patton @ 2013-11-09 5:29 UTC (permalink / raw) To: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Paul B. Henson The following is opinion, MY opinion. On Fri, 08 Nov 2013 22:01:28 -0500, Paul B. Henson <henson-HInyCGIudOg@public.gmane.org> wrote: > kernel. Is it intended for bcache to be considered production ready in > the 3.10 LTS branch, or do you pretty much have to run the latest stable > of the week for now if you want to be sure to get all the bcache bugfixes > necessary for a stable system? I think that's hard to say. The .10 code wasn't re-worked like the .11 branch and it may well have fewer issues than the .11 series. It's also not clear that EVERY bug uncovered in the .11 branch (that wasn't narrowly specific to .11) has been properly back-ported. > Specifically, I'd like to use a raid1 of 2 > 256G SSDs to be a write-back cache for a raid10 of 4 2TB HDs. Occasional > reboots aren't an issue for kernel updates, but I'd prefer to avoid the > potential instability and config churn of tracking the mainline kernel. Storage is the LAST place to cut corners. Unless of course your data isn't important, can be thrown away, or recreated without a lot of time and sweat. Don't get me wrong, I like what BCache is trying to do and I sent Kent $100 of my own money to support his efforts back when continued development seemed to be in jeopardy. Personally I think it needs another 3 months to bake, even in the 3.11.6 guise. As to your specific example, are WRITE IOPs of critical importance? If not, just use WRITE-THRU and have the SSDs be a READ cache for hot data. There is no or almost zero risk to your data in that configuration. Despite all the hand-waving by sysadmins, READ cache is far more useful as a practical matter than WRITE. If you have a heavy WRITE load, then there is no good solution that doesn't cost money. If your 4 disks can't support the desired IOPs, then bite the bullet and get faster disks, more disks, or more cache on the RAID controller, or try the alternative software solutions both of which are free: IOEnhance from STEC or the in-kernel MD-hotspot. I have no useful degree of experience with either, however. Failing that, shell out the money for a ZFS-friendly setup and abstract the storage away from your virtual machines. Indeed that's a much better design anyway. I personally run LSI controllers with CacheCade (sadly limited to 500GB of SSD cache) or you can spring for an equivalent feature set from Adaptec -7 series (unlimited SSD cache) for under $800. My other fancy controller is an Areca with 4GB of battery-backed RAM. My storage nodes also have battery-backed 512MB NVRAM boards (dirt cheap on Ebay) and I use those as targets for filesystem journals or MD Raid1 intent logs. Lastly maybe forget KVM/Xen and get VMware ESXi as your hypervisor. It supports SSDs as block cache too but I'm not sure which level of product is needed to activate it. It can be as cheap as $500 for 3 two-socket physical hosts to $1500+/socket. In conclusion, if staying with BCache use it in write-thru mode. ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <op.w59n7e06f3gqgg-r49W/1Cwd2cba4AQcYcrVKxOck334EZe@public.gmane.org>]
* RE: 3.10LTS ok for production? [not found] ` <op.w59n7e06f3gqgg-r49W/1Cwd2cba4AQcYcrVKxOck334EZe@public.gmane.org> @ 2013-11-13 0:17 ` Paul B. Henson 0 siblings, 0 replies; 7+ messages in thread From: Paul B. Henson @ 2013-11-13 0:17 UTC (permalink / raw) To: 'Matthew Patton', linux-bcache-u79uwXL29TY76Z2rM5mHXA > From: Matthew Patton [mailto:pattonme-/E1597aS9LQAvxtiuMwx3w@public.gmane.org] > Sent: Friday, November 08, 2013 9:29 PM > > The following is opinion, MY opinion. Noted; thanks for taking the time to share it :). > I think that's hard to say. The .10 code wasn't re-worked like the .11 > branch and it may well have fewer issues than the .11 series. There was a re-factoring between .10 and .11? I hadn't noticed that. > Storage is the LAST place to cut corners. Unless of course your data isn't > important, can be thrown away, or recreated without a lot of time and > sweat. Well, technically, this particular deployment is for my house ;), and while I wouldn't really agree with any of those statements for my data, this hobby box has already become ridiculously expensive, and I'd like to make the best of the pieces I already have. > Personally I think it needs another 3 months to bake, even in the 3.11.6 > guise. Hmm, won't 3.11 be EOL before then? So presumably the result of that bake time would be in 3.12. > As to your specific example, are WRITE IOPs of critical importance? If > not, just use WRITE-THRU and have the SSDs be a READ cache for hot data. > > There is no or almost zero risk to your data in that configuration. Well, I don't know if I'd agree with that; bugs in bcache could result in corrupted data being returned from reads or ending up on the backing devices right even in write through, definitely less risk I would think then write back, but none? > Despite all the hand-waving by sysadmins, READ cache is far more useful as > a practical matter than WRITE. If you have a heavy WRITE load, then there > is no good solution that doesn't cost money. Theoretically, caching the writes through the SSD should decrease latency and turn random IO into a sequential stream for the backing device, resulting in increased performance. Ideally, I'd like to avail of that :). > the alternative software solutions both of which are free: IOEnhance from > STEC It looks like they was some activity back in February about getting that into the staging driver section of the kernel, but I don't see it there, and I don't see any further activity, so not sure what happened there. I'd prefer to use functionality in the standard kernel, as opposed to compiling in outside stuff. > the in-kernel MD-hotspot Do you have a reference for that? I can't seem to find anything via Google. > Failing that, shell out the money for a ZFS-friendly setup and abstract > the storage away from your virtual machines. Indeed that's a much better > design anyway. I actually have a storage server sitting right next to the virtualization server running illumos/zfs, with roughly 21TB of storage, which is going to provide bulk storage, but I plan to have the vm operating system files and smaller data on the virtualization server itself. > Lastly maybe forget KVM/Xen and get VMware ESXi as your hypervisor. We use ESXi at my day job, it's got a pretty good feature set, but I'm trying to stick with open source for my home deployments... Thanks for your thoughts. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 3.10LTS ok for production? [not found] ` <20131109030128.GJ5474-eJ6RpuielZ6oHZ9hTG1MgCsmlnnoMqry@public.gmane.org> 2013-11-09 5:29 ` Matthew Patton @ 2013-11-09 6:47 ` Kent Overstreet 2013-11-09 7:11 ` Stefan Priebe 2013-11-13 0:21 ` Paul B. Henson 1 sibling, 2 replies; 7+ messages in thread From: Kent Overstreet @ 2013-11-09 6:47 UTC (permalink / raw) To: Paul B. Henson; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA On Fri, Nov 08, 2013 at 07:01:28PM -0800, Paul B. Henson wrote: > I'd kinda like to use the 3.10 LTS kernel for a virtualization server > I'm building, but it seems like every time somebody reports a problem > the recommendation is to make sure you're using the latest bleeding edge > kernel. Is it intended for bcache to be considered production ready in > the 3.10 LTS branch, or do you pretty much have to run the latest stable > of the week for now if you want to be sure to get all the bcache bugfixes > necessary for a stable system? Specifically, I'd like to use a raid1 of 2 > 256G SSDs to be a write-back cache for a raid10 of 4 2TB HDs. Occasional > reboots aren't an issue for kernel updates, but I'd prefer to avoid the > potential instability and config churn of tracking the mainline kernel. Yes - 3.10 LTS (or 3.11) has been what you want to be running for awhile now; I've been making sure all the bugfixes get backported quickly. The only bugfix I know of that I wasn't backported was a fix for a suspend issue, because it was part of a fairly involved allocator rework. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 3.10LTS ok for production? 2013-11-09 6:47 ` Kent Overstreet @ 2013-11-09 7:11 ` Stefan Priebe [not found] ` <527DE027.2050606-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org> 2013-11-13 0:21 ` Paul B. Henson 1 sibling, 1 reply; 7+ messages in thread From: Stefan Priebe @ 2013-11-09 7:11 UTC (permalink / raw) To: Kent Overstreet, Paul B. Henson; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA at least i'm suffering from two problems on 3.10: 1.) dirty value is often wrong / can go negative 2.) writeback cache is only cleared / written back when having writeback_percent => 0 The first one is already fixed by kent - just waiting for a backport. Greets, Stefan Am 09.11.2013 07:47, schrieb Kent Overstreet: > On Fri, Nov 08, 2013 at 07:01:28PM -0800, Paul B. Henson wrote: >> I'd kinda like to use the 3.10 LTS kernel for a virtualization server >> I'm building, but it seems like every time somebody reports a problem >> the recommendation is to make sure you're using the latest bleeding edge >> kernel. Is it intended for bcache to be considered production ready in >> the 3.10 LTS branch, or do you pretty much have to run the latest stable >> of the week for now if you want to be sure to get all the bcache bugfixes >> necessary for a stable system? Specifically, I'd like to use a raid1 of 2 >> 256G SSDs to be a write-back cache for a raid10 of 4 2TB HDs. Occasional >> reboots aren't an issue for kernel updates, but I'd prefer to avoid the >> potential instability and config churn of tracking the mainline kernel. > > Yes - 3.10 LTS (or 3.11) has been what you want to be running for awhile > now; I've been making sure all the bugfixes get backported quickly. The > only bugfix I know of that I wasn't backported was a fix for a suspend > issue, because it was part of a fairly involved allocator rework. > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <527DE027.2050606-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>]
* RE: 3.10LTS ok for production? [not found] ` <527DE027.2050606-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org> @ 2013-11-13 0:21 ` Paul B. Henson 0 siblings, 0 replies; 7+ messages in thread From: Paul B. Henson @ 2013-11-13 0:21 UTC (permalink / raw) To: 'Stefan Priebe'; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA > From: Stefan Priebe [mailto:s.priebe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org] > Sent: Friday, November 08, 2013 11:12 PM > > 1.) dirty value is often wrong / can go negative > 2.) writeback cache is only cleared / written back when having > writeback_percent => 0 Hmm, neither of those result in data loss or corruption though? ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: 3.10LTS ok for production? 2013-11-09 6:47 ` Kent Overstreet 2013-11-09 7:11 ` Stefan Priebe @ 2013-11-13 0:21 ` Paul B. Henson 1 sibling, 0 replies; 7+ messages in thread From: Paul B. Henson @ 2013-11-13 0:21 UTC (permalink / raw) To: 'Kent Overstreet'; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA > From: Kent Overstreet [mailto:kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org] > Sent: Friday, November 08, 2013 10:47 PM > > Yes - 3.10 LTS (or 3.11) has been what you want to be running for awhile > now; I've been making sure all the bugfixes get backported quickly. Cool, thanks for the feedback. I ended up starting with a 3.11.7 kernel after all, I'm going to play with that and see what happens. I'm looking forward to the potential support for redundant cache devices within bcache itself, so I won't have to mirror my two SSDs, but still have redundancy for writeback and more overall space for read caching. Not sure what the timeline is for that, but imagine it wouldn't be backported to 3.10. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-11-13 0:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-09 3:01 3.10LTS ok for production? Paul B. Henson
[not found] ` <20131109030128.GJ5474-eJ6RpuielZ6oHZ9hTG1MgCsmlnnoMqry@public.gmane.org>
2013-11-09 5:29 ` Matthew Patton
[not found] ` <op.w59n7e06f3gqgg-r49W/1Cwd2cba4AQcYcrVKxOck334EZe@public.gmane.org>
2013-11-13 0:17 ` Paul B. Henson
2013-11-09 6:47 ` Kent Overstreet
2013-11-09 7:11 ` Stefan Priebe
[not found] ` <527DE027.2050606-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2013-11-13 0:21 ` Paul B. Henson
2013-11-13 0:21 ` Paul B. Henson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox