* Formatting of backing device
@ 2012-02-01 10:10 Piergiorgio Sartor
[not found] ` <20120201101041.GA2779-W+Wf6LxwHt0@public.gmane.org>
[not found] ` <CAHYUNGYcs3CeRA8Pk-R_3hA6mFHshKzysxRaCcsfm3WLT__B0A@mail.gmail.com>
0 siblings, 2 replies; 19+ messages in thread
From: Piergiorgio Sartor @ 2012-02-01 10:10 UTC (permalink / raw)
To: linux-bcache-u79uwXL29TY76Z2rM5mHXA
Hi all,
first of all I would like to congratulate for this
project, I think it is one of the most promising
feature the Linux kernel can have.
Wrote that, I've a question about the concept of
formatting the backing device.
As far as I understood, the first concept of bcache
was to simply "register" or "attach" a cache to a
backing device, that is, the backing device had not
to be formatted.
Lately, still if I understood it correctly, this
behaviour was changed and, now, the backing device
needs to be formatted.
So, the question is:
How about an already running device? Is it still
possible to attach a cache under such situation?
In general, would it be possible to attach/detach
a cache to any already available device (in the
future)? Or the caching/backing setup must be planned
before the HW is available, so to speak?
It would be useful (and cool too), to have the
possibility to attach/detach the SSD cache, on
the fly (at run-time) to any device it needs it.
I hope the question(s) are clear, if not please
let me know.
Thanks a lot in advance,
bye,
--
piergiorgio
^ permalink raw reply [flat|nested] 19+ messages in thread[parent not found: <20120201101041.GA2779-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120201101041.GA2779-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-01 19:12 ` Adam Berkan 0 siblings, 0 replies; 19+ messages in thread From: Adam Berkan @ 2012-02-01 19:12 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA You can attach bcache to a drive with an existing file system, and it will continue as normal. If you connect to a drive without a file system, then it will continue to not have a file system, but you can format it while attached. Attach/detach should work while the device is in use. This isn't the most tested code path, especially with writeback on, but it's supposed to work. Detaching while the cache is dirty requires flushing all that data so performance will be bad until the detach completes. Let us know if you find any bugs. On Wed, Feb 1, 2012 at 2:10 AM, Piergiorgio Sartor <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > Hi all, > > first of all I would like to congratulate for this > project, I think it is one of the most promising > feature the Linux kernel can have. > > Wrote that, I've a question about the concept of > formatting the backing device. > > As far as I understood, the first concept of bcache > was to simply "register" or "attach" a cache to a > backing device, that is, the backing device had not > to be formatted. > > Lately, still if I understood it correctly, this > behaviour was changed and, now, the backing device > needs to be formatted. > > So, the question is: > > How about an already running device? Is it still > possible to attach a cache under such situation? > > In general, would it be possible to attach/detach > a cache to any already available device (in the > future)? Or the caching/backing setup must be planned > before the HW is available, so to speak? > > It would be useful (and cool too), to have the > possibility to attach/detach the SSD cache, on > the fly (at run-time) to any device it needs it. > > I hope the question(s) are clear, if not please > let me know. > > Thanks a lot in advance, > > bye, > > -- > > piergiorgio > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAHYUNGYcs3CeRA8Pk-R_3hA6mFHshKzysxRaCcsfm3WLT__B0A@mail.gmail.com>]
[parent not found: <CAHYUNGYcs3CeRA8Pk-R_3hA6mFHshKzysxRaCcsfm3WLT__B0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <CAHYUNGYcs3CeRA8Pk-R_3hA6mFHshKzysxRaCcsfm3WLT__B0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-02-01 20:54 ` Piergiorgio Sartor [not found] ` <20120201205456.GA7669-W+Wf6LxwHt0@public.gmane.org> [not found] ` <CAHYUNGaB4LCESDWU1tWB1ZJp_kBH_=19e07vCndxXS5T98_xBA@mail.gmail.com> 0 siblings, 2 replies; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-01 20:54 UTC (permalink / raw) To: Adam Berkan; +Cc: Piergiorgio Sartor, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Adam, thanks for the answer, see below. On Wed, Feb 01, 2012 at 11:04:59AM -0800, Adam Berkan wrote: > You can attach bcache to a drive with an existing file system, and it will > continue as normal. If you connect to a drive without a file system, then > it will continue to not have a file system, but you can format it while > attached. Maybe I misused the term "format". I did not mean filesystem format, but bcache format. What I understood, maybe I'm wrong, is that the backing device, before being used, must be "initialized" with the bcache tool. From the docs: Getting started: You'll need make-bcache from the bcache-tools repository. Both the cache device and backing device must be formatted before use. make-bcache -B /dev/sdb make-bcache -C -w2k -b1M -j64 /dev/sdc I understand this as the backing device gets something on written on it (note the term "formatted"). Am I wrong? I hope so... Thanks again, bye, pg > Attach/detach should work while the device is in use. This isn't the most > tested code path, especially with writeback on, but it's supposed to work. > Detaching while the cache is dirty requires flushing all that data so > performance will be bad until the detach completes. > > Let us know if you find any bugs. > Adam > > On Wed, Feb 1, 2012 at 2:10 AM, Piergiorgio Sartor < > piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > > Hi all, > > > > first of all I would like to congratulate for this > > project, I think it is one of the most promising > > feature the Linux kernel can have. > > > > Wrote that, I've a question about the concept of > > formatting the backing device. > > > > As far as I understood, the first concept of bcache > > was to simply "register" or "attach" a cache to a > > backing device, that is, the backing device had not > > to be formatted. > > > > Lately, still if I understood it correctly, this > > behaviour was changed and, now, the backing device > > needs to be formatted. > > > > So, the question is: > > > > How about an already running device? Is it still > > possible to attach a cache under such situation? > > > > In general, would it be possible to attach/detach > > a cache to any already available device (in the > > future)? Or the caching/backing setup must be planned > > before the HW is available, so to speak? > > > > It would be useful (and cool too), to have the > > possibility to attach/detach the SSD cache, on > > the fly (at run-time) to any device it needs it. > > > > I hope the question(s) are clear, if not please > > let me know. > > > > Thanks a lot in advance, > > > > bye, > > > > -- > > > > piergiorgio > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120201205456.GA7669-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120201205456.GA7669-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-01 21:43 ` Adam Berkan 0 siblings, 0 replies; 19+ messages in thread From: Adam Berkan @ 2012-02-01 21:43 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA Oh, sorry I misunderstood. You have to run make-bcache once to add a bcache superblock to the drive. After that the drive contents are destroyed and it needs to be formatted with a filesystem. At that point you can attach or detach the drive while it is in use. On Wed, Feb 1, 2012 at 12:54 PM, Piergiorgio Sartor <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > Hi Adam, > > thanks for the answer, see below. > > On Wed, Feb 01, 2012 at 11:04:59AM -0800, Adam Berkan wrote: > > You can attach bcache to a drive with an existing file system, and it will > > continue as normal. If you connect to a drive without a file system, then > > it will continue to not have a file system, but you can format it while > > attached. > > Maybe I misused the term "format". > > I did not mean filesystem format, but bcache format. > > What I understood, maybe I'm wrong, is that the backing > device, before being used, must be "initialized" with > the bcache tool. > > From the docs: > > Getting started: > You'll need make-bcache from the bcache-tools repository. Both the cache device > and backing device must be formatted before use. > make-bcache -B /dev/sdb > make-bcache -C -w2k -b1M -j64 /dev/sdc > > I understand this as the backing device gets something > on written on it (note the term "formatted"). > > Am I wrong? I hope so... > > Thanks again, > > bye, > > pg > > > Attach/detach should work while the device is in use. This isn't the most > > tested code path, especially with writeback on, but it's supposed to work. > > Detaching while the cache is dirty requires flushing all that data so > > performance will be bad until the detach completes. > > > > Let us know if you find any bugs. > > Adam > > > > On Wed, Feb 1, 2012 at 2:10 AM, Piergiorgio Sartor < > > piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > > > > Hi all, > > > > > > first of all I would like to congratulate for this > > > project, I think it is one of the most promising > > > feature the Linux kernel can have. > > > > > > Wrote that, I've a question about the concept of > > > formatting the backing device. > > > > > > As far as I understood, the first concept of bcache > > > was to simply "register" or "attach" a cache to a > > > backing device, that is, the backing device had not > > > to be formatted. > > > > > > Lately, still if I understood it correctly, this > > > behaviour was changed and, now, the backing device > > > needs to be formatted. > > > > > > So, the question is: > > > > > > How about an already running device? Is it still > > > possible to attach a cache under such situation? > > > > > > In general, would it be possible to attach/detach > > > a cache to any already available device (in the > > > future)? Or the caching/backing setup must be planned > > > before the HW is available, so to speak? > > > > > > It would be useful (and cool too), to have the > > > possibility to attach/detach the SSD cache, on > > > the fly (at run-time) to any device it needs it. > > > > > > I hope the question(s) are clear, if not please > > > let me know. > > > > > > Thanks a lot in advance, > > > > > > bye, > > > > > > -- > > > > > > piergiorgio > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > > piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAHYUNGaB4LCESDWU1tWB1ZJp_kBH_=19e07vCndxXS5T98_xBA@mail.gmail.com>]
[parent not found: <CAHYUNGaB4LCESDWU1tWB1ZJp_kBH_=19e07vCndxXS5T98_xBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <CAHYUNGaB4LCESDWU1tWB1ZJp_kBH_=19e07vCndxXS5T98_xBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-02-01 21:44 ` Piergiorgio Sartor [not found] ` <20120201214443.GA8544-W+Wf6LxwHt0@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-01 21:44 UTC (permalink / raw) To: Adam Berkan; +Cc: Piergiorgio Sartor, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Adam, On Wed, Feb 01, 2012 at 01:38:12PM -0800, Adam Berkan wrote: > Oh, sorry I misunderstood. > > You have to run make-bcache once to add a bcache superblock to the drive. > After that the drive contents are destroyed and it needs to be formatted > with a filesystem. ah! That's not good... Is there any plan to have the caching device attachable and detachable from *any* backing device without prior "formatting" of this second one? I think bcache is a very interesting and promising project, but formatting the backing device is something, I think, that should be avoided. bye, pg > At that point you can attach or detach the drive while it is in use. > > Adam > > On Wed, Feb 1, 2012 at 12:54 PM, Piergiorgio Sartor < > piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > > Hi Adam, > > > > thanks for the answer, see below. > > > > On Wed, Feb 01, 2012 at 11:04:59AM -0800, Adam Berkan wrote: > > > You can attach bcache to a drive with an existing file system, and it > > will > > > continue as normal. If you connect to a drive without a file system, > > then > > > it will continue to not have a file system, but you can format it while > > > attached. > > > > Maybe I misused the term "format". > > > > I did not mean filesystem format, but bcache format. > > > > What I understood, maybe I'm wrong, is that the backing > > device, before being used, must be "initialized" with > > the bcache tool. > > > > From the docs: > > > > Getting started: > > You'll need make-bcache from the bcache-tools repository. Both the cache > > device > > and backing device must be formatted before use. > > make-bcache -B /dev/sdb > > make-bcache -C -w2k -b1M -j64 /dev/sdc > > > > I understand this as the backing device gets something > > on written on it (note the term "formatted"). > > > > Am I wrong? I hope so... > > > > Thanks again, > > > > bye, > > > > pg > > > > > Attach/detach should work while the device is in use. This isn't the > > most > > > tested code path, especially with writeback on, but it's supposed to > > work. > > > Detaching while the cache is dirty requires flushing all that data so > > > performance will be bad until the detach completes. > > > > > > Let us know if you find any bugs. > > > Adam > > > > > > On Wed, Feb 1, 2012 at 2:10 AM, Piergiorgio Sartor < > > > piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > > > > > > Hi all, > > > > > > > > first of all I would like to congratulate for this > > > > project, I think it is one of the most promising > > > > feature the Linux kernel can have. > > > > > > > > Wrote that, I've a question about the concept of > > > > formatting the backing device. > > > > > > > > As far as I understood, the first concept of bcache > > > > was to simply "register" or "attach" a cache to a > > > > backing device, that is, the backing device had not > > > > to be formatted. > > > > > > > > Lately, still if I understood it correctly, this > > > > behaviour was changed and, now, the backing device > > > > needs to be formatted. > > > > > > > > So, the question is: > > > > > > > > How about an already running device? Is it still > > > > possible to attach a cache under such situation? > > > > > > > > In general, would it be possible to attach/detach > > > > a cache to any already available device (in the > > > > future)? Or the caching/backing setup must be planned > > > > before the HW is available, so to speak? > > > > > > > > It would be useful (and cool too), to have the > > > > possibility to attach/detach the SSD cache, on > > > > the fly (at run-time) to any device it needs it. > > > > > > > > I hope the question(s) are clear, if not please > > > > let me know. > > > > > > > > Thanks a lot in advance, > > > > > > > > bye, > > > > > > > > -- > > > > > > > > piergiorgio > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe > > linux-bcache" in > > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > -- > > > > piergiorgio > > -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120201214443.GA8544-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120201214443.GA8544-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-01 23:11 ` Adam Berkan 2012-02-02 19:01 ` Piergiorgio Sartor 0 siblings, 1 reply; 19+ messages in thread From: Adam Berkan @ 2012-02-01 23:11 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA When we make-bcache on a drive we need to replace the filesytem superblock with a bcache superblock so the kernel knows to load the drive through bcache, but this destroys the filesystem. We've talked about hacky ways to hide the bcache superblock somewhere else, but it's very dangerous stuff that's likely to fail and we don't want to support it. Adam On Wed, Feb 1, 2012 at 1:44 PM, Piergiorgio Sartor <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > Hi Adam, > > On Wed, Feb 01, 2012 at 01:38:12PM -0800, Adam Berkan wrote: >> Oh, sorry I misunderstood. >> >> You have to run make-bcache once to add a bcache superblock to the drive. >> After that the drive contents are destroyed and it needs to be formatted >> with a filesystem. > > ah! That's not good... > > Is there any plan to have the caching device attachable > and detachable from *any* backing device without prior > "formatting" of this second one? > > I think bcache is a very interesting and promising > project, but formatting the backing device is > something, I think, that should be avoided. > > bye, > > pg > >> At that point you can attach or detach the drive while it is in use. >> >> Adam >> >> On Wed, Feb 1, 2012 at 12:54 PM, Piergiorgio Sartor < >> piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: >> >> > Hi Adam, >> > >> > thanks for the answer, see below. >> > >> > On Wed, Feb 01, 2012 at 11:04:59AM -0800, Adam Berkan wrote: >> > > You can attach bcache to a drive with an existing file system, and it >> > will >> > > continue as normal. If you connect to a drive without a file system, >> > then >> > > it will continue to not have a file system, but you can format it while >> > > attached. >> > >> > Maybe I misused the term "format". >> > >> > I did not mean filesystem format, but bcache format. >> > >> > What I understood, maybe I'm wrong, is that the backing >> > device, before being used, must be "initialized" with >> > the bcache tool. >> > >> > From the docs: >> > >> > Getting started: >> > You'll need make-bcache from the bcache-tools repository. Both the cache >> > device >> > and backing device must be formatted before use. >> > make-bcache -B /dev/sdb >> > make-bcache -C -w2k -b1M -j64 /dev/sdc >> > >> > I understand this as the backing device gets something >> > on written on it (note the term "formatted"). >> > >> > Am I wrong? I hope so... >> > >> > Thanks again, >> > >> > bye, >> > >> > pg >> > >> > > Attach/detach should work while the device is in use. This isn't the >> > most >> > > tested code path, especially with writeback on, but it's supposed to >> > work. >> > > Detaching while the cache is dirty requires flushing all that data so >> > > performance will be bad until the detach completes. >> > > >> > > Let us know if you find any bugs. >> > > Adam >> > > >> > > On Wed, Feb 1, 2012 at 2:10 AM, Piergiorgio Sartor < >> > > piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: >> > > >> > > > Hi all, >> > > > >> > > > first of all I would like to congratulate for this >> > > > project, I think it is one of the most promising >> > > > feature the Linux kernel can have. >> > > > >> > > > Wrote that, I've a question about the concept of >> > > > formatting the backing device. >> > > > >> > > > As far as I understood, the first concept of bcache >> > > > was to simply "register" or "attach" a cache to a >> > > > backing device, that is, the backing device had not >> > > > to be formatted. >> > > > >> > > > Lately, still if I understood it correctly, this >> > > > behaviour was changed and, now, the backing device >> > > > needs to be formatted. >> > > > >> > > > So, the question is: >> > > > >> > > > How about an already running device? Is it still >> > > > possible to attach a cache under such situation? >> > > > >> > > > In general, would it be possible to attach/detach >> > > > a cache to any already available device (in the >> > > > future)? Or the caching/backing setup must be planned >> > > > before the HW is available, so to speak? >> > > > >> > > > It would be useful (and cool too), to have the >> > > > possibility to attach/detach the SSD cache, on >> > > > the fly (at run-time) to any device it needs it. >> > > > >> > > > I hope the question(s) are clear, if not please >> > > > let me know. >> > > > >> > > > Thanks a lot in advance, >> > > > >> > > > bye, >> > > > >> > > > -- >> > > > >> > > > piergiorgio >> > > > -- >> > > > To unsubscribe from this list: send the line "unsubscribe >> > linux-bcache" in >> > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > >> > >> > -- >> > >> > piergiorgio >> > > > -- > > piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Formatting of backing device 2012-02-01 23:11 ` Adam Berkan @ 2012-02-02 19:01 ` Piergiorgio Sartor [not found] ` <20120202190122.GA2353-W+Wf6LxwHt0@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-02 19:01 UTC (permalink / raw) To: Adam Berkan; +Cc: Piergiorgio Sartor, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Adam, On Wed, Feb 01, 2012 at 03:11:54PM -0800, Adam Berkan wrote: > When we make-bcache on a drive we need to replace the filesytem > superblock with a bcache superblock so the kernel knows to load the > drive through bcache, but this destroys the filesystem. We've talked well, I guess it will destroy the md superblock 1.1 too, how about LVM metadata? I think the mismatch is with /dev/bcacheX device. The first implementation, as far as I remember, was simply telling the caching device (using UUID) which was the backing device, i.e. it was registering the backing to the caching. Then, still if I got it right, the bcache was caching the backing device directly, without any need of a third device (/dev/bcacheX). I understand that the actual implementation is easier and, maybe, simpler, since a completely new device is added, which will have the new caching "features", while the old one (backing device) is just a further layer. This is similar to LVM over md over /dev/sdX. Nevertheless, my opinioni is, while still considering bcache a great project, that it should work on already existing devices, without touching them. Anyway, thanks a lot for the chat, bye, pg > about hacky ways to hide the bcache superblock somewhere else, but > it's very dangerous stuff that's likely to fail and we don't want to > support it. > > Adam > > > > On Wed, Feb 1, 2012 at 1:44 PM, Piergiorgio Sartor > <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > > Hi Adam, > > > > On Wed, Feb 01, 2012 at 01:38:12PM -0800, Adam Berkan wrote: > >> Oh, sorry I misunderstood. > >> > >> You have to run make-bcache once to add a bcache superblock to the drive. > >> After that the drive contents are destroyed and it needs to be formatted > >> with a filesystem. > > > > ah! That's not good... > > > > Is there any plan to have the caching device attachable > > and detachable from *any* backing device without prior > > "formatting" of this second one? > > > > I think bcache is a very interesting and promising > > project, but formatting the backing device is > > something, I think, that should be avoided. > > > > bye, > > > > pg > > > >> At that point you can attach or detach the drive while it is in use. > >> > >> Adam > >> > >> On Wed, Feb 1, 2012 at 12:54 PM, Piergiorgio Sartor < > >> piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > >> > >> > Hi Adam, > >> > > >> > thanks for the answer, see below. > >> > > >> > On Wed, Feb 01, 2012 at 11:04:59AM -0800, Adam Berkan wrote: > >> > > You can attach bcache to a drive with an existing file system, and it > >> > will > >> > > continue as normal. If you connect to a drive without a file system, > >> > then > >> > > it will continue to not have a file system, but you can format it while > >> > > attached. > >> > > >> > Maybe I misused the term "format". > >> > > >> > I did not mean filesystem format, but bcache format. > >> > > >> > What I understood, maybe I'm wrong, is that the backing > >> > device, before being used, must be "initialized" with > >> > the bcache tool. > >> > > >> > From the docs: > >> > > >> > Getting started: > >> > You'll need make-bcache from the bcache-tools repository. Both the cache > >> > device > >> > and backing device must be formatted before use. > >> > make-bcache -B /dev/sdb > >> > make-bcache -C -w2k -b1M -j64 /dev/sdc > >> > > >> > I understand this as the backing device gets something > >> > on written on it (note the term "formatted"). > >> > > >> > Am I wrong? I hope so... > >> > > >> > Thanks again, > >> > > >> > bye, > >> > > >> > pg > >> > > >> > > Attach/detach should work while the device is in use. This isn't the > >> > most > >> > > tested code path, especially with writeback on, but it's supposed to > >> > work. > >> > > Detaching while the cache is dirty requires flushing all that data so > >> > > performance will be bad until the detach completes. > >> > > > >> > > Let us know if you find any bugs. > >> > > Adam > >> > > > >> > > On Wed, Feb 1, 2012 at 2:10 AM, Piergiorgio Sartor < > >> > > piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > >> > > > >> > > > Hi all, > >> > > > > >> > > > first of all I would like to congratulate for this > >> > > > project, I think it is one of the most promising > >> > > > feature the Linux kernel can have. > >> > > > > >> > > > Wrote that, I've a question about the concept of > >> > > > formatting the backing device. > >> > > > > >> > > > As far as I understood, the first concept of bcache > >> > > > was to simply "register" or "attach" a cache to a > >> > > > backing device, that is, the backing device had not > >> > > > to be formatted. > >> > > > > >> > > > Lately, still if I understood it correctly, this > >> > > > behaviour was changed and, now, the backing device > >> > > > needs to be formatted. > >> > > > > >> > > > So, the question is: > >> > > > > >> > > > How about an already running device? Is it still > >> > > > possible to attach a cache under such situation? > >> > > > > >> > > > In general, would it be possible to attach/detach > >> > > > a cache to any already available device (in the > >> > > > future)? Or the caching/backing setup must be planned > >> > > > before the HW is available, so to speak? > >> > > > > >> > > > It would be useful (and cool too), to have the > >> > > > possibility to attach/detach the SSD cache, on > >> > > > the fly (at run-time) to any device it needs it. > >> > > > > >> > > > I hope the question(s) are clear, if not please > >> > > > let me know. > >> > > > > >> > > > Thanks a lot in advance, > >> > > > > >> > > > bye, > >> > > > > >> > > > -- > >> > > > > >> > > > piergiorgio > >> > > > -- > >> > > > To unsubscribe from this list: send the line "unsubscribe > >> > linux-bcache" in > >> > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > >> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > > > > >> > > >> > -- > >> > > >> > piergiorgio > >> > > > > > -- > > > > piergiorgio -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120202190122.GA2353-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120202190122.GA2353-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-02 22:11 ` Kent Overstreet [not found] ` <20120202221101.GA26768-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Kent Overstreet @ 2012-02-02 22:11 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: Adam Berkan, linux-bcache-u79uwXL29TY76Z2rM5mHXA On Thu, Feb 02, 2012 at 08:01:22PM +0100, Piergiorgio Sartor wrote: > Hi Adam, > > On Wed, Feb 01, 2012 at 03:11:54PM -0800, Adam Berkan wrote: > > When we make-bcache on a drive we need to replace the filesytem > > superblock with a bcache superblock so the kernel knows to load the > > drive through bcache, but this destroys the filesystem. We've talked > > well, I guess it will destroy the md superblock 1.1 too, > how about LVM metadata? > > I think the mismatch is with /dev/bcacheX device. > > The first implementation, as far as I remember, was simply > telling the caching device (using UUID) which was the > backing device, i.e. it was registering the backing to > the caching. > Then, still if I got it right, the bcache was caching the > backing device directly, without any need of a third > device (/dev/bcacheX). > > I understand that the actual implementation is easier and, > maybe, simpler, since a completely new device is added, > which will have the new caching "features", while the > old one (backing device) is just a further layer. > This is similar to LVM over md over /dev/sdX. The reason for getting rid of transparent caching didn't have anything to do with ease of implementation: the real reason is that safely doing persistent caching (and writeback!) is impossible with transparent caching. Adding back a mode that caches a device without a bcache superblock but without the cache being persistent isn't out of the question, but it wouldn't be terribly useful to us so it's not at all a priority for me. If someone else wrote the code I'd take patches, though. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120202221101.GA26768-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120202221101.GA26768-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> @ 2012-02-02 22:24 ` Piergiorgio Sartor 2012-02-16 19:42 ` Alex Elsayed 0 siblings, 1 reply; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-02 22:24 UTC (permalink / raw) To: Kent Overstreet Cc: Piergiorgio Sartor, Adam Berkan, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Kent, nice to have you in this discussion! > > I understand that the actual implementation is easier and, > > maybe, simpler, since a completely new device is added, > > which will have the new caching "features", while the > > old one (backing device) is just a further layer. > > This is similar to LVM over md over /dev/sdX. > > The reason for getting rid of transparent caching didn't have anything > to do with ease of implementation: the real reason is that safely doing > persistent caching (and writeback!) is impossible with transparent > caching. Well, it seems to me "impossible" is a big word... I could image is more "invasive". > Adding back a mode that caches a device without a bcache superblock but > without the cache being persistent isn't out of the question, but it I miss the point, the superblock can be stored in the caching device, instead of the backing and the actual device *could* stay the same. The kernel would have to discover first the caching, later the backing and then put things together. So, the cache will be persistent, or? As I wrote above, I see this more complex than adding a further layer, likely I would do the same. > wouldn't be terribly useful to us so it's not at all a priority for me. > If someone else wrote the code I'd take patches, though. No time for that, unfortunately. I take the opportunity to congratulate personally to you for this project, well done! bye, -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Formatting of backing device 2012-02-02 22:24 ` Piergiorgio Sartor @ 2012-02-16 19:42 ` Alex Elsayed [not found] ` <loom.20120216T200235-190-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Alex Elsayed @ 2012-02-16 19:42 UTC (permalink / raw) To: linux-bcache-u79uwXL29TY76Z2rM5mHXA Piergiorgio Sartor <piergiorgio.sartor@...> writes: > > The reason for getting rid of transparent caching didn't have anything > > to do with ease of implementation: the real reason is that safely doing > > persistent caching (and writeback!) is impossible with transparent > > caching. > > Well, it seems to me "impossible" is a big word... > I could image is more "invasive". Not invasive, *horribly unsafe* > > Adding back a mode that caches a device without a bcache superblock but > > without the cache being persistent isn't out of the question, but it > > I miss the point, the superblock can be stored in > the caching device, instead of the backing and > the actual device *could* stay the same. > The kernel would have to discover first the caching, > later the backing and then put things together. > So, the cache will be persistent, or? Oh sure, the cache is persistent. But device discovery order is undefined, and if the backing device is no different from one without a cache and writeback caching is enabled the kernel has no *possible* way to know that a caching device is needed or even exists. So it mounts it, but it doesn't have any of the data in the writeback cache meaning it thinks the filesystem is corrupted. Depending on the filesystem and exactly what is missing, it may run some in-kernel recovery code that alters the disk. You just lost your data. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <loom.20120216T200235-190-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <loom.20120216T200235-190-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> @ 2012-02-16 20:33 ` Piergiorgio Sartor [not found] ` <20120216203332.GA6597-W+Wf6LxwHt0@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-16 20:33 UTC (permalink / raw) To: Alex Elsayed; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Alex, > Oh sure, the cache is persistent. But device discovery order is undefined, and > if the backing device is no different from one without a cache and writeback > caching is enabled the kernel has no *possible* way to know that a caching > device is needed or even exists. So it mounts it, but it doesn't have any of the > data in the writeback cache meaning it thinks the filesystem is corrupted. > Depending on the filesystem and exactly what is missing, it may run some > in-kernel recovery code that alters the disk. You just lost your data. nonono, I believe I wrote that the kernel should *first* look for caching devices and later for the others... The formatting thing is, clearly, a much standard approach, for the current kernel architecture, but nothing forbids to have a hierarchical search of devices. This could be done, for example, by assigning different classes to each device type, to be scanned in a specific order. In this scope (not bcache, but device discovery) it is already a problem a layered software RAID with metadata 1.0 together with 1.2 (or 1.1). Where the first lies at the end and the second at the beginning of the HDDs, making it difficult (but not impossible) to find out which is the outer and which is the inner one. bye, -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120216203332.GA6597-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120216203332.GA6597-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-16 20:50 ` Alex Elsayed [not found] ` <CA++fp8wcxTDJ=mbsKmWi27+yRZg-tyNdWgmhWU6=UeWgC0TZuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Alex Elsayed @ 2012-02-16 20:50 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA On Thu, Feb 16, 2012 at 12:33 PM, Piergiorgio Sartor <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > Hi Alex, > >> Oh sure, the cache is persistent. But device discovery order is undefined, and >> if the backing device is no different from one without a cache and writeback >> caching is enabled the kernel has no *possible* way to know that a caching >> device is needed or even exists. So it mounts it, but it doesn't have any of the >> data in the writeback cache meaning it thinks the filesystem is corrupted. >> Depending on the filesystem and exactly what is missing, it may run some >> in-kernel recovery code that alters the disk. You just lost your data. > > nonono, I believe I wrote that the kernel > should *first* look for caching devices > and later for the others... > > The formatting thing is, clearly, a much > standard approach, for the current kernel > architecture, but nothing forbids to have > a hierarchical search of devices. > This could be done, for example, by assigning > different classes to each device type, to > be scanned in a specific order. > > In this scope (not bcache, but device discovery) > it is already a problem a layered software RAID > with metadata 1.0 together with 1.2 (or 1.1). > Where the first lies at the end and the second > at the beginning of the HDDs, making it difficult > (but not impossible) to find out which is the > outer and which is the inner one. The difference is that for MD devices, both types of metadata are on the same block device. You're prioritizing which *type of metadata* is checked for first in that case. For bcache, you'd have to scan /dev/sdz before /dev/sda if sdz is the cache and sda is the backing device. Now consider a few things: 1.) SCSI/SATA devices may be probed in parallel 2.) udev gets events when each device is probed, *not* after all devices have been probed 3.) The bcache device may not even be attached to the system at the time 4.) Even in the MD case, there is still *some* change to the backing device, there is still some sort of data there that says "hey, there's more." A totally unchanged backing device won't do that. Even if it doesn't invalidate the other metadata, it still tells the kernel that it's not enough - think of it as invalidating it at the logical rather than the physical level 3 and 4 are the really critical ones. If the cable that connects the SSD to the computer is flaky, and it never gets probed, and there is *no* metadata on the backing device, there is *exactly* zero information available to the kernel to inform it that a backing device ever existed at all. Also, you say that the cache must be scanned before the backing device - but how do you know it's a cache or a backing device until you've probed it? You could delay sending any uevents untill all devices are probed, except there are some devices that take 30sec timeouts and fail, or iscsi, or devices that get plugged in at runtime, or... And since you can't do that, you have a chicken and egg problem. You can't probe the backing device before the cache, but you don't know which is the cache until you probe it. And there may be more than one of each. You can have one cache and 200 backing devices, in theory. Want to take the odds that the cache gets probed first at random? Because the kernel doesn't have enough information for it to be anything other than random. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CA++fp8wcxTDJ=mbsKmWi27+yRZg-tyNdWgmhWU6=UeWgC0TZuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <CA++fp8wcxTDJ=mbsKmWi27+yRZg-tyNdWgmhWU6=UeWgC0TZuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-02-16 20:52 ` Alex Elsayed 2012-02-16 22:35 ` Piergiorgio Sartor 1 sibling, 0 replies; 19+ messages in thread From: Alex Elsayed @ 2012-02-16 20:52 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA On Thu, Feb 16, 2012 at 12:50 PM, Alex Elsayed <eternaleye-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Thu, Feb 16, 2012 at 12:33 PM, Piergiorgio Sartor > <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: >> Hi Alex, >> >>> Oh sure, the cache is persistent. But device discovery order is undefined, and >>> if the backing device is no different from one without a cache and writeback >>> caching is enabled the kernel has no *possible* way to know that a caching >>> device is needed or even exists. So it mounts it, but it doesn't have any of the >>> data in the writeback cache meaning it thinks the filesystem is corrupted. >>> Depending on the filesystem and exactly what is missing, it may run some >>> in-kernel recovery code that alters the disk. You just lost your data. >> >> nonono, I believe I wrote that the kernel >> should *first* look for caching devices >> and later for the others... >> >> The formatting thing is, clearly, a much >> standard approach, for the current kernel >> architecture, but nothing forbids to have >> a hierarchical search of devices. >> This could be done, for example, by assigning >> different classes to each device type, to >> be scanned in a specific order. >> >> In this scope (not bcache, but device discovery) >> it is already a problem a layered software RAID >> with metadata 1.0 together with 1.2 (or 1.1). >> Where the first lies at the end and the second >> at the beginning of the HDDs, making it difficult >> (but not impossible) to find out which is the >> outer and which is the inner one. > > The difference is that for MD devices, both types > of metadata are on the same block device. You're > prioritizing which *type of metadata* is checked > for first in that case. For bcache, you'd have to > scan /dev/sdz before /dev/sda if sdz is the cache > and sda is the backing device. Now consider a > few things: > > 1.) SCSI/SATA devices may be probed in parallel > > 2.) udev gets events when each device is probed, > *not* after all devices have been probed > > 3.) The bcache device may not even be attached > to the system at the time > > 4.) Even in the MD case, there is still *some* > change to the backing device, there is still some > sort of data there that says "hey, there's more." > A totally unchanged backing device won't do that. > Even if it doesn't invalidate the other metadata, it > still tells the kernel that it's not enough - think of > it as invalidating it at the logical rather than the > physical level > > 3 and 4 are the really critical ones. If the cable > that connects the SSD to the computer is flaky, > and it never gets probed, and there is *no* > metadata on the backing device, there is > *exactly* zero information available to the kernel > to inform it that a backing device ever existed at all. Er, to inform it that a *cache* device ever existed > > Also, you say that the cache must be scanned > before the backing device - but how do you know > it's a cache or a backing device until you've probed it? > You could delay sending any uevents untill all > devices are probed, except there are some devices > that take 30sec timeouts and fail, or iscsi, or devices > that get plugged in at runtime, or... > > And since you can't do that, you have a chicken > and egg problem. You can't probe the backing > device before the cache, but you don't know which > is the cache until you probe it. And there may be > more than one of each. You can have one cache > and 200 backing devices, in theory. Want to take > the odds that the cache gets probed first at random? > Because the kernel doesn't have enough information > for it to be anything other than random. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Formatting of backing device [not found] ` <CA++fp8wcxTDJ=mbsKmWi27+yRZg-tyNdWgmhWU6=UeWgC0TZuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2012-02-16 20:52 ` Alex Elsayed @ 2012-02-16 22:35 ` Piergiorgio Sartor [not found] ` <20120216223554.GA6947-W+Wf6LxwHt0@public.gmane.org> 1 sibling, 1 reply; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-16 22:35 UTC (permalink / raw) To: Alex Elsayed; +Cc: Piergiorgio Sartor, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Alex. > The difference is that for MD devices, both types > of metadata are on the same block device. You're > prioritizing which *type of metadata* is checked how? 1.0 in 1.1 is the same as 1.1 in 1.0... The only difference would be that one is smaller than the other, which can hint which is first and which is second. > for first in that case. For bcache, you'd have to > scan /dev/sdz before /dev/sda if sdz is the cache > and sda is the backing device. Now consider a > few things: Again, you scan *all* and check *only* for cache devices. After that, if none found, you've your list of devices, if someone found, you activate these first and then the corresponding backing device. > 1.) SCSI/SATA devices may be probed in parallel And this does not make any difference, in this context. Probed does not mean necessarily activated. Maybe you mean probed as activated. For me it is different. > 2.) udev gets events when each device is probed, > *not* after all devices have been probed This is a udev issue, which can be fixed... :-) > 3.) The bcache device may not even be attached > to the system at the time Good, so the persistency is not needed, I guess, in that case... Or, the backing device cannot be activated, which might be an option, in the current architecture, but, maybe a bit borderline. > 4.) Even in the MD case, there is still *some* > change to the backing device, there is still some > sort of data there that says "hey, there's more." If you mean the 1.1 in 1.0 (or the other way around), there is no information telling you there's more, except, as mentioned, the size, which is not directly related to device probing. Otherwise, I do not understand what do you mean. > Even if it doesn't invalidate the other metadata, it > still tells the kernel that it's not enough - think of > it as invalidating it at the logical rather than the > physical level > > 3 and 4 are the really critical ones. If the cable > that connects the SSD to the computer is flaky, In this case you've much more serious problems, I guess, this is not a use case. The cable can be flaky also after the probing and activation, and result in a disaster. > Also, you say that the cache must be scanned > before the backing device - but how do you know > it's a cache or a backing device until you've probed it? The cache has ad "header" with enough information, namely the UUID(s) of the backing device(s) So you probe (I use "scan") all devices, sort out caches, sort out backing and the rest. Then you activate in proper order. There are many other alternatives. > You could delay sending any uevents untill all > devices are probed, except there are some devices > that take 30sec timeouts and fail, or iscsi, or devices > that get plugged in at runtime, or... Those are *all* solvable problems. Some of them are even too generic. That is, they're problems in any case. As I wrote few posts ago, it is clear why it is like it is. It is *complex* to implement all the required changes in order to have the backing device unformatted. Which has, in the end, limited advantage. No problem with that, very fine for me, but telling the it is not possible, it is just, well, let's say funny. > And since you can't do that, you have a chicken > and egg problem. You can't probe the backing > device before the cache, but you don't know which > is the cache until you probe it. And there may be > more than one of each. You can have one cache > and 200 backing devices, in theory. Want to take > the odds that the cache gets probed first at random? > Because the kernel doesn't have enough information > for it to be anything other than random. The kernel, again, has to separate the probing process, from the activation process. Furthermore, it could always be possible to configure the booting process to do so, in an *explicit* way, like md does usually, i.e. with a configuration file (in initramfs). bye, -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120216223554.GA6947-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120216223554.GA6947-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-16 23:09 ` Joseph Glanville [not found] ` <CAOzFzEhO+6ECN-WjvtMK+-2g7Dwo+DPwQMVWuCZG=Y3BVRNEBw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Joseph Glanville @ 2012-02-16 23:09 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: Alex Elsayed, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Piergiorgio, Your reasoning is quite sound assuming the cache device is present at activation time. In the case where the cache device has failed but the backing device has persisted the failure then the case looks somewhat more like this: 1) OS probes all devices, searches for caches and finds none. 2) Activate the raw backing device with possibly corrupt data.... This is the primary reason Alex has been trying to convince you of the necessity of the super block on the backing device, it exists to tell the kernel not to try activate it raw if the cache is not found. Joseph. On 17 February 2012 09:35, Piergiorgio Sartor <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > Hi Alex. > >> The difference is that for MD devices, both types >> of metadata are on the same block device. You're >> prioritizing which *type of metadata* is checked > > how? 1.0 in 1.1 is the same as 1.1 in 1.0... > The only difference would be that one is smaller > than the other, which can hint which is first > and which is second. > >> for first in that case. For bcache, you'd have to >> scan /dev/sdz before /dev/sda if sdz is the cache >> and sda is the backing device. Now consider a >> few things: > > Again, you scan *all* and check *only* for > cache devices. > After that, if none found, you've your list of > devices, if someone found, you activate these > first and then the corresponding backing device. > >> 1.) SCSI/SATA devices may be probed in parallel > > And this does not make any difference, in > this context. Probed does not mean necessarily > activated. Maybe you mean probed as activated. > For me it is different. > >> 2.) udev gets events when each device is probed, >> *not* after all devices have been probed > > This is a udev issue, which can be fixed... :-) > >> 3.) The bcache device may not even be attached >> to the system at the time > > Good, so the persistency is not needed, I guess, > in that case... > Or, the backing device cannot be activated, > which might be an option, in the current > architecture, but, maybe a bit borderline. > >> 4.) Even in the MD case, there is still *some* >> change to the backing device, there is still some >> sort of data there that says "hey, there's more." > > If you mean the 1.1 in 1.0 (or the other way around), > there is no information telling you there's more, > except, as mentioned, the size, which is not directly > related to device probing. > > Otherwise, I do not understand what do you mean. > >> Even if it doesn't invalidate the other metadata, it >> still tells the kernel that it's not enough - think of >> it as invalidating it at the logical rather than the >> physical level >> >> 3 and 4 are the really critical ones. If the cable >> that connects the SSD to the computer is flaky, > > In this case you've much more serious problems, > I guess, this is not a use case. > The cable can be flaky also after the probing > and activation, and result in a disaster. > >> Also, you say that the cache must be scanned >> before the backing device - but how do you know >> it's a cache or a backing device until you've probed it? > > The cache has ad "header" with enough information, > namely the UUID(s) of the backing device(s) > So you probe (I use "scan") all devices, sort out > caches, sort out backing and the rest. > Then you activate in proper order. > There are many other alternatives. > >> You could delay sending any uevents untill all >> devices are probed, except there are some devices >> that take 30sec timeouts and fail, or iscsi, or devices >> that get plugged in at runtime, or... > > Those are *all* solvable problems. Some of > them are even too generic. That is, they're > problems in any case. > > As I wrote few posts ago, it is clear why it is > like it is. It is *complex* to implement all the > required changes in order to have the backing > device unformatted. Which has, in the end, > limited advantage. > > No problem with that, very fine for me, but > telling the it is not possible, it is just, > well, let's say funny. > >> And since you can't do that, you have a chicken >> and egg problem. You can't probe the backing >> device before the cache, but you don't know which >> is the cache until you probe it. And there may be >> more than one of each. You can have one cache >> and 200 backing devices, in theory. Want to take >> the odds that the cache gets probed first at random? >> Because the kernel doesn't have enough information >> for it to be anything other than random. > > The kernel, again, has to separate the probing > process, from the activation process. > > Furthermore, it could always be possible to > configure the booting process to do so, in > an *explicit* way, like md does usually, i.e. > with a configuration file (in initramfs). > > bye, > > -- > > piergiorgio > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Founder | Director | VP Research Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAOzFzEhO+6ECN-WjvtMK+-2g7Dwo+DPwQMVWuCZG=Y3BVRNEBw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <CAOzFzEhO+6ECN-WjvtMK+-2g7Dwo+DPwQMVWuCZG=Y3BVRNEBw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-02-16 23:17 ` Piergiorgio Sartor [not found] ` <20120216231754.GA14206-W+Wf6LxwHt0@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-16 23:17 UTC (permalink / raw) To: Joseph Glanville Cc: Piergiorgio Sartor, Alex Elsayed, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi joseph, > Your reasoning is quite sound assuming the cache device is present at > activation time. > > In the case where the cache device has failed but the backing device > has persisted the failure then the case looks somewhat more like this: > 1) OS probes all devices, searches for caches and finds none. > 2) Activate the raw backing device with possibly corrupt data.... as I mentioned, this is a bit borderline. One reason is that it would be a failure in any case, depending on what the system will do with the backing device. Second, as per md, the configuration could be in a file in initramfs, which will allow to support this type of failure *and* have the backing device unformatted. In other words, it does not need to be activated automatically by kernel, it can be done by the user, like md... As wrote before, I'm fine with the formatting, very clear and understandable. bye, -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20120216231754.GA14206-W+Wf6LxwHt0@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <20120216231754.GA14206-W+Wf6LxwHt0@public.gmane.org> @ 2012-02-16 23:34 ` Alex Elsayed [not found] ` <CA++fp8w7_uUd35Tcwy1bwEYpR6tJ+fkWMEg+iVEyJ1H4hqKBKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 19+ messages in thread From: Alex Elsayed @ 2012-02-16 23:34 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: Joseph Glanville, linux-bcache-u79uwXL29TY76Z2rM5mHXA On Thu, Feb 16, 2012 at 3:17 PM, Piergiorgio Sartor <piergiorgio.sartor-KvP5wT2u2U0@public.gmane.org> wrote: > Hi joseph, > >> Your reasoning is quite sound assuming the cache device is present at >> activation time. >> >> In the case where the cache device has failed but the backing device >> has persisted the failure then the case looks somewhat more like this: >> 1) OS probes all devices, searches for caches and finds none. >> 2) Activate the raw backing device with possibly corrupt data.... > > as I mentioned, this is a bit borderline. > > One reason is that it would be a failure in > any case, depending on what the system will > do with the backing device. Perhaps, but there are two types of failures that are absolutely critical to distinguish between: Recoverable, and unrecoverable. If there is a superblock, any error in which the cache device is not available at activation is recoverable so long as the cache device can be made available at some other time. If there is a superblock, whether such a situation is recoverable is now undefined, and dependent on the implementation of the filesystem. This is a recipe for a horrible disaster. > Second, as per md, the configuration could > be in a file in initramfs, which will allow > to support this type of failure *and* have > the backing device unformatted. Actually, in modern initramfs' (see dracut) the way md devices are set up is via dynamic scanning, NOT via a static configuration file. This is possible *because* md devices have a superblock on the backing devices. This is *desirable* because a generic initramfs reduces the burden on the user (to know what they are doing) and on the distribution (to support users who roll their own initramfs) And dracut's entire logic is based on acting on devices as they are detected, so delaying all uevents until everything has been found would catastrophically break it. Especially because it acting on those events can create more devices which also need probed. What if your cache is on LVM but your backing devices are whole disks? Waiting until all devices have been probed before poking userspace means that you will never find the cache at all. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CA++fp8w7_uUd35Tcwy1bwEYpR6tJ+fkWMEg+iVEyJ1H4hqKBKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Formatting of backing device [not found] ` <CA++fp8w7_uUd35Tcwy1bwEYpR6tJ+fkWMEg+iVEyJ1H4hqKBKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2012-02-16 23:35 ` Alex Elsayed 2012-02-17 19:12 ` Piergiorgio Sartor 1 sibling, 0 replies; 19+ messages in thread From: Alex Elsayed @ 2012-02-16 23:35 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: Joseph Glanville, linux-bcache-u79uwXL29TY76Z2rM5mHXA > If there is a superblock, whether such a situation is > recoverable is now undefined, and dependent on the Argh. "If there is *not* a superblock" ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Formatting of backing device [not found] ` <CA++fp8w7_uUd35Tcwy1bwEYpR6tJ+fkWMEg+iVEyJ1H4hqKBKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2012-02-16 23:35 ` Alex Elsayed @ 2012-02-17 19:12 ` Piergiorgio Sartor 1 sibling, 0 replies; 19+ messages in thread From: Piergiorgio Sartor @ 2012-02-17 19:12 UTC (permalink / raw) To: Alex Elsayed Cc: Piergiorgio Sartor, Joseph Glanville, linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Alex, > Perhaps, but there are two types of failures that are > absolutely critical to distinguish between: > > Recoverable, and unrecoverable. > > If there is a superblock, any error in which the cache > device is not available at activation is recoverable so > long as the cache device can be made available at > some other time. yes and no, depend, as wrote before, on *if* the cache device is coming and what is the system doing with the volume. > If there is a superblock, whether such a situation is > recoverable is now undefined, and dependent on the > implementation of the filesystem. OK, let's say it differently, the superblock *could* be in a different place than the backing device. > Actually, in modern initramfs' (see dracut) the > way md devices are set up is via dynamic scanning, > NOT via a static configuration file. Actually, the dynamic scanning is done in user space, not in kernel space. It could be done using "mdadm.conf" or, by "udev", using "mdadm -I" and proper udev rules. This could be replicated with bcache. As an example, and please note this is just and example, so no nitpicking, we can consider the following. First of all, what is required is to activate the bcache system from boot and to be able to use the persistent caching, maybe even in write back mode (backing not in sync). What we need is: 1) udev rule, which is trigger by any storage device found by the kernel. This should support the skipping of following rules. AFAIK this is somehow supported in udev, if not it will require a patch. 2) User space tool, let's call "bcacheadm", which can activate bcache devices. This is called by the above udev rule, must keep state across calls (it could use the /dev/ fs or daemonize, for example) and should have proper return codes, in order to allow udev to skip following rules. 3) Configuration file, which contains, in the simple case, pairs of device UUIDs, let's say caching-backing. In case of more complex configurations it could be a human (un)readable xml file. Everything is packed into the initramfs, like it is nowadays done with mdadm. When a storage device pops up, udev call, at first, the bcache rule. bcacheadm will then check if the device is in the configuration list. If not, it will just return and the following udev rules will run. If yes, it will "copy" the device in the proper slot (figuratively, slot in the list) and, if the slot is full (caching and backing present), it will ask the kernel to create the bcache device (and trigger a further udev event). If the slot is not full, then it will return and inform udev to skip the following rules (for this device). As wrote above, this all run in initramfs, like it happens for md devices. This is a bit complex, but I'm pretty sure smart people can do better. More or less the initial requirements are fulfilled. This approach will, de facto, detach the superblock from the backing device and put it in the config file. What do we gain? A backing device unformatted. What do we pay? A part for a little complexity, we introduce a single point of failure, namely the configuration file. If this is lost, damaged or changed unintentionally, we can, potentially, create the situation where the backing device is activated without cache. Of course, the information in this config file could be reduntant and several fail-safe mechanisms could be considered. Is this worth? If you ask me, it is *NOT* worth it. Nevertheless, my point is that it is *possible*. Complexity is the limitation, the several dependencies, the udev weaknesses, and so on. That's why, I write it again, I fully agree on having the superblock into the backing device. Sorry for the long post, bye, -- piergiorgio ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-02-17 19:12 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-01 10:10 Formatting of backing device Piergiorgio Sartor
[not found] ` <20120201101041.GA2779-W+Wf6LxwHt0@public.gmane.org>
2012-02-01 19:12 ` Adam Berkan
[not found] ` <CAHYUNGYcs3CeRA8Pk-R_3hA6mFHshKzysxRaCcsfm3WLT__B0A@mail.gmail.com>
[not found] ` <CAHYUNGYcs3CeRA8Pk-R_3hA6mFHshKzysxRaCcsfm3WLT__B0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-01 20:54 ` Piergiorgio Sartor
[not found] ` <20120201205456.GA7669-W+Wf6LxwHt0@public.gmane.org>
2012-02-01 21:43 ` Adam Berkan
[not found] ` <CAHYUNGaB4LCESDWU1tWB1ZJp_kBH_=19e07vCndxXS5T98_xBA@mail.gmail.com>
[not found] ` <CAHYUNGaB4LCESDWU1tWB1ZJp_kBH_=19e07vCndxXS5T98_xBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-01 21:44 ` Piergiorgio Sartor
[not found] ` <20120201214443.GA8544-W+Wf6LxwHt0@public.gmane.org>
2012-02-01 23:11 ` Adam Berkan
2012-02-02 19:01 ` Piergiorgio Sartor
[not found] ` <20120202190122.GA2353-W+Wf6LxwHt0@public.gmane.org>
2012-02-02 22:11 ` Kent Overstreet
[not found] ` <20120202221101.GA26768-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-02-02 22:24 ` Piergiorgio Sartor
2012-02-16 19:42 ` Alex Elsayed
[not found] ` <loom.20120216T200235-190-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2012-02-16 20:33 ` Piergiorgio Sartor
[not found] ` <20120216203332.GA6597-W+Wf6LxwHt0@public.gmane.org>
2012-02-16 20:50 ` Alex Elsayed
[not found] ` <CA++fp8wcxTDJ=mbsKmWi27+yRZg-tyNdWgmhWU6=UeWgC0TZuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-16 20:52 ` Alex Elsayed
2012-02-16 22:35 ` Piergiorgio Sartor
[not found] ` <20120216223554.GA6947-W+Wf6LxwHt0@public.gmane.org>
2012-02-16 23:09 ` Joseph Glanville
[not found] ` <CAOzFzEhO+6ECN-WjvtMK+-2g7Dwo+DPwQMVWuCZG=Y3BVRNEBw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-16 23:17 ` Piergiorgio Sartor
[not found] ` <20120216231754.GA14206-W+Wf6LxwHt0@public.gmane.org>
2012-02-16 23:34 ` Alex Elsayed
[not found] ` <CA++fp8w7_uUd35Tcwy1bwEYpR6tJ+fkWMEg+iVEyJ1H4hqKBKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-16 23:35 ` Alex Elsayed
2012-02-17 19:12 ` Piergiorgio Sartor
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.