* How to use --freeze-reshape and is it safe? @ 2014-08-14 5:38 Ram Ramesh 2014-08-14 5:56 ` NeilBrown 0 siblings, 1 reply; 6+ messages in thread From: Ram Ramesh @ 2014-08-14 5:38 UTC (permalink / raw) To: linux-raid I was browsing through mdadm man pages to check out --layout options when converting 3disk-raid5 to 4disk-raid6 and encountered --freeze-reshape switch/arg. I did a quick google and could not get much info. Can a user issue this to suspend reshape for a short while? Specifically 1. Is the use (or frequent use) of this switch safe? recommended? 2. Can the array be mounted when this switch is used? 3. What is correct syntax for the usage? 4. Can I use this to manage the reshape load on an array? May be to let the disk cool off after a busy hours of seeking to reshape? 5. Can I use it as a safe method for shutting down the machine? 6. Is there a tutorial/faq/manual that explains in detail the use of other mdadm esoteric switches? (like --layout I was searching) Regards Ramesh ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How to use --freeze-reshape and is it safe? 2014-08-14 5:38 How to use --freeze-reshape and is it safe? Ram Ramesh @ 2014-08-14 5:56 ` NeilBrown 2014-08-14 6:25 ` Ram Ramesh 2014-08-14 13:51 ` Ethan Wilson 0 siblings, 2 replies; 6+ messages in thread From: NeilBrown @ 2014-08-14 5:56 UTC (permalink / raw) To: Ram Ramesh; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1844 bytes --] On Thu, 14 Aug 2014 00:38:43 -0500 Ram Ramesh <rramesh2400@gmail.com> wrote: > I was browsing through mdadm man pages to check out --layout options > when converting 3disk-raid5 to 4disk-raid6 and encountered > --freeze-reshape switch/arg. I did a quick google and could not get much > info. Can a user issue this to suspend reshape for a short while? As --freeze-reshape is only meaningful in combination with --assemble, this question doesn't really make sense. If you are using a sufficiently new kernel and mdadm so that "data_offset" is adjusted during reshapes so that no 'backup' is needed, then you can suspend a reshape for a period of time by: echo frozen > /sys/block/mdXXX/md/sync_action This is perfectly safe. When you want to unfreeze, write 'idle' to 'sync_action'. md will notice that a reshape is pending and will restart where it was up to. > Specifically > > 1. Is the use (or frequent use) of this switch safe? recommended? > 2. Can the array be mounted when this switch is used? > 3. What is correct syntax for the usage? > 4. Can I use this to manage the reshape load on an array? May be to let > the disk cool off after a busy hours of seeking to reshape? > 5. Can I use it as a safe method for shutting down the machine? > 6. Is there a tutorial/faq/manual that explains in detail the use of > other mdadm esoteric switches? (like --layout I was searching) Is it really that esoteric? If you want to reshape an array, you run "mdadm --grow" and list all the changes you want to make. Set a new level, a new number of devices, a new layout, a new chunk size, whatever. mdadm will do it if it can and give an error if it cannot. If you want to test it out first then that is extremely sensible. Make some loop devices and experiment. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How to use --freeze-reshape and is it safe? 2014-08-14 5:56 ` NeilBrown @ 2014-08-14 6:25 ` Ram Ramesh 2014-08-14 7:30 ` NeilBrown 2014-08-14 13:51 ` Ethan Wilson 1 sibling, 1 reply; 6+ messages in thread From: Ram Ramesh @ 2014-08-14 6:25 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid On 08/14/2014 12:56 AM, NeilBrown wrote: > On Thu, 14 Aug 2014 00:38:43 -0500 Ram Ramesh <rramesh2400@gmail.com> wrote: > >> I was browsing through mdadm man pages to check out --layout options >> when converting 3disk-raid5 to 4disk-raid6 and encountered >> --freeze-reshape switch/arg. I did a quick google and could not get much >> info. Can a user issue this to suspend reshape for a short while? > As --freeze-reshape is only meaningful in combination with --assemble, > this question doesn't really make sense. > > If you are using a sufficiently new kernel and mdadm so that "data_offset" is > adjusted during reshapes so that no 'backup' is needed, then you can > suspend a reshape for a period of time by: > > echo frozen > /sys/block/mdXXX/md/sync_action > > This is perfectly safe. When you want to unfreeze, write 'idle' > to 'sync_action'. md will notice that a reshape is pending and will restart > where it was up to. > > >> Specifically >> >> 1. Is the use (or frequent use) of this switch safe? recommended? >> 2. Can the array be mounted when this switch is used? >> 3. What is correct syntax for the usage? >> 4. Can I use this to manage the reshape load on an array? May be to let >> the disk cool off after a busy hours of seeking to reshape? >> 5. Can I use it as a safe method for shutting down the machine? >> 6. Is there a tutorial/faq/manual that explains in detail the use of >> other mdadm esoteric switches? (like --layout I was searching) > Is it really that esoteric? > If you want to reshape an array, you run "mdadm --grow" and list all the > changes you want to make. Set a new level, a new number of devices, a new > layout, a new chunk size, whatever. mdadm will do it if it can and give an > error if it cannot. > If you want to test it out first then that is extremely sensible. Make some > loop devices and experiment. > > NeilBrown Thanks. The name --freeze-reshape mislead me in to thinking that this is a request to stop reshape just like -fail is to make a drive failed. I used esoteric to mean not routinely used or cannot be interpreted by plain English meaning of the the switch/arg name. While I am at this, let me ask the --layout question also. Does conversion from raid5 to raid6 do --layout=left-symmeric-6 first and then distribute Q through second pass with --layout=left-symmetric? If not, will the reshape be faster if I did it in two phases? Ramesh ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How to use --freeze-reshape and is it safe? 2014-08-14 6:25 ` Ram Ramesh @ 2014-08-14 7:30 ` NeilBrown 2014-08-14 15:59 ` Ram Ramesh 0 siblings, 1 reply; 6+ messages in thread From: NeilBrown @ 2014-08-14 7:30 UTC (permalink / raw) To: Ram Ramesh; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 4666 bytes --] On Thu, 14 Aug 2014 01:25:28 -0500 Ram Ramesh <rramesh2400@gmail.com> wrote: > On 08/14/2014 12:56 AM, NeilBrown wrote: > > On Thu, 14 Aug 2014 00:38:43 -0500 Ram Ramesh <rramesh2400@gmail.com> wrote: > > > >> I was browsing through mdadm man pages to check out --layout options > >> when converting 3disk-raid5 to 4disk-raid6 and encountered > >> --freeze-reshape switch/arg. I did a quick google and could not get much > >> info. Can a user issue this to suspend reshape for a short while? > > As --freeze-reshape is only meaningful in combination with --assemble, > > this question doesn't really make sense. > > > > If you are using a sufficiently new kernel and mdadm so that "data_offset" is > > adjusted during reshapes so that no 'backup' is needed, then you can > > suspend a reshape for a period of time by: > > > > echo frozen > /sys/block/mdXXX/md/sync_action > > > > This is perfectly safe. When you want to unfreeze, write 'idle' > > to 'sync_action'. md will notice that a reshape is pending and will restart > > where it was up to. > > > > > >> Specifically > >> > >> 1. Is the use (or frequent use) of this switch safe? recommended? > >> 2. Can the array be mounted when this switch is used? > >> 3. What is correct syntax for the usage? > >> 4. Can I use this to manage the reshape load on an array? May be to let > >> the disk cool off after a busy hours of seeking to reshape? > >> 5. Can I use it as a safe method for shutting down the machine? > >> 6. Is there a tutorial/faq/manual that explains in detail the use of > >> other mdadm esoteric switches? (like --layout I was searching) > > Is it really that esoteric? > > If you want to reshape an array, you run "mdadm --grow" and list all the > > changes you want to make. Set a new level, a new number of devices, a new > > layout, a new chunk size, whatever. mdadm will do it if it can and give an > > error if it cannot. > > If you want to test it out first then that is extremely sensible. Make some > > loop devices and experiment. > > > > NeilBrown > Thanks. The name --freeze-reshape mislead me in to thinking that this is > a request to stop reshape just like -fail is to make a drive > failed. I used esoteric to mean not routinely used or cannot be > interpreted by plain English meaning of the the switch/arg name. > > While I am at this, let me ask the --layout question also. Does > conversion from raid5 to raid6 do --layout=left-symmeric-6 first and > then distribute Q through second pass with --layout=left-symmetric? If > not, will the reshape be faster if I did it in two phases? When you convert a raid5 to a raid6 it will assume that an extra drive is being added as well. Firstly the array is instantaneously converted from an optimal RAID5 in left-symmetric layout to a degraded RAID6 in left-symmetric-6 layout. Then the reshape process is started which reads each stripe in the left-symmetric-6 layout and writes it back in the raid6:left-symmetric layout. (if you specify a different number of final devices it all still works in one pass, but the dance is more complex). If this is done without changing the data offset, then every stripe is written on top of the old location of the same stripe so if the host crashed in the middle of the write, data would be lost. So mdadm copies each stripe to a backup-file before allowing the data to be relocated. This causes a lot more IO than required to move the data, but is a lot safer. With newer kernels (v3.5) and mdadm (v3.3) a reshape can move the data_offset at the same time so that it is only ever writing to an unused area of the devices. This should be much faster. However it requires that the data_offset is high enough that there is room to move it backwards. mdadm 3.3 creates arrays with a reasonably large data_offset. With arrays created earlier you might need to - shrink the filesystem - shrink the --size of the array md can either increase or decrease the data offset. The later requires free space at the start of the array so data_offset must be large. The former requires free space at the end of the array, so size must be less than the maximum. "mdadm --examine" will report "Unused space" both "before" and "after" which indicates how much data_offset can be moved. If either of these are larger than 1 chunk, then mdadm will make use of it. To answer you question: there is no "second pass". The only way to make it faster is to have a recent kernel and mdadm and make sure there is sufficient Unused space, either "before" or "after". NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How to use --freeze-reshape and is it safe? 2014-08-14 7:30 ` NeilBrown @ 2014-08-14 15:59 ` Ram Ramesh 0 siblings, 0 replies; 6+ messages in thread From: Ram Ramesh @ 2014-08-14 15:59 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid > > When you convert a raid5 to a raid6 it will assume that an extra drive is > being added as well. > Firstly the array is instantaneously converted from an optimal RAID5 in > left-symmetric layout to a degraded RAID6 in left-symmetric-6 layout. > > Then the reshape process is started which reads each stripe in the > left-symmetric-6 layout and writes it back in the raid6:left-symmetric layout. > > (if you specify a different number of final devices it all still works in one > pass, but the dance is more complex). > > If this is done without changing the data offset, then every stripe is > written on top of the old location of the same stripe so if the host crashed > in the middle of the write, data would be lost. > So mdadm copies each stripe to a backup-file before allowing the data to be > relocated. This causes a lot more IO than required to move the data, but is > a lot safer. > > With newer kernels (v3.5) and mdadm (v3.3) a reshape can move the data_offset > at the same time so that it is only ever writing to an unused area of the > devices. This should be much faster. > However it requires that the data_offset is high enough that there is room to > move it backwards. mdadm 3.3 creates arrays with a reasonably large > data_offset. With arrays created earlier you might need to > - shrink the filesystem > - shrink the --size of the array > > md can either increase or decrease the data offset. > The later requires free space at the start of the array so data_offset must > be large. The former requires free space at the end of the array, so size > must be less than the maximum. "mdadm --examine" will report "Unused space" > both "before" and "after" which indicates how much data_offset can be moved. > If either of these are larger than 1 chunk, then mdadm will make use of it. > > To answer you question: there is no "second pass". The only way to make it > faster is to have a recent kernel and mdadm and make sure there is sufficient > Unused space, either "before" or "after". > > NeilBrown I may be wrong here, but wouldn't going back and forth on the same disk make the operation slow. I mean trying to compute Q and distribute will require read followed by a write to several disk making seek the bottleneck. Would it not be better to first build Q on the new disk and do the distribution later as you may be able to read multiple blocks, parallelize reads, and combine writes. I am not claiming deep knowledge of disk's inner working here. Just bouncing thoughts. Ramesh ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How to use --freeze-reshape and is it safe? 2014-08-14 5:56 ` NeilBrown 2014-08-14 6:25 ` Ram Ramesh @ 2014-08-14 13:51 ` Ethan Wilson 1 sibling, 0 replies; 6+ messages in thread From: Ethan Wilson @ 2014-08-14 13:51 UTC (permalink / raw) To: linux-raid On 14/08/2014 07:56, NeilBrown wrote: > .... > > As --freeze-reshape is only meaningful in combination with --assemble, > this question doesn't really make sense. > > If you are using a sufficiently new kernel and mdadm so that "data_offset" is > adjusted during reshapes so that no 'backup' is needed, then you can > suspend a reshape for a period of time by: > > echo frozen > /sys/block/mdXXX/md/sync_action > > This is perfectly safe. When you want to unfreeze, write 'idle' > to 'sync_action'. md will notice that a reshape is pending and will restart > where it was up to. > > ..... > > Is it really that esoteric? > If you want to reshape an array, you run "mdadm --grow" and list all the > changes you want to make. Set a new level, a new number of devices, a new > layout, a new chunk size, whatever. mdadm will do it if it can and give an > error if it cannot. > If you want to test it out first then that is extremely sensible. Make some > loop devices and experiment. I also was interested in the functioning of reshape, and it never was completely clear to me. Do you confirm that the manpage for --freeze-reshape is correct? --freeze-reshape Option is intended to be used in start-up scripts during initrd boot phase. When array under reshape is assembled during initrd phase, this option stops reshape after reshape critical section is being restored. This happens before file system pivot operation and avoids loss of file system context. Losing file system context would cause reshape to be broken. "would cause reshape to be broken" means complete data loss? So, to be safe in case of power loss during reshape, we have to make absolutely sure that our Linux distribution implements the --freeze-reshape before pivot_root ? I think Ubuntu doesn't do that. Debian probably also doesn't. Is the reshape operation supposed to be safe with regard to power loss (supposing the distro implements --freeze-reshape) ? Is the --freeze-reshape needed only when there is a backup file, or even with newer kernels (>= 3.5) and mdadm (>= 3.3) and no backup file ? If mdadm lets me initiate a reshape without specifying a backup file, does it mean that it has checked that in my case it is safe? Thank you EW ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-08-14 15:59 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-14 5:38 How to use --freeze-reshape and is it safe? Ram Ramesh 2014-08-14 5:56 ` NeilBrown 2014-08-14 6:25 ` Ram Ramesh 2014-08-14 7:30 ` NeilBrown 2014-08-14 15:59 ` Ram Ramesh 2014-08-14 13:51 ` Ethan Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).