* Is "btrfs balance start" truly asynchronous? @ 2016-06-20 16:33 Dmitry Katsubo 2016-06-21 8:55 ` Duncan 0 siblings, 1 reply; 9+ messages in thread From: Dmitry Katsubo @ 2016-06-20 16:33 UTC (permalink / raw) To: linux-btrfs Dear btfs community, I have added a drive to existing raid1 btrfs volume and decided to perform balancing so that data distributes "fairly" among drives. I have started "btrfs balance start", but it stalled for about 5-10 minutes intensively doing the work. After that time it has printed something like "had to relocate 50 chunks" and exited. According to drive I/O, "btrfs balance" did most (if not all) of the work, so after it has exited the job was done. Shouldn't "btrfs balance start" do the operation in the background? Thanks for any information. -- With best regards, Dmitry ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-20 16:33 Is "btrfs balance start" truly asynchronous? Dmitry Katsubo @ 2016-06-21 8:55 ` Duncan 2016-06-21 11:24 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 9+ messages in thread From: Duncan @ 2016-06-21 8:55 UTC (permalink / raw) To: linux-btrfs Dmitry Katsubo posted on Mon, 20 Jun 2016 18:33:54 +0200 as excerpted: > Dear btfs community, > > I have added a drive to existing raid1 btrfs volume and decided to > perform balancing so that data distributes "fairly" among drives. I have > started "btrfs balance start", but it stalled for about 5-10 minutes > intensively doing the work. After that time it has printed something > like "had to relocate 50 chunks" and exited. According to drive I/O, > "btrfs balance" did most (if not all) of the work, so after it has > exited the job was done. > > Shouldn't "btrfs balance start" do the operation in the background? >From the btrfs-balance (8) manpage (from btrfs-progs-4.5.3): start [options] <path> start the balance operation according to the specified filters, no filters will rewrite the entire filesystem. The process runs in the foreground. So the balance start operation runs in the foreground, but as explained elsewhere in the manpage, the balance is interruptible by unmount and will automatically restart after a remount. It can also be paused and resumed or canceled with the appropriate btrfs balance subcommands. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 8:55 ` Duncan @ 2016-06-21 11:24 ` Austin S. Hemmelgarn 2016-06-21 11:33 ` Hugo Mills 2016-06-21 12:19 ` Zygo Blaxell 0 siblings, 2 replies; 9+ messages in thread From: Austin S. Hemmelgarn @ 2016-06-21 11:24 UTC (permalink / raw) To: Duncan, linux-btrfs On 2016-06-21 04:55, Duncan wrote: > Dmitry Katsubo posted on Mon, 20 Jun 2016 18:33:54 +0200 as excerpted: > >> Dear btfs community, >> >> I have added a drive to existing raid1 btrfs volume and decided to >> perform balancing so that data distributes "fairly" among drives. I have >> started "btrfs balance start", but it stalled for about 5-10 minutes >> intensively doing the work. After that time it has printed something >> like "had to relocate 50 chunks" and exited. According to drive I/O, >> "btrfs balance" did most (if not all) of the work, so after it has >> exited the job was done. >> >> Shouldn't "btrfs balance start" do the operation in the background? > > From the btrfs-balance (8) manpage (from btrfs-progs-4.5.3): > > start [options] <path> > start the balance operation according to the specified filters, > no filters will rewrite the entire filesystem. The process runs > in the foreground. > > > So the balance start operation runs in the foreground, but as explained > elsewhere in the manpage, the balance is interruptible by unmount and > will automatically restart after a remount. It can also be paused and > resumed or canceled with the appropriate btrfs balance subcommands. > FWIW, there was some talk a while back about possibly providing an option to run balance in the background. If I end up finding the time, I may write a patch for this (userland only, I'm not interested in mucking around with the kernel side of things, and it's fully possible to do this just using libc functions), as it's something I'd rather like to have myself, as the current method of using job control in a shell doesn't really work in some circumstances (for example, you can't easily start a balance on a remote system via a ssh command, which is the specific use case I have). ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 11:24 ` Austin S. Hemmelgarn @ 2016-06-21 11:33 ` Hugo Mills 2016-06-21 11:51 ` Austin S. Hemmelgarn 2016-06-21 12:19 ` Zygo Blaxell 1 sibling, 1 reply; 9+ messages in thread From: Hugo Mills @ 2016-06-21 11:33 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: Duncan, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2282 bytes --] On Tue, Jun 21, 2016 at 07:24:24AM -0400, Austin S. Hemmelgarn wrote: > On 2016-06-21 04:55, Duncan wrote: > >Dmitry Katsubo posted on Mon, 20 Jun 2016 18:33:54 +0200 as excerpted: > > > >>Dear btfs community, > >> > >>I have added a drive to existing raid1 btrfs volume and decided to > >>perform balancing so that data distributes "fairly" among drives. I have > >>started "btrfs balance start", but it stalled for about 5-10 minutes > >>intensively doing the work. After that time it has printed something > >>like "had to relocate 50 chunks" and exited. According to drive I/O, > >>"btrfs balance" did most (if not all) of the work, so after it has > >>exited the job was done. > >> > >>Shouldn't "btrfs balance start" do the operation in the background? > > > >From the btrfs-balance (8) manpage (from btrfs-progs-4.5.3): > > > >start [options] <path> > > start the balance operation according to the specified filters, > > no filters will rewrite the entire filesystem. The process runs > > in the foreground. > > > > > >So the balance start operation runs in the foreground, but as explained > >elsewhere in the manpage, the balance is interruptible by unmount and > >will automatically restart after a remount. It can also be paused and > >resumed or canceled with the appropriate btrfs balance subcommands. > > > FWIW, there was some talk a while back about possibly providing an > option to run balance in the background. If I end up finding the > time, I may write a patch for this (userland only, I'm not > interested in mucking around with the kernel side of things, and > it's fully possible to do this just using libc functions), as it's > something I'd rather like to have myself, as the current method of > using job control in a shell doesn't really work in some > circumstances (for example, you can't easily start a balance on a > remote system via a ssh command, which is the specific use case I > have). There's quite a bit of infrastructure in the userspace tools to deal with managing an asynchronous scrub. It would probably be worth looking at that in the first instance to see if it can be reused for balance. Hugo. -- Hugo Mills | hugo@... carfax.org.uk | __(_'> http://carfax.org.uk/ | Squeak! PGP: E2AB1DE4 | [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 11:33 ` Hugo Mills @ 2016-06-21 11:51 ` Austin S. Hemmelgarn 2016-06-21 13:17 ` Graham Cobb 0 siblings, 1 reply; 9+ messages in thread From: Austin S. Hemmelgarn @ 2016-06-21 11:51 UTC (permalink / raw) To: Hugo Mills, Duncan, linux-btrfs On 2016-06-21 07:33, Hugo Mills wrote: > On Tue, Jun 21, 2016 at 07:24:24AM -0400, Austin S. Hemmelgarn wrote: >> On 2016-06-21 04:55, Duncan wrote: >>> Dmitry Katsubo posted on Mon, 20 Jun 2016 18:33:54 +0200 as excerpted: >>> >>>> Dear btfs community, >>>> >>>> I have added a drive to existing raid1 btrfs volume and decided to >>>> perform balancing so that data distributes "fairly" among drives. I have >>>> started "btrfs balance start", but it stalled for about 5-10 minutes >>>> intensively doing the work. After that time it has printed something >>>> like "had to relocate 50 chunks" and exited. According to drive I/O, >>>> "btrfs balance" did most (if not all) of the work, so after it has >>>> exited the job was done. >>>> >>>> Shouldn't "btrfs balance start" do the operation in the background? >>> >> >From the btrfs-balance (8) manpage (from btrfs-progs-4.5.3): >>> >>> start [options] <path> >>> start the balance operation according to the specified filters, >>> no filters will rewrite the entire filesystem. The process runs >>> in the foreground. >>> >>> >>> So the balance start operation runs in the foreground, but as explained >>> elsewhere in the manpage, the balance is interruptible by unmount and >>> will automatically restart after a remount. It can also be paused and >>> resumed or canceled with the appropriate btrfs balance subcommands. >>> >> FWIW, there was some talk a while back about possibly providing an >> option to run balance in the background. If I end up finding the >> time, I may write a patch for this (userland only, I'm not >> interested in mucking around with the kernel side of things, and >> it's fully possible to do this just using libc functions), as it's >> something I'd rather like to have myself, as the current method of >> using job control in a shell doesn't really work in some >> circumstances (for example, you can't easily start a balance on a >> remote system via a ssh command, which is the specific use case I >> have). > > There's quite a bit of infrastructure in the userspace tools to > deal with managing an asynchronous scrub. It would probably be worth > looking at that in the first instance to see if it can be reused for > balance. Yeah, but we've also already got most of what's needed though for an asynchronous balance. The kernel itself functionally mutexes balances (at least, I'm pretty certain it does), we already store state in the filesystem itself (so that it can be auto-resumed on remount), and we already have commands for pausing, resuming, canceling, and checking status. The only thing that appears to be missing is the ability to have the balance backgrounded by the tools themselves instead of needing POSIX sh job control or something to daemonize it. The scrub design works, but the whole state file thing has some rather irritating side effects and other implications, and developed out of requirements that aren't present for balance (it might be nice to check how many chunks actually got balanced after the fact, but it's not absolutely necessary). ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 11:51 ` Austin S. Hemmelgarn @ 2016-06-21 13:17 ` Graham Cobb 2016-06-21 13:44 ` Lionel Bouton 2016-06-21 23:30 ` Dmitry Katsubo 0 siblings, 2 replies; 9+ messages in thread From: Graham Cobb @ 2016-06-21 13:17 UTC (permalink / raw) To: linux-btrfs On 21/06/16 12:51, Austin S. Hemmelgarn wrote: > The scrub design works, but the whole state file thing has some rather > irritating side effects and other implications, and developed out of > requirements that aren't present for balance (it might be nice to check > how many chunks actually got balanced after the fact, but it's not > absolutely necessary). Actually, that would be **really** useful. I have been experimenting with cancelling balances after a certain time (as part of my "balance-slowly" script). I have got it working, just using bash scripting, but it means my script does not know whether any work has actually been done by the balance run which was cancelled (if no work was done, but it timed out anyway, there is probably no point trying again with the same timeout later!). Graham ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 13:17 ` Graham Cobb @ 2016-06-21 13:44 ` Lionel Bouton 2016-06-21 23:30 ` Dmitry Katsubo 1 sibling, 0 replies; 9+ messages in thread From: Lionel Bouton @ 2016-06-21 13:44 UTC (permalink / raw) To: Graham Cobb, linux-btrfs Le 21/06/2016 15:17, Graham Cobb a écrit : > On 21/06/16 12:51, Austin S. Hemmelgarn wrote: >> The scrub design works, but the whole state file thing has some rather >> irritating side effects and other implications, and developed out of >> requirements that aren't present for balance (it might be nice to check >> how many chunks actually got balanced after the fact, but it's not >> absolutely necessary). > Actually, that would be **really** useful. I have been experimenting > with cancelling balances after a certain time (as part of my > "balance-slowly" script). I have got it working, just using bash > scripting, but it means my script does not know whether any work has > actually been done by the balance run which was cancelled (if no work > was done, but it timed out anyway, there is probably no point trying > again with the same timeout later!). I have the exact same use case. We trigger balances when we detect that the free space is mostly allocated but unused to prevent possible ENOSPC events. A balance on busy disks can slow other I/Os so we try to limit them in time (in our use case 15 to 30 min max is mostly OK). Trying to emulate this by using [d|v]range was a possibility too but I thought it could be hard to get right. We actually inspect the allocated space before and after to report the difference but we don't know if this difference is caused by the aborted balance or other activity (we have to read the kernel logs to find out). Lionel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 13:17 ` Graham Cobb 2016-06-21 13:44 ` Lionel Bouton @ 2016-06-21 23:30 ` Dmitry Katsubo 1 sibling, 0 replies; 9+ messages in thread From: Dmitry Katsubo @ 2016-06-21 23:30 UTC (permalink / raw) To: linux-btrfs On 2016-06-21 15:17, Graham Cobb wrote: > On 21/06/16 12:51, Austin S. Hemmelgarn wrote: >> The scrub design works, but the whole state file thing has some rather >> irritating side effects and other implications, and developed out of >> requirements that aren't present for balance (it might be nice to check >> how many chunks actually got balanced after the fact, but it's not >> absolutely necessary). > > Actually, that would be **really** useful. I have been experimenting > with cancelling balances after a certain time (as part of my > "balance-slowly" script). I have got it working, just using bash > scripting, but it means my script does not know whether any work has > actually been done by the balance run which was cancelled (if no work > was done, but it timed out anyway, there is probably no point trying > again with the same timeout later!). Additionally it would be nice if balance/scrub reports the status via /proc in human readable manner (similar to /proc/mdstat). -- With best regards, Dmitry ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is "btrfs balance start" truly asynchronous? 2016-06-21 11:24 ` Austin S. Hemmelgarn 2016-06-21 11:33 ` Hugo Mills @ 2016-06-21 12:19 ` Zygo Blaxell 1 sibling, 0 replies; 9+ messages in thread From: Zygo Blaxell @ 2016-06-21 12:19 UTC (permalink / raw) To: Austin S. Hemmelgarn; +Cc: Duncan, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 460 bytes --] On Tue, Jun 21, 2016 at 07:24:24AM -0400, Austin S. Hemmelgarn wrote: > (for example, you can't easily start a balance on a remote > system via a ssh command, which is the specific use case I have). Wait, what? ssh remotehost -n btrfs balance start -d... -m... /foo \& or ssh remotehost -f btrfs balance start -d... -m... /foo It even works with systemd's auto-kill feature (send btrfs balance all the SIGKILLs you want, the kernel just ignores them). [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-06-21 23:31 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-20 16:33 Is "btrfs balance start" truly asynchronous? Dmitry Katsubo 2016-06-21 8:55 ` Duncan 2016-06-21 11:24 ` Austin S. Hemmelgarn 2016-06-21 11:33 ` Hugo Mills 2016-06-21 11:51 ` Austin S. Hemmelgarn 2016-06-21 13:17 ` Graham Cobb 2016-06-21 13:44 ` Lionel Bouton 2016-06-21 23:30 ` Dmitry Katsubo 2016-06-21 12:19 ` Zygo Blaxell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).