linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs send extremely slow (almost stuck)
@ 2016-08-28  3:38 Oliver Freyermuth
  2016-08-28  7:53 ` Duncan
  2016-08-29  2:11 ` Qu Wenruo
  0 siblings, 2 replies; 19+ messages in thread
From: Oliver Freyermuth @ 2016-08-28  3:38 UTC (permalink / raw)
  To: linux-btrfs

Dear btrfs experts, 

I just tried to make use of btrfs send / receive for incremental backups (using btrbk to simplify the process). 
It seems that on my two machines, btrfs send gets stuck after transferring some GiB - it's not fully halted, but instead of making full use of the available I/O, I get something < 500 kiB on average,
which are just some "full speed spikes" with many seconds / minutes of no I/O in between. 

During this "halting", btrfs send eats one full CPU core. 
A "perf top" shows this is spent in "find_parent_nodes" and "__merge_refs" inside the kernel. 
I am using btrfs-progs 4.7 and kernel 4.7.0. 

I googled a bit and found related patchwork (https://patchwork.kernel.org/patch/9238987/) which seems to workaround high load in this area and mentions a real solution is proposed but not yet there. 

Since this affects two machines of mine and backupping my root volume would take about 80 hours in case I can extrapolate the average rate, this means btrfs send is unusable to me. 

Can I assume this is a common issue which will be fixed in a later kernel release (4.8, 4.9) or can I do something to my FS's to workaround this issue? 

One FS is only two weeks old, the other one now about 1 year. I did some balancing at some points of time to have more unallocated space for trimming,
and used duperemove regularly to free space. One FS has skinny extents, the other has not. 

Mount options are "rw,noatime,compress=zlib,ssd,space_cache,commit=120". 

Apart from that: No RAID or any other special configuration involved. 

Cheers and any help appreciated, 
	Oliver

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: btrfs send extremely slow (almost stuck)
@ 2016-08-28 16:15 Oliver Freyermuth
  2016-08-28 21:41 ` james harvey
  0 siblings, 1 reply; 19+ messages in thread
From: Oliver Freyermuth @ 2016-08-28 16:15 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Oliver Freyermuth

(sorry if my Message-ID header is missing, I am not subscribed to the mailing list, 
so I reply using mail-archive)

> So a workaround would be reducing your duperemove usage and possibly 
> rewriting (for instance via defrag) the deduped files to kill the 
> multiple reflinks.  Or simply delete the additional reflinked copies, if 
> your use-case allows it.

Sadly, I need the extra space (that's why I was using duperemove in the first place)
and can not delete all duped copies. These are mainly several checkouts of different repositories
with partially common (partially large binary) content. 

> And thin down your snapshot retention if you have many snapshots per 
> subvolume.  With the geometric scaling issues, thinning to under 300 per 
> subvolume should be quite reasonable in nearly all circumstances, and 
> thinning to under 100 per subvolume may be possible and should result in 
> dramatically reduced scaling issues.

In addition, I have only ~ 5 snapshots for both those volumes, which should certainly not be too much. 


So in short, this just means btrfs send is (still) unusable
for filesystems which rely on the offline dedupe feature (in the past 'btrfs send' got broken
after dedupe which got fixed, now it is just extremely slow). 


For me, this means I have to stay with rsync backups, which are sadly incomplete since special FS attrs
like "C" for nocow are not backed up. 


Cheers and thanks for your reply, 
	Oliver

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: btrfs send extremely slow (almost stuck)
@ 2017-04-14 15:33 J. Hart
  0 siblings, 0 replies; 19+ messages in thread
From: J. Hart @ 2017-04-14 15:33 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo, o.freyermuth

on 30.08.2016 at 02:48 Qu Wenruo wrote :
 > Not the first, but although still few.
 > There is a xfstest case submitted for it, and even before the test 
case, there are already report from IRC.
 > Anyway, I'll add Cc for you after the new IRC patch is out.

Please count me in.

I have this occur when I'm backing up a file server I use to hold 
reflinked incrementals from client machines.  Backing up from clients to 
server is very quick (mere seconds, no incrementals there), but backup 
of the server volume itself is very slow even with limited changes.  
With clone detection enabled, that backup takes nearly seven hours.  
Sending a complete volume to a blank filesystem (so no reflinks are 
present at the destination) is a matter of only a few minutes.

Many thanks to Hermann Schwarzler whose suggestion led me onto this.

J. Hart


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-04-14 15:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-28  3:38 btrfs send extremely slow (almost stuck) Oliver Freyermuth
2016-08-28  7:53 ` Duncan
2016-08-29  2:11 ` Qu Wenruo
2016-08-29  2:12   ` Qu Wenruo
2016-08-31  1:35     ` Jeff Mahoney
2016-08-31  1:54       ` Qu Wenruo
2016-08-29 10:02   ` Oliver Freyermuth
2016-08-30  0:48     ` Qu Wenruo
2016-09-04 21:41       ` Oliver Freyermuth
2016-09-05  5:21         ` Qu Wenruo
2016-09-05 21:29           ` Oliver Freyermuth
2016-09-06  2:13             ` Duncan
2016-09-06 22:24               ` Oliver Freyermuth
2016-09-06  2:46             ` Qu Wenruo
2016-09-06 21:53               ` Oliver Freyermuth
  -- strict thread matches above, loose matches on Subject: below --
2016-08-28 16:15 Oliver Freyermuth
2016-08-28 21:41 ` james harvey
2016-08-29 17:50   ` Kai Krakow
2017-04-14 15:33 J. Hart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).