* Slow snapshot deletion
@ 2011-07-28 20:04 Bruce Guenter
2011-07-28 20:28 ` Chris Mason
0 siblings, 1 reply; 12+ messages in thread
From: Bruce Guenter @ 2011-07-28 20:04 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1227 bytes --]
At work we have a backup server system running btrfs. The main backup
pool is a 1.3TB partition (on LVM). Every night, a series of backups
are done with rsync, with each backup onto a separate subvolume, and a
snapshot made of each backup. Compression is used to minimize disk
space used.
When the system was first provisioned, all the operations ran
impressively fast, but now it is almost unusably slow. It has taken 2
days to delete snapshots that recovered 30GB of space, and some of the
backups are taking longer than 24 hours to complete.
Is there any way to monitor the progress of the snapshot deletions?
What could be causing it to run so slow?
What could we do to speed things up?
Kernel is 2.6.39-gentoo-r3. It was last rebooted 2 days ago, and
rebooting does not seem to improve the situation. Mount options
according to /proc are "rw,nosuid,noatime,nodiratime,compress=zlib,noacl"
btrfs filesystem df shows:
Data: total=1.15TB, used=1.10TB
System, DUP: total=8.00MB, used=184.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=49.38GB, used=31.59GB
Metadata: total=8.00MB, used=0.00
--
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-07-28 20:04 Slow snapshot deletion Bruce Guenter
@ 2011-07-28 20:28 ` Chris Mason
2011-07-28 20:51 ` Chris Mason
2011-08-01 22:26 ` Bruce Guenter
0 siblings, 2 replies; 12+ messages in thread
From: Chris Mason @ 2011-07-28 20:28 UTC (permalink / raw)
To: Bruce Guenter; +Cc: linux-btrfs
Excerpts from Bruce Guenter's message of 2011-07-28 16:04:45 -0400:
>
> At work we have a backup server system running btrfs. The main backup
> pool is a 1.3TB partition (on LVM). Every night, a series of backups
> are done with rsync, with each backup onto a separate subvolume, and a
> snapshot made of each backup. Compression is used to minimize disk
> space used.
>
> When the system was first provisioned, all the operations ran
> impressively fast, but now it is almost unusably slow. It has taken 2
> days to delete snapshots that recovered 30GB of space, and some of the
> backups are taking longer than 24 hours to complete.
>
> Is there any way to monitor the progress of the snapshot deletions?
>
> What could be causing it to run so slow?
>
> What could we do to speed things up?
>
> Kernel is 2.6.39-gentoo-r3. It was last rebooted 2 days ago, and
> rebooting does not seem to improve the situation. Mount options
> according to /proc are "rw,nosuid,noatime,nodiratime,compress=zlib,noacl"
> btrfs filesystem df shows:
>
> Data: total=1.15TB, used=1.10TB
> System, DUP: total=8.00MB, used=184.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=49.38GB, used=31.59GB
> Metadata: total=8.00MB, used=0.00
>
The slow performance is probably coming from reading in the metadata
associated with the snapshot extents. The new readahead extentions from
Arne should help once we've adapted them to it. The easiest way to make
sure is to hit sysrq-w a few times while it is slow (or run a patched
latencytop).
-chris
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-07-28 20:28 ` Chris Mason
@ 2011-07-28 20:51 ` Chris Mason
2011-08-04 16:35 ` Bruce Guenter
2011-08-01 22:26 ` Bruce Guenter
1 sibling, 1 reply; 12+ messages in thread
From: Chris Mason @ 2011-07-28 20:51 UTC (permalink / raw)
To: Chris Mason; +Cc: Bruce Guenter, linux-btrfs
Excerpts from Chris Mason's message of 2011-07-28 16:28:04 -0400:
> Excerpts from Bruce Guenter's message of 2011-07-28 16:04:45 -0400:
> >
> > At work we have a backup server system running btrfs. The main backup
> > pool is a 1.3TB partition (on LVM). Every night, a series of backups
> > are done with rsync, with each backup onto a separate subvolume, and a
> > snapshot made of each backup. Compression is used to minimize disk
> > space used.
> >
> > When the system was first provisioned, all the operations ran
> > impressively fast, but now it is almost unusably slow. It has taken 2
> > days to delete snapshots that recovered 30GB of space, and some of the
> > backups are taking longer than 24 hours to complete.
> >
> > Is there any way to monitor the progress of the snapshot deletions?
> >
> > What could be causing it to run so slow?
> >
> > What could we do to speed things up?
> >
> > Kernel is 2.6.39-gentoo-r3. It was last rebooted 2 days ago, and
> > rebooting does not seem to improve the situation. Mount options
> > according to /proc are "rw,nosuid,noatime,nodiratime,compress=zlib,noacl"
> > btrfs filesystem df shows:
> >
> > Data: total=1.15TB, used=1.10TB
> > System, DUP: total=8.00MB, used=184.00KB
> > System: total=4.00MB, used=0.00
> > Metadata, DUP: total=49.38GB, used=31.59GB
> > Metadata: total=8.00MB, used=0.00
> >
>
> The slow performance is probably coming from reading in the metadata
> associated with the snapshot extents. The new readahead extentions from
> Arne should help once we've adapted them to it. The easiest way to make
> sure is to hit sysrq-w a few times while it is slow (or run a patched
> latencytop).
Sorry, hit send too soon. Here is the latencytop patch, after you
recompile run latencytop -c for a few minutes. Send the output here.
http://oss.oracle.com/~mason/latencytop.patch
-chris
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-07-28 20:28 ` Chris Mason
2011-07-28 20:51 ` Chris Mason
@ 2011-08-01 22:26 ` Bruce Guenter
2011-08-01 22:41 ` cwillu
1 sibling, 1 reply; 12+ messages in thread
From: Bruce Guenter @ 2011-08-01 22:26 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 894 bytes --]
On Thu, Jul 28, 2011 at 04:28:04PM -0400, Chris Mason wrote:
> The slow performance is probably coming from reading in the metadata
> associated with the snapshot extents. The new readahead extentions from
> Arne should help once we've adapted them to it. The easiest way to make
> sure is to hit sysrq-w a few times while it is slow (or run a patched
> latencytop).
We shut down the rsync backups for now, and are waiting for the
snapshots to complete deletion so we can unmount the filesystem to
reboot (missing kernel options for latencytop). In the mean time I hit
SysRq-W quite a number of times, and nothing is showing up in kernel
messages. However, btrfs-cleaner and/or btrfs-transaction (sometimes
both, sometimes just one) have been running high in top, taking up all
the CPU time.
--
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-01 22:26 ` Bruce Guenter
@ 2011-08-01 22:41 ` cwillu
2011-08-01 22:59 ` Bruce Guenter
0 siblings, 1 reply; 12+ messages in thread
From: cwillu @ 2011-08-01 22:41 UTC (permalink / raw)
To: linux-btrfs
On Mon, Aug 1, 2011 at 4:26 PM, Bruce Guenter <bruce@untroubled.org> wr=
ote:
> On Thu, Jul 28, 2011 at 04:28:04PM -0400, Chris Mason wrote:
>> The slow performance is probably coming from reading in the metadata
>> associated with the snapshot extents. =C2=A0The new readahead extent=
ions from
>> Arne should help once we've adapted them to it. =C2=A0The easiest wa=
y to make
>> sure is to hit sysrq-w a few times while it is slow (or run a patche=
d
>> latencytop).
>
> We shut down the rsync backups for now, and are waiting for the
> snapshots to complete deletion so we can unmount the filesystem to
> reboot (missing kernel options for latencytop). =C2=A0In the mean tim=
e I hit
> SysRq-W quite a number of times, and nothing is showing up in kernel
The keystroke should be alt-sysrq-w.
> messages. =C2=A0However, btrfs-cleaner and/or btrfs-transaction (some=
times
> both, sometimes just one) have been running high in top, taking up al=
l
> the CPU time.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-01 22:41 ` cwillu
@ 2011-08-01 22:59 ` Bruce Guenter
2011-08-02 0:29 ` Chris Mason
0 siblings, 1 reply; 12+ messages in thread
From: Bruce Guenter @ 2011-08-01 22:59 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 409 bytes --]
On Mon, Aug 01, 2011 at 04:41:10PM -0600, cwillu wrote:
> The keystroke should be alt-sysrq-w.
Right. What I meant is that no backtraces are showing up. I was
actually using
echo w > /proc/sysrq-trigger
and all I was seeing is:
SysRq : Show Blocked State
task PC stack pid father
--
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-01 22:59 ` Bruce Guenter
@ 2011-08-02 0:29 ` Chris Mason
0 siblings, 0 replies; 12+ messages in thread
From: Chris Mason @ 2011-08-02 0:29 UTC (permalink / raw)
To: Bruce Guenter; +Cc: linux-btrfs
Excerpts from Bruce Guenter's message of 2011-08-01 18:59:15 -0400:
> On Mon, Aug 01, 2011 at 04:41:10PM -0600, cwillu wrote:
> > The keystroke should be alt-sysrq-w.
>
> Right. What I meant is that no backtraces are showing up. I was
> actually using
>
> echo w > /proc/sysrq-trigger
>
> and all I was seeing is:
>
> SysRq : Show Blocked State
> task PC stack pid father
>
# once per boot, run dmesg -n8
dmesg -n8
echo w > /proc/sysrq-trigger
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-07-28 20:51 ` Chris Mason
@ 2011-08-04 16:35 ` Bruce Guenter
2011-08-11 15:04 ` Bruce Guenter
0 siblings, 1 reply; 12+ messages in thread
From: Bruce Guenter @ 2011-08-04 16:35 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 4176 bytes --]
On Thu, Jul 28, 2011 at 04:51:24PM -0400, Chris Mason wrote:
> Sorry, hit send too soon. Here is the latencytop patch, after you
> recompile run latencytop -c for a few minutes. Send the output here.
>
> http://oss.oracle.com/~mason/latencytop.patch
Ok, I ran it for more than a few minutes. Early on (shortly after boot), the output shows this for a long time:
=============== Mon Aug 1 21:09:30 2011
Globals: Cause Maximum Percentage
Process details:
Process ksoftirqd/0 (3) Total: 3.9 msec
[run_ksoftirqd] 3.9 msec 100.0 %
run_ksoftirqd kthread kernel_thread_helper
Process kworker/1:1 (983) Total: 31.2 msec
. 3.9 msec 100.0 %
worker_thread kthread kernel_thread_helper
Process sshd (3948) Total: 3.9 msec
Waiting for event (select) 3.9 msec 100.0 %
poll_schedule_timeout do_select core_sys_select sys_select
system_call_fastpath
Process btrfs-cleaner (4345) Total: 2433.6 msec
[sleep_on_page] 31.3 msec 100.0 %
sleep_on_page wait_on_page_bit read_extent_buffer_pages
btree_read_extent_buffer_pages read_tree_block
read_block_for_search btrfs_search_slot
lookup_inline_extent_backref __btrfs_free_extent
run_clustered_refs btrfs_run_delayed_refs __btrfs_end_transaction
Process btrfs-transacti (4346) Total: 2437.5 msec
[sleep_on_page] 27.3 msec 100.0 %
sleep_on_page wait_on_page_bit read_extent_buffer_pages
btree_read_extent_buffer_pages read_tree_block
read_block_for_search btrfs_search_slot
lookup_inline_extent_backref __btrfs_free_extent
run_clustered_refs btrfs_run_delayed_refs btrfs_commit_transaction
This repeats identically (all numbers and backtrace is very stable) for
a long time. Much later, I see this:
=============== Tue Aug 2 13:46:29 2011
Globals: Cause Maximum Percentage
Process details:
Process btrfs-submit-0 (4332) Total: 199.2 msec
Creating block layer request 58.6 msec 100.0 %
get_request_wait __make_request generic_make_request
submit_bio run_scheduled_bios pending_bios_fn worker_loop
kthread kernel_thread_helper
Process btrfs-cleaner (4345) Total: 3.9 msec
Creating block layer request 3.9 msec 100.0 %
get_request_wait __make_request generic_make_request
submit_bio btrfs_map_bio btree_submit_bio_hook submit_one_bio
read_extent_buffer_pages readahead_tree_block reada_walk_down
do_walk_down walk_down_tree
Process btrfs-endio-met (4700) Total: 35.2 msec
[worker_loop] 3.9 msec 100.0 %
worker_loop kthread kernel_thread_helper
Process kworker/1:2 (26315) Total: 50.8 msec
. 3.9 msec 100.0 %
worker_thread kthread kernel_thread_helper
[...snip...]
=============== Thu Aug 4 12:34:08 2011
Globals: Cause Maximum Percentage
Process details:
Process btrfs-submit-0 (4332) Total: 199.2 msec
Creating block layer request 58.6 msec 100.0 %
get_request_wait __make_request generic_make_request
submit_bio run_scheduled_bios pending_bios_fn worker_loop
kthread kernel_thread_helper
Process btrfs-cleaner (4345) Total: 3.9 msec
Creating block layer request 3.9 msec 100.0 %
get_request_wait __make_request generic_make_request
submit_bio btrfs_map_bio btree_submit_bio_hook submit_one_bio
read_extent_buffer_pages readahead_tree_block reada_walk_down
do_walk_down walk_down_tree
Process btrfs-endio-met (4700) Total: 35.2 msec
[worker_loop] 3.9 msec 100.0 %
worker_loop kthread kernel_thread_helper
--
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-04 16:35 ` Bruce Guenter
@ 2011-08-11 15:04 ` Bruce Guenter
2011-08-12 0:40 ` Simon Kirby
0 siblings, 1 reply; 12+ messages in thread
From: Bruce Guenter @ 2011-08-11 15:04 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 318 bytes --]
On Thu, Aug 04, 2011 at 10:35:36AM -0600, Bruce Guenter wrote:
> Ok, I ran it for more than a few minutes. Early on (shortly after boot), the output shows this for a long time:
Does anybody have any ideas what might be going on?
--
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-11 15:04 ` Bruce Guenter
@ 2011-08-12 0:40 ` Simon Kirby
2011-08-12 18:21 ` Bruce Guenter
0 siblings, 1 reply; 12+ messages in thread
From: Simon Kirby @ 2011-08-12 0:40 UTC (permalink / raw)
To: linux-btrfs
On Thu, Aug 11, 2011 at 09:04:56AM -0600, Bruce Guenter wrote:
> On Thu, Aug 04, 2011 at 10:35:36AM -0600, Bruce Guenter wrote:
> > Ok, I ran it for more than a few minutes. Early on (shortly after boot), the output shows this for a long time:
>
> Does anybody have any ideas what might be going on?
By chance, does "vmstat 1" or top show a high system cpu% while this is
happening, and if so, does cat /proc/<pid>/stack of the PIDs in R state
sometimes show things under btrfs_run_delayed_refs like
setup_cluster_no_bitmap() and rb_next()?
I haven't been able to build a 2.6.39 to verify yet, but 2.6.38.5 does
not show the problem for some reason. This is on a backup server, too,
with 20 3 TB btrfs volumes.
You should also be able to see a stack trace in this case (NOT the
uninterruptible sleep case) with echo l > /rpoc/sysrq-trigger, or use
oprofile or "perf top", which is what I am using. I see output such as:
samples pcnt function DSO
_______ _____ __________________________________ ______________
13537.00 59.2% rb_next [kernel]
3539.00 15.5% _raw_spin_lock [kernel]
1668.00 7.3% setup_cluster_no_bitmap [kernel]
799.00 3.5% tree_search_offset [kernel]
476.00 2.1% fill_window [kernel]
and and perf record -g; perf report --stdio, which gives me:
47.60% btrfs-transacti [kernel.kallsyms] [k] rb_next
|
--- rb_next
|
|--97.36%-- setup_cluster_no_bitmap
| btrfs_find_space_cluster
| find_free_extent
| btrfs_reserve_extent
| btrfs_alloc_free_block
| __btrfs_cow_block
| btrfs_cow_block
| |
| |--99.95%-- btrfs_search_slot
| | |
| | |--55.01%-- lookup_inline_extent_backref
| | | |
| | | |--97.17%-- __btrfs_free_extent
| | | | run_clustered_refs
| | | | btrfs_run_delayed_refs
| | | | |
| | | | |--88.72%-- btrfs_commit_transaction
| | | | | transaction_kthread
At one point, I rebooted the box without unmounting, and mounting of the
60 TB of btrfs volumes never actually finished within 8 hours. It seemed
to get stuck on one of them about half way through, constantly writing
(slowly) and taking lots of CPU. This is just from mounting, not any use.
Booting back into 2.6.38.5 was fine. The kernel giving problems was 3.0,
and now 3.1-rc1.
Before, when it actually was mounted, "sync" (with nothing else dirtying
pages) also took 4 minutes to run and pegged the CPU with a profile
similar to the above.
I tried started trying to do a bisection but tons of other things have
changed and it takes quite a while to verify each pass. It also doesn't
help that I can't seem to reproduce it on any other non-production boxes.
Simon-
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-12 0:40 ` Simon Kirby
@ 2011-08-12 18:21 ` Bruce Guenter
2011-08-12 18:26 ` Simon Kirby
0 siblings, 1 reply; 12+ messages in thread
From: Bruce Guenter @ 2011-08-12 18:21 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 615 bytes --]
On Thu, Aug 11, 2011 at 05:40:36PM -0700, Simon Kirby wrote:
> By chance, does "vmstat 1" or top show a high system cpu% while this is
> happening,
Absolutely, usually around 80-90%
> and if so, does cat /proc/<pid>/stack of the PIDs in R state
> sometimes show things under btrfs_run_delayed_refs like
> setup_cluster_no_bitmap() and rb_next()?
No, I am not seeing what you describe. The stack traces are usually
showing what I originally quoted. I don't know if I've seen
setup_cluster_no_bitmap or rb_* at all.
--
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Slow snapshot deletion
2011-08-12 18:21 ` Bruce Guenter
@ 2011-08-12 18:26 ` Simon Kirby
0 siblings, 0 replies; 12+ messages in thread
From: Simon Kirby @ 2011-08-12 18:26 UTC (permalink / raw)
To: linux-btrfs
On Fri, Aug 12, 2011 at 12:21:44PM -0600, Bruce Guenter wrote:
> On Thu, Aug 11, 2011 at 05:40:36PM -0700, Simon Kirby wrote:
> > By chance, does "vmstat 1" or top show a high system cpu% while this is
> > happening,
>
> Absolutely, usually around 80-90%
...which is perhaps why you had difficult obtaining traces with sysrq-w,
since most of the time there were no D-state processes. Instead, sysrq-l
give a backtrace of each running CPU.
> > and if so, does cat /proc/<pid>/stack of the PIDs in R state
> > sometimes show things under btrfs_run_delayed_refs like
> > setup_cluster_no_bitmap() and rb_next()?
>
> No, I am not seeing what you describe. The stack traces are usually
> showing what I originally quoted. I don't know if I've seen
> setup_cluster_no_bitmap or rb_* at all.
Ok, oh well. :)
Simon-
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-08-12 18:26 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-28 20:04 Slow snapshot deletion Bruce Guenter
2011-07-28 20:28 ` Chris Mason
2011-07-28 20:51 ` Chris Mason
2011-08-04 16:35 ` Bruce Guenter
2011-08-11 15:04 ` Bruce Guenter
2011-08-12 0:40 ` Simon Kirby
2011-08-12 18:21 ` Bruce Guenter
2011-08-12 18:26 ` Simon Kirby
2011-08-01 22:26 ` Bruce Guenter
2011-08-01 22:41 ` cwillu
2011-08-01 22:59 ` Bruce Guenter
2011-08-02 0:29 ` Chris Mason
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).