* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
@ 2015-09-16 14:31 Glyn Normington
2015-09-16 14:48 ` Holger Hoffstätte
0 siblings, 1 reply; 12+ messages in thread
From: Glyn Normington @ 2015-09-16 14:31 UTC (permalink / raw)
To: linux-btrfs
>On 09/15/15 21:15, Josef Bacik wrote:
>> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>>> saw fairly frequently with V1 no longer seems to occur here.
>>>
>>> Well..I may have spoken too soon:
>>
>> Huh this doesn't seem related to my stuff. Can I have your script you
>> are running to reproduce this? Thanks,
>
>This was manual - deleted two large files, deleted two snapshots (one
>from the modified subvolume and one sibling), let it sit in the background
>while doing something else for a few seconds and found the messages.
>Don't know how to reproduce this as it didn't show up previously.
>
>-h
We are seeing a very similar problem fwiw. Unfortunately we can't
reproduce this reliably, but it is cropping up regularly. Here's a
chore we are using to track our work on this: [2].
Here's the kernel log:
[610194.395845] WARNING: CPU: 3 PID: 29166 at
/build/linux-lts-vivid-BZwsXG/linux-lts-vivid-3.19.0/fs/btrfs/super.c:260
__btrfs_abort_transaction+0x54/0x130 [btrfs]()
[610194.395847] BTRFS: Transaction aborted (error -2)
[610194.395848] Modules linked in: dummy xt_NFLOG nfnetlink_log
nfnetlink xt_tcpudp xt_iprange ipt_MASQUERADE nf_nat_masquerade_ipv4
veth xt_nat nf_log_ipv4 nf_log_common xt_LOG bridge stp llc
iptable_nat nf_nat_ipv4 nf_nat ipt_REJECT nf_reject_ipv4
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
iptable_filter ip_tables x_tables btrfs xor raid6_pq
x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper
ablk_helper cryptd nfsd auth_rpcgss nfs_acl nfs lockd grace sunrpc
fscache
[610194.395881] CPU: 3 PID: 29166 Comm: garden-linux Tainted: G
WC 3.19.0-26-generic #28~14.04.1-Ubuntu
[610194.395883] ffffffffc046ff40 ffff88000ba77968 ffffffff817aeed7
0000000000000000
[610194.395886] ffff88000ba779b8 ffff88000ba779a8 ffffffff81074d8a
ffff8801605fee38
[610194.395889] ffff8801bc069210 ffff880005dd5800 00000000fffffffe
ffffffffc046c390
[610194.395892] Call Trace:
[610194.395899] [<ffffffff817aeed7>] dump_stack+0x45/0x57
[610194.395904] [<ffffffff81074d8a>] warn_slowpath_common+0x8a/0xc0
[610194.395907] [<ffffffff81074e06>] warn_slowpath_fmt+0x46/0x50
[610194.395917] [<ffffffffc03bff14>]
__btrfs_abort_transaction+0x54/0x130 [btrfs]
[610194.395930] [<ffffffffc03ec686>]
create_pending_snapshot+0x5a6/0x9b0 [btrfs]
[610194.395943] [<ffffffffc03ecb19>] create_pending_snapshots+0x89/0xb0 [btrfs]
[610194.395955] [<ffffffffc03ee0ae>]
btrfs_commit_transaction+0x47e/0xa30 [btrfs]
[610194.395970] [<ffffffffc0423070>] ?
btrfs_mksubvol.isra.28+0x310/0x550 [btrfs]
[610194.395983] [<ffffffffc0423297>] btrfs_mksubvol.isra.28+0x537/0x550 [btrfs]
[610194.395987] [<ffffffff810b4e10>] ? prepare_to_wait_event+0x110/0x110
[610194.395999] [<ffffffffc0423436>]
btrfs_ioctl_snap_create_transid+0x186/0x190 [btrfs]
[610194.396011] [<ffffffffc0423596>]
btrfs_ioctl_snap_create_v2+0xe6/0x140 [btrfs]
[610194.396024] [<ffffffffc04259de>] btrfs_ioctl+0xfbe/0x2ac0 [btrfs]
[610194.396028] [<ffffffff811fbe83>] ? path_openat+0x93/0x5a0
[610194.396032] [<ffffffff81061eba>] ? __do_page_fault+0x1ea/0x5b0
[610194.396035] [<ffffffff811ffbf8>] do_vfs_ioctl+0x2f8/0x510
[610194.396038] [<ffffffff811fc3b6>] ? final_putname+0x26/0x50
[610194.396041] [<ffffffff811fc659>] ? putname+0x29/0x40
[610194.396043] [<ffffffff811ffe91>] SyS_ioctl+0x81/0xa0
[610194.396047] [<ffffffff817b688d>] system_call_fastpath+0x16/0x1b
[610194.396049] ---[ end trace 8477fc3336b22b20 ]---
[610194.396075] BTRFS: error (device loop1) in
create_pending_snapshot:1392: errno=-2 No such entry
[610194.396186] BTRFS info (device loop1): forced readonly
[610194.396189] BTRFS warning (device loop1): Skipping commit of
aborted transaction.
[610194.396191] BTRFS: error (device loop1) in
cleanup_transaction:1670: errno=-2 No such entry
[610194.396290] BTRFS info (device loop1): delayed_refs has NO entry
and the version and btrfs info:
$> uname -a
Linux 1987d92f-d5e0-479c-8744-bb9dca7a28a1 3.19.0-26-generic
#28~14.04.1-Ubuntu SMP Wed Aug 12 14:09:17 UTC 2015 x86_64 x86_64
x86_64 GNU/Linux
$> btrfs --version
btrfs-progs v4.1.2
$> btrfs fi show
Label: none uuid: 13d624be-b162-4794-8edb-82270d68295d
Total devices 1 FS bytes used 5.40GiB
devid 1 size 29.75GiB used 10.04GiB path /dev/loop2
btrfs-progs v4.1.2
$> btrfs fi df /var/vcap/data/garden/btrfs_graph
Data, single: total=6.01GiB, used=4.66GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=1.50GiB, used=752.88MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=128.00MiB, used=0.00B
These plus the full dmesg log are available in [1].
Regards,
Glyn Normington & George Lestaris
[1] https://www.dropbox.com/s/v9wc9ampjw0fa8b/btrfs-diagnostics.txt?dl=1
[2] https://www.pivotaltracker.com/story/show/103489222
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-16 14:31 [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit Glyn Normington
@ 2015-09-16 14:48 ` Holger Hoffstätte
0 siblings, 0 replies; 12+ messages in thread
From: Holger Hoffstätte @ 2015-09-16 14:48 UTC (permalink / raw)
To: Glyn Normington, linux-btrfs
Hi Glyn :)
On 09/16/15 16:31, Glyn Normington wrote:
> We are seeing a very similar problem fwiw. Unfortunately we can't
> reproduce this reliably, but it is cropping up regularly. Here's a
> chore we are using to track our work on this: [2].
>
> Here's the kernel log:
>
> [610194.395845] WARNING: CPU: 3 PID: 29166 at
> /build/linux-lts-vivid-BZwsXG/linux-lts-vivid-3.19.0/fs/btrfs/super.c:260
> __btrfs_abort_transaction+0x54/0x130 [btrfs]()
This is an entirely different problem and may have long been fixed;
I count at least 6 patches for tx aborts in my 4.1++ tree alone.
We're talking about patches that I can guarantee you are not running.
If Pivotal needs some help with stable btrfs backport fixes I might be
able to resurrect my old 3.19 patch queue (check my Github!), but this
is close to Zombie Olympics.. :)
cheers
Holger
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
@ 2015-09-15 14:07 Josef Bacik
2015-09-15 15:50 ` Holger Hoffstätte
0 siblings, 1 reply; 12+ messages in thread
From: Josef Bacik @ 2015-09-15 14:07 UTC (permalink / raw)
To: linux-btrfs
When dropping a snapshot we need to account for the qgroup changes. If we drop
the snapshot in all one go then the backref code will fail to find blocks from
the snapshot we dropped since it won't be able to find the root in the fs root
cache. This can lead to us failing to find refs from other roots that pointed
at blocks in the now deleted root. To handle this we need to not remove the fs
roots from the cache until after we process the qgroup operations. Do this by
adding dropped roots to a list on the transaction, and letting the transaction
remove the roots at the same time it drops the commit roots. This will keep all
of the backref searching code in sync properly, and fixes a problem Mark was
seeing with snapshot delete and qgroups. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
V1->V2:
-clear the trans tag when we are finished dropping the subvol so we don't try to
update the root during commit.
fs/btrfs/extent-tree.c | 2 +-
fs/btrfs/transaction.c | 30 ++++++++++++++++++++++++++++++
fs/btrfs/transaction.h | 5 ++++-
3 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ab81135..2327d4c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8596,7 +8596,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
}
if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state)) {
- btrfs_drop_and_free_fs_root(tree_root->fs_info, root);
+ btrfs_add_dropped_root(trans, root);
} else {
free_extent_buffer(root->node);
free_extent_buffer(root->commit_root);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index f5021fc..cd15f39 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -117,6 +117,16 @@ static noinline void switch_commit_roots(struct btrfs_transaction *trans,
btrfs_unpin_free_ino(root);
clear_btree_io_tree(&root->dirty_log_pages);
}
+
+ /* We can free old roots now. */
+ spin_lock(&trans->dropped_roots_lock);
+ while (!list_empty(&trans->dropped_roots)) {
+ root = list_first_entry(&trans->dropped_roots,
+ struct btrfs_root, root_list);
+ list_del_init(&root->root_list);
+ btrfs_drop_and_free_fs_root(fs_info, root);
+ }
+ spin_unlock(&trans->dropped_roots_lock);
up_write(&fs_info->commit_root_sem);
}
@@ -255,9 +265,11 @@ loop:
INIT_LIST_HEAD(&cur_trans->pending_ordered);
INIT_LIST_HEAD(&cur_trans->dirty_bgs);
INIT_LIST_HEAD(&cur_trans->io_bgs);
+ INIT_LIST_HEAD(&cur_trans->dropped_roots);
mutex_init(&cur_trans->cache_write_mutex);
cur_trans->num_dirty_bgs = 0;
spin_lock_init(&cur_trans->dirty_bgs_lock);
+ spin_lock_init(&cur_trans->dropped_roots_lock);
list_add_tail(&cur_trans->list, &fs_info->trans_list);
extent_io_tree_init(&cur_trans->dirty_pages,
fs_info->btree_inode->i_mapping);
@@ -334,6 +346,24 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
}
+void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root)
+{
+ struct btrfs_transaction *cur_trans = trans->transaction;
+
+ /* Add ourselves to the transaction dropped list */
+ spin_lock(&cur_trans->dropped_roots_lock);
+ list_add_tail(&root->root_list, &cur_trans->dropped_roots);
+ spin_unlock(&cur_trans->dropped_roots_lock);
+
+ /* Make sure we don't try to update the root at commit time */
+ spin_lock(&root->fs_info->fs_roots_radix_lock);
+ radix_tree_tag_clear(&root->fs_info->fs_roots_radix,
+ (unsigned long)root->root_key.objectid,
+ BTRFS_ROOT_TRANS_TAG);
+ spin_unlock(&root->fs_info->fs_roots_radix_lock);
+}
+
int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index eb09c20..ef4ff38 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -65,6 +65,7 @@ struct btrfs_transaction {
struct list_head switch_commits;
struct list_head dirty_bgs;
struct list_head io_bgs;
+ struct list_head dropped_roots;
u64 num_dirty_bgs;
/*
@@ -74,6 +75,7 @@ struct btrfs_transaction {
*/
struct mutex cache_write_mutex;
spinlock_t dirty_bgs_lock;
+ spinlock_t dropped_roots_lock;
struct btrfs_delayed_ref_root delayed_refs;
int aborted;
int dirty_bg_run;
@@ -214,5 +216,6 @@ int btrfs_transaction_blocked(struct btrfs_fs_info *info);
int btrfs_transaction_in_commit(struct btrfs_fs_info *info);
void btrfs_put_transaction(struct btrfs_transaction *transaction);
void btrfs_apply_pending_changes(struct btrfs_fs_info *fs_info);
-
+void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root);
#endif
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-15 14:07 Josef Bacik
@ 2015-09-15 15:50 ` Holger Hoffstätte
2015-09-15 15:56 ` Josef Bacik
2015-09-15 19:08 ` Holger Hoffstätte
0 siblings, 2 replies; 12+ messages in thread
From: Holger Hoffstätte @ 2015-09-15 15:50 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs
On 09/15/15 16:07, Josef Bacik wrote:
> When dropping a snapshot we need to account for the qgroup changes. If we drop
> the snapshot in all one go then the backref code will fail to find blocks from
> the snapshot we dropped since it won't be able to find the root in the fs root
> cache. This can lead to us failing to find refs from other roots that pointed
> at blocks in the now deleted root. To handle this we need to not remove the fs
> roots from the cache until after we process the qgroup operations. Do this by
> adding dropped roots to a list on the transaction, and letting the transaction
> remove the roots at the same time it drops the commit roots. This will keep all
> of the backref searching code in sync properly, and fixes a problem Mark was
> seeing with snapshot delete and qgroups. Thanks,
>
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> ---
> V1->V2:
> -clear the trans tag when we are finished dropping the subvol so we don't try to
> update the root during commit.
This V2 does indeed seem to fix the issues I reported with snapshot deletion &
concurrent sync. I've now created/filled/deleted countless snapshots while issuing
sync(s) in parallel, and the problem that I saw fairly frequently with V1 no longer
seems to occur here. Therefore:
Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Thanks for the quick fix, Josef!
cheers
Holger
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-15 15:50 ` Holger Hoffstätte
@ 2015-09-15 15:56 ` Josef Bacik
2015-09-15 19:08 ` Holger Hoffstätte
1 sibling, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2015-09-15 15:56 UTC (permalink / raw)
To: Holger Hoffstätte, linux-btrfs
On 09/15/2015 11:50 AM, Holger Hoffstätte wrote:
> On 09/15/15 16:07, Josef Bacik wrote:
>> When dropping a snapshot we need to account for the qgroup changes. If we drop
>> the snapshot in all one go then the backref code will fail to find blocks from
>> the snapshot we dropped since it won't be able to find the root in the fs root
>> cache. This can lead to us failing to find refs from other roots that pointed
>> at blocks in the now deleted root. To handle this we need to not remove the fs
>> roots from the cache until after we process the qgroup operations. Do this by
>> adding dropped roots to a list on the transaction, and letting the transaction
>> remove the roots at the same time it drops the commit roots. This will keep all
>> of the backref searching code in sync properly, and fixes a problem Mark was
>> seeing with snapshot delete and qgroups. Thanks,
>>
>> Signed-off-by: Josef Bacik <jbacik@fb.com>
>> ---
>> V1->V2:
>> -clear the trans tag when we are finished dropping the subvol so we don't try to
>> update the root during commit.
>
> This V2 does indeed seem to fix the issues I reported with snapshot deletion &
> concurrent sync. I've now created/filled/deleted countless snapshots while issuing
> sync(s) in parallel, and the problem that I saw fairly frequently with V1 no longer
> seems to occur here. Therefore:
>
> Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
>
> Thanks for the quick fix, Josef!
>
Great thanks,
Josef
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-15 15:50 ` Holger Hoffstätte
2015-09-15 15:56 ` Josef Bacik
@ 2015-09-15 19:08 ` Holger Hoffstätte
2015-09-15 19:15 ` Josef Bacik
1 sibling, 1 reply; 12+ messages in thread
From: Holger Hoffstätte @ 2015-09-15 19:08 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs
On 09/15/15 17:50, Holger Hoffstätte wrote:
> This V2 does indeed seem to fix the issues I reported with snapshot
> deletion & concurrent sync. I've now created/filled/deleted countless
> snapshots while issuing sync(s) in parallel, and the problem that I
> saw fairly frequently with V1 no longer seems to occur here.
Well..I may have spoken too soon:
[20502.470227] ------------[ cut here ]------------
[20502.470247] WARNING: CPU: 0 PID: 27322 at fs/btrfs/transaction.c:66 btrfs_put_transaction+0xb7/0x100 [btrfs]()
[20502.470248] Modules linked in: btrfs nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel xor snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp raid6_pq crc32_pclmul i915 crc32c_intel aesni_intel snd_hda_intel aes_x86_64 intel_gtt glue_helper lrw snd_hda_controller i2c_algo_bit gf128mul ablk_helper drm_kms_helper snd_hda_codec cryptd usbhid snd_hda_core drm snd_pcm i2c_i801 snd_timer snd r8169 i2c_core soundcore mii video [last unloaded: btrfs]
[20502.470279] CPU: 0 PID: 27322 Comm: btrfs-transacti Not tainted 4.1.7 #1
[20502.470280] Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
[20502.470281] ffffffffa0843635 ffff8800c096fd48 ffffffff8156eae1 0000000000000007
[20502.470282] 0000000000000000 ffff8800c096fd88 ffffffff8105098a 0000000000000296
[20502.470284] ffff8802865a1d40 ffff8802865a1d40 ffff88014b5b5678 0000000000000000
[20502.470285] Call Trace:
[20502.470289] [<ffffffff8156eae1>] dump_stack+0x45/0x57
[20502.470292] [<ffffffff8105098a>] warn_slowpath_common+0x8a/0xc0
[20502.470293] [<ffffffff81050a7a>] warn_slowpath_null+0x1a/0x20
[20502.470298] [<ffffffffa07c1027>] btrfs_put_transaction+0xb7/0x100 [btrfs]
[20502.470303] [<ffffffffa07c280d>] btrfs_commit_transaction+0xb3d/0xc10 [btrfs]
[20502.470308] [<ffffffffa07bd305>] transaction_kthread+0x245/0x260 [btrfs]
[20502.470313] [<ffffffffa07bd0c0>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
[20502.470315] [<ffffffff8106ddc9>] kthread+0xc9/0xe0
[20502.470317] [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20502.470319] [<ffffffff81573f52>] ret_from_fork+0x42/0x70
[20502.470320] [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20502.470321] ---[ end trace 0556ec192b60907f ]---
[20532.457806] ------------[ cut here ]------------
[20532.457828] WARNING: CPU: 0 PID: 27322 at fs/btrfs/transaction.c:66 btrfs_put_transaction+0xb7/0x100 [btrfs]()
[20532.457829] Modules linked in: btrfs nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel xor snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp raid6_pq crc32_pclmul i915 crc32c_intel aesni_intel snd_hda_intel aes_x86_64 intel_gtt glue_helper lrw snd_hda_controller i2c_algo_bit gf128mul ablk_helper drm_kms_helper snd_hda_codec cryptd usbhid snd_hda_core drm snd_pcm i2c_i801 snd_timer snd r8169 i2c_core soundcore mii video [last unloaded: btrfs]
[20532.457872] CPU: 0 PID: 27322 Comm: btrfs-transacti Tainted: G W 4.1.7 #1
[20532.457873] Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
[20532.457873] ffffffffa0843635 ffff8800c096fd48 ffffffff8156eae1 0000000000000007
[20532.457875] 0000000000000000 ffff8800c096fd88 ffffffff8105098a 0000000000000296
[20532.457876] ffff8802865a1d40 ffff8802865a1d40 ffff8801123dc678 0000000000000000
[20532.457878] Call Trace:
[20532.457882] [<ffffffff8156eae1>] dump_stack+0x45/0x57
[20532.457884] [<ffffffff8105098a>] warn_slowpath_common+0x8a/0xc0
[20532.457885] [<ffffffff81050a7a>] warn_slowpath_null+0x1a/0x20
[20532.457890] [<ffffffffa07c1027>] btrfs_put_transaction+0xb7/0x100 [btrfs]
[20532.457895] [<ffffffffa07c280d>] btrfs_commit_transaction+0xb3d/0xc10 [btrfs]
[20532.457901] [<ffffffffa07bd305>] transaction_kthread+0x245/0x260 [btrfs]
[20532.457905] [<ffffffffa07bd0c0>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
[20532.457907] [<ffffffff8106ddc9>] kthread+0xc9/0xe0
[20532.457909] [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20532.457911] [<ffffffff81573f52>] ret_from_fork+0x42/0x70
[20532.457912] [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20532.457913] ---[ end trace 0556ec192b609080 ]---
As expected pid 27322 is the [btrfs-transacti] thread. It now seems to happen
(just did twice in a row) when I delete a snapshot but *don't* sync, but instead
just let it smurf along all by itself.
This warning is from:
..
void btrfs_put_transaction(struct btrfs_transaction *transaction)
{
WARN_ON(atomic_read(&transaction->use_count) == 0);
if (atomic_dec_and_test(&transaction->use_count)) {
BUG_ON(!list_empty(&transaction->list));
WARN_ON(!RB_EMPTY_ROOT(&transaction->delayed_refs.href_root));
if (transaction->delayed_refs.pending_csums)
..
Once this happens unmounting results in:
..blabla..
[21618.982176] BTRFS error (device sdc1): unable to find ref byte nr 628535836672 parent 0 root 1 owner 0 offset 0
[21618.982176] ------------[ cut here ]------------
[21618.982181] WARNING: CPU: 1 PID: 387 at fs/btrfs/extent-tree.c:6363 __btrfs_free_extent.isra.33+0x960/0xdf0 [btrfs]()
[21618.982181] BTRFS: Transaction aborted (error -2)
[21618.982182] Modules linked in: btrfs nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel xor snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp raid6_pq crc32_pclmul i915 crc32c_intel aesni_intel snd_hda_intel aes_x86_64 intel_gtt glue_helper lrw snd_hda_controller i2c_algo_bit gf128mul ablk_helper drm_kms_helper snd_hda_codec cryptd usbhid snd_hda_core drm snd_pcm i2c_i801 snd_timer snd r8169 i2c_core soundcore mii video [last unloaded: btrfs]
[21618.982196] CPU: 1 PID: 387 Comm: umount Tainted: G W 4.1.7 #1
[21618.982196] Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
[21618.982197] ffffffffa0842df0 ffff88014336b948 ffffffff8156eae1 0000000000000007
[21618.982198] ffff88014336b998 ffff88014336b988 ffffffff8105098a ffffffffa0842d5a
[21618.982199] 00000000fffffffe 0000009257a7c000 0000000000000000 ffff8803c9350000
[21618.982201] Call Trace:
[21618.982202] [<ffffffff8156eae1>] dump_stack+0x45/0x57
[21618.982203] [<ffffffff8105098a>] warn_slowpath_common+0x8a/0xc0
[21618.982204] [<ffffffff81050a06>] warn_slowpath_fmt+0x46/0x50
[21618.982208] [<ffffffffa07a5330>] __btrfs_free_extent.isra.33+0x960/0xdf0 [btrfs]
[21618.982214] [<ffffffffa080ce1d>] ? btrfs_delayed_ref_lock+0x3d/0x270 [btrfs]
[21618.982219] [<ffffffffa07a9802>] __btrfs_run_delayed_refs+0x9c2/0x1110 [btrfs]
[21618.982224] [<ffffffffa07acdb3>] btrfs_run_delayed_refs.part.38+0x73/0x280 [btrfs]
[21618.982228] [<ffffffffa07b0dbe>] btrfs_start_dirty_block_groups+0x35e/0x410 [btrfs]
[21618.982234] [<ffffffffa07c1e95>] btrfs_commit_transaction+0x1c5/0xc10 [btrfs]
[21618.982239] [<ffffffffa07c2976>] ? start_transaction+0x96/0x510 [btrfs]
[21618.982244] [<ffffffffa07bba23>] btrfs_commit_super+0x93/0xa0 [btrfs]
[21618.982248] [<ffffffffa07bd5b4>] close_ctree+0x294/0x350 [btrfs]
[21618.982252] [<ffffffffa0790dd9>] btrfs_put_super+0x19/0x20 [btrfs]
[21618.982253] [<ffffffff8117535a>] generic_shutdown_super+0x6a/0xf0
[21618.982254] [<ffffffff81175646>] kill_anon_super+0x16/0x30
[21618.982258] [<ffffffffa07906e8>] btrfs_kill_super+0x18/0x110 [btrfs]
[21618.982259] [<ffffffff811759f9>] deactivate_locked_super+0x49/0x80
[21618.982260] [<ffffffff81175e6c>] deactivate_super+0x6c/0x80
[21618.982261] [<ffffffff81192a53>] cleanup_mnt+0x43/0xa0
[21618.982263] [<ffffffff81192b02>] __cleanup_mnt+0x12/0x20
[21618.982264] [<ffffffff8106c387>] task_work_run+0xa7/0xe0
[21618.982265] [<ffffffff81002b9d>] do_notify_resume+0x6d/0x70
[21618.982266] [<ffffffff81573d42>] int_signal+0x12/0x17
[21618.982267] ---[ end trace 0556ec192b609086 ]---
[21618.982269] BTRFS: error (device sdc1) in __btrfs_free_extent:6363: errno=-2 No such entry
[21618.982800] BTRFS info (device sdc1): forced readonly
[21618.982802] BTRFS: error (device sdc1) in btrfs_run_delayed_refs:2863: errno=-2 No such entry
[21618.983332] BTRFS error (device sdc1): commit super ret -2
[21618.983407] BTRFS error (device sdc1): cleaner transaction attach returned -30
Sorry..
-h
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-15 19:08 ` Holger Hoffstätte
@ 2015-09-15 19:15 ` Josef Bacik
2015-09-15 19:39 ` Holger Hoffstätte
2015-09-16 8:58 ` Holger Hoffstätte
0 siblings, 2 replies; 12+ messages in thread
From: Josef Bacik @ 2015-09-15 19:15 UTC (permalink / raw)
To: Holger Hoffstätte, linux-btrfs
On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
> On 09/15/15 17:50, Holger Hoffstätte wrote:
>> This V2 does indeed seem to fix the issues I reported with snapshot
>> deletion & concurrent sync. I've now created/filled/deleted countless
>> snapshots while issuing sync(s) in parallel, and the problem that I
>> saw fairly frequently with V1 no longer seems to occur here.
>
> Well..I may have spoken too soon:
Huh this doesn't seem related to my stuff. Can I have your script you
are running to reproduce this? Thanks,
Josef
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-15 19:15 ` Josef Bacik
@ 2015-09-15 19:39 ` Holger Hoffstätte
2015-09-16 8:58 ` Holger Hoffstätte
1 sibling, 0 replies; 12+ messages in thread
From: Holger Hoffstätte @ 2015-09-15 19:39 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs
On 09/15/15 21:15, Josef Bacik wrote:
> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>> saw fairly frequently with V1 no longer seems to occur here.
>>
>> Well..I may have spoken too soon:
>
> Huh this doesn't seem related to my stuff. Can I have your script you
> are running to reproduce this? Thanks,
This was manual - deleted two large files, deleted two snapshots (one
from the modified subvolume and one sibling), let it sit in the background
while doing something else for a few seconds and found the messages.
Don't know how to reproduce this as it didn't show up previously.
-h
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-15 19:15 ` Josef Bacik
2015-09-15 19:39 ` Holger Hoffstätte
@ 2015-09-16 8:58 ` Holger Hoffstätte
2015-09-16 13:50 ` Josef Bacik
1 sibling, 1 reply; 12+ messages in thread
From: Holger Hoffstätte @ 2015-09-16 8:58 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs
On 09/15/15 21:15, Josef Bacik wrote:
> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>> saw fairly frequently with V1 no longer seems to occur here.
>>
>> Well..I may have spoken too soon:
>
> Huh this doesn't seem related to my stuff. Can I have your script
> you are running to reproduce this? Thanks,
>
> Josef
It seemed to be related to Omar's free-space-tree - I saw the same bug
a bit later on a different device without snapshot deletion, just
deleting two large files. Sorry for the false positive.
-h
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-16 8:58 ` Holger Hoffstätte
@ 2015-09-16 13:50 ` Josef Bacik
2015-09-16 14:00 ` Holger Hoffstätte
2015-09-18 18:28 ` Omar Sandoval
0 siblings, 2 replies; 12+ messages in thread
From: Josef Bacik @ 2015-09-16 13:50 UTC (permalink / raw)
To: Holger Hoffstätte, linux-btrfs, Omar Sandoval
On 09/16/2015 04:58 AM, Holger Hoffstätte wrote:
> On 09/15/15 21:15, Josef Bacik wrote:
>> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>>> saw fairly frequently with V1 no longer seems to occur here.
>>>
>>> Well..I may have spoken too soon:
>>
>> Huh this doesn't seem related to my stuff. Can I have your script
>> you are running to reproduce this? Thanks,
>>
>> Josef
>
> It seemed to be related to Omar's free-space-tree - I saw the same bug
> a bit later on a different device without snapshot deletion, just
> deleting two large files. Sorry for the false positive.
>
Omar, do you have time to look into this? Were they highly fragmented
files or just giant contiguous files? Thanks,
Josef
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-16 13:50 ` Josef Bacik
@ 2015-09-16 14:00 ` Holger Hoffstätte
2015-09-18 18:28 ` Omar Sandoval
1 sibling, 0 replies; 12+ messages in thread
From: Holger Hoffstätte @ 2015-09-16 14:00 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, Omar Sandoval
On 09/16/15 15:50, Josef Bacik wrote:
> On 09/16/2015 04:58 AM, Holger Hoffstätte wrote:
>> On 09/15/15 21:15, Josef Bacik wrote:
>>> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>>>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>>>> saw fairly frequently with V1 no longer seems to occur here.
>>>>
>>>> Well..I may have spoken too soon:
>>>
>>> Huh this doesn't seem related to my stuff. Can I have your script
>>> you are running to reproduce this? Thanks,
>>>
>>> Josef
>>
>> It seemed to be related to Omar's free-space-tree - I saw the same bug
>> a bit later on a different device without snapshot deletion, just
>> deleting two large files. Sorry for the false positive.
>>
>
> Omar, do you have time to look into this? Were they highly fragmented
> files or just giant contiguous files? Thanks,
These were very contiguous (2* ~1.3 GB, 1-3 extents each). I already sent
Omar a long klog dump with more background info. The biggest issue is that
after this bug the volume is hosed since the fst sets an incompat flag,
and after mounting rw just stumbles again trying to replay.
I'll keep trying to isolate exactly how and when this happens.
-h
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit
2015-09-16 13:50 ` Josef Bacik
2015-09-16 14:00 ` Holger Hoffstätte
@ 2015-09-18 18:28 ` Omar Sandoval
1 sibling, 0 replies; 12+ messages in thread
From: Omar Sandoval @ 2015-09-18 18:28 UTC (permalink / raw)
To: Josef Bacik; +Cc: Holger Hoffstätte, linux-btrfs
On Wed, Sep 16, 2015 at 09:50:17AM -0400, Josef Bacik wrote:
> On 09/16/2015 04:58 AM, Holger Hoffstätte wrote:
> >On 09/15/15 21:15, Josef Bacik wrote:
> >>On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
> >>>On 09/15/15 17:50, Holger Hoffstätte wrote:
> >>>>This V2 does indeed seem to fix the issues I reported with snapshot
> >>>>deletion & concurrent sync. I've now created/filled/deleted countless
> >>>>snapshots while issuing sync(s) in parallel, and the problem that I
> >>>>saw fairly frequently with V1 no longer seems to occur here.
> >>>
> >>>Well..I may have spoken too soon:
> >>
> >>Huh this doesn't seem related to my stuff. Can I have your script
> >>you are running to reproduce this? Thanks,
> >>
> >>Josef
> >
> >It seemed to be related to Omar's free-space-tree - I saw the same bug
> >a bit later on a different device without snapshot deletion, just
> >deleting two large files. Sorry for the false positive.
> >
>
> Omar, do you have time to look into this? Were they highly fragmented files
> or just giant contiguous files? Thanks,
>
> Josef
>
Yeah, I'll take a look over the weekend or next week at the latest.
--
Omar
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-09-18 18:28 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-16 14:31 [PATCH V2] Btrfs: keep dropped roots in cache until transaction commit Glyn Normington
2015-09-16 14:48 ` Holger Hoffstätte
-- strict thread matches above, loose matches on Subject: below --
2015-09-15 14:07 Josef Bacik
2015-09-15 15:50 ` Holger Hoffstätte
2015-09-15 15:56 ` Josef Bacik
2015-09-15 19:08 ` Holger Hoffstätte
2015-09-15 19:15 ` Josef Bacik
2015-09-15 19:39 ` Holger Hoffstätte
2015-09-16 8:58 ` Holger Hoffstätte
2015-09-16 13:50 ` Josef Bacik
2015-09-16 14:00 ` Holger Hoffstätte
2015-09-18 18:28 ` Omar Sandoval
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).