* [PATCH 1/7] Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error
2012-06-01 13:55 Transaction abort fixes Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
2012-06-01 13:55 ` [PATCH 2/7] Btrfs: fix locking in btrfs_destroy_delayed_refs Josef Bacik
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
While doing my enospc work I got a transaction abortion that resulted in a
panic when we tried to unlock_page() an already unlocked page. This is
because we aren't calling extent_clear_unlock_delalloc with the locked page
so it was unlocking all the pages in the range. This is wrong since
__extent_writepage expects to have the page locked still unless we return
*page_started as 1. This should keep us from panicing. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/inode.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 46d8732..e91f985 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -830,7 +830,7 @@ static noinline int cow_file_range(struct inode *inode,
if (IS_ERR(trans)) {
extent_clear_unlock_delalloc(inode,
&BTRFS_I(inode)->io_tree,
- start, end, NULL,
+ start, end, locked_page,
EXTENT_CLEAR_UNLOCK_PAGE |
EXTENT_CLEAR_UNLOCK |
EXTENT_CLEAR_DELALLOC |
@@ -963,7 +963,7 @@ out:
out_unlock:
extent_clear_unlock_delalloc(inode,
&BTRFS_I(inode)->io_tree,
- start, end, NULL,
+ start, end, locked_page,
EXTENT_CLEAR_UNLOCK_PAGE |
EXTENT_CLEAR_UNLOCK |
EXTENT_CLEAR_DELALLOC |
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/7] Btrfs: fix locking in btrfs_destroy_delayed_refs
2012-06-01 13:55 Transaction abort fixes Josef Bacik
2012-06-01 13:55 ` [PATCH 1/7] Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
2012-06-01 13:55 ` [PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache Josef Bacik
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
The transaction abort stuff was throwing warnings from the list debugging
code because we do a list_del_init outside of the delayed_refs spin lock.
The delayed refs locking makes baby Jesus cry so it's not hard to get wrong,
but we need to take the ref head mutex to make sure it's not being processed
currently, and so if it is we need to drop the spin lock and then take and
drop the mutex and do the search again. If we can take the mutex then we
can safely remove the head from the list and carry on. Now when the
transaction aborts I don't get the list debugging warnings. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/disk-io.c | 30 +++++++++++++++++-------------
1 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b0d49e2..0224c25 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3395,7 +3395,6 @@ int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
delayed_refs = &trans->delayed_refs;
-again:
spin_lock(&delayed_refs->lock);
if (delayed_refs->num_entries == 0) {
spin_unlock(&delayed_refs->lock);
@@ -3403,31 +3402,36 @@ again:
return ret;
}
- node = rb_first(&delayed_refs->root);
- while (node) {
+ while ((node = rb_first(&delayed_refs->root)) != NULL) {
ref = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
- node = rb_next(node);
-
- ref->in_tree = 0;
- rb_erase(&ref->rb_node, &delayed_refs->root);
- delayed_refs->num_entries--;
atomic_set(&ref->refs, 1);
if (btrfs_delayed_ref_is_head(ref)) {
struct btrfs_delayed_ref_head *head;
head = btrfs_delayed_node_to_head(ref);
- spin_unlock(&delayed_refs->lock);
- mutex_lock(&head->mutex);
+ if (!mutex_trylock(&head->mutex)) {
+ atomic_inc(&ref->refs);
+ spin_unlock(&delayed_refs->lock);
+
+ /* Need to wait for the delayed ref to run */
+ mutex_lock(&head->mutex);
+ mutex_unlock(&head->mutex);
+ btrfs_put_delayed_ref(ref);
+
+ continue;
+ }
+
kfree(head->extent_op);
delayed_refs->num_heads--;
if (list_empty(&head->cluster))
delayed_refs->num_heads_ready--;
list_del_init(&head->cluster);
- mutex_unlock(&head->mutex);
- btrfs_put_delayed_ref(ref);
- goto again;
}
+ ref->in_tree = 0;
+ rb_erase(&ref->rb_node, &delayed_refs->root);
+ delayed_refs->num_entries--;
+
spin_unlock(&delayed_refs->lock);
btrfs_put_delayed_ref(ref);
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache
2012-06-01 13:55 Transaction abort fixes Josef Bacik
2012-06-01 13:55 ` [PATCH 1/7] Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error Josef Bacik
2012-06-01 13:55 ` [PATCH 2/7] Btrfs: fix locking in btrfs_destroy_delayed_refs Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
2012-06-04 18:07 ` Josef Bacik
2012-06-01 13:55 ` [PATCH 4/7] Btrfs: wake up transaction waiters when aborting a transaction Josef Bacik
` (3 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
In doing my enospc work I would sometimes error out in btrfs_save_ino_cache
which would abort the transaction but we'd still end up with a corrupted
file system. This is because we don't actually check the return value and
so if somethign goes wrong we just exit out and screw everything up. This
fixes this particular part. Thanks,
Btrfs: check the return code of btrfs_save_ino_cache
In doing my enospc work I would sometimes error out in btrfs_save_ino_cache
which would abort the transaction but we'd still end up with a corrupted
file system. This is because we don't actually check the return value and
so if somethign goes wrong we just exit out and screw everything up. This
fixes this particular part. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/transaction.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 82b03af..7aed0e8 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -823,7 +823,9 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans,
btrfs_update_reloc_root(trans, root);
btrfs_orphan_commit_root(trans, root);
- btrfs_save_ino_cache(root, trans);
+ err = btrfs_save_ino_cache(root, trans);
+ if (err)
+ goto out;
/* see comments in should_cow_block() */
root->force_cow = 0;
@@ -848,6 +850,7 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans,
}
}
spin_unlock(&fs_info->fs_roots_radix_lock);
+out:
return err;
}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache
2012-06-01 13:55 ` [PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache Josef Bacik
@ 2012-06-04 18:07 ` Josef Bacik
0 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-04 18:07 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs
On Fri, Jun 01, 2012 at 09:55:51AM -0400, Josef Bacik wrote:
> In doing my enospc work I would sometimes error out in btrfs_save_ino_cache
> which would abort the transaction but we'd still end up with a corrupted
> file system. This is because we don't actually check the return value and
> so if somethign goes wrong we just exit out and screw everything up. This
> fixes this particular part. Thanks,
Dropping this patch, it doesn't actually matter if the space cache gets written
out or not and it actually fails if the caching has not finished which can lead
to a transaction being aborted for no reason. Thanks,
Josef
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/7] Btrfs: wake up transaction waiters when aborting a transaction
2012-06-01 13:55 Transaction abort fixes Josef Bacik
` (2 preceding siblings ...)
2012-06-01 13:55 ` [PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
2012-06-01 13:55 ` [PATCH 5/7] Btrfs: abort the transaction if the commit fails Josef Bacik
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
I was getting lots of hung tasks and a NULL pointer dereference because we
are not cleaning up the transaction properly when it aborts. First we need
to reset the running_transaction to NULL so we don't get a bad dereference
for any start_transaction callers after this. Also we cannot rely on
waitqueue_active() since it's just a list_empty(), so just call wake_up()
directly since that will do the barrier for us and such. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/disk-io.c | 9 +++------
fs/btrfs/transaction.c | 4 ++++
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 0224c25..050db9b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3584,16 +3584,13 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
/* FIXME: cleanup wait for commit */
cur_trans->in_commit = 1;
cur_trans->blocked = 1;
- if (waitqueue_active(&root->fs_info->transaction_blocked_wait))
- wake_up(&root->fs_info->transaction_blocked_wait);
+ wake_up(&root->fs_info->transaction_blocked_wait);
cur_trans->blocked = 0;
- if (waitqueue_active(&root->fs_info->transaction_wait))
- wake_up(&root->fs_info->transaction_wait);
+ wake_up(&root->fs_info->transaction_wait);
cur_trans->commit_done = 1;
- if (waitqueue_active(&cur_trans->commit_wait))
- wake_up(&cur_trans->commit_wait);
+ wake_up(&cur_trans->commit_wait);
btrfs_destroy_pending_snapshots(cur_trans);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 7aed0e8..4e6f63e 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1205,6 +1205,10 @@ static void cleanup_transaction(struct btrfs_trans_handle *trans,
spin_lock(&root->fs_info->trans_lock);
list_del_init(&cur_trans->list);
+ if (cur_trans == root->fs_info->running_transaction) {
+ root->fs_info->running_transaction = NULL;
+ root->fs_info->trans_no_join = 0;
+ }
spin_unlock(&root->fs_info->trans_lock);
btrfs_cleanup_one_transaction(trans->transaction, root);
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/7] Btrfs: abort the transaction if the commit fails
2012-06-01 13:55 Transaction abort fixes Josef Bacik
` (3 preceding siblings ...)
2012-06-01 13:55 ` [PATCH 4/7] Btrfs: wake up transaction waiters when aborting a transaction Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
2012-06-01 13:55 ` [PATCH 6/7] Btrfs: fix btrfs_destroy_marked_extents Josef Bacik
2012-06-01 13:55 ` [PATCH 7/7] Btrfs: unlock everything properly in the error case for nocow Josef Bacik
6 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
If a transaction commit fails we don't abort it so we don't set an error on
the file system. This patch fixes that by actually calling the abort stuff
and then adding a check for a fs error in the transaction start stuff to
make sure it is caught properly. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/transaction.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 4e6f63e..ead64e1 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -99,6 +99,10 @@ loop:
kmem_cache_free(btrfs_transaction_cachep, cur_trans);
cur_trans = root->fs_info->running_transaction;
goto loop;
+ } else if (root->fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) {
+ spin_unlock(&root->fs_info->trans_lock);
+ kmem_cache_free(btrfs_transaction_cachep, cur_trans);
+ return -EROFS;
}
atomic_set(&cur_trans->num_writers, 1);
@@ -1197,12 +1201,14 @@ int btrfs_commit_transaction_async(struct btrfs_trans_handle *trans,
static void cleanup_transaction(struct btrfs_trans_handle *trans,
- struct btrfs_root *root)
+ struct btrfs_root *root, int err)
{
struct btrfs_transaction *cur_trans = trans->transaction;
WARN_ON(trans->use_count > 1);
+ btrfs_abort_transaction(trans, root, err);
+
spin_lock(&root->fs_info->trans_lock);
list_del_init(&cur_trans->list);
if (cur_trans == root->fs_info->running_transaction) {
@@ -1514,7 +1520,7 @@ cleanup_transaction:
// WARN_ON(1);
if (current->journal_info == trans)
current->journal_info = NULL;
- cleanup_transaction(trans, root);
+ cleanup_transaction(trans, root, ret);
return ret;
}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/7] Btrfs: fix btrfs_destroy_marked_extents
2012-06-01 13:55 Transaction abort fixes Josef Bacik
` (4 preceding siblings ...)
2012-06-01 13:55 ` [PATCH 5/7] Btrfs: abort the transaction if the commit fails Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
2012-06-01 13:55 ` [PATCH 7/7] Btrfs: unlock everything properly in the error case for nocow Josef Bacik
6 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
So we're forcing the eb's to have their ref count set to 1 so invalidatepage
works but this breaks lots of things, for example root nodes, and is just
plain wrong, we don't need to just evict all of this stuff. Also drop the
invalidatepage altogether and add a page_cache_release(). With this patch
we no longer hang when trying to access the root nodes after an aborted
transaction and we no longer leak memory. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/disk-io.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 050db9b..ea2b1d2 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3519,11 +3519,9 @@ static int btrfs_destroy_marked_extents(struct btrfs_root *root,
&(&BTRFS_I(page->mapping->host)->io_tree)->buffer,
offset >> PAGE_CACHE_SHIFT);
spin_unlock(&dirty_pages->buffer_lock);
- if (eb) {
+ if (eb)
ret = test_and_clear_bit(EXTENT_BUFFER_DIRTY,
&eb->bflags);
- atomic_set(&eb->refs, 1);
- }
if (PageWriteback(page))
end_page_writeback(page);
@@ -3537,8 +3535,8 @@ static int btrfs_destroy_marked_extents(struct btrfs_root *root,
spin_unlock_irq(&page->mapping->tree_lock);
}
- page->mapping->a_ops->invalidatepage(page, 0);
unlock_page(page);
+ page_cache_release(page);
}
}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 7/7] Btrfs: unlock everything properly in the error case for nocow
2012-06-01 13:55 Transaction abort fixes Josef Bacik
` (5 preceding siblings ...)
2012-06-01 13:55 ` [PATCH 6/7] Btrfs: fix btrfs_destroy_marked_extents Josef Bacik
@ 2012-06-01 13:55 ` Josef Bacik
6 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2012-06-01 13:55 UTC (permalink / raw)
To: linux-btrfs
I was getting hung on umount when a transaction was aborted because a range
of one of the free space inodes was still locked. This is because the nocow
stuff doesn't unlock anything on error. This fixed the problem and I
verified that is what was happening. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/inode.c | 37 +++++++++++++++++++++++++++++++++++--
1 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e91f985..96b841d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1136,8 +1136,18 @@ static noinline int run_delalloc_nocow(struct inode *inode,
u64 ino = btrfs_ino(inode);
path = btrfs_alloc_path();
- if (!path)
+ if (!path) {
+ extent_clear_unlock_delalloc(inode,
+ &BTRFS_I(inode)->io_tree,
+ start, end, locked_page,
+ EXTENT_CLEAR_UNLOCK_PAGE |
+ EXTENT_CLEAR_UNLOCK |
+ EXTENT_CLEAR_DELALLOC |
+ EXTENT_CLEAR_DIRTY |
+ EXTENT_SET_WRITEBACK |
+ EXTENT_END_WRITEBACK);
return -ENOMEM;
+ }
nolock = btrfs_is_free_space_inode(root, inode);
@@ -1147,6 +1157,15 @@ static noinline int run_delalloc_nocow(struct inode *inode,
trans = btrfs_join_transaction(root);
if (IS_ERR(trans)) {
+ extent_clear_unlock_delalloc(inode,
+ &BTRFS_I(inode)->io_tree,
+ start, end, locked_page,
+ EXTENT_CLEAR_UNLOCK_PAGE |
+ EXTENT_CLEAR_UNLOCK |
+ EXTENT_CLEAR_DELALLOC |
+ EXTENT_CLEAR_DIRTY |
+ EXTENT_SET_WRITEBACK |
+ EXTENT_END_WRITEBACK);
btrfs_free_path(path);
return PTR_ERR(trans);
}
@@ -1327,8 +1346,11 @@ out_check:
}
btrfs_release_path(path);
- if (cur_offset <= end && cow_start == (u64)-1)
+ if (cur_offset <= end && cow_start == (u64)-1) {
cow_start = cur_offset;
+ cur_offset = end;
+ }
+
if (cow_start != (u64)-1) {
ret = cow_file_range(inode, locked_page, cow_start, end,
page_started, nr_written, 1);
@@ -1347,6 +1369,17 @@ error:
if (!ret)
ret = err;
+ if (ret && cur_offset < end)
+ extent_clear_unlock_delalloc(inode,
+ &BTRFS_I(inode)->io_tree,
+ cur_offset, end, locked_page,
+ EXTENT_CLEAR_UNLOCK_PAGE |
+ EXTENT_CLEAR_UNLOCK |
+ EXTENT_CLEAR_DELALLOC |
+ EXTENT_CLEAR_DIRTY |
+ EXTENT_SET_WRITEBACK |
+ EXTENT_END_WRITEBACK);
+
btrfs_free_path(path);
return ret;
}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread