linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots"
@ 2014-10-15 18:37 Filipe Manana
  2014-10-15 21:11 ` Chris Mason
  0 siblings, 1 reply; 5+ messages in thread
From: Filipe Manana @ 2014-10-15 18:37 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable, clm, Filipe Manana

This reverts commit 9c3b306e1c9e6be4be09e99a8fe2227d1005effc.

Switching only one commit root during a transaction is wrong because it leads
the fs into an inconsistent state. All commit roots should be switched at once,
at transaction commit time, otherwise backref walking can often miss important
references that were only accessible through the old commit root.
Plus, the root item for the snapshot's root wasn't getting updated and preventing
the next transaction commit to do it.

This made several users get into random corruption issues after creation of
readonly snapshots.

A regression test for xfstests will follow soon.

Cc: stable@vger.kernel.org # 3.17
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/inode.c | 36 ------------------------------------
 fs/btrfs/ioctl.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 36 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index fc9c043..d23362f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5261,42 +5261,6 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
 			iput(inode);
 			inode = ERR_PTR(ret);
 		}
-		/*
-		 * If orphan cleanup did remove any orphans, it means the tree
-		 * was modified and therefore the commit root is not the same as
-		 * the current root anymore. This is a problem, because send
-		 * uses the commit root and therefore can see inode items that
-		 * don't exist in the current root anymore, and for example make
-		 * calls to btrfs_iget, which will do tree lookups based on the
-		 * current root and not on the commit root. Those lookups will
-		 * fail, returning a -ESTALE error, and making send fail with
-		 * that error. So make sure a send does not see any orphans we
-		 * have just removed, and that it will see the same inodes
-		 * regardless of whether a transaction commit happened before
-		 * it started (meaning that the commit root will be the same as
-		 * the current root) or not.
-		 */
-		if (sub_root->node != sub_root->commit_root) {
-			u64 sub_flags = btrfs_root_flags(&sub_root->root_item);
-
-			if (sub_flags & BTRFS_ROOT_SUBVOL_RDONLY) {
-				struct extent_buffer *eb;
-
-				/*
-				 * Assert we can't have races between dentry
-				 * lookup called through the snapshot creation
-				 * ioctl and the VFS.
-				 */
-				ASSERT(mutex_is_locked(&dir->i_mutex));
-
-				down_write(&root->fs_info->commit_root_sem);
-				eb = sub_root->commit_root;
-				sub_root->commit_root =
-					btrfs_root_node(sub_root);
-				up_write(&root->fs_info->commit_root_sem);
-				free_extent_buffer(eb);
-			}
-		}
 	}
 
 	return inode;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e732274..33c80f5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -713,6 +713,39 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	if (ret)
 		goto fail;
 
+	ret = btrfs_orphan_cleanup(pending_snapshot->snap);
+	if (ret)
+		goto fail;
+
+	/*
+	 * If orphan cleanup did remove any orphans, it means the tree was
+	 * modified and therefore the commit root is not the same as the
+	 * current root anymore. This is a problem, because send uses the
+	 * commit root and therefore can see inode items that don't exist
+	 * in the current root anymore, and for example make calls to
+	 * btrfs_iget, which will do tree lookups based on the current root
+	 * and not on the commit root. Those lookups will fail, returning a
+	 * -ESTALE error, and making send fail with that error. So make sure
+	 * a send does not see any orphans we have just removed, and that it
+	 * will see the same inodes regardless of whether a transaction
+	 * commit happened before it started (meaning that the commit root
+	 * will be the same as the current root) or not.
+	 */
+	if (readonly && pending_snapshot->snap->node !=
+	    pending_snapshot->snap->commit_root) {
+		trans = btrfs_join_transaction(pending_snapshot->snap);
+		if (IS_ERR(trans) && PTR_ERR(trans) != -ENOENT) {
+			ret = PTR_ERR(trans);
+			goto fail;
+		}
+		if (!IS_ERR(trans)) {
+			ret = btrfs_commit_transaction(trans,
+						       pending_snapshot->snap);
+			if (ret)
+				goto fail;
+		}
+	}
+
 	inode = btrfs_lookup_dentry(dentry->d_parent->d_inode, dentry);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots"
  2014-10-15 18:37 [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots" Filipe Manana
@ 2014-10-15 21:11 ` Chris Mason
  2014-10-15 22:42   ` john terragon
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Mason @ 2014-10-15 21:11 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs, stable, Filipe Manana



On Wed, Oct 15, 2014 at 2:37 PM, Filipe Manana <fdmanana@gmail.com> 
wrote:
> This reverts commit 9c3b306e1c9e6be4be09e99a8fe2227d1005effc.
> 
> Switching only one commit root during a transaction is wrong because 
> it leads
> the fs into an inconsistent state. All commit roots should be 
> switched at once,
> at transaction commit time, otherwise backref walking can often miss 
> important
> references that were only accessible through the old commit root.
> Plus, the root item for the snapshot's root wasn't getting updated 
> and preventing
> the next transaction commit to do it.
> 
> This made several users get into random corruption issues after 
> creation of
> readonly snapshots.

Thanks Filipe and everyone else that helped nail this down.  I 
convinced myself that working on readonly roots made the original 
commit ok, but clearly I was horribly wrong.  Sorry guys, I'll get this 
off to Linus as soon as I do a quick cook here.

-chris



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots"
  2014-10-15 21:11 ` Chris Mason
@ 2014-10-15 22:42   ` john terragon
  2014-10-15 23:05     ` Filipe David Manana
  0 siblings, 1 reply; 5+ messages in thread
From: john terragon @ 2014-10-15 22:42 UTC (permalink / raw)
  To: Chris Mason; +Cc: Filipe Manana, Btrfs BTRFS, stable, Filipe Manana

Hi.

I applied the patch to 3.17.1 but although I haven't seen any
corrupted ro snapshot yet it's still impossible to do btrfs send. As
soon as I start btrfs send I still get

ERROR: send ioctl failed with -12: Cannot allocate memory

even if I redirect btrfs send's output to a file (instead of involving
btrfs receive)

Maybe this time it's actually a btrfs-progs bug?

Thanks
John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots"
  2014-10-15 22:42   ` john terragon
@ 2014-10-15 23:05     ` Filipe David Manana
  2014-10-15 23:49       ` john terragon
  0 siblings, 1 reply; 5+ messages in thread
From: Filipe David Manana @ 2014-10-15 23:05 UTC (permalink / raw)
  To: john terragon; +Cc: Chris Mason, Btrfs BTRFS, stable, Filipe Manana

On Wed, Oct 15, 2014 at 11:42 PM, john terragon <jterragon@gmail.com> wrote:
> Hi.
>
> I applied the patch to 3.17.1 but although I haven't seen any
> corrupted ro snapshot yet it's still impossible to do btrfs send. As
> soon as I start btrfs send I still get
>
> ERROR: send ioctl failed with -12: Cannot allocate memory
>
> even if I redirect btrfs send's output to a file (instead of involving
> btrfs receive)
>
> Maybe this time it's actually a btrfs-progs bug?

Not enough information to tell.

Is it a brand new fs? If not, is it a snapshot created after applying
the patch or before? Does a btrfsck reports any issues with the fs?
Is it an incremental (using -p <parent_snapshot>) or a full send? Do
you see any warning (traces, errors) in syslog (dmesg)?

Either an issue in send or, if it's an fs created/used with unpatched
3.17.0/1, it can be a side effect of the corruption.

thanks

>
> Thanks
> John



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots"
  2014-10-15 23:05     ` Filipe David Manana
@ 2014-10-15 23:49       ` john terragon
  0 siblings, 0 replies; 5+ messages in thread
From: john terragon @ 2014-10-15 23:49 UTC (permalink / raw)
  To: Filipe Manana; +Cc: Chris Mason, Btrfs BTRFS, Filipe Manana

-It's not a brand new fs. It has been created four or five days ago
with btrfs-progs 3.16.2 (in fact it was created because of the dead
unremovable ro snapshots in the previous fs)

-the snapshot in question has been created after applying the patch
(and it has not become corrupted so far)

-not an incremental send

-no warnings in dmesg

-btrfs check segfaults (as it did before the patch)

-there are in fact dead unremovable ro snapshots in the filesystem (it
has been used before the patch). But the filesystem seems functional
as long as the dead ro snapshots aren't touched. If one of them is
accessed with ls -l  I get the usual "parent transid verify failed on
X wanted Y found Z". But as I said no warnings of that kind (or any
kind) appear in dmesg when I do the send on the freshly created ro
snapshot.

thanks
john


On Thu, Oct 16, 2014 at 1:05 AM, Filipe David Manana <fdmanana@gmail.com> wrote:
> On Wed, Oct 15, 2014 at 11:42 PM, john terragon <jterragon@gmail.com> wrote:
>> Hi.
>>
>> I applied the patch to 3.17.1 but although I haven't seen any
>> corrupted ro snapshot yet it's still impossible to do btrfs send. As
>> soon as I start btrfs send I still get
>>
>> ERROR: send ioctl failed with -12: Cannot allocate memory
>>
>> even if I redirect btrfs send's output to a file (instead of involving
>> btrfs receive)
>>
>> Maybe this time it's actually a btrfs-progs bug?
>
> Not enough information to tell.
>
> Is it a brand new fs? If not, is it a snapshot created after applying
> the patch or before? Does a btrfsck reports any issues with the fs?
> Is it an incremental (using -p <parent_snapshot>) or a full send? Do
> you see any warning (traces, errors) in syslog (dmesg)?
>
> Either an issue in send or, if it's an fs created/used with unpatched
> 3.17.0/1, it can be a side effect of the corruption.
>
> thanks
>
>>
>> Thanks
>> John
>
>
>
> --
> Filipe David Manana,
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-15 23:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-15 18:37 [PATCH] Revert "Btrfs: race free update of commit root for ro snapshots" Filipe Manana
2014-10-15 21:11 ` Chris Mason
2014-10-15 22:42   ` john terragon
2014-10-15 23:05     ` Filipe David Manana
2014-10-15 23:49       ` john terragon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).