linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: [bug] its messy when missing device reappears after its been replaced in RAID1
Date: Tue, 07 Jan 2014 00:56:19 +0800	[thread overview]
Message-ID: <52CAE033.3020604@oracle.com> (raw)


test case:
disappear a disk then replace (RAID1) the disappeared disk
and then make disappeared disk to reappear.

----
  mkfs.btrfs -f -m raid1 -d raid1 /dev/sdc /dev/sdd
  mount /dev/sdc /btrfs
  dd if=/dev/zero of=/btrfs/tf1 count=1
  btrfs fi sync /btrfs
---

devmgt[1] will help to attach or detach a disk easily

--
  devmgt show
  devmgt detach /dev/sdc
--

btrfs sill unaware of device missing.
--
  btrfs fi show -m
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
         Total devices 2 FS bytes used 32.00KiB
         devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <--
         devid    2 size 958.94MiB used 103.88MiB path /dev/sdd

  btrfs rep start -f 1 /dev/sde /btrfs
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
         Total devices 2 FS bytes used 32.00KiB
         devid    1 size 958.94MiB used 115.88MiB path /dev/sde
         devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
--

so far good. now missing /dev/sdc comes-back.

---
  devmgt attach host2

btrfs fi show -m shows sdc
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
         Total devices 2 FS bytes used 32.00KiB^M
         devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong.
         devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
---

this is wrong it should be sde. this happened because when
disk comes back device_list_add() is called which would invariably
replace the existing disk with the given disk with the same fsid/devid.
But the actual IO is still going to sde not to sdc.

Further when we start fresh with (modprobe -r btrfs)
unless it is carefully managed using btrfs dev scan <dev>
it may pair with wrong disk.

Need your review of the following proposed fix. This patch
will compare the trans id before disk is substituted.

----------------------------------------------------
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2ca91fc..b226284 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -496,14 +496,39 @@ static noinline int device_list_add(const char *path,

                 device->fs_devices = fs_devices;
         } else if (!device->name || strcmp(device->name->str, path)) {
-               name = rcu_string_strdup(path, GFP_NOFS);
-               if (!name)
-                       return -ENOMEM;
-               rcu_string_free(device->name);
-               rcu_assign_pointer(device->name, name);
-               if (device->missing) {
-                       fs_devices->missing_devices--;
-                       device->missing = 0;
+
+               struct buffer_head *bh;
+               struct btrfs_super_block *cur_disk_super;
+               u64 cur_transid;
+
+               if (!device->missing) {
+                       bh = btrfs_read_dev_super(device->bdev);
+                       if (!bh)
+                               return -EINVAL;
+
+                       cur_disk_super = (struct btrfs_super_block *)
+						bh->b_data;
+                       cur_transid = btrfs_super_generation(ds);
+               } else
+                       cur_transid = 0;
+
+               if (found_transid > cur_transid) {
+
+                       name = rcu_string_strdup(path, GFP_NOFS);
+                       if (!name)
+                               return -ENOMEM;
+
+                       rcu_string_free(device->name);
+                       rcu_assign_pointer(device->name, name);
+
+                       if (device->missing) {
+                               fs_devices->missing_devices--;
+                               device->missing = 0;
+                       }
+
+       printk_in_rcu(KERN_INFO "%s tran %llu replaced %s tran %llu\n",
+                               path, found_transid,
+                               rcu_str_deref(device->name), tranid);
                 }
         }

---------------------------------------


Thanks Anand


[1] github.com/anajain/devmgt.git




             reply	other threads:[~2014-01-06 16:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-06 16:56 Anand Jain [this message]
2014-01-13 10:05 ` [bug] its messy when missing device reappears after its been replaced in RAID1 Wang Shilong
2014-01-14 11:43   ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52CAE033.3020604@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).