All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: [bug] its messy when missing device reappears after its been replaced in RAID1
Date: Tue, 07 Jan 2014 00:56:19 +0800	[thread overview]
Message-ID: <52CAE033.3020604@oracle.com> (raw)


test case:
disappear a disk then replace (RAID1) the disappeared disk
and then make disappeared disk to reappear.

----
  mkfs.btrfs -f -m raid1 -d raid1 /dev/sdc /dev/sdd
  mount /dev/sdc /btrfs
  dd if=/dev/zero of=/btrfs/tf1 count=1
  btrfs fi sync /btrfs
---

devmgt[1] will help to attach or detach a disk easily

--
  devmgt show
  devmgt detach /dev/sdc
--

btrfs sill unaware of device missing.
--
  btrfs fi show -m
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
         Total devices 2 FS bytes used 32.00KiB
         devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <--
         devid    2 size 958.94MiB used 103.88MiB path /dev/sdd

  btrfs rep start -f 1 /dev/sde /btrfs
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
         Total devices 2 FS bytes used 32.00KiB
         devid    1 size 958.94MiB used 115.88MiB path /dev/sde
         devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
--

so far good. now missing /dev/sdc comes-back.

---
  devmgt attach host2

btrfs fi show -m shows sdc
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
         Total devices 2 FS bytes used 32.00KiB^M
         devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong.
         devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
---

this is wrong it should be sde. this happened because when
disk comes back device_list_add() is called which would invariably
replace the existing disk with the given disk with the same fsid/devid.
But the actual IO is still going to sde not to sdc.

Further when we start fresh with (modprobe -r btrfs)
unless it is carefully managed using btrfs dev scan <dev>
it may pair with wrong disk.

Need your review of the following proposed fix. This patch
will compare the trans id before disk is substituted.

----------------------------------------------------
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2ca91fc..b226284 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -496,14 +496,39 @@ static noinline int device_list_add(const char *path,

                 device->fs_devices = fs_devices;
         } else if (!device->name || strcmp(device->name->str, path)) {
-               name = rcu_string_strdup(path, GFP_NOFS);
-               if (!name)
-                       return -ENOMEM;
-               rcu_string_free(device->name);
-               rcu_assign_pointer(device->name, name);
-               if (device->missing) {
-                       fs_devices->missing_devices--;
-                       device->missing = 0;
+
+               struct buffer_head *bh;
+               struct btrfs_super_block *cur_disk_super;
+               u64 cur_transid;
+
+               if (!device->missing) {
+                       bh = btrfs_read_dev_super(device->bdev);
+                       if (!bh)
+                               return -EINVAL;
+
+                       cur_disk_super = (struct btrfs_super_block *)
+						bh->b_data;
+                       cur_transid = btrfs_super_generation(ds);
+               } else
+                       cur_transid = 0;
+
+               if (found_transid > cur_transid) {
+
+                       name = rcu_string_strdup(path, GFP_NOFS);
+                       if (!name)
+                               return -ENOMEM;
+
+                       rcu_string_free(device->name);
+                       rcu_assign_pointer(device->name, name);
+
+                       if (device->missing) {
+                               fs_devices->missing_devices--;
+                               device->missing = 0;
+                       }
+
+       printk_in_rcu(KERN_INFO "%s tran %llu replaced %s tran %llu\n",
+                               path, found_transid,
+                               rcu_str_deref(device->name), tranid);
                 }
         }

---------------------------------------


Thanks Anand


[1] github.com/anajain/devmgt.git




             reply	other threads:[~2014-01-06 16:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-06 16:56 Anand Jain [this message]
2014-01-13 10:05 ` [bug] its messy when missing device reappears after its been replaced in RAID1 Wang Shilong
2014-01-14 11:43   ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52CAE033.3020604@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.