linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Cc: linux-btrfs@vger.kernel.org, dsterba@suse.cz
Subject: Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace
Date: Thu, 14 Apr 2016 17:57:40 +0800	[thread overview]
Message-ID: <570F6994.9030605@oracle.com> (raw)
In-Reply-To: <20160414092246.GB17024@jeknote.loshitsa1.net>



On 04/14/2016 05:22 PM, Yauhen Kharuzhy wrote:
> On Thu, Apr 14, 2016 at 04:45:11PM +0800, Anand Jain wrote:
>>
>>
>>
>> Thanks for the report ! more below..
>>
>>
>>   You may use simpler devmgt tool, https://github.com/asj/devmgt
>
> Thanks, will try.
>
>>
>>   You are failing the replace-target, presumably when the replace is
>>   still running, however note that this patch-set does not fail the
>>   replace-target for errors (as of now I have no idea how to do that
>>   without leading to a messy situation), and so it would follow the
>>   original code as without this patch.
>>   Next, originally with-out this patch-set we won't close any device
>>   for errors. So when you delete the device at the block-layer and
>>   re-attach (scan) most probably you are having a newer device path
>>   to the block device. (which kind of defeats the idea of testing
>>   an intermittently disappearing device), so I doubt, if the test
>>   case is reliable,  and above panic is btrfs related and if its
>>   this patch-set related.
>
> No, It is fixed by my latest patch (about of s_bdev field in
> superblock). Actual sequence which leads to oops is:
> 1) FS is mounted, s_bdev is NULL
> 2) failed device is closed, s_bdev untouched


> 3) missing device is replaced, s_bdev is set to non-NULL – bdev of
> the replaced device
> 4) at second device closing, s_bdev is "changed" to first device from
> the device list but it is... some device because closed dev still
> didn't delete from the list!
> 5) after device closing, s_bdev points to invalid bdev.
> 6) umount -> sync_filesystem() -> sync_blokdev(s_bdev) -> OOPS.
>

  This is wrong. It should be other way around. That is s_bdev
  should continue to be NULL. And if s_bdev continues to be NULL
  the sync thread will fail-safe.

  The diff sent in the other thread will fix.

Thanks, Anand

  reply	other threads:[~2016-04-14  9:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 14:15 [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace Anand Jain
2016-04-12 14:15 ` [PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2016-04-12 19:21   ` Yauhen Kharuzhy
2016-04-12 14:15 ` [PATCH 02/13] btrfs: Do per-chunk check for mount time check Anand Jain
2016-04-12 14:15 ` [PATCH 03/13] btrfs: Do per-chunk degraded check for remount Anand Jain
2016-04-12 14:15 ` [PATCH 04/13] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2016-04-12 14:15 ` [PATCH 05/13] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2016-04-12 14:15 ` [PATCH 06/13] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2016-04-12 14:15 ` [PATCH 07/13] btrfs: add check not to mount a spare device Anand Jain
2016-04-12 14:15 ` [PATCH 08/13] btrfs: support btrfs dev scan for " Anand Jain
2016-04-12 14:15 ` [PATCH 09/13] btrfs: provide framework to get and put a " Anand Jain
2016-04-12 14:16 ` [PATCH 10/13] btrfs: introduce helper functions to perform hot replace Anand Jain
2016-04-12 14:40   ` kbuild test robot
2016-04-12 14:16 ` [PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-04-14  1:15   ` [PATCH] Btrfs: Set superblock s_bdev field properly at device closing Yauhen Kharuzhy
2016-04-14  6:59     ` Anand Jain
2016-04-14  9:10       ` Yauhen Kharuzhy
2016-04-14  9:48         ` Anand Jain
2016-04-14 10:51   ` [PATCH v5 11/13] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-04-14 16:56     ` Yauhen Kharuzhy
2016-04-18 10:50       ` Anand Jain
2016-04-12 14:16 ` [PATCH 12/13] btrfs: check device for critical errors and mark failed Anand Jain
2016-04-12 14:16 ` [PATCH 13/13] btrfs: check for failed device and hot replace Anand Jain
2016-04-12 20:02 ` [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace Yauhen Kharuzhy
2016-04-13 22:43   ` Anand Jain
2016-04-13 21:21 ` Yauhen Kharuzhy
2016-04-14  8:45   ` Anand Jain
2016-04-14  9:22     ` Yauhen Kharuzhy
2016-04-14  9:57       ` Anand Jain [this message]
2016-04-14 19:12 ` Yauhen Kharuzhy
2016-04-14 23:09 ` Yauhen Kharuzhy
2016-04-18  8:54   ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=570F6994.9030605@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=yauhen.kharuzhy@zavadatar.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).