From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp1050.oracle.com ([141.146.126.70]:28022 "EHLO aserp1050.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753331AbdCGIHo (ORCPT ); Tue, 7 Mar 2017 03:07:44 -0500 Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) by aserp1050.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v276rGp0027981 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 7 Mar 2017 06:53:16 GMT Subject: Re: [PATCH v2 4/6] btrfs: Allow barrier_all_devices to do chunk level device check To: Qu Wenruo References: <20170306085855.11403-1-quwenruo@cn.fujitsu.com> <20170306085855.11403-5-quwenruo@cn.fujitsu.com> <10be6a52-355d-d23f-b534-0bb279516a56@oracle.com> <2194e420-62ff-8ee3-6311-d1cc397525fb@cn.fujitsu.com> Cc: linux-btrfs@vger.kernel.org, lists@colorremedies.com From: Anand Jain Message-ID: Date: Tue, 7 Mar 2017 14:55:46 +0800 MIME-Version: 1.0 In-Reply-To: <2194e420-62ff-8ee3-6311-d1cc397525fb@cn.fujitsu.com> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: > 1) About reentrancy > In previous version, the err_* bits are still put into btrfs_devices > structure, just timing of resetting these bits are changes. > > So either way, it's not reentrant in theory. > > But that doesn't make a problem, as we have other things to protect when > calling write_all_supers(), the only caller of barrier_all_devices(). > > So would you give me an example why we need to make it reentrant? Its updating the device struct I would avoid such a change for the reasons of this patch. (I notice that in V1 as well). Further btrfs does not handle online intermittent device failure, unless the complete story is taken care, I would avoid such a change. Theoretically this patch is buggy, btrfs_check_rw_degradable() has more consumers than just the barrier_all_devices(). However the dev->err_wait and dev->err_send are managed by only barrier_all_devices(). And the bad news is dev->err_wait and dev->err_send would affect the result of "missing" coming out of btrfs_check_rw_degradable() which is wrong for the threads other than barrier_all_devices(). Further, the only way dev->err_wait and dev->err_send gets reset is by the next call to barrier_all_devices(). And if lock is an answer that would makes it messy and complicated. I won't do that. There is a simple solution as below.. > 2) About using stack variable? pass err_send and err_write to btrfs_check_rw_degradable() through argument so to compute degradable for the barrier_all_devices(). In this way changes are kept local thread specific. Thanks, Anand > Did you mean build a array on stack to record which devices fails to > send/wait and use the array as check condition other than > btrfs_device->err_* and btrfs_device->missing ? > > The only problem is, it sounds more complex than needed. > Despite the err_*, we also needs to check already missing devices, so I > prefer the easy way, just checking btrfs_device->err_* and > btrfs_device->missing. > > Any simple example to explain your suggestion here? > > Thanks, > Qu > >> >> >>> Signed-off-by: Qu Wenruo >>> --- >>> fs/btrfs/disk-io.c | 15 +++++++-------- >>> fs/btrfs/volumes.c | 4 +++- >>> fs/btrfs/volumes.h | 4 ++++ >>> 3 files changed, 14 insertions(+), 9 deletions(-) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index c26b8a0b121c..f596bd130524 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -3569,17 +3569,17 @@ static int barrier_all_devices(struct >>> btrfs_fs_info *info) >>> { >>> struct list_head *head; >>> struct btrfs_device *dev; >>> - int errors_send = 0; >>> - int errors_wait = 0; >>> int ret; >>> >>> /* send down all the barriers */ >>> head = &info->fs_devices->devices; >>> list_for_each_entry_rcu(dev, head, dev_list) { >>> + dev->err_wait = false; >>> + dev->err_send = false; >>> if (dev->missing) >>> continue; >>> if (!dev->bdev) { >>> - errors_send++; >>> + dev->err_send = true; >>> continue; >>> } >>> if (!dev->in_fs_metadata || !dev->writeable) >>> @@ -3587,7 +3587,7 @@ static int barrier_all_devices(struct >>> btrfs_fs_info *info) >>> >>> ret = write_dev_flush(dev, 0); >>> if (ret) >>> - errors_send++; >>> + dev->err_send = true; >>> } >>> >>> /* wait for all the barriers */ >>> @@ -3595,7 +3595,7 @@ static int barrier_all_devices(struct >>> btrfs_fs_info *info) >>> if (dev->missing) >>> continue; >>> if (!dev->bdev) { >>> - errors_wait++; >>> + dev->err_wait = true; >>> continue; >>> } >>> if (!dev->in_fs_metadata || !dev->writeable) >>> @@ -3603,10 +3603,9 @@ static int barrier_all_devices(struct >>> btrfs_fs_info *info) >>> >>> ret = write_dev_flush(dev, 1); >>> if (ret) >>> - errors_wait++; >>> + dev->err_wait = true; >>> } >>> - if (errors_send > info->num_tolerated_disk_barrier_failures || >>> - errors_wait > info->num_tolerated_disk_barrier_failures) >>> + if (!btrfs_check_rw_degradable(info)) >>> return -EIO; >>> return 0; >>> } >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index dd9dd94d7043..729cbd0d2b60 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -6796,7 +6796,9 @@ bool btrfs_check_rw_degradable(struct >>> btrfs_fs_info *fs_info) >>> btrfs_get_num_tolerated_disk_barrier_failures( >>> map->type); >>> for (i = 0; i < map->num_stripes; i++) { >>> - if (map->stripes[i].dev->missing) >>> + if (map->stripes[i].dev->missing || >>> + map->stripes[i].dev->err_wait || >>> + map->stripes[i].dev->err_send) >>> missing++; >>> } This is rather wrong. >>> if (missing > max_tolerated) { >>> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h >>> index db1b5ef479cf..112fccacdabc 100644 >>> --- a/fs/btrfs/volumes.h >>> +++ b/fs/btrfs/volumes.h >>> @@ -75,6 +75,10 @@ struct btrfs_device { >>> int can_discard; >>> int is_tgtdev_for_dev_replace; >>> >>> + /* If this devices fails to send/wait dev flush */ >>> + bool err_send; >>> + bool err_wait; >> >> >> >>> #ifdef __BTRFS_NEED_DEVICE_DATA_ORDERED >>> seqcount_t data_seqcount; >>> #endif >>> >> >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html