* [PATCH 1/2] btrfs: Simplify snapshot exclusion code @ 2019-04-23 11:42 Nikolay Borisov 2019-04-23 11:42 ` [PATCH 2/2] btrfs: Remove dead code Nikolay Borisov 2019-04-24 9:49 ` [PATCH 1/2] btrfs: Simplify snapshot exclusion code Filipe Manana 0 siblings, 2 replies; 6+ messages in thread From: Nikolay Borisov @ 2019-04-23 11:42 UTC (permalink / raw) To: linux-btrfs; +Cc: Nikolay Borisov BTRFS sports a mechanism to provide exclusion when a snapshot is about to be created. This is implemented via btrfs_start_write_no_snapshotting et al. Currently the implementation of that mechanism is some perverse amalgamation of a percpu variable, an explicit waitqueue, an atomic_t variable and an implicit wait bit on said atomic_t via wait_var_event family of API. And for good measure there is a memory barrier thrown in the mix... Astute reader should have concluded by now that it's bordering on impossible to prove whether this scheme works. What's worse - all of this is required to achieve something really simple - ensure certain operations cannot run during snapshot creation. Let's simplify this by relying on a single atomic_t used as a boolean flag. This commit changes only the implementation and not the semantics of the existing mechanism. Now, if the atomic is 1 (snapshot is in progress) callers of btrfs_start_write_no_snapshotting will get a ret val of 0 that should be handled accordingly. btrfs_wait_for_snapshot_creation OTOH will block until snapshotting is in progress and return when current snapshot in progress is finished and will acquire the right to create a snapshot. Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- fs/btrfs/extent-tree.c | 20 +++++--------------- fs/btrfs/ioctl.c | 9 ++------- 2 files changed, 7 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 8f2b7b29c3fd..d9e2e35700fd 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -11333,25 +11333,15 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) */ void btrfs_end_write_no_snapshotting(struct btrfs_root *root) { - percpu_counter_dec(&root->subv_writers->counter); - cond_wake_up(&root->subv_writers->wait); + ASSERT(atomic_read(&root->will_be_snapshotted) == 1); + if (atomic_dec_and_test(&root->will_be_snapshotted)) + wake_up_var(&root->will_be_snapshotted); } int btrfs_start_write_no_snapshotting(struct btrfs_root *root) { - if (atomic_read(&root->will_be_snapshotted)) - return 0; - - percpu_counter_inc(&root->subv_writers->counter); - /* - * Make sure counter is updated before we check for snapshot creation. - */ - smp_mb(); - if (atomic_read(&root->will_be_snapshotted)) { - btrfs_end_write_no_snapshotting(root); - return 0; - } - return 1; + ASSERT(atomic_read(&root->will_be_snapshotted) >= 0); + return atomic_add_unless(&root->will_be_snapshotted, 1, 1); } void btrfs_wait_for_snapshot_creation(struct btrfs_root *root) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 8774d4be7c97..f9f66c8a5dad 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -794,11 +794,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, * possible. This is to avoid later writeback (running dealloc) to * fallback to COW mode and unexpectedly fail with ENOSPC. */ - atomic_inc(&root->will_be_snapshotted); - smp_mb__after_atomic(); - /* wait for no snapshot writes */ - wait_event(root->subv_writers->wait, - percpu_counter_sum(&root->subv_writers->counter) == 0); + btrfs_wait_for_snapshot_creation(root); ret = btrfs_start_delalloc_snapshot(root); if (ret) @@ -878,8 +874,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, dec_and_free: if (snapshot_force_cow) atomic_dec(&root->snapshot_force_cow); - if (atomic_dec_and_test(&root->will_be_snapshotted)) - wake_up_var(&root->will_be_snapshotted); + btrfs_end_write_no_snapshotting(root); free_pending: kfree(pending_snapshot->root_item); btrfs_free_path(pending_snapshot->path); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] btrfs: Remove dead code 2019-04-23 11:42 [PATCH 1/2] btrfs: Simplify snapshot exclusion code Nikolay Borisov @ 2019-04-23 11:42 ` Nikolay Borisov 2019-04-23 13:58 ` [PATCH v2] " Nikolay Borisov 2019-04-24 9:49 ` [PATCH 1/2] btrfs: Simplify snapshot exclusion code Filipe Manana 1 sibling, 1 reply; 6+ messages in thread From: Nikolay Borisov @ 2019-04-23 11:42 UTC (permalink / raw) To: linux-btrfs; +Cc: Nikolay Borisov BTRFS no longer relies on btrfs_subvolume_writers for snapshot exclusion. Just remove any code allocating/freeing it and the structure definition itself. Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- fs/btrfs/ctree.h | 6 ------ fs/btrfs/disk-io.c | 10 ---------- 2 files changed, 16 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 05731e4ca358..9fdb7ab74102 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1169,11 +1169,6 @@ static inline struct btrfs_fs_info *btrfs_sb(struct super_block *sb) return sb->s_fs_info; } -struct btrfs_subvolume_writers { - struct percpu_counter counter; - wait_queue_head_t wait; -}; - /* * The state of btrfs root */ @@ -1339,7 +1334,6 @@ struct btrfs_root { * manipulation with the read-only status via SUBVOL_SETFLAGS */ int send_in_progress; - struct btrfs_subvolume_writers *subv_writers; atomic_t will_be_snapshotted; atomic_t snapshot_force_cow; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 005c9f5c6f10..ad2fa12cc654 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1492,7 +1492,6 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_root *tree_root, int btrfs_init_fs_root(struct btrfs_root *root) { int ret; - struct btrfs_subvolume_writers *writers; root->free_ino_ctl = kzalloc(sizeof(*root->free_ino_ctl), GFP_NOFS); root->free_ino_pinned = kzalloc(sizeof(*root->free_ino_pinned), @@ -1502,13 +1501,6 @@ int btrfs_init_fs_root(struct btrfs_root *root) goto fail; } - writers = btrfs_alloc_subvolume_writers(); - if (IS_ERR(writers)) { - ret = PTR_ERR(writers); - goto fail; - } - root->subv_writers = writers; - btrfs_init_free_ino_ctl(root); spin_lock_init(&root->ino_cache_lock); init_waitqueue_head(&root->ino_cache_wait); @@ -3870,8 +3862,6 @@ void btrfs_free_fs_root(struct btrfs_root *root) WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree)); if (root->anon_dev) free_anon_bdev(root->anon_dev); - if (root->subv_writers) - btrfs_free_subvolume_writers(root->subv_writers); free_extent_buffer(root->node); free_extent_buffer(root->commit_root); kfree(root->free_ino_ctl); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2] btrfs: Remove dead code 2019-04-23 11:42 ` [PATCH 2/2] btrfs: Remove dead code Nikolay Borisov @ 2019-04-23 13:58 ` Nikolay Borisov 2019-04-23 14:49 ` [PATCH v3] " Nikolay Borisov 0 siblings, 1 reply; 6+ messages in thread From: Nikolay Borisov @ 2019-04-23 13:58 UTC (permalink / raw) To: linux-btrfs; +Cc: Nikolay Borisov BTRFS no longer relies on btrfs_subvolume_writers for snapshot exclusion. Just remove any code allocating/freeing it and the structure definition itself. Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- Changes in v2: * Remove definition of btrfs_alloc_subvolume_writers. fs/btrfs/ctree.h | 6 ------ fs/btrfs/disk-io.c | 29 ----------------------------- 2 files changed, 35 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 05731e4ca358..9fdb7ab74102 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1169,11 +1169,6 @@ static inline struct btrfs_fs_info *btrfs_sb(struct super_block *sb) return sb->s_fs_info; } -struct btrfs_subvolume_writers { - struct percpu_counter counter; - wait_queue_head_t wait; -}; - /* * The state of btrfs root */ @@ -1339,7 +1334,6 @@ struct btrfs_root { * manipulation with the read-only status via SUBVOL_SETFLAGS */ int send_in_progress; - struct btrfs_subvolume_writers *subv_writers; atomic_t will_be_snapshotted; atomic_t snapshot_force_cow; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 005c9f5c6f10..411678c88047 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1130,25 +1130,6 @@ void btrfs_clean_tree_block(struct extent_buffer *buf) } } -static struct btrfs_subvolume_writers *btrfs_alloc_subvolume_writers(void) -{ - struct btrfs_subvolume_writers *writers; - int ret; - - writers = kmalloc(sizeof(*writers), GFP_NOFS); - if (!writers) - return ERR_PTR(-ENOMEM); - - ret = percpu_counter_init(&writers->counter, 0, GFP_NOFS); - if (ret < 0) { - kfree(writers); - return ERR_PTR(ret); - } - - init_waitqueue_head(&writers->wait); - return writers; -} - static void btrfs_free_subvolume_writers(struct btrfs_subvolume_writers *writers) { @@ -1492,7 +1473,6 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_root *tree_root, int btrfs_init_fs_root(struct btrfs_root *root) { int ret; - struct btrfs_subvolume_writers *writers; root->free_ino_ctl = kzalloc(sizeof(*root->free_ino_ctl), GFP_NOFS); root->free_ino_pinned = kzalloc(sizeof(*root->free_ino_pinned), @@ -1502,13 +1482,6 @@ int btrfs_init_fs_root(struct btrfs_root *root) goto fail; } - writers = btrfs_alloc_subvolume_writers(); - if (IS_ERR(writers)) { - ret = PTR_ERR(writers); - goto fail; - } - root->subv_writers = writers; - btrfs_init_free_ino_ctl(root); spin_lock_init(&root->ino_cache_lock); init_waitqueue_head(&root->ino_cache_wait); @@ -3870,8 +3843,6 @@ void btrfs_free_fs_root(struct btrfs_root *root) WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree)); if (root->anon_dev) free_anon_bdev(root->anon_dev); - if (root->subv_writers) - btrfs_free_subvolume_writers(root->subv_writers); free_extent_buffer(root->node); free_extent_buffer(root->commit_root); kfree(root->free_ino_ctl); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3] btrfs: Remove dead code 2019-04-23 13:58 ` [PATCH v2] " Nikolay Borisov @ 2019-04-23 14:49 ` Nikolay Borisov 0 siblings, 0 replies; 6+ messages in thread From: Nikolay Borisov @ 2019-04-23 14:49 UTC (permalink / raw) To: linux-btrfs; +Cc: Nikolay Borisov BTRFS no longer relies on btrfs_subvolume_writers for snapshot exclusion. Just remove any code allocating/freeing it and the structure definition itself. Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- Changes in v3: * Removed -btrfs_free_subvolume_writers. This is really the final piece of the puzzle.... Changes in v2: * Remove definition of btrfs_alloc_subvolume_writers. fs/btrfs/ctree.h | 6 ------ fs/btrfs/disk-io.c | 36 ------------------------------------ 2 files changed, 42 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 05731e4ca358..9fdb7ab74102 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1169,11 +1169,6 @@ static inline struct btrfs_fs_info *btrfs_sb(struct super_block *sb) return sb->s_fs_info; } -struct btrfs_subvolume_writers { - struct percpu_counter counter; - wait_queue_head_t wait; -}; - /* * The state of btrfs root */ @@ -1339,7 +1334,6 @@ struct btrfs_root { * manipulation with the read-only status via SUBVOL_SETFLAGS */ int send_in_progress; - struct btrfs_subvolume_writers *subv_writers; atomic_t will_be_snapshotted; atomic_t snapshot_force_cow; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 005c9f5c6f10..4702227d9ddd 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1130,32 +1130,6 @@ void btrfs_clean_tree_block(struct extent_buffer *buf) } } -static struct btrfs_subvolume_writers *btrfs_alloc_subvolume_writers(void) -{ - struct btrfs_subvolume_writers *writers; - int ret; - - writers = kmalloc(sizeof(*writers), GFP_NOFS); - if (!writers) - return ERR_PTR(-ENOMEM); - - ret = percpu_counter_init(&writers->counter, 0, GFP_NOFS); - if (ret < 0) { - kfree(writers); - return ERR_PTR(ret); - } - - init_waitqueue_head(&writers->wait); - return writers; -} - -static void -btrfs_free_subvolume_writers(struct btrfs_subvolume_writers *writers) -{ - percpu_counter_destroy(&writers->counter); - kfree(writers); -} - static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info, u64 objectid) { @@ -1492,7 +1466,6 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_root *tree_root, int btrfs_init_fs_root(struct btrfs_root *root) { int ret; - struct btrfs_subvolume_writers *writers; root->free_ino_ctl = kzalloc(sizeof(*root->free_ino_ctl), GFP_NOFS); root->free_ino_pinned = kzalloc(sizeof(*root->free_ino_pinned), @@ -1502,13 +1475,6 @@ int btrfs_init_fs_root(struct btrfs_root *root) goto fail; } - writers = btrfs_alloc_subvolume_writers(); - if (IS_ERR(writers)) { - ret = PTR_ERR(writers); - goto fail; - } - root->subv_writers = writers; - btrfs_init_free_ino_ctl(root); spin_lock_init(&root->ino_cache_lock); init_waitqueue_head(&root->ino_cache_wait); @@ -3870,8 +3836,6 @@ void btrfs_free_fs_root(struct btrfs_root *root) WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree)); if (root->anon_dev) free_anon_bdev(root->anon_dev); - if (root->subv_writers) - btrfs_free_subvolume_writers(root->subv_writers); free_extent_buffer(root->node); free_extent_buffer(root->commit_root); kfree(root->free_ino_ctl); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] btrfs: Simplify snapshot exclusion code 2019-04-23 11:42 [PATCH 1/2] btrfs: Simplify snapshot exclusion code Nikolay Borisov 2019-04-23 11:42 ` [PATCH 2/2] btrfs: Remove dead code Nikolay Borisov @ 2019-04-24 9:49 ` Filipe Manana 2019-04-24 10:19 ` Filipe Manana 1 sibling, 1 reply; 6+ messages in thread From: Filipe Manana @ 2019-04-24 9:49 UTC (permalink / raw) To: Nikolay Borisov; +Cc: linux-btrfs On Tue, Apr 23, 2019 at 12:43 PM Nikolay Borisov <nborisov@suse.com> wrote: > > BTRFS sports a mechanism to provide exclusion when a snapshot is about > to be created. This is implemented via btrfs_start_write_no_snapshotting > et al. Currently the implementation of that mechanism is some perverse > amalgamation of a percpu variable, an explicit waitqueue, an atomic_t > variable and an implicit wait bit on said atomic_t via wait_var_event > family of API. And for good measure there is a memory barrier thrown in > the mix... > > Astute reader should have concluded by now that it's bordering on > impossible to prove whether this scheme works. What's worse - all of > this is required to achieve something really simple - ensure certain > operations cannot run during snapshot creation. Let's simplify this by > relying on a single atomic_t used as a boolean flag. Nop, can't work as a boolean, see below. > This commit changes > only the implementation and not the semantics of the existing mechanism. > > Now, if the atomic is 1 (snapshot is in progress) callers of > btrfs_start_write_no_snapshotting will get a ret val of 0 that should be > handled accordingly. > > btrfs_wait_for_snapshot_creation OTOH will block until snapshotting is > in progress and return when current snapshot in progress is finished and > will acquire the right to create a snapshot. > > Signed-off-by: Nikolay Borisov <nborisov@suse.com> > --- > fs/btrfs/extent-tree.c | 20 +++++--------------- > fs/btrfs/ioctl.c | 9 ++------- > 2 files changed, 7 insertions(+), 22 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 8f2b7b29c3fd..d9e2e35700fd 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -11333,25 +11333,15 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) > */ > void btrfs_end_write_no_snapshotting(struct btrfs_root *root) > { > - percpu_counter_dec(&root->subv_writers->counter); > - cond_wake_up(&root->subv_writers->wait); > + ASSERT(atomic_read(&root->will_be_snapshotted) == 1); > + if (atomic_dec_and_test(&root->will_be_snapshotted)) > + wake_up_var(&root->will_be_snapshotted); > } > > int btrfs_start_write_no_snapshotting(struct btrfs_root *root) > { > - if (atomic_read(&root->will_be_snapshotted)) > - return 0; > - > - percpu_counter_inc(&root->subv_writers->counter); > - /* > - * Make sure counter is updated before we check for snapshot creation. > - */ > - smp_mb(); > - if (atomic_read(&root->will_be_snapshotted)) { > - btrfs_end_write_no_snapshotting(root); > - return 0; > - } > - return 1; > + ASSERT(atomic_read(&root->will_be_snapshotted) >= 0); > + return atomic_add_unless(&root->will_be_snapshotted, 1, 1); > } So if two writes call btrfs_start_write_no_snapshotting(), we end up with root->will_be_snapshotted == 1. One task calls btrfs_end_write_no_snapshotting(), it decrements it to 1 - we wake up the snapshot creation task while there's still one nodatacow writer - this is incorrect. Now the second task calls btrfs_end_write_no_snapshotting(), sees root->will_be_snapshotted == 0, assertion failure. > > void btrfs_wait_for_snapshot_creation(struct btrfs_root *root) > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c > index 8774d4be7c97..f9f66c8a5dad 100644 > --- a/fs/btrfs/ioctl.c > +++ b/fs/btrfs/ioctl.c > @@ -794,11 +794,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, > * possible. This is to avoid later writeback (running dealloc) to > * fallback to COW mode and unexpectedly fail with ENOSPC. > */ > - atomic_inc(&root->will_be_snapshotted); > - smp_mb__after_atomic(); > - /* wait for no snapshot writes */ > - wait_event(root->subv_writers->wait, > - percpu_counter_sum(&root->subv_writers->counter) == 0); > + btrfs_wait_for_snapshot_creation(root); This naming is also confusing now. The task that creates a snapshot is calling btrfs_wait_for_snapshot_creation(), waiting for itself? > > ret = btrfs_start_delalloc_snapshot(root); > if (ret) > @@ -878,8 +874,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, > dec_and_free: > if (snapshot_force_cow) > atomic_dec(&root->snapshot_force_cow); > - if (atomic_dec_and_test(&root->will_be_snapshotted)) > - wake_up_var(&root->will_be_snapshotted); > + btrfs_end_write_no_snapshotting(root); Also confusing. We are not ending a write operation, we are ending snapshot creation. Thanks. > free_pending: > kfree(pending_snapshot->root_item); > btrfs_free_path(pending_snapshot->path); > -- > 2.17.1 > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.” ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] btrfs: Simplify snapshot exclusion code 2019-04-24 9:49 ` [PATCH 1/2] btrfs: Simplify snapshot exclusion code Filipe Manana @ 2019-04-24 10:19 ` Filipe Manana 0 siblings, 0 replies; 6+ messages in thread From: Filipe Manana @ 2019-04-24 10:19 UTC (permalink / raw) To: Nikolay Borisov; +Cc: linux-btrfs On Wed, Apr 24, 2019 at 10:49 AM Filipe Manana <fdmanana@gmail.com> wrote: > > On Tue, Apr 23, 2019 at 12:43 PM Nikolay Borisov <nborisov@suse.com> wrote: > > > > BTRFS sports a mechanism to provide exclusion when a snapshot is about > > to be created. This is implemented via btrfs_start_write_no_snapshotting > > et al. Currently the implementation of that mechanism is some perverse > > amalgamation of a percpu variable, an explicit waitqueue, an atomic_t > > variable and an implicit wait bit on said atomic_t via wait_var_event > > family of API. And for good measure there is a memory barrier thrown in > > the mix... > > > > Astute reader should have concluded by now that it's bordering on > > impossible to prove whether this scheme works. What's worse - all of > > this is required to achieve something really simple - ensure certain > > operations cannot run during snapshot creation. Let's simplify this by > > relying on a single atomic_t used as a boolean flag. > > Nop, can't work as a boolean, see below. > > > This commit changes > > only the implementation and not the semantics of the existing mechanism. > > > > Now, if the atomic is 1 (snapshot is in progress) callers of > > btrfs_start_write_no_snapshotting will get a ret val of 0 that should be > > handled accordingly. > > > > btrfs_wait_for_snapshot_creation OTOH will block until snapshotting is > > in progress and return when current snapshot in progress is finished and > > will acquire the right to create a snapshot. > > > > Signed-off-by: Nikolay Borisov <nborisov@suse.com> > > --- > > fs/btrfs/extent-tree.c | 20 +++++--------------- > > fs/btrfs/ioctl.c | 9 ++------- > > 2 files changed, 7 insertions(+), 22 deletions(-) > > > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > > index 8f2b7b29c3fd..d9e2e35700fd 100644 > > --- a/fs/btrfs/extent-tree.c > > +++ b/fs/btrfs/extent-tree.c > > @@ -11333,25 +11333,15 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) > > */ > > void btrfs_end_write_no_snapshotting(struct btrfs_root *root) > > { > > - percpu_counter_dec(&root->subv_writers->counter); > > - cond_wake_up(&root->subv_writers->wait); > > + ASSERT(atomic_read(&root->will_be_snapshotted) == 1); > > + if (atomic_dec_and_test(&root->will_be_snapshotted)) > > + wake_up_var(&root->will_be_snapshotted); > > } > > > > int btrfs_start_write_no_snapshotting(struct btrfs_root *root) > > { > > - if (atomic_read(&root->will_be_snapshotted)) > > - return 0; > > - > > - percpu_counter_inc(&root->subv_writers->counter); > > - /* > > - * Make sure counter is updated before we check for snapshot creation. > > - */ > > - smp_mb(); > > - if (atomic_read(&root->will_be_snapshotted)) { > > - btrfs_end_write_no_snapshotting(root); > > - return 0; > > - } > > - return 1; > > + ASSERT(atomic_read(&root->will_be_snapshotted) >= 0); > > + return atomic_add_unless(&root->will_be_snapshotted, 1, 1); > > } > > So if two writes call btrfs_start_write_no_snapshotting(), we end up > with root->will_be_snapshotted == 1. > > One task calls btrfs_end_write_no_snapshotting(), it decrements it to > 1 - we wake up the snapshot creation task while there's still one > nodatacow writer - this is incorrect. > Now the second task calls btrfs_end_write_no_snapshotting(), sees > root->will_be_snapshotted == 0, assertion failure. Actually take that out, I ignored the return value of atomic_add_unless(). So this change does not allow for concurrent no snapshot writers anymore, multiple tasks calling btrfs_start_write_no_snapshotting(), only one succeeds and all the others fail, so that's a regression from what we currently have. The rest of the confusing names still applies. > > > > > void btrfs_wait_for_snapshot_creation(struct btrfs_root *root) > > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c > > index 8774d4be7c97..f9f66c8a5dad 100644 > > --- a/fs/btrfs/ioctl.c > > +++ b/fs/btrfs/ioctl.c > > @@ -794,11 +794,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, > > * possible. This is to avoid later writeback (running dealloc) to > > * fallback to COW mode and unexpectedly fail with ENOSPC. > > */ > > - atomic_inc(&root->will_be_snapshotted); > > - smp_mb__after_atomic(); > > - /* wait for no snapshot writes */ > > - wait_event(root->subv_writers->wait, > > - percpu_counter_sum(&root->subv_writers->counter) == 0); > > + btrfs_wait_for_snapshot_creation(root); > > This naming is also confusing now. The task that creates a snapshot is > calling btrfs_wait_for_snapshot_creation(), waiting for itself? > > > > > ret = btrfs_start_delalloc_snapshot(root); > > if (ret) > > @@ -878,8 +874,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, > > dec_and_free: > > if (snapshot_force_cow) > > atomic_dec(&root->snapshot_force_cow); > > - if (atomic_dec_and_test(&root->will_be_snapshotted)) > > - wake_up_var(&root->will_be_snapshotted); > > + btrfs_end_write_no_snapshotting(root); > > Also confusing. We are not ending a write operation, we are ending > snapshot creation. > > Thanks. > > > free_pending: > > kfree(pending_snapshot->root_item); > > btrfs_free_path(pending_snapshot->path); > > -- > > 2.17.1 > > > > > -- > Filipe David Manana, > > “Whether you think you can, or you think you can't — you're right.” -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.” ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-04-24 10:19 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-04-23 11:42 [PATCH 1/2] btrfs: Simplify snapshot exclusion code Nikolay Borisov 2019-04-23 11:42 ` [PATCH 2/2] btrfs: Remove dead code Nikolay Borisov 2019-04-23 13:58 ` [PATCH v2] " Nikolay Borisov 2019-04-23 14:49 ` [PATCH v3] " Nikolay Borisov 2019-04-24 9:49 ` [PATCH 1/2] btrfs: Simplify snapshot exclusion code Filipe Manana 2019-04-24 10:19 ` Filipe Manana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox