* [PATCH v3 0/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL
@ 2020-02-19 18:30 Eric Biggers
2020-02-19 18:30 ` [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem Eric Biggers
2020-02-19 18:30 ` [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers
0 siblings, 2 replies; 7+ messages in thread
From: Eric Biggers @ 2020-02-19 18:30 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Jan Kara
This series fixes a race between writepages and enabling EXT4_EXTENTS_FL
that could cause a WARN_ON() in ext4_add_complete_io() to be hit. Patch
1 is a trivial renaming in preparation for patch 2 which is the actual
fix. See patch 2 for the full details.
Changed in v3:
Do the renaming in a separate patch.
Changed in v2:
Instead of making ext4_writepages() read EXT4_EXTENTS_FL only once,
make it so that EXT4_EXTENTS_FL can't be changed while
ext4_writepages() is running.
Eric Biggers (2):
ext4: rename s_journal_flag_rwsem to s_writepages_rwsem
ext4: fix race between writepages and enabling EXT4_EXTENTS_FL
fs/ext4/ext4.h | 7 +++++--
fs/ext4/inode.c | 14 +++++++-------
fs/ext4/migrate.c | 27 +++++++++++++++++++--------
fs/ext4/super.c | 6 +++---
4 files changed, 34 insertions(+), 20 deletions(-)
base-commit: c96dceeabf765d0b1b1f29c3bf50a5c01315b820
--
2.25.0
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem 2020-02-19 18:30 [PATCH v3 0/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers @ 2020-02-19 18:30 ` Eric Biggers 2020-02-20 9:14 ` Jan Kara 2020-02-19 18:30 ` [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers 1 sibling, 1 reply; 7+ messages in thread From: Eric Biggers @ 2020-02-19 18:30 UTC (permalink / raw) To: linux-ext4, Theodore Ts'o; +Cc: Jan Kara From: Eric Biggers <ebiggers@google.com> In preparation for making s_journal_flag_rwsem synchronize ext4_writepages() with changes to both the EXTENTS and JOURNAL_DATA flags (rather than just JOURNAL_DATA as it does currently), rename it to s_writepages_rwsem. Signed-off-by: Eric Biggers <ebiggers@google.com> --- fs/ext4/ext4.h | 2 +- fs/ext4/inode.c | 14 +++++++------- fs/ext4/super.c | 6 +++--- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 4441331d06cc4..487a7b430b9dd 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1553,7 +1553,7 @@ struct ext4_sb_info { struct ratelimit_state s_msg_ratelimit_state; /* Barrier between changing inodes' journal flags and writepages ops. */ - struct percpu_rw_semaphore s_journal_flag_rwsem; + struct percpu_rw_semaphore s_writepages_rwsem; struct dax_device *s_daxdev; #ifdef CONFIG_EXT4_DEBUG unsigned long s_simulate_fail; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index c04a15fc8b6ad..f49c48ea2f170 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2628,7 +2628,7 @@ static int ext4_writepages(struct address_space *mapping, if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO; - percpu_down_read(&sbi->s_journal_flag_rwsem); + percpu_down_read(&sbi->s_writepages_rwsem); trace_ext4_writepages(inode, wbc); /* @@ -2849,7 +2849,7 @@ static int ext4_writepages(struct address_space *mapping, out_writepages: trace_ext4_writepages_result(inode, wbc, ret, nr_to_write - wbc->nr_to_write); - percpu_up_read(&sbi->s_journal_flag_rwsem); + percpu_up_read(&sbi->s_writepages_rwsem); return ret; } @@ -2864,13 +2864,13 @@ static int ext4_dax_writepages(struct address_space *mapping, if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO; - percpu_down_read(&sbi->s_journal_flag_rwsem); + percpu_down_read(&sbi->s_writepages_rwsem); trace_ext4_writepages(inode, wbc); ret = dax_writeback_mapping_range(mapping, inode->i_sb->s_bdev, wbc); trace_ext4_writepages_result(inode, wbc, ret, nr_to_write - wbc->nr_to_write); - percpu_up_read(&sbi->s_journal_flag_rwsem); + percpu_up_read(&sbi->s_writepages_rwsem); return ret; } @@ -5861,7 +5861,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) } } - percpu_down_write(&sbi->s_journal_flag_rwsem); + percpu_down_write(&sbi->s_writepages_rwsem); jbd2_journal_lock_updates(journal); /* @@ -5878,7 +5878,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) err = jbd2_journal_flush(journal); if (err < 0) { jbd2_journal_unlock_updates(journal); - percpu_up_write(&sbi->s_journal_flag_rwsem); + percpu_up_write(&sbi->s_writepages_rwsem); return err; } ext4_clear_inode_flag(inode, EXT4_INODE_JOURNAL_DATA); @@ -5886,7 +5886,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) ext4_set_aops(inode); jbd2_journal_unlock_updates(journal); - percpu_up_write(&sbi->s_journal_flag_rwsem); + percpu_up_write(&sbi->s_writepages_rwsem); if (val) up_write(&EXT4_I(inode)->i_mmap_sem); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index b0b9150c97735..feb59c7ad395f 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1054,7 +1054,7 @@ static void ext4_put_super(struct super_block *sb) percpu_counter_destroy(&sbi->s_freeinodes_counter); percpu_counter_destroy(&sbi->s_dirs_counter); percpu_counter_destroy(&sbi->s_dirtyclusters_counter); - percpu_free_rwsem(&sbi->s_journal_flag_rwsem); + percpu_free_rwsem(&sbi->s_writepages_rwsem); #ifdef CONFIG_QUOTA for (i = 0; i < EXT4_MAXQUOTAS; i++) kfree(get_qf_name(sb, sbi, i)); @@ -4600,7 +4600,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) err = percpu_counter_init(&sbi->s_dirtyclusters_counter, 0, GFP_KERNEL); if (!err) - err = percpu_init_rwsem(&sbi->s_journal_flag_rwsem); + err = percpu_init_rwsem(&sbi->s_writepages_rwsem); if (err) { ext4_msg(sb, KERN_ERR, "insufficient memory"); @@ -4694,7 +4694,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) percpu_counter_destroy(&sbi->s_freeinodes_counter); percpu_counter_destroy(&sbi->s_dirs_counter); percpu_counter_destroy(&sbi->s_dirtyclusters_counter); - percpu_free_rwsem(&sbi->s_journal_flag_rwsem); + percpu_free_rwsem(&sbi->s_writepages_rwsem); failed_mount5: ext4_ext_release(sb); ext4_release_system_zone(sb); base-commit: c96dceeabf765d0b1b1f29c3bf50a5c01315b820 -- 2.25.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem 2020-02-19 18:30 ` [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem Eric Biggers @ 2020-02-20 9:14 ` Jan Kara 2020-02-21 18:53 ` Theodore Y. Ts'o 0 siblings, 1 reply; 7+ messages in thread From: Jan Kara @ 2020-02-20 9:14 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-ext4, Theodore Ts'o, Jan Kara On Wed 19-02-20 10:30:46, Eric Biggers wrote: > From: Eric Biggers <ebiggers@google.com> > > In preparation for making s_journal_flag_rwsem synchronize > ext4_writepages() with changes to both the EXTENTS and JOURNAL_DATA > flags (rather than just JOURNAL_DATA as it does currently), rename it to > s_writepages_rwsem. > > Signed-off-by: Eric Biggers <ebiggers@google.com> The patch looks good to me. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/ext4.h | 2 +- > fs/ext4/inode.c | 14 +++++++------- > fs/ext4/super.c | 6 +++--- > 3 files changed, 11 insertions(+), 11 deletions(-) > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > index 4441331d06cc4..487a7b430b9dd 100644 > --- a/fs/ext4/ext4.h > +++ b/fs/ext4/ext4.h > @@ -1553,7 +1553,7 @@ struct ext4_sb_info { > struct ratelimit_state s_msg_ratelimit_state; > > /* Barrier between changing inodes' journal flags and writepages ops. */ > - struct percpu_rw_semaphore s_journal_flag_rwsem; > + struct percpu_rw_semaphore s_writepages_rwsem; > struct dax_device *s_daxdev; > #ifdef CONFIG_EXT4_DEBUG > unsigned long s_simulate_fail; > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index c04a15fc8b6ad..f49c48ea2f170 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -2628,7 +2628,7 @@ static int ext4_writepages(struct address_space *mapping, > if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) > return -EIO; > > - percpu_down_read(&sbi->s_journal_flag_rwsem); > + percpu_down_read(&sbi->s_writepages_rwsem); > trace_ext4_writepages(inode, wbc); > > /* > @@ -2849,7 +2849,7 @@ static int ext4_writepages(struct address_space *mapping, > out_writepages: > trace_ext4_writepages_result(inode, wbc, ret, > nr_to_write - wbc->nr_to_write); > - percpu_up_read(&sbi->s_journal_flag_rwsem); > + percpu_up_read(&sbi->s_writepages_rwsem); > return ret; > } > > @@ -2864,13 +2864,13 @@ static int ext4_dax_writepages(struct address_space *mapping, > if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) > return -EIO; > > - percpu_down_read(&sbi->s_journal_flag_rwsem); > + percpu_down_read(&sbi->s_writepages_rwsem); > trace_ext4_writepages(inode, wbc); > > ret = dax_writeback_mapping_range(mapping, inode->i_sb->s_bdev, wbc); > trace_ext4_writepages_result(inode, wbc, ret, > nr_to_write - wbc->nr_to_write); > - percpu_up_read(&sbi->s_journal_flag_rwsem); > + percpu_up_read(&sbi->s_writepages_rwsem); > return ret; > } > > @@ -5861,7 +5861,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) > } > } > > - percpu_down_write(&sbi->s_journal_flag_rwsem); > + percpu_down_write(&sbi->s_writepages_rwsem); > jbd2_journal_lock_updates(journal); > > /* > @@ -5878,7 +5878,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) > err = jbd2_journal_flush(journal); > if (err < 0) { > jbd2_journal_unlock_updates(journal); > - percpu_up_write(&sbi->s_journal_flag_rwsem); > + percpu_up_write(&sbi->s_writepages_rwsem); > return err; > } > ext4_clear_inode_flag(inode, EXT4_INODE_JOURNAL_DATA); > @@ -5886,7 +5886,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) > ext4_set_aops(inode); > > jbd2_journal_unlock_updates(journal); > - percpu_up_write(&sbi->s_journal_flag_rwsem); > + percpu_up_write(&sbi->s_writepages_rwsem); > > if (val) > up_write(&EXT4_I(inode)->i_mmap_sem); > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index b0b9150c97735..feb59c7ad395f 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1054,7 +1054,7 @@ static void ext4_put_super(struct super_block *sb) > percpu_counter_destroy(&sbi->s_freeinodes_counter); > percpu_counter_destroy(&sbi->s_dirs_counter); > percpu_counter_destroy(&sbi->s_dirtyclusters_counter); > - percpu_free_rwsem(&sbi->s_journal_flag_rwsem); > + percpu_free_rwsem(&sbi->s_writepages_rwsem); > #ifdef CONFIG_QUOTA > for (i = 0; i < EXT4_MAXQUOTAS; i++) > kfree(get_qf_name(sb, sbi, i)); > @@ -4600,7 +4600,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > err = percpu_counter_init(&sbi->s_dirtyclusters_counter, 0, > GFP_KERNEL); > if (!err) > - err = percpu_init_rwsem(&sbi->s_journal_flag_rwsem); > + err = percpu_init_rwsem(&sbi->s_writepages_rwsem); > > if (err) { > ext4_msg(sb, KERN_ERR, "insufficient memory"); > @@ -4694,7 +4694,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > percpu_counter_destroy(&sbi->s_freeinodes_counter); > percpu_counter_destroy(&sbi->s_dirs_counter); > percpu_counter_destroy(&sbi->s_dirtyclusters_counter); > - percpu_free_rwsem(&sbi->s_journal_flag_rwsem); > + percpu_free_rwsem(&sbi->s_writepages_rwsem); > failed_mount5: > ext4_ext_release(sb); > ext4_release_system_zone(sb); > > base-commit: c96dceeabf765d0b1b1f29c3bf50a5c01315b820 > -- > 2.25.0 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem 2020-02-20 9:14 ` Jan Kara @ 2020-02-21 18:53 ` Theodore Y. Ts'o 0 siblings, 0 replies; 7+ messages in thread From: Theodore Y. Ts'o @ 2020-02-21 18:53 UTC (permalink / raw) To: Jan Kara; +Cc: Eric Biggers, linux-ext4 On Thu, Feb 20, 2020 at 10:14:58AM +0100, Jan Kara wrote: > On Wed 19-02-20 10:30:46, Eric Biggers wrote: > > From: Eric Biggers <ebiggers@google.com> > > > > In preparation for making s_journal_flag_rwsem synchronize > > ext4_writepages() with changes to both the EXTENTS and JOURNAL_DATA > > flags (rather than just JOURNAL_DATA as it does currently), rename it to > > s_writepages_rwsem. > > > > Signed-off-by: Eric Biggers <ebiggers@google.com> > > The patch looks good to me. You can add: > > Reviewed-by: Jan Kara <jack@suse.cz> Thanks, applied. - Ted ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL 2020-02-19 18:30 [PATCH v3 0/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers 2020-02-19 18:30 ` [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem Eric Biggers @ 2020-02-19 18:30 ` Eric Biggers 2020-02-20 9:15 ` Jan Kara 1 sibling, 1 reply; 7+ messages in thread From: Eric Biggers @ 2020-02-19 18:30 UTC (permalink / raw) To: linux-ext4, Theodore Ts'o; +Cc: Jan Kara From: Eric Biggers <ebiggers@google.com> If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running on it, the following warning in ext4_add_complete_io() can be hit: WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120 Here's a minimal reproducer (not 100% reliable) (root isn't required): while true; do sync done & while true; do rm -f file touch file chattr -e file echo X >> file chattr +e file done The problem is that in ext4_writepages(), ext4_should_dioread_nolock() (which only returns true on extent-based files) is checked once to set the number of reserved journal credits, and also again later to select the flags for ext4_map_blocks() and copy the reserved journal handle to ext4_io_end::handle. But if EXT4_EXTENTS_FL is being concurrently set, the first check can see dioread_nolock disabled while the later one can see it enabled, causing the reserved handle to unexpectedly be NULL. Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races related to doing so as well, fix this by synchronizing changing EXT4_EXTENTS_FL with ext4_writepages() via the existing s_writepages_rwsem (previously called s_journal_flag_rwsem). This was originally reported by syzbot without a reproducer at https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf, but now that dioread_nolock is the default I also started seeing this when running syzkaller locally. Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> --- fs/ext4/ext4.h | 5 ++++- fs/ext4/migrate.c | 27 +++++++++++++++++++-------- 2 files changed, 23 insertions(+), 9 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 487a7b430b9dd..0a59006c621a0 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1552,7 +1552,10 @@ struct ext4_sb_info { struct ratelimit_state s_warning_ratelimit_state; struct ratelimit_state s_msg_ratelimit_state; - /* Barrier between changing inodes' journal flags and writepages ops. */ + /* + * Barrier between writepages ops and changing any inode's JOURNAL_DATA + * or EXTENTS flag. + */ struct percpu_rw_semaphore s_writepages_rwsem; struct dax_device *s_daxdev; #ifdef CONFIG_EXT4_DEBUG diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c index 89725fa425732..fb6520f371355 100644 --- a/fs/ext4/migrate.c +++ b/fs/ext4/migrate.c @@ -407,6 +407,7 @@ static int free_ext_block(handle_t *handle, struct inode *inode) int ext4_ext_migrate(struct inode *inode) { + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); handle_t *handle; int retval = 0, i; __le32 *i_data; @@ -431,6 +432,8 @@ int ext4_ext_migrate(struct inode *inode) */ return retval; + percpu_down_write(&sbi->s_writepages_rwsem); + /* * Worst case we can touch the allocation bitmaps, a bgd * block, and a block to link in the orphan list. We do need @@ -441,7 +444,7 @@ int ext4_ext_migrate(struct inode *inode) if (IS_ERR(handle)) { retval = PTR_ERR(handle); - return retval; + goto out_unlock; } goal = (((inode->i_ino - 1) / EXT4_INODES_PER_GROUP(inode->i_sb)) * EXT4_INODES_PER_GROUP(inode->i_sb)) + 1; @@ -452,7 +455,7 @@ int ext4_ext_migrate(struct inode *inode) if (IS_ERR(tmp_inode)) { retval = PTR_ERR(tmp_inode); ext4_journal_stop(handle); - return retval; + goto out_unlock; } i_size_write(tmp_inode, i_size_read(inode)); /* @@ -494,7 +497,7 @@ int ext4_ext_migrate(struct inode *inode) */ ext4_orphan_del(NULL, tmp_inode); retval = PTR_ERR(handle); - goto out; + goto out_tmp_inode; } ei = EXT4_I(inode); @@ -576,10 +579,11 @@ int ext4_ext_migrate(struct inode *inode) ext4_ext_tree_init(handle, tmp_inode); out_stop: ext4_journal_stop(handle); -out: +out_tmp_inode: unlock_new_inode(tmp_inode); iput(tmp_inode); - +out_unlock: + percpu_up_write(&sbi->s_writepages_rwsem); return retval; } @@ -589,7 +593,8 @@ int ext4_ext_migrate(struct inode *inode) int ext4_ind_migrate(struct inode *inode) { struct ext4_extent_header *eh; - struct ext4_super_block *es = EXT4_SB(inode->i_sb)->s_es; + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + struct ext4_super_block *es = sbi->s_es; struct ext4_inode_info *ei = EXT4_I(inode); struct ext4_extent *ex; unsigned int i, len; @@ -613,9 +618,13 @@ int ext4_ind_migrate(struct inode *inode) if (test_opt(inode->i_sb, DELALLOC)) ext4_alloc_da_blocks(inode); + percpu_down_write(&sbi->s_writepages_rwsem); + handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1); - if (IS_ERR(handle)) - return PTR_ERR(handle); + if (IS_ERR(handle)) { + ret = PTR_ERR(handle); + goto out_unlock; + } down_write(&EXT4_I(inode)->i_data_sem); ret = ext4_ext_check_inode(inode); @@ -650,5 +659,7 @@ int ext4_ind_migrate(struct inode *inode) errout: ext4_journal_stop(handle); up_write(&EXT4_I(inode)->i_data_sem); +out_unlock: + percpu_up_write(&sbi->s_writepages_rwsem); return ret; } -- 2.25.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL 2020-02-19 18:30 ` [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers @ 2020-02-20 9:15 ` Jan Kara 2020-02-21 18:53 ` Theodore Y. Ts'o 0 siblings, 1 reply; 7+ messages in thread From: Jan Kara @ 2020-02-20 9:15 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-ext4, Theodore Ts'o, Jan Kara On Wed 19-02-20 10:30:47, Eric Biggers wrote: > From: Eric Biggers <ebiggers@google.com> > > If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running > on it, the following warning in ext4_add_complete_io() can be hit: > > WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120 > > Here's a minimal reproducer (not 100% reliable) (root isn't required): > > while true; do > sync > done & > while true; do > rm -f file > touch file > chattr -e file > echo X >> file > chattr +e file > done > > The problem is that in ext4_writepages(), ext4_should_dioread_nolock() > (which only returns true on extent-based files) is checked once to set > the number of reserved journal credits, and also again later to select > the flags for ext4_map_blocks() and copy the reserved journal handle to > ext4_io_end::handle. But if EXT4_EXTENTS_FL is being concurrently set, > the first check can see dioread_nolock disabled while the later one can > see it enabled, causing the reserved handle to unexpectedly be NULL. > > Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races > related to doing so as well, fix this by synchronizing changing > EXT4_EXTENTS_FL with ext4_writepages() via the existing > s_writepages_rwsem (previously called s_journal_flag_rwsem). > > This was originally reported by syzbot without a reproducer at > https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf, > but now that dioread_nolock is the default I also started seeing this > when running syzkaller locally. > > Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com > Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") > Cc: stable@kernel.org > Signed-off-by: Eric Biggers <ebiggers@google.com> The patch looks good to me. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/ext4.h | 5 ++++- > fs/ext4/migrate.c | 27 +++++++++++++++++++-------- > 2 files changed, 23 insertions(+), 9 deletions(-) > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > index 487a7b430b9dd..0a59006c621a0 100644 > --- a/fs/ext4/ext4.h > +++ b/fs/ext4/ext4.h > @@ -1552,7 +1552,10 @@ struct ext4_sb_info { > struct ratelimit_state s_warning_ratelimit_state; > struct ratelimit_state s_msg_ratelimit_state; > > - /* Barrier between changing inodes' journal flags and writepages ops. */ > + /* > + * Barrier between writepages ops and changing any inode's JOURNAL_DATA > + * or EXTENTS flag. > + */ > struct percpu_rw_semaphore s_writepages_rwsem; > struct dax_device *s_daxdev; > #ifdef CONFIG_EXT4_DEBUG > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c > index 89725fa425732..fb6520f371355 100644 > --- a/fs/ext4/migrate.c > +++ b/fs/ext4/migrate.c > @@ -407,6 +407,7 @@ static int free_ext_block(handle_t *handle, struct inode *inode) > > int ext4_ext_migrate(struct inode *inode) > { > + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > handle_t *handle; > int retval = 0, i; > __le32 *i_data; > @@ -431,6 +432,8 @@ int ext4_ext_migrate(struct inode *inode) > */ > return retval; > > + percpu_down_write(&sbi->s_writepages_rwsem); > + > /* > * Worst case we can touch the allocation bitmaps, a bgd > * block, and a block to link in the orphan list. We do need > @@ -441,7 +444,7 @@ int ext4_ext_migrate(struct inode *inode) > > if (IS_ERR(handle)) { > retval = PTR_ERR(handle); > - return retval; > + goto out_unlock; > } > goal = (((inode->i_ino - 1) / EXT4_INODES_PER_GROUP(inode->i_sb)) * > EXT4_INODES_PER_GROUP(inode->i_sb)) + 1; > @@ -452,7 +455,7 @@ int ext4_ext_migrate(struct inode *inode) > if (IS_ERR(tmp_inode)) { > retval = PTR_ERR(tmp_inode); > ext4_journal_stop(handle); > - return retval; > + goto out_unlock; > } > i_size_write(tmp_inode, i_size_read(inode)); > /* > @@ -494,7 +497,7 @@ int ext4_ext_migrate(struct inode *inode) > */ > ext4_orphan_del(NULL, tmp_inode); > retval = PTR_ERR(handle); > - goto out; > + goto out_tmp_inode; > } > > ei = EXT4_I(inode); > @@ -576,10 +579,11 @@ int ext4_ext_migrate(struct inode *inode) > ext4_ext_tree_init(handle, tmp_inode); > out_stop: > ext4_journal_stop(handle); > -out: > +out_tmp_inode: > unlock_new_inode(tmp_inode); > iput(tmp_inode); > - > +out_unlock: > + percpu_up_write(&sbi->s_writepages_rwsem); > return retval; > } > > @@ -589,7 +593,8 @@ int ext4_ext_migrate(struct inode *inode) > int ext4_ind_migrate(struct inode *inode) > { > struct ext4_extent_header *eh; > - struct ext4_super_block *es = EXT4_SB(inode->i_sb)->s_es; > + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > + struct ext4_super_block *es = sbi->s_es; > struct ext4_inode_info *ei = EXT4_I(inode); > struct ext4_extent *ex; > unsigned int i, len; > @@ -613,9 +618,13 @@ int ext4_ind_migrate(struct inode *inode) > if (test_opt(inode->i_sb, DELALLOC)) > ext4_alloc_da_blocks(inode); > > + percpu_down_write(&sbi->s_writepages_rwsem); > + > handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1); > - if (IS_ERR(handle)) > - return PTR_ERR(handle); > + if (IS_ERR(handle)) { > + ret = PTR_ERR(handle); > + goto out_unlock; > + } > > down_write(&EXT4_I(inode)->i_data_sem); > ret = ext4_ext_check_inode(inode); > @@ -650,5 +659,7 @@ int ext4_ind_migrate(struct inode *inode) > errout: > ext4_journal_stop(handle); > up_write(&EXT4_I(inode)->i_data_sem); > +out_unlock: > + percpu_up_write(&sbi->s_writepages_rwsem); > return ret; > } > -- > 2.25.0 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL 2020-02-20 9:15 ` Jan Kara @ 2020-02-21 18:53 ` Theodore Y. Ts'o 0 siblings, 0 replies; 7+ messages in thread From: Theodore Y. Ts'o @ 2020-02-21 18:53 UTC (permalink / raw) To: Jan Kara; +Cc: Eric Biggers, linux-ext4 On Thu, Feb 20, 2020 at 10:15:48AM +0100, Jan Kara wrote: > On Wed 19-02-20 10:30:47, Eric Biggers wrote: > > From: Eric Biggers <ebiggers@google.com> > > > > If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running > > on it, the following warning in ext4_add_complete_io() can be hit: > > > > WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120 > > > > Here's a minimal reproducer (not 100% reliable) (root isn't required): > > > > while true; do > > sync > > done & > > while true; do > > rm -f file > > touch file > > chattr -e file > > echo X >> file > > chattr +e file > > done > > > > The problem is that in ext4_writepages(), ext4_should_dioread_nolock() > > (which only returns true on extent-based files) is checked once to set > > the number of reserved journal credits, and also again later to select > > the flags for ext4_map_blocks() and copy the reserved journal handle to > > ext4_io_end::handle. But if EXT4_EXTENTS_FL is being concurrently set, > > the first check can see dioread_nolock disabled while the later one can > > see it enabled, causing the reserved handle to unexpectedly be NULL. > > > > Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races > > related to doing so as well, fix this by synchronizing changing > > EXT4_EXTENTS_FL with ext4_writepages() via the existing > > s_writepages_rwsem (previously called s_journal_flag_rwsem). > > > > This was originally reported by syzbot without a reproducer at > > https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf, > > but now that dioread_nolock is the default I also started seeing this > > when running syzkaller locally. > > > > Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com > > Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") > > Cc: stable@kernel.org > > Signed-off-by: Eric Biggers <ebiggers@google.com> > > The patch looks good to me. You can add: > > Reviewed-by: Jan Kara <jack@suse.cz> Thanks, applied. - Ted ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-02-21 18:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-02-19 18:30 [PATCH v3 0/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers 2020-02-19 18:30 ` [PATCH v3 1/2] ext4: rename s_journal_flag_rwsem to s_writepages_rwsem Eric Biggers 2020-02-20 9:14 ` Jan Kara 2020-02-21 18:53 ` Theodore Y. Ts'o 2020-02-19 18:30 ` [PATCH v3 2/2] ext4: fix race between writepages and enabling EXT4_EXTENTS_FL Eric Biggers 2020-02-20 9:15 ` Jan Kara 2020-02-21 18:53 ` Theodore Y. Ts'o
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.