* [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 @ 2013-12-10 10:13 Jan Kara 2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Jan Kara @ 2013-12-10 10:13 UTC (permalink / raw) To: Ted Tso; +Cc: Akira Fujita, linux-ext4 Hello, these three patches fix a deadlock Akira-san has been observing for some time when running xfstest's test 269 for long enough. The first two patches are preparatory and the third patch fixes the real problem - calling ext4_should_retry_alloc() from a wrong locking context. Honza ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed 2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara @ 2013-12-10 10:13 ` Jan Kara 2013-12-12 6:44 ` Zheng Liu 2013-12-18 5:45 ` Theodore Ts'o 2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara 2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara 2 siblings, 2 replies; 12+ messages in thread From: Jan Kara @ 2013-12-10 10:13 UTC (permalink / raw) To: Ted Tso; +Cc: Akira Fujita, linux-ext4, Jan Kara Similarly as other ->write_begin functions in ext4, also ext4_da_write_inline_data_begin() should retry allocation if the conversion failed because of ENOSPC. This avoids returning ENOSPC prematurely because of uncommitted block deletions. Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext4/inline.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index bae987549dc3..ed6e71fe5e9d 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, handle_t *handle; struct page *page; struct ext4_iloc iloc; + int retries; ret = ext4_get_inode_loc(inode, &iloc); if (ret) return ret; +retry_journal: handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); if (IS_ERR(handle)) { ret = PTR_ERR(handle); @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, inode, flags, fsdata); + ext4_journal_stop(handle); + handle = NULL; + if (ret == -ENOSPC && + ext4_should_retry_alloc(inode->i_sb, &retries)) + goto retry_journal; goto out; } -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed 2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara @ 2013-12-12 6:44 ` Zheng Liu 2013-12-12 9:39 ` Jan Kara 2013-12-18 5:45 ` Theodore Ts'o 1 sibling, 1 reply; 12+ messages in thread From: Zheng Liu @ 2013-12-12 6:44 UTC (permalink / raw) To: Jan Kara; +Cc: Ted Tso, Akira Fujita, linux-ext4 Hi Jan, On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote: > Similarly as other ->write_begin functions in ext4, also > ext4_da_write_inline_data_begin() should retry allocation if the > conversion failed because of ENOSPC. This avoids returning ENOSPC > prematurely because of uncommitted block deletions. I don't see why we need to try again here. If there is no enough space in inline data, ext4_da_convert_inline_data_to_extent() just read the inline data into a page and clear EXT4_STATE_MAY_INLINE_DATA flag. If we try again, we will encounter an ENOSPC error again. So why do we need to retry allocation? Thanks, - Zheng > > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/ext4/inline.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c > index bae987549dc3..ed6e71fe5e9d 100644 > --- a/fs/ext4/inline.c > +++ b/fs/ext4/inline.c > @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, > handle_t *handle; > struct page *page; > struct ext4_iloc iloc; > + int retries; > > ret = ext4_get_inode_loc(inode, &iloc); > if (ret) > return ret; > > +retry_journal: > handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); > if (IS_ERR(handle)) { > ret = PTR_ERR(handle); > @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, > inode, > flags, > fsdata); > + ext4_journal_stop(handle); > + handle = NULL; > + if (ret == -ENOSPC && > + ext4_should_retry_alloc(inode->i_sb, &retries)) > + goto retry_journal; > goto out; > } > > -- > 1.8.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed 2013-12-12 6:44 ` Zheng Liu @ 2013-12-12 9:39 ` Jan Kara 2013-12-12 10:30 ` Zheng Liu 0 siblings, 1 reply; 12+ messages in thread From: Jan Kara @ 2013-12-12 9:39 UTC (permalink / raw) To: Zheng Liu; +Cc: Jan Kara, Ted Tso, Akira Fujita, linux-ext4 On Thu 12-12-13 14:44:22, Zheng Liu wrote: > Hi Jan, > > On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote: > > Similarly as other ->write_begin functions in ext4, also > > ext4_da_write_inline_data_begin() should retry allocation if the > > conversion failed because of ENOSPC. This avoids returning ENOSPC > > prematurely because of uncommitted block deletions. > > I don't see why we need to try again here. If there is no enough space > in inline data, ext4_da_convert_inline_data_to_extent() just read the > inline data into a page and clear EXT4_STATE_MAY_INLINE_DATA flag. If > we try again, we will encounter an ENOSPC error again. So why do we > need to retry allocation? Umm, I don't think so. ext4_da_convert_inline_data_to_extent() calls ext4_da_get_block_prep() and that can falsely fail with ENOSPC if there are blocks freed in currently running transaction. So we call ext4_should_retry_alloc() which forces commit of the current transaction and then we retry the allocation. Honza > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/ext4/inline.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c > > index bae987549dc3..ed6e71fe5e9d 100644 > > --- a/fs/ext4/inline.c > > +++ b/fs/ext4/inline.c > > @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, > > handle_t *handle; > > struct page *page; > > struct ext4_iloc iloc; > > + int retries; > > > > ret = ext4_get_inode_loc(inode, &iloc); > > if (ret) > > return ret; > > > > +retry_journal: > > handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); > > if (IS_ERR(handle)) { > > ret = PTR_ERR(handle); > > @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, > > inode, > > flags, > > fsdata); > > + ext4_journal_stop(handle); > > + handle = NULL; > > + if (ret == -ENOSPC && > > + ext4_should_retry_alloc(inode->i_sb, &retries)) > > + goto retry_journal; > > goto out; > > } > > > > -- > > 1.8.1.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed 2013-12-12 9:39 ` Jan Kara @ 2013-12-12 10:30 ` Zheng Liu 0 siblings, 0 replies; 12+ messages in thread From: Zheng Liu @ 2013-12-12 10:30 UTC (permalink / raw) To: Jan Kara; +Cc: Ted Tso, Akira Fujita, linux-ext4 On Thu, Dec 12, 2013 at 10:39:36AM +0100, Jan Kara wrote: > On Thu 12-12-13 14:44:22, Zheng Liu wrote: > > Hi Jan, > > > > On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote: > > > Similarly as other ->write_begin functions in ext4, also > > > ext4_da_write_inline_data_begin() should retry allocation if the > > > conversion failed because of ENOSPC. This avoids returning ENOSPC > > > prematurely because of uncommitted block deletions. > > > > I don't see why we need to try again here. If there is no enough space > > in inline data, ext4_da_convert_inline_data_to_extent() just read the > > inline data into a page and clear EXT4_STATE_MAY_INLINE_DATA flag. If > > we try again, we will encounter an ENOSPC error again. So why do we > > need to retry allocation? > Umm, I don't think so. ext4_da_convert_inline_data_to_extent() calls > ext4_da_get_block_prep() and that can falsely fail with ENOSPC if there are > blocks freed in currently running transaction. So we call > ext4_should_retry_alloc() which forces commit of the current transaction > and then we retry the allocation. Ah, I see. Thanks for your explanation. - Zheng > > Honza > > > > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > --- > > > fs/ext4/inline.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c > > > index bae987549dc3..ed6e71fe5e9d 100644 > > > --- a/fs/ext4/inline.c > > > +++ b/fs/ext4/inline.c > > > @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, > > > handle_t *handle; > > > struct page *page; > > > struct ext4_iloc iloc; > > > + int retries; > > > > > > ret = ext4_get_inode_loc(inode, &iloc); > > > if (ret) > > > return ret; > > > > > > +retry_journal: > > > handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); > > > if (IS_ERR(handle)) { > > > ret = PTR_ERR(handle); > > > @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, > > > inode, > > > flags, > > > fsdata); > > > + ext4_journal_stop(handle); > > > + handle = NULL; > > > + if (ret == -ENOSPC && > > > + ext4_should_retry_alloc(inode->i_sb, &retries)) > > > + goto retry_journal; > > > goto out; > > > } > > > > > > -- > > > 1.8.1.4 > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed 2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara 2013-12-12 6:44 ` Zheng Liu @ 2013-12-18 5:45 ` Theodore Ts'o 1 sibling, 0 replies; 12+ messages in thread From: Theodore Ts'o @ 2013-12-18 5:45 UTC (permalink / raw) To: Jan Kara; +Cc: Akira Fujita, linux-ext4 On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote: > Similarly as other ->write_begin functions in ext4, also > ext4_da_write_inline_data_begin() should retry allocation if the > conversion failed because of ENOSPC. This avoids returning ENOSPC > prematurely because of uncommitted block deletions. > > Signed-off-by: Jan Kara <jack@suse.cz> Thanks, applied. - Ted ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() 2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara 2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara @ 2013-12-10 10:13 ` Jan Kara 2013-12-18 5:45 ` Theodore Ts'o 2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara 2 siblings, 1 reply; 12+ messages in thread From: Jan Kara @ 2013-12-10 10:13 UTC (permalink / raw) To: Ted Tso; +Cc: Akira Fujita, linux-ext4, Jan Kara The function has a bit non-standard (for ext4) error recovery in that it used a mix of 'out' labels and testing for 'handle' being NULL. There isn't a good reason for that in the function so clean it up a bit. Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext4/inline.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index ed6e71fe5e9d..c417e52d194e 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -859,7 +859,6 @@ retry_journal: handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); if (IS_ERR(handle)) { ret = PTR_ERR(handle); - handle = NULL; goto out; } @@ -869,7 +868,7 @@ retry_journal: if (inline_size >= pos + len) { ret = ext4_prepare_inline_data(handle, inode, pos + len); if (ret && ret != -ENOSPC) - goto out; + goto out_journal; } if (ret == -ENOSPC) { @@ -878,7 +877,6 @@ retry_journal: flags, fsdata); ext4_journal_stop(handle); - handle = NULL; if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry_journal; @@ -894,7 +892,7 @@ retry_journal: page = grab_cache_page_write_begin(mapping, 0, flags); if (!page) { ret = -ENOMEM; - goto out; + goto out_journal; } down_read(&EXT4_I(inode)->xattr_sem); @@ -911,16 +909,15 @@ retry_journal: up_read(&EXT4_I(inode)->xattr_sem); *pagep = page; - handle = NULL; brelse(iloc.bh); return 1; out_release_page: up_read(&EXT4_I(inode)->xattr_sem); unlock_page(page); page_cache_release(page); +out_journal: + ext4_journal_stop(handle); out: - if (handle) - ext4_journal_stop(handle); brelse(iloc.bh); return ret; } -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() 2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara @ 2013-12-18 5:45 ` Theodore Ts'o 0 siblings, 0 replies; 12+ messages in thread From: Theodore Ts'o @ 2013-12-18 5:45 UTC (permalink / raw) To: Jan Kara; +Cc: Akira Fujita, linux-ext4 On Tue, Dec 10, 2013 at 11:13:59AM +0100, Jan Kara wrote: > The function has a bit non-standard (for ext4) error recovery in that it > used a mix of 'out' labels and testing for 'handle' being NULL. There > isn't a good reason for that in the function so clean it up a bit. > > Signed-off-by: Jan Kara <jack@suse.cz> Thanks, applied. - Ted ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions 2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara 2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara 2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara @ 2013-12-10 10:14 ` Jan Kara 2013-12-12 6:51 ` Zheng Liu 2013-12-18 5:45 ` Theodore Ts'o 2 siblings, 2 replies; 12+ messages in thread From: Jan Kara @ 2013-12-10 10:14 UTC (permalink / raw) To: Ted Tso; +Cc: Akira Fujita, linux-ext4, Jan Kara Akira-san has been reporting rare deadlocks of his machine when running xfstests test 269 on ext4 filesystem. The problem turned out to be in ext4_da_reserve_metadata() and ext4_da_reserve_space() which called ext4_should_retry_alloc() while holding i_data_sem. Since ext4_should_retry_alloc() can force a transaction commit, this is a lock ordering violation and leads to deadlocks. Fix the problem by just removing the retry loops. These functions should just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that function must take care of retrying after dropping all necessary locks. Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext4/inode.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 075763474118..61d49ff22c81 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1206,7 +1206,6 @@ static int ext4_journalled_write_end(struct file *file, */ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock) { - int retries = 0; struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); unsigned int md_needed; @@ -1218,7 +1217,6 @@ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock) * in order to allocate nrblocks * worse case is one extent per block */ -repeat: spin_lock(&ei->i_block_reservation_lock); /* * ext4_calc_metadata_amount() has side effects, which we have @@ -1238,10 +1236,6 @@ repeat: ei->i_da_metadata_calc_len = save_len; ei->i_da_metadata_calc_last_lblock = save_last_lblock; spin_unlock(&ei->i_block_reservation_lock); - if (ext4_should_retry_alloc(inode->i_sb, &retries)) { - cond_resched(); - goto repeat; - } return -ENOSPC; } ei->i_reserved_meta_blocks += md_needed; @@ -1255,7 +1249,6 @@ repeat: */ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) { - int retries = 0; struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); unsigned int md_needed; @@ -1277,7 +1270,6 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) * in order to allocate nrblocks * worse case is one extent per block */ -repeat: spin_lock(&ei->i_block_reservation_lock); /* * ext4_calc_metadata_amount() has side effects, which we have @@ -1297,10 +1289,6 @@ repeat: ei->i_da_metadata_calc_len = save_len; ei->i_da_metadata_calc_last_lblock = save_last_lblock; spin_unlock(&ei->i_block_reservation_lock); - if (ext4_should_retry_alloc(inode->i_sb, &retries)) { - cond_resched(); - goto repeat; - } dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); return -ENOSPC; } -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions 2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara @ 2013-12-12 6:51 ` Zheng Liu 2013-12-12 9:41 ` Jan Kara 2013-12-18 5:45 ` Theodore Ts'o 1 sibling, 1 reply; 12+ messages in thread From: Zheng Liu @ 2013-12-12 6:51 UTC (permalink / raw) To: Jan Kara; +Cc: Ted Tso, Akira Fujita, linux-ext4 On Tue, Dec 10, 2013 at 11:14:00AM +0100, Jan Kara wrote: > Akira-san has been reporting rare deadlocks of his machine when running > xfstests test 269 on ext4 filesystem. The problem turned out to be in > ext4_da_reserve_metadata() and ext4_da_reserve_space() which called > ext4_should_retry_alloc() while holding i_data_sem. Since > ext4_should_retry_alloc() can force a transaction commit, this is a > lock ordering violation and leads to deadlocks. > > Fix the problem by just removing the retry loops. These functions should > just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that > function must take care of retrying after dropping all necessary locks. > > Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> > Signed-off-by: Jan Kara <jack@suse.cz> Thanks for fixing this. The patch looks good to me. You can add: Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> BTW, I have met a deadlock which is caused by ext4_da_reserve_space() in our product system. The calltrace information looks like this. So I want to make sure it is the root cause. But I couldn't reproduce the problem with running xfstest #269. Could you please tell me how to reproduce the deadlock? FWIW, I think we should backport this patch to stable kernel. Thanks, - Zheng > --- > fs/ext4/inode.c | 12 ------------ > 1 file changed, 12 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 075763474118..61d49ff22c81 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -1206,7 +1206,6 @@ static int ext4_journalled_write_end(struct file *file, > */ > static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock) > { > - int retries = 0; > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > struct ext4_inode_info *ei = EXT4_I(inode); > unsigned int md_needed; > @@ -1218,7 +1217,6 @@ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock) > * in order to allocate nrblocks > * worse case is one extent per block > */ > -repeat: > spin_lock(&ei->i_block_reservation_lock); > /* > * ext4_calc_metadata_amount() has side effects, which we have > @@ -1238,10 +1236,6 @@ repeat: > ei->i_da_metadata_calc_len = save_len; > ei->i_da_metadata_calc_last_lblock = save_last_lblock; > spin_unlock(&ei->i_block_reservation_lock); > - if (ext4_should_retry_alloc(inode->i_sb, &retries)) { > - cond_resched(); > - goto repeat; > - } > return -ENOSPC; > } > ei->i_reserved_meta_blocks += md_needed; > @@ -1255,7 +1249,6 @@ repeat: > */ > static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) > { > - int retries = 0; > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > struct ext4_inode_info *ei = EXT4_I(inode); > unsigned int md_needed; > @@ -1277,7 +1270,6 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) > * in order to allocate nrblocks > * worse case is one extent per block > */ > -repeat: > spin_lock(&ei->i_block_reservation_lock); > /* > * ext4_calc_metadata_amount() has side effects, which we have > @@ -1297,10 +1289,6 @@ repeat: > ei->i_da_metadata_calc_len = save_len; > ei->i_da_metadata_calc_last_lblock = save_last_lblock; > spin_unlock(&ei->i_block_reservation_lock); > - if (ext4_should_retry_alloc(inode->i_sb, &retries)) { > - cond_resched(); > - goto repeat; > - } > dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); > return -ENOSPC; > } > -- > 1.8.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions 2013-12-12 6:51 ` Zheng Liu @ 2013-12-12 9:41 ` Jan Kara 0 siblings, 0 replies; 12+ messages in thread From: Jan Kara @ 2013-12-12 9:41 UTC (permalink / raw) To: Zheng Liu; +Cc: Jan Kara, Ted Tso, Akira Fujita, linux-ext4 On Thu 12-12-13 14:51:16, Zheng Liu wrote: > On Tue, Dec 10, 2013 at 11:14:00AM +0100, Jan Kara wrote: > > Akira-san has been reporting rare deadlocks of his machine when running > > xfstests test 269 on ext4 filesystem. The problem turned out to be in > > ext4_da_reserve_metadata() and ext4_da_reserve_space() which called > > ext4_should_retry_alloc() while holding i_data_sem. Since > > ext4_should_retry_alloc() can force a transaction commit, this is a > > lock ordering violation and leads to deadlocks. > > > > Fix the problem by just removing the retry loops. These functions should > > just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that > > function must take care of retrying after dropping all necessary locks. > > > > Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> > > Signed-off-by: Jan Kara <jack@suse.cz> > > Thanks for fixing this. The patch looks good to me. You can add: > Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> > > BTW, I have met a deadlock which is caused by ext4_da_reserve_space() > in our product system. The calltrace information looks like this. So > I want to make sure it is the root cause. But I couldn't reproduce the > problem with running xfstest #269. Could you please tell me how to > reproduce the deadlock? I couldn't reproduce it either but Akira was able to reproduce it (but it took him a long time as well). > FWIW, I think we should backport this patch to stable kernel. Agreed. Honza > > --- > > fs/ext4/inode.c | 12 ------------ > > 1 file changed, 12 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 075763474118..61d49ff22c81 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -1206,7 +1206,6 @@ static int ext4_journalled_write_end(struct file *file, > > */ > > static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock) > > { > > - int retries = 0; > > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > > struct ext4_inode_info *ei = EXT4_I(inode); > > unsigned int md_needed; > > @@ -1218,7 +1217,6 @@ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock) > > * in order to allocate nrblocks > > * worse case is one extent per block > > */ > > -repeat: > > spin_lock(&ei->i_block_reservation_lock); > > /* > > * ext4_calc_metadata_amount() has side effects, which we have > > @@ -1238,10 +1236,6 @@ repeat: > > ei->i_da_metadata_calc_len = save_len; > > ei->i_da_metadata_calc_last_lblock = save_last_lblock; > > spin_unlock(&ei->i_block_reservation_lock); > > - if (ext4_should_retry_alloc(inode->i_sb, &retries)) { > > - cond_resched(); > > - goto repeat; > > - } > > return -ENOSPC; > > } > > ei->i_reserved_meta_blocks += md_needed; > > @@ -1255,7 +1249,6 @@ repeat: > > */ > > static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) > > { > > - int retries = 0; > > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > > struct ext4_inode_info *ei = EXT4_I(inode); > > unsigned int md_needed; > > @@ -1277,7 +1270,6 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) > > * in order to allocate nrblocks > > * worse case is one extent per block > > */ > > -repeat: > > spin_lock(&ei->i_block_reservation_lock); > > /* > > * ext4_calc_metadata_amount() has side effects, which we have > > @@ -1297,10 +1289,6 @@ repeat: > > ei->i_da_metadata_calc_len = save_len; > > ei->i_da_metadata_calc_last_lblock = save_last_lblock; > > spin_unlock(&ei->i_block_reservation_lock); > > - if (ext4_should_retry_alloc(inode->i_sb, &retries)) { > > - cond_resched(); > > - goto repeat; > > - } > > dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); > > return -ENOSPC; > > } > > -- > > 1.8.1.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions 2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara 2013-12-12 6:51 ` Zheng Liu @ 2013-12-18 5:45 ` Theodore Ts'o 1 sibling, 0 replies; 12+ messages in thread From: Theodore Ts'o @ 2013-12-18 5:45 UTC (permalink / raw) To: Jan Kara; +Cc: Akira Fujita, linux-ext4 On Tue, Dec 10, 2013 at 11:14:00AM +0100, Jan Kara wrote: > Akira-san has been reporting rare deadlocks of his machine when running > xfstests test 269 on ext4 filesystem. The problem turned out to be in > ext4_da_reserve_metadata() and ext4_da_reserve_space() which called > ext4_should_retry_alloc() while holding i_data_sem. Since > ext4_should_retry_alloc() can force a transaction commit, this is a > lock ordering violation and leads to deadlocks. > > Fix the problem by just removing the retry loops. These functions should > just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that > function must take care of retrying after dropping all necessary locks. > > Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> > Signed-off-by: Jan Kara <jack@suse.cz> Thanks, applied. - Ted ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-12-18 5:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara 2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara 2013-12-12 6:44 ` Zheng Liu 2013-12-12 9:39 ` Jan Kara 2013-12-12 10:30 ` Zheng Liu 2013-12-18 5:45 ` Theodore Ts'o 2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara 2013-12-18 5:45 ` Theodore Ts'o 2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara 2013-12-12 6:51 ` Zheng Liu 2013-12-12 9:41 ` Jan Kara 2013-12-18 5:45 ` Theodore Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).