* Should we discard jbddirty bit if BH_Freed is set? @ 2010-01-27 2:39 丁定华 0 siblings, 0 replies; 7+ messages in thread From: 丁定华 @ 2010-01-27 2:39 UTC (permalink / raw) To: linux-ext4; +Cc: Jan Kara Hi all: I'm a little confused about BH_Freed bit. The only place it is set is journal_unmap_buffer, which is called by jbd2_journal_invalidatepage when we want to truncate a file. Since jbd2_journal_invalidatepage is called outside of transaction, We can't make sure whether the "add to orphan" operation belongs to committing transaction or not, so we can't touch the buffer belongs to committing transaction, instead BH_Freed bit is set to indicate that this buffer can be discarded in running transaction. But i think we shouldn't clear BH_JBDdirty in jbd2_journal_commit_transaction, as following codes does: /* A buffer which has been freed while still being * journaled by a previous transaction may end up still * being dirty here, but we want to avoid writing back * that buffer in the future now that the last use has * been committed. That's not only a performance gain, * it also stops aliasing problems if the buffer is left * behind for writeback and gets reallocated for another * use in a different page. */ if (buffer_freed(bh)) { clear_buffer_freed(bh); clear_buffer_jbddirty(bh); } Note that, We can't make sure "current running transaction" can complete commit work. If we clear BH_JBDdirty bit here, this buffer may be freed here, the log space of older transaction may be freed before the "current running transaction" complete commit work, and if this happends and system crashed at this moment, filesystem will be inconsistent. Above is my analysis, please let me know if it's wrong, and if it's a bug, may be I can send a patch out, thanks. -- 丁定华 -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <7bb361261001261832wb4f9ac2u96fdb6460aa45fa2@mail.gmail.com>]
* Re: Should we discard jbddirty bit if BH_Freed is set? [not found] <7bb361261001261832wb4f9ac2u96fdb6460aa45fa2@mail.gmail.com> @ 2010-01-27 12:23 ` Jan Kara 2010-01-28 1:23 ` 丁定华 0 siblings, 1 reply; 7+ messages in thread From: Jan Kara @ 2010-01-27 12:23 UTC (permalink / raw) To: 丁定华; +Cc: linux-ext4, Jan Kara Hi, On Wed 27-01-10 10:32:18, 丁定华 wrote: > I'm a little confused about BH_Freed bit. The only place it is set > is journal_unmap_buffer, which is called by jbd2_journal_invalidatepage when > we want to truncate a file. Since jbd2_journal_invalidatepage is called > outside of transaction, We can't make sure whether the "add to orphan" > operation belongs to committing transaction or not, so we can't touch the > buffer belongs to committing transaction, instead BH_Freed bit is set to > indicate that this buffer can be discarded in running transaction. But i > think we shouldn't clear BH_JBDdirty in jbd2_journal_commit_transaction, as > following codes does: > /* A buffer which has been freed while still being > * journaled by a previous transaction may end up still > * being dirty here, but we want to avoid writing back > * that buffer in the future now that the last use has > * been committed. That's not only a performance gain, > * it also stops aliasing problems if the buffer is left > * behind for writeback and gets reallocated for another > * use in a different page. */ > if (buffer_freed(bh)) { > clear_buffer_freed(bh); > clear_buffer_jbddirty(bh); > } > Note that, *We can't make sure "current running transaction" can complete > commit work.* If we clear BH_JBDdirty bit here, this buffer may be freed > here, the log space of older transaction may be freed before the "current > running transaction" complete commit work, and if this happends, filesystem > will be inconsistent. Let me sketch the situation here: The file F gets truncated. The inode is added to orphan list in some transaction T1, only then jbd2_journal_invalidatepage can be called. As you wrote above, it can happen that jbd2_journal_invalidatepage on buffer B runs when some transaction T2 containing B is being committed and in that case we set BH_Freed. If T2 != T1 - i.e., T2 is being committed and T1 is the running transaction, note that we clear the dirty bit only when T2 is fully committed and we are processing forget list. So buffer has been properly written to T2 and we just won't write it in the transaction T1. And that is fine because as soon as transaction T1 finishes commit, we don't care about what happens with buffers of F because the fact that F is truncated is recorded and in case of crash we finish truncate during journal replay. And if we crash before T1 finishes commit, we don't care about contents of T1 either. If T2 == T1, the above reasoning applies as well and the situation is even simpler. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Should we discard jbddirty bit if BH_Freed is set? 2010-01-27 12:23 ` Jan Kara @ 2010-01-28 1:23 ` 丁定华 2010-01-28 12:53 ` Jan Kara 0 siblings, 1 reply; 7+ messages in thread From: 丁定华 @ 2010-01-28 1:23 UTC (permalink / raw) To: Jan Kara; +Cc: linux-ext4 Hi, As you wrote, if T2!=T1, then T2 is committing transaction while T1 is running transaction, and if T1 complete commit, we don't care about the content of buffers. But there is a prerequisite -->"T1 complete commit", if T1 start commit and another transaction T3 becomes the new running transaction, T3 may need to reuse T2 log space and force checkpoint, and since we have clean the BH_dirty bit of buffers after T2 commits, so T2 may be freed before T1 complete commit, and unfortunately, T1 doesn't complete commit, so after replay, updates of T2 get lost, fs becomes inconsistent. 2010/1/27 Jan Kara <jack@suse.cz>: > Hi, > > On Wed 27-01-10 10:32:18, 丁定华 wrote: >> I'm a little confused about BH_Freed bit. The only place it is set >> is journal_unmap_buffer, which is called by jbd2_journal_invalidatepage when >> we want to truncate a file. Since jbd2_journal_invalidatepage is called >> outside of transaction, We can't make sure whether the "add to orphan" >> operation belongs to committing transaction or not, so we can't touch the >> buffer belongs to committing transaction, instead BH_Freed bit is set to >> indicate that this buffer can be discarded in running transaction. But i >> think we shouldn't clear BH_JBDdirty in jbd2_journal_commit_transaction, as >> following codes does: >> /* A buffer which has been freed while still being >> * journaled by a previous transaction may end up still >> * being dirty here, but we want to avoid writing back >> * that buffer in the future now that the last use has >> * been committed. That's not only a performance gain, >> * it also stops aliasing problems if the buffer is left >> * behind for writeback and gets reallocated for another >> * use in a different page. */ >> if (buffer_freed(bh)) { >> clear_buffer_freed(bh); >> clear_buffer_jbddirty(bh); >> } >> Note that, *We can't make sure "current running transaction" can complete >> commit work.* If we clear BH_JBDdirty bit here, this buffer may be freed >> here, the log space of older transaction may be freed before the "current >> running transaction" complete commit work, and if this happends, filesystem >> will be inconsistent. > Let me sketch the situation here: > The file F gets truncated. The inode is added to orphan list in some > transaction T1, only then jbd2_journal_invalidatepage can be called. > As you wrote above, it can happen that jbd2_journal_invalidatepage on > buffer B runs when some transaction T2 containing B is being committed and > in that case we set BH_Freed. If T2 != T1 - i.e., T2 is being committed > and T1 is the running transaction, note that we clear the dirty bit only > when T2 is fully committed and we are processing forget list. So buffer has > been properly written to T2 and we just won't write it in the transaction > T1. And that is fine because as soon as transaction T1 finishes commit, we > don't care about what happens with buffers of F because the fact that F is > truncated is recorded and in case of crash we finish truncate during > journal replay. And if we crash before T1 finishes commit, we don't care > about contents of T1 either. If T2 == T1, the above reasoning applies as > well and the situation is even simpler. > > Honza > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- 丁定华 -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Should we discard jbddirty bit if BH_Freed is set? 2010-01-28 1:23 ` 丁定华 @ 2010-01-28 12:53 ` Jan Kara 2010-01-29 1:43 ` 丁定华 0 siblings, 1 reply; 7+ messages in thread From: Jan Kara @ 2010-01-28 12:53 UTC (permalink / raw) To: 丁定华; +Cc: Jan Kara, linux-ext4 Hi, On Thu 28-01-10 09:23:43, 丁定华 wrote: > As you wrote, if T2!=T1, then T2 is committing transaction while T1 > is running transaction, and if T1 complete commit, we don't care > about the content of buffers. But there is a prerequisite -->"T1 > complete commit", if T1 start commit and another transaction T3 > becomes the new running transaction, T3 may need to reuse T2 log > space and force checkpoint, and since we have clean the BH_dirty bit > of buffers after T2 commits, so T2 may be freed before T1 complete > commit, and unfortunately, T1 doesn't complete commit, so after > replay, updates of T2 get lost, fs becomes inconsistent. Hmm, you are right. That could probably happen. It only hits ext3/4 in data=journaled mode but the bug is there. But it's hard to fix it in a reasonable way - if we would just leave the dirty bit set, we will have aliasing problems - buffer B is attached to some page which used to be from file F, so unmap_underlying_metadata will not find it because it looks only into page cache the block device, not to the pagecaches of individual files. So if we reallocate the block of B for some other file G before the buffer B is checkpointed, we have two dirty buffers representing the same block and thus data corruption can happen. Maybe we could handle them by setting b_next_transaction to the transaction that deleted the buffer (in jbd2_journal_unmap_buffer) and setting buffer freed like we do now. Commit code would handle freed buffers like: If b_next_transaction is set, file buffer to forget list of the next transaction. If b_next_transaction isn't set, clear all dirty bits. What do you think? Honza > 2010/1/27 Jan Kara <jack@suse.cz>: > > Hi, > > > > On Wed 27-01-10 10:32:18, 丁定华 wrote: > >> I'm a little confused about BH_Freed bit. The only place it is set > >> is journal_unmap_buffer, which is called by jbd2_journal_invalidatepage when > >> we want to truncate a file. Since jbd2_journal_invalidatepage is called > >> outside of transaction, We can't make sure whether the "add to orphan" > >> operation belongs to committing transaction or not, so we can't touch the > >> buffer belongs to committing transaction, instead BH_Freed bit is set to > >> indicate that this buffer can be discarded in running transaction. But i > >> think we shouldn't clear BH_JBDdirty in jbd2_journal_commit_transaction, as > >> following codes does: > >> /* A buffer which has been freed while still being > >> * journaled by a previous transaction may end up still > >> * being dirty here, but we want to avoid writing back > >> * that buffer in the future now that the last use has > >> * been committed. That's not only a performance gain, > >> * it also stops aliasing problems if the buffer is left > >> * behind for writeback and gets reallocated for another > >> * use in a different page. */ > >> if (buffer_freed(bh)) { > >> clear_buffer_freed(bh); > >> clear_buffer_jbddirty(bh); > >> } > >> Note that, *We can't make sure "current running transaction" can complete > >> commit work.* If we clear BH_JBDdirty bit here, this buffer may be freed > >> here, the log space of older transaction may be freed before the "current > >> running transaction" complete commit work, and if this happends, filesystem > >> will be inconsistent. > > Let me sketch the situation here: > > The file F gets truncated. The inode is added to orphan list in some > > transaction T1, only then jbd2_journal_invalidatepage can be called. > > As you wrote above, it can happen that jbd2_journal_invalidatepage on > > buffer B runs when some transaction T2 containing B is being committed and > > in that case we set BH_Freed. If T2 != T1 - i.e., T2 is being committed > > and T1 is the running transaction, note that we clear the dirty bit only > > when T2 is fully committed and we are processing forget list. So buffer has > > been properly written to T2 and we just won't write it in the transaction > > T1. And that is fine because as soon as transaction T1 finishes commit, we > > don't care about what happens with buffers of F because the fact that F is > > truncated is recorded and in case of crash we finish truncate during > > journal replay. And if we crash before T1 finishes commit, we don't care > > about contents of T1 either. If T2 == T1, the above reasoning applies as > > well and the situation is even simpler. > > > > Honza > > -- > > Jan Kara <jack@suse.cz> > > SUSE Labs, CR > > > > > > -- > 丁定华 -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Should we discard jbddirty bit if BH_Freed is set? 2010-01-28 12:53 ` Jan Kara @ 2010-01-29 1:43 ` 丁定华 2010-01-30 6:41 ` 丁定华 0 siblings, 1 reply; 7+ messages in thread From: 丁定华 @ 2010-01-29 1:43 UTC (permalink / raw) To: Jan Kara; +Cc: linux-ext4 Hi, I think what we need to do is to distinguish the buffers which can be discarded after this transaction commits and which we must wait until the next transaction commits, and as you wrote, set b_next_transaction to running transaction and file them into t_forget list is a good way. 2010/1/28 Jan Kara <jack@suse.cz>: > Hi, > > On Thu 28-01-10 09:23:43, 丁定华 wrote: >> As you wrote, if T2!=T1, then T2 is committing transaction while T1 >> is running transaction, and if T1 complete commit, we don't care >> about the content of buffers. But there is a prerequisite -->"T1 >> complete commit", if T1 start commit and another transaction T3 >> becomes the new running transaction, T3 may need to reuse T2 log >> space and force checkpoint, and since we have clean the BH_dirty bit >> of buffers after T2 commits, so T2 may be freed before T1 complete >> commit, and unfortunately, T1 doesn't complete commit, so after >> replay, updates of T2 get lost, fs becomes inconsistent. > Hmm, you are right. That could probably happen. It only hits ext3/4 in > data=journaled mode but the bug is there. But it's hard to fix it in a > reasonable way - if we would just leave the dirty bit set, we will have > aliasing problems - buffer B is attached to some page which used to be from > file F, so unmap_underlying_metadata will not find it because it looks only > into page cache the block device, not to the pagecaches of individual > files. So if we reallocate the block of B for some other file G before the > buffer B is checkpointed, we have two dirty buffers representing the same > block and thus data corruption can happen. > Maybe we could handle them by setting b_next_transaction to the > transaction that deleted the buffer (in jbd2_journal_unmap_buffer) and > setting buffer freed like we do now. Commit code would handle freed > buffers like: > If b_next_transaction is set, file buffer to forget list of the next > transaction. If b_next_transaction isn't set, clear all dirty bits. > What do you think? > > Honza > >> 2010/1/27 Jan Kara <jack@suse.cz>: >> > Hi, >> > >> > On Wed 27-01-10 10:32:18, 丁定华 wrote: >> >> I'm a little confused about BH_Freed bit. The only place it is set >> >> is journal_unmap_buffer, which is called by jbd2_journal_invalidatepage when >> >> we want to truncate a file. Since jbd2_journal_invalidatepage is called >> >> outside of transaction, We can't make sure whether the "add to orphan" >> >> operation belongs to committing transaction or not, so we can't touch the >> >> buffer belongs to committing transaction, instead BH_Freed bit is set to >> >> indicate that this buffer can be discarded in running transaction. But i >> >> think we shouldn't clear BH_JBDdirty in jbd2_journal_commit_transaction, as >> >> following codes does: >> >> /* A buffer which has been freed while still being >> >> * journaled by a previous transaction may end up still >> >> * being dirty here, but we want to avoid writing back >> >> * that buffer in the future now that the last use has >> >> * been committed. That's not only a performance gain, >> >> * it also stops aliasing problems if the buffer is left >> >> * behind for writeback and gets reallocated for another >> >> * use in a different page. */ >> >> if (buffer_freed(bh)) { >> >> clear_buffer_freed(bh); >> >> clear_buffer_jbddirty(bh); >> >> } >> >> Note that, *We can't make sure "current running transaction" can complete >> >> commit work.* If we clear BH_JBDdirty bit here, this buffer may be freed >> >> here, the log space of older transaction may be freed before the "current >> >> running transaction" complete commit work, and if this happends, filesystem >> >> will be inconsistent. >> > Let me sketch the situation here: >> > The file F gets truncated. The inode is added to orphan list in some >> > transaction T1, only then jbd2_journal_invalidatepage can be called. >> > As you wrote above, it can happen that jbd2_journal_invalidatepage on >> > buffer B runs when some transaction T2 containing B is being committed and >> > in that case we set BH_Freed. If T2 != T1 - i.e., T2 is being committed >> > and T1 is the running transaction, note that we clear the dirty bit only >> > when T2 is fully committed and we are processing forget list. So buffer has >> > been properly written to T2 and we just won't write it in the transaction >> > T1. And that is fine because as soon as transaction T1 finishes commit, we >> > don't care about what happens with buffers of F because the fact that F is >> > truncated is recorded and in case of crash we finish truncate during >> > journal replay. And if we crash before T1 finishes commit, we don't care >> > about contents of T1 either. If T2 == T1, the above reasoning applies as >> > well and the situation is even simpler. >> > >> > Honza >> > -- >> > Jan Kara <jack@suse.cz> >> > SUSE Labs, CR >> > >> >> >> >> -- >> 丁定华 > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- 丁定华 -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Should we discard jbddirty bit if BH_Freed is set? 2010-01-29 1:43 ` 丁定华 @ 2010-01-30 6:41 ` 丁定华 2010-02-02 22:02 ` Jan Kara 0 siblings, 1 reply; 7+ messages in thread From: 丁定华 @ 2010-01-30 6:41 UTC (permalink / raw) To: Jan Kara; +Cc: linux-ext4 [-- Attachment #1: Type: text/plain, Size: 5711 bytes --] Hi, The attached is a patch to fix this problem, please have a look. 2010/1/29 丁定华 <dingdinghua85@gmail.com>: > Hi, > I think what we need to do is to distinguish the buffers which can > be discarded after > this transaction commits and which we must wait until the next transaction > commits, and as you wrote, set b_next_transaction to running > transaction and file them > into t_forget list is a good way. > > 2010/1/28 Jan Kara <jack@suse.cz>: >> Hi, >> >> On Thu 28-01-10 09:23:43, 丁定华 wrote: >>> As you wrote, if T2!=T1, then T2 is committing transaction while T1 >>> is running transaction, and if T1 complete commit, we don't care >>> about the content of buffers. But there is a prerequisite -->"T1 >>> complete commit", if T1 start commit and another transaction T3 >>> becomes the new running transaction, T3 may need to reuse T2 log >>> space and force checkpoint, and since we have clean the BH_dirty bit >>> of buffers after T2 commits, so T2 may be freed before T1 complete >>> commit, and unfortunately, T1 doesn't complete commit, so after >>> replay, updates of T2 get lost, fs becomes inconsistent. >> Hmm, you are right. That could probably happen. It only hits ext3/4 in >> data=journaled mode but the bug is there. But it's hard to fix it in a >> reasonable way - if we would just leave the dirty bit set, we will have >> aliasing problems - buffer B is attached to some page which used to be from >> file F, so unmap_underlying_metadata will not find it because it looks only >> into page cache the block device, not to the pagecaches of individual >> files. So if we reallocate the block of B for some other file G before the >> buffer B is checkpointed, we have two dirty buffers representing the same >> block and thus data corruption can happen. >> Maybe we could handle them by setting b_next_transaction to the >> transaction that deleted the buffer (in jbd2_journal_unmap_buffer) and >> setting buffer freed like we do now. Commit code would handle freed >> buffers like: >> If b_next_transaction is set, file buffer to forget list of the next >> transaction. If b_next_transaction isn't set, clear all dirty bits. >> What do you think? >> >> Honza >> >>> 2010/1/27 Jan Kara <jack@suse.cz>: >>> > Hi, >>> > >>> > On Wed 27-01-10 10:32:18, 丁定华 wrote: >>> >> I'm a little confused about BH_Freed bit. The only place it is set >>> >> is journal_unmap_buffer, which is called by jbd2_journal_invalidatepage when >>> >> we want to truncate a file. Since jbd2_journal_invalidatepage is called >>> >> outside of transaction, We can't make sure whether the "add to orphan" >>> >> operation belongs to committing transaction or not, so we can't touch the >>> >> buffer belongs to committing transaction, instead BH_Freed bit is set to >>> >> indicate that this buffer can be discarded in running transaction. But i >>> >> think we shouldn't clear BH_JBDdirty in jbd2_journal_commit_transaction, as >>> >> following codes does: >>> >> /* A buffer which has been freed while still being >>> >> * journaled by a previous transaction may end up still >>> >> * being dirty here, but we want to avoid writing back >>> >> * that buffer in the future now that the last use has >>> >> * been committed. That's not only a performance gain, >>> >> * it also stops aliasing problems if the buffer is left >>> >> * behind for writeback and gets reallocated for another >>> >> * use in a different page. */ >>> >> if (buffer_freed(bh)) { >>> >> clear_buffer_freed(bh); >>> >> clear_buffer_jbddirty(bh); >>> >> } >>> >> Note that, *We can't make sure "current running transaction" can complete >>> >> commit work.* If we clear BH_JBDdirty bit here, this buffer may be freed >>> >> here, the log space of older transaction may be freed before the "current >>> >> running transaction" complete commit work, and if this happends, filesystem >>> >> will be inconsistent. >>> > Let me sketch the situation here: >>> > The file F gets truncated. The inode is added to orphan list in some >>> > transaction T1, only then jbd2_journal_invalidatepage can be called. >>> > As you wrote above, it can happen that jbd2_journal_invalidatepage on >>> > buffer B runs when some transaction T2 containing B is being committed and >>> > in that case we set BH_Freed. If T2 != T1 - i.e., T2 is being committed >>> > and T1 is the running transaction, note that we clear the dirty bit only >>> > when T2 is fully committed and we are processing forget list. So buffer has >>> > been properly written to T2 and we just won't write it in the transaction >>> > T1. And that is fine because as soon as transaction T1 finishes commit, we >>> > don't care about what happens with buffers of F because the fact that F is >>> > truncated is recorded and in case of crash we finish truncate during >>> > journal replay. And if we crash before T1 finishes commit, we don't care >>> > about contents of T1 either. If T2 == T1, the above reasoning applies as >>> > well and the situation is even simpler. >>> > >>> > Honza >>> > -- >>> > Jan Kara <jack@suse.cz> >>> > SUSE Labs, CR >>> > >>> >>> >>> >>> -- >>> 丁定华 >> -- >> Jan Kara <jack@suse.cz> >> SUSE Labs, CR >> > > > > -- > 丁定华 > -- 丁定华 [-- Attachment #2: 0001-Jbd2-delay-discarding-buffers-in-journal_unmap_buff.patch --] [-- Type: text/x-patch, Size: 2678 bytes --] From eb6567de576ac4a9337c3971a2d0549af64e15f0 Mon Sep 17 00:00:00 2001 From: dingdinghua <dingdinghua@nrchpc.ac.cn> Date: Sat, 30 Jan 2010 14:23:28 +0800 Subject: [PATCH] Jbd2: delay discarding buffers in journal_unmap_buffer We should delay discarding buffers until we know that "add to orphan" operation has definitely been committed, otherwise the log space of committing transation may be freed and reused before truncate get committed, updates may get lost if crash happens. --- fs/jbd2/commit.c | 6 ++++-- fs/jbd2/transaction.c | 18 ++++++++++-------- 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 1bc74b6..c5d1c7c 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -936,8 +936,10 @@ restart_loop: * behind for writeback and gets reallocated for another * use in a different page. */ if (buffer_freed(bh)) { - clear_buffer_freed(bh); - clear_buffer_jbddirty(bh); + if (!jh->b_next_transaction) { + clear_buffer_freed(bh); + clear_buffer_jbddirty(bh); + } } if (buffer_jbddirty(bh)) { diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index a051270..0d019f1 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1788,11 +1788,8 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh) * running transaction if that is set, but nothing * else. */ set_buffer_freed(bh); - if (jh->b_next_transaction) { - J_ASSERT(jh->b_next_transaction == - journal->j_running_transaction); - jh->b_next_transaction = NULL; - } + if (journal->j_running_transaction && buffer_jbddirty(bh)) + jh->b_next_transaction = journal->j_running_transaction; jbd2_journal_put_journal_head(jh); spin_unlock(&journal->j_list_lock); jbd_unlock_bh_state(bh); @@ -1969,7 +1966,7 @@ void jbd2_journal_file_buffer(struct journal_head *jh, */ void __jbd2_journal_refile_buffer(struct journal_head *jh) { - int was_dirty; + int was_dirty, jlist; struct buffer_head *bh = jh2bh(jh); J_ASSERT_JH(jh, jbd_is_locked_bh_state(bh)); @@ -1991,8 +1988,13 @@ void __jbd2_journal_refile_buffer(struct journal_head *jh) __jbd2_journal_temp_unlink_buffer(jh); jh->b_transaction = jh->b_next_transaction; jh->b_next_transaction = NULL; - __jbd2_journal_file_buffer(jh, jh->b_transaction, - jh->b_modified ? BJ_Metadata : BJ_Reserved); + if (buffer_freed(bh)) + jlist = BJ_Forget; + else if (jh->b_modified) + jlist = BJ_Metadata; + else + jlist = BJ_Reserved; + __jbd2_journal_file_buffer(jh, jh->b_transaction, jlist); J_ASSERT_JH(jh, jh->b_transaction->t_state == T_RUNNING); if (was_dirty) -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Should we discard jbddirty bit if BH_Freed is set? 2010-01-30 6:41 ` 丁定华 @ 2010-02-02 22:02 ` Jan Kara 0 siblings, 0 replies; 7+ messages in thread From: Jan Kara @ 2010-02-02 22:02 UTC (permalink / raw) To: 丁定华; +Cc: Jan Kara, linux-ext4 > From eb6567de576ac4a9337c3971a2d0549af64e15f0 Mon Sep 17 00:00:00 2001 > From: dingdinghua <dingdinghua@nrchpc.ac.cn> > Date: Sat, 30 Jan 2010 14:23:28 +0800 > Subject: [PATCH] Jbd2: delay discarding buffers in journal_unmap_buffer > > We should delay discarding buffers until we know that "add to orphan" > operation has definitely been committed, otherwise the log space of > committing transation may be freed and reused before truncate get > committed, updates may get lost if crash happens. Thanks for the patch. For us to be able to merge the patch, we need a Signed-off-by signature from you. Please add it in the next round of submission. > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index 1bc74b6..c5d1c7c 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -936,8 +936,10 @@ restart_loop: > * behind for writeback and gets reallocated for another > * use in a different page. */ > if (buffer_freed(bh)) { > - clear_buffer_freed(bh); > - clear_buffer_jbddirty(bh); > + if (!jh->b_next_transaction) { > + clear_buffer_freed(bh); > + clear_buffer_jbddirty(bh); > + } > } When can merge the 'if' here. Also the comment above this code needs updating after your change. > if (buffer_jbddirty(bh)) { > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c > index a051270..0d019f1 100644 > --- a/fs/jbd2/transaction.c > +++ b/fs/jbd2/transaction.c > @@ -1788,11 +1788,8 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh) > * running transaction if that is set, but nothing > * else. */ > set_buffer_freed(bh); > - if (jh->b_next_transaction) { > - J_ASSERT(jh->b_next_transaction == > - journal->j_running_transaction); > - jh->b_next_transaction = NULL; > - } > + if (journal->j_running_transaction && buffer_jbddirty(bh)) > + jh->b_next_transaction = journal->j_running_transaction; Also here you have to update the comment above set_buffer_freed to match the new reality. Otherwise I'm fine with the patch. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-02-02 22:01 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-27 2:39 Should we discard jbddirty bit if BH_Freed is set? 丁定华 [not found] <7bb361261001261832wb4f9ac2u96fdb6460aa45fa2@mail.gmail.com> 2010-01-27 12:23 ` Jan Kara 2010-01-28 1:23 ` 丁定华 2010-01-28 12:53 ` Jan Kara 2010-01-29 1:43 ` 丁定华 2010-01-30 6:41 ` 丁定华 2010-02-02 22:02 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).