* [PATCH 0/2] Fix waiting on transaction in ext3_sync_file
@ 2010-04-26 20:10 Jan Kara
2010-04-26 20:10 ` [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier Jan Kara
2010-04-26 20:10 ` [PATCH 2/2] ext3: Fix waiting on transaction during fsync Jan Kara
0 siblings, 2 replies; 5+ messages in thread
From: Jan Kara @ 2010-04-26 20:10 UTC (permalink / raw)
To: linux-ext4
Hi,
the two patches below fix waiting on transaction in ext3_sync_file. The
problem is that log_start_commit() returns 1 only if it has really started
committing. If the transaction is already undergoing commit, it will return 0
and thus we incorrectly proceed although we should have waited for commit to
finish. This problem would be easily solved by always waiting for transaction
commit. But we also need to know whether the commit really happened after we
have entered ext3_sync_file - otherwise we have to send a barrier request to
disk. Sending it in all cases has unnecessarily bad performance.
Attached patch addresses the issue by providing a function
journal_trans_will_send_data_barrier which takes transaction ID and returns
whether the final stage of commit of this transaction has not happened yet.
Ext3 then uses this function to check whether it needs to send an additional
barrier or not.
Any comments or review welcome.
Honza
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier
2010-04-26 20:10 [PATCH 0/2] Fix waiting on transaction in ext3_sync_file Jan Kara
@ 2010-04-26 20:10 ` Jan Kara
2010-04-27 3:42 ` Dmitry Monakhov
2010-04-26 20:10 ` [PATCH 2/2] ext3: Fix waiting on transaction during fsync Jan Kara
1 sibling, 1 reply; 5+ messages in thread
From: Jan Kara @ 2010-04-26 20:10 UTC (permalink / raw)
To: linux-ext4; +Cc: Jan Kara
Provide a function which returns whether a transaction with given tid
will send a barrier to the filesystem device. The function will be used
by ext3 to detect whether fsync needs to send a separate barrier or not.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/jbd/commit.c | 8 +++++++-
fs/jbd/journal.c | 33 +++++++++++++++++++++++++++++++++
include/linux/jbd.h | 3 ++-
3 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
index ecb44c9..28a9dda 100644
--- a/fs/jbd/commit.c
+++ b/fs/jbd/commit.c
@@ -786,6 +786,12 @@ wait_for_iobuf:
jbd_debug(3, "JBD: commit phase 6\n");
+ /* All metadata is written, now write commit record and do cleanup */
+ spin_lock(&journal->j_state_lock);
+ J_ASSERT(commit_transaction->t_state == T_COMMIT);
+ commit_transaction->t_state = T_COMMIT_RECORD;
+ spin_unlock(&journal->j_state_lock);
+
if (journal_write_commit_record(journal, commit_transaction))
err = -EIO;
@@ -923,7 +929,7 @@ restart_loop:
jbd_debug(3, "JBD: commit phase 8\n");
- J_ASSERT(commit_transaction->t_state == T_COMMIT);
+ J_ASSERT(commit_transaction->t_state == T_COMMIT_RECORD);
commit_transaction->t_state = T_FINISHED;
J_ASSERT(commit_transaction == journal->j_committing_transaction);
diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
index bd224ee..99c7194 100644
--- a/fs/jbd/journal.c
+++ b/fs/jbd/journal.c
@@ -565,6 +565,38 @@ int log_wait_commit(journal_t *journal, tid_t tid)
}
/*
+ * Return 1 if a given transaction has not yet sent barrier request
+ * connected with a transaction commit. If 0 is returned, transaction
+ * may or may not have sent the barrier. Used to avoid sending barrier
+ * twice in common cases.
+ */
+int journal_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
+{
+ int ret = 0;
+ transaction_t *commit_trans;
+
+ if (!(journal->j_flags & JFS_BARRIER))
+ return 0;
+ spin_lock(&journal->j_state_lock);
+ /* Transaction already committed? */
+ if (tid_geq(journal->j_commit_sequence, tid))
+ goto out;
+ /*
+ * Transaction is being committed and we already proceeded to
+ * writing commit record?
+ */
+ commit_trans = journal->j_committing_transaction;
+ if (commit_trans && commit_trans->t_tid == tid &&
+ commit_trans->t_state >= T_COMMIT_RECORD)
+ goto out;
+ ret = 1;
+out:
+ spin_unlock(&journal->j_state_lock);
+ return ret;
+}
+EXPORT_SYMBOL(journal_commit_will_send_barrier);
+
+/*
* Log buffer allocation routines:
*/
@@ -1157,6 +1189,7 @@ int journal_destroy(journal_t *journal)
{
int err = 0;
+
/* Wait for the commit thread to wake up and die. */
journal_kill_thread(journal);
diff --git a/include/linux/jbd.h b/include/linux/jbd.h
index 516a2a2..e069650 100644
--- a/include/linux/jbd.h
+++ b/include/linux/jbd.h
@@ -427,9 +427,9 @@ struct transaction_s
enum {
T_RUNNING,
T_LOCKED,
- T_RUNDOWN,
T_FLUSH,
T_COMMIT,
+ T_COMMIT_RECORD,
T_FINISHED
} t_state;
@@ -991,6 +991,7 @@ int journal_start_commit(journal_t *journal, tid_t *tid);
int journal_force_commit_nested(journal_t *journal);
int log_wait_commit(journal_t *journal, tid_t tid);
int log_do_checkpoint(journal_t *journal);
+int journal_trans_will_send_data_barrier(journal_t *journal, tid_t tid);
void __log_wait_for_space(journal_t *journal);
extern void __journal_drop_transaction(journal_t *, transaction_t *);
--
1.6.4.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] ext3: Fix waiting on transaction during fsync
2010-04-26 20:10 [PATCH 0/2] Fix waiting on transaction in ext3_sync_file Jan Kara
2010-04-26 20:10 ` [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier Jan Kara
@ 2010-04-26 20:10 ` Jan Kara
1 sibling, 0 replies; 5+ messages in thread
From: Jan Kara @ 2010-04-26 20:10 UTC (permalink / raw)
To: linux-ext4; +Cc: Jan Kara
log_start_commit() returns 1 only when it started a transaction
commit. Thus in case transaction commit is already running, we
fail to wait for the commit to finish. Fix the issue by always
waiting for the commit regardless of the log_start_commit return
value.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/ext3/fsync.c | 20 +++++++++-----------
fs/jbd/journal.c | 2 +-
2 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/fs/ext3/fsync.c b/fs/ext3/fsync.c
index 8209f26..26289e8 100644
--- a/fs/ext3/fsync.c
+++ b/fs/ext3/fsync.c
@@ -48,7 +48,7 @@ int ext3_sync_file(struct file * file, struct dentry *dentry, int datasync)
struct inode *inode = dentry->d_inode;
struct ext3_inode_info *ei = EXT3_I(inode);
journal_t *journal = EXT3_SB(inode->i_sb)->s_journal;
- int ret = 0;
+ int ret, needs_barrier = 0;
tid_t commit_tid;
if (inode->i_sb->s_flags & MS_RDONLY)
@@ -70,28 +70,26 @@ int ext3_sync_file(struct file * file, struct dentry *dentry, int datasync)
* (they were dirtied by commit). But that's OK - the blocks are
* safe in-journal, which is all fsync() needs to ensure.
*/
- if (ext3_should_journal_data(inode)) {
- ret = ext3_force_commit(inode->i_sb);
- goto out;
- }
+ if (ext3_should_journal_data(inode))
+ return ext3_force_commit(inode->i_sb);
if (datasync)
commit_tid = atomic_read(&ei->i_datasync_tid);
else
commit_tid = atomic_read(&ei->i_sync_tid);
- if (log_start_commit(journal, commit_tid)) {
- log_wait_commit(journal, commit_tid);
- goto out;
- }
+ if (test_opt(inode->i_sb, BARRIER) &&
+ !journal_trans_will_send_data_barrier(journal, commit_tid))
+ needs_barrier = 1;
+ log_start_commit(journal, commit_tid);
+ ret = log_wait_commit(journal, commit_tid);
/*
* In case we didn't commit a transaction, we have to flush
* disk caches manually so that data really is on persistent
* storage
*/
- if (test_opt(inode->i_sb, BARRIER))
+ if (needs_barrier)
blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
-out:
return ret;
}
diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
index 99c7194..93d1e47 100644
--- a/fs/jbd/journal.c
+++ b/fs/jbd/journal.c
@@ -594,7 +594,7 @@ out:
spin_unlock(&journal->j_state_lock);
return ret;
}
-EXPORT_SYMBOL(journal_commit_will_send_barrier);
+EXPORT_SYMBOL(journal_trans_will_send_data_barrier);
/*
* Log buffer allocation routines:
--
1.6.4.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier
2010-04-26 20:10 ` [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier Jan Kara
@ 2010-04-27 3:42 ` Dmitry Monakhov
2010-04-27 20:48 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Monakhov @ 2010-04-27 3:42 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4
Jan Kara <jack@suse.cz> writes:
> Provide a function which returns whether a transaction with given tid
> will send a barrier to the filesystem device. The function will be used
> by ext3 to detect whether fsync needs to send a separate barrier or not.
Agree. Except the fact that in case of j_dev != j_fs_dev jbd is still
broken. I'm plan to post back-port from jbd2 which makes
journal_trans_will_send_data_barrier() more complex. It have to analyze
commit_transaction->t_flushed_data_blocks.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/jbd/commit.c | 8 +++++++-
> fs/jbd/journal.c | 33 +++++++++++++++++++++++++++++++++
> include/linux/jbd.h | 3 ++-
> 3 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
> index ecb44c9..28a9dda 100644
> --- a/fs/jbd/commit.c
> +++ b/fs/jbd/commit.c
> @@ -786,6 +786,12 @@ wait_for_iobuf:
>
> jbd_debug(3, "JBD: commit phase 6\n");
>
> + /* All metadata is written, now write commit record and do cleanup */
> + spin_lock(&journal->j_state_lock);
> + J_ASSERT(commit_transaction->t_state == T_COMMIT);
> + commit_transaction->t_state = T_COMMIT_RECORD;
> + spin_unlock(&journal->j_state_lock);
> +
> if (journal_write_commit_record(journal, commit_transaction))
> err = -EIO;
>
> @@ -923,7 +929,7 @@ restart_loop:
>
> jbd_debug(3, "JBD: commit phase 8\n");
>
> - J_ASSERT(commit_transaction->t_state == T_COMMIT);
> + J_ASSERT(commit_transaction->t_state == T_COMMIT_RECORD);
>
> commit_transaction->t_state = T_FINISHED;
> J_ASSERT(commit_transaction == journal->j_committing_transaction);
> diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
> index bd224ee..99c7194 100644
> --- a/fs/jbd/journal.c
> +++ b/fs/jbd/journal.c
> @@ -565,6 +565,38 @@ int log_wait_commit(journal_t *journal, tid_t tid)
> }
>
> /*
> + * Return 1 if a given transaction has not yet sent barrier request
> + * connected with a transaction commit. If 0 is returned, transaction
> + * may or may not have sent the barrier. Used to avoid sending barrier
> + * twice in common cases.
> + */
> +int journal_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
> +{
> + int ret = 0;
> + transaction_t *commit_trans;
> +
> + if (!(journal->j_flags & JFS_BARRIER))
> + return 0;
> + spin_lock(&journal->j_state_lock);
> + /* Transaction already committed? */
> + if (tid_geq(journal->j_commit_sequence, tid))
> + goto out;
> + /*
> + * Transaction is being committed and we already proceeded to
> + * writing commit record?
> + */
> + commit_trans = journal->j_committing_transaction;
> + if (commit_trans && commit_trans->t_tid == tid &&
> + commit_trans->t_state >= T_COMMIT_RECORD)
> + goto out;
> + ret = 1;
> +out:
> + spin_unlock(&journal->j_state_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL(journal_commit_will_send_barrier);
> +
> +/*
> * Log buffer allocation routines:
> */
>
> @@ -1157,6 +1189,7 @@ int journal_destroy(journal_t *journal)
> {
> int err = 0;
>
> +
> /* Wait for the commit thread to wake up and die. */
> journal_kill_thread(journal);
>
> diff --git a/include/linux/jbd.h b/include/linux/jbd.h
> index 516a2a2..e069650 100644
> --- a/include/linux/jbd.h
> +++ b/include/linux/jbd.h
> @@ -427,9 +427,9 @@ struct transaction_s
> enum {
> T_RUNNING,
> T_LOCKED,
> - T_RUNDOWN,
> T_FLUSH,
> T_COMMIT,
> + T_COMMIT_RECORD,
> T_FINISHED
> } t_state;
>
> @@ -991,6 +991,7 @@ int journal_start_commit(journal_t *journal, tid_t *tid);
> int journal_force_commit_nested(journal_t *journal);
> int log_wait_commit(journal_t *journal, tid_t tid);
> int log_do_checkpoint(journal_t *journal);
> +int journal_trans_will_send_data_barrier(journal_t *journal, tid_t tid);
>
> void __log_wait_for_space(journal_t *journal);
> extern void __journal_drop_transaction(journal_t *, transaction_t *);
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier
2010-04-27 3:42 ` Dmitry Monakhov
@ 2010-04-27 20:48 ` Jan Kara
0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2010-04-27 20:48 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-ext4
On Tue 27-04-10 07:42:49, Dmitry Monakhov wrote:
> Jan Kara <jack@suse.cz> writes:
>
> > Provide a function which returns whether a transaction with given tid
> > will send a barrier to the filesystem device. The function will be used
> > by ext3 to detect whether fsync needs to send a separate barrier or not.
> Agree. Except the fact that in case of j_dev != j_fs_dev jbd is still
> broken.
Yeah, I know about this. I had a look at it but stumbled over a barrier
issue in the checkpointing code
(http://marc.info/?l=linux-ext4&m=127198235617788&w=2) so I though I'll
wait till that gets resolved.
> I'm plan to post back-port from jbd2 which makes
> journal_trans_will_send_data_barrier() more complex. It have to analyze
> commit_transaction->t_flushed_data_blocks.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-04-27 20:48 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-26 20:10 [PATCH 0/2] Fix waiting on transaction in ext3_sync_file Jan Kara
2010-04-26 20:10 ` [PATCH 1/2] jbd: Provide function to check whether transaction will issue data barrier Jan Kara
2010-04-27 3:42 ` Dmitry Monakhov
2010-04-27 20:48 ` Jan Kara
2010-04-26 20:10 ` [PATCH 2/2] ext3: Fix waiting on transaction during fsync Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).