Linux EXT4 FS development
 help / color / mirror / Atom feed
* [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads
@ 2026-01-30  3:12 Li Chen
  2026-01-30  3:12 ` [PATCH 1/3] jbd2: use " Li Chen
  2026-01-30  3:12 ` [PATCH 2/3] ext4: " Li Chen
  0 siblings, 2 replies; 6+ messages in thread
From: Li Chen @ 2026-01-30  3:12 UTC (permalink / raw)
  To: Theodore Ts'o, Jan Kara, Mark Fasheh, linux-ext4, ocfs2-devel
  Cc: Andreas Dilger, Joel Becker, Joseph Qi, linux-kernel

This series adds READ_ONCE() for existing lockless reads of
jbd2_inode fields in jbd2 and filesystem callbacks used by ext4 and ocfs2.

This is based on Jan's suggestion in the review of the ext4 jinode
publication race fix. [1]

[1]: https://lore.kernel.org/all/4jxwogttddiaoqbstlgou5ox6zs27ngjjz5ukrxafm2z5ijxod@so4eqnykiegj/

Thanks,
Li

Li Chen (3):
  jbd2: use READ_ONCE for lockless jinode reads
  ext4: use READ_ONCE for lockless jinode reads
  ocfs2: use READ_ONCE for lockless jinode reads

 fs/ext4/inode.c       |  6 ++++--
 fs/ext4/super.c       | 13 ++++++++-----
 fs/jbd2/commit.c      | 39 ++++++++++++++++++++++++++++++++-------
 fs/jbd2/transaction.c |  2 +-
 fs/ocfs2/journal.c    |  7 +++++--
 5 files changed, 50 insertions(+), 17 deletions(-)

-- 
2.52.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] jbd2: use READ_ONCE for lockless jinode reads
  2026-01-30  3:12 [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads Li Chen
@ 2026-01-30  3:12 ` Li Chen
  2026-02-02 16:40   ` Jan Kara
  2026-01-30  3:12 ` [PATCH 2/3] ext4: " Li Chen
  1 sibling, 1 reply; 6+ messages in thread
From: Li Chen @ 2026-01-30  3:12 UTC (permalink / raw)
  To: Theodore Ts'o, Jan Kara, linux-ext4, linux-kernel; +Cc: Li Chen

jbd2_inode fields are updated under journal->j_list_lock, but some
paths read them without holding the lock (e.g. fast commit
helpers and the ordered truncate fast path).

Use READ_ONCE() for these lockless reads to correct the
concurrency assumptions.

Suggested-by: Jan Kara <jack@suse.com>
Signed-off-by: Li Chen <me@linux.beauty>
---
 fs/jbd2/commit.c      | 39 ++++++++++++++++++++++++++++++++-------
 fs/jbd2/transaction.c |  2 +-
 2 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 7203d2d2624d..3347d75da2f8 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -180,7 +180,13 @@ static int journal_wait_on_commit_record(journal_t *journal,
 /* Send all the data buffers related to an inode */
 int jbd2_submit_inode_data(journal_t *journal, struct jbd2_inode *jinode)
 {
-	if (!jinode || !(jinode->i_flags & JI_WRITE_DATA))
+	unsigned long flags;
+
+	if (!jinode)
+		return 0;
+
+	flags = READ_ONCE(jinode->i_flags);
+	if (!(flags & JI_WRITE_DATA))
 		return 0;
 
 	trace_jbd2_submit_inode_data(jinode->i_vfs_inode);
@@ -191,12 +197,30 @@ EXPORT_SYMBOL(jbd2_submit_inode_data);
 
 int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
 {
-	if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
-		!jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
+	struct address_space *mapping;
+	struct inode *inode;
+	unsigned long flags;
+	loff_t start, end;
+
+	if (!jinode)
+		return 0;
+
+	flags = READ_ONCE(jinode->i_flags);
+	if (!(flags & JI_WAIT_DATA))
+		return 0;
+
+	inode = READ_ONCE(jinode->i_vfs_inode);
+	if (!inode)
+		return 0;
+
+	mapping = inode->i_mapping;
+	start = READ_ONCE(jinode->i_dirty_start);
+	end = READ_ONCE(jinode->i_dirty_end);
+
+	if (!mapping)
 		return 0;
 	return filemap_fdatawait_range_keep_errors(
-		jinode->i_vfs_inode->i_mapping, jinode->i_dirty_start,
-		jinode->i_dirty_end);
+		mapping, start, end);
 }
 EXPORT_SYMBOL(jbd2_wait_inode_data);
 
@@ -240,10 +264,11 @@ static int journal_submit_data_buffers(journal_t *journal,
 int jbd2_journal_finish_inode_data_buffers(struct jbd2_inode *jinode)
 {
 	struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
+	loff_t start = READ_ONCE(jinode->i_dirty_start);
+	loff_t end = READ_ONCE(jinode->i_dirty_end);
 
 	return filemap_fdatawait_range_keep_errors(mapping,
-						   jinode->i_dirty_start,
-						   jinode->i_dirty_end);
+						   start, end);
 }
 
 /*
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index dca4b5d8aaaa..302b2090eea7 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -2739,7 +2739,7 @@ int jbd2_journal_begin_ordered_truncate(journal_t *journal,
 	int ret = 0;
 
 	/* This is a quick check to avoid locking if not necessary */
-	if (!jinode->i_transaction)
+	if (!READ_ONCE(jinode->i_transaction))
 		goto out;
 	/* Locks are here just to force reading of recent values, it is
 	 * enough that the transaction was not committing before we started
-- 
2.52.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] ext4: use READ_ONCE for lockless jinode reads
  2026-01-30  3:12 [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads Li Chen
  2026-01-30  3:12 ` [PATCH 1/3] jbd2: use " Li Chen
@ 2026-01-30  3:12 ` Li Chen
  2026-02-02 16:41   ` Jan Kara
  1 sibling, 1 reply; 6+ messages in thread
From: Li Chen @ 2026-01-30  3:12 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel
  Cc: Jan Kara, Li Chen

ext4 journal commit callbacks access jbd2_inode fields such as
i_transaction and i_dirty_start/end without holding journal->j_list_lock.

Use READ_ONCE() for these reads to correct the concurrency assumptions.

Suggested-by: Jan Kara <jack@suse.com>
Signed-off-by: Li Chen <me@linux.beauty>
---
 fs/ext4/inode.c |  6 ++++--
 fs/ext4/super.c | 13 ++++++++-----
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d99296d7315f..2d451388e080 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3033,11 +3033,13 @@ static int ext4_writepages(struct address_space *mapping,
 
 int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode)
 {
+	loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
+	loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_ALL,
 		.nr_to_write = LONG_MAX,
-		.range_start = jinode->i_dirty_start,
-		.range_end = jinode->i_dirty_end,
+		.range_start = dirty_start,
+		.range_end = dirty_end,
 	};
 	struct mpage_da_data mpd = {
 		.inode = jinode->i_vfs_inode,
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5cf6c2b54bbb..acb2bc016fd4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -521,6 +521,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
 {
 	struct buffer_head *bh, *head;
 	struct journal_head *jh;
+	transaction_t *trans = READ_ONCE(jinode->i_transaction);
 
 	bh = head = folio_buffers(folio);
 	do {
@@ -539,7 +540,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
 		 */
 		jh = bh2jh(bh);
 		if (buffer_dirty(bh) ||
-		    (jh && (jh->b_transaction != jinode->i_transaction ||
+		    (jh && (jh->b_transaction != trans ||
 			    jh->b_next_transaction)))
 			return true;
 	} while ((bh = bh->b_this_page) != head);
@@ -550,12 +551,14 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
 static int ext4_journalled_submit_inode_data_buffers(struct jbd2_inode *jinode)
 {
 	struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
+	loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
+	loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
 	struct writeback_control wbc = {
-		.sync_mode =  WB_SYNC_ALL,
+		.sync_mode = WB_SYNC_ALL,
 		.nr_to_write = LONG_MAX,
-		.range_start = jinode->i_dirty_start,
-		.range_end = jinode->i_dirty_end,
-        };
+		.range_start = dirty_start,
+		.range_end = dirty_end,
+	};
 	struct folio *folio = NULL;
 	int error;
 
-- 
2.52.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] jbd2: use READ_ONCE for lockless jinode reads
  2026-01-30  3:12 ` [PATCH 1/3] jbd2: use " Li Chen
@ 2026-02-02 16:40   ` Jan Kara
  2026-02-02 16:52     ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2026-02-02 16:40 UTC (permalink / raw)
  To: Li Chen; +Cc: Theodore Ts'o, Jan Kara, linux-ext4, linux-kernel

On Fri 30-01-26 11:12:30, Li Chen wrote:
> jbd2_inode fields are updated under journal->j_list_lock, but some
> paths read them without holding the lock (e.g. fast commit
> helpers and the ordered truncate fast path).
> 
> Use READ_ONCE() for these lockless reads to correct the
> concurrency assumptions.
> 
> Suggested-by: Jan Kara <jack@suse.com>
> Signed-off-by: Li Chen <me@linux.beauty>

Just one nit below. With that fixed feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

> @@ -191,12 +197,30 @@ EXPORT_SYMBOL(jbd2_submit_inode_data);
>  
>  int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
>  {
> -	if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
> -		!jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
> +	struct address_space *mapping;
> +	struct inode *inode;
> +	unsigned long flags;
> +	loff_t start, end;
> +
> +	if (!jinode)
> +		return 0;
> +
> +	flags = READ_ONCE(jinode->i_flags);
> +	if (!(flags & JI_WAIT_DATA))
> +		return 0;
> +
> +	inode = READ_ONCE(jinode->i_vfs_inode);

i_vfs_inode never changes so READ_ONCE is pointless here.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] ext4: use READ_ONCE for lockless jinode reads
  2026-01-30  3:12 ` [PATCH 2/3] ext4: " Li Chen
@ 2026-02-02 16:41   ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2026-02-02 16:41 UTC (permalink / raw)
  To: Li Chen
  Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel,
	Jan Kara

On Fri 30-01-26 11:12:31, Li Chen wrote:
> ext4 journal commit callbacks access jbd2_inode fields such as
> i_transaction and i_dirty_start/end without holding journal->j_list_lock.
> 
> Use READ_ONCE() for these reads to correct the concurrency assumptions.
> 
> Suggested-by: Jan Kara <jack@suse.com>
> Signed-off-by: Li Chen <me@linux.beauty>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c |  6 ++++--
>  fs/ext4/super.c | 13 ++++++++-----
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d99296d7315f..2d451388e080 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3033,11 +3033,13 @@ static int ext4_writepages(struct address_space *mapping,
>  
>  int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode)
>  {
> +	loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
> +	loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
>  	struct writeback_control wbc = {
>  		.sync_mode = WB_SYNC_ALL,
>  		.nr_to_write = LONG_MAX,
> -		.range_start = jinode->i_dirty_start,
> -		.range_end = jinode->i_dirty_end,
> +		.range_start = dirty_start,
> +		.range_end = dirty_end,
>  	};
>  	struct mpage_da_data mpd = {
>  		.inode = jinode->i_vfs_inode,
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 5cf6c2b54bbb..acb2bc016fd4 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -521,6 +521,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
>  {
>  	struct buffer_head *bh, *head;
>  	struct journal_head *jh;
> +	transaction_t *trans = READ_ONCE(jinode->i_transaction);
>  
>  	bh = head = folio_buffers(folio);
>  	do {
> @@ -539,7 +540,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
>  		 */
>  		jh = bh2jh(bh);
>  		if (buffer_dirty(bh) ||
> -		    (jh && (jh->b_transaction != jinode->i_transaction ||
> +		    (jh && (jh->b_transaction != trans ||
>  			    jh->b_next_transaction)))
>  			return true;
>  	} while ((bh = bh->b_this_page) != head);
> @@ -550,12 +551,14 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
>  static int ext4_journalled_submit_inode_data_buffers(struct jbd2_inode *jinode)
>  {
>  	struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
> +	loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
> +	loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
>  	struct writeback_control wbc = {
> -		.sync_mode =  WB_SYNC_ALL,
> +		.sync_mode = WB_SYNC_ALL,
>  		.nr_to_write = LONG_MAX,
> -		.range_start = jinode->i_dirty_start,
> -		.range_end = jinode->i_dirty_end,
> -        };
> +		.range_start = dirty_start,
> +		.range_end = dirty_end,
> +	};
>  	struct folio *folio = NULL;
>  	int error;
>  
> -- 
> 2.52.0
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] jbd2: use READ_ONCE for lockless jinode reads
  2026-02-02 16:40   ` Jan Kara
@ 2026-02-02 16:52     ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2026-02-02 16:52 UTC (permalink / raw)
  To: Li Chen; +Cc: Theodore Ts'o, Jan Kara, linux-ext4, linux-kernel

On Mon 02-02-26 17:40:45, Jan Kara wrote:
> On Fri 30-01-26 11:12:30, Li Chen wrote:
> > jbd2_inode fields are updated under journal->j_list_lock, but some
> > paths read them without holding the lock (e.g. fast commit
> > helpers and the ordered truncate fast path).
> > 
> > Use READ_ONCE() for these lockless reads to correct the
> > concurrency assumptions.
> > 
> > Suggested-by: Jan Kara <jack@suse.com>
> > Signed-off-by: Li Chen <me@linux.beauty>
> 
> Just one nit below. With that fixed feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> > @@ -191,12 +197,30 @@ EXPORT_SYMBOL(jbd2_submit_inode_data);
> >  
> >  int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
> >  {
> > -	if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
> > -		!jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
> > +	struct address_space *mapping;
> > +	struct inode *inode;
> > +	unsigned long flags;
> > +	loff_t start, end;
> > +
> > +	if (!jinode)
> > +		return 0;
> > +
> > +	flags = READ_ONCE(jinode->i_flags);
> > +	if (!(flags & JI_WAIT_DATA))
> > +		return 0;
> > +
> > +	inode = READ_ONCE(jinode->i_vfs_inode);
> 
> i_vfs_inode never changes so READ_ONCE is pointless here.

One more note: I've realized that for this to work you also need to make
jbd2_journal_file_inode() use WRITE_ONCE() when updating i_dirty_start,
i_dirty_end and i_flags.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-02 16:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-30  3:12 [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads Li Chen
2026-01-30  3:12 ` [PATCH 1/3] jbd2: use " Li Chen
2026-02-02 16:40   ` Jan Kara
2026-02-02 16:52     ` Jan Kara
2026-01-30  3:12 ` [PATCH 2/3] ext4: " Li Chen
2026-02-02 16:41   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox