* [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads
@ 2026-01-30 3:12 Li Chen
2026-01-30 3:12 ` [PATCH 1/3] jbd2: use " Li Chen
2026-01-30 3:12 ` [PATCH 2/3] ext4: " Li Chen
0 siblings, 2 replies; 6+ messages in thread
From: Li Chen @ 2026-01-30 3:12 UTC (permalink / raw)
To: Theodore Ts'o, Jan Kara, Mark Fasheh, linux-ext4, ocfs2-devel
Cc: Andreas Dilger, Joel Becker, Joseph Qi, linux-kernel
This series adds READ_ONCE() for existing lockless reads of
jbd2_inode fields in jbd2 and filesystem callbacks used by ext4 and ocfs2.
This is based on Jan's suggestion in the review of the ext4 jinode
publication race fix. [1]
[1]: https://lore.kernel.org/all/4jxwogttddiaoqbstlgou5ox6zs27ngjjz5ukrxafm2z5ijxod@so4eqnykiegj/
Thanks,
Li
Li Chen (3):
jbd2: use READ_ONCE for lockless jinode reads
ext4: use READ_ONCE for lockless jinode reads
ocfs2: use READ_ONCE for lockless jinode reads
fs/ext4/inode.c | 6 ++++--
fs/ext4/super.c | 13 ++++++++-----
fs/jbd2/commit.c | 39 ++++++++++++++++++++++++++++++++-------
fs/jbd2/transaction.c | 2 +-
fs/ocfs2/journal.c | 7 +++++--
5 files changed, 50 insertions(+), 17 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/3] jbd2: use READ_ONCE for lockless jinode reads
2026-01-30 3:12 [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads Li Chen
@ 2026-01-30 3:12 ` Li Chen
2026-02-02 16:40 ` Jan Kara
2026-01-30 3:12 ` [PATCH 2/3] ext4: " Li Chen
1 sibling, 1 reply; 6+ messages in thread
From: Li Chen @ 2026-01-30 3:12 UTC (permalink / raw)
To: Theodore Ts'o, Jan Kara, linux-ext4, linux-kernel; +Cc: Li Chen
jbd2_inode fields are updated under journal->j_list_lock, but some
paths read them without holding the lock (e.g. fast commit
helpers and the ordered truncate fast path).
Use READ_ONCE() for these lockless reads to correct the
concurrency assumptions.
Suggested-by: Jan Kara <jack@suse.com>
Signed-off-by: Li Chen <me@linux.beauty>
---
fs/jbd2/commit.c | 39 ++++++++++++++++++++++++++++++++-------
fs/jbd2/transaction.c | 2 +-
2 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 7203d2d2624d..3347d75da2f8 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -180,7 +180,13 @@ static int journal_wait_on_commit_record(journal_t *journal,
/* Send all the data buffers related to an inode */
int jbd2_submit_inode_data(journal_t *journal, struct jbd2_inode *jinode)
{
- if (!jinode || !(jinode->i_flags & JI_WRITE_DATA))
+ unsigned long flags;
+
+ if (!jinode)
+ return 0;
+
+ flags = READ_ONCE(jinode->i_flags);
+ if (!(flags & JI_WRITE_DATA))
return 0;
trace_jbd2_submit_inode_data(jinode->i_vfs_inode);
@@ -191,12 +197,30 @@ EXPORT_SYMBOL(jbd2_submit_inode_data);
int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
{
- if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
- !jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
+ struct address_space *mapping;
+ struct inode *inode;
+ unsigned long flags;
+ loff_t start, end;
+
+ if (!jinode)
+ return 0;
+
+ flags = READ_ONCE(jinode->i_flags);
+ if (!(flags & JI_WAIT_DATA))
+ return 0;
+
+ inode = READ_ONCE(jinode->i_vfs_inode);
+ if (!inode)
+ return 0;
+
+ mapping = inode->i_mapping;
+ start = READ_ONCE(jinode->i_dirty_start);
+ end = READ_ONCE(jinode->i_dirty_end);
+
+ if (!mapping)
return 0;
return filemap_fdatawait_range_keep_errors(
- jinode->i_vfs_inode->i_mapping, jinode->i_dirty_start,
- jinode->i_dirty_end);
+ mapping, start, end);
}
EXPORT_SYMBOL(jbd2_wait_inode_data);
@@ -240,10 +264,11 @@ static int journal_submit_data_buffers(journal_t *journal,
int jbd2_journal_finish_inode_data_buffers(struct jbd2_inode *jinode)
{
struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
+ loff_t start = READ_ONCE(jinode->i_dirty_start);
+ loff_t end = READ_ONCE(jinode->i_dirty_end);
return filemap_fdatawait_range_keep_errors(mapping,
- jinode->i_dirty_start,
- jinode->i_dirty_end);
+ start, end);
}
/*
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index dca4b5d8aaaa..302b2090eea7 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -2739,7 +2739,7 @@ int jbd2_journal_begin_ordered_truncate(journal_t *journal,
int ret = 0;
/* This is a quick check to avoid locking if not necessary */
- if (!jinode->i_transaction)
+ if (!READ_ONCE(jinode->i_transaction))
goto out;
/* Locks are here just to force reading of recent values, it is
* enough that the transaction was not committing before we started
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/3] ext4: use READ_ONCE for lockless jinode reads
2026-01-30 3:12 [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads Li Chen
2026-01-30 3:12 ` [PATCH 1/3] jbd2: use " Li Chen
@ 2026-01-30 3:12 ` Li Chen
2026-02-02 16:41 ` Jan Kara
1 sibling, 1 reply; 6+ messages in thread
From: Li Chen @ 2026-01-30 3:12 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel
Cc: Jan Kara, Li Chen
ext4 journal commit callbacks access jbd2_inode fields such as
i_transaction and i_dirty_start/end without holding journal->j_list_lock.
Use READ_ONCE() for these reads to correct the concurrency assumptions.
Suggested-by: Jan Kara <jack@suse.com>
Signed-off-by: Li Chen <me@linux.beauty>
---
fs/ext4/inode.c | 6 ++++--
fs/ext4/super.c | 13 ++++++++-----
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d99296d7315f..2d451388e080 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3033,11 +3033,13 @@ static int ext4_writepages(struct address_space *mapping,
int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode)
{
+ loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
+ loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
struct writeback_control wbc = {
.sync_mode = WB_SYNC_ALL,
.nr_to_write = LONG_MAX,
- .range_start = jinode->i_dirty_start,
- .range_end = jinode->i_dirty_end,
+ .range_start = dirty_start,
+ .range_end = dirty_end,
};
struct mpage_da_data mpd = {
.inode = jinode->i_vfs_inode,
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5cf6c2b54bbb..acb2bc016fd4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -521,6 +521,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
{
struct buffer_head *bh, *head;
struct journal_head *jh;
+ transaction_t *trans = READ_ONCE(jinode->i_transaction);
bh = head = folio_buffers(folio);
do {
@@ -539,7 +540,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
*/
jh = bh2jh(bh);
if (buffer_dirty(bh) ||
- (jh && (jh->b_transaction != jinode->i_transaction ||
+ (jh && (jh->b_transaction != trans ||
jh->b_next_transaction)))
return true;
} while ((bh = bh->b_this_page) != head);
@@ -550,12 +551,14 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
static int ext4_journalled_submit_inode_data_buffers(struct jbd2_inode *jinode)
{
struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
+ loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
+ loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
struct writeback_control wbc = {
- .sync_mode = WB_SYNC_ALL,
+ .sync_mode = WB_SYNC_ALL,
.nr_to_write = LONG_MAX,
- .range_start = jinode->i_dirty_start,
- .range_end = jinode->i_dirty_end,
- };
+ .range_start = dirty_start,
+ .range_end = dirty_end,
+ };
struct folio *folio = NULL;
int error;
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] jbd2: use READ_ONCE for lockless jinode reads
2026-01-30 3:12 ` [PATCH 1/3] jbd2: use " Li Chen
@ 2026-02-02 16:40 ` Jan Kara
2026-02-02 16:52 ` Jan Kara
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2026-02-02 16:40 UTC (permalink / raw)
To: Li Chen; +Cc: Theodore Ts'o, Jan Kara, linux-ext4, linux-kernel
On Fri 30-01-26 11:12:30, Li Chen wrote:
> jbd2_inode fields are updated under journal->j_list_lock, but some
> paths read them without holding the lock (e.g. fast commit
> helpers and the ordered truncate fast path).
>
> Use READ_ONCE() for these lockless reads to correct the
> concurrency assumptions.
>
> Suggested-by: Jan Kara <jack@suse.com>
> Signed-off-by: Li Chen <me@linux.beauty>
Just one nit below. With that fixed feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
> @@ -191,12 +197,30 @@ EXPORT_SYMBOL(jbd2_submit_inode_data);
>
> int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
> {
> - if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
> - !jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
> + struct address_space *mapping;
> + struct inode *inode;
> + unsigned long flags;
> + loff_t start, end;
> +
> + if (!jinode)
> + return 0;
> +
> + flags = READ_ONCE(jinode->i_flags);
> + if (!(flags & JI_WAIT_DATA))
> + return 0;
> +
> + inode = READ_ONCE(jinode->i_vfs_inode);
i_vfs_inode never changes so READ_ONCE is pointless here.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/3] ext4: use READ_ONCE for lockless jinode reads
2026-01-30 3:12 ` [PATCH 2/3] ext4: " Li Chen
@ 2026-02-02 16:41 ` Jan Kara
0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2026-02-02 16:41 UTC (permalink / raw)
To: Li Chen
Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel,
Jan Kara
On Fri 30-01-26 11:12:31, Li Chen wrote:
> ext4 journal commit callbacks access jbd2_inode fields such as
> i_transaction and i_dirty_start/end without holding journal->j_list_lock.
>
> Use READ_ONCE() for these reads to correct the concurrency assumptions.
>
> Suggested-by: Jan Kara <jack@suse.com>
> Signed-off-by: Li Chen <me@linux.beauty>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 6 ++++--
> fs/ext4/super.c | 13 ++++++++-----
> 2 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d99296d7315f..2d451388e080 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3033,11 +3033,13 @@ static int ext4_writepages(struct address_space *mapping,
>
> int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode)
> {
> + loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
> + loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
> struct writeback_control wbc = {
> .sync_mode = WB_SYNC_ALL,
> .nr_to_write = LONG_MAX,
> - .range_start = jinode->i_dirty_start,
> - .range_end = jinode->i_dirty_end,
> + .range_start = dirty_start,
> + .range_end = dirty_end,
> };
> struct mpage_da_data mpd = {
> .inode = jinode->i_vfs_inode,
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 5cf6c2b54bbb..acb2bc016fd4 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -521,6 +521,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
> {
> struct buffer_head *bh, *head;
> struct journal_head *jh;
> + transaction_t *trans = READ_ONCE(jinode->i_transaction);
>
> bh = head = folio_buffers(folio);
> do {
> @@ -539,7 +540,7 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
> */
> jh = bh2jh(bh);
> if (buffer_dirty(bh) ||
> - (jh && (jh->b_transaction != jinode->i_transaction ||
> + (jh && (jh->b_transaction != trans ||
> jh->b_next_transaction)))
> return true;
> } while ((bh = bh->b_this_page) != head);
> @@ -550,12 +551,14 @@ static bool ext4_journalled_writepage_needs_redirty(struct jbd2_inode *jinode,
> static int ext4_journalled_submit_inode_data_buffers(struct jbd2_inode *jinode)
> {
> struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
> + loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
> + loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
> struct writeback_control wbc = {
> - .sync_mode = WB_SYNC_ALL,
> + .sync_mode = WB_SYNC_ALL,
> .nr_to_write = LONG_MAX,
> - .range_start = jinode->i_dirty_start,
> - .range_end = jinode->i_dirty_end,
> - };
> + .range_start = dirty_start,
> + .range_end = dirty_end,
> + };
> struct folio *folio = NULL;
> int error;
>
> --
> 2.52.0
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] jbd2: use READ_ONCE for lockless jinode reads
2026-02-02 16:40 ` Jan Kara
@ 2026-02-02 16:52 ` Jan Kara
0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2026-02-02 16:52 UTC (permalink / raw)
To: Li Chen; +Cc: Theodore Ts'o, Jan Kara, linux-ext4, linux-kernel
On Mon 02-02-26 17:40:45, Jan Kara wrote:
> On Fri 30-01-26 11:12:30, Li Chen wrote:
> > jbd2_inode fields are updated under journal->j_list_lock, but some
> > paths read them without holding the lock (e.g. fast commit
> > helpers and the ordered truncate fast path).
> >
> > Use READ_ONCE() for these lockless reads to correct the
> > concurrency assumptions.
> >
> > Suggested-by: Jan Kara <jack@suse.com>
> > Signed-off-by: Li Chen <me@linux.beauty>
>
> Just one nit below. With that fixed feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> > @@ -191,12 +197,30 @@ EXPORT_SYMBOL(jbd2_submit_inode_data);
> >
> > int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
> > {
> > - if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
> > - !jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
> > + struct address_space *mapping;
> > + struct inode *inode;
> > + unsigned long flags;
> > + loff_t start, end;
> > +
> > + if (!jinode)
> > + return 0;
> > +
> > + flags = READ_ONCE(jinode->i_flags);
> > + if (!(flags & JI_WAIT_DATA))
> > + return 0;
> > +
> > + inode = READ_ONCE(jinode->i_vfs_inode);
>
> i_vfs_inode never changes so READ_ONCE is pointless here.
One more note: I've realized that for this to work you also need to make
jbd2_journal_file_inode() use WRITE_ONCE() when updating i_dirty_start,
i_dirty_end and i_flags.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-02 16:52 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-30 3:12 [PATCH 0/3] jbd2/ext4/ocfs2: READ_ONCE for lockless jinode reads Li Chen
2026-01-30 3:12 ` [PATCH 1/3] jbd2: use " Li Chen
2026-02-02 16:40 ` Jan Kara
2026-02-02 16:52 ` Jan Kara
2026-01-30 3:12 ` [PATCH 2/3] ext4: " Li Chen
2026-02-02 16:41 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox