* [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path
@ 2026-06-20 8:08 Deepanshu Kartikey
2026-06-20 8:20 ` sashiko-bot
2026-06-20 17:59 ` Matthew Wilcox
0 siblings, 2 replies; 4+ messages in thread
From: Deepanshu Kartikey @ 2026-06-20 8:08 UTC (permalink / raw)
To: mark, jlbec, joseph.qi, bigeasy, clrkwllms, rostedt
Cc: ocfs2-devel, linux-kernel, linux-rt-devel, Deepanshu Kartikey,
syzbot+ce129763ce7d7e914739
PREEMPT_RT's rtmutex PI chain walker detected a lock dependency
cycle in a single thread:
ocfs2_file_write_iter()
inode_lock(file_inode) [Lock A]
ocfs2_dio_end_io_write()
ocfs2_inode_lock() [Lock B]
ocfs2_del_inode_from_orphan()
inode_lock(orphan_dir) [Lock D] <- cycle detected!
The problem is lock ordering. Lock B is held when Lock D is
acquired. Recovery paths acquire these locks in a different
order creating a potential cycle in the lock dependency graph.
Fix this by releasing Lock B (ocfs2_inode_unlock + brelse(di_bh))
BEFORE calling ocfs2_del_inode_from_orphan(). Pass NULL for di_bh
to signal that ocfs2_del_inode_from_orphan() should acquire its
own fresh cluster lock and di_bh internally.
This ensures consistent lock ordering:
Before: B held -> D acquired (inconsistent)
After: B released -> B' fresh -> D (consistent)
Reported-by: syzbot+ce129763ce7d7e914739@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ce129763ce7d7e914739
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
fs/ocfs2/aops.c | 21 +++++++++++++++------
fs/ocfs2/namei.c | 17 ++++++++++++++++-
2 files changed, 31 insertions(+), 7 deletions(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 6ec198bdab12..15b059a23ebc 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2280,6 +2280,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
handle_t *handle = NULL;
loff_t end = offset + bytes;
int ret = 0, credits = 0, batch = 0;
+ bool orphaned = false;
ocfs2_init_dealloc_ctxt(&dealloc);
@@ -2371,17 +2372,25 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
ocfs2_commit_trans(osb, handle);
unlock:
up_write(&oi->ip_alloc_sem);
+ /*
+ * Release the cluster lock and di_bh BEFORE calling
+ * ocfs2_del_inode_from_orphan(). That function will acquire
+ * inode_lock(orphan_dir_inode) which would cause an AB-BA
+ * deadlock with recovery paths that hold orphan_dir lock
+ * before acquiring the file inode lock.
+ */
+ orphaned = (!ret && dwc->dw_orphaned);
+ ocfs2_inode_unlock(inode, 1);
+ brelse(di_bh);
+ di_bh = NULL;
- /* everything looks good, let's start the cleanup */
- if (!ret && dwc->dw_orphaned) {
+ /* everything looks good, let's start the orphan cleanup */
+ if (orphaned) {
BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
-
- ret = ocfs2_del_inode_from_orphan(osb, inode, di_bh, 0, 0);
+ ret = ocfs2_del_inode_from_orphan(osb, inode, NULL, 0, 0);
if (ret < 0)
mlog_errno(ret);
}
- ocfs2_inode_unlock(inode, 1);
- brelse(di_bh);
out:
if (data_ac)
ocfs2_free_alloc_context(data_ac);
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index 1277666c77cd..25bbe2a9776e 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -2712,10 +2712,21 @@ int ocfs2_del_inode_from_orphan(struct ocfs2_super *osb,
{
struct inode *orphan_dir_inode = NULL;
struct buffer_head *orphan_dir_bh = NULL;
- struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
+ struct ocfs2_dinode *di;
handle_t *handle = NULL;
int status = 0;
+ struct buffer_head *local_di_bh = NULL;
+ if (!di_bh) {
+ status = ocfs2_inode_lock(inode, &local_di_bh, 1);
+ if (status < 0) {
+ mlog_errno(status);
+ return status;
+ }
+ di_bh = local_di_bh;
+ }
+
+ di = (struct ocfs2_dinode *)di_bh->b_data;
orphan_dir_inode = ocfs2_get_system_file_inode(osb,
ORPHAN_DIR_SYSTEM_INODE,
le16_to_cpu(di->i_dio_orphaned_slot));
@@ -2779,6 +2790,10 @@ int ocfs2_del_inode_from_orphan(struct ocfs2_super *osb,
iput(orphan_dir_inode);
bail:
+ if (local_di_bh) {
+ ocfs2_inode_unlock(inode, 1);
+ brelse(local_di_bh);
+ }
return status;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path
2026-06-20 8:08 [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path Deepanshu Kartikey
@ 2026-06-20 8:20 ` sashiko-bot
2026-06-20 17:59 ` Matthew Wilcox
1 sibling, 0 replies; 4+ messages in thread
From: sashiko-bot @ 2026-06-20 8:20 UTC (permalink / raw)
To: Deepanshu Kartikey; +Cc: linux-kernel, linux-rt-devel
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues:
- [High] The patch completely fails to resolve the AB-BA deadlock between the write path and the orphan recovery path, as the exact same lock dependency is immediately recreated.
- [High] The identical lock dependency cycle (A -> B -> D) exists unmitigated in the orphan addition path.
- [High] Directory `i_rwsem` operations use the wrong subclass, violating VFS lock ordering rules and hiding parent-child deadlocks.
--
commit a3ea3ef9d81bdb971acfa026b14e772f33af54fa
Author: Deepanshu Kartikey <kartikey406@gmail.com>
ocfs2: fix deadlock in dio write orphan cleanup path
This commit attempts to fix an AB-BA deadlock in the direct IO orphan cleanup
path by releasing the file's cluster lock before calling
ocfs2_del_inode_from_orphan(). It passes NULL for di_bh to signal the
function to acquire its own lock.
Closes: https://syzkaller.appspot.com/bug?extid=ce129763ce7d7e914739
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 6ec198bdab121..095cb3e44c65e 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
[ ... ]
> @@ -2371,17 +2372,25 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
> ocfs2_commit_trans(osb, handle);
> unlock:
> up_write(&oi->ip_alloc_sem);
> + /*
> + * Release the cluster lock and di_bh BEFORE calling
> + * ocfs2_del_inode_from_orphan(). That function will acquire
> + * inode_lock(orphan_dir_inode) which would cause an AB-BA
> + * deadlock with recovery paths that hold orphan_dir lock
> + * before acquiring the file inode lock.
> + */
> + orphaned = (!ret && dwc->dw_orphaned);
> + ocfs2_inode_unlock(inode, 1);
> + brelse(di_bh);
> + di_bh = NULL;
>
> - /* everything looks good, let's start the cleanup */
> - if (!ret && dwc->dw_orphaned) {
> + /* everything looks good, let's start the orphan cleanup */
> + if (orphaned) {
> BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
> -
> - ret = ocfs2_del_inode_from_orphan(osb, inode, di_bh, 0, 0);
> + ret = ocfs2_del_inode_from_orphan(osb, inode, NULL, 0, 0);
> if (ret < 0)
> mlog_errno(ret);
> }
> - ocfs2_inode_unlock(inode, 1);
> - brelse(di_bh);
> out:
[ ... ]
> diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
> index 1277666c77cd2..25bbe2a9776ef 100644
> --- a/fs/ocfs2/namei.c
> +++ b/fs/ocfs2/namei.c
> @@ -2712,10 +2712,21 @@ int ocfs2_del_inode_from_orphan(struct ocfs2_super *osb,
> {
> struct inode *orphan_dir_inode = NULL;
> struct buffer_head *orphan_dir_bh = NULL;
> - struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
> + struct ocfs2_dinode *di;
> handle_t *handle = NULL;
> int status = 0;
> + struct buffer_head *local_di_bh = NULL;
>
> + if (!di_bh) {
> + status = ocfs2_inode_lock(inode, &local_di_bh, 1);
[Severity: High]
This is a pre-existing issue, but is the AB-BA deadlock between the write
path and the orphan recovery path truly resolved?
When di_bh is NULL, this immediately reacquires the exact same
ocfs2_inode_lock on the file inode (Lock B) before the orphan directory's
VFS lock (Lock D) is acquired later in this function:
fs/ocfs2/namei.c:ocfs2_del_inode_from_orphan() {
...
inode_lock(orphan_dir_inode);
...
}
Since the lock acquisition order remains exactly the same (B -> D) as it
was before the patch, and the recovery path acquires these locks in the
reverse order (D -> B), doesn't the deadlock cycle remain unmitigated?
Also, as a related pre-existing issue, couldn't the identical lock
dependency cycle (A -> B -> D) still occur unmitigated in the orphan
addition path?
During direct IO writes, ocfs2_dio_get_block() calls
ocfs2_add_inode_to_orphan() while holding the file's VFS lock (Lock A).
Inside ocfs2_add_inode_to_orphan(), the file's cluster lock (Lock B) is
acquired:
fs/ocfs2/namei.c:ocfs2_add_inode_to_orphan() {
status = ocfs2_inode_lock(inode, &di_bh, 1);
...
status = ocfs2_prepare_orphan_dir(osb, &orphan_dir_inode,
OCFS2_I(inode)->ip_blkno,
orphan_name,
&orphan_insert,
true);
}
This calls ocfs2_prepare_orphan_dir(), which acquires the orphan
directory's VFS lock (Lock D) via inode_lock(). This seems to establish
the exact same A -> B -> D deadlock cycle against the recovery path.
Finally, this is a pre-existing issue, but might the use of inode_lock()
on the orphan directory violate VFS lock ordering rules?
When modifying a directory, the VFS requires acquiring the directory's
i_rwsem with subclass I_MUTEX_PARENT. In ocfs2_del_inode_from_orphan()
and ocfs2_lookup_lock_orphan_dir(), the default inode_lock() is used
instead:
fs/ocfs2/namei.c:ocfs2_del_inode_from_orphan() {
...
inode_lock(orphan_dir_inode);
...
}
Because the caller already holds the file's inode_lock (also subclass 0),
acquiring the orphan directory's lock using subclass 0 can create a
lockdep AA self-deadlock or cycle warning. Furthermore, taking a child's
lock before its parent directory's lock intrinsically violates the VFS
hierarchy (I_MUTEX_PARENT -> I_MUTEX_NORMAL). Should this be using
inode_lock_nested(..., I_MUTEX_PARENT) instead?
> + if (status < 0) {
> + mlog_errno(status);
> + return status;
> + }
> + di_bh = local_di_bh;
> + }
> +
> + di = (struct ocfs2_dinode *)di_bh->b_data;
> orphan_dir_inode = ocfs2_get_system_file_inode(osb,
> ORPHAN_DIR_SYSTEM_INODE,
> le16_to_cpu(di->i_dio_orphaned_slot));
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260620080802.35165-1-kartikey406@gmail.com?part=1
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path
2026-06-20 8:08 [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path Deepanshu Kartikey
2026-06-20 8:20 ` sashiko-bot
@ 2026-06-20 17:59 ` Matthew Wilcox
2026-06-20 23:26 ` Deepanshu Kartikey
1 sibling, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2026-06-20 17:59 UTC (permalink / raw)
To: Deepanshu Kartikey
Cc: mark, jlbec, joseph.qi, bigeasy, clrkwllms, rostedt, ocfs2-devel,
linux-kernel, linux-rt-devel, syzbot+ce129763ce7d7e914739
On Sat, Jun 20, 2026 at 01:38:02PM +0530, Deepanshu Kartikey wrote:
> PREEMPT_RT's rtmutex PI chain walker detected a lock dependency
> cycle in a single thread:
>
> ocfs2_file_write_iter()
> inode_lock(file_inode) [Lock A]
> ocfs2_dio_end_io_write()
> ocfs2_inode_lock() [Lock B]
> ocfs2_del_inode_from_orphan()
> inode_lock(orphan_dir) [Lock D] <- cycle detected!
This seems like a false positive. You can't call write_iter() on
a directory, and orphan_dir is always a directory.
I would suggest that the easiest way to make this warning go away is to
replace inode_lock(orphan_dir) with inode_lock_nested(orphan_dir,
I_MUTEX_NONDIR2). It's a bit quirky because, well, orphan2 is a
directory. We could add a seventh lock class to
inode_i_mutex_lock_class, but that feels a bit excessive.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path
2026-06-20 17:59 ` Matthew Wilcox
@ 2026-06-20 23:26 ` Deepanshu Kartikey
0 siblings, 0 replies; 4+ messages in thread
From: Deepanshu Kartikey @ 2026-06-20 23:26 UTC (permalink / raw)
To: Matthew Wilcox
Cc: mark, jlbec, joseph.qi, bigeasy, clrkwllms, rostedt, ocfs2-devel,
linux-kernel, linux-rt-devel, syzbot+ce129763ce7d7e914739
On Sat, Jun 20, 2026 at 11:29 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> This seems like a false positive. You can't call write_iter() on
> a directory, and orphan_dir is always a directory.
>
> I would suggest that the easiest way to make this warning go away is to
> replace inode_lock(orphan_dir) with inode_lock_nested(orphan_dir,
> I_MUTEX_NONDIR2). It's a bit quirky because, well, orphan2 is a
> directory. We could add a seventh lock class to
> inode_i_mutex_lock_class, but that feels a bit excessive.
>
Thanks for the review. I have sent the patch v2 with
inode_lock_nested(orphan_dir_inode, I_MUTEX_NONDIR2)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-20 23:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-20 8:08 [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path Deepanshu Kartikey
2026-06-20 8:20 ` sashiko-bot
2026-06-20 17:59 ` Matthew Wilcox
2026-06-20 23:26 ` Deepanshu Kartikey
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.