Re: [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Deepanshu Kartikey" <kartikey406@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-rt-devel@lists.linux.dev
Subject: Re: [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path
Date: Sat, 20 Jun 2026 08:20:26 +0000	[thread overview]
Message-ID: <20260620082026.784E51F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260620080802.35165-1-kartikey406@gmail.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] The patch completely fails to resolve the AB-BA deadlock between the write path and the orphan recovery path, as the exact same lock dependency is immediately recreated.
- [High] The identical lock dependency cycle (A -> B -> D) exists unmitigated in the orphan addition path.
- [High] Directory `i_rwsem` operations use the wrong subclass, violating VFS lock ordering rules and hiding parent-child deadlocks.
--

commit a3ea3ef9d81bdb971acfa026b14e772f33af54fa
Author: Deepanshu Kartikey <kartikey406@gmail.com>

ocfs2: fix deadlock in dio write orphan cleanup path

This commit attempts to fix an AB-BA deadlock in the direct IO orphan cleanup
path by releasing the file's cluster lock before calling
ocfs2_del_inode_from_orphan(). It passes NULL for di_bh to signal the
function to acquire its own lock.

Closes: https://syzkaller.appspot.com/bug?extid=ce129763ce7d7e914739

> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 6ec198bdab121..095cb3e44c65e 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c

[ ... ]

> @@ -2371,17 +2372,25 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
>  		ocfs2_commit_trans(osb, handle);
>  unlock:
>  	up_write(&oi->ip_alloc_sem);
> +	/*
> +	 * Release the cluster lock and di_bh BEFORE calling
> +	 * ocfs2_del_inode_from_orphan(). That function will acquire
> +	 * inode_lock(orphan_dir_inode) which would cause an AB-BA
> +	 *  deadlock with recovery paths that hold orphan_dir lock
> +	 * before acquiring the file inode lock.
> +	 */
> +	orphaned = (!ret && dwc->dw_orphaned);
> +	ocfs2_inode_unlock(inode, 1);
> +	brelse(di_bh);
> +	di_bh = NULL;
>  
> -	/* everything looks good, let's start the cleanup */
> -	if (!ret && dwc->dw_orphaned) {
> +	/* everything looks good, let's start the orphan cleanup */
> +	if (orphaned) {
>  		BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
> -
> -		ret = ocfs2_del_inode_from_orphan(osb, inode, di_bh, 0, 0);
> +		ret = ocfs2_del_inode_from_orphan(osb, inode, NULL, 0, 0);
>  		if (ret < 0)
>  			mlog_errno(ret);
>  	}
> -	ocfs2_inode_unlock(inode, 1);
> -	brelse(di_bh);
>  out:

[ ... ]

> diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
> index 1277666c77cd2..25bbe2a9776ef 100644
> --- a/fs/ocfs2/namei.c
> +++ b/fs/ocfs2/namei.c
> @@ -2712,10 +2712,21 @@ int ocfs2_del_inode_from_orphan(struct ocfs2_super *osb,
>  {
>  	struct inode *orphan_dir_inode = NULL;
>  	struct buffer_head *orphan_dir_bh = NULL;
> -	struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
> +	struct ocfs2_dinode *di;
>  	handle_t *handle = NULL;
>  	int status = 0;
> +	struct buffer_head *local_di_bh = NULL;
>  
> +	if (!di_bh) {
> +		status = ocfs2_inode_lock(inode, &local_di_bh, 1);

[Severity: High]
This is a pre-existing issue, but is the AB-BA deadlock between the write
path and the orphan recovery path truly resolved?

When di_bh is NULL, this immediately reacquires the exact same
ocfs2_inode_lock on the file inode (Lock B) before the orphan directory's
VFS lock (Lock D) is acquired later in this function:

fs/ocfs2/namei.c:ocfs2_del_inode_from_orphan() {
    ...
    inode_lock(orphan_dir_inode);
    ...
}

Since the lock acquisition order remains exactly the same (B -> D) as it
was before the patch, and the recovery path acquires these locks in the
reverse order (D -> B), doesn't the deadlock cycle remain unmitigated?

Also, as a related pre-existing issue, couldn't the identical lock
dependency cycle (A -> B -> D) still occur unmitigated in the orphan
addition path?

During direct IO writes, ocfs2_dio_get_block() calls
ocfs2_add_inode_to_orphan() while holding the file's VFS lock (Lock A).
Inside ocfs2_add_inode_to_orphan(), the file's cluster lock (Lock B) is
acquired:

fs/ocfs2/namei.c:ocfs2_add_inode_to_orphan() {
    status = ocfs2_inode_lock(inode, &di_bh, 1);
    ...
    status = ocfs2_prepare_orphan_dir(osb, &orphan_dir_inode,
                                      OCFS2_I(inode)->ip_blkno,
                                      orphan_name,
                                      &orphan_insert,
                                      true);
}

This calls ocfs2_prepare_orphan_dir(), which acquires the orphan
directory's VFS lock (Lock D) via inode_lock(). This seems to establish
the exact same A -> B -> D deadlock cycle against the recovery path.

Finally, this is a pre-existing issue, but might the use of inode_lock()
on the orphan directory violate VFS lock ordering rules?

When modifying a directory, the VFS requires acquiring the directory's
i_rwsem with subclass I_MUTEX_PARENT. In ocfs2_del_inode_from_orphan()
and ocfs2_lookup_lock_orphan_dir(), the default inode_lock() is used
instead:

fs/ocfs2/namei.c:ocfs2_del_inode_from_orphan() {
    ...
    inode_lock(orphan_dir_inode);
    ...
}

Because the caller already holds the file's inode_lock (also subclass 0),
acquiring the orphan directory's lock using subclass 0 can create a
lockdep AA self-deadlock or cycle warning. Furthermore, taking a child's
lock before its parent directory's lock intrinsically violates the VFS
hierarchy (I_MUTEX_PARENT -> I_MUTEX_NORMAL). Should this be using
inode_lock_nested(..., I_MUTEX_PARENT) instead?

> +		if (status < 0) {
> +			mlog_errno(status);
> +			return status;
> +		}
> +		di_bh = local_di_bh;
> +	}
> +
> +	di = (struct ocfs2_dinode *)di_bh->b_data;
>  	orphan_dir_inode = ocfs2_get_system_file_inode(osb,
>  			ORPHAN_DIR_SYSTEM_INODE,
>  			le16_to_cpu(di->i_dio_orphaned_slot));

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260620080802.35165-1-kartikey406@gmail.com?part=1

next prev parent reply	other threads:[~2026-06-20  8:20 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-20  8:08 [PATCH] ocfs2: fix deadlock in dio write orphan cleanup path Deepanshu Kartikey
2026-06-20  8:20 ` sashiko-bot [this message]
2026-06-20 17:59 ` Matthew Wilcox
2026-06-20 23:26   ` Deepanshu Kartikey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260620082026.784E51F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kartikey406@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.