From: Marco Elver <elver@google.com>
To: elver@google.com
Cc: Mark Fasheh <mark@fasheh.com>, Joel Becker <jlbec@evilplan.org>,
Joseph Qi <joseph.qi@linux.alibaba.com>,
ocfs2-devel@lists.linux.dev, linux-kernel@vger.kernel.org,
kasan-dev@googlegroups.com
Subject: [PATCH] ocfs2: fix orphan inode disk leak in ocfs2_dio_end_io() on I/O error
Date: Thu, 11 Jun 2026 17:01:50 +0200 [thread overview]
Message-ID: <20260611150341.3964327-1-elver@google.com> (raw)
When an extending direct I/O write or a direct I/O write racing with an
unlink is initiated, ocfs2_direct_IO() places the user inode into the
system orphan directory and sets the OCFS2_DIO_ORPHANED_FL flag to
ensure defined behavior and crash consistency.
However, if the direct I/O request encounters an error or gets
asynchronous cancellation (bytes <= 0), the VFS completion hook
ocfs2_dio_end_io() bypasses ocfs2_dio_end_io_write() entirely and
executes ocfs2_dio_free_write_ctx(). This completely omits the teardown
of the orphan entry, leaking the user inode in the orphan directory and
leaving the OCFS2_DIO_ORPHANED_FL disk flag set.
Because the OCFS2_DIO_ORPHANED_FL flag remains active, subsequent VFS
final inode eviction (ocfs2_delete_inode) observes the flag, assumes a
direct I/O write is actively in progress, and refuses to wipe the inode.
This results in an irrecoverable disk storage and resource leak that can
only be reclaimed if the cluster unmounts or crashes.
Fix this by ensuring that ocfs2_dio_end_io() inspects dw_orphaned even
when an I/O error occurs, and executes ocfs2_del_inode_from_orphan() to
liberate the inode before destroying the in-memory write context.
Fixes: 5040f8df56fb ("ocfs2: free up write context when direct IO failed")
Assisted-by: Antigravity:Gemini
Signed-off-by: Marco Elver <elver@google.com>
---
fs/ocfs2/aops.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 4acdbb70882c..ad3f2057e26e 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2419,11 +2419,24 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
(long long)bytes);
if (private) {
- if (bytes > 0)
+ if (bytes > 0) {
ret = ocfs2_dio_end_io_write(inode, private, offset,
bytes);
- else
+ } else {
+ struct ocfs2_dio_write_ctxt *dwc = private;
+
+ if (dwc->dw_orphaned) {
+ struct buffer_head *di_bh = NULL;
+
+ if (ocfs2_inode_lock(inode, &di_bh, 1) == 0) {
+ ocfs2_del_inode_from_orphan(OCFS2_SB(inode->i_sb),
+ inode, di_bh, 0, 0);
+ ocfs2_inode_unlock(inode, 1);
+ brelse(di_bh);
+ }
+ }
ocfs2_dio_free_write_ctx(inode, private);
+ }
}
ocfs2_iocb_clear_rw_locked(iocb);
--
2.54.0.1099.g489fc7bff1-goog
next reply other threads:[~2026-06-11 15:03 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 15:01 Marco Elver [this message]
2026-06-12 1:27 ` [PATCH] ocfs2: fix orphan inode disk leak in ocfs2_dio_end_io() on I/O error Heming Zhao
2026-06-12 12:58 ` Marco Elver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260611150341.3964327-1-elver@google.com \
--to=elver@google.com \
--cc=jlbec@evilplan.org \
--cc=joseph.qi@linux.alibaba.com \
--cc=kasan-dev@googlegroups.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark@fasheh.com \
--cc=ocfs2-devel@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.