All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ocfs2: fix orphan inode disk leak in ocfs2_dio_end_io() on I/O error
@ 2026-06-11 15:01 Marco Elver
  2026-06-12  1:27 ` Heming Zhao
  0 siblings, 1 reply; 2+ messages in thread
From: Marco Elver @ 2026-06-11 15:01 UTC (permalink / raw)
  To: elver
  Cc: Mark Fasheh, Joel Becker, Joseph Qi, ocfs2-devel, linux-kernel,
	kasan-dev

When an extending direct I/O write or a direct I/O write racing with an
unlink is initiated, ocfs2_direct_IO() places the user inode into the
system orphan directory and sets the OCFS2_DIO_ORPHANED_FL flag to
ensure defined behavior and crash consistency.

However, if the direct I/O request encounters an error or gets
asynchronous cancellation (bytes <= 0), the VFS completion hook
ocfs2_dio_end_io() bypasses ocfs2_dio_end_io_write() entirely and
executes ocfs2_dio_free_write_ctx().  This completely omits the teardown
of the orphan entry, leaking the user inode in the orphan directory and
leaving the OCFS2_DIO_ORPHANED_FL disk flag set.

Because the OCFS2_DIO_ORPHANED_FL flag remains active, subsequent VFS
final inode eviction (ocfs2_delete_inode) observes the flag, assumes a
direct I/O write is actively in progress, and refuses to wipe the inode.
This results in an irrecoverable disk storage and resource leak that can
only be reclaimed if the cluster unmounts or crashes.

Fix this by ensuring that ocfs2_dio_end_io() inspects dw_orphaned even
when an I/O error occurs, and executes ocfs2_del_inode_from_orphan() to
liberate the inode before destroying the in-memory write context.

Fixes: 5040f8df56fb ("ocfs2: free up write context when direct IO failed")
Assisted-by: Antigravity:Gemini
Signed-off-by: Marco Elver <elver@google.com>
---
 fs/ocfs2/aops.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 4acdbb70882c..ad3f2057e26e 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2419,11 +2419,24 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
 		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
 				 (long long)bytes);
 	if (private) {
-		if (bytes > 0)
+		if (bytes > 0) {
 			ret = ocfs2_dio_end_io_write(inode, private, offset,
 						     bytes);
-		else
+		} else {
+			struct ocfs2_dio_write_ctxt *dwc = private;
+
+			if (dwc->dw_orphaned) {
+				struct buffer_head *di_bh = NULL;
+
+				if (ocfs2_inode_lock(inode, &di_bh, 1) == 0) {
+					ocfs2_del_inode_from_orphan(OCFS2_SB(inode->i_sb),
+								    inode, di_bh, 0, 0);
+					ocfs2_inode_unlock(inode, 1);
+					brelse(di_bh);
+				}
+			}
 			ocfs2_dio_free_write_ctx(inode, private);
+		}
 	}
 
 	ocfs2_iocb_clear_rw_locked(iocb);
-- 
2.54.0.1099.g489fc7bff1-goog


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] ocfs2: fix orphan inode disk leak in ocfs2_dio_end_io() on I/O error
  2026-06-11 15:01 [PATCH] ocfs2: fix orphan inode disk leak in ocfs2_dio_end_io() on I/O error Marco Elver
@ 2026-06-12  1:27 ` Heming Zhao
  0 siblings, 0 replies; 2+ messages in thread
From: Heming Zhao @ 2026-06-12  1:27 UTC (permalink / raw)
  To: Marco Elver
  Cc: Mark Fasheh, Joel Becker, Joseph Qi, ocfs2-devel, linux-kernel,
	kasan-dev

On Thu, Jun 11, 2026 at 05:01:50PM +0200, Marco Elver wrote:
> When an extending direct I/O write or a direct I/O write racing with an
> unlink is initiated, ocfs2_direct_IO() places the user inode into the
> system orphan directory and sets the OCFS2_DIO_ORPHANED_FL flag to
> ensure defined behavior and crash consistency.
> 
> However, if the direct I/O request encounters an error or gets
> asynchronous cancellation (bytes <= 0), the VFS completion hook
> ocfs2_dio_end_io() bypasses ocfs2_dio_end_io_write() entirely and
> executes ocfs2_dio_free_write_ctx().  This completely omits the teardown
> of the orphan entry, leaking the user inode in the orphan directory and
> leaving the OCFS2_DIO_ORPHANED_FL disk flag set.
> 
> Because the OCFS2_DIO_ORPHANED_FL flag remains active, subsequent VFS
> final inode eviction (ocfs2_delete_inode) observes the flag, assumes a
> direct I/O write is actively in progress, and refuses to wipe the inode.
> This results in an irrecoverable disk storage and resource leak that can
> only be reclaimed if the cluster unmounts or crashes.
> 
> Fix this by ensuring that ocfs2_dio_end_io() inspects dw_orphaned even
> when an I/O error occurs, and executes ocfs2_del_inode_from_orphan() to
> liberate the inode before destroying the in-memory write context.
> 
> Fixes: 5040f8df56fb ("ocfs2: free up write context when direct IO failed")
> Assisted-by: Antigravity:Gemini
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  fs/ocfs2/aops.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 4acdbb70882c..ad3f2057e26e 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2419,11 +2419,24 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
>  		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
>  				 (long long)bytes);
>  	if (private) {
> -		if (bytes > 0)
> +		if (bytes > 0) {
>  			ret = ocfs2_dio_end_io_write(inode, private, offset,
>  						     bytes);
> -		else
> +		} else {
> +			struct ocfs2_dio_write_ctxt *dwc = private;
> +
> +			if (dwc->dw_orphaned) {
> +				struct buffer_head *di_bh = NULL;
> +
> +				if (ocfs2_inode_lock(inode, &di_bh, 1) == 0) {
> +					ocfs2_del_inode_from_orphan(OCFS2_SB(inode->i_sb),
> +								    inode, di_bh, 0, 0);
> +					ocfs2_inode_unlock(inode, 1);
> +					brelse(di_bh);
> +				}

Calling only ocfs2_del_inode_from_orphan() without ocfs2_truncate_file() will
leave stale blocks beyond the EOF.

I think the existing OCFS2 code already handles error/crash cases for orphaned
inodes, and this "leaking" behavior is by design.
please refer to ocfs2_recover_orphans() and ocfs2_add_inode_to_orphan().

Thanks,
Heming
> +			}
>  			ocfs2_dio_free_write_ctx(inode, private);
> +		}
>  	}
>  
>  	ocfs2_iocb_clear_rw_locked(iocb);
> -- 
> 2.54.0.1099.g489fc7bff1-goog
> 
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-12  1:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 15:01 [PATCH] ocfs2: fix orphan inode disk leak in ocfs2_dio_end_io() on I/O error Marco Elver
2026-06-12  1:27 ` Heming Zhao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.