The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] ocfs2: reject inconsistent inode size before truncate
@ 2026-05-09  9:10 ZhengYuan Huang
  2026-05-10  3:53 ` Joseph Qi
  0 siblings, 1 reply; 3+ messages in thread
From: ZhengYuan Huang @ 2026-05-09  9:10 UTC (permalink / raw)
  To: mark, jlbec, joseph.qi
  Cc: ocfs2-devel, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	ZhengYuan Huang

[BUG]
openat(..., O_WRONLY|O_CREAT|O_TRUNC) can hit:

kernel BUG at fs/ocfs2/file.c:454!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:ocfs2_truncate_file+0x1204/0x13c0 fs/ocfs2/file.c:454
Call Trace:
 ocfs2_setattr+0xa6d/0x1fd0 fs/ocfs2/file.c:1212
 notify_change+0x4b5/0x1030 fs/attr.c:546
 do_truncate+0x1d2/0x230 fs/open.c:68
 handle_truncate fs/namei.c:3596 [inline]
 do_open fs/namei.c:3979 [inline]
 path_openat+0x260f/0x2ce0 fs/namei.c:4134
 do_filp_open+0x1f6/0x430 fs/namei.c:4161
 do_sys_openat2+0x117/0x1c0 fs/open.c:1437
 do_sys_open fs/open.c:1452 [inline]
 __do_sys_openat fs/open.c:1468 [inline]
 __se_sys_openat fs/open.c:1463 [inline]
 __x64_sys_openat+0x15b/0x220 fs/open.c:1463
 ...

[CAUSE]
ocfs2_truncate_file() treats di_bh->i_size matching inode->i_size as an
internal code invariant and BUGs if it is broken.

That assumption is too strong for corrupted metadata. The dinode block can
still be structurally valid enough to pass ocfs2_read_inode_block() while
no longer matching an already-instantiated VFS inode. On local mounts,
ocfs2_inode_lock_update() skips refresh entirely, so truncate can
observe the mismatch directly and crash instead of rejecting the
corruption.

[FIX]
Turn the BUG_ON into normal OCFS2 corruption handling. If truncate sees
di_bh->i_size disagree with inode->i_size, report it with ocfs2_error() and
abort before touching truncate state.

This keeps the fix at the first boundary that actually requires the
sizes to match and avoids widening checks into hotter generic
inode-lock paths.

Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -444,15 +444,17 @@ int ocfs2_truncate_file(struct inode *inode,
 	int status = 0;
 	struct ocfs2_dinode *fe = NULL;
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 
-	/* We trust di_bh because it comes from ocfs2_inode_lock(), which
-	 * already validated it */
+	/*
+	 * ocfs2_inode_lock() validated the dinode block, but truncation
+	 * still needs to reject an inode state that no longer matches
+	 * di_bh.
+	 */
 	fe = (struct ocfs2_dinode *) di_bh->b_data;
 
 	trace_ocfs2_truncate_file((unsigned long long)OCFS2_I(inode)->ip_blkno,
 				  (unsigned long long)le64_to_cpu(fe->i_size),
 				  (unsigned long long)new_i_size);
 
-	mlog_bug_on_msg(le64_to_cpu(fe->i_size) != i_size_read(inode),
-			"Inode %llu, inode i_size = %lld != di "
-			"i_size = %llu, i_flags = 0x%x\n",
-			(unsigned long long)OCFS2_I(inode)->ip_blkno,
-			i_size_read(inode),
-			(unsigned long long)le64_to_cpu(fe->i_size),
-			le32_to_cpu(fe->i_flags));
+	if (unlikely(le64_to_cpu(fe->i_size) != i_size_read(inode))) {
+		status = ocfs2_error(inode->i_sb,
+				     "Inode %llu has inconsistent i_size: inode = %lld, dinode = %llu, i_flags = 0x%x\n",
+				     (unsigned long long)OCFS2_I(inode)->ip_blkno,
+				     i_size_read(inode),
+				     (unsigned long long)le64_to_cpu(fe->i_size),
+				     le32_to_cpu(fe->i_flags));
+		goto bail;
+	}
 
 	if (new_i_size > le64_to_cpu(fe->i_size)) {
 		trace_ocfs2_truncate_file_error(
 			(unsigned long long)le64_to_cpu(fe->i_size),
-- 
2.43.0

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ocfs2: reject inconsistent inode size before truncate
  2026-05-09  9:10 [PATCH] ocfs2: reject inconsistent inode size before truncate ZhengYuan Huang
@ 2026-05-10  3:53 ` Joseph Qi
  2026-05-12  2:17   ` ZhengYuan Huang
  0 siblings, 1 reply; 3+ messages in thread
From: Joseph Qi @ 2026-05-10  3:53 UTC (permalink / raw)
  To: ZhengYuan Huang
  Cc: ocfs2-devel, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	Mark Fasheh, Joel Becker



On 5/9/26 5:10 PM, ZhengYuan Huang wrote:
> [BUG]
> openat(..., O_WRONLY|O_CREAT|O_TRUNC) can hit:
> 
> kernel BUG at fs/ocfs2/file.c:454!
> Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
> RIP: 0010:ocfs2_truncate_file+0x1204/0x13c0 fs/ocfs2/file.c:454
> Call Trace:
>  ocfs2_setattr+0xa6d/0x1fd0 fs/ocfs2/file.c:1212
>  notify_change+0x4b5/0x1030 fs/attr.c:546
>  do_truncate+0x1d2/0x230 fs/open.c:68
>  handle_truncate fs/namei.c:3596 [inline]
>  do_open fs/namei.c:3979 [inline]
>  path_openat+0x260f/0x2ce0 fs/namei.c:4134
>  do_filp_open+0x1f6/0x430 fs/namei.c:4161
>  do_sys_openat2+0x117/0x1c0 fs/open.c:1437
>  do_sys_open fs/open.c:1452 [inline]
>  __do_sys_openat fs/open.c:1468 [inline]
>  __se_sys_openat fs/open.c:1463 [inline]
>  __x64_sys_openat+0x15b/0x220 fs/open.c:1463
>  ...
> 
> [CAUSE]
> ocfs2_truncate_file() treats di_bh->i_size matching inode->i_size as an
> internal code invariant and BUGs if it is broken.
> 
> That assumption is too strong for corrupted metadata. The dinode block can
> still be structurally valid enough to pass ocfs2_read_inode_block() while
> no longer matching an already-instantiated VFS inode. On local mounts,
> ocfs2_inode_lock_update() skips refresh entirely, so truncate can
> observe the mismatch directly and crash instead of rejecting the
> corruption.
> 
> [FIX]
> Turn the BUG_ON into normal OCFS2 corruption handling. If truncate sees
> di_bh->i_size disagree with inode->i_size, report it with ocfs2_error() and
> abort before touching truncate state.
> 
> This keeps the fix at the first boundary that actually requires the
> sizes to match and avoids widening checks into hotter generic
> inode-lock paths.
> 
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -444,15 +444,17 @@ int ocfs2_truncate_file(struct inode *inode,
>  	int status = 0;
>  	struct ocfs2_dinode *fe = NULL;
>  	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>  
> -	/* We trust di_bh because it comes from ocfs2_inode_lock(), which
> -	 * already validated it */
> +	/*
> +	 * ocfs2_inode_lock() validated the dinode block, but truncation
> +	 * still needs to reject an inode state that no longer matches
> +	 * di_bh.
> +	 */

In commit log, you mention the case is local mount.
So here comments should refer it.

Thanks,
Joseph

>  	fe = (struct ocfs2_dinode *) di_bh->b_data;
>  
>  	trace_ocfs2_truncate_file((unsigned long long)OCFS2_I(inode)->ip_blkno,
>  				  (unsigned long long)le64_to_cpu(fe->i_size),
>  				  (unsigned long long)new_i_size);
>  
> -	mlog_bug_on_msg(le64_to_cpu(fe->i_size) != i_size_read(inode),
> -			"Inode %llu, inode i_size = %lld != di "
> -			"i_size = %llu, i_flags = 0x%x\n",
> -			(unsigned long long)OCFS2_I(inode)->ip_blkno,
> -			i_size_read(inode),
> -			(unsigned long long)le64_to_cpu(fe->i_size),
> -			le32_to_cpu(fe->i_flags));
> +	if (unlikely(le64_to_cpu(fe->i_size) != i_size_read(inode))) {
> +		status = ocfs2_error(inode->i_sb,
> +				     "Inode %llu has inconsistent i_size: inode = %lld, dinode = %llu, i_flags = 0x%x\n",
> +				     (unsigned long long)OCFS2_I(inode)->ip_blkno,
> +				     i_size_read(inode),
> +				     (unsigned long long)le64_to_cpu(fe->i_size),
> +				     le32_to_cpu(fe->i_flags));
> +		goto bail;
> +	}
>  
>  	if (new_i_size > le64_to_cpu(fe->i_size)) {
>  		trace_ocfs2_truncate_file_error(
>  			(unsigned long long)le64_to_cpu(fe->i_size),


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ocfs2: reject inconsistent inode size before truncate
  2026-05-10  3:53 ` Joseph Qi
@ 2026-05-12  2:17   ` ZhengYuan Huang
  0 siblings, 0 replies; 3+ messages in thread
From: ZhengYuan Huang @ 2026-05-12  2:17 UTC (permalink / raw)
  To: Joseph Qi
  Cc: ocfs2-devel, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	Mark Fasheh, Joel Becker

On Sun, May 10, 2026 at 11:54 AM Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
>
>
>
> On 5/9/26 5:10 PM, ZhengYuan Huang wrote:
> > [BUG]
> > openat(..., O_WRONLY|O_CREAT|O_TRUNC) can hit:
> >
> > kernel BUG at fs/ocfs2/file.c:454!
> > Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
> > RIP: 0010:ocfs2_truncate_file+0x1204/0x13c0 fs/ocfs2/file.c:454
> > Call Trace:
> >  ocfs2_setattr+0xa6d/0x1fd0 fs/ocfs2/file.c:1212
> >  notify_change+0x4b5/0x1030 fs/attr.c:546
> >  do_truncate+0x1d2/0x230 fs/open.c:68
> >  handle_truncate fs/namei.c:3596 [inline]
> >  do_open fs/namei.c:3979 [inline]
> >  path_openat+0x260f/0x2ce0 fs/namei.c:4134
> >  do_filp_open+0x1f6/0x430 fs/namei.c:4161
> >  do_sys_openat2+0x117/0x1c0 fs/open.c:1437
> >  do_sys_open fs/open.c:1452 [inline]
> >  __do_sys_openat fs/open.c:1468 [inline]
> >  __se_sys_openat fs/open.c:1463 [inline]
> >  __x64_sys_openat+0x15b/0x220 fs/open.c:1463
> >  ...
> >
> > [CAUSE]
> > ocfs2_truncate_file() treats di_bh->i_size matching inode->i_size as an
> > internal code invariant and BUGs if it is broken.
> >
> > That assumption is too strong for corrupted metadata. The dinode block can
> > still be structurally valid enough to pass ocfs2_read_inode_block() while
> > no longer matching an already-instantiated VFS inode. On local mounts,
> > ocfs2_inode_lock_update() skips refresh entirely, so truncate can
> > observe the mismatch directly and crash instead of rejecting the
> > corruption.
> >
> > [FIX]
> > Turn the BUG_ON into normal OCFS2 corruption handling. If truncate sees
> > di_bh->i_size disagree with inode->i_size, report it with ocfs2_error() and
> > abort before touching truncate state.
> >
> > This keeps the fix at the first boundary that actually requires the
> > sizes to match and avoids widening checks into hotter generic
> > inode-lock paths.
> >
> > Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> > ---
> > diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> > --- a/fs/ocfs2/file.c
> > +++ b/fs/ocfs2/file.c
> > @@ -444,15 +444,17 @@ int ocfs2_truncate_file(struct inode *inode,
> >       int status = 0;
> >       struct ocfs2_dinode *fe = NULL;
> >       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> >
> > -     /* We trust di_bh because it comes from ocfs2_inode_lock(), which
> > -      * already validated it */
> > +     /*
> > +      * ocfs2_inode_lock() validated the dinode block, but truncation
> > +      * still needs to reject an inode state that no longer matches
> > +      * di_bh.
> > +      */
>
> In commit log, you mention the case is local mount.
> So here comments should refer it.
>
> Thanks,
> Joseph
>
> >       fe = (struct ocfs2_dinode *) di_bh->b_data;
> >
> >       trace_ocfs2_truncate_file((unsigned long long)OCFS2_I(inode)->ip_blkno,
> >                                 (unsigned long long)le64_to_cpu(fe->i_size),
> >                                 (unsigned long long)new_i_size);
> >
> > -     mlog_bug_on_msg(le64_to_cpu(fe->i_size) != i_size_read(inode),
> > -                     "Inode %llu, inode i_size = %lld != di "
> > -                     "i_size = %llu, i_flags = 0x%x\n",
> > -                     (unsigned long long)OCFS2_I(inode)->ip_blkno,
> > -                     i_size_read(inode),
> > -                     (unsigned long long)le64_to_cpu(fe->i_size),
> > -                     le32_to_cpu(fe->i_flags));
> > +     if (unlikely(le64_to_cpu(fe->i_size) != i_size_read(inode))) {
> > +             status = ocfs2_error(inode->i_sb,
> > +                                  "Inode %llu has inconsistent i_size: inode = %lld, dinode = %llu, i_flags = 0x%x\n",
> > +                                  (unsigned long long)OCFS2_I(inode)->ip_blkno,
> > +                                  i_size_read(inode),
> > +                                  (unsigned long long)le64_to_cpu(fe->i_size),
> > +                                  le32_to_cpu(fe->i_flags));
> > +             goto bail;
> > +     }
> >
> >       if (new_i_size > le64_to_cpu(fe->i_size)) {
> >               trace_ocfs2_truncate_file_error(
> >                       (unsigned long long)le64_to_cpu(fe->i_size),
>

Thanks for the review.

That makes sense. I have updated the comment to mention the local mount
case explicitly and sent a v2 patch.

Thanks,
ZhengYuan Huang

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-12  2:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-09  9:10 [PATCH] ocfs2: reject inconsistent inode size before truncate ZhengYuan Huang
2026-05-10  3:53 ` Joseph Qi
2026-05-12  2:17   ` ZhengYuan Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox