linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-next ext4 inode size 128 corrupted
@ 2023-07-18  3:43 Hugh Dickins
  2023-07-18 10:32 ` Jeff Layton
  2023-07-18 12:54 ` Jeff Layton
  0 siblings, 2 replies; 5+ messages in thread
From: Hugh Dickins @ 2023-07-18  3:43 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Theodore Ts'o, Christian Brauner, Jan Kara, Andreas Dilger,
	linux-fsdevel, linux-ext4, linux-kernel

Hi Jeff,

I've been unable to run my kernel builds on ext4 on loop0 on tmpfs
swapping load on linux-next recently, on one machine: various kinds
of havoc, most common symptoms being ext4_find_dest_de:2107 errors,
systemd-journald errors, segfaults.  But no problem observed running
on a more recent installation.

Bisected yesterday to 979492850abd ("ext4: convert to ctime accessor
functions").

I've mostly averted my eyes from the EXT4_INODE macro changes there,
but I think that's where the problem lies.  Reading the comment in
fs/ext4/ext4.h above EXT4_FITS_IN_INODE() led me to try "tune2fs -l"
and look at /etc/mke2fs.conf.  It's an old installation, its own
inodes are 256, but that old mke2fs.conf does default to 128 for small
FSes, and what I use for the load test is small.  Passing -I 256 to the
mkfs makes the problems go away.

(What's most alarming about the corruption is that it appears to extend
beyond just the throwaway test filesystem: segfaults on bash and libc.so
from the root filesystem.  But no permanent damage done there.)

One oddity I noticed in scrutinizing that commit, didn't help with
the issues above, but there's a hunk in ext4_rename() which changes
-	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
+	old.dir->i_mtime = inode_set_ctime_current(old.inode);

Hugh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: linux-next ext4 inode size 128 corrupted
  2023-07-18  3:43 linux-next ext4 inode size 128 corrupted Hugh Dickins
@ 2023-07-18 10:32 ` Jeff Layton
  2023-07-18 12:54 ` Jeff Layton
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2023-07-18 10:32 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Theodore Ts'o, Christian Brauner, Jan Kara, Andreas Dilger,
	linux-fsdevel, linux-ext4, linux-kernel

On Mon, 2023-07-17 at 20:43 -0700, Hugh Dickins wrote:
> Hi Jeff,
> 
> I've been unable to run my kernel builds on ext4 on loop0 on tmpfs
> swapping load on linux-next recently, on one machine: various kinds
> of havoc, most common symptoms being ext4_find_dest_de:2107 errors,
> systemd-journald errors, segfaults.  But no problem observed running
> on a more recent installation.
> 
> Bisected yesterday to 979492850abd ("ext4: convert to ctime accessor
> functions").
> 
> I've mostly averted my eyes from the EXT4_INODE macro changes there,
> but I think that's where the problem lies.  Reading the comment in
> fs/ext4/ext4.h above EXT4_FITS_IN_INODE() led me to try "tune2fs -l"
> and look at /etc/mke2fs.conf.  It's an old installation, its own
> inodes are 256, but that old mke2fs.conf does default to 128 for small
> FSes, and what I use for the load test is small.  Passing -I 256 to the
> mkfs makes the problems go away.
> 

Sounds like something is storing timestamp values in the extended part
of the inode when it shouldn't be. The macros look sane to me, but I'll
go over them again.

> (What's most alarming about the corruption is that it appears to extend
> beyond just the throwaway test filesystem: segfaults on bash and libc.so
> from the root filesystem.  But no permanent damage done there.)
> 
> One oddity I noticed in scrutinizing that commit, didn't help with
> the issues above, but there's a hunk in ext4_rename() which changes
> -	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
> +	old.dir->i_mtime = inode_set_ctime_current(old.inode);
> 

That actually looks fine. We're just setting the in-memory inode
timestamp there. The problem you're having sounds more like something is
going wrong when storing the values to disk. I'll take a closer look.

Thanks!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: linux-next ext4 inode size 128 corrupted
  2023-07-18  3:43 linux-next ext4 inode size 128 corrupted Hugh Dickins
  2023-07-18 10:32 ` Jeff Layton
@ 2023-07-18 12:54 ` Jeff Layton
  2023-07-18 18:03   ` Hugh Dickins
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2023-07-18 12:54 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Theodore Ts'o, Christian Brauner, Jan Kara, Andreas Dilger,
	linux-fsdevel, linux-ext4, linux-kernel

On Mon, 2023-07-17 at 20:43 -0700, Hugh Dickins wrote:
> Hi Jeff,
> 
> I've been unable to run my kernel builds on ext4 on loop0 on tmpfs
> swapping load on linux-next recently, on one machine: various kinds
> of havoc, most common symptoms being ext4_find_dest_de:2107 errors,
> systemd-journald errors, segfaults.  But no problem observed running
> on a more recent installation.
> 
> Bisected yesterday to 979492850abd ("ext4: convert to ctime accessor
> functions").
> 
> I've mostly averted my eyes from the EXT4_INODE macro changes there,
> but I think that's where the problem lies.  Reading the comment in
> fs/ext4/ext4.h above EXT4_FITS_IN_INODE() led me to try "tune2fs -l"
> and look at /etc/mke2fs.conf.  It's an old installation, its own
> inodes are 256, but that old mke2fs.conf does default to 128 for small
> FSes, and what I use for the load test is small.  Passing -I 256 to the
> mkfs makes the problems go away.
> 
> (What's most alarming about the corruption is that it appears to extend
> beyond just the throwaway test filesystem: segfaults on bash and libc.so
> from the root filesystem.  But no permanent damage done there.)
> 
> One oddity I noticed in scrutinizing that commit, didn't help with
> the issues above, but there's a hunk in ext4_rename() which changes
> -	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
> +	old.dir->i_mtime = inode_set_ctime_current(old.inode);
> 
> 

I suspect the problem here is the i_crtime, which lives wholly in the
extended part of the inode. The old macros would just not store anything
if the i_crtime didn't fit, but the new ones would still store the
tv_sec field in that case, which could be a memory corruptor. This patch
should fix it, and I'm testing it now.

Hugh, if you're able to give this a spin on your setup, then that would
be most helpful. This is also in the "ctime" branch in my kernel.org
tree if that helps. If this looks good, I'll ask Christian to fold this
into the ext4 conversion patch.

Thanks for the bug report!

---------------------------8<--------------------------

[PATCH] ext4: fix the time handling macros when ext4 is using small inodes

If ext4 is using small on-disk inodes, then it may not be able to store
fine grained timestamps. It also can't store the i_crtime at all in that
case since that fully lives in the extended part of the inode.

979492850abd got the EXT4_EINODE_{GET,SET}_XTIME macros wrong, and would
still store the tv_sec field of the i_crtime into the raw_inode, even
when they were small, corrupting adjacent memory.

This fixes those macros to skip setting anything in the raw_inode if the
tv_sec field doesn't fit. 

Cc: Jan Kara <jack@suse.cz>
Fixes: 979492850abd ("ext4: convert to ctime accessor functions")
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ext4/ext4.h | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 2af347669db7..1e2259d9967d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -900,8 +900,10 @@ do {										\
 #define EXT4_INODE_SET_CTIME(inode, raw_inode)					\
 	EXT4_INODE_SET_XTIME_VAL(i_ctime, inode, raw_inode, inode_get_ctime(inode))
 
-#define EXT4_EINODE_SET_XTIME(xtime, einode, raw_inode)			       \
-	EXT4_INODE_SET_XTIME_VAL(xtime, &((einode)->vfs_inode), raw_inode, (einode)->xtime)
+#define EXT4_EINODE_SET_XTIME(xtime, einode, raw_inode)				\
+	if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime))			\
+		EXT4_INODE_SET_XTIME_VAL(xtime, &((einode)->vfs_inode),		\
+					 raw_inode, (einode)->xtime)
 
 #define EXT4_INODE_GET_XTIME_VAL(xtime, inode, raw_inode)			\
 	(EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra) ?	\
@@ -922,9 +924,14 @@ do {										\
 		EXT4_INODE_GET_XTIME_VAL(i_ctime, inode, raw_inode));		\
 } while (0)
 
-#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode)			       \
-do {									       \
-	(einode)->xtime = EXT4_INODE_GET_XTIME_VAL(xtime, &(einode->vfs_inode), raw_inode);	\
+#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode)				\
+do {										\
+	if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime)) 			\
+		(einode)->xtime =						\
+			EXT4_INODE_GET_XTIME_VAL(xtime, &(einode->vfs_inode),	\
+						 raw_inode);			\
+	else									\
+		(einode)->xtime = (struct timespec64){0, 0};			\
 } while (0)
 
 #define i_disk_version osd1.linux1.l_i_version
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: linux-next ext4 inode size 128 corrupted
  2023-07-18 12:54 ` Jeff Layton
@ 2023-07-18 18:03   ` Hugh Dickins
  2023-07-18 18:27     ` Jeff Layton
  0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2023-07-18 18:03 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Hugh Dickins, Theodore Ts'o, Christian Brauner, Jan Kara,
	Andreas Dilger, linux-fsdevel, linux-ext4, linux-kernel

On Tue, 18 Jul 2023, Jeff Layton wrote:
> On Mon, 2023-07-17 at 20:43 -0700, Hugh Dickins wrote:
> > Hi Jeff,
> > 
> > I've been unable to run my kernel builds on ext4 on loop0 on tmpfs
> > swapping load on linux-next recently, on one machine: various kinds
> > of havoc, most common symptoms being ext4_find_dest_de:2107 errors,
> > systemd-journald errors, segfaults.  But no problem observed running
> > on a more recent installation.
> > 
> > Bisected yesterday to 979492850abd ("ext4: convert to ctime accessor
> > functions").
> > 
> > I've mostly averted my eyes from the EXT4_INODE macro changes there,
> > but I think that's where the problem lies.  Reading the comment in
> > fs/ext4/ext4.h above EXT4_FITS_IN_INODE() led me to try "tune2fs -l"
> > and look at /etc/mke2fs.conf.  It's an old installation, its own
> > inodes are 256, but that old mke2fs.conf does default to 128 for small
> > FSes, and what I use for the load test is small.  Passing -I 256 to the
> > mkfs makes the problems go away.
> > 
> > (What's most alarming about the corruption is that it appears to extend
> > beyond just the throwaway test filesystem: segfaults on bash and libc.so
> > from the root filesystem.  But no permanent damage done there.)
> > 
> > One oddity I noticed in scrutinizing that commit, didn't help with
> > the issues above, but there's a hunk in ext4_rename() which changes
> > -	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
> > +	old.dir->i_mtime = inode_set_ctime_current(old.inode);
> > 
> > 
> 
> I suspect the problem here is the i_crtime, which lives wholly in the
> extended part of the inode. The old macros would just not store anything
> if the i_crtime didn't fit, but the new ones would still store the
> tv_sec field in that case, which could be a memory corruptor. This patch
> should fix it, and I'm testing it now.

That makes sense.

> 
> Hugh, if you're able to give this a spin on your setup, then that would
> be most helpful. This is also in the "ctime" branch in my kernel.org
> tree if that helps. If this looks good, I'll ask Christian to fold this
> into the ext4 conversion patch.

Yes, it's now running fine on the problem machine, and on the no-problem.

Tested-by: Hugh Dickins <hughd@google.com>

> 
> Thanks for the bug report!

And thanks for the quick turnaround!

But I'm puzzled by your dismissing that
-	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
+	old.dir->i_mtime = inode_set_ctime_current(old.inode);
in ext4_rename() as "actually looks fine".

Different issue, nothing to do with the corruption, sure.  Much less
important, sure.  But updating ctime on the wrong inode is "fine"?

Hugh

> 
> ---------------------------8<--------------------------
> 
> [PATCH] ext4: fix the time handling macros when ext4 is using small inodes
> 
> If ext4 is using small on-disk inodes, then it may not be able to store
> fine grained timestamps. It also can't store the i_crtime at all in that
> case since that fully lives in the extended part of the inode.
> 
> 979492850abd got the EXT4_EINODE_{GET,SET}_XTIME macros wrong, and would
> still store the tv_sec field of the i_crtime into the raw_inode, even
> when they were small, corrupting adjacent memory.
> 
> This fixes those macros to skip setting anything in the raw_inode if the
> tv_sec field doesn't fit. 
> 
> Cc: Jan Kara <jack@suse.cz>
> Fixes: 979492850abd ("ext4: convert to ctime accessor functions")
> Reported-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/ext4/ext4.h | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 2af347669db7..1e2259d9967d 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -900,8 +900,10 @@ do {										\
>  #define EXT4_INODE_SET_CTIME(inode, raw_inode)					\
>  	EXT4_INODE_SET_XTIME_VAL(i_ctime, inode, raw_inode, inode_get_ctime(inode))
>  
> -#define EXT4_EINODE_SET_XTIME(xtime, einode, raw_inode)			       \
> -	EXT4_INODE_SET_XTIME_VAL(xtime, &((einode)->vfs_inode), raw_inode, (einode)->xtime)
> +#define EXT4_EINODE_SET_XTIME(xtime, einode, raw_inode)				\
> +	if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime))			\
> +		EXT4_INODE_SET_XTIME_VAL(xtime, &((einode)->vfs_inode),		\
> +					 raw_inode, (einode)->xtime)
>  
>  #define EXT4_INODE_GET_XTIME_VAL(xtime, inode, raw_inode)			\
>  	(EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra) ?	\
> @@ -922,9 +924,14 @@ do {										\
>  		EXT4_INODE_GET_XTIME_VAL(i_ctime, inode, raw_inode));		\
>  } while (0)
>  
> -#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode)			       \
> -do {									       \
> -	(einode)->xtime = EXT4_INODE_GET_XTIME_VAL(xtime, &(einode->vfs_inode), raw_inode);	\
> +#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode)				\
> +do {										\
> +	if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime)) 			\
> +		(einode)->xtime =						\
> +			EXT4_INODE_GET_XTIME_VAL(xtime, &(einode->vfs_inode),	\
> +						 raw_inode);			\
> +	else									\
> +		(einode)->xtime = (struct timespec64){0, 0};			\
>  } while (0)
>  
>  #define i_disk_version osd1.linux1.l_i_version
> -- 
> 2.41.0
> 
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: linux-next ext4 inode size 128 corrupted
  2023-07-18 18:03   ` Hugh Dickins
@ 2023-07-18 18:27     ` Jeff Layton
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2023-07-18 18:27 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Theodore Ts'o, Christian Brauner, Jan Kara, Andreas Dilger,
	linux-fsdevel, linux-ext4, linux-kernel

On Tue, 2023-07-18 at 11:03 -0700, Hugh Dickins wrote:
> On Tue, 18 Jul 2023, Jeff Layton wrote:
> > On Mon, 2023-07-17 at 20:43 -0700, Hugh Dickins wrote:
> > > Hi Jeff,
> > > 
> > > I've been unable to run my kernel builds on ext4 on loop0 on tmpfs
> > > swapping load on linux-next recently, on one machine: various kinds
> > > of havoc, most common symptoms being ext4_find_dest_de:2107 errors,
> > > systemd-journald errors, segfaults.  But no problem observed running
> > > on a more recent installation.
> > > 
> > > Bisected yesterday to 979492850abd ("ext4: convert to ctime accessor
> > > functions").
> > > 
> > > I've mostly averted my eyes from the EXT4_INODE macro changes there,
> > > but I think that's where the problem lies.  Reading the comment in
> > > fs/ext4/ext4.h above EXT4_FITS_IN_INODE() led me to try "tune2fs -l"
> > > and look at /etc/mke2fs.conf.  It's an old installation, its own
> > > inodes are 256, but that old mke2fs.conf does default to 128 for small
> > > FSes, and what I use for the load test is small.  Passing -I 256 to the
> > > mkfs makes the problems go away.
> > > 
> > > (What's most alarming about the corruption is that it appears to extend
> > > beyond just the throwaway test filesystem: segfaults on bash and libc.so
> > > from the root filesystem.  But no permanent damage done there.)
> > > 
> > > One oddity I noticed in scrutinizing that commit, didn't help with
> > > the issues above, but there's a hunk in ext4_rename() which changes
> > > -	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
> > > +	old.dir->i_mtime = inode_set_ctime_current(old.inode);
> > > 
> > > 
> > 
> > I suspect the problem here is the i_crtime, which lives wholly in the
> > extended part of the inode. The old macros would just not store anything
> > if the i_crtime didn't fit, but the new ones would still store the
> > tv_sec field in that case, which could be a memory corruptor. This patch
> > should fix it, and I'm testing it now.
> 
> That makes sense.
> 
> > 
> > Hugh, if you're able to give this a spin on your setup, then that would
> > be most helpful. This is also in the "ctime" branch in my kernel.org
> > tree if that helps. If this looks good, I'll ask Christian to fold this
> > into the ext4 conversion patch.
> 
> Yes, it's now running fine on the problem machine, and on the no-problem.
> 
> Tested-by: Hugh Dickins <hughd@google.com>
> 
> > 
> > Thanks for the bug report!
> 
> And thanks for the quick turnaround!
> 
> But I'm puzzled by your dismissing that
> -	old.dir->i_ctime = old.dir->i_mtime = current_time(old.dir);
> +	old.dir->i_mtime = inode_set_ctime_current(old.inode);
> in ext4_rename() as "actually looks fine".
> 
> Different issue, nothing to do with the corruption, sure.  Much less
> important, sure.  But updating ctime on the wrong inode is "fine"?

Ahh , sorry I wasn't looking at that properly. I think you're correct.
The right fix is probably to move ext4 to use generic_rename_timestamp.
I'll test and send another patch for that.

Thanks again!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-07-18 18:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-18  3:43 linux-next ext4 inode size 128 corrupted Hugh Dickins
2023-07-18 10:32 ` Jeff Layton
2023-07-18 12:54 ` Jeff Layton
2023-07-18 18:03   ` Hugh Dickins
2023-07-18 18:27     ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).