linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: Lukas Czerner <lczerner@redhat.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: Re: [PATCH] ext4: ignore EXT4_INODE_JOURNAL_DATA flag with delalloc
Date: Wed, 11 Jan 2012 12:13:49 -0600	[thread overview]
Message-ID: <4F0DD15D.7030706@redhat.com> (raw)
In-Reply-To: <1326266607-15211-1-git-send-email-lczerner@redhat.com>

On 1/11/12 1:23 AM, Lukas Czerner wrote:
> Ext4 does not support data journalling with delayed allocation enabled.
> We even do not allow to mount the file system with delayed allocation
> and data journalling enabled, however it can be set via FS_IOC_SETFLAGS
> so we can hit the inode with EXT4_INODE_JOURNAL_DATA set even on file
> system mounted with delayed allocation (default) and that's where
> problem arises. The easies way to reproduce this problem is with the
> following set of commands:
> 
>  mkfs.ext4 /dev/sdd
>  mount /dev/sdd /mnt/test1
>  dd if=/dev/zero of=/mnt/test1/file bs=1M count=4
>  chattr +j /mnt/test1/file
>  dd if=/dev/zero of=/mnt/test1/file bs=1M count=4 conv=notrunc
>  chattr -j /mnt/test1/file
> 
> Additionally it can be reproduced quite reliably with xfstests 272 and
> 269. In fact the above reproducer is a part of test 272.
> 
> To fix this we should ignore the EXT4_INODE_JOURNAL_DATA inode flag if
> the file system is mounted with delayed allocation. This can be easily
> done by fixing ext4_should_*_data() functions do ignore data journal
> flag when delalloc is set (suggested by Ted). We also have to set the
> appropriate address space operations for the inode (again, ignoring data
> journal flag if delalloc enabled).
> 
> Additionally this commit introduces ext4_inode_journal_mode() function
> because ext4_should_*_data() has already had a lot of common code and
> this change is putting it all into one function so it is easier to
> read.

I suppose I would have preferred that to be 2 different patches - ignore
the journal flag in one step and do the code reorg in another, just for
clarity when looking through git logs and reviewing commits...

Other than that, the logic all seems ok to me, though I have a few
nitpicky comments below.

-Eric

> Successfully tested with xfstest in following configurations:
> 
> delalloc + data=ordered
> delalloc + data=writeback
> data=journal
> nodelalloc + data=ordered
> nodelalloc + data=writeback
> nodelalloc + data=journal
> 
> (I wish we can get rid of the data=journal mode)
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
>  fs/ext4/ext4_jbd2.h |   55 ++++++++++++++++++++++++++-------------------------
>  fs/ext4/inode.c     |   36 ++++++++++++++++++++-------------
>  2 files changed, 50 insertions(+), 41 deletions(-)
> 
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index 5802fa1..8a1982a 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -261,43 +261,44 @@ static inline void ext4_update_inode_fsync_trans(handle_t *handle,
>  /* super.c */
>  int ext4_force_commit(struct super_block *sb);
>  
> -static inline int ext4_should_journal_data(struct inode *inode)
> +/*
> + * Ext4 inode journal modes
> + */
> +#define EXT4_INODE_JOURNAL_DATA_MODE	0x01 /* journal data mode */
> +#define EXT4_INODE_ORDER_DATA_MODE	0x02 /* ordered data mode */

Should probably be EXT4_INODE_ORDERED_DATA_MODE?

> +#define EXT4_INODE_WRITEBACK_DATA_MODE	0x04 /* writeback data mode */
> +
> +static inline int ext4_inode_journal_mode(struct inode *inode)
>  {
>  	if (EXT4_JOURNAL(inode) == NULL)
> -		return 0;
> -	if (!S_ISREG(inode->i_mode))
> -		return 1;
> -	if (test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA)
> -		return 1;
> -	if (ext4_test_inode_flag(inode, EXT4_INODE_JOURNAL_DATA))
> -		return 1;
> -	return 0;
> +		return EXT4_INODE_WRITEBACK_DATA_MODE;	/* writeback */
> +	/* We do not support data journalling with delayed allocation */
> +	if ((!S_ISREG(inode->i_mode)) ||

I wonder if there's any way to make this if() more readable.

I think the first extra set of () makes this harder to read, it's easy to think
the first opening ( extends past the || since it otherwise wouldn't be
necessary?  I guess I'm getting nitpicky here.

> +	   (test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA) ||
> +	   (ext4_test_inode_flag(inode, EXT4_INODE_JOURNAL_DATA) &&
> +	   (!test_opt(inode->i_sb, DELALLOC))))
> +		return EXT4_INODE_JOURNAL_DATA_MODE;	/* journal data */
> +	if (test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA)
> +		return EXT4_INODE_ORDER_DATA_MODE;		/* ordered */

I'd remove a tab there just to line up the comments :)

(actually the comments may not be necessary, they are just restating the
flag, but they stand out a bit more I guess)

> +	if (test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_WRITEBACK_DATA)
> +		return EXT4_INODE_WRITEBACK_DATA_MODE;	/* writeback */
> +	else
> +		BUG();

Hm 2 new BUG() statements in the patch.  If these are at all possible
to hit, maybe a printk at least saying where we are and what state we're
in first?  Should we instead fall back to a common/safe journaling mode,
rather than BUG()?

I guess it's pretty much impossible, if the journaling mode is set to
any of the 3 possible values, we get a normal return.

> +}
> +
> +static inline int ext4_should_journal_data(struct inode *inode)
> +{
> +	return ext4_inode_journal_mode(inode) & EXT4_INODE_JOURNAL_DATA_MODE;
>  }
>  
>  static inline int ext4_should_order_data(struct inode *inode)
>  {
> -	if (EXT4_JOURNAL(inode) == NULL)
> -		return 0;
> -	if (!S_ISREG(inode->i_mode))
> -		return 0;
> -	if (ext4_test_inode_flag(inode, EXT4_INODE_JOURNAL_DATA))
> -		return 0;
> -	if (test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA)
> -		return 1;
> -	return 0;
> +	return ext4_inode_journal_mode(inode) & EXT4_INODE_ORDER_DATA_MODE;
>  }
>  
>  static inline int ext4_should_writeback_data(struct inode *inode)
>  {
> -	if (EXT4_JOURNAL(inode) == NULL)
> -		return 1;
> -	if (!S_ISREG(inode->i_mode))
> -		return 0;
> -	if (ext4_test_inode_flag(inode, EXT4_INODE_JOURNAL_DATA))
> -		return 0;
> -	if (test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_WRITEBACK_DATA)
> -		return 1;
> -	return 0;
> +	return ext4_inode_journal_mode(inode) & EXT4_INODE_WRITEBACK_DATA_MODE;
>  }
>  
>  /*
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index aa8efa6..3b07d22 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2479,13 +2479,14 @@ static int ext4_da_write_end(struct file *file,
>  	int write_mode = (int)(unsigned long)fsdata;
>  
>  	if (write_mode == FALL_BACK_TO_NONDELALLOC) {
> -		if (ext4_should_order_data(inode)) {
> +		switch (ext4_inode_journal_mode(inode)) {
> +		case EXT4_INODE_ORDER_DATA_MODE:
>  			return ext4_ordered_write_end(file, mapping, pos,
>  					len, copied, page, fsdata);
> -		} else if (ext4_should_writeback_data(inode)) {
> +		case EXT4_INODE_WRITEBACK_DATA_MODE:
>  			return ext4_writeback_write_end(file, mapping, pos,
>  					len, copied, page, fsdata);
> -		} else {
> +		default:
>  			BUG();
>  		}
>  	}
> @@ -3083,18 +3084,25 @@ static const struct address_space_operations ext4_da_aops = {
>  
>  void ext4_set_aops(struct inode *inode)
>  {
> -	if (ext4_should_order_data(inode) &&
> -		test_opt(inode->i_sb, DELALLOC))
> -		inode->i_mapping->a_ops = &ext4_da_aops;
> -	else if (ext4_should_order_data(inode))
> -		inode->i_mapping->a_ops = &ext4_ordered_aops;
> -	else if (ext4_should_writeback_data(inode) &&
> -		 test_opt(inode->i_sb, DELALLOC))
> -		inode->i_mapping->a_ops = &ext4_da_aops;
> -	else if (ext4_should_writeback_data(inode))
> -		inode->i_mapping->a_ops = &ext4_writeback_aops;
> -	else
> +	switch (ext4_inode_journal_mode(inode)) {
> +	case EXT4_INODE_ORDER_DATA_MODE:
> +		if (test_opt(inode->i_sb, DELALLOC))
> +			inode->i_mapping->a_ops = &ext4_da_aops;
> +		else
> +			inode->i_mapping->a_ops = &ext4_ordered_aops;
> +		break;
> +	case EXT4_INODE_WRITEBACK_DATA_MODE:
> +		if (test_opt(inode->i_sb, DELALLOC))
> +			inode->i_mapping->a_ops = &ext4_da_aops;
> +		else
> +			inode->i_mapping->a_ops = &ext4_writeback_aops;
> +		break;
> +	case EXT4_INODE_JOURNAL_DATA_MODE:
>  		inode->i_mapping->a_ops = &ext4_journalled_aops;
> +		break;
> +	default:
> +		BUG();
> +	}
>  }
>  
>  


  parent reply	other threads:[~2012-01-11 19:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-11  7:23 [PATCH] ext4: ignore EXT4_INODE_JOURNAL_DATA flag with delalloc Lukas Czerner
2012-01-11  9:20 ` Andreas Dilger
2012-01-11  9:51   ` Lukas Czerner
2012-01-11 18:13 ` Eric Sandeen [this message]
2012-01-12  7:12   ` Lukas Czerner
2012-01-12 11:03 ` [PATCH v2] " Lukas Czerner
2012-01-19 16:50   ` Ted Ts'o
2012-01-20 12:09     ` Lukas Czerner
2012-01-20 12:10     ` [PATCH v3] " Lukas Czerner
2012-01-23 22:47       ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F0DD15D.7030706@redhat.com \
    --to=sandeen@redhat.com \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).