linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: fast symlink test should not rely on i_blocks
@ 2017-06-28  0:34 Tahsin Erdogan
  2017-06-28 16:53 ` Andreas Dilger
  0 siblings, 1 reply; 3+ messages in thread
From: Tahsin Erdogan @ 2017-06-28  0:34 UTC (permalink / raw)
  To: Andreas Dilger, Theodore Ts'o, linux-ext4
  Cc: linux-kernel, Tahsin Erdogan

ext4_inode_info->i_data is the storage area for 4 types of data:

  a) Extents data
  b) Inline data
  c) Block map
  d) Fast symlink data (symlink length < 60)

Extents data case is positively identified by EXT4_INODE_EXTENTS flag.
Inline data case is also obvious because of EXT4_INODE_INLINE_DATA
flag.

Distinguishing c) and d) however requires additional logic. This
currently relies on i_blocks count. After subtracting external xattr
block from i_blocks, if it is greater than 0 then we know that some
data blocks exist, so there must be a block map.

This logic got broken after ea_inode feature was added. That feature
charges the data blocks of external xattr inodes to the referencing
inode and so adds them to the i_blocks. To fix this, we could subtract
ea_inode blocks by iterating through all xattr entries and then check
whether remaining i_blocks count is zero. Besides being complicated,
this won't change the fact that the current way of distinguishing
between c) and d) is fragile.

The alternative solution is to test whether i_size is less than 60 to
determine fast symlink case. ext4_symlink() uses the same test to decide
whether to store the symlink in i_data. There is one caveat to address
before this can work though.

If an inode's i_nlink is zero during eviction, its i_size is set to
zero and its data is truncated. If system crashes before inode is removed
from the orphan list, next boot orphan cleanup may find the inode with
zero i_size. So, a symlink that had its data stored in a block may now
appear to be a fast symlink. The solution used in this patch is to treat
i_size = 0 as a non-fast symlink case. A zero sized symlink is not legal
so the only time this can happen is the mentioned scenario. This is also
logically correct because a i_size = 0 symlink has no data stored in
i_data.

Fixes: 74c5bfa651af ("ext4: xattr inode deduplication")

Suggested-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
---
 fs/ext4/inode.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index daed9b38362a..3c600f02673f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -144,16 +144,12 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
 
 /*
  * Test whether an inode is a fast symlink.
+ * A fast symlink has its symlink data stored in ext4_inode_info->i_data.
  */
 int ext4_inode_is_fast_symlink(struct inode *inode)
 {
-        int ea_blocks = EXT4_I(inode)->i_file_acl ?
-		EXT4_CLUSTER_SIZE(inode->i_sb) >> 9 : 0;
-
-	if (ext4_has_inline_data(inode))
-		return 0;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: fast symlink test should not rely on i_blocks
  2017-06-28  0:34 [PATCH] ext4: fast symlink test should not rely on i_blocks Tahsin Erdogan
@ 2017-06-28 16:53 ` Andreas Dilger
  2017-07-04  4:11   ` Theodore Ts'o
  0 siblings, 1 reply; 3+ messages in thread
From: Andreas Dilger @ 2017-06-28 16:53 UTC (permalink / raw)
  To: Tahsin Erdogan; +Cc: Theodore Ts'o, linux-ext4, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4060 bytes --]


> On Jun 27, 2017, at 6:34 PM, Tahsin Erdogan <tahsin@google.com> wrote:
> 
> ext4_inode_info->i_data is the storage area for 4 types of data:
> 
>  a) Extents data
>  b) Inline data
>  c) Block map
>  d) Fast symlink data (symlink length < 60)
> 
> Extents data case is positively identified by EXT4_INODE_EXTENTS flag.
> Inline data case is also obvious because of EXT4_INODE_INLINE_DATA
> flag.
> 
> Distinguishing c) and d) however requires additional logic. This
> currently relies on i_blocks count. After subtracting external xattr
> block from i_blocks, if it is greater than 0 then we know that some
> data blocks exist, so there must be a block map.
> 
> This logic got broken after ea_inode feature was added. That feature
> charges the data blocks of external xattr inodes to the referencing
> inode and so adds them to the i_blocks. To fix this, we could subtract
> ea_inode blocks by iterating through all xattr entries and then check
> whether remaining i_blocks count is zero. Besides being complicated,
> this won't change the fact that the current way of distinguishing
> between c) and d) is fragile.
> 
> The alternative solution is to test whether i_size is less than 60 to
> determine fast symlink case. ext4_symlink() uses the same test to decide
> whether to store the symlink in i_data. There is one caveat to address
> before this can work though.
> 
> If an inode's i_nlink is zero during eviction, its i_size is set to
> zero and its data is truncated. If system crashes before inode is removed
> from the orphan list, next boot orphan cleanup may find the inode with
> zero i_size. So, a symlink that had its data stored in a block may now
> appear to be a fast symlink. The solution used in this patch is to treat
> i_size = 0 as a non-fast symlink case. A zero sized symlink is not legal
> so the only time this can happen is the mentioned scenario. This is also
> logically correct because a i_size = 0 symlink has no data stored in
> i_data.
> 
> Fixes: 74c5bfa651af ("ext4: xattr inode deduplication")
> 
> Suggested-by: Andreas Dilger <adilger@dilger.ca>
> Signed-off-by: Tahsin Erdogan <tahsin@google.com>

The unfortunate bit is that this makes the inode impossible to undelete, but
I don't think that is a huge concern for symlinks.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/ext4/inode.c | 20 +++++++++++++-------
> 1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index daed9b38362a..3c600f02673f 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -144,16 +144,12 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
> 
> /*
>  * Test whether an inode is a fast symlink.
> + * A fast symlink has its symlink data stored in ext4_inode_info->i_data.
>  */
> int ext4_inode_is_fast_symlink(struct inode *inode)
> {
> -        int ea_blocks = EXT4_I(inode)->i_file_acl ?
> -		EXT4_CLUSTER_SIZE(inode->i_sb) >> 9 : 0;
> -
> -	if (ext4_has_inline_data(inode))
> -		return 0;
> -
> -	return (S_ISLNK(inode->i_mode) && inode->i_blocks - ea_blocks == 0);
> +	return S_ISLNK(inode->i_mode) && inode->i_size &&
> +	       (inode->i_size < EXT4_N_BLOCKS * 4);
> }
> 
> /*
> @@ -261,6 +257,16 @@ void ext4_evict_inode(struct inode *inode)
> 
> 	if (IS_SYNC(inode))
> 		ext4_handle_sync(handle);
> +
> +	/*
> +	 * Set inode->i_size to 0 before calling ext4_truncate(). We need
> +	 * special handling of symlinks here because i_size is used to
> +	 * determine whether ext4_inode_info->i_data contains symlink data or
> +	 * block mappings. Setting i_size to 0 will remove its fast symlink
> +	 * status. Erase i_data so that it becomes a valid empty block map.
> +	 */
> +	if (ext4_inode_is_fast_symlink(inode))
> +		memset(EXT4_I(inode)->i_data, 0, sizeof(EXT4_I(inode)->i_data));
> 	inode->i_size = 0;
> 	err = ext4_mark_inode_dirty(handle, inode);
> 	if (err) {
> --
> 2.13.2.725.g09c95d1e9-goog
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: fast symlink test should not rely on i_blocks
  2017-06-28 16:53 ` Andreas Dilger
@ 2017-07-04  4:11   ` Theodore Ts'o
  0 siblings, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2017-07-04  4:11 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Tahsin Erdogan, linux-ext4, linux-kernel

On Wed, Jun 28, 2017 at 10:53:31AM -0600, Andreas Dilger wrote:
> 
> > On Jun 27, 2017, at 6:34 PM, Tahsin Erdogan <tahsin@google.com> wrote:
> > 
> > ext4_inode_info->i_data is the storage area for 4 types of data:
> > 
> >  a) Extents data
> >  b) Inline data
> >  c) Block map
> >  d) Fast symlink data (symlink length < 60)
> > 
> > Extents data case is positively identified by EXT4_INODE_EXTENTS flag.
> > Inline data case is also obvious because of EXT4_INODE_INLINE_DATA
> > flag.
> > 
> > Distinguishing c) and d) however requires additional logic. This
> > currently relies on i_blocks count. After subtracting external xattr
> > block from i_blocks, if it is greater than 0 then we know that some
> > data blocks exist, so there must be a block map.
> > 
> > This logic got broken after ea_inode feature was added. That feature
> > charges the data blocks of external xattr inodes to the referencing
> > inode and so adds them to the i_blocks. To fix this, we could subtract
> > ea_inode blocks by iterating through all xattr entries and then check
> > whether remaining i_blocks count is zero. Besides being complicated,
> > this won't change the fact that the current way of distinguishing
> > between c) and d) is fragile.
> > 
> > The alternative solution is to test whether i_size is less than 60 to
> > determine fast symlink case. ext4_symlink() uses the same test to decide
> > whether to store the symlink in i_data. There is one caveat to address
> > before this can work though.
> > 
> > If an inode's i_nlink is zero during eviction, its i_size is set to
> > zero and its data is truncated. If system crashes before inode is removed
> > from the orphan list, next boot orphan cleanup may find the inode with
> > zero i_size. So, a symlink that had its data stored in a block may now
> > appear to be a fast symlink. The solution used in this patch is to treat
> > i_size = 0 as a non-fast symlink case. A zero sized symlink is not legal
> > so the only time this can happen is the mentioned scenario. This is also
> > logically correct because a i_size = 0 symlink has no data stored in
> > i_data.
> > 
> > Fixes: 74c5bfa651af ("ext4: xattr inode deduplication")
> > 
> > Suggested-by: Andreas Dilger <adilger@dilger.ca>
> > Signed-off-by: Tahsin Erdogan <tahsin@google.com>
> 
> The unfortunate bit is that this makes the inode impossible to undelete, but
> I don't think that is a huge concern for symlinks.
> 
> Reviewed-by: Andreas Dilger <adilger@dilger.ca>

Thanks, applied.

						- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-07-04  4:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-28  0:34 [PATCH] ext4: fast symlink test should not rely on i_blocks Tahsin Erdogan
2017-06-28 16:53 ` Andreas Dilger
2017-07-04  4:11   ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).