linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
@ 2006-11-06 21:50 Eric Sandeen
  2006-11-06 22:27 ` Dave Kleikamp
  2006-11-06 23:05 ` Christoph Hellwig
  0 siblings, 2 replies; 16+ messages in thread
From: Eric Sandeen @ 2006-11-06 21:50 UTC (permalink / raw)
  To: Linux Kernel Mailing List, linux-fsdevel; +Cc: Theodore Tso

The inode diet patches first did this to generic_fillattr():

-	stat->blksize = inode->i_blksize;
+	stat->blksize = PAGE_CACHE_SIZE;

but by 2.6.19-rc3 this was changed again:

-	stat->blksize = PAGE_CACHE_SIZE;
+	stat->blksize = (1 << inode->i_blkbits);

I can't find for sure why this was done; perhaps because it exposed a bug
in cifs[1]

However, if we are going to pick a default for generic_fillattr, it should probably
be what most filesystems were using before, and let filesystems which need something
else re-set it to according to their needs.  As it stands today, doing readdirs and
the like in block-sized chunks rather than page sized will probably not be the
best thing for performance.

A bit of quick parsing of the original inode diet patch shows that by far most
filesystems were using the page size:

(count) (line which sets i_blksize)
      1 cifs_inode->vfs_inode.i_blksize = CIFS_MAX_MSGSIZE;
      1 i->i_blksize = 512;
      1 inode->i_blksize = attr->va_blocksize;
      1 inode->i_blksize = befs_sb->block_size;
      1 inode->i_blksize = HFSPLUS_SB(inode->i_sb).alloc_blksz;
      1 inode->i_blksize = HFSPLUS_SB(sb).alloc_blksz;
      1 inode->i_blksize = HFS_SB(inode->i_sb)->alloc_blksz;
      1 inode->i_blksize = HFS_SB(sb)->alloc_blksz;
      1 inode->i_blksize = HPAGE_SIZE;
      1 inode->i_blksize = NCP_BLOCK_SIZE;
      1 inode->i_blksize = QNX4_DIR_ENTRY_SIZE;
      1 inode->i_blksize = (u32)osb->s_clustersize;
      1 inode->i_blksize = xfs_preferred_iosize(mp);
      1 ino->i_blksize = i_blksize;
      1 ino->i_blksize = proc_ino->i_blksize;
      1 ip->i_blksize = ip->i_sb->s_blocksize;
      1 ip->i_blksize = PAGE_SIZE;
      2 inode->i_blksize = fattr->du.nfs2.blocksize;
      2 inode->i_blksize = inode->i_sb->s_blocksize;
      2 inode->i_blksize = reiserfs_default_io_size;
      2 inode->i_blksize = sbi->cluster_size;
      2 vi->i_blksize = PAGE_CACHE_SIZE;
      3 inode->i_blksize = sb->s_blocksize;
      3 ret->i_blksize = PAGE_CACHE_SIZE;
      5 inode->i_blksize = 1024;
      7 inode->i_blksize = 0;
     12 inode->i_blksize = PAGE_SIZE;
     22 inode->i_blksize = PAGE_CACHE_SIZE;

so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
and let filesystems which need something -else- do that on their own.

[1] http://lists.samba.org/archive/linux-cifs-client/2006-September/001481.html

Thanks,
-Eric

Signed-off-by: Eric Sandeen <sandeen@redhat.com>

--- linux.orig/fs/stat.c	2006-11-05 23:27:04.962482569 -0600
+++ linux/fs/stat.c	2006-11-05 23:27:29.394396050 -0600
@@ -33,7 +33,7 @@
  	stat->ctime = inode->i_ctime;
  	stat->size = i_size_read(inode);
  	stat->blocks = inode->i_blocks;
-	stat->blksize = (1 << inode->i_blkbits);
+	stat->blksize = PAGE_CACHE_SIZE;
  }

  EXPORT_SYMBOL(generic_fillattr);


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 21:50 [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE Eric Sandeen
@ 2006-11-06 22:27 ` Dave Kleikamp
  2006-11-06 22:37   ` Eric Sandeen
  2006-11-07  0:26   ` Steve French
  2006-11-06 23:05 ` Christoph Hellwig
  1 sibling, 2 replies; 16+ messages in thread
From: Dave Kleikamp @ 2006-11-06 22:27 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Linux Kernel Mailing List, linux-fsdevel, Theodore Tso,
	Steve French

On Mon, 2006-11-06 at 15:50 -0600, Eric Sandeen wrote:
> The inode diet patches first did this to generic_fillattr():
> 
> -	stat->blksize = inode->i_blksize;
> +	stat->blksize = PAGE_CACHE_SIZE;
> 
> but by 2.6.19-rc3 this was changed again:
> 
> -	stat->blksize = PAGE_CACHE_SIZE;
> +	stat->blksize = (1 << inode->i_blkbits);
> 
> I can't find for sure why this was done; perhaps because it exposed a bug
> in cifs[1]

Steve has fixed the bug in cifs_readdir().

> However, if we are going to pick a default for generic_fillattr, it should probably
> be what most filesystems were using before, and let filesystems which need something
> else re-set it to according to their needs.  As it stands today, doing readdirs and
> the like in block-sized chunks rather than page sized will probably not be the
> best thing for performance.

I agree.

> 
> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
> and let filesystems which need something -else- do that on their own.
> 
> [1] http://lists.samba.org/archive/linux-cifs-client/2006-September/001481.html
> 
> Thanks,
> -Eric
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> 
> --- linux.orig/fs/stat.c	2006-11-05 23:27:04.962482569 -0600
> +++ linux/fs/stat.c	2006-11-05 23:27:29.394396050 -0600
> @@ -33,7 +33,7 @@
>   	stat->ctime = inode->i_ctime;
>   	stat->size = i_size_read(inode);
>   	stat->blocks = inode->i_blocks;
> -	stat->blksize = (1 << inode->i_blkbits);
> +	stat->blksize = PAGE_CACHE_SIZE;
>   }
> 
>   EXPORT_SYMBOL(generic_fillattr);

Looks good.  Since cifs is affected by this patch, I propose that cifs
explicitly set stat->blksize:

CIFS: Explicitly set stat->blksize

CIFS may perform I/O over the network in larger chunks than the page size,
so it should explicitly set stat->blksize to ensure optimal I/O bandwidth

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>

diff -Nurp linux.orig/fs/cifs/inode.c linux/fs/cifs/inode.c
--- linux.orig/fs/cifs/inode.c	2006-11-03 13:44:04.000000000 -0600
+++ linux/fs/cifs/inode.c	2006-11-06 16:11:21.000000000 -0600
@@ -1089,8 +1089,10 @@ int cifs_getattr(struct vfsmount *mnt, s
 	struct kstat *stat)
 {
 	int err = cifs_revalidate(dentry);
-	if (!err)
+	if (!err) {
 		generic_fillattr(dentry->d_inode, stat);
+		stat->blksize = CIFS_MAX_MSGSIZE;
+	}
 	return err;
 }
 

-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 22:27 ` Dave Kleikamp
@ 2006-11-06 22:37   ` Eric Sandeen
  2006-11-07  0:26   ` Steve French
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2006-11-06 22:37 UTC (permalink / raw)
  To: Dave Kleikamp
  Cc: Eric Sandeen, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso, Steve French

Dave Kleikamp wrote:
ooks good.  Since cifs is affected by this patch, I propose that cifs
> explicitly set stat->blksize:
> 
> CIFS: Explicitly set stat->blksize
> 
> CIFS may perform I/O over the network in larger chunks than the page size,
> so it should explicitly set stat->blksize to ensure optimal I/O bandwidth
> 
> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
> 
> diff -Nurp linux.orig/fs/cifs/inode.c linux/fs/cifs/inode.c
> --- linux.orig/fs/cifs/inode.c	2006-11-03 13:44:04.000000000 -0600
> +++ linux/fs/cifs/inode.c	2006-11-06 16:11:21.000000000 -0600
> @@ -1089,8 +1089,10 @@ int cifs_getattr(struct vfsmount *mnt, s
>  	struct kstat *stat)
>  {
>  	int err = cifs_revalidate(dentry);
> -	if (!err)
> +	if (!err) {
>  		generic_fillattr(dentry->d_inode, stat);
> +		stat->blksize = CIFS_MAX_MSGSIZE;
> +	}
>  	return err;
>  }

Yep, I agree that this should go in too.  Other filesystems probably
need to recover from the crash diet as well :)

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 21:50 [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE Eric Sandeen
  2006-11-06 22:27 ` Dave Kleikamp
@ 2006-11-06 23:05 ` Christoph Hellwig
  2006-11-06 23:15   ` Eric Sandeen
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2006-11-06 23:05 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Linux Kernel Mailing List, linux-fsdevel, Theodore Tso

On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
> and let filesystems which need something -else- do that on their own.

I agree with the conclusion, but the patch is incomplete.  You went down
all the way to find out what the fileystems do in this messages, so add
the hunks to override the defaults for non-standard filesystems to the
patch aswell to restore the pre-inode diet state.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 23:05 ` Christoph Hellwig
@ 2006-11-06 23:15   ` Eric Sandeen
  2006-11-06 23:17     ` Christoph Hellwig
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Eric Sandeen @ 2006-11-06 23:15 UTC (permalink / raw)
  To: Christoph Hellwig, Eric Sandeen, Linux Kernel Mailing List,
	linux-fsdevel, Theodore Tso

Christoph Hellwig wrote:
> On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
>> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
>> and let filesystems which need something -else- do that on their own.
> 
> I agree with the conclusion, but the patch is incomplete.  You went down
> all the way to find out what the fileystems do in this messages, so add
> the hunks to override the defaults for non-standard filesystems to the
> patch aswell to restore the pre-inode diet state.

Well, agreed.  I put 80% or more back to pre-patch state, but not all.
:)  So it's less broken with my patch than without, so at least it's
moving forward.  So... Ted's patches get in w/o fixing up all the other
filesystems (left as an exercise to the patch reader) but mine can't? :)

-Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 23:15   ` Eric Sandeen
@ 2006-11-06 23:17     ` Christoph Hellwig
  2006-11-06 23:28       ` Eric Sandeen
  2006-11-07  0:08     ` Andreas Dilger
  2006-11-07  3:43     ` Theodore Tso
  2 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2006-11-06 23:17 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Christoph Hellwig, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso

On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
> > I agree with the conclusion, but the patch is incomplete.  You went down
> > all the way to find out what the fileystems do in this messages, so add
> > the hunks to override the defaults for non-standard filesystems to the
> > patch aswell to restore the pre-inode diet state.
> 
> Well, agreed.  I put 80% or more back to pre-patch state, but not all.
> :)  So it's less broken with my patch than without, so at least it's
> moving forward.  So... Ted's patches get in w/o fixing up all the other
> filesystems (left as an exercise to the patch reader) but mine can't? :)

It's because no reviwer noticed the breakage Ted put in ;-)  I think if
you really refused to do the remaining 20% we can try to pressured Ted to
do it and if that doesn't work I'll just do it myself because I'll be
tired of arguing at that point.. 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 23:17     ` Christoph Hellwig
@ 2006-11-06 23:28       ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2006-11-06 23:28 UTC (permalink / raw)
  To: Christoph Hellwig, Eric Sandeen, Linux Kernel Mailing List,
	linux-fsdevel, Theodore Tso

Christoph Hellwig wrote:
> On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
>>> I agree with the conclusion, but the patch is incomplete.  You went down
>>> all the way to find out what the fileystems do in this messages, so add
>>> the hunks to override the defaults for non-standard filesystems to the
>>> patch aswell to restore the pre-inode diet state.
>> Well, agreed.  I put 80% or more back to pre-patch state, but not all.
>> :)  So it's less broken with my patch than without, so at least it's
>> moving forward.  So... Ted's patches get in w/o fixing up all the other
>> filesystems (left as an exercise to the patch reader) but mine can't? :)
> 
> It's because no reviwer noticed the breakage Ted put in ;-)  I think if
> you really refused to do the remaining 20% we can try to pressured Ted to
> do it and if that doesn't work I'll just do it myself because I'll be
> tired of arguing at that point.. 

eh, I'll sign up to do it when I get a moment, I won't refuse :)

Just had to notice that 2 patches already went in without the
requirement to "fix everything else" :)

It would be nice to get the "fix most of it" patch in sooner than later,
though.

-Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 23:15   ` Eric Sandeen
  2006-11-06 23:17     ` Christoph Hellwig
@ 2006-11-07  0:08     ` Andreas Dilger
  2006-11-07  0:13       ` Hua Zhong
  2006-11-07  1:39       ` Eric Sandeen
  2006-11-07  3:43     ` Theodore Tso
  2 siblings, 2 replies; 16+ messages in thread
From: Andreas Dilger @ 2006-11-07  0:08 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Christoph Hellwig, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso

On Nov 06, 2006  17:15 -0600, Eric Sandeen wrote:
> Christoph Hellwig wrote:
> > I agree with the conclusion, but the patch is incomplete.  You went down
> > all the way to find out what the fileystems do in this messages, so add
> > the hunks to override the defaults for non-standard filesystems to the
> > patch aswell to restore the pre-inode diet state.
> 
> Well, agreed.  I put 80% or more back to pre-patch state, but not all.
> :)  So it's less broken with my patch than without, so at least it's
> moving forward.  So... Ted's patches get in w/o fixing up all the other
> filesystems (left as an exercise to the patch reader) but mine can't? :)

Actually, rather than blindly revert to pre-patch behaviour it would be
worthwhile to determine if PAGE_SIZE isn't the better value.  In some
cases people don't understand that i_blksize is the "optimal IO size"
and instead assume it is the filesystem blocksize.  I saw a few that were
e.g. 512 and that can't be very useful.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-07  0:08     ` Andreas Dilger
@ 2006-11-07  0:13       ` Hua Zhong
  2006-11-07  1:39       ` Eric Sandeen
  1 sibling, 0 replies; 16+ messages in thread
From: Hua Zhong @ 2006-11-07  0:13 UTC (permalink / raw)
  To: 'Andreas Dilger', 'Eric Sandeen'
  Cc: 'Christoph Hellwig', 'Linux Kernel Mailing List',
	'linux-fsdevel', 'Theodore Tso'

> Actually, rather than blindly revert to pre-patch behaviour 
> it would be worthwhile to determine if PAGE_SIZE isn't the 
> better value.  In some cases people don't understand that 
> i_blksize is the "optimal IO size"
> and instead assume it is the filesystem blocksize.  I saw a 
> few that were e.g. 512 and that can't be very useful.

Of course the name i_blksize is very clear on that. :-)
 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> majordomo@vger.kernel.org More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 22:27 ` Dave Kleikamp
  2006-11-06 22:37   ` Eric Sandeen
@ 2006-11-07  0:26   ` Steve French
  2006-11-07 13:40     ` Dave Kleikamp
  1 sibling, 1 reply; 16+ messages in thread
From: Steve French @ 2006-11-07  0:26 UTC (permalink / raw)
  To: Dave Kleikamp
  Cc: Eric Sandeen, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso

Dave Kleikamp wrote:
> On Mon, 2006-11-06 at 15:50 -0600, Eric Sandeen wrote:
>   
>> The inode diet patches first did this to generic_fillattr():
>>
>> -	stat->blksize = inode->i_blksize;
>> +	stat->blksize = PAGE_CACHE_SIZE;
>>
>> but by 2.6.19-rc3 this was changed again:
>>
>> -	stat->blksize = PAGE_CACHE_SIZE;
>> +	stat->blksize = (1 << inode->i_blkbits);
>>
>> I can't find for sure why this was done; perhaps because it exposed a bug
>> in cifs[1]
>>     
>
> Steve has fixed the bug in cifs_readdir().
>
>   
>> However, if we are going to pick a default for generic_fillattr, it should probably
>> be what most filesystems were using before, and let filesystems which need something
>> else re-set it to according to their needs.  As it stands today, doing readdirs and
>> the like in block-sized chunks rather than page sized will probably not be the
>> best thing for performance.
>>     
>
> I agree.
>
>   
>> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
>> and let filesystems which need something -else- do that on their own.
>>
>> [1] http://lists.samba.org/archive/linux-cifs-client/2006-September/001481.html
>>
>> Thanks,
>> -Eric
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>
>> --- linux.orig/fs/stat.c	2006-11-05 23:27:04.962482569 -0600
>> +++ linux/fs/stat.c	2006-11-05 23:27:29.394396050 -0600
>> @@ -33,7 +33,7 @@
>>   	stat->ctime = inode->i_ctime;
>>   	stat->size = i_size_read(inode);
>>   	stat->blocks = inode->i_blocks;
>> -	stat->blksize = (1 << inode->i_blkbits);
>> +	stat->blksize = PAGE_CACHE_SIZE;
>>   }
>>
>>   EXPORT_SYMBOL(generic_fillattr);
>>     
>
> Looks good.  Since cifs is affected by this patch, I propose that cifs
> explicitly set stat->blksize:
>
> CIFS: Explicitly set stat->blksize
>
> CIFS may perform I/O over the network in larger chunks than the page size,
> so it should explicitly set stat->blksize to ensure optimal I/O bandwidth
>
> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
>
> diff -Nurp linux.orig/fs/cifs/inode.c linux/fs/cifs/inode.c
> --- linux.orig/fs/cifs/inode.c	2006-11-03 13:44:04.000000000 -0600
> +++ linux/fs/cifs/inode.c	2006-11-06 16:11:21.000000000 -0600
> @@ -1089,8 +1089,10 @@ int cifs_getattr(struct vfsmount *mnt, s
>  	struct kstat *stat)
>  {
>  	int err = cifs_revalidate(dentry);
> -	if (!err)
> +	if (!err) {
>  		generic_fillattr(dentry->d_inode, stat);
> +		stat->blksize = CIFS_MAX_MSGSIZE;
> +	}
>  	return err;
>  }
>  
>
>   
I assumed that the original intent of the "inode diet patch" was to 
remove fields in the inode,
which most filesystems can get out of the superblock.    If 
inode->blksize and inode->blkbits were
related (2**blkbits == blksize) , it also makes sense to me that someone 
(Ted?) removed one and left the other
as one would be redundant, but some filesystems like cifs have a large 
recommended i/o size (16K),
but if someone wants to remove both from the inode that is fine by me, 
as long as cifs
stat->blksize is set as you suggested on the way out of cifs_getattr.    
Eventually cifs should set
the stat->blksize to  a smaller value for one rarer case.   For the most 
common case
cifs should still set it to 16K (CIFS_MAX_MSGSIZE) as that is the most 
common negotiated
buffer size but if the server does not negotiate large read support, and 
the server also is so old
that it negotiates a buffer size smaller than 16K (e.g. Windows95 
negotiates 2K IIRC)
then we could set stat-blksize to the smaller negotiated buffer size - 
but since those servers are
getting rarer it is probably not that important.    More interesting 
will be the future cases in
which we will be able to set this value larger to more servers, as in 
general for modern
network adapters, the larger network i/o size the better.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-07  0:08     ` Andreas Dilger
  2006-11-07  0:13       ` Hua Zhong
@ 2006-11-07  1:39       ` Eric Sandeen
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2006-11-07  1:39 UTC (permalink / raw)
  To: Eric Sandeen, Christoph Hellwig, Linux Kernel Mailing List,
	linux-fsdevel, Theodore Tso

Andreas Dilger wrote:
> On Nov 06, 2006  17:15 -0600, Eric Sandeen wrote:
>> Christoph Hellwig wrote:
>>> I agree with the conclusion, but the patch is incomplete.  You went down
>>> all the way to find out what the fileystems do in this messages, so add
>>> the hunks to override the defaults for non-standard filesystems to the
>>> patch aswell to restore the pre-inode diet state.
>> Well, agreed.  I put 80% or more back to pre-patch state, but not all.
>> :)  So it's less broken with my patch than without, so at least it's
>> moving forward.  So... Ted's patches get in w/o fixing up all the other
>> filesystems (left as an exercise to the patch reader) but mine can't? :)
> 
> Actually, rather than blindly revert to pre-patch behaviour it would be
> worthwhile to determine if PAGE_SIZE isn't the better value.  In some
> cases people don't understand that i_blksize is the "optimal IO size"
> and instead assume it is the filesystem blocksize.  I saw a few that were
> e.g. 512 and that can't be very useful.

I'm willing to either revert everyting to pre-inode-diet behavior, or leave it 
at the (newly re-proposed) page size default and let the other fs maintainers 
sort it out for their own codebase, but I don't pretend to know what is best 
for, say, qnx4 etc...  I'd be willing to cc: all maintainers asking them to take 
another long hard look at their code. :)

As we saw with cifs, these changes can have unintended consequences (not picking 
on cifs, it's just one that ran into issues with the broad-stroke change).

-Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-06 23:15   ` Eric Sandeen
  2006-11-06 23:17     ` Christoph Hellwig
  2006-11-07  0:08     ` Andreas Dilger
@ 2006-11-07  3:43     ` Theodore Tso
  2006-11-07  4:02       ` Eric Sandeen
  2 siblings, 1 reply; 16+ messages in thread
From: Theodore Tso @ 2006-11-07  3:43 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Christoph Hellwig, Linux Kernel Mailing List, linux-fsdevel

On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
> Christoph Hellwig wrote:
> > On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
> >> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
> >> and let filesystems which need something -else- do that on their own.
> > 
> > I agree with the conclusion, but the patch is incomplete.  You went down
> > all the way to find out what the fileystems do in this messages, so add
> > the hunks to override the defaults for non-standard filesystems to the
> > patch aswell to restore the pre-inode diet state.
> 
> Well, agreed.  I put 80% or more back to pre-patch state, but not all.
> :)  So it's less broken with my patch than without, so at least it's
> moving forward.  So... Ted's patches get in w/o fixing up all the other
> filesystems (left as an exercise to the patch reader) but mine can't? :)

Note that *I* wasn't the one who changed it from PAGE_CACHE_SIZE to (1
<< inode->i_blkbits).  This was done by Andrew.  (See below, from an
e-mail dated September 19th).

Given that Steve French was cc'ed, I assume this was done as a hack to
fix CIFS, but it was a bad idea; I agree that PAGE_CACHE_SIZE is a way
better default than (1 << inode->i_blkbits).

As far as fixing all of the other filesystems, I did *try*; I know I
screwed up with XFS, but that's because I still think the code is a
screaming horror of indirections that make it impossible to understand
what the heck is going on, and I guess I screwed up with CIFS.  Some
of the changes away from "pre inode diet" state were deliberate,
though, since some filesystems had very clearly broken "optimal I/O
sizes" of 512, and one even had something incredibly bogus that was
something like 96 bytes (!) if I remember correctly.

					- Ted


Subject: inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-fix
From: Andrew Morton <akpm@osdl.org>

Cc: <sbenni@gmx.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Steven French <sfrench@us.ibm.com>,
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 fs/stat.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN
fs/stat.c~inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-fi
x fs/stat.c
---
a/fs/stat.c~inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-
fix
+++ a/fs/stat.c
@@ -33,7 +33,7 @@ void generic_fillattr(struct inode *inod
        stat->ctime = inode->i_ctime;
        stat->size = i_size_read(inode);
        stat->blocks = inode->i_blocks;
-       stat->blksize = PAGE_CACHE_SIZE;
+       stat->blksize = (1 << inode->i_blkbits);
 }

 EXPORT_SYMBOL(generic_fillattr);
_


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE
  2006-11-07  3:43     ` Theodore Tso
@ 2006-11-07  4:02       ` Eric Sandeen
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sandeen @ 2006-11-07  4:02 UTC (permalink / raw)
  To: Theodore Tso, Eric Sandeen, Christoph Hellwig,
	Linux Kernel Mailing List, linux-fsdevel

Theodore Tso wrote:
> On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
>> Christoph Hellwig wrote:
>>> On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
>>>> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again), 
>>>> and let filesystems which need something -else- do that on their own.
>>> I agree with the conclusion, but the patch is incomplete.  You went down
>>> all the way to find out what the fileystems do in this messages, so add
>>> the hunks to override the defaults for non-standard filesystems to the
>>> patch aswell to restore the pre-inode diet state.
>> Well, agreed.  I put 80% or more back to pre-patch state, but not all.
>> :)  So it's less broken with my patch than without, so at least it's
>> moving forward.  So... Ted's patches get in w/o fixing up all the other
>> filesystems (left as an exercise to the patch reader) but mine can't? :)
> 
> Note that *I* wasn't the one who changed it from PAGE_CACHE_SIZE to (1
> << inode->i_blkbits).  This was done by Andrew.  (See below, from an
> e-mail dated September 19th).
> 
> Given that Steve French was cc'ed, I assume this was done as a hack to
> fix CIFS, but it was a bad idea; I agree that PAGE_CACHE_SIZE is a way
> better default than (1 << inode->i_blkbits).
> 
> As far as fixing all of the other filesystems, I did *try*; I know I
> screwed up with XFS, but that's because I still think the code is a
> screaming horror of indirections that make it impossible to understand
> what the heck is going on, and I guess I screwed up with CIFS.  Some
> of the changes away from "pre inode diet" state were deliberate,
> though, since some filesystems had very clearly broken "optimal I/O
> sizes" of 512, and one even had something incredibly bogus that was
> something like 96 bytes (!) if I remember correctly.

In that case can we just go with my original third-round patch to put it back to 
PAGE_SIZE, and let the other filesystems tidy up if necessary.  This -is- just 
supposed to be a hint...

Sorry Ted, "broken" was too strong a word...  I just read:

"Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function."

in the patch, and figured that you were leaving it up to the filesystems to 
re-evaluate whether they needed something other than a page size.  Which I'd be 
happy to do, too.  :)

-Eric


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to   PAGE_CACHE_SIZE
  2006-11-07  0:26   ` Steve French
@ 2006-11-07 13:40     ` Dave Kleikamp
  2006-11-07 15:34       ` Dave Kleikamp
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Kleikamp @ 2006-11-07 13:40 UTC (permalink / raw)
  To: Steve French
  Cc: Eric Sandeen, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso

On Mon, 2006-11-06 at 18:26 -0600, Steve French wrote:   
> I assumed that the original intent of the "inode diet patch" was to 
> remove fields in the inode,
> which most filesystems can get out of the superblock. 

I think I may have planted the idea that you could get i_blkbits from
sb->s_blocksize_bits, but I was wrong.  Consider a block device.  It's
blocksize is not related to the superblock of its containing filesystem.

>    If 
> inode->blksize and inode->blkbits were
> related (2**blkbits == blksize) , it also makes sense to me that someone 
> (Ted?) removed one and left the other
> as one would be redundant,

They were never really related.  Some filesystems treated them as if
they were, but the vfs always used i_blkbits for the block size.
i_blksize was only really used for returning stat->blksize.

>  but some filesystems like cifs have a large 
> recommended i/o size (16K),
> but if someone wants to remove both from the inode that is fine by me, 
> as long as cifs
> stat->blksize is set as you suggested on the way out of cifs_getattr.    
> Eventually cifs should set
> the stat->blksize to  a smaller value for one rarer case.   For the most 
> common case
> cifs should still set it to 16K (CIFS_MAX_MSGSIZE) as that is the most 
> common negotiated
> buffer size but if the server does not negotiate large read support, and 
> the server also is so old
> that it negotiates a buffer size smaller than 16K (e.g. Windows95 
> negotiates 2K IIRC)
> then we could set stat-blksize to the smaller negotiated buffer size - 
> but since those servers are
> getting rarer it is probably not that important.    More interesting 
> will be the future cases in
> which we will be able to set this value larger to more servers, as in 
> general for modern
> network adapters, the larger network i/o size the better.

It would probably be best to just set stat->blksize to the negotiated
buffer size.
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to   PAGE_CACHE_SIZE
  2006-11-07 13:40     ` Dave Kleikamp
@ 2006-11-07 15:34       ` Dave Kleikamp
  2006-11-07 15:50         ` Steve French
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Kleikamp @ 2006-11-07 15:34 UTC (permalink / raw)
  To: Steve French
  Cc: Eric Sandeen, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso

On Tue, 2006-11-07 at 07:40 -0600, Dave Kleikamp wrote:
> It would probably be best to just set stat->blksize to the negotiated
> buffer size.

But be careful here.  I don't know how applications/glibc may behave if
stat->blksize is not a power of 2.
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to   PAGE_CACHE_SIZE
  2006-11-07 15:34       ` Dave Kleikamp
@ 2006-11-07 15:50         ` Steve French
  0 siblings, 0 replies; 16+ messages in thread
From: Steve French @ 2006-11-07 15:50 UTC (permalink / raw)
  To: Dave Kleikamp
  Cc: Eric Sandeen, Linux Kernel Mailing List, linux-fsdevel,
	Theodore Tso

Dave Kleikamp wrote:
> On Tue, 2006-11-07 at 07:40 -0600, Dave Kleikamp wrote:
>   
>> It would probably be best to just set stat->blksize to the negotiated
>> buffer size.
>>     
>
> But be careful here.  I don't know how applications/glibc may behave if
> stat->blksize is not a power of 2.
>   
The man page is not particularly helpful either as it simply indicates:
    "The st_blksize field gives the preferred blocksize for efficient 
file system  I/O. "
but it appears that blksize would affects readdir performance more than 
read/write
(since read/write go through the pagecache and thus readpages/writepages
will request readahead/writebehind for many pages at a time) unless the 
application
opens the file direct i/o.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2006-11-07 15:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-06 21:50 [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE Eric Sandeen
2006-11-06 22:27 ` Dave Kleikamp
2006-11-06 22:37   ` Eric Sandeen
2006-11-07  0:26   ` Steve French
2006-11-07 13:40     ` Dave Kleikamp
2006-11-07 15:34       ` Dave Kleikamp
2006-11-07 15:50         ` Steve French
2006-11-06 23:05 ` Christoph Hellwig
2006-11-06 23:15   ` Eric Sandeen
2006-11-06 23:17     ` Christoph Hellwig
2006-11-06 23:28       ` Eric Sandeen
2006-11-07  0:08     ` Andreas Dilger
2006-11-07  0:13       ` Hua Zhong
2006-11-07  1:39       ` Eric Sandeen
2006-11-07  3:43     ` Theodore Tso
2006-11-07  4:02       ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).