linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
@ 2010-03-30 12:36 jing zhang
  2010-03-30 18:37 ` Aneesh Kumar K. V
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: jing zhang @ 2010-03-30 12:36 UTC (permalink / raw)
  To: linux-ext4, zj.barak
  Cc: Theodore Ts'o, Andreas Dilger, Dave Kleikamp,
	Aneesh Kumar K. V

From: Jing Zhang <zj.barak@gmail.com>

Date: Tue Mar 30 20:35:22     2010

With the added cache, better group locality may be earned when
allocating blocks.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Jing Zhang <zj.barak@gmail.com>

---

--- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
+++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
@@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
 	ext4_group_t i, ngroups = ext4_get_groups_count(sb);
 	int ret;
 	int freed = 0;
+	static ext4_group_t grp_cache = 0;

 	trace_ext4_mb_discard_preallocations(sb, needed);
-	for (i = 0; i < ngroups && needed > 0; i++) {
-		ret = ext4_mb_discard_group_preallocations(sb, i, needed);
+	if (needed <= 0)
+		return freed;
+	for (i = 0; i < ngroups; i++) {
+		if (grp_cache >= ngroups)
+			grp_cache -= ngroups;
+		ret = ext4_mb_discard_group_preallocations(sb, grp_cache, needed);
 		freed += ret;
 		needed -= ret;
+		if (needed <= 0)
+			break;
+		grp_cache++;
 	}

 	return freed;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-03-30 12:36 [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations() jing zhang
@ 2010-03-30 18:37 ` Aneesh Kumar K. V
  2010-03-31 15:10   ` jing zhang
  2010-03-31 15:03 ` Andreas Dilger
  2010-04-06 18:31 ` tytso
  2 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K. V @ 2010-03-30 18:37 UTC (permalink / raw)
  To: jing zhang, linux-ext4, zj.barak
  Cc: Theodore Ts'o, Andreas Dilger, Dave Kleikamp

On Tue, 30 Mar 2010 20:36:17 +0800, jing zhang <zj.barak@gmail.com> wrote:
> From: Jing Zhang <zj.barak@gmail.com>
> 
> Date: Tue Mar 30 20:35:22     2010
> 
> With the added cache, better group locality may be earned when
> allocating blocks.
> 
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Andreas Dilger <adilger@sun.com>
> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
> Cc: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Jing Zhang <zj.barak@gmail.com>
> 
> ---
> 
> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
> +++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
> @@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
>  	ext4_group_t i, ngroups = ext4_get_groups_count(sb);
>  	int ret;
>  	int freed = 0;
> +	static ext4_group_t grp_cache = 0;
> 
>  	trace_ext4_mb_discard_preallocations(sb, needed);
> -	for (i = 0; i < ngroups && needed > 0; i++) {
> -		ret = ext4_mb_discard_group_preallocations(sb, i, needed);
> +	if (needed <= 0)
> +		return freed;
> +	for (i = 0; i < ngroups; i++) {
> +		if (grp_cache >= ngroups)
> +			grp_cache -= ngroups;
> +		ret = ext4_mb_discard_group_preallocations(sb, grp_cache, needed);
>  		freed += ret;
>  		needed -= ret;
> +		if (needed <= 0)
> +			break;
> +		grp_cache++;
>  	}
> 
>  	return freed;

can you explain this further ?

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-03-30 12:36 [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations() jing zhang
  2010-03-30 18:37 ` Aneesh Kumar K. V
@ 2010-03-31 15:03 ` Andreas Dilger
  2010-04-01 12:34   ` jing zhang
  2010-04-06 18:31 ` tytso
  2 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2010-03-31 15:03 UTC (permalink / raw)
  To: jing zhang
  Cc: linux-ext4, Theodore Ts'o, Dave Kleikamp, Aneesh Kumar K. V

On 2010-03-30, at 06:36, jing zhang wrote:
> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
> +++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
> @@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
> 	trace_ext4_mb_discard_preallocations(sb, needed);
> -	for (i = 0; i < ngroups && needed > 0; i++) {
> -		ret = ext4_mb_discard_group_preallocations(sb, i, needed);
> +	if (needed <= 0)
> +		return freed;
> +	for (i = 0; i < ngroups; i++) {
> +		if (grp_cache >= ngroups)
> +			grp_cache -= ngroups;
> +		ret = ext4_mb_discard_group_preallocations(sb, grp_cache, needed);


Anything that is walking every group in the filesystem is going to hit  
problems on large filesystems.  This seems like something that needs  
to be fixed in a different way (e.g. keeping a list of preallocations).

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-03-30 18:37 ` Aneesh Kumar K. V
@ 2010-03-31 15:10   ` jing zhang
  0 siblings, 0 replies; 10+ messages in thread
From: jing zhang @ 2010-03-31 15:10 UTC (permalink / raw)
  To: Aneesh Kumar K. V
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Dave Kleikamp

2010/3/31, Aneesh Kumar K. V <aneesh.kumar@linux.vnet.ibm.com>:
> On Tue, 30 Mar 2010 20:36:17 +0800, jing zhang <zj.barak@gmail.com> wrote:
>> From: Jing Zhang <zj.barak@gmail.com>
>>
>> Date: Tue Mar 30 20:35:22     2010
>>
>> With the added cache, better group locality may be earned when
>> allocating blocks.
>>
>> Cc: Theodore Ts'o <tytso@mit.edu>
>> Cc: Andreas Dilger <adilger@sun.com>
>> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
>> Cc: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Jing Zhang <zj.barak@gmail.com>
>>
>> ---
>>
>> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
>> +++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
>> @@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
>>  	ext4_group_t i, ngroups = ext4_get_groups_count(sb);
>>  	int ret;
>>  	int freed = 0;
>> +	static ext4_group_t grp_cache = 0;
>>
>>  	trace_ext4_mb_discard_preallocations(sb, needed);
>> -	for (i = 0; i < ngroups && needed > 0; i++) {
>> -		ret = ext4_mb_discard_group_preallocations(sb, i, needed);
>> +	if (needed <= 0)
>> +		return freed;
>> +	for (i = 0; i < ngroups; i++) {
>> +		if (grp_cache >= ngroups)
>> +			grp_cache -= ngroups;
>> +		ret = ext4_mb_discard_group_preallocations(sb, grp_cache, needed);
>>  		freed += ret;
>>  		needed -= ret;
>> +		if (needed <= 0)
>> +			break;
>> +		grp_cache++;
>>  	}
>>
>>  	return freed;
>
> can you explain this further ?
>
> -aneesh
>

The added cache checks whether blocks pre-allocated in group are still
available. If yes, they are discarded and used for allocation without
change of group. So more group locality can be earned.

What is more, in function, ext4_mb_discard_group_preallocations(),
pre-allocation is allowed  to be discarded as much as possible by
yielding.

     - zj

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-03-31 15:03 ` Andreas Dilger
@ 2010-04-01 12:34   ` jing zhang
  2010-04-06 18:49     ` tytso
  0 siblings, 1 reply; 10+ messages in thread
From: jing zhang @ 2010-04-01 12:34 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-ext4, Theodore Ts'o, Dave Kleikamp, Aneesh Kumar K. V

2010/3/31, Andreas Dilger <adilger@sun.com>:
> On 2010-03-30, at 06:36, jing zhang wrote:
>> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
>> +++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
>> @@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
>> 	trace_ext4_mb_discard_preallocations(sb, needed);
>> -	for (i = 0; i < ngroups && needed > 0; i++) {
>> -		ret = ext4_mb_discard_group_preallocations(sb, i, needed);
>> +	if (needed <= 0)
>> +		return freed;
>> +	for (i = 0; i < ngroups; i++) {
>> +		if (grp_cache >= ngroups)
>> +			grp_cache -= ngroups;
>> +		ret = ext4_mb_discard_group_preallocations(sb, grp_cache, needed);
>
>
> Anything that is walking every group in the filesystem is going to hit
> problems on large filesystems.  This seems like something that needs
> to be fixed in a different way (e.g. keeping a list of preallocations).
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>

Then please take the following also into consideration.

Thanks
            - zj

---

--- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
+++ ext4_mm_leak/mballoc-14.c	2010-04-01 20:35:58.000000000 +0800
@@ -4299,7 +4299,7 @@ repeat:
 		}
 	} else {
 		freed  = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
-		if (freed)
+		if (freed && freed >= ac->ac_o_ex.fe_len)
 			goto repeat;
 		*errp = -ENOSPC;
 		ac->ac_b_ex.fe_len = 0;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-03-30 12:36 [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations() jing zhang
  2010-03-30 18:37 ` Aneesh Kumar K. V
  2010-03-31 15:03 ` Andreas Dilger
@ 2010-04-06 18:31 ` tytso
  2010-04-07 12:50   ` jing zhang
  2 siblings, 1 reply; 10+ messages in thread
From: tytso @ 2010-04-06 18:31 UTC (permalink / raw)
  To: jing zhang; +Cc: linux-ext4, Andreas Dilger, Dave Kleikamp, Aneesh Kumar K. V

On Tue, Mar 30, 2010 at 08:36:17PM +0800, jing zhang wrote:
> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
> +++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
> @@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
>  	ext4_group_t i, ngroups = ext4_get_groups_count(sb);
>  	int ret;
>  	int freed = 0;
> +	static ext4_group_t grp_cache = 0;

This is a problem right there.  Remember that there could be multiple
file systems mounted so a static variable is fundamentally flawed.

In fact, we could have a one filesystem which has more than 3 times
the number of groups as another file system.  I'll leave it as an
exercise to a reader why your patch would be fundamentally flawed in
that case.

The other thing to note is that this case only gets hit if the file
system is so full that we need to empty preallocations.  So this means
hitting this case is rare, which raises two questions: (1) is it worth
it to optimize this case in the first place (is it really that
expensive to iterate over all the groups to discard the
preallocations); (2) can we test this case well?

						- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-04-01 12:34   ` jing zhang
@ 2010-04-06 18:49     ` tytso
  2010-04-07 12:58       ` jing zhang
  0 siblings, 1 reply; 10+ messages in thread
From: tytso @ 2010-04-06 18:49 UTC (permalink / raw)
  To: jing zhang; +Cc: Andreas Dilger, linux-ext4, Dave Kleikamp, Aneesh Kumar K. V

On Thu, Apr 01, 2010 at 08:34:41PM +0800, jing zhang wrote:
> 
> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
> +++ ext4_mm_leak/mballoc-14.c	2010-04-01 20:35:58.000000000 +0800
> @@ -4299,7 +4299,7 @@ repeat:
>  		}
>  	} else {
>  		freed  = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
> -		if (freed)
> +		if (freed && freed >= ac->ac_o_ex.fe_len)
>  			goto repeat;
>  		*errp = -ENOSPC;
>  		ac->ac_b_ex.fe_len = 0;

This is just wrong.   

Since you didn't give a justification, I'm not sure why you think it
is correct.

						- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-04-06 18:31 ` tytso
@ 2010-04-07 12:50   ` jing zhang
  0 siblings, 0 replies; 10+ messages in thread
From: jing zhang @ 2010-04-07 12:50 UTC (permalink / raw)
  To: tytso; +Cc: linux-ext4, Andreas Dilger, Dave Kleikamp, Aneesh Kumar K. V

2010/4/7, tytso@mit.edu <tytso@mit.edu>:
> On Tue, Mar 30, 2010 at 08:36:17PM +0800, jing zhang wrote:
>> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
>> +++ ext4_mm_leak/mballoc-13.c	2010-03-30 20:28:08.000000000 +0800
>> @@ -4183,12 +4183,20 @@ static int ext4_mb_discard_preallocation
>>  	ext4_group_t i, ngroups = ext4_get_groups_count(sb);
>>  	int ret;
>>  	int freed = 0;
>> +	static ext4_group_t grp_cache = 0;
>
> This is a problem right there.  Remember that there could be multiple
> file systems mounted so a static variable is fundamentally flawed.
>

cool, the static in my patch is a fatal error.

          - zj

> In fact, we could have a one filesystem which has more than 3 times
> the number of groups as another file system.  I'll leave it as an
> exercise to a reader why your patch would be fundamentally flawed in
> that case.
>
> The other thing to note is that this case only gets hit if the file
> system is so full that we need to empty preallocations.  So this means
> hitting this case is rare, which raises two questions: (1) is it worth
> it to optimize this case in the first place (is it really that
> expensive to iterate over all the groups to discard the
> preallocations); (2) can we test this case well?
>
> 						- Ted
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-04-06 18:49     ` tytso
@ 2010-04-07 12:58       ` jing zhang
  2010-04-07 14:46         ` tytso
  0 siblings, 1 reply; 10+ messages in thread
From: jing zhang @ 2010-04-07 12:58 UTC (permalink / raw)
  To: tytso; +Cc: Andreas Dilger, linux-ext4, Dave Kleikamp, Aneesh Kumar K. V

2010/4/7, tytso@mit.edu <tytso@mit.edu>:
> On Thu, Apr 01, 2010 at 08:34:41PM +0800, jing zhang wrote:
>>
>> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
>> +++ ext4_mm_leak/mballoc-14.c	2010-04-01 20:35:58.000000000 +0800
>> @@ -4299,7 +4299,7 @@ repeat:
>>  		}
>>  	} else {
>>  		freed  = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
>> -		if (freed)
>> +		if (freed && freed >= ac->ac_o_ex.fe_len)
>>  			goto repeat;
>>  		*errp = -ENOSPC;
>>  		ac->ac_b_ex.fe_len = 0;
>
> This is just wrong.
>
> Since you didn't give a justification, I'm not sure why you think it
> is correct.
>

Though freed, is the amount freed bigger than needed?
If not, it seems unnecessary to repeat.

      - zj

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations()
  2010-04-07 12:58       ` jing zhang
@ 2010-04-07 14:46         ` tytso
  0 siblings, 0 replies; 10+ messages in thread
From: tytso @ 2010-04-07 14:46 UTC (permalink / raw)
  To: jing zhang; +Cc: Andreas Dilger, linux-ext4, Dave Kleikamp, Aneesh Kumar K. V

On Wed, Apr 07, 2010 at 08:58:36PM +0800, jing zhang wrote:
> 2010/4/7, tytso@mit.edu <tytso@mit.edu>:
> > On Thu, Apr 01, 2010 at 08:34:41PM +0800, jing zhang wrote:
> >>
> >> --- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
> >> +++ ext4_mm_leak/mballoc-14.c	2010-04-01 20:35:58.000000000 +0800
> >> @@ -4299,7 +4299,7 @@ repeat:
> >>  		}
> >>  	} else {
> >>  		freed  = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
> >> -		if (freed)
> >> +		if (freed && freed >= ac->ac_o_ex.fe_len)
> >>  			goto repeat;
> >>  		*errp = -ENOSPC;
> >>  		ac->ac_b_ex.fe_len = 0;
> >
> > This is just wrong.
> >
> > Since you didn't give a justification, I'm not sure why you think it
> > is correct.
> >
> 
> Though freed, is the amount freed bigger than needed?
> If not, it seems unnecessary to repeat.

You don't understand the code, I think.  If we've freed up any number
of blocks, it makes sense to use those blocks right away.  Mballoc()
is allowed to return fewer blocks than what was requested.

   	      	    	     	 - Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-04-07 14:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-30 12:36 [PATCH] ext4: group cache is added in ext4_mb_discard_preallocations() jing zhang
2010-03-30 18:37 ` Aneesh Kumar K. V
2010-03-31 15:10   ` jing zhang
2010-03-31 15:03 ` Andreas Dilger
2010-04-01 12:34   ` jing zhang
2010-04-06 18:49     ` tytso
2010-04-07 12:58       ` jing zhang
2010-04-07 14:46         ` tytso
2010-04-06 18:31 ` tytso
2010-04-07 12:50   ` jing zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).