public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Ext3 -mm reservations code: is this fix really correct?
@ 2004-10-15 13:27 Stephen C. Tweedie
  2004-10-15 16:01 ` [Ext2-devel] " mingming cao
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen C. Tweedie @ 2004-10-15 13:27 UTC (permalink / raw)
  To: Mingming Cao, Badari Pulavarty
  Cc: linux-kernel, ext2-devel@lists.sourceforge.net, Stephen Tweedie,
	Andrew Morton

Hi,

In ext3-reservations-window-allocation-fix.patch from -mm, we try to
make sure that we always search for a new reservation from the goal
forwards, not just from the previous window forwards.  I'm assuming this
is done to optimise random writes.

I'm still not convinced we get it right.  In alloc_new_reservation(), we
do:

		if ((my_rsv->rsv_start <= group_end_block) &&
				(my_rsv->rsv_end > group_end_block))
			return -1;

We get into alloc_new_reservation in the first place either when the
goal is outside the window, or we could not allocate inside the window.

Now, in the latter case, the check is correct --- if the window spans
two block groups and we couldn't allocate in the first block group, then
we should continue in the next one.

But what if the goal was in the current block group, but was *prior* to
the window?  The goal is outside the window, yet the above check may
still be true, and we'll incorrectly decide to avoid the current block
group entirely.

I think we need an "&& start_block >= my_rsv->rsv_start" to deal with
this.

If we get past that test --- the reservation window is all entirely
inside one group --- then we have the following chunk in
xt3-reservations-window-allocation-fix.patch:

-		/* remember where we are before we discard the old one */
-		if (my_rsv->rsv_end + 1 > start_block)
-			start_block = my_rsv->rsv_end + 1;
-		search_head = my_rsv;
+		search_head = search_reserve_window(&my_rsv->rsv_node, start_block);

which I'm assuming is trying to pin the search start to the goal block. 
But that's wrong --- search_reserve_window does a downwards-traversing
tree search, and needs to be given the root of the tree in order to find
a given block.  Giving it the current reservation node as the search
start point is not going to allow it to find the right node in all
cases.

Have I misunderstood something?

Fortunately, none of the above should affect the normal hot path of
sequential allocation, but it may well penalise random writes.

Cheers,
 Stephen


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct?
  2004-10-15 13:27 Ext3 -mm reservations code: is this fix really correct? Stephen C. Tweedie
@ 2004-10-15 16:01 ` mingming cao
  2004-10-15 16:40   ` Stephen C. Tweedie
  0 siblings, 1 reply; 16+ messages in thread
From: mingming cao @ 2004-10-15 16:01 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net,
	Andrew Morton

On Fri, 2004-10-15 at 06:27, Stephen C. Tweedie wrote:
> Hi,
> 
> In ext3-reservations-window-allocation-fix.patch from -mm, we try to
> make sure that we always search for a new reservation from the goal
> forwards, not just from the previous window forwards.  I'm assuming this
> is done to optimise random writes.
> 
> I'm still not convinced we get it right.  In alloc_new_reservation(), we
> do:
> 
> 		if ((my_rsv->rsv_start <= group_end_block) &&
> 				(my_rsv->rsv_end > group_end_block))
> 			return -1;
> 
> We get into alloc_new_reservation in the first place either when the
> goal is outside the window, or we could not allocate inside the window.
> 
> Now, in the latter case, the check is correct --- if the window spans
> two block groups and we couldn't allocate in the first block group, then
> we should continue in the next one.
> 
> But what if the goal was in the current block group, but was *prior* to
> the window?  The goal is outside the window, yet the above check may
> still be true, and we'll incorrectly decide to avoid the current block
> group entirely.
> 
> I think we need an "&& start_block >= my_rsv->rsv_start" to deal with
> this.
> 

You are right. I think at the beginning we were decide the honor the the
reservation window if the goal is out of the window. Considering the
goal is selected with good reason, we agreed that we should try honor
the goal block all the time.But we missed this case apparently.

> If we get past that test --- the reservation window is all entirely
> inside one group --- then we have the following chunk in
> xt3-reservations-window-allocation-fix.patch:
> 
> -		/* remember where we are before we discard the old one */
> -		if (my_rsv->rsv_end + 1 > start_block)
> -			start_block = my_rsv->rsv_end + 1;
> -		search_head = my_rsv;
> +		search_head = search_reserve_window(&my_rsv->rsv_node, start_block);
> 
> which I'm assuming is trying to pin the search start to the goal block. 
> But that's wrong --- search_reserve_window does a downwards-traversing
> tree search, and needs to be given the root of the tree in order to find
> a given block.  Giving it the current reservation node as the search
> start point is not going to allow it to find the right node in all
> cases.
> 
> Have I misunderstood something?
> 
You are correct, again:) We should do a search_reserve_window() from the
root.

I will post a fix for these two soon.

Mingming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct?
  2004-10-15 16:01 ` [Ext2-devel] " mingming cao
@ 2004-10-15 16:40   ` Stephen C. Tweedie
  2004-10-15 20:29     ` mingming cao
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen C. Tweedie @ 2004-10-15 16:40 UTC (permalink / raw)
  To: Mingming Cao
  Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net,
	Andrew Morton

Hi,

On Fri, 2004-10-15 at 17:01, mingming cao wrote:

> > Have I misunderstood something?
> > 
> You are correct, again:) We should do a search_reserve_window() from the
> root.
> 
> I will post a fix for these two soon.

Thanks.  I'll be away for a few days so I probably won't be able to look
at the fix until Wednesday next week.

Cheers,
 Stephen


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct?
  2004-10-15 16:40   ` Stephen C. Tweedie
@ 2004-10-15 20:29     ` mingming cao
  2004-10-15 22:20       ` Stephen C. Tweedie
  0 siblings, 1 reply; 16+ messages in thread
From: mingming cao @ 2004-10-15 20:29 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net,
	Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 494 bytes --]

On Fri, 2004-10-15 at 09:40, Stephen C. Tweedie wrote:
> Hi,
> 
> On Fri, 2004-10-15 at 17:01, mingming cao wrote:
> 
> > > Have I misunderstood something?
> > > 
> > You are correct, again:) We should do a search_reserve_window() from the
> > root.
> > 
> > I will post a fix for these two soon.
> 
> Thanks.  I'll be away for a few days so I probably won't be able to look
> at the fix until Wednesday next week.
> 

How about this? Haven't test it, will do it shortly.:)


Thanks,
Mingming


[-- Attachment #2: ext3_reservation_window_fix_fix.patch --]
[-- Type: text/plain, Size: 2424 bytes --]

---

 linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c |   21 +++++++++------------
 1 files changed, 9 insertions(+), 12 deletions(-)

diff -puN fs/ext3/balloc.c~ext3_reservation_window_fix_fix fs/ext3/balloc.c
--- linux-2.6.9-rc1-mm5/fs/ext3/balloc.c~ext3_reservation_window_fix_fix	2004-10-15 18:23:42.824158856 -0700
+++ linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c	2004-10-15 20:16:06.037034680 -0700
@@ -184,9 +184,10 @@ goal_in_my_reservation(struct reserve_wi
  * if the goal is not in any window.
  * Returns NULL if there are no windows or if all windows start after the goal.
  */
-static struct reserve_window_node *search_reserve_window(struct rb_node *n,
+static struct reserve_window_node *search_reserve_window(struct rb_root *root,
 							 unsigned long goal)
 {
+	struct rb_node *n = root->rb_node;
 	struct reserve_window_node *rsv;
 
 	if (!n)
@@ -822,10 +823,10 @@ static int alloc_new_reservation(struct 
 		start_block = goal + group_first_block;
 
 	size = atomic_read(&my_rsv->rsv_goal_size);
-	/* if we have a old reservation, start the search from the old rsv */
 	if (!rsv_is_empty(&my_rsv->rsv_window)) {
 		/*
 		 * if the old reservation is cross group boundary
+		 * and if the goal is inside the old reservation window,
 		 * we will come here when we just failed to allocate from
 		 * the first part of the window. We still have another part
 		 * that belongs to the next group. In this case, there is no
@@ -838,10 +839,10 @@ static int alloc_new_reservation(struct 
 		 */
 
 		if ((my_rsv->rsv_start <= group_end_block) &&
-				(my_rsv->rsv_end > group_end_block))
+				(my_rsv->rsv_end > group_end_block) &&
+				(start_block <= my_rsv->rsv_start))
 			return -1;
 
-		search_head = search_reserve_window(&my_rsv->rsv_node, start_block);
 		if ((atomic_read(&my_rsv->rsv_alloc_hit) >
 		     (my_rsv->rsv_end - my_rsv->rsv_start + 1) / 2)) {
 			/*
@@ -855,14 +856,10 @@ static int alloc_new_reservation(struct 
 			atomic_set(&my_rsv->rsv_goal_size, size);
 		}
 	}
-	else {
-		/*
-		 * we don't have a reservation,
-		 * we set our goal(start_block) and
-		 * the list head for the search
-		 */
-		search_head = search_reserve_window(fs_rsv_root->rb_node, start_block);
-	}
+	/*
+	 * shift the search start to the window near the goal block
+	 */
+	search_head = search_reserve_window(fs_rsv_root, start_block);
 
 	/*
 	 * find_next_reservable_window() simply finds a reservable window

_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct?
  2004-10-15 20:29     ` mingming cao
@ 2004-10-15 22:20       ` Stephen C. Tweedie
  2004-10-15 22:34         ` mingming cao
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen C. Tweedie @ 2004-10-15 22:20 UTC (permalink / raw)
  To: Mingming Cao
  Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net,
	Andrew Morton, Stephen Tweedie

Hi,

On Fri, 2004-10-15 at 21:29, mingming cao wrote:

> How about this? Haven't test it, will do it shortly.:)

                if ((my_rsv->rsv_start <= group_end_block) &&
-                               (my_rsv->rsv_end > group_end_block))
+                               (my_rsv->rsv_end > group_end_block) &&
+                               (start_block <= my_rsv->rsv_start))
                        return -1;
 
That new "<=" should be a ">=", shouldn't it?

Otherwise, looks OK.

--Stephen


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct?
  2004-10-15 22:20       ` Stephen C. Tweedie
@ 2004-10-15 22:34         ` mingming cao
  2004-10-18 22:55           ` [PATCH 2/3] ext3 reservation allow turn off for specifed file Mingming Cao
  0 siblings, 1 reply; 16+ messages in thread
From: mingming cao @ 2004-10-15 22:34 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net,
	Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 622 bytes --]

On Fri, 2004-10-15 at 15:20, Stephen C. Tweedie wrote:
> Hi,
> 
> On Fri, 2004-10-15 at 21:29, mingming cao wrote:
> 
> > How about this? Haven't test it, will do it shortly.:)
> 
>                 if ((my_rsv->rsv_start <= group_end_block) &&
> -                               (my_rsv->rsv_end > group_end_block))
> +                               (my_rsv->rsv_end > group_end_block) &&
> +                               (start_block <= my_rsv->rsv_start))
>                         return -1;
>  
> That new "<=" should be a ">=", shouldn't it?

My bad! It should be ">=":)

Updated patch attached. 

Thanks!

Mingming


[-- Attachment #2: ext3_reservation_window_fix_fix.patch --]
[-- Type: text/plain, Size: 2424 bytes --]

---

 linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c |   21 +++++++++------------
 1 files changed, 9 insertions(+), 12 deletions(-)

diff -puN fs/ext3/balloc.c~ext3_reservation_window_fix_fix fs/ext3/balloc.c
--- linux-2.6.9-rc1-mm5/fs/ext3/balloc.c~ext3_reservation_window_fix_fix	2004-10-15 18:23:42.824158856 -0700
+++ linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c	2004-10-15 22:38:52.895674208 -0700
@@ -184,9 +184,10 @@ goal_in_my_reservation(struct reserve_wi
  * if the goal is not in any window.
  * Returns NULL if there are no windows or if all windows start after the goal.
  */
-static struct reserve_window_node *search_reserve_window(struct rb_node *n,
+static struct reserve_window_node *search_reserve_window(struct rb_root *root,
 							 unsigned long goal)
 {
+	struct rb_node *n = root->rb_node;
 	struct reserve_window_node *rsv;
 
 	if (!n)
@@ -822,10 +823,10 @@ static int alloc_new_reservation(struct 
 		start_block = goal + group_first_block;
 
 	size = atomic_read(&my_rsv->rsv_goal_size);
-	/* if we have a old reservation, start the search from the old rsv */
 	if (!rsv_is_empty(&my_rsv->rsv_window)) {
 		/*
 		 * if the old reservation is cross group boundary
+		 * and if the goal is inside the old reservation window,
 		 * we will come here when we just failed to allocate from
 		 * the first part of the window. We still have another part
 		 * that belongs to the next group. In this case, there is no
@@ -838,10 +839,10 @@ static int alloc_new_reservation(struct 
 		 */
 
 		if ((my_rsv->rsv_start <= group_end_block) &&
-				(my_rsv->rsv_end > group_end_block))
+				(my_rsv->rsv_end > group_end_block) &&
+				(start_block >= my_rsv->rsv_start))
 			return -1;
 
-		search_head = search_reserve_window(&my_rsv->rsv_node, start_block);
 		if ((atomic_read(&my_rsv->rsv_alloc_hit) >
 		     (my_rsv->rsv_end - my_rsv->rsv_start + 1) / 2)) {
 			/*
@@ -855,14 +856,10 @@ static int alloc_new_reservation(struct 
 			atomic_set(&my_rsv->rsv_goal_size, size);
 		}
 	}
-	else {
-		/*
-		 * we don't have a reservation,
-		 * we set our goal(start_block) and
-		 * the list head for the search
-		 */
-		search_head = search_reserve_window(fs_rsv_root->rb_node, start_block);
-	}
+	/*
+	 * shift the search start to the window near the goal block
+	 */
+	search_head = search_reserve_window(fs_rsv_root, start_block);
 
 	/*
 	 * find_next_reservable_window() simply finds a reservable window

_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-15 22:34         ` mingming cao
@ 2004-10-18 22:55           ` Mingming Cao
  2004-10-18 23:42             ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Mingming Cao @ 2004-10-18 22:55 UTC (permalink / raw)
  To: akpm, Stephen C. Tweedie
  Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net

Allow user shut down reservation-based allocation(using ioctl) on a specific file(e.g. for seeky random write).


---

 linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff -puN fs/ext3/balloc.c~ext3_shutdown_reservation_per-file fs/ext3/balloc.c
--- linux-2.6.9-rc4-mm1/fs/ext3/balloc.c~ext3_shutdown_reservation_per-file	2004-10-18 22:27:06.333698488 -0700
+++ linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c	2004-10-18 22:34:52.825780912 -0700
@@ -1153,6 +1153,7 @@ int ext3_new_block(handle_t *handle, str
 	struct ext3_super_block *es;
 	struct ext3_sb_info *sbi;
 	struct reserve_window_node *my_rsv = NULL;
+	struct reserve_window_node *rsv = &EXT3_I(inode)->i_rsv_window;
 #ifdef EXT3FS_DEBUG
 	static int goal_hits, goal_attempts;
 #endif
@@ -1176,8 +1177,18 @@ int ext3_new_block(handle_t *handle, str
 	sbi = EXT3_SB(sb);
 	es = EXT3_SB(sb)->s_es;
 	ext3_debug("goal=%lu.\n", goal);
-	if (test_opt(sb, RESERVATION) && S_ISREG(inode->i_mode))
-		my_rsv = &EXT3_I(inode)->i_rsv_window;
+	/*
+	 * Allocate a block from reservation only when
+	 * filesystem is mounted with reservation(default,-o reservation), and
+	 * it's a regular file, and
+	 * the desired window size is greater than 0 (One could use ioctl
+	 * command EXT3_IOC_SETRSVSZ to set the window size to 0 to turn off
+	 * reservation on that particular file)
+	 */
+	if (test_opt(sb, RESERVATION) &&
+		S_ISREG(inode->i_mode) &&
+		(atomic_read(&rsv->rsv_goal_size) > 0))
+		my_rsv = rsv;
 	if (!ext3_has_free_blocks(sbi)) {
 		*errp = -ENOSPC;
 		goto out;

_


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-18 22:55           ` [PATCH 2/3] ext3 reservation allow turn off for specifed file Mingming Cao
@ 2004-10-18 23:42             ` Andrew Morton
  2004-10-19  1:01               ` Mingming Cao
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2004-10-18 23:42 UTC (permalink / raw)
  To: Mingming Cao; +Cc: sct, pbadari, linux-kernel, ext2-devel

Mingming Cao <cmm@us.ibm.com> wrote:
>
> Allow user shut down reservation-based allocation(using ioctl) on a specific file(e.g. for seeky random write).

Applications currently pass a seeky-access hint into the kernel via
posix_fadvise(POSIX_FADV_RANDOM).  It would be nice to hook into that
rather than adding an ext3-specific ioctl.  Maybe just peeking at
file->f_ra.ra_pages would suffice.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-18 23:42             ` Andrew Morton
@ 2004-10-19  1:01               ` Mingming Cao
  2004-10-20 17:55                 ` Ray Lee
  0 siblings, 1 reply; 16+ messages in thread
From: Mingming Cao @ 2004-10-19  1:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: sct, pbadari, linux-kernel, ext2-devel

On Mon, 2004-10-18 at 16:42, Andrew Morton wrote:
> Mingming Cao <cmm@us.ibm.com> wrote:
> >
> > Allow user shut down reservation-based allocation(using ioctl) on a specific file(e.g. for seeky random write).
> 
> Applications currently pass a seeky-access hint into the kernel via
> posix_fadvise(POSIX_FADV_RANDOM).  It would be nice to hook into that
> rather than adding an ext3-specific ioctl.  Maybe just peeking at
> file->f_ra.ra_pages would suffice.
> 
> 

Ha, I think I did not make this clear in the description: The patch did
not add a new ext3-specific ioctl. We added two ioctl before for ext3
reservation code to allow user to get and set reservation window size,
so application could set it's desired reservation window size. It allows
the window size to be 0, however the current reservation code just
simply set the window size value to 0, but still try to allocate a size
0 reservation window. We should skip doing reservation-based allocation
at all if the desired window size is 0.

Just thought seeky random write application could use the existing ioctl
to let the kernel know it does not need reservation at all. Isn't that
more straightforward?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-19  1:01               ` Mingming Cao
@ 2004-10-20 17:55                 ` Ray Lee
  2004-10-25 20:33                   ` [Ext2-devel] " Mingming Cao
  0 siblings, 1 reply; 16+ messages in thread
From: Ray Lee @ 2004-10-20 17:55 UTC (permalink / raw)
  To: Mingming Cao; +Cc: Andrew Morton, sct, pbadari, Linux Kernel, ext2-devel

On Mon, 2004-10-18 at 18:01 -0700, Mingming Cao wrote:
> On Mon, 2004-10-18 at 16:42, Andrew Morton wrote:
> > Applications currently pass a seeky-access hint into the kernel via
> > posix_fadvise(POSIX_FADV_RANDOM).  It would be nice to hook into that

[...]

> Just thought seeky random write application could use the existing ioctl
> to let the kernel know it does not need reservation at all. Isn't that
> more straightforward?

Going the ioctl route seems to imply that userspace would have to do a
posix_fadvise() call and the ioctl, as opposed to just the fadvise. No?
I'm betting the fadvise call is a little more portable as well.

Ray


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-20 17:55                 ` Ray Lee
@ 2004-10-25 20:33                   ` Mingming Cao
  2004-10-25 23:05                     ` Mingming Cao
  0 siblings, 1 reply; 16+ messages in thread
From: Mingming Cao @ 2004-10-25 20:33 UTC (permalink / raw)
  To: Ray Lee; +Cc: Andrew Morton, sct, pbadari, Linux Kernel, ext2-devel

On Wed, 2004-10-20 at 10:55, Ray Lee wrote:
> On Mon, 2004-10-18 at 18:01 -0700, Mingming Cao wrote:
> > On Mon, 2004-10-18 at 16:42, Andrew Morton wrote:
> > > Applications currently pass a seeky-access hint into the kernel via
> > > posix_fadvise(POSIX_FADV_RANDOM).  It would be nice to hook into that
> 
> [...]
> 
> > Just thought seeky random write application could use the existing ioctl
> > to let the kernel know it does not need reservation at all. Isn't that
> > more straightforward?
> 
> Going the ioctl route seems to imply that userspace would have to do a
> posix_fadvise() call and the ioctl, as opposed to just the fadvise. No?
> I'm betting the fadvise call is a little more portable as well.

Agreed. How about this: add a check in ext3_prepare_write(), if user passed seeky-access hint through posix_fadvise(via check for file->f_ra.ra_pages == 0), if so, close the reservation window.  But we still need previous patch to handle window size 0(no reservation) in reservation code.

Currently the readahead is turned off if the userspace passed a seeky access hint to kernel by POSIX_FADVISE. It would be nice to turn off the block allocation reservation as well for seeky random write.


---

 linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c |    9 +++++++++
 1 files changed, 9 insertions(+)

diff -puN fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise fs/ext3/inode.c
--- linux-2.6.9-rc4-mm1/fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise	2004-10-25 18:25:44.135751800 -0700
+++ linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c	2004-10-25 18:34:39.501363856 -0700
@@ -1008,6 +1008,15 @@ retry:
 		ret = PTR_ERR(handle);
 		goto out;
 	}
+
+	/*
+	 * if user passed a seeky-access hint to kernel,
+	 * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared)
+	 * turn off reservation for block allocation correspondingly.
+	 */
+	if (!file->f_ra.ra_pages)
+		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, 0);
+
 	ret = block_prepare_write(page, from, to, ext3_get_block);
 	if (ret)
 		goto prepare_write_failed;

_


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-25 20:33                   ` [Ext2-devel] " Mingming Cao
@ 2004-10-25 23:05                     ` Mingming Cao
  2004-10-25 23:45                       ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Mingming Cao @ 2004-10-25 23:05 UTC (permalink / raw)
  To: Mingming Cao
  Cc: Ray Lee, Andrew Morton, sct, pbadari, Linux Kernel, ext2-devel

On Mon, 2004-10-25 at 13:33, Mingming Cao wrote:
> On Wed, 2004-10-20 at 10:55, Ray Lee wrote:
> > On Mon, 2004-10-18 at 18:01 -0700, Mingming Cao wrote:
> > > On Mon, 2004-10-18 at 16:42, Andrew Morton wrote:
> > > > Applications currently pass a seeky-access hint into the kernel via
> > > > posix_fadvise(POSIX_FADV_RANDOM).  It would be nice to hook into that
> > 
> > [...]
> > 
> > > Just thought seeky random write application could use the existing ioctl
> > > to let the kernel know it does not need reservation at all. Isn't that
> > > more straightforward?
> > 
> > Going the ioctl route seems to imply that userspace would have to do a
> > posix_fadvise() call and the ioctl, as opposed to just the fadvise. No?
> > I'm betting the fadvise call is a little more portable as well.
> 
> Agreed. How about this: add a check in ext3_prepare_write(), if user passed seeky-access hint through posix_fadvise(via check for file->f_ra.ra_pages == 0), if so, close the reservation window.  But we still need previous patch to handle window size 0(no reservation) in reservation code.

The previous patch could re-open the reservation in the case user then
switch back to POSIX_FADV_NORMAL or POSIZ_FADV_SEQUENTAIL. Updated
version included. We set the window size to -1 if the user pass the
POSIX_FADV_RANDOM, instead of 0, to differentiate the case when user
really want to close the window by ioctl.  We could re-open the window
only when user pass the POSIX_FADV_SEQUENTAIL (or POSIX_FADV_NORMAL
hint) and the window was closed because of previous seeky access hint.

Currently the readahead is turned off if the userspace passed a seeky access hint to kernel by POSIX_FADVISE. It would be nice to turn off the block allocation reservation as well for seeky random write.


---

 linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c |    2 +-
 linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c  |   17 +++++++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff -puN fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise fs/ext3/inode.c
--- linux-2.6.9-rc4-mm1/fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise	2004-10-25 18:25:44.135751800 -0700
+++ linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c	2004-10-25 21:43:16.194964928 -0700
@@ -1001,6 +1001,7 @@ static int ext3_prepare_write(struct fil
 	int ret, needed_blocks = ext3_writepage_trans_blocks(inode);
 	handle_t *handle;
 	int retries = 0;
+	int windowsz = 0;
 
 retry:
 	handle = ext3_journal_start(inode, needed_blocks);
@@ -1008,6 +1009,22 @@ retry:
 		ret = PTR_ERR(handle);
 		goto out;
 	}
+
+	/*
+	 * if user passed a seeky-access hint to kernel,
+	 * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared)
+	 * turn off reservation for block allocation correspondingly.
+	 *
+	 * Otherwise, if user switch back to POSIX_FADV_SEQUENTIAL or
+	 * POSIX_FADV_NORMAL, re-open the reservation window.
+	 */
+	windowsz = atomic_read(&EXT3_I(inode)->i_rsv_window.rsv_goal_size);
+	if ((windowsz > 0) && (!file->f_ra.ra_pages))
+		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, -1);
+	if ((windowsz == -1) && file->f_ra.ra_pages)
+		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size,
+					EXT3_DEFAULT_RESERVE_BLOCKS);
+
 	ret = block_prepare_write(page, from, to, ext3_get_block);
 	if (ret)
 		goto prepare_write_failed;
diff -puN fs/ext3/balloc.c~ext3_turn_off_reservation_by_fadvise fs/ext3/balloc.c
--- linux-2.6.9-rc4-mm1/fs/ext3/balloc.c~ext3_turn_off_reservation_by_fadvise	2004-10-25 21:23:05.152071432 -0700
+++ linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c	2004-10-25 21:24:48.136415432 -0700
@@ -1154,7 +1154,7 @@ int ext3_new_block(handle_t *handle, str
 	struct ext3_sb_info *sbi;
 	struct reserve_window_node *my_rsv = NULL;
 	struct reserve_window_node *rsv = &EXT3_I(inode)->i_rsv_window;
-	unsigned short windowsz = 0;
+	int windowsz = 0;
 #ifdef EXT3FS_DEBUG
 	static int goal_hits, goal_attempts;
 #endif

_


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-25 23:05                     ` Mingming Cao
@ 2004-10-25 23:45                       ` Andrew Morton
  2004-10-26 16:53                         ` Mingming Cao
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2004-10-25 23:45 UTC (permalink / raw)
  To: Mingming Cao; +Cc: cmm, ray-lk, sct, pbadari, linux-kernel, ext2-devel

Mingming Cao <cmm@us.ibm.com> wrote:
>
> 	/*
> +	 * if user passed a seeky-access hint to kernel,
> +	 * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared)
> +	 * turn off reservation for block allocation correspondingly.
> +	 *
> +	 * Otherwise, if user switch back to POSIX_FADV_SEQUENTIAL or
> +	 * POSIX_FADV_NORMAL, re-open the reservation window.
> +	 */
> +	windowsz = atomic_read(&EXT3_I(inode)->i_rsv_window.rsv_goal_size);
> +	if ((windowsz > 0) && (!file->f_ra.ra_pages))
> +		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, -1);
> +	if ((windowsz == -1) && file->f_ra.ra_pages)
> +		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size,
> +					EXT3_DEFAULT_RESERVE_BLOCKS);
> +

It's pretty sad that we add this extra code into ext3_prepare_write() -
it's almost never actually executed.

I wonder how important this optimisation really is?  I bet no applications
are using posix_fadvise(POSIX_FADV_RANDOM) anyway.

It we really see that benefits are available from this approach we probably
need to bite the bullet and add file_operations.fadvise()

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-25 23:45                       ` Andrew Morton
@ 2004-10-26 16:53                         ` Mingming Cao
  2004-10-26 20:18                           ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Mingming Cao @ 2004-10-26 16:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ray-lk, sct, pbadari, linux-kernel, ext2-devel

On Mon, 2004-10-25 at 16:45, Andrew Morton wrote:
> Mingming Cao <cmm@us.ibm.com> wrote:
> >
> > 	/*
> > +	 * if user passed a seeky-access hint to kernel,
> > +	 * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared)
> > +	 * turn off reservation for block allocation correspondingly.
> > +	 *
> > +	 * Otherwise, if user switch back to POSIX_FADV_SEQUENTIAL or
> > +	 * POSIX_FADV_NORMAL, re-open the reservation window.
> > +	 */
> > +	windowsz = atomic_read(&EXT3_I(inode)->i_rsv_window.rsv_goal_size);
> > +	if ((windowsz > 0) && (!file->f_ra.ra_pages))
> > +		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, -1);
> > +	if ((windowsz == -1) && file->f_ra.ra_pages)
> > +		atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size,
> > +					EXT3_DEFAULT_RESERVE_BLOCKS);
> > +
> 
> It's pretty sad that we add this extra code into ext3_prepare_write() -
> it's almost never actually executed.
> 
:(

> I wonder how important this optimisation really is?  I bet no applications
> are using posix_fadvise(POSIX_FADV_RANDOM) anyway.
> 
I don't know if there is application using the POSIX_FADV_RANDOM. No? If
this is the truth, I think we don't need this optimization at present.
Logically reservation does not benefit seeky random write, but there is
no benchmark showing performance issue so far. We have already provided
ways for applications turn off reservation through the existing ioctl
for specified file and -o noreservation mount option for the whole
filesystem.

> It we really see that benefits are available from this approach we probably
> need to bite the bullet and add file_operations.fadvise()
> 
I agree.

Mingming



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-26 16:53                         ` Mingming Cao
@ 2004-10-26 20:18                           ` Andrew Morton
  2004-10-26 23:01                             ` Mingming Cao
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2004-10-26 20:18 UTC (permalink / raw)
  To: Mingming Cao; +Cc: ray-lk, sct, pbadari, linux-kernel, ext2-devel

Mingming Cao <cmm@us.ibm.com> wrote:
>
> > I wonder how important this optimisation really is?  I bet no applications
>  > are using posix_fadvise(POSIX_FADV_RANDOM) anyway.
>  > 
>  I don't know if there is application using the POSIX_FADV_RANDOM. No? If
>  this is the truth, I think we don't need this optimization at present.
>  Logically reservation does not benefit seeky random write, but there is
>  no benchmark showing performance issue so far. We have already provided
>  ways for applications turn off reservation through the existing ioctl
>  for specified file and -o noreservation mount option for the whole
>  filesystem.

Well we definitely don't want to be encouraging application developers to be
adding ext3-specific ioctls.  So we need to work out if any applications
can get significant benefit from manually disabling reservations and if
so, wire up fadvise() into filesystems and do it that way.

Do you know if disabling reservations helps any workloads?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file
  2004-10-26 20:18                           ` Andrew Morton
@ 2004-10-26 23:01                             ` Mingming Cao
  0 siblings, 0 replies; 16+ messages in thread
From: Mingming Cao @ 2004-10-26 23:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: ray-lk, sct, pbadari, linux-kernel, ext2-devel

On Tue, 2004-10-26 at 13:18, Andrew Morton wrote:
> Mingming Cao <cmm@us.ibm.com> wrote:
> >
> > > I wonder how important this optimisation really is?  I bet no applications
> >  > are using posix_fadvise(POSIX_FADV_RANDOM) anyway.
> >  > 
> >  I don't know if there is application using the POSIX_FADV_RANDOM. No? If
> >  this is the truth, I think we don't need this optimization at present.
> >  Logically reservation does not benefit seeky random write, but there is
> >  no benchmark showing performance issue so far. We have already provided
> >  ways for applications turn off reservation through the existing ioctl
> >  for specified file and -o noreservation mount option for the whole
> >  filesystem.
> 
> Well we definitely don't want to be encouraging application developers to be
> adding ext3-specific ioctls.  So we need to work out if any applications
> can get significant benefit from manually disabling reservations and if
> so, wire up fadvise() into filesystems and do it that way.
> 
Okey. Also, if there is such a need to disable reservation, maybe we
could think about closing the reservation window dynamically based on
past I/O pattern, when user application did not give any hint.

> Do you know if disabling reservations helps any workloads?
> 
Not that I aware of.  We have done several microbenchmark on sequential
and random workload, haven't seen regression with reservations so far.
But our testing is certainly not comprehensive.

Mingming


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-10-26 22:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-15 13:27 Ext3 -mm reservations code: is this fix really correct? Stephen C. Tweedie
2004-10-15 16:01 ` [Ext2-devel] " mingming cao
2004-10-15 16:40   ` Stephen C. Tweedie
2004-10-15 20:29     ` mingming cao
2004-10-15 22:20       ` Stephen C. Tweedie
2004-10-15 22:34         ` mingming cao
2004-10-18 22:55           ` [PATCH 2/3] ext3 reservation allow turn off for specifed file Mingming Cao
2004-10-18 23:42             ` Andrew Morton
2004-10-19  1:01               ` Mingming Cao
2004-10-20 17:55                 ` Ray Lee
2004-10-25 20:33                   ` [Ext2-devel] " Mingming Cao
2004-10-25 23:05                     ` Mingming Cao
2004-10-25 23:45                       ` Andrew Morton
2004-10-26 16:53                         ` Mingming Cao
2004-10-26 20:18                           ` Andrew Morton
2004-10-26 23:01                             ` Mingming Cao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox