* Ext3 -mm reservations code: is this fix really correct? @ 2004-10-15 13:27 Stephen C. Tweedie 2004-10-15 16:01 ` [Ext2-devel] " mingming cao 0 siblings, 1 reply; 16+ messages in thread From: Stephen C. Tweedie @ 2004-10-15 13:27 UTC (permalink / raw) To: Mingming Cao, Badari Pulavarty Cc: linux-kernel, ext2-devel@lists.sourceforge.net, Stephen Tweedie, Andrew Morton Hi, In ext3-reservations-window-allocation-fix.patch from -mm, we try to make sure that we always search for a new reservation from the goal forwards, not just from the previous window forwards. I'm assuming this is done to optimise random writes. I'm still not convinced we get it right. In alloc_new_reservation(), we do: if ((my_rsv->rsv_start <= group_end_block) && (my_rsv->rsv_end > group_end_block)) return -1; We get into alloc_new_reservation in the first place either when the goal is outside the window, or we could not allocate inside the window. Now, in the latter case, the check is correct --- if the window spans two block groups and we couldn't allocate in the first block group, then we should continue in the next one. But what if the goal was in the current block group, but was *prior* to the window? The goal is outside the window, yet the above check may still be true, and we'll incorrectly decide to avoid the current block group entirely. I think we need an "&& start_block >= my_rsv->rsv_start" to deal with this. If we get past that test --- the reservation window is all entirely inside one group --- then we have the following chunk in xt3-reservations-window-allocation-fix.patch: - /* remember where we are before we discard the old one */ - if (my_rsv->rsv_end + 1 > start_block) - start_block = my_rsv->rsv_end + 1; - search_head = my_rsv; + search_head = search_reserve_window(&my_rsv->rsv_node, start_block); which I'm assuming is trying to pin the search start to the goal block. But that's wrong --- search_reserve_window does a downwards-traversing tree search, and needs to be given the root of the tree in order to find a given block. Giving it the current reservation node as the search start point is not going to allow it to find the right node in all cases. Have I misunderstood something? Fortunately, none of the above should affect the normal hot path of sequential allocation, but it may well penalise random writes. Cheers, Stephen ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct? 2004-10-15 13:27 Ext3 -mm reservations code: is this fix really correct? Stephen C. Tweedie @ 2004-10-15 16:01 ` mingming cao 2004-10-15 16:40 ` Stephen C. Tweedie 0 siblings, 1 reply; 16+ messages in thread From: mingming cao @ 2004-10-15 16:01 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net, Andrew Morton On Fri, 2004-10-15 at 06:27, Stephen C. Tweedie wrote: > Hi, > > In ext3-reservations-window-allocation-fix.patch from -mm, we try to > make sure that we always search for a new reservation from the goal > forwards, not just from the previous window forwards. I'm assuming this > is done to optimise random writes. > > I'm still not convinced we get it right. In alloc_new_reservation(), we > do: > > if ((my_rsv->rsv_start <= group_end_block) && > (my_rsv->rsv_end > group_end_block)) > return -1; > > We get into alloc_new_reservation in the first place either when the > goal is outside the window, or we could not allocate inside the window. > > Now, in the latter case, the check is correct --- if the window spans > two block groups and we couldn't allocate in the first block group, then > we should continue in the next one. > > But what if the goal was in the current block group, but was *prior* to > the window? The goal is outside the window, yet the above check may > still be true, and we'll incorrectly decide to avoid the current block > group entirely. > > I think we need an "&& start_block >= my_rsv->rsv_start" to deal with > this. > You are right. I think at the beginning we were decide the honor the the reservation window if the goal is out of the window. Considering the goal is selected with good reason, we agreed that we should try honor the goal block all the time.But we missed this case apparently. > If we get past that test --- the reservation window is all entirely > inside one group --- then we have the following chunk in > xt3-reservations-window-allocation-fix.patch: > > - /* remember where we are before we discard the old one */ > - if (my_rsv->rsv_end + 1 > start_block) > - start_block = my_rsv->rsv_end + 1; > - search_head = my_rsv; > + search_head = search_reserve_window(&my_rsv->rsv_node, start_block); > > which I'm assuming is trying to pin the search start to the goal block. > But that's wrong --- search_reserve_window does a downwards-traversing > tree search, and needs to be given the root of the tree in order to find > a given block. Giving it the current reservation node as the search > start point is not going to allow it to find the right node in all > cases. > > Have I misunderstood something? > You are correct, again:) We should do a search_reserve_window() from the root. I will post a fix for these two soon. Mingming ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct? 2004-10-15 16:01 ` [Ext2-devel] " mingming cao @ 2004-10-15 16:40 ` Stephen C. Tweedie 2004-10-15 20:29 ` mingming cao 0 siblings, 1 reply; 16+ messages in thread From: Stephen C. Tweedie @ 2004-10-15 16:40 UTC (permalink / raw) To: Mingming Cao Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net, Andrew Morton Hi, On Fri, 2004-10-15 at 17:01, mingming cao wrote: > > Have I misunderstood something? > > > You are correct, again:) We should do a search_reserve_window() from the > root. > > I will post a fix for these two soon. Thanks. I'll be away for a few days so I probably won't be able to look at the fix until Wednesday next week. Cheers, Stephen ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct? 2004-10-15 16:40 ` Stephen C. Tweedie @ 2004-10-15 20:29 ` mingming cao 2004-10-15 22:20 ` Stephen C. Tweedie 0 siblings, 1 reply; 16+ messages in thread From: mingming cao @ 2004-10-15 20:29 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 494 bytes --] On Fri, 2004-10-15 at 09:40, Stephen C. Tweedie wrote: > Hi, > > On Fri, 2004-10-15 at 17:01, mingming cao wrote: > > > > Have I misunderstood something? > > > > > You are correct, again:) We should do a search_reserve_window() from the > > root. > > > > I will post a fix for these two soon. > > Thanks. I'll be away for a few days so I probably won't be able to look > at the fix until Wednesday next week. > How about this? Haven't test it, will do it shortly.:) Thanks, Mingming [-- Attachment #2: ext3_reservation_window_fix_fix.patch --] [-- Type: text/plain, Size: 2424 bytes --] --- linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c | 21 +++++++++------------ 1 files changed, 9 insertions(+), 12 deletions(-) diff -puN fs/ext3/balloc.c~ext3_reservation_window_fix_fix fs/ext3/balloc.c --- linux-2.6.9-rc1-mm5/fs/ext3/balloc.c~ext3_reservation_window_fix_fix 2004-10-15 18:23:42.824158856 -0700 +++ linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c 2004-10-15 20:16:06.037034680 -0700 @@ -184,9 +184,10 @@ goal_in_my_reservation(struct reserve_wi * if the goal is not in any window. * Returns NULL if there are no windows or if all windows start after the goal. */ -static struct reserve_window_node *search_reserve_window(struct rb_node *n, +static struct reserve_window_node *search_reserve_window(struct rb_root *root, unsigned long goal) { + struct rb_node *n = root->rb_node; struct reserve_window_node *rsv; if (!n) @@ -822,10 +823,10 @@ static int alloc_new_reservation(struct start_block = goal + group_first_block; size = atomic_read(&my_rsv->rsv_goal_size); - /* if we have a old reservation, start the search from the old rsv */ if (!rsv_is_empty(&my_rsv->rsv_window)) { /* * if the old reservation is cross group boundary + * and if the goal is inside the old reservation window, * we will come here when we just failed to allocate from * the first part of the window. We still have another part * that belongs to the next group. In this case, there is no @@ -838,10 +839,10 @@ static int alloc_new_reservation(struct */ if ((my_rsv->rsv_start <= group_end_block) && - (my_rsv->rsv_end > group_end_block)) + (my_rsv->rsv_end > group_end_block) && + (start_block <= my_rsv->rsv_start)) return -1; - search_head = search_reserve_window(&my_rsv->rsv_node, start_block); if ((atomic_read(&my_rsv->rsv_alloc_hit) > (my_rsv->rsv_end - my_rsv->rsv_start + 1) / 2)) { /* @@ -855,14 +856,10 @@ static int alloc_new_reservation(struct atomic_set(&my_rsv->rsv_goal_size, size); } } - else { - /* - * we don't have a reservation, - * we set our goal(start_block) and - * the list head for the search - */ - search_head = search_reserve_window(fs_rsv_root->rb_node, start_block); - } + /* + * shift the search start to the window near the goal block + */ + search_head = search_reserve_window(fs_rsv_root, start_block); /* * find_next_reservable_window() simply finds a reservable window _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct? 2004-10-15 20:29 ` mingming cao @ 2004-10-15 22:20 ` Stephen C. Tweedie 2004-10-15 22:34 ` mingming cao 0 siblings, 1 reply; 16+ messages in thread From: Stephen C. Tweedie @ 2004-10-15 22:20 UTC (permalink / raw) To: Mingming Cao Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net, Andrew Morton, Stephen Tweedie Hi, On Fri, 2004-10-15 at 21:29, mingming cao wrote: > How about this? Haven't test it, will do it shortly.:) if ((my_rsv->rsv_start <= group_end_block) && - (my_rsv->rsv_end > group_end_block)) + (my_rsv->rsv_end > group_end_block) && + (start_block <= my_rsv->rsv_start)) return -1; That new "<=" should be a ">=", shouldn't it? Otherwise, looks OK. --Stephen ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Ext3 -mm reservations code: is this fix really correct? 2004-10-15 22:20 ` Stephen C. Tweedie @ 2004-10-15 22:34 ` mingming cao 2004-10-18 22:55 ` [PATCH 2/3] ext3 reservation allow turn off for specifed file Mingming Cao 0 siblings, 1 reply; 16+ messages in thread From: mingming cao @ 2004-10-15 22:34 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 622 bytes --] On Fri, 2004-10-15 at 15:20, Stephen C. Tweedie wrote: > Hi, > > On Fri, 2004-10-15 at 21:29, mingming cao wrote: > > > How about this? Haven't test it, will do it shortly.:) > > if ((my_rsv->rsv_start <= group_end_block) && > - (my_rsv->rsv_end > group_end_block)) > + (my_rsv->rsv_end > group_end_block) && > + (start_block <= my_rsv->rsv_start)) > return -1; > > That new "<=" should be a ">=", shouldn't it? My bad! It should be ">=":) Updated patch attached. Thanks! Mingming [-- Attachment #2: ext3_reservation_window_fix_fix.patch --] [-- Type: text/plain, Size: 2424 bytes --] --- linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c | 21 +++++++++------------ 1 files changed, 9 insertions(+), 12 deletions(-) diff -puN fs/ext3/balloc.c~ext3_reservation_window_fix_fix fs/ext3/balloc.c --- linux-2.6.9-rc1-mm5/fs/ext3/balloc.c~ext3_reservation_window_fix_fix 2004-10-15 18:23:42.824158856 -0700 +++ linux-2.6.9-rc1-mm5-ming/fs/ext3/balloc.c 2004-10-15 22:38:52.895674208 -0700 @@ -184,9 +184,10 @@ goal_in_my_reservation(struct reserve_wi * if the goal is not in any window. * Returns NULL if there are no windows or if all windows start after the goal. */ -static struct reserve_window_node *search_reserve_window(struct rb_node *n, +static struct reserve_window_node *search_reserve_window(struct rb_root *root, unsigned long goal) { + struct rb_node *n = root->rb_node; struct reserve_window_node *rsv; if (!n) @@ -822,10 +823,10 @@ static int alloc_new_reservation(struct start_block = goal + group_first_block; size = atomic_read(&my_rsv->rsv_goal_size); - /* if we have a old reservation, start the search from the old rsv */ if (!rsv_is_empty(&my_rsv->rsv_window)) { /* * if the old reservation is cross group boundary + * and if the goal is inside the old reservation window, * we will come here when we just failed to allocate from * the first part of the window. We still have another part * that belongs to the next group. In this case, there is no @@ -838,10 +839,10 @@ static int alloc_new_reservation(struct */ if ((my_rsv->rsv_start <= group_end_block) && - (my_rsv->rsv_end > group_end_block)) + (my_rsv->rsv_end > group_end_block) && + (start_block >= my_rsv->rsv_start)) return -1; - search_head = search_reserve_window(&my_rsv->rsv_node, start_block); if ((atomic_read(&my_rsv->rsv_alloc_hit) > (my_rsv->rsv_end - my_rsv->rsv_start + 1) / 2)) { /* @@ -855,14 +856,10 @@ static int alloc_new_reservation(struct atomic_set(&my_rsv->rsv_goal_size, size); } } - else { - /* - * we don't have a reservation, - * we set our goal(start_block) and - * the list head for the search - */ - search_head = search_reserve_window(fs_rsv_root->rb_node, start_block); - } + /* + * shift the search start to the window near the goal block + */ + search_head = search_reserve_window(fs_rsv_root, start_block); /* * find_next_reservable_window() simply finds a reservable window _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-15 22:34 ` mingming cao @ 2004-10-18 22:55 ` Mingming Cao 2004-10-18 23:42 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: Mingming Cao @ 2004-10-18 22:55 UTC (permalink / raw) To: akpm, Stephen C. Tweedie Cc: Badari Pulavarty, linux-kernel, ext2-devel@lists.sourceforge.net Allow user shut down reservation-based allocation(using ioctl) on a specific file(e.g. for seeky random write). --- linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c | 15 +++++++++++++-- 1 files changed, 13 insertions(+), 2 deletions(-) diff -puN fs/ext3/balloc.c~ext3_shutdown_reservation_per-file fs/ext3/balloc.c --- linux-2.6.9-rc4-mm1/fs/ext3/balloc.c~ext3_shutdown_reservation_per-file 2004-10-18 22:27:06.333698488 -0700 +++ linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c 2004-10-18 22:34:52.825780912 -0700 @@ -1153,6 +1153,7 @@ int ext3_new_block(handle_t *handle, str struct ext3_super_block *es; struct ext3_sb_info *sbi; struct reserve_window_node *my_rsv = NULL; + struct reserve_window_node *rsv = &EXT3_I(inode)->i_rsv_window; #ifdef EXT3FS_DEBUG static int goal_hits, goal_attempts; #endif @@ -1176,8 +1177,18 @@ int ext3_new_block(handle_t *handle, str sbi = EXT3_SB(sb); es = EXT3_SB(sb)->s_es; ext3_debug("goal=%lu.\n", goal); - if (test_opt(sb, RESERVATION) && S_ISREG(inode->i_mode)) - my_rsv = &EXT3_I(inode)->i_rsv_window; + /* + * Allocate a block from reservation only when + * filesystem is mounted with reservation(default,-o reservation), and + * it's a regular file, and + * the desired window size is greater than 0 (One could use ioctl + * command EXT3_IOC_SETRSVSZ to set the window size to 0 to turn off + * reservation on that particular file) + */ + if (test_opt(sb, RESERVATION) && + S_ISREG(inode->i_mode) && + (atomic_read(&rsv->rsv_goal_size) > 0)) + my_rsv = rsv; if (!ext3_has_free_blocks(sbi)) { *errp = -ENOSPC; goto out; _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-18 22:55 ` [PATCH 2/3] ext3 reservation allow turn off for specifed file Mingming Cao @ 2004-10-18 23:42 ` Andrew Morton 2004-10-19 1:01 ` Mingming Cao 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2004-10-18 23:42 UTC (permalink / raw) To: Mingming Cao; +Cc: sct, pbadari, linux-kernel, ext2-devel Mingming Cao <cmm@us.ibm.com> wrote: > > Allow user shut down reservation-based allocation(using ioctl) on a specific file(e.g. for seeky random write). Applications currently pass a seeky-access hint into the kernel via posix_fadvise(POSIX_FADV_RANDOM). It would be nice to hook into that rather than adding an ext3-specific ioctl. Maybe just peeking at file->f_ra.ra_pages would suffice. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-18 23:42 ` Andrew Morton @ 2004-10-19 1:01 ` Mingming Cao 2004-10-20 17:55 ` Ray Lee 0 siblings, 1 reply; 16+ messages in thread From: Mingming Cao @ 2004-10-19 1:01 UTC (permalink / raw) To: Andrew Morton; +Cc: sct, pbadari, linux-kernel, ext2-devel On Mon, 2004-10-18 at 16:42, Andrew Morton wrote: > Mingming Cao <cmm@us.ibm.com> wrote: > > > > Allow user shut down reservation-based allocation(using ioctl) on a specific file(e.g. for seeky random write). > > Applications currently pass a seeky-access hint into the kernel via > posix_fadvise(POSIX_FADV_RANDOM). It would be nice to hook into that > rather than adding an ext3-specific ioctl. Maybe just peeking at > file->f_ra.ra_pages would suffice. > > Ha, I think I did not make this clear in the description: The patch did not add a new ext3-specific ioctl. We added two ioctl before for ext3 reservation code to allow user to get and set reservation window size, so application could set it's desired reservation window size. It allows the window size to be 0, however the current reservation code just simply set the window size value to 0, but still try to allocate a size 0 reservation window. We should skip doing reservation-based allocation at all if the desired window size is 0. Just thought seeky random write application could use the existing ioctl to let the kernel know it does not need reservation at all. Isn't that more straightforward? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-19 1:01 ` Mingming Cao @ 2004-10-20 17:55 ` Ray Lee 2004-10-25 20:33 ` [Ext2-devel] " Mingming Cao 0 siblings, 1 reply; 16+ messages in thread From: Ray Lee @ 2004-10-20 17:55 UTC (permalink / raw) To: Mingming Cao; +Cc: Andrew Morton, sct, pbadari, Linux Kernel, ext2-devel On Mon, 2004-10-18 at 18:01 -0700, Mingming Cao wrote: > On Mon, 2004-10-18 at 16:42, Andrew Morton wrote: > > Applications currently pass a seeky-access hint into the kernel via > > posix_fadvise(POSIX_FADV_RANDOM). It would be nice to hook into that [...] > Just thought seeky random write application could use the existing ioctl > to let the kernel know it does not need reservation at all. Isn't that > more straightforward? Going the ioctl route seems to imply that userspace would have to do a posix_fadvise() call and the ioctl, as opposed to just the fadvise. No? I'm betting the fadvise call is a little more portable as well. Ray ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-20 17:55 ` Ray Lee @ 2004-10-25 20:33 ` Mingming Cao 2004-10-25 23:05 ` Mingming Cao 0 siblings, 1 reply; 16+ messages in thread From: Mingming Cao @ 2004-10-25 20:33 UTC (permalink / raw) To: Ray Lee; +Cc: Andrew Morton, sct, pbadari, Linux Kernel, ext2-devel On Wed, 2004-10-20 at 10:55, Ray Lee wrote: > On Mon, 2004-10-18 at 18:01 -0700, Mingming Cao wrote: > > On Mon, 2004-10-18 at 16:42, Andrew Morton wrote: > > > Applications currently pass a seeky-access hint into the kernel via > > > posix_fadvise(POSIX_FADV_RANDOM). It would be nice to hook into that > > [...] > > > Just thought seeky random write application could use the existing ioctl > > to let the kernel know it does not need reservation at all. Isn't that > > more straightforward? > > Going the ioctl route seems to imply that userspace would have to do a > posix_fadvise() call and the ioctl, as opposed to just the fadvise. No? > I'm betting the fadvise call is a little more portable as well. Agreed. How about this: add a check in ext3_prepare_write(), if user passed seeky-access hint through posix_fadvise(via check for file->f_ra.ra_pages == 0), if so, close the reservation window. But we still need previous patch to handle window size 0(no reservation) in reservation code. Currently the readahead is turned off if the userspace passed a seeky access hint to kernel by POSIX_FADVISE. It would be nice to turn off the block allocation reservation as well for seeky random write. --- linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c | 9 +++++++++ 1 files changed, 9 insertions(+) diff -puN fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise fs/ext3/inode.c --- linux-2.6.9-rc4-mm1/fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise 2004-10-25 18:25:44.135751800 -0700 +++ linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c 2004-10-25 18:34:39.501363856 -0700 @@ -1008,6 +1008,15 @@ retry: ret = PTR_ERR(handle); goto out; } + + /* + * if user passed a seeky-access hint to kernel, + * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared) + * turn off reservation for block allocation correspondingly. + */ + if (!file->f_ra.ra_pages) + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, 0); + ret = block_prepare_write(page, from, to, ext3_get_block); if (ret) goto prepare_write_failed; _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-25 20:33 ` [Ext2-devel] " Mingming Cao @ 2004-10-25 23:05 ` Mingming Cao 2004-10-25 23:45 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: Mingming Cao @ 2004-10-25 23:05 UTC (permalink / raw) To: Mingming Cao Cc: Ray Lee, Andrew Morton, sct, pbadari, Linux Kernel, ext2-devel On Mon, 2004-10-25 at 13:33, Mingming Cao wrote: > On Wed, 2004-10-20 at 10:55, Ray Lee wrote: > > On Mon, 2004-10-18 at 18:01 -0700, Mingming Cao wrote: > > > On Mon, 2004-10-18 at 16:42, Andrew Morton wrote: > > > > Applications currently pass a seeky-access hint into the kernel via > > > > posix_fadvise(POSIX_FADV_RANDOM). It would be nice to hook into that > > > > [...] > > > > > Just thought seeky random write application could use the existing ioctl > > > to let the kernel know it does not need reservation at all. Isn't that > > > more straightforward? > > > > Going the ioctl route seems to imply that userspace would have to do a > > posix_fadvise() call and the ioctl, as opposed to just the fadvise. No? > > I'm betting the fadvise call is a little more portable as well. > > Agreed. How about this: add a check in ext3_prepare_write(), if user passed seeky-access hint through posix_fadvise(via check for file->f_ra.ra_pages == 0), if so, close the reservation window. But we still need previous patch to handle window size 0(no reservation) in reservation code. The previous patch could re-open the reservation in the case user then switch back to POSIX_FADV_NORMAL or POSIZ_FADV_SEQUENTAIL. Updated version included. We set the window size to -1 if the user pass the POSIX_FADV_RANDOM, instead of 0, to differentiate the case when user really want to close the window by ioctl. We could re-open the window only when user pass the POSIX_FADV_SEQUENTAIL (or POSIX_FADV_NORMAL hint) and the window was closed because of previous seeky access hint. Currently the readahead is turned off if the userspace passed a seeky access hint to kernel by POSIX_FADVISE. It would be nice to turn off the block allocation reservation as well for seeky random write. --- linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c | 2 +- linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+), 1 deletion(-) diff -puN fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise fs/ext3/inode.c --- linux-2.6.9-rc4-mm1/fs/ext3/inode.c~ext3_turn_off_reservation_by_fadvise 2004-10-25 18:25:44.135751800 -0700 +++ linux-2.6.9-rc4-mm1-ming/fs/ext3/inode.c 2004-10-25 21:43:16.194964928 -0700 @@ -1001,6 +1001,7 @@ static int ext3_prepare_write(struct fil int ret, needed_blocks = ext3_writepage_trans_blocks(inode); handle_t *handle; int retries = 0; + int windowsz = 0; retry: handle = ext3_journal_start(inode, needed_blocks); @@ -1008,6 +1009,22 @@ retry: ret = PTR_ERR(handle); goto out; } + + /* + * if user passed a seeky-access hint to kernel, + * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared) + * turn off reservation for block allocation correspondingly. + * + * Otherwise, if user switch back to POSIX_FADV_SEQUENTIAL or + * POSIX_FADV_NORMAL, re-open the reservation window. + */ + windowsz = atomic_read(&EXT3_I(inode)->i_rsv_window.rsv_goal_size); + if ((windowsz > 0) && (!file->f_ra.ra_pages)) + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, -1); + if ((windowsz == -1) && file->f_ra.ra_pages) + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, + EXT3_DEFAULT_RESERVE_BLOCKS); + ret = block_prepare_write(page, from, to, ext3_get_block); if (ret) goto prepare_write_failed; diff -puN fs/ext3/balloc.c~ext3_turn_off_reservation_by_fadvise fs/ext3/balloc.c --- linux-2.6.9-rc4-mm1/fs/ext3/balloc.c~ext3_turn_off_reservation_by_fadvise 2004-10-25 21:23:05.152071432 -0700 +++ linux-2.6.9-rc4-mm1-ming/fs/ext3/balloc.c 2004-10-25 21:24:48.136415432 -0700 @@ -1154,7 +1154,7 @@ int ext3_new_block(handle_t *handle, str struct ext3_sb_info *sbi; struct reserve_window_node *my_rsv = NULL; struct reserve_window_node *rsv = &EXT3_I(inode)->i_rsv_window; - unsigned short windowsz = 0; + int windowsz = 0; #ifdef EXT3FS_DEBUG static int goal_hits, goal_attempts; #endif _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-25 23:05 ` Mingming Cao @ 2004-10-25 23:45 ` Andrew Morton 2004-10-26 16:53 ` Mingming Cao 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2004-10-25 23:45 UTC (permalink / raw) To: Mingming Cao; +Cc: cmm, ray-lk, sct, pbadari, linux-kernel, ext2-devel Mingming Cao <cmm@us.ibm.com> wrote: > > /* > + * if user passed a seeky-access hint to kernel, > + * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared) > + * turn off reservation for block allocation correspondingly. > + * > + * Otherwise, if user switch back to POSIX_FADV_SEQUENTIAL or > + * POSIX_FADV_NORMAL, re-open the reservation window. > + */ > + windowsz = atomic_read(&EXT3_I(inode)->i_rsv_window.rsv_goal_size); > + if ((windowsz > 0) && (!file->f_ra.ra_pages)) > + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, -1); > + if ((windowsz == -1) && file->f_ra.ra_pages) > + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, > + EXT3_DEFAULT_RESERVE_BLOCKS); > + It's pretty sad that we add this extra code into ext3_prepare_write() - it's almost never actually executed. I wonder how important this optimisation really is? I bet no applications are using posix_fadvise(POSIX_FADV_RANDOM) anyway. It we really see that benefits are available from this approach we probably need to bite the bullet and add file_operations.fadvise() ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-25 23:45 ` Andrew Morton @ 2004-10-26 16:53 ` Mingming Cao 2004-10-26 20:18 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: Mingming Cao @ 2004-10-26 16:53 UTC (permalink / raw) To: Andrew Morton; +Cc: ray-lk, sct, pbadari, linux-kernel, ext2-devel On Mon, 2004-10-25 at 16:45, Andrew Morton wrote: > Mingming Cao <cmm@us.ibm.com> wrote: > > > > /* > > + * if user passed a seeky-access hint to kernel, > > + * through POSIX_FADV_RANDOM,(file->r_ra.ra_pages is cleared) > > + * turn off reservation for block allocation correspondingly. > > + * > > + * Otherwise, if user switch back to POSIX_FADV_SEQUENTIAL or > > + * POSIX_FADV_NORMAL, re-open the reservation window. > > + */ > > + windowsz = atomic_read(&EXT3_I(inode)->i_rsv_window.rsv_goal_size); > > + if ((windowsz > 0) && (!file->f_ra.ra_pages)) > > + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, -1); > > + if ((windowsz == -1) && file->f_ra.ra_pages) > > + atomic_set(&EXT3_I(inode)->i_rsv_window.rsv_goal_size, > > + EXT3_DEFAULT_RESERVE_BLOCKS); > > + > > It's pretty sad that we add this extra code into ext3_prepare_write() - > it's almost never actually executed. > :( > I wonder how important this optimisation really is? I bet no applications > are using posix_fadvise(POSIX_FADV_RANDOM) anyway. > I don't know if there is application using the POSIX_FADV_RANDOM. No? If this is the truth, I think we don't need this optimization at present. Logically reservation does not benefit seeky random write, but there is no benchmark showing performance issue so far. We have already provided ways for applications turn off reservation through the existing ioctl for specified file and -o noreservation mount option for the whole filesystem. > It we really see that benefits are available from this approach we probably > need to bite the bullet and add file_operations.fadvise() > I agree. Mingming ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-26 16:53 ` Mingming Cao @ 2004-10-26 20:18 ` Andrew Morton 2004-10-26 23:01 ` Mingming Cao 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2004-10-26 20:18 UTC (permalink / raw) To: Mingming Cao; +Cc: ray-lk, sct, pbadari, linux-kernel, ext2-devel Mingming Cao <cmm@us.ibm.com> wrote: > > > I wonder how important this optimisation really is? I bet no applications > > are using posix_fadvise(POSIX_FADV_RANDOM) anyway. > > > I don't know if there is application using the POSIX_FADV_RANDOM. No? If > this is the truth, I think we don't need this optimization at present. > Logically reservation does not benefit seeky random write, but there is > no benchmark showing performance issue so far. We have already provided > ways for applications turn off reservation through the existing ioctl > for specified file and -o noreservation mount option for the whole > filesystem. Well we definitely don't want to be encouraging application developers to be adding ext3-specific ioctls. So we need to work out if any applications can get significant benefit from manually disabling reservations and if so, wire up fadvise() into filesystems and do it that way. Do you know if disabling reservations helps any workloads? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Ext2-devel] Re: [PATCH 2/3] ext3 reservation allow turn off for specifed file 2004-10-26 20:18 ` Andrew Morton @ 2004-10-26 23:01 ` Mingming Cao 0 siblings, 0 replies; 16+ messages in thread From: Mingming Cao @ 2004-10-26 23:01 UTC (permalink / raw) To: Andrew Morton; +Cc: ray-lk, sct, pbadari, linux-kernel, ext2-devel On Tue, 2004-10-26 at 13:18, Andrew Morton wrote: > Mingming Cao <cmm@us.ibm.com> wrote: > > > > > I wonder how important this optimisation really is? I bet no applications > > > are using posix_fadvise(POSIX_FADV_RANDOM) anyway. > > > > > I don't know if there is application using the POSIX_FADV_RANDOM. No? If > > this is the truth, I think we don't need this optimization at present. > > Logically reservation does not benefit seeky random write, but there is > > no benchmark showing performance issue so far. We have already provided > > ways for applications turn off reservation through the existing ioctl > > for specified file and -o noreservation mount option for the whole > > filesystem. > > Well we definitely don't want to be encouraging application developers to be > adding ext3-specific ioctls. So we need to work out if any applications > can get significant benefit from manually disabling reservations and if > so, wire up fadvise() into filesystems and do it that way. > Okey. Also, if there is such a need to disable reservation, maybe we could think about closing the reservation window dynamically based on past I/O pattern, when user application did not give any hint. > Do you know if disabling reservations helps any workloads? > Not that I aware of. We have done several microbenchmark on sequential and random workload, haven't seen regression with reservations so far. But our testing is certainly not comprehensive. Mingming ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2004-10-26 22:59 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-10-15 13:27 Ext3 -mm reservations code: is this fix really correct? Stephen C. Tweedie 2004-10-15 16:01 ` [Ext2-devel] " mingming cao 2004-10-15 16:40 ` Stephen C. Tweedie 2004-10-15 20:29 ` mingming cao 2004-10-15 22:20 ` Stephen C. Tweedie 2004-10-15 22:34 ` mingming cao 2004-10-18 22:55 ` [PATCH 2/3] ext3 reservation allow turn off for specifed file Mingming Cao 2004-10-18 23:42 ` Andrew Morton 2004-10-19 1:01 ` Mingming Cao 2004-10-20 17:55 ` Ray Lee 2004-10-25 20:33 ` [Ext2-devel] " Mingming Cao 2004-10-25 23:05 ` Mingming Cao 2004-10-25 23:45 ` Andrew Morton 2004-10-26 16:53 ` Mingming Cao 2004-10-26 20:18 ` Andrew Morton 2004-10-26 23:01 ` Mingming Cao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox