* Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()
2007-10-08 16:54 [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite() Peter Zijlstra
@ 2007-10-08 6:37 ` Nick Piggin
2007-10-08 23:36 ` David Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2007-10-08 6:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, Linus Torvalds, Andrew Morton, Christoph Hellwig,
David Howells, Dave Chinner, Trond Myklebust, mark.fasheh, hugh,
stable
On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> It seems that with the recent usage of ->page_mkwrite() a little detail
> was overlooked.
>
> .22-rc1 merged OCFS2 usage of this hook
> .23-rc1 merged XFS usage
> .24-rc1 will most likely merge NFS usage
>
> Please consider this for .23 final and maybe even .22.x
>
> ---
> Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
>
> All the current page_mkwrite() implementations also set the page dirty.
> Which results in the set_page_dirty_balance() call to _not_ call balance,
> because the page is already found dirty.
>
> This allows us to dirty a _lot_ of pages without ever hitting
> balance_dirty_pages(). Not good (tm).
>
> Force a balance call if ->page_mkwrite() was successful.
Would it be better to just have the callers set_page_dirty_balance()?
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
> include/linux/writeback.h | 2 +-
> mm/memory.c | 9 +++++++--
> mm/page-writeback.c | 4 ++--
> 3 files changed, 10 insertions(+), 5 deletions(-)
>
> Index: linux-2.6/include/linux/writeback.h
> ===================================================================
> --- linux-2.6.orig/include/linux/writeback.h
> +++ linux-2.6/include/linux/writeback.h
> @@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
> loff_t pos, loff_t count);
> int sync_page_range_nolock(struct inode *inode, struct address_space
> *mapping, loff_t pos, loff_t count);
> -void set_page_dirty_balance(struct page *page);
> +void set_page_dirty_balance(struct page *page, int page_mkwrite);
> void writeback_set_ratelimit(void);
>
> /* pdflush.c */
> Index: linux-2.6/mm/memory.c
> ===================================================================
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
> struct page *old_page, *new_page;
> pte_t entry;
> int reuse = 0, ret = 0;
> + int page_mkwrite = 0;
> struct page *dirty_page = NULL;
>
> old_page = vm_normal_page(vma, address, orig_pte);
> @@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
> page_cache_release(old_page);
> if (!pte_same(*page_table, orig_pte))
> goto unlock;
> +
> + page_mkwrite = 1;
> }
> dirty_page = old_page;
> get_page(dirty_page);
> @@ -1691,7 +1694,7 @@ unlock:
> * do_no_page is protected similarly.
> */
> wait_on_page_locked(dirty_page);
> - set_page_dirty_balance(dirty_page);
> + set_page_dirty_balance(dirty_page, page_mkwrite);
> put_page(dirty_page);
> }
> return ret;
> @@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
> struct page *dirty_page = NULL;
> struct vm_fault vmf;
> int ret;
> + int page_mkwrite = 0;
>
> vmf.virtual_address = (void __user *)(address & PAGE_MASK);
> vmf.pgoff = pgoff;
> @@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
> anon = 1; /* no anon but release vmf.page */
> goto out;
> }
> + page_mkwrite = 1;
> }
> }
>
> @@ -2375,7 +2380,7 @@ out_unlocked:
> if (anon)
> page_cache_release(vmf.page);
> else if (dirty_page) {
> - set_page_dirty_balance(dirty_page);
> + set_page_dirty_balance(dirty_page, page_mkwrite);
> put_page(dirty_page);
> }
>
> Index: linux-2.6/mm/page-writeback.c
> ===================================================================
> --- linux-2.6.orig/mm/page-writeback.c
> +++ linux-2.6/mm/page-writeback.c
> @@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
> pdflush_operation(background_writeout, 0);
> }
>
> -void set_page_dirty_balance(struct page *page)
> +void set_page_dirty_balance(struct page *page, int page_mkwrite)
> {
> - if (set_page_dirty(page)) {
> + if (set_page_dirty(page) || page_mkwrite) {
> struct address_space *mapping = page_mapping(page);
>
> if (mapping)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()
2007-10-08 23:36 ` David Chinner
@ 2007-10-08 7:47 ` Nick Piggin
2007-10-09 2:12 ` Mark Fasheh
0 siblings, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2007-10-08 7:47 UTC (permalink / raw)
To: David Chinner
Cc: Peter Zijlstra, linux-kernel, Linus Torvalds, Andrew Morton,
Christoph Hellwig, David Howells, Trond Myklebust, mark.fasheh,
hugh, stable
On Tuesday 09 October 2007 09:36, David Chinner wrote:
> On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
> > On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> > > Force a balance call if ->page_mkwrite() was successful.
> >
> > Would it be better to just have the callers set_page_dirty_balance()?
>
> block_page_mkwrite() is just using generic interfaces to do this,
> same as pretty much any write() system call. The idea was to make it
> as similar to the write() call path as possible...
>
> However, unlike generic_file_buffered_write(), we are not calling
> balance_dirty_pages_ratelimited(mapping) between
> ->prepare/commit_write call pairs. Perhaps this should be added to
> block_page_mkwrite() after the page is unlocked....
That sounds pretty sane, in terms of matching with
generic_file_buffered_write.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()
2007-10-09 2:12 ` Mark Fasheh
@ 2007-10-08 14:50 ` Nick Piggin
0 siblings, 0 replies; 6+ messages in thread
From: Nick Piggin @ 2007-10-08 14:50 UTC (permalink / raw)
To: Mark Fasheh
Cc: David Chinner, Peter Zijlstra, linux-kernel, Linus Torvalds,
Andrew Morton, Christoph Hellwig, David Howells, Trond Myklebust,
hugh, stable
On Tuesday 09 October 2007 12:12, Mark Fasheh wrote:
> On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
> > > block_page_mkwrite() is just using generic interfaces to do this,
> > > same as pretty much any write() system call. The idea was to make it
> > > as similar to the write() call path as possible...
> > >
> > > However, unlike generic_file_buffered_write(), we are not calling
> > > balance_dirty_pages_ratelimited(mapping) between
> > > ->prepare/commit_write call pairs. Perhaps this should be added to
> > > block_page_mkwrite() after the page is unlocked....
> >
> > That sounds pretty sane, in terms of matching with
> > generic_file_buffered_write.
>
> I agree. We could also insert a call to balance_dirty_pages_ratelimited()
> in __ocfs2_page_mkwrite.
Hmm, Peter's patch got merged -- I suppose that's fine for 2.6.23 though...
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()
@ 2007-10-08 16:54 Peter Zijlstra
2007-10-08 6:37 ` Nick Piggin
0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2007-10-08 16:54 UTC (permalink / raw)
To: linux-kernel, Linus Torvalds, Andrew Morton
Cc: Christoph Hellwig, David Howells, Nick Piggin, Dave Chinner,
Trond Myklebust, mark.fasheh, hugh, stable
It seems that with the recent usage of ->page_mkwrite() a little detail
was overlooked.
.22-rc1 merged OCFS2 usage of this hook
.23-rc1 merged XFS usage
.24-rc1 will most likely merge NFS usage
Please consider this for .23 final and maybe even .22.x
---
Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
All the current page_mkwrite() implementations also set the page dirty. Which
results in the set_page_dirty_balance() call to _not_ call balance, because the
page is already found dirty.
This allows us to dirty a _lot_ of pages without ever hitting
balance_dirty_pages(). Not good (tm).
Force a balance call if ->page_mkwrite() was successful.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
include/linux/writeback.h | 2 +-
mm/memory.c | 9 +++++++--
mm/page-writeback.c | 4 ++--
3 files changed, 10 insertions(+), 5 deletions(-)
Index: linux-2.6/include/linux/writeback.h
===================================================================
--- linux-2.6.orig/include/linux/writeback.h
+++ linux-2.6/include/linux/writeback.h
@@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
loff_t pos, loff_t count);
int sync_page_range_nolock(struct inode *inode, struct address_space *mapping,
loff_t pos, loff_t count);
-void set_page_dirty_balance(struct page *page);
+void set_page_dirty_balance(struct page *page, int page_mkwrite);
void writeback_set_ratelimit(void);
/* pdflush.c */
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
struct page *old_page, *new_page;
pte_t entry;
int reuse = 0, ret = 0;
+ int page_mkwrite = 0;
struct page *dirty_page = NULL;
old_page = vm_normal_page(vma, address, orig_pte);
@@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
page_cache_release(old_page);
if (!pte_same(*page_table, orig_pte))
goto unlock;
+
+ page_mkwrite = 1;
}
dirty_page = old_page;
get_page(dirty_page);
@@ -1691,7 +1694,7 @@ unlock:
* do_no_page is protected similarly.
*/
wait_on_page_locked(dirty_page);
- set_page_dirty_balance(dirty_page);
+ set_page_dirty_balance(dirty_page, page_mkwrite);
put_page(dirty_page);
}
return ret;
@@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
struct page *dirty_page = NULL;
struct vm_fault vmf;
int ret;
+ int page_mkwrite = 0;
vmf.virtual_address = (void __user *)(address & PAGE_MASK);
vmf.pgoff = pgoff;
@@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
anon = 1; /* no anon but release vmf.page */
goto out;
}
+ page_mkwrite = 1;
}
}
@@ -2375,7 +2380,7 @@ out_unlocked:
if (anon)
page_cache_release(vmf.page);
else if (dirty_page) {
- set_page_dirty_balance(dirty_page);
+ set_page_dirty_balance(dirty_page, page_mkwrite);
put_page(dirty_page);
}
Index: linux-2.6/mm/page-writeback.c
===================================================================
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
pdflush_operation(background_writeout, 0);
}
-void set_page_dirty_balance(struct page *page)
+void set_page_dirty_balance(struct page *page, int page_mkwrite)
{
- if (set_page_dirty(page)) {
+ if (set_page_dirty(page) || page_mkwrite) {
struct address_space *mapping = page_mapping(page);
if (mapping)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()
2007-10-08 6:37 ` Nick Piggin
@ 2007-10-08 23:36 ` David Chinner
2007-10-08 7:47 ` Nick Piggin
0 siblings, 1 reply; 6+ messages in thread
From: David Chinner @ 2007-10-08 23:36 UTC (permalink / raw)
To: Nick Piggin
Cc: Peter Zijlstra, linux-kernel, Linus Torvalds, Andrew Morton,
Christoph Hellwig, David Howells, Dave Chinner, Trond Myklebust,
mark.fasheh, hugh, stable
On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
> On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> > It seems that with the recent usage of ->page_mkwrite() a little detail
> > was overlooked.
> >
> > .22-rc1 merged OCFS2 usage of this hook
> > .23-rc1 merged XFS usage
> > .24-rc1 will most likely merge NFS usage
> >
> > Please consider this for .23 final and maybe even .22.x
> >
> > ---
> > Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
> >
> > All the current page_mkwrite() implementations also set the page dirty.
> > Which results in the set_page_dirty_balance() call to _not_ call balance,
> > because the page is already found dirty.
> >
> > This allows us to dirty a _lot_ of pages without ever hitting
> > balance_dirty_pages(). Not good (tm).
> >
> > Force a balance call if ->page_mkwrite() was successful.
>
> Would it be better to just have the callers set_page_dirty_balance()?
block_page_mkwrite() is just using generic interfaces to do this,
same as pretty much any write() system call. The idea was to make it
as similar to the write() call path as possible...
However, unlike generic_file_buffered_write(), we are not calling
balance_dirty_pages_ratelimited(mapping) between
->prepare/commit_write call pairs. Perhaps this should be added to
block_page_mkwrite() after the page is unlocked....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()
2007-10-08 7:47 ` Nick Piggin
@ 2007-10-09 2:12 ` Mark Fasheh
2007-10-08 14:50 ` Nick Piggin
0 siblings, 1 reply; 6+ messages in thread
From: Mark Fasheh @ 2007-10-09 2:12 UTC (permalink / raw)
To: Nick Piggin
Cc: David Chinner, Peter Zijlstra, linux-kernel, Linus Torvalds,
Andrew Morton, Christoph Hellwig, David Howells, Trond Myklebust,
hugh, stable
On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
> > block_page_mkwrite() is just using generic interfaces to do this,
> > same as pretty much any write() system call. The idea was to make it
> > as similar to the write() call path as possible...
> >
> > However, unlike generic_file_buffered_write(), we are not calling
> > balance_dirty_pages_ratelimited(mapping) between
> > ->prepare/commit_write call pairs. Perhaps this should be added to
> > block_page_mkwrite() after the page is unlocked....
>
> That sounds pretty sane, in terms of matching with
> generic_file_buffered_write.
I agree. We could also insert a call to balance_dirty_pages_ratelimited() in
__ocfs2_page_mkwrite.
--Mark
--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-10-09 7:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-08 16:54 [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite() Peter Zijlstra
2007-10-08 6:37 ` Nick Piggin
2007-10-08 23:36 ` David Chinner
2007-10-08 7:47 ` Nick Piggin
2007-10-09 2:12 ` Mark Fasheh
2007-10-08 14:50 ` Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox