public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Edward Shishkin <edward.shishkin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Nick Piggin <npiggin@suse.de>, Ryan Hope <rmh3093@gmail.com>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	linux-kernel@vger.kernel.org,
	ReiserFS Mailing List <reiserfs-devel@vger.kernel.org>
Subject: Re: [patch 2/4] vfs: add set_page_dirty_notag
Date: Sat, 14 Feb 2009 22:11:33 +0100	[thread overview]
Message-ID: <1234645893.4695.8.camel@laptop> (raw)
In-Reply-To: <18838.49922.215481.399653@edward.zelnet.ru>

On Sat, 2009-02-14 at 16:11 +0300, Edward Shishkin wrote:
> Peter Zijlstra writes:
>  > On Fri, 2009-02-13 at 16:57 +0300, Edward Shishkin wrote:
>  > > >
>  > > > Eew, so reiser4 will totally side-step the regular vm inode
>  > > writeback
>  > > > paths -- or is this fixed by a more elaborate than usual
>  > > > a_ops->writepages() ?
>  > > >   
>  > > The second.
>  > > 
>  > > reiser4_writepages() catches the anonymous  (tagged)
>  > > pages, captures them mandatory, then commits all atoms
>  > > of the file.
>  > 
>  > OK, can you then make it painfully clear in the function comment, esp.
>  > since you export this function?
> 
> Hello.
> Does it look better?
> 
> Thanks,
> Edward.
> 
> This is a fixup for the following "todo":
> akpm wrote:
> > reiser4_set_page_dirty_internal() pokes around in VFS internals.
> > Use __set_page_dirty_no_buffers() or create a new library function
> > in mm/page-writeback.c.
> 
> Problem:
> 
> In accordance with reiser4 transactional model every dirty page
> should be "captured" by some atom. However, outside reiser4 context
> dirty page can not be captured in some cases, as it is accompanied
> with specific work (jnode creation, etc). Reiser4 recognizes such
> "anonymous" pages (i.e. pages that were dirtied outside of reiser4)
> by the tag PAGECACHE_TAG_DIRTY. Pages dirtied inside reiser4 context
> are not tagged at all: we don't need this. Indeed, once page is
> dirtied and captured, it is attached to a jnode (a special header
> to keep a track of transactions).
> 
> reiser4_set_page_dirty_internal() was the internal reiser4 function
> that set dirty bit without tagging the page. Having such internal
> function led to real problems (incorrect task io accounting, etc.
> because of not updating this internal "friend").
> 
> Solution:
> 
> The following patch adds a core library function that sets a dirty
> bit without tagging the page. It should be modified simultaneously
> with its "friends": __set_page_dirty_nobuffers, __set_page_dirty.
> 
> Signed-off-by: Edward Shishkin<edward.shishkin@gmail.com>
> ---
>  include/linux/mm.h  |    1 +
>  mm/page-writeback.c |   40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> --- mmotm.orig/include/linux/mm.h
> +++ mmotm/include/linux/mm.h
> @@ -841,6 +841,7 @@ int redirty_page_for_writepage(struct wr
>  				struct page *page);
>  int set_page_dirty(struct page *page);
>  int set_page_dirty_lock(struct page *page);
> +int set_page_dirty_notag(struct page *page);
>  int clear_page_dirty_for_io(struct page *page);
>  
>  extern unsigned long move_page_tables(struct vm_area_struct *vma,
> --- mmotm.orig/mm/page-writeback.c
> +++ mmotm/mm/page-writeback.c
> @@ -1248,6 +1248,46 @@ int __set_page_dirty_nobuffers(struct pa
>  EXPORT_SYMBOL(__set_page_dirty_nobuffers);
>  
>  /*
> + * Some filesystems, which don't use buffers, provide their own
> + * writeback means. And it can happen, that even dirty tag, which
> + * is used by generic methods is not needed. In this case it would
> + * be reasonably to use the following lightweight version of
> + * __set_page_dirty_nobuffers:
> + *
> + * Don't tag page as dirty in its radix tree, just set dirty bit
> + * and update the accounting.
> + * NOTE: This function also doesn't take care of races, i.e. the
> + * caller should guarantee that page can not be truncated.
> + */

Maybe something like

/*
 * set_page_dirty_notag() -- similar to __set_page_dirty_nobuffers()
 * except it doesn't tag the page dirty in the page-cache radix tree.
 * This means that the address space using this cannot use the regular
 * filemap ->writepages() helpers and must provide its own means of
 * tracking and finding dirty pages.
 *
 * NOTE: furthermore, this version also doesn't handle truncate races.
 */

> +int set_page_dirty_notag(struct page *page)
> +{
> +	struct address_space *mapping = page->mapping;
> +
> +	if (!TestSetPageDirty(page)) {
> +		WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
> +		if (mapping_cap_account_dirty(mapping)) {
> +		        /*
> +			 * Since we don't tag the page as dirty,
> +			 * acquiring the tree_lock is replaced
> +			 * with disabling preemption to protect
> +			 * per-cpu data used for accounting.
> +			 */

This should be local_irq_save(flags)

> +			preempt_disable();
> +			__inc_zone_page_state(page, NR_FILE_DIRTY);
> +			__inc_bdi_stat(mapping->backing_dev_info,
> +				       BDI_RECLAIMABLE);
> +			task_dirty_inc(current);
> +			task_io_account_write(PAGE_CACHE_SIZE);
> +			preempt_enable();

local_irq_restore()

These accounting functions rely on being atomic wrt interrupts.

> +		}
> +		__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> +		return 1;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(set_page_dirty_notag);

How much performance gain do you see by avoiding that radix tree op? 

I take it the only reason you don't use the regular
__set_page_dirty_nobuffers() and just clear the tag when you do the
write-out by whatever alternative means you have to find the page, is
that it gains you some performance.

It would be good to have some numbers to judge this on.


  reply	other threads:[~2009-02-14 21:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-13 11:56 [patch 2/4] vfs: add set_page_dirty_notag Edward Shishkin
2009-02-13 13:08 ` Peter Zijlstra
2009-02-13 13:57   ` Edward Shishkin
2009-02-13 14:09     ` Peter Zijlstra
2009-02-14 13:11       ` Edward Shishkin
2009-02-14 21:11         ` Peter Zijlstra [this message]
2009-02-16 22:43           ` Edward Shishkin
2009-02-17  9:09             ` Peter Zijlstra
2009-02-17  9:38               ` Nick Piggin
2009-02-17 10:05                 ` Peter Zijlstra
2009-02-17 10:24                   ` Nick Piggin
2009-02-17 10:40                     ` set_page_dirty races (was: Re: [patch 2/4] vfs: add set_page_dirty_notag) Peter Zijlstra
2009-02-17 11:25                       ` Nick Piggin
2009-02-17 11:39                         ` Peter Zijlstra
2009-02-17 11:55                           ` Nick Piggin
2009-02-17 12:05                             ` Peter Zijlstra
2009-02-17 12:30                               ` Nick Piggin
2009-02-17 22:35             ` [patch 2/4] vfs: add set_page_dirty_notag Andrew Morton
2009-02-18  0:26               ` Edward Shishkin
2009-02-18  0:38                 ` Andrew Morton
2009-02-18 13:27                   ` [patch 1/2] vfs: add/use update_page_accounting Edward Shishkin
2009-02-18 14:06                     ` Nick Piggin
2009-02-18 18:23                       ` Andrew Morton
2009-02-18 13:27                   ` [patch 2/2] vfs: (take 2)add set_page_dirty_notag Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1234645893.4695.8.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=edward.shishkin@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=randy.dunlap@oracle.com \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=rmh3093@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox