All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [PATCH] fix task dirty balancing
Date: Sat, 05 Jul 2008 11:30:02 +0200	[thread overview]
Message-ID: <1215250202.6320.10.camel@lappy.programming.kicks-ass.net> (raw)
In-Reply-To: <20080705150401.8bd28b71.kamezawa.hiroyu@jp.fujitsu.com>

On Sat, 2008-07-05 at 15:04 +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 02 Jul 2008 22:27:18 +0200
> Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> 
> > On Wed, 2008-07-02 at 17:26 +0900, YAMAMOTO Takashi wrote:
> > > hi,
> > > 
> > > task_dirty_inc doesn't seem to be called properly for
> > > filesystems which don't use set_page_dirty for write(2).
> > > eg. ext2 w/o nobh option.
> > 
> > I'm thinking this is an ext2 bug. So I'd rather it'd just call
> > set_page_dirty() like a proper filesystem instead of doing things like
> > this.
> > 
> > And I certainly don't like exporting task_dirty_inc() - filesystems and
> > the like should not have to know about things like that.
> > 
> Hmm, a bit complicated for me.
> 
> At first, there are 2 __set_page_dirty() in the kernel.
>   - mm/page-writeback.c: __set_page_dirty()
>               .... set_page_dirty() calls this.
>   - fs/buffer.c : __set_page_dirty()
>               .... __set_page_dirty_buffers() and mark_buffer_dirty() calls this.
> 
> Why per-task dirty acconitng is done in mm/page-writeback.c::set_page_dirty() ?
> 
> It seems other accounting is done in the fs/buffer.c: __set_page_dirty()
> 
> The purpose of task-dirty accounting is different from others  ?
> 
> = fs/buffer.c
>  697 static int __set_page_dirty(struct page *page,
>  698                 struct address_space *mapping, int warn)
>  699 {
>  700         if (unlikely(!mapping))
>  701                 return !TestSetPageDirty(page);
>  702 
>  703         if (TestSetPageDirty(page))
>  704                 return 0;
>  705 
>  706         write_lock_irq(&mapping->tree_lock);
>  707         if (page->mapping) {    /* Race with truncate? */
>  708                 WARN_ON_ONCE(warn && !PageUptodate(page));
>  709 
>  710                 if (mapping_cap_account_dirty(mapping)) {
>  711                         __inc_zone_page_state(page, NR_FILE_DIRTY);
>  712                         __inc_bdi_stat(mapping->backing_dev_info,
>  713                                         BDI_RECLAIMABLE);
>  714                         task_io_account_write(PAGE_CACHE_SIZE);
>  715                 }
>  716                 radix_tree_tag_set(&mapping->page_tree,
>  717                                 page_index(page), PAGECACHE_TAG_DIRTY);
>  718         }
>  719         write_unlock_irq(&mapping->tree_lock);
>  720         __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
>  721 
>  722         return 1;
> ==
> 
> And task-dirty-limit don't have to take care of following 2 case ?
>   - __set_page_dirty_nobuffers(struct page *page) (increment BDI_RECRAIMABLE)
>   - test_set_page_writeback() (increment BDI_RECLAIMABLE)


Gah - what a mess...

It's in set_page_dirty() so it wouldn't have to be in all the
a_ops->set_page_dirty() functions...

But now it turns out people don't use set_page_dirty() to dirty
pages :-(

For the purpose of task_dirty_inc() I guess we might as well pair it
with task_io_account_write() for each PAGE_CACHE_SIZE (and ignore the
DIO bit, since that doesn't care about the dirty limit anyway).

Might be my ignorance, but _why_ do we have __set_page_dirty_nobuffers()
reimplemented in fs/buffers.c:__set_page_dirty() ? - those two functions
look suspiciously similar.

Also, why was the EXPORT added anyway - fs/buffers.o never ends up in
modules?

Please beat me to cleaning up this stuff - otherwise I'll have to look
at it when I get back from holidays.

Peter


  reply	other threads:[~2008-07-05  9:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-02  8:26 [PATCH] fix task dirty balancing YAMAMOTO Takashi
2008-07-02 20:27 ` Peter Zijlstra
2008-07-03  7:33   ` YAMAMOTO Takashi
2008-07-05  6:04   ` KAMEZAWA Hiroyuki
2008-07-05  9:30     ` Peter Zijlstra [this message]
2008-07-07  6:46       ` YAMAMOTO Takashi
2008-07-07 10:36       ` Nick Piggin
2008-07-08 23:38       ` YAMAMOTO Takashi
2008-07-09  7:41         ` Nick Piggin
2008-07-10  3:10           ` YAMAMOTO Takashi
2008-07-10  7:23             ` Nick Piggin
2008-07-24  0:27               ` YAMAMOTO Takashi
2008-07-24 15:08                 ` Nick Piggin
2008-07-25  8:04                   ` YAMAMOTO Takashi
2008-07-25  9:57                     ` Peter Zijlstra
2008-07-28  5:50                       ` YAMAMOTO Takashi
2008-07-28  7:24                         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1215250202.6320.10.camel@lappy.programming.kicks-ass.net \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.