linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch 4/8] mm: allow not updating BDI stats in end_page_writeback()
Date: Tue, 18 Mar 2008 14:08:22 +0100	[thread overview]
Message-ID: <1205845702.8514.365.camel@twins> (raw)
In-Reply-To: <E1JbbHf-0005rm-R5@pomaz-ex.szeredi.hu>

On Tue, 2008-03-18 at 13:51 +0100, Miklos Szeredi wrote:
> > Yes, it does two things, _however_ those two things are very much
> > related. Your use-case that breaks this relation is an execption - and I
> > haven't really grasped it yet..
> > 
> > I'm in general not too keen about you having to export the BDI
> > accounting stuff and using it explicitly like this, but I'm afraid I
> > don't see a way around it - the danger is that other filesystems will
> > get creative (hence the req for GPL - that excludes the most creative
> > ones).
> > 
> > Yes, it makes sense to delay the write completion accounting until its
> > actually completed.. but I would suggest all writeback accounting.
> 
> Doesn't work, as long as we have throttle_vm_writeout() waiting for
> NR_WRITEBACK to go below a threshold, delaying the NR_WRITEBACK
> accounting could lead to a deadlock.
> 
> So at least until that's resolved NR_WRITEBACK_TEMP needs to be
> separate from NR_WRITEBACK_TEMP.  And it makes sense possibly even
> after that, as they are fundamentally different things.  The first one
> is page cache pages being under writeout, the second is just kernel
> buffers (mostly) unrelated to the page cache.

Urgh, throttle_vm_writeout() again.. Agreed, that'll deadlock.

> > So the thing that's in your way is that removing a page from the radix
> > tree doesn't imply its done writing. So perhaps we should make that
> > distinction instead?
> > 
> > So instead of conditionally do part of the accounting, never do it and
> > require something like: page_writeback_complete() to be called after a
> > successfull test_clear_page_writeback().
> 
> Yes, that's a possibility, but then normal filesystems miss out on the
> small optimization provided by doing the BDI accounting functions
> inside the same IRQ disabled region as the radix tree operation.
> Would that have any significant performance impact?

Yeah, realized that. Don't know, would have to measure it somehow...
some archs are rather slow with disabling IRQs, but we're talking about
writeout which should be dominated by the IO times.

Its just that your proposal exposes too much guts, I'd like the
interface to be a little higher level.




  reply	other threads:[~2008-03-18 13:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-17 19:19 [patch 0/8] fuse: writable mmap + batched write Miklos Szeredi
2008-03-17 19:19 ` [patch 1/8] mm: bdi: export bdi_writeout_inc() Miklos Szeredi
2008-03-18 11:27   ` Peter Zijlstra
2008-03-18 11:46     ` Miklos Szeredi
2008-03-17 19:19 ` [patch 2/8] mm: Add NR_WRITEBACK_TEMP counter Miklos Szeredi
2008-03-18  5:05   ` Andrew Morton
2008-03-17 19:19 ` [patch 3/8] mm: rotate_reclaimable_page() cleanup Miklos Szeredi
2008-03-18 11:31   ` Peter Zijlstra
2008-03-18 11:56     ` Miklos Szeredi
2008-03-18 16:45       ` Andrew Morton
2008-03-17 19:19 ` [patch 4/8] mm: allow not updating BDI stats in end_page_writeback() Miklos Szeredi
2008-03-18  5:04   ` Andrew Morton
2008-03-18  8:11     ` Miklos Szeredi
2008-03-18  8:18       ` Andrew Morton
2008-03-18 11:33   ` Peter Zijlstra
2008-03-18 11:59     ` Miklos Szeredi
2008-03-18 12:29       ` Peter Zijlstra
2008-03-18 12:51         ` Miklos Szeredi
2008-03-18 13:08           ` Peter Zijlstra [this message]
2008-03-18 13:58             ` Miklos Szeredi
2008-03-18 13:59               ` Peter Zijlstra
2008-03-18 15:53                 ` Miklos Szeredi
2008-03-18 16:49                   ` Andrew Morton
2008-03-17 19:19 ` [patch 5/8] fuse: support writable mmap Miklos Szeredi
2008-03-17 19:19 ` [patch 6/8] fuse: clean up setting i_size in write Miklos Szeredi
2008-03-18  5:08   ` Andrew Morton
2008-03-18  8:16     ` Miklos Szeredi
2008-03-17 19:19 ` [patch 7/8] fuse: implement perform_write Miklos Szeredi
2008-03-17 19:19 ` [patch 8/8] fuse: update file size on short read Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1205845702.8514.365.camel@twins \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).