public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Eric St-Laurent <ericstl34@sympatico.ca>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	Fengguang Wu <fengguang.wu@gmail.com>,
	Dave Jones <davej@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	riel <riel@redhat.com>, Andrew Morton <akpm@linux-foundation.org>,
	Tim Pepper <lnxninja@us.ibm.com>, Chris Snook <csnook@redhat.com>
Subject: Re: [PATCH 0/3] readahead drop behind and size adjustment
Date: Wed, 25 Jul 2007 15:19:59 +1000	[thread overview]
Message-ID: <46A6DD7F.1050505@yahoo.com.au> (raw)
In-Reply-To: <1185338106.7105.44.camel@perkele>

Eric St-Laurent wrote:
> On Mon, 2007-23-07 at 19:00 +1000, Nick Piggin wrote:
> 
> 
>>I don't like this kind of conditional information going from something
>>like readahead into page reclaim. Unless it is for readahead _specific_
>>data such as "I got these all wrong, so you can reclaim them" (which
>>this isn't).
>>
>>But I don't like it as a use-once thing. The VM should be able to get
>>that right.
>>
> 
> 
> 
> Question: How work the use-once code in the current kernel? Is there
> any? I doesn't quite work for me...

What *I* think is supposed to happen is that newly read in pages get
put on the inactive list, and unless they get accessed againbefore
being reclaimed, they are allowed to fall off the end of the list
without disturbing active data too much.

I think there is a missing piece here, that we used to ease the reclaim
pressure off the active list when the inactive list grows relatively
much larger than it (which could indicate a lot of use-once pages in
the system).

Andrew got rid of that logic for some reason which I don't know, but I
can't see that use-once would be terribly effective today (so your
results don't surprise me too much).

I think I've been banned from touching vmscan.c, but if you're keen to
try a patch, I might be convinced to come out of retirement :)


> See my previous email today, I've done a small test case to demonstrate 
> the problem and the effectiveness of Peter's patch.  The only piece
> missing is the copy case (read once + write once).
> 
> Regardless of how it's implemented, I think a similar mechanism must be
> added. This is a long standing issue.
> 
> In the end, I think it's a pagecache resources allocation problem. the
> VM lacks fair-share limits between processes. The kernel doesn't have
> enough information to make the right decisions.
> 
> You can refine or use more advanced page reclaim, but some fair-share
> splitting (like the CPU scheduler) between the processes must be
> present.  Of course some process should have large or unlimited VM
> limits, like databases.
> 
> Maybe the "containers" patchset and memory controller can help.  With
> some specific configuration and/or a userspace daemon to adjust the
> limits on the fly.
> 
> Independently, the basic large file streaming read (or copy) once cases
> should not trash the pagecache. Can we agree on that?

One man's trash is another's treasure: some people will want the
files to remain in cache because they'll use them again (copy it
somewhere else, or start editing it after being copied or whatever).

But yeah, we can probably do better at the sequential read/write
case.


> I say, let's add some code to fix the problem.  If we hear about any
> regression in some workloads, we can add a tunable to limit or disable
> its effects, _if_ a better compromised solution cannot be found.

Sure, but let's figure out the workloads and look at all the
alternatives first.

-- 
SUSE Labs, Novell Inc.

  reply	other threads:[~2007-07-25  5:20 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-21 21:00 [PATCH 0/3] readahead drop behind and size adjustment Peter Zijlstra
2007-07-21 21:00 ` [PATCH 1/3] readahead: drop behind Peter Zijlstra
2007-07-21 20:29   ` Eric St-Laurent
2007-07-21 20:37     ` Peter Zijlstra
2007-07-21 20:59       ` Eric St-Laurent
2007-07-21 21:06         ` Peter Zijlstra
2007-07-25  3:55   ` Eric St-Laurent
2007-07-21 21:00 ` [PATCH 2/3] readahead: fadvise drop behind controls Peter Zijlstra
2007-07-21 21:00 ` [PATCH 3/3] readahead: scale max readahead size depending on memory size Peter Zijlstra
2007-07-22  8:24   ` Jens Axboe
2007-07-22  8:36     ` Peter Zijlstra
2007-07-22  8:50       ` Jens Axboe
2007-07-22  9:17         ` Peter Zijlstra
2007-07-22 16:44           ` Jens Axboe
2007-07-23 10:04             ` Jörn Engel
2007-07-23 10:11               ` Jens Axboe
2007-07-23 22:44               ` Rusty Russell
2007-07-22 23:52         ` Rik van Riel
2007-07-23  5:22           ` Jens Axboe
     [not found]   ` <20070722084526.GB6317@mail.ustc.edu.cn>
2007-07-22  8:45     ` Fengguang Wu
2007-07-22  8:59       ` Peter Zijlstra
     [not found]         ` <20070722095313.GA8136@mail.ustc.edu.cn>
2007-07-22  9:53           ` Fengguang Wu
     [not found] ` <20070722023923.GA6438@mail.ustc.edu.cn>
2007-07-22  2:39   ` [PATCH 0/3] readahead drop behind and size adjustment Fengguang Wu
2007-07-22  2:44   ` Dave Jones
     [not found]     ` <20070722081010.GA6317@mail.ustc.edu.cn>
2007-07-22  8:10       ` Fengguang Wu
2007-07-22  8:24         ` Peter Zijlstra
     [not found]           ` <20070722082923.GA7790@mail.ustc.edu.cn>
2007-07-22  8:29             ` Fengguang Wu
2007-07-22  8:33       ` Rusty Russell
2007-07-22  8:45         ` Peter Zijlstra
2007-07-23  9:00         ` Nick Piggin
     [not found]           ` <20070723142457.GA10130@mail.ustc.edu.cn>
2007-07-23 14:24             ` Fengguang Wu
2007-07-23 19:40               ` Andrew Morton
     [not found]                 ` <20070724004728.GA8026@mail.ustc.edu.cn>
2007-07-24  0:47                   ` Fengguang Wu
2007-07-24  1:17                     ` Andrew Morton
2007-07-24  8:50                       ` Andreas Dilger
2007-07-24  4:30                     ` Nick Piggin
2007-07-25  4:35           ` Eric St-Laurent
2007-07-25  5:19             ` Nick Piggin [this message]
2007-07-25  6:18               ` Eric St-Laurent
2007-07-25  7:09                 ` Nick Piggin
2007-07-25  7:48                   ` Eric St-Laurent
2007-07-25 15:36                     ` Rik van Riel
2007-07-25 15:33                   ` Rik van Riel
2007-07-29  7:44                   ` Eric St-Laurent
2007-07-25 15:28               ` Rik van Riel
  -- strict thread matches above, loose matches on Subject: below --
2007-07-22 11:11 Al Boldi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46A6DD7F.1050505@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=csnook@redhat.com \
    --cc=davej@redhat.com \
    --cc=ericstl34@sympatico.ca \
    --cc=fengguang.wu@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lnxninja@us.ibm.com \
    --cc=riel@redhat.com \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox