From: Wu Fengguang <fengguang.wu@intel.com>
To: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>,
Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] mm: make VM_MAX_READAHEAD configurable
Date: Mon, 12 Oct 2009 14:23:17 +0800 [thread overview]
Message-ID: <20091012062317.GA10719@localhost> (raw)
In-Reply-To: <4AD2C43D.1080804@linux.vnet.ibm.com>
On Mon, Oct 12, 2009 at 01:53:01PM +0800, Christian Ehrhardt wrote:
> Wu Fengguang wrote:
> > Hi Martin,
> >
> > On Fri, Oct 09, 2009 at 09:49:50PM +0800, Martin Schwidefsky wrote:
> >
> >> On Fri, 9 Oct 2009 14:29:52 +0200
> >> Jens Axboe <jens.axboe@oracle.com> wrote:
> >>
> >>
> >>> On Fri, Oct 09 2009, Peter Zijlstra wrote:
> >>>
> >>>> On Fri, 2009-10-09 at 13:19 +0200, Ehrhardt Christian wrote:
> >>>>
> >>>>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> >>>>>
> >>>>> On one hand the define VM_MAX_READAHEAD in include/linux/mm.h is just a default
> >>>>> and can be configured per block device queue.
> >>>>> On the other hand a lot of admins do not use it, therefore it is reasonable to
> >>>>> set a wise default.
> >>>>>
> >>>>> This path allows to configure the value via Kconfig mechanisms and therefore
> >>>>> allow the assignment of different defaults dependent on other Kconfig symbols.
> >>>>>
> >>>>> Using this, the patch increases the default max readahead for s390 improving
> >>>>> sequential throughput in a lot of scenarios with almost no drawbacks (only
> >>>>> theoretical workloads with a lot concurrent sequential read patterns on a very
> >>>>> low memory system suffer due to page cache trashing as expected).
> >>>>>
> > [snip]
> >
> >> The patch from Christian fixes a performance regression in the latest
> >> distributions for s390. So we would opt for a larger value, 512KB seems
> >> to be a good one. I have no idea what that will do to the embedded
> >> space which is why Christian choose to make it configurable. Clearly
> >> the better solution would be some sort of system control that can be
> >> modified at runtime.
> >>
> >
> > May I ask for more details about your performance regression and why
> > it is related to readahead size? (we didn't change VM_MAX_READAHEAD..)
> >
> Sure, the performance regression appeared when comparing Novell SLES10
> vs. SLES11.
> While you are right Wu that the upstream default never changed so far,
> SLES10 had a
> patch applied that set 512.
I see. I'm curious why SLES11 removed that patch. Did it experienced
some regressions with the larger readahead size?
> As mentioned before I didn't expect to get a generic 128->512 patch
> accepted,therefore
> the configurable solution. But after Peter and Jens replied so quickly
> stating that
> changing the default in kernel would be the wrong way to go I already
> looked out for
> userspace alternatives. At least for my issues I could fix it with
> device specific udev rules
> too.
OK.
> And as Andrew mentioned the diversity of devices cause any default to be
> wrong for one
> or another installation. To solve that the udev approach can also differ
> between different
> device types (might be easier on s390 than on other architectures
> because I need to take
> care of two disk types atm - and both shold get 512).
I guess it's not a general solution for all. There are so many
devices in the world, and we have not yet considered the
memory/workload combinations.
> The testcase for anyone who wants to experiment with it is almost too
> easy, the biggest
> impact can be seen with single thread iozone - I get ~40% better
> throughput when
> increasing the readahead size to 512 (even bigger RA sizes don't help
> much in my
> environment, probably due to fast devices).
That's impressive number - I guess we need a larger default RA size.
But before that let's learn something from SLES10's experiences :)
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-10-12 6:23 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-09 11:19 [PATCH] mm: make VM_MAX_READAHEAD configurable Ehrhardt Christian
2009-10-09 12:20 ` Peter Zijlstra
2009-10-09 12:29 ` Jens Axboe
2009-10-09 13:49 ` Martin Schwidefsky
2009-10-09 13:58 ` Wu Fengguang
2009-10-11 1:10 ` Wu Fengguang
2009-10-12 5:53 ` Christian Ehrhardt
2009-10-12 6:23 ` Wu Fengguang [this message]
2009-10-12 9:29 ` Christian Ehrhardt
2009-10-12 9:39 ` Wu Fengguang
2009-10-09 21:31 ` Andrew Morton
2009-10-10 10:53 ` Jens Axboe
2009-10-10 12:40 ` Wu Fengguang
2009-10-10 17:41 ` Andrew Morton
2009-10-09 13:14 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091012062317.GA10719@localhost \
--to=fengguang.wu@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=ehrhardt@linux.vnet.ibm.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=schwidefsky@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).