From: Jens Axboe <jens.axboe@oracle.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jeff Moyer <jmoyer@redhat.com>,
Ralf Gross <rg@STZ-Softwaretechnik.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: io-scheduler tuning for better read/write ratio
Date: Fri, 26 Jun 2009 12:44:06 +0200 [thread overview]
Message-ID: <20090626104406.GK23611@kernel.dk> (raw)
In-Reply-To: <20090626021905.GA23981@localhost>
On Fri, Jun 26 2009, Wu Fengguang wrote:
> On Tue, Jun 23, 2009 at 03:42:46AM +0800, Jeff Moyer wrote:
> > Ralf Gross <rg@STZ-Softwaretechnik.com> writes:
> >
> > > Jeff Moyer schrieb:
> > >> Jeff Moyer <jmoyer@redhat.com> writes:
> > >>
> > >> > Ralf Gross <rg@stz-softwaretechnik.com> writes:
> > >> >
> > >> >> Casey Dahlin schrieb:
> > >> >>> On 06/16/2009 02:40 PM, Ralf Gross wrote:
> > >> >>> > David Newall schrieb:
> > >> >>> >> Ralf Gross wrote:
> > >> >>> >>> write throughput is much higher than the read throughput (40 MB/s
> > >> >>> >>> read, 90 MB/s write).
> > >> >>> >
> > >> >>> > Hm, but I get higher read throughput (160-200 MB/s) if I don't write
> > >> >>> > to the device at the same time.
> > >> >>> >
> > >> >>> > Ralf
> > >> >>>
> > >> >>> How specifically are you testing? It could depend a lot on the
> > >> >>> particular access patterns you're using to test.
> > >> >>
> > >> >> I did the basic tests with tiobench. The real test is a test backup
> > >> >> (bacula) with 2 jobs that create 2 30 GB spool files on that device.
> > >> >> The jobs partially write to the device in parallel. Depending which
> > >> >> spool file reaches the 30 GB first, one starts reading from that file
> > >> >> and writing to tape, while to other is still spooling.
> > >> >
> > >> > We are missing a lot of details, here. I guess the first thing I'd try
> > >> > would be bumping up the max_readahead_kb parameter, since I'm guessing
> > >> > that your backup application isn't driving very deep queue depths. If
> > >> > that doesn't work, then please provide exact invocations of tiobench
> > >> > that reprduce the problem or some blktrace output for your real test.
> > >>
> > >> Any news, Ralf?
> > >
> > > sorry for the delay. atm there are large backups running and using the
> > > raid device for spooling. So I can't do any tests.
> > >
> > > Re. read ahead: I tested different settings from 8Kb to 65Kb, this
> > > didn't help.
> > >
> > > I'll do some more tests when the backups are done (3-4 more days).
> >
> > The default is 128KB, I believe, so it's strange that you would test
> > smaller values. ;) I would try something along the lines of 1 or 2 MB.
> >
> > I'm CCing Fengguang in case he has any suggestions.
>
> Jeff, thank you for the forwarding (and sorry for the long delay)!
>
> The read:write (or rather sync:async) ratio control is an IO scheduler
> feature. CFQ has parameters slice_sync and slice_async for that.
> What's more, CFQ will let async IO wait if there are any in flight
> sync IO. This is good, but not quite enough. Normally sync IOs come
> one by one, with some small idle time window in between. If we only
> start dispatching async IOs after the last sync IO has completed for
> eg. 1ms, then we may stop the async background write IOs when there
> are active sync foreground read IO stream.
>
> This simple patch aims to address the writes-push-aside-reads problem.
> Ralf, you can try applying this patch and run your workload with this
> (huge) CFQ parameter:
>
> echo 1000 > /sys/block/sda/queue/iosched/slice_sync
>
> The patch is based on 2.6.30, but can be trivially backported if you
> want to use some old kernel.
>
> It may impact overall (sync+async) IO throughput when there are one or
> more ongoing sync IO streams, so requires considerable benchmarks and
> adjustments.
>
> Thanks,
> Fengguang
> ---
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index a55a9bd..14011b7 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -1064,7 +1064,6 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
> if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag)
> return;
>
> - WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list));
> WARN_ON(cfq_cfqq_slice_new(cfqq));
>
> /*
> @@ -2175,8 +2174,6 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> * or if we want to idle in case it has no pending requests.
> */
> if (cfqd->active_queue == cfqq) {
> - const bool cfqq_empty = RB_EMPTY_ROOT(&cfqq->sort_list);
> -
> if (cfq_cfqq_slice_new(cfqq)) {
> cfq_set_prio_slice(cfqd, cfqq);
> cfq_clear_cfqq_slice_new(cfqq);
> @@ -2190,8 +2187,8 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> */
> if (cfq_slice_used(cfqq) || cfq_class_idle(cfqq))
> cfq_slice_expired(cfqd, 1);
> - else if (cfqq_empty && !cfq_close_cooperator(cfqd, cfqq, 1) &&
> - sync && !rq_noidle(rq))
> + else if (sync && !rq_noidle(rq) &&
> + !cfq_close_cooperator(cfqd, cfqq, 1))
> cfq_arm_slice_timer(cfqd);
> }
What's the purpose of this patch? If you have requests pending you don't
want to arm the idle timer and wait, you want to dispatch those.
--
Jens Axboe
next prev parent reply other threads:[~2009-06-26 10:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-16 15:43 io-scheduler tuning for better read/write ratio Ralf Gross
2009-06-16 16:41 ` David Newall
2009-06-16 18:40 ` Ralf Gross
2009-06-16 18:43 ` Casey Dahlin
2009-06-16 18:56 ` Ralf Gross
2009-06-16 20:16 ` Jeff Moyer
2009-06-22 14:43 ` Jeff Moyer
2009-06-22 16:31 ` Ralf Gross
2009-06-22 19:42 ` Jeff Moyer
2009-06-23 7:24 ` Ralf Gross
2009-06-23 13:53 ` Jeff Moyer
2009-06-24 7:25 ` Ralf Gross
2009-06-24 7:57 ` Al Boldi
2009-06-25 7:26 ` Ralf Gross
2009-06-25 13:45 ` Al Boldi
2009-06-25 7:27 ` Ralf Gross
2009-06-26 2:19 ` Wu Fengguang
2009-06-26 10:44 ` Jens Axboe [this message]
2009-06-27 3:46 ` Wu Fengguang
2009-06-29 9:47 ` Ralf Gross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090626104406.GK23611@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=fengguang.wu@intel.com \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rg@STZ-Softwaretechnik.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.