Re: LVM on raid10,f2 performance issues

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Keld Jørn Simonsen" <keld@dkuug.dk>
To: Peter Rabbitson <rabbit+list@rabbit.us>
Cc: thomas62186218@aol.com, soltys@ziu.info, mauermann@gmail.com,
	linux-raid@vger.kernel.org
Subject: Re: LVM on raid10,f2 performance issues
Date: Mon, 19 Jan 2009 14:59:38 +0100	[thread overview]
Message-ID: <20090119135938.GA27466@rap.rap.dk> (raw)
In-Reply-To: <49747107.8020607@rabbit.us>

On Mon, Jan 19, 2009 at 01:24:39PM +0100, Peter Rabbitson wrote:
> Keld Jørn Simonsen wrote:
> > Hmm, 
> > 
> > Why is the command
> > 
> >  blockdev --setra 65536 /dev/md0
> > 
> > really needed? I think the kernel should set a reasonable default here.
> 
> The in-kernel default for a block device is 256 (128k) which is way too
> low. the MD subsystems tries to be a bit smarter and assigns the md
> device readahead according to the number of devices/raid level. For
> streaming (i.e. file sever) these values are also too low. LVs can take
> a readahead specification at creation time and use that, but this is
> manual.

I would like to have something done automatically in the kernel, so that
you do not need to do it manually. People tend to not know that you need
to add the blockdev statement, eg in /etc/rc.local, to get decent
performance. And this is needed even for simpler arrays, such as a 4
drive raid10,f2 , which can be set up on many recent motherboards with 
sata-II support directly off the mobo.

> It is arguable what the typical workload is, but I would lean towards
> big long linear reads (fileserver) vs short scattered ones (database).

My understanding is that the readahead is only done when the kernel
thinks it is doing sequential reads. his prpbalu is not the case whan
doing database operations. So we are kind of safe here, IMHO.
> 
> The real solution to the problem was proposed a long time ago, and it
> seems it got lost in the attic: http://lwn.net/Articles/155510/

Yes, interesting.

The patch may nt be ready for inclusion for some time due to complexity
and lack of testing.

So I am wondering if we could come up with a formula to set the readahead
for raid. It seems like a big readahead would not affect random reading.
It would then only be overkill for sequential reading of smallish files.

So how does the kernel detect that it is doing sequential reading?
Maybe it detects that the new block to read or a specific file
descriptor is the follower to the previous read on the same FD?

And then we normally read a full chunk for the raid, which is at least
something like 64 KiB? This would take care of most database
transactions. 

I would think we then should find the smallest readahead value for a
given array, from chunk size and drive count, that gets the array to
perform as expected.

best regards
keld
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

     prev parent reply	other threads:[~2009-01-19 13:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-01  0:00 LVM on raid10,f2 performance issues Holger Mauermann
2008-12-01 16:42 ` Keld Jørn Simonsen
2008-12-02 23:28   ` Holger Mauermann
2008-12-03  7:15     ` Keld Jørn Simonsen
2008-12-03  9:43     ` Michal Soltys
2009-01-19  1:24       ` thomas62186218
2009-01-19  7:28         ` Peter Rabbitson
2009-01-26 19:06           ` Bill Davidsen
2009-01-19  7:30         ` Michal Soltys
2009-01-19 12:17         ` Keld Jørn Simonsen
2009-01-19 12:24           ` Peter Rabbitson
2009-01-19 13:59             ` Keld Jørn Simonsen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090119135938.GA27466@rap.rap.dk \
    --to=keld@dkuug.dk \
    --cc=linux-raid@vger.kernel.org \
    --cc=mauermann@gmail.com \
    --cc=rabbit+list@rabbit.us \
    --cc=soltys@ziu.info \
    --cc=thomas62186218@aol.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).