linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jody McIntyre <scjody@sun.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-raid@vger.kernel.org, neilb@suse.de
Subject: Re: [PATCH] md: Track raid5/6 statistics
Date: Mon, 11 May 2009 09:36:03 -0400	[thread overview]
Message-ID: <20090511133602.GB30561@clouds> (raw)
In-Reply-To: <e9c3a7c20905070930n5b7bd3dcy3e837d17865d4540@mail.gmail.com>

On Thu, May 07, 2009 at 09:30:33AM -0700, Dan Williams wrote:

> It would be nice if the kernel could auto-tune stripe_cache_size, but
> I think modifying it in a reactive fashion may do more harm than good.
>  The times when we want write-out to be faster are usually the times
> when the system has too much dirty memory lying around so there is no
> room to increase the cache.  If we are under utilizing the stripe
> cache then there is a good chance the memory could be put to better
> use in the page cache, but then we are putting ourselves in a
> compromised state when a write burst appears.

Yes - it's really too bad that we have this tunable, but I can't think
of a good way to get rid of it.  In some customer issues I've seen,
performance really suffers when the array is out of stripes - enough to
make single IOs take _minutes_ in the worst cases.  This is especially
easy to reproduce during a resync or rebuild, for obvious reasons.

On a related note, there seems to be some confusion surrounding how much
memory is used by the stripe cache.  I've seen users who believed the
value was in kilobytes of memory, whereas the truth is a bit more
complicated.  We could add a stripe_cache_kb entry (writeable even) to
make this clearer, and/or improve Documentation/md.txt.  Also, we
helpfully print the amount allocated when the array is first run():

		printk(KERN_INFO "raid5: allocated %dkB for %s\n",
			memory, mdname(mddev));

but we don't ever provide an update when it changes.  I don't think we
want to printk() every time someone changes the sysfs tunable though -
perhaps we should get rid of the message in run()?

> In the end I agree that having some kind of out_of_stripes
> notification would be useful.  However, I think it would make more
> sense to implement it as a "stripe_cache_active load average".  Then
> for a given workload the operator can see if there are spikes or
> sustained cache saturation.  What do you think?

That makes sense.  It would be a more meaningful number than our current
statistic, which is "at some point since you started the array, we had
to wait for a stripe N times."

I'll come up with a patch when I get the chance.

Cheers,
Jody

  reply	other threads:[~2009-05-11 13:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-12 20:57 [PATCH] md: Track raid5/6 statistics Jody McIntyre
2009-03-14 17:07 ` Dan Williams
2009-05-06 20:05   ` Jody McIntyre
2009-05-07 16:30     ` Dan Williams
2009-05-11 13:36       ` Jody McIntyre [this message]
2009-05-13 13:10         ` Bill Davidsen
2009-10-02 17:01           ` Jody McIntyre
2009-10-02 17:51             ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090511133602.GB30561@clouds \
    --to=scjody@sun.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).