All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Eddy Zhao <eddy.y.zhao@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: BUG REPORT: md RAID5 write throughput will drop for 1~2s every 16s  (under 1Hz sample rate)
Date: Tue, 20 Jul 2010 22:43:27 +1000	[thread overview]
Message-ID: <20100720224327.69ff1967@notabene> (raw)
In-Reply-To: <AANLkTin4T27FUSAWktl-gAHm1yseXcIMTc-7WEPkCOu6@mail.gmail.com>

On Tue, 20 Jul 2010 19:40:05 +0800
Eddy Zhao <eddy.y.zhao@gmail.com> wrote:

> Hello Neil:
> 
> 
> We observe periodic write throughput drop of md RAID5. See description below
> 
> Configuration
>  - linux 2.6.28.9
>  - 3 Seagate 320GB 7200rpm SATA2.0 disks
>  - md RAID5, 3 disks, 256KB chunk
> 
> Test
>  - open O_DIRECT /dev/md0
>  - sequential write, 512KB write block
>  - refer to "fpt.cpp" ("ulimit -s ulimited" before run the program)
> 
> Problem
>  - md RAID5 write throughput will drop for 1~2s every 16s (under 1Hz sample
> rate)
>  - refer to "output.txt"
> 
> Do you know the resaon of the problem? We want to fix it on our server to
> make the QOS smooth

If I'm interpreting your numbers correctly, it is just an occasional single
write that is slow - not a series of writes during a one second interval that
are each slow.  It would help if you could confirm that.

Two possibilities occur to me, though it could be something else altogether.
You would need to instrument the code to collect internal states to see if it
is one of these or something else.

1/ a scheduler problem could be delaying the running of raid5d from time to
   time so that it either doesn't respond to ready stripes quickly, or cannot
   get CPU time to perform the xor.

2/ For some reason raid5 sometimes decides that it needs to pre-read the
   'other' block to calculate parity rather than waiting for the other block
   to be written.  This is more likely.
   Either this is bad code somewhere, or the raid5 is being 'unplugged'
   prematurely.
   This seems to happen with a period of 30 seconds (I don't know where you
   got 16 from.  The command:
     tr : ' ' < output.txt | sed 's/ms//' | awk '$4 > 100 {print NR, NR-p; p=NR}' 
   suggests intervals of 1 or 33 seconds being most common, though you could
   get more precise data out of your program.

   I suspect this aligns with the 30 second periodic 'flush' that Linux does,
   though I'm not 100% certain.  You could possibly put a 'WARN_ON' in 
   raid5_activate_delayed if delayed_list is not empty.  That will give you
   a stack trace showing why the unplug was called.

I'd be keen to hear about any further discoveries you make.

BTW I prefer all such questions be post to linux-raid@vger.kernel.org
as others may be able to contribute.  I have taken the liberty of 
cc:ing this reply there.  I hope you are OK with that.


NeilBrown


> 
> FYI: "Single disk" and "2 disk RAID0" write throughput are all smooth (under
> 1Hz sample rate)
> 
> 
> Thanks
> Eddy


       reply	other threads:[~2010-07-20 12:43 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTin4T27FUSAWktl-gAHm1yseXcIMTc-7WEPkCOu6@mail.gmail.com>
2010-07-20 12:43 ` Neil Brown [this message]
2010-07-22 11:21   ` BUG REPORT: md RAID5 write throughput will drop for 1~2s every 16s (under 1Hz sample rate) Eddy Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100720224327.69ff1967@notabene \
    --to=neilb@suse.de \
    --cc=eddy.y.zhao@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.