From: Andrew Burgess <aab@cichlid.com>
To: lrhorer@satx.rr.com
Cc: 'Linux RAID' <linux-raid@vger.kernel.org>
Subject: RE: RAID halting
Date: Sat, 04 Apr 2009 06:01:06 -0700 [thread overview]
Message-ID: <1238850066.16200.51.camel@cichlid.com> (raw)
In-Reply-To: <20090404055720.QDOH19140.cdptpa-omta03.mail.rr.com@Leslie>
On Sat, 2009-04-04 at 00:57 -0500, Lelsie Rhorer wrote:
> >> The issue is the entire array will occasionally pause completely for
> about 40 seconds when a file is created.
>
> >I had symptoms like this once. It turned out to be a defective disk. The
> >disk would never return a read or write error but just intermittently
> >took a really long time to respond.
>
> >I found it by running atop. All the other drives would be running at low
> >utilization and this one drive would be at 100% when the symptoms
> >occurred (which in atop gets colored red so it jumps out at you)
>
> Thanks. I gave this a try, but not being at all familiar with atop, I'm not
> sure what, if anything, the results mean in terms of any additional
> diagnostic data.
It's the same info as iostat just in color
> Depending somewhat upon the I/O load on the RAID array,
> atop sometimes reports the drive utilization on several or all of the drives
> to be well in excess of 85% - occasionally even 99%, but never flat 100% at
> any time.
High 90's is what I ment by 100% :-)
> Oddly, even under relatively light loads of 20 or 30 Mbps,
> sometimes the RAID members would show utilization in the high 90s, usually
> on all the drives on a multiplier channel.
I think that's the filesystem buffering and then writing all at once.
It's normal if it's periodic; they go briefly to ~100% and then back to
~0%?
Did you watch the atop display when the problem occurred?
> I don't know if this is ordinary
> behavior for atop, but all the drives also periodically disappear from the
> status display.
That's a config option (and I find the default annoying). atop also
sorts the drives by utilization every second which can be a little hard
to watch. But if you have the problem I had then that one drive stays at
the top of the list when the problem occurs.
> Additionally, while atop is running and I am using my usual
> video editor, Video Redo, on a Windows workstation to stream video from the
> server, every time atop updates, the video and audio skip when reading from
> a drive not on the RAID array. I did not notice the same behavior from the
> RAID array. Odd.
I think this is heavy /proc filesystem access which I have noticed can
screw up even realtime processes.
> Anyway, on to the diagnostics.
>
> I ran both `atop` and `watch iostat 1 2` concurrently and triggered several
> events while under heavy load ( >450 Mbps, total ). In atop, drives sdb,
> sdd, sde, sdg, and sdi consistently disappeared from atop entirely, and
> writes for the other drives fell to dead zero. Reads fell to a very small
> number. The iostat session returned information in agreement with atop:
> both reads and writes for sdb, sdd, sde, sdg, sdi, and md0 all fell to dead
> zero from nominal values frequently exceeding 20,000 reads / sec and 5000
> writes / sec. Meanwhile, writes to sda, sdc, sdf, sdh, and sdj also dropped
> to dead zero, but reads only fell to between 230 and 256 reads/sec.
I used:
iostat -t -k -x 1 | egrep -v 'sd.[0-9]'
to get percent utilization and not show each partition but just whole
drives.
For atop you want the -f option to 'fixate' the number of lines so
drives with zero utilization don't disappear.
If you didn't get constant 100% utilization while the event occurred
then I guess you don't have the problem I had.
Does the sata multiplier have it's own driver and if so, is it the
latest? Any other complaints on the net about it? I would think a
problem there would show up as 100% utilization though...
And I think you already said the cpu usage is low when the event occurs?
No one core at near 100%? (atop would show this too...)
next prev parent reply other threads:[~2009-04-04 13:01 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-02 4:16 (unknown), Lelsie Rhorer
2009-04-02 4:22 ` David Lethe
2009-04-05 0:12 ` RE: Lelsie Rhorer
2009-04-05 0:38 ` Greg Freemyer
2009-04-05 5:05 ` Lelsie Rhorer
2009-04-05 11:42 ` Greg Freemyer
2009-04-05 0:45 ` Re: Roger Heflin
2009-04-05 5:21 ` Lelsie Rhorer
2009-04-05 5:33 ` RE: David Lethe
2009-04-05 8:14 ` RAID halting Lelsie Rhorer
2009-04-02 4:38 ` Strange filesystem slowness with 8TB RAID6 NeilBrown
2009-04-04 7:12 ` RAID halting Lelsie Rhorer
2009-04-04 12:38 ` Roger Heflin
2009-04-02 6:56 ` your mail Luca Berra
2009-04-04 6:44 ` RAID halting Lelsie Rhorer
2009-04-02 7:33 ` Peter Grandi
2009-04-02 23:01 ` RAID halting Lelsie Rhorer
2009-04-02 13:35 ` Andrew Burgess
2009-04-04 5:57 ` RAID halting Lelsie Rhorer
2009-04-04 13:01 ` Andrew Burgess [this message]
2009-04-04 14:39 ` Lelsie Rhorer
2009-04-04 15:04 ` Andrew Burgess
2009-04-04 15:15 ` Lelsie Rhorer
2009-04-04 16:39 ` Andrew Burgess
-- strict thread matches above, loose matches on Subject: below --
2009-04-04 17:05 Lelsie Rhorer
[not found] <49D7C19C.2050308@gmail.com>
2009-04-05 0:07 ` Lelsie Rhorer
2009-04-05 0:49 ` Greg Freemyer
2009-04-05 5:34 ` Lelsie Rhorer
2009-04-05 7:16 ` Richard Scobie
2009-04-05 8:22 ` Lelsie Rhorer
2009-04-05 14:05 ` Drew
2009-04-05 18:54 ` Leslie Rhorer
2009-04-05 19:17 ` John Robinson
2009-04-05 20:00 ` Greg Freemyer
2009-04-05 20:39 ` Peter Grandi
2009-04-05 23:27 ` Leslie Rhorer
2009-04-05 22:03 ` Leslie Rhorer
2009-04-06 22:16 ` Greg Freemyer
2009-04-07 18:22 ` Leslie Rhorer
2009-04-24 4:52 ` Leslie Rhorer
2009-04-24 6:50 ` Richard Scobie
2009-04-24 10:03 ` Leslie Rhorer
2009-04-28 19:36 ` lrhorer
2009-04-24 15:24 ` Andrew Burgess
2009-04-25 4:26 ` Leslie Rhorer
2009-04-24 17:03 ` Doug Ledford
2009-04-24 20:25 ` Richard Scobie
2009-04-24 20:28 ` CoolCold
2009-04-24 21:04 ` Richard Scobie
2009-04-25 7:40 ` Leslie Rhorer
2009-04-25 8:53 ` Michał Przyłuski
2009-04-28 19:33 ` Leslie Rhorer
2009-04-29 11:25 ` John Robinson
2009-04-30 0:55 ` Leslie Rhorer
2009-04-30 12:34 ` John Robinson
2009-05-03 2:16 ` Leslie Rhorer
2009-05-03 2:23 ` Leslie Rhorer
2009-04-24 20:25 ` Greg Freemyer
2009-04-25 7:24 ` Leslie Rhorer
2009-04-05 21:02 ` Leslie Rhorer
2009-04-05 19:26 ` Richard Scobie
2009-04-05 20:40 ` Leslie Rhorer
2009-04-05 20:57 ` Peter Grandi
2009-04-05 23:55 ` Leslie Rhorer
2009-04-06 20:35 ` jim owens
2009-04-07 17:47 ` Leslie Rhorer
2009-04-07 18:18 ` David Lethe
2009-04-08 14:17 ` Leslie Rhorer
2009-04-08 14:30 ` David Lethe
2009-04-09 4:52 ` Leslie Rhorer
2009-04-09 6:45 ` David Lethe
2009-04-08 14:37 ` Greg Freemyer
2009-04-08 16:29 ` Andrew Burgess
2009-04-09 3:24 ` Leslie Rhorer
2009-04-10 3:02 ` Leslie Rhorer
2009-04-10 4:51 ` Leslie Rhorer
2009-04-10 12:50 ` jim owens
2009-04-10 15:31 ` Bill Davidsen
2009-04-11 1:37 ` Leslie Rhorer
2009-04-11 13:02 ` Bill Davidsen
2009-04-10 8:53 ` David Greaves
2009-04-08 18:04 ` Corey Hickey
2009-04-07 18:20 ` Greg Freemyer
2009-04-08 8:45 ` John Robinson
2009-04-09 3:34 ` Leslie Rhorer
2009-04-05 7:33 ` Richard Scobie
2009-04-05 0:57 ` Roger Heflin
2009-04-05 6:30 ` Lelsie Rhorer
2009-04-05 14:22 FW: " David Lethe
2009-04-05 14:53 ` David Lethe
2009-04-05 20:33 ` Leslie Rhorer
2009-04-05 22:20 ` Peter Grandi
2009-04-06 0:31 ` Doug Ledford
2009-04-06 1:53 ` Leslie Rhorer
2009-04-06 12:37 ` Doug Ledford
[not found] <49D89515.3020800@computer.org>
2009-04-05 18:40 ` Leslie Rhorer
[not found] <49F21B75.7060705@sauce.co.nz>
2009-04-25 4:32 ` Leslie Rhorer
[not found] <49F2A193.8080807@sauce.co.nz>
2009-04-25 7:03 ` Leslie Rhorer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1238850066.16200.51.camel@cichlid.com \
--to=aab@cichlid.com \
--cc=linux-raid@vger.kernel.org \
--cc=lrhorer@satx.rr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).