All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aab@cichlid.com>
To: lrhorer@satx.rr.com
Cc: 'Linux RAID' <linux-raid@vger.kernel.org>
Subject: RE: RAID halting
Date: Sat, 04 Apr 2009 06:01:06 -0700	[thread overview]
Message-ID: <1238850066.16200.51.camel@cichlid.com> (raw)
In-Reply-To: <20090404055720.QDOH19140.cdptpa-omta03.mail.rr.com@Leslie>

On Sat, 2009-04-04 at 00:57 -0500, Lelsie Rhorer wrote:

> >> The issue is the entire array will occasionally pause completely for
> about 40 seconds when a file is created. 
> 
> >I had symptoms like this once. It turned out to be a defective disk. The
> >disk would never return a read or write error but just intermittently
> >took a really long time to respond.
> 
> >I found it by running atop. All the other drives would be running at low
> >utilization and this one drive would be at 100% when the symptoms
> >occurred (which in atop gets colored red so it jumps out at you)
> 
> Thanks.  I gave this a try, but not being at all familiar with atop, I'm not
> sure what, if anything, the results mean in terms of any additional
> diagnostic data.

It's the same info as iostat just in color

> Depending somewhat upon the I/O load on the RAID array,
> atop sometimes reports the drive utilization on several or all of the drives
> to be well in excess of 85% - occasionally even 99%, but never flat 100% at
> any time.  

High 90's is what I ment by 100% :-)

> Oddly, even under relatively light loads of 20 or 30 Mbps,
> sometimes the RAID members would show utilization in the high 90s, usually
> on all the drives on a multiplier channel.

I think that's the filesystem buffering and then writing all at once.
It's normal if it's periodic; they go briefly to ~100% and then back to
~0%?

Did you watch the atop display when the problem occurred?

> I don't know if this is ordinary
> behavior for atop, but all the drives also periodically disappear from the
> status display.

That's a config option (and I find the default annoying). atop also
sorts the drives by utilization every second which can be a little hard
to watch. But if you have the problem I had then that one drive stays at
the top of the list when the problem occurs.

> Additionally, while atop is running and I am using my usual
> video editor, Video Redo, on a Windows workstation to stream video from the
> server, every time atop updates, the video and audio skip when reading from
> a drive not on the RAID array.  I did not notice the same behavior from the
> RAID array.  Odd.

I think this is heavy /proc filesystem access which I have noticed can
screw up even realtime processes.

> Anyway, on to the diagnostics.
> 
> I ran both `atop` and `watch iostat 1 2` concurrently and triggered several
> events while under heavy load ( >450 Mbps, total ). In atop, drives sdb,
> sdd, sde, sdg, and sdi consistently disappeared from atop entirely, and
> writes for the other drives fell to dead zero.  Reads fell to a very small
> number.  The iostat session returned information in agreement with atop:
> both reads and writes for sdb, sdd, sde, sdg, sdi, and md0 all fell to dead
> zero from nominal values frequently exceeding 20,000 reads / sec and 5000
> writes / sec.  Meanwhile, writes to sda, sdc, sdf, sdh, and sdj also dropped
> to dead zero, but reads only fell to between 230 and 256 reads/sec.

I used:

  iostat -t -k -x 1 | egrep -v 'sd.[0-9]'

to get percent utilization and not show each partition but just whole
drives.

For atop you want the -f option to 'fixate' the number of lines so
drives with zero utilization don't disappear.

If you didn't get constant 100% utilization while the event occurred
then I guess you don't have the problem I had.

Does the sata multiplier have it's own driver and if so, is it the
latest? Any other complaints on the net about it? I would think a
problem there would show up as 100% utilization though...

And I think you already said the cpu usage is low when the event occurs?
No one core at near 100%? (atop would show this too...)



  reply	other threads:[~2009-04-04 13:01 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-02  4:16 (unknown), Lelsie Rhorer
2009-04-02  4:22 ` David Lethe
2009-04-05  0:12   ` RE: Lelsie Rhorer
2009-04-05  0:38     ` Greg Freemyer
2009-04-05  5:05       ` Lelsie Rhorer
2009-04-05 11:42         ` Greg Freemyer
2009-04-05  0:45     ` Re: Roger Heflin
2009-04-05  5:21       ` Lelsie Rhorer
2009-04-05  5:33         ` RE: David Lethe
2009-04-05  8:14           ` RAID halting Lelsie Rhorer
2009-04-02  4:38 ` Strange filesystem slowness with 8TB RAID6 NeilBrown
2009-04-04  7:12   ` RAID halting Lelsie Rhorer
2009-04-04 12:38     ` Roger Heflin
2009-04-02  6:56 ` your mail Luca Berra
2009-04-04  6:44   ` RAID halting Lelsie Rhorer
2009-04-02  7:33 ` Peter Grandi
2009-04-02 23:01   ` RAID halting Lelsie Rhorer
2009-04-02 13:35 ` Andrew Burgess
2009-04-04  5:57   ` RAID halting Lelsie Rhorer
2009-04-04 13:01     ` Andrew Burgess [this message]
2009-04-04 14:39       ` Lelsie Rhorer
2009-04-04 15:04         ` Andrew Burgess
2009-04-04 15:15           ` Lelsie Rhorer
2009-04-04 16:39             ` Andrew Burgess
  -- strict thread matches above, loose matches on Subject: below --
2009-04-04 17:05 Lelsie Rhorer
     [not found] <49D7C19C.2050308@gmail.com>
2009-04-05  0:07 ` Lelsie Rhorer
2009-04-05  0:49   ` Greg Freemyer
2009-04-05  5:34     ` Lelsie Rhorer
2009-04-05  7:16       ` Richard Scobie
2009-04-05  8:22         ` Lelsie Rhorer
2009-04-05 14:05           ` Drew
2009-04-05 18:54             ` Leslie Rhorer
2009-04-05 19:17               ` John Robinson
2009-04-05 20:00                 ` Greg Freemyer
2009-04-05 20:39                   ` Peter Grandi
2009-04-05 23:27                     ` Leslie Rhorer
2009-04-05 22:03                   ` Leslie Rhorer
2009-04-06 22:16                     ` Greg Freemyer
2009-04-07 18:22                       ` Leslie Rhorer
2009-04-24  4:52                   ` Leslie Rhorer
2009-04-24  6:50                     ` Richard Scobie
2009-04-24 10:03                       ` Leslie Rhorer
2009-04-28 19:36                         ` lrhorer
2009-04-24 15:24                     ` Andrew Burgess
2009-04-25  4:26                       ` Leslie Rhorer
2009-04-24 17:03                     ` Doug Ledford
2009-04-24 20:25                       ` Richard Scobie
2009-04-24 20:28                         ` CoolCold
2009-04-24 21:04                           ` Richard Scobie
2009-04-25  7:40                       ` Leslie Rhorer
2009-04-25  8:53                         ` Michał Przyłuski
2009-04-28 19:33                         ` Leslie Rhorer
2009-04-29 11:25                           ` John Robinson
2009-04-30  0:55                             ` Leslie Rhorer
2009-04-30 12:34                               ` John Robinson
2009-05-03  2:16                                 ` Leslie Rhorer
2009-05-03  2:23                           ` Leslie Rhorer
2009-04-24 20:25                     ` Greg Freemyer
2009-04-25  7:24                     ` Leslie Rhorer
2009-04-05 21:02                 ` Leslie Rhorer
2009-04-05 19:26               ` Richard Scobie
2009-04-05 20:40                 ` Leslie Rhorer
2009-04-05 20:57               ` Peter Grandi
2009-04-05 23:55                 ` Leslie Rhorer
2009-04-06 20:35                   ` jim owens
2009-04-07 17:47                     ` Leslie Rhorer
2009-04-07 18:18                       ` David Lethe
2009-04-08 14:17                         ` Leslie Rhorer
2009-04-08 14:30                           ` David Lethe
2009-04-09  4:52                             ` Leslie Rhorer
2009-04-09  6:45                               ` David Lethe
2009-04-08 14:37                           ` Greg Freemyer
2009-04-08 16:29                             ` Andrew Burgess
2009-04-09  3:24                               ` Leslie Rhorer
2009-04-10  3:02                               ` Leslie Rhorer
2009-04-10  4:51                                 ` Leslie Rhorer
2009-04-10 12:50                                   ` jim owens
2009-04-10 15:31                                   ` Bill Davidsen
2009-04-11  1:37                                     ` Leslie Rhorer
2009-04-11 13:02                                       ` Bill Davidsen
2009-04-10  8:53                                 ` David Greaves
2009-04-08 18:04                           ` Corey Hickey
2009-04-07 18:20                       ` Greg Freemyer
2009-04-08  8:45                       ` John Robinson
2009-04-09  3:34                         ` Leslie Rhorer
2009-04-05  7:33       ` Richard Scobie
2009-04-05  0:57   ` Roger Heflin
2009-04-05  6:30     ` Lelsie Rhorer
2009-04-05 14:22 FW: " David Lethe
2009-04-05 14:53 ` David Lethe
2009-04-05 20:33 ` Leslie Rhorer
2009-04-05 22:20   ` Peter Grandi
2009-04-06  0:31   ` Doug Ledford
2009-04-06  1:53     ` Leslie Rhorer
2009-04-06 12:37       ` Doug Ledford
     [not found] <49D89515.3020800@computer.org>
2009-04-05 18:40 ` Leslie Rhorer
     [not found] <49F21B75.7060705@sauce.co.nz>
2009-04-25  4:32 ` Leslie Rhorer
     [not found] <49F2A193.8080807@sauce.co.nz>
2009-04-25  7:03 ` Leslie Rhorer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1238850066.16200.51.camel@cichlid.com \
    --to=aab@cichlid.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=lrhorer@satx.rr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.