All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rogier Wolff <R.E.Wolff@BitWizard.nl>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: "Rogier Wolff" <R.E.Wolff@BitWizard.nl>,
	"Greg Freemyer" <greg.freemyer@gmail.com>,
	"Bruno Prémont" <bonbons@linux-vserver.org>,
	linux-kernel@vger.kernel.org, linux-ide@vger.kernel.org
Subject: Re: Slow disks.
Date: Thu, 23 Dec 2010 18:01:09 +0100	[thread overview]
Message-ID: <20101223170109.GA31591@bitwizard.nl> (raw)
In-Reply-To: <x494oa48scp.fsf@segfault.boston.devel.redhat.com>

On Thu, Dec 23, 2010 at 09:40:54AM -0500, Jeff Moyer wrote:
> > In my performance calculations, 10ms average seek (should be around
> > 7), 4ms average rotational latency for a total of 14ms. This would
> > degrade for read-modify-write to 10+4+8 = 22ms. Still 10 times better
> > than what we observe: service times on the order of 200-300ms. 
> 
> I didn't say it would account for all of your degradation, just that it
> could affect performance.  I'm sorry if I wasn't clear on that.

We can live with a "2x performance degradation" due to stupid
configuration. But not with the 10x -30x that we're seeing now. 

> >  > md1 : active raid5 sda2[0] sdd2[3](S) sdb2[1] sdc2[4]
> >> >       39067648 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3]
> >> > [UUU]
> >> 
> >> A 512KB raid5 chunk with 4KB I/Os?  That is a recipe for inefficiency.
> >> Again, blktrace data would be helpful.
> >
> > Where did you get the 4kb IOs from? You mean from the iostat -x
> > output?
> 
> Yes, since that's all I have to go on at the moment.
> 
> > The system/filesystem decided to do those small IOs. With the
> > throughput we're getting on the filesystem, it better not try to write
> > larger chuncks...
> 
> Your logic is a bit flawed, for so many reasons I'm not even going to
> try to enumerate them here.  Anyway, I'll continue to sound like a
> broken record and ask for blktrace data.

Here it is. 

http://prive.bitwizard.nl/blktrace.log

I can't read those yet... Manual is unclear. 

I'd guess that "D" means "submitted to driver". and "C" means
"completed". I very often see a D followed VERY shortly by a C. Also I
see more C's than "D"s.

Anohter way of looking at it, was to sort on the "ID" field. I would
expect each "transaction" to follow similar steps. But many IDs only
occur twice, and not the same for each. 

> > I have benchmarked my own "high bandwidth" raid arrays. I benchmarked
> > them with 128k, 256, 512 and 1024k blocksize. I got the best
> > throughput (for my benchmark: dd if=/dev/md0 of=/dev/null bs=1024k)
> > with 512k blocksize. (and yes that IS a valid benchmark for my
> > usage of the array.)
> 
> Sorry, I'm not sure I understand how this is relevant.  I thought we
> were troubleshooting a problem on someone else's system.  Further, the
> window into the workload we saw via iostat definitely shows that smaller
> I/Os are issued.

My friend confessed to me today that he determined the "optimal" RAID
block size with the exact same test as I had done, and reached the
same conclusion. So that explains his raid blocksize of 512k. 

The system is a mailserver running on a raid on three of the disks.
most of the IOs are generated by the mail server software through the
FS driver, and the raid system. It's not that we're running a database
that inherently requires 4k IOs. Apparently what the
system needs are those small IOs. 

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

  reply	other threads:[~2010-12-23 17:01 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-20 14:15 Slow disks Rogier Wolff
2010-12-20 18:06 ` Bruno Prémont
2010-12-20 18:32   ` Greg Freemyer
2010-12-22 10:43     ` Rogier Wolff
2010-12-22 15:59       ` Greg Freemyer
2010-12-22 16:27       ` Jeff Moyer
2010-12-22 22:44         ` Rogier Wolff
2010-12-23 14:40           ` Jeff Moyer
2010-12-23 17:01             ` Rogier Wolff [this message]
2010-12-23 17:47               ` Jeff Moyer
2010-12-23 18:51                 ` Greg Freemyer
2010-12-23 19:10                   ` Jaap Crezee
2010-12-23 22:09                     ` Greg Freemyer
2010-12-24 11:40                       ` Rogier Wolff
2010-12-24 11:40                         ` Rogier Wolff
2010-12-26 23:05                         ` Greg Freemyer
2010-12-27  0:27                           ` Rogier Wolff
2010-12-27  7:21                             ` Stan Hoeppner
2010-12-24 10:45                 ` Rogier Wolff
2010-12-23 17:05             ` Jaap Crezee
2010-12-26 23:38         ` Mark Knecht
2010-12-27  0:34           ` Rogier Wolff
2010-12-27  3:12             ` Mark Knecht
2010-12-27 18:20           ` Krzysztof Halasa
2010-12-24 13:01       ` Krzysztof Halasa
2010-12-24 15:24         ` Michael Tokarev
2010-12-24 20:58           ` Krzysztof Halasa
2010-12-25 12:14           ` Rogier Wolff
2010-12-25 12:19             ` Mikael Abrahamsson
2010-12-25 18:12               ` Jaap Crezee
2010-12-25 21:28                 ` Michael Tokarev
2010-12-26 21:40             ` Rogier Wolff
2010-12-26 23:17               ` Greg Freemyer
2010-12-26 23:49                 ` Rogier Wolff
2010-12-26 22:07           ` Niels
2010-12-27 10:56             ` Tejun Heo
2010-12-20 19:09 ` Jeff Moyer
2010-12-22 20:52 ` David Rees
2010-12-22 22:46   ` Rogier Wolff
2010-12-22 23:13     ` David Rees
     [not found] <fa.C+PyZdFdHUxRFDJDF3KlrfaJASk@ifi.uio.no>
2010-12-21 12:29 ` Arto Jantunen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101223170109.GA31591@bitwizard.nl \
    --to=r.e.wolff@bitwizard.nl \
    --cc=bonbons@linux-vserver.org \
    --cc=greg.freemyer@gmail.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.