All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bill Fink <billfink@mindspring.com>
To: Justin Maggard <jmaggard10@gmail.com>
Cc: "Ted Ts'o" <tytso@mit.edu>,
	Bill Fink <bill@wizard.sci.gsfc.nasa.gov>,
	"adilger@sun.com" <adilger@sun.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	"Fink, William E. (GSFC-6061)" <william.e.fink@nasa.gov>
Subject: Re: [RFC PATCH] ext4: fix 50% disk write performance regression
Date: Mon, 30 Aug 2010 21:44:20 -0400	[thread overview]
Message-ID: <20100830214420.51c920de.billfink@mindspring.com> (raw)
In-Reply-To: <AANLkTi=pfVCKmURAnhageAuQFT7yUMy+5NzJ8C9NunXt@mail.gmail.com>

On Mon, 30 Aug 2010, Justin Maggard wrote:

> On Mon, Aug 30, 2010 at 5:37 PM, Ted Ts'o <tytso@mit.edu> wrote:
> > On Mon, Aug 30, 2010 at 04:49:58PM -0400, Bill Fink wrote:
> >> > Thanks for reporting it.  I'm going to have to take a closer look at
> >> > why this makes a difference.  I'm going to guess though that what's
> >> > going on is that we're posting writes in such a way that they're no
> >> > longer aligned or ending at the end of a RAID5 stripe, causing a
> >> > read-modify-write pass.  That would easily explain the write
> >> > performance regression.
> >>
> >> I'm not sure I understand.  How could calling or not calling
> >> ext4_num_dirty_pages() (unpatched versus patched 2.6.35 kernel)
> >> affect the write alignment?
> >
> > Suppose you have 8 disks, with stripe size of 16k.  Assuming that
> > you're only using one parity disk (i.e., RAID 5) and no spare disks,
> > that means the optimal I/O size is 7*16k == 112k.  If we do a write
> > which is smaller than 112k, or which is not a multiple of 112k, then
> > the RAID subsystem will need to do a read-modify-write to update the
> > parity disk.  Furthermore, the write had better be aligned on an 112k
> > byte boundary.  The block allocator will guarantee that block #0 is
> > aligned on a 112k block, but writes have to also be right size in
> > order to avoid the read-modify-write.
> >
> > If we end up doing very small writes, then it can end up being quite
> > disatrous for write performance.
> 
> I'd have to agree that this is likely the case.  Just to add a little
> more data here, I tried the same 32GB dd test against a 12-disk MD
> RAID 6 64k chunk array today with and without the patch (although
> against a 2.6.33.7 kernel), and my write performance dropped from
> ~420MB/sec down to 350MB/sec when I used the patched kernel.

I'm curious.  Since you're using 12 disks where I was only
using 8, I'm wondering what performance you would get if you
changed the multiplier to say 16, i.e.

	desired_nr_to_write = wbc->nr_to_write * 16;

It seems you should be getting better than 420 MB/sec on a
12-disk raid, although perhaps the overhead of doing RAID6
is an issue.  I use md RAID0 to combine 2 of the hardware
RAID5 arrays (total of 16 disks), and I'm seeing (with my
patch) 1.3 GB/sec write performance.

					-Bill
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-08-31  1:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-30  3:11 [RFC PATCH] ext4: fix 50% disk write performance regression Bill Fink
2010-08-30 17:05 ` Eric Sandeen
2010-08-30 19:30   ` Bill Fink
2010-08-30 19:35     ` Eric Sandeen
2010-08-30 17:40 ` Ted Ts'o
2010-08-30 20:49   ` Bill Fink
2010-08-30 21:05     ` Eric Sandeen
     [not found]       ` <20100830194533.6d09c38b.bill@wizard.sci.gsfc.nasa.gov>
2010-08-30 23:53         ` Eric Sandeen
     [not found]           ` <20100830210541.8b248a14.billfink@mindspring.com>
     [not found]             ` <4C7C62E9.4090707@redhat.com>
2010-08-31  3:27               ` Bill Fink
2010-08-31  3:29                 ` Eric Sandeen
2010-08-31  0:37     ` Ted Ts'o
2010-08-31  0:51       ` Justin Maggard
2010-08-31  1:44         ` Bill Fink [this message]
2010-08-31  1:14       ` Bill Fink
2010-08-31  3:43 ` [PATCH] " Eric Sandeen
2010-08-31  4:26   ` Eric Sandeen
2010-08-31  4:53   ` Bill Fink
2010-08-31  5:05     ` Eric Sandeen
2010-08-31  5:31       ` Bill Fink
2010-09-09  0:23       ` Daniel Taylor
2010-09-09  3:29         ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100830214420.51c920de.billfink@mindspring.com \
    --to=billfink@mindspring.com \
    --cc=adilger@sun.com \
    --cc=bill@wizard.sci.gsfc.nasa.gov \
    --cc=jmaggard10@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=william.e.fink@nasa.gov \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.