Re: Linux MD? Or an H710p?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Brown <david.brown@hesbynett.no>
To: stan@hardwarefreak.com
Cc: Steve Bergman <sbergman27@gmail.com>, linux-raid@vger.kernel.org
Subject: Re: Linux MD? Or an H710p?
Date: Wed, 23 Oct 2013 09:03:16 +0200	[thread overview]
Message-ID: <526774B4.400@hesbynett.no> (raw)
In-Reply-To: <5266AE51.4050501@hardwarefreak.com>

On 22/10/13 18:56, Stan Hoeppner wrote:
> On 10/22/2013 2:24 AM, David Brown wrote:
>> On 22/10/13 02:36, Steve Bergman wrote:
>>
>> <snip>
>>
>>> But hey, this is going to be a very nice opportunity for observing XFS's
>>> savvy with parallel i/o.
>>
>> You mentioned using a 6-drive RAID10 in your first email, with XFS on
>> top of that.  Stan is the expert here, but my understanding is that you
>> should go for three 2-drive RAID1 pairs, and then use an md linear
>> "raid" for these pairs and put XFS on top of that in order to get the
>> full benefits of XFS parallelism.
> 
> XFS on a concatenation, which is what you described above, is a very
> workload specific storage architecture.  It is not a general use
> architecture, and almost never good for database workloads.  Here most
> of the data is stored in a single file or a small set of files, in a
> single directory.  With such a DB workload and 3 concatenated mirrors,
> only 1/3rd of the spindles would see the vast majority of the IO.
> 

That's a good point - while I had noted that the OP was running a
database, I forgot it was a virtual windows machine and MS SQL database.
 The virtual machine will use a single large file for its virtual
harddisk image, and so RAID10 + XFS will beat RAID1 + concat + XFS.

On the other hand, he is also serving 100+ freenx desktop users.  As far
as I understand it (and I'm very happy for corrections if I'm wrong),
that will mean a /home directory with 100+ sub-directories for the
different users - and that /is/ one of the ideal cases for concat+XFS
parallelism.

Only the OP can say which type of access is going to dominate and where
the balance should go.

As a more general point, I don't know that you can generalise that
database workloads normally store data in a single big file or a small
set of files.  I haven't worked with many databases, and none more than
a few hundred MB, so I am theorising here on things I have read rather
than personal practice.  But certainly with postgresql the data is split
into multiple directories - each table has its own directory.  For very
big tables, the data is split into multiple files - and at some point,
they will hit the allocation group size and then be split over multiple
AG's, leading to parallelism (with a bit of luck).  I am guessing other
databases are somewhat similar.  Of course, like any database tuning,
this will all be highly load-dependent.

next prev parent reply	other threads:[~2013-10-23  7:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-20  0:49 Linux MD? Or an H710p? Steve Bergman
2013-10-20  7:37 ` Stan Hoeppner
2013-10-20  8:50 ` Mikael Abrahamsson
2013-10-21 14:18 ` John Stoffel
2013-10-22  0:36   ` Steve Bergman
2013-10-22  7:24     ` David Brown
2013-10-22 15:29       ` keld
2013-10-22 16:56       ` Stan Hoeppner
2013-10-23  7:03         ` David Brown [this message]
2013-10-24  6:23           ` Stan Hoeppner
2013-10-24  7:26             ` David Brown
2013-10-25  9:34               ` Stan Hoeppner
2013-10-25 11:42                 ` David Brown
2013-10-26  9:37                   ` Stan Hoeppner
2013-10-27 22:08                     ` David Brown
2013-10-22 16:43     ` Stan Hoeppner
  -- strict thread matches above, loose matches on Subject: below --
2013-10-23 19:05 Drew

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=526774B4.400@hesbynett.no \
    --to=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    --cc=sbergman27@gmail.com \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.