linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Landman <joe.landman@gmail.com>
To: Mdadm <linux-raid@vger.kernel.org>
Subject: Re: high throughput storage server?  GPFS w/ 10GB/s throughput to the rescue
Date: Sat, 26 Feb 2011 19:56:52 -0500	[thread overview]
Message-ID: <4D69A154.3030403@gmail.com> (raw)
In-Reply-To: <4D6992CE.60803@hardwarefreak.com>

On 02/26/2011 06:54 PM, Stan Hoeppner wrote:
> Joe Landman put forth on 2/24/2011 3:20 PM:

[...]

>> that gets you 50x 117 MB/s or about 5.9 GB/s sustained bandwidth for
>> your IO.  10 machines running at a sustainable 600 MB/s delivered over
>> the network, and a parallel file system atop this, solves this problem.
>
> That's 1 file server for each 5 compute nodes Joe.  That is excessive.

No Stan, it isn't.  As I said, this is our market, we know it pretty 
well.  Matt stated his needs pretty clearly.

He needs 5.9GB/s sustained bandwidth.  Local drives (as you suggested 
later on) will deliver 75-100 MB/s of bandwidth, and he'd need 2 for 
RAID1, as well as a RAID0 (e.g. RAID10) for local bandwidth (150+ MB/s). 
  4 drives per unit, 50 units.  200 drives.

Any admin want to admin 200+ drives in 50 chassis?  Admin 50 different 
file systems?

Oh, and what is the impact if some of those nodes went away?  Would they 
take down the file system?  In the cloud of microdisk model Stan 
suggested, yes they would.  Which is why you might not want to give that 
advice serious consideration.  Unless you built in replication.  Now we 
are at 400 disks in 50 chassis.

Again, this design keeps getting worse.

> Your business is selling these storage servers, so I can understand this
> recommendation.  What cost is Matt looking at for these 10 storage

Now this is sad, very sad.

Stan started out selling the Nexsan version of things (and why was he 
doing it on the MD RAID list I wonder?), which would have run into the 
same costs Stan noted later.  Now Stan is selling (actually mis-selling) 
GPFS (again, on an MD RAID list, seemingly having picked it off of a 
website), without having a clue as to the pricing, implementation, 
issues, etc.

> servers?  $8-15k apiece?  $80-150K total, not including installation,
> maintenance, service contract, or administration training?  And these
> require a cluster file system.  I'm guessing that's in the territory of
> quotes he's already received from NetApp et al.

I did suggest using GlusterFS as it will help with a number of aspects, 
has an open source version.  I did also suggest (since he seems to wish 
to build it himself) that he pursue a reasonable design to start with, 
and avoid the filer based designs Stan suggested (two Nexsan's and some 
sort of filer head to handle them), or a SAN switch of some sort. 
Neither design works well in his scenario, or for that matter, in the 
vast majority of HPC situations.

I did make a full disclosure of my interests up front, and people are 
free to take my words with a grain of salt.  Insinuating based upon my 
disclosure?  Sad.


> In that case it makes more sense to simply use direct attached storage
> in each compute node at marginal additional cost, and a truly scalable
> parallel filesystem across the compute nodes, IBM's GPFS.  This will
> give better aggregate performance at substantially lower cost, and
> likely with much easier filesystem administration.

See GlusterFS.  Open source at zero cost.  However, and this is a large 
however, this design, using local storage for a pooled "cloud" of disks, 
has some often problematic issues (resiliency, performance, hotspots). 
A truly hobby design would use this.  Local disk is fine for scratch 
space, for a few other things.  Managing the disk spread out among 50 
nodes?  Yeah, its harder.

I'm gonna go out on a limb here and suggest Matt speak with HPC cluster 
and storage people.  He can implement things ranging from effectively 
zero cost through things which can be quite expensive.  If you are 
talking to Netapp about HPC storage, well, probably move onto a real HPC 
storage shop.  His problem is squarely in the HPC arena.

However, I would strongly advise against designs such as a single 
centralized unit, or a cloud of micro disks.  The first design is 
decidedly non-scalable, which is in part why the HPC community abandoned 
it years ago.  The second design is very hard to manage and guarantee 
any sort of resiliency.  You get all the benefits of a RAID0 in what 
Stan proposed.

Start out talking with and working with experts, and its pretty likely 
you'll come out with a good solution.   The inverse is also true.

MD RAID, which Stan dismissed as a "hobby RAID" at first can work well 
for Matt.  GlusterFS can help with the parallel file system atop this. 
Starting with a realistic design, an MD RAID based system (self built or 
otherwise) could easily provide everything Matt needs, at the data rates 
he needs it, using entirely open source technologies.  And good designs.

You really won't get good performance out of a bad design.  The folks 
doing HPC work who've responded have largely helped frame good design 
patterns.  The folks who aren't sure what HPC really is, haven't.

Regards,

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

  reply	other threads:[~2011-02-27  0:56 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-14 23:59 high throughput storage server? Matt Garman
2011-02-15  2:06 ` Doug Dumitru
2011-02-15  4:44   ` Matt Garman
2011-02-15  5:49     ` hansbkk
2011-02-15  9:43     ` David Brown
2011-02-24 20:28       ` Matt Garman
2011-02-24 20:43         ` David Brown
2011-02-15 15:16     ` Joe Landman
2011-02-15 20:37       ` NeilBrown
2011-02-15 20:47         ` Joe Landman
2011-02-15 21:41           ` NeilBrown
2011-02-24 20:58       ` Matt Garman
2011-02-24 21:20         ` Joe Landman
2011-02-26 23:54           ` high throughput storage server? GPFS w/ 10GB/s throughput to the rescue Stan Hoeppner
2011-02-27  0:56             ` Joe Landman [this message]
2011-02-27 14:55               ` Stan Hoeppner
2011-03-12 22:49                 ` Matt Garman
2011-02-27 21:30     ` high throughput storage server? Ed W
2011-02-28 15:46       ` Joe Landman
2011-02-28 23:14         ` Stan Hoeppner
2011-02-28 22:22       ` Stan Hoeppner
2011-03-02  3:44       ` Matt Garman
2011-03-02  4:20         ` Joe Landman
2011-03-02  7:10           ` Roberto Spadim
2011-03-02 19:03             ` Drew
2011-03-02 19:20               ` Roberto Spadim
2011-03-13 20:10                 ` Christoph Hellwig
2011-03-14 12:27                   ` Stan Hoeppner
2011-03-14 12:47                     ` Christoph Hellwig
2011-03-18 13:16                       ` Stan Hoeppner
2011-03-18 14:05                         ` Christoph Hellwig
2011-03-18 15:43                           ` Stan Hoeppner
2011-03-18 16:21                             ` Roberto Spadim
2011-03-18 22:01                             ` NeilBrown
2011-03-18 22:23                               ` Roberto Spadim
2011-03-20  1:34                               ` Stan Hoeppner
2011-03-20  3:41                                 ` NeilBrown
2011-03-20  5:32                                   ` Roberto Spadim
2011-03-20 23:22                                     ` Stan Hoeppner
2011-03-21  0:52                                       ` Roberto Spadim
2011-03-21  2:44                                       ` Keld Jørn Simonsen
2011-03-21  3:13                                         ` Roberto Spadim
2011-03-21  3:14                                           ` Roberto Spadim
2011-03-21 17:07                                             ` Stan Hoeppner
2011-03-21 14:18                                         ` Stan Hoeppner
2011-03-21 17:08                                           ` Roberto Spadim
2011-03-21 22:13                                           ` Keld Jørn Simonsen
2011-03-22  9:46                                             ` Robin Hill
2011-03-22 10:14                                               ` Keld Jørn Simonsen
2011-03-23  8:53                                                 ` Stan Hoeppner
2011-03-23 15:57                                                   ` Roberto Spadim
2011-03-23 16:19                                                     ` Joe Landman
2011-03-24  8:05                                                       ` Stan Hoeppner
2011-03-24 13:12                                                         ` Joe Landman
2011-03-25  7:06                                                           ` Stan Hoeppner
2011-03-24 17:07                                                       ` Christoph Hellwig
2011-03-24  5:52                                                     ` Stan Hoeppner
2011-03-24  6:33                                                       ` NeilBrown
2011-03-24  8:07                                                         ` Roberto Spadim
2011-03-24  8:31                                                         ` Stan Hoeppner
2011-03-22 10:00                                             ` Stan Hoeppner
2011-03-22 11:01                                               ` Keld Jørn Simonsen
2011-02-15 12:29 ` Stan Hoeppner
2011-02-15 12:45   ` Roberto Spadim
2011-02-15 13:03     ` Roberto Spadim
2011-02-24 20:43       ` Matt Garman
2011-02-24 20:53         ` Zdenek Kaspar
2011-02-24 21:07           ` Joe Landman
2011-02-15 13:39   ` David Brown
2011-02-16 23:32     ` Stan Hoeppner
2011-02-17  0:00       ` Keld Jørn Simonsen
2011-02-17  0:19         ` Stan Hoeppner
2011-02-17  2:23           ` Roberto Spadim
2011-02-17  3:05             ` Stan Hoeppner
2011-02-17  0:26       ` David Brown
2011-02-17  0:45         ` Stan Hoeppner
2011-02-17 10:39           ` David Brown
2011-02-24 20:49     ` Matt Garman
2011-02-15 13:48 ` Zdenek Kaspar
2011-02-15 14:29   ` Roberto Spadim
2011-02-15 14:51     ` A. Krijgsman
2011-02-15 16:44       ` Roberto Spadim
2011-02-15 14:56     ` Zdenek Kaspar
2011-02-24 20:36       ` Matt Garman
2011-02-17 11:07 ` John Robinson
2011-02-17 13:36   ` Roberto Spadim
2011-02-17 13:54     ` Roberto Spadim
2011-02-17 21:47   ` Stan Hoeppner
2011-02-17 22:13     ` Joe Landman
2011-02-17 23:49       ` Stan Hoeppner
2011-02-18  0:06         ` Joe Landman
2011-02-18  3:48           ` Stan Hoeppner
2011-02-18 13:49 ` Mattias Wadenstein
2011-02-18 23:16   ` Stan Hoeppner
2011-02-21 10:25     ` Mattias Wadenstein
2011-02-21 21:51       ` Stan Hoeppner
2011-02-22  8:57         ` David Brown
2011-02-22  9:30           ` Mattias Wadenstein
2011-02-22  9:49             ` David Brown
2011-02-22 13:38           ` Stan Hoeppner
2011-02-22 14:18             ` David Brown
2011-02-23  5:52               ` Stan Hoeppner
2011-02-23 13:56                 ` David Brown
2011-02-23 14:25                   ` John Robinson
2011-02-23 15:15                     ` David Brown
2011-02-23 23:14                       ` Stan Hoeppner
2011-02-24 10:19                         ` David Brown
2011-02-23 21:59                     ` Stan Hoeppner
2011-02-23 23:43                       ` John Robinson
2011-02-24 15:53                         ` Stan Hoeppner
2011-02-23 21:11                   ` Stan Hoeppner
2011-02-24 11:24                     ` David Brown
2011-02-24 23:30                       ` Stan Hoeppner
2011-02-25  8:20                         ` David Brown
2011-02-19  0:24   ` Joe Landman
2011-02-21 10:04     ` Mattias Wadenstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D69A154.3030403@gmail.com \
    --to=joe.landman@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).