From: Joe Landman <joe.landman@gmail.com>
To: Stan Hoeppner <stan@hardwarefreak.com>
Cc: John Robinson <john.robinson@anonymous.org.uk>,
Matt Garman <matthew.garman@gmail.com>,
Mdadm <linux-raid@vger.kernel.org>
Subject: Re: high throughput storage server?
Date: Thu, 17 Feb 2011 17:13:04 -0500 [thread overview]
Message-ID: <4D5D9D70.9010603@gmail.com> (raw)
In-Reply-To: <4D5D9758.2080400@hardwarefreak.com>
On 02/17/2011 04:47 PM, Stan Hoeppner wrote:
> John Robinson put forth on 2/17/2011 5:07 AM:
>> On 14/02/2011 23:59, Matt Garman wrote:
>> [...]
>>> The requirement is basically this: around 40 to 50 compute machines
>>> act as basically an ad-hoc scientific compute/simulation/analysis
>>> cluster. These machines all need access to a shared 20 TB pool of
>>> storage. Each compute machine has a gigabit network connection, and
>>> it's possible that nearly every machine could simultaneously try to
>>> access a large (100 to 1000 MB) file in the storage pool. In other
>>> words, a 20 TB file store with bandwidth upwards of 50 Gbps.
>>
>> I'd recommend you analyse that requirement more closely. Yes, you have
>> 50 compute machines with GigE connections so it's possible they could
>> all demand data from the file store at once, but in actual use, would they?
>
> This is a very good point and one which I somewhat ignored in my initial
> response, making a silent assumption. I did so based on personal
> experience, and knowledge of what other sites are deploying.
Well, the application area appears to be high performance cluster
computing, and the storage behind it. Its a somewhat more specialized
version of storage, and not one that a typical IT person runs into
often. There are different, some profoundly so, demands placed upon
such storage.
Full disclosure: this is our major market, we make/sell products in
this space, have for a while. Take what we say with that in your mind
as a caveat, as it does color our opinions.
The spec's as stated, 50Gb/s ... its rare ... exceptionally rare ...
that you ever see cluster computing storage requirements stated in such
terms. Usually they are stated in the MB/s or GB/s regime. Using a
basic conversion of Gb/s to GB/s, the OP is looking for ~6GB/s support.
Some basic facts about this.
Fibre channel (FC-8 in particular), will give you, at best 1GB/s per
loop, and that presumes you aren't oversubscribing the loop. The vast
majority of designs we see coming from IT shops, do, in fact, badly
oversubscribe the bandwidth, which causes significant contention on the
loops. The Nexsan unit you indicated (they are nominally a competitor
of ours) is an FC device, though we've heard rumblings that they may
even allow for SAS direct connections (though that would be quite cost
ineffective as a SAS JBOD chassis compared to other units, and you still
have the oversubscription problem).
As I said, high performance storage design is a very ... very ...
different animal from standard IT storage design. There are very
different decision points, and design concepts.
> You don't see many deployed filers on the planet with 5 * 10 GbE front
> end connections. In fact, today, you still don't see many deployed
> filers with even one 10 GbE front end connection, but usually multiple
> (often but not always bonded) GbE connections.
In this space, high performance cluster storage, this statement is
incorrect.
Our units (again, not trying to be a commercial here, see .sig if you
want to converse offline) usually ship with either 2x 10GbE, 2x QDR IB,
or combinations of these. QDR IB gets you 3.2 GB/s. Per port.
In high performance computing storage (again, the focus of the OP's
questions), this is a reasonable configuration and request.
>
> A single 10 GbE front end connection provides a truly enormous amount of
> real world bandwidth, over 1 GB/s aggregate sustained. *This is
> equivalent to transferring a full length dual layer DVD in 10 seconds*
Trust me. This is not *enormous*. Well, ok ... put another way, we
architect systems that scale well beyond 10GB/s sustained. We have nice
TB sprints and similar sorts of "drag racing" as I call them (c.f.
http://scalability.org/?p=2912 http://scalability.org/?p=2356
http://scalability.org/?p=2165 http://scalability.org/?p=1980
http://scalability.org/?p=1756 )
1 GB/s is nothing magical. Again, not a commercial, but our DeltaV
units, running MD raid, achieve 850-900MB/s (0.85-0.9 GB/s) for RAID6.
To get good (great) performance you have to start out with a good
(great) design. One that will really optimize the performance on a per
unit basis.
> Few sites/applications actually need this kind of bandwidth, either
> burst or sustained. But, this is the system I spec'd for the OP
> earlier. Sometimes people get caught up in comparing raw bandwidth
> numbers between different platforms and lose sight of the real world
> performance they can get from any one of them.
The sad part is that we often wind up fighting against others "marketing
numbers". Our real benchmarks are often comparable to their "strong
wind a the back" numbers. Heck, our MD raid numbers often are better
than others hardware RAID numbers.
Theoretical bandwidth from the marketing docs doesn't matter. The only
thing that does matter is having a sound design and implementation at
all levels. This is why we do what we do, and why we do use MD raid.
Regards,
Joe
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
next prev parent reply other threads:[~2011-02-17 22:13 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-14 23:59 high throughput storage server? Matt Garman
2011-02-15 2:06 ` Doug Dumitru
2011-02-15 4:44 ` Matt Garman
2011-02-15 5:49 ` hansbkk
2011-02-15 9:43 ` David Brown
2011-02-24 20:28 ` Matt Garman
2011-02-24 20:43 ` David Brown
2011-02-15 15:16 ` Joe Landman
2011-02-15 20:37 ` NeilBrown
2011-02-15 20:47 ` Joe Landman
2011-02-15 21:41 ` NeilBrown
2011-02-24 20:58 ` Matt Garman
2011-02-24 21:20 ` Joe Landman
2011-02-26 23:54 ` high throughput storage server? GPFS w/ 10GB/s throughput to the rescue Stan Hoeppner
2011-02-27 0:56 ` Joe Landman
2011-02-27 14:55 ` Stan Hoeppner
2011-03-12 22:49 ` Matt Garman
2011-02-27 21:30 ` high throughput storage server? Ed W
2011-02-28 15:46 ` Joe Landman
2011-02-28 23:14 ` Stan Hoeppner
2011-02-28 22:22 ` Stan Hoeppner
2011-03-02 3:44 ` Matt Garman
2011-03-02 4:20 ` Joe Landman
2011-03-02 7:10 ` Roberto Spadim
2011-03-02 19:03 ` Drew
2011-03-02 19:20 ` Roberto Spadim
2011-03-13 20:10 ` Christoph Hellwig
2011-03-14 12:27 ` Stan Hoeppner
2011-03-14 12:47 ` Christoph Hellwig
2011-03-18 13:16 ` Stan Hoeppner
2011-03-18 14:05 ` Christoph Hellwig
2011-03-18 15:43 ` Stan Hoeppner
2011-03-18 16:21 ` Roberto Spadim
2011-03-18 22:01 ` NeilBrown
2011-03-18 22:23 ` Roberto Spadim
2011-03-20 1:34 ` Stan Hoeppner
2011-03-20 3:41 ` NeilBrown
2011-03-20 5:32 ` Roberto Spadim
2011-03-20 23:22 ` Stan Hoeppner
2011-03-21 0:52 ` Roberto Spadim
2011-03-21 2:44 ` Keld Jørn Simonsen
2011-03-21 3:13 ` Roberto Spadim
2011-03-21 3:14 ` Roberto Spadim
2011-03-21 17:07 ` Stan Hoeppner
2011-03-21 14:18 ` Stan Hoeppner
2011-03-21 17:08 ` Roberto Spadim
2011-03-21 22:13 ` Keld Jørn Simonsen
2011-03-22 9:46 ` Robin Hill
2011-03-22 10:14 ` Keld Jørn Simonsen
2011-03-23 8:53 ` Stan Hoeppner
2011-03-23 15:57 ` Roberto Spadim
2011-03-23 16:19 ` Joe Landman
2011-03-24 8:05 ` Stan Hoeppner
2011-03-24 13:12 ` Joe Landman
2011-03-25 7:06 ` Stan Hoeppner
2011-03-24 17:07 ` Christoph Hellwig
2011-03-24 5:52 ` Stan Hoeppner
2011-03-24 6:33 ` NeilBrown
2011-03-24 8:07 ` Roberto Spadim
2011-03-24 8:31 ` Stan Hoeppner
2011-03-22 10:00 ` Stan Hoeppner
2011-03-22 11:01 ` Keld Jørn Simonsen
2011-02-15 12:29 ` Stan Hoeppner
2011-02-15 12:45 ` Roberto Spadim
2011-02-15 13:03 ` Roberto Spadim
2011-02-24 20:43 ` Matt Garman
2011-02-24 20:53 ` Zdenek Kaspar
2011-02-24 21:07 ` Joe Landman
2011-02-15 13:39 ` David Brown
2011-02-16 23:32 ` Stan Hoeppner
2011-02-17 0:00 ` Keld Jørn Simonsen
2011-02-17 0:19 ` Stan Hoeppner
2011-02-17 2:23 ` Roberto Spadim
2011-02-17 3:05 ` Stan Hoeppner
2011-02-17 0:26 ` David Brown
2011-02-17 0:45 ` Stan Hoeppner
2011-02-17 10:39 ` David Brown
2011-02-24 20:49 ` Matt Garman
2011-02-15 13:48 ` Zdenek Kaspar
2011-02-15 14:29 ` Roberto Spadim
2011-02-15 14:51 ` A. Krijgsman
2011-02-15 16:44 ` Roberto Spadim
2011-02-15 14:56 ` Zdenek Kaspar
2011-02-24 20:36 ` Matt Garman
2011-02-17 11:07 ` John Robinson
2011-02-17 13:36 ` Roberto Spadim
2011-02-17 13:54 ` Roberto Spadim
2011-02-17 21:47 ` Stan Hoeppner
2011-02-17 22:13 ` Joe Landman [this message]
2011-02-17 23:49 ` Stan Hoeppner
2011-02-18 0:06 ` Joe Landman
2011-02-18 3:48 ` Stan Hoeppner
2011-02-18 13:49 ` Mattias Wadenstein
2011-02-18 23:16 ` Stan Hoeppner
2011-02-21 10:25 ` Mattias Wadenstein
2011-02-21 21:51 ` Stan Hoeppner
2011-02-22 8:57 ` David Brown
2011-02-22 9:30 ` Mattias Wadenstein
2011-02-22 9:49 ` David Brown
2011-02-22 13:38 ` Stan Hoeppner
2011-02-22 14:18 ` David Brown
2011-02-23 5:52 ` Stan Hoeppner
2011-02-23 13:56 ` David Brown
2011-02-23 14:25 ` John Robinson
2011-02-23 15:15 ` David Brown
2011-02-23 23:14 ` Stan Hoeppner
2011-02-24 10:19 ` David Brown
2011-02-23 21:59 ` Stan Hoeppner
2011-02-23 23:43 ` John Robinson
2011-02-24 15:53 ` Stan Hoeppner
2011-02-23 21:11 ` Stan Hoeppner
2011-02-24 11:24 ` David Brown
2011-02-24 23:30 ` Stan Hoeppner
2011-02-25 8:20 ` David Brown
2011-02-19 0:24 ` Joe Landman
2011-02-21 10:04 ` Mattias Wadenstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D5D9D70.9010603@gmail.com \
--to=joe.landman@gmail.com \
--cc=john.robinson@anonymous.org.uk \
--cc=linux-raid@vger.kernel.org \
--cc=matthew.garman@gmail.com \
--cc=stan@hardwarefreak.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).