From: Joe Landman <joe.landman@gmail.com>
To: Stan Hoeppner <stan@hardwarefreak.com>
Cc: John Robinson <john.robinson@anonymous.org.uk>,
Matt Garman <matthew.garman@gmail.com>,
Mdadm <linux-raid@vger.kernel.org>
Subject: Re: high throughput storage server?
Date: Thu, 17 Feb 2011 19:06:00 -0500 [thread overview]
Message-ID: <4D5DB7E8.1020003@gmail.com> (raw)
In-Reply-To: <4D5DB3F3.8050209@hardwarefreak.com>
On 2/17/2011 6:49 PM, Stan Hoeppner wrote:
> Joe Landman put forth on 2/17/2011 4:13 PM:
>
>> Well, the application area appears to be high performance cluster
>> computing, and the storage behind it. Its a somewhat more specialized
>> version of storage, and not one that a typical IT person runs into
>> often. There are different, some profoundly so, demands placed upon
>> such storage.
>
> The OP's post described an ad hoc collection of 40-50 machines doing
> various types of processing on shared data files. This is not classical
> cluster computing. He didn't describe any kind of _parallel_
> processing. It sounded to me like staged batch processing, the
Semantics at best. He is doing significant processing, in parallel,
doing data analysis, in parallel, across a cluster of machines. Doing
MPI-IO? No. Does not using MPI make this not a cluster? No.
> bandwidth demands of which are typically much lower than a parallel
> compute cluster.
See his original post. He posits his bandwidth demands.
>
>> Full disclosure: this is our major market, we make/sell products in
>> this space, have for a while. Take what we say with that in your mind
>> as a caveat, as it does color our opinions.
>
> Thanks for the disclosure Joe.
>
>> The spec's as stated, 50Gb/s ... its rare ... exceptionally rare ...
>> that you ever see cluster computing storage requirements stated in such
>> terms. Usually they are stated in the MB/s or GB/s regime. Using a
>> basic conversion of Gb/s to GB/s, the OP is looking for ~6GB/s support.
>
> Indeed. You typically don't see this kind of storage b/w need outside
> the government labs and supercomputing centers (LLNL, Sandia, NCCS,
> SDSC, etc). Of course those sites' requirements are quite a bit higher
> than a "puny" 6 GB/s.
Heh ... we see it all the time in compute cluster, large data analysis
farms etc. Not at the big labs.
[...]
> McData, etc. I've not hard of a front end loop being used in many many
> years. Some storage vendors still use loops on the _back_ end to
> connect FC/SAS/SATA expansion chassis to the head controller, IBM and
I am talking about the back end.
> NetApp come to mind, but it's usually dual loops per chassis, so you're
> looking at ~3 GB/s per expansion chassis using 8 Gbit loops. One would
2 GB/s assuming FC-8, and 20 lower speed drives are sufficient to
completely fill 2 GB/s. So, as I was saying, the design matters.
[...]
> Nexsan doesn't offer direct SAS connection on the big 42/102 drive Beast
> units, only on the Boy units. The Beast units all use dual or quad FC
> front end ports, with a couple front end GbE iSCSI ports thrown in for
> flexibility. The SAS Boy units beat all competitors on price/TB, as do
> all the Nexsan products.
As I joked one time, many many years ago "broad sweeping generalizations
tend to be incorrect". Yes, it is a recursive joke, but there is a
serious aspect to it. Your proffered pricing per TB, which you claim
Nexsan beats all ... is much higher than ours, and many others. No,
they don't beat all, or even many.
> I'd like to note that over subscription isn't intrinsic to a piece of
> hardware. It's indicative of an engineer or storage architect not
> knowing what the blank he's doing.
Oversubscription and it corresponding resource contention, not to
mention poor design of other aspects ... yeah, I agree that this is
indicative of something. One must question why people continue to
deploy architectures which don't scale.
>
>> As I said, high performance storage design is a very ... very ...
>> different animal from standard IT storage design. There are very
>> different decision points, and design concepts.
>
> Depends on the segment of the HPC market. It seems you're competing in
> the low end of it. Configurations get a bit exotic at the very high
I noted this about your previous responses, this particular tone you
take. I debated for a while responding, until I saw something I simply
needed to correct. I'll try not to take your bait.
[...]
> So, again, it really depends on the application(s), as always,
> regardless of whether it's HPC or IT, although there are few purely
> streaming IT workloads, EDL of decision support databases comes to mind,
> but these are usually relatively short duration. They can still put
> some strain on a SAN if not architected correctly.
>
>>> You don't see many deployed filers on the planet with 5 * 10 GbE front
>>> end connections. In fact, today, you still don't see many deployed
>>> filers with even one 10 GbE front end connection, but usually multiple
>>> (often but not always bonded) GbE connections.
>>
>> In this space, high performance cluster storage, this statement is
>> incorrect.
>
> The OP doesn't have a high performance cluster. HPC cluster storage by
Again, semantics. They are doing massive data ingestion and processing.
The view of this is called "big data" in HPC circles and it is *very
much* an HPC problem.
> accepted definition includes highly parallel workloads. This is not
> what the OP described. He described ad hoc staged data analysis.
See above. If you want to argue semantics, be my guest, I won't be
party to such a waste of time. The OP is doing analysis that requires a
high performance architecture. The architecture you suggested is not
one people in the field would likely recommend.
[rest deleted]
--
joe
next prev parent reply other threads:[~2011-02-18 0:06 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-14 23:59 high throughput storage server? Matt Garman
2011-02-15 2:06 ` Doug Dumitru
2011-02-15 4:44 ` Matt Garman
2011-02-15 5:49 ` hansbkk
2011-02-15 9:43 ` David Brown
2011-02-24 20:28 ` Matt Garman
2011-02-24 20:43 ` David Brown
2011-02-15 15:16 ` Joe Landman
2011-02-15 20:37 ` NeilBrown
2011-02-15 20:47 ` Joe Landman
2011-02-15 21:41 ` NeilBrown
2011-02-24 20:58 ` Matt Garman
2011-02-24 21:20 ` Joe Landman
2011-02-26 23:54 ` high throughput storage server? GPFS w/ 10GB/s throughput to the rescue Stan Hoeppner
2011-02-27 0:56 ` Joe Landman
2011-02-27 14:55 ` Stan Hoeppner
2011-03-12 22:49 ` Matt Garman
2011-02-27 21:30 ` high throughput storage server? Ed W
2011-02-28 15:46 ` Joe Landman
2011-02-28 23:14 ` Stan Hoeppner
2011-02-28 22:22 ` Stan Hoeppner
2011-03-02 3:44 ` Matt Garman
2011-03-02 4:20 ` Joe Landman
2011-03-02 7:10 ` Roberto Spadim
2011-03-02 19:03 ` Drew
2011-03-02 19:20 ` Roberto Spadim
2011-03-13 20:10 ` Christoph Hellwig
2011-03-14 12:27 ` Stan Hoeppner
2011-03-14 12:47 ` Christoph Hellwig
2011-03-18 13:16 ` Stan Hoeppner
2011-03-18 14:05 ` Christoph Hellwig
2011-03-18 15:43 ` Stan Hoeppner
2011-03-18 16:21 ` Roberto Spadim
2011-03-18 22:01 ` NeilBrown
2011-03-18 22:23 ` Roberto Spadim
2011-03-20 1:34 ` Stan Hoeppner
2011-03-20 3:41 ` NeilBrown
2011-03-20 5:32 ` Roberto Spadim
2011-03-20 23:22 ` Stan Hoeppner
2011-03-21 0:52 ` Roberto Spadim
2011-03-21 2:44 ` Keld Jørn Simonsen
2011-03-21 3:13 ` Roberto Spadim
2011-03-21 3:14 ` Roberto Spadim
2011-03-21 17:07 ` Stan Hoeppner
2011-03-21 14:18 ` Stan Hoeppner
2011-03-21 17:08 ` Roberto Spadim
2011-03-21 22:13 ` Keld Jørn Simonsen
2011-03-22 9:46 ` Robin Hill
2011-03-22 10:14 ` Keld Jørn Simonsen
2011-03-23 8:53 ` Stan Hoeppner
2011-03-23 15:57 ` Roberto Spadim
2011-03-23 16:19 ` Joe Landman
2011-03-24 8:05 ` Stan Hoeppner
2011-03-24 13:12 ` Joe Landman
2011-03-25 7:06 ` Stan Hoeppner
2011-03-24 17:07 ` Christoph Hellwig
2011-03-24 5:52 ` Stan Hoeppner
2011-03-24 6:33 ` NeilBrown
2011-03-24 8:07 ` Roberto Spadim
2011-03-24 8:31 ` Stan Hoeppner
2011-03-22 10:00 ` Stan Hoeppner
2011-03-22 11:01 ` Keld Jørn Simonsen
2011-02-15 12:29 ` Stan Hoeppner
2011-02-15 12:45 ` Roberto Spadim
2011-02-15 13:03 ` Roberto Spadim
2011-02-24 20:43 ` Matt Garman
2011-02-24 20:53 ` Zdenek Kaspar
2011-02-24 21:07 ` Joe Landman
2011-02-15 13:39 ` David Brown
2011-02-16 23:32 ` Stan Hoeppner
2011-02-17 0:00 ` Keld Jørn Simonsen
2011-02-17 0:19 ` Stan Hoeppner
2011-02-17 2:23 ` Roberto Spadim
2011-02-17 3:05 ` Stan Hoeppner
2011-02-17 0:26 ` David Brown
2011-02-17 0:45 ` Stan Hoeppner
2011-02-17 10:39 ` David Brown
2011-02-24 20:49 ` Matt Garman
2011-02-15 13:48 ` Zdenek Kaspar
2011-02-15 14:29 ` Roberto Spadim
2011-02-15 14:51 ` A. Krijgsman
2011-02-15 16:44 ` Roberto Spadim
2011-02-15 14:56 ` Zdenek Kaspar
2011-02-24 20:36 ` Matt Garman
2011-02-17 11:07 ` John Robinson
2011-02-17 13:36 ` Roberto Spadim
2011-02-17 13:54 ` Roberto Spadim
2011-02-17 21:47 ` Stan Hoeppner
2011-02-17 22:13 ` Joe Landman
2011-02-17 23:49 ` Stan Hoeppner
2011-02-18 0:06 ` Joe Landman [this message]
2011-02-18 3:48 ` Stan Hoeppner
2011-02-18 13:49 ` Mattias Wadenstein
2011-02-18 23:16 ` Stan Hoeppner
2011-02-21 10:25 ` Mattias Wadenstein
2011-02-21 21:51 ` Stan Hoeppner
2011-02-22 8:57 ` David Brown
2011-02-22 9:30 ` Mattias Wadenstein
2011-02-22 9:49 ` David Brown
2011-02-22 13:38 ` Stan Hoeppner
2011-02-22 14:18 ` David Brown
2011-02-23 5:52 ` Stan Hoeppner
2011-02-23 13:56 ` David Brown
2011-02-23 14:25 ` John Robinson
2011-02-23 15:15 ` David Brown
2011-02-23 23:14 ` Stan Hoeppner
2011-02-24 10:19 ` David Brown
2011-02-23 21:59 ` Stan Hoeppner
2011-02-23 23:43 ` John Robinson
2011-02-24 15:53 ` Stan Hoeppner
2011-02-23 21:11 ` Stan Hoeppner
2011-02-24 11:24 ` David Brown
2011-02-24 23:30 ` Stan Hoeppner
2011-02-25 8:20 ` David Brown
2011-02-19 0:24 ` Joe Landman
2011-02-21 10:04 ` Mattias Wadenstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D5DB7E8.1020003@gmail.com \
--to=joe.landman@gmail.com \
--cc=john.robinson@anonymous.org.uk \
--cc=linux-raid@vger.kernel.org \
--cc=matthew.garman@gmail.com \
--cc=stan@hardwarefreak.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).