From: Stan Hoeppner <stan@hardwarefreak.com>
To: Joe Landman <joe.landman@gmail.com>
Cc: Ed W <lists@wildgooses.com>,
Matt Garman <matthew.garman@gmail.com>,
Mdadm <linux-raid@vger.kernel.org>
Subject: Re: high throughput storage server?
Date: Mon, 28 Feb 2011 17:14:20 -0600 [thread overview]
Message-ID: <4D6C2C4C.5090204@hardwarefreak.com> (raw)
In-Reply-To: <4D6BC33E.7070703@gmail.com>
Joe Landman put forth on 2/28/2011 9:46 AM:
> On 02/27/2011 04:30 PM, Ed W wrote:
>
> [...]
>
>> It would appear that you can use a much lower powered system to
>> basically push jobs out to the processing machines in advance, this way
>> your bandwidth basically only needs to be:
>> size_of_job * num_machines / time_to_process_jobs
>
> This would be good. Matt's original argument suggested he needed this
> as his sustained bandwidth given the way the analysis proceeded.
And Joe has provided a nice mathematical model for quantifying it.
> If we assume that the processing time is T_p, and the communication time
> is T_c, ignoring other factors, the total time for 1 job is T_j = T_p +
> T_c. If T_c << T_p, then you can effectively ignore bandwidth related
> issues (and use a much smaller bandwidth system). For T_c << T_p, lets
> (for laughs) say T_c = 0.1 x T_p (e.g. communication time is 1/10th the
> processing time). Then even if you halved your bandwidth, and doubled
> T_c, you are making only an about 10% increase in your total execution
> time for a job.
>
> With Nmachines each with Ncores, you have Nmachines x Ncores jobs going
> on all at once. If T_c << T_p (as in the above example), then most of
> the time, on average, the machines will not be communicating. In fact,
> if we do a very rough first pass approximation to an answer (there are
> more accurate statistical models) for this, one would expect the network
> to be used T_c/T_p fraction of the time by each process. Then the total
> consumption of data for a run (assuming all runs are *approximately* of
> equal duration)
>
> D = B x T_c
>
> D being the amount of data in MB or GB, and B being the bandwidth
> expressed in MB/s or GB/s. Your effective bandwidth per run, Beff will be
>
> D = Beff x T = Beff x (T_c + T_p)
>
> For Nmachines x Ncores jobs, Dtotal is the total data transfered
>
> Dtotal = Nmachines x Ncores * D = Nmachines x Ncores x Beff
> x (T_c + T_p)
>
>
> You know Dtotal (aggregate data needed for run). You know Nmachines and
> Ncores. You know T_c and T_p (approximately). From this, solve for
> Beff. Thats what you have to sustain (approximately).
This assumes his application is threaded and scales linearly across
multiple cores. If not, running Ncores processes on each node should
achieve a similar result to the threaded case, assuming the application
is written such that multiple process instances don't trip over each
other by say, all using the same scratch file path/name, etc, etc.
--
Stan
next prev parent reply other threads:[~2011-02-28 23:14 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-14 23:59 high throughput storage server? Matt Garman
2011-02-15 2:06 ` Doug Dumitru
2011-02-15 4:44 ` Matt Garman
2011-02-15 5:49 ` hansbkk
2011-02-15 9:43 ` David Brown
2011-02-24 20:28 ` Matt Garman
2011-02-24 20:43 ` David Brown
2011-02-15 15:16 ` Joe Landman
2011-02-15 20:37 ` NeilBrown
2011-02-15 20:47 ` Joe Landman
2011-02-15 21:41 ` NeilBrown
2011-02-24 20:58 ` Matt Garman
2011-02-24 21:20 ` Joe Landman
2011-02-26 23:54 ` high throughput storage server? GPFS w/ 10GB/s throughput to the rescue Stan Hoeppner
2011-02-27 0:56 ` Joe Landman
2011-02-27 14:55 ` Stan Hoeppner
2011-03-12 22:49 ` Matt Garman
2011-02-27 21:30 ` high throughput storage server? Ed W
2011-02-28 15:46 ` Joe Landman
2011-02-28 23:14 ` Stan Hoeppner [this message]
2011-02-28 22:22 ` Stan Hoeppner
2011-03-02 3:44 ` Matt Garman
2011-03-02 4:20 ` Joe Landman
2011-03-02 7:10 ` Roberto Spadim
2011-03-02 19:03 ` Drew
2011-03-02 19:20 ` Roberto Spadim
2011-03-13 20:10 ` Christoph Hellwig
2011-03-14 12:27 ` Stan Hoeppner
2011-03-14 12:47 ` Christoph Hellwig
2011-03-18 13:16 ` Stan Hoeppner
2011-03-18 14:05 ` Christoph Hellwig
2011-03-18 15:43 ` Stan Hoeppner
2011-03-18 16:21 ` Roberto Spadim
2011-03-18 22:01 ` NeilBrown
2011-03-18 22:23 ` Roberto Spadim
2011-03-20 1:34 ` Stan Hoeppner
2011-03-20 3:41 ` NeilBrown
2011-03-20 5:32 ` Roberto Spadim
2011-03-20 23:22 ` Stan Hoeppner
2011-03-21 0:52 ` Roberto Spadim
2011-03-21 2:44 ` Keld Jørn Simonsen
2011-03-21 3:13 ` Roberto Spadim
2011-03-21 3:14 ` Roberto Spadim
2011-03-21 17:07 ` Stan Hoeppner
2011-03-21 14:18 ` Stan Hoeppner
2011-03-21 17:08 ` Roberto Spadim
2011-03-21 22:13 ` Keld Jørn Simonsen
2011-03-22 9:46 ` Robin Hill
2011-03-22 10:14 ` Keld Jørn Simonsen
2011-03-23 8:53 ` Stan Hoeppner
2011-03-23 15:57 ` Roberto Spadim
2011-03-23 16:19 ` Joe Landman
2011-03-24 8:05 ` Stan Hoeppner
2011-03-24 13:12 ` Joe Landman
2011-03-25 7:06 ` Stan Hoeppner
2011-03-24 17:07 ` Christoph Hellwig
2011-03-24 5:52 ` Stan Hoeppner
2011-03-24 6:33 ` NeilBrown
2011-03-24 8:07 ` Roberto Spadim
2011-03-24 8:31 ` Stan Hoeppner
2011-03-22 10:00 ` Stan Hoeppner
2011-03-22 11:01 ` Keld Jørn Simonsen
2011-02-15 12:29 ` Stan Hoeppner
2011-02-15 12:45 ` Roberto Spadim
2011-02-15 13:03 ` Roberto Spadim
2011-02-24 20:43 ` Matt Garman
2011-02-24 20:53 ` Zdenek Kaspar
2011-02-24 21:07 ` Joe Landman
2011-02-15 13:39 ` David Brown
2011-02-16 23:32 ` Stan Hoeppner
2011-02-17 0:00 ` Keld Jørn Simonsen
2011-02-17 0:19 ` Stan Hoeppner
2011-02-17 2:23 ` Roberto Spadim
2011-02-17 3:05 ` Stan Hoeppner
2011-02-17 0:26 ` David Brown
2011-02-17 0:45 ` Stan Hoeppner
2011-02-17 10:39 ` David Brown
2011-02-24 20:49 ` Matt Garman
2011-02-15 13:48 ` Zdenek Kaspar
2011-02-15 14:29 ` Roberto Spadim
2011-02-15 14:51 ` A. Krijgsman
2011-02-15 16:44 ` Roberto Spadim
2011-02-15 14:56 ` Zdenek Kaspar
2011-02-24 20:36 ` Matt Garman
2011-02-17 11:07 ` John Robinson
2011-02-17 13:36 ` Roberto Spadim
2011-02-17 13:54 ` Roberto Spadim
2011-02-17 21:47 ` Stan Hoeppner
2011-02-17 22:13 ` Joe Landman
2011-02-17 23:49 ` Stan Hoeppner
2011-02-18 0:06 ` Joe Landman
2011-02-18 3:48 ` Stan Hoeppner
2011-02-18 13:49 ` Mattias Wadenstein
2011-02-18 23:16 ` Stan Hoeppner
2011-02-21 10:25 ` Mattias Wadenstein
2011-02-21 21:51 ` Stan Hoeppner
2011-02-22 8:57 ` David Brown
2011-02-22 9:30 ` Mattias Wadenstein
2011-02-22 9:49 ` David Brown
2011-02-22 13:38 ` Stan Hoeppner
2011-02-22 14:18 ` David Brown
2011-02-23 5:52 ` Stan Hoeppner
2011-02-23 13:56 ` David Brown
2011-02-23 14:25 ` John Robinson
2011-02-23 15:15 ` David Brown
2011-02-23 23:14 ` Stan Hoeppner
2011-02-24 10:19 ` David Brown
2011-02-23 21:59 ` Stan Hoeppner
2011-02-23 23:43 ` John Robinson
2011-02-24 15:53 ` Stan Hoeppner
2011-02-23 21:11 ` Stan Hoeppner
2011-02-24 11:24 ` David Brown
2011-02-24 23:30 ` Stan Hoeppner
2011-02-25 8:20 ` David Brown
2011-02-19 0:24 ` Joe Landman
2011-02-21 10:04 ` Mattias Wadenstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D6C2C4C.5090204@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=joe.landman@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=lists@wildgooses.com \
--cc=matthew.garman@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.