From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ed W Subject: Re: high throughput storage server? Date: Sun, 27 Feb 2011 21:30:48 +0000 Message-ID: <4D6AC288.20101@wildgooses.com> References: <20110215044434.GA9186@septictank.raw-sewage.fake> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20110215044434.GA9186@septictank.raw-sewage.fake> Sender: linux-raid-owner@vger.kernel.org To: Matt Garman , Mdadm List-Id: linux-raid.ids Your application appears to be an implementation of a queue processing system? ie each machine: pulls a file down, processes it, gets the next file, etc? Can you share some information on - the size of files you pull down (I saw something in another post) - how long each machine takes to process each file - whether there is any dependency between the processing machines? eg can each machine operate completely independently of the others and start it's job when it wishes (or does it need to sync?) Given the tentative assumption that - processing each file takes many multiples of the time needed to download the file, and - files are processed independently It would appear that you can use a much lower powered system to basically push jobs out to the processing machines in advance, this way your bandwidth basically only needs to be: size_of_job * num_machines / time_to_process_jobs So if the time to process jobs is significant then you have quite some time to push out the next job to local storage ready? Firstly is this architecture workable? If so then you have some new performance parameters to target for the storage architecture? Good luck Ed W