linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: POHMELFS high performance network filesystem. Transactions, failover, performance.
Date: Tue, 13 May 2008 15:09:06 -0400	[thread overview]
Message-ID: <4829E752.8030104@garzik.org> (raw)
In-Reply-To: <20080513174523.GA1677@2ka.mipt.ru>

Evgeniy Polyakov wrote:
> Hi.
> 
> I'm please to announce POHMEL high performance network filesystem.
> POHMELFS stands for Parallel Optimized Host Message Exchange Layered File System.
> 
> Development status can be tracked in filesystem section [1].
> 
> This is a high performance network filesystem with local coherent cache of data
> and metadata. Its main goal is distributed parallel processing of data. Network 
> filesystem is a client transport. POHMELFS protocol was proven to be superior to
> NFS in lots (if not all, then it is in a roadmap) operations.
> 
> This release brings following features:
>  * Fast transactions. System will wrap all writings into transactions, which
>  	will be resent to different (or the same) server in case of failure.
> 	Details in notes [1].
>  * Failover. It is now possible to provide number of servers to be used in
>  	round-robin fasion when one of them dies. System will automatically
> 	reconnect to others and send transactions to them.
>  * Performance. Super fast (close to wire limit) metadata operations over
>  	the network. By courtesy of writeback cache and transactions the whole
> 	kernel archive can be untarred by 2-3 seconds (including sync) over
> 	GigE link (wire limit! Not comparable to NFS).
> 
> Basic POHMELFS features:
>     * Local coherent (notes [5]) cache for data and metadata.
>     * Completely async processing of all events (hard and symlinks are the only 
>     	exceptions) including object creation and data reading.
>     * Flexible object architecture optimized for network processing. Ability to
>     	create long pathes to object and remove arbitrary huge directoris in 
> 	single network command.
>     * High performance is one of the main design goals.
>     * Very fast and scalable multithreaded userspace server. Being in userspace
>     	it works with any underlying filesystem and still is much faster than
> 	async ni-kernel NFS one.
> 
> Roadmap includes:
>     * Server extension to allow storing data on multiple devices (like creating mirroring),
>     	first by saving data in several local directories (think about server, which mounted
> 	remote dirs over POHMELFS or NFS, and local dirs).
>     * Client/server extension to report lookup and readdir requests not only for local
>     	destination, but also to different addresses, so that reading/writing could be
> 	done from different nodes in parallel.
>     * Strong authentification and possible data encryption in network channel.
>     * Async writing of the data from receiving kernel thread into
>     	userspace pages via copy_to_user() (check development tracking
> 	blog for results).
> 
> One can grab sources from archive or git [2] or check homepage [3].
> Benchmark section can be found in the blog [4].
> 
> The nearest roadmap (scheduled or the end of the month) includes:
>  * Full transaction support for all operations (only writeback is
>  	guarded by transactions currently, default network state
> 	just reconnects to the same server).
>  * Data and metadata coherency extensions (in addition to existing
> 	commented object creation/removal messages). (next week)
>  * Server redundancy.

This continues to be a neat and interesting project :)

Where is the best place to look at client<->server protocol?

Are you planning to support the case where the server filesystem dataset 
does not fit entirely on one server?

What is your opinion of the Paxos algorithm?

	Jeff




  reply	other threads:[~2008-05-13 19:09 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-13 17:45 POHMELFS high performance network filesystem. Transactions, failover, performance Evgeniy Polyakov
2008-05-13 19:09 ` Jeff Garzik [this message]
2008-05-13 20:51   ` Evgeniy Polyakov
2008-05-14  0:52     ` Jamie Lokier
2008-05-14  1:16       ` Florian Wiessner
2008-05-14  8:10         ` Evgeniy Polyakov
2008-05-14  7:57       ` Evgeniy Polyakov
2008-05-14 13:35     ` Sage Weil
2008-05-14 13:52       ` Evgeniy Polyakov
2008-05-14 14:31         ` Jamie Lokier
2008-05-14 15:00           ` Evgeniy Polyakov
2008-05-14 19:08             ` Jeff Garzik
2008-05-14 19:32               ` Evgeniy Polyakov
2008-05-14 20:37                 ` Jeff Garzik
2008-05-14 21:19                   ` Evgeniy Polyakov
2008-05-14 21:34                     ` Jeff Garzik
2008-05-14 21:32             ` Jamie Lokier
2008-05-14 21:37               ` Jeff Garzik
2008-05-14 21:43                 ` Jamie Lokier
2008-05-14 22:02               ` Evgeniy Polyakov
2008-05-14 22:28                 ` Jamie Lokier
2008-05-14 22:45                   ` Evgeniy Polyakov
2008-05-15  1:10                     ` Jamie Lokier
2008-05-15  7:34                       ` Evgeniy Polyakov
2008-05-14 19:05           ` Jeff Garzik
2008-05-14 21:38             ` Jamie Lokier
2008-05-14 19:03         ` Jeff Garzik
2008-05-14 19:38           ` Evgeniy Polyakov
2008-05-14 21:57             ` Jamie Lokier
2008-05-14 22:06               ` Jeff Garzik
2008-05-14 22:41                 ` Evgeniy Polyakov
2008-05-14 22:50                   ` Evgeniy Polyakov
2008-05-14 22:32               ` Evgeniy Polyakov
2008-05-14 14:09       ` Jamie Lokier
2008-05-14 16:09         ` Sage Weil
2008-05-14 19:11           ` Jeff Garzik
2008-05-14 21:19           ` Jamie Lokier
2008-05-14 18:24       ` Jeff Garzik
2008-05-14 20:00         ` Sage Weil
2008-05-14 21:49           ` Jeff Garzik
2008-05-14 22:26             ` Sage Weil
2008-05-14 22:35               ` Jamie Lokier
2008-05-14  6:33 ` Andrew Morton
2008-05-14  7:40   ` Evgeniy Polyakov
2008-05-14  8:01     ` Andrew Morton
2008-05-14  8:31       ` Evgeniy Polyakov
2008-05-14  8:08     ` Evgeniy Polyakov
2008-05-14 13:41       ` Sage Weil
2008-05-14 13:56         ` Evgeniy Polyakov
2008-05-14 17:56         ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4829E752.8030104@garzik.org \
    --to=jeff@garzik.org \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).