From: Jeff Garzik <jeff@garzik.org>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Jamie Lokier <jamie@shareable.org>, Sage Weil <sage@newdream.net>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: POHMELFS high performance network filesystem. Transactions, failover, performance.
Date: Wed, 14 May 2008 16:37:20 -0400 [thread overview]
Message-ID: <482B4D80.5080808@garzik.org> (raw)
In-Reply-To: <20080514193234.GA10165@2ka.mipt.ru>
Evgeniy Polyakov wrote:
> No, server to connect is the server, which stores data. By addition it
> will also store it to some other places according to distributed
> algorithm (like weaver, raid, mirror, whatever).
[...]
> Sure the less number of machines between client and storage we have, the
> faster and more robust we are.
>
> Either client has to write data to all servers, or it has to write it to
> one and wait utill that server will broadcast it further (to quorum or any
> number of machines it wants). Having pure client to think to what
> servers it has to put its data is a bit wrong (if not saying more),
> since it has to join not only data network, but also control one, to
> check that some servers are alive or not, to be able not to race, when
> server is recovering and so on...
Quite true. It is a trade-off: additional complexity in the client
permits reduced latency and increased throughput. But is the additional
complexity -- including administrative and access control headaches --
worth it? As you say, the "complex" clients must join the data network.
Hardware manufacturers are putting so much effort into zero-copy and
RDMA. The client-to-many approach mimics that trend by minimizing
latency and data copying (and permitting use of more exotic or unusual
hardware).
But the client-to-many approach is not as complex as you make out. A
key attribute is simply for a client to be able to store new objects and
metadata on multiple servers in parallel. Once the data is stored
redundantly, the metadata controller may take quick action to
commit/abort the transaction. You can even shortcut the process further
by having the replicas send confirmations to the metadata controller.
That said, the biggest distributed systems seem to inevitably grow their
own "front end server" layer. Clients connect to N caching/application
servers, each of which behaves as you describe: the caching/app server
connects to the control and data networks, and performs the necessary
load/store operations.
Personally, I think the most simple thing for _users_ is where
semi-smart clients open multiple connections to an amorphous cloud of
servers, where the cloud is self-optimizing, self-balancing, and
self-repairing internally.
Jeff
next prev parent reply other threads:[~2008-05-14 20:37 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-13 17:45 POHMELFS high performance network filesystem. Transactions, failover, performance Evgeniy Polyakov
2008-05-13 19:09 ` Jeff Garzik
2008-05-13 20:51 ` Evgeniy Polyakov
2008-05-14 0:52 ` Jamie Lokier
2008-05-14 1:16 ` Florian Wiessner
2008-05-14 8:10 ` Evgeniy Polyakov
2008-05-14 7:57 ` Evgeniy Polyakov
2008-05-14 13:35 ` Sage Weil
2008-05-14 13:52 ` Evgeniy Polyakov
2008-05-14 14:31 ` Jamie Lokier
2008-05-14 15:00 ` Evgeniy Polyakov
2008-05-14 19:08 ` Jeff Garzik
2008-05-14 19:32 ` Evgeniy Polyakov
2008-05-14 20:37 ` Jeff Garzik [this message]
2008-05-14 21:19 ` Evgeniy Polyakov
2008-05-14 21:34 ` Jeff Garzik
2008-05-14 21:32 ` Jamie Lokier
2008-05-14 21:37 ` Jeff Garzik
2008-05-14 21:43 ` Jamie Lokier
2008-05-14 22:02 ` Evgeniy Polyakov
2008-05-14 22:28 ` Jamie Lokier
2008-05-14 22:45 ` Evgeniy Polyakov
2008-05-15 1:10 ` Jamie Lokier
2008-05-15 7:34 ` Evgeniy Polyakov
2008-05-14 19:05 ` Jeff Garzik
2008-05-14 21:38 ` Jamie Lokier
2008-05-14 19:03 ` Jeff Garzik
2008-05-14 19:38 ` Evgeniy Polyakov
2008-05-14 21:57 ` Jamie Lokier
2008-05-14 22:06 ` Jeff Garzik
2008-05-14 22:41 ` Evgeniy Polyakov
2008-05-14 22:50 ` Evgeniy Polyakov
2008-05-14 22:32 ` Evgeniy Polyakov
2008-05-14 14:09 ` Jamie Lokier
2008-05-14 16:09 ` Sage Weil
2008-05-14 19:11 ` Jeff Garzik
2008-05-14 21:19 ` Jamie Lokier
2008-05-14 18:24 ` Jeff Garzik
2008-05-14 20:00 ` Sage Weil
2008-05-14 21:49 ` Jeff Garzik
2008-05-14 22:26 ` Sage Weil
2008-05-14 22:35 ` Jamie Lokier
2008-05-14 6:33 ` Andrew Morton
2008-05-14 7:40 ` Evgeniy Polyakov
2008-05-14 8:01 ` Andrew Morton
2008-05-14 8:31 ` Evgeniy Polyakov
2008-05-14 8:08 ` Evgeniy Polyakov
2008-05-14 13:41 ` Sage Weil
2008-05-14 13:56 ` Evgeniy Polyakov
2008-05-14 17:56 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=482B4D80.5080808@garzik.org \
--to=jeff@garzik.org \
--cc=jamie@shareable.org \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sage@newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox