From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Mike Snitzer <snitzer@gmail.com>
Cc: lkml <linux-kernel@vger.kernel.org>,
netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [0/4] DST: Distributed storage.
Date: Tue, 4 Dec 2007 19:00:40 +0300 [thread overview]
Message-ID: <20071204160040.GA8055@2ka.mipt.ru> (raw)
In-Reply-To: <170fa0d20712040725o506a47cayb137993041ec3a63@mail.gmail.com>
Hi Mike.
On Tue, Dec 04, 2007 at 10:25:29AM -0500, Mike Snitzer (snitzer@gmail.com) wrote:
> Thanks for your continued work on DST. I'd like to know if you've
> thought further about how synchronous mirroring would be best
> implemented with DST.
>
> You shared you views some time ago via comments on your blog:
> http://tservice.net.ru/~s0mbre/blog/devel/dst/2007_11_05.html
>
> At that time you were saying you'd add a sync bit to the request
> structure that is sent to remote nodes. I'd imagine this would also
> require ordering of the block io, no? Is order guaranteed when the
> requests are submitted over the DST protocol? Otherwise how can you
> ensure a valid remote mirror (in the case of network disconnects,
> etc)?
>
> Guaranteeing consistent data on all members of a mirror is important.
> The main question is: what mechanisms _should_ be used in DST to
> provide this consistency? And do you have a timeframe for when DST
> might support such mechanisms for consistent data?
>
> For the purpose of this discussion please assume that the disk cache
> is either write-through or battery-backed.
In this case sync bit would only imply waiting until all pending
requests reached remote nodes. This is not implemented yet.
Order of the requests for given node is guaranteed by DST core,
it is possible to perform multiple requests in parallel for/from
different nodes.
In the more generic case it should wait until data has reached media,
i.e. perform flushing.
I did not implement that since actually no multiple-device system in
Linux supports barriers (please note, that in this discussion sync bit
actually means a barrier in the block layer).
Protocol changes are pretty trivial and are absolutely transparent for
the DST core - only remote targets (both userspace and kernelspace)
should be changed to invoke ->issue_flush_fn() callback when needed for
underlying device and do not process new requests until flush completed.
Thus barrier bit can be attached to data packets and can also be single
requests without data.
DST will continue to collect data, but will not send it to remote nodes
(actually it can send it, but data will not be processed and will stay
in the remote's receiving queue). This is a main concern about barrier -
should or not main node continue to process requests if previous ones
have not reached media yet, thus I have not yet implemented barriers.
> regards,
> Mike
--
Evgeniy Polyakov
next prev parent reply other threads:[~2007-12-04 16:00 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <aqqqqasdzxczc036@2ka.mipt.ru>
2007-11-29 12:53 ` [0/4] dst: Distributed storage Evgeniy Polyakov
2007-11-29 12:53 ` [1/4] dst: Distributed storage documentation Evgeniy Polyakov
2007-11-29 12:53 ` [2/4] dst: Core distributed storage files Evgeniy Polyakov
2007-11-29 12:53 ` [3/4] dst: Network state machine Evgeniy Polyakov
2007-11-29 12:53 ` [4/4] dst: Algorithms used in distributed storage Evgeniy Polyakov
2007-12-03 4:50 ` [1/4] dst: Distributed storage documentation Matt Mackall
2007-12-03 11:16 ` Evgeniy Polyakov
2007-12-04 14:37 ` [0/4] DST: Distributed storage Evgeniy Polyakov
2007-12-04 14:37 ` [1/4] DST: Distributed storage documentation Evgeniy Polyakov
2007-12-04 14:37 ` [2/4] DST: Core distributed storage files Evgeniy Polyakov
2007-12-04 14:37 ` [3/4] DST: Network state machine Evgeniy Polyakov
2007-12-04 14:37 ` [4/4] DST: Algorithms used in distributed storage Evgeniy Polyakov
2007-12-04 15:25 ` [0/4] DST: Distributed storage Mike Snitzer
2007-12-04 16:00 ` Evgeniy Polyakov [this message]
2007-12-04 16:56 ` Christoph Hellwig
2007-12-04 17:21 ` Evgeniy Polyakov
[not found] <11qqqasdzxczc036@2ka.mipt.ru>
2007-12-10 11:47 ` Evgeniy Polyakov
[not found] <20071214133747.GC31971@2ka.mipt.ru>
2007-12-17 15:03 ` Evgeniy Polyakov
2007-12-18 1:00 ` David Chinner
2007-12-18 11:08 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071204160040.GA8055@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=snitzer@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).