From: Dave Chinner <david@fromorbit.com>
To: Stewart Smith <stewart@flamingspork.com>
Cc: Grozdan <neutrino8@gmail.com>, xfs@oss.sgi.com
Subject: Re: Transactional XFS?
Date: Thu, 16 Feb 2012 17:42:30 +1100 [thread overview]
Message-ID: <20120216064230.GZ14132@dastard> (raw)
In-Reply-To: <87ehtvz6bp.fsf@flamingspork.com>
On Thu, Feb 16, 2012 at 04:38:02PM +1100, Stewart Smith wrote:
> On Thu, 16 Feb 2012 12:43:38 +1100, Dave Chinner <david@fromorbit.com> wrote:
> > Oh, so making some set of random user changes to random user data
> > have ACID properties? That's what databases are for, isn't it? :P
>
> Yep :)
>
> > I dont see us implementing anything like this in XFS anytime soon.
> > We are looking to add transaction grouping so that we can make
> > things that currently require multiple transactions (e.g. create a
> > file, add a default ACL) atomic, but I don't have any plans to
> > open the can of worms that is userspace controlled transactions any
> > time soon.
>
> The worst part is working out the semantics as to not break existing apps
> (without completely sacrificing concurrency).
That doesn't seem like a show stopper to me.
The part that I see is that it is basically impossible to do
arbitrarily large transactions in a filesystem - they are limited by
the size of the log. e.g. you can't have a user transaction that
writes more data or modifies more data than the log allows in a
single checkpoint/transaction. e.g. you can't just overwrite a 100MB
file in a transaction and expect it to work. It might work if you've
got a 2GB log, but if you've only got a 10MB log, then that
overwrite transaction is full of fail.
It's issues that like that that doom the generic usefulness of
userspace controlled filesystem transactions as part of the normal
filesystem operation. If you need this sort of functionality, it has
to be layered over the top of the filesystem to avoid filesystem
atomicity limitations. i.e. another layer of tracking and
journalling. And at that point you're talking about implementing a
database on top of the filesystem in the filesystem....
> > We already have this upgrade rollback functionality in development
> > with none of that complexity - it uses filesystem snapshots so is
> > effectively filesystem independent and already works with yum and
> > btrfs. You don't need any special application support for this -
> > rollback from a failed upgrade is as simple as a reboot.
>
> The downside being you also roll back your logs and any other changes
> made during that time. On the whole though, it's probably sufficient.
That, IMO, is one of the good things about it. You go back to a
pristine condition, but still have the failed upgrade image that you
can mount and debug. The logs and all the failed state is still
intact in the upgrade image, and when you are done debugging it you
can blow it away and try again....
> > Sure, Microsoft have been trying to make their filesystem a database
> > for years. It's theoretically possible, but in practice they've
> > fallen short in every attempt in the past 15 years.
>
> err... try 20 years :)
Time gets aways from me these days ;)
> It's funny in a way, sqlite succeeds at effectively doing this for an
> awful large number of applications.
/me nods
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-02-16 6:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-15 19:15 Transactional XFS? Grozdan
2012-02-16 0:22 ` Dave Chinner
2012-02-16 1:01 ` Stewart Smith
2012-02-16 1:43 ` Dave Chinner
2012-02-16 5:38 ` Stewart Smith
2012-02-16 6:42 ` Dave Chinner [this message]
2012-02-17 4:40 ` Stewart Smith
2012-02-16 22:10 ` Peter Grandi
2012-02-16 12:01 ` Matthias Schniedermeyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120216064230.GZ14132@dastard \
--to=david@fromorbit.com \
--cc=neutrino8@gmail.com \
--cc=stewart@flamingspork.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox