* Transactional XFS? @ 2012-02-15 19:15 Grozdan 2012-02-16 0:22 ` Dave Chinner 2012-02-16 12:01 ` Matthias Schniedermeyer 0 siblings, 2 replies; 9+ messages in thread From: Grozdan @ 2012-02-15 19:15 UTC (permalink / raw) To: xfs Hi, I just finished watching the excellent speech of Dave Chinner at linux.conf.au and I must say I'm impressed by the recent improvements to XFS. Towards the end of the talk, Dave talked about upcoming improvements on Metadata reliability and other features. What I'm wondering about is if there are any plans in making XFS transactional (fully atomic) like it is the case with recent NTFS versions on Windows Vista and higher? Thanks _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-15 19:15 Transactional XFS? Grozdan @ 2012-02-16 0:22 ` Dave Chinner 2012-02-16 1:01 ` Stewart Smith 2012-02-16 12:01 ` Matthias Schniedermeyer 1 sibling, 1 reply; 9+ messages in thread From: Dave Chinner @ 2012-02-16 0:22 UTC (permalink / raw) To: Grozdan; +Cc: xfs On Wed, Feb 15, 2012 at 08:15:46PM +0100, Grozdan wrote: > Hi, > > I just finished watching the excellent speech of Dave Chinner at > linux.conf.au and I must say I'm impressed by the recent improvements > to XFS. Towards the end of the talk, Dave talked about upcoming > improvements on Metadata reliability and other features. What I'm > wondering about is if there are any plans in making XFS transactional > (fully atomic) like it is the case with recent NTFS versions on > Windows Vista and higher? What do you mean by "fully atomic"? NTFS is not fully atomic - it doesn't journal data so can lose data on a crash - so I'm not sure what you mean here.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-16 0:22 ` Dave Chinner @ 2012-02-16 1:01 ` Stewart Smith 2012-02-16 1:43 ` Dave Chinner 0 siblings, 1 reply; 9+ messages in thread From: Stewart Smith @ 2012-02-16 1:01 UTC (permalink / raw) To: Dave Chinner, Grozdan; +Cc: xfs On Thu, 16 Feb 2012 11:22:37 +1100, Dave Chinner <david@fromorbit.com> wrote: > On Wed, Feb 15, 2012 at 08:15:46PM +0100, Grozdan wrote: > > Hi, > > > > I just finished watching the excellent speech of Dave Chinner at > > linux.conf.au and I must say I'm impressed by the recent improvements > > to XFS. Towards the end of the talk, Dave talked about upcoming > > improvements on Metadata reliability and other features. What I'm > > wondering about is if there are any plans in making XFS transactional > > (fully atomic) like it is the case with recent NTFS versions on > > Windows Vista and higher? > > What do you mean by "fully atomic"? NTFS is not fully atomic - it > doesn't journal data so can lose data on a crash - so I'm not sure > what you mean here.... There's another API in Windows that's let you do operations in a all-or-nothing way. Originally this was scoped to be able to just add a couple of API calls to the Windows file API and have it all "just work" (imagine adding just three syscalls: begin(), commit(), rollback()). This didn't really work out so well, and by the final Vista release, it was a wholly different API calls (more like tx_begin, tx_open, tx_read, tx_write... so you had to have code explicitly aware of transactions). AFAIK the current big user is Windows Update. That is, windows update will either apply all its changes to the system or none. Think of being able to hit the reset button halfway through a windows update and have everything "just work" and come back to a sane state. I've had a linux box crash during a dist-upgrade before... not pretty. It's a neat idea, but as you can imagine, fraught with difficulties. I think it'd be possible to do.. you know, if you lock a number of FS and VFS devs in a room with database people for a month or so we may theoritically solve nearly all the problems.... -- Stewart Smith _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-16 1:01 ` Stewart Smith @ 2012-02-16 1:43 ` Dave Chinner 2012-02-16 5:38 ` Stewart Smith 2012-02-16 22:10 ` Peter Grandi 0 siblings, 2 replies; 9+ messages in thread From: Dave Chinner @ 2012-02-16 1:43 UTC (permalink / raw) To: Stewart Smith; +Cc: Grozdan, xfs On Thu, Feb 16, 2012 at 12:01:01PM +1100, Stewart Smith wrote: > On Thu, 16 Feb 2012 11:22:37 +1100, Dave Chinner <david@fromorbit.com> wrote: > > On Wed, Feb 15, 2012 at 08:15:46PM +0100, Grozdan wrote: > > > Hi, > > > > > > I just finished watching the excellent speech of Dave Chinner at > > > linux.conf.au and I must say I'm impressed by the recent improvements > > > to XFS. Towards the end of the talk, Dave talked about upcoming > > > improvements on Metadata reliability and other features. What I'm > > > wondering about is if there are any plans in making XFS transactional > > > (fully atomic) like it is the case with recent NTFS versions on > > > Windows Vista and higher? > > > > What do you mean by "fully atomic"? NTFS is not fully atomic - it > > doesn't journal data so can lose data on a crash - so I'm not sure > > what you mean here.... > > There's another API in Windows that's let you do operations in a > all-or-nothing way. Originally this was scoped to be able to just add a > couple of API calls to the Windows file API and have it all "just work" > (imagine adding just three syscalls: begin(), commit(), > rollback()). This didn't really work out so well, and by the final Vista > release, it was a wholly different API calls (more like tx_begin, > tx_open, tx_read, tx_write... so you had to have code explicitly aware > of transactions). Oh, so making some set of random user changes to random user data have ACID properties? That's what databases are for, isn't it? :P I dont see us implementing anything like this in XFS anytime soon. We are looking to add transaction grouping so that we can make things that currently require multiple transactions (e.g. create a file, add a default ACL) atomic, but I don't have any plans to open the can of worms that is userspace controlled transactions any time soon. > AFAIK the current big user is Windows Update. That is, windows update > will either apply all its changes to the system or none. Think of being > able to hit the reset button halfway through a windows update and have > everything "just work" and come back to a sane state. I've had a linux > box crash during a dist-upgrade before... not pretty. > > It's a neat idea, but as you can imagine, fraught with difficulties. We already have this upgrade rollback functionality in development with none of that complexity - it uses filesystem snapshots so is effectively filesystem independent and already works with yum and btrfs. You don't need any special application support for this - rollback from a failed upgrade is as simple as a reboot. > I think it'd be possible to do.. you know, if you lock a number of FS > and VFS devs in a room with database people for a month or so we may > theoritically solve nearly all the problems.... Sure, Microsoft have been trying to make their filesystem a database for years. It's theoretically possible, but in practice they've fallen short in every attempt in the past 15 years. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-16 1:43 ` Dave Chinner @ 2012-02-16 5:38 ` Stewart Smith 2012-02-16 6:42 ` Dave Chinner 2012-02-16 22:10 ` Peter Grandi 1 sibling, 1 reply; 9+ messages in thread From: Stewart Smith @ 2012-02-16 5:38 UTC (permalink / raw) To: Dave Chinner; +Cc: Grozdan, xfs On Thu, 16 Feb 2012 12:43:38 +1100, Dave Chinner <david@fromorbit.com> wrote: > Oh, so making some set of random user changes to random user data > have ACID properties? That's what databases are for, isn't it? :P Yep :) > I dont see us implementing anything like this in XFS anytime soon. > We are looking to add transaction grouping so that we can make > things that currently require multiple transactions (e.g. create a > file, add a default ACL) atomic, but I don't have any plans to > open the can of worms that is userspace controlled transactions any > time soon. The worst part is working out the semantics as to not break existing apps (without completely sacrificing concurrency). > We already have this upgrade rollback functionality in development > with none of that complexity - it uses filesystem snapshots so is > effectively filesystem independent and already works with yum and > btrfs. You don't need any special application support for this - > rollback from a failed upgrade is as simple as a reboot. The downside being you also roll back your logs and any other changes made during that time. On the whole though, it's probably sufficient. > Sure, Microsoft have been trying to make their filesystem a database > for years. It's theoretically possible, but in practice they've > fallen short in every attempt in the past 15 years. err... try 20 years :) It's funny in a way, sqlite succeeds at effectively doing this for an awful large number of applications. -- Stewart Smith _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-16 5:38 ` Stewart Smith @ 2012-02-16 6:42 ` Dave Chinner 2012-02-17 4:40 ` Stewart Smith 0 siblings, 1 reply; 9+ messages in thread From: Dave Chinner @ 2012-02-16 6:42 UTC (permalink / raw) To: Stewart Smith; +Cc: Grozdan, xfs On Thu, Feb 16, 2012 at 04:38:02PM +1100, Stewart Smith wrote: > On Thu, 16 Feb 2012 12:43:38 +1100, Dave Chinner <david@fromorbit.com> wrote: > > Oh, so making some set of random user changes to random user data > > have ACID properties? That's what databases are for, isn't it? :P > > Yep :) > > > I dont see us implementing anything like this in XFS anytime soon. > > We are looking to add transaction grouping so that we can make > > things that currently require multiple transactions (e.g. create a > > file, add a default ACL) atomic, but I don't have any plans to > > open the can of worms that is userspace controlled transactions any > > time soon. > > The worst part is working out the semantics as to not break existing apps > (without completely sacrificing concurrency). That doesn't seem like a show stopper to me. The part that I see is that it is basically impossible to do arbitrarily large transactions in a filesystem - they are limited by the size of the log. e.g. you can't have a user transaction that writes more data or modifies more data than the log allows in a single checkpoint/transaction. e.g. you can't just overwrite a 100MB file in a transaction and expect it to work. It might work if you've got a 2GB log, but if you've only got a 10MB log, then that overwrite transaction is full of fail. It's issues that like that that doom the generic usefulness of userspace controlled filesystem transactions as part of the normal filesystem operation. If you need this sort of functionality, it has to be layered over the top of the filesystem to avoid filesystem atomicity limitations. i.e. another layer of tracking and journalling. And at that point you're talking about implementing a database on top of the filesystem in the filesystem.... > > We already have this upgrade rollback functionality in development > > with none of that complexity - it uses filesystem snapshots so is > > effectively filesystem independent and already works with yum and > > btrfs. You don't need any special application support for this - > > rollback from a failed upgrade is as simple as a reboot. > > The downside being you also roll back your logs and any other changes > made during that time. On the whole though, it's probably sufficient. That, IMO, is one of the good things about it. You go back to a pristine condition, but still have the failed upgrade image that you can mount and debug. The logs and all the failed state is still intact in the upgrade image, and when you are done debugging it you can blow it away and try again.... > > Sure, Microsoft have been trying to make their filesystem a database > > for years. It's theoretically possible, but in practice they've > > fallen short in every attempt in the past 15 years. > > err... try 20 years :) Time gets aways from me these days ;) > It's funny in a way, sqlite succeeds at effectively doing this for an > awful large number of applications. /me nods Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-16 6:42 ` Dave Chinner @ 2012-02-17 4:40 ` Stewart Smith 0 siblings, 0 replies; 9+ messages in thread From: Stewart Smith @ 2012-02-17 4:40 UTC (permalink / raw) To: Dave Chinner; +Cc: Grozdan, xfs On Thu, 16 Feb 2012 17:42:30 +1100, Dave Chinner <david@fromorbit.com> wrote: > > The worst part is working out the semantics as to not break existing apps > > (without completely sacrificing concurrency). > > That doesn't seem like a show stopper to me. > > The part that I see is that it is basically impossible to do > arbitrarily large transactions in a filesystem - they are limited by > the size of the log. e.g. you can't have a user transaction that > writes more data or modifies more data than the log allows in a > single checkpoint/transaction. e.g. you can't just overwrite a 100MB > file in a transaction and expect it to work. It might work if you've > got a 2GB log, but if you've only got a 10MB log, then that > overwrite transaction is full of fail. We have this problem too. none of the solutions are particularly pretty, and certainly do have a performance impact. > It's issues that like that that doom the generic usefulness of > userspace controlled filesystem transactions as part of the normal > filesystem operation. If you need this sort of functionality, it has > to be layered over the top of the filesystem to avoid filesystem > atomicity limitations. i.e. another layer of tracking and > journalling. And at that point you're talking about implementing a > database on top of the filesystem in the filesystem.... As I said... it's tricky to solve all the problems :) -- Stewart Smith _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-16 1:43 ` Dave Chinner 2012-02-16 5:38 ` Stewart Smith @ 2012-02-16 22:10 ` Peter Grandi 1 sibling, 0 replies; 9+ messages in thread From: Peter Grandi @ 2012-02-16 22:10 UTC (permalink / raw) To: Linux fs XFS [ ... ] > Oh, so making some set of random user changes to random user data > have ACID properties? That's what databases are for, isn't it? :P I am going to use this and in particular "That's what databases are for, isn't it?" as a quote to throw at people who try to use filesystems as database managers, usually with very many very small files (also known as "records" to database people), but not only. >> I think it'd be possible to do.. you know, if you lock a >> number of FS and VFS devs in a room with database people for >> a month or so we may theoritically solve nearly all the >> problems.... The DBMS people have given up long, long ago. At least since the article by Stonebraker mentioned here: http://WWW.sabi.co.UK/blog/anno05-4th.html#051012d Anyhow Oracle has sponsored two filesystem designs, one being OCFS2, which is pretty decent, targeted at DBMS storage and does not have ACID as such, and one being BTRFS which is not targeted at DBMS storage and that has snapshots for rollback of failed transactions. > Sure, Microsoft have been trying to make their filesystem a > database for years. It's theoretically possible, but in > practice they've fallen short in every attempt in the past 15 > years. I think it would be easier to do the opposite, and there have been indeed filesystems implemented on top of DBMSes (with the DBMS storing their data directly on top of block devices). _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Transactional XFS? 2012-02-15 19:15 Transactional XFS? Grozdan 2012-02-16 0:22 ` Dave Chinner @ 2012-02-16 12:01 ` Matthias Schniedermeyer 1 sibling, 0 replies; 9+ messages in thread From: Matthias Schniedermeyer @ 2012-02-16 12:01 UTC (permalink / raw) To: Grozdan; +Cc: xfs On 15.02.2012 20:15, Grozdan wrote: > Hi, > > I just finished watching the excellent speech of Dave Chinner at > linux.conf.au and I must say I'm impressed by the recent improvements > to XFS. Towards the end of the talk, Dave talked about upcoming > improvements on Metadata reliability and other features. What I'm > wondering about is if there are any plans in making XFS transactional > (fully atomic) like it is the case with recent NTFS versions on > Windows Vista and higher? You could argue if it is NTFS doing the work at all. I glanced over a document describing it, and as far as i remember the KTM-Component does all the work and stores the changes into a specialized database. So effectivly you have a shim at the VFS-Layer that lets "others" see the old data while your application can see the new data and when you "commit", all the filesystem changes stored in the database are applied to the filesystem. As far as i unterstand it you wouldn't necessarily need support for that in the filesystem itself, you could do it at the VFS level. So one of the union/layered-"things" should be able to do that. IOW, store all the changes necessary and "replay" the changes to the actual filesystem when doing the commit. (Or the opposite, depending if you expect a commit or rollback as the default operation at transaction end.) Or BTRFS should be able to do that, when they implement snapshot at directory-level (AFAIR BTRFS currently supports snapshots at subvolume level, so if you use a subvolume you could already to that). You would snapshot the dir, do your work in the snapshot and switch the original dir with the snapshot on commit. Altough i don't know if you can switch a mounted subvolume, or if it has to be umounted first. Having to do a umount might be problematic, depending on use-case. Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-02-17 4:40 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-15 19:15 Transactional XFS? Grozdan 2012-02-16 0:22 ` Dave Chinner 2012-02-16 1:01 ` Stewart Smith 2012-02-16 1:43 ` Dave Chinner 2012-02-16 5:38 ` Stewart Smith 2012-02-16 6:42 ` Dave Chinner 2012-02-17 4:40 ` Stewart Smith 2012-02-16 22:10 ` Peter Grandi 2012-02-16 12:01 ` Matthias Schniedermeyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox