* v4 transaction design
@ 2002-09-08 23:31 Tobias Oberstein
2002-09-09 11:08 ` Hans Reiser
0 siblings, 1 reply; 6+ messages in thread
From: Tobias Oberstein @ 2002-09-08 23:31 UTC (permalink / raw)
To: reiserfs-list
I have a couple of questions regarding the v4 design. In particular
with respect to transaction support.
The quotes are take from this document http://www.namesys.com/txn-doc.html
OK, .. regarding syntax:
1. how will the filesystem API extended to support user controlled
transaction management?
* with new syscalls?
* with ioctl()'s?
2. will the new API also provide for 2 phase commits
(so that the filesystem can act as a XA resource)?
Note: even if there is not initial implementation, already
defining or planning the hooks might be a good idea
.. and the semantics:
"Persons familiar with the database literature will note that these
definitions [transcrash] do not imply isolation or serializability
between processes. Isolation requires the ability to undo a sequence
of operations when lock conflicts cause a deadlock to occur."
Let me first give a personal impression: IMHO the term "transcrash"
is misleading and may easily distract people not looking behind the
words. crash is evil. but I suppose you chose that one because
transcrashes aren't transactions semantically? I admit, naming the
"stuff" transaction could also be misleading therefor.
But now the real question:
Have you considered multi-version concurrency control
(maintaining multiple versions of an object) to provide
some level ("READ COMMITTED") of isolation? This would be
enough for many apps. It's also the default level in Oracle.
Anyway, in database terminlogy .. what's the isolation level
you indend to support: "READ UNCOMMITTED"?
"Rollback is the ability to abort and undo the effects of the operations
in an uncommitted transcrash. Transcrashes do not provide isolation,
which is needed to support separate rollback of separate transcrashes.
We only support unified rollback of all transcrashes in progress at the
time of crash recovery."
Does this mean an application cannot abort_tx() at it's will, but
transactions will only be (automatically) rolled back during recovery
(and then all uncommitted transactions will be undone)?
"However, our architecture is designed to support
separate, concurrent atoms so that it can be expanded to implement fully
isolated transactions in the future."
Are you referring to the interface?
greets,
Tobias.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: v4 transaction design 2002-09-08 23:31 v4 transaction design Tobias Oberstein @ 2002-09-09 11:08 ` Hans Reiser 2002-09-09 11:25 ` Nikita Danilov 0 siblings, 1 reply; 6+ messages in thread From: Hans Reiser @ 2002-09-09 11:08 UTC (permalink / raw) To: Tobias Oberstein; +Cc: reiserfs-list Tobias Oberstein wrote: >I have a couple of questions regarding the v4 design. In particular >with respect to transaction support. > >The quotes are take from this document http://www.namesys.com/txn-doc.html > >OK, .. regarding syntax: > >1. how will the filesystem API extended to support user controlled > transaction management? > > * with new syscalls? > sys_reiser4(), a new system call. > * with ioctl()'s? > That would be uglier. > >2. will the new API also provide for 2 phase commits > yes. > (so that the filesystem can act as a XA resource)? > what is that? > > Note: even if there is not initial implementation, already > defining or planning the hooks might be a good idea > > >.. and the semantics: > >"Persons familiar with the database literature will note that these >definitions [transcrash] do not imply isolation or serializability >between processes. Isolation requires the ability to undo a sequence >of operations when lock conflicts cause a deadlock to occur." > >Let me first give a personal impression: IMHO the term "transcrash" >is misleading and may easily distract people not looking behind the >words. crash is evil. but I suppose you chose that one because >transcrashes aren't transactions semantically? I admit, naming the >"stuff" transaction could also be misleading therefor. > In the paper I am writing I just use the term atomic transaction. Look for the docs on this to change a lot between now and January.... > > >But now the real question: > >Have you considered multi-version concurrency control >(maintaining multiple versions of an object) to provide >some level ("READ COMMITTED") of isolation? This would be >enough for many apps. It's also the default level in Oracle. > Yes, it is appropriate to have that. We don't have someone implementing it yet though.... > >Anyway, in database terminlogy .. what's the isolation level >you indend to support: "READ UNCOMMITTED"? > > >"Rollback is the ability to abort and undo the effects of the operations >in an uncommitted transcrash. Transcrashes do not provide isolation, >which is needed to support separate rollback of separate transcrashes. >We only support unified rollback of all transcrashes in progress at the >time of crash recovery." > >Does this mean an application cannot abort_tx() at it's will, but >transactions will only be (automatically) rolled back during recovery >(and then all uncommitted transactions will be undone)? > There will be atomic transactions, and isolated transactions, and only isolated transactions will offer independent rollback. Only isolated transactions will be suitable for untrusted users. Atomic transactions are implemented except for the API. Isolated transactions are farther away. > >"However, our architecture is designed to support >separate, concurrent atoms so that it can be expanded to implement fully >isolated transactions in the future." > >Are you referring to the interface? > No, the infrastructure. > >greets, >Tobias. > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: v4 transaction design 2002-09-09 11:08 ` Hans Reiser @ 2002-09-09 11:25 ` Nikita Danilov 2002-09-11 10:58 ` Xuan Baldauf 0 siblings, 1 reply; 6+ messages in thread From: Nikita Danilov @ 2002-09-09 11:25 UTC (permalink / raw) To: Hans Reiser; +Cc: Tobias Oberstein, Reiserfs mail-list Hans Reiser writes: > Tobias Oberstein wrote: > > >I have a couple of questions regarding the v4 design. In particular > >with respect to transaction support. > > > >The quotes are take from this document http://www.namesys.com/txn-doc.html > > > >OK, .. regarding syntax: > > > >1. how will the filesystem API extended to support user controlled > > transaction management? > > > > * with new syscalls? > > > sys_reiser4(), a new system call. > > > * with ioctl()'s? > > > That would be uglier. > > > > >2. will the new API also provide for 2 phase commits > > > yes. > > > (so that the filesystem can act as a XA resource)? > > > what is that? Resource manager able to participate in a distributed transaction. "XA Open" is specification by OpenGroup for such a resource manager. http://www.opengroup.org/products/publications/catalog/c193.htm > > > > > Note: even if there is not initial implementation, already > > defining or planning the hooks might be a good idea > > > > > >.. and the semantics: > > > >"Persons familiar with the database literature will note that these > >definitions [transcrash] do not imply isolation or serializability > >between processes. Isolation requires the ability to undo a sequence > >of operations when lock conflicts cause a deadlock to occur." > > > >Let me first give a personal impression: IMHO the term "transcrash" > >is misleading and may easily distract people not looking behind the > >words. crash is evil. but I suppose you chose that one because > >transcrashes aren't transactions semantically? I admit, naming the > >"stuff" transaction could also be misleading therefor. > > > In the paper I am writing I just use the term atomic transaction. Look > for the docs on this to change a lot between now and January.... > > > > > > >But now the real question: > > > >Have you considered multi-version concurrency control > >(maintaining multiple versions of an object) to provide > >some level ("READ COMMITTED") of isolation? This would be > >enough for many apps. It's also the default level in Oracle. > > > Yes, it is appropriate to have that. We don't have someone implementing > it yet though.... > > > > >Anyway, in database terminlogy .. what's the isolation level > >you indend to support: "READ UNCOMMITTED"? > > Because we don't currently support isolation, isolation levels are not exactly meaningful. But yes, one thread T1 can read data modified by another thread T2 that hasn't yet committed, but at that moment "atom" associated with T1 will "fuse" with atom of T2, so that they will either commit of fail -both-. > > > >"Rollback is the ability to abort and undo the effects of the operations > >in an uncommitted transcrash. Transcrashes do not provide isolation, > >which is needed to support separate rollback of separate transcrashes. > >We only support unified rollback of all transcrashes in progress at the > >time of crash recovery." > > > >Does this mean an application cannot abort_tx() at it's will, but > >transactions will only be (automatically) rolled back during recovery > >(and then all uncommitted transactions will be undone)? > > > There will be atomic transactions, and isolated transactions, and only > isolated transactions will offer independent rollback. Only isolated > transactions will be suitable for untrusted users. > > Atomic transactions are implemented except for the API. Isolated > transactions are farther away. > > > > >"However, our architecture is designed to support > >separate, concurrent atoms so that it can be expanded to implement fully > >isolated transactions in the future." > > > >Are you referring to the interface? > > > No, the infrastructure. > > > > >greets, > >Tobias. > > Nikita. > > > > > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: v4 transaction design 2002-09-09 11:25 ` Nikita Danilov @ 2002-09-11 10:58 ` Xuan Baldauf 2002-09-11 11:10 ` Nikita Danilov 2002-09-11 11:16 ` Hans Reiser 0 siblings, 2 replies; 6+ messages in thread From: Xuan Baldauf @ 2002-09-11 10:58 UTC (permalink / raw) To: Nikita Danilov; +Cc: Hans Reiser, Tobias Oberstein, Reiserfs mail-list Nikita Danilov wrote: > Hans Reiser writes: > > Tobias Oberstein wrote: > > > > >I have a couple of questions regarding the v4 design. In particular > > >with respect to transaction support. > > > > > >The quotes are take from this document http://www.namesys.com/txn-doc.html > > > > > >OK, .. regarding syntax: > > > > > >1. how will the filesystem API extended to support user controlled > > > transaction management? > > > > > > * with new syscalls? > > > > > sys_reiser4(), a new system call. > > > > > * with ioctl()'s? > > > > > That would be uglier. > > > > > > > >2. will the new API also provide for 2 phase commits > > > > > yes. > > > > > (so that the filesystem can act as a XA resource)? > > > > > what is that? > > Resource manager able to participate in a distributed transaction. "XA > Open" is specification by OpenGroup for such a resource manager. > > http://www.opengroup.org/products/publications/catalog/c193.htm > > > > > > > > > Note: even if there is not initial implementation, already > > > defining or planning the hooks might be a good idea > > > > > > > > >.. and the semantics: > > > > > >"Persons familiar with the database literature will note that these > > >definitions [transcrash] do not imply isolation or serializability > > >between processes. Isolation requires the ability to undo a sequence > > >of operations when lock conflicts cause a deadlock to occur." > > > > > >Let me first give a personal impression: IMHO the term "transcrash" > > >is misleading and may easily distract people not looking behind the > > >words. crash is evil. but I suppose you chose that one because > > >transcrashes aren't transactions semantically? I admit, naming the > > >"stuff" transaction could also be misleading therefor. > > > > > In the paper I am writing I just use the term atomic transaction. Look > > for the docs on this to change a lot between now and January.... > > > > > > > > > > >But now the real question: > > > > > >Have you considered multi-version concurrency control > > >(maintaining multiple versions of an object) to provide > > >some level ("READ COMMITTED") of isolation? This would be > > >enough for many apps. It's also the default level in Oracle. > > > > > Yes, it is appropriate to have that. We don't have someone implementing > > it yet though.... > > > > > > > >Anyway, in database terminlogy .. what's the isolation level > > >you indend to support: "READ UNCOMMITTED"? > > > > > Because we don't currently support isolation, isolation levels are not > exactly meaningful. But yes, one thread T1 can read data modified by > another thread T2 that hasn't yet committed, but at that moment "atom" > associated with T1 will "fuse" with atom of T2, so that they will either > commit of fail -both-. So this is an implicit join of transactions. How do you ensure that livelocks do not happen, i.e. that T1 fails due to T2, and T2 also fails because it joined T1, and that thus a retry would make T1 with T2 fail again...? Xuân. > > > > > > > >"Rollback is the ability to abort and undo the effects of the operations > > >in an uncommitted transcrash. Transcrashes do not provide isolation, > > >which is needed to support separate rollback of separate transcrashes. > > >We only support unified rollback of all transcrashes in progress at the > > >time of crash recovery." > > > > > >Does this mean an application cannot abort_tx() at it's will, but > > >transactions will only be (automatically) rolled back during recovery > > >(and then all uncommitted transactions will be undone)? > > > > > There will be atomic transactions, and isolated transactions, and only > > isolated transactions will offer independent rollback. Only isolated > > transactions will be suitable for untrusted users. > > > > Atomic transactions are implemented except for the API. Isolated > > transactions are farther away. > > > > > > > >"However, our architecture is designed to support > > >separate, concurrent atoms so that it can be expanded to implement fully > > >isolated transactions in the future." > > > > > >Are you referring to the interface? > > > > > No, the infrastructure. > > > > > > > >greets, > > >Tobias. > > > > > Nikita. > > > > > > > > > > > > > > > > -- Mit freundlichen Grüßen Xuân Baldauf Medium.net Internet Server Software ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: v4 transaction design 2002-09-11 10:58 ` Xuan Baldauf @ 2002-09-11 11:10 ` Nikita Danilov 2002-09-11 11:16 ` Hans Reiser 1 sibling, 0 replies; 6+ messages in thread From: Nikita Danilov @ 2002-09-11 11:10 UTC (permalink / raw) To: Xuan Baldauf; +Cc: Hans Reiser, Tobias Oberstein, Reiserfs mail-list Xuan Baldauf writes: > > > Nikita Danilov wrote: > > > Hans Reiser writes: > > > Tobias Oberstein wrote: > > > > > > >I have a couple of questions regarding the v4 design. In particular > > > >with respect to transaction support. > > > > > > > >The quotes are take from this document http://www.namesys.com/txn-doc.html > > > > > > > >OK, .. regarding syntax: > > > > > > > >1. how will the filesystem API extended to support user controlled > > > > transaction management? > > > > > > > > * with new syscalls? > > > > > > > sys_reiser4(), a new system call. > > > > > > > * with ioctl()'s? > > > > > > > That would be uglier. > > > > > > > > > > >2. will the new API also provide for 2 phase commits > > > > > > > yes. > > > > > > > (so that the filesystem can act as a XA resource)? > > > > > > > what is that? > > > > Resource manager able to participate in a distributed transaction. "XA > > Open" is specification by OpenGroup for such a resource manager. > > > > http://www.opengroup.org/products/publications/catalog/c193.htm > > > > > > > > > > > > > Note: even if there is not initial implementation, already > > > > defining or planning the hooks might be a good idea > > > > > > > > > > > >.. and the semantics: > > > > > > > >"Persons familiar with the database literature will note that these > > > >definitions [transcrash] do not imply isolation or serializability > > > >between processes. Isolation requires the ability to undo a sequence > > > >of operations when lock conflicts cause a deadlock to occur." > > > > > > > >Let me first give a personal impression: IMHO the term "transcrash" > > > >is misleading and may easily distract people not looking behind the > > > >words. crash is evil. but I suppose you chose that one because > > > >transcrashes aren't transactions semantically? I admit, naming the > > > >"stuff" transaction could also be misleading therefor. > > > > > > > In the paper I am writing I just use the term atomic transaction. Look > > > for the docs on this to change a lot between now and January.... > > > > > > > > > > > > > > >But now the real question: > > > > > > > >Have you considered multi-version concurrency control > > > >(maintaining multiple versions of an object) to provide > > > >some level ("READ COMMITTED") of isolation? This would be > > > >enough for many apps. It's also the default level in Oracle. > > > > > > > Yes, it is appropriate to have that. We don't have someone implementing > > > it yet though.... > > > > > > > > > > >Anyway, in database terminlogy .. what's the isolation level > > > >you indend to support: "READ UNCOMMITTED"? > > > > > > > > Because we don't currently support isolation, isolation levels are not > > exactly meaningful. But yes, one thread T1 can read data modified by > > another thread T2 that hasn't yet committed, but at that moment "atom" > > associated with T1 will "fuse" with atom of T2, so that they will either > > commit of fail -both-. > > So this is an implicit join of transactions. How do you ensure that livelocks do > not happen, i.e. that T1 fails due to T2, and T2 also fails because it joined T1, > and that thus a retry would make T1 with T2 fail again...? Currently transaction service is only available to the kernel, so such sort of properties is guaranteed, well, by careful coding, and system-wide knowledge. In this particular case, system call just returns error code further to the user rather than restarts itself in the case of failure. Plan is that transaction API will be first exported to the "trusted" user level applications that know what they are doing. We are aware that exporting transactions to the user level is going to pose a lot of complex problems, to some of which database people hadn't found a universal solution after thirty years of research. > > Xuân. > > > > > > > > > > > > >"Rollback is the ability to abort and undo the effects of the operations > > > >in an uncommitted transcrash. Transcrashes do not provide isolation, > > > >which is needed to support separate rollback of separate transcrashes. > > > >We only support unified rollback of all transcrashes in progress at the > > > >time of crash recovery." > > > > > > > >Does this mean an application cannot abort_tx() at it's will, but > > > >transactions will only be (automatically) rolled back during recovery > > > >(and then all uncommitted transactions will be undone)? > > > > > > > There will be atomic transactions, and isolated transactions, and only > > > isolated transactions will offer independent rollback. Only isolated > > > transactions will be suitable for untrusted users. > > > > > > Atomic transactions are implemented except for the API. Isolated > > > transactions are farther away. > > > > > > > > > > >"However, our architecture is designed to support > > > >separate, concurrent atoms so that it can be expanded to implement fully > > > >isolated transactions in the future." > > > > > > > >Are you referring to the interface? > > > > > > > No, the infrastructure. > > > > > > > > > > >greets, > > > >Tobias. > > > > > > Nikita. > > > > > > > > > > > > > > > > > > > > > > > > > -- > Mit freundlichen Grüßen > > Xuân Baldauf > Medium.net Internet Server Software > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: v4 transaction design 2002-09-11 10:58 ` Xuan Baldauf 2002-09-11 11:10 ` Nikita Danilov @ 2002-09-11 11:16 ` Hans Reiser 1 sibling, 0 replies; 6+ messages in thread From: Hans Reiser @ 2002-09-11 11:16 UTC (permalink / raw) To: Xuan Baldauf; +Cc: Nikita Danilov, Tobias Oberstein, Reiserfs mail-list Xuan Baldauf wrote: >Nikita Danilov wrote: > > > >> >>Because we don't currently support isolation, isolation levels are not >>exactly meaningful. But yes, one thread T1 can read data modified by >>another thread T2 that hasn't yet committed, but at that moment "atom" >>associated with T1 will "fuse" with atom of T2, so that they will either >>commit of fail -both-. >> You should mention that we will support isolation, but we won't work on that until after Halloween. >> >> > >So this is an implicit join of transactions. How do you ensure that livelocks do >not happen, i.e. that T1 fails due to T2, and T2 also fails because it joined T1, >and that thus a retry would make T1 with T2 fail again...? > We don't. Think of what we are doing as taking the traditional journaling filesystem approach of not allowing rollback except in response to crashes, and only allowing trusted plugins to use transactions. Then understand that we will add isolation and a user API later after the code we have is debugged. Hans ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-09-11 11:16 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-09-08 23:31 v4 transaction design Tobias Oberstein 2002-09-09 11:08 ` Hans Reiser 2002-09-09 11:25 ` Nikita Danilov 2002-09-11 10:58 ` Xuan Baldauf 2002-09-11 11:10 ` Nikita Danilov 2002-09-11 11:16 ` Hans Reiser
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.