public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Hans Reiser <reiser@namesys.com>
To: Andreas Dilger <adilger@clusterfs.com>
Cc: Mike Fedyk <mfedyk@matchmail.com>, Andrew Morton <akpm@osdl.org>,
	reiserfs-list@namesys.com, linux-kernel@vger.kernel.org
Subject: Re: precise characterization of ext3 atomicity
Date: Fri, 05 Sep 2003 01:32:41 +0400	[thread overview]
Message-ID: <3F57AF79.1040702@namesys.com> (raw)
In-Reply-To: <20030904132804.D15623@schatzie.adilger.int>

Andreas Dilger wrote:

>On Sep 04, 2003  22:37 +0400, Hans Reiser wrote:
>  
>
>>Mike Fedyk wrote:
>>    
>>
>>>And how does reiser4 do this [export atomic ops to userspace]
>>>without changing the userspace apps?
>>>      
>>>
>>We don't.  We just make the hovercraft, we don't force you to go over 
>>the water.....
>>    
>>
>
>It is possible to do the same with ext3, namely exporting journal_start()
>and journal_stop() (or some interface to them) to userspace so the application
>can start a transaction for multiple operations.  We had discussed this in
>the past, but decided not to do so because user applications can screw up in
>so many ways, and if an application uses these interfaces it is possible to
>deadlock the entire filesystem if the application isn't well behaved.
>
Yup.  That's why we confine it to a (finite #defined number) set of 
operations within one sys_reiser4 call.  At some point we will allow 
trusted user space processes to span multiple system calls (mail server 
applicances, database appliances, etc., might find this useful).  You 
might consider supporting sys_reiser4 at some point.

>
>If the app doesn't eventually say "end the transaction", the filesystem might
>wait indefinitely.  You could start adding more plumbing like "if the file
>is closed (maybe because the process crashed), cancel the transaction",
>and "if the process doesn't complete the transaction in time, cancel the
>transaction", etc.  How do you guarantee in advance that the application
>will be able to complete all of the operations it needs (i.e. if it runs
>out of space in the filesystem or something)?
>
We will export our space reservation infrastructure code also.  We are 
still thinking about the right API for that.  At first I had some idea 
that we should calculate for the user how much space would be needed, 
but I am getting lazier as we get closer to actually doing it, and I am 
thinking we can add the helpful but complex in some cases 
estimate_sizeof() functions later, and for now just let the user grab 
space, and then if they exceed it return an error, and if they both 
exceed it and run out of disk space return a nastier error that tells 
them to go cope with their mistake.  Now I think it will be something 
that takes a 64 bit int that is the number of blocks to grab, compares 
it to their quota if any, and causes the sys_reiser4 to do nothing and 
error nicely if it can't get it.

When coding sometimes you have to be careful not to let the complex  
needs of the 20% prevent you from getting something that the 80% need 
and would  be happy with to market.  (Learned this one from Dave Hitz of 
NetApp.)

There's a lot of applications that have very simple needs in regards to 
atomicity, like write 3 things to 3 files as one atom.  If we can 
address that, then later people can do their PhDs on the needs of the 
complex 20%.... and hopefully send some nice patches.....

>
>I suppose at worst, the application doesn't get its multi-op atomicity
>guarantee, but I'm guessing that apps which use this interface depend on
>it working properly or they wouldn't be using it.
>
>Cheers, Andreas
>--
>Andreas Dilger
>http://sourceforge.net/projects/ext2resize/
>http://www-mddsp.enel.ucalgary.ca/People/adilger/
>
>
>
>  
>


-- 
Hans



  reply	other threads:[~2003-09-04 21:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-09-04 14:20 precise characterization of ext3 atomicity Hans Reiser
2003-09-04 15:55 ` Andrew Morton
2003-09-04 15:59   ` Hans Reiser
2003-09-04 16:12     ` Andrew Morton
2003-09-04 16:25       ` Hans Reiser
2003-09-04 18:15         ` Mike Fedyk
2003-09-04 16:05           ` Antonio Vargas
2003-09-04 18:37           ` Hans Reiser
2003-09-04 19:12             ` Mike Fedyk
2003-09-04 21:03               ` Hans Reiser
2003-09-04 19:28             ` Andreas Dilger
2003-09-04 21:32               ` Hans Reiser [this message]
2003-09-04 22:03                 ` Andreas Dilger
2003-09-05 13:47                   ` Chris Mason
2003-09-09 13:09                 ` Pavel Machek
2003-09-09 19:21                   ` Gábor Lénárt
2003-09-09 19:43                     ` Mike Fedyk
2003-09-04 20:16   ` Daniel Phillips
2003-09-04 20:10     ` Andrew Morton
2003-09-04 21:08       ` Daniel Phillips
2003-09-04 21:39         ` Hans Reiser
2003-09-04 21:59           ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F57AF79.1040702@namesys.com \
    --to=reiser@namesys.com \
    --cc=adilger@clusterfs.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfedyk@matchmail.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox