From: Olaf van der Spek <olafvdspek@gmail.com>
To: "Ted Ts'o" <tytso@mit.edu>
Cc: Neil Brown <neilb@suse.de>,
Christian Stroetmann <stroetmann@ontolinux.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-ext4@vger.kernel.org, Nick Piggin <npiggin@gmail.com>
Subject: Re: Atomic non-durable file write API
Date: Wed, 29 Dec 2010 10:09:48 +0100 [thread overview]
Message-ID: <AANLkTi=ULuM6fHH1V2zKGpaSjKRrbUJen5oAMKAkAaei@mail.gmail.com> (raw)
In-Reply-To: <20101228234216.GJ10149@thunk.org>
On Wed, Dec 29, 2010 at 12:42 AM, Ted Ts'o <tytso@mit.edu> wrote:
> On Tue, Dec 28, 2010 at 11:54:33PM +0100, Olaf van der Spek wrote:
>
>> > Very true. But until such problems are described an understood,
>> > there is not a lot of point trying to implement a
>> > solution. Premature implementation, like premature optimisation,
>> > is unlikely to be fruitful. I know this from experience.
>>
>> The problems seem clear. The implications not yet.
>
> I don't think there's even agreement that it is a problem. A problem
Maybe problem isn't the right word, but it does seem a cornercase / exception.
> implies a use case where where such a need is critical, and I haven't
> seen it yet. I'd rather characeterize it as a demand for a "solution"
> for a problem that hasn't been proven to exist yet.
>
>> True, I don't understand why people say it will cause a performance
>> hit but then don't want to tell why.
>
> Because I don't want waste time doing a hypothetical design when (a)
> the specification space hasn't even been fully spec'ed out, and (b) no
> compelling use case has been demonstrated, and (c) no one is paying
> me.
> The last point is a critical one; who's going to do the work? If you
> are going to do the work, then implement it and send us the patches.
> If you expect a technology expert to do the work, it's dirty pool to
> try force him or her do a design to "prove" that it's not trivial.
>
> If you're going to pay me $50,000 or $100,000, then it's on the golden
> rule principle (the customer with the gold, makes the rules), and I'll
> happily work on a design even if in my best judgment it's ill-advised,
> and probably will be a waste of money, because, hey, it's the
> customer's money. But if you're going to ask me to spend my time
> working on something which in my professional opinion is a waste of
> time, and do it pro bono, you must be smoking something really good,
> and probably really illegal.
I don't want you to work on something you do not support.
I want to understand why you think it's a bad idea.
> Here are some of the hints though about trouble spots.
>
> 1) What happens in disk full cases? Remember, we can't free the old
> inode until writeback has happened. And if we haven't allocated space
> yet for the file, and space is needed for the new file, what happens?
> What if some other disk write needs the space?
I would expect a no space error.
> 2) How big are the files that you imagine should be supported with
> such a scheme? If the file system is 1 GB, and the file is 600MG, and
> you want to replace it with new contents which is 750MB long, what
> happens? How does the system degrade gracefully in the case of larger
> files? Does the user get any notification that maybe the magic
> O_PONIES semantics might be changing?
No sementics will change, you'll get a no space error.
Just like you would if you use the temp file approach.
> 3) What if the rename is still pending, but in the mean time, some
> other process modifies the file? Do those writes also have to be
> atomic vis-a-vis the rename?
So the rename has been executed already (but has not yet been comitted
to disk) and then the file is modified? They would apply to the new
file.
> 4) What if the rename is still pending, but in the meantime, some
> other process does another create a new file, and rename over the same
> file name?
The last update would win, if by pending you mean the rename has been
executed already but hasn't been written to disk yet.
> etc.
>
>> >> Where losing meta-data is bad? That should be obvious.
>>
>> In that case meta-data shouldn't be supported in the first place.
>
> Well, hold on a minute. It depends on what the meta-data means. If
> the meta-data is supposed to be a secure indication of who created the
> file, or more importantly if quotes are enforced, to whom the disk
> usage quota should be charged, then it might not be allowable to
> "preserve the metadata in some cases".
I understand you can't just allow chown, but ...
> In general, you can always save the meta data, and restore the meta
> data to the new file --- except when there are security reasons why
> this isn't allowed. For example, file ownership is special, because
> of (a) setuid bit considerations, and (b) file quota considerations.
> If you don't have those issues, then allowing a non-privileged user to
> use chown() is perfectly acceptable. But it's because of these issues
> that chown() is special.
>
> And if quota is enabled, replacing a 10MB file with a 6TB file, while
> preserving the same file "owner", and therefore charging the 6TB to
> the old owner, would be a total evasion of the quota system.
Isn't that already a problem if you have write access to a file you don't own?
Still waiting on an answer to:
> What is the recommended way for atomic (complete) file writes?
Given that (you say) so many get it wrong, it would be nice to know
the right way.
Olaf
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-12-29 9:09 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <AANLkTing7+SK+pavFehR4AGDbRRfFwvvzNxgWQ3zRp+O@mail.gmail.com>
2010-12-09 12:03 ` Atomic non-durable file write API Olaf van der Spek
2010-12-16 12:22 ` Olaf van der Spek
2010-12-16 20:11 ` Ric Wheeler
2010-12-18 22:15 ` Calvin Walton
2010-12-19 16:39 ` Olaf van der Spek
2010-12-23 15:49 ` Olaf van der Spek
2010-12-23 21:51 ` Neil Brown
2010-12-23 22:22 ` Ted Ts'o
[not found] ` <4D13E98D.8070105@ontolinux.com>
[not found] ` <20101224004825.GF12763@thunk.org>
[not found] ` <4D13F09D.4010703@ontolinux.com>
[not found] ` <20101224095105.GG12763@thunk.org>
2010-12-24 11:14 ` Olaf van der Spek
2010-12-25 3:15 ` Ted Ts'o
2010-12-25 10:41 ` Olaf van der Spek
2010-12-25 11:33 ` Nick Piggin
2010-12-25 15:24 ` Olaf van der Spek
2010-12-25 17:25 ` Nick Piggin
2010-12-26 15:08 ` Olaf van der Spek
2010-12-26 15:55 ` Boaz Harrosh
2010-12-26 16:02 ` Olaf van der Spek
2010-12-26 16:27 ` Boaz Harrosh
2010-12-26 18:26 ` Olaf van der Spek
2010-12-26 16:43 ` Nick Piggin
2010-12-26 18:51 ` Olaf van der Spek
2010-12-26 22:10 ` Ted Ts'o
2010-12-27 0:30 ` Christian Stroetmann
2010-12-27 1:04 ` Ted Ts'o
2010-12-27 1:30 ` Christian Stroetmann
2010-12-27 2:53 ` Ted Ts'o
2010-12-27 10:21 ` Olaf van der Spek
2010-12-27 11:07 ` Marco Stornelli
2010-12-27 15:30 ` Christian Stroetmann
2010-12-27 19:07 ` Olaf van der Spek
2010-12-27 19:30 ` Christian Stroetmann
2010-12-28 17:22 ` Olaf van der Spek
2010-12-28 20:59 ` Neil Brown
2010-12-28 22:00 ` Greg Freemyer
2010-12-28 22:06 ` Olaf van der Spek
2010-12-28 22:15 ` Greg Freemyer
2010-12-28 22:28 ` Olaf van der Spek
2010-12-28 22:35 ` Neil Brown
2010-12-29 11:05 ` Dave Chinner
2010-12-28 22:10 ` Olaf van der Spek
2010-12-28 22:31 ` Neil Brown
2010-12-28 22:54 ` Olaf van der Spek
2010-12-28 23:42 ` Ted Ts'o
2010-12-29 9:09 ` Olaf van der Spek [this message]
2010-12-29 15:30 ` Christian Stroetmann
2010-12-29 15:41 ` Olaf van der Spek
2010-12-29 16:30 ` Christian Stroetmann
2010-12-29 17:14 ` Olaf van der Spek
2010-12-30 0:50 ` Neil Brown
2011-01-07 14:23 ` Olaf van der Spek
2010-12-27 4:12 ` Nick Piggin
2010-12-27 11:48 ` Olaf van der Spek
2010-12-27 12:43 ` Olaf van der Spek
2010-12-28 0:45 ` Ted Ts'o
2010-12-24 11:17 ` Olaf van der Spek
2010-12-25 21:40 ` Neil Brown
2010-12-23 22:43 ` Dave Chinner
2010-12-23 22:47 ` Ted Ts'o
2010-12-26 9:59 ` Amir Goldstein
2010-12-26 15:23 ` Olaf van der Spek
2010-12-26 16:52 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='AANLkTi=ULuM6fHH1V2zKGpaSjKRrbUJen5oAMKAkAaei@mail.gmail.com' \
--to=olafvdspek@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=neilb@suse.de \
--cc=npiggin@gmail.com \
--cc=stroetmann@ontolinux.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).