From: Pavel Machek <pavel@ucw.cz>
To: Artem Bityutskiy <dedekind@yandex.ru>
Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: replace() system call needed (was Re: EXT4-ish "fixes" in UBIFS)
Date: Sun, 29 Mar 2009 14:50:00 +0200 [thread overview]
Message-ID: <20090329124959.GD15492@elf.ucw.cz> (raw)
In-Reply-To: <49CF6CBB.7070907@yandex.ru>
>>> We have a problem that user-space people do not want to
>>> use 'fsync()', even when they are pointed to their code
>>> which is doing create/write/rename/close without fsync().
>>
>> Well... they really don't want to spin the disk up for the
>> fsync(). I'm not sure if fsync() is really sensible operation to use
>> there.
>
> I'm personally concerned about hand-held, and in case of UBIFS
> fsync is not too expensive - we work on flash and on fsync() we
> write back only the stuff belonging to inode in question, and
> nothing else.
Well, I'm more concerned about spinning disks, having one even in my
zaurus. And I do believe that fsync() will write more data than
neccessary even in flash case.
>>> 1. truncate/write/close leads to empty files
>>
>> this is buggy.
>
> In FS, or in application?
Application is buggy; no way kernel can help there.
>>> 2. create/write/rename leads to empty files
>>
>> ..but this should not be. If we want to make that explicit, we should
>> provide "replace()" operation; where replace is rename that makes sure
>> that source file is completely on media before commiting the rename.
>
> Well, OK, we can fsync() before rename, we just need clean rules
> for this, so that all Linux FSes would follow them. Would be nice
> to have final agreement on all this stuff.
My proposal is
rename() stays.
replace(src, bar) is rename that ensures that bar will contain valid
data after powerfail.
>> It is somehow similar to fsync()/rename(), but does not force disk
>> spin up immediately -- it only inserts "barrier" between data blocks
>> and rename. (And yes, it should be implemented as fsync()+rename() for
>> filesystems like xfs. It can be implemented as plain rename for ext3
>> and ext4 after the fixes...)
>
> Right. But I guess only few file-systems would really implement
> this, because this is complex.
Complex yes, but at least ext3+ext4+btrfs should, and they really have
90% of "market share" :-). ext3 and ext4 implementations are already
done :-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
next prev parent reply other threads:[~2009-03-29 12:50 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-27 12:48 EXT4-ish "fixes" in UBIFS Artem Bityutskiy
2009-03-28 1:22 ` Kyungmin Park
2009-03-29 12:31 ` Artem Bityutskiy
2009-03-29 12:54 ` Artem Bityutskiy
2009-03-29 12:26 ` replace() system call needed (was Re: EXT4-ish "fixes" in UBIFS) Pavel Machek
2009-03-29 12:42 ` Artem Bityutskiy
2009-03-29 12:50 ` Pavel Machek [this message]
2009-03-29 13:00 ` Artem Bityutskiy
2009-03-29 13:02 ` Pavel Machek
2009-03-29 13:07 ` Artem Bityutskiy
2009-03-29 13:22 ` Andreas T.Auer
2009-03-29 13:55 ` Artem Bityutskiy
2009-03-29 13:40 ` Pavel Machek
2009-03-29 13:57 ` Artem Bityutskiy
2009-03-29 14:00 ` Pavel Machek
2009-03-30 17:19 ` Ric Wheeler
2009-03-30 22:11 ` Pavel Machek
2009-03-29 13:01 ` Andreas T.Auer
2009-03-29 13:06 ` Artem Bityutskiy
2009-03-30 15:58 ` Diego Calleja
2009-04-03 0:09 ` EXT4-ish "fixes" in UBIFS Christian Kujau
2009-04-03 0:24 ` Trenton D. Adams
2009-04-03 0:28 ` Trenton D. Adams
2009-04-03 0:38 ` Christian Kujau
2009-04-03 0:54 ` Trenton D. Adams
2009-04-03 0:54 ` Trenton D. Adams
2009-04-03 0:59 ` Trenton D. Adams
2009-04-03 1:55 ` David Rees
2009-04-03 2:05 ` Trenton D. Adams
2009-04-03 2:19 ` David Rees
2009-04-03 2:28 ` Trenton D. Adams
2009-04-03 2:58 ` David Rees
2009-04-03 3:13 ` Trenton D. Adams
2009-04-03 3:14 ` Trenton D. Adams
2009-04-03 5:02 ` Theodore Tso
2009-04-03 5:15 ` Trenton D. Adams
2009-04-03 6:30 ` Theodore Tso
2009-04-03 18:53 ` Chris Adams
2009-04-03 18:05 ` David Rees
2009-04-09 20:17 ` Pavel Machek
2009-04-03 2:26 ` Trenton D. Adams
2009-04-03 2:05 ` Theodore Tso
2009-04-03 2:45 ` Christian Kujau
2009-04-03 2:49 ` Trenton D. Adams
2009-04-03 6:53 ` Artem Bityutskiy
[not found] <ckjPq-2Dl-15@gated-at.bofh.it>
[not found] ` <cl2jy-65z-1@gated-at.bofh.it>
[not found] ` <cl2CZ-6q2-21@gated-at.bofh.it>
[not found] ` <cl2N9-6Bj-9@gated-at.bofh.it>
2009-03-31 21:27 ` replace() system call needed (was Re: EXT4-ish "fixes" in UBIFS) Bodo Eggert
2009-04-01 0:06 ` Theodore Tso
2009-04-01 20:52 ` Pavel Machek
2009-04-01 22:58 ` Bodo Eggert
[not found] <cmFiD-8uc-9@gated-at.bofh.it>
[not found] ` <cmFss-ft-15@gated-at.bofh.it>
[not found] ` <cmFsu-ft-23@gated-at.bofh.it>
[not found] ` <cmGRt-2hq-7@gated-at.bofh.it>
[not found] ` <cmH1b-2K0-11@gated-at.bofh.it>
[not found] ` <cmHkz-3d3-5@gated-at.bofh.it>
[not found] ` <cmHkA-3d3-7@gated-at.bofh.it>
[not found] ` <cmHND-3Oz-5@gated-at.bofh.it>
[not found] ` <cmJPm-7hd-5@gated-at.bofh.it>
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090329124959.GD15492@elf.ucw.cz \
--to=pavel@ucw.cz \
--cc=Artem.Bityutskiy@nokia.com \
--cc=dedekind@yandex.ru \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox