From: Chris Mason <chris.mason@oracle.com>
To: Olaf van der Spek <olafvdspek@gmail.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Atomic file data replace API
Date: Fri, 07 Jan 2011 10:05:15 -0500 [thread overview]
Message-ID: <1294412553-sup-9058@think> (raw)
In-Reply-To: <AANLkTi=Z1RCwMKSMRzg4fwN55ZQs4F91nNjdEfv6veyb@mail.gmail.com>
Excerpts from Olaf van der Spek's message of 2011-01-07 10:01:59 -0500:
> On Fri, Jan 7, 2011 at 3:58 PM, Chris Mason <chris.mason@oracle.com> =
wrote:
> > Excerpts from Olaf van der Spek's message of 2011-01-06 15:01:15 -0=
500:
> >> Hi,
> >>
> >> Does btrfs support atomic file data replaces? Basically, the atomi=
c
> >> variant of this:
> >> // old stage
> >> open(O_TRUNC)
> >> write() // 0+ times
> >> close()
> >> // new state
> >
> > Yes and no. =C2=A0We have a best effort mechanism where we try to g=
uess that
> > since you've done this truncate and the write that you want the wri=
tes
> > to show up quickly. =C2=A0But its a guess.
> >
> > The problem is the write() // 0+ times. =C2=A0The kernel has no ide=
a what
> > new result you want the file to contain because the application isn=
't
> > telling us.
>=20
> Isn't it safe for the kernel to wait until the first write or close
> before writing anything to disk?
I'm afraid not. Picture an application that opens a thousand files and
writes 1MB to each of them, and then didn't close any. If we waited
until close, you'd have 1GB of memory pinned or staged somehow.
>=20
> > What btrfs can do (but we haven't yet implemented) is make sure tha=
t the
> > results of a single write file are on disk atomically, even if they=
are
> > replacing existing bytes in the file.
> >
> > Because we cow and because we don't update metadata pointers until =
the
> > IO is complete, we can wait until all the IO for a given write call=
is
> > on disk before we update any of the metadata.
> >
> > This isn't hard, it's on my TODO list.
>=20
> What about a new flag: O_ATOMIC that'd take the guesswork out of the =
kernel?
We can't guess beyond a single write call. Otherwise we get into
the problem above where an application can force the kernel to wait
forever. I'm not against O_ATOMIC to enable the new btrfs
functionality, but it will still be limited to one write.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-01-07 15:05 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-06 20:01 Atomic file data replace API Olaf van der Spek
2011-01-07 13:55 ` Mike Fleetwood
2011-01-07 14:01 ` Olaf van der Spek
2011-01-07 14:10 ` Olaf van der Spek
2011-01-07 14:58 ` Chris Mason
2011-01-07 15:01 ` Olaf van der Spek
2011-01-07 15:05 ` Chris Mason [this message]
2011-01-07 15:08 ` Olaf van der Spek
2011-01-07 15:13 ` Chris Mason
2011-01-07 15:17 ` Olaf van der Spek
2011-01-07 16:12 ` Chris Mason
2011-01-07 16:19 ` Olaf van der Spek
2011-01-07 16:26 ` Hubert Kario
2011-01-07 19:29 ` Chris Mason
2011-01-08 14:40 ` Olaf van der Spek
2011-01-26 18:30 ` Olaf van der Spek
2011-01-26 19:30 ` Chris Mason
2011-01-26 21:56 ` Olaf van der Spek
2011-01-07 16:32 ` Massimo Maggi
2011-01-07 16:34 ` Olaf van der Spek
2011-01-07 19:29 ` Thomas Bellman
2011-01-08 14:36 ` Olaf van der Spek
2011-01-08 21:43 ` Thomas Bellman
2011-01-09 15:16 ` Olaf van der Spek
2011-01-09 18:56 ` Thomas Bellman
2011-01-09 19:06 ` Olaf van der Spek
2011-01-09 20:13 ` Phillip Susi
2011-01-08 1:11 ` Phillip Susi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1294412553-sup-9058@think \
--to=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=olafvdspek@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).