From: Jamie Lokier <jamie@shareable.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Ulrich Drepper <drepper@redhat.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: adding proper O_SYNC/O_DSYNC, was Re: O_DIRECT and barriers
Date: Fri, 28 Aug 2009 17:44:32 +0100 [thread overview]
Message-ID: <20090828164432.GA8036@shareable.org> (raw)
In-Reply-To: <20090828154647.GA15808@infradead.org>
Christoph Hellwig wrote:
> On Thu, Aug 27, 2009 at 10:24:28AM -0700, Ulrich Drepper wrote:
> > The problem with O_* extensions is that the syscall doesn't fail if the
> > flag is not handled. This is a problem in the open implementation which
> > can only be fixed with a new syscall.
> >
> > Why cannot just go on and say we interpret O_SYNC like O_SYNC and
> > O_SYNC|O_DSYNC like O_DSYNC. The POSIX spec explicitly requires that
> > the latter handled like O_SYNC.
> >
> > We could handle it by allocating two bits, only one is handled in the
> > kernel. If the O_DSYNC definition for userlevel would be different from
> > the kernel definition then the kernel could interpret O_SYNC|O_DSYNC
> > like O_DSYNC. The libc would then have to translate the userlevel
> > O_DSYNC into the kernel O_DSYNC. If the libc is too old for the kernel
> > and the application, the userlevel flag would be passed to the kernel
> > and nothing bad happens.
>
> What about hte following variant:
>
> - given that our current O_SYNC really is and always has been actuall
> Posix O_DSYNC keep the numerical value and rename it to O_DSYNC in
> the headers.
> - Add a new O_SYNC definition:
>
> #define O_SYNC (O_DSYNC|O_REALLY_SYNC)
>
> and do full O_SYNC handling in new kernels if O_REALLY_SYNC is
> present.
That looks good for the kernel.
However, for userspace, there's an issue with applications which were
compiled with an old libc and used O_SYNC. Most of them probably
expected O_SYNC behaviour but all they got was O_DSYNC, because Linux
didn't do it right.
But they *didn't know* that.
When using a newer kernel which actually implements O_SYNC behaviour,
I'm thinking those applications which asked for O_SYNC should get it,
even though they're still linked with an old libc.
That's because this thread is the first time I've heard that Linux
O_SYNC was really the weaker O_DSYNC in disguise, and judging from the
many Googlings I've done about O_SYNC in applications and on different
OS, it'll be news to other people too.
(I always thought the "#define O_DSYNC O_SYNC" was because Linux
didn't implement the weaker O_DSYNC).
(Oh, and Ulrich: Why is there a "#define O_RSYNC O_SYNC" in the Glibc
headers? That doesn't make sense: O_RSYNC has nothing to do with
writing.)
To achieve that, libc could implement two versions of open() at the
same time as it updates header files. The new libc's __old_open() would
do:
/* Only O_DSYNC is set for apps built against old libc which
were compiled
if (flags & O_DSYNC)
flags |= O_SYNC;
I'm not exactly sure how symbol versioning works, but perhaps the
header file in the new libc would need __REDIRECT_NTH to map open() to
__new_open(), which just calls the kernel. This is to ensure .o and
.a files built with an old libc's headers but then linked to a new
libc will get __old_open().
Although libc's __new_open() could have this:
/* Old kernels only look at O_DSYNC. It's better than nothing. */
if (flags & O_SYNC)
flags |= O_DSYNC;
Imho, it's better to not do that, and instead have
#define O_SYNC (O_DSYNC|__O_SYNC_KERNEL)
as Chris suggests, in the libc header the same as the kernel header,
because that way applications which use the syscall() function or have
to invoke a syscall directly (I've seen clone-using code doing it),
won't spontaneously start losing their O_SYNCness on older kernels.
Unless there is some reason why "flags &= ~O_SYNC" is not permitted to
clear the O_DSYNC flag, or other reason why they must be separate flags.
-- Jamie
next prev parent reply other threads:[~2009-08-28 16:44 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1250697884-22288-1-git-send-email-jack@suse.cz>
2009-08-20 22:12 ` O_DIRECT and barriers Christoph Hellwig
2009-08-21 11:40 ` Jens Axboe
2009-08-21 13:54 ` Jamie Lokier
2009-08-21 14:26 ` Christoph Hellwig
2009-08-21 15:24 ` Jamie Lokier
2009-08-21 17:45 ` Christoph Hellwig
2009-08-21 19:18 ` Ric Wheeler
2009-08-22 0:50 ` Jamie Lokier
2009-08-22 2:19 ` Theodore Tso
2009-08-22 2:31 ` Theodore Tso
2009-08-24 2:34 ` Christoph Hellwig
2009-08-27 14:34 ` Jamie Lokier
2009-08-27 17:10 ` adding proper O_SYNC/O_DSYNC, was " Christoph Hellwig
2009-08-27 17:24 ` Ulrich Drepper
2009-08-28 15:46 ` Christoph Hellwig
2009-08-28 16:06 ` Ulrich Drepper
2009-08-28 16:17 ` Christoph Hellwig
2009-08-28 16:33 ` Ulrich Drepper
2009-08-28 16:41 ` Christoph Hellwig
2009-08-28 20:51 ` Ulrich Drepper
2009-08-28 21:08 ` Christoph Hellwig
2009-08-28 21:16 ` Trond Myklebust
2009-08-28 21:29 ` Christoph Hellwig
2009-08-28 21:43 ` Trond Myklebust
2009-08-28 22:39 ` Christoph Hellwig
2009-08-30 16:44 ` Jamie Lokier
2009-08-28 16:46 ` Jamie Lokier
2009-08-29 0:59 ` Jamie Lokier
2009-08-28 16:44 ` Jamie Lokier [this message]
2009-08-28 16:50 ` Jamie Lokier
2009-08-28 21:08 ` Ulrich Drepper
2009-08-30 16:58 ` Jamie Lokier
2009-08-30 17:48 ` Jamie Lokier
2009-08-28 23:06 ` Jamie Lokier
2009-08-28 23:46 ` Christoph Hellwig
2009-08-21 22:08 ` Theodore Tso
2009-08-21 22:38 ` Joel Becker
2009-08-21 22:45 ` Joel Becker
2009-08-22 2:11 ` Theodore Tso
2009-08-24 2:42 ` Christoph Hellwig
2009-08-24 2:37 ` Christoph Hellwig
2009-08-22 0:56 ` Jamie Lokier
2009-08-22 2:06 ` Theodore Tso
2009-08-26 6:34 ` Dave Chinner
2009-08-26 15:01 ` Jamie Lokier
2009-08-26 18:47 ` Theodore Tso
2009-08-27 14:50 ` Jamie Lokier
2009-08-21 14:20 ` Christoph Hellwig
2009-08-21 15:06 ` James Bottomley
2009-08-21 15:23 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090828164432.GA8036@shareable.org \
--to=jamie@shareable.org \
--cc=drepper@redhat.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).