From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andreas Dilger <adilger@sun.com>
Subject: Re: RFC: O_PONIES semantics (well O_REWRITE)
Date: Wed, 10 Jun 2009 23:53:09 -0600
Message-ID: <20090611055309.GR9002@webber.adilger.int>
References: <4A3057DD.1050703@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; CHARSET=US-ASCII
Content-Transfer-Encoding: 7BIT
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ray Strode <rstrode@redhat.com>, elb@psg.com
To: Rik van Riel <riel@redhat.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from sca-es-mail-1.Sun.COM ([192.18.43.132]:65404 "EHLO
	sca-es-mail-1.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1754251AbZFKFxX (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 11 Jun 2009 01:53:23 -0400
Received: from fe-sfbay-10.sun.com ([192.18.43.129])
	by sca-es-mail-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id n5B5rNEH009728
	for <linux-fsdevel@vger.kernel.org>; Wed, 10 Jun 2009 22:53:23 -0700 (PDT)
Content-disposition: inline
Received: from conversion-daemon.fe-sfbay-10.sun.com by fe-sfbay-10.sun.com
 (Sun Java(tm) System Messaging Server 7u2-7.02 64bit (built Apr 16 2009))
 id <0KL20080089XP700@fe-sfbay-10.sun.com> for linux-fsdevel@vger.kernel.org;
 Wed, 10 Jun 2009 22:53:23 -0700 (PDT)
In-reply-to: <4A3057DD.1050703@redhat.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Jun 10, 2009  21:03 -0400, Rik van Riel wrote:
> The semantics for O_REWRITE would be:
>
> 1) When opening a file O_REWRITE, the file handle points at
>    a freshly allocated, empty file.  The original file is
>    still available to programs that open the file without
>    O_REWRITE.
>
> 2) O_REWRITE can only be used in conjunction with O_WRONLY,
>    because the file descriptor is not associated with the
>    original file (which has data), but with an empty inode.
>
> 3) The code that implements O_REWRITE (kernel?  glibc?)
>    makes sure that:
>    - the new file is on the same filesystem as the original file
>    - the new file is not linked (so it is automatically freed
>      after a process or system crash)
>    - the new file's ownership, permissions and extended attributes
>      match that of the original file
>
> 4) The application that opens a file O_REWRITE is required
>    to rewrite the entire file.

This is all essentially open(O_CREAT|O_TRUNC|O_WRONLY)

> 5) On close(), the code that implements O_REWRITE makes sure that
>    the file is atomically renamed, so that if a system crash happens,
>    the user will see either the old or the new file contents, but
>    never an empty file.

This would be possible if the kernel set the i_size=0, but didn't
send the filesystem the truncate until the file was closed and
being flushed.

> 6) After close(), processes that open the file will get the new
>    content.  Processes that previously opened the file will hold
>    on to the old inode and get old contents.

What is the benefit of (6)?  Of all these semantics this is the
one that would cause the most confusion I think.

> Here are my questions:
>
> - Are these semantics useful for programs that want to replace
>   config (or other) files with new content?
>
> - Are these semantics sane?
>
> - What would be the best place to implement these semantics?

The main question is - would any applications use O_REWRITE in
the first place, or would it just make sense to have a helper
function in glibc like e.g. mktemp that handles the "atomic
update of config file" properly in the first place.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.