public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL)
@ 2011-02-25 21:52 Steve Rago
  2011-04-07 21:37 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Rago @ 2011-02-25 21:52 UTC (permalink / raw)
  To: linux-kernel

This has probably been a problem since day 1 (I ran into this running the 2.4 kernel years ago; finally got around to 
fixing it).  The problem is that fcntl(fd, F_SETFL, flags|O_SYNC) appears to work, but silently ignores the O_SYNC flag. 
  Opening the file with O_SYNC works okay, but setting it later on via fcntl doesn't work.


Signed-off-by: Steve Rago <sar@nec-labs.com>
---
  fs/fcntl.c |    2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index cb10261..afd233a 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -143,7 +143,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
         return ret;
  }

-#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME)
+#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME | O_SYNC)

  static int setfl(int fd, struct file * filp, unsigned long arg)
  {
--
1.7.2.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL)
  2011-02-25 21:52 [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL) Steve Rago
@ 2011-04-07 21:37 ` Andrew Morton
  2011-04-08 15:14   ` Christoph Hellwig
  2011-04-08 17:39   ` Steve Rago
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2011-04-07 21:37 UTC (permalink / raw)
  To: Steve Rago; +Cc: linux-kernel, linux-fsdevel

(did I ever reply to this?  I meant to ;))

On Fri, 25 Feb 2011 16:52:36 -0500
Steve Rago <sar@nec-labs.com> wrote:

> This has probably been a problem since day 1 (I ran into this running the 2.4 kernel years ago; finally got around to 
> fixing it).  The problem is that fcntl(fd, F_SETFL, flags|O_SYNC) appears to work, but silently ignores the O_SYNC flag. 
>   Opening the file with O_SYNC works okay, but setting it later on via fcntl doesn't work.
> 
> 
> Signed-off-by: Steve Rago <sar@nec-labs.com>
> ---
>   fs/fcntl.c |    2 +-
>   1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index cb10261..afd233a 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -143,7 +143,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
>          return ret;
>   }
> 
> -#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME)
> +#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME | O_SYNC)

Does any standard say that we should do this? 
http://pubs.opengroup.org/onlinepubs/007908799/xsh/fcntl.html does, I
guess.

I worry a bit that this change will surprise people.  For example, this
person:
http://koders.com/c/fidA34D8D5EE9AA5D0AB0F3C604678E2E935E5B0246.aspx?s=dupa
is going to wonder why his app suddenly got a lot slower!

Sadly, the kernel silently ignores invalid set bits in `arg', so we
have no reliable way of signaling to the user that our behaviour here
changed.

I wonder if we should sync the file when someone sets O_SYNC this way. 
If we don't then there is a period during which we have an fd which has
O_SYNC set, but it has pending unwritten data.  An O_SYNC fd should
never be in such a state!

Ho hum.  yes, I guess we should apply the patch.  But it would have
been better to not have screwed this up in the first place!


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL)
  2011-04-07 21:37 ` Andrew Morton
@ 2011-04-08 15:14   ` Christoph Hellwig
  2011-04-08 17:39   ` Steve Rago
  1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-04-08 15:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steve Rago, linux-kernel, linux-fsdevel

I actually prototypes this patch independetly a while ago, and in
addition to the data writeout when removing O_SYNC there are the
following caveats:

 - O_SYNC is not actually one flag, but two: O_DSYNC and __O_SYNC.
   setfl() needs to make sure __O_SYNC cannot be in f_flags without
   O_DSYNC also beeing present.
 - we need to audit all filesystems that they don't do stupid things
   when the O_SYNC flags appear or disappear during a write, that
   is make sure it is checked in just one place.  The generic write
   code is fine in that respect, but I didn't go through all filesystems
   to verify it yet.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL)
  2011-04-07 21:37 ` Andrew Morton
  2011-04-08 15:14   ` Christoph Hellwig
@ 2011-04-08 17:39   ` Steve Rago
  2011-04-08 17:56     ` Andrew Morton
  1 sibling, 1 reply; 6+ messages in thread
From: Steve Rago @ 2011-04-08 17:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-fsdevel

On 04/07/2011 05:37 PM, Andrew Morton wrote:
> (did I ever reply to this?  I meant to ;))
>
> On Fri, 25 Feb 2011 16:52:36 -0500
> Steve Rago<sar@nec-labs.com>  wrote:
>
>> This has probably been a problem since day 1 (I ran into this running the 2.4 kernel years ago; finally got around to
>> fixing it).  The problem is that fcntl(fd, F_SETFL, flags|O_SYNC) appears to work, but silently ignores the O_SYNC flag.
>>    Opening the file with O_SYNC works okay, but setting it later on via fcntl doesn't work.
>>
>>
>> Signed-off-by: Steve Rago<sar@nec-labs.com>
>> ---
>>    fs/fcntl.c |    2 +-
>>    1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/fcntl.c b/fs/fcntl.c
>> index cb10261..afd233a 100644
>> --- a/fs/fcntl.c
>> +++ b/fs/fcntl.c
>> @@ -143,7 +143,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
>>           return ret;
>>    }
>>
>> -#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME)
>> +#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME | O_SYNC)
>
> Does any standard say that we should do this?
> http://pubs.opengroup.org/onlinepubs/007908799/xsh/fcntl.html does, I
> guess.

It's required by the Single UNIX Specification (POSIX.1).  All other major platforms allow it to be set via fcntl.  See 
bugzilla.kernel.org bug ID #5994.

>
> I worry a bit that this change will surprise people.  For example, this
> person:
> http://koders.com/c/fidA34D8D5EE9AA5D0AB0F3C604678E2E935E5B0246.aspx?s=dupa
> is going to wonder why his app suddenly got a lot slower!
>
> Sadly, the kernel silently ignores invalid set bits in `arg', so we
> have no reliable way of signaling to the user that our behaviour here
> changed.
>
> I wonder if we should sync the file when someone sets O_SYNC this way.
> If we don't then there is a period during which we have an fd which has
> O_SYNC set, but it has pending unwritten data.  An O_SYNC fd should
> never be in such a state!

Why not?  If I write something in non-synchronous mode, then change the file descriptor to synchronous mode, I should 
not make any assumptions about what was written prior to this point.  If I care that much, I'll call fsync.  All that 
matters is that the operating system honors the contract as specified by the system call API.

>
> Ho hum.  yes, I guess we should apply the patch.  But it would have
> been better to not have screwed this up in the first place!
>
>

Agreed.  Thanks for not letting this fall through the cracks.

Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL)
  2011-04-08 17:39   ` Steve Rago
@ 2011-04-08 17:56     ` Andrew Morton
  2011-04-08 21:08       ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2011-04-08 17:56 UTC (permalink / raw)
  To: Steve Rago; +Cc: linux-kernel, linux-fsdevel

On Fri, 08 Apr 2011 13:39:16 -0400
Steve Rago <sar@nec-labs.com> wrote:

> > I wonder if we should sync the file when someone sets O_SYNC this way.
> > If we don't then there is a period during which we have an fd which has
> > O_SYNC set, but it has pending unwritten data.  An O_SYNC fd should
> > never be in such a state!
> 
> Why not?

Because it's inconsistent.  An O_SYNC fd never has outstanding writeout. 
Except for in this one new and special time window between a setfl()
and the next write().

It's not a big deal, but it's somewhat ugly and merits thinking about.

>  If I write something in non-synchronous mode, then change the file descriptor to synchronous mode, I should 
> not make any assumptions about what was written prior to this point.  If I care that much, I'll call fsync.

Well.  You can call fsync() after every write() too.

>  All that 
> matters is that the operating system honors the contract as specified by the system call API.

There's a lot more to it than that.  Things like
quality-of-implementation and principle-of-least-surprise.  We used to
have a particular relationship between an O_SYNC fd and the state of
the inode which it represents.  With this patch, that relationship no
longer holds.

As I say: not a big deal IMO, but it should be aired and thought about.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL)
  2011-04-08 17:56     ` Andrew Morton
@ 2011-04-08 21:08       ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-04-08 21:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steve Rago, linux-kernel, linux-fsdevel

On Fri, Apr 08, 2011 at 10:56:02AM -0700, Andrew Morton wrote:
> Because it's inconsistent.  An O_SYNC fd never has outstanding writeout. 
> Except for in this one new and special time window between a setfl()
> and the next write().

It might actually have outstanding writes for as long as it eventually
takes the writeback code to push them out.  O_SYNC only does a range
writeout for the area that was written.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-04-08 21:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-25 21:52 [PATCH] Allow O_SYNC to be set by fcntl(F_SETFL) Steve Rago
2011-04-07 21:37 ` Andrew Morton
2011-04-08 15:14   ` Christoph Hellwig
2011-04-08 17:39   ` Steve Rago
2011-04-08 17:56     ` Andrew Morton
2011-04-08 21:08       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox