All of lore.kernel.org
 help / color / mirror / Atom feed
* Correct behavior on O_DIRECT sparse file writes
@ 2007-10-12 20:39 Chris Mason
  2007-10-12 21:02 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Mason @ 2007-10-12 20:39 UTC (permalink / raw)
  To: linux-fsdevel, Andrew Morton

Hello everyone,

The test below creates a sparse file and then fills a hole with
O_DIRECT.  As far as I can tell from reading generic_osync_inode, the
filesystem metadata is only forced to disk if i_size changes during the
file write.  I've tested ext3, xfs and reiserfs and they all skip the
commit when filling holes.

I would argue that filling holes via O_DIRECT is supposed to commit the
metadata required to find those file blocks later.  At least on ext3,
O_SYNC does force a commit on fill holes  (haven't tested others).

So, is the current behavior a bug or a feature?

dd if=/dev/zero of=foo bs=1M seek=1 count=1 oflag=direct

hexdump foo | head -n 2
0000000 62b1 ea2d 73e8 c64f f5ef 1af5 dd09 8ccd
0000010 75ec 9581 e0ea ae9b e28f b76d a700 4d5b

dd if=/dev/urandom of=foo bs=4k count=1 conv=notrunc oflag=direct
reboot -nf

(after reboot)

hexdump foo
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0200000

-chris



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Correct behavior on O_DIRECT sparse file writes
  2007-10-12 20:39 Correct behavior on O_DIRECT sparse file writes Chris Mason
@ 2007-10-12 21:02 ` Andrew Morton
  2007-10-13 11:24   ` Florian Weimer
  2007-10-15 16:53   ` Bryan Henderson
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2007-10-12 21:02 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-fsdevel

On Fri, 12 Oct 2007 16:39:27 -0400
Chris Mason <chris.mason@oracle.com> wrote:

> Hello everyone,
> 
> The test below creates a sparse file and then fills a hole with
> O_DIRECT.  As far as I can tell from reading generic_osync_inode, the
> filesystem metadata is only forced to disk if i_size changes during the
> file write.  I've tested ext3, xfs and reiserfs and they all skip the
> commit when filling holes.
> 
> I would argue that filling holes via O_DIRECT is supposed to commit the
> metadata required to find those file blocks later.  At least on ext3,
> O_SYNC does force a commit on fill holes  (haven't tested others).
> 
> So, is the current behavior a bug or a feature?

I don't think it's a bug.  Sure, O_DIRECT is synchronous, but that's
because it is, err, direct.  Not because it provides extra data-integrity
guarantees.  If you want those guarantees, use O_SYNC as well.

> dd if=/dev/zero of=foo bs=1M seek=1 count=1 oflag=direct
> 
> hexdump foo | head -n 2
> 0000000 62b1 ea2d 73e8 c64f f5ef 1af5 dd09 8ccd
> 0000010 75ec 9581 e0ea ae9b e28f b76d a700 4d5b
> 
> dd if=/dev/urandom of=foo bs=4k count=1 conv=notrunc oflag=direct
> reboot -nf
> 
> (after reboot)
> 
> hexdump foo
> 0000000 0000 0000 0000 0000 0000 0000 0000 0000
> *
> 0200000
> 
> -chris
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Correct behavior on O_DIRECT sparse file writes
  2007-10-12 21:02 ` Andrew Morton
@ 2007-10-13 11:24   ` Florian Weimer
  2007-10-15 15:36     ` Chuck Lever
  2007-10-15 16:53   ` Bryan Henderson
  1 sibling, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2007-10-13 11:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Chris Mason, linux-fsdevel

* Andrew Morton:

> I don't think it's a bug.  Sure, O_DIRECT is synchronous, but that's
> because it is, err, direct.  Not because it provides extra data-integrity
> guarantees.  If you want those guarantees, use O_SYNC as well.

This needs to be prominently documented.  Right now, it's far from clear
that you need both O_DIRECT and O_SYNC.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Correct behavior on O_DIRECT sparse file writes
  2007-10-13 11:24   ` Florian Weimer
@ 2007-10-15 15:36     ` Chuck Lever
  0 siblings, 0 replies; 5+ messages in thread
From: Chuck Lever @ 2007-10-15 15:36 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Andrew Morton, Chris Mason, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 549 bytes --]

Florian Weimer wrote:
> * Andrew Morton:
> 
>> I don't think it's a bug.  Sure, O_DIRECT is synchronous, but that's
>> because it is, err, direct.  Not because it provides extra data-integrity
>> guarantees.  If you want those guarantees, use O_SYNC as well.
> 
> This needs to be prominently documented.  Right now, it's far from clear
> that you need both O_DIRECT and O_SYNC.

It's certainly not a requirement for NFS.  O_DIRECT on NFS forces data 
to the server, which always updates a file's metadata on each write, 
including indirect blocks.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Correct behavior on O_DIRECT sparse file writes
  2007-10-12 21:02 ` Andrew Morton
  2007-10-13 11:24   ` Florian Weimer
@ 2007-10-15 16:53   ` Bryan Henderson
  1 sibling, 0 replies; 5+ messages in thread
From: Bryan Henderson @ 2007-10-15 16:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Chris Mason, linux-fsdevel

>> The test below creates a sparse file and then fills a hole with
>> O_DIRECT.  As far as I can tell from reading generic_osync_inode, the
>> filesystem metadata is only forced to disk if i_size changes during the
>> file write.  I've tested ext3, xfs and reiserfs and they all skip the
>> commit when filling holes.
>> 
>> I would argue that filling holes via O_DIRECT is supposed to commit the
>> metadata required to find those file blocks later.  At least on ext3,
>> O_SYNC does force a commit on fill holes  (haven't tested others).
>> 

>I don't think it's a bug.  Sure, O_DIRECT is synchronous, but that's
>because it is, err, direct.  Not because it provides extra data-integrity
>guarantees.  If you want those guarantees, use O_SYNC as well.

That makes sense, but how do you explain the committing of the size change 
without O_SYNC?  That seems wrong to me.

This does need to be documented carefully, because a person could easily 
believe, even subconsciously,  that O_DIRECT makes the entire file write 
direct, and sloppy documentation might actually use words to that effect.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Filesystems


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-10-15 16:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-12 20:39 Correct behavior on O_DIRECT sparse file writes Chris Mason
2007-10-12 21:02 ` Andrew Morton
2007-10-13 11:24   ` Florian Weimer
2007-10-15 15:36     ` Chuck Lever
2007-10-15 16:53   ` Bryan Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.