* Correct behavior on O_DIRECT sparse file writes
@ 2007-10-12 20:39 Chris Mason
2007-10-12 21:02 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: Chris Mason @ 2007-10-12 20:39 UTC (permalink / raw)
To: linux-fsdevel, Andrew Morton
Hello everyone,
The test below creates a sparse file and then fills a hole with
O_DIRECT. As far as I can tell from reading generic_osync_inode, the
filesystem metadata is only forced to disk if i_size changes during the
file write. I've tested ext3, xfs and reiserfs and they all skip the
commit when filling holes.
I would argue that filling holes via O_DIRECT is supposed to commit the
metadata required to find those file blocks later. At least on ext3,
O_SYNC does force a commit on fill holes (haven't tested others).
So, is the current behavior a bug or a feature?
dd if=/dev/zero of=foo bs=1M seek=1 count=1 oflag=direct
hexdump foo | head -n 2
0000000 62b1 ea2d 73e8 c64f f5ef 1af5 dd09 8ccd
0000010 75ec 9581 e0ea ae9b e28f b76d a700 4d5b
dd if=/dev/urandom of=foo bs=4k count=1 conv=notrunc oflag=direct
reboot -nf
(after reboot)
hexdump foo
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0200000
-chris
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Correct behavior on O_DIRECT sparse file writes
2007-10-12 20:39 Correct behavior on O_DIRECT sparse file writes Chris Mason
@ 2007-10-12 21:02 ` Andrew Morton
2007-10-13 11:24 ` Florian Weimer
2007-10-15 16:53 ` Bryan Henderson
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2007-10-12 21:02 UTC (permalink / raw)
To: Chris Mason; +Cc: linux-fsdevel
On Fri, 12 Oct 2007 16:39:27 -0400
Chris Mason <chris.mason@oracle.com> wrote:
> Hello everyone,
>
> The test below creates a sparse file and then fills a hole with
> O_DIRECT. As far as I can tell from reading generic_osync_inode, the
> filesystem metadata is only forced to disk if i_size changes during the
> file write. I've tested ext3, xfs and reiserfs and they all skip the
> commit when filling holes.
>
> I would argue that filling holes via O_DIRECT is supposed to commit the
> metadata required to find those file blocks later. At least on ext3,
> O_SYNC does force a commit on fill holes (haven't tested others).
>
> So, is the current behavior a bug or a feature?
I don't think it's a bug. Sure, O_DIRECT is synchronous, but that's
because it is, err, direct. Not because it provides extra data-integrity
guarantees. If you want those guarantees, use O_SYNC as well.
> dd if=/dev/zero of=foo bs=1M seek=1 count=1 oflag=direct
>
> hexdump foo | head -n 2
> 0000000 62b1 ea2d 73e8 c64f f5ef 1af5 dd09 8ccd
> 0000010 75ec 9581 e0ea ae9b e28f b76d a700 4d5b
>
> dd if=/dev/urandom of=foo bs=4k count=1 conv=notrunc oflag=direct
> reboot -nf
>
> (after reboot)
>
> hexdump foo
> 0000000 0000 0000 0000 0000 0000 0000 0000 0000
> *
> 0200000
>
> -chris
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Correct behavior on O_DIRECT sparse file writes
2007-10-12 21:02 ` Andrew Morton
@ 2007-10-13 11:24 ` Florian Weimer
2007-10-15 15:36 ` Chuck Lever
2007-10-15 16:53 ` Bryan Henderson
1 sibling, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2007-10-13 11:24 UTC (permalink / raw)
To: Andrew Morton; +Cc: Chris Mason, linux-fsdevel
* Andrew Morton:
> I don't think it's a bug. Sure, O_DIRECT is synchronous, but that's
> because it is, err, direct. Not because it provides extra data-integrity
> guarantees. If you want those guarantees, use O_SYNC as well.
This needs to be prominently documented. Right now, it's far from clear
that you need both O_DIRECT and O_SYNC.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Correct behavior on O_DIRECT sparse file writes
2007-10-13 11:24 ` Florian Weimer
@ 2007-10-15 15:36 ` Chuck Lever
0 siblings, 0 replies; 5+ messages in thread
From: Chuck Lever @ 2007-10-15 15:36 UTC (permalink / raw)
To: Florian Weimer; +Cc: Andrew Morton, Chris Mason, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 549 bytes --]
Florian Weimer wrote:
> * Andrew Morton:
>
>> I don't think it's a bug. Sure, O_DIRECT is synchronous, but that's
>> because it is, err, direct. Not because it provides extra data-integrity
>> guarantees. If you want those guarantees, use O_SYNC as well.
>
> This needs to be prominently documented. Right now, it's far from clear
> that you need both O_DIRECT and O_SYNC.
It's certainly not a requirement for NFS. O_DIRECT on NFS forces data
to the server, which always updates a file's metadata on each write,
including indirect blocks.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Correct behavior on O_DIRECT sparse file writes
2007-10-12 21:02 ` Andrew Morton
2007-10-13 11:24 ` Florian Weimer
@ 2007-10-15 16:53 ` Bryan Henderson
1 sibling, 0 replies; 5+ messages in thread
From: Bryan Henderson @ 2007-10-15 16:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: Chris Mason, linux-fsdevel
>> The test below creates a sparse file and then fills a hole with
>> O_DIRECT. As far as I can tell from reading generic_osync_inode, the
>> filesystem metadata is only forced to disk if i_size changes during the
>> file write. I've tested ext3, xfs and reiserfs and they all skip the
>> commit when filling holes.
>>
>> I would argue that filling holes via O_DIRECT is supposed to commit the
>> metadata required to find those file blocks later. At least on ext3,
>> O_SYNC does force a commit on fill holes (haven't tested others).
>>
>I don't think it's a bug. Sure, O_DIRECT is synchronous, but that's
>because it is, err, direct. Not because it provides extra data-integrity
>guarantees. If you want those guarantees, use O_SYNC as well.
That makes sense, but how do you explain the committing of the size change
without O_SYNC? That seems wrong to me.
This does need to be documented carefully, because a person could easily
believe, even subconsciously, that O_DIRECT makes the entire file write
direct, and sloppy documentation might actually use words to that effect.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-10-15 16:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-12 20:39 Correct behavior on O_DIRECT sparse file writes Chris Mason
2007-10-12 21:02 ` Andrew Morton
2007-10-13 11:24 ` Florian Weimer
2007-10-15 15:36 ` Chuck Lever
2007-10-15 16:53 ` Bryan Henderson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.