* Possible O_DIRECT problems ?
@ 2001-12-21 0:08 Dave Jones
2001-12-21 0:23 ` Trond Myklebust
0 siblings, 1 reply; 10+ messages in thread
From: Dave Jones @ 2001-12-21 0:08 UTC (permalink / raw)
To: Linux Kernel; +Cc: andrea, davej
Andrea, lk,
I just experimented with O_DIRECT in conjunction with fsx,
and the results aren't pretty.
Over NFS it survives around 921 operations, all local filesystems
(ext2,ext3,reiser tested) just 6 operations.
I've put the source to a modified fsx at
http://www.codemonkey.org.uk/cruft/fsx-odirect.c
It's possible I've done something wrong here, so look it over.
Just adding O_DIRECT flag to open() should be all thats necessary
correct ?
Also note, that by changing the flags on line 988 to have O_DIRECT
also, we get different failure type.
So, did I get the usage of O_DIRECT correct and find some bugs,
or have I had a little too much xmas spirits already ? 8-)
Dave.
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs .
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-21 0:08 Possible O_DIRECT problems ? Dave Jones
@ 2001-12-21 0:23 ` Trond Myklebust
2001-12-21 0:39 ` Dave Jones
0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2001-12-21 0:23 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel, andrea, davej, Chuck Lever
>>>>> " " == Dave Jones <davej@suse.de> writes:
> Andrea, lk,
> I just experimented with O_DIRECT in conjunction with fsx,
> and the results aren't pretty.
> Over NFS it survives around 921 operations, all local
> filesystems (ext2,ext3,reiser tested) just 6 operations. I've
> put the source to a modified fsx at
> http://www.codemonkey.org.uk/cruft/fsx-odirect.c
Dave,
O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
Lever's NFS patches you've been testing?
Cheers,
Trond
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-21 0:23 ` Trond Myklebust
@ 2001-12-21 0:39 ` Dave Jones
[not found] ` <w53ellp2out.wl@megaela.fe.dis.titech.ac.jp>
2001-12-29 15:25 ` Andrea Arcangeli
0 siblings, 2 replies; 10+ messages in thread
From: Dave Jones @ 2001-12-21 0:39 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Dave Jones, Linux Kernel, andrea, davej, Chuck Lever
On Fri, Dec 21, 2001 at 01:23:45AM +0100, Trond Myklebust wrote:
> O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
> Lever's NFS patches you've been testing?
Nope, stock 2.4.17rc2 & 2.5.1.
I thought NFS might just ignore the O_DIRECT flag if it didn't
understand it yet, I wasn't expecting such a dramatic failure.
I just got reminded of the bugs Andrew Morton & some others
found in O_DIRECT, so this may be hitting the same problems
already found.
Dave.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
[not found] ` <w53ellp2out.wl@megaela.fe.dis.titech.ac.jp>
@ 2001-12-21 12:46 ` Trond Myklebust
2001-12-21 16:14 ` Chuck Lever
2001-12-21 16:04 ` Chuck Lever
1 sibling, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2001-12-21 12:46 UTC (permalink / raw)
To: GOTO Masanori, davej, davej, linux-kernel, andrea, cel
On Friday 21. December 2001 05:12, GOTO Masanori wrote:
> At Fri, 21 Dec 2001 00:39:42 +0000,
>
> Dave Jones <davej@codemonkey.org.uk> wrote:
> > On Fri, Dec 21, 2001 at 01:23:45AM +0100, Trond Myklebust wrote:
> > > O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
> > > Lever's NFS patches you've been testing?
>
> Where is Chuck's patch ? I searched but didn't find.
I haven't put it up on my own web-site, but it should be available from the
CITI NFS client performance project site. See
http://www.citi.umich.edu/projects/nfs-perf/patches/
> Supporting direct_IO with NFS is some meaningful
> for users who have fast NAS server environment, IMHO.
It can also provide for better data security in some circumstances.
Journaling in databases over NFS can for instance benefit greatly, and has
been one of Chuck's motivations for doing it.
Cheers,
Trond
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
[not found] ` <w53ellp2out.wl@megaela.fe.dis.titech.ac.jp>
2001-12-21 12:46 ` Trond Myklebust
@ 2001-12-21 16:04 ` Chuck Lever
1 sibling, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2001-12-21 16:04 UTC (permalink / raw)
To: GOTO Masanori; +Cc: davej, trond.myklebust, davej, linux-kernel, andrea
fyi: the complete patch against 2.4.16 (should work with little or no
modification against 2.4.17) is here:
http://www.citi.umich.edu/projects/nfs-perf/patches/
you'll need to apply inode2file.diff then nfs-odirect11.diff, and it
requires Trond's pathconf patch in order to be completely useful.
because O_DIRECT cannot do small I/O (must be a multiple of a block size),
does fsx work when using it? can someone describe the failures?
On Fri, 21 Dec 2001, GOTO Masanori wrote:
> At Fri, 21 Dec 2001 00:39:42 +0000,
> Dave Jones <davej@codemonkey.org.uk> wrote:
> >
> > On Fri, Dec 21, 2001 at 01:23:45AM +0100, Trond Myklebust wrote:
> >
> > > O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
> > > Lever's NFS patches you've been testing?
>
> Where is Chuck's patch ? I searched but didn't find.
>
> > Nope, stock 2.4.17rc2 & 2.5.1.
> > I thought NFS might just ignore the O_DIRECT flag if it didn't
> > understand it yet, I wasn't expecting such a dramatic failure.
>
> Supporting direct_IO with NFS is some meaningful
> for users who have fast NAS server environment, IMHO.
>
> > I just got reminded of the bugs Andrew Morton & some others
> > found in O_DIRECT, so this may be hitting the same problems
> > already found.
>
> No, I think it's another issue, but it may be another bugs...
>
> -- gotom
>
- Chuck Lever
--
corporate: <cel@netapp.com>
personal: <chucklever@bigfoot.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-21 12:46 ` Trond Myklebust
@ 2001-12-21 16:14 ` Chuck Lever
0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2001-12-21 16:14 UTC (permalink / raw)
To: Trond Myklebust; +Cc: GOTO Masanori, davej, davej, linux-kernel, andrea
On Fri, 21 Dec 2001, Trond Myklebust wrote:
> On Friday 21. December 2001 05:12, GOTO Masanori wrote:
> > At Fri, 21 Dec 2001 00:39:42 +0000,
> >
> > Dave Jones <davej@codemonkey.org.uk> wrote:
> > > On Fri, Dec 21, 2001 at 01:23:45AM +0100, Trond Myklebust wrote:
> > > > O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
> > > > Lever's NFS patches you've been testing?
> >
> > Where is Chuck's patch ? I searched but didn't find.
>
> I haven't put it up on my own web-site, but it should be available from the
> CITI NFS client performance project site. See
>
> http://www.citi.umich.edu/projects/nfs-perf/patches/
>
> > Supporting direct_IO with NFS is some meaningful
> > for users who have fast NAS server environment, IMHO.
>
> It can also provide for better data security in some circumstances.
> Journaling in databases over NFS can for instance benefit greatly, and has
> been one of Chuck's motivations for doing it.
the patch is designed for applications that manage their own data cache,
like databases do. but it is also useful for applications that want to
move large datasets without blowing the O/S level data cache.
in the NFS case, because O_DIRECT read() and write() always go back to the
server, you can more easily build clustered and HA applications that share
the data storage backend.
- Chuck Lever
--
corporate: <cel@netapp.com>
personal: <chucklever@bigfoot.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-21 0:39 ` Dave Jones
[not found] ` <w53ellp2out.wl@megaela.fe.dis.titech.ac.jp>
@ 2001-12-29 15:25 ` Andrea Arcangeli
2001-12-29 18:46 ` CJ
1 sibling, 1 reply; 10+ messages in thread
From: Andrea Arcangeli @ 2001-12-29 15:25 UTC (permalink / raw)
To: Dave Jones, Trond Myklebust, Dave Jones, Linux Kernel,
Chuck Lever
On Fri, Dec 21, 2001 at 12:39:42AM +0000, Dave Jones wrote:
> On Fri, Dec 21, 2001 at 01:23:45AM +0100, Trond Myklebust wrote:
>
> > O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
> > Lever's NFS patches you've been testing?
>
> Nope, stock 2.4.17rc2 & 2.5.1.
> I thought NFS might just ignore the O_DIRECT flag if it didn't
> understand it yet, I wasn't expecting such a dramatic failure.
The point of O_DIRECT is to do DMA directly into the userspace memory
(and to avoid the VM overhead but that's a secondary issue and with data
journaling we may need to put an anchor into the VM to serialize the
direct I/O with the pagecache I/O in a secondary - slower - direct_IO
callback for the data journaling fs).
But to avoid the mem copies you're required to use strict alignment and
size of the userspace buffers, just like rawio.
If you don't you will get -EINVAL. This ensures people will use O_DIRECT
correctly in their apps. In short every single bugreport like this about
this -EINVAL strict behaviour is the proof we need to be strict and to
return -EINVAL :)
Andrea
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-29 15:25 ` Andrea Arcangeli
@ 2001-12-29 18:46 ` CJ
2001-12-30 5:39 ` Andre Hedrick
0 siblings, 1 reply; 10+ messages in thread
From: CJ @ 2001-12-29 18:46 UTC (permalink / raw)
To: Linux Kernel; +Cc: Dave Jones, Trond Myklebust, Dave Jones, Chuck Lever
Shouldn't O_DIRECT's requirements come from the hardware? If we can
ASPI or CAM DMA SCSI devices to odd addresses and lengths, why not
O_DIRECT? Do ape drives DMA to user buffers? Are O_DIRECT's
current limits gratuitous?
Andrea Arcangeli wrote:
>On Fri, Dec 21, 2001 at 12:39:42AM +0000, Dave Jones wrote:
>
>>On Fri, Dec 21, 2001 at 01:23:45AM +0100, Trond Myklebust wrote:
>>
>> > O_DIRECT for NFS isn't yet merged into the kernel. Are these Chuck
>> > Lever's NFS patches you've been testing?
>>
>>Nope, stock 2.4.17rc2 & 2.5.1.
>>I thought NFS might just ignore the O_DIRECT flag if it didn't
>>understand it yet, I wasn't expecting such a dramatic failure.
>>
>
>The point of O_DIRECT is to do DMA directly into the userspace memory
>(and to avoid the VM overhead but that's a secondary issue and with data
>journaling we may need to put an anchor into the VM to serialize the
>direct I/O with the pagecache I/O in a secondary - slower - direct_IO
>callback for the data journaling fs).
>
>But to avoid the mem copies you're required to use strict alignment and
>size of the userspace buffers, just like rawio.
>
>If you don't you will get -EINVAL. This ensures people will use O_DIRECT
>correctly in their apps. In short every single bugreport like this about
>this -EINVAL strict behaviour is the proof we need to be strict and to
>return -EINVAL :)
>
>Andrea
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>.
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-29 18:46 ` CJ
@ 2001-12-30 5:39 ` Andre Hedrick
2001-12-30 11:16 ` Gérard Roudier
0 siblings, 1 reply; 10+ messages in thread
From: Andre Hedrick @ 2001-12-30 5:39 UTC (permalink / raw)
To: CJ; +Cc: Linux Kernel, Dave Jones, Trond Myklebust, Dave Jones,
Chuck Lever
On Sat, 29 Dec 2001, CJ wrote:
> Shouldn't O_DIRECT's requirements come from the hardware? If we can
> ASPI or CAM DMA SCSI devices to odd addresses and lengths, why not
> O_DIRECT? Do ape drives DMA to user buffers? Are O_DIRECT's
> current limits gratuitous?
CAM is a very bad thing and that is why the X3 committees split.
Andre Hedrick
Linux Disk Certification Project Linux ATA Development
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Possible O_DIRECT problems ?
2001-12-30 5:39 ` Andre Hedrick
@ 2001-12-30 11:16 ` Gérard Roudier
0 siblings, 0 replies; 10+ messages in thread
From: Gérard Roudier @ 2001-12-30 11:16 UTC (permalink / raw)
To: Andre Hedrick
Cc: CJ, Linux Kernel, Dave Jones, Trond Myklebust, Dave Jones,
Chuck Lever
On Sat, 29 Dec 2001, Andre Hedrick wrote:
> On Sat, 29 Dec 2001, CJ wrote:
>
> > Shouldn't O_DIRECT's requirements come from the hardware? If we can
> > ASPI or CAM DMA SCSI devices to odd addresses and lengths, why not
> > O_DIRECT? Do ape drives DMA to user buffers? Are O_DIRECT's
> > current limits gratuitous?
>
> CAM is a very bad thing and that is why the X3 committees split.
There were interesting guide-lines in CAM, notably the topology handling
and the error recovery scheme. But it was another different wheel in a
world where everybody did reinvent its own. It seemed also very DEC
tainted.
Btw, given guys like you in X3 committees, I am not surprised that splits
occur in this place. :-)
Gérard.
PS: Your various email addresses bounce back claiming some ridiculous
text about spammers. Is this still another show of your apparent
existential complex.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2001-12-30 10:16 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-21 0:08 Possible O_DIRECT problems ? Dave Jones
2001-12-21 0:23 ` Trond Myklebust
2001-12-21 0:39 ` Dave Jones
[not found] ` <w53ellp2out.wl@megaela.fe.dis.titech.ac.jp>
2001-12-21 12:46 ` Trond Myklebust
2001-12-21 16:14 ` Chuck Lever
2001-12-21 16:04 ` Chuck Lever
2001-12-29 15:25 ` Andrea Arcangeli
2001-12-29 18:46 ` CJ
2001-12-30 5:39 ` Andre Hedrick
2001-12-30 11:16 ` Gérard Roudier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox