* A problem about DIRECT IO on ext3
@ 2005-08-29 12:15 colin
2005-08-29 13:29 ` Erik Mouw
0 siblings, 1 reply; 15+ messages in thread
From: colin @ 2005-08-29 12:15 UTC (permalink / raw)
To: linux-kernel
Hi all,
I wrote a simple program to test direct io, and found that there are some
strange behaviors of it on "ext3".
My simple program is below. Assume that the executable file name is
"directio". If I do the following:
1. cp directio aaa
2. ./directio directio aaa
The size of aaa is about the same with directio. This is wrong.
It should be 3 times the size of directio because there are 2 write
operations and one lseek to the file end.
If the second file is not opened with "O_DIRECT", the result is correct.
What's the problem of direct io? I found that if I remove the instruction of
lseek, the result is correct.
Is there any problem of lseek when doing direct io on ext3?
My platform is 2.6.11.
Regards,
Colin
#define _GNU_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <stdlib.h>
int main(int argc, char **argv) {
int fd1, fd2, count;
char *ptr1, *ptr2;
if(argc == 3) {
fd1 = open(argv[1], O_RDONLY | O_DIRECT, S_IRWXU);
fd2 = open(argv[2], O_RDWR | O_CREAT | O_DIRECT);
} else {
printf("Error syntax\n");
exit(1);
}
printf("%d\n", lseek(fd2, 0, SEEK_END));
ptr1 = malloc(4096 + 4096-1);
ptr2 = (void*)((int)ptr1 - (int)ptr1 % 4096 + 4096);
do {
count = read(fd1, ptr2, 4096);
if(!count)
break;
write(fd2, ptr2, 4096);
write(fd2, ptr2, 4096);
} while(count > 0);
free(ptr1);
close(fd1);
close(fd2);
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
@ 2005-08-29 13:21 colin
0 siblings, 0 replies; 15+ messages in thread
From: colin @ 2005-08-29 13:21 UTC (permalink / raw)
To: linux-kernel
Hi all,
Sorry, ignore this mail.
I found that I didn't align the block size when doing direct io... :-(
Regards,
Colin
----- Original Message -----
From: "colin" <colin@realtek.com.tw>
To: <linux-kernel@vger.kernel.org>
Sent: Monday, August 29, 2005 8:15 PM
Subject: A problem about DIRECT IO on ext3
>
> Hi all,
> I wrote a simple program to test direct io, and found that there are some
> strange behaviors of it on "ext3".
> My simple program is below. Assume that the executable file name is
> "directio". If I do the following:
> 1. cp directio aaa
> 2. ./directio directio aaa
>
> The size of aaa is about the same with directio. This is wrong.
> It should be 3 times the size of directio because there are 2 write
> operations and one lseek to the file end.
>
> If the second file is not opened with "O_DIRECT", the result is correct.
>
> What's the problem of direct io? I found that if I remove the instruction
of
> lseek, the result is correct.
> Is there any problem of lseek when doing direct io on ext3?
> My platform is 2.6.11.
>
> Regards,
> Colin
>
>
>
>
>
>
>
>
> #define _GNU_SOURCE
>
> #include <stdio.h>
> #include <fcntl.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <stdlib.h>
>
>
> int main(int argc, char **argv) {
>
> int fd1, fd2, count;
> char *ptr1, *ptr2;
>
> if(argc == 3) {
> fd1 = open(argv[1], O_RDONLY | O_DIRECT, S_IRWXU);
> fd2 = open(argv[2], O_RDWR | O_CREAT | O_DIRECT);
> } else {
> printf("Error syntax\n");
> exit(1);
> }
> printf("%d\n", lseek(fd2, 0, SEEK_END));
>
> ptr1 = malloc(4096 + 4096-1);
> ptr2 = (void*)((int)ptr1 - (int)ptr1 % 4096 + 4096);
>
> do {
> count = read(fd1, ptr2, 4096);
> if(!count)
> break;
> write(fd2, ptr2, 4096);
> write(fd2, ptr2, 4096);
> } while(count > 0);
>
> free(ptr1);
> close(fd1);
> close(fd2);
> }
>
>
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-08-29 12:15 A problem about DIRECT IO on ext3 colin
@ 2005-08-29 13:29 ` Erik Mouw
2005-08-31 8:07 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: Erik Mouw @ 2005-08-29 13:29 UTC (permalink / raw)
To: colin; +Cc: linux-kernel
On Mon, Aug 29, 2005 at 08:15:43PM +0800, colin wrote:
> I wrote a simple program to test direct io, and found that there are some
> strange behaviors of it on "ext3".
> My simple program is below. Assume that the executable file name is
> "directio". If I do the following:
> 1. cp directio aaa
> 2. ./directio directio aaa
>
> The size of aaa is about the same with directio. This is wrong.
No, it's right, but it's not what you expected.
> It should be 3 times the size of directio because there are 2 write
> operations and one lseek to the file end.
I suggest to strace() your program to see what happens.
> If the second file is not opened with "O_DIRECT", the result is correct.
>
> What's the problem of direct io? I found that if I remove the instruction of
> lseek, the result is correct.
There are four prerequisites for direct IO:
- the file needs to be opened with O_DIRECT
- the buffer needs to be page aligned (hint: use getpagesize() instead
of assuming that a page is 4k
- reads and writes need to happen *in* multiples of the soft block size
- reads and writes need to happen *at* multiples of the soft block size
> Is there any problem of lseek when doing direct io on ext3?
> My platform is 2.6.11.
There is no problem.
> Regards,
> Colin
>
> #define _GNU_SOURCE
>
> #include <stdio.h>
> #include <fcntl.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <stdlib.h>
Compile your program with -Wall, you're missing quite some include
files over here.
> int main(int argc, char **argv) {
>
> int fd1, fd2, count;
> char *ptr1, *ptr2;
>
> if(argc == 3) {
> fd1 = open(argv[1], O_RDONLY | O_DIRECT, S_IRWXU);
> fd2 = open(argv[2], O_RDWR | O_CREAT | O_DIRECT);
> } else {
> printf("Error syntax\n");
> exit(1);
> }
> printf("%d\n", lseek(fd2, 0, SEEK_END));
Make that lseek(fd2, 4 * 4096, SEEK_SET);
> ptr1 = malloc(4096 + 4096-1);
> ptr2 = (void*)((int)ptr1 - (int)ptr1 % 4096 + 4096);
Use memalign() or posix_memalign().
> do {
> count = read(fd1, ptr2, 4096);
> if(!count)
> break;
And what happens when count < 0 ?
> write(fd2, ptr2, 4096);
> write(fd2, ptr2, 4096);
Check return values.
> } while(count > 0);
>
> free(ptr1);
> close(fd1);
> close(fd2);
return 0;
> }
With the changes, the result is:
erik@arthur:/tmp > ls -l directio aaa
-rwxr-xr-x 1 erik erik 49152 2005-08-29 15:26 aaa*
-rwxr-xr-x 1 erik erik 12628 2005-08-29 15:26 directio*
Erik
--
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-08-29 13:29 ` Erik Mouw
@ 2005-08-31 8:07 ` Jens Axboe
2005-08-31 11:12 ` Erik Mouw
2005-10-17 8:52 ` li nux
0 siblings, 2 replies; 15+ messages in thread
From: Jens Axboe @ 2005-08-31 8:07 UTC (permalink / raw)
To: Erik Mouw; +Cc: colin, linux-kernel
On Mon, Aug 29 2005, Erik Mouw wrote:
> There are four prerequisites for direct IO:
> - the file needs to be opened with O_DIRECT
> - the buffer needs to be page aligned (hint: use getpagesize() instead
> of assuming that a page is 4k
> - reads and writes need to happen *in* multiples of the soft block size
> - reads and writes need to happen *at* multiples of the soft block size
Actually, the buffer only needs to be hard block size aligned, same goes
for the chunk size used for reads/writes.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-08-31 8:07 ` Jens Axboe
@ 2005-08-31 11:12 ` Erik Mouw
2005-08-31 11:15 ` Jens Axboe
2005-10-17 8:52 ` li nux
1 sibling, 1 reply; 15+ messages in thread
From: Erik Mouw @ 2005-08-31 11:12 UTC (permalink / raw)
To: Jens Axboe; +Cc: colin, linux-kernel
On Wed, Aug 31, 2005 at 10:07:45AM +0200, Jens Axboe wrote:
> On Mon, Aug 29 2005, Erik Mouw wrote:
> > There are four prerequisites for direct IO:
> > - the file needs to be opened with O_DIRECT
> > - the buffer needs to be page aligned (hint: use getpagesize() instead
> > of assuming that a page is 4k
> > - reads and writes need to happen *in* multiples of the soft block size
> > - reads and writes need to happen *at* multiples of the soft block size
>
> Actually, the buffer only needs to be hard block size aligned, same goes
> for the chunk size used for reads/writes.
OK, so that's different from 2.4 where reads/writes needed to be soft
block aligned and buffers page aligned.
Erik
--
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
| Data lost? Stay calm and contact Harddisk-recovery.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-08-31 11:12 ` Erik Mouw
@ 2005-08-31 11:15 ` Jens Axboe
0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2005-08-31 11:15 UTC (permalink / raw)
To: Erik Mouw; +Cc: colin, linux-kernel
On Wed, Aug 31 2005, Erik Mouw wrote:
> On Wed, Aug 31, 2005 at 10:07:45AM +0200, Jens Axboe wrote:
> > On Mon, Aug 29 2005, Erik Mouw wrote:
> > > There are four prerequisites for direct IO:
> > > - the file needs to be opened with O_DIRECT
> > > - the buffer needs to be page aligned (hint: use getpagesize() instead
> > > of assuming that a page is 4k
> > > - reads and writes need to happen *in* multiples of the soft block size
> > > - reads and writes need to happen *at* multiples of the soft block size
> >
> > Actually, the buffer only needs to be hard block size aligned, same goes
> > for the chunk size used for reads/writes.
>
> OK, so that's different from 2.4 where reads/writes needed to be soft
> block aligned and buffers page aligned.
Yes, 2.6 has relaxed the restrictions there somewhat.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-08-31 8:07 ` Jens Axboe
2005-08-31 11:12 ` Erik Mouw
@ 2005-10-17 8:52 ` li nux
2005-10-17 8:58 ` li nux
2005-10-17 9:03 ` Jens Axboe
1 sibling, 2 replies; 15+ messages in thread
From: li nux @ 2005-10-17 8:52 UTC (permalink / raw)
To: Jens Axboe, Erik Mouw; +Cc: colin, linux-kernel
--- Jens Axboe <axboe@suse.de> wrote:
> On Mon, Aug 29 2005, Erik Mouw wrote:
> > There are four prerequisites for direct IO:
> > - the file needs to be opened with O_DIRECT
> > - the buffer needs to be page aligned (hint: use
> getpagesize() instead
> > of assuming that a page is 4k
> > - reads and writes need to happen *in* multiples
> of the soft block size
> > - reads and writes need to happen *at* multiples
> of the soft block size
>
> Actually, the buffer only needs to be hard block
> size aligned, same goes
> for the chunk size used for reads/writes.
>
> --
> Jens Axboe
>
On 2.4 the open call succeeds with O_DIRECT
but read returns -EINVAL for any block size (512, 1024
..16384)
open("/tmp/midstress_idx10",
O_RDWR|O_CREAT|O_DIRECT|O_LARGEFILE, 01001101270) = 4
read(3, 0xbfffdc40, 16384) = -1 EINVAL (Invalid
argument)
how to correct this problem ?
__________________________________
Start your day with Yahoo! - Make it your home page!
http://www.yahoo.com/r/hs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 8:52 ` li nux
@ 2005-10-17 8:58 ` li nux
2005-10-17 9:03 ` Jens Axboe
1 sibling, 0 replies; 15+ messages in thread
From: li nux @ 2005-10-17 8:58 UTC (permalink / raw)
To: li nux, Jens Axboe, Erik Mouw; +Cc: colin, linux-kernel
--- li nux <lnxluv@yahoo.com> wrote:
>
>
> --- Jens Axboe <axboe@suse.de> wrote:
>
> > On Mon, Aug 29 2005, Erik Mouw wrote:
> > > There are four prerequisites for direct IO:
> > > - the file needs to be opened with O_DIRECT
> > > - the buffer needs to be page aligned (hint: use
> > getpagesize() instead
> > > of assuming that a page is 4k
> > > - reads and writes need to happen *in* multiples
> > of the soft block size
> > > - reads and writes need to happen *at* multiples
> > of the soft block size
> >
> > Actually, the buffer only needs to be hard block
> > size aligned, same goes
> > for the chunk size used for reads/writes.
> >
> > --
> > Jens Axboe
> >
On 2.4 the open call succeeds with O_DIRECT
but read returns -EINVAL for any block size (512,
1024 ..16384)
open("/tmp/midstress_idx10",
O_RDWR|O_CREAT|O_DIRECT|O_LARGEFILE, 01001101270) =
3
read(3, 0xbfffdc40, 16384) = -1 EINVAL (Invalid
argument)
how to correct this problem ?
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 8:52 ` li nux
2005-10-17 8:58 ` li nux
@ 2005-10-17 9:03 ` Jens Axboe
2005-10-17 9:15 ` Grzegorz Kulewski
1 sibling, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2005-10-17 9:03 UTC (permalink / raw)
To: li nux; +Cc: Erik Mouw, colin, linux-kernel
On Mon, Oct 17 2005, li nux wrote:
>
>
> --- Jens Axboe <axboe@suse.de> wrote:
>
> > On Mon, Aug 29 2005, Erik Mouw wrote:
> > > There are four prerequisites for direct IO:
> > > - the file needs to be opened with O_DIRECT
> > > - the buffer needs to be page aligned (hint: use
> > getpagesize() instead
> > > of assuming that a page is 4k
> > > - reads and writes need to happen *in* multiples
> > of the soft block size
> > > - reads and writes need to happen *at* multiples
> > of the soft block size
> >
> > Actually, the buffer only needs to be hard block
> > size aligned, same goes
> > for the chunk size used for reads/writes.
> >
> > --
> > Jens Axboe
> >
> On 2.4 the open call succeeds with O_DIRECT
> but read returns -EINVAL for any block size (512, 1024
> ..16384)
>
> open("/tmp/midstress_idx10",
> O_RDWR|O_CREAT|O_DIRECT|O_LARGEFILE, 01001101270) = 4
> read(3, 0xbfffdc40, 16384) = -1 EINVAL (Invalid
> argument)
>
> how to correct this problem ?
See your buffer address, it's not aligned. You need to align that as
well. This is needed because the hardware will dma directly to the user
buffer, and to be on the safe side we require the same alignment as the
block layer will normally generate for file system io.
So in short, just align your read buffer to the same as your block size
and you will be fine. Example:
#define BS (4096)
#define MASK (BS - 1)
#define ALIGN(buf) (((unsigned long) (buf) + MASK) & ~(MASK))
char *ptr = malloc(BS + MASK);
char *buf = (char *) ALIGN(ptr);
read(fd, buf, BS);
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 9:03 ` Jens Axboe
@ 2005-10-17 9:15 ` Grzegorz Kulewski
2005-10-17 9:17 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: Grzegorz Kulewski @ 2005-10-17 9:15 UTC (permalink / raw)
To: Jens Axboe; +Cc: li nux, Erik Mouw, colin, linux-kernel
On Mon, 17 Oct 2005, Jens Axboe wrote:
>> how to correct this problem ?
>
> See your buffer address, it's not aligned. You need to align that as
> well. This is needed because the hardware will dma directly to the user
> buffer, and to be on the safe side we require the same alignment as the
> block layer will normally generate for file system io.
>
> So in short, just align your read buffer to the same as your block size
> and you will be fine. Example:
>
> #define BS (4096)
> #define MASK (BS - 1)
> #define ALIGN(buf) (((unsigned long) (buf) + MASK) & ~(MASK))
>
> char *ptr = malloc(BS + MASK);
> char *buf = (char *) ALIGN(ptr);
>
> read(fd, buf, BS);
Shouldn't one use posix_memalign(3) for that?
Grzegorz Kulewski
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 9:15 ` Grzegorz Kulewski
@ 2005-10-17 9:17 ` Jens Axboe
2005-10-17 9:41 ` li nux
0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2005-10-17 9:17 UTC (permalink / raw)
To: Grzegorz Kulewski; +Cc: li nux, Erik Mouw, colin, linux-kernel
On Mon, Oct 17 2005, Grzegorz Kulewski wrote:
> On Mon, 17 Oct 2005, Jens Axboe wrote:
> >>how to correct this problem ?
> >
> >See your buffer address, it's not aligned. You need to align that as
> >well. This is needed because the hardware will dma directly to the user
> >buffer, and to be on the safe side we require the same alignment as the
> >block layer will normally generate for file system io.
> >
> >So in short, just align your read buffer to the same as your block size
> >and you will be fine. Example:
> >
> >#define BS (4096)
> >#define MASK (BS - 1)
> >#define ALIGN(buf) (((unsigned long) (buf) + MASK) & ~(MASK))
> >
> >char *ptr = malloc(BS + MASK);
> >char *buf = (char *) ALIGN(ptr);
> >
> >read(fd, buf, BS);
>
> Shouldn't one use posix_memalign(3) for that?
Dunno if one 'should', one 'can' if one wants to. I prefer to do it
manually so I don't have to jump through #define hoops to get at it
(which, btw, still doesn't expose it on this machine).
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 9:17 ` Jens Axboe
@ 2005-10-17 9:41 ` li nux
2005-10-17 9:51 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: li nux @ 2005-10-17 9:41 UTC (permalink / raw)
To: Jens Axboe, Grzegorz Kulewski; +Cc: Erik Mouw, colin, linux-kernel
--- Jens Axboe <axboe@suse.de> wrote:
> On Mon, Oct 17 2005, Grzegorz Kulewski wrote:
> > On Mon, 17 Oct 2005, Jens Axboe wrote:
> > >>how to correct this problem ?
> > >
> > >See your buffer address, it's not aligned. You
> need to align that as
> > >well. This is needed because the hardware will
> dma directly to the user
> > >buffer, and to be on the safe side we require the
> same alignment as the
> > >block layer will normally generate for file
> system io.
> > >
> > >So in short, just align your read buffer to the
> same as your block size
> > >and you will be fine. Example:
> > >
> > >#define BS (4096)
> > >#define MASK (BS - 1)
> > >#define ALIGN(buf) (((unsigned long) (buf) +
> MASK) & ~(MASK))
> > >
> > >char *ptr = malloc(BS + MASK);
> > >char *buf = (char *) ALIGN(ptr);
> > >
> > >read(fd, buf, BS);
> >
> > Shouldn't one use posix_memalign(3) for that?
>
> Dunno if one 'should', one 'can' if one wants to. I
> prefer to do it
> manually so I don't have to jump through #define
> hoops to get at it
> (which, btw, still doesn't expose it on this
> machine).
>
> --
> Jens Axboe
Thanx a lot Jens :-)
Its working now.
I did not have to make these adjustments on 2.6
Is looks to be having more relaxation.
Can somebody please throw some light on how to find
your system's hard/soft block size ?
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 9:41 ` li nux
@ 2005-10-17 9:51 ` Jens Axboe
2005-10-17 16:36 ` Badari Pulavarty
0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2005-10-17 9:51 UTC (permalink / raw)
To: li nux; +Cc: Grzegorz Kulewski, Erik Mouw, colin, linux-kernel
On Mon, Oct 17 2005, li nux wrote:
>
>
> --- Jens Axboe <axboe@suse.de> wrote:
>
> > On Mon, Oct 17 2005, Grzegorz Kulewski wrote:
> > > On Mon, 17 Oct 2005, Jens Axboe wrote:
> > > >>how to correct this problem ?
> > > >
> > > >See your buffer address, it's not aligned. You
> > need to align that as
> > > >well. This is needed because the hardware will
> > dma directly to the user
> > > >buffer, and to be on the safe side we require the
> > same alignment as the
> > > >block layer will normally generate for file
> > system io.
> > > >
> > > >So in short, just align your read buffer to the
> > same as your block size
> > > >and you will be fine. Example:
> > > >
> > > >#define BS (4096)
> > > >#define MASK (BS - 1)
> > > >#define ALIGN(buf) (((unsigned long) (buf) +
> > MASK) & ~(MASK))
> > > >
> > > >char *ptr = malloc(BS + MASK);
> > > >char *buf = (char *) ALIGN(ptr);
> > > >
> > > >read(fd, buf, BS);
> > >
> > > Shouldn't one use posix_memalign(3) for that?
> >
> > Dunno if one 'should', one 'can' if one wants to. I
> > prefer to do it
> > manually so I don't have to jump through #define
> > hoops to get at it
> > (which, btw, still doesn't expose it on this
> > machine).
> >
> > --
> > Jens Axboe
>
> Thanx a lot Jens :-)
> Its working now.
> I did not have to make these adjustments on 2.6
> Is looks to be having more relaxation.
2.6 does have the option of checking the hardware dma requirement
seperately, but for this path you should run into the same restrictions.
Perhaps you just got lucky when testing 2.6?
> Can somebody please throw some light on how to find
> your system's hard/soft block size ?
It's a per-device (or even per-partition, in case of mounted partitions)
setting, you can use the BLKBSZGET and BLKSSZGET ioctls to query for
soft/hard sector sizes.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 9:51 ` Jens Axboe
@ 2005-10-17 16:36 ` Badari Pulavarty
2005-10-17 17:53 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: Badari Pulavarty @ 2005-10-17 16:36 UTC (permalink / raw)
To: Jens Axboe; +Cc: li nux, Grzegorz Kulewski, Erik Mouw, colin, lkml
On Mon, 2005-10-17 at 11:51 +0200, Jens Axboe wrote:
> On Mon, Oct 17 2005, li nux wrote:
> >
> >
> > --- Jens Axboe <axboe@suse.de> wrote:
> >
> > > On Mon, Oct 17 2005, Grzegorz Kulewski wrote:
> > > > On Mon, 17 Oct 2005, Jens Axboe wrote:
> > > > >>how to correct this problem ?
> > > > >
> > > > >See your buffer address, it's not aligned. You
> > > need to align that as
> > > > >well. This is needed because the hardware will
> > > dma directly to the user
> > > > >buffer, and to be on the safe side we require the
> > > same alignment as the
> > > > >block layer will normally generate for file
> > > system io.
> > > > >
> > > > >So in short, just align your read buffer to the
> > > same as your block size
> > > > >and you will be fine. Example:
> > > > >
> > > > >#define BS (4096)
> > > > >#define MASK (BS - 1)
> > > > >#define ALIGN(buf) (((unsigned long) (buf) +
> > > MASK) & ~(MASK))
> > > > >
> > > > >char *ptr = malloc(BS + MASK);
> > > > >char *buf = (char *) ALIGN(ptr);
> > > > >
> > > > >read(fd, buf, BS);
> > > >
> > > > Shouldn't one use posix_memalign(3) for that?
> > >
> > > Dunno if one 'should', one 'can' if one wants to. I
> > > prefer to do it
> > > manually so I don't have to jump through #define
> > > hoops to get at it
> > > (which, btw, still doesn't expose it on this
> > > machine).
> > >
> > > --
> > > Jens Axboe
> >
> > Thanx a lot Jens :-)
> > Its working now.
> > I did not have to make these adjustments on 2.6
> > Is looks to be having more relaxation.
>
> 2.6 does have the option of checking the hardware dma requirement
> seperately, but for this path you should run into the same restrictions.
> Perhaps you just got lucky when testing 2.6?
2.6 also has the same restriction. But, if the "filesystem
blocksize alignment" (soft block size) fails, we try to see
if its aligned with hard sector size (512). If so, we can do the IO.
2.4 fails if the offset or buffer is NOT filesystem blocksize
aligned. Period.
So, its possible that your buffer is atleast 512byte aligned,
there by succeeding on 2.6
BTW, posix_memalign() or valloc() should be safe.
>
> > Can somebody please throw some light on how to find
> > your system's hard/soft block size ?
>
> It's a per-device (or even per-partition, in case of mounted partitions)
> setting, you can use the BLKBSZGET and BLKSSZGET ioctls to query for
> soft/hard sector sizes.
>
Thanks,
Badari
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A problem about DIRECT IO on ext3
2005-10-17 16:36 ` Badari Pulavarty
@ 2005-10-17 17:53 ` Jens Axboe
0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2005-10-17 17:53 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: li nux, Grzegorz Kulewski, Erik Mouw, colin, lkml
On Mon, Oct 17 2005, Badari Pulavarty wrote:
> On Mon, 2005-10-17 at 11:51 +0200, Jens Axboe wrote:
> > On Mon, Oct 17 2005, li nux wrote:
> > >
> > >
> > > --- Jens Axboe <axboe@suse.de> wrote:
> > >
> > > > On Mon, Oct 17 2005, Grzegorz Kulewski wrote:
> > > > > On Mon, 17 Oct 2005, Jens Axboe wrote:
> > > > > >>how to correct this problem ?
> > > > > >
> > > > > >See your buffer address, it's not aligned. You
> > > > need to align that as
> > > > > >well. This is needed because the hardware will
> > > > dma directly to the user
> > > > > >buffer, and to be on the safe side we require the
> > > > same alignment as the
> > > > > >block layer will normally generate for file
> > > > system io.
> > > > > >
> > > > > >So in short, just align your read buffer to the
> > > > same as your block size
> > > > > >and you will be fine. Example:
> > > > > >
> > > > > >#define BS (4096)
> > > > > >#define MASK (BS - 1)
> > > > > >#define ALIGN(buf) (((unsigned long) (buf) +
> > > > MASK) & ~(MASK))
> > > > > >
> > > > > >char *ptr = malloc(BS + MASK);
> > > > > >char *buf = (char *) ALIGN(ptr);
> > > > > >
> > > > > >read(fd, buf, BS);
> > > > >
> > > > > Shouldn't one use posix_memalign(3) for that?
> > > >
> > > > Dunno if one 'should', one 'can' if one wants to. I
> > > > prefer to do it
> > > > manually so I don't have to jump through #define
> > > > hoops to get at it
> > > > (which, btw, still doesn't expose it on this
> > > > machine).
> > > >
> > > > --
> > > > Jens Axboe
> > >
> > > Thanx a lot Jens :-)
> > > Its working now.
> > > I did not have to make these adjustments on 2.6
> > > Is looks to be having more relaxation.
> >
> > 2.6 does have the option of checking the hardware dma requirement
> > seperately, but for this path you should run into the same restrictions.
> > Perhaps you just got lucky when testing 2.6?
>
> 2.6 also has the same restriction. But, if the "filesystem
> blocksize alignment" (soft block size) fails, we try to see
> if its aligned with hard sector size (512). If so, we can do the IO.
>
> 2.4 fails if the offset or buffer is NOT filesystem blocksize
> aligned. Period.
I'm aware of that, however this particular case was about the buffer
alignment (which was 32-bytes in the strace). And that should not work
for 2.6 either.
The block-size alignment is really a separate property of correctness.
> BTW, posix_memalign() or valloc() should be safe.
Certainly.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2005-10-17 17:52 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-29 12:15 A problem about DIRECT IO on ext3 colin
2005-08-29 13:29 ` Erik Mouw
2005-08-31 8:07 ` Jens Axboe
2005-08-31 11:12 ` Erik Mouw
2005-08-31 11:15 ` Jens Axboe
2005-10-17 8:52 ` li nux
2005-10-17 8:58 ` li nux
2005-10-17 9:03 ` Jens Axboe
2005-10-17 9:15 ` Grzegorz Kulewski
2005-10-17 9:17 ` Jens Axboe
2005-10-17 9:41 ` li nux
2005-10-17 9:51 ` Jens Axboe
2005-10-17 16:36 ` Badari Pulavarty
2005-10-17 17:53 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2005-08-29 13:21 colin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox