* [rfc patch 2/2] direct-io: remove address alignment check
@ 2005-07-13 23:43 Daniel McNeil
2005-07-14 23:16 ` Badari Pulavarty
2005-07-15 0:28 ` Tejun Heo
0 siblings, 2 replies; 19+ messages in thread
From: Daniel McNeil @ 2005-07-13 23:43 UTC (permalink / raw)
To: linux-aio@kvack.org, Linux Kernel Mailing List
This patch relaxes the direct i/o alignment check so that user addresses
do not have to be a multiple of the device block size.
I've done some preliminary testing and it mostly works on an ext3
file system on a ide disk. I have seen trouble when the user address
is on an odd byte boundary. Sometimes the data is read back incorrectly
on read and sometimes I get these kernel error messages:
hda: dma_timer_expiry: dma status == 0x60
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: drive not ready for command
Doing direct-io with user addresses on even, non-512 boundaries appears
to be working correctly.
Any additional testing and/or comments welcome.
Signed-off-by: Daniel McNeil <daniel@osdl.org>
--- linux-2.6.12.orig/fs/direct-io.c 2005-06-28 16:39:39.000000000 -0700
+++ linux-2.6.12/fs/direct-io.c 2005-06-28 16:39:59.000000000 -0700
@@ -1147,7 +1147,9 @@ __blockdev_direct_IO(int rw, struct kioc
goto out;
}
- /* Check the memory alignment. Blocks cannot straddle pages */
+ /*
+ * Check the i/o. It must be a multiple of device block size.
+ */
for (seg = 0; seg < nr_segs; seg++) {
addr = (unsigned long)iov[seg].iov_base;
size = iov[seg].iov_len;
@@ -1156,7 +1158,7 @@ __blockdev_direct_IO(int rw, struct kioc
if (bdev)
blkbits = bdev_blkbits;
blocksize_mask = (1 << blkbits) - 1;
- if ((addr & blocksize_mask) || (size & blocksize_mask))
+ if (size & blocksize_mask)
goto out;
}
}
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-13 23:43 [rfc patch 2/2] direct-io: remove address alignment check Daniel McNeil @ 2005-07-14 23:16 ` Badari Pulavarty 2005-07-14 23:44 ` Daniel McNeil 2005-07-15 0:28 ` Tejun Heo 1 sibling, 1 reply; 19+ messages in thread From: Badari Pulavarty @ 2005-07-14 23:16 UTC (permalink / raw) To: Daniel McNeil; +Cc: linux-aio@kvack.org, Linux Kernel Mailing List How does your patch ensures that we meet the driver alignment restrictions ? Like you said, you need atleast "even" byte alignment for IDE etc.. And also, are there any restrictions on how much the "minimum" IO size has to be ? I mean, can I read "1" byte ? I guess you are not relaxing it (yet).. Thanks, Badari On Wed, 2005-07-13 at 16:43 -0700, Daniel McNeil wrote: > This patch relaxes the direct i/o alignment check so that user addresses > do not have to be a multiple of the device block size. > > I've done some preliminary testing and it mostly works on an ext3 > file system on a ide disk. I have seen trouble when the user address > is on an odd byte boundary. Sometimes the data is read back incorrectly > on read and sometimes I get these kernel error messages: > hda: dma_timer_expiry: dma status == 0x60 > hda: DMA timeout retry > hda: timeout waiting for DMA > hda: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hda: drive not ready for command > > Doing direct-io with user addresses on even, non-512 boundaries appears > to be working correctly. > > Any additional testing and/or comments welcome. > > Signed-off-by: Daniel McNeil <daniel@osdl.org> > > --- linux-2.6.12.orig/fs/direct-io.c 2005-06-28 16:39:39.000000000 -0700 > +++ linux-2.6.12/fs/direct-io.c 2005-06-28 16:39:59.000000000 -0700 > @@ -1147,7 +1147,9 @@ __blockdev_direct_IO(int rw, struct kioc > goto out; > } > > - /* Check the memory alignment. Blocks cannot straddle pages */ > + /* > + * Check the i/o. It must be a multiple of device block size. > + */ > for (seg = 0; seg < nr_segs; seg++) { > addr = (unsigned long)iov[seg].iov_base; > size = iov[seg].iov_len; > @@ -1156,7 +1158,7 @@ __blockdev_direct_IO(int rw, struct kioc > if (bdev) > blkbits = bdev_blkbits; > blocksize_mask = (1 << blkbits) - 1; > - if ((addr & blocksize_mask) || (size & blocksize_mask)) > + if (size & blocksize_mask) > goto out; > } > } > > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to majordomo@kvack.org. For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 23:16 ` Badari Pulavarty @ 2005-07-14 23:44 ` Daniel McNeil 2005-07-15 5:27 ` Badari Pulavarty 0 siblings, 1 reply; 19+ messages in thread From: Daniel McNeil @ 2005-07-14 23:44 UTC (permalink / raw) To: Badari Pulavarty; +Cc: linux-aio@kvack.org, Linux Kernel Mailing List On Thu, 2005-07-14 at 16:16, Badari Pulavarty wrote: > How does your patch ensures that we meet the driver alignment > restrictions ? Like you said, you need atleast "even" byte alignment > for IDE etc.. > > And also, are there any restrictions on how much the "minimum" IO > size has to be ? I mean, can I read "1" byte ? I guess you are > not relaxing it (yet).. > This patch does not change the i/o size requirements -- they must be a multiple of device block size (usually 512). It only relaxes the address alignment restriction. I do not know what the driver alignment restrictions are. Without the 1st patch, it was impossible to relax the address space check and have direct-io generate the correct i/o's to submit. This 2nd patch, is just for testing and generating feedback to find out what the address alignment issues are. Then we can decide how to proceed. Did you look over the 1st patch? Comments? Thanks, Daniel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 23:44 ` Daniel McNeil @ 2005-07-15 5:27 ` Badari Pulavarty 2005-07-15 20:06 ` Daniel McNeil 0 siblings, 1 reply; 19+ messages in thread From: Badari Pulavarty @ 2005-07-15 5:27 UTC (permalink / raw) To: Daniel McNeil; +Cc: linux-aio@kvack.org, Linux Kernel Mailing List Daniel McNeil wrote: > On Thu, 2005-07-14 at 16:16, Badari Pulavarty wrote: > >>How does your patch ensures that we meet the driver alignment >>restrictions ? Like you said, you need atleast "even" byte alignment >>for IDE etc.. >> >>And also, are there any restrictions on how much the "minimum" IO >>size has to be ? I mean, can I read "1" byte ? I guess you are >>not relaxing it (yet).. >> > > > This patch does not change the i/o size requirements -- they > must be a multiple of device block size (usually 512). > > It only relaxes the address alignment restriction. I do not > know what the driver alignment restrictions are. Without the > 1st patch, it was impossible to relax the address space > check and have direct-io generate the correct i/o's to submit. > > This 2nd patch, is just for testing and generating feedback > to find out what the address alignment issues are. Then > we can decide how to proceed. > > Did you look over the 1st patch? Comments? Yes. I did look at the first patch and my questions were basically towards the first patch. I don't see any enforcement of alignment with your patch at all. So, we let the driver fail if it can't handle it ? BTW, I don't think the first patch is really doing the right thing. You got little carried away while cleaning up. You are trying to relax "user buffer" alignment only. If your "offset" is in the middle of a filesystem block (say 4k), you still need to zero out the first portion to be able to write into the middle. That "evil" code is still needed. :( Thanks, Badari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 5:27 ` Badari Pulavarty @ 2005-07-15 20:06 ` Daniel McNeil 0 siblings, 0 replies; 19+ messages in thread From: Daniel McNeil @ 2005-07-15 20:06 UTC (permalink / raw) To: Badari Pulavarty; +Cc: linux-aio@kvack.org, Linux Kernel Mailing List On Thu, 2005-07-14 at 22:27, Badari Pulavarty wrote: > Daniel McNeil wrote: > > > On Thu, 2005-07-14 at 16:16, Badari Pulavarty wrote: > > > >>How does your patch ensures that we meet the driver alignment > >>restrictions ? Like you said, you need atleast "even" byte alignment > >>for IDE etc.. > >> > >>And also, are there any restrictions on how much the "minimum" IO > >>size has to be ? I mean, can I read "1" byte ? I guess you are > >>not relaxing it (yet).. > >> > > > > > > This patch does not change the i/o size requirements -- they > > must be a multiple of device block size (usually 512). > > > > It only relaxes the address alignment restriction. I do not > > know what the driver alignment restrictions are. Without the > > 1st patch, it was impossible to relax the address space > > check and have direct-io generate the correct i/o's to submit. > > > > This 2nd patch, is just for testing and generating feedback > > to find out what the address alignment issues are. Then > > we can decide how to proceed. > > > > Did you look over the 1st patch? Comments? > > Yes. I did look at the first patch and my questions were basically > towards the first patch. I don't see any enforcement of alignment > with your patch at all. So, we let the driver fail if it can't > handle it ? > The 1st patch re-writes direct-io to handle non-512 aligned addresses. Without the 2nd patch, it will never see non-512 aligned user address and should work the same as before only with slightly smaller code :). The drivers will get the same 512-byte aligned addresses. Am I missing something? > BTW, I don't think the first patch is really doing the right thing. > You got little carried away while cleaning up. > You are trying to relax "user buffer" alignment only. If your > "offset" is in the middle of a filesystem block (say 4k), you still > need to zero out the first portion to be able to write into the > middle. That "evil" code is still needed. :( > The code still does zero out the 1st portion. dio_zero_block() is being called twice still. Sure looks like it is working to me: Test program d.c: ------------------------ #define _GNU_SOURCE 1 #include <sys/types.h> #include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <string.h> main() { int fd; char *buf; int io_size = 512; off_t skip = 512; int i; if (posix_memalign((void *)&buf, getpagesize(), io_size) != 0) { perror("cannot alloc mem"); exit(1); } memset(buf, 'a', io_size); fd = open("direct_test_file", O_RDWR|O_DIRECT|O_TRUNC|O_CREAT, 0666); lseek(fd, skip, SEEK_SET); if ((i = write(fd, buf, io_size)) != io_size) { perror("bad write"); exit(2); } printf("write to direct_test_file %d bytes of 'a' at %d\n", i, skip); memset(buf, 'b', io_size); lseek(fd, getpagesize(), SEEK_SET); if ((i = write(fd, buf, io_size)) != io_size) { perror("bad write"); exit(2); } printf("write to direct_test_file %d bytes of 'b' at %d\n", i, getpagesize()); } -------------------------- $ ./d write to direct_test_file 512 bytes of 'a' at 512 write to direct_test_file 512 bytes of 'b' at 4096 $ hexdump direct_test_file 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000200 6161 6161 6161 6161 6161 6161 6161 6161 * 0000400 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000 6262 6262 6262 6262 6262 6262 6262 6262 * 0001200 The 1st 512 bytes are zeroed as well as the bytes between 1k and 4k. Thanks, Daniel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-13 23:43 [rfc patch 2/2] direct-io: remove address alignment check Daniel McNeil 2005-07-14 23:16 ` Badari Pulavarty @ 2005-07-15 0:28 ` Tejun Heo 2005-07-15 5:18 ` Badari Pulavarty 1 sibling, 1 reply; 19+ messages in thread From: Tejun Heo @ 2005-07-15 0:28 UTC (permalink / raw) To: Daniel McNeil; +Cc: linux-aio@kvack.org, Linux Kernel Mailing List Daniel McNeil wrote: > This patch relaxes the direct i/o alignment check so that user addresses > do not have to be a multiple of the device block size. > > I've done some preliminary testing and it mostly works on an ext3 > file system on a ide disk. I have seen trouble when the user address > is on an odd byte boundary. Sometimes the data is read back incorrectly > on read and sometimes I get these kernel error messages: > hda: dma_timer_expiry: dma status == 0x60 > hda: DMA timeout retry > hda: timeout waiting for DMA > hda: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hda: drive not ready for command > > Doing direct-io with user addresses on even, non-512 boundaries appears > to be working correctly. > > Any additional testing and/or comments welcome. > Hi, Daniel. I don't think the change is a good idea. We may be able to relax alignment contraints on some hardware to certain levels, but IMHO it will be very difficult to verify. All internal block IO code follows strict block boundary alignment. And as raw IOs (especially unaligned ones) aren't very common operations, they won't get tested much. Then when some rare (probably not an open source one) application uses it on some rare buggy hardware, it may cause *very* strange things. Also, I don't think it will improve application programmer's convenience. As each hardware employs different DMA alignemnt, we need to implement a way to export the alignment to user space and enforce it. So, in the end, user application must do aligned allocation accordingly. Just following block boundary will be easier. I don't know why you wanna relax the alignment requirement, but wouldn't it be easier to just write/use block-aligned allocator for such buffers? It will even make the program more portable. -- tejun ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 0:28 ` Tejun Heo @ 2005-07-15 5:18 ` Badari Pulavarty 2005-07-15 8:23 ` Tejun Heo 2005-07-15 16:56 ` Joel Becker 0 siblings, 2 replies; 19+ messages in thread From: Badari Pulavarty @ 2005-07-15 5:18 UTC (permalink / raw) To: Tejun Heo; +Cc: Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List Tejun Heo wrote: > Daniel McNeil wrote: > >> This patch relaxes the direct i/o alignment check so that user addresses >> do not have to be a multiple of the device block size. >> >> I've done some preliminary testing and it mostly works on an ext3 >> file system on a ide disk. I have seen trouble when the user address >> is on an odd byte boundary. Sometimes the data is read back incorrectly >> on read and sometimes I get these kernel error messages: >> hda: dma_timer_expiry: dma status == 0x60 >> hda: DMA timeout retry >> hda: timeout waiting for DMA >> hda: status error: status=0x58 { DriveReady SeekComplete >> DataRequest } >> ide: failed opcode was: unknown >> hda: drive not ready for command >> >> Doing direct-io with user addresses on even, non-512 boundaries appears >> to be working correctly. >> >> Any additional testing and/or comments welcome. >> > > Hi, Daniel. > > I don't think the change is a good idea. We may be able to relax > alignment contraints on some hardware to certain levels, but IMHO it > will be very difficult to verify. All internal block IO code follows > strict block boundary alignment. And as raw IOs (especially unaligned > ones) aren't very common operations, they won't get tested much. Then > when some rare (probably not an open source one) application uses it on > some rare buggy hardware, it may cause *very* strange things. > > Also, I don't think it will improve application programmer's > convenience. As each hardware employs different DMA alignemnt, we need > to implement a way to export the alignment to user space and enforce it. > So, in the end, user application must do aligned allocation > accordingly. Just following block boundary will be easier. > > I don't know why you wanna relax the alignment requirement, but > wouldn't it be easier to just write/use block-aligned allocator for such > buffers? It will even make the program more portable. > I can imagine a reason for relaxing the alignment. I keep getting asked whether we can do "O_DIRECT mount option". Database folks wants to make sure all the access to files in a given filesystem are O_DIRECT (whether they are accessing or some random program like ftp, scp, cp are acessing them). This was mainly to ensure that buffered accesses to the file doesn't polute the pagecache (while database is using O_DIRECT access). Seems like a logical request, but not easy to do :( Thanks, Badari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 5:18 ` Badari Pulavarty @ 2005-07-15 8:23 ` Tejun Heo 2005-07-15 17:54 ` Badari Pulavarty 2005-07-15 16:56 ` Joel Becker 1 sibling, 1 reply; 19+ messages in thread From: Tejun Heo @ 2005-07-15 8:23 UTC (permalink / raw) To: Badari Pulavarty Cc: Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List Badari Pulavarty wrote: > Tejun Heo wrote: > >> Daniel McNeil wrote: >> >>> This patch relaxes the direct i/o alignment check so that user addresses >>> do not have to be a multiple of the device block size. >>> >>> I've done some preliminary testing and it mostly works on an ext3 >>> file system on a ide disk. I have seen trouble when the user address >>> is on an odd byte boundary. Sometimes the data is read back incorrectly >>> on read and sometimes I get these kernel error messages: >>> hda: dma_timer_expiry: dma status == 0x60 >>> hda: DMA timeout retry >>> hda: timeout waiting for DMA >>> hda: status error: status=0x58 { DriveReady SeekComplete >>> DataRequest } >>> ide: failed opcode was: unknown >>> hda: drive not ready for command >>> >>> Doing direct-io with user addresses on even, non-512 boundaries appears >>> to be working correctly. >>> >>> Any additional testing and/or comments welcome. >>> >> >> Hi, Daniel. >> >> I don't think the change is a good idea. We may be able to relax >> alignment contraints on some hardware to certain levels, but IMHO it >> will be very difficult to verify. All internal block IO code follows >> strict block boundary alignment. And as raw IOs (especially unaligned >> ones) aren't very common operations, they won't get tested much. Then >> when some rare (probably not an open source one) application uses it >> on some rare buggy hardware, it may cause *very* strange things. >> >> Also, I don't think it will improve application programmer's >> convenience. As each hardware employs different DMA alignemnt, we >> need to implement a way to export the alignment to user space and >> enforce it. So, in the end, user application must do aligned >> allocation accordingly. Just following block boundary will be easier. >> >> I don't know why you wanna relax the alignment requirement, but >> wouldn't it be easier to just write/use block-aligned allocator for >> such buffers? It will even make the program more portable. >> > > I can imagine a reason for relaxing the alignment. I keep getting asked > whether we can do "O_DIRECT mount option". Database folks wants to > make sure all the access to files in a given filesystem are O_DIRECT > (whether they are accessing or some random program like ftp, scp, cp > are acessing them). This was mainly to ensure that buffered accesses to > the file doesn't polute the pagecache (while database is using O_DIRECT > access). Seems like a logical request, but not easy to do :( > > Thanks, > Badari I don't know much about VM, but, if that's necessary, I think that limiting pagecache size per mounted fs (or by some other applicable category) is easier/more complete approach. After all, you cannot mmap w/ O_DIRECT and many programs (gcc, ld come to mind) mmap large part of their memory usage. Thanks. -- tejun ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 8:23 ` Tejun Heo @ 2005-07-15 17:54 ` Badari Pulavarty 2005-07-16 3:50 ` Tejun Heo 0 siblings, 1 reply; 19+ messages in thread From: Badari Pulavarty @ 2005-07-15 17:54 UTC (permalink / raw) To: Tejun Heo; +Cc: Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List On Fri, 2005-07-15 at 17:23 +0900, Tejun Heo wrote: > Badari Pulavarty wrote: ... > >> I don't know why you wanna relax the alignment requirement, but > >> wouldn't it be easier to just write/use block-aligned allocator for > >> such buffers? It will even make the program more portable. > >> > > > > I can imagine a reason for relaxing the alignment. I keep getting asked > > whether we can do "O_DIRECT mount option". Database folks wants to > > make sure all the access to files in a given filesystem are O_DIRECT > > (whether they are accessing or some random program like ftp, scp, cp > > are acessing them). This was mainly to ensure that buffered accesses to > > the file doesn't polute the pagecache (while database is using O_DIRECT > > access). Seems like a logical request, but not easy to do :( > > > > Thanks, > > Badari > > I don't know much about VM, but, if that's necessary, I think that > limiting pagecache size per mounted fs (or by some other applicable > category) is easier/more complete approach. After all, you cannot mmap > w/ O_DIRECT and many programs (gcc, ld come to mind) mmap large part of > their memory usage. I agree. I guess for mmap()ed access we can kick it back to buffered mode. I don't think limiting pagecache use per filesystem is an acceptable option. In fact, database folks exactly want this - to limit the pagecache use by filesystems - but I don't think its right thing to do, so I am trying to propose mount O_DIRECT as an alternative (if its feasible). Thanks, Badari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 17:54 ` Badari Pulavarty @ 2005-07-16 3:50 ` Tejun Heo 0 siblings, 0 replies; 19+ messages in thread From: Tejun Heo @ 2005-07-16 3:50 UTC (permalink / raw) To: Badari Pulavarty Cc: Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List Badari Pulavarty wrote: > On Fri, 2005-07-15 at 17:23 +0900, Tejun Heo wrote: > >>Badari Pulavarty wrote: > > ... > >>>> I don't know why you wanna relax the alignment requirement, but >>>>wouldn't it be easier to just write/use block-aligned allocator for >>>>such buffers? It will even make the program more portable. >>>> >>> >>>I can imagine a reason for relaxing the alignment. I keep getting asked >>>whether we can do "O_DIRECT mount option". Database folks wants to >>>make sure all the access to files in a given filesystem are O_DIRECT >>>(whether they are accessing or some random program like ftp, scp, cp >>>are acessing them). This was mainly to ensure that buffered accesses to >>>the file doesn't polute the pagecache (while database is using O_DIRECT >>>access). Seems like a logical request, but not easy to do :( >>> >>>Thanks, >>>Badari >> >> I don't know much about VM, but, if that's necessary, I think that >>limiting pagecache size per mounted fs (or by some other applicable >>category) is easier/more complete approach. After all, you cannot mmap >>w/ O_DIRECT and many programs (gcc, ld come to mind) mmap large part of >>their memory usage. > > > I agree. I guess for mmap()ed access we can kick it back to buffered > mode. > > I don't think limiting pagecache use per filesystem is an acceptable > option. In fact, database folks exactly want this - to limit the > pagecache use by filesystems - but I don't think its right thing to do, > so I am trying to propose mount O_DIRECT as an alternative (if its > feasible). Just out of curiosity, can you tell me why you think limiting pagecache isn't the right thing to do (tm)? O_DIRECT mount seems to me incomplete/complex solution (DMA alignment etc...). Forgive me if this issue has been discussed to death already. -- tejun ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 5:18 ` Badari Pulavarty 2005-07-15 8:23 ` Tejun Heo @ 2005-07-15 16:56 ` Joel Becker 2005-07-15 17:50 ` Badari Pulavarty 1 sibling, 1 reply; 19+ messages in thread From: Joel Becker @ 2005-07-15 16:56 UTC (permalink / raw) To: Badari Pulavarty Cc: Tejun Heo, Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List On Thu, Jul 14, 2005 at 10:18:28PM -0700, Badari Pulavarty wrote: > I can imagine a reason for relaxing the alignment. I keep getting asked > whether we can do "O_DIRECT mount option". Database folks wants to > make sure all the access to files in a given filesystem are O_DIRECT All currently existing "O_DIRECT mount option" implementations that I know of do: if (not-512-aligned) bounce_buffer() That is, no one attempts to support the wacky variations in DMA engines. Joel -- Brain: I shall pollute the water supply with this DNAdefibuliser, turning everyone into mindless slaves. Pinky: What about the people who drink bottled water? Brain: Pinky, people who pay 5 dollars for a bottle of water are already mindless slaves. http://www.jlbec.org/ jlbec@evilplan.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 16:56 ` Joel Becker @ 2005-07-15 17:50 ` Badari Pulavarty 2005-07-15 19:16 ` Joel Becker 0 siblings, 1 reply; 19+ messages in thread From: Badari Pulavarty @ 2005-07-15 17:50 UTC (permalink / raw) To: Joel Becker Cc: Tejun Heo, Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List On Fri, 2005-07-15 at 17:56 +0100, Joel Becker wrote: > On Thu, Jul 14, 2005 at 10:18:28PM -0700, Badari Pulavarty wrote: > > I can imagine a reason for relaxing the alignment. I keep getting asked > > whether we can do "O_DIRECT mount option". Database folks wants to > > make sure all the access to files in a given filesystem are O_DIRECT > > All currently existing "O_DIRECT mount option" implementations > that I know of do: > > if (not-512-aligned) > bounce_buffer() > > That is, no one attempts to support the wacky variations in DMA engines. I believe some OSs do buffered IO, if there is a problem with alignment. Thanks, Badari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-15 17:50 ` Badari Pulavarty @ 2005-07-15 19:16 ` Joel Becker 0 siblings, 0 replies; 19+ messages in thread From: Joel Becker @ 2005-07-15 19:16 UTC (permalink / raw) To: Badari Pulavarty Cc: Tejun Heo, Daniel McNeil, linux-aio@kvack.org, Linux Kernel Mailing List On Fri, Jul 15, 2005 at 10:50:46AM -0700, Badari Pulavarty wrote: > I believe some OSs do buffered IO, if there is a problem with alignment. That's what I said. They all do buffered I/O if the alignment is not 512B. They do _not_ try to accept alignments that are smaller. There's no good reason to. It just adds needless complexity. Joel -- "I think it would be a good idea." - Mahatma Ghandi, when asked what he thought of Western civilization http://www.jlbec.org/ jlbec@evilplan.org ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <1121298112.6025.21.camel@ibm-c.pdx.osdl.net.suse.lists.linux.kernel>]
* Re: [rfc patch 2/2] direct-io: remove address alignment check [not found] <1121298112.6025.21.camel@ibm-c.pdx.osdl.net.suse.lists.linux.kernel> @ 2005-07-14 13:18 ` Andi Kleen 2005-07-14 16:02 ` Daniel McNeil 0 siblings, 1 reply; 19+ messages in thread From: Andi Kleen @ 2005-07-14 13:18 UTC (permalink / raw) To: Daniel McNeil; +Cc: linux-kernel Daniel McNeil <daniel@osdl.org> writes: > This patch relaxes the direct i/o alignment check so that user addresses > do not have to be a multiple of the device block size. The original reason for this limit was that lots of drivers (not only IDE) explode when you give them odd sizes. Sometimes it is even worse. I doubt all of them have been fixed. Very risky change. -Andi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 13:18 ` Andi Kleen @ 2005-07-14 16:02 ` Daniel McNeil 2005-07-14 18:23 ` Andi Kleen 0 siblings, 1 reply; 19+ messages in thread From: Daniel McNeil @ 2005-07-14 16:02 UTC (permalink / raw) To: Andi Kleen; +Cc: Linux Kernel Mailing List On Thu, 2005-07-14 at 06:18, Andi Kleen wrote: > Daniel McNeil <daniel@osdl.org> writes: > > > This patch relaxes the direct i/o alignment check so that user addresses > > do not have to be a multiple of the device block size. > > The original reason for this limit was that lots of drivers > (not only IDE) explode when you give them odd sizes. Sometimes > it is even worse. > > I doubt all of them have been fixed. > > Very risky change. > That is exactly why I made this a separate patch, so that we can test and find out where the problems are and work to fix them. Are there problems only with odd sizes, or do drivers have problems with non-512 sizes? Allowing 4-byte aligned user addresses would be a good step forward, since it looks like malloc() returns 4-byte aligned addresses. Daniel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 16:02 ` Daniel McNeil @ 2005-07-14 18:23 ` Andi Kleen 2005-07-14 20:40 ` Daniel McNeil 0 siblings, 1 reply; 19+ messages in thread From: Andi Kleen @ 2005-07-14 18:23 UTC (permalink / raw) To: Daniel McNeil; +Cc: Andi Kleen, Linux Kernel Mailing List > That is exactly why I made this a separate patch, so that we > can test and find out where the problems are and work to fix > them. That's pretty hard because there are a lot of block drivers. And might not very nice for people's data. > > Are there problems only with odd sizes, or do drivers have problems > with non-512 sizes? I believe they have problems with non 512 sizes (and probably alignments) too. -Andi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 18:23 ` Andi Kleen @ 2005-07-14 20:40 ` Daniel McNeil 2005-07-14 23:39 ` Andrew Morton 0 siblings, 1 reply; 19+ messages in thread From: Daniel McNeil @ 2005-07-14 20:40 UTC (permalink / raw) To: Andi Kleen; +Cc: Linux Kernel Mailing List On Thu, 2005-07-14 at 11:23, Andi Kleen wrote: > > That is exactly why I made this a separate patch, so that we > > can test and find out where the problems are and work to fix > > them. > > That's pretty hard because there are a lot of block drivers. > > And might not very nice for people's data. > > > > > Are there problems only with odd sizes, or do drivers have problems > > with non-512 sizes? > > I believe they have problems with non 512 sizes (and probably alignments) > too. The check still only allows i/o that is multiple of the device block size. That will always be a requirement. I was trying to ask: Do drivers have problems with odd addresses or with non-512 addresses? In my limited testing, I saw problems with odd user space addresses on IDE (using DMA). When testing 2-byte aligned addresses, I did not see any problems, and so far, the data looks correct. I am continuing to test and this patch allows other to try it out as well. For the most part, it should be safe because nobody has application code that uses O_DIRECT with non-aligned addresses. Obviously, it will only be ready for mainline if/when we fix all the drivers. Thanks, Daniel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 20:40 ` Daniel McNeil @ 2005-07-14 23:39 ` Andrew Morton 2005-07-15 0:03 ` Daniel McNeil 0 siblings, 1 reply; 19+ messages in thread From: Andrew Morton @ 2005-07-14 23:39 UTC (permalink / raw) To: Daniel McNeil; +Cc: ak, linux-kernel Daniel McNeil <daniel@osdl.org> wrote: > > Do drivers have problems with odd addresses or with > non-512 addresses? I do recall hearing rumours that some bus-masters have fairly strict memory alignment requirements. A cacheline size, perhaps - that would be 32 bytes given the age of the hardware. But yeah, it's v. risky to assume that all bus masters can cope with memory alignments down to two bytes. It would be sane to put the minimum alignment into ->backing_dev_info, default to 512, get the device drivers to override that as they are tested. But this introduces a very very bad problem: people will write applications which work on their hardware, ship the things and then find that the apps break on other people's hardware. So we can't do that. Instead, we need to work out the minimum alignment requirement for all disk controllers and DMA controllers and motherboards in the world. And that includes catering for weird ones which appear to work but which occasionally fail in mysterious ways with finer alignments. That's hard. It's easier to continue to make application developers jump through hoops. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [rfc patch 2/2] direct-io: remove address alignment check 2005-07-14 23:39 ` Andrew Morton @ 2005-07-15 0:03 ` Daniel McNeil 0 siblings, 0 replies; 19+ messages in thread From: Daniel McNeil @ 2005-07-15 0:03 UTC (permalink / raw) To: Andrew Morton; +Cc: ak, Linux Kernel Mailing List On Thu, 2005-07-14 at 16:39, Andrew Morton wrote: > Daniel McNeil <daniel@osdl.org> wrote: > > > > Do drivers have problems with odd addresses or with > > non-512 addresses? > > I do recall hearing rumours that some bus-masters have fairly strict memory > alignment requirements. A cacheline size, perhaps - that would be 32 bytes > given the age of the hardware. > > But yeah, it's v. risky to assume that all bus masters can cope with > memory alignments down to two bytes. > > It would be sane to put the minimum alignment into ->backing_dev_info, > default to 512, get the device drivers to override that as they are tested. > > But this introduces a very very bad problem: people will write applications > which work on their hardware, ship the things and then find that the apps > break on other people's hardware. So we can't do that. > > Instead, we need to work out the minimum alignment requirement for all disk > controllers and DMA controllers and motherboards in the world. And that > includes catering for weird ones which appear to work but which > occasionally fail in mysterious ways with finer alignments. That's hard. > It's easier to continue to make application developers jump through hoops. I was hoping this patch would help turn rumors into real data :) If we did put min alignment into backing_dev_info, we could implement the equivalent of bounce buffers for direct-io -- or just fall back to buffer i/o like it does sometimes anyway. That way application would not break, just get worse performance on some hardware. Right now I just wanted to get the issues on table, get some test results, and see how to proceed from there. Since this patch only affects direct i/o, getting test results shouldn't cause too many problems. Thanks, Daniel ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2005-07-16 3:51 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-13 23:43 [rfc patch 2/2] direct-io: remove address alignment check Daniel McNeil
2005-07-14 23:16 ` Badari Pulavarty
2005-07-14 23:44 ` Daniel McNeil
2005-07-15 5:27 ` Badari Pulavarty
2005-07-15 20:06 ` Daniel McNeil
2005-07-15 0:28 ` Tejun Heo
2005-07-15 5:18 ` Badari Pulavarty
2005-07-15 8:23 ` Tejun Heo
2005-07-15 17:54 ` Badari Pulavarty
2005-07-16 3:50 ` Tejun Heo
2005-07-15 16:56 ` Joel Becker
2005-07-15 17:50 ` Badari Pulavarty
2005-07-15 19:16 ` Joel Becker
[not found] <1121298112.6025.21.camel@ibm-c.pdx.osdl.net.suse.lists.linux.kernel>
2005-07-14 13:18 ` Andi Kleen
2005-07-14 16:02 ` Daniel McNeil
2005-07-14 18:23 ` Andi Kleen
2005-07-14 20:40 ` Daniel McNeil
2005-07-14 23:39 ` Andrew Morton
2005-07-15 0:03 ` Daniel McNeil
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox