* [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. [not found] <F9014F50-D1F6-4B64-8535-7452CC64B18A@qualexsystems.com> @ 2012-05-20 8:01 ` manish honap 2012-05-20 18:33 ` Linus Torvalds 0 siblings, 1 reply; 11+ messages in thread From: manish honap @ 2012-05-20 8:01 UTC (permalink / raw) To: tytso@mit.edu, adilger.kernel@dilger.ca Cc: torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org From: Manish Honap <manish_honap_vit@yahoo.co.in> The direct-io.c::do_direct_io() returns int and this causes the results to overflow for sizes>=2g; the following patch removes this bug. Signed-off-by: Manish Honap <manish_honap_vit@yahoo.co.in> --- Kernel version - linux-3.3.4 linux-3.3.4/fs/direct-io.c | 6 +++--- linux-3.3.4/fs/ext4/file.c | 2 +- linux-3.3.4/fs/ext4/inode.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff -uprN -X linux-3.3.4-vanilla/Documentation/dontdiff linux-3.3.4-vanilla/fs/direct-io.c ../linux-3.3.4/fs/direct-io.c --- linux-3.3.4-vanilla/fs/direct-io.c 2012-04-27 22:47:35.000000000 +0530 +++ ../linux-3.3.4/fs/direct-io.c 2012-05-20 10:34:47.885599682 +0530 @@ -892,14 +892,14 @@ static inline void dio_zero_block(struct * it should set b_size to PAGE_SIZE or more inside get_block(). This gives * fine alignment but still allows this function to work in PAGE_SIZE units. */ -static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, - struct buffer_head *map_bh) +static ssize_t do_direct_IO(struct dio *dio, struct dio_submit *sdio, + struct buffer_head *map_bh) { const unsigned blkbits = sdio->blkbits; const unsigned blocks_per_page = PAGE_SIZE >> blkbits; struct page *page; unsigned block_in_page; - int ret = 0; + ssize_t ret = 0; /* The I/O can start at any block offset within the first page */ block_in_page = sdio->first_block_in_page; diff -uprN -X linux-3.3.4-vanilla/Documentation/dontdiff linux-3.3.4-vanilla/fs/ext4/file.c ../linux-3.3.4/fs/ext4/file.c --- linux-3.3.4-vanilla/fs/ext4/file.c 2012-04-27 22:47:35.000000000 +0530 +++ ../linux-3.3.4/fs/ext4/file.c 2012-05-20 10:29:02.714603023 +0530 @@ -95,7 +95,7 @@ ext4_file_write(struct kiocb *iocb, cons { struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode; int unaligned_aio = 0; - int ret; + ssize_t ret; /* * If we have encountered a bitmap-format file, the size limit diff -uprN -X linux-3.3.4-vanilla/Documentation/dontdiff linux-3.3.4-vanilla/fs/ext4/inode.c ../linux-3.3.4/fs/ext4/inode.c --- linux-3.3.4-vanilla/fs/ext4/inode.c 2012-04-27 22:47:35.000000000 +0530 +++ ../linux-3.3.4/fs/ext4/inode.c 2012-05-20 10:30:15.444622201 +0530 @@ -2750,7 +2750,7 @@ static int ext4_get_block_write(struct i } static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset, - ssize_t size, void *private, int ret, + ssize_t size, void *private, ssize_t ret, bool is_async) { struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode; Thanks & Regards, - Manish -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-20 8:01 ` [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code manish honap @ 2012-05-20 18:33 ` Linus Torvalds 2012-05-21 3:28 ` manish honap 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2012-05-20 18:33 UTC (permalink / raw) To: manish honap Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org On Sun, May 20, 2012 at 1:01 AM, manish honap <manish_honap_vit@yahoo.co.in> wrote: > > From: Manish Honap <manish_honap_vit@yahoo.co.in> > > The direct-io.c::do_direct_io() returns int and this causes the results to > overflow for sizes>=2g; the following patch removes this bug. > > Signed-off-by: Manish Honap <manish_honap_vit@yahoo.co.in> It should not be possible to do a write bigger than 2GB - the generic VFS layer should stop it. Exactly because of overflow avoidance issues (and because a single 2GB+ write would be insane and has serious DoS issues anyway). Can you actually *trigger* this issue some way? If so, we should fix that instead. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-20 18:33 ` Linus Torvalds @ 2012-05-21 3:28 ` manish honap 2012-05-21 4:50 ` Linus Torvalds 0 siblings, 1 reply; 11+ messages in thread From: manish honap @ 2012-05-21 3:28 UTC (permalink / raw) To: Linus Torvalds Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org Hello Linus, The overflow issue was seen during async dio path Please consider following code, <code> #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <libaio.h> #include <err.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define NR_BYTES 2*1024*1024*1024L int main() { int fd; int ret; struct iocb iocb; struct iocb *piocb = &iocb; io_context_t io_ctx = NULL; void *ptr; struct io_event evout; struct timespec ioq_timeout; ioq_timeout.tv_sec = 0; ioq_timeout.tv_nsec = 1000000000; ret = posix_memalign(&ptr, 1<<12, NR_BYTES); if (ret != 0) { errx(1, "Allocating Memory."); } memset(ptr, 'a', NR_BYTES); fd = open("temp.txt", O_CREAT|O_RDWR|O_DIRECT, 0744); if (fd < 0) { fprintf(stderr, "Error opening file\n"); goto out; } ret = io_setup(10, &io_ctx); if (ret != 0) { fprintf(stderr, "During io_setup\n"); goto out; } piocb->aio_fildes = fd; piocb->u.c.buf = ptr; piocb->u.c.nbytes = NR_BYTES; piocb->u.c.offset = 0; piocb->aio_lio_opcode = IO_CMD_PWRITE; ret = io_submit(io_ctx, 1, &piocb); if (ret != 1) { fprintf(stderr, "During io_submit\n"); goto out; } ret = io_getevents(io_ctx, 1, 1, &evout, &ioq_timeout); if (ret == 1) { printf("%ld (0x%lx) bytes transferred\n", evout.res); } io_destroy(io_ctx); out: free(ptr); return 0; } </code> -2147483648 (0xffffffff80000000) bytes transferred Well, in this case we can see the overflow. Thanks and Regards - Manish ----- Original Message ----- From: Linus Torvalds <torvalds@linux-foundation.org> To: manish honap <manish_honap_vit@yahoo.co.in> Cc: "tytso@mit.edu" <tytso@mit.edu>; "adilger.kernel@dilger.ca" <adilger.kernel@dilger.ca>; "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org> Sent: Monday, 21 May 2012 12:03 AM Subject: Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. On Sun, May 20, 2012 at 1:01 AM, manish honap <manish_honap_vit@yahoo.co.in> wrote: > > From: Manish Honap <manish_honap_vit@yahoo.co.in> > > The direct-io.c::do_direct_io() returns int and this causes the results to > overflow for sizes>=2g; the following patch removes this bug. > > Signed-off-by: Manish Honap <manish_honap_vit@yahoo.co.in> It should not be possible to do a write bigger than 2GB - the generic VFS layer should stop it. Exactly because of overflow avoidance issues (and because a single 2GB+ write would be insane and has serious DoS issues anyway). Can you actually *trigger* this issue some way? If so, we should fix that instead. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-21 3:28 ` manish honap @ 2012-05-21 4:50 ` Linus Torvalds 2012-05-21 22:22 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Linus Torvalds @ 2012-05-21 4:50 UTC (permalink / raw) To: manish honap Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 806 bytes --] On Sun, May 20, 2012 at 8:28 PM, manish honap <manish_honap_vit@yahoo.co.in> wrote: > Hello Linus, > > The overflow issue was seen during async dio path Christ. fs/aio.c doesn't do the proper rw_verify_area(). As a result, it doesn't check file locks, and it doesn't seem to check offset overflows either. The vector versions kind of get the size limit by mistake (because they at least use rw_copy_check_uvector(), which does limit things to MAX_RW_COUNT), but they don't do the offset overflow check either. Does this patch work for you? What it *should* do is the same that the other read/write paths do (and the vector path for aio already do), namely truncate reads or writes to MAX_RW_COUNT (which is INT_MAX aligned down to a page). This patch is entirely untested, Linus [-- Attachment #2: patch.diff --] [-- Type: application/octet-stream, Size: 2545 bytes --] fs/aio.c | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 67a6db3e1b6f..e7f2fad7b4ce 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1456,6 +1456,10 @@ static ssize_t aio_setup_vectored_rw(int type, struct kiocb *kiocb, bool compat) if (ret < 0) goto out; + ret = rw_verify_area(type, kiocb->ki_filp, &kiocb->ki_pos, ret); + if (ret < 0) + goto out; + kiocb->ki_nr_segs = kiocb->ki_nbytes; kiocb->ki_cur_seg = 0; /* ki_nbytes/left now reflect bytes instead of segs */ @@ -1467,11 +1471,17 @@ out: return ret; } -static ssize_t aio_setup_single_vector(struct kiocb *kiocb) +static ssize_t aio_setup_single_vector(int type, struct file * file, struct kiocb *kiocb) { + int bytes; + + bytes = rw_verify_area(type, file, &kiocb->ki_pos, kiocb->ki_left); + if (bytes < 0) + return bytes; + kiocb->ki_iovec = &kiocb->ki_inline_vec; kiocb->ki_iovec->iov_base = kiocb->ki_buf; - kiocb->ki_iovec->iov_len = kiocb->ki_left; + kiocb->ki_iovec->iov_len = bytes; kiocb->ki_nr_segs = 1; kiocb->ki_cur_seg = 0; return 0; @@ -1496,10 +1506,7 @@ static ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat) if (unlikely(!access_ok(VERIFY_WRITE, kiocb->ki_buf, kiocb->ki_left))) break; - ret = security_file_permission(file, MAY_READ); - if (unlikely(ret)) - break; - ret = aio_setup_single_vector(kiocb); + ret = aio_setup_single_vector(READ, file, kiocb); if (ret) break; ret = -EINVAL; @@ -1514,10 +1521,7 @@ static ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat) if (unlikely(!access_ok(VERIFY_READ, kiocb->ki_buf, kiocb->ki_left))) break; - ret = security_file_permission(file, MAY_WRITE); - if (unlikely(ret)) - break; - ret = aio_setup_single_vector(kiocb); + ret = aio_setup_single_vector(WRITE, file, kiocb); if (ret) break; ret = -EINVAL; @@ -1528,9 +1532,6 @@ static ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat) ret = -EBADF; if (unlikely(!(file->f_mode & FMODE_READ))) break; - ret = security_file_permission(file, MAY_READ); - if (unlikely(ret)) - break; ret = aio_setup_vectored_rw(READ, kiocb, compat); if (ret) break; @@ -1542,9 +1543,6 @@ static ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat) ret = -EBADF; if (unlikely(!(file->f_mode & FMODE_WRITE))) break; - ret = security_file_permission(file, MAY_WRITE); - if (unlikely(ret)) - break; ret = aio_setup_vectored_rw(WRITE, kiocb, compat); if (ret) break; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-21 4:50 ` Linus Torvalds @ 2012-05-21 22:22 ` Linus Torvalds 2012-05-21 23:31 ` Ted Ts'o 2012-05-22 19:26 ` [PATCH 1/1] xfstests 286: test for 2G overflows in AIO Eric Sandeen 2012-05-22 20:41 ` [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code Jeff Moyer 2 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2012-05-21 22:22 UTC (permalink / raw) To: manish honap Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org On Sun, May 20, 2012 at 9:50 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Does this patch work for you? What it *should* do is the same that the > other read/write paths do (and the vector path for aio already do), > namely truncate reads or writes to MAX_RW_COUNT (which is INT_MAX > aligned down to a page). > > This patch is entirely untested, Ok, so with that patch, your program gives the expected 2147479552 (0x7ffff000) bytes transferred which is indeed MAX_INT aligned down to a page boundary (ie MAX_RW_COUNT). So I'm pretty sure this patch is what we want, and rw_verify_area() really is required to protect low-level filesystems from these kinds of issues. Not just ext4. At the same time, I would *really* want somebody who actually uses anything AIO to test it out too. Because I want to not only commit it, but also mark it for stable - and it would be nice to have some more testing than me saying "ok, it passes the one test-case sent to me" and "hey, the code looks sane". Anybody? Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-21 22:22 ` Linus Torvalds @ 2012-05-21 23:31 ` Ted Ts'o 2012-05-22 16:11 ` Eric Sandeen 2012-05-22 16:13 ` manish honap 0 siblings, 2 replies; 11+ messages in thread From: Ted Ts'o @ 2012-05-21 23:31 UTC (permalink / raw) To: Linus Torvalds Cc: manish honap, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org, linux-ext4 On Mon, May 21, 2012 at 03:22:28PM -0700, Linus Torvalds wrote: > > So I'm pretty sure this patch is what we want, and rw_verify_area() > really is required to protect low-level filesystems from these kinds > of issues. Not just ext4. > > At the same time, I would *really* want somebody who actually uses > anything AIO to test it out too. Because I want to not only commit it, > but also mark it for stable - and it would be nice to have some more > testing than me saying "ok, it passes the one test-case sent to me" > and "hey, the code looks sane". Hi Linus, We do use AIO at Google, and I primarily use fio (usually as part of xfstests) to test AIO functionality. If you like I can carry the patch in the ext4 tree and test it as part of the ext4 commits for the merge window. I probably won't be able to get to it until a bit later in the week, though. Things have been really crazy for me this past two months... - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-21 23:31 ` Ted Ts'o @ 2012-05-22 16:11 ` Eric Sandeen 2012-05-22 19:02 ` Eric Sandeen 2012-05-22 16:13 ` manish honap 1 sibling, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2012-05-22 16:11 UTC (permalink / raw) To: Ted Ts'o Cc: Linus Torvalds, manish honap, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org, linux-ext4 On 5/21/12 6:31 PM, Ted Ts'o wrote: > On Mon, May 21, 2012 at 03:22:28PM -0700, Linus Torvalds wrote: >> >> So I'm pretty sure this patch is what we want, and rw_verify_area() >> really is required to protect low-level filesystems from these kinds >> of issues. Not just ext4. >> >> At the same time, I would *really* want somebody who actually uses >> anything AIO to test it out too. Because I want to not only commit it, >> but also mark it for stable - and it would be nice to have some more >> testing than me saying "ok, it passes the one test-case sent to me" >> and "hey, the code looks sane". Just FWIW, the ext4_file_write() part of this fix was sent by Zheng Liu on 4/12/12: [PATCH RESEND] ext4: change return value from int to ssize_t in ext4_file_write I'll see about writing an xfstest for this stuff too so it doesn't regress again. -Eric > Hi Linus, > > We do use AIO at Google, and I primarily use fio (usually as part of > xfstests) to test AIO functionality. > > If you like I can carry the patch in the ext4 tree and test it as part > of the ext4 commits for the merge window. > > I probably won't be able to get to it until a bit later in the week, > though. Things have been really crazy for me this past two months... > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-22 16:11 ` Eric Sandeen @ 2012-05-22 19:02 ` Eric Sandeen 0 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2012-05-22 19:02 UTC (permalink / raw) To: Eric Sandeen Cc: Ted Ts'o, Linus Torvalds, manish honap, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org, linux-ext4 On 5/22/12 11:11 AM, Eric Sandeen wrote: > On 5/21/12 6:31 PM, Ted Ts'o wrote: >> On Mon, May 21, 2012 at 03:22:28PM -0700, Linus Torvalds wrote: >>> >>> So I'm pretty sure this patch is what we want, and rw_verify_area() >>> really is required to protect low-level filesystems from these kinds >>> of issues. Not just ext4. >>> >>> At the same time, I would *really* want somebody who actually uses >>> anything AIO to test it out too. Because I want to not only commit it, >>> but also mark it for stable - and it would be nice to have some more >>> testing than me saying "ok, it passes the one test-case sent to me" >>> and "hey, the code looks sane". > > Just FWIW, the ext4_file_write() part of this fix was sent by Zheng Liu on 4/12/12: > > [PATCH RESEND] ext4: change return value from int to ssize_t in ext4_file_write Sorry, I was talking about Manish's original patch, not Linus's followup. -Eric > I'll see about writing an xfstest for this stuff too so it doesn't regress again. > > -Eric > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-21 23:31 ` Ted Ts'o 2012-05-22 16:11 ` Eric Sandeen @ 2012-05-22 16:13 ` manish honap 1 sibling, 0 replies; 11+ messages in thread From: manish honap @ 2012-05-22 16:13 UTC (permalink / raw) To: Ted Ts'o, Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Hello Linus, Ted On Mon, May 21, 2012 at 03:22:28PM -0700, Linus Torvalds wrote: > > So I'm pretty sure this patch is what we want, and rw_verify_area() > really is required to protect low-level filesystems from these kinds > of issues. Not just ext4. Patch works on my machine also. > > At the same time, I would *really* want somebody who actually uses > anything AIO to test it out too. Because I want to not only commit it, > but also mark it for stable - and it would be nice to have some more > testing than me saying "ok, it passes the one test-case sent to me" > and "hey, the code looks sane". Used Chris Mason's aio-stress.c - Manish ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/1] xfstests 286: test for 2G overflows in AIO 2012-05-21 4:50 ` Linus Torvalds 2012-05-21 22:22 ` Linus Torvalds @ 2012-05-22 19:26 ` Eric Sandeen 2012-05-22 20:41 ` [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code Jeff Moyer 2 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2012-05-22 19:26 UTC (permalink / raw) To: Linus Torvalds Cc: manish honap, tytso@mit.edu, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org, Jeff Moyer On 5/20/12 11:50 PM, Linus Torvalds wrote: > On Sun, May 20, 2012 at 8:28 PM, manish honap > <manish_honap_vit@yahoo.co.in> wrote: >> Hello Linus, >> >> The overflow issue was seen during async dio path > > Christ. fs/aio.c doesn't do the proper rw_verify_area(). > > As a result, it doesn't check file locks, and it doesn't seem to check > offset overflows either. > > The vector versions kind of get the size limit by mistake (because > they at least use rw_copy_check_uvector(), which does limit things to > MAX_RW_COUNT), but they don't do the offset overflow check either. > > Does this patch work for you? What it *should* do is the same that the > other read/write paths do (and the vector path for aio already do), > namely truncate reads or writes to MAX_RW_COUNT (which is INT_MAX > aligned down to a page). > > This patch is entirely untested, > > Linus Here's a testcase for xfstests. --- Add new testcase looking for overflows in AIO code when 2G write requests are issued. Also fix up ltp/aio-stress.c to not overflow before the request ever gets to the kernel... Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- diff --git a/286 b/286 new file mode 100755 index 0000000..f5daa96 --- /dev/null +++ b/286 @@ -0,0 +1,75 @@ +#! /bin/bash +# FS QA Test No. 286 +# +# Check for 2G overflows in AIO +# +#----------------------------------------------------------------------- +# Copyright (c) 2012 Red Hat, Inc. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# +# creator +owner=sandeen@sandeen.net + +seq=`basename $0` +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common.rc +. ./common.filter + +# real QA test starts here +rm -f $seq.full + +# Modify as appropriate. +_supported_fs generic +_supported_os IRIX Linux +# Because we will be writing some big files +_require_scratch +[ -x $here/ltp/aio-stress ] || _notrun "aio-stress not built for this platform" + +_scratch_mkfs > $seq.full 2>&1 +_scratch_mount +# A little over 4G +_require_fs_space $SCRATCH_MNT 4194500 + +expected=4294967296 + +# 4x 1G IOs, should pass +$here/ltp/aio-stress -d 1 -b 1 -i 1 -O -I 4 -s 4096 -r 1048576 -v $SCRATCH_MNT/aiofile >> $seq.full 2>&1 +size=$(ls -l $SCRATCH_MNT/aiofile | $AWK_PROG '{print $5}') +[ "$size" -ne $expected ] && _fail "2 x 1G IOs: filesize $size not $expected" + +rm -f $SCRATCH_MNT/aiofile + +# 2x 2G IOs, has failed in past +$here/ltp/aio-stress -d 1 -b 1 -i 1 -O -I 2 -s 4096 -r 2097152 -v $SCRATCH_MNT/aiofile >> $seq.full 2>&1 +size=$(ls -l $SCRATCH_MNT/aiofile | $AWK_PROG '{print $5}') +[ "$size" -ne $expected ] && _fail "1 x 2G IOs: filesize $size not $expected" + +# success, all done +status=0 +exit diff --git a/286.out b/286.out new file mode 100644 index 0000000..6415ad8 --- /dev/null +++ b/286.out @@ -0,0 +1 @@ +QA output created by 286 diff --git a/group b/group index 17afdcd..e91abd6 100644 --- a/group +++ b/group @@ -404,3 +404,4 @@ deprecated 283 dump ioctl auto quick 284 auto 285 repair +286 aio auto diff --git a/ltp/aio-stress.c b/ltp/aio-stress.c index 57a2158..40651b4 100644 --- a/ltp/aio-stress.c +++ b/ltp/aio-stress.c @@ -92,7 +92,7 @@ int completion_latency_stats = 0; int io_iter = 8; int iterations = RUN_FOREVER; int max_io_submit = 0; -long rec_len = 64 * 1024; +size_t rec_len = 64 * 1024; int depth = 64; int num_threads = 1; int num_contexts = 1; @@ -102,7 +102,7 @@ int use_shm = 0; int shm_id; char *unaligned_buffer = NULL; char *aligned_buffer = NULL; -int padded_reclen = 0; +size_t padded_reclen = 0; int stonewall = 1; int verify = 0; char *verify_buf = NULL; @@ -661,7 +661,7 @@ finish_oper(struct thread_info *t, struct io_oper *oper) * null on error */ static struct io_oper * -create_oper(int fd, int rw, off_t start, off_t end, int reclen, int depth, +create_oper(int fd, int rw, off_t start, off_t end, size_t reclen, int depth, int iter, char *file_name) { struct io_oper *oper; @@ -925,7 +925,7 @@ void aio_setup(io_context_t *io_ctx, int n) */ int setup_ious(struct thread_info *t, int num_files, int depth, - int reclen, int max_io_submit) { + size_t reclen, int max_io_submit) { int i; size_t bytes = num_files * depth * sizeof(*t->ios); @@ -989,7 +989,7 @@ free_buffers: * buffers to */ int setup_shared_mem(int num_threads, int num_files, int depth, - int reclen, int max_io_submit) + size_t reclen, int max_io_submit) { char *p = NULL; size_t total_ram; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code. 2012-05-21 4:50 ` Linus Torvalds 2012-05-21 22:22 ` Linus Torvalds 2012-05-22 19:26 ` [PATCH 1/1] xfstests 286: test for 2G overflows in AIO Eric Sandeen @ 2012-05-22 20:41 ` Jeff Moyer 2 siblings, 0 replies; 11+ messages in thread From: Jeff Moyer @ 2012-05-22 20:41 UTC (permalink / raw) To: Linus Torvalds Cc: manish honap, tytso@mit.edu, adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sun, May 20, 2012 at 8:28 PM, manish honap > <manish_honap_vit@yahoo.co.in> wrote: >> Hello Linus, >> >> The overflow issue was seen during async dio path > > Christ. fs/aio.c doesn't do the proper rw_verify_area(). > > As a result, it doesn't check file locks, and it doesn't seem to check > offset overflows either. > > The vector versions kind of get the size limit by mistake (because > they at least use rw_copy_check_uvector(), which does limit things to > MAX_RW_COUNT), but they don't do the offset overflow check either. > > Does this patch work for you? What it *should* do is the same that the > other read/write paths do (and the vector path for aio already do), > namely truncate reads or writes to MAX_RW_COUNT (which is INT_MAX > aligned down to a page). OK, this took me a while to wrap my head around, mostly due to confusion with ki_nbytes and ki_left. When doing p{read|write}v, aio_nbytes is used to store the number of vectors. Thus, we need to set both ki_left and ki_nbytes to the result of the rw_verify_area call, which means that when ki_left goes to 0, we'll exit out of this loop: do { ret = rw_op(iocb, &iocb->ki_iovec[iocb->ki_cur_seg], iocb->ki_nr_segs - iocb->ki_cur_seg, iocb->ki_pos); if (ret > 0) aio_advance_iovec(iocb, ret); /* retry all partial writes. retry partial reads as long as its a * regular file. */ } while (ret > 0 && iocb->ki_left > 0 && (opcode == IOCB_CMD_PWRITEV || (!S_ISFIFO(inode->i_mode) && !S_ISSOCK(inode->i_mode)))); and here: /* This means we must have transferred all that we could */ /* No need to retry anymore */ if ((ret == 0) || (iocb->ki_left == 0)) ret = iocb->ki_nbytes - iocb->ki_left; ki_nbytes will remain at whatever rw_verify_area returned, and ki_left should be zero. Meaning, if we passed in 2G or more, we're capped at the value Linus mentioned earlier. In the case of a single vector, we leave ki_left alone: > - kiocb->ki_iovec->iov_len = kiocb->ki_left; > + kiocb->ki_iovec->iov_len = bytes; Thus, we will exit the while loop due to ret == 0, but this time, ki_nbytes will be the initial value, and ki_left will be non-zero. The end result is that the amount of data transferred is returned, so all is well. Of course, none of that matters for O_DIRECT I/O, since the return value is -EIOCBQUEUED. And, since buffered AIO isn't even asynchronous, nobody /should/ be doing that anyway, so even if the above was broken, nobody /should/ notice. Nevertheless, I did some targeted testing of both the O_DIRECT and the buffered cases, and it seems to be working fine for the single and multiple vector cases. I do have one question, though: why do we limit the total bytes in a vectored operation to 2GB? So long as each segment is below 2G, we shouldn't trip up any code paths in the kernel, right? So that just leaves the "it would take a long time and burn kernel resources" argument. So I guess that's the reason? Anyway, consider this a long-winded: Reviewed-by: Jeff Moyer <jmoyer@redhat.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-05-22 20:41 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <F9014F50-D1F6-4B64-8535-7452CC64B18A@qualexsystems.com> 2012-05-20 8:01 ` [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code manish honap 2012-05-20 18:33 ` Linus Torvalds 2012-05-21 3:28 ` manish honap 2012-05-21 4:50 ` Linus Torvalds 2012-05-21 22:22 ` Linus Torvalds 2012-05-21 23:31 ` Ted Ts'o 2012-05-22 16:11 ` Eric Sandeen 2012-05-22 19:02 ` Eric Sandeen 2012-05-22 16:13 ` manish honap 2012-05-22 19:26 ` [PATCH 1/1] xfstests 286: test for 2G overflows in AIO Eric Sandeen 2012-05-22 20:41 ` [PATCH 1/1] ext4, dio: Remove overflow for size >2G in aio-dio code Jeff Moyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).