linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] aio: partial write should not return error code.
@ 2008-01-03  9:04 Rusty Russell
  2008-01-03  9:04 ` [PATCH] aio: negative offset should return -EINVAL Rusty Russell
  2008-01-03 20:04 ` [PATCH] aio: partial write should not return error code Zach Brown
  0 siblings, 2 replies; 6+ messages in thread
From: Rusty Russell @ 2008-01-03  9:04 UTC (permalink / raw)
  To: bcrl; +Cc: linux-aio, Andrew Morton, linux-kernel

When an AIO write gets an error after writing some data (eg. ENOSPC),
it should return the amount written already, not the error.  Just like
write() is supposed to.

This was found by the libaio test suite.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r 18802689361a fs/aio.c
--- a/fs/aio.c	Thu Jan 03 15:22:24 2008 +1100
+++ b/fs/aio.c	Thu Jan 03 18:05:25 2008 +1100
@@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct 
 	/* This means we must have transferred all that we could */
 	/* No need to retry anymore */
 	if ((ret == 0) || (iocb->ki_left == 0))
+		ret = iocb->ki_nbytes - iocb->ki_left;
+
+	/* If we managed to write some out we return that, rather than
+	 * the eventual error. */
+	if (opcode == IOCB_CMD_PWRITEV
+	    && ret < 0
+	    && iocb->ki_nbytes - iocb->ki_left)
 		ret = iocb->ki_nbytes - iocb->ki_left;
 
 	return ret;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] aio: negative offset should return -EINVAL
  2008-01-03  9:04 [PATCH] aio: partial write should not return error code Rusty Russell
@ 2008-01-03  9:04 ` Rusty Russell
  2008-01-03 20:17   ` Zach Brown
  2008-01-03 20:04 ` [PATCH] aio: partial write should not return error code Zach Brown
  1 sibling, 1 reply; 6+ messages in thread
From: Rusty Russell @ 2008-01-03  9:04 UTC (permalink / raw)
  To: bcrl; +Cc: linux-aio, Andrew Morton, linux-kernel

An AIO read or write should return -EINVAL if the offset is negative.
This check matches the one in pread and pwrite.

This was found by the libaio test suite.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r 18802689361a fs/aio.c
--- a/fs/aio.c	Thu Jan 03 15:22:24 2008 +1100
+++ b/fs/aio.c	Thu Jan 03 18:05:25 2008 +1100
@@ -1330,6 +1330,10 @@ static ssize_t aio_rw_vect_retry(struct 
 		opcode = IOCB_CMD_PWRITEV;
 	}
 
+	/* This matches the pread()/pwrite() logic */
+	if (iocb->ki_pos < 0)
+		return -EINVAL;
+
 	do {
 		ret = rw_op(iocb, &iocb->ki_iovec[iocb->ki_cur_seg],
 			    iocb->ki_nr_segs - iocb->ki_cur_seg,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: partial write should not return error code.
  2008-01-03  9:04 [PATCH] aio: partial write should not return error code Rusty Russell
  2008-01-03  9:04 ` [PATCH] aio: negative offset should return -EINVAL Rusty Russell
@ 2008-01-03 20:04 ` Zach Brown
  2008-01-04  3:10   ` Rusty Russell
  1 sibling, 1 reply; 6+ messages in thread
From: Zach Brown @ 2008-01-03 20:04 UTC (permalink / raw)
  To: Rusty Russell; +Cc: bcrl, linux-aio, Andrew Morton, linux-kernel

Rusty Russell wrote:
> When an AIO write gets an error after writing some data (eg. ENOSPC),
> it should return the amount written already, not the error.  Just like
> write() is supposed to.

Andrew, please don't queue this fix.  I think the bug is valid but the
patch is subtly dangerous.

> diff -r 18802689361a fs/aio.c
> --- a/fs/aio.c	Thu Jan 03 15:22:24 2008 +1100
> +++ b/fs/aio.c	Thu Jan 03 18:05:25 2008 +1100
> @@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct 
>  	/* This means we must have transferred all that we could */
>  	/* No need to retry anymore */
>  	if ((ret == 0) || (iocb->ki_left == 0))
> +		ret = iocb->ki_nbytes - iocb->ki_left;
> +
> +	/* If we managed to write some out we return that, rather than
> +	 * the eventual error. */
> +	if (opcode == IOCB_CMD_PWRITEV
> +	    && ret < 0
> +	    && iocb->ki_nbytes - iocb->ki_left)
>  		ret = iocb->ki_nbytes - iocb->ki_left;

This doesn't account for the (sigh) -EIOCB* error codes.  They must be
returned to the caller so that it can properly handle the iocb reference
counting.  Failure to do so can lead to oopses.

To be fair, I think you'll have a really hard time finding an
->aio_write() implementation which would return partial progress and
*then* one of the magical errnos.  But the infrastructure does allow it.

So maybe we could get a helper in aio.h that abstracts out the

	(ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY)

condition.  Then I think this patch would be fine.

I assigned a bug to remind myself to revisit this if you aren't excited
by continuing with the patch:

  http://bugzilla.kernel.org/show_bug.cgi?id=9681

- z

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: negative offset should return -EINVAL
  2008-01-03  9:04 ` [PATCH] aio: negative offset should return -EINVAL Rusty Russell
@ 2008-01-03 20:17   ` Zach Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Zach Brown @ 2008-01-03 20:17 UTC (permalink / raw)
  To: Rusty Russell; +Cc: bcrl, linux-aio, Andrew Morton, linux-kernel

Rusty Russell wrote:
> An AIO read or write should return -EINVAL if the offset is negative.
> This check matches the one in pread and pwrite.
> 
> This was found by the libaio test suite.
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

This looks fine to me.

Signed-off-by: Zach Brown <zach.brown@oracle.com>

- z

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: partial write should not return error code.
  2008-01-03 20:04 ` [PATCH] aio: partial write should not return error code Zach Brown
@ 2008-01-04  3:10   ` Rusty Russell
  2008-01-04 18:19     ` Zach Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Rusty Russell @ 2008-01-04  3:10 UTC (permalink / raw)
  To: Zach Brown; +Cc: bcrl, linux-aio, Andrew Morton, linux-kernel

On Friday 04 January 2008 07:04:30 Zach Brown wrote:
> Rusty Russell wrote:
> > When an AIO write gets an error after writing some data (eg. ENOSPC),
> > it should return the amount written already, not the error.  Just like
> > write() is supposed to.
>
> Andrew, please don't queue this fix.  I think the bug is valid but the
> patch is subtly dangerous.
>
> > diff -r 18802689361a fs/aio.c
> > --- a/fs/aio.c	Thu Jan 03 15:22:24 2008 +1100
> > +++ b/fs/aio.c	Thu Jan 03 18:05:25 2008 +1100
> > @@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct
> >  	/* This means we must have transferred all that we could */
> >  	/* No need to retry anymore */
> >  	if ((ret == 0) || (iocb->ki_left == 0))
> > +		ret = iocb->ki_nbytes - iocb->ki_left;
> > +
> > +	/* If we managed to write some out we return that, rather than
> > +	 * the eventual error. */
> > +	if (opcode == IOCB_CMD_PWRITEV
> > +	    && ret < 0
> > +	    && iocb->ki_nbytes - iocb->ki_left)
> >  		ret = iocb->ki_nbytes - iocb->ki_left;
>
> This doesn't account for the (sigh) -EIOCB* error codes.  They must be
> returned to the caller so that it can properly handle the iocb reference
> counting.  Failure to do so can lead to oopses.
>
> To be fair, I think you'll have a really hard time finding an
> ->aio_write() implementation which would return partial progress and
> *then* one of the magical errnos.  But the infrastructure does allow it.

Erk, thanks.

> So maybe we could get a helper in aio.h that abstracts out the
>
> 	(ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY)
>
> condition.  Then I think this patch would be fine.
>
> I assigned a bug to remind myself to revisit this if you aren't excited
> by continuing with the patch:
>
>   http://bugzilla.kernel.org/show_bug.cgi?id=9681

No, that's fine, here is the new one:

When an AIO write gets a non-retry error after writing some data
(eg. ENOSPC), it should return the amount written already, not the
error.  Just like write() is supposed to.

This was found by the libaio test suite.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 fs/aio.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff -r 18802689361a fs/aio.c
--- a/fs/aio.c	Thu Jan 03 15:22:24 2008 +1100
+++ b/fs/aio.c	Thu Jan 03 18:05:25 2008 +1100
@@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct 
 	/* This means we must have transferred all that we could */
 	/* No need to retry anymore */
 	if ((ret == 0) || (iocb->ki_left == 0))
+		ret = iocb->ki_nbytes - iocb->ki_left;
+
+	/* If we managed to write some out we return that, rather than
+	 * the eventual error. */
+	if (opcode == IOCB_CMD_PWRITEV
+	    && ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY
+	    && iocb->ki_nbytes - iocb->ki_left)
 		ret = iocb->ki_nbytes - iocb->ki_left;
 
 	return ret;



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: partial write should not return error code.
  2008-01-04  3:10   ` Rusty Russell
@ 2008-01-04 18:19     ` Zach Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Zach Brown @ 2008-01-04 18:19 UTC (permalink / raw)
  To: Rusty Russell; +Cc: bcrl, linux-aio, Andrew Morton, linux-kernel


> 
> No, that's fine, here is the new one:
> 
> When an AIO write gets a non-retry error after writing some data
> (eg. ENOSPC), it should return the amount written already, not the
> error.  Just like write() is supposed to.
> 
> This was found by the libaio test suite.
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

This looks good, feel free to push this from your tree.

Acked-By: Zach Brown <zach.brown@oracle.com>

- z

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-01-04 18:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-03  9:04 [PATCH] aio: partial write should not return error code Rusty Russell
2008-01-03  9:04 ` [PATCH] aio: negative offset should return -EINVAL Rusty Russell
2008-01-03 20:17   ` Zach Brown
2008-01-03 20:04 ` [PATCH] aio: partial write should not return error code Zach Brown
2008-01-04  3:10   ` Rusty Russell
2008-01-04 18:19     ` Zach Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).