[PATCH] NFS: Fix "BUG at fs/aio.c:554!"

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
@ 2011-01-19 22:36 Chuck Lever
  2011-01-19 23:25 ` Trond Myklebust
  2011-01-21  2:42 ` Wengang Wang
  0 siblings, 2 replies; 13+ messages in thread
From: Chuck Lever @ 2011-01-19 22:36 UTC (permalink / raw)
  To: npiggin; +Cc: linux-fsdevel, linux-nfs, wen.gang.wang

Nick Piggin reports:

> I'm getting use after frees in aio code in NFS
>
> [ 2703.396766] Call Trace:
> [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
> [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
> [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
> [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
> [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
> [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
> [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
> [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
> [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
> [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
>
> Adding some tracing, it is due to nfs completing the request then
> returning something other than -EIOCBQUEUED, so aio.c
> also completes the request.

To address this, prevent the NFS direct I/O engine from completing
async iocbs when the forward path returns an error other than
EIOCBQUEUED.

This appears to survive ^C during both "xfstest no. 208" and "fsx -Z."

Cc: Stable <stable@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

Here's my take.

 fs/nfs/direct.c |   32 +++++++++++++++++---------------
 1 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index e6ace0d..c2176f4 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -407,15 +407,16 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
 		pos += vec->iov_len;
 	}
 
-	if (put_dreq(dreq))
-		nfs_direct_complete(dreq);
-
-	if (requested_bytes != 0)
-		return 0;
+	/*
+	 * If no bytes were started, return the error, and let the
+	 * generic layer handle the completion.
+	 */
+	if (requested_bytes == 0)
+		return result < 0 ? result : -EIO;
 
-	if (result < 0)
-		return result;
-	return -EIO;
+	if (put_dreq(dreq))
+		nfs_direct_write_complete(dreq, dreq->inode);
+	return 0;
 }
 
 static ssize_t nfs_direct_read(struct kiocb *iocb, const struct iovec *iov,
@@ -841,15 +842,16 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
 		pos += vec->iov_len;
 	}
 
+	/*
+	 * If no bytes were started, return the error, and let the
+	 * generic layer handle the completion.
+	 */
+	if (requested_bytes == 0)
+		return result < 0 ? result : -EIO;
+
 	if (put_dreq(dreq))
 		nfs_direct_write_complete(dreq, dreq->inode);
-
-	if (requested_bytes != 0)
-		return 0;
-
-	if (result < 0)
-		return result;
-	return -EIO;
+	return 0;
 }
 
 static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 22:36 [PATCH] NFS: Fix "BUG at fs/aio.c:554!" Chuck Lever
@ 2011-01-19 23:25 ` Trond Myklebust
  2011-01-19 23:26   ` Nick Piggin
  2011-01-19 23:31   ` Chuck Lever
  2011-01-21  2:42 ` Wengang Wang
  1 sibling, 2 replies; 13+ messages in thread
From: Trond Myklebust @ 2011-01-19 23:25 UTC (permalink / raw)
  To: Chuck Lever; +Cc: npiggin, linux-fsdevel, linux-nfs, wen.gang.wang

On Wed, 2011-01-19 at 17:36 -0500, Chuck Lever wrote: 
> Nick Piggin reports:
> 
> > I'm getting use after frees in aio code in NFS
> >
> > [ 2703.396766] Call Trace:
> > [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
> > [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
> > [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
> > [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
> > [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> > [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
> > [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
> > [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> > [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
> > [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
> > [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
> > [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
> >
> > Adding some tracing, it is due to nfs completing the request then
> > returning something other than -EIOCBQUEUED, so aio.c
> > also completes the request.
> 
> To address this, prevent the NFS direct I/O engine from completing
> async iocbs when the forward path returns an error other than
> EIOCBQUEUED.
> 
> This appears to survive ^C during both "xfstest no. 208" and "fsx -Z."
> 
> Cc: Stable <stable@kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> 
> Here's my take.
> 
>  fs/nfs/direct.c |   32 +++++++++++++++++---------------
>  1 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> index e6ace0d..c2176f4 100644
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -407,15 +407,16 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
>  		pos += vec->iov_len;
>  	}
>  
> -	if (put_dreq(dreq))
> -		nfs_direct_complete(dreq);
> -
> -	if (requested_bytes != 0)
> -		return 0;
> +	/*
> +	 * If no bytes were started, return the error, and let the
> +	 * generic layer handle the completion.
> +	 */
> +	if (requested_bytes == 0)
> +		return result < 0 ? result : -EIO;
>  
> -	if (result < 0)
> -		return result;
> -	return -EIO;
> +	if (put_dreq(dreq))
> +		nfs_direct_write_complete(dreq, dreq->inode);
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In nfs_direct_read_schedule_iovec()? Shouldn't that be 
		nfs_direct_complete(dreq);

Also, why is EIO the correct reply when no bytes were read/written? Why
shouldn't the VFS aio code be able to cope with a zero byte reply?

> +	return 0;
>  }
>  
>  static ssize_t nfs_direct_read(struct kiocb *iocb, const struct iovec *iov,
> @@ -841,15 +842,16 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
>  		pos += vec->iov_len;
>  	}
>  
> +	/*
> +	 * If no bytes were started, return the error, and let the
> +	 * generic layer handle the completion.
> +	 */
> +	if (requested_bytes == 0)
> +		return result < 0 ? result : -EIO;
> +
>  	if (put_dreq(dreq))
>  		nfs_direct_write_complete(dreq, dreq->inode);
> -
> -	if (requested_bytes != 0)
> -		return 0;
> -
> -	if (result < 0)
> -		return result;
> -	return -EIO;
> +	return 0;
>  }
>  
>  static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:25 ` Trond Myklebust
@ 2011-01-19 23:26   ` Nick Piggin
  2011-01-19 23:31     ` Trond Myklebust
  2011-01-19 23:31   ` Chuck Lever
  1 sibling, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2011-01-19 23:26 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Chuck Lever, linux-fsdevel, linux-nfs, wen.gang.wang

On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
<Trond.Myklebust@netapp.com> wrote:
> On Wed, 2011-01-19 at 17:36 -0500, Chuck Lever wrote:
>> Nick Piggin reports:
>>
>> > I'm getting use after frees in aio code in NFS
>> >
>> > [ 2703.396766] Call Trace:
>> > [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
>> > [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
>> > [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
>> > [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
>> > [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
>> > [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
>> > [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
>> > [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
>> > [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
>> > [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
>> > [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
>> > [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
>> >
>> > Adding some tracing, it is due to nfs completing the request then
>> > returning something other than -EIOCBQUEUED, so aio.c
>> > also completes the request.
>>
>> To address this, prevent the NFS direct I/O engine from completing
>> async iocbs when the forward path returns an error other than
>> EIOCBQUEUED.
>>
>> This appears to survive ^C during both "xfstest no. 208" and "fsx -Z."
>>
>> Cc: Stable <stable@kernel.org>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>
>> Here's my take.
>>
>>  fs/nfs/direct.c |   32 +++++++++++++++++---------------
>>  1 files changed, 17 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
>> index e6ace0d..c2176f4 100644
>> --- a/fs/nfs/direct.c
>> +++ b/fs/nfs/direct.c
>> @@ -407,15 +407,16 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
>>               pos += vec->iov_len;
>>       }
>>
>> -     if (put_dreq(dreq))
>> -             nfs_direct_complete(dreq);
>> -
>> -     if (requested_bytes != 0)
>> -             return 0;
>> +     /*
>> +      * If no bytes were started, return the error, and let the
>> +      * generic layer handle the completion.
>> +      */
>> +     if (requested_bytes == 0)
>> +             return result < 0 ? result : -EIO;
>>
>> -     if (result < 0)
>> -             return result;
>> -     return -EIO;
>> +     if (put_dreq(dreq))
>> +             nfs_direct_write_complete(dreq, dreq->inode);
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> In nfs_direct_read_schedule_iovec()? Shouldn't that be
>                nfs_direct_complete(dreq);
>
> Also, why is EIO the correct reply when no bytes were read/written? Why
> shouldn't the VFS aio code be able to cope with a zero byte reply?

What would it do?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:25 ` Trond Myklebust
  2011-01-19 23:26   ` Nick Piggin
@ 2011-01-19 23:31   ` Chuck Lever
  1 sibling, 0 replies; 13+ messages in thread
From: Chuck Lever @ 2011-01-19 23:31 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: npiggin, linux-fsdevel, linux-nfs, wen.gang.wang


On Jan 19, 2011, at 6:25 PM, Trond Myklebust wrote:

> On Wed, 2011-01-19 at 17:36 -0500, Chuck Lever wrote: 
>> Nick Piggin reports:
>> 
>>> I'm getting use after frees in aio code in NFS
>>> 
>>> [ 2703.396766] Call Trace:
>>> [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
>>> [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
>>> [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
>>> [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
>>> [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
>>> [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
>>> [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
>>> [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
>>> [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
>>> [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
>>> [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
>>> [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
>>> 
>>> Adding some tracing, it is due to nfs completing the request then
>>> returning something other than -EIOCBQUEUED, so aio.c
>>> also completes the request.
>> 
>> To address this, prevent the NFS direct I/O engine from completing
>> async iocbs when the forward path returns an error other than
>> EIOCBQUEUED.
>> 
>> This appears to survive ^C during both "xfstest no. 208" and "fsx -Z."
>> 
>> Cc: Stable <stable@kernel.org>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> 
>> Here's my take.
>> 
>> fs/nfs/direct.c |   32 +++++++++++++++++---------------
>> 1 files changed, 17 insertions(+), 15 deletions(-)
>> 
>> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
>> index e6ace0d..c2176f4 100644
>> --- a/fs/nfs/direct.c
>> +++ b/fs/nfs/direct.c
>> @@ -407,15 +407,16 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
>> 		pos += vec->iov_len;
>> 	}
>> 
>> -	if (put_dreq(dreq))
>> -		nfs_direct_complete(dreq);
>> -
>> -	if (requested_bytes != 0)
>> -		return 0;
>> +	/*
>> +	 * If no bytes were started, return the error, and let the
>> +	 * generic layer handle the completion.
>> +	 */
>> +	if (requested_bytes == 0)
>> +		return result < 0 ? result : -EIO;
>> 
>> -	if (result < 0)
>> -		return result;
>> -	return -EIO;
>> +	if (put_dreq(dreq))
>> +		nfs_direct_write_complete(dreq, dreq->inode);
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> In nfs_direct_read_schedule_iovec()? Shouldn't that be 
> 		nfs_direct_complete(dreq);

Yes, copy and paste error.  Thanks for the review.

> Also, why is EIO the correct reply when no bytes were read/written? Why
> shouldn't the VFS aio code be able to cope with a zero byte reply?

-EIO is returned only if no other error code has been provided so far.  If no bytes were transferred, normally there is going to be some error code generated by the lower layer.  Should we make this a BUG instead?

A zero-byte write is allowed, of course, but I think that kind of request is shunted off before we arrive here.

> 
>> +	return 0;
>> }
>> 
>> static ssize_t nfs_direct_read(struct kiocb *iocb, const struct iovec *iov,
>> @@ -841,15 +842,16 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
>> 		pos += vec->iov_len;
>> 	}
>> 
>> +	/*
>> +	 * If no bytes were started, return the error, and let the
>> +	 * generic layer handle the completion.
>> +	 */
>> +	if (requested_bytes == 0)
>> +		return result < 0 ? result : -EIO;
>> +
>> 	if (put_dreq(dreq))
>> 		nfs_direct_write_complete(dreq, dreq->inode);
>> -
>> -	if (requested_bytes != 0)
>> -		return 0;
>> -
>> -	if (result < 0)
>> -		return result;
>> -	return -EIO;
>> +	return 0;
>> }
>> 
>> static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer
> 
> NetApp
> Trond.Myklebust@netapp.com
> www.netapp.com
> 

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:26   ` Nick Piggin
@ 2011-01-19 23:31     ` Trond Myklebust
  2011-01-19 23:37       ` Nick Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Trond Myklebust @ 2011-01-19 23:31 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Chuck Lever, linux-fsdevel, linux-nfs, wen.gang.wang

On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote: 
> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
> > On Wed, 2011-01-19 at 17:36 -0500, Chuck Lever wrote:
> >> Nick Piggin reports:
> >>
> >> > I'm getting use after frees in aio code in NFS
> >> >
> >> > [ 2703.396766] Call Trace:
> >> > [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
> >> > [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
> >> > [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
> >> > [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
> >> > [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> >> > [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
> >> > [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
> >> > [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> >> > [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
> >> > [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
> >> > [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
> >> > [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
> >> >
> >> > Adding some tracing, it is due to nfs completing the request then
> >> > returning something other than -EIOCBQUEUED, so aio.c
> >> > also completes the request.
> >>
> >> To address this, prevent the NFS direct I/O engine from completing
> >> async iocbs when the forward path returns an error other than
> >> EIOCBQUEUED.
> >>
> >> This appears to survive ^C during both "xfstest no. 208" and "fsx -Z."
> >>
> >> Cc: Stable <stable@kernel.org>
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >>
> >> Here's my take.
> >>
> >>  fs/nfs/direct.c |   32 +++++++++++++++++---------------
> >>  1 files changed, 17 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> >> index e6ace0d..c2176f4 100644
> >> --- a/fs/nfs/direct.c
> >> +++ b/fs/nfs/direct.c
> >> @@ -407,15 +407,16 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
> >>               pos += vec->iov_len;
> >>       }
> >>
> >> -     if (put_dreq(dreq))
> >> -             nfs_direct_complete(dreq);
> >> -
> >> -     if (requested_bytes != 0)
> >> -             return 0;
> >> +     /*
> >> +      * If no bytes were started, return the error, and let the
> >> +      * generic layer handle the completion.
> >> +      */
> >> +     if (requested_bytes == 0)
> >> +             return result < 0 ? result : -EIO;
> >>
> >> -     if (result < 0)
> >> -             return result;
> >> -     return -EIO;
> >> +     if (put_dreq(dreq))
> >> +             nfs_direct_write_complete(dreq, dreq->inode);
> >                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > In nfs_direct_read_schedule_iovec()? Shouldn't that be
> >                nfs_direct_complete(dreq);
> >
> > Also, why is EIO the correct reply when no bytes were read/written? Why
> > shouldn't the VFS aio code be able to cope with a zero byte reply?
> 
> What would it do?

Just return that zero byte reply to userland.

zero bytes is a valid reply for ordinary read() and write(), so why
should we have to do anything different for aio_read()/aio_write()?

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:31     ` Trond Myklebust
@ 2011-01-19 23:37       ` Nick Piggin
  2011-01-19 23:39         ` Chuck Lever
  2011-01-19 23:49         ` Trond Myklebust
  0 siblings, 2 replies; 13+ messages in thread
From: Nick Piggin @ 2011-01-19 23:37 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Chuck Lever, linux-fsdevel, linux-nfs, wen.gang.wang

On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
<Trond.Myklebust@netapp.com> wrote:
> On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
>> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust

>> > Also, why is EIO the correct reply when no bytes were read/written? Why
>> > shouldn't the VFS aio code be able to cope with a zero byte reply?
>>
>> What would it do?
>
> Just return that zero byte reply to userland.
>
> zero bytes is a valid reply for ordinary read() and write(), so why
> should we have to do anything different for aio_read()/aio_write()?

It doesn't give userspace much to do. zero reply from read means
EOF. Zero reply from write is pretty useless, I don't think we do it
in the buffered write path -- we either ensure we write at least
something or have a meaningful error to return.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:37       ` Nick Piggin
@ 2011-01-19 23:39         ` Chuck Lever
  2011-01-19 23:42           ` Nick Piggin
  2011-01-19 23:49         ` Trond Myklebust
  1 sibling, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2011-01-19 23:39 UTC (permalink / raw)
  To: Nick Piggin, Trond Myklebust
  Cc: linux-fsdevel, Linux NFS Mailing List, Wengang Wang


On Jan 19, 2011, at 6:37 PM, Nick Piggin wrote:

> On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
>> On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
>>> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
> 
>>>> Also, why is EIO the correct reply when no bytes were read/written? Why
>>>> shouldn't the VFS aio code be able to cope with a zero byte reply?
>>> 
>>> What would it do?
>> 
>> Just return that zero byte reply to userland.
>> 
>> zero bytes is a valid reply for ordinary read() and write(), so why
>> should we have to do anything different for aio_read()/aio_write()?
> 
> It doesn't give userspace much to do. zero reply from read means
> EOF. Zero reply from write is pretty useless, I don't think we do it
> in the buffered write path -- we either ensure we write at least
> something or have a meaningful error to return.

I think in this case, the zero-length requests are already shunted off.  No zero-length requests make it down here, IIRC.  So we expect that either some bytes are started, or an error occurs.  If zero bytes were started and no error occurred, that's just... wrong.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:39         ` Chuck Lever
@ 2011-01-19 23:42           ` Nick Piggin
  2011-01-19 23:56             ` Trond Myklebust
  0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2011-01-19 23:42 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Trond Myklebust, linux-fsdevel, Linux NFS Mailing List,
	Wengang Wang

On Thu, Jan 20, 2011 at 10:39 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On Jan 19, 2011, at 6:37 PM, Nick Piggin wrote:
>
>> On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
>> <Trond.Myklebust@netapp.com> wrote:
>>> On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
>>>> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
>>
>>>>> Also, why is EIO the correct reply when no bytes were read/written? Why
>>>>> shouldn't the VFS aio code be able to cope with a zero byte reply?
>>>>
>>>> What would it do?
>>>
>>> Just return that zero byte reply to userland.
>>>
>>> zero bytes is a valid reply for ordinary read() and write(), so why
>>> should we have to do anything different for aio_read()/aio_write()?
>>
>> It doesn't give userspace much to do. zero reply from read means
>> EOF. Zero reply from write is pretty useless, I don't think we do it
>> in the buffered write path -- we either ensure we write at least
>> something or have a meaningful error to return.
>
> I think in this case, the zero-length requests are already shunted off.  No zero-length requests make it down here, IIRC.  So we expect that either some bytes are started, or an error occurs.  If zero bytes were started and no error occurred, that's just... wrong.

Right. So in this case you should have an error to return.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:37       ` Nick Piggin
  2011-01-19 23:39         ` Chuck Lever
@ 2011-01-19 23:49         ` Trond Myklebust
  2011-01-19 23:50           ` Nick Piggin
  2011-01-19 23:57           ` Chuck Lever
  1 sibling, 2 replies; 13+ messages in thread
From: Trond Myklebust @ 2011-01-19 23:49 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Chuck Lever, linux-fsdevel, linux-nfs, wen.gang.wang

On Thu, 2011-01-20 at 10:37 +1100, Nick Piggin wrote: 
> On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
> > On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
> >> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
> 
> >> > Also, why is EIO the correct reply when no bytes were read/written? Why
> >> > shouldn't the VFS aio code be able to cope with a zero byte reply?
> >>
> >> What would it do?
> >
> > Just return that zero byte reply to userland.
> >
> > zero bytes is a valid reply for ordinary read() and write(), so why
> > should we have to do anything different for aio_read()/aio_write()?
> 
> It doesn't give userspace much to do. zero reply from read means
> EOF. Zero reply from write is pretty useless, I don't think we do it
> in the buffered write path -- we either ensure we write at least
> something or have a meaningful error to return.

zero reply from read means EOF _or_ user supplied a zero length buffer.

zero reply from write may also be useless, but it is a valid value. It
can simply mean the user supplied a zero length buffer.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:49         ` Trond Myklebust
@ 2011-01-19 23:50           ` Nick Piggin
  2011-01-19 23:57           ` Chuck Lever
  1 sibling, 0 replies; 13+ messages in thread
From: Nick Piggin @ 2011-01-19 23:50 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Chuck Lever, linux-fsdevel, linux-nfs, wen.gang.wang

On Thu, Jan 20, 2011 at 10:49 AM, Trond Myklebust
<Trond.Myklebust@netapp.com> wrote:
> On Thu, 2011-01-20 at 10:37 +1100, Nick Piggin wrote:
>> On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
>> <Trond.Myklebust@netapp.com> wrote:
>> > On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
>> >> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
>>
>> >> > Also, why is EIO the correct reply when no bytes were read/written? Why
>> >> > shouldn't the VFS aio code be able to cope with a zero byte reply?
>> >>
>> >> What would it do?
>> >
>> > Just return that zero byte reply to userland.
>> >
>> > zero bytes is a valid reply for ordinary read() and write(), so why
>> > should we have to do anything different for aio_read()/aio_write()?
>>
>> It doesn't give userspace much to do. zero reply from read means
>> EOF. Zero reply from write is pretty useless, I don't think we do it
>> in the buffered write path -- we either ensure we write at least
>> something or have a meaningful error to return.
>
> zero reply from read means EOF _or_ user supplied a zero length buffer.
>
> zero reply from write may also be useless, but it is a valid value. It
> can simply mean the user supplied a zero length buffer.

OK, yes. I'm ignoring zero length request.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:42           ` Nick Piggin
@ 2011-01-19 23:56             ` Trond Myklebust
  0 siblings, 0 replies; 13+ messages in thread
From: Trond Myklebust @ 2011-01-19 23:56 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Chuck Lever, linux-fsdevel, Linux NFS Mailing List, Wengang Wang

On Thu, 2011-01-20 at 10:42 +1100, Nick Piggin wrote: 
> On Thu, Jan 20, 2011 at 10:39 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> > On Jan 19, 2011, at 6:37 PM, Nick Piggin wrote:
> >
> >> On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
> >> <Trond.Myklebust@netapp.com> wrote:
> >>> On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
> >>>> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
> >>
> >>>>> Also, why is EIO the correct reply when no bytes were read/written? Why
> >>>>> shouldn't the VFS aio code be able to cope with a zero byte reply?
> >>>>
> >>>> What would it do?
> >>>
> >>> Just return that zero byte reply to userland.
> >>>
> >>> zero bytes is a valid reply for ordinary read() and write(), so why
> >>> should we have to do anything different for aio_read()/aio_write()?
> >>
> >> It doesn't give userspace much to do. zero reply from read means
> >> EOF. Zero reply from write is pretty useless, I don't think we do it
> >> in the buffered write path -- we either ensure we write at least
> >> something or have a meaningful error to return.
> >
> > I think in this case, the zero-length requests are already shunted off.  No zero-length requests make it down here, IIRC.  So we expect that either some bytes are started, or an error occurs.  If zero bytes were started and no error occurred, that's just... wrong.
> 
> Right. So in this case you should have an error to return.

So, which error??? What is going wrong that ensures that we don't start
any I/O. Is it the pages that are failing to fault in, is it the RPC
call that is failing to start (if so, why)?

I don't like to be returning EIO if a more specific (and useful) error
exists.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 23:49         ` Trond Myklebust
  2011-01-19 23:50           ` Nick Piggin
@ 2011-01-19 23:57           ` Chuck Lever
  1 sibling, 0 replies; 13+ messages in thread
From: Chuck Lever @ 2011-01-19 23:57 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Nick Piggin, linux-fsdevel, linux-nfs, wen.gang.wang


On Jan 19, 2011, at 6:49 PM, Trond Myklebust wrote:

> On Thu, 2011-01-20 at 10:37 +1100, Nick Piggin wrote: 
>> On Thu, Jan 20, 2011 at 10:31 AM, Trond Myklebust
>> <Trond.Myklebust@netapp.com> wrote:
>>> On Thu, 2011-01-20 at 10:26 +1100, Nick Piggin wrote:
>>>> On Thu, Jan 20, 2011 at 10:25 AM, Trond Myklebust
>> 
>>>>> Also, why is EIO the correct reply when no bytes were read/written? Why
>>>>> shouldn't the VFS aio code be able to cope with a zero byte reply?
>>>> 
>>>> What would it do?
>>> 
>>> Just return that zero byte reply to userland.
>>> 
>>> zero bytes is a valid reply for ordinary read() and write(), so why
>>> should we have to do anything different for aio_read()/aio_write()?
>> 
>> It doesn't give userspace much to do. zero reply from read means
>> EOF. Zero reply from write is pretty useless, I don't think we do it
>> in the buffered write path -- we either ensure we write at least
>> something or have a meaningful error to return.
> 
> zero reply from read means EOF _or_ user supplied a zero length buffer.
> 
> zero reply from write may also be useless, but it is a valid value. It
> can simply mean the user supplied a zero length buffer.

This code is in the forward path.  So, here we are just dealing with starting the I/O.  If the server replies with a short read or write, that's handled in the callbacks, not here.

So, -EIO is appropriate if we couldn't even start the I/O, right?

Anyway, this is copied from the existing code, not something new with this patch.  If we want to change this part of the logic, maybe it should be part of a different patch?

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] NFS: Fix "BUG at fs/aio.c:554!"
  2011-01-19 22:36 [PATCH] NFS: Fix "BUG at fs/aio.c:554!" Chuck Lever
  2011-01-19 23:25 ` Trond Myklebust
@ 2011-01-21  2:42 ` Wengang Wang
  1 sibling, 0 replies; 13+ messages in thread
From: Wengang Wang @ 2011-01-21  2:42 UTC (permalink / raw)
  To: Chuck Lever; +Cc: npiggin, linux-fsdevel, linux-nfs, wen.gang.wang

On 11-01-19 17:36, Chuck Lever wrote:
> Nick Piggin reports:
> 
> > I'm getting use after frees in aio code in NFS
> >
> > [ 2703.396766] Call Trace:
> > [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
> > [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
> > [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
> > [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
> > [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> > [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
> > [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
> > [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> > [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
> > [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
> > [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
> > [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
> >
> > Adding some tracing, it is due to nfs completing the request then
> > returning something other than -EIOCBQUEUED, so aio.c
> > also completes the request.
> 
> To address this, prevent the NFS direct I/O engine from completing
> async iocbs when the forward path returns an error other than
> EIOCBQUEUED.
> 
> This appears to survive ^C during both "xfstest no. 208" and "fsx -Z."
> 
> Cc: Stable <stable@kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> 
> Here's my take.
> 
>  fs/nfs/direct.c |   32 +++++++++++++++++---------------
>  1 files changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> index e6ace0d..c2176f4 100644
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -407,15 +407,16 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
>  		pos += vec->iov_len;
>  	}
>  
> -	if (put_dreq(dreq))
> -		nfs_direct_complete(dreq);
> -
> -	if (requested_bytes != 0)
> -		return 0;
> +	/*
> +	 * If no bytes were started, return the error, and let the
> +	 * generic layer handle the completion.
> +	 */

But the put_dreq() -> nfs_direct_complete() does more than complete the
aio.
It also drops ref on dreq with put_dreq() and does

complete_all(&dreq->completion);
nfs_direct_req_release(dreq);

I think we still needs that called somewhere.

regards,
wengang.

> +	if (requested_bytes == 0)
> +		return result < 0 ? result : -EIO;
>  
> -	if (result < 0)
> -		return result;
> -	return -EIO;
> +	if (put_dreq(dreq))
> +		nfs_direct_write_complete(dreq, dreq->inode);
> +	return 0;
>  }
>  
>  static ssize_t nfs_direct_read(struct kiocb *iocb, const struct iovec *iov,
> @@ -841,15 +842,16 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
>  		pos += vec->iov_len;
>  	}
>  
> +	/*
> +	 * If no bytes were started, return the error, and let the
> +	 * generic layer handle the completion.
> +	 */
> +	if (requested_bytes == 0)
> +		return result < 0 ? result : -EIO;
> +
>  	if (put_dreq(dreq))
>  		nfs_direct_write_complete(dreq, dreq->inode);
> -
> -	if (requested_bytes != 0)
> -		return 0;
> -
> -	if (result < 0)
> -		return result;
> -	return -EIO;
> +	return 0;
>  }
>  
>  static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-01-21  2:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-19 22:36 [PATCH] NFS: Fix "BUG at fs/aio.c:554!" Chuck Lever
2011-01-19 23:25 ` Trond Myklebust
2011-01-19 23:26   ` Nick Piggin
2011-01-19 23:31     ` Trond Myklebust
2011-01-19 23:37       ` Nick Piggin
2011-01-19 23:39         ` Chuck Lever
2011-01-19 23:42           ` Nick Piggin
2011-01-19 23:56             ` Trond Myklebust
2011-01-19 23:49         ` Trond Myklebust
2011-01-19 23:50           ` Nick Piggin
2011-01-19 23:57           ` Chuck Lever
2011-01-19 23:31   ` Chuck Lever
2011-01-21  2:42 ` Wengang Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).