All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] aio: Fix return code of io_submit() (RFC)
@ 2014-10-03 18:08 Kent Overstreet
  2014-10-03 18:13 ` Jens Axboe
  2014-10-03 18:19 ` Benjamin LaHaise
  0 siblings, 2 replies; 9+ messages in thread
From: Kent Overstreet @ 2014-10-03 18:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kent Overstreet, Benjamin LaHaise, Zach Brown, Jeff Moyer,
	Jens Axboe, Slava Pestov

io_submit() could return -EAGAIN on memory allocation failure when it should
really have been returning -ENOMEM. This could confuse applications (i.e. fio)
since -EAGAIN means "too many requests outstanding, wait until completions have
been reaped" and if the application actually was tracking outstanding
completions this wouldn't make a lot of sense.

NOTE:

the man page seems to imply that the current behaviour (-EAGAIN on allocation
failure) has always been the case. I don't think it makes a lot of sense, but
this should probably be discussed more widely in case applications have somehow
come to rely on the current behaviour...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Zach Brown <zab@zabbo.net>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Slava Pestov <sp@daterainc.com>
---
 fs/aio.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 733750096b..556547044b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -933,23 +933,14 @@ static inline struct kiocb *aio_get_req(struct kioctx *ctx)
 {
 	struct kiocb *req;
 
-	if (!get_reqs_available(ctx)) {
-		user_refill_reqs_available(ctx);
-		if (!get_reqs_available(ctx))
-			return NULL;
-	}
-
 	req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO);
 	if (unlikely(!req))
-		goto out_put;
+		return NULL;
 
 	percpu_ref_get(&ctx->reqs);
 
 	req->ki_ctx = ctx;
 	return req;
-out_put:
-	put_reqs_available(ctx, 1);
-	return NULL;
 }
 
 static void kiocb_free(struct kiocb *req)
@@ -1489,9 +1480,17 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
 		return -EINVAL;
 	}
 
+	if (!get_reqs_available(ctx)) {
+		user_refill_reqs_available(ctx);
+		if (!get_reqs_available(ctx))
+			return -EAGAIN;
+	}
+
 	req = aio_get_req(ctx);
-	if (unlikely(!req))
-		return -EAGAIN;
+	if (unlikely(!req)) {
+		ret = -ENOMEM;
+		goto out_put;
+	}
 
 	req->ki_filp = fget(iocb->aio_fildes);
 	if (unlikely(!req->ki_filp)) {
@@ -1533,9 +1532,10 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
 
 	return 0;
 out_put_req:
-	put_reqs_available(ctx, 1);
 	percpu_ref_put(&ctx->reqs);
 	kiocb_free(req);
+out_put:
+	put_reqs_available(ctx, 1);
 	return ret;
 }
 
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:08 [PATCH] aio: Fix return code of io_submit() (RFC) Kent Overstreet
@ 2014-10-03 18:13 ` Jens Axboe
  2014-10-03 18:21   ` Kent Overstreet
  2014-10-03 18:22   ` Benjamin LaHaise
  2014-10-03 18:19 ` Benjamin LaHaise
  1 sibling, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2014-10-03 18:13 UTC (permalink / raw)
  To: Kent Overstreet, linux-kernel
  Cc: Benjamin LaHaise, Zach Brown, Jeff Moyer, Slava Pestov

On 2014-10-03 12:08, Kent Overstreet wrote:
> io_submit() could return -EAGAIN on memory allocation failure when it should
> really have been returning -ENOMEM. This could confuse applications (i.e. fio)
> since -EAGAIN means "too many requests outstanding, wait until completions have
> been reaped" and if the application actually was tracking outstanding
> completions this wouldn't make a lot of sense.
>
> NOTE:
>
> the man page seems to imply that the current behaviour (-EAGAIN on allocation
> failure) has always been the case. I don't think it makes a lot of sense, but
> this should probably be discussed more widely in case applications have somehow
> come to rely on the current behaviour...

We can't really feasibly fix this, is my worry. Fio does track the 
pending requests and does not get into a getevents() forever wait if it 
gets -EAGAIN on submission. But before the fix, it would loop forever in 
submission in -EAGAIN.

How are applications supposed to deal with ENOMEM? I think the answer 
here is that they can't, it would be a fatal condition. AIO must provide 
isn't own guarantee of progress, with a mempool or similar.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:08 [PATCH] aio: Fix return code of io_submit() (RFC) Kent Overstreet
  2014-10-03 18:13 ` Jens Axboe
@ 2014-10-03 18:19 ` Benjamin LaHaise
  1 sibling, 0 replies; 9+ messages in thread
From: Benjamin LaHaise @ 2014-10-03 18:19 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-kernel, Zach Brown, Jeff Moyer, Jens Axboe, Slava Pestov

Hi Kent,

On Fri, Oct 03, 2014 at 11:08:13AM -0700, Kent Overstreet wrote:
> io_submit() could return -EAGAIN on memory allocation failure when it should
> really have been returning -ENOMEM. This could confuse applications (i.e. fio)
> since -EAGAIN means "too many requests outstanding, wait until completions have
> been reaped" and if the application actually was tracking outstanding
> completions this wouldn't make a lot of sense.

Wouldn't it be simpler to just return an ERR_PTR with the appropriate 
return code rather than move all that code around?

		-ben

> NOTE:
> 
> the man page seems to imply that the current behaviour (-EAGAIN on allocation
> failure) has always been the case. I don't think it makes a lot of sense, but
> this should probably be discussed more widely in case applications have somehow
> come to rely on the current behaviour...
> 
> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
> Cc: Benjamin LaHaise <bcrl@kvack.org>
> Cc: Zach Brown <zab@zabbo.net>
> Cc: Jeff Moyer <jmoyer@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Slava Pestov <sp@daterainc.com>
> ---
>  fs/aio.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 733750096b..556547044b 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -933,23 +933,14 @@ static inline struct kiocb *aio_get_req(struct kioctx *ctx)
>  {
>  	struct kiocb *req;
>  
> -	if (!get_reqs_available(ctx)) {
> -		user_refill_reqs_available(ctx);
> -		if (!get_reqs_available(ctx))
> -			return NULL;
> -	}
> -
>  	req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO);
>  	if (unlikely(!req))
> -		goto out_put;
> +		return NULL;
>  
>  	percpu_ref_get(&ctx->reqs);
>  
>  	req->ki_ctx = ctx;
>  	return req;
> -out_put:
> -	put_reqs_available(ctx, 1);
> -	return NULL;
>  }
>  
>  static void kiocb_free(struct kiocb *req)
> @@ -1489,9 +1480,17 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
>  		return -EINVAL;
>  	}
>  
> +	if (!get_reqs_available(ctx)) {
> +		user_refill_reqs_available(ctx);
> +		if (!get_reqs_available(ctx))
> +			return -EAGAIN;
> +	}
> +
>  	req = aio_get_req(ctx);
> -	if (unlikely(!req))
> -		return -EAGAIN;
> +	if (unlikely(!req)) {
> +		ret = -ENOMEM;
> +		goto out_put;
> +	}
>  
>  	req->ki_filp = fget(iocb->aio_fildes);
>  	if (unlikely(!req->ki_filp)) {
> @@ -1533,9 +1532,10 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
>  
>  	return 0;
>  out_put_req:
> -	put_reqs_available(ctx, 1);
>  	percpu_ref_put(&ctx->reqs);
>  	kiocb_free(req);
> +out_put:
> +	put_reqs_available(ctx, 1);
>  	return ret;
>  }
>  
> -- 
> 2.1.1

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:13 ` Jens Axboe
@ 2014-10-03 18:21   ` Kent Overstreet
  2014-10-03 18:25     ` Jens Axboe
  2014-10-03 18:22   ` Benjamin LaHaise
  1 sibling, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2014-10-03 18:21 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, Benjamin LaHaise, Zach Brown, Jeff Moyer,
	Slava Pestov

On Fri, Oct 03, 2014 at 12:13:39PM -0600, Jens Axboe wrote:
> On 2014-10-03 12:08, Kent Overstreet wrote:
> >io_submit() could return -EAGAIN on memory allocation failure when it should
> >really have been returning -ENOMEM. This could confuse applications (i.e. fio)
> >since -EAGAIN means "too many requests outstanding, wait until completions have
> >been reaped" and if the application actually was tracking outstanding
> >completions this wouldn't make a lot of sense.
> >
> >NOTE:
> >
> >the man page seems to imply that the current behaviour (-EAGAIN on allocation
> >failure) has always been the case. I don't think it makes a lot of sense, but
> >this should probably be discussed more widely in case applications have somehow
> >come to rely on the current behaviour...
> 
> We can't really feasibly fix this, is my worry. Fio does track the pending
> requests and does not get into a getevents() forever wait if it gets -EAGAIN
> on submission. But before the fix, it would loop forever in submission in
> -EAGAIN.
> 
> How are applications supposed to deal with ENOMEM? I think the answer here
> is that they can't, it would be a fatal condition. AIO must provide isn't
> own guarantee of progress, with a mempool or similar.

Well, even though the AIO code doesn't currently return -ENOMEM we definitely do
have random other driver/filesystem code that will return -ENOMEM if a random
GFP_KERNEL allocation fails (e.g. the dio code, if allocating a struct dio
fails). So I think there's precedent for this, and having it be a fatal error
when the system is under major memory pressure is not a crazy thing to do too.

But OTOH maybe we should just use a mempool there.

The argument against making it a mempool would be "we don't want io_submit() to
block; even if that's not the case today, we at least have a chance of fixing it
with the current setup. If we can't allocate memory for our asynchronous state,
we really can't do anything there except block or fail".

I'm not sure I have strong feelings one way or the other.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:13 ` Jens Axboe
  2014-10-03 18:21   ` Kent Overstreet
@ 2014-10-03 18:22   ` Benjamin LaHaise
  2014-10-03 18:31     ` Kent Overstreet
  2014-10-03 18:36     ` Jens Axboe
  1 sibling, 2 replies; 9+ messages in thread
From: Benjamin LaHaise @ 2014-10-03 18:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Kent Overstreet, linux-kernel, Zach Brown, Jeff Moyer,
	Slava Pestov

On Fri, Oct 03, 2014 at 12:13:39PM -0600, Jens Axboe wrote:
> On 2014-10-03 12:08, Kent Overstreet wrote:
> >io_submit() could return -EAGAIN on memory allocation failure when it 
> >should
> >really have been returning -ENOMEM. This could confuse applications (i.e. 
> >fio)
> >since -EAGAIN means "too many requests outstanding, wait until completions 
> >have
> >been reaped" and if the application actually was tracking outstanding
> >completions this wouldn't make a lot of sense.
> >
> >NOTE:
> >
> >the man page seems to imply that the current behaviour (-EAGAIN on 
> >allocation
> >failure) has always been the case. I don't think it makes a lot of sense, 
> >but
> >this should probably be discussed more widely in case applications have 
> >somehow
> >come to rely on the current behaviour...
> 
> We can't really feasibly fix this, is my worry. Fio does track the 
> pending requests and does not get into a getevents() forever wait if it 
> gets -EAGAIN on submission. But before the fix, it would loop forever in 
> submission in -EAGAIN.

There are lots of instances in the kernel where out of memory is potentially 
exposed to the user.  If we're failing a memory allocation that is well under 
1KB, the system is probably completely hosed.

> How are applications supposed to deal with ENOMEM? I think the answer 
> here is that they can't, it would be a fatal condition. AIO must provide 
> isn't own guarantee of progress, with a mempool or similar.

I'm not sure if using a mempool is appropriate for allocations that are 
driven by userland code.  At least with an ENOMEM error, an application 
could free up some of the memory it allocated and possibly recover the 
system.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:21   ` Kent Overstreet
@ 2014-10-03 18:25     ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2014-10-03 18:25 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-kernel, Benjamin LaHaise, Zach Brown, Jeff Moyer,
	Slava Pestov

On 2014-10-03 12:21, Kent Overstreet wrote:
> On Fri, Oct 03, 2014 at 12:13:39PM -0600, Jens Axboe wrote:
>> On 2014-10-03 12:08, Kent Overstreet wrote:
>>> io_submit() could return -EAGAIN on memory allocation failure when it should
>>> really have been returning -ENOMEM. This could confuse applications (i.e. fio)
>>> since -EAGAIN means "too many requests outstanding, wait until completions have
>>> been reaped" and if the application actually was tracking outstanding
>>> completions this wouldn't make a lot of sense.
>>>
>>> NOTE:
>>>
>>> the man page seems to imply that the current behaviour (-EAGAIN on allocation
>>> failure) has always been the case. I don't think it makes a lot of sense, but
>>> this should probably be discussed more widely in case applications have somehow
>>> come to rely on the current behaviour...
>>
>> We can't really feasibly fix this, is my worry. Fio does track the pending
>> requests and does not get into a getevents() forever wait if it gets -EAGAIN
>> on submission. But before the fix, it would loop forever in submission in
>> -EAGAIN.
>>
>> How are applications supposed to deal with ENOMEM? I think the answer here
>> is that they can't, it would be a fatal condition. AIO must provide isn't
>> own guarantee of progress, with a mempool or similar.
>
> Well, even though the AIO code doesn't currently return -ENOMEM we definitely do
> have random other driver/filesystem code that will return -ENOMEM if a random
> GFP_KERNEL allocation fails (e.g. the dio code, if allocating a struct dio
> fails). So I think there's precedent for this, and having it be a fatal error
> when the system is under major memory pressure is not a crazy thing to do too.
>
> But OTOH maybe we should just use a mempool there.
>
> The argument against making it a mempool would be "we don't want io_submit() to
> block; even if that's not the case today, we at least have a chance of fixing it
> with the current setup. If we can't allocate memory for our asynchronous state,
> we really can't do anything there except block or fail".

It'll block anyway in other places, if we run out of resources there. 
But good point on the other potential -ENOMEM cases, it's not a new 
condition (potentially).

> I'm not sure I have strong feelings one way or the other.

Me neither...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:22   ` Benjamin LaHaise
@ 2014-10-03 18:31     ` Kent Overstreet
  2014-10-03 18:39       ` Jens Axboe
  2014-10-03 18:36     ` Jens Axboe
  1 sibling, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2014-10-03 18:31 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Jens Axboe, linux-kernel, Zach Brown, Jeff Moyer, Slava Pestov

On Fri, Oct 03, 2014 at 02:22:20PM -0400, Benjamin LaHaise wrote:
> On Fri, Oct 03, 2014 at 12:13:39PM -0600, Jens Axboe wrote:
> > How are applications supposed to deal with ENOMEM? I think the answer 
> > here is that they can't, it would be a fatal condition. AIO must provide 
> > isn't own guarantee of progress, with a mempool or similar.
> 
> I'm not sure if using a mempool is appropriate for allocations that are 
> driven by userland code.  At least with an ENOMEM error, an application 
> could free up some of the memory it allocated and possibly recover the 
> system.

I guess it's going to depend on the application... some applications really want
to always make forward progress (much like a lot of code in the kernel), so
they're going to want the mempool semantics and we in the kernel are in a much
better position to implement that correctly (think of all the applications that
are just going to sleep and retry on -ENOMEM).

we kind of want another flag in the syscall args that's the moral equivalent of
MSG_DONTWAIT but for memory allocations; it'd translate into "mempool +
GFP_KERNEL, or GFP_NOWAIT".

not that I'm actually going to implement that :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:22   ` Benjamin LaHaise
  2014-10-03 18:31     ` Kent Overstreet
@ 2014-10-03 18:36     ` Jens Axboe
  1 sibling, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2014-10-03 18:36 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Kent Overstreet, linux-kernel, Zach Brown, Jeff Moyer,
	Slava Pestov

On 2014-10-03 12:22, Benjamin LaHaise wrote:
> On Fri, Oct 03, 2014 at 12:13:39PM -0600, Jens Axboe wrote:
>> On 2014-10-03 12:08, Kent Overstreet wrote:
>>> io_submit() could return -EAGAIN on memory allocation failure when it
>>> should
>>> really have been returning -ENOMEM. This could confuse applications (i.e.
>>> fio)
>>> since -EAGAIN means "too many requests outstanding, wait until completions
>>> have
>>> been reaped" and if the application actually was tracking outstanding
>>> completions this wouldn't make a lot of sense.
>>>
>>> NOTE:
>>>
>>> the man page seems to imply that the current behaviour (-EAGAIN on
>>> allocation
>>> failure) has always been the case. I don't think it makes a lot of sense,
>>> but
>>> this should probably be discussed more widely in case applications have
>>> somehow
>>> come to rely on the current behaviour...
>>
>> We can't really feasibly fix this, is my worry. Fio does track the
>> pending requests and does not get into a getevents() forever wait if it
>> gets -EAGAIN on submission. But before the fix, it would loop forever in
>> submission in -EAGAIN.
>
> There are lots of instances in the kernel where out of memory is potentially
> exposed to the user.  If we're failing a memory allocation that is well under
> 1KB, the system is probably completely hosed.
>
>> How are applications supposed to deal with ENOMEM? I think the answer
>> here is that they can't, it would be a fatal condition. AIO must provide
>> isn't own guarantee of progress, with a mempool or similar.
>
> I'm not sure if using a mempool is appropriate for allocations that are
> driven by userland code.  At least with an ENOMEM error, an application
> could free up some of the memory it allocated and possibly recover the
> system.

Since fio just hit this, it has nothing it can potentially free to make 
progress possible. There was no pending IO, so all it can do is quit. 
But I do agree that if a small alloc like that fails, then we are 
probably pretty darn screwed anyway, and it doesn't matter that much 
what we do. My main concern was a potential change in the ABI, but since 
we could already return -ENOMEM from other cases, that is probably a 
moot point.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] aio: Fix return code of io_submit() (RFC)
  2014-10-03 18:31     ` Kent Overstreet
@ 2014-10-03 18:39       ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2014-10-03 18:39 UTC (permalink / raw)
  To: Kent Overstreet, Benjamin LaHaise
  Cc: linux-kernel, Zach Brown, Jeff Moyer, Slava Pestov

On 2014-10-03 12:31, Kent Overstreet wrote:
> On Fri, Oct 03, 2014 at 02:22:20PM -0400, Benjamin LaHaise wrote:
>> On Fri, Oct 03, 2014 at 12:13:39PM -0600, Jens Axboe wrote:
>>> How are applications supposed to deal with ENOMEM? I think the answer
>>> here is that they can't, it would be a fatal condition. AIO must provide
>>> isn't own guarantee of progress, with a mempool or similar.
>>
>> I'm not sure if using a mempool is appropriate for allocations that are
>> driven by userland code.  At least with an ENOMEM error, an application
>> could free up some of the memory it allocated and possibly recover the
>> system.
>
> I guess it's going to depend on the application... some applications really want
> to always make forward progress (much like a lot of code in the kernel), so
> they're going to want the mempool semantics and we in the kernel are in a much
> better position to implement that correctly (think of all the applications that
> are just going to sleep and retry on -ENOMEM).

Precisely, there's no real way to do that in the application. Especially 
if it has no pending IO it can just wait on, it'll be a sleep and retry 
thing

> we kind of want another flag in the syscall args that's the moral equivalent of
> MSG_DONTWAIT but for memory allocations; it'd translate into "mempool +
> GFP_KERNEL, or GFP_NOWAIT".

We do...

> not that I'm actually going to implement that :)

It's worth keeping in mind for if we do extend the API for some reason.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-10-03 18:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-03 18:08 [PATCH] aio: Fix return code of io_submit() (RFC) Kent Overstreet
2014-10-03 18:13 ` Jens Axboe
2014-10-03 18:21   ` Kent Overstreet
2014-10-03 18:25     ` Jens Axboe
2014-10-03 18:22   ` Benjamin LaHaise
2014-10-03 18:31     ` Kent Overstreet
2014-10-03 18:39       ` Jens Axboe
2014-10-03 18:36     ` Jens Axboe
2014-10-03 18:19 ` Benjamin LaHaise

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.