* [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write
@ 2011-02-22 10:17 Jes.Sorensen
2011-02-22 10:18 ` Jes.Sorensen
0 siblings, 1 reply; 10+ messages in thread
From: Jes.Sorensen @ 2011-02-22 10:17 UTC (permalink / raw)
To: qemu-devel; +Cc: hch, kwolf, stefanha
From: Jes Sorensen <Jes.Sorensen@redhat.com>
Hi,
Current AIO code returns -EINVAL on every error, even on write. In
order to be able to report some more sensible error messages, I
suggest we change it to return -ENOSPC if we are failing on a write
request and we had a partial write. It matches more what one would
expect.
One way to reproduce the strange error return is this:
1) setup a loopback device for a file (say 2GB)
2) create an LVM volume over it
3) create a qcow2 image on top of it which is larger than the 2GB
4) try and install a guest.....
Comments? any reason why this is a bad idea?
Cheers,
Jes
Jes Sorensen (1):
For AIO return -ENOSPC on short write
linux-aio.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
--
1.7.4
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write
2011-02-22 10:17 [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write Jes.Sorensen
@ 2011-02-22 10:18 ` Jes.Sorensen
2011-02-22 11:44 ` [Qemu-devel] " Kevin Wolf
2011-02-22 15:02 ` [Qemu-devel] " Stefan Hajnoczi
0 siblings, 2 replies; 10+ messages in thread
From: Jes.Sorensen @ 2011-02-22 10:18 UTC (permalink / raw)
To: qemu-devel; +Cc: hch, kwolf, stefanha
From: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
linux-aio.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/linux-aio.c b/linux-aio.c
index 68f4b3d..d9c0225 100644
--- a/linux-aio.c
+++ b/linux-aio.c
@@ -32,6 +32,7 @@ struct qemu_laiocb {
ssize_t ret;
size_t nbytes;
int async_context_id;
+ int type;
QLIST_ENTRY(qemu_laiocb) node;
};
@@ -62,6 +63,9 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
if (ret != -ECANCELED) {
if (ret == laiocb->nbytes)
ret = 0;
+ else if ((laiocb->type == QEMU_AIO_WRITE) && (ret >= 0) &&
+ (ret < laiocb->nbytes))
+ ret = -ENOSPC;
else if (ret >= 0)
ret = -EINVAL;
@@ -204,6 +208,7 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
laiocb->nbytes = nb_sectors * 512;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
+ laiocb->type = type;
laiocb->async_context_id = get_async_context_id();
iocbs = &laiocb->iocb;
--
1.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] Re: [PATCH] For AIO return -ENOSPC on short write
2011-02-22 10:18 ` Jes.Sorensen
@ 2011-02-22 11:44 ` Kevin Wolf
2011-02-22 11:45 ` Jes Sorensen
2011-02-22 15:02 ` [Qemu-devel] " Stefan Hajnoczi
1 sibling, 1 reply; 10+ messages in thread
From: Kevin Wolf @ 2011-02-22 11:44 UTC (permalink / raw)
To: Jes.Sorensen; +Cc: hch, qemu-devel, stefanha
Am 22.02.2011 11:18, schrieb Jes.Sorensen@redhat.com:
> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>
> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
> ---
> linux-aio.c | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/linux-aio.c b/linux-aio.c
> index 68f4b3d..d9c0225 100644
> --- a/linux-aio.c
> +++ b/linux-aio.c
> @@ -32,6 +32,7 @@ struct qemu_laiocb {
> ssize_t ret;
> size_t nbytes;
> int async_context_id;
> + int type;
> QLIST_ENTRY(qemu_laiocb) node;
> };
>
> @@ -62,6 +63,9 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
> if (ret != -ECANCELED) {
> if (ret == laiocb->nbytes)
> ret = 0;
> + else if ((laiocb->type == QEMU_AIO_WRITE) && (ret >= 0) &&
> + (ret < laiocb->nbytes))
> + ret = -ENOSPC;
> else if (ret >= 0)
> ret = -EINVAL;
Isn't there a way to get the real error code instead of just making it
up more cleverly? Like retrying for the rest of the request?
Kevin
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] Re: [PATCH] For AIO return -ENOSPC on short write
2011-02-22 11:44 ` [Qemu-devel] " Kevin Wolf
@ 2011-02-22 11:45 ` Jes Sorensen
2011-02-22 13:56 ` Avi Kivity
0 siblings, 1 reply; 10+ messages in thread
From: Jes Sorensen @ 2011-02-22 11:45 UTC (permalink / raw)
To: Kevin Wolf; +Cc: hch, qemu-devel, stefanha
On 02/22/11 12:44, Kevin Wolf wrote:
>> @@ -62,6 +63,9 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
>> if (ret != -ECANCELED) {
>> if (ret == laiocb->nbytes)
>> ret = 0;
>> + else if ((laiocb->type == QEMU_AIO_WRITE) && (ret >= 0) &&
>> + (ret < laiocb->nbytes))
>> + ret = -ENOSPC;
>> else if (ret >= 0)
>> ret = -EINVAL;
>
> Isn't there a way to get the real error code instead of just making it
> up more cleverly? Like retrying for the rest of the request?
>
> Kevin
I guess we could retry the last part of the request, but if we already
have an error, it seems silly to try to rewrite the same stuff again
just to obtain the error code.
I looked through the aio calls and I didn't find any obvious way to
retrieve the error code, but maybe I missed something?
Cheers,
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] For AIO return -ENOSPC on short write
2011-02-22 11:45 ` Jes Sorensen
@ 2011-02-22 13:56 ` Avi Kivity
0 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2011-02-22 13:56 UTC (permalink / raw)
To: Jes Sorensen; +Cc: Kevin Wolf, hch, qemu-devel, stefanha
On 02/22/2011 01:45 PM, Jes Sorensen wrote:
> On 02/22/11 12:44, Kevin Wolf wrote:
> >> @@ -62,6 +63,9 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
> >> if (ret != -ECANCELED) {
> >> if (ret == laiocb->nbytes)
> >> ret = 0;
> >> + else if ((laiocb->type == QEMU_AIO_WRITE)&& (ret>= 0)&&
> >> + (ret< laiocb->nbytes))
> >> + ret = -ENOSPC;
> >> else if (ret>= 0)
> >> ret = -EINVAL;
> >
> > Isn't there a way to get the real error code instead of just making it
> > up more cleverly? Like retrying for the rest of the request?
> >
> > Kevin
>
> I guess we could retry the last part of the request, but if we already
> have an error, it seems silly to try to rewrite the same stuff again
> just to obtain the error code.
Why? It's the standard Unix idiom. Keep writing until you either
complete your request or get an error. We don't do this here, and
instead invent an error.
Admittedly it's harder to do.
> I looked through the aio calls and I didn't find any obvious way to
> retrieve the error code, but maybe I missed something?
The existing code already has it: if ret is negative, that's what we return.
What you have to do on a short read or write is to schedule a new
request that starts from the point where this completion ends, and let
the completion of the new request return the error (or perhaps succeed).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write
2011-02-22 10:18 ` Jes.Sorensen
2011-02-22 11:44 ` [Qemu-devel] " Kevin Wolf
@ 2011-02-22 15:02 ` Stefan Hajnoczi
2011-02-22 15:11 ` Kevin Wolf
1 sibling, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-02-22 15:02 UTC (permalink / raw)
To: Jes.Sorensen; +Cc: hch, kwolf, qemu-devel, stefanha
On Tue, Feb 22, 2011 at 10:18 AM, <Jes.Sorensen@redhat.com> wrote:
> + else if ((laiocb->type == QEMU_AIO_WRITE) && (ret >= 0) &&
> + (ret < laiocb->nbytes))
> + ret = -ENOSPC;
Why is write special?
Why are we even allowing requests that extend beyond the end of the
device? Is the LVM volume marked growable in the QEMU block layer?
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write
2011-02-22 15:02 ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-02-22 15:11 ` Kevin Wolf
2011-02-22 15:16 ` Stefan Hajnoczi
0 siblings, 1 reply; 10+ messages in thread
From: Kevin Wolf @ 2011-02-22 15:11 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: hch, Jes.Sorensen, qemu-devel, stefanha
Am 22.02.2011 16:02, schrieb Stefan Hajnoczi:
> On Tue, Feb 22, 2011 at 10:18 AM, <Jes.Sorensen@redhat.com> wrote:
>> + else if ((laiocb->type == QEMU_AIO_WRITE) && (ret >= 0) &&
>> + (ret < laiocb->nbytes))
>> + ret = -ENOSPC;
>
> Why is write special?
I think we need the change reads, too. However not to return ENOSPC, but
to return zeros instead (this is what the synchronous raw_read does, and
pwrite relies on it - once we make pwrite async, we'll need this).
> Why are we even allowing requests that extend beyond the end of the
> device? Is the LVM volume marked growable in the QEMU block layer?
Might well be a qcow2 on LVM case that Jes was debugging.
Kevin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write
2011-02-22 15:11 ` Kevin Wolf
@ 2011-02-22 15:16 ` Stefan Hajnoczi
2011-02-22 16:59 ` [Qemu-devel] " Paolo Bonzini
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-02-22 15:16 UTC (permalink / raw)
To: Kevin Wolf; +Cc: hch, Jes.Sorensen, qemu-devel, stefanha
On Tue, Feb 22, 2011 at 3:11 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 22.02.2011 16:02, schrieb Stefan Hajnoczi:
>> Why are we even allowing requests that extend beyond the end of the
>> device? Is the LVM volume marked growable in the QEMU block layer?
>
> Might well be a qcow2 on LVM case that Jes was debugging.
Yes it is. It doesn't explain it though. The code involved here is
linux-aio.c and will be qcow2's bs->file. That ought to be a
host_device and AFAIK that is not growable. So I wanted to figure out
why we're even getting this far. I expected the request to get
rejected in block.c when checking the range against the host_device.
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] Re: [PATCH] For AIO return -ENOSPC on short write
2011-02-22 15:16 ` Stefan Hajnoczi
@ 2011-02-22 16:59 ` Paolo Bonzini
2011-03-01 20:19 ` Christoph Hellwig
0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2011-02-22 16:59 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, hch, stefanha, Jes.Sorensen
On 02/22/2011 04:16 PM, Stefan Hajnoczi wrote:
> Yes it is. It doesn't explain it though. The code involved here is
> linux-aio.c and will be qcow2's bs->file. That ought to be a
> host_device and AFAIK that is not growable. So I wanted to figure out
> why we're even getting this far. I expected the request to get
> rejected in block.c when checking the range against the host_device.
Possibly a COW logical volume can give short writes on disk full?
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] For AIO return -ENOSPC on short write
2011-02-22 16:59 ` [Qemu-devel] " Paolo Bonzini
@ 2011-03-01 20:19 ` Christoph Hellwig
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2011-03-01 20:19 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Kevin Wolf, hch, qemu-devel, Jes.Sorensen, stefanha
On Tue, Feb 22, 2011 at 05:59:01PM +0100, Paolo Bonzini wrote:
> On 02/22/2011 04:16 PM, Stefan Hajnoczi wrote:
> >Yes it is. It doesn't explain it though. The code involved here is
> >linux-aio.c and will be qcow2's bs->file. That ought to be a
> >host_device and AFAIK that is not growable. So I wanted to figure out
> >why we're even getting this far. I expected the request to get
> >rejected in block.c when checking the range against the host_device.
>
> Possibly a COW logical volume can give short writes on disk full?
It shouldn't. If it does that needs fixing.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-01 20:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-22 10:17 [Qemu-devel] [PATCH] For AIO return -ENOSPC on short write Jes.Sorensen
2011-02-22 10:18 ` Jes.Sorensen
2011-02-22 11:44 ` [Qemu-devel] " Kevin Wolf
2011-02-22 11:45 ` Jes Sorensen
2011-02-22 13:56 ` Avi Kivity
2011-02-22 15:02 ` [Qemu-devel] " Stefan Hajnoczi
2011-02-22 15:11 ` Kevin Wolf
2011-02-22 15:16 ` Stefan Hajnoczi
2011-02-22 16:59 ` [Qemu-devel] " Paolo Bonzini
2011-03-01 20:19 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).