* [Qemu-devel] buffer alignment for block backends
@ 2009-04-08 13:35 Stefano Stabellini
2009-04-08 18:53 ` Anthony Liguori
0 siblings, 1 reply; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-08 13:35 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Hi all,
I am having an issue with the alignment of the buffers given to the
block backends. In particular, at the moment we are allocating buffers
512 bytes aligned while the backend I am working on (block-vbd) needs
page alignment.
Given that the 512 bytes alignment comes already from the requirements
of a "special case" (linux O_DIRECT, I guess), would you be willing to
make another exception for another special case and page align the buffers?
If you do not want to do that, another possible solution is to create a
new function called "qemu_blockalign" that would be implemented as
qemu_memalign(512, size); so we don't have to write 512 bytes everywhere
but only in one place, thus making life easier to people like me that
have to change the value for a special case.
Thanks in advance for your sympathy :)
Stefano Stabellini
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-08 13:35 [Qemu-devel] buffer alignment for block backends Stefano Stabellini
@ 2009-04-08 18:53 ` Anthony Liguori
2009-04-09 9:57 ` Stefano Stabellini
2009-04-09 10:07 ` Stefano Stabellini
0 siblings, 2 replies; 27+ messages in thread
From: Anthony Liguori @ 2009-04-08 18:53 UTC (permalink / raw)
To: qemu-devel
Stefano Stabellini wrote:
> Hi all,
> I am having an issue with the alignment of the buffers given to the
> block backends. In particular, at the moment we are allocating buffers
> 512 bytes aligned while the backend I am working on (block-vbd) needs
> page alignment.
>
There is no requirement for this. The block block-raw-posix backend
deals with bouncing the buffers if it has to. A few places do
qemu_memalign() allocs as an optimization to avoid the bouncing but it's
not strictly required.
> If you do not want to do that, another possible solution is to create a
> new function called "qemu_blockalign" that would be implemented as
> qemu_memalign(512, size);
This is fine, but this is purely an optimization, it cannot be relied
upon in the general case.
> so we don't have to write 512 bytes everywhere
> but only in one place, thus making life easier to people like me that
> have to change the value for a special case.
>
> Thanks in advance for your sympathy :)
>
Why does your backend requirement page alignment and who's notion of page?
Regards,
Anthony Liguori
> Stefano Stabellini
>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-08 18:53 ` Anthony Liguori
@ 2009-04-09 9:57 ` Stefano Stabellini
2009-04-09 13:11 ` Anthony Liguori
2009-04-09 10:07 ` Stefano Stabellini
1 sibling, 1 reply; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 9:57 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Anthony Liguori wrote:
>> If you do not want to do that, another possible solution is to create a
>> new function called "qemu_blockalign" that would be implemented as
>> qemu_memalign(512, size);
>
> This is fine, but this is purely an optimization, it cannot be relied
> upon in the general case.
well, there aren't many places that allocate buffers for the block
backends, I can count only the following for dma operations:
- block.c:bdrv_aio_rw_vector
and this one for other ide read and write operations:
- ide.c:ide_init2
I think it would be important at least for dma operations.
>> so we don't have to write 512 bytes everywhere
>> but only in one place, thus making life easier to people like me that
>> have to change the value for a special case.
>>
>> Thanks in advance for your sympathy :)
>>
>
> Why does your backend requirement page alignment and who's notion of page?
>
my backend (block-vbd) needs page aligned buffers because blkfront needs
page aligned buffers. I could allocate a new page aligned buffer every
time and the memcpy but it is just a waste.
The notion of page is the host notion of page (minios in this case).
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-08 18:53 ` Anthony Liguori
2009-04-09 9:57 ` Stefano Stabellini
@ 2009-04-09 10:07 ` Stefano Stabellini
2009-04-09 13:14 ` Anthony Liguori
2009-04-09 13:19 ` Christoph Hellwig
1 sibling, 2 replies; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 10:07 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Anthony Liguori wrote:
>> If you do not want to do that, another possible solution is to create a
>> new function called "qemu_blockalign" that would be implemented as
>> qemu_memalign(512, size);
>
> This is fine, but this is purely an optimization, it cannot be relied
> upon in the general case.
>
If you are OK with this, a simple patch like the following is
acceptable, or do you prefer a more formal approach involving a new
function in the BlockDriver interface?
---
diff --git a/block.c b/block.c
index 74d19ad..1fdcfef 100644
--- a/block.c
+++ b/block.c
@@ -1376,7 +1376,7 @@ static BlockDriverAIOCB *bdrv_aio_rw_vector(BlockDriverState *bs,
acb = qemu_aio_get(bs, cb, opaque);
acb->is_write = is_write;
acb->qiov = qiov;
- acb->bounce = qemu_memalign(512, qiov->size);
+ acb->bounce = qemu_blockalign(qiov->size);
if (!acb->bh)
acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb);
@@ -1626,3 +1626,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
return drv->bdrv_aio_ioctl(bs, req, buf, cb, opaque);
return NULL;
}
+
+void *qemu_blockalign(size_t size)
+{
+ return qemu_memalign(512, size);
+}
+
diff --git a/block.h b/block.h
index ca672a1..dba9f9d 100644
--- a/block.h
+++ b/block.h
@@ -176,4 +176,6 @@ int bdrv_put_buffer(BlockDriverState *bs, const uint8_t *buf,
int bdrv_get_buffer(BlockDriverState *bs, uint8_t *buf, int64_t pos, int size);
+void *qemu_blockalign(size_t size);
+
#endif
diff --git a/hw/ide.c b/hw/ide.c
index f187546..93b90b8 100644
--- a/hw/ide.c
+++ b/hw/ide.c
@@ -2788,7 +2788,7 @@ static void ide_init2(IDEState *ide_state,
for(i = 0; i < 2; i++) {
s = ide_state + i;
- s->io_buffer = qemu_memalign(512, IDE_DMA_BUF_SECTORS*512 + 4);
+ s->io_buffer = qemu_blockalign(IDE_DMA_BUF_SECTORS*512 + 4);
if (i == 0)
s->bs = hd0;
else
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 9:57 ` Stefano Stabellini
@ 2009-04-09 13:11 ` Anthony Liguori
2009-04-09 13:30 ` Stefano Stabellini
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Anthony Liguori @ 2009-04-09 13:11 UTC (permalink / raw)
To: qemu-devel
Stefano Stabellini wrote:
> Anthony Liguori wrote:
>
>
>>> If you do not want to do that, another possible solution is to create a
>>> new function called "qemu_blockalign" that would be implemented as
>>> qemu_memalign(512, size);
>>>
>> This is fine, but this is purely an optimization, it cannot be relied
>> upon in the general case.
>>
>
>
>
> well, there aren't many places that allocate buffers for the block
> backends, I can count only the following for dma operations:
>
> - block.c:bdrv_aio_rw_vector
>
This bounces a scatter/gather list into a single linear buffer since not
all backends handle scatter/gather lists today.
> and this one for other ide read and write operations:
>
> - ide.c:ide_init2
>
This buffer is only used when not doing DMA. When doing DMA, we are
able to do zero-copy IO so the alignment of the request depends on how
the guest aligned the request. I suspect you'll find a lot of guests
that, in practice, do not align requests at 4k boundaries. I don't know
what the requirements are for IDE but I would be surprised if it was 4k.
> I think it would be important at least for dma operations.
>
We have a lot of places with explicit memalign's because the block raw
backend code degrades into synchronous IO when performing non-aligned IO
with cache=off. I've never liked this much personally.
That's not saying that I think we shouldn't try to align DMA buffers
when we're allocating them in QEMU. A block level function to do this
would be pretty nice in fact.
>>> so we don't have to write 512 bytes everywhere
>>> but only in one place, thus making life easier to people like me that
>>> have to change the value for a special case.
>>>
>>> Thanks in advance for your sympathy :)
>>>
>>>
>> Why does your backend requirement page alignment and who's notion of page?
>>
>>
>
> my backend (block-vbd) needs page aligned buffers because blkfront needs
> page aligned buffers. I could allocate a new page aligned buffer every
> time and the memcpy but it is just a waste.
>
You'll need to check the alignment of the request and bounce it if
necessary. In the case that you have zero-copy requests coming from the
guest that aren't page aligned, someone has to bounce the thing to make
it page aligned.
So I presume you're implementing blkfront in userspace? Does minios
provide a userspace interface for grant tables that looks similar to the
interfaces on Linux? Were you planning on submitting this for inclusion
in upstream QEMU?
I think it's a reasonable thing to include provided it's relatively
self-contained.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 10:07 ` Stefano Stabellini
@ 2009-04-09 13:14 ` Anthony Liguori
2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 13:19 ` Christoph Hellwig
1 sibling, 1 reply; 27+ messages in thread
From: Anthony Liguori @ 2009-04-09 13:14 UTC (permalink / raw)
To: qemu-devel
Stefano Stabellini wrote:
> Anthony Liguori wrote:
>
>
>>> If you do not want to do that, another possible solution is to create a
>>> new function called "qemu_blockalign" that would be implemented as
>>> qemu_memalign(512, size);
>>>
>> This is fine, but this is purely an optimization, it cannot be relied
>> upon in the general case.
>>
>>
>
>
>
> If you are OK with this, a simple patch like the following is
> acceptable, or do you prefer a more formal approach involving a new
> function in the BlockDriver interface?
>
Yes, but that's not covering very many of the cases. There are still
places where we allocate bounce buffers that aren't with memalign (like
in block-qcow2.c). Also, please document what the new function does and
when it should be used.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 10:07 ` Stefano Stabellini
2009-04-09 13:14 ` Anthony Liguori
@ 2009-04-09 13:19 ` Christoph Hellwig
2009-04-09 13:30 ` Stefano Stabellini
1 sibling, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-04-09 13:19 UTC (permalink / raw)
To: qemu-devel
On Thu, Apr 09, 2009 at 11:07:28AM +0100, Stefano Stabellini wrote:
> + acb->bounce = qemu_blockalign(qiov->size);
Why don't you pass a BlockDriverState to it so that it automatically
does the best alignment for the underlying driver?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:11 ` Anthony Liguori
@ 2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 13:54 ` Gerd Hoffmann
2009-04-09 13:33 ` Stefano Stabellini
2009-04-09 15:19 ` Samuel Thibault
2 siblings, 1 reply; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 13:30 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Anthony Liguori wrote:
>> my backend (block-vbd) needs page aligned buffers because blkfront needs
>> page aligned buffers. I could allocate a new page aligned buffer every
>> time and the memcpy but it is just a waste.
>>
>
> You'll need to check the alignment of the request and bounce it if
> necessary. In the case that you have zero-copy requests coming from the
> guest that aren't page aligned, someone has to bounce the thing to make
> it page aligned.
Yes, but that is exactly what I was trying to avoid.
I can provide a bounce buffer fallback anyway (I actually should) but I
would really like to avoid using it.
> So I presume you're implementing blkfront in userspace? Does minios
> provide a userspace interface for grant tables that looks similar to the
> interfaces on Linux? Were you planning on submitting this for inclusion
> in upstream QEMU?
The userspace\kernelspace distinction does not make much sense in
minios, but yes it provides an interface to communicate with blkback in
dom0. This interface is used by block-vbd in qemu instead of
block-raw-posix when run in minios.
> I think it's a reasonable thing to include provided it's relatively
> self-contained.
>
good!
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:14 ` Anthony Liguori
@ 2009-04-09 13:30 ` Stefano Stabellini
0 siblings, 0 replies; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 13:30 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Anthony Liguori wrote:
> Stefano Stabellini wrote:
>> Anthony Liguori wrote:
>>
>>
>>>> If you do not want to do that, another possible solution is to create a
>>>> new function called "qemu_blockalign" that would be implemented as
>>>> qemu_memalign(512, size);
>>>>
>>> This is fine, but this is purely an optimization, it cannot be relied
>>> upon in the general case.
>>>
>>>
>>
>>
>> If you are OK with this, a simple patch like the following is
>> acceptable, or do you prefer a more formal approach involving a new
>> function in the BlockDriver interface?
>>
>
> Yes, but that's not covering very many of the cases. There are still
> places where we allocate bounce buffers that aren't with memalign (like
> in block-qcow2.c). Also, please document what the new function does and
> when it should be used.
>
Ok, I'll try to find all those places.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:19 ` Christoph Hellwig
@ 2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 15:31 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 13:30 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Christoph Hellwig wrote:
> On Thu, Apr 09, 2009 at 11:07:28AM +0100, Stefano Stabellini wrote:
>> + acb->bounce = qemu_blockalign(qiov->size);
>
> Why don't you pass a BlockDriverState to it so that it automatically
> does the best alignment for the underlying driver?
>
This is a good idea.
What if I add a new field to BlockDriverState called 'alignment'
defaulted to 512?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:11 ` Anthony Liguori
2009-04-09 13:30 ` Stefano Stabellini
@ 2009-04-09 13:33 ` Stefano Stabellini
2009-04-09 15:19 ` Samuel Thibault
2 siblings, 0 replies; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 13:33 UTC (permalink / raw)
To: qemu-devel@nongnu.org
I forgot to reply to one question:
Anthony Liguori wrote:
> Were you planning on submitting this for inclusion
> in upstream QEMU?
Yes, as part of the xen hvm support.
If you think you can find a good use for it even without xen I can
submit it before.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:30 ` Stefano Stabellini
@ 2009-04-09 13:54 ` Gerd Hoffmann
2009-04-09 15:21 ` Samuel Thibault
0 siblings, 1 reply; 27+ messages in thread
From: Gerd Hoffmann @ 2009-04-09 13:54 UTC (permalink / raw)
To: qemu-devel
On 04/09/09 15:30, Stefano Stabellini wrote:
> The userspace\kernelspace distinction does not make much sense in
> minios, but yes it provides an interface to communicate with blkback in
> dom0. This interface is used by block-vbd in qemu instead of
> block-raw-posix when run in minios.
Sit back, look again. Do you really need it being page-aligned?
The xen block protocol can handle unaligned requests just fine. The
granted page is splitted into 512 sectors (8 pieces in case of 4k pages
on x86). You can then request only some of the of the sectors being
transfered. If you have two sectors with a 1024 byte offset to the page
border you'll just ask for sectors 2+3.
cheers,
Gerd
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:11 ` Anthony Liguori
2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 13:33 ` Stefano Stabellini
@ 2009-04-09 15:19 ` Samuel Thibault
2009-04-09 15:38 ` Paul Brook
2009-04-09 15:56 ` Anthony Liguori
2 siblings, 2 replies; 27+ messages in thread
From: Samuel Thibault @ 2009-04-09 15:19 UTC (permalink / raw)
To: qemu-devel
Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit :
> >- ide.c:ide_init2
> >
>
> This buffer is only used when not doing DMA. When doing DMA, we are
> able to do zero-copy IO so the alignment of the request depends on how
> the guest aligned the request. I suspect you'll find a lot of guests
> that, in practice, do not align requests at 4k boundaries.
Eeeeerrr, why shoudn't they? Guests usually work on pages, which are
already aligned on 4k boundaries.
> I don't know what the requirements are for IDE but I would be
> surprised if it was 4k.
I believe there is no requirement.
Samuel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:54 ` Gerd Hoffmann
@ 2009-04-09 15:21 ` Samuel Thibault
2009-04-09 15:46 ` Stefano Stabellini
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Samuel Thibault @ 2009-04-09 15:21 UTC (permalink / raw)
To: qemu-devel
Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
> On 04/09/09 15:30, Stefano Stabellini wrote:
> >The userspace\kernelspace distinction does not make much sense in
> >minios, but yes it provides an interface to communicate with blkback in
> >dom0. This interface is used by block-vbd in qemu instead of
> >block-raw-posix when run in minios.
>
> Sit back, look again. Do you really need it being page-aligned?
Page aligned is an optimization which comes for free when you can
already choose the alignment.
> The xen block protocol can handle unaligned requests just fine.
Not completely arbitrarily aligned requests. They need to be
sector-aligned.
Samuel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 13:30 ` Stefano Stabellini
@ 2009-04-09 15:31 ` Christoph Hellwig
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-04-09 15:31 UTC (permalink / raw)
To: qemu-devel
On Thu, Apr 09, 2009 at 02:30:42PM +0100, Stefano Stabellini wrote:
> Christoph Hellwig wrote:
>
> > On Thu, Apr 09, 2009 at 11:07:28AM +0100, Stefano Stabellini wrote:
> >> + acb->bounce = qemu_blockalign(qiov->size);
> >
> > Why don't you pass a BlockDriverState to it so that it automatically
> > does the best alignment for the underlying driver?
> >
>
> This is a good idea.
>
> What if I add a new field to BlockDriverState called 'alignment'
> defaulted to 512?
Let's call it buffer_alignmnet or memory_alignment as we also have an
alignment of the actual sector number (doesn't matter currently, but
there are devices or device configurations that only support larger than
512 byte sectors)
Also please add a large comment describing what it's used for. Right
now that's only a hint for the allocations, but I might re-use it in
future to move the bouncing on mis-alignmnet into the block layer.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:19 ` Samuel Thibault
@ 2009-04-09 15:38 ` Paul Brook
2009-04-09 15:40 ` Samuel Thibault
2009-04-09 15:56 ` Anthony Liguori
1 sibling, 1 reply; 27+ messages in thread
From: Paul Brook @ 2009-04-09 15:38 UTC (permalink / raw)
To: qemu-devel; +Cc: Samuel Thibault
On Thursday 09 April 2009, Samuel Thibault wrote:
> Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit :
> > >- ide.c:ide_init2
> >
> > This buffer is only used when not doing DMA. When doing DMA, we are
> > able to do zero-copy IO so the alignment of the request depends on how
> > the guest aligned the request. I suspect you'll find a lot of guests
> > that, in practice, do not align requests at 4k boundaries.
>
> Eeeeerrr, why shoudn't they? Guests usually work on pages, which are
> already aligned on 4k boundaries.
Assuming that page == 4k is going to get you into all sorts of trouble.
Paul
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:38 ` Paul Brook
@ 2009-04-09 15:40 ` Samuel Thibault
0 siblings, 0 replies; 27+ messages in thread
From: Samuel Thibault @ 2009-04-09 15:40 UTC (permalink / raw)
To: Paul Brook; +Cc: qemu-devel
Paul Brook, le Thu 09 Apr 2009 16:38:51 +0100, a écrit :
> On Thursday 09 April 2009, Samuel Thibault wrote:
> > Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit :
> > > >- ide.c:ide_init2
> > >
> > > This buffer is only used when not doing DMA. When doing DMA, we are
> > > able to do zero-copy IO so the alignment of the request depends on how
> > > the guest aligned the request. I suspect you'll find a lot of guests
> > > that, in practice, do not align requests at 4k boundaries.
> >
> > Eeeeerrr, why shoudn't they? Guests usually work on pages, which are
> > already aligned on 4k boundaries.
>
> Assuming that page == 4k is going to get you into all sorts of trouble.
I'm not assuming it. I'm saying that most of the time that's the case,
and thus a heuristic that works better when that's the case will get
great benefits.
Samuel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:21 ` Samuel Thibault
@ 2009-04-09 15:46 ` Stefano Stabellini
2009-04-09 15:50 ` Gerd Hoffmann
2009-04-09 15:55 ` Jamie Lokier
2 siblings, 0 replies; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 15:46 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Samuel Thibault wrote:
> Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
>> On 04/09/09 15:30, Stefano Stabellini wrote:
>>> The userspace\kernelspace distinction does not make much sense in
>>> minios, but yes it provides an interface to communicate with blkback in
>>> dom0. This interface is used by block-vbd in qemu instead of
>>> block-raw-posix when run in minios.
>> Sit back, look again. Do you really need it being page-aligned?
>
> Page aligned is an optimization which comes for free when you can
> already choose the alignment.
>
>> The xen block protocol can handle unaligned requests just fine.
>
> Not completely arbitrarily aligned requests. They need to be
> sector-aligned.
>
Samuel is right, I was misled by my test environment.
Blkfront does support sector aligned buffers (even though at the moment
they don't work for me :P ).
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:21 ` Samuel Thibault
2009-04-09 15:46 ` Stefano Stabellini
@ 2009-04-09 15:50 ` Gerd Hoffmann
2009-04-09 16:11 ` Stefano Stabellini
` (2 more replies)
2009-04-09 15:55 ` Jamie Lokier
2 siblings, 3 replies; 27+ messages in thread
From: Gerd Hoffmann @ 2009-04-09 15:50 UTC (permalink / raw)
To: qemu-devel
On 04/09/09 17:21, Samuel Thibault wrote:
> Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
>> On 04/09/09 15:30, Stefano Stabellini wrote:
>>> The userspace\kernelspace distinction does not make much sense in
>>> minios, but yes it provides an interface to communicate with blkback in
>>> dom0. This interface is used by block-vbd in qemu instead of
>>> block-raw-posix when run in minios.
>> Sit back, look again. Do you really need it being page-aligned?
>
> Page aligned is an optimization which comes for free when you can
> already choose the alignment.
Ok, so you don't have to.
Does it have an effect positive actually? i.e. do you have requests
which would fit into one page but actually span two due to the misalignment?
At least when watching the linux kernel access pattern I see larger
requests being page aligned anyway. Probably page cache reads and writes.
>> The xen block protocol can handle unaligned requests just fine.
>
> Not completely arbitrarily aligned requests. They need to be
> sector-aligned.
Agreed. Sector-alignment you have anyway, thats why I didn't wrote that
explicitly.
cheers,
Gerd
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:21 ` Samuel Thibault
2009-04-09 15:46 ` Stefano Stabellini
2009-04-09 15:50 ` Gerd Hoffmann
@ 2009-04-09 15:55 ` Jamie Lokier
2009-04-09 16:15 ` Samuel Thibault
2009-04-09 17:29 ` Lennart Sorensen
2 siblings, 2 replies; 27+ messages in thread
From: Jamie Lokier @ 2009-04-09 15:55 UTC (permalink / raw)
To: qemu-devel
Samuel Thibault wrote:
> Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
> > On 04/09/09 15:30, Stefano Stabellini wrote:
> > >The userspace\kernelspace distinction does not make much sense in
> > >minios, but yes it provides an interface to communicate with blkback in
> > >dom0. This interface is used by block-vbd in qemu instead of
> > >block-raw-posix when run in minios.
> >
> > Sit back, look again. Do you really need it being page-aligned?
>
> Page aligned is an optimization which comes for free when you can
> already choose the alignment.
>
> > The xen block protocol can handle unaligned requests just fine.
>
> Not completely arbitrarily aligned requests. They need to be
> sector-aligned.
Thought about 4k sectors, to be found in new disks sometime soon?
-- Jamie
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:19 ` Samuel Thibault
2009-04-09 15:38 ` Paul Brook
@ 2009-04-09 15:56 ` Anthony Liguori
2009-04-09 16:19 ` Samuel Thibault
1 sibling, 1 reply; 27+ messages in thread
From: Anthony Liguori @ 2009-04-09 15:56 UTC (permalink / raw)
To: qemu-devel
Samuel Thibault wrote:
> Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit :
>
>>> - ide.c:ide_init2
>>>
>>>
>> This buffer is only used when not doing DMA. When doing DMA, we are
>> able to do zero-copy IO so the alignment of the request depends on how
>> the guest aligned the request. I suspect you'll find a lot of guests
>> that, in practice, do not align requests at 4k boundaries.
>>
>
> Eeeeerrr, why shoudn't they? Guests usually work on pages, which are
> already aligned on 4k boundaries.
>
Like if you do a 512-byte O_DIRECT write (in userspace) to a 512-byte,
but not 4096-byte aligned buffer in Linux. This may not happen all the
time, but if you don't handle this in your backend, you'll get a lot of
lovely bug reports :-)
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:50 ` Gerd Hoffmann
@ 2009-04-09 16:11 ` Stefano Stabellini
2009-04-09 16:13 ` Samuel Thibault
2009-04-09 16:28 ` Anthony Liguori
2 siblings, 0 replies; 27+ messages in thread
From: Stefano Stabellini @ 2009-04-09 16:11 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Gerd Hoffmann wrote:
> On 04/09/09 17:21, Samuel Thibault wrote:
>> Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
>>> On 04/09/09 15:30, Stefano Stabellini wrote:
>>>> The userspace\kernelspace distinction does not make much sense in
>>>> minios, but yes it provides an interface to communicate with blkback in
>>>> dom0. This interface is used by block-vbd in qemu instead of
>>>> block-raw-posix when run in minios.
>>> Sit back, look again. Do you really need it being page-aligned?
>> Page aligned is an optimization which comes for free when you can
>> already choose the alignment.
>
> Ok, so you don't have to.
>
> Does it have an effect positive actually? i.e. do you have requests
> which would fit into one page but actually span two due to the misalignment?
Yep, I can see many here.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:50 ` Gerd Hoffmann
2009-04-09 16:11 ` Stefano Stabellini
@ 2009-04-09 16:13 ` Samuel Thibault
2009-04-09 16:28 ` Anthony Liguori
2 siblings, 0 replies; 27+ messages in thread
From: Samuel Thibault @ 2009-04-09 16:13 UTC (permalink / raw)
To: qemu-devel
Gerd Hoffmann, le Thu 09 Apr 2009 17:50:19 +0200, a écrit :
> On 04/09/09 17:21, Samuel Thibault wrote:
> >Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
> >>On 04/09/09 15:30, Stefano Stabellini wrote:
> >>>The userspace\kernelspace distinction does not make much sense in
> >>>minios, but yes it provides an interface to communicate with blkback in
> >>>dom0. This interface is used by block-vbd in qemu instead of
> >>>block-raw-posix when run in minios.
> >>Sit back, look again. Do you really need it being page-aligned?
> >
> >Page aligned is an optimization which comes for free when you can
> >already choose the alignment.
>
> Ok, so you don't have to.
>
> Does it have an effect positive actually? i.e. do you have requests
> which would fit into one page but actually span two due to the misalignment?
I/O with small files often fit in just one page. If they're never
aligned the amount of grants to transfer is doubled.
> At least when watching the linux kernel access pattern I see larger
> requests being page aligned anyway.
> Probably page cache reads and writes.
Yes, that's what I meant in another mail. Usually, I/O will already be
aligned. That's not a reason for not trying to optimize other cases.
Samuel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:55 ` Jamie Lokier
@ 2009-04-09 16:15 ` Samuel Thibault
2009-04-09 17:29 ` Lennart Sorensen
1 sibling, 0 replies; 27+ messages in thread
From: Samuel Thibault @ 2009-04-09 16:15 UTC (permalink / raw)
To: qemu-devel
Jamie Lokier, le Thu 09 Apr 2009 16:55:50 +0100, a écrit :
> Samuel Thibault wrote:
> > Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit :
> > > On 04/09/09 15:30, Stefano Stabellini wrote:
> > > >The userspace\kernelspace distinction does not make much sense in
> > > >minios, but yes it provides an interface to communicate with blkback in
> > > >dom0. This interface is used by block-vbd in qemu instead of
> > > >block-raw-posix when run in minios.
> > >
> > > Sit back, look again. Do you really need it being page-aligned?
> >
> > Page aligned is an optimization which comes for free when you can
> > already choose the alignment.
> >
> > > The xen block protocol can handle unaligned requests just fine.
> >
> > Not completely arbitrarily aligned requests. They need to be
> > sector-aligned.
>
> Thought about 4k sectors, to be found in new disks sometime soon?
Same wording: they will have to be sector-aligned (the blkback announces
the size of its sector).
Samuel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:56 ` Anthony Liguori
@ 2009-04-09 16:19 ` Samuel Thibault
0 siblings, 0 replies; 27+ messages in thread
From: Samuel Thibault @ 2009-04-09 16:19 UTC (permalink / raw)
To: qemu-devel
Anthony Liguori, le Thu 09 Apr 2009 10:56:27 -0500, a écrit :
> Samuel Thibault wrote:
> >Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit :
> >>I suspect you'll find a lot of guests that, in practice, do not
> >>align requests at 4k boundaries.
> >
> >Eeeeerrr, why shoudn't they? Guests usually work on pages, which are
> >already aligned on 4k boundaries.
> >
>
> Like if you do a 512-byte O_DIRECT write (in userspace) to a 512-byte,
> but not 4096-byte aligned buffer in Linux. This may not happen all the
> time, but if you don't handle this in your backend, you'll get a lot of
> lovely bug reports :-)
As I said, the requirement is sector alignment. Yes, if the blkback
sector size is greater than 512 bytes we'll have to cope with the
userspace buffer not aligned with 512 bytes, by adding a grant to the
blkback request. In such a case however, I'd suggest having qemu expose
4k sector virtual disks, so that the guest automaticcally chooses to
align to 4k, and thus get full optimization.
Samuel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:50 ` Gerd Hoffmann
2009-04-09 16:11 ` Stefano Stabellini
2009-04-09 16:13 ` Samuel Thibault
@ 2009-04-09 16:28 ` Anthony Liguori
2 siblings, 0 replies; 27+ messages in thread
From: Anthony Liguori @ 2009-04-09 16:28 UTC (permalink / raw)
To: qemu-devel
Gerd Hoffmann wrote:
>>> The xen block protocol can handle unaligned requests just fine.
>>
>> Not completely arbitrarily aligned requests. They need to be
>> sector-aligned.
>
> Agreed. Sector-alignment you have anyway, thats why I didn't wrote
> that explicitly.
But even sector-alignment is not guaranteed for the buffers in the
bdrv_* API.
You can implement a backend that just implements aio_read/aio_write and
you'll always get sector-aligned offsets, but the actual data buffers
may have any alignment.
Regards,
Anthony Liguori
> cheers,
> Gerd
>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends
2009-04-09 15:55 ` Jamie Lokier
2009-04-09 16:15 ` Samuel Thibault
@ 2009-04-09 17:29 ` Lennart Sorensen
1 sibling, 0 replies; 27+ messages in thread
From: Lennart Sorensen @ 2009-04-09 17:29 UTC (permalink / raw)
To: qemu-devel
On Thu, Apr 09, 2009 at 04:55:50PM +0100, Jamie Lokier wrote:
> Thought about 4k sectors, to be found in new disks sometime soon?
Some new SAS drives use 4k sectors, and SATA is going to at some point
as well. It seems to be currently in testing, but some of the linux
kernel developers have test drives that use 4k sectors for development.
--
Len Sorensen
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-04-09 17:29 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-08 13:35 [Qemu-devel] buffer alignment for block backends Stefano Stabellini
2009-04-08 18:53 ` Anthony Liguori
2009-04-09 9:57 ` Stefano Stabellini
2009-04-09 13:11 ` Anthony Liguori
2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 13:54 ` Gerd Hoffmann
2009-04-09 15:21 ` Samuel Thibault
2009-04-09 15:46 ` Stefano Stabellini
2009-04-09 15:50 ` Gerd Hoffmann
2009-04-09 16:11 ` Stefano Stabellini
2009-04-09 16:13 ` Samuel Thibault
2009-04-09 16:28 ` Anthony Liguori
2009-04-09 15:55 ` Jamie Lokier
2009-04-09 16:15 ` Samuel Thibault
2009-04-09 17:29 ` Lennart Sorensen
2009-04-09 13:33 ` Stefano Stabellini
2009-04-09 15:19 ` Samuel Thibault
2009-04-09 15:38 ` Paul Brook
2009-04-09 15:40 ` Samuel Thibault
2009-04-09 15:56 ` Anthony Liguori
2009-04-09 16:19 ` Samuel Thibault
2009-04-09 10:07 ` Stefano Stabellini
2009-04-09 13:14 ` Anthony Liguori
2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 13:19 ` Christoph Hellwig
2009-04-09 13:30 ` Stefano Stabellini
2009-04-09 15:31 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).