* [Qemu-devel] buffer alignment for block backends @ 2009-04-08 13:35 Stefano Stabellini 2009-04-08 18:53 ` Anthony Liguori 0 siblings, 1 reply; 27+ messages in thread From: Stefano Stabellini @ 2009-04-08 13:35 UTC (permalink / raw) To: qemu-devel@nongnu.org Hi all, I am having an issue with the alignment of the buffers given to the block backends. In particular, at the moment we are allocating buffers 512 bytes aligned while the backend I am working on (block-vbd) needs page alignment. Given that the 512 bytes alignment comes already from the requirements of a "special case" (linux O_DIRECT, I guess), would you be willing to make another exception for another special case and page align the buffers? If you do not want to do that, another possible solution is to create a new function called "qemu_blockalign" that would be implemented as qemu_memalign(512, size); so we don't have to write 512 bytes everywhere but only in one place, thus making life easier to people like me that have to change the value for a special case. Thanks in advance for your sympathy :) Stefano Stabellini ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-08 13:35 [Qemu-devel] buffer alignment for block backends Stefano Stabellini @ 2009-04-08 18:53 ` Anthony Liguori 2009-04-09 9:57 ` Stefano Stabellini 2009-04-09 10:07 ` Stefano Stabellini 0 siblings, 2 replies; 27+ messages in thread From: Anthony Liguori @ 2009-04-08 18:53 UTC (permalink / raw) To: qemu-devel Stefano Stabellini wrote: > Hi all, > I am having an issue with the alignment of the buffers given to the > block backends. In particular, at the moment we are allocating buffers > 512 bytes aligned while the backend I am working on (block-vbd) needs > page alignment. > There is no requirement for this. The block block-raw-posix backend deals with bouncing the buffers if it has to. A few places do qemu_memalign() allocs as an optimization to avoid the bouncing but it's not strictly required. > If you do not want to do that, another possible solution is to create a > new function called "qemu_blockalign" that would be implemented as > qemu_memalign(512, size); This is fine, but this is purely an optimization, it cannot be relied upon in the general case. > so we don't have to write 512 bytes everywhere > but only in one place, thus making life easier to people like me that > have to change the value for a special case. > > Thanks in advance for your sympathy :) > Why does your backend requirement page alignment and who's notion of page? Regards, Anthony Liguori > Stefano Stabellini > > > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-08 18:53 ` Anthony Liguori @ 2009-04-09 9:57 ` Stefano Stabellini 2009-04-09 13:11 ` Anthony Liguori 2009-04-09 10:07 ` Stefano Stabellini 1 sibling, 1 reply; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 9:57 UTC (permalink / raw) To: qemu-devel@nongnu.org Anthony Liguori wrote: >> If you do not want to do that, another possible solution is to create a >> new function called "qemu_blockalign" that would be implemented as >> qemu_memalign(512, size); > > This is fine, but this is purely an optimization, it cannot be relied > upon in the general case. well, there aren't many places that allocate buffers for the block backends, I can count only the following for dma operations: - block.c:bdrv_aio_rw_vector and this one for other ide read and write operations: - ide.c:ide_init2 I think it would be important at least for dma operations. >> so we don't have to write 512 bytes everywhere >> but only in one place, thus making life easier to people like me that >> have to change the value for a special case. >> >> Thanks in advance for your sympathy :) >> > > Why does your backend requirement page alignment and who's notion of page? > my backend (block-vbd) needs page aligned buffers because blkfront needs page aligned buffers. I could allocate a new page aligned buffer every time and the memcpy but it is just a waste. The notion of page is the host notion of page (minios in this case). ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 9:57 ` Stefano Stabellini @ 2009-04-09 13:11 ` Anthony Liguori 2009-04-09 13:30 ` Stefano Stabellini ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Anthony Liguori @ 2009-04-09 13:11 UTC (permalink / raw) To: qemu-devel Stefano Stabellini wrote: > Anthony Liguori wrote: > > >>> If you do not want to do that, another possible solution is to create a >>> new function called "qemu_blockalign" that would be implemented as >>> qemu_memalign(512, size); >>> >> This is fine, but this is purely an optimization, it cannot be relied >> upon in the general case. >> > > > > well, there aren't many places that allocate buffers for the block > backends, I can count only the following for dma operations: > > - block.c:bdrv_aio_rw_vector > This bounces a scatter/gather list into a single linear buffer since not all backends handle scatter/gather lists today. > and this one for other ide read and write operations: > > - ide.c:ide_init2 > This buffer is only used when not doing DMA. When doing DMA, we are able to do zero-copy IO so the alignment of the request depends on how the guest aligned the request. I suspect you'll find a lot of guests that, in practice, do not align requests at 4k boundaries. I don't know what the requirements are for IDE but I would be surprised if it was 4k. > I think it would be important at least for dma operations. > We have a lot of places with explicit memalign's because the block raw backend code degrades into synchronous IO when performing non-aligned IO with cache=off. I've never liked this much personally. That's not saying that I think we shouldn't try to align DMA buffers when we're allocating them in QEMU. A block level function to do this would be pretty nice in fact. >>> so we don't have to write 512 bytes everywhere >>> but only in one place, thus making life easier to people like me that >>> have to change the value for a special case. >>> >>> Thanks in advance for your sympathy :) >>> >>> >> Why does your backend requirement page alignment and who's notion of page? >> >> > > my backend (block-vbd) needs page aligned buffers because blkfront needs > page aligned buffers. I could allocate a new page aligned buffer every > time and the memcpy but it is just a waste. > You'll need to check the alignment of the request and bounce it if necessary. In the case that you have zero-copy requests coming from the guest that aren't page aligned, someone has to bounce the thing to make it page aligned. So I presume you're implementing blkfront in userspace? Does minios provide a userspace interface for grant tables that looks similar to the interfaces on Linux? Were you planning on submitting this for inclusion in upstream QEMU? I think it's a reasonable thing to include provided it's relatively self-contained. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:11 ` Anthony Liguori @ 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 13:54 ` Gerd Hoffmann 2009-04-09 13:33 ` Stefano Stabellini 2009-04-09 15:19 ` Samuel Thibault 2 siblings, 1 reply; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 13:30 UTC (permalink / raw) To: qemu-devel@nongnu.org Anthony Liguori wrote: >> my backend (block-vbd) needs page aligned buffers because blkfront needs >> page aligned buffers. I could allocate a new page aligned buffer every >> time and the memcpy but it is just a waste. >> > > You'll need to check the alignment of the request and bounce it if > necessary. In the case that you have zero-copy requests coming from the > guest that aren't page aligned, someone has to bounce the thing to make > it page aligned. Yes, but that is exactly what I was trying to avoid. I can provide a bounce buffer fallback anyway (I actually should) but I would really like to avoid using it. > So I presume you're implementing blkfront in userspace? Does minios > provide a userspace interface for grant tables that looks similar to the > interfaces on Linux? Were you planning on submitting this for inclusion > in upstream QEMU? The userspace\kernelspace distinction does not make much sense in minios, but yes it provides an interface to communicate with blkback in dom0. This interface is used by block-vbd in qemu instead of block-raw-posix when run in minios. > I think it's a reasonable thing to include provided it's relatively > self-contained. > good! ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:30 ` Stefano Stabellini @ 2009-04-09 13:54 ` Gerd Hoffmann 2009-04-09 15:21 ` Samuel Thibault 0 siblings, 1 reply; 27+ messages in thread From: Gerd Hoffmann @ 2009-04-09 13:54 UTC (permalink / raw) To: qemu-devel On 04/09/09 15:30, Stefano Stabellini wrote: > The userspace\kernelspace distinction does not make much sense in > minios, but yes it provides an interface to communicate with blkback in > dom0. This interface is used by block-vbd in qemu instead of > block-raw-posix when run in minios. Sit back, look again. Do you really need it being page-aligned? The xen block protocol can handle unaligned requests just fine. The granted page is splitted into 512 sectors (8 pieces in case of 4k pages on x86). You can then request only some of the of the sectors being transfered. If you have two sectors with a 1024 byte offset to the page border you'll just ask for sectors 2+3. cheers, Gerd ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:54 ` Gerd Hoffmann @ 2009-04-09 15:21 ` Samuel Thibault 2009-04-09 15:46 ` Stefano Stabellini ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Samuel Thibault @ 2009-04-09 15:21 UTC (permalink / raw) To: qemu-devel Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : > On 04/09/09 15:30, Stefano Stabellini wrote: > >The userspace\kernelspace distinction does not make much sense in > >minios, but yes it provides an interface to communicate with blkback in > >dom0. This interface is used by block-vbd in qemu instead of > >block-raw-posix when run in minios. > > Sit back, look again. Do you really need it being page-aligned? Page aligned is an optimization which comes for free when you can already choose the alignment. > The xen block protocol can handle unaligned requests just fine. Not completely arbitrarily aligned requests. They need to be sector-aligned. Samuel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:21 ` Samuel Thibault @ 2009-04-09 15:46 ` Stefano Stabellini 2009-04-09 15:50 ` Gerd Hoffmann 2009-04-09 15:55 ` Jamie Lokier 2 siblings, 0 replies; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 15:46 UTC (permalink / raw) To: qemu-devel@nongnu.org Samuel Thibault wrote: > Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : >> On 04/09/09 15:30, Stefano Stabellini wrote: >>> The userspace\kernelspace distinction does not make much sense in >>> minios, but yes it provides an interface to communicate with blkback in >>> dom0. This interface is used by block-vbd in qemu instead of >>> block-raw-posix when run in minios. >> Sit back, look again. Do you really need it being page-aligned? > > Page aligned is an optimization which comes for free when you can > already choose the alignment. > >> The xen block protocol can handle unaligned requests just fine. > > Not completely arbitrarily aligned requests. They need to be > sector-aligned. > Samuel is right, I was misled by my test environment. Blkfront does support sector aligned buffers (even though at the moment they don't work for me :P ). ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:21 ` Samuel Thibault 2009-04-09 15:46 ` Stefano Stabellini @ 2009-04-09 15:50 ` Gerd Hoffmann 2009-04-09 16:11 ` Stefano Stabellini ` (2 more replies) 2009-04-09 15:55 ` Jamie Lokier 2 siblings, 3 replies; 27+ messages in thread From: Gerd Hoffmann @ 2009-04-09 15:50 UTC (permalink / raw) To: qemu-devel On 04/09/09 17:21, Samuel Thibault wrote: > Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : >> On 04/09/09 15:30, Stefano Stabellini wrote: >>> The userspace\kernelspace distinction does not make much sense in >>> minios, but yes it provides an interface to communicate with blkback in >>> dom0. This interface is used by block-vbd in qemu instead of >>> block-raw-posix when run in minios. >> Sit back, look again. Do you really need it being page-aligned? > > Page aligned is an optimization which comes for free when you can > already choose the alignment. Ok, so you don't have to. Does it have an effect positive actually? i.e. do you have requests which would fit into one page but actually span two due to the misalignment? At least when watching the linux kernel access pattern I see larger requests being page aligned anyway. Probably page cache reads and writes. >> The xen block protocol can handle unaligned requests just fine. > > Not completely arbitrarily aligned requests. They need to be > sector-aligned. Agreed. Sector-alignment you have anyway, thats why I didn't wrote that explicitly. cheers, Gerd ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:50 ` Gerd Hoffmann @ 2009-04-09 16:11 ` Stefano Stabellini 2009-04-09 16:13 ` Samuel Thibault 2009-04-09 16:28 ` Anthony Liguori 2 siblings, 0 replies; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 16:11 UTC (permalink / raw) To: qemu-devel@nongnu.org Gerd Hoffmann wrote: > On 04/09/09 17:21, Samuel Thibault wrote: >> Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : >>> On 04/09/09 15:30, Stefano Stabellini wrote: >>>> The userspace\kernelspace distinction does not make much sense in >>>> minios, but yes it provides an interface to communicate with blkback in >>>> dom0. This interface is used by block-vbd in qemu instead of >>>> block-raw-posix when run in minios. >>> Sit back, look again. Do you really need it being page-aligned? >> Page aligned is an optimization which comes for free when you can >> already choose the alignment. > > Ok, so you don't have to. > > Does it have an effect positive actually? i.e. do you have requests > which would fit into one page but actually span two due to the misalignment? Yep, I can see many here. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:50 ` Gerd Hoffmann 2009-04-09 16:11 ` Stefano Stabellini @ 2009-04-09 16:13 ` Samuel Thibault 2009-04-09 16:28 ` Anthony Liguori 2 siblings, 0 replies; 27+ messages in thread From: Samuel Thibault @ 2009-04-09 16:13 UTC (permalink / raw) To: qemu-devel Gerd Hoffmann, le Thu 09 Apr 2009 17:50:19 +0200, a écrit : > On 04/09/09 17:21, Samuel Thibault wrote: > >Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : > >>On 04/09/09 15:30, Stefano Stabellini wrote: > >>>The userspace\kernelspace distinction does not make much sense in > >>>minios, but yes it provides an interface to communicate with blkback in > >>>dom0. This interface is used by block-vbd in qemu instead of > >>>block-raw-posix when run in minios. > >>Sit back, look again. Do you really need it being page-aligned? > > > >Page aligned is an optimization which comes for free when you can > >already choose the alignment. > > Ok, so you don't have to. > > Does it have an effect positive actually? i.e. do you have requests > which would fit into one page but actually span two due to the misalignment? I/O with small files often fit in just one page. If they're never aligned the amount of grants to transfer is doubled. > At least when watching the linux kernel access pattern I see larger > requests being page aligned anyway. > Probably page cache reads and writes. Yes, that's what I meant in another mail. Usually, I/O will already be aligned. That's not a reason for not trying to optimize other cases. Samuel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:50 ` Gerd Hoffmann 2009-04-09 16:11 ` Stefano Stabellini 2009-04-09 16:13 ` Samuel Thibault @ 2009-04-09 16:28 ` Anthony Liguori 2 siblings, 0 replies; 27+ messages in thread From: Anthony Liguori @ 2009-04-09 16:28 UTC (permalink / raw) To: qemu-devel Gerd Hoffmann wrote: >>> The xen block protocol can handle unaligned requests just fine. >> >> Not completely arbitrarily aligned requests. They need to be >> sector-aligned. > > Agreed. Sector-alignment you have anyway, thats why I didn't wrote > that explicitly. But even sector-alignment is not guaranteed for the buffers in the bdrv_* API. You can implement a backend that just implements aio_read/aio_write and you'll always get sector-aligned offsets, but the actual data buffers may have any alignment. Regards, Anthony Liguori > cheers, > Gerd > > > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:21 ` Samuel Thibault 2009-04-09 15:46 ` Stefano Stabellini 2009-04-09 15:50 ` Gerd Hoffmann @ 2009-04-09 15:55 ` Jamie Lokier 2009-04-09 16:15 ` Samuel Thibault 2009-04-09 17:29 ` Lennart Sorensen 2 siblings, 2 replies; 27+ messages in thread From: Jamie Lokier @ 2009-04-09 15:55 UTC (permalink / raw) To: qemu-devel Samuel Thibault wrote: > Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : > > On 04/09/09 15:30, Stefano Stabellini wrote: > > >The userspace\kernelspace distinction does not make much sense in > > >minios, but yes it provides an interface to communicate with blkback in > > >dom0. This interface is used by block-vbd in qemu instead of > > >block-raw-posix when run in minios. > > > > Sit back, look again. Do you really need it being page-aligned? > > Page aligned is an optimization which comes for free when you can > already choose the alignment. > > > The xen block protocol can handle unaligned requests just fine. > > Not completely arbitrarily aligned requests. They need to be > sector-aligned. Thought about 4k sectors, to be found in new disks sometime soon? -- Jamie ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:55 ` Jamie Lokier @ 2009-04-09 16:15 ` Samuel Thibault 2009-04-09 17:29 ` Lennart Sorensen 1 sibling, 0 replies; 27+ messages in thread From: Samuel Thibault @ 2009-04-09 16:15 UTC (permalink / raw) To: qemu-devel Jamie Lokier, le Thu 09 Apr 2009 16:55:50 +0100, a écrit : > Samuel Thibault wrote: > > Gerd Hoffmann, le Thu 09 Apr 2009 15:54:34 +0200, a écrit : > > > On 04/09/09 15:30, Stefano Stabellini wrote: > > > >The userspace\kernelspace distinction does not make much sense in > > > >minios, but yes it provides an interface to communicate with blkback in > > > >dom0. This interface is used by block-vbd in qemu instead of > > > >block-raw-posix when run in minios. > > > > > > Sit back, look again. Do you really need it being page-aligned? > > > > Page aligned is an optimization which comes for free when you can > > already choose the alignment. > > > > > The xen block protocol can handle unaligned requests just fine. > > > > Not completely arbitrarily aligned requests. They need to be > > sector-aligned. > > Thought about 4k sectors, to be found in new disks sometime soon? Same wording: they will have to be sector-aligned (the blkback announces the size of its sector). Samuel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:55 ` Jamie Lokier 2009-04-09 16:15 ` Samuel Thibault @ 2009-04-09 17:29 ` Lennart Sorensen 1 sibling, 0 replies; 27+ messages in thread From: Lennart Sorensen @ 2009-04-09 17:29 UTC (permalink / raw) To: qemu-devel On Thu, Apr 09, 2009 at 04:55:50PM +0100, Jamie Lokier wrote: > Thought about 4k sectors, to be found in new disks sometime soon? Some new SAS drives use 4k sectors, and SATA is going to at some point as well. It seems to be currently in testing, but some of the linux kernel developers have test drives that use 4k sectors for development. -- Len Sorensen ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:11 ` Anthony Liguori 2009-04-09 13:30 ` Stefano Stabellini @ 2009-04-09 13:33 ` Stefano Stabellini 2009-04-09 15:19 ` Samuel Thibault 2 siblings, 0 replies; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 13:33 UTC (permalink / raw) To: qemu-devel@nongnu.org I forgot to reply to one question: Anthony Liguori wrote: > Were you planning on submitting this for inclusion > in upstream QEMU? Yes, as part of the xen hvm support. If you think you can find a good use for it even without xen I can submit it before. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:11 ` Anthony Liguori 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 13:33 ` Stefano Stabellini @ 2009-04-09 15:19 ` Samuel Thibault 2009-04-09 15:38 ` Paul Brook 2009-04-09 15:56 ` Anthony Liguori 2 siblings, 2 replies; 27+ messages in thread From: Samuel Thibault @ 2009-04-09 15:19 UTC (permalink / raw) To: qemu-devel Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit : > >- ide.c:ide_init2 > > > > This buffer is only used when not doing DMA. When doing DMA, we are > able to do zero-copy IO so the alignment of the request depends on how > the guest aligned the request. I suspect you'll find a lot of guests > that, in practice, do not align requests at 4k boundaries. Eeeeerrr, why shoudn't they? Guests usually work on pages, which are already aligned on 4k boundaries. > I don't know what the requirements are for IDE but I would be > surprised if it was 4k. I believe there is no requirement. Samuel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:19 ` Samuel Thibault @ 2009-04-09 15:38 ` Paul Brook 2009-04-09 15:40 ` Samuel Thibault 2009-04-09 15:56 ` Anthony Liguori 1 sibling, 1 reply; 27+ messages in thread From: Paul Brook @ 2009-04-09 15:38 UTC (permalink / raw) To: qemu-devel; +Cc: Samuel Thibault On Thursday 09 April 2009, Samuel Thibault wrote: > Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit : > > >- ide.c:ide_init2 > > > > This buffer is only used when not doing DMA. When doing DMA, we are > > able to do zero-copy IO so the alignment of the request depends on how > > the guest aligned the request. I suspect you'll find a lot of guests > > that, in practice, do not align requests at 4k boundaries. > > Eeeeerrr, why shoudn't they? Guests usually work on pages, which are > already aligned on 4k boundaries. Assuming that page == 4k is going to get you into all sorts of trouble. Paul ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:38 ` Paul Brook @ 2009-04-09 15:40 ` Samuel Thibault 0 siblings, 0 replies; 27+ messages in thread From: Samuel Thibault @ 2009-04-09 15:40 UTC (permalink / raw) To: Paul Brook; +Cc: qemu-devel Paul Brook, le Thu 09 Apr 2009 16:38:51 +0100, a écrit : > On Thursday 09 April 2009, Samuel Thibault wrote: > > Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit : > > > >- ide.c:ide_init2 > > > > > > This buffer is only used when not doing DMA. When doing DMA, we are > > > able to do zero-copy IO so the alignment of the request depends on how > > > the guest aligned the request. I suspect you'll find a lot of guests > > > that, in practice, do not align requests at 4k boundaries. > > > > Eeeeerrr, why shoudn't they? Guests usually work on pages, which are > > already aligned on 4k boundaries. > > Assuming that page == 4k is going to get you into all sorts of trouble. I'm not assuming it. I'm saying that most of the time that's the case, and thus a heuristic that works better when that's the case will get great benefits. Samuel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:19 ` Samuel Thibault 2009-04-09 15:38 ` Paul Brook @ 2009-04-09 15:56 ` Anthony Liguori 2009-04-09 16:19 ` Samuel Thibault 1 sibling, 1 reply; 27+ messages in thread From: Anthony Liguori @ 2009-04-09 15:56 UTC (permalink / raw) To: qemu-devel Samuel Thibault wrote: > Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit : > >>> - ide.c:ide_init2 >>> >>> >> This buffer is only used when not doing DMA. When doing DMA, we are >> able to do zero-copy IO so the alignment of the request depends on how >> the guest aligned the request. I suspect you'll find a lot of guests >> that, in practice, do not align requests at 4k boundaries. >> > > Eeeeerrr, why shoudn't they? Guests usually work on pages, which are > already aligned on 4k boundaries. > Like if you do a 512-byte O_DIRECT write (in userspace) to a 512-byte, but not 4096-byte aligned buffer in Linux. This may not happen all the time, but if you don't handle this in your backend, you'll get a lot of lovely bug reports :-) Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 15:56 ` Anthony Liguori @ 2009-04-09 16:19 ` Samuel Thibault 0 siblings, 0 replies; 27+ messages in thread From: Samuel Thibault @ 2009-04-09 16:19 UTC (permalink / raw) To: qemu-devel Anthony Liguori, le Thu 09 Apr 2009 10:56:27 -0500, a écrit : > Samuel Thibault wrote: > >Anthony Liguori, le Thu 09 Apr 2009 08:11:27 -0500, a écrit : > >>I suspect you'll find a lot of guests that, in practice, do not > >>align requests at 4k boundaries. > > > >Eeeeerrr, why shoudn't they? Guests usually work on pages, which are > >already aligned on 4k boundaries. > > > > Like if you do a 512-byte O_DIRECT write (in userspace) to a 512-byte, > but not 4096-byte aligned buffer in Linux. This may not happen all the > time, but if you don't handle this in your backend, you'll get a lot of > lovely bug reports :-) As I said, the requirement is sector alignment. Yes, if the blkback sector size is greater than 512 bytes we'll have to cope with the userspace buffer not aligned with 512 bytes, by adding a grant to the blkback request. In such a case however, I'd suggest having qemu expose 4k sector virtual disks, so that the guest automaticcally chooses to align to 4k, and thus get full optimization. Samuel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-08 18:53 ` Anthony Liguori 2009-04-09 9:57 ` Stefano Stabellini @ 2009-04-09 10:07 ` Stefano Stabellini 2009-04-09 13:14 ` Anthony Liguori 2009-04-09 13:19 ` Christoph Hellwig 1 sibling, 2 replies; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 10:07 UTC (permalink / raw) To: qemu-devel@nongnu.org Anthony Liguori wrote: >> If you do not want to do that, another possible solution is to create a >> new function called "qemu_blockalign" that would be implemented as >> qemu_memalign(512, size); > > This is fine, but this is purely an optimization, it cannot be relied > upon in the general case. > If you are OK with this, a simple patch like the following is acceptable, or do you prefer a more formal approach involving a new function in the BlockDriver interface? --- diff --git a/block.c b/block.c index 74d19ad..1fdcfef 100644 --- a/block.c +++ b/block.c @@ -1376,7 +1376,7 @@ static BlockDriverAIOCB *bdrv_aio_rw_vector(BlockDriverState *bs, acb = qemu_aio_get(bs, cb, opaque); acb->is_write = is_write; acb->qiov = qiov; - acb->bounce = qemu_memalign(512, qiov->size); + acb->bounce = qemu_blockalign(qiov->size); if (!acb->bh) acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb); @@ -1626,3 +1626,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs, return drv->bdrv_aio_ioctl(bs, req, buf, cb, opaque); return NULL; } + +void *qemu_blockalign(size_t size) +{ + return qemu_memalign(512, size); +} + diff --git a/block.h b/block.h index ca672a1..dba9f9d 100644 --- a/block.h +++ b/block.h @@ -176,4 +176,6 @@ int bdrv_put_buffer(BlockDriverState *bs, const uint8_t *buf, int bdrv_get_buffer(BlockDriverState *bs, uint8_t *buf, int64_t pos, int size); +void *qemu_blockalign(size_t size); + #endif diff --git a/hw/ide.c b/hw/ide.c index f187546..93b90b8 100644 --- a/hw/ide.c +++ b/hw/ide.c @@ -2788,7 +2788,7 @@ static void ide_init2(IDEState *ide_state, for(i = 0; i < 2; i++) { s = ide_state + i; - s->io_buffer = qemu_memalign(512, IDE_DMA_BUF_SECTORS*512 + 4); + s->io_buffer = qemu_blockalign(IDE_DMA_BUF_SECTORS*512 + 4); if (i == 0) s->bs = hd0; else ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 10:07 ` Stefano Stabellini @ 2009-04-09 13:14 ` Anthony Liguori 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 13:19 ` Christoph Hellwig 1 sibling, 1 reply; 27+ messages in thread From: Anthony Liguori @ 2009-04-09 13:14 UTC (permalink / raw) To: qemu-devel Stefano Stabellini wrote: > Anthony Liguori wrote: > > >>> If you do not want to do that, another possible solution is to create a >>> new function called "qemu_blockalign" that would be implemented as >>> qemu_memalign(512, size); >>> >> This is fine, but this is purely an optimization, it cannot be relied >> upon in the general case. >> >> > > > > If you are OK with this, a simple patch like the following is > acceptable, or do you prefer a more formal approach involving a new > function in the BlockDriver interface? > Yes, but that's not covering very many of the cases. There are still places where we allocate bounce buffers that aren't with memalign (like in block-qcow2.c). Also, please document what the new function does and when it should be used. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:14 ` Anthony Liguori @ 2009-04-09 13:30 ` Stefano Stabellini 0 siblings, 0 replies; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 13:30 UTC (permalink / raw) To: qemu-devel@nongnu.org Anthony Liguori wrote: > Stefano Stabellini wrote: >> Anthony Liguori wrote: >> >> >>>> If you do not want to do that, another possible solution is to create a >>>> new function called "qemu_blockalign" that would be implemented as >>>> qemu_memalign(512, size); >>>> >>> This is fine, but this is purely an optimization, it cannot be relied >>> upon in the general case. >>> >>> >> >> >> If you are OK with this, a simple patch like the following is >> acceptable, or do you prefer a more formal approach involving a new >> function in the BlockDriver interface? >> > > Yes, but that's not covering very many of the cases. There are still > places where we allocate bounce buffers that aren't with memalign (like > in block-qcow2.c). Also, please document what the new function does and > when it should be used. > Ok, I'll try to find all those places. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 10:07 ` Stefano Stabellini 2009-04-09 13:14 ` Anthony Liguori @ 2009-04-09 13:19 ` Christoph Hellwig 2009-04-09 13:30 ` Stefano Stabellini 1 sibling, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2009-04-09 13:19 UTC (permalink / raw) To: qemu-devel On Thu, Apr 09, 2009 at 11:07:28AM +0100, Stefano Stabellini wrote: > + acb->bounce = qemu_blockalign(qiov->size); Why don't you pass a BlockDriverState to it so that it automatically does the best alignment for the underlying driver? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:19 ` Christoph Hellwig @ 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 15:31 ` Christoph Hellwig 0 siblings, 1 reply; 27+ messages in thread From: Stefano Stabellini @ 2009-04-09 13:30 UTC (permalink / raw) To: qemu-devel@nongnu.org Christoph Hellwig wrote: > On Thu, Apr 09, 2009 at 11:07:28AM +0100, Stefano Stabellini wrote: >> + acb->bounce = qemu_blockalign(qiov->size); > > Why don't you pass a BlockDriverState to it so that it automatically > does the best alignment for the underlying driver? > This is a good idea. What if I add a new field to BlockDriverState called 'alignment' defaulted to 512? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [Qemu-devel] buffer alignment for block backends 2009-04-09 13:30 ` Stefano Stabellini @ 2009-04-09 15:31 ` Christoph Hellwig 0 siblings, 0 replies; 27+ messages in thread From: Christoph Hellwig @ 2009-04-09 15:31 UTC (permalink / raw) To: qemu-devel On Thu, Apr 09, 2009 at 02:30:42PM +0100, Stefano Stabellini wrote: > Christoph Hellwig wrote: > > > On Thu, Apr 09, 2009 at 11:07:28AM +0100, Stefano Stabellini wrote: > >> + acb->bounce = qemu_blockalign(qiov->size); > > > > Why don't you pass a BlockDriverState to it so that it automatically > > does the best alignment for the underlying driver? > > > > This is a good idea. > > What if I add a new field to BlockDriverState called 'alignment' > defaulted to 512? Let's call it buffer_alignmnet or memory_alignment as we also have an alignment of the actual sector number (doesn't matter currently, but there are devices or device configurations that only support larger than 512 byte sectors) Also please add a large comment describing what it's used for. Right now that's only a hint for the allocations, but I might re-use it in future to move the bouncing on mis-alignmnet into the block layer. ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-04-09 17:29 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-04-08 13:35 [Qemu-devel] buffer alignment for block backends Stefano Stabellini 2009-04-08 18:53 ` Anthony Liguori 2009-04-09 9:57 ` Stefano Stabellini 2009-04-09 13:11 ` Anthony Liguori 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 13:54 ` Gerd Hoffmann 2009-04-09 15:21 ` Samuel Thibault 2009-04-09 15:46 ` Stefano Stabellini 2009-04-09 15:50 ` Gerd Hoffmann 2009-04-09 16:11 ` Stefano Stabellini 2009-04-09 16:13 ` Samuel Thibault 2009-04-09 16:28 ` Anthony Liguori 2009-04-09 15:55 ` Jamie Lokier 2009-04-09 16:15 ` Samuel Thibault 2009-04-09 17:29 ` Lennart Sorensen 2009-04-09 13:33 ` Stefano Stabellini 2009-04-09 15:19 ` Samuel Thibault 2009-04-09 15:38 ` Paul Brook 2009-04-09 15:40 ` Samuel Thibault 2009-04-09 15:56 ` Anthony Liguori 2009-04-09 16:19 ` Samuel Thibault 2009-04-09 10:07 ` Stefano Stabellini 2009-04-09 13:14 ` Anthony Liguori 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 13:19 ` Christoph Hellwig 2009-04-09 13:30 ` Stefano Stabellini 2009-04-09 15:31 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).