* [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
@ 2012-11-21 8:58 Christian Borntraeger
2012-11-21 9:15 ` Kevin Wolf
0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2012-11-21 8:58 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Heinz Graalfs, agraf, Christian Borntraeger, jfrei,
Stefan Hajnoczi
From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
While testing IPL code (booting) for s390x we faced some problems
with cache=none on dasds (4k block size) on bdrv_preads with length
values != block size.
This patch makes sure that bdrv_pread and friends work fine with
unaligned access even with cache=none
- propagate alignment value also into bs->file struct
- modify the size in case of no cache to avoid EINVAL on
pread() etc. (file was opened with O_DIRECT).
This patch seems to cure the problems.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
block.c | 3 +++
block/raw-posix.c | 6 ++++++
2 files changed, 9 insertions(+)
diff --git a/block.c b/block.c
index 854ebd6..f23c562 100644
--- a/block.c
+++ b/block.c
@@ -4242,6 +4242,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
void bdrv_set_buffer_alignment(BlockDriverState *bs, int align)
{
bs->buffer_alignment = align;
+ if ((bs->open_flags & BDRV_O_NOCACHE)) {
+ bs->file->buffer_alignment = align;
+ }
}
void *qemu_blockalign(BlockDriverState *bs, size_t size)
diff --git a/block/raw-posix.c b/block/raw-posix.c
index f2f0404..baebf1d 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
acb->aio_nbytes = nb_sectors * 512;
acb->aio_offset = sector_num * 512;
+ /* O_DIRECT also requires an aligned length */
+ if (bs->open_flags & BDRV_O_NOCACHE) {
+ acb->aio_nbytes += acb->bs->buffer_alignment - 1;
+ acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
+ }
+
trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
return thread_pool_submit_aio(aio_worker, acb, cb, opaque);
}
--
1.7.10.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 8:58 [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered Christian Borntraeger
@ 2012-11-21 9:15 ` Kevin Wolf
2012-11-21 10:00 ` Christian Borntraeger
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Kevin Wolf @ 2012-11-21 9:15 UTC (permalink / raw)
To: Christian Borntraeger
Cc: jfrei, Heinz Graalfs, qemu-devel, Stefan Hajnoczi, agraf
Am 21.11.2012 09:58, schrieb Christian Borntraeger:
> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
>
> While testing IPL code (booting) for s390x we faced some problems
> with cache=none on dasds (4k block size) on bdrv_preads with length
> values != block size.
>
> This patch makes sure that bdrv_pread and friends work fine with
> unaligned access even with cache=none
> - propagate alignment value also into bs->file struct
> - modify the size in case of no cache to avoid EINVAL on
> pread() etc. (file was opened with O_DIRECT).
>
> This patch seems to cure the problems.
>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> block.c | 3 +++
> block/raw-posix.c | 6 ++++++
> 2 files changed, 9 insertions(+)
>
> diff --git a/block.c b/block.c
> index 854ebd6..f23c562 100644
> --- a/block.c
> +++ b/block.c
> @@ -4242,6 +4242,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
> void bdrv_set_buffer_alignment(BlockDriverState *bs, int align)
> {
> bs->buffer_alignment = align;
> + if ((bs->open_flags & BDRV_O_NOCACHE)) {
> + bs->file->buffer_alignment = align;
> + }
Any reason to restrict this to BDRV_O_NOCACHE?
There have been patches to change the BDRV_O_NOCACHE flag from the
monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
anew and O_DIRECT requests start to fail again.
> }
>
> void *qemu_blockalign(BlockDriverState *bs, size_t size)
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index f2f0404..baebf1d 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
> acb->aio_nbytes = nb_sectors * 512;
> acb->aio_offset = sector_num * 512;
>
> + /* O_DIRECT also requires an aligned length */
> + if (bs->open_flags & BDRV_O_NOCACHE) {
> + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
> + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
> + }
Modifying aio_nbytes, but not the iov looks wrong to me. This may work
in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
Kevin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 9:15 ` Kevin Wolf
@ 2012-11-21 10:00 ` Christian Borntraeger
2012-11-21 11:24 ` Heinz Graalfs
2012-11-21 16:03 ` Paolo Bonzini
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2012-11-21 10:00 UTC (permalink / raw)
To: Kevin Wolf, Heinz Graalfs; +Cc: jfrei, qemu-devel, Stefan Hajnoczi, agraf
On 21/11/12 10:15, Kevin Wolf wrote:
> Am 21.11.2012 09:58, schrieb Christian Borntraeger:
>> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
>>
>> While testing IPL code (booting) for s390x we faced some problems
>> with cache=none on dasds (4k block size) on bdrv_preads with length
>> values != block size.
>>
>> This patch makes sure that bdrv_pread and friends work fine with
>> unaligned access even with cache=none
>> - propagate alignment value also into bs->file struct
>> - modify the size in case of no cache to avoid EINVAL on
>> pread() etc. (file was opened with O_DIRECT).
>>
>> This patch seems to cure the problems.
>>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>> block.c | 3 +++
>> block/raw-posix.c | 6 ++++++
>> 2 files changed, 9 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index 854ebd6..f23c562 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4242,6 +4242,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
>> void bdrv_set_buffer_alignment(BlockDriverState *bs, int align)
>> {
>> bs->buffer_alignment = align;
>> + if ((bs->open_flags & BDRV_O_NOCACHE)) {
>> + bs->file->buffer_alignment = align;
>> + }
>
> Any reason to restrict this to BDRV_O_NOCACHE?
>
> There have been patches to change the BDRV_O_NOCACHE flag from the
> monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
> anew and O_DIRECT requests start to fail again.
Right, should be ok to remove the check.
>
>> }
>>
>> void *qemu_blockalign(BlockDriverState *bs, size_t size)
>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>> index f2f0404..baebf1d 100644
>> --- a/block/raw-posix.c
>> +++ b/block/raw-posix.c
>> @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
>> acb->aio_nbytes = nb_sectors * 512;
>> acb->aio_offset = sector_num * 512;
>>
>> + /* O_DIRECT also requires an aligned length */
>> + if (bs->open_flags & BDRV_O_NOCACHE) {
>> + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
>> + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
>> + }
>
> Modifying aio_nbytes, but not the iov looks wrong to me. This may work
> in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
I think it seemed to work because the vectored I/O cases that we were testing were properly
aligned or were in the QEMU_AIO_MISALIGNED case which does bounce buffering anyway.
But I am not sure...
Heinz can you have a look at this and identify the exact place were it was failing
and why this patch helps? (I just know it does). That might help to understand
if we also need to touch the iovs.
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 10:00 ` Christian Borntraeger
@ 2012-11-21 11:24 ` Heinz Graalfs
0 siblings, 0 replies; 12+ messages in thread
From: Heinz Graalfs @ 2012-11-21 11:24 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Kevin Wolf, jfrei, qemu-devel, Stefan Hajnoczi, agraf
On Wed, 2012-11-21 at 11:00 +0100, Christian Borntraeger wrote:
> On 21/11/12 10:15, Kevin Wolf wrote:
> > Am 21.11.2012 09:58, schrieb Christian Borntraeger:
> >> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> >>
> >> While testing IPL code (booting) for s390x we faced some problems
> >> with cache=none on dasds (4k block size) on bdrv_preads with length
> >> values != block size.
> >>
> >> This patch makes sure that bdrv_pread and friends work fine with
> >> unaligned access even with cache=none
> >> - propagate alignment value also into bs->file struct
> >> - modify the size in case of no cache to avoid EINVAL on
> >> pread() etc. (file was opened with O_DIRECT).
> >>
> >> This patch seems to cure the problems.
> >>
> >> CC: Kevin Wolf <kwolf@redhat.com>
> >> CC: Stefan Hajnoczi <stefanha@redhat.com>
> >> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> >> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> >> ---
> >> block.c | 3 +++
> >> block/raw-posix.c | 6 ++++++
> >> 2 files changed, 9 insertions(+)
> >>
> >> diff --git a/block.c b/block.c
> >> index 854ebd6..f23c562 100644
> >> --- a/block.c
> >> +++ b/block.c
> >> @@ -4242,6 +4242,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
> >> void bdrv_set_buffer_alignment(BlockDriverState *bs, int align)
> >> {
> >> bs->buffer_alignment = align;
> >> + if ((bs->open_flags & BDRV_O_NOCACHE)) {
> >> + bs->file->buffer_alignment = align;
> >> + }
> >
> > Any reason to restrict this to BDRV_O_NOCACHE?
> >
> > There have been patches to change the BDRV_O_NOCACHE flag from the
> > monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
> > anew and O_DIRECT requests start to fail again.
>
>
> Right, should be ok to remove the check.
>
>
> >
> >> }
> >>
> >> void *qemu_blockalign(BlockDriverState *bs, size_t size)
> >> diff --git a/block/raw-posix.c b/block/raw-posix.c
> >> index f2f0404..baebf1d 100644
> >> --- a/block/raw-posix.c
> >> +++ b/block/raw-posix.c
> >> @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
> >> acb->aio_nbytes = nb_sectors * 512;
> >> acb->aio_offset = sector_num * 512;
> >>
> >> + /* O_DIRECT also requires an aligned length */
> >> + if (bs->open_flags & BDRV_O_NOCACHE) {
> >> + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
> >> + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
> >> + }
> >
> > Modifying aio_nbytes, but not the iov looks wrong to me. This may work
> > in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
>
> I think it seemed to work because the vectored I/O cases that we were testing were properly
> aligned or were in the QEMU_AIO_MISALIGNED case which does bounce buffering anyway.
> But I am not sure...
>
> Heinz can you have a look at this and identify the exact place were it was failing
> and why this patch helps? (I just know it does). That might help to understand
> if we also need to touch the iovs.
The pread() call in handle_aiocb_rw_linear() is failing with errno 22.
At least for this path the patch ensures that the length is correctly
set. I need to look into the vectored I/O part in more detail.
> Christian
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 9:15 ` Kevin Wolf
2012-11-21 10:00 ` Christian Borntraeger
@ 2012-11-21 16:03 ` Paolo Bonzini
2012-11-22 12:03 ` Christian Borntraeger
2012-11-23 10:45 ` Heinz Graalfs
2012-12-07 20:26 ` Heinz Graalfs
3 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2012-11-21 16:03 UTC (permalink / raw)
To: Kevin Wolf
Cc: Heinz Graalfs, agraf, qemu-devel, Christian Borntraeger, jfrei,
Stefan Hajnoczi
Il 21/11/2012 10:15, Kevin Wolf ha scritto:
>> > + if ((bs->open_flags & BDRV_O_NOCACHE)) {
>> > + bs->file->buffer_alignment = align;
>> > + }
> Any reason to restrict this to BDRV_O_NOCACHE?
>
> There have been patches to change the BDRV_O_NOCACHE flag from the
> monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
> anew and O_DIRECT requests start to fail again.
>
bdrv_set_buffer_alignment() is completely broken. It should set host
alignment, but in fact it is passed the guest alignment.
In practice, we only support logical_block_size matching the host's or
bigger (which is unsafe due to torn writes, but works).
So I suggest that we just look at writes outside the device models, and
"fix" them to always read a multiple of 4k.
Paolo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 16:03 ` Paolo Bonzini
@ 2012-11-22 12:03 ` Christian Borntraeger
0 siblings, 0 replies; 12+ messages in thread
From: Christian Borntraeger @ 2012-11-22 12:03 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Kevin Wolf, Heinz Graalfs, agraf, qemu-devel, jfrei,
Stefan Hajnoczi
On 21/11/12 17:03, Paolo Bonzini wrote:
> Il 21/11/2012 10:15, Kevin Wolf ha scritto:
>>>> + if ((bs->open_flags & BDRV_O_NOCACHE)) {
>>>> + bs->file->buffer_alignment = align;
>>>> + }
>> Any reason to restrict this to BDRV_O_NOCACHE?
>>
>> There have been patches to change the BDRV_O_NOCACHE flag from the
>> monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
>> anew and O_DIRECT requests start to fail again.
>>
>
> bdrv_set_buffer_alignment() is completely broken. It should set host
> alignment, but in fact it is passed the guest alignment.
>
> In practice, we only support logical_block_size matching the host's or
> bigger (which is unsafe due to torn writes, but works).
For other reasons (partition table format) we want to have host block
size == guest block size on s390 anyway - so it would not really matter for
us.
But I certainly agree that it makes more sense to use the host block size
for the alignment checks.
> So I suggest that we just look at writes outside the device models, and
> "fix" them to always read a multiple of 4k.
Wouldnt that cause performance regressions for block devices with 512 byte
block size, because we read more than necessary. Wouldnt that also require
read/update/write combinations for valid 512 byte writes?
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 9:15 ` Kevin Wolf
2012-11-21 10:00 ` Christian Borntraeger
2012-11-21 16:03 ` Paolo Bonzini
@ 2012-11-23 10:45 ` Heinz Graalfs
2012-12-07 20:26 ` Heinz Graalfs
3 siblings, 0 replies; 12+ messages in thread
From: Heinz Graalfs @ 2012-11-23 10:45 UTC (permalink / raw)
To: Kevin Wolf
Cc: Christian Borntraeger, jfrei, qemu-devel, Stefan Hajnoczi, agraf
On Wed, 2012-11-21 at 10:15 +0100, Kevin Wolf wrote:
> Am 21.11.2012 09:58, schrieb Christian Borntraeger:
> > From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> >
> > While testing IPL code (booting) for s390x we faced some problems
> > with cache=none on dasds (4k block size) on bdrv_preads with length
> > values != block size.
> >
> > This patch makes sure that bdrv_pread and friends work fine with
> > unaligned access even with cache=none
> > - propagate alignment value also into bs->file struct
> > - modify the size in case of no cache to avoid EINVAL on
> > pread() etc. (file was opened with O_DIRECT).
> >
> > This patch seems to cure the problems.
> >
> > CC: Kevin Wolf <kwolf@redhat.com>
> > CC: Stefan Hajnoczi <stefanha@redhat.com>
> > Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> > Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > ---
> > block.c | 3 +++
> > block/raw-posix.c | 6 ++++++
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/block.c b/block.c
> > index 854ebd6..f23c562 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -4242,6 +4242,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
> > void bdrv_set_buffer_alignment(BlockDriverState *bs, int align)
> > {
> > bs->buffer_alignment = align;
> > + if ((bs->open_flags & BDRV_O_NOCACHE)) {
> > + bs->file->buffer_alignment = align;
> > + }
>
> Any reason to restrict this to BDRV_O_NOCACHE?
>
> There have been patches to change the BDRV_O_NOCACHE flag from the
> monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
> anew and O_DIRECT requests start to fail again.
>
OK
> > }
> >
> > void *qemu_blockalign(BlockDriverState *bs, size_t size)
> > diff --git a/block/raw-posix.c b/block/raw-posix.c
> > index f2f0404..baebf1d 100644
> > --- a/block/raw-posix.c
> > +++ b/block/raw-posix.c
> > @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
> > acb->aio_nbytes = nb_sectors * 512;
> > acb->aio_offset = sector_num * 512;
> >
> > + /* O_DIRECT also requires an aligned length */
> > + if (bs->open_flags & BDRV_O_NOCACHE) {
> > + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
> > + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
> > + }
>
> Modifying aio_nbytes, but not the iov looks wrong to me. This may work
> in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
>
Current coding ensures that read IO buffers always seem to be aligned
correctly. Whereas read length values are not always appropriate for an
O_DIRECT scenario.
For a 2048 formatted disk I verified that
1. non vectored IO - the length needs to be adapted several times,
which is accomplished now by the patch.
2. vectored IO - the qiov's total length is always a multiple of the
logical block size
(which is also verified in virtio_blk_handle_read())
The particular iov length fields are already correctly setup as a
multiple of the logical block size when processed in
virtio_blk_handle_request().
> Kevin
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-11-21 9:15 ` Kevin Wolf
` (2 preceding siblings ...)
2012-11-23 10:45 ` Heinz Graalfs
@ 2012-12-07 20:26 ` Heinz Graalfs
2012-12-10 8:55 ` Kevin Wolf
3 siblings, 1 reply; 12+ messages in thread
From: Heinz Graalfs @ 2012-12-07 20:26 UTC (permalink / raw)
To: Kevin Wolf
Cc: Christian Borntraeger, jfrei, qemu-devel, Stefan Hajnoczi, agraf
Hello Kevin,
I'm resending my answer as of Nov 23rd.
Is this still on your queue?
Heinz
On Wed, 2012-11-21 at 10:15 +0100, Kevin Wolf wrote:
> Am 21.11.2012 09:58, schrieb Christian Borntraeger:
> > From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> >
> > While testing IPL code (booting) for s390x we faced some problems
> > with cache=none on dasds (4k block size) on bdrv_preads with length
> > values != block size.
> >
> > This patch makes sure that bdrv_pread and friends work fine with
> > unaligned access even with cache=none
> > - propagate alignment value also into bs->file struct
> > - modify the size in case of no cache to avoid EINVAL on
> > pread() etc. (file was opened with O_DIRECT).
> >
> > This patch seems to cure the problems.
> >
> > CC: Kevin Wolf <kwolf@redhat.com>
> > CC: Stefan Hajnoczi <stefanha@redhat.com>
> > Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> > Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > ---
> > block.c | 3 +++
> > block/raw-posix.c | 6 ++++++
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/block.c b/block.c
> > index 854ebd6..f23c562 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -4242,6 +4242,9 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
> > void bdrv_set_buffer_alignment(BlockDriverState *bs, int align)
> > {
> > bs->buffer_alignment = align;
> > + if ((bs->open_flags & BDRV_O_NOCACHE)) {
> > + bs->file->buffer_alignment = align;
> > + }
>
> Any reason to restrict this to BDRV_O_NOCACHE?
OK, can be removed
> There have been patches to change the BDRV_O_NOCACHE flag from the
> monitor, in which case bdrv_set_buffer_alignment() wouldn't be called
> anew and O_DIRECT requests start to fail again.
> > }
> >
> > void *qemu_blockalign(BlockDriverState *bs, size_t size)
> > diff --git a/block/raw-posix.c b/block/raw-posix.c
> > index f2f0404..baebf1d 100644
> > --- a/block/raw-posix.c
> > +++ b/block/raw-posix.c
> > @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
> > acb->aio_nbytes = nb_sectors * 512;
> > acb->aio_offset = sector_num * 512;
> >
> > + /* O_DIRECT also requires an aligned length */
> > + if (bs->open_flags & BDRV_O_NOCACHE) {
> > + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
> > + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
> > + }
>
> Modifying aio_nbytes, but not the iov looks wrong to me. This may work
> in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
Current coding ensures that read IO buffers always seem to be aligned
correctly. Whereas read length values are not always appropriate for an
O_DIRECT scenario.
For a 2048 formatted disk I verified that
1. non vectored IO - the length needs to be adapted several times,
which is accomplished now by the patch.
2. vectored IO - the qiov's total length is always a multiple of the
logical block size
(which is also verified in virtio_blk_handle_read())
The particular iov length fields are already correctly setup as a
multiple of the logical block size when processed in
virtio_blk_handle_request().
>
> Kevin
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-12-07 20:26 ` Heinz Graalfs
@ 2012-12-10 8:55 ` Kevin Wolf
2012-12-11 9:58 ` Heinz Graalfs
0 siblings, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2012-12-10 8:55 UTC (permalink / raw)
To: Heinz Graalfs
Cc: Christian Borntraeger, jfrei, qemu-devel, Stefan Hajnoczi, agraf
Am 07.12.2012 21:26, schrieb Heinz Graalfs:
> Hello Kevin,
>
> I'm resending my answer as of Nov 23rd.
>
> Is this still on your queue?
No, it wasn't. I guess I was waiting for a new version of the patch.
>>> }
>>>
>>> void *qemu_blockalign(BlockDriverState *bs, size_t size)
>>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>>> index f2f0404..baebf1d 100644
>>> --- a/block/raw-posix.c
>>> +++ b/block/raw-posix.c
>>> @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
>>> acb->aio_nbytes = nb_sectors * 512;
>>> acb->aio_offset = sector_num * 512;
>>>
>>> + /* O_DIRECT also requires an aligned length */
>>> + if (bs->open_flags & BDRV_O_NOCACHE) {
>>> + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
>>> + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
>>> + }
>>
>> Modifying aio_nbytes, but not the iov looks wrong to me. This may work
>> in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
>
> Current coding ensures that read IO buffers always seem to be aligned
> correctly. Whereas read length values are not always appropriate for an
> O_DIRECT scenario.
>
> For a 2048 formatted disk I verified that
>
> 1. non vectored IO - the length needs to be adapted several times,
> which is accomplished now by the patch.
>
> 2. vectored IO - the qiov's total length is always a multiple of the
> logical block size
> (which is also verified in virtio_blk_handle_read())
> The particular iov length fields are already correctly setup as a
> multiple of the logical block size when processed in
> virtio_blk_handle_request().
I must admit that I don't quite understand this. As far as I know,
virtio-blk doesn't make any difference between requests with niov = 1
and real vectored requests. So how can the length of the latter always
be right, whereas the length of the former may be wrong?
The other point is that requests may not even be coming from virtio-blk.
They could be made by other device emulations or they could come from a
block job. (They also could be the result of a merge in the block layer,
though if the original requests were aligned, the result will stay aligned)
Kevin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-12-10 8:55 ` Kevin Wolf
@ 2012-12-11 9:58 ` Heinz Graalfs
2012-12-11 10:30 ` Kevin Wolf
0 siblings, 1 reply; 12+ messages in thread
From: Heinz Graalfs @ 2012-12-11 9:58 UTC (permalink / raw)
To: Kevin Wolf
Cc: Christian Borntraeger, jfrei, qemu-devel, Stefan Hajnoczi, agraf
Hi Kevin,
I'm using the bdrv_pread() function during boot partition detection ...
In detail:
bdrv_pread() is called to read 32 bytes from a 2048 bytes formatted
disk. This results in setting up a read of 512 bytes (1 sector
multiplied by 512 current code in paio_submit()), which is wrong for a
O_DIRECT opened file, and produces the error.
Heinz
On Mon, 2012-12-10 at 09:55 +0100, Kevin Wolf wrote:
> Am 07.12.2012 21:26, schrieb Heinz Graalfs:
> > Hello Kevin,
> >
> > I'm resending my answer as of Nov 23rd.
> >
> > Is this still on your queue?
>
> No, it wasn't. I guess I was waiting for a new version of the patch.
>
> >>> }
> >>>
> >>> void *qemu_blockalign(BlockDriverState *bs, size_t size)
> >>> diff --git a/block/raw-posix.c b/block/raw-posix.c
> >>> index f2f0404..baebf1d 100644
> >>> --- a/block/raw-posix.c
> >>> +++ b/block/raw-posix.c
> >>> @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
> >>> acb->aio_nbytes = nb_sectors * 512;
> >>> acb->aio_offset = sector_num * 512;
> >>>
> >>> + /* O_DIRECT also requires an aligned length */
> >>> + if (bs->open_flags & BDRV_O_NOCACHE) {
> >>> + acb->aio_nbytes += acb->bs->buffer_alignment - 1;
> >>> + acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
> >>> + }
> >>
> >> Modifying aio_nbytes, but not the iov looks wrong to me. This may work
> >> in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
> >
> > Current coding ensures that read IO buffers always seem to be aligned
> > correctly. Whereas read length values are not always appropriate for an
> > O_DIRECT scenario.
> >
> > For a 2048 formatted disk I verified that
> >
> > 1. non vectored IO - the length needs to be adapted several times,
> > which is accomplished now by the patch.
> >
> > 2. vectored IO - the qiov's total length is always a multiple of the
> > logical block size
> > (which is also verified in virtio_blk_handle_read())
> > The particular iov length fields are already correctly setup as a
> > multiple of the logical block size when processed in
> > virtio_blk_handle_request().
>
> I must admit that I don't quite understand this. As far as I know,
> virtio-blk doesn't make any difference between requests with niov = 1
> and real vectored requests. So how can the length of the latter always
> be right, whereas the length of the former may be wrong?
>
> The other point is that requests may not even be coming from virtio-blk.
> They could be made by other device emulations or they could come from a
> block job. (They also could be the result of a merge in the block layer,
> though if the original requests were aligned, the result will stay aligned)
>
> Kevin
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-12-11 9:58 ` Heinz Graalfs
@ 2012-12-11 10:30 ` Kevin Wolf
2012-12-11 13:53 ` Heinz Graalfs
0 siblings, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2012-12-11 10:30 UTC (permalink / raw)
To: Heinz Graalfs
Cc: Christian Borntraeger, jfrei, qemu-devel, Stefan Hajnoczi, agraf
Am 11.12.2012 10:58, schrieb Heinz Graalfs:
> Hi Kevin,
>
> I'm using the bdrv_pread() function during boot partition detection ...
>
> In detail:
> bdrv_pread() is called to read 32 bytes from a 2048 bytes formatted
> disk. This results in setting up a read of 512 bytes (1 sector
> multiplied by 512 current code in paio_submit()), which is wrong for a
> O_DIRECT opened file, and produces the error.
So this sounds like the real problem: bdrv_pread/pwrite assume 512 byte
sectors. May it's better to fix it there instead of just fixing one code
path in one backend.
In any case this patch as submitted is wrong as it overflows the buffer
passed to paio_submit. Test it with this patch:
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1718,6 +1718,8 @@ static int openfile(char *name, int flags, int
growable)
bs = NULL;
return 1;
}
+
+ bdrv_set_buffer_alignment(bs, 4096);
}
return 0;
$ ./qemu-io -n -c 'read -p 0 512' /tmp/foo
read 512/512 bytes at offset 0
512 bytes, 1 ops; 0.0001 sec (3.727 MiB/sec and 7633.5878 ops/sec)
*** glibc detected *** ./qemu-io: double free or corruption (out):
0x00007fa22349b000 ***
Kevin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered
2012-12-11 10:30 ` Kevin Wolf
@ 2012-12-11 13:53 ` Heinz Graalfs
0 siblings, 0 replies; 12+ messages in thread
From: Heinz Graalfs @ 2012-12-11 13:53 UTC (permalink / raw)
To: Kevin Wolf
Cc: Christian Borntraeger, jfrei, qemu-devel, Stefan Hajnoczi, agraf
On Tue, 2012-12-11 at 11:30 +0100, Kevin Wolf wrote:
> Am 11.12.2012 10:58, schrieb Heinz Graalfs:
> > Hi Kevin,
> >
> > I'm using the bdrv_pread() function during boot partition detection ...
> >
> > In detail:
> > bdrv_pread() is called to read 32 bytes from a 2048 bytes formatted
> > disk. This results in setting up a read of 512 bytes (1 sector
> > multiplied by 512 current code in paio_submit()), which is wrong for a
> > O_DIRECT opened file, and produces the error.
>
> So this sounds like the real problem: bdrv_pread/pwrite assume 512 byte
> sectors. May it's better to fix it there instead of just fixing one code
> path in one backend.
>
> In any case this patch as submitted is wrong as it overflows the buffer
> passed to paio_submit. Test it with this patch:
>
> --- a/qemu-io.c
> +++ b/qemu-io.c
> @@ -1718,6 +1718,8 @@ static int openfile(char *name, int flags, int
> growable)
> bs = NULL;
> return 1;
> }
> +
> + bdrv_set_buffer_alignment(bs, 4096);
> }
>
> return 0;
>
>
> $ ./qemu-io -n -c 'read -p 0 512' /tmp/foo
> read 512/512 bytes at offset 0
> 512 bytes, 1 ops; 0.0001 sec (3.727 MiB/sec and 7633.5878 ops/sec)
> *** glibc detected *** ./qemu-io: double free or corruption (out):
> 0x00007fa22349b000 ***
>
> Kevin
>
Kevin, I tried your fix and it solves the free error...
Here is what I get:
# lsdasd
Bus-ID Status Name Device Type BlkSz Size Blocks
==============================================================================
0.0.37a1 active dasdb 94:4 ECKD 2048 6162MB 3155355
0.0.37a0 active dasdc 94:8 ECKD 512 3594MB 7362495
# ./qemu-io -c 'read -p 0 512' /dev/disk/by-path/ccw-0.0.37a0
read 512/512 bytes at offset 0
512 bytes, 1 ops; 0.0000 sec (7.512 MiB/sec and 15384.6154 ops/sec)
# ./qemu-io -n -c 'read -p 0 512' /dev/disk/by-path/ccw-0.0.37a0
read 512/512 bytes at offset 0
512 bytes, 1 ops; 0.0005 sec (904.159 KiB/sec and 1808.3183 ops/sec)
# ./qemu-io -c 'read -p 0 512' /dev/disk/by-path/ccw-0.0.37a1
read 512/512 bytes at offset 0
512 bytes, 1 ops; 0.0000 sec (7.288 MiB/sec and 14925.3731 ops/sec)
# ./qemu-io -n -c 'read -p 0 512' /dev/disk/by-path/ccw-0.0.37a1
read failed: Invalid argument
#
Are you going to fix the rest in bdrv_pread/pwrite too?
Heinz
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-12-11 13:53 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-21 8:58 [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered Christian Borntraeger
2012-11-21 9:15 ` Kevin Wolf
2012-11-21 10:00 ` Christian Borntraeger
2012-11-21 11:24 ` Heinz Graalfs
2012-11-21 16:03 ` Paolo Bonzini
2012-11-22 12:03 ` Christian Borntraeger
2012-11-23 10:45 ` Heinz Graalfs
2012-12-07 20:26 ` Heinz Graalfs
2012-12-10 8:55 ` Kevin Wolf
2012-12-11 9:58 ` Heinz Graalfs
2012-12-11 10:30 ` Kevin Wolf
2012-12-11 13:53 ` Heinz Graalfs
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).