* [Qemu-devel] [PATCH 0/2] Switch raw NBD to byte-based @ 2016-06-24 3:58 Eric Blake 2016-06-24 3:58 ` [Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface Eric Blake 2016-06-24 3:58 ` [Qemu-devel] [PATCH 2/2] raw_bsd: " Eric Blake 0 siblings, 2 replies; 6+ messages in thread From: Eric Blake @ 2016-06-24 3:58 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, qemu-block, kwolf With these two patches, I'm finally able to run: ./qemu-nbd -f raw -x foo file ./qemu-io -f raw -t none nbd://localhost:10809/foo and get true byte-based access over the wire for operations such as 'r 1 1' or 'w 1 1', rather than RMW sector-aligned access. Depends on these series: v3 Byte-based block limits: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg06983.html v1 Auto-fragment large transactions at the block layer: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg05819.html v1 byte-based block discard: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg06491.html Eric Blake (2): nbd: Convert to byte-based interface raw_bsd: Convert to byte-based interface block/nbd-client.h | 8 ++++---- include/block/nbd.h | 1 - block/nbd-client.c | 30 +++++++++++++++++------------- block/nbd.c | 12 ++++++------ block/raw_bsd.c | 35 +++++++++++++++++------------------ 5 files changed, 44 insertions(+), 42 deletions(-) -- 2.5.5 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface 2016-06-24 3:58 [Qemu-devel] [PATCH 0/2] Switch raw NBD to byte-based Eric Blake @ 2016-06-24 3:58 ` Eric Blake 2016-06-27 12:19 ` Paolo Bonzini 2016-06-24 3:58 ` [Qemu-devel] [PATCH 2/2] raw_bsd: " Eric Blake 1 sibling, 1 reply; 6+ messages in thread From: Eric Blake @ 2016-06-24 3:58 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, qemu-block, kwolf, Max Reitz The NBD protocol doesn't have any notion of sectors, so it is a fairly easy convertion to use byte-based read and write. Signed-off-by: Eric Blake <eblake@redhat.com> --- block/nbd-client.h | 8 ++++---- include/block/nbd.h | 1 - block/nbd-client.c | 30 +++++++++++++++++------------- block/nbd.c | 12 ++++++------ 4 files changed, 27 insertions(+), 24 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index 62dec33..fa9817b 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -46,10 +46,10 @@ void nbd_client_close(BlockDriverState *bs); int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count); int nbd_client_co_flush(BlockDriverState *bs); -int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov, int flags); -int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov); +int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, int flags); +int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, int flags); void nbd_client_detach_aio_context(BlockDriverState *bs); void nbd_client_attach_aio_context(BlockDriverState *bs, diff --git a/include/block/nbd.h b/include/block/nbd.h index 503f514..cb91820 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -77,7 +77,6 @@ enum { /* Maximum size of a single READ/WRITE data buffer */ #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024) -#define NBD_MAX_SECTORS (NBD_MAX_BUFFER_SIZE / BDRV_SECTOR_SIZE) /* Maximum size of an export name. The NBD spec requires 256 and * suggests that servers support up to 4096, but we stick to only the diff --git a/block/nbd-client.c b/block/nbd-client.c index 070bb9d..647dedd 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -218,17 +218,20 @@ static void nbd_coroutine_end(NbdClientSession *s, } } -int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov) +int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, int flags) { NbdClientSession *client = nbd_get_client_session(bs); - struct nbd_request request = { .type = NBD_CMD_READ }; + struct nbd_request request = { + .type = NBD_CMD_READ, + .from = offset, + .len = bytes, + }; struct nbd_reply reply; ssize_t ret; - assert(nb_sectors <= NBD_MAX_SECTORS); - request.from = sector_num * 512; - request.len = nb_sectors * 512; + assert(bytes <= NBD_MAX_BUFFER_SIZE); + assert(!flags); nbd_coroutine_start(client, &request); ret = nbd_co_send_request(bs, &request, NULL); @@ -239,14 +242,17 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num, } nbd_coroutine_end(client, &request); return -reply.error; - } -int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov, int flags) +int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, int flags) { NbdClientSession *client = nbd_get_client_session(bs); - struct nbd_request request = { .type = NBD_CMD_WRITE }; + struct nbd_request request = { + .type = NBD_CMD_WRITE, + .from = offset, + .len = bytes, + }; struct nbd_reply reply; ssize_t ret; @@ -255,9 +261,7 @@ int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num, request.type |= NBD_CMD_FLAG_FUA; } - assert(nb_sectors <= NBD_MAX_SECTORS); - request.from = sector_num * 512; - request.len = nb_sectors * 512; + assert(bytes <= NBD_MAX_BUFFER_SIZE); nbd_coroutine_start(client, &request); ret = nbd_co_send_request(bs, &request, qiov); diff --git a/block/nbd.c b/block/nbd.c index 42cae0e..8d57220 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -438,8 +438,8 @@ static BlockDriver bdrv_nbd = { .instance_size = sizeof(BDRVNBDState), .bdrv_parse_filename = nbd_parse_filename, .bdrv_file_open = nbd_open, - .bdrv_co_readv = nbd_client_co_readv, - .bdrv_co_writev_flags = nbd_client_co_writev, + .bdrv_co_preadv = nbd_client_co_preadv, + .bdrv_co_pwritev = nbd_client_co_pwritev, .bdrv_close = nbd_close, .bdrv_co_flush_to_os = nbd_co_flush, .bdrv_co_pdiscard = nbd_client_co_pdiscard, @@ -456,8 +456,8 @@ static BlockDriver bdrv_nbd_tcp = { .instance_size = sizeof(BDRVNBDState), .bdrv_parse_filename = nbd_parse_filename, .bdrv_file_open = nbd_open, - .bdrv_co_readv = nbd_client_co_readv, - .bdrv_co_writev_flags = nbd_client_co_writev, + .bdrv_co_preadv = nbd_client_co_preadv, + .bdrv_co_pwritev = nbd_client_co_pwritev, .bdrv_close = nbd_close, .bdrv_co_flush_to_os = nbd_co_flush, .bdrv_co_pdiscard = nbd_client_co_pdiscard, @@ -474,8 +474,8 @@ static BlockDriver bdrv_nbd_unix = { .instance_size = sizeof(BDRVNBDState), .bdrv_parse_filename = nbd_parse_filename, .bdrv_file_open = nbd_open, - .bdrv_co_readv = nbd_client_co_readv, - .bdrv_co_writev_flags = nbd_client_co_writev, + .bdrv_co_preadv = nbd_client_co_preadv, + .bdrv_co_pwritev = nbd_client_co_pwritev, .bdrv_close = nbd_close, .bdrv_co_flush_to_os = nbd_co_flush, .bdrv_co_pdiscard = nbd_client_co_pdiscard, -- 2.5.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface 2016-06-24 3:58 ` [Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface Eric Blake @ 2016-06-27 12:19 ` Paolo Bonzini 0 siblings, 0 replies; 6+ messages in thread From: Paolo Bonzini @ 2016-06-27 12:19 UTC (permalink / raw) To: Eric Blake, qemu-devel; +Cc: qemu-block, kwolf, Max Reitz On 24/06/2016 05:58, Eric Blake wrote: > The NBD protocol doesn't have any notion of sectors, so it is > a fairly easy convertion to use byte-based read and write. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > block/nbd-client.h | 8 ++++---- > include/block/nbd.h | 1 - > block/nbd-client.c | 30 +++++++++++++++++------------- > block/nbd.c | 12 ++++++------ > 4 files changed, 27 insertions(+), 24 deletions(-) > > diff --git a/block/nbd-client.h b/block/nbd-client.h > index 62dec33..fa9817b 100644 > --- a/block/nbd-client.h > +++ b/block/nbd-client.h > @@ -46,10 +46,10 @@ void nbd_client_close(BlockDriverState *bs); > > int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count); > int nbd_client_co_flush(BlockDriverState *bs); > -int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num, > - int nb_sectors, QEMUIOVector *qiov, int flags); > -int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num, > - int nb_sectors, QEMUIOVector *qiov); > +int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, > + uint64_t bytes, QEMUIOVector *qiov, int flags); > +int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, > + uint64_t bytes, QEMUIOVector *qiov, int flags); > > void nbd_client_detach_aio_context(BlockDriverState *bs); > void nbd_client_attach_aio_context(BlockDriverState *bs, > diff --git a/include/block/nbd.h b/include/block/nbd.h > index 503f514..cb91820 100644 > --- a/include/block/nbd.h > +++ b/include/block/nbd.h > @@ -77,7 +77,6 @@ enum { > > /* Maximum size of a single READ/WRITE data buffer */ > #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024) > -#define NBD_MAX_SECTORS (NBD_MAX_BUFFER_SIZE / BDRV_SECTOR_SIZE) > > /* Maximum size of an export name. The NBD spec requires 256 and > * suggests that servers support up to 4096, but we stick to only the > diff --git a/block/nbd-client.c b/block/nbd-client.c > index 070bb9d..647dedd 100644 > --- a/block/nbd-client.c > +++ b/block/nbd-client.c > @@ -218,17 +218,20 @@ static void nbd_coroutine_end(NbdClientSession *s, > } > } > > -int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num, > - int nb_sectors, QEMUIOVector *qiov) > +int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, > + uint64_t bytes, QEMUIOVector *qiov, int flags) > { > NbdClientSession *client = nbd_get_client_session(bs); > - struct nbd_request request = { .type = NBD_CMD_READ }; > + struct nbd_request request = { > + .type = NBD_CMD_READ, > + .from = offset, > + .len = bytes, > + }; > struct nbd_reply reply; > ssize_t ret; > > - assert(nb_sectors <= NBD_MAX_SECTORS); > - request.from = sector_num * 512; > - request.len = nb_sectors * 512; > + assert(bytes <= NBD_MAX_BUFFER_SIZE); > + assert(!flags); > > nbd_coroutine_start(client, &request); > ret = nbd_co_send_request(bs, &request, NULL); > @@ -239,14 +242,17 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num, > } > nbd_coroutine_end(client, &request); > return -reply.error; > - > } > > -int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num, > - int nb_sectors, QEMUIOVector *qiov, int flags) > +int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, > + uint64_t bytes, QEMUIOVector *qiov, int flags) > { > NbdClientSession *client = nbd_get_client_session(bs); > - struct nbd_request request = { .type = NBD_CMD_WRITE }; > + struct nbd_request request = { > + .type = NBD_CMD_WRITE, > + .from = offset, > + .len = bytes, > + }; > struct nbd_reply reply; > ssize_t ret; > > @@ -255,9 +261,7 @@ int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num, > request.type |= NBD_CMD_FLAG_FUA; > } > > - assert(nb_sectors <= NBD_MAX_SECTORS); > - request.from = sector_num * 512; > - request.len = nb_sectors * 512; > + assert(bytes <= NBD_MAX_BUFFER_SIZE); > > nbd_coroutine_start(client, &request); > ret = nbd_co_send_request(bs, &request, qiov); > diff --git a/block/nbd.c b/block/nbd.c > index 42cae0e..8d57220 100644 > --- a/block/nbd.c > +++ b/block/nbd.c > @@ -438,8 +438,8 @@ static BlockDriver bdrv_nbd = { > .instance_size = sizeof(BDRVNBDState), > .bdrv_parse_filename = nbd_parse_filename, > .bdrv_file_open = nbd_open, > - .bdrv_co_readv = nbd_client_co_readv, > - .bdrv_co_writev_flags = nbd_client_co_writev, > + .bdrv_co_preadv = nbd_client_co_preadv, > + .bdrv_co_pwritev = nbd_client_co_pwritev, > .bdrv_close = nbd_close, > .bdrv_co_flush_to_os = nbd_co_flush, > .bdrv_co_pdiscard = nbd_client_co_pdiscard, > @@ -456,8 +456,8 @@ static BlockDriver bdrv_nbd_tcp = { > .instance_size = sizeof(BDRVNBDState), > .bdrv_parse_filename = nbd_parse_filename, > .bdrv_file_open = nbd_open, > - .bdrv_co_readv = nbd_client_co_readv, > - .bdrv_co_writev_flags = nbd_client_co_writev, > + .bdrv_co_preadv = nbd_client_co_preadv, > + .bdrv_co_pwritev = nbd_client_co_pwritev, > .bdrv_close = nbd_close, > .bdrv_co_flush_to_os = nbd_co_flush, > .bdrv_co_pdiscard = nbd_client_co_pdiscard, > @@ -474,8 +474,8 @@ static BlockDriver bdrv_nbd_unix = { > .instance_size = sizeof(BDRVNBDState), > .bdrv_parse_filename = nbd_parse_filename, > .bdrv_file_open = nbd_open, > - .bdrv_co_readv = nbd_client_co_readv, > - .bdrv_co_writev_flags = nbd_client_co_writev, > + .bdrv_co_preadv = nbd_client_co_preadv, > + .bdrv_co_pwritev = nbd_client_co_pwritev, > .bdrv_close = nbd_close, > .bdrv_co_flush_to_os = nbd_co_flush, > .bdrv_co_pdiscard = nbd_client_co_pdiscard, > Acked-by: Paolo Bonzini <pbonzini@redhat.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 2/2] raw_bsd: Convert to byte-based interface 2016-06-24 3:58 [Qemu-devel] [PATCH 0/2] Switch raw NBD to byte-based Eric Blake 2016-06-24 3:58 ` [Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface Eric Blake @ 2016-06-24 3:58 ` Eric Blake 2016-06-27 12:19 ` Paolo Bonzini 1 sibling, 1 reply; 6+ messages in thread From: Eric Blake @ 2016-06-24 3:58 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, qemu-block, kwolf, Max Reitz Since the raw format driver is just passing things through, we can do byte-based read and write if the underlying protocol does likewise. There's one tricky part - if we probed the image format, we document that we restrict operations on the initial sector. Rather than trying to handle a read-modify-write on the first sector, it's easiest to just include in our restrictions that partial writes to the first sector are not permitted. Signed-off-by: Eric Blake <eblake@redhat.com> --- block/raw_bsd.c | 35 +++++++++++++++++------------------ 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/block/raw_bsd.c b/block/raw_bsd.c index 365c38a..a417d9a 100644 --- a/block/raw_bsd.c +++ b/block/raw_bsd.c @@ -50,32 +50,32 @@ static int raw_reopen_prepare(BDRVReopenState *reopen_state, return 0; } -static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, - int nb_sectors, QEMUIOVector *qiov) +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, + int flags) { BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO); - return bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov); + return bdrv_co_preadv(bs->file->bs, offset, bytes, qiov, flags); } -static int coroutine_fn -raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors, - QEMUIOVector *qiov, int flags) +static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, + int flags) { void *buf = NULL; BlockDriver *drv; QEMUIOVector local_qiov; int ret; - if (bs->probed && sector_num == 0) { - /* As long as these conditions are true, we can't get partial writes to - * the probe buffer and can just directly check the request. */ + if (bs->probed && offset < BLOCK_PROBE_BUF_SIZE && bytes) { + /* Handling partial writes would be a pain - so we just + * require that the guest cannot write to the first sector + * without writing the entire sector */ QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512); QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512); - - if (nb_sectors == 0) { - /* qemu_iovec_to_buf() would fail, but we want to return success - * instead of -EINVAL in this case. */ - return 0; + if (offset != 0 || bytes < BLOCK_PROBE_BUF_SIZE) { + ret = -EINVAL; + goto fail; } buf = qemu_try_blockalign(bs->file->bs, 512); @@ -105,8 +105,7 @@ raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors, } BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO); - ret = bdrv_co_pwritev(bs->file->bs, sector_num * BDRV_SECTOR_SIZE, - nb_sectors * BDRV_SECTOR_SIZE, qiov, flags); + ret = bdrv_co_pwritev(bs->file->bs, offset, bytes, qiov, flags); fail: if (qiov == &local_qiov) { @@ -240,8 +239,8 @@ BlockDriver bdrv_raw = { .bdrv_open = &raw_open, .bdrv_close = &raw_close, .bdrv_create = &raw_create, - .bdrv_co_readv = &raw_co_readv, - .bdrv_co_writev_flags = &raw_co_writev_flags, + .bdrv_co_preadv = &raw_co_preadv, + .bdrv_co_pwritev = &raw_co_pwritev, .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes, .bdrv_co_pdiscard = &raw_co_pdiscard, .bdrv_co_get_block_status = &raw_co_get_block_status, -- 2.5.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] raw_bsd: Convert to byte-based interface 2016-06-24 3:58 ` [Qemu-devel] [PATCH 2/2] raw_bsd: " Eric Blake @ 2016-06-27 12:19 ` Paolo Bonzini 2016-06-27 12:51 ` Eric Blake 0 siblings, 1 reply; 6+ messages in thread From: Paolo Bonzini @ 2016-06-27 12:19 UTC (permalink / raw) To: Eric Blake, qemu-devel; +Cc: qemu-block, kwolf, Max Reitz On 24/06/2016 05:58, Eric Blake wrote: > Since the raw format driver is just passing things through, we can > do byte-based read and write if the underlying protocol does > likewise. > > There's one tricky part - if we probed the image format, we document > that we restrict operations on the initial sector. Rather than > trying to handle a read-modify-write on the first sector, it's > easiest to just include in our restrictions that partial writes to > the first sector are not permitted. Can we just set the alignment to at least 512 if probed? It's the practical guest alignment anyway. Paolo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] raw_bsd: Convert to byte-based interface 2016-06-27 12:19 ` Paolo Bonzini @ 2016-06-27 12:51 ` Eric Blake 0 siblings, 0 replies; 6+ messages in thread From: Eric Blake @ 2016-06-27 12:51 UTC (permalink / raw) To: Paolo Bonzini, qemu-devel; +Cc: qemu-block, kwolf, Max Reitz [-- Attachment #1: Type: text/plain, Size: 846 bytes --] On 06/27/2016 06:19 AM, Paolo Bonzini wrote: > > > On 24/06/2016 05:58, Eric Blake wrote: >> Since the raw format driver is just passing things through, we can >> do byte-based read and write if the underlying protocol does >> likewise. >> >> There's one tricky part - if we probed the image format, we document >> that we restrict operations on the initial sector. Rather than >> trying to handle a read-modify-write on the first sector, it's >> easiest to just include in our restrictions that partial writes to >> the first sector are not permitted. > > Can we just set the alignment to at least 512 if probed? It's the > practical guest alignment anyway. Interesting idea; should be fairly easy to attempt. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-27 12:51 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-24 3:58 [Qemu-devel] [PATCH 0/2] Switch raw NBD to byte-based Eric Blake 2016-06-24 3:58 ` [Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface Eric Blake 2016-06-27 12:19 ` Paolo Bonzini 2016-06-24 3:58 ` [Qemu-devel] [PATCH 2/2] raw_bsd: " Eric Blake 2016-06-27 12:19 ` Paolo Bonzini 2016-06-27 12:51 ` Eric Blake
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).