From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52193) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1biKzF-0006uW-2U for qemu-devel@nongnu.org; Fri, 09 Sep 2016 08:32:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1biKzA-0002Tz-Sv for qemu-devel@nongnu.org; Fri, 09 Sep 2016 08:32:52 -0400 Received: from mail-lf0-x242.google.com ([2a00:1450:4010:c07::242]:36727) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1biKzA-0002Ti-EU for qemu-devel@nongnu.org; Fri, 09 Sep 2016 08:32:48 -0400 Received: by mail-lf0-x242.google.com with SMTP id s29so2636938lfg.3 for ; Fri, 09 Sep 2016 05:32:47 -0700 (PDT) References: <1473244860-6072-1-git-send-email-paulinaszubarczyk@gmail.com> <1473244860-6072-3-git-send-email-paulinaszubarczyk@gmail.com> <57D08E12.4080607@gmail.com> From: Paulina Szubarczyk Message-ID: <57D2ABE9.9070800@gmail.com> Date: Fri, 9 Sep 2016 14:32:41 +0200 MIME-Version: 1.0 In-Reply-To: <57D08E12.4080607@gmail.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v6 2/2] qdisk - hw/block/xen_disk: grant copy implementation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: roger.pau@citrix.com Cc: Stefano Stabellini , xen-devel@lists.xenproject.org, wei.liu2@citrix.com, ian.jackson@eu.citrix.com, david.vrabel@citrix.com, anthony.perard@citrix.com, qemu-devel@nongnu.org On 09/08/2016 12:00 AM, Paulina Szubarczyk wrote: > > On 09/07/2016 10:56 PM, Stefano Stabellini wrote: >> On Wed, 7 Sep 2016, Paulina Szubarczyk wrote: >>> Copy data operated on during request from/to local buffers to/from >>> the grant references. >>> >>> Before grant copy operation local buffers must be allocated what is >>> done by calling ioreq_init_copy_buffers. For the 'read' operation, >>> first, the qemu device invokes the read operation on local buffers >>> and on the completion grant copy is called and buffers are freed. >>> For the 'write' operation grant copy is performed before invoking >>> write by qemu device. >>> >>> A new value 'feature_grant_copy' is added to recognize when the >>> grant copy operation is supported by a guest. >>> >>> Signed-off-by: Paulina Szubarczyk >>> --- >>> Changes since v5: >>> -added checking of every interface in the configure file. Based on >>> the Roger's comment that xengnttab_map_grant_ref was added prior >>> xengnttab_grant_copy, thus do not need to be check again here >>> I dropped this check. >>> >> >> Thank you Paulina, the patch is good. Thanks for your work! Sorry for >> coming in late in the review; I have a couple of minor suggestions >> below. >> > It had not been possible without all the help I have got. I am very > grateful for it. > >> In addition to Anthony's ack, it would be also nice to have Roger's ack >> on this patch. >> Roger, will you need more time to take a look at the patch or may I send a new version with applied Stefano comments? >> >>> configure | 55 ++++++++++++++++ >>> hw/block/xen_disk.c | 157 >>> ++++++++++++++++++++++++++++++++++++++++++-- >>> include/hw/xen/xen_common.h | 14 ++++ >>> 3 files changed, 221 insertions(+), 5 deletions(-) >>> >>> diff --git a/configure b/configure >>> index 4b808f9..3f44d38 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -1956,6 +1956,61 @@ EOF >>> /* >>> * If we have stable libs the we don't want the libxc compat >>> * layers, regardless of what CFLAGS we may have been given. >>> + * >>> + * Also, check if xengnttab_grant_copy_segment_t is defined and >>> + * grant copy operation is implemented. >>> + */ >>> +#undef XC_WANT_COMPAT_EVTCHN_API >>> +#undef XC_WANT_COMPAT_GNTTAB_API >>> +#undef XC_WANT_COMPAT_MAP_FOREIGN_API >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#if !defined(HVM_MAX_VCPUS) >>> +# error HVM_MAX_VCPUS not defined >>> +#endif >>> +int main(void) { >>> + xc_interface *xc = NULL; >>> + xenforeignmemory_handle *xfmem; >>> + xenevtchn_handle *xe; >>> + xengnttab_handle *xg; >>> + xen_domain_handle_t handle; >>> + xengnttab_grant_copy_segment_t* seg = NULL; >>> + >>> + xs_daemon_open(); >>> + >>> + xc = xc_interface_open(0, 0, 0); >>> + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); >>> + xc_domain_add_to_physmap(0, 0, XENMAPSPACE_gmfn, 0, 0); >>> + xc_hvm_inject_msi(xc, 0, 0xf0000000, 0x00000000); >>> + xc_hvm_create_ioreq_server(xc, 0, HVM_IOREQSRV_BUFIOREQ_ATOMIC, >>> NULL); >>> + xc_domain_create(xc, 0, handle, 0, NULL, NULL); >>> + >>> + xfmem = xenforeignmemory_open(0, 0); >>> + xenforeignmemory_map(xfmem, 0, 0, 0, 0, 0); >>> + >>> + xe = xenevtchn_open(0, 0); >>> + xenevtchn_fd(xe); >>> + >>> + xg = xengnttab_open(0, 0); >>> + xengnttab_grant_copy(xg, 0, seg); >>> + >>> + return 0; >>> +} >>> +EOF >>> + compile_prog "" "$xen_libs $xen_stable_libs" >>> + then >>> + xen_ctrl_version=480 >>> + xen=yes >>> + elif >>> + cat > $TMPC <>> +/* >>> + * If we have stable libs the we don't want the libxc compat >>> + * layers, regardless of what CFLAGS we may have been given. >>> */ >>> #undef XC_WANT_COMPAT_EVTCHN_API >>> #undef XC_WANT_COMPAT_GNTTAB_API >>> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c >>> index 3b8ad33..3739e13 100644 >>> --- a/hw/block/xen_disk.c >>> +++ b/hw/block/xen_disk.c >>> @@ -119,6 +119,9 @@ struct XenBlkDev { >>> unsigned int persistent_gnt_count; >>> unsigned int max_grants; >>> >>> + /* Grant copy */ >>> + gboolean feature_grant_copy; >>> + >>> /* qemu block driver */ >>> DriveInfo *dinfo; >>> BlockBackend *blk; >>> @@ -489,6 +492,108 @@ static int ioreq_map(struct ioreq *ioreq) >>> return 0; >>> } >>> >>> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 480 >> >> In general I prefer to avoid this kind of #ifdef in xen_disk.c, but I >> see that Anthony suggested it for a good reason. The only alternartive I >> can think of would be to introduce two static inline functions in >> xen_common.h to set a xengnttab_grant_copy_segment_t seg. But this is >> also OK. >> > The functions take as parameters pointers to struct ioreq defined in > xen_disk.c which members are used to fill the > xengnttab_grant_copy_segment_t. There would be some overhead to move the > functions to the header. >> >>> +static void free_buffers(struct ioreq *ioreq) >> >> Please name this function ioreq_free_copy_buffers to make it clearer >> that has to do with the same buffers initialized below. >> >> >>> +{ >>> + int i; >>> + >>> + for (i = 0; i < ioreq->v.niov; i++) { >>> + ioreq->page[i] = NULL; >>> + } >>> + >>> + qemu_vfree(ioreq->pages); >>> +} >>> + >>> +static int ioreq_init_copy_buffers(struct ioreq *ioreq) >>> +{ >>> + int i; >>> + >>> + if (ioreq->v.niov == 0) { >>> + return 0; >>> + } >>> + >>> + ioreq->pages = qemu_memalign(XC_PAGE_SIZE, ioreq->v.niov * >>> XC_PAGE_SIZE); >>> + >>> + for (i = 0; i < ioreq->v.niov; i++) { >>> + ioreq->page[i] = ioreq->pages + i * XC_PAGE_SIZE; >>> + ioreq->v.iov[i].iov_base = ioreq->page[i]; >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +static int ioreq_copy(struct ioreq *ioreq) >> >> Please name this function in a way that makes it clear that it has >> something to do with grant copies. Like for example ioreq_grant_copy. >> > I will change the names of both functions and remove the blank line > pointed below. >> >>> +{ >>> + xengnttab_handle *gnt = ioreq->blkdev->xendev.gnttabdev; >>> + xengnttab_grant_copy_segment_t >>> segs[BLKIF_MAX_SEGMENTS_PER_REQUEST]; >>> + int i, count, rc; >>> + int64_t file_blk = ioreq->blkdev->file_blk; >>> + >>> + if (ioreq->v.niov == 0) { >>> + return 0; >>> + } >>> + >>> + count = ioreq->v.niov; >>> + >>> + for (i = 0; i < count; i++) { >>> + >>> + if (ioreq->req.operation == BLKIF_OP_READ) { >>> + segs[i].flags = GNTCOPY_dest_gref; >>> + segs[i].dest.foreign.ref = ioreq->refs[i]; >>> + segs[i].dest.foreign.domid = ioreq->domids[i]; >>> + segs[i].dest.foreign.offset = >>> ioreq->req.seg[i].first_sect * file_blk; >>> + segs[i].source.virt = ioreq->v.iov[i].iov_base; >>> + } else { >>> + segs[i].flags = GNTCOPY_source_gref; >>> + segs[i].source.foreign.ref = ioreq->refs[i]; >>> + segs[i].source.foreign.domid = ioreq->domids[i]; >>> + segs[i].source.foreign.offset = >>> ioreq->req.seg[i].first_sect * file_blk; >>> + segs[i].dest.virt = ioreq->v.iov[i].iov_base; >>> + } >>> + segs[i].len = (ioreq->req.seg[i].last_sect >>> + - ioreq->req.seg[i].first_sect + 1) * file_blk; >>> + >>> + } >>> + >>> + rc = xengnttab_grant_copy(gnt, count, segs); >>> + >>> + if (rc) { >>> + xen_be_printf(&ioreq->blkdev->xendev, 0, >>> + "failed to copy data %d\n", rc); >>> + ioreq->aio_errors++; >>> + return -1; >>> + } >>> + >>> + for (i = 0; i < count; i++) { >>> + if (segs[i].status != GNTST_okay) { >>> + xen_be_printf(&ioreq->blkdev->xendev, 3, >>> + "failed to copy data %d for gref %d, domid >>> %d\n", >>> + segs[i].status, ioreq->refs[i], >>> ioreq->domids[i]); >>> + ioreq->aio_errors++; >>> + rc = -1; >>> + } >>> + } >>> + >>> + return rc; >>> +} >>> +#else >>> +static void free_buffers(struct ioreq *ioreq) >>> +{ >>> + abort(); >>> +} >>> + >>> +static int ioreq_init_copy_buffers(struct ioreq *ioreq) >>> +{ >>> + abort(); >>> +} >>> + >>> +static int ioreq_copy(struct ioreq *ioreq) >>> +{ >>> + abort(); >>> +} >>> +#endif >>> + >>> static int ioreq_runio_qemu_aio(struct ioreq *ioreq); >>> >>> static void qemu_aio_complete(void *opaque, int ret) >>> @@ -511,8 +616,31 @@ static void qemu_aio_complete(void *opaque, int >>> ret) >>> return; >>> } >>> >>> + if (ioreq->blkdev->feature_grant_copy) { >>> + switch (ioreq->req.operation) { >>> + case BLKIF_OP_READ: >>> + /* in case of failure ioreq->aio_errors is increased */ >>> + if (ret == 0) { >>> + ioreq_copy(ioreq); >>> + } >>> + free_buffers(ioreq); >>> + break; >>> + case BLKIF_OP_WRITE: >>> + case BLKIF_OP_FLUSH_DISKCACHE: >>> + if (!ioreq->req.nr_segments) { >>> + break; >>> + } >>> + free_buffers(ioreq); >>> + break; >>> + default: >>> + break; >>> + } >>> + } >>> + >>> ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : >>> BLKIF_RSP_OKAY; >>> - ioreq_unmap(ioreq); >>> + if (!ioreq->blkdev->feature_grant_copy) { >>> + ioreq_unmap(ioreq); >>> + } >>> ioreq_finish(ioreq); >>> switch (ioreq->req.operation) { >>> case BLKIF_OP_WRITE: >>> @@ -538,8 +666,20 @@ static int ioreq_runio_qemu_aio(struct ioreq >>> *ioreq) >>> { >>> struct XenBlkDev *blkdev = ioreq->blkdev; >>> >>> - if (ioreq->req.nr_segments && ioreq_map(ioreq) == -1) { >>> - goto err_no_map; >>> + if (ioreq->blkdev->feature_grant_copy) { >>> + ioreq_init_copy_buffers(ioreq); >>> + if (ioreq->req.nr_segments && (ioreq->req.operation == >>> BLKIF_OP_WRITE || >>> + ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE)) { >>> + if (ioreq_copy(ioreq)) { >>> + free_buffers(ioreq); >>> + goto err; >>> + } >>> + } >>> + >> >> Please remove blank this line (it looks a bit odd) >> >> >>> + } else { >>> + if (ioreq->req.nr_segments && ioreq_map(ioreq)) { >>> + goto err; >>> + } >>> } >>> >>> ioreq->aio_inflight++; >>> @@ -582,6 +722,9 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) >>> } >>> default: >>> /* unknown operation (shouldn't happen -- parse catches >>> this) */ >>> + if (!ioreq->blkdev->feature_grant_copy) { >>> + ioreq_unmap(ioreq); >>> + } >> >> Why are you adding this? Is it a bug fix? If so, please explain in the >> commit message. >> > It is not a bug fix. In the ioreq_runio_qemu_aio there were two error > labels 'err_no_map' and 'err' each of them used once. So before the > patch the labels are looking this way: > err: > ioreq_unmap(ioreq); > err_no_map: > ioreq_finish(ioreq); > ioreq->status = BLKIF_RSP_ERROR; > return -1; > > I removed the 'err_no_map' label and from the 'err' section I removed > the ioreq_unmap. The 'err' label was previously used in that default > section of the switch because the grant_map is called regardless the > ioreq->req.operation. > > In the patch, there is no need for any special behavior for grant_copy, > because buffers were not allocated in this case, so there is only jump > to 'err'. > > An advantage of this change is one unified error path for both grant > operations. >> >>> goto err; >>> } >>> >>> @@ -590,8 +733,6 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) >>> return 0; >>> >>> err: >>> - ioreq_unmap(ioreq); >>> -err_no_map: >>> ioreq_finish(ioreq); >>> ioreq->status = BLKIF_RSP_ERROR; >>> return -1; >>> @@ -1032,6 +1173,12 @@ static int blk_connect(struct XenDevice *xendev) >>> >>> xen_be_bind_evtchn(&blkdev->xendev); >>> >>> + blkdev->feature_grant_copy = >>> + (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, >>> NULL) == 0); >>> + >>> + xen_be_printf(&blkdev->xendev, 3, "grant copy operation %s\n", >>> + blkdev->feature_grant_copy ? "enabled" : "disabled"); >>> + >>> xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " >>> "remote port %d, local port %d\n", >>> blkdev->xendev.protocol, blkdev->ring_ref, >>> diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h >>> index bd39287..8e1580d 100644 >>> --- a/include/hw/xen/xen_common.h >>> +++ b/include/hw/xen/xen_common.h >>> @@ -424,4 +424,18 @@ static inline int xen_domain_create(xc_interface >>> *xc, uint32_t ssidref, >>> #endif >>> #endif >>> >>> +/* Xen before 4.8 */ >>> + >>> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 480 >>> + >>> + >>> +typedef void *xengnttab_grant_copy_segment_t; >>> + >>> +static inline int xengnttab_grant_copy(xengnttab_handle *xgt, >>> uint32_t count, >>> + >>> xengnttab_grant_copy_segment_t *segs) >>> +{ >>> + return -ENOSYS; >>> +} >>> +#endif >>> + >>> #endif /* QEMU_HW_XEN_COMMON_H */ >>> -- >>> 1.9.1 >>> > > Paulina Paulina