From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35904) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VsrvN-00087Y-Ar for qemu-devel@nongnu.org; Tue, 17 Dec 2013 05:30:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VsrvE-0003uR-Rn for qemu-devel@nongnu.org; Tue, 17 Dec 2013 05:30:49 -0500 Received: from mx.ipv6.kamp.de ([2a02:248:0:51::16]:46568 helo=mx01.kamp.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VsrvD-0003tz-WA for qemu-devel@nongnu.org; Tue, 17 Dec 2013 05:30:40 -0500 Message-ID: <52B02809.9020407@kamp.de> Date: Tue, 17 Dec 2013 11:31:37 +0100 From: Peter Lieven MIME-Version: 1.0 References: <1387208069-9302-1-git-send-email-pl@kamp.de> <52AFCDF4.9020804@redhat.com> <52B002F2.4060906@kamp.de> <52B00B4F.9060406@redhat.com> <52B00F7A.5030506@kamp.de> <52B010A5.90704@redhat.com> <52B0119F.8010504@kamp.de> <52B012D5.3010802@redhat.com> <52B01455.3050905@kamp.de> <52B01D75.40404@redhat.com> In-Reply-To: <52B01D75.40404@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH] block: add native support for NFS List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: kwolf@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, ronniesahlberg@gmail.com, qemu-devel@nongnu.org On 17.12.2013 10:46, Fam Zheng wrote: > On 2013年12月17日 17:07, Peter Lieven wrote: >> On 17.12.2013 10:01, Fam Zheng wrote: >>> On 2013年12月17日 16:55, Peter Lieven wrote: >>>> On 17.12.2013 09:51, Fam Zheng wrote: >>>>> 于2013年 12月17日 星期二 16时46分50秒,Peter Lieven写到: >>>>>> On 17.12.2013 09:29, Fam Zheng wrote: >>>>>>> On 2013年12月17日 15:53, Peter Lieven wrote: >>>>>>>> Hi Fam, >>>>>>>> >>>>>>>> On 17.12.2013 05:07, Fam Zheng wrote: >>>>>>>>> On 2013年12月16日 23:34, Peter Lieven wrote: >>>>>>>>>> +static int coroutine_fn nfs_co_writev(BlockDriverState *bs, >>>>>>>>>> + int64_t sector_num, int >>>>>>>>>> nb_sectors, >>>>>>>>>> + QEMUIOVector *iov) >>>>>>>>>> +{ >>>>>>>>>> + nfsclient *client = bs->opaque; >>>>>>>>>> + struct NFSTask Task; >>>>>>>>>> + char *buf = NULL; >>>>>>>>>> + >>>>>>>>>> + nfs_co_init_task(client, &Task); >>>>>>>>>> + >>>>>>>>>> + buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE); >>>>>>>>>> + qemu_iovec_to_buf(iov, 0, buf, nb_sectors * >>>>>>>>>> BDRV_SECTOR_SIZE); >>>>>>>>>> + >>>>>>>>>> + if (nfs_pwrite_async(client->context, client->fh, >>>>>>>>>> + sector_num * BDRV_SECTOR_SIZE, >>>>>>>>>> + nb_sectors * BDRV_SECTOR_SIZE, >>>>>>>>>> + buf, nfs_co_generic_cb, &Task) != 0) { >>>>>>>>>> + g_free(buf); >>>>>>>>>> + return -EIO; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + while (!Task.complete) { >>>>>>>>>> + nfs_set_events(client); >>>>>>>>>> + qemu_coroutine_yield(); >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + g_free(buf); >>>>>>>>>> + >>>>>>>>>> + if (Task.status != nb_sectors * BDRV_SECTOR_SIZE) { >>>>>>>>>> + return -EIO; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + bs->total_sectors = MAX(bs->total_sectors, sector_num + >>>>>>>>>> nb_sectors); >>>>>>>>>> + client->allocated_file_size = -ENOTSUP; >>>>>>>>> >>>>>>>>> Why does allocated_file_size become not supported after a write? >>>>>>>> I thought that someone would ask this ;-) bdrv_allocated_file_size >>>>>>>> is only >>>>>>>> used in image info. I saved some code here implementing an async >>>>>>>> call. >>>>>>>> On open I fstat anyway and store that value. For qemu-img info >>>>>>>> this is >>>>>>>> sufficient, but the allocated size likely changes after a write. >>>>>>>> -ENOTSUP >>>>>>>> is the default if bdrv_allocated_file_size is not implemented. >>>>>>> >>>>>>> OK. Please add some comment here. >>>>>>> >>>>>>>>>> +static int nfs_file_open_common(BlockDriverState *bs, QDict >>>>>>>>>> *options, int flags, >>>>>>>>>> + int open_flags, Error **errp) >>>>>>>>>> +{ >>>>>>>>>> + nfsclient *client = bs->opaque; >>>>>>>>>> + const char *filename; >>>>>>>>>> + int ret = 0; >>>>>>>>>> + QemuOpts *opts; >>>>>>>>>> + Error *local_err = NULL; >>>>>>>>>> + char *server = NULL, *path = NULL, *file = NULL, *strp; >>>>>>>>>> + struct stat st; >>>>>>>>>> + >>>>>>>>>> + opts = qemu_opts_create_nofail(&runtime_opts); >>>>>>>>>> + qemu_opts_absorb_qdict(opts, options, &local_err); >>>>>>>>>> + if (error_is_set(&local_err)) { >>>>>>>>>> + qerror_report_err(local_err); >>>>>>>>>> + error_free(local_err); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + filename = qemu_opt_get(opts, "filename"); >>>>>>>>>> + >>>>>>>>>> + client->context = nfs_init_context(); >>>>>>>>>> + >>>>>>>>>> + if (client->context == NULL) { >>>>>>>>>> + error_setg(errp, "Failed to init NFS context"); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + server = g_strdup(filename + 6); >>>>>>>>> >>>>>>>>> Please check the length of filename is longer than 6 before >>>>>>>>> accessing >>>>>>>>> filename[6]. >>>>>>>> Good point. I will check for this, but in fact I think it can't >>>>>>>> happen >>>>>>>> because we will >>>>>>>> never end up there if filename does not start with nfs:// >>>>>>> >>>>>>> True, at least for now, but it doesn't hurt to be defensive, >>>>>>> especially with strings. >>>>>>> >>>>>>>>> >>>>>>>>>> + if (server[0] == '/' || server[0] == '\0') { >>>>>>>>>> + error_setg(errp, "Invalid server in URL"); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + strp = strchr(server, '/'); >>>>>>>>>> + if (strp == NULL) { >>>>>>>>>> + error_setg(errp, "Invalid URL specified.\n"); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + path = g_strdup(strp); >>>>>>>>>> + *strp = 0; >>>>>>>>>> + strp = strrchr(path, '/'); >>>>>>>>>> + if (strp == NULL) { >>>>>>>>>> + error_setg(errp, "Invalid URL specified.\n"); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + file = g_strdup(strp); >>>>>>>>>> + *strp = 0; >>>>>>>>>> + >>>>>>>>>> + if (nfs_mount(client->context, server, path) != 0) { >>>>>>>>>> + error_setg(errp, "Failed to mount nfs share: %s", >>>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + if (open_flags & O_CREAT) { >>>>>>>>>> + if (nfs_creat(client->context, file, 0600, &client->fh) >>>>>>>>>> != 0) { >>>>>>>>>> + error_setg(errp, "Failed to create file: %s", >>>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + } else { >>>>>>>>>> + open_flags = (flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY; >>>>>>>>>> + if (nfs_open(client->context, file, open_flags, >>>>>>>>>> &client->fh) >>>>>>>>>> != 0) { >>>>>>>>>> + error_setg(errp, "Failed to open file : %s", >>>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>>> + ret = -EINVAL; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + if (nfs_fstat(client->context, client->fh, &st) != 0) { >>>>>>>>>> + error_setg(errp, "Failed to fstat file: %s", >>>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>>> + ret = -EIO; >>>>>>>>>> + goto fail; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + bs->total_sectors = st.st_size / BDRV_SECTOR_SIZE; >>>>>>>>> >>>>>>>>> Please use DIV_ROUND_UP(). Otherwise the remainder in last sector >>>>>>>>> couldn't be read. >>>>>>>> Will do. Can't it happen that we end up reading unallocated sectors? >>>>>>> >>>>>>> Hmm, maybe. Not missing last bytes of unaligned sector is essential >>>>>>> for VMDK description file. But you are right, please add check code >>>>>>> and make sure that we don't read beyond EOF as well. >>>>>> Actually it would like to keep bs->total_sectors = st.st_size / >>>>>> BDRV_SECTOR_SIZE; for now until I have checked how libnfs and the >>>>>> whole stack react on beyond EOF reads. >>>>>> >>>>> >>>>> You don't need to care about libnfs' EOF behavior, nfs_pread_async is >>>>> in bytes granularity, you could just read the last partial sector till >>>>> EOF and zero padding the buffer. >>>> isn't this something that is already in bdrv_co_do_readv? >>>> >>>> do you have an example of an VMDK with descriptor file for me. I would >>>> try the implementation then. >>>> >>> >>> You can create a VMDK with description file with: >>> >>>> $ qemu-img create -f vmdk -o subformat=monolithicFlat foo.vmdk 1G >>>> Formatting 'foo.vmdk', fmt=vmdk size=1073741824 compat6=off >>>> subformat='monolithicFlat' zeroed_grain=off >>> >>>> $ cat foo.vmdk >>>> # Disk DescriptorFile >>>> version=1 >>>> CID=52b01245 >>>> parentCID=ffffffff >>>> createType="monolithicFlat" >>>> >>>> # Extent description >>>> RW 2097152 FLAT "foo-flat.vmdk" 0 >>>> >>>> # The Disk Data Base >>>> #DDB >>>> >>>> ddb.virtualHWVersion = "4" >>>> ddb.geometry.cylinders = "2080" >>>> ddb.geometry.heads = "16" >>>> ddb.geometry.sectors = "63" >>>> ddb.adapterType = "ide" >>> >>> It's a 313 bytes text file which I assume might not work with nfs >>> backend here. Please have a try. >>> >>> Thanks :) >> It works better than expected. Just creating does not work for obvious >> reasons, but if I have those files on NFS it works *g* >> >> image: nfs://172.21.200.61/vcore-dev-cdrom/foo.vmdk >> file format: vmdk >> virtual size: 1.0G (1073741824 bytes) >> disk size: 1.0G >> Format specific information: >> cid: 1387271102 >> parent cid: 4294967295 >> create type: monolithicFlat >> extents: >> [0]: >> virtual size: 1073741824 >> filename: nfs://172.21.200.61/vcore-dev-cdrom/foo-flat.vmdk >> format: FLAT >> > > Without DIV_ROUND_UP? That would be conditional on zero_beyond_eof. I tried this as well, it's not working if total_sectors is not rounded up: no, with round up. I've just sent v2 of the patch. Peter