From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53858) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VsrEw-0005FE-MB for qemu-devel@nongnu.org; Tue, 17 Dec 2013 04:47:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VsrEq-0007nu-FJ for qemu-devel@nongnu.org; Tue, 17 Dec 2013 04:46:58 -0500 Received: from mx1.redhat.com ([209.132.183.28]:15024) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VsrEq-0007nZ-6d for qemu-devel@nongnu.org; Tue, 17 Dec 2013 04:46:52 -0500 Message-ID: <52B01D75.40404@redhat.com> Date: Tue, 17 Dec 2013 17:46:29 +0800 From: Fam Zheng MIME-Version: 1.0 References: <1387208069-9302-1-git-send-email-pl@kamp.de> <52AFCDF4.9020804@redhat.com> <52B002F2.4060906@kamp.de> <52B00B4F.9060406@redhat.com> <52B00F7A.5030506@kamp.de> <52B010A5.90704@redhat.com> <52B0119F.8010504@kamp.de> <52B012D5.3010802@redhat.com> <52B01455.3050905@kamp.de> In-Reply-To: <52B01455.3050905@kamp.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] block: add native support for NFS List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: kwolf@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, ronniesahlberg@gmail.com, qemu-devel@nongnu.org On 2013=E5=B9=B412=E6=9C=8817=E6=97=A5 17:07, Peter Lieven wrote: > On 17.12.2013 10:01, Fam Zheng wrote: >> On 2013=E5=B9=B412=E6=9C=8817=E6=97=A5 16:55, Peter Lieven wrote: >>> On 17.12.2013 09:51, Fam Zheng wrote: >>>> =E4=BA=8E2013=E5=B9=B4 12=E6=9C=8817=E6=97=A5 =E6=98=9F=E6=9C=9F=E4=BA= =8C 16=E6=97=B646=E5=88=8650=E7=A7=92,Peter Lieven=E5=86=99=E5=88=B0: >>>>> On 17.12.2013 09:29, Fam Zheng wrote: >>>>>> On 2013=E5=B9=B412=E6=9C=8817=E6=97=A5 15:53, Peter Lieven wrote: >>>>>>> Hi Fam, >>>>>>> >>>>>>> On 17.12.2013 05:07, Fam Zheng wrote: >>>>>>>> On 2013=E5=B9=B412=E6=9C=8816=E6=97=A5 23:34, Peter Lieven wrote= : >>>>>>>>> +static int coroutine_fn nfs_co_writev(BlockDriverState *bs, >>>>>>>>> + int64_t sector_num, in= t >>>>>>>>> nb_sectors, >>>>>>>>> + QEMUIOVector *iov) >>>>>>>>> +{ >>>>>>>>> + nfsclient *client =3D bs->opaque; >>>>>>>>> + struct NFSTask Task; >>>>>>>>> + char *buf =3D NULL; >>>>>>>>> + >>>>>>>>> + nfs_co_init_task(client, &Task); >>>>>>>>> + >>>>>>>>> + buf =3D g_malloc(nb_sectors * BDRV_SECTOR_SIZE); >>>>>>>>> + qemu_iovec_to_buf(iov, 0, buf, nb_sectors * >>>>>>>>> BDRV_SECTOR_SIZE); >>>>>>>>> + >>>>>>>>> + if (nfs_pwrite_async(client->context, client->fh, >>>>>>>>> + sector_num * BDRV_SECTOR_SIZE, >>>>>>>>> + nb_sectors * BDRV_SECTOR_SIZE, >>>>>>>>> + buf, nfs_co_generic_cb, &Task) !=3D 0= ) { >>>>>>>>> + g_free(buf); >>>>>>>>> + return -EIO; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + while (!Task.complete) { >>>>>>>>> + nfs_set_events(client); >>>>>>>>> + qemu_coroutine_yield(); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + g_free(buf); >>>>>>>>> + >>>>>>>>> + if (Task.status !=3D nb_sectors * BDRV_SECTOR_SIZE) { >>>>>>>>> + return -EIO; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + bs->total_sectors =3D MAX(bs->total_sectors, sector_num + >>>>>>>>> nb_sectors); >>>>>>>>> + client->allocated_file_size =3D -ENOTSUP; >>>>>>>> >>>>>>>> Why does allocated_file_size become not supported after a write? >>>>>>> I thought that someone would ask this ;-) bdrv_allocated_file_siz= e >>>>>>> is only >>>>>>> used in image info. I saved some code here implementing an async >>>>>>> call. >>>>>>> On open I fstat anyway and store that value. For qemu-img info >>>>>>> this is >>>>>>> sufficient, but the allocated size likely changes after a write. >>>>>>> -ENOTSUP >>>>>>> is the default if bdrv_allocated_file_size is not implemented. >>>>>> >>>>>> OK. Please add some comment here. >>>>>> >>>>>>>>> +static int nfs_file_open_common(BlockDriverState *bs, QDict >>>>>>>>> *options, int flags, >>>>>>>>> + int open_flags, Error **errp) >>>>>>>>> +{ >>>>>>>>> + nfsclient *client =3D bs->opaque; >>>>>>>>> + const char *filename; >>>>>>>>> + int ret =3D 0; >>>>>>>>> + QemuOpts *opts; >>>>>>>>> + Error *local_err =3D NULL; >>>>>>>>> + char *server =3D NULL, *path =3D NULL, *file =3D NULL, *st= rp; >>>>>>>>> + struct stat st; >>>>>>>>> + >>>>>>>>> + opts =3D qemu_opts_create_nofail(&runtime_opts); >>>>>>>>> + qemu_opts_absorb_qdict(opts, options, &local_err); >>>>>>>>> + if (error_is_set(&local_err)) { >>>>>>>>> + qerror_report_err(local_err); >>>>>>>>> + error_free(local_err); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + filename =3D qemu_opt_get(opts, "filename"); >>>>>>>>> + >>>>>>>>> + client->context =3D nfs_init_context(); >>>>>>>>> + >>>>>>>>> + if (client->context =3D=3D NULL) { >>>>>>>>> + error_setg(errp, "Failed to init NFS context"); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + server =3D g_strdup(filename + 6); >>>>>>>> >>>>>>>> Please check the length of filename is longer than 6 before >>>>>>>> accessing >>>>>>>> filename[6]. >>>>>>> Good point. I will check for this, but in fact I think it can't >>>>>>> happen >>>>>>> because we will >>>>>>> never end up there if filename does not start with nfs:// >>>>>> >>>>>> True, at least for now, but it doesn't hurt to be defensive, >>>>>> especially with strings. >>>>>> >>>>>>>> >>>>>>>>> + if (server[0] =3D=3D '/' || server[0] =3D=3D '\0') { >>>>>>>>> + error_setg(errp, "Invalid server in URL"); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + strp =3D strchr(server, '/'); >>>>>>>>> + if (strp =3D=3D NULL) { >>>>>>>>> + error_setg(errp, "Invalid URL specified.\n"); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + path =3D g_strdup(strp); >>>>>>>>> + *strp =3D 0; >>>>>>>>> + strp =3D strrchr(path, '/'); >>>>>>>>> + if (strp =3D=3D NULL) { >>>>>>>>> + error_setg(errp, "Invalid URL specified.\n"); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + file =3D g_strdup(strp); >>>>>>>>> + *strp =3D 0; >>>>>>>>> + >>>>>>>>> + if (nfs_mount(client->context, server, path) !=3D 0) { >>>>>>>>> + error_setg(errp, "Failed to mount nfs share: %s", >>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + if (open_flags & O_CREAT) { >>>>>>>>> + if (nfs_creat(client->context, file, 0600, &client->fh= ) >>>>>>>>> !=3D 0) { >>>>>>>>> + error_setg(errp, "Failed to create file: %s", >>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + } else { >>>>>>>>> + open_flags =3D (flags & BDRV_O_RDWR) ? O_RDWR : O_RDON= LY; >>>>>>>>> + if (nfs_open(client->context, file, open_flags, >>>>>>>>> &client->fh) >>>>>>>>> !=3D 0) { >>>>>>>>> + error_setg(errp, "Failed to open file : %s", >>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>> + ret =3D -EINVAL; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + if (nfs_fstat(client->context, client->fh, &st) !=3D 0) { >>>>>>>>> + error_setg(errp, "Failed to fstat file: %s", >>>>>>>>> + nfs_get_error(client->context)); >>>>>>>>> + ret =3D -EIO; >>>>>>>>> + goto fail; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + bs->total_sectors =3D st.st_size / BDRV_SECTOR_SIZE; >>>>>>>> >>>>>>>> Please use DIV_ROUND_UP(). Otherwise the remainder in last secto= r >>>>>>>> couldn't be read. >>>>>>> Will do. Can't it happen that we end up reading unallocated secto= rs? >>>>>> >>>>>> Hmm, maybe. Not missing last bytes of unaligned sector is essentia= l >>>>>> for VMDK description file. But you are right, please add check cod= e >>>>>> and make sure that we don't read beyond EOF as well. >>>>> Actually it would like to keep bs->total_sectors =3D st.st_size / >>>>> BDRV_SECTOR_SIZE; for now until I have checked how libnfs and the >>>>> whole stack react on beyond EOF reads. >>>>> >>>> >>>> You don't need to care about libnfs' EOF behavior, nfs_pread_async i= s >>>> in bytes granularity, you could just read the last partial sector ti= ll >>>> EOF and zero padding the buffer. >>> isn't this something that is already in bdrv_co_do_readv? >>> >>> do you have an example of an VMDK with descriptor file for me. I woul= d >>> try the implementation then. >>> >> >> You can create a VMDK with description file with: >> >>> $ qemu-img create -f vmdk -o subformat=3DmonolithicFlat foo.vmdk 1G >>> Formatting 'foo.vmdk', fmt=3Dvmdk size=3D1073741824 compat6=3Doff >>> subformat=3D'monolithicFlat' zeroed_grain=3Doff >> >>> $ cat foo.vmdk >>> # Disk DescriptorFile >>> version=3D1 >>> CID=3D52b01245 >>> parentCID=3Dffffffff >>> createType=3D"monolithicFlat" >>> >>> # Extent description >>> RW 2097152 FLAT "foo-flat.vmdk" 0 >>> >>> # The Disk Data Base >>> #DDB >>> >>> ddb.virtualHWVersion =3D "4" >>> ddb.geometry.cylinders =3D "2080" >>> ddb.geometry.heads =3D "16" >>> ddb.geometry.sectors =3D "63" >>> ddb.adapterType =3D "ide" >> >> It's a 313 bytes text file which I assume might not work with nfs >> backend here. Please have a try. >> >> Thanks :) > It works better than expected. Just creating does not work for obvious > reasons, but if I have those files on NFS it works *g* > > image: nfs://172.21.200.61/vcore-dev-cdrom/foo.vmdk > file format: vmdk > virtual size: 1.0G (1073741824 bytes) > disk size: 1.0G > Format specific information: > cid: 1387271102 > parent cid: 4294967295 > create type: monolithicFlat > extents: > [0]: > virtual size: 1073741824 > filename: nfs://172.21.200.61/vcore-dev-cdrom/foo-flat.vmd= k > format: FLAT > Without DIV_ROUND_UP? That would be conditional on zero_beyond_eof. I=20 tried this as well, it's not working if total_sectors is not rounded up: > $ sudo ./qemu-img info nfs://127.0.0.1/stor/vmdk/foo.vmdk > image: nfs://127.0.0.1/stor/vmdk/foo.vmdk > file format: raw > virtual size: 0 (0 bytes) > disk size: 0 Fam