From: Peter Lieven <pl@kamp.de>
To: Fam Zheng <famz@redhat.com>
Cc: kwolf@redhat.com, pbonzini@redhat.com, stefanha@redhat.com,
ronniesahlberg@gmail.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] block: add native support for NFS
Date: Tue, 17 Dec 2013 11:31:37 +0100 [thread overview]
Message-ID: <52B02809.9020407@kamp.de> (raw)
In-Reply-To: <52B01D75.40404@redhat.com>
On 17.12.2013 10:46, Fam Zheng wrote:
> On 2013年12月17日 17:07, Peter Lieven wrote:
>> On 17.12.2013 10:01, Fam Zheng wrote:
>>> On 2013年12月17日 16:55, Peter Lieven wrote:
>>>> On 17.12.2013 09:51, Fam Zheng wrote:
>>>>> 于2013年 12月17日 星期二 16时46分50秒,Peter Lieven写到:
>>>>>> On 17.12.2013 09:29, Fam Zheng wrote:
>>>>>>> On 2013年12月17日 15:53, Peter Lieven wrote:
>>>>>>>> Hi Fam,
>>>>>>>>
>>>>>>>> On 17.12.2013 05:07, Fam Zheng wrote:
>>>>>>>>> On 2013年12月16日 23:34, Peter Lieven wrote:
>>>>>>>>>> +static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
>>>>>>>>>> + int64_t sector_num, int
>>>>>>>>>> nb_sectors,
>>>>>>>>>> + QEMUIOVector *iov)
>>>>>>>>>> +{
>>>>>>>>>> + nfsclient *client = bs->opaque;
>>>>>>>>>> + struct NFSTask Task;
>>>>>>>>>> + char *buf = NULL;
>>>>>>>>>> +
>>>>>>>>>> + nfs_co_init_task(client, &Task);
>>>>>>>>>> +
>>>>>>>>>> + buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE);
>>>>>>>>>> + qemu_iovec_to_buf(iov, 0, buf, nb_sectors *
>>>>>>>>>> BDRV_SECTOR_SIZE);
>>>>>>>>>> +
>>>>>>>>>> + if (nfs_pwrite_async(client->context, client->fh,
>>>>>>>>>> + sector_num * BDRV_SECTOR_SIZE,
>>>>>>>>>> + nb_sectors * BDRV_SECTOR_SIZE,
>>>>>>>>>> + buf, nfs_co_generic_cb, &Task) != 0) {
>>>>>>>>>> + g_free(buf);
>>>>>>>>>> + return -EIO;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + while (!Task.complete) {
>>>>>>>>>> + nfs_set_events(client);
>>>>>>>>>> + qemu_coroutine_yield();
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + g_free(buf);
>>>>>>>>>> +
>>>>>>>>>> + if (Task.status != nb_sectors * BDRV_SECTOR_SIZE) {
>>>>>>>>>> + return -EIO;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + bs->total_sectors = MAX(bs->total_sectors, sector_num +
>>>>>>>>>> nb_sectors);
>>>>>>>>>> + client->allocated_file_size = -ENOTSUP;
>>>>>>>>>
>>>>>>>>> Why does allocated_file_size become not supported after a write?
>>>>>>>> I thought that someone would ask this ;-) bdrv_allocated_file_size
>>>>>>>> is only
>>>>>>>> used in image info. I saved some code here implementing an async
>>>>>>>> call.
>>>>>>>> On open I fstat anyway and store that value. For qemu-img info
>>>>>>>> this is
>>>>>>>> sufficient, but the allocated size likely changes after a write.
>>>>>>>> -ENOTSUP
>>>>>>>> is the default if bdrv_allocated_file_size is not implemented.
>>>>>>>
>>>>>>> OK. Please add some comment here.
>>>>>>>
>>>>>>>>>> +static int nfs_file_open_common(BlockDriverState *bs, QDict
>>>>>>>>>> *options, int flags,
>>>>>>>>>> + int open_flags, Error **errp)
>>>>>>>>>> +{
>>>>>>>>>> + nfsclient *client = bs->opaque;
>>>>>>>>>> + const char *filename;
>>>>>>>>>> + int ret = 0;
>>>>>>>>>> + QemuOpts *opts;
>>>>>>>>>> + Error *local_err = NULL;
>>>>>>>>>> + char *server = NULL, *path = NULL, *file = NULL, *strp;
>>>>>>>>>> + struct stat st;
>>>>>>>>>> +
>>>>>>>>>> + opts = qemu_opts_create_nofail(&runtime_opts);
>>>>>>>>>> + qemu_opts_absorb_qdict(opts, options, &local_err);
>>>>>>>>>> + if (error_is_set(&local_err)) {
>>>>>>>>>> + qerror_report_err(local_err);
>>>>>>>>>> + error_free(local_err);
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + filename = qemu_opt_get(opts, "filename");
>>>>>>>>>> +
>>>>>>>>>> + client->context = nfs_init_context();
>>>>>>>>>> +
>>>>>>>>>> + if (client->context == NULL) {
>>>>>>>>>> + error_setg(errp, "Failed to init NFS context");
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + server = g_strdup(filename + 6);
>>>>>>>>>
>>>>>>>>> Please check the length of filename is longer than 6 before
>>>>>>>>> accessing
>>>>>>>>> filename[6].
>>>>>>>> Good point. I will check for this, but in fact I think it can't
>>>>>>>> happen
>>>>>>>> because we will
>>>>>>>> never end up there if filename does not start with nfs://
>>>>>>>
>>>>>>> True, at least for now, but it doesn't hurt to be defensive,
>>>>>>> especially with strings.
>>>>>>>
>>>>>>>>>
>>>>>>>>>> + if (server[0] == '/' || server[0] == '\0') {
>>>>>>>>>> + error_setg(errp, "Invalid server in URL");
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> + strp = strchr(server, '/');
>>>>>>>>>> + if (strp == NULL) {
>>>>>>>>>> + error_setg(errp, "Invalid URL specified.\n");
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> + path = g_strdup(strp);
>>>>>>>>>> + *strp = 0;
>>>>>>>>>> + strp = strrchr(path, '/');
>>>>>>>>>> + if (strp == NULL) {
>>>>>>>>>> + error_setg(errp, "Invalid URL specified.\n");
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> + file = g_strdup(strp);
>>>>>>>>>> + *strp = 0;
>>>>>>>>>> +
>>>>>>>>>> + if (nfs_mount(client->context, server, path) != 0) {
>>>>>>>>>> + error_setg(errp, "Failed to mount nfs share: %s",
>>>>>>>>>> + nfs_get_error(client->context));
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + if (open_flags & O_CREAT) {
>>>>>>>>>> + if (nfs_creat(client->context, file, 0600, &client->fh)
>>>>>>>>>> != 0) {
>>>>>>>>>> + error_setg(errp, "Failed to create file: %s",
>>>>>>>>>> + nfs_get_error(client->context));
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> + } else {
>>>>>>>>>> + open_flags = (flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY;
>>>>>>>>>> + if (nfs_open(client->context, file, open_flags,
>>>>>>>>>> &client->fh)
>>>>>>>>>> != 0) {
>>>>>>>>>> + error_setg(errp, "Failed to open file : %s",
>>>>>>>>>> + nfs_get_error(client->context));
>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + if (nfs_fstat(client->context, client->fh, &st) != 0) {
>>>>>>>>>> + error_setg(errp, "Failed to fstat file: %s",
>>>>>>>>>> + nfs_get_error(client->context));
>>>>>>>>>> + ret = -EIO;
>>>>>>>>>> + goto fail;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + bs->total_sectors = st.st_size / BDRV_SECTOR_SIZE;
>>>>>>>>>
>>>>>>>>> Please use DIV_ROUND_UP(). Otherwise the remainder in last sector
>>>>>>>>> couldn't be read.
>>>>>>>> Will do. Can't it happen that we end up reading unallocated sectors?
>>>>>>>
>>>>>>> Hmm, maybe. Not missing last bytes of unaligned sector is essential
>>>>>>> for VMDK description file. But you are right, please add check code
>>>>>>> and make sure that we don't read beyond EOF as well.
>>>>>> Actually it would like to keep bs->total_sectors = st.st_size /
>>>>>> BDRV_SECTOR_SIZE; for now until I have checked how libnfs and the
>>>>>> whole stack react on beyond EOF reads.
>>>>>>
>>>>>
>>>>> You don't need to care about libnfs' EOF behavior, nfs_pread_async is
>>>>> in bytes granularity, you could just read the last partial sector till
>>>>> EOF and zero padding the buffer.
>>>> isn't this something that is already in bdrv_co_do_readv?
>>>>
>>>> do you have an example of an VMDK with descriptor file for me. I would
>>>> try the implementation then.
>>>>
>>>
>>> You can create a VMDK with description file with:
>>>
>>>> $ qemu-img create -f vmdk -o subformat=monolithicFlat foo.vmdk 1G
>>>> Formatting 'foo.vmdk', fmt=vmdk size=1073741824 compat6=off
>>>> subformat='monolithicFlat' zeroed_grain=off
>>>
>>>> $ cat foo.vmdk
>>>> # Disk DescriptorFile
>>>> version=1
>>>> CID=52b01245
>>>> parentCID=ffffffff
>>>> createType="monolithicFlat"
>>>>
>>>> # Extent description
>>>> RW 2097152 FLAT "foo-flat.vmdk" 0
>>>>
>>>> # The Disk Data Base
>>>> #DDB
>>>>
>>>> ddb.virtualHWVersion = "4"
>>>> ddb.geometry.cylinders = "2080"
>>>> ddb.geometry.heads = "16"
>>>> ddb.geometry.sectors = "63"
>>>> ddb.adapterType = "ide"
>>>
>>> It's a 313 bytes text file which I assume might not work with nfs
>>> backend here. Please have a try.
>>>
>>> Thanks :)
>> It works better than expected. Just creating does not work for obvious
>> reasons, but if I have those files on NFS it works *g*
>>
>> image: nfs://172.21.200.61/vcore-dev-cdrom/foo.vmdk
>> file format: vmdk
>> virtual size: 1.0G (1073741824 bytes)
>> disk size: 1.0G
>> Format specific information:
>> cid: 1387271102
>> parent cid: 4294967295
>> create type: monolithicFlat
>> extents:
>> [0]:
>> virtual size: 1073741824
>> filename: nfs://172.21.200.61/vcore-dev-cdrom/foo-flat.vmdk
>> format: FLAT
>>
>
> Without DIV_ROUND_UP? That would be conditional on zero_beyond_eof. I tried this as well, it's not working if total_sectors is not rounded up:
no, with round up. I've just sent v2 of the patch.
Peter
prev parent reply other threads:[~2013-12-17 10:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-16 15:34 [Qemu-devel] [PATCH] block: add native support for NFS Peter Lieven
2013-12-16 16:33 ` ronnie sahlberg
2013-12-16 17:01 ` ronnie sahlberg
2013-12-17 4:07 ` Fam Zheng
2013-12-17 7:53 ` Peter Lieven
2013-12-17 8:29 ` Fam Zheng
2013-12-17 8:46 ` Peter Lieven
2013-12-17 8:51 ` Fam Zheng
2013-12-17 8:55 ` Peter Lieven
2013-12-17 9:01 ` Fam Zheng
2013-12-17 9:07 ` Peter Lieven
2013-12-17 9:46 ` Fam Zheng
2013-12-17 10:31 ` Peter Lieven [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52B02809.9020407@kamp.de \
--to=pl@kamp.de \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=ronniesahlberg@gmail.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).