From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver Date: Mon, 4 Jul 2016 15:27:35 +0800 Message-ID: <577A0FE7.5020202@linux.intel.com> References: <1466440308-4961-1-git-send-email-kwankhede@nvidia.com> <1466440308-4961-2-git-send-email-kwankhede@nvidia.com> <5773D24A.4060005@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: shuai.ruan@intel.com, jike.song@intel.com, kvm@vger.kernel.org, kevin.tian@intel.com, qemu-devel@nongnu.org, zhiyuan.lv@intel.com, bjsdjshi@linux.vnet.ibm.com To: Kirti Wankhede , alex.williamson@redhat.com, pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com Return-path: Received: from mga02.intel.com ([134.134.136.20]:31549 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753151AbcGDHbb (ORCPT ); Mon, 4 Jul 2016 03:31:31 -0400 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 07/01/2016 02:51 AM, Kirti Wankhede wrote: > > > On 6/29/2016 7:21 PM, Xiao Guangrong wrote: >> >> >> On 06/21/2016 12:31 AM, Kirti Wankhede wrote: >>> Design for Mediated Device Driver: > ... >>> +static int mdev_add_attribute_group(struct device *dev, >>> + const struct attribute_group **groups) >>> +{ >>> + return sysfs_create_groups(&dev->kobj, groups); >>> +} >>> + >>> +static void mdev_remove_attribute_group(struct device *dev, >>> + const struct attribute_group **groups) >>> +{ >>> + sysfs_remove_groups(&dev->kobj, groups); >>> +} >>> + >> >> better use device_add_groups() / device_remove_groups() instead? >> > > These are not exported from base module. They can't be used here. Er, i did not realize it, sorry. > > >>> +} >>> + >>> +static int mdev_device_create_ops(struct mdev_device *mdev, char >>> *mdev_params) >>> +{ >>> + struct parent_device *parent = mdev->parent; >>> + int ret; >>> + >>> + mutex_lock(&parent->ops_lock); >>> + if (parent->ops->create) { >>> + ret = parent->ops->create(mdev->dev.parent, mdev->uuid, >>> + mdev->instance, mdev_params); >> >> I think it is better if we pass @mdev to this callback, then the parent >> driver >> can do its specified operations and associate it with the instance, >> e.g, via mdev->private. >> > > Yes, actually I was also thinking of changing it to > > - ret = parent->ops->create(mdev->dev.parent, mdev->uuid, > - mdev->instance, mdev_params); > + ret = parent->ops->create(mdev, mdev_params); > Good. :) > >>> +int mdev_register_device(struct device *dev, const struct parent_ops >>> *ops) >>> +{ >>> + int ret = 0; >>> + struct parent_device *parent; >>> + >>> + if (!dev || !ops) >>> + return -EINVAL; >>> + >>> + mutex_lock(&parent_devices.list_lock); >>> + >>> + /* Check for duplicate */ >>> + parent = find_parent_device(dev); >>> + if (parent) { >>> + ret = -EEXIST; >>> + goto add_dev_err; >>> + } >>> + >>> + parent = kzalloc(sizeof(*parent), GFP_KERNEL); >>> + if (!parent) { >>> + ret = -ENOMEM; >>> + goto add_dev_err; >>> + } >>> + >>> + kref_init(&parent->ref); >>> + list_add(&parent->next, &parent_devices.dev_list); >>> + mutex_unlock(&parent_devices.list_lock); >> >> It is not safe as Alex's already pointed it out. >> >>> + >>> + parent->dev = dev; >>> + parent->ops = ops; >>> + mutex_init(&parent->ops_lock); >>> + mutex_init(&parent->mdev_list_lock); >>> + INIT_LIST_HEAD(&parent->mdev_list); >>> + init_waitqueue_head(&parent->release_done); >> >> And no lock to protect these operations. >> > > As I replied to Alex also, yes I'm fixing it. > >>> +void mdev_unregister_device(struct device *dev) >>> +{ >>> + struct parent_device *parent; >>> + struct mdev_device *mdev, *n; >>> + int ret; >>> + >>> + mutex_lock(&parent_devices.list_lock); >>> + parent = find_parent_device(dev); >>> + >>> + if (!parent) { >>> + mutex_unlock(&parent_devices.list_lock); >>> + return; >>> + } >>> + dev_info(dev, "MDEV: Unregistering\n"); >>> + >>> + /* >>> + * Remove parent from the list and remove create and destroy sysfs >>> + * files so that no new mediated device could be created for this >>> parent >>> + */ >>> + list_del(&parent->next); >>> + mdev_remove_sysfs_files(dev); >>> + mutex_unlock(&parent_devices.list_lock); >>> + >> >> find_parent_device() does not increase the refcount of the parent-device, >> after releasing the lock, is it still safe to use the device? >> > > Yes. In mdev_register_device(), kref_init() initialises refcount to 1 > and then when mdev child is created refcount is incremented and on child > mdev destroys refcount is decremented. So when all child mdev are > destroyed, refcount will still be 1 until mdev_unregister_device() is > called. So when no mdev device is created, mdev_register_device() hold > parent's refcount and released from mdev_unregister_device(). > > >>> + mutex_lock(&parent->ops_lock); >>> + mdev_remove_attribute_group(dev, >>> + parent->ops->dev_attr_groups); >> >> Why mdev_remove_sysfs_files() and mdev_remove_attribute_group() >> are protected by different locks? >> > > As mentioned in reply to Alex on another thread, removing these locks. > >>> + >>> +int mdev_device_destroy(struct device *dev, uuid_le uuid, uint32_t >>> instance) >>> +{ >>> + struct mdev_device *mdev; >>> + struct parent_device *parent; >>> + int ret; >>> + >>> + parent = mdev_get_parent_by_dev(dev); >>> + if (!parent) { >>> + ret = -EINVAL; >>> + goto destroy_err; >>> + } >>> + >>> + mdev = find_mdev_device(parent, uuid, instance); >>> + if (!mdev) { >>> + ret = -EINVAL; >>> + goto destroy_err; >>> + } >>> + >>> + ret = mdev_device_destroy_ops(mdev, false); >>> + if (ret) >>> + goto destroy_err; >> >> find_mdev_device() does not hold the refcount of mdev, is it safe? >> > > Yes, this function is just to check duplicate entry or existence of mdev > device. > Now, i am getting more confused, the caller of mdev_device_destroy(), e.g, mdev_destroy_store(), does not hold any lock, however, you was walking the list of parent->mdev_list (calling find_mdev_device()), is it really safe? And how to prevent this scenario? CPU 0 CPU 1 in mdev_device_destroy(): in mdev_unregister_device(): mdev = find_mdev_device(parent, uuid, instance); // no refcount hold mdev_device_destroy_ops(mdev, true); list_del(&mdev->next); mdev_put_device(mdev); // last refcount is gone. mdev = find_mdev_device(parent, uuid, instance); !!!!!! PANIC !!!!!! >>> + >>> + mdev_put_parent(parent); >>> + >> >> The refcount of parent-device is released, you can not continue to >> use it. >> > > Removing these locks. > >>> + mutex_lock(&parent->mdev_list_lock); >>> + list_del(&mdev->next); >>> + mutex_unlock(&parent->mdev_list_lock); >>> + >>> + mdev_put_device(mdev); >>> + return ret; >>> + >>> +destroy_err: >>> + mdev_put_parent(parent); >>> + return ret; >>> +} >>> + >>> + >>> +static struct class mdev_class = { >>> + .name = MDEV_CLASS_NAME, >>> + .owner = THIS_MODULE, >>> + .class_attrs = mdev_class_attrs, >> >> These interfaces, start and shutdown, are based on UUID, how >> about if we want to operate on the specified instance? >> > > Do you mean hot-plug a device? Hot-plug and hot-unplug, yes. > >>> +}; >>> + >>> +static int __init mdev_init(void) >>> +{ >>> + int ret; >>> + >>> + mutex_init(&parent_devices.list_lock); >>> + INIT_LIST_HEAD(&parent_devices.dev_list); >>> + >>> + ret = class_register(&mdev_class); >>> + if (ret) { >>> + pr_err("Failed to register mdev class\n"); >>> + return ret; >>> + } >>> + >>> + ret = mdev_bus_register(); >>> + if (ret) { >>> + pr_err("Failed to register mdev bus\n"); >>> + class_unregister(&mdev_class); >>> + return ret; >>> + } >>> + >>> + return ret; >>> +} >>> + >>> +static void __exit mdev_exit(void) >>> +{ >>> + mdev_bus_unregister(); >>> + class_unregister(&mdev_class); >>> +} >> >> Hmm, how to prevent if there are parent-devices existing >> when the module is being unloaded? >> > > If parent device exits that means that other module is using > mdev_register_device()/mdev_unregister_device() from their module. > 'rmmod mdev' would fail until that module is unloaded. > > # rmmod mdev > rmmod: ERROR: Module mdev is in use by: nvidia > So other module need to explicitly increase the refcount of mdev.ko? > > >>> + >>> +static int uuid_parse(const char *str, uuid_le *uuid) >>> +{ >>> + int i; >>> + >>> + if (strlen(str) < UUID_CHAR_LENGTH) >>> + return -EINVAL; >>> + >>> + for (i = 0; i < UUID_BYTE_LENGTH; i++) { >>> + if (!isxdigit(str[0]) || !isxdigit(str[1])) { >>> + pr_err("%s err", __func__); >>> + return -EINVAL; >>> + } >>> + >>> + uuid->b[i] = (hex_to_bin(str[0]) << 4) | hex_to_bin(str[1]); >>> + str += 2; >>> + if (is_uuid_sep(*str)) >>> + str++; >>> + } >>> + >>> + return 0; >>> +} >>> + >> >> Can we use uuid_le_to_bin()? > > I couldn't find this in kernel? It is in lib/uuid.c, i am using kvm tree.