From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
To: Kirti Wankhede <kwankhede@nvidia.com>,
alex.williamson@redhat.com, pbonzini@redhat.com,
kraxel@redhat.com, cjia@nvidia.com
Cc: shuai.ruan@intel.com, jike.song@intel.com, kvm@vger.kernel.org,
kevin.tian@intel.com, qemu-devel@nongnu.org,
zhiyuan.lv@intel.com, bjsdjshi@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver
Date: Mon, 4 Jul 2016 15:27:35 +0800 [thread overview]
Message-ID: <577A0FE7.5020202@linux.intel.com> (raw)
In-Reply-To: <ff274e7b-7cdc-cc34-3b18-913072ab545e@nvidia.com>
On 07/01/2016 02:51 AM, Kirti Wankhede wrote:
>
>
> On 6/29/2016 7:21 PM, Xiao Guangrong wrote:
>>
>>
>> On 06/21/2016 12:31 AM, Kirti Wankhede wrote:
>>> Design for Mediated Device Driver:
> ...
>>> +static int mdev_add_attribute_group(struct device *dev,
>>> + const struct attribute_group **groups)
>>> +{
>>> + return sysfs_create_groups(&dev->kobj, groups);
>>> +}
>>> +
>>> +static void mdev_remove_attribute_group(struct device *dev,
>>> + const struct attribute_group **groups)
>>> +{
>>> + sysfs_remove_groups(&dev->kobj, groups);
>>> +}
>>> +
>>
>> better use device_add_groups() / device_remove_groups() instead?
>>
>
> These are not exported from base module. They can't be used here.
Er, i did not realize it, sorry.
>
>
>>> +}
>>> +
>>> +static int mdev_device_create_ops(struct mdev_device *mdev, char
>>> *mdev_params)
>>> +{
>>> + struct parent_device *parent = mdev->parent;
>>> + int ret;
>>> +
>>> + mutex_lock(&parent->ops_lock);
>>> + if (parent->ops->create) {
>>> + ret = parent->ops->create(mdev->dev.parent, mdev->uuid,
>>> + mdev->instance, mdev_params);
>>
>> I think it is better if we pass @mdev to this callback, then the parent
>> driver
>> can do its specified operations and associate it with the instance,
>> e.g, via mdev->private.
>>
>
> Yes, actually I was also thinking of changing it to
>
> - ret = parent->ops->create(mdev->dev.parent, mdev->uuid,
> - mdev->instance, mdev_params);
> + ret = parent->ops->create(mdev, mdev_params);
>
Good. :)
>
>>> +int mdev_register_device(struct device *dev, const struct parent_ops
>>> *ops)
>>> +{
>>> + int ret = 0;
>>> + struct parent_device *parent;
>>> +
>>> + if (!dev || !ops)
>>> + return -EINVAL;
>>> +
>>> + mutex_lock(&parent_devices.list_lock);
>>> +
>>> + /* Check for duplicate */
>>> + parent = find_parent_device(dev);
>>> + if (parent) {
>>> + ret = -EEXIST;
>>> + goto add_dev_err;
>>> + }
>>> +
>>> + parent = kzalloc(sizeof(*parent), GFP_KERNEL);
>>> + if (!parent) {
>>> + ret = -ENOMEM;
>>> + goto add_dev_err;
>>> + }
>>> +
>>> + kref_init(&parent->ref);
>>> + list_add(&parent->next, &parent_devices.dev_list);
>>> + mutex_unlock(&parent_devices.list_lock);
>>
>> It is not safe as Alex's already pointed it out.
>>
>>> +
>>> + parent->dev = dev;
>>> + parent->ops = ops;
>>> + mutex_init(&parent->ops_lock);
>>> + mutex_init(&parent->mdev_list_lock);
>>> + INIT_LIST_HEAD(&parent->mdev_list);
>>> + init_waitqueue_head(&parent->release_done);
>>
>> And no lock to protect these operations.
>>
>
> As I replied to Alex also, yes I'm fixing it.
>
>>> +void mdev_unregister_device(struct device *dev)
>>> +{
>>> + struct parent_device *parent;
>>> + struct mdev_device *mdev, *n;
>>> + int ret;
>>> +
>>> + mutex_lock(&parent_devices.list_lock);
>>> + parent = find_parent_device(dev);
>>> +
>>> + if (!parent) {
>>> + mutex_unlock(&parent_devices.list_lock);
>>> + return;
>>> + }
>>> + dev_info(dev, "MDEV: Unregistering\n");
>>> +
>>> + /*
>>> + * Remove parent from the list and remove create and destroy sysfs
>>> + * files so that no new mediated device could be created for this
>>> parent
>>> + */
>>> + list_del(&parent->next);
>>> + mdev_remove_sysfs_files(dev);
>>> + mutex_unlock(&parent_devices.list_lock);
>>> +
>>
>> find_parent_device() does not increase the refcount of the parent-device,
>> after releasing the lock, is it still safe to use the device?
>>
>
> Yes. In mdev_register_device(), kref_init() initialises refcount to 1
> and then when mdev child is created refcount is incremented and on child
> mdev destroys refcount is decremented. So when all child mdev are
> destroyed, refcount will still be 1 until mdev_unregister_device() is
> called. So when no mdev device is created, mdev_register_device() hold
> parent's refcount and released from mdev_unregister_device().
>
>
>>> + mutex_lock(&parent->ops_lock);
>>> + mdev_remove_attribute_group(dev,
>>> + parent->ops->dev_attr_groups);
>>
>> Why mdev_remove_sysfs_files() and mdev_remove_attribute_group()
>> are protected by different locks?
>>
>
> As mentioned in reply to Alex on another thread, removing these locks.
>
>>> +
>>> +int mdev_device_destroy(struct device *dev, uuid_le uuid, uint32_t
>>> instance)
>>> +{
>>> + struct mdev_device *mdev;
>>> + struct parent_device *parent;
>>> + int ret;
>>> +
>>> + parent = mdev_get_parent_by_dev(dev);
>>> + if (!parent) {
>>> + ret = -EINVAL;
>>> + goto destroy_err;
>>> + }
>>> +
>>> + mdev = find_mdev_device(parent, uuid, instance);
>>> + if (!mdev) {
>>> + ret = -EINVAL;
>>> + goto destroy_err;
>>> + }
>>> +
>>> + ret = mdev_device_destroy_ops(mdev, false);
>>> + if (ret)
>>> + goto destroy_err;
>>
>> find_mdev_device() does not hold the refcount of mdev, is it safe?
>>
>
> Yes, this function is just to check duplicate entry or existence of mdev
> device.
>
Now, i am getting more confused, the caller of mdev_device_destroy(), e.g,
mdev_destroy_store(), does not hold any lock, however, you was walking
the list of parent->mdev_list (calling find_mdev_device()), is it really safe?
And how to prevent this scenario?
CPU 0 CPU 1
in mdev_device_destroy(): in mdev_unregister_device():
mdev = find_mdev_device(parent, uuid, instance); // no refcount hold
mdev_device_destroy_ops(mdev, true);
list_del(&mdev->next);
mdev_put_device(mdev); // last refcount is gone.
mdev = find_mdev_device(parent, uuid, instance); !!!!!! PANIC !!!!!!
>>> +
>>> + mdev_put_parent(parent);
>>> +
>>
>> The refcount of parent-device is released, you can not continue to
>> use it.
>>
>
> Removing these locks.
>
>>> + mutex_lock(&parent->mdev_list_lock);
>>> + list_del(&mdev->next);
>>> + mutex_unlock(&parent->mdev_list_lock);
>>> +
>>> + mdev_put_device(mdev);
>>> + return ret;
>>> +
>>> +destroy_err:
>>> + mdev_put_parent(parent);
>>> + return ret;
>>> +}
>>> +
>>> +
>>> +static struct class mdev_class = {
>>> + .name = MDEV_CLASS_NAME,
>>> + .owner = THIS_MODULE,
>>> + .class_attrs = mdev_class_attrs,
>>
>> These interfaces, start and shutdown, are based on UUID, how
>> about if we want to operate on the specified instance?
>>
>
> Do you mean hot-plug a device?
Hot-plug and hot-unplug, yes.
>
>>> +};
>>> +
>>> +static int __init mdev_init(void)
>>> +{
>>> + int ret;
>>> +
>>> + mutex_init(&parent_devices.list_lock);
>>> + INIT_LIST_HEAD(&parent_devices.dev_list);
>>> +
>>> + ret = class_register(&mdev_class);
>>> + if (ret) {
>>> + pr_err("Failed to register mdev class\n");
>>> + return ret;
>>> + }
>>> +
>>> + ret = mdev_bus_register();
>>> + if (ret) {
>>> + pr_err("Failed to register mdev bus\n");
>>> + class_unregister(&mdev_class);
>>> + return ret;
>>> + }
>>> +
>>> + return ret;
>>> +}
>>> +
>>> +static void __exit mdev_exit(void)
>>> +{
>>> + mdev_bus_unregister();
>>> + class_unregister(&mdev_class);
>>> +}
>>
>> Hmm, how to prevent if there are parent-devices existing
>> when the module is being unloaded?
>>
>
> If parent device exits that means that other module is using
> mdev_register_device()/mdev_unregister_device() from their module.
> 'rmmod mdev' would fail until that module is unloaded.
>
> # rmmod mdev
> rmmod: ERROR: Module mdev is in use by: nvidia
>
So other module need to explicitly increase the refcount of mdev.ko?
>
>
>>> +
>>> +static int uuid_parse(const char *str, uuid_le *uuid)
>>> +{
>>> + int i;
>>> +
>>> + if (strlen(str) < UUID_CHAR_LENGTH)
>>> + return -EINVAL;
>>> +
>>> + for (i = 0; i < UUID_BYTE_LENGTH; i++) {
>>> + if (!isxdigit(str[0]) || !isxdigit(str[1])) {
>>> + pr_err("%s err", __func__);
>>> + return -EINVAL;
>>> + }
>>> +
>>> + uuid->b[i] = (hex_to_bin(str[0]) << 4) | hex_to_bin(str[1]);
>>> + str += 2;
>>> + if (is_uuid_sep(*str))
>>> + str++;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>
>> Can we use uuid_le_to_bin()?
>
> I couldn't find this in kernel?
It is in lib/uuid.c, i am using kvm tree.
next prev parent reply other threads:[~2016-07-04 7:31 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-20 16:31 [RFC PATCH v5 0/3] Add Mediated device support Kirti Wankhede
2016-06-20 16:31 ` [PATCH 1/3] Mediated device Core driver Kirti Wankhede
2016-06-21 7:38 ` Jike Song
2016-06-21 21:30 ` Alex Williamson
2016-06-24 17:54 ` Kirti Wankhede
2016-06-24 19:40 ` Alex Williamson
2016-06-30 16:48 ` Kirti Wankhede
2016-06-29 13:51 ` Xiao Guangrong
2016-06-30 7:12 ` Jike Song
2016-06-30 18:58 ` [Qemu-devel] " Kirti Wankhede
2016-06-30 18:51 ` Kirti Wankhede
2016-07-04 7:27 ` Xiao Guangrong [this message]
2016-07-04 2:08 ` Jike Song
2016-06-20 16:31 ` [PATCH 2/3] VFIO driver for mediated PCI device Kirti Wankhede
2016-06-21 22:48 ` Alex Williamson
2016-06-24 18:34 ` Kirti Wankhede
2016-06-24 19:45 ` Alex Williamson
2016-06-28 18:45 ` Kirti Wankhede
2016-06-29 2:54 ` Alex Williamson
2016-06-30 16:54 ` Kirti Wankhede
2016-06-30 6:34 ` Xiao Guangrong
2016-06-20 16:31 ` [PATCH 3/3] VFIO Type1 IOMMU: Add support for mediated devices Kirti Wankhede
2016-06-22 3:46 ` Alex Williamson
2016-06-28 13:02 ` Kirti Wankhede
2016-06-29 2:46 ` Alex Williamson
2016-06-30 8:28 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=577A0FE7.5020202@linux.intel.com \
--to=guangrong.xiao@linux.intel.com \
--cc=alex.williamson@redhat.com \
--cc=bjsdjshi@linux.vnet.ibm.com \
--cc=cjia@nvidia.com \
--cc=jike.song@intel.com \
--cc=kevin.tian@intel.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=shuai.ruan@intel.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox