public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
To: Kirti Wankhede <kwankhede@nvidia.com>,
	alex.williamson@redhat.com, pbonzini@redhat.com,
	kraxel@redhat.com, cjia@nvidia.com
Cc: shuai.ruan@intel.com, jike.song@intel.com, kvm@vger.kernel.org,
	kevin.tian@intel.com, qemu-devel@nongnu.org,
	zhiyuan.lv@intel.com, bjsdjshi@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver
Date: Mon, 4 Jul 2016 15:27:35 +0800	[thread overview]
Message-ID: <577A0FE7.5020202@linux.intel.com> (raw)
In-Reply-To: <ff274e7b-7cdc-cc34-3b18-913072ab545e@nvidia.com>



On 07/01/2016 02:51 AM, Kirti Wankhede wrote:
>
>
> On 6/29/2016 7:21 PM, Xiao Guangrong wrote:
>>
>>
>> On 06/21/2016 12:31 AM, Kirti Wankhede wrote:
>>> Design for Mediated Device Driver:
> ...
>>> +static int mdev_add_attribute_group(struct device *dev,
>>> +                    const struct attribute_group **groups)
>>> +{
>>> +    return sysfs_create_groups(&dev->kobj, groups);
>>> +}
>>> +
>>> +static void mdev_remove_attribute_group(struct device *dev,
>>> +                    const struct attribute_group **groups)
>>> +{
>>> +    sysfs_remove_groups(&dev->kobj, groups);
>>> +}
>>> +
>>
>> better use device_add_groups() / device_remove_groups() instead?
>>
>
> These are not exported from base module. They can't be used here.

Er, i did not realize it, sorry.

>
>
>>> +}
>>> +
>>> +static int mdev_device_create_ops(struct mdev_device *mdev, char
>>> *mdev_params)
>>> +{
>>> +    struct parent_device *parent = mdev->parent;
>>> +    int ret;
>>> +
>>> +    mutex_lock(&parent->ops_lock);
>>> +    if (parent->ops->create) {
>>> +        ret = parent->ops->create(mdev->dev.parent, mdev->uuid,
>>> +                    mdev->instance, mdev_params);
>>
>> I think it is better if we pass @mdev to this callback, then the parent
>> driver
>> can do its specified operations and associate it with the instance,
>> e.g, via mdev->private.
>>
>
> Yes, actually I was also thinking of changing it to
>
> -       ret = parent->ops->create(mdev->dev.parent, mdev->uuid,
> -                                 mdev->instance, mdev_params);
> +       ret = parent->ops->create(mdev, mdev_params);
>

Good. :)

>
>>> +int mdev_register_device(struct device *dev, const struct parent_ops
>>> *ops)
>>> +{
>>> +    int ret = 0;
>>> +    struct parent_device *parent;
>>> +
>>> +    if (!dev || !ops)
>>> +        return -EINVAL;
>>> +
>>> +    mutex_lock(&parent_devices.list_lock);
>>> +
>>> +    /* Check for duplicate */
>>> +    parent = find_parent_device(dev);
>>> +    if (parent) {
>>> +        ret = -EEXIST;
>>> +        goto add_dev_err;
>>> +    }
>>> +
>>> +    parent = kzalloc(sizeof(*parent), GFP_KERNEL);
>>> +    if (!parent) {
>>> +        ret = -ENOMEM;
>>> +        goto add_dev_err;
>>> +    }
>>> +
>>> +    kref_init(&parent->ref);
>>> +    list_add(&parent->next, &parent_devices.dev_list);
>>> +    mutex_unlock(&parent_devices.list_lock);
>>
>> It is not safe as Alex's already pointed it out.
>>
>>> +
>>> +    parent->dev = dev;
>>> +    parent->ops = ops;
>>> +    mutex_init(&parent->ops_lock);
>>> +    mutex_init(&parent->mdev_list_lock);
>>> +    INIT_LIST_HEAD(&parent->mdev_list);
>>> +    init_waitqueue_head(&parent->release_done);
>>
>> And no lock to protect these operations.
>>
>
> As I replied to Alex also, yes I'm fixing it.
>
>>> +void mdev_unregister_device(struct device *dev)
>>> +{
>>> +    struct parent_device *parent;
>>> +    struct mdev_device *mdev, *n;
>>> +    int ret;
>>> +
>>> +    mutex_lock(&parent_devices.list_lock);
>>> +    parent = find_parent_device(dev);
>>> +
>>> +    if (!parent) {
>>> +        mutex_unlock(&parent_devices.list_lock);
>>> +        return;
>>> +    }
>>> +    dev_info(dev, "MDEV: Unregistering\n");
>>> +
>>> +    /*
>>> +     * Remove parent from the list and remove create and destroy sysfs
>>> +     * files so that no new mediated device could be created for this
>>> parent
>>> +     */
>>> +    list_del(&parent->next);
>>> +    mdev_remove_sysfs_files(dev);
>>> +    mutex_unlock(&parent_devices.list_lock);
>>> +
>>
>> find_parent_device() does not increase the refcount of the parent-device,
>> after releasing the lock, is it still safe to use the device?
>>
>
> Yes. In mdev_register_device(), kref_init() initialises refcount to 1
> and then when mdev child is created refcount is incremented and on child
> mdev destroys refcount is decremented. So when all child mdev are
> destroyed, refcount will still be 1 until mdev_unregister_device() is
> called. So when no mdev device is created, mdev_register_device() hold
> parent's refcount and released from mdev_unregister_device().
>
>
>>> +    mutex_lock(&parent->ops_lock);
>>> +    mdev_remove_attribute_group(dev,
>>> +                    parent->ops->dev_attr_groups);
>>
>> Why mdev_remove_sysfs_files() and mdev_remove_attribute_group()
>> are protected by different locks?
>>
>
> As mentioned in reply to Alex on another thread, removing these locks.
>
>>> +
>>> +int mdev_device_destroy(struct device *dev, uuid_le uuid, uint32_t
>>> instance)
>>> +{
>>> +    struct mdev_device *mdev;
>>> +    struct parent_device *parent;
>>> +    int ret;
>>> +
>>> +    parent = mdev_get_parent_by_dev(dev);
>>> +    if (!parent) {
>>> +        ret = -EINVAL;
>>> +        goto destroy_err;
>>> +    }
>>> +
>>> +    mdev = find_mdev_device(parent, uuid, instance);
>>> +    if (!mdev) {
>>> +        ret = -EINVAL;
>>> +        goto destroy_err;
>>> +    }
>>> +
>>> +    ret = mdev_device_destroy_ops(mdev, false);
>>> +    if (ret)
>>> +        goto destroy_err;
>>
>> find_mdev_device() does not hold the refcount of mdev, is it safe?
>>
>
> Yes, this function is just to check duplicate entry or existence of mdev
> device.
>

Now, i am getting more confused, the caller of mdev_device_destroy(), e.g,
mdev_destroy_store(), does not hold any lock, however, you was walking
the list of parent->mdev_list (calling find_mdev_device()), is it really safe?

And how to prevent this scenario?

CPU 0                                          CPU 1
in mdev_device_destroy():                in mdev_unregister_device():

mdev = find_mdev_device(parent, uuid, instance); // no refcount hold

                                          mdev_device_destroy_ops(mdev, true);
                                          list_del(&mdev->next);
                                          mdev_put_device(mdev); // last refcount is gone.


mdev = find_mdev_device(parent, uuid, instance); !!!!!! PANIC !!!!!!


>>> +
>>> +    mdev_put_parent(parent);
>>> +
>>
>> The refcount of parent-device is released, you can not continue to
>> use it.
>>
>
> Removing these locks.
>
>>> +    mutex_lock(&parent->mdev_list_lock);
>>> +    list_del(&mdev->next);
>>> +    mutex_unlock(&parent->mdev_list_lock);
>>> +
>>> +    mdev_put_device(mdev);
>>> +    return ret;
>>> +
>>> +destroy_err:
>>> +    mdev_put_parent(parent);
>>> +    return ret;
>>> +}
>>> +
>>> +
>>> +static struct class mdev_class = {
>>> +    .name        = MDEV_CLASS_NAME,
>>> +    .owner        = THIS_MODULE,
>>> +    .class_attrs    = mdev_class_attrs,
>>
>> These interfaces, start and shutdown, are based on UUID, how
>> about if we want to operate on the specified instance?
>>
>
> Do you mean hot-plug a device?

Hot-plug and hot-unplug, yes.

>
>>> +};
>>> +
>>> +static int __init mdev_init(void)
>>> +{
>>> +    int ret;
>>> +
>>> +    mutex_init(&parent_devices.list_lock);
>>> +    INIT_LIST_HEAD(&parent_devices.dev_list);
>>> +
>>> +    ret = class_register(&mdev_class);
>>> +    if (ret) {
>>> +        pr_err("Failed to register mdev class\n");
>>> +        return ret;
>>> +    }
>>> +
>>> +    ret = mdev_bus_register();
>>> +    if (ret) {
>>> +        pr_err("Failed to register mdev bus\n");
>>> +        class_unregister(&mdev_class);
>>> +        return ret;
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +static void __exit mdev_exit(void)
>>> +{
>>> +    mdev_bus_unregister();
>>> +    class_unregister(&mdev_class);
>>> +}
>>
>> Hmm, how to prevent if there are parent-devices existing
>> when the module is being unloaded?
>>
>
> If parent device exits that means that other module is using
> mdev_register_device()/mdev_unregister_device() from their module.
> 'rmmod mdev' would fail until that module is unloaded.
>
> # rmmod mdev
> rmmod: ERROR: Module mdev is in use by: nvidia
>

So other module need to explicitly increase the refcount of mdev.ko?

>
>
>>> +
>>> +static int uuid_parse(const char *str, uuid_le *uuid)
>>> +{
>>> +    int i;
>>> +
>>> +    if (strlen(str) < UUID_CHAR_LENGTH)
>>> +        return -EINVAL;
>>> +
>>> +    for (i = 0; i < UUID_BYTE_LENGTH; i++) {
>>> +        if (!isxdigit(str[0]) || !isxdigit(str[1])) {
>>> +            pr_err("%s err", __func__);
>>> +            return -EINVAL;
>>> +        }
>>> +
>>> +        uuid->b[i] = (hex_to_bin(str[0]) << 4) | hex_to_bin(str[1]);
>>> +        str += 2;
>>> +        if (is_uuid_sep(*str))
>>> +            str++;
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>
>> Can we use uuid_le_to_bin()?
>
> I couldn't find this in kernel?

It is in lib/uuid.c, i am using kvm tree.


  reply	other threads:[~2016-07-04  7:31 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-20 16:31 [RFC PATCH v5 0/3] Add Mediated device support Kirti Wankhede
2016-06-20 16:31 ` [PATCH 1/3] Mediated device Core driver Kirti Wankhede
2016-06-21  7:38   ` Jike Song
2016-06-21 21:30   ` Alex Williamson
2016-06-24 17:54     ` Kirti Wankhede
2016-06-24 19:40       ` Alex Williamson
2016-06-30 16:48         ` Kirti Wankhede
2016-06-29 13:51   ` Xiao Guangrong
2016-06-30  7:12     ` Jike Song
2016-06-30 18:58       ` [Qemu-devel] " Kirti Wankhede
2016-06-30 18:51     ` Kirti Wankhede
2016-07-04  7:27       ` Xiao Guangrong [this message]
2016-07-04  2:08   ` Jike Song
2016-06-20 16:31 ` [PATCH 2/3] VFIO driver for mediated PCI device Kirti Wankhede
2016-06-21 22:48   ` Alex Williamson
2016-06-24 18:34     ` Kirti Wankhede
2016-06-24 19:45       ` Alex Williamson
2016-06-28 18:45         ` Kirti Wankhede
2016-06-29  2:54           ` Alex Williamson
2016-06-30 16:54             ` Kirti Wankhede
2016-06-30  6:34   ` Xiao Guangrong
2016-06-20 16:31 ` [PATCH 3/3] VFIO Type1 IOMMU: Add support for mediated devices Kirti Wankhede
2016-06-22  3:46   ` Alex Williamson
2016-06-28 13:02     ` Kirti Wankhede
2016-06-29  2:46       ` Alex Williamson
2016-06-30  8:28         ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=577A0FE7.5020202@linux.intel.com \
    --to=guangrong.xiao@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=bjsdjshi@linux.vnet.ibm.com \
    --cc=cjia@nvidia.com \
    --cc=jike.song@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuai.ruan@intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox