From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45964) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bhpFo-0004Fi-MM for qemu-devel@nongnu.org; Wed, 07 Sep 2016 22:39:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bhpFk-0001VG-Et for qemu-devel@nongnu.org; Wed, 07 Sep 2016 22:39:51 -0400 Received: from mga09.intel.com ([134.134.136.24]:48237) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bhpFk-0001VA-2T for qemu-devel@nongnu.org; Wed, 07 Sep 2016 22:39:48 -0400 Message-ID: <57D0CF08.6020106@intel.com> Date: Thu, 08 Sep 2016 10:38:00 +0800 From: Jike Song MIME-Version: 1.0 References: <1472097235-6332-1-git-send-email-kwankhede@nvidia.com> <1472097235-6332-3-git-send-email-kwankhede@nvidia.com> <20160825172226.2cd06d71@oc7835276234> <72e3baa7-70d1-c5dd-6cd7-3874e7ea4c01@nvidia.com> In-Reply-To: <72e3baa7-70d1-c5dd-6cd7-3874e7ea4c01@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7 2/4] vfio: VFIO driver for mediated devices List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kirti Wankhede Cc: Dong Jia , alex.williamson@redhat.com, pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, kevin.tian@intel.com On 08/26/2016 10:13 PM, Kirti Wankhede wrote: > > > On 8/25/2016 2:52 PM, Dong Jia wrote: >> On Thu, 25 Aug 2016 09:23:53 +0530 >> Kirti Wankhede wrote: >> >> [...] >> >> Dear Kirti, >> >> I just rebased my vfio-ccw patches to this series. >> With a little fix, which was pointed it out in my reply to the #3 >> patch, it works fine. >> > > Thanks for update. Glad to know this works for you. > > >>> +static long vfio_mdev_unlocked_ioctl(void *device_data, >>> + unsigned int cmd, unsigned long arg) >>> +{ >>> + int ret = 0; >>> + struct vfio_mdev *vmdev = device_data; >>> + struct parent_device *parent = vmdev->mdev->parent; >>> + unsigned long minsz; >>> + >>> + switch (cmd) { >>> + case VFIO_DEVICE_GET_INFO: >>> + { >>> + struct vfio_device_info info; >>> + >>> + minsz = offsetofend(struct vfio_device_info, num_irqs); >>> + >>> + if (copy_from_user(&info, (void __user *)arg, minsz)) >>> + return -EFAULT; >>> + >>> + if (info.argsz < minsz) >>> + return -EINVAL; >>> + >>> + if (parent->ops->get_device_info) >>> + ret = parent->ops->get_device_info(vmdev->mdev, &info); >>> + else >>> + return -EINVAL; >>> + >>> + if (ret) >>> + return ret; >>> + >>> + if (parent->ops->reset) >>> + info.flags |= VFIO_DEVICE_FLAGS_RESET; >> Shouldn't this be done inside the get_device_info callback? >> > > I would like Vendor driver to set device type only. Reset flag should be > set on basis of reset() callback provided. > >>> + >>> + memcpy(&vmdev->dev_info, &info, sizeof(info)); >>> + >>> + return copy_to_user((void __user *)arg, &info, minsz); >>> + } >> [...] >> >>> + >>> +static ssize_t vfio_mdev_read(void *device_data, char __user *buf, >>> + size_t count, loff_t *ppos) >>> +{ >>> + struct vfio_mdev *vmdev = device_data; >>> + struct mdev_device *mdev = vmdev->mdev; >>> + struct parent_device *parent = mdev->parent; >>> + unsigned int done = 0; >>> + int ret; >>> + >>> + if (!parent->ops->read) >>> + return -EINVAL; >>> + >>> + while (count) { >> Here, I have to say sorry to you guys for that I didn't notice the >> bad impact of this change to my patches during the v6 discussion. >> >> For vfio-ccw, I introduced an I/O region to input/output I/O >> instruction parameters and results for Qemu. The @count of these data >> currently is 140. So supporting arbitrary lengths in one shot here, and >> also in vfio_mdev_write, seems the better option for this case. >> >> I believe that if the pci drivers want to iterate in a 4 bytes step, you >> can do that in the parent read/write callbacks instead. >> >> What do you think? >> > > I would like to know Alex's thought on this. He raised concern with this > approach in v6 reviews: > "But I think this is exploitable, it lets the user make the kernel > allocate an arbitrarily sized buffer." It is impossible to check count here, because one simply doesn't have the knowledge of this region. VFIO_DEVICE_GET_REGION_INFO was implemented in vfio-mdev.ko, while decoding the vfio_mdev_read to a particular MMIO region was expected to be implemented in vendor driver, that results in unbalanced interfaces. To have balanced interfaces, you either: - call ioctl instead of GET_REGION_INFO - call read instead of decoding REGION or: - call GET_REGION_INFO instead of ioctl - decode REGION in read, and check its validity, call region-specific read function V6 was the latter, v7 is kind of a mixture of these two, while I believe the former will completely address such problem :) -- Thanks, Jike >>> + size_t filled; >>> + >>> + if (count >= 4 && !(*ppos % 4)) { >>> + u32 val; >>> + >>> + ret = parent->ops->read(mdev, (char *)&val, sizeof(val), >>> + *ppos); >>> + if (ret <= 0) >>> + goto read_err; >>> + >>> + if (copy_to_user(buf, &val, sizeof(val))) >>> + goto read_err; >>> + >>> + filled = 4; >>> + } else if (count >= 2 && !(*ppos % 2)) { >>> + u16 val; >>> + >>> + ret = parent->ops->read(mdev, (char *)&val, sizeof(val), >>> + *ppos); >>> + if (ret <= 0) >>> + goto read_err; >>> + >>> + if (copy_to_user(buf, &val, sizeof(val))) >>> + goto read_err; >>> + >>> + filled = 2; >>> + } else { >>> + u8 val; >>> + >>> + ret = parent->ops->read(mdev, &val, sizeof(val), *ppos); >>> + if (ret <= 0) >>> + goto read_err; >>> + >>> + if (copy_to_user(buf, &val, sizeof(val))) >>> + goto read_err; >>> + >>> + filled = 1; >>> + } >>> + >>> + count -= filled; >>> + done += filled; >>> + *ppos += filled; >>> + buf += filled; >>> + } >>> + >>> + return done; >>> + >>> +read_err: >>> + return -EFAULT; >>> +} >> [...] >> >> -------- >> Dong Jia >>