Re: [PATCH v7 1/1] vfio/nvgpu: Add vfio pci variant module for grace hopper

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com
Cc: alex.williamson@redhat.com, yishaih@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com,
	aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com,
	apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v7 1/1] vfio/nvgpu: Add vfio pci variant module for grace hopper
Date: Wed, 23 Aug 2023 10:56:07 -0300	[thread overview]
Message-ID: <ZOYP92q1mDQgwnc9@nvidia.com> (raw)
In-Reply-To: <20230822202303.19661-1-ankita@nvidia.com>

On Tue, Aug 22, 2023 at 01:23:03PM -0700, ankita@nvidia.com wrote:

> @@ -0,0 +1,444 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved
> + */
> +
> +#include <linux/pci.h>
> +#include <linux/vfio_pci_core.h>
> +#include <linux/vfio.h>
> +
> +struct nvgrace_gpu_vfio_pci_core_device {
> +	struct vfio_pci_core_device core_device;
> +	u64 hpa;
> +	u64 mem_length;
> +	void *opregion;
> +};

opregion is some word that has meaning for the intel drivers, use
something else please.

'hpa' has no business being in a VFIO driver.

phys_addr_t memphys
size_t memlength
void __iomem *memmap;

> +static long nvgrace_gpu_vfio_pci_ioctl(struct vfio_device *core_vdev,
> +					unsigned int cmd, unsigned long arg)
> +{
> +	struct nvgrace_gpu_vfio_pci_core_device *nvdev = container_of(
> +		core_vdev, struct nvgrace_gpu_vfio_pci_core_device, core_device.vdev);
> +
> +	unsigned long minsz = offsetofend(struct vfio_region_info, offset);
> +	struct vfio_region_info info;
> +
> +	if (cmd == VFIO_DEVICE_GET_REGION_INFO) {

This block is long enough to put in its own function and remove a
level of indenting

> +/*
> + * Read count bytes from the device memory at an offset. The actual device
> + * memory size (available) may not be a power-of-2. So the driver fakes
> + * the size to a power-of-2 (reported) when exposing to a user space driver.
> + *
> + * Read request beyond the actual device size is filled with ~1, while
> + * those beyond the actual reported size is skipped.
> + *
> + * A read from a reported+ offset is considered error conditions and
> + * returned with an -EINVAL. Overflow conditions are also handled.
> + */
> +ssize_t nvgrace_gpu_read_mem(void __user *buf, size_t count, loff_t *ppos,
> +			      const void *addr, size_t available, size_t reported)
> +{
> +	u64 offset = *ppos & VFIO_PCI_OFFSET_MASK;
> +	u64 end;
> +	size_t read_count, i;
> +	u8 val = 0xFF;
> +
> +	if (offset >= reported)
> +		return -EINVAL;
> +
> +	if (check_add_overflow(offset, count, &end))
> +		return -EOVERFLOW;
> +
> +	/* Clip short the read request beyond reported BAR size */
> +	if (end >= reported)
> +		count = reported - offset;
> +
> +	/*
> +	 * Determine how many bytes to be actually read from the device memory.
> +	 * Do not read from the offset beyond available size.
> +	 */
> +	if (offset >= available)
> +		read_count = 0;
> +	else
> +		read_count = min(count, available - (size_t)offset);
> +
> +	/*
> +	 * Handle read on the BAR2 region. Map to the target device memory
> +	 * physical address and copy to the request read buffer.
> +	 */
> +	if (copy_to_user(buf, (u8 *)addr + offset, read_count))
> +		return -EFAULT;
> +
> +	/*
> +	 * Only the device memory present on the hardware is mapped, which may
> +	 * not be power-of-2 aligned. A read to the BAR2 region implies an
> +	 * access outside the available device memory on the hardware. Fill
> +	 * such read request with ~1.
> +	 */
> +	for (i = 0; i < count - read_count; i++)
> +		if (copy_to_user(buf + read_count + i, &val, 1))
> +			return -EFAULT;

Use put_user()

> +static ssize_t nvgrace_gpu_vfio_pci_write(struct vfio_device *core_vdev,
> +					   const char __user *buf, size_t count, loff_t *ppos)
> +{
> +	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
> +	struct nvgrace_gpu_vfio_pci_core_device *nvdev = container_of(
> +		core_vdev, struct nvgrace_gpu_vfio_pci_core_device, core_device.vdev);
> +
> +	if (index == VFIO_PCI_BAR2_REGION_INDEX) {
> +		if (!nvdev->opregion) {
> +			nvdev->opregion = memremap(nvdev->hpa, nvdev->mem_length, MEMREMAP_WB);
> +			if (!nvdev->opregion)
> +				return -ENOMEM;
> +		}

Needs some kind of locking on opregion

> +static int
> +nvgrace_gpu_vfio_pci_fetch_memory_property(struct pci_dev *pdev,
> +					    struct nvgrace_gpu_vfio_pci_core_device *nvdev)
> +{
> +	int ret;
> +
> +	/*
> +	 * The memory information is present in the system ACPI tables as DSD
> +	 * properties nvidia,gpu-mem-base-pa and nvidia,gpu-mem-size.
> +	 */
> +	ret = device_property_read_u64(&pdev->dev, "nvidia,gpu-mem-base-pa",
> +				       &(nvdev->hpa));
> +	if (ret)
> +		return ret;

Technically you need to check that the read_u64 doesn't overflow
phys_addr_t

> +	ret = device_property_read_u64(&pdev->dev, "nvidia,gpu-mem-size",
> +				       &(nvdev->mem_length));
> +	return ret;

And that mem_length doesn't overflow size_t

Jason

next prev parent reply	other threads:[~2023-08-23 13:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-22 20:23 [PATCH v7 1/1] vfio/nvgpu: Add vfio pci variant module for grace hopper ankita
2023-08-22 22:29 ` Alex Williamson
2023-08-23 13:56 ` Jason Gunthorpe [this message]
2023-08-23 14:18   ` Ankit Agrawal
2023-08-23 14:25     ` Jason Gunthorpe
2023-08-23 14:26     ` Alex Williamson
2023-08-23 14:50   ` Ankit Agrawal
2023-08-23 15:14     ` Alex Williamson
2023-08-23 15:16       ` Jason Gunthorpe
2023-08-23 15:24         ` Alex Williamson
2023-08-30 13:50 ` Christoph Hellwig
2023-08-30 15:39   ` Jason Gunthorpe
2023-08-30 15:39     ` Jason Gunthorpe
2023-08-31 12:26     ` Christoph Hellwig
2023-08-31 13:51       ` Ankit Agrawal
2023-08-31 13:51         ` Ankit Agrawal
2023-08-31 14:04         ` Christoph Hellwig
2023-08-31 18:21           ` Alex Williamson
2023-08-31 18:21             ` Alex Williamson
2023-09-01  0:44             ` Jason Gunthorpe
2023-09-01  0:44               ` Jason Gunthorpe
2023-09-01  5:48               ` Christoph Hellwig
2023-09-01  5:47             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZOYP92q1mDQgwnc9@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.