linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yishai Hadas <yishaih@nvidia.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <bhelgaas@google.com>, <corbet@lwn.net>,
	<alex.williamson@redhat.com>, <diana.craciun@oss.nxp.com>,
	<kwankhede@nvidia.com>, <eric.auger@redhat.com>,
	<masahiroy@kernel.org>, <michal.lkml@markovi.net>,
	<linux-pci@vger.kernel.org>, <linux-doc@vger.kernel.org>,
	<kvm@vger.kernel.org>, <linux-s390@vger.kernel.org>,
	<linux-kbuild@vger.kernel.org>, <mgurtovoy@nvidia.com>,
	<jgg@nvidia.com>, <maorg@nvidia.com>, <leonro@nvidia.com>
Subject: Re: [PATCH V2 09/12] PCI: Add 'override_only' bitmap to struct pci_device_id
Date: Thu, 19 Aug 2021 19:16:20 +0300	[thread overview]
Message-ID: <41539eec-b6fc-084b-0417-ac39d324189e@nvidia.com> (raw)
In-Reply-To: <20210819151549.GA3128368@bjorn-Precision-5520>

On 8/19/2021 6:15 PM, Bjorn Helgaas wrote:
> On Wed, Aug 18, 2021 at 06:16:03PM +0300, Yishai Hadas wrote:
>> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>>
>> Allow device drivers to include match entries in the modules.alias file
>> produced by kbuild that are not used for normal driver autoprobing and
>> module autoloading. Drivers using these match entries can be connected
>> to the PCI device manually, by userspace, using the existing
>> driver_override sysfs.
>>
>> To achieve it, we add the 'override_only' bitmap to struct pci_device_id
>> and a helper macro named 'PCI_DEVICE_DRIVER_OVERRIDE' to enable setting
>> specific bits on it.
>>
>> The first bit (i.e. 'PCI_ID_F_VFIO_DRIVER_OVERRIDE') indicates that the
>> match entry is for the VFIO subsystem, it can be set by another helper
>> macro named 'PCI_DRIVER_OVERRIDE_DEVICE_VFIO'.
>>
>> These match entries are prefixed with "vfio_" in the modules.alias.
>>
>> For example the resulting modules.alias may have:
>>
>>    alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core
>>    alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
>>    alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
>>
>> In this example mlx5_core and mlx5_vfio_pci match to the same PCI
>> device. The kernel will autoload and autobind to mlx5_core but the kernel
>> and udev mechanisms will ignore mlx5_vfio_pci.
>>
>> When userspace wants to change a device to the VFIO subsystem userspace
>> can implement a generic algorithm:
>>
>>     1) Identify the sysfs path to the device:
>>      /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
>>
>>     2) Get the modalias string from the kernel:
>>      $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
>>      pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
>>
>>     3) Prefix it with vfio_:
>>      vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
>>
>>     4) Search modules.alias for the above string and select the entry that
>>        has the fewest *'s:
>>      alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
>>
>>     5) modprobe the matched module name:
>>      $ modprobe mlx5_vfio_pci
>>
>>     6) cat the matched module name to driver_override:
>>      echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
>>
>>     7) unbind device from original module
>>      echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
>>
>>     8) probe PCI drivers (or explicitly bind to mlx5_vfio_pci)
>>      echo 0000:01:00.0 > /sys/bus/pci/drivers_probe
>>
>> The algorithm is independent of bus type. In future the other buses's with
> s/buses's/buses/


OK

>> VFIO device drivers, like platform and ACPI, can use this algorithm as
>> well.
>>
>> This patch is the infrastructure to provide the information in the
>> modules.alias to userspace. Convert the only VFIO pci_driver which results
>> in one new line in the modules.alias:
>>
>>    alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
>>
>> Later series introduce additional HW specific VFIO PCI drivers, such as
>> mlx5_vfio_pci.
>>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>   Documentation/PCI/pci.rst         |  1 +
>>   drivers/pci/pci-driver.c          | 27 ++++++++++++++++++++-------
>>   drivers/vfio/pci/vfio_pci.c       |  9 ++++++++-
>>   include/linux/mod_devicetable.h   |  6 ++++++
>>   include/linux/pci.h               | 28 ++++++++++++++++++++++++++++
>>   scripts/mod/devicetable-offsets.c |  1 +
>>   scripts/mod/file2alias.c          |  8 ++++++--
>>   7 files changed, 70 insertions(+), 10 deletions(-)
>>
>> diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
>> index fa651e25d98c..87c6f4a6ca32 100644
>> --- a/Documentation/PCI/pci.rst
>> +++ b/Documentation/PCI/pci.rst
>> @@ -103,6 +103,7 @@ need pass only as many optional fields as necessary:
>>     - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
>>     - class and classmask fields default to 0
>>     - driver_data defaults to 0UL.
>> +  - override_only field defaults to 0.
>>   
>>   Note that driver_data must match the value used by any of the pci_device_id
>>   entries defined in the driver. This makes the driver_data field mandatory
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 3a72352aa5cf..8a6bd3364127 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -136,7 +136,7 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>>   						    struct pci_dev *dev)
>>   {
>>   	struct pci_dynid *dynid;
>> -	const struct pci_device_id *found_id = NULL;
>> +	const struct pci_device_id *found_id = NULL, *ids;
>>   
>>   	/* When driver_override is set, only bind to the matching driver */
>>   	if (dev->driver_override && strcmp(dev->driver_override, drv->name))
>> @@ -152,14 +152,27 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>>   	}
>>   	spin_unlock(&drv->dynids.lock);
>>   
>> -	if (!found_id)
>> -		found_id = pci_match_id(drv->id_table, dev);
>> +	if (found_id)
>> +		return found_id;
>>   
>> -	/* driver_override will always match, send a dummy id */
>> -	if (!found_id && dev->driver_override)
>> -		found_id = &pci_device_id_any;
>> +	for (ids = drv->id_table; (found_id = pci_match_id(ids, dev));
>> +	     ids = found_id + 1) {
>> +		/*
>> +		 * The match table is split based on driver_override. Check the
>> +		 * override_only as well so that any matching entry is
>> +		 * returned.
>> +		 */
>> +		if (!found_id->override_only || dev->driver_override)
>> +			return found_id;
> The negation makes this short, but IMO, makes this harder to read.
> I'd rather test for the special case directly instead of testing for
> the *absence* of the special case, e.g.,
>
>    if (found_id->override_only) {
>      if (dev->driver_override)
>        return found_id;
>    } else
>      return found_id;


This can be fine as well.

>> +	}
>>   
>> -	return found_id;
>> +	/*
>> +	 * if no static match, driver_override will always match, send a dummy
>> +	 * id.
> I think the original comment was better.  This comment implies that we
> only checked for static matches above, but we actually checked for
> *both* dynamic IDs and static IDs.


OK

>> +	 */
>> +	if (dev->driver_override)
>> +		return &pci_device_id_any;
>> +	return NULL;
>>   }
>>   
>>   /**
>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
>> index 07edddf7e6ca..c52620ac5e70 100644
>> --- a/drivers/vfio/pci/vfio_pci.c
>> +++ b/drivers/vfio/pci/vfio_pci.c
>> @@ -180,9 +180,16 @@ static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
>>   	return vfio_pci_core_sriov_configure(pdev, nr_virtfn);
>>   }
>>   
>> +static const struct pci_device_id vfio_pci_table[] = {
>> +	{ PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_ANY_ID, PCI_ANY_ID) }, /* match all by default */
>> +	{}
>> +};
>> +
>> +MODULE_DEVICE_TABLE(pci, vfio_pci_table);
>> +
>>   static struct pci_driver vfio_pci_driver = {
>>   	.name			= "vfio-pci",
>> -	.id_table		= NULL, /* only dynamic ids */
>> +	.id_table		= vfio_pci_table,
>>   	.probe			= vfio_pci_probe,
>>   	.remove			= vfio_pci_remove,
>>   	.sriov_configure	= vfio_pci_sriov_configure,
>> diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
>> index 8e291cfdaf06..39c229a7ab8c 100644
>> --- a/include/linux/mod_devicetable.h
>> +++ b/include/linux/mod_devicetable.h
>> @@ -16,6 +16,10 @@ typedef unsigned long kernel_ulong_t;
>>   
>>   #define PCI_ANY_ID (~0)
>>   
>> +enum {
>> +	PCI_ID_F_VFIO_DRIVER_OVERRIDE	= 1 << 0,
>> +};
>> +
>>   /**
>>    * struct pci_device_id - PCI device ID structure
>>    * @vendor:		Vendor ID to match (or PCI_ANY_ID)
>> @@ -34,12 +38,14 @@ typedef unsigned long kernel_ulong_t;
>>    *			Best practice is to use driver_data as an index
>>    *			into a static list of equivalent device types,
>>    *			instead of using it as a pointer.
>> + * @override_only:	Bitmap for override_only PCI drivers.
> "Match only when dev->driver_override is this driver"?


Just to be aligned here,

This field will stay __u32 and may hold at the most 1 bit value set to 
represent the actual subsystem/driver.

This is required to later on set the correct prefix in the modules.alias 
file, and you just suggested to change the comment as of above, right ?

> As far as PCI core is concerned there's no need for this to be a
> bitmap.
>
> I think this would make more sense if split into two patches.  The
> first would add override_only and change pci_match_device().  Then
> there's no confusion about whether this is specific to VFIO.


Splitting may end-up the first patch with a dead-code on below, as 
found_id->override_only will be always 0.

If you still believe that this is better we can do it.

if (found_id->override_only) {
     if (dev->driver_override)
       return found_id;
   } else
     return found_id;

> The second can add PCI_ID_F_VFIO_DRIVER_OVERRIDE and make the
> file2alias.c changes.  Most of the commit log applies to this part.
>
>>    */
>>   struct pci_device_id {
>>   	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
>>   	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
>>   	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
>>   	kernel_ulong_t driver_data;	/* Data private to the driver */
>> +	__u32 override_only;
>>   };
>>   
>>   
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 540b377ca8f6..57f9aa60f3b4 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -901,6 +901,34 @@ struct pci_driver {
>>   	.vendor = (vend), .device = (dev), \
>>   	.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID
>>   
>> +/**
>> + * PCI_DEVICE_DRIVER_OVERRIDE - macro used to describe a PCI device with
>> + *                              override_only flags.
>> + * @vend: the 16 bit PCI Vendor ID
>> + * @dev: the 16 bit PCI Device ID
>> + * @driver_override: PCI Device override_only bitmap
>> + *
>> + * This macro is used to create a struct pci_device_id that matches a
>> + * specific device. The subvendor and subdevice fields will be set to
>> + * PCI_ANY_ID.
>> + */
>> +#define PCI_DEVICE_DRIVER_OVERRIDE(vend, dev, driver_override) \
>> +	.vendor = (vend), .device = (dev), .subvendor = PCI_ANY_ID, \
>> +	.subdevice = PCI_ANY_ID, .override_only = (driver_override)
>> +
>> +/**
>> + * PCI_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
>> + *                                   "driver_override" PCI device.
>> + * @vend: the 16 bit PCI Vendor ID
>> + * @dev: the 16 bit PCI Device ID
>> + *
>> + * This macro is used to create a struct pci_device_id that matches a
>> + * specific device. The subvendor and subdevice fields will be set to
>> + * PCI_ANY_ID and the flags will be set to PCI_ID_F_VFIO_DRIVER_OVERRIDE.
>> + */
>> +#define PCI_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
>> +	PCI_DEVICE_DRIVER_OVERRIDE(vend, dev, PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>> +
>>   /**
>>    * PCI_DEVICE_SUB - macro used to describe a specific PCI device with subsystem
>>    * @vend: the 16 bit PCI Vendor ID
>> diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c
>> index 9bb6c7edccc4..cc3625617a0e 100644
>> --- a/scripts/mod/devicetable-offsets.c
>> +++ b/scripts/mod/devicetable-offsets.c
>> @@ -42,6 +42,7 @@ int main(void)
>>   	DEVID_FIELD(pci_device_id, subdevice);
>>   	DEVID_FIELD(pci_device_id, class);
>>   	DEVID_FIELD(pci_device_id, class_mask);
>> +	DEVID_FIELD(pci_device_id, override_only);
>>   
>>   	DEVID(ccw_device_id);
>>   	DEVID_FIELD(ccw_device_id, match_flags);
>> diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
>> index 7c97fa8e36bc..c3edbf73157e 100644
>> --- a/scripts/mod/file2alias.c
>> +++ b/scripts/mod/file2alias.c
>> @@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
>>   	return 1;
>>   }
>>   
>> -/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
>> +/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
>>   static int do_pci_entry(const char *filename,
>>   			void *symval, char *alias)
>>   {
>> @@ -440,8 +440,12 @@ static int do_pci_entry(const char *filename,
>>   	DEF_FIELD(symval, pci_device_id, subdevice);
>>   	DEF_FIELD(symval, pci_device_id, class);
>>   	DEF_FIELD(symval, pci_device_id, class_mask);
>> +	DEF_FIELD(symval, pci_device_id, override_only);
>>   
>> -	strcpy(alias, "pci:");
>> +	if (override_only & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
>> +		strcpy(alias, "vfio_pci:");
>> +	else
>> +		strcpy(alias, "pci:");
>>   	ADD(alias, "v", vendor != PCI_ANY_ID, vendor);
>>   	ADD(alias, "d", device != PCI_ANY_ID, device);
>>   	ADD(alias, "sv", subvendor != PCI_ANY_ID, subvendor);
>> -- 
>> 2.18.1
>>


  reply	other threads:[~2021-08-19 16:16 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18 15:15 [PATCH V2 00/12] Introduce vfio_pci_core subsystem Yishai Hadas
2021-08-18 15:15 ` [PATCH V2 01/12] vfio/pci: Rename vfio_pci.c to vfio_pci_core.c Yishai Hadas
2021-08-18 15:15 ` [PATCH V2 02/12] vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h Yishai Hadas
2021-08-18 15:15 ` [PATCH V2 03/12] vfio/pci: Rename vfio_pci_device to vfio_pci_core_device Yishai Hadas
2021-08-18 15:15 ` [PATCH V2 04/12] vfio/pci: Rename ops functions to fit core namings Yishai Hadas
2021-08-18 15:15 ` [PATCH V2 05/12] vfio/pci: Include vfio header in vfio_pci_core.h Yishai Hadas
2021-08-18 15:16 ` [PATCH V2 06/12] vfio/pci: Split the pci_driver code out of vfio_pci_core.c Yishai Hadas
2021-08-19 21:12   ` Alex Williamson
2021-08-19 21:38     ` Alex Williamson
2021-08-19 22:36     ` Jason Gunthorpe
2021-08-18 15:16 ` [PATCH V2 07/12] vfio/pci: Move igd initialization to vfio_pci.c Yishai Hadas
2021-08-18 15:16 ` [PATCH V2 08/12] vfio/pci: Move module parameters " Yishai Hadas
2021-08-18 15:16 ` [PATCH V2 09/12] PCI: Add 'override_only' bitmap to struct pci_device_id Yishai Hadas
2021-08-19 15:15   ` Bjorn Helgaas
2021-08-19 16:16     ` Yishai Hadas [this message]
2021-08-19 16:39       ` Bjorn Helgaas
2021-08-19 19:57         ` Max Gurtovoy
2021-08-19 22:19           ` Alex Williamson
2021-08-18 15:16 ` [PATCH V2 10/12] vfio: Use select for eventfd Yishai Hadas
2021-08-18 15:16 ` [PATCH V2 11/12] vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' Yishai Hadas
2021-08-18 15:16 ` [PATCH V2 12/12] vfio/pci: Introduce vfio_pci_core.ko Yishai Hadas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41539eec-b6fc-084b-0417-ac39d324189e@nvidia.com \
    --to=yishaih@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=diana.craciun@oss.nxp.com \
    --cc=eric.auger@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=masahiroy@kernel.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=michal.lkml@markovi.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).