linux-fpga.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Colberg, Peter" <peter.colberg@intel.com>
To: "yilun.xu@linux.intel.com" <yilun.xu@linux.intel.com>
Cc: "Xu, Yilun" <yilun.xu@intel.com>,
	"linux-fpga@vger.kernel.org" <linux-fpga@vger.kernel.org>,
	"mdf@kernel.org" <mdf@kernel.org>, "Wu, Hao" <hao.wu@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"russ.weight@linux.dev" <russ.weight@linux.dev>,
	"Pagani, Marco" <marpagan@redhat.com>,
	"trix@redhat.com" <trix@redhat.com>,
	"russell.h.weight@intel.com" <russell.h.weight@intel.com>,
	"matthew.gerlach@linux.intel.com"
	<matthew.gerlach@linux.intel.com>
Subject: Re: [PATCH v3 9/9] fpga: dfl: fix kernel warning on port release/assign for SRIOV
Date: Fri, 25 Oct 2024 22:54:16 +0000	[thread overview]
Message-ID: <04e415171f4c0af1f9407ad41bd386a7c1ef07cc.camel@intel.com> (raw)
In-Reply-To: <ZvJ/6wHoU9VXJKh8@yilunxu-OptiPlex-7050>

On Tue, 2024-09-24 at 17:01 +0800, Xu Yilun wrote:
> On Thu, Sep 19, 2024 at 04:34:30PM -0400, Peter Colberg wrote:
> > From: Xu Yilun <yilun.xu@intel.com>
> > 
> 
> Please describe what this patch does at the beginning. And below
> background descriptions follow.

The description has been fully revised in the new last patch "fpga:
dfl: destroy/recreate feature platform device on port release/assign".

> 
> > With the Intel FPGA PAC D5005, DFL ports are registered as platform
> > devices in PF mode. The port device must be removed from the host when
> > the user wants to configure the port as a VF for use by a user-space
> > driver, e.g., for pass-through to a virtual machine. The FME device
> > ioctls DFL_FPGA_FME_PORT_RELEASE/ASSIGN are assigned for this purpose.
> > 
> > In the previous implementation, the port platform device is not
> > completely destroyed on port release: it is removed from the system by
> > platform_device_del(), but the platform device instance is retained.
> > When DFL_FPGA_FME_PORT_ASSIGN is called, the platform device is added
> > back with platform_device_add(), which conflicts with this comment of
> > device_add(): "Do not call this routine more than once for any device
> > structure", and would previously cause a kernel warning at runtime.
> > 
> > This patch completely unregisters the port platform device on release
> > and registers a new device on assign. But the main work is to remove
> 
> The main work of this series, not this patch.
> 
> > the dependency on struct dfl_feature_platform_data for many internal DFL
> > APIs. This structure holds many DFL enumeration infos for feature
> > devices. Many DFL APIs are expected to work with these infos even when
> > the port platform device is unregistered. But after this change, the
> > platform_data will be freed on port release. Hence this patch introduces
> > a new structure dfl_feature_dev_data, which acts similarly to the
> > previous dfl_feature_platform_data. dfl_feature_platform_data then only
> > needs a pointer to dfl_feature_dev_data to query DFL enumeration infos.
> > 
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > Signed-off-by: Peter Colberg <peter.colberg@intel.com>
> > Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
> > ---
> >  drivers/fpga/dfl-fme-br.c |   2 -
> >  drivers/fpga/dfl.c        | 207 ++++++++++++++++++--------------------
> >  drivers/fpga/dfl.h        |  57 +++++++----
> >  3 files changed, 133 insertions(+), 133 deletions(-)
> > 
> > diff --git a/drivers/fpga/dfl-fme-br.c b/drivers/fpga/dfl-fme-br.c
> > index 5c60a38ec76c..a298a041877b 100644
> > --- a/drivers/fpga/dfl-fme-br.c
> > +++ b/drivers/fpga/dfl-fme-br.c
> > @@ -85,8 +85,6 @@ static void fme_br_remove(struct platform_device *pdev)
> >  
> >  	fpga_bridge_unregister(br);
> >  
> > -	if (priv->port_fdata)
> > -		put_device(&priv->port_fdata->dev->dev);
> 
> I can't remember why all the get_device/put_device() are not needed anymore.

This is now detailed in the new patch "fpga: dfl: destroy/recreate
feature platform device on port release/assign".

> 
> >  	if (priv->port_ops)
> >  		dfl_fpga_port_ops_put(priv->port_ops);
> >  }
> > diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> > index 52f58d029ca4..a77d7692b170 100644
> > --- a/drivers/fpga/dfl.c
> > +++ b/drivers/fpga/dfl.c
> > @@ -160,7 +160,7 @@ struct dfl_fpga_port_ops *dfl_fpga_port_ops_get(struct dfl_feature_dev_data *fda
> >  
> >  	list_for_each_entry(ops, &dfl_port_ops_list, node) {
> >  		/* match port_ops using the name of platform device */
> > -		if (!strcmp(fdata->dev->name, ops->name)) {
> > +		if (!strcmp(fdata->pdev_name, ops->name)) {
> >  			if (!try_module_get(ops->owner))
> >  				ops = NULL;
> >  			goto done;
> > @@ -681,7 +681,6 @@ EXPORT_SYMBOL_GPL(dfl_fpga_dev_ops_unregister);
> >   * @nr_irqs: number of irqs for all feature devices.
> >   * @irq_table: Linux IRQ numbers for all irqs, indexed by local irq index of
> >   *	       this device.
> > - * @feature_dev: current feature device.
> >   * @type: the current FIU type.
> >   * @ioaddr: header register region address of current FIU in enumeration.
> >   * @start: register resource start of current FIU.
> > @@ -695,7 +694,6 @@ struct build_feature_devs_info {
> >  	unsigned int nr_irqs;
> >  	int *irq_table;
> >  
> > -	struct platform_device *feature_dev;
> >  	enum dfl_id_type type;
> >  	void __iomem *ioaddr;
> >  	resource_size_t start;
> > @@ -736,7 +734,6 @@ static void dfl_fpga_cdev_add_port_data(struct dfl_fpga_cdev *cdev,
> >  {
> >  	mutex_lock(&cdev->lock);
> >  	list_add(&fdata->node, &cdev->port_dev_list);
> > -	get_device(&fdata->dev->dev);
> >  	mutex_unlock(&cdev->lock);
> >  }
> >  
> > @@ -744,7 +741,6 @@ static struct dfl_feature_dev_data *
> >  binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  {
> >  	enum dfl_id_type type = binfo->type;
> > -	struct platform_device *fdev = binfo->feature_dev;
> >  	struct dfl_feature_info *finfo, *p;
> >  	struct dfl_feature_dev_data *fdata;
> >  	int ret, index = 0, res_idx = 0;
> > @@ -752,18 +748,27 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  	if (WARN_ON_ONCE(type >= DFL_ID_MAX))
> >  		return ERR_PTR(-EINVAL);
> >  
> > -	/*
> > -	 * we do not need to care for the memory which is associated with
> > -	 * the platform device. After calling platform_device_unregister(),
> > -	 * it will be automatically freed by device's release() callback,
> > -	 * platform_device_release().
> > -	 */
> > -	fdata = kzalloc(struct_size(fdata, features, binfo->feature_num), GFP_KERNEL);
> > +	fdata = devm_kzalloc(binfo->dev, sizeof(*fdata), GFP_KERNEL);
> >  	if (!fdata)
> >  		return ERR_PTR(-ENOMEM);
> >  
> > -	fdata->dev = fdev;
> > +	fdata->features = devm_kcalloc(binfo->dev, binfo->feature_num,
> > +				       sizeof(*fdata->features), GFP_KERNEL);
> > +	if (!fdata->features)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	fdata->resources = devm_kcalloc(binfo->dev, binfo->feature_num,
> > +					sizeof(*fdata->resources), GFP_KERNEL);
> > +	if (!fdata->resources)
> > +		return ERR_PTR(-ENOMEM);
> > +
> >  	fdata->type = type;
> > +
> > +	fdata->pdev_id = dfl_id_alloc(type, binfo->dev);
> > +	if (fdata->pdev_id < 0)
> > +		return ERR_PTR(fdata->pdev_id);
> > +
> > +	fdata->pdev_name = dfl_devs[type].name;
> >  	fdata->num = binfo->feature_num;
> >  	fdata->dfl_cdev = binfo->cdev;
> >  	fdata->id = FEATURE_DEV_ID_UNUSED;
> > @@ -779,15 +784,6 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  	 */
> >  	WARN_ON(fdata->disable_count);
> >  
> > -	fdev->dev.platform_data = fdata;
> > -
> > -	/* each sub feature has one MMIO resource */
> > -	fdev->num_resources = binfo->feature_num;
> > -	fdev->resource = kcalloc(binfo->feature_num, sizeof(*fdev->resource),
> > -				 GFP_KERNEL);
> > -	if (!fdev->resource)
> > -		return ERR_PTR(-ENOMEM);
> > -
> >  	/* fill features and resource information for feature dev */
> >  	list_for_each_entry_safe(finfo, p, &binfo->sub_features, node) {
> >  		struct dfl_feature *feature = &fdata->features[index++];
> > @@ -795,7 +791,6 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  		unsigned int i;
> >  
> >  		/* save resource information for each feature */
> > -		feature->dev = fdev;
> >  		feature->id = finfo->fid;
> >  		feature->revision = finfo->revision;
> >  		feature->dfh_version = finfo->dfh_version;
> > @@ -804,8 +799,10 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  			feature->params = devm_kmemdup(binfo->dev,
> >  						       finfo->params, finfo->param_size,
> >  						       GFP_KERNEL);
> > -			if (!feature->params)
> > -				return ERR_PTR(-ENOMEM);
> > +			if (!feature->params) {
> > +				ret = -ENOMEM;
> > +				goto err_free_id;
> > +			}
> >  
> >  			feature->param_size = finfo->param_size;
> >  		}
> > @@ -823,19 +820,20 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  						      &finfo->mmio_res);
> >  			if (IS_ERR(feature->ioaddr)) {
> >  				ret = PTR_ERR(feature->ioaddr);
> > -				return ERR_PTR(ret);
> > +				goto err_free_id;
> >  			}
> >  		} else {
> >  			feature->resource_index = res_idx;
> > -			fdev->resource[res_idx++] = finfo->mmio_res;
> > +			fdata->resources[res_idx++] = finfo->mmio_res;
> >  		}
> >  
> >  		if (finfo->nr_irqs) {
> >  			ctx = devm_kcalloc(binfo->dev, finfo->nr_irqs,
> >  					   sizeof(*ctx), GFP_KERNEL);
> > -			if (!ctx)
> > -				return ERR_PTR(-ENOMEM);
> > -
> > +			if (!ctx) {
> > +				ret = -ENOMEM;
> > +				goto err_free_id;
> > +			}
> >  			for (i = 0; i < finfo->nr_irqs; i++)
> >  				ctx[i].irq =
> >  					binfo->irq_table[finfo->irq_base + i];
> > @@ -848,36 +846,67 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
> >  		kfree(finfo);
> >  	}
> >  
> > +	fdata->resource_num = res_idx;
> > +
> >  	return fdata;
> > +
> > +err_free_id:
> > +	dfl_id_free(type, fdata->pdev_id);
> > +
> > +	return ERR_PTR(ret);
> >  }
> >  
> > -static int
> > -build_info_create_dev(struct build_feature_devs_info *binfo)
> > +/*
> > + * register current feature device, it is called when we need to switch to
> > + * another feature parsing or we have parsed all features on given device
> > + * feature list.
> > + */
> > +static int feature_dev_register(struct dfl_feature_dev_data *fdata)
> >  {
> > -	enum dfl_id_type type = binfo->type;
> > +	struct dfl_feature_platform_data pdata = {};
> >  	struct platform_device *fdev;
> > +	struct dfl_feature *feature;
> > +	int ret;
> >  
> > -	/*
> > -	 * we use -ENODEV as the initialization indicator which indicates
> > -	 * whether the id need to be reclaimed
> > -	 */
> > -	fdev = platform_device_alloc(dfl_devs[type].name, -ENODEV);
> > +	fdev = platform_device_alloc(fdata->pdev_name, fdata->pdev_id);
> 
> I see a part of the change is to delay the platform_device_alloc() so
> that the feature platform device registration work could be gathered in
> one function. Not sure if it is necessary for the port platform release
> issue, or could be an independent enhancement patch.

This has been factored out into a separate patch "fpga: dfl: allocate
platform device after feature device data". Delaying the call to
platform_device_alloc() to feature_dev_register() is essential to the
overall series since the last patch reuses feature_dev_register() for
the port reassign. If platform_device_alloc() were not part of
feature_dev_register(), the call to platform_device_alloc() would have
to be repeated in dfl_fpga_cdev_assign_port().

> 
> I raise this cause I still feel hard to understand all these changes
> in this patch and look for any chance to further split it.

The smaller patches of the v4 series are hopefully much more straight-
forward to follow now. The refactoring uncovered a leak of the feature
device id for released ports that has been fixed in the new patch
"fpga: dfl: store platform device id in feature device data".

> 
> BTW: Overall this series is much improved than the last version.
> 
> Thanks,
> Yilun

Thank you for your quick review and guidance for this series.

Thanks,
Peter

  reply	other threads:[~2024-10-25 22:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-19 20:34 [PATCH v3 0/9] fpga: dfl: fix kernel warning on port release/assign for SRIOV Peter Colberg
2024-09-19 20:34 ` [PATCH v3 1/9] fpga: dfl: omit unneeded argument pdata from dfl_feature_instance_init() Peter Colberg
2024-09-19 20:34 ` [PATCH v3 2/9] fpga: dfl: omit unneeded null pointer check from {afu,fme}_open() Peter Colberg
2024-09-24  6:28   ` Xu Yilun
2024-10-25 22:39     ` Colberg, Peter
2024-09-24  6:41   ` Xu Yilun
2024-10-25 22:40     ` Colberg, Peter
2024-09-19 20:34 ` [PATCH v3 3/9] fpga: dfl: afu: use parent device to log errors on port enable/disable Peter Colberg
2024-09-19 20:34 ` [PATCH v3 4/9] fpga: dfl: afu: define local pointer to feature device Peter Colberg
2024-09-19 20:34 ` [PATCH v3 5/9] fpga: dfl: pass feature platform data instead of device as argument Peter Colberg
2024-09-19 20:34 ` [PATCH v3 6/9] fpga: dfl: factor out feature data creation from build_info_commit_dev() Peter Colberg
2024-09-24  7:15   ` Xu Yilun
2024-10-25 22:43     ` Colberg, Peter
2024-09-19 20:34 ` [PATCH v3 7/9] fpga: dfl: store FIU type in feature platform data Peter Colberg
2024-09-24  7:42   ` Xu Yilun
2024-10-25 22:46     ` Colberg, Peter
2024-09-19 20:34 ` [PATCH v3 8/9] fpga: dfl: refactor functions to take/return feature device data Peter Colberg
2024-09-19 20:34 ` [PATCH v3 9/9] fpga: dfl: fix kernel warning on port release/assign for SRIOV Peter Colberg
2024-09-24  9:01   ` Xu Yilun
2024-10-25 22:54     ` Colberg, Peter [this message]
2024-10-26  1:52       ` Colberg, Peter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04e415171f4c0af1f9407ad41bd386a7c1ef07cc.camel@intel.com \
    --to=peter.colberg@intel.com \
    --cc=hao.wu@intel.com \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marpagan@redhat.com \
    --cc=matthew.gerlach@linux.intel.com \
    --cc=mdf@kernel.org \
    --cc=russ.weight@linux.dev \
    --cc=russell.h.weight@intel.com \
    --cc=trix@redhat.com \
    --cc=yilun.xu@intel.com \
    --cc=yilun.xu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).