From: Wei Yang <weiyang@linux.vnet.ibm.com>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: Wei Yang <weiyang@linux.vnet.ibm.com>,
netdev <netdev@vger.kernel.org>,
yevgenyp@mellanox.com, Or Gerlitz <ogerlitz@mellanox.com>,
Amir Vadai <amirv@mellanox.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
David Miller <davem@davemloft.net>
Subject: Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
Date: Thu, 3 Apr 2014 14:54:32 +0800 [thread overview]
Message-ID: <20140403065432.GA18045@richard> (raw)
In-Reply-To: <CAErSpo749gfvcJmwjCw6c6H07S9cvaX2kz4jw1AwUZzbDN6Afw@mail.gmail.com>
On Wed, Apr 02, 2014 at 09:11:18PM -0600, Bjorn Helgaas wrote:
>On Wed, Apr 2, 2014 at 7:51 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>> On Wed, Apr 02, 2014 at 12:28:46PM -0600, Bjorn Helgaas wrote:
>
>>>I looked at all the .error_detected() methods in the tree, and I think
>>>mlx4_pci_err_detected() is the only one that actually throws away the
>>>pci_drvdata(). Most drivers just do pci_disable_device() and some
>>>other housekeeping. Can you do something similar?
>>
>> Change mlx4_remove_one() to have just pci_disable_device() is a big decisioin.
>> I believe Or and Amir will have better ideas.
>
>Oh, I totally agree that you shouldn't make such a radical change just
>for this issue. What I meant was that maybe there's a relatively
>simple way for you to hang on to the pci_drvdata() or at least the
>pci_device_id.driver_data value.
>
Sorry, I misunderstand your point :-)
>BUT just on general principles, you should at least look at the other
>drivers and use the same model unless you need something different. I
>doubt there's anything so special about mlx4 that it needs a totally
>different approach. But again, this is a broad comment, not a
>suggestion for how to solve this particular issue.
>
>>>The mlx4 approach of completely tearing down and rebuilding the device
>>>*is* sort of appealing because I'm a little dubious of assuming that
>>>any driver setup done before the reset is still valid afterwards. But
>>>maybe you could at least hang onto the pci_device_id.driver_data
>>>value? As far as the PCI core is concerned, it supplied that to the
>>>.probe() function, and nothing has changed since then, so there's no
>>>reason for a driver to request it again.
>>
>> Hmm... so you suggest every driver better do what mlx4_core does? Clear/Reset
>> the device? This is reasonable to me, while one case comes into my mind --
>> SRIOV. For example this PF triggers an error and be reported the error. If we
>> tear down the PF, we should remove all the VFs too. This means once the PF
>> gets into an error, all the PF and VFs should be cleared/reset, no matter
>> whether the VFs are healthy or not. So there is no chance to isolate PF and
>> VFs. I guess this is not what we want to achieve for SRIOV. Is my
>> understanding correct?
>
>No, I'm not suggesting that everybody do what mlx4 does. I'm just
>saying that I can see why mlx4 was designed that way.
Yep, I misunderstand it.
>
>>From the PCI core's perspective, after .probe() returns successfully,
>we can call any driver entry point and pass the pci_dev to it, and
>expect it to work. Doing mlx4_remove_one() in mlx4_pci_err_detected()
>sort of breaks that assumption because you clear out pci_drvdata().
>Right now, the only other entry point mlx4 really implements is
>mlx4_remove_one(), and it has a hack that tests whether pci_drvdata()
>is NULL. But that's ... a hack, and you'll have to do the same
>if/when you implement suspend/resume/sriov_configure/etc.
>
>So doing the whole tear-down in mlx4_pci_err_detected() doesn't seem
>like a great design to me. But mlx4_remove_one() probably could be
>refactored to move the bulk of its code into a helper, and then you
>could have both mlx4_remove_one() and mlx4_pci_err_detected() call
>that helper. Clearing pci_set_drvdata() could be done only in
>mlx4_remove_one(), so it could be preserved in
>mlx4_pci_err_detected().
>
>Bjorn
Here is another one based on your comment, which split mlx4_remove_one into
two and named a helper __mlx4_remove_one(). mlx4_pci_err_detected() will just
call __mlx4_remove_one(), which will not release drvdata.
BTW, this is not tested, just want to make sure my understanding is correct.
>From 84a5a9df0604cbea9b70c74b0709258841637946 Mon Sep 17 00:00:00 2001
From: Wei Yang <weiyang@linux.vnet.ibm.com>
Date: Mon, 31 Mar 2014 11:34:57 +0800
Subject: [PATCH] net/mlx4_core: match pci_device_id including dynids
Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
pci_device_id.driver_data to __mlx4_init_one during reset".
pci_match_id() just match the static pci_device_id, which may return NULL if
someone binds the driver to a device manually using
/sys/bus/pci/drivers/.../new_id.
This patch wrap up a helper function __mlx4_remove_one() which does the tear
down function but preserve the drv_data. Functions like
mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out
releasing drvdata.
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/main.c | 149 ++++++++++++++++-------------
1 file changed, 80 insertions(+), 69 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index aa54ef7..fd1f288 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2268,7 +2268,12 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data)
/* Allow large DMA segments, up to the firmware limit of 1 GB */
dma_set_max_seg_size(&pdev->dev, 1024 * 1024 * 1024);
- priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ dev = pci_get_drvdata(pdev);
+ if (!dev)
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ else
+ priv = mlx4_priv(dev);
+
if (!priv) {
err = -ENOMEM;
goto err_release_regions;
@@ -2525,77 +2530,81 @@ static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
return __mlx4_init_one(pdev, id->driver_data);
}
-static void mlx4_remove_one(struct pci_dev *pdev)
+static void __mlx4_remove_one(struct mlx4_dev *dev)
{
- struct mlx4_dev *dev = pci_get_drvdata(pdev);
- struct mlx4_priv *priv = mlx4_priv(dev);
- int p;
+ /* in SRIOV it is not allowed to unload the pf's
+ * driver while there are alive vf's */
+ if (mlx4_is_master(dev)) {
+ if (mlx4_how_many_lives_vf(dev))
+ printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
+ }
+ mlx4_stop_sense(dev);
+ mlx4_unregister_device(dev);
- if (dev) {
- /* in SRIOV it is not allowed to unload the pf's
- * driver while there are alive vf's */
- if (mlx4_is_master(dev)) {
- if (mlx4_how_many_lives_vf(dev))
- printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
- }
- mlx4_stop_sense(dev);
- mlx4_unregister_device(dev);
+ for (p = 1; p <= dev->caps.num_ports; p++) {
+ mlx4_cleanup_port_info(&priv->port[p]);
+ mlx4_CLOSE_PORT(dev, p);
+ }
- for (p = 1; p <= dev->caps.num_ports; p++) {
- mlx4_cleanup_port_info(&priv->port[p]);
- mlx4_CLOSE_PORT(dev, p);
- }
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_SLAVES_ONLY);
- if (mlx4_is_master(dev))
- mlx4_free_resource_tracker(dev,
- RES_TR_FREE_SLAVES_ONLY);
-
- mlx4_cleanup_counters_table(dev);
- mlx4_cleanup_qp_table(dev);
- mlx4_cleanup_srq_table(dev);
- mlx4_cleanup_cq_table(dev);
- mlx4_cmd_use_polling(dev);
- mlx4_cleanup_eq_table(dev);
- mlx4_cleanup_mcg_table(dev);
- mlx4_cleanup_mr_table(dev);
- mlx4_cleanup_xrcd_table(dev);
- mlx4_cleanup_pd_table(dev);
+ mlx4_cleanup_counters_table(dev);
+ mlx4_cleanup_qp_table(dev);
+ mlx4_cleanup_srq_table(dev);
+ mlx4_cleanup_cq_table(dev);
+ mlx4_cmd_use_polling(dev);
+ mlx4_cleanup_eq_table(dev);
+ mlx4_cleanup_mcg_table(dev);
+ mlx4_cleanup_mr_table(dev);
+ mlx4_cleanup_xrcd_table(dev);
+ mlx4_cleanup_pd_table(dev);
- if (mlx4_is_master(dev))
- mlx4_free_resource_tracker(dev,
- RES_TR_FREE_STRUCTS_ONLY);
-
- iounmap(priv->kar);
- mlx4_uar_free(dev, &priv->driver_uar);
- mlx4_cleanup_uar_table(dev);
- if (!mlx4_is_slave(dev))
- mlx4_clear_steering(dev);
- mlx4_free_eq_table(dev);
- if (mlx4_is_master(dev))
- mlx4_multi_func_cleanup(dev);
- mlx4_close_hca(dev);
- if (mlx4_is_slave(dev))
- mlx4_multi_func_cleanup(dev);
- mlx4_cmd_cleanup(dev);
-
- if (dev->flags & MLX4_FLAG_MSI_X)
- pci_disable_msix(pdev);
- if (dev->flags & MLX4_FLAG_SRIOV) {
- mlx4_warn(dev, "Disabling SR-IOV\n");
- pci_disable_sriov(pdev);
- }
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_STRUCTS_ONLY);
- if (!mlx4_is_slave(dev))
- mlx4_free_ownership(dev);
+ iounmap(priv->kar);
+ mlx4_uar_free(dev, &priv->driver_uar);
+ mlx4_cleanup_uar_table(dev);
+ if (!mlx4_is_slave(dev))
+ mlx4_clear_steering(dev);
+ mlx4_free_eq_table(dev);
+ if (mlx4_is_master(dev))
+ mlx4_multi_func_cleanup(dev);
+ mlx4_close_hca(dev);
+ if (mlx4_is_slave(dev))
+ mlx4_multi_func_cleanup(dev);
+ mlx4_cmd_cleanup(dev);
+
+ if (dev->flags & MLX4_FLAG_MSI_X)
+ pci_disable_msix(pdev);
+ if (dev->flags & MLX4_FLAG_SRIOV) {
+ mlx4_warn(dev, "Disabling SR-IOV\n");
+ pci_disable_sriov(pdev);
+ }
+
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ pci_release_regions(pdev);
+ pci_disable_device(pdev);
+}
- kfree(dev->caps.qp0_tunnel);
- kfree(dev->caps.qp0_proxy);
- kfree(dev->caps.qp1_tunnel);
- kfree(dev->caps.qp1_proxy);
+static void mlx4_remove_one(struct pci_dev *pdev)
+{
+ struct mlx4_dev *dev = pci_get_drvdata(pdev);
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int p;
+ if (dev) {
+ __mlx4_remove_one(dev);
kfree(priv);
- pci_release_regions(pdev);
- pci_disable_device(pdev);
pci_set_drvdata(pdev, NULL);
}
}
@@ -2607,7 +2616,7 @@ int mlx4_restart_one(struct pci_dev *pdev)
int pci_dev_data;
pci_dev_data = priv->pci_dev_data;
- mlx4_remove_one(pdev);
+ __mlx4_remove_one(pdev);
return __mlx4_init_one(pdev, pci_dev_data);
}
@@ -2662,7 +2671,7 @@ MODULE_DEVICE_TABLE(pci, mlx4_pci_table);
static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
pci_channel_state_t state)
{
- mlx4_remove_one(pdev);
+ __mlx4_remove_one(pdev);
return state == pci_channel_io_perm_failure ?
PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
@@ -2670,11 +2679,13 @@ static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
{
- const struct pci_device_id *id;
- int ret;
+ struct mlx4_dev *dev = pci_get_drvdata(pdev);
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int pci_dev_data;
+ int ret;
- id = pci_match_id(mlx4_pci_table, pdev);
- ret = __mlx4_init_one(pdev, id->driver_data);
+ pci_dev_data = priv->pci_dev_data;
+ ret = __mlx4_init_one(pdev, pci_dev_data);
return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
}
--
1.7.9.5
--
Richard Yang
Help you, Help me
next prev parent reply other threads:[~2014-04-03 6:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-01 4:36 [PATCH] net/mlx4_core: match pci_device_id including dynids Wei Yang
2014-04-02 2:29 ` David Miller
2014-04-02 18:28 ` Bjorn Helgaas
2014-04-03 1:51 ` Wei Yang
2014-04-03 3:11 ` Bjorn Helgaas
2014-04-03 6:54 ` Wei Yang [this message]
2014-04-03 21:12 ` Bjorn Helgaas
2014-04-04 14:20 ` Wei Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140403065432.GA18045@richard \
--to=weiyang@linux.vnet.ibm.com \
--cc=amirv@mellanox.com \
--cc=bhelgaas@google.com \
--cc=davem@davemloft.net \
--cc=linux-pci@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=yevgenyp@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.