netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net/mlx4_core: match pci_device_id including dynids
@ 2014-04-01  4:36 Wei Yang
  2014-04-02  2:29 ` David Miller
  2014-04-02 18:28 ` Bjorn Helgaas
  0 siblings, 2 replies; 8+ messages in thread
From: Wei Yang @ 2014-04-01  4:36 UTC (permalink / raw)
  To: netdev, yevgenyp, ogerlitz, amirv, bhelgaas, linux-pci; +Cc: Wei Yang

Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
pci_device_id.driver_data to __mlx4_init_one during reset".

pci_match_id() just match the static pci_device_id, which may return NULL if
someone binds the driver to a device manually using
/sys/bus/pci/drivers/.../new_id.

This patch match pci_device_id with pci_match_device() to cover both dynids
and static id_table.

Thanks to Bjorn finding this issue.

CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c |    4 +++-
 drivers/pci/pci-driver.c                  |    3 ++-
 include/linux/pci.h                       |    2 ++
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index aa54ef7..b0edb5c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2673,7 +2673,9 @@ static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
 	const struct pci_device_id *id;
 	int ret;
 
-	id = pci_match_id(mlx4_pci_table, pdev);
+	id = pci_match_device(pci_dev_driver(pdev), pdev);
+	BUG_ON(!id);
+
 	ret = __mlx4_init_one(pdev, id->driver_data);
 
 	return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 25f0bc6..1ee26a1 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -225,7 +225,7 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
  * system is in its list of supported devices.  Returns the matching
  * pci_device_id structure or %NULL if there is no match.
  */
-static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
+const struct pci_device_id *pci_match_device(struct pci_driver *drv,
 						    struct pci_dev *dev)
 {
 	struct pci_dynid *dynid;
@@ -1355,6 +1355,7 @@ postcore_initcall(pci_driver_init);
 
 EXPORT_SYMBOL_GPL(pci_add_dynid);
 EXPORT_SYMBOL(pci_match_id);
+EXPORT_SYMBOL(pci_match_device);
 EXPORT_SYMBOL(__pci_register_driver);
 EXPORT_SYMBOL(pci_unregister_driver);
 EXPORT_SYMBOL(pci_dev_driver);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 44f6883..8ede1b5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1134,6 +1134,8 @@ int pci_add_dynid(struct pci_driver *drv,
 		  unsigned long driver_data);
 const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
 					 struct pci_dev *dev);
+const struct pci_device_id *pci_match_device(struct pci_driver *drv,
+						    struct pci_dev *dev);
 int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
 		    int pass);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-01  4:36 [PATCH] net/mlx4_core: match pci_device_id including dynids Wei Yang
@ 2014-04-02  2:29 ` David Miller
  2014-04-02 18:28 ` Bjorn Helgaas
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2014-04-02  2:29 UTC (permalink / raw)
  To: weiyang; +Cc: netdev, yevgenyp, ogerlitz, amirv, bhelgaas, linux-pci

From: Wei Yang <weiyang@linux.vnet.ibm.com>
Date: Tue,  1 Apr 2014 12:36:01 +0800

> Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
> pci_device_id.driver_data to __mlx4_init_one during reset".
> 
> pci_match_id() just match the static pci_device_id, which may return NULL if
> someone binds the driver to a device manually using
> /sys/bus/pci/drivers/.../new_id.
> 
> This patch match pci_device_id with pci_match_device() to cover both dynids
> and static id_table.
> 
> Thanks to Bjorn finding this issue.
> 
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Amir Vadai <amirv@mellanox.com>
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> Acked-by: Amir Vadai <amirv@mellanox.com>

And ACK from the PCI folks would be greatly appreciated.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-01  4:36 [PATCH] net/mlx4_core: match pci_device_id including dynids Wei Yang
  2014-04-02  2:29 ` David Miller
@ 2014-04-02 18:28 ` Bjorn Helgaas
  2014-04-03  1:51   ` Wei Yang
  1 sibling, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2014-04-02 18:28 UTC (permalink / raw)
  To: Wei Yang
  Cc: netdev, yevgenyp, Or Gerlitz, Amir Vadai,
	linux-pci@vger.kernel.org, David Miller

On Mon, Mar 31, 2014 at 10:36 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
> Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
> pci_device_id.driver_data to __mlx4_init_one during reset".
>
> pci_match_id() just match the static pci_device_id, which may return NULL if
> someone binds the driver to a device manually using
> /sys/bus/pci/drivers/.../new_id.
>
> This patch match pci_device_id with pci_match_device() to cover both dynids
> and static id_table.
>
> Thanks to Bjorn finding this issue.

I didn't actually find it; I just passed along an issue found by
Coverity.  I usually include "Found by Coverity (CID 1194947)" in the
changelog in case anybody is trying to manage those issues.

> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Amir Vadai <amirv@mellanox.com>
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> Acked-by: Amir Vadai <amirv@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/main.c |    4 +++-
>  drivers/pci/pci-driver.c                  |    3 ++-
>  include/linux/pci.h                       |    2 ++
>  3 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
> index aa54ef7..b0edb5c 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
> @@ -2673,7 +2673,9 @@ static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
>         const struct pci_device_id *id;
>         int ret;
>
> -       id = pci_match_id(mlx4_pci_table, pdev);
> +       id = pci_match_device(pci_dev_driver(pdev), pdev);
> +       BUG_ON(!id);

This doesn't seem like the best solution to me, because it basically
duplicates part of __pci_device_probe() in the driver.  And of course,
I'd rather not export pci_match_device() if we can avoid it (I don't
have a specific objection; just the general "don't export more than
necessary" rule, and something with a single user always makes me look
twice).

I looked at all the .error_detected() methods in the tree, and I think
mlx4_pci_err_detected() is the only one that actually throws away the
pci_drvdata().  Most drivers just do pci_disable_device() and some
other housekeeping.  Can you do something similar?

The mlx4 approach of completely tearing down and rebuilding the device
*is* sort of appealing because I'm a little dubious of assuming that
any driver setup done before the reset is still valid afterwards.  But
maybe you could at least hang onto the pci_device_id.driver_data
value?  As far as the PCI core is concerned, it supplied that to the
.probe() function, and nothing has changed since then, so there's no
reason for a driver to request it again.

Bjorn

>         ret = __mlx4_init_one(pdev, id->driver_data);
>
>         return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 25f0bc6..1ee26a1 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -225,7 +225,7 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
>   * system is in its list of supported devices.  Returns the matching
>   * pci_device_id structure or %NULL if there is no match.
>   */
> -static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
> +const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>                                                     struct pci_dev *dev)
>  {
>         struct pci_dynid *dynid;
> @@ -1355,6 +1355,7 @@ postcore_initcall(pci_driver_init);
>
>  EXPORT_SYMBOL_GPL(pci_add_dynid);
>  EXPORT_SYMBOL(pci_match_id);
> +EXPORT_SYMBOL(pci_match_device);
>  EXPORT_SYMBOL(__pci_register_driver);
>  EXPORT_SYMBOL(pci_unregister_driver);
>  EXPORT_SYMBOL(pci_dev_driver);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 44f6883..8ede1b5 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1134,6 +1134,8 @@ int pci_add_dynid(struct pci_driver *drv,
>                   unsigned long driver_data);
>  const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
>                                          struct pci_dev *dev);
> +const struct pci_device_id *pci_match_device(struct pci_driver *drv,
> +                                                   struct pci_dev *dev);
>  int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
>                     int pass);
>
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-02 18:28 ` Bjorn Helgaas
@ 2014-04-03  1:51   ` Wei Yang
  2014-04-03  3:11     ` Bjorn Helgaas
  0 siblings, 1 reply; 8+ messages in thread
From: Wei Yang @ 2014-04-03  1:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Wei Yang, netdev, yevgenyp, Or Gerlitz, Amir Vadai,
	linux-pci@vger.kernel.org, David Miller

On Wed, Apr 02, 2014 at 12:28:46PM -0600, Bjorn Helgaas wrote:
>On Mon, Mar 31, 2014 at 10:36 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>> Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
>> pci_device_id.driver_data to __mlx4_init_one during reset".
>>
>> pci_match_id() just match the static pci_device_id, which may return NULL if
>> someone binds the driver to a device manually using
>> /sys/bus/pci/drivers/.../new_id.
>>
>> This patch match pci_device_id with pci_match_device() to cover both dynids
>> and static id_table.
>>
>> Thanks to Bjorn finding this issue.
>
>I didn't actually find it; I just passed along an issue found by
>Coverity.  I usually include "Found by Coverity (CID 1194947)" in the
>changelog in case anybody is trying to manage those issues.
>
>> CC: Bjorn Helgaas <bhelgaas@google.com>
>> CC: Amir Vadai <amirv@mellanox.com>
>> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>> Acked-by: Amir Vadai <amirv@mellanox.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx4/main.c |    4 +++-
>>  drivers/pci/pci-driver.c                  |    3 ++-
>>  include/linux/pci.h                       |    2 ++
>>  3 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
>> index aa54ef7..b0edb5c 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
>> @@ -2673,7 +2673,9 @@ static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
>>         const struct pci_device_id *id;
>>         int ret;
>>
>> -       id = pci_match_id(mlx4_pci_table, pdev);
>> +       id = pci_match_device(pci_dev_driver(pdev), pdev);
>> +       BUG_ON(!id);
>
>This doesn't seem like the best solution to me, because it basically
>duplicates part of __pci_device_probe() in the driver.  And of course,
>I'd rather not export pci_match_device() if we can avoid it (I don't
>have a specific objection; just the general "don't export more than
>necessary" rule, and something with a single user always makes me look
>twice).

Agree, we should be careful to export a function.

>
>I looked at all the .error_detected() methods in the tree, and I think
>mlx4_pci_err_detected() is the only one that actually throws away the
>pci_drvdata().  Most drivers just do pci_disable_device() and some
>other housekeeping.  Can you do something similar?

Change mlx4_remove_one() to have just pci_disable_device() is a big decisioin.
I believe Or and Amir will have better ideas.

>
>The mlx4 approach of completely tearing down and rebuilding the device
>*is* sort of appealing because I'm a little dubious of assuming that
>any driver setup done before the reset is still valid afterwards.  But
>maybe you could at least hang onto the pci_device_id.driver_data
>value?  As far as the PCI core is concerned, it supplied that to the
>.probe() function, and nothing has changed since then, so there's no
>reason for a driver to request it again.

Hmm... so you suggest every driver better do what mlx4_core does? Clear/Reset
the device? This is reasonable to me, while one case comes into my mind --
SRIOV. For example this PF triggers an error and be reported the error. If we
tear down the PF, we should remove all the VFs too. This means once the PF
gets into an error, all the PF and VFs should be cleared/reset, no matter
whether the VFs are healthy or not. So there is no chance to isolate PF and
VFs. I guess this is not what we want to achieve for SRIOV. Is my
understanding correct?

Come back to the issue in this patch, one quick fix in my mind is after tear
down the device and release the driver_data, we build another one and save the
pci_dev_data in it again. Will send this version for comments.

>
>Bjorn
>
>>         ret = __mlx4_init_one(pdev, id->driver_data);
>>
>>         return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 25f0bc6..1ee26a1 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -225,7 +225,7 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
>>   * system is in its list of supported devices.  Returns the matching
>>   * pci_device_id structure or %NULL if there is no match.
>>   */
>> -static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>> +const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>>                                                     struct pci_dev *dev)
>>  {
>>         struct pci_dynid *dynid;
>> @@ -1355,6 +1355,7 @@ postcore_initcall(pci_driver_init);
>>
>>  EXPORT_SYMBOL_GPL(pci_add_dynid);
>>  EXPORT_SYMBOL(pci_match_id);
>> +EXPORT_SYMBOL(pci_match_device);
>>  EXPORT_SYMBOL(__pci_register_driver);
>>  EXPORT_SYMBOL(pci_unregister_driver);
>>  EXPORT_SYMBOL(pci_dev_driver);
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 44f6883..8ede1b5 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1134,6 +1134,8 @@ int pci_add_dynid(struct pci_driver *drv,
>>                   unsigned long driver_data);
>>  const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
>>                                          struct pci_dev *dev);
>> +const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>> +                                                   struct pci_dev *dev);
>>  int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
>>                     int pass);
>>
>> --
>> 1.7.9.5
>>

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-03  1:51   ` Wei Yang
@ 2014-04-03  3:11     ` Bjorn Helgaas
  2014-04-03  6:54       ` Wei Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2014-04-03  3:11 UTC (permalink / raw)
  To: Wei Yang
  Cc: netdev, yevgenyp, Or Gerlitz, Amir Vadai,
	linux-pci@vger.kernel.org, David Miller

On Wed, Apr 2, 2014 at 7:51 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
> On Wed, Apr 02, 2014 at 12:28:46PM -0600, Bjorn Helgaas wrote:

>>I looked at all the .error_detected() methods in the tree, and I think
>>mlx4_pci_err_detected() is the only one that actually throws away the
>>pci_drvdata().  Most drivers just do pci_disable_device() and some
>>other housekeeping.  Can you do something similar?
>
> Change mlx4_remove_one() to have just pci_disable_device() is a big decisioin.
> I believe Or and Amir will have better ideas.

Oh, I totally agree that you shouldn't make such a radical change just
for this issue.  What I meant was that maybe there's a relatively
simple way for you to hang on to the pci_drvdata() or at least the
pci_device_id.driver_data value.

BUT just on general principles, you should at least look at the other
drivers and use the same model unless you need something different.  I
doubt there's anything so special about mlx4 that it needs a totally
different approach.  But again, this is a broad comment, not a
suggestion for how to solve this particular issue.

>>The mlx4 approach of completely tearing down and rebuilding the device
>>*is* sort of appealing because I'm a little dubious of assuming that
>>any driver setup done before the reset is still valid afterwards.  But
>>maybe you could at least hang onto the pci_device_id.driver_data
>>value?  As far as the PCI core is concerned, it supplied that to the
>>.probe() function, and nothing has changed since then, so there's no
>>reason for a driver to request it again.
>
> Hmm... so you suggest every driver better do what mlx4_core does? Clear/Reset
> the device? This is reasonable to me, while one case comes into my mind --
> SRIOV. For example this PF triggers an error and be reported the error. If we
> tear down the PF, we should remove all the VFs too. This means once the PF
> gets into an error, all the PF and VFs should be cleared/reset, no matter
> whether the VFs are healthy or not. So there is no chance to isolate PF and
> VFs. I guess this is not what we want to achieve for SRIOV. Is my
> understanding correct?

No, I'm not suggesting that everybody do what mlx4 does.  I'm just
saying that I can see why mlx4 was designed that way.

>From the PCI core's perspective, after .probe() returns successfully,
we can call any driver entry point and pass the pci_dev to it, and
expect it to work.  Doing mlx4_remove_one() in mlx4_pci_err_detected()
sort of breaks that assumption because you clear out pci_drvdata().
Right now, the only other entry point mlx4 really implements is
mlx4_remove_one(), and it has a hack that tests whether pci_drvdata()
is NULL.  But that's ... a hack, and you'll have to do the same
if/when you implement suspend/resume/sriov_configure/etc.

So doing the whole tear-down in mlx4_pci_err_detected() doesn't seem
like a great design to me.  But mlx4_remove_one() probably could be
refactored to move the bulk of its code into a helper, and then you
could have both mlx4_remove_one() and mlx4_pci_err_detected() call
that helper.  Clearing pci_set_drvdata() could be done only in
mlx4_remove_one(), so it could be preserved in
mlx4_pci_err_detected().

Bjorn

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-03  3:11     ` Bjorn Helgaas
@ 2014-04-03  6:54       ` Wei Yang
  2014-04-03 21:12         ` Bjorn Helgaas
  0 siblings, 1 reply; 8+ messages in thread
From: Wei Yang @ 2014-04-03  6:54 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Wei Yang, netdev, yevgenyp, Or Gerlitz, Amir Vadai,
	linux-pci@vger.kernel.org, David Miller

On Wed, Apr 02, 2014 at 09:11:18PM -0600, Bjorn Helgaas wrote:
>On Wed, Apr 2, 2014 at 7:51 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>> On Wed, Apr 02, 2014 at 12:28:46PM -0600, Bjorn Helgaas wrote:
>
>>>I looked at all the .error_detected() methods in the tree, and I think
>>>mlx4_pci_err_detected() is the only one that actually throws away the
>>>pci_drvdata().  Most drivers just do pci_disable_device() and some
>>>other housekeeping.  Can you do something similar?
>>
>> Change mlx4_remove_one() to have just pci_disable_device() is a big decisioin.
>> I believe Or and Amir will have better ideas.
>
>Oh, I totally agree that you shouldn't make such a radical change just
>for this issue.  What I meant was that maybe there's a relatively
>simple way for you to hang on to the pci_drvdata() or at least the
>pci_device_id.driver_data value.
>

Sorry, I misunderstand your point :-)

>BUT just on general principles, you should at least look at the other
>drivers and use the same model unless you need something different.  I
>doubt there's anything so special about mlx4 that it needs a totally
>different approach.  But again, this is a broad comment, not a
>suggestion for how to solve this particular issue.
>
>>>The mlx4 approach of completely tearing down and rebuilding the device
>>>*is* sort of appealing because I'm a little dubious of assuming that
>>>any driver setup done before the reset is still valid afterwards.  But
>>>maybe you could at least hang onto the pci_device_id.driver_data
>>>value?  As far as the PCI core is concerned, it supplied that to the
>>>.probe() function, and nothing has changed since then, so there's no
>>>reason for a driver to request it again.
>>
>> Hmm... so you suggest every driver better do what mlx4_core does? Clear/Reset
>> the device? This is reasonable to me, while one case comes into my mind --
>> SRIOV. For example this PF triggers an error and be reported the error. If we
>> tear down the PF, we should remove all the VFs too. This means once the PF
>> gets into an error, all the PF and VFs should be cleared/reset, no matter
>> whether the VFs are healthy or not. So there is no chance to isolate PF and
>> VFs. I guess this is not what we want to achieve for SRIOV. Is my
>> understanding correct?
>
>No, I'm not suggesting that everybody do what mlx4 does.  I'm just
>saying that I can see why mlx4 was designed that way.

Yep, I misunderstand it.

>
>>From the PCI core's perspective, after .probe() returns successfully,
>we can call any driver entry point and pass the pci_dev to it, and
>expect it to work.  Doing mlx4_remove_one() in mlx4_pci_err_detected()
>sort of breaks that assumption because you clear out pci_drvdata().
>Right now, the only other entry point mlx4 really implements is
>mlx4_remove_one(), and it has a hack that tests whether pci_drvdata()
>is NULL.  But that's ... a hack, and you'll have to do the same
>if/when you implement suspend/resume/sriov_configure/etc.
>
>So doing the whole tear-down in mlx4_pci_err_detected() doesn't seem
>like a great design to me.  But mlx4_remove_one() probably could be
>refactored to move the bulk of its code into a helper, and then you
>could have both mlx4_remove_one() and mlx4_pci_err_detected() call
>that helper.  Clearing pci_set_drvdata() could be done only in
>mlx4_remove_one(), so it could be preserved in
>mlx4_pci_err_detected().
>
>Bjorn

Here is another one based on your comment, which split mlx4_remove_one into
two and named a helper __mlx4_remove_one(). mlx4_pci_err_detected() will just
call __mlx4_remove_one(), which will not release drvdata.

BTW, this is not tested, just want to make sure my understanding is correct.

>From 84a5a9df0604cbea9b70c74b0709258841637946 Mon Sep 17 00:00:00 2001
From: Wei Yang <weiyang@linux.vnet.ibm.com>
Date: Mon, 31 Mar 2014 11:34:57 +0800
Subject: [PATCH] net/mlx4_core: match pci_device_id including dynids

Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
pci_device_id.driver_data to __mlx4_init_one during reset".

pci_match_id() just match the static pci_device_id, which may return NULL if
someone binds the driver to a device manually using
/sys/bus/pci/drivers/.../new_id.

This patch wrap up a helper function __mlx4_remove_one() which does the tear
down function but preserve the drv_data. Functions like
mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out
releasing drvdata.

CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c |  149 ++++++++++++++++-------------
 1 file changed, 80 insertions(+), 69 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index aa54ef7..fd1f288 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2268,7 +2268,12 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data)
 	/* Allow large DMA segments, up to the firmware limit of 1 GB */
 	dma_set_max_seg_size(&pdev->dev, 1024 * 1024 * 1024);
 
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	dev = pci_get_drvdata(pdev);
+	if (!dev)
+		priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	else
+		priv = mlx4_priv(dev);
+
 	if (!priv) {
 		err = -ENOMEM;
 		goto err_release_regions;
@@ -2525,77 +2530,81 @@ static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 	return __mlx4_init_one(pdev, id->driver_data);
 }
 
-static void mlx4_remove_one(struct pci_dev *pdev)
+static void __mlx4_remove_one(struct mlx4_dev *dev)
 {
-	struct mlx4_dev  *dev  = pci_get_drvdata(pdev);
-	struct mlx4_priv *priv = mlx4_priv(dev);
-	int p;
+	/* in SRIOV it is not allowed to unload the pf's
+	 * driver while there are alive vf's */
+	if (mlx4_is_master(dev)) {
+		if (mlx4_how_many_lives_vf(dev))
+			printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
+	}
+	mlx4_stop_sense(dev);
+	mlx4_unregister_device(dev);
 
-	if (dev) {
-		/* in SRIOV it is not allowed to unload the pf's
-		 * driver while there are alive vf's */
-		if (mlx4_is_master(dev)) {
-			if (mlx4_how_many_lives_vf(dev))
-				printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
-		}
-		mlx4_stop_sense(dev);
-		mlx4_unregister_device(dev);
+	for (p = 1; p <= dev->caps.num_ports; p++) {
+		mlx4_cleanup_port_info(&priv->port[p]);
+		mlx4_CLOSE_PORT(dev, p);
+	}
 
-		for (p = 1; p <= dev->caps.num_ports; p++) {
-			mlx4_cleanup_port_info(&priv->port[p]);
-			mlx4_CLOSE_PORT(dev, p);
-		}
+	if (mlx4_is_master(dev))
+		mlx4_free_resource_tracker(dev,
+					   RES_TR_FREE_SLAVES_ONLY);
 
-		if (mlx4_is_master(dev))
-			mlx4_free_resource_tracker(dev,
-						   RES_TR_FREE_SLAVES_ONLY);
-
-		mlx4_cleanup_counters_table(dev);
-		mlx4_cleanup_qp_table(dev);
-		mlx4_cleanup_srq_table(dev);
-		mlx4_cleanup_cq_table(dev);
-		mlx4_cmd_use_polling(dev);
-		mlx4_cleanup_eq_table(dev);
-		mlx4_cleanup_mcg_table(dev);
-		mlx4_cleanup_mr_table(dev);
-		mlx4_cleanup_xrcd_table(dev);
-		mlx4_cleanup_pd_table(dev);
+	mlx4_cleanup_counters_table(dev);
+	mlx4_cleanup_qp_table(dev);
+	mlx4_cleanup_srq_table(dev);
+	mlx4_cleanup_cq_table(dev);
+	mlx4_cmd_use_polling(dev);
+	mlx4_cleanup_eq_table(dev);
+	mlx4_cleanup_mcg_table(dev);
+	mlx4_cleanup_mr_table(dev);
+	mlx4_cleanup_xrcd_table(dev);
+	mlx4_cleanup_pd_table(dev);
 
-		if (mlx4_is_master(dev))
-			mlx4_free_resource_tracker(dev,
-						   RES_TR_FREE_STRUCTS_ONLY);
-
-		iounmap(priv->kar);
-		mlx4_uar_free(dev, &priv->driver_uar);
-		mlx4_cleanup_uar_table(dev);
-		if (!mlx4_is_slave(dev))
-			mlx4_clear_steering(dev);
-		mlx4_free_eq_table(dev);
-		if (mlx4_is_master(dev))
-			mlx4_multi_func_cleanup(dev);
-		mlx4_close_hca(dev);
-		if (mlx4_is_slave(dev))
-			mlx4_multi_func_cleanup(dev);
-		mlx4_cmd_cleanup(dev);
-
-		if (dev->flags & MLX4_FLAG_MSI_X)
-			pci_disable_msix(pdev);
-		if (dev->flags & MLX4_FLAG_SRIOV) {
-			mlx4_warn(dev, "Disabling SR-IOV\n");
-			pci_disable_sriov(pdev);
-		}
+	if (mlx4_is_master(dev))
+		mlx4_free_resource_tracker(dev,
+					   RES_TR_FREE_STRUCTS_ONLY);
 
-		if (!mlx4_is_slave(dev))
-			mlx4_free_ownership(dev);
+	iounmap(priv->kar);
+	mlx4_uar_free(dev, &priv->driver_uar);
+	mlx4_cleanup_uar_table(dev);
+	if (!mlx4_is_slave(dev))
+		mlx4_clear_steering(dev);
+	mlx4_free_eq_table(dev);
+	if (mlx4_is_master(dev))
+		mlx4_multi_func_cleanup(dev);
+	mlx4_close_hca(dev);
+	if (mlx4_is_slave(dev))
+		mlx4_multi_func_cleanup(dev);
+	mlx4_cmd_cleanup(dev);
+
+	if (dev->flags & MLX4_FLAG_MSI_X)
+		pci_disable_msix(pdev);
+	if (dev->flags & MLX4_FLAG_SRIOV) {
+		mlx4_warn(dev, "Disabling SR-IOV\n");
+		pci_disable_sriov(pdev);
+	}
+
+	if (!mlx4_is_slave(dev))
+		mlx4_free_ownership(dev);
+
+	kfree(dev->caps.qp0_tunnel);
+	kfree(dev->caps.qp0_proxy);
+	kfree(dev->caps.qp1_tunnel);
+	kfree(dev->caps.qp1_proxy);
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+}
 
-		kfree(dev->caps.qp0_tunnel);
-		kfree(dev->caps.qp0_proxy);
-		kfree(dev->caps.qp1_tunnel);
-		kfree(dev->caps.qp1_proxy);
+static void mlx4_remove_one(struct pci_dev *pdev)
+{
+	struct mlx4_dev  *dev  = pci_get_drvdata(pdev);
+	struct mlx4_priv *priv = mlx4_priv(dev);
+	int p;
 
+	if (dev) {
+		__mlx4_remove_one(dev);
 		kfree(priv);
-		pci_release_regions(pdev);
-		pci_disable_device(pdev);
 		pci_set_drvdata(pdev, NULL);
 	}
 }
@@ -2607,7 +2616,7 @@ int mlx4_restart_one(struct pci_dev *pdev)
 	int		  pci_dev_data;
 
 	pci_dev_data = priv->pci_dev_data;
-	mlx4_remove_one(pdev);
+	__mlx4_remove_one(pdev);
 	return __mlx4_init_one(pdev, pci_dev_data);
 }
 
@@ -2662,7 +2671,7 @@ MODULE_DEVICE_TABLE(pci, mlx4_pci_table);
 static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
 					      pci_channel_state_t state)
 {
-	mlx4_remove_one(pdev);
+	__mlx4_remove_one(pdev);
 
 	return state == pci_channel_io_perm_failure ?
 		PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
@@ -2670,11 +2679,13 @@ static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
 
 static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
 {
-	const struct pci_device_id *id;
-	int ret;
+	struct mlx4_dev	 *dev  = pci_get_drvdata(pdev);
+	struct mlx4_priv *priv = mlx4_priv(dev);
+	int		  pci_dev_data;
+	int               ret;
 
-	id = pci_match_id(mlx4_pci_table, pdev);
-	ret = __mlx4_init_one(pdev, id->driver_data);
+	pci_dev_data = priv->pci_dev_data;
+	ret = __mlx4_init_one(pdev, pci_dev_data);
 
 	return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
 }
-- 
1.7.9.5


-- 
Richard Yang
Help you, Help me

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-03  6:54       ` Wei Yang
@ 2014-04-03 21:12         ` Bjorn Helgaas
  2014-04-04 14:20           ` Wei Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2014-04-03 21:12 UTC (permalink / raw)
  To: Wei Yang
  Cc: netdev, yevgenyp, Or Gerlitz, Amir Vadai,
	linux-pci@vger.kernel.org, David Miller

On Thu, Apr 03, 2014 at 02:54:32PM +0800, Wei Yang wrote:
> Here is another one based on your comment, which split mlx4_remove_one into
> two and named a helper __mlx4_remove_one(). mlx4_pci_err_detected() will just
> call __mlx4_remove_one(), which will not release drvdata.
> 
> BTW, this is not tested, just want to make sure my understanding is correct.

A couple minor comments below, but in general, yes, this is what I was
thinking.

> From 84a5a9df0604cbea9b70c74b0709258841637946 Mon Sep 17 00:00:00 2001
> From: Wei Yang <weiyang@linux.vnet.ibm.com>
> Date: Mon, 31 Mar 2014 11:34:57 +0800
> Subject: [PATCH] net/mlx4_core: match pci_device_id including dynids
> 
> Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
> pci_device_id.driver_data to __mlx4_init_one during reset".
> 
> pci_match_id() just match the static pci_device_id, which may return NULL if
> someone binds the driver to a device manually using
> /sys/bus/pci/drivers/.../new_id.
> 
> This patch wrap up a helper function __mlx4_remove_one() which does the tear
> down function but preserve the drv_data. Functions like
> mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out
> releasing drvdata.
> 
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Amir Vadai <amirv@mellanox.com>
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> Acked-by: Amir Vadai <amirv@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/main.c |  149 ++++++++++++++++-------------
>  1 file changed, 80 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
> index aa54ef7..fd1f288 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
> @@ -2268,7 +2268,12 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data)
>  	/* Allow large DMA segments, up to the firmware limit of 1 GB */
>  	dma_set_max_seg_size(&pdev->dev, 1024 * 1024 * 1024);
>  
> -	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +	dev = pci_get_drvdata(pdev);
> +	if (!dev)
> +		priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +	else
> +		priv = mlx4_priv(dev);

Why don't you move the priv kzalloc into mlx4_init_one()?  Then it would be
symmetric -- you alloc and call pci_set_drvdata() in mlx4_init_one(), and
you call pci_set_drvdata(NULL) and free it in mlx4_remove_one().  And you
wouldn't need the test here.

> +
>  	if (!priv) {
>  		err = -ENOMEM;
>  		goto err_release_regions;
> @@ -2525,77 +2530,81 @@ static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
>  	return __mlx4_init_one(pdev, id->driver_data);
>  }
>  
> -static void mlx4_remove_one(struct pci_dev *pdev)
> +static void __mlx4_remove_one(struct mlx4_dev *dev)
>  {
> -	struct mlx4_dev  *dev  = pci_get_drvdata(pdev);
> -	struct mlx4_priv *priv = mlx4_priv(dev);
> -	int p;
> +	/* in SRIOV it is not allowed to unload the pf's
> +	 * driver while there are alive vf's */
> +	if (mlx4_is_master(dev)) {
> +		if (mlx4_how_many_lives_vf(dev))
> +			printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
> +	}
> +	mlx4_stop_sense(dev);
> +	mlx4_unregister_device(dev);
>  
> -	if (dev) {
> -		/* in SRIOV it is not allowed to unload the pf's
> -		 * driver while there are alive vf's */
> -		if (mlx4_is_master(dev)) {
> -			if (mlx4_how_many_lives_vf(dev))
> -				printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
> -		}
> -		mlx4_stop_sense(dev);
> -		mlx4_unregister_device(dev);
> +	for (p = 1; p <= dev->caps.num_ports; p++) {
> +		mlx4_cleanup_port_info(&priv->port[p]);
> +		mlx4_CLOSE_PORT(dev, p);
> +	}
>  
> -		for (p = 1; p <= dev->caps.num_ports; p++) {
> -			mlx4_cleanup_port_info(&priv->port[p]);
> -			mlx4_CLOSE_PORT(dev, p);
> -		}
> +	if (mlx4_is_master(dev))
> +		mlx4_free_resource_tracker(dev,
> +					   RES_TR_FREE_SLAVES_ONLY);
>  
> -		if (mlx4_is_master(dev))
> -			mlx4_free_resource_tracker(dev,
> -						   RES_TR_FREE_SLAVES_ONLY);
> -
> -		mlx4_cleanup_counters_table(dev);
> -		mlx4_cleanup_qp_table(dev);
> -		mlx4_cleanup_srq_table(dev);
> -		mlx4_cleanup_cq_table(dev);
> -		mlx4_cmd_use_polling(dev);
> -		mlx4_cleanup_eq_table(dev);
> -		mlx4_cleanup_mcg_table(dev);
> -		mlx4_cleanup_mr_table(dev);
> -		mlx4_cleanup_xrcd_table(dev);
> -		mlx4_cleanup_pd_table(dev);
> +	mlx4_cleanup_counters_table(dev);
> +	mlx4_cleanup_qp_table(dev);
> +	mlx4_cleanup_srq_table(dev);
> +	mlx4_cleanup_cq_table(dev);
> +	mlx4_cmd_use_polling(dev);
> +	mlx4_cleanup_eq_table(dev);
> +	mlx4_cleanup_mcg_table(dev);
> +	mlx4_cleanup_mr_table(dev);
> +	mlx4_cleanup_xrcd_table(dev);
> +	mlx4_cleanup_pd_table(dev);
>  
> -		if (mlx4_is_master(dev))
> -			mlx4_free_resource_tracker(dev,
> -						   RES_TR_FREE_STRUCTS_ONLY);
> -
> -		iounmap(priv->kar);
> -		mlx4_uar_free(dev, &priv->driver_uar);
> -		mlx4_cleanup_uar_table(dev);
> -		if (!mlx4_is_slave(dev))
> -			mlx4_clear_steering(dev);
> -		mlx4_free_eq_table(dev);
> -		if (mlx4_is_master(dev))
> -			mlx4_multi_func_cleanup(dev);
> -		mlx4_close_hca(dev);
> -		if (mlx4_is_slave(dev))
> -			mlx4_multi_func_cleanup(dev);
> -		mlx4_cmd_cleanup(dev);
> -
> -		if (dev->flags & MLX4_FLAG_MSI_X)
> -			pci_disable_msix(pdev);
> -		if (dev->flags & MLX4_FLAG_SRIOV) {
> -			mlx4_warn(dev, "Disabling SR-IOV\n");
> -			pci_disable_sriov(pdev);
> -		}
> +	if (mlx4_is_master(dev))
> +		mlx4_free_resource_tracker(dev,
> +					   RES_TR_FREE_STRUCTS_ONLY);
>  
> -		if (!mlx4_is_slave(dev))
> -			mlx4_free_ownership(dev);
> +	iounmap(priv->kar);
> +	mlx4_uar_free(dev, &priv->driver_uar);
> +	mlx4_cleanup_uar_table(dev);
> +	if (!mlx4_is_slave(dev))
> +		mlx4_clear_steering(dev);
> +	mlx4_free_eq_table(dev);
> +	if (mlx4_is_master(dev))
> +		mlx4_multi_func_cleanup(dev);
> +	mlx4_close_hca(dev);
> +	if (mlx4_is_slave(dev))
> +		mlx4_multi_func_cleanup(dev);
> +	mlx4_cmd_cleanup(dev);
> +
> +	if (dev->flags & MLX4_FLAG_MSI_X)
> +		pci_disable_msix(pdev);
> +	if (dev->flags & MLX4_FLAG_SRIOV) {
> +		mlx4_warn(dev, "Disabling SR-IOV\n");
> +		pci_disable_sriov(pdev);
> +	}
> +
> +	if (!mlx4_is_slave(dev))
> +		mlx4_free_ownership(dev);
> +
> +	kfree(dev->caps.qp0_tunnel);
> +	kfree(dev->caps.qp0_proxy);
> +	kfree(dev->caps.qp1_tunnel);
> +	kfree(dev->caps.qp1_proxy);
> +	pci_release_regions(pdev);
> +	pci_disable_device(pdev);
> +}
>  
> -		kfree(dev->caps.qp0_tunnel);
> -		kfree(dev->caps.qp0_proxy);
> -		kfree(dev->caps.qp1_tunnel);
> -		kfree(dev->caps.qp1_proxy);
> +static void mlx4_remove_one(struct pci_dev *pdev)
> +{
> +	struct mlx4_dev  *dev  = pci_get_drvdata(pdev);
> +	struct mlx4_priv *priv = mlx4_priv(dev);
> +	int p;
>  
> +	if (dev) {

I don't think you should test "dev" here.  What scenario is there where
mlx4_remove_one() would be called with a pci_dev that has no drvdata?

> +		__mlx4_remove_one(dev);
>  		kfree(priv);
> -		pci_release_regions(pdev);
> -		pci_disable_device(pdev);
>  		pci_set_drvdata(pdev, NULL);
>  	}
>  }
> @@ -2607,7 +2616,7 @@ int mlx4_restart_one(struct pci_dev *pdev)
>  	int		  pci_dev_data;
>  
>  	pci_dev_data = priv->pci_dev_data;
> -	mlx4_remove_one(pdev);
> +	__mlx4_remove_one(pdev);
>  	return __mlx4_init_one(pdev, pci_dev_data);
>  }
>  
> @@ -2662,7 +2671,7 @@ MODULE_DEVICE_TABLE(pci, mlx4_pci_table);
>  static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
>  					      pci_channel_state_t state)
>  {
> -	mlx4_remove_one(pdev);
> +	__mlx4_remove_one(pdev);
>  
>  	return state == pci_channel_io_perm_failure ?
>  		PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
> @@ -2670,11 +2679,13 @@ static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
>  
>  static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
>  {
> -	const struct pci_device_id *id;
> -	int ret;
> +	struct mlx4_dev	 *dev  = pci_get_drvdata(pdev);
> +	struct mlx4_priv *priv = mlx4_priv(dev);
> +	int		  pci_dev_data;
> +	int               ret;
>  
> -	id = pci_match_id(mlx4_pci_table, pdev);
> -	ret = __mlx4_init_one(pdev, id->driver_data);
> +	pci_dev_data = priv->pci_dev_data;
> +	ret = __mlx4_init_one(pdev, pci_dev_data);
>  
>  	return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
>  }
> -- 
> 1.7.9.5
> 
> 
> -- 
> Richard Yang
> Help you, Help me
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net/mlx4_core: match pci_device_id including dynids
  2014-04-03 21:12         ` Bjorn Helgaas
@ 2014-04-04 14:20           ` Wei Yang
  0 siblings, 0 replies; 8+ messages in thread
From: Wei Yang @ 2014-04-04 14:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Wei Yang, netdev, yevgenyp, Or Gerlitz, Amir Vadai,
	linux-pci@vger.kernel.org, David Miller

On Thu, Apr 03, 2014 at 03:12:54PM -0600, Bjorn Helgaas wrote:
>On Thu, Apr 03, 2014 at 02:54:32PM +0800, Wei Yang wrote:
>> Here is another one based on your comment, which split mlx4_remove_one into
>> two and named a helper __mlx4_remove_one(). mlx4_pci_err_detected() will just
>> call __mlx4_remove_one(), which will not release drvdata.
>> 
>> BTW, this is not tested, just want to make sure my understanding is correct.
>
>A couple minor comments below, but in general, yes, this is what I was
>thinking.
>
>> From 84a5a9df0604cbea9b70c74b0709258841637946 Mon Sep 17 00:00:00 2001
>> From: Wei Yang <weiyang@linux.vnet.ibm.com>
>> Date: Mon, 31 Mar 2014 11:34:57 +0800
>> Subject: [PATCH] net/mlx4_core: match pci_device_id including dynids
>> 
>> Fix issue introduced by commit: 97a5221 "net/mlx4_core: pass
>> pci_device_id.driver_data to __mlx4_init_one during reset".
>> 
>> pci_match_id() just match the static pci_device_id, which may return NULL if
>> someone binds the driver to a device manually using
>> /sys/bus/pci/drivers/.../new_id.
>> 
>> This patch wrap up a helper function __mlx4_remove_one() which does the tear
>> down function but preserve the drv_data. Functions like
>> mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out
>> releasing drvdata.
>> 
>> CC: Bjorn Helgaas <bhelgaas@google.com>
>> CC: Amir Vadai <amirv@mellanox.com>
>> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>> Acked-by: Amir Vadai <amirv@mellanox.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx4/main.c |  149 ++++++++++++++++-------------
>>  1 file changed, 80 insertions(+), 69 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
>> index aa54ef7..fd1f288 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
>> @@ -2268,7 +2268,12 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data)
>>  	/* Allow large DMA segments, up to the firmware limit of 1 GB */
>>  	dma_set_max_seg_size(&pdev->dev, 1024 * 1024 * 1024);
>>  
>> -	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> +	dev = pci_get_drvdata(pdev);
>> +	if (!dev)
>> +		priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> +	else
>> +		priv = mlx4_priv(dev);
>
>Why don't you move the priv kzalloc into mlx4_init_one()?  Then it would be
>symmetric -- you alloc and call pci_set_drvdata() in mlx4_init_one(), and
>you call pci_set_drvdata(NULL) and free it in mlx4_remove_one().  And you
>wouldn't need the test here.
>

Agree, this looks more consistent.

Will write a formal version and send to mail list after verification.

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-04-04 14:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-01  4:36 [PATCH] net/mlx4_core: match pci_device_id including dynids Wei Yang
2014-04-02  2:29 ` David Miller
2014-04-02 18:28 ` Bjorn Helgaas
2014-04-03  1:51   ` Wei Yang
2014-04-03  3:11     ` Bjorn Helgaas
2014-04-03  6:54       ` Wei Yang
2014-04-03 21:12         ` Bjorn Helgaas
2014-04-04 14:20           ` Wei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).