From: Ivan Vecera <ivecera@redhat.com>
To: "Laba, SlawomirX" <slawomirx.laba@intel.com>
Cc: "intel-wired-lan@lists.osuosl.org" <intel-wired-lan@lists.osuosl.org>
Subject: Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
Date: Tue, 11 Oct 2022 15:22:34 +0200 [thread overview]
Message-ID: <20221011152234.33904e70@p1.luc.cera.cz> (raw)
In-Reply-To: <DM6PR11MB31130998A16B8360E72D354087629@DM6PR11MB3113.namprd11.prod.outlook.com>
On Tue, 9 Aug 2022 21:11:26 +0000
"Laba, SlawomirX" <slawomirx.laba@intel.com> wrote:
> > -----Original Message-----
> > From: Ivan Vecera <ivecera@redhat.com>
> > Sent: Thursday, August 4, 2022 11:38 AM
> > To: mschmidt <mschmidt@redhat.com>
> > Cc: Palczewski, Mateusz <mateusz.palczewski@intel.com>; intel-wired-
> > lan@lists.osuosl.org; Laba, SlawomirX <slawomirx.laba@intel.com>
> > Subject: Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition
> > between iavf_shutdown and iavf_remove
> >
> > On Thu, 4 Aug 2022 11:14:58 +0200
> > Michal Schmidt <mschmidt@redhat.com> wrote:
> >
> > > On Tue, Aug 2, 2022 at 1:52 PM Mateusz Palczewski <
> > > mateusz.palczewski@intel.com> wrote:
> > >
> > > > From: Slawomir Laba <slawomirx.laba@intel.com>
> > > >
> > > > Fix a deadlock introduced by commit
> > > > 974578017fc1 ("iavf: Add waiting so the port is initialized in
> > > > remove") due to race condition between iavf_shutdown and
> > > > iavf_remove, where iavf_remove stucks forever in while loop since
> > > > iavf_shutdown already set __IAVF_REMOVE adapter state.
> > > >
> > > > Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
> > > > set and return if so.
> > > >
> > > > Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized
> > > > in
> > > > remove")
> > > > Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> > > > Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> > > > ---
> > > > v2: Fixed author
> > > > ---
> > > > drivers/net/ethernet/intel/iavf/iavf_main.c | 19
> > > > ++++++++++---------
> > > > 1 file changed, 10 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > index 6357dea93b99..d43e8f12d3ad 100644
> > > > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > @@ -4846,23 +4846,24 @@ static int __maybe_unused
> > iavf_resume(struct
> > > > device *dev_d) static void iavf_remove(struct pci_dev *pdev) {
> > > > struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
> > > > - struct net_device *netdev = adapter->netdev;
> > > > struct iavf_fdir_fltr *fdir, *fdirtmp;
> > > > struct iavf_vlan_filter *vlf, *vlftmp;
> > > > + struct iavf_cloud_filter *cf, *cftmp;
> > > > struct iavf_adv_rss *rss, *rsstmp;
> > > > struct iavf_mac_filter *f, *ftmp;
> > > > - struct iavf_cloud_filter *cf, *cftmp;
> > > > - struct iavf_hw *hw = &adapter->hw;
> > > > + struct net_device *netdev;
> > > > + struct iavf_hw *hw;
> > > > int err;
> > > >
> > > > - /* When reboot/shutdown is in progress no need to do anything
> > > > - * as the adapter is already REMOVE state that was set during
> > > > - * iavf_shutdown() callback.
> > > > - */
> > > > - if (adapter->state == __IAVF_REMOVE)
> > > > + if (!adapter)
> > > > + return;
> > > >
> > >
> > > The check for !adapter cannot work. iavf_pdev_to_adapter(pdev) will
> > > never return NULL. It does:
> > > return netdev_priv(pci_get_drvdata(pdev));
> > > Even if pci_get_drvdata(pdev) somehow returns NULL (which I don't
> > > think it will, because the driver never resets the drvdata before
> > > freeing netdev),
> > > netdev_priv() would make a non-NULL value out of it (it adds the
> > > aligned size of struct net_device to the pointer).
> > >
> > > CCing Ivan, who will have more comments about the adapter states.
> > > Michal
> >
> > Yes, to make your patch working correctly you need to adjust
> > iavf_pdev_to_adapter() appropriately and also set pci drvdata to NULL prior
> > free_netdev():
>
> Thanks Ivan for a nice fix of this problem. The only way that this check would work is when iavf_probe fails with no memory.
> We also came to the conclusion that this check is not really necessary and our update on this patch would be to simply
> remove the check on the adapter to NULL. What do you think about this?
OK, make sense.
> > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > @@ -164,7 +164,9 @@ int virtchnl_status_to_errno(enum
> > virtchnl_status_code v_status)
> > */
> > static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev) {
> > - return netdev_priv(pci_get_drvdata(pdev));
> > + struct net_device *netdev = pci_get_drvdata(pdev);
> > +
> > + return netdev ? netdev_priv(netdev) : NULL;
> > }
> >
> > /**
> > @@ -4899,6 +4901,7 @@ static void iavf_remove(struct pci_dev *pdev)
> > }
> > spin_unlock_bh(&adapter->adv_rss_lock);
> >
> > + pci_set_drvdata(pdev, NULL);
> > free_netdev(netdev);
> >
> > pci_disable_pcie_error_reporting(pdev);
> >
> > Regarding adapter states... __IAVF_REMOVE can be removed as redundant
> > and instead only use __IAVF_IN_REMOVE_TASK bit.
> >
> > Ivan
>
> I divided iavf_remove function into two logical pieces. The first piece helps the driver to survive races of watchdog init states and iavf_remove call.
> So when init fails the driver doesn't reinit if remove is triggered. Additionally the __IAVF_IN_REMOVE_TASK was introduced in order to fix race
> condition between register_netdevice in init and unregister_netdevice in remove. The second piece is the cleanup of resources after netdev gets
> unregistered. I had no better idea on how to deal with unregister_netdevice call without holding crit_lock. Unregister_netdevice function will call
> iavf_close which requires this lock in order to free traffic irqs and cleanup rx and tx resources.
>
> The __IAVF_REMOVE state is used in different purpose, for example when the driver is ready to stop workqueues (after netdev gets unregistered)
> and iavf_remove already holds the crit_lock for final cleanups.
OK, got it.
Thanks,
Ivan
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
prev parent reply other threads:[~2022-10-11 13:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-02 11:51 [Intel-wired-lan] [PATCH net v2 0/2] iavf: Fix close path on shutdown and remove in iavf driver Mateusz Palczewski
2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one Mateusz Palczewski
2022-08-04 7:55 ` Szlosek, Marek
2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove Mateusz Palczewski
2022-08-04 7:56 ` Szlosek, Marek
2022-08-04 9:14 ` Michal Schmidt
2022-08-04 9:37 ` Ivan Vecera
2022-08-09 21:11 ` Laba, SlawomirX
2022-10-11 13:22 ` Ivan Vecera [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221011152234.33904e70@p1.luc.cera.cz \
--to=ivecera@redhat.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=slawomirx.laba@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox