* [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove()
@ 2021-12-08 10:21 Ken Cox
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove Ken Cox
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Ken Cox @ 2021-12-08 10:21 UTC (permalink / raw)
To: intel-wired-lan
This series fixes panics that occur after iavf_remove() is called.
The panics occur because the iavf_adapter structure is freed at the end
of iavf_remove(), but it is possible that new work has been scheduled using
the work_struct's contained within the iavf_adapter structure. If this occurs, the system will panic when it later tries to process the work queue.
Ken Cox (2):
iavf: Fix panic in iavf_remove
iavf: Prevent reset from being scheduled while adapter is being
removed
drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 7 +++++--
drivers/net/ethernet/intel/iavf/iavf_main.c | 17 +++++++++++------
drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 4 +++-
3 files changed, 19 insertions(+), 9 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove
2021-12-08 10:21 [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
@ 2021-12-08 10:21 ` Ken Cox
2021-12-11 11:25 ` Jankowski, Konrad0
2021-12-13 18:26 ` Nguyen, Anthony L
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed Ken Cox
2021-12-14 13:21 ` [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
2 siblings, 2 replies; 12+ messages in thread
From: Ken Cox @ 2021-12-08 10:21 UTC (permalink / raw)
To: intel-wired-lan
It's possible for the client_task to get scheduled by the watchdog
after cancel_delayed_work_sync(&adapter->client_task); This can cause
a panic because free_netdev() is called with the client_task still queued
on the work queue.
The stack backtrace usually looks similar to:
[ 121.272963] Workqueue: 0x0 (iavf)
[ 121.272969] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
...
[ 121.272980] Call Trace:
[ 121.272985] move_linked_works+0x49/0xa0
[ 121.272988] pwq_activate_delayed_work+0x43/0x100
[ 121.272991] pwq_dec_nr_in_flight+0x5d/0x90
[ 121.272993] worker_thread+0x30/0x370
[ 121.272995] ? process_one_work+0x420/0x420
[ 121.272998] kthread+0x15d/0x180
[ 121.273000] ? __kthread_parkme+0xa0/0xa0
[ 121.273003] ret_from_fork+0x1f/0x40
Signed-off-by: Ken Cox <jkc@redhat.com>
---
drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 6c2afbc8acbcd..63eec7edbf60a 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3940,7 +3940,6 @@ static void iavf_remove(struct pci_dev *pdev)
set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section);
cancel_delayed_work_sync(&adapter->init_task);
cancel_work_sync(&adapter->reset_task);
- cancel_delayed_work_sync(&adapter->client_task);
if (adapter->netdev_registered) {
unregister_netdev(netdev);
adapter->netdev_registered = false;
@@ -3974,6 +3973,7 @@ static void iavf_remove(struct pci_dev *pdev)
iavf_free_q_vectors(adapter);
cancel_delayed_work_sync(&adapter->watchdog_task);
+ cancel_delayed_work_sync(&adapter->client_task);
cancel_work_sync(&adapter->adminq_task);
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed
2021-12-08 10:21 [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove Ken Cox
@ 2021-12-08 10:21 ` Ken Cox
2021-12-11 11:25 ` Jankowski, Konrad0
2021-12-13 18:27 ` Nguyen, Anthony L
2021-12-14 13:21 ` [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
2 siblings, 2 replies; 12+ messages in thread
From: Ken Cox @ 2021-12-08 10:21 UTC (permalink / raw)
To: intel-wired-lan
If a reset gets scheduled while the adapter is being removed it can
cause a panic.
The work_struct for the reset_task is contained in the iavf_adapter
structure. iavf_remove() eventually frees the iavf_adapter structure
so if there is active work scheduled it can cause a panic.
Signed-off-by: Ken Cox <jkc@redhat.com>
---
drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 7 +++++--
drivers/net/ethernet/intel/iavf/iavf_main.c | 15 ++++++++++-----
drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 4 +++-
3 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
index af43fbd8cb75e..3cf1679153604 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
@@ -519,7 +519,9 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags)
/* issue a reset to force legacy-rx change to take effect */
if (changed_flags & IAVF_FLAG_LEGACY_RX) {
- if (netif_running(netdev)) {
+
+ if (netif_running(netdev) &&
+ !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) {
adapter->flags |= IAVF_FLAG_RESET_NEEDED;
queue_work(iavf_wq, &adapter->reset_task);
}
@@ -630,7 +632,8 @@ static int iavf_set_ringparam(struct net_device *netdev,
adapter->tx_desc_count = new_tx_count;
adapter->rx_desc_count = new_rx_count;
- if (netif_running(netdev)) {
+ if (netif_running(netdev) &&
+ !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) {
adapter->flags |= IAVF_FLAG_RESET_NEEDED;
queue_work(iavf_wq, &adapter->reset_task);
}
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 63eec7edbf60a..af2788c997ca2 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -164,7 +164,8 @@ static int iavf_lock_timeout(struct iavf_adapter *adapter,
void iavf_schedule_reset(struct iavf_adapter *adapter)
{
if (!(adapter->flags &
- (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) {
+ (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED)) &&
+ !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) {
adapter->flags |= IAVF_FLAG_RESET_NEEDED;
queue_work(iavf_wq, &adapter->reset_task);
}
@@ -2013,7 +2014,8 @@ static void iavf_watchdog_task(struct work_struct *work)
adapter->aq_required = 0;
adapter->current_op = VIRTCHNL_OP_UNKNOWN;
dev_err(&adapter->pdev->dev, "Hardware reset detected\n");
- queue_work(iavf_wq, &adapter->reset_task);
+ if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
+ queue_work(iavf_wq, &adapter->reset_task);
goto watchdog_done;
}
@@ -3348,8 +3350,10 @@ static int iavf_change_mtu(struct net_device *netdev, int new_mtu)
iavf_notify_client_l2_params(&adapter->vsi);
adapter->flags |= IAVF_FLAG_SERVICE_CLIENT_REQUESTED;
}
- adapter->flags |= IAVF_FLAG_RESET_NEEDED;
- queue_work(iavf_wq, &adapter->reset_task);
+ if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) {
+ adapter->flags |= IAVF_FLAG_RESET_NEEDED;
+ queue_work(iavf_wq, &adapter->reset_task);
+ }
return 0;
}
@@ -3909,7 +3913,8 @@ static int __maybe_unused iavf_resume(struct device *dev_d)
return err;
}
- queue_work(iavf_wq, &adapter->reset_task);
+ if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
+ queue_work(iavf_wq, &adapter->reset_task);
netif_device_attach(netdev);
diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
index 0eab3c43bdc59..ba973b2ab0547 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
@@ -1470,7 +1470,9 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
break;
case VIRTCHNL_EVENT_RESET_IMPENDING:
dev_info(&adapter->pdev->dev, "Reset warning received from the PF\n");
- if (!(adapter->flags & IAVF_FLAG_RESET_PENDING)) {
+ if (!(adapter->flags & IAVF_FLAG_RESET_PENDING) &&
+ !test_bit(__IAVF_IN_REMOVE_TASK,
+ &adapter->crit_section)) {
adapter->flags |= IAVF_FLAG_RESET_PENDING;
dev_info(&adapter->pdev->dev, "Scheduling reset task\n");
queue_work(iavf_wq, &adapter->reset_task);
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove Ken Cox
@ 2021-12-11 11:25 ` Jankowski, Konrad0
2021-12-13 17:29 ` Jankowski, Konrad0
2021-12-13 18:26 ` Nguyen, Anthony L
1 sibling, 1 reply; 12+ messages in thread
From: Jankowski, Konrad0 @ 2021-12-11 11:25 UTC (permalink / raw)
To: intel-wired-lan
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Ken Cox
> Sent: ?roda, 8 grudnia 2021 11:22
> To: intel-wired-lan at osuosl.org
> Cc: Ken Cox <jkc@redhat.com>
> Subject: [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove
>
> It's possible for the client_task to get scheduled by the watchdog after
> cancel_delayed_work_sync(&adapter->client_task); This can cause a panic
> because free_netdev() is called with the client_task still queued on the work
> queue.
>
> The stack backtrace usually looks similar to:
>
> [ 121.272963] Workqueue: 0x0 (iavf)
> [ 121.272969] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
> ...
> [ 121.272980] Call Trace:
> [ 121.272985] move_linked_works+0x49/0xa0 [ 121.272988]
> pwq_activate_delayed_work+0x43/0x100
> [ 121.272991] pwq_dec_nr_in_flight+0x5d/0x90 [ 121.272993]
> worker_thread+0x30/0x370 [ 121.272995] ?
> process_one_work+0x420/0x420 [ 121.272998] kthread+0x15d/0x180 [
> 121.273000] ? __kthread_parkme+0xa0/0xa0 [ 121.273003]
> ret_from_fork+0x1f/0x40
>
> Signed-off-by: Ken Cox <jkc@redhat.com>
> ---
> drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 6c2afbc8acbcd..63eec7edbf60a 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -3940,7 +3940,6 @@ static void iavf_remove(struct pci_dev *pdev)
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed Ken Cox
@ 2021-12-11 11:25 ` Jankowski, Konrad0
2021-12-13 17:48 ` Jankowski, Konrad0
2021-12-13 18:27 ` Nguyen, Anthony L
1 sibling, 1 reply; 12+ messages in thread
From: Jankowski, Konrad0 @ 2021-12-11 11:25 UTC (permalink / raw)
To: intel-wired-lan
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Ken Cox
> Sent: ?roda, 8 grudnia 2021 11:22
> To: intel-wired-lan at osuosl.org
> Cc: Ken Cox <jkc@redhat.com>
> Subject: [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being
> scheduled while adapter is being removed
>
> If a reset gets scheduled while the adapter is being removed it can cause a
> panic.
>
> The work_struct for the reset_task is contained in the iavf_adapter
> structure. iavf_remove() eventually frees the iavf_adapter structure so if
> there is active work scheduled it can cause a panic.
>
> Signed-off-by: Ken Cox <jkc@redhat.com>
> ---
> drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 7 +++++--
> drivers/net/ethernet/intel/iavf/iavf_main.c | 15 ++++++++++-----
> drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 4 +++-
> 3 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> index af43fbd8cb75e..3cf1679153604 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> @@ -519,7 +519,9 @@ static int iavf_set_priv_flags(struct net_device
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove
2021-12-11 11:25 ` Jankowski, Konrad0
@ 2021-12-13 17:29 ` Jankowski, Konrad0
0 siblings, 0 replies; 12+ messages in thread
From: Jankowski, Konrad0 @ 2021-12-13 17:29 UTC (permalink / raw)
To: intel-wired-lan
I'm sorry, this was not sent to me, it is still untested
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed
2021-12-11 11:25 ` Jankowski, Konrad0
@ 2021-12-13 17:48 ` Jankowski, Konrad0
0 siblings, 0 replies; 12+ messages in thread
From: Jankowski, Konrad0 @ 2021-12-13 17:48 UTC (permalink / raw)
To: intel-wired-lan
> -----Original Message-----
> From: Jankowski, Konrad0
> Sent: sobota, 11 grudnia 2021 12:26
> To: Ken Cox <jkc@redhat.com>; intel-wired-lan at osuosl.org
> Subject: RE: [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being
> scheduled while adapter is being removed
>
>
>
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> > Of Ken Cox
> > Sent: ?roda, 8 grudnia 2021 11:22
> > To: intel-wired-lan at osuosl.org
> > Cc: Ken Cox <jkc@redhat.com>
> > Subject: [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being
> > scheduled while adapter is being removed
> >
> > If a reset gets scheduled while the adapter is being removed it can
> > cause a panic.
> >
> > The work_struct for the reset_task is contained in the iavf_adapter
> > structure. iavf_remove() eventually frees the iavf_adapter structure
> > so if there is active work scheduled it can cause a panic.
> >
> > Signed-off-by: Ken Cox <jkc@redhat.com>
> > ---
> > drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 7 +++++--
> > drivers/net/ethernet/intel/iavf/iavf_main.c | 15 ++++++++++-----
> > drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 4 +++-
> > 3 files changed, 18 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> > b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> > index af43fbd8cb75e..3cf1679153604 100644
> > --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> > +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
> > @@ -519,7 +519,9 @@ static int iavf_set_priv_flags(struct net_device
>
I'm sorry, this was not sent to me, it is still untested
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove Ken Cox
2021-12-11 11:25 ` Jankowski, Konrad0
@ 2021-12-13 18:26 ` Nguyen, Anthony L
2021-12-14 13:18 ` Ken Cox
1 sibling, 1 reply; 12+ messages in thread
From: Nguyen, Anthony L @ 2021-12-13 18:26 UTC (permalink / raw)
To: intel-wired-lan
On Wed, 2021-12-08 at 04:21 -0600, Ken Cox wrote:
> It's possible for the client_task to get scheduled by the watchdog
> after cancel_delayed_work_sync(&adapter->client_task);? This can
> cause
> a panic because free_netdev() is called with the client_task still
> queued
> on the work queue.
>
> The stack backtrace usually looks similar to:
>
> [? 121.272963] Workqueue:? 0x0 (iavf)
> [? 121.272969] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
> ...
> [? 121.272980] Call Trace:
> [? 121.272985]? move_linked_works+0x49/0xa0
> [? 121.272988]? pwq_activate_delayed_work+0x43/0x100
> [? 121.272991]? pwq_dec_nr_in_flight+0x5d/0x90
> [? 121.272993]? worker_thread+0x30/0x370
> [? 121.272995]? ? process_one_work+0x420/0x420
> [? 121.272998]? kthread+0x15d/0x180
> [? 121.273000]? ? __kthread_parkme+0xa0/0xa0
> [? 121.273003]? ret_from_fork+0x1f/0x40
>
> Signed-off-by: Ken Cox <jkc@redhat.com>
> ---
> ?drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
> ?1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 6c2afbc8acbcd..63eec7edbf60a 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -3940,7 +3940,6 @@ static void iavf_remove(struct pci_dev *pdev)
> ????????set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section);
> ????????cancel_delayed_work_sync(&adapter->init_task);
> ????????cancel_work_sync(&adapter->reset_task);
> -???????cancel_delayed_work_sync(&adapter->client_task);
> ????????if (adapter->netdev_registered) {
> ????????????????unregister_netdev(netdev);
> ????????????????adapter->netdev_registered = false;
> @@ -3974,6 +3973,7 @@ static void iavf_remove(struct pci_dev *pdev)
> ????????iavf_free_q_vectors(adapter);
> ?
> ????????cancel_delayed_work_sync(&adapter->watchdog_task);
> +???????cancel_delayed_work_sync(&adapter->client_task);
> ?
> ????????cancel_work_sync(&adapter->adminq_task);
> ?
Hi Ken,
What tree is this patch based on? This doesn't apply to either of the
IWL trees or the netdev trees.
The ordering looks correct on the kernel tree with watchdog_task being
cancelled before the client_task [1]. However, we do have an extra
'cancel_delayed_work_sync(&adapter->watchdog_task)'. I'll get a patch
together to remove the extra one.
Thanks,
Tony
[1] https://elixir.bootlin.com/linux/v5.16-
rc5/source/drivers/net/ethernet/intel/iavf/iavf_main.c#3979
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed Ken Cox
2021-12-11 11:25 ` Jankowski, Konrad0
@ 2021-12-13 18:27 ` Nguyen, Anthony L
2021-12-14 13:19 ` Ken Cox
1 sibling, 1 reply; 12+ messages in thread
From: Nguyen, Anthony L @ 2021-12-13 18:27 UTC (permalink / raw)
To: intel-wired-lan
On Wed, 2021-12-08 at 04:21 -0600, Ken Cox wrote:
> If a reset gets scheduled while the adapter is being removed it can
> cause a panic.
>
> The work_struct for the reset_task is contained in the iavf_adapter
> structure.? iavf_remove() eventually frees the iavf_adapter structure
> so if there is active work scheduled it can cause a panic.
>
> Signed-off-by: Ken Cox <jkc@redhat.com>
Like the other patch, this one isn't applying.
Thanks,
Tony
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove
2021-12-13 18:26 ` Nguyen, Anthony L
@ 2021-12-14 13:18 ` Ken Cox
0 siblings, 0 replies; 12+ messages in thread
From: Ken Cox @ 2021-12-14 13:18 UTC (permalink / raw)
To: intel-wired-lan
On 12/13/21 12:26, Nguyen, Anthony L wrote:
> On Wed, 2021-12-08 at 04:21 -0600, Ken Cox wrote:
>> It's possible for the client_task to get scheduled by the watchdog
>> after cancel_delayed_work_sync(&adapter->client_task);? This can
>> cause
>> a panic because free_netdev() is called with the client_task still
>> queued
>> on the work queue.
>>
>> The stack backtrace usually looks similar to:
>>
>> [? 121.272963] Workqueue:? 0x0 (iavf)
>> [? 121.272969] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
>> ...
>> [? 121.272980] Call Trace:
>> [? 121.272985]? move_linked_works+0x49/0xa0
>> [? 121.272988]? pwq_activate_delayed_work+0x43/0x100
>> [? 121.272991]? pwq_dec_nr_in_flight+0x5d/0x90
>> [? 121.272993]? worker_thread+0x30/0x370
>> [? 121.272995]? ? process_one_work+0x420/0x420
>> [? 121.272998]? kthread+0x15d/0x180
>> [? 121.273000]? ? __kthread_parkme+0xa0/0xa0
>> [? 121.273003]? ret_from_fork+0x1f/0x40
>>
>> Signed-off-by: Ken Cox <jkc@redhat.com>
>> ---
>> ?drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
>> ?1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
>> b/drivers/net/ethernet/intel/iavf/iavf_main.c
>> index 6c2afbc8acbcd..63eec7edbf60a 100644
>> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
>> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
>> @@ -3940,7 +3940,6 @@ static void iavf_remove(struct pci_dev *pdev)
>> ????????set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section);
>> ????????cancel_delayed_work_sync(&adapter->init_task);
>> ????????cancel_work_sync(&adapter->reset_task);
>> -???????cancel_delayed_work_sync(&adapter->client_task);
>> ????????if (adapter->netdev_registered) {
>> ????????????????unregister_netdev(netdev);
>> ????????????????adapter->netdev_registered = false;
>> @@ -3974,6 +3973,7 @@ static void iavf_remove(struct pci_dev *pdev)
>> ????????iavf_free_q_vectors(adapter);
>>
>> ????????cancel_delayed_work_sync(&adapter->watchdog_task);
>> +???????cancel_delayed_work_sync(&adapter->client_task);
>>
>> ????????cancel_work_sync(&adapter->adminq_task);
>>
>
> Hi Ken,
>
> What tree is this patch based on? This doesn't apply to either of the
> IWL trees or the netdev trees.
Sorry, I was in the wrong branch when I generated these patches. Please
disregard. I will re-evaluate and resend if necessary.
>
> The ordering looks correct on the kernel tree with watchdog_task being
> cancelled before the client_task [1]. However, we do have an extra
> 'cancel_delayed_work_sync(&adapter->watchdog_task)'. I'll get a patch
> together to remove the extra one.
>
> Thanks,
> Tony
>
>
> [1] https://elixir.bootlin.com/linux/v5.16-
> rc5/source/drivers/net/ethernet/intel/iavf/iavf_main.c#3979
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed
2021-12-13 18:27 ` Nguyen, Anthony L
@ 2021-12-14 13:19 ` Ken Cox
0 siblings, 0 replies; 12+ messages in thread
From: Ken Cox @ 2021-12-14 13:19 UTC (permalink / raw)
To: intel-wired-lan
On 12/13/21 12:27, Nguyen, Anthony L wrote:
> On Wed, 2021-12-08 at 04:21 -0600, Ken Cox wrote:
>> If a reset gets scheduled while the adapter is being removed it can
>> cause a panic.
>>
>> The work_struct for the reset_task is contained in the iavf_adapter
>> structure.? iavf_remove() eventually frees the iavf_adapter structure
>> so if there is active work scheduled it can cause a panic.
>>
>> Signed-off-by: Ken Cox <jkc@redhat.com>
>
> Like the other patch, this one isn't applying.
>
> Thanks,
> Tony
>
Sorry, I was in the wrong branch when I generated these patches. Please
disregard. I will re-evaluate and resend if necessary.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove()
2021-12-08 10:21 [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove Ken Cox
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed Ken Cox
@ 2021-12-14 13:21 ` Ken Cox
2 siblings, 0 replies; 12+ messages in thread
From: Ken Cox @ 2021-12-14 13:21 UTC (permalink / raw)
To: intel-wired-lan
On 12/8/21 04:21, Ken Cox wrote:
> This series fixes panics that occur after iavf_remove() is called.
>
> The panics occur because the iavf_adapter structure is freed at the end
> of iavf_remove(), but it is possible that new work has been scheduled using
> the work_struct's contained within the iavf_adapter structure. If this occurs, the system will panic when it later tries to process the work queue.
>
> Ken Cox (2):
> iavf: Fix panic in iavf_remove
> iavf: Prevent reset from being scheduled while adapter is being
> removed
>
> drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 7 +++++--
> drivers/net/ethernet/intel/iavf/iavf_main.c | 17 +++++++++++------
> drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 4 +++-
> 3 files changed, 19 insertions(+), 9 deletions(-)
>
NAK for this series.
These patches were generated off of the wrong branch. I will
re-evaluate and re-submit if necessary.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-12-14 13:21 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-08 10:21 [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 1/2] iavf: Fix panic in iavf_remove Ken Cox
2021-12-11 11:25 ` Jankowski, Konrad0
2021-12-13 17:29 ` Jankowski, Konrad0
2021-12-13 18:26 ` Nguyen, Anthony L
2021-12-14 13:18 ` Ken Cox
2021-12-08 10:21 ` [Intel-wired-lan] [Patch 2/2] iavf: Prevent reset from being scheduled while adapter is being removed Ken Cox
2021-12-11 11:25 ` Jankowski, Konrad0
2021-12-13 17:48 ` Jankowski, Konrad0
2021-12-13 18:27 ` Nguyen, Anthony L
2021-12-14 13:19 ` Ken Cox
2021-12-14 13:21 ` [Intel-wired-lan] [Patch 0/2] iavf: Fix panics due to active work queues being freed in iavf_remove() Ken Cox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox