* [PATCH net] ice: fix memory leak of aRFS after resuming from suspend
@ 2021-03-18 8:15 Yongxin Liu
2021-03-18 22:20 ` Creeley, Brett
0 siblings, 1 reply; 3+ messages in thread
From: Yongxin Liu @ 2021-03-18 8:15 UTC (permalink / raw)
To: brett.creeley, madhu.chittim, anthony.l.nguyen, andrewx.bowers,
jeffrey.t.kirsher
Cc: netdev
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
irq_free_descs() will be eventually called to free irq and its descriptor.
In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs.
However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe
cannot be freed, if the irqs that released in ice_suspend() were reassigned
to other devices, which makes irq descriptor's affinity_notify lost.
So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which can
make sure all irq_glue and cpu_rmap can be correctly released before
corresponding irq and descriptor are released.
Fix the following memeory leak.
unreferenced object 0xffff95bd951afc00 (size 512):
comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
hex dump (first 32 bytes):
18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p.......
00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................
backtrace:
[<0000000072e4b914>] __kmalloc+0x336/0x540
[<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
[<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
[<000000002370a632>] ice_probe+0x941/0x1180 [ice]
[<00000000d692edba>] local_pci_probe+0x47/0xa0
[<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
[<00000000555a9e4a>] process_one_work+0x1dd/0x410
[<000000002c4b414a>] worker_thread+0x221/0x3f0
[<00000000bb2b556b>] kthread+0x14c/0x170
[<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
unreferenced object 0xffff95bd81b0a2a0 (size 96):
comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
hex dump (first 32 bytes):
38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8...............
b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................
backtrace:
[<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
[<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
[<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
[<000000002370a632>] ice_probe+0x941/0x1180 [ice]
[<00000000d692edba>] local_pci_probe+0x47/0xa0
[<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
[<00000000555a9e4a>] process_one_work+0x1dd/0x410
[<000000002c4b414a>] worker_thread+0x221/0x3f0
[<00000000bb2b556b>] kthread+0x14c/0x170
[<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
---
drivers/net/ethernet/intel/ice/ice_arfs.c | 1 -
drivers/net/ethernet/intel/ice/ice_main.c | 3 +++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c
index 6560acd76c94..c748d0a5c7d4 100644
--- a/drivers/net/ethernet/intel/ice/ice_arfs.c
+++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
@@ -654,7 +654,6 @@ void ice_rebuild_arfs(struct ice_pf *pf)
if (!pf_vsi)
return;
- ice_remove_arfs(pf);
if (ice_set_cpu_rx_rmap(pf_vsi)) {
dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n");
return;
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 2c23c8f468a5..dba901bf2b9b 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -4568,6 +4568,9 @@ static int __maybe_unused ice_suspend(struct device *dev)
continue;
ice_vsi_free_q_vectors(pf->vsi[v]);
}
+ if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
+ ice_remove_arfs(pf);
+ }
ice_clear_interrupt_scheme(pf);
pci_save_state(pdev);
--
2.14.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net] ice: fix memory leak of aRFS after resuming from suspend
2021-03-18 8:15 [PATCH net] ice: fix memory leak of aRFS after resuming from suspend Yongxin Liu
@ 2021-03-18 22:20 ` Creeley, Brett
2021-03-19 2:33 ` Liu, Yongxin
0 siblings, 1 reply; 3+ messages in thread
From: Creeley, Brett @ 2021-03-18 22:20 UTC (permalink / raw)
To: yongxin.liu@windriver.com, jeffrey.t.kirsher@intel.com,
Chittim, Madhu, Nguyen, Anthony L, andrewx.bowers@intel.com
Cc: netdev@vger.kernel.org
On Thu, 2021-03-18 at 16:15 +0800, Yongxin Liu wrote:
> In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
> irq_free_descs() will be eventually called to free irq and its
> descriptor.
>
> In ice_resume(), ice_init_interrupt_scheme() is called to allocate
> new irqs.
> However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap
> maybe
> cannot be freed, if the irqs that released in ice_suspend() were
> reassigned
> to other devices, which makes irq descriptor's affinity_notify lost.
>
> So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which
> can
> make sure all irq_glue and cpu_rmap can be correctly released before
> corresponding irq and descriptor are released.
>
> Fix the following memeory leak.
s/memeory/memory
<snip>
> diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c
> b/drivers/net/ethernet/intel/ice/ice_arfs.c
> index 6560acd76c94..c748d0a5c7d4 100644
> --- a/drivers/net/ethernet/intel/ice/ice_arfs.c
> +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
> @@ -654,7 +654,6 @@ void ice_rebuild_arfs(struct ice_pf *pf)
> if (!pf_vsi)
> return;
>
> - ice_remove_arfs(pf);
This should not be removed. Removing this would break the
reset flows outside of the suspend/remove case.
> if (ice_set_cpu_rx_rmap(pf_vsi)) {
> dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n");
> return;
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index 2c23c8f468a5..dba901bf2b9b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -4568,6 +4568,9 @@ static int __maybe_unused ice_suspend(struct
> device *dev)
> continue;
> ice_vsi_free_q_vectors(pf->vsi[v]);
> }
> + if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
> + ice_remove_arfs(pf);
> + }
Braces aren't needed around a single if statement like this.
Also, I don't think this is the right solution. I think a better
approach would be to call ice_free_rx_cpu_map() here. With this,
it seems like no other changes are necessary. It also isn't
necessary to check the ICE_FLAG_FD_ENA bit with this change.
> ice_clear_interrupt_scheme(pf);
>
> pci_save_state(pdev);
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [PATCH net] ice: fix memory leak of aRFS after resuming from suspend
2021-03-18 22:20 ` Creeley, Brett
@ 2021-03-19 2:33 ` Liu, Yongxin
0 siblings, 0 replies; 3+ messages in thread
From: Liu, Yongxin @ 2021-03-19 2:33 UTC (permalink / raw)
To: Creeley, Brett
Cc: netdev@vger.kernel.org, jeffrey.t.kirsher@intel.com,
Chittim, Madhu, Nguyen, Anthony L, andrewx.bowers@intel.com
> -----Original Message-----
> From: Creeley, Brett <brett.creeley@intel.com>
> Sent: Friday, March 19, 2021 06:20
> To: Liu, Yongxin <Yongxin.Liu@windriver.com>; jeffrey.t.kirsher@intel.com;
> Chittim, Madhu <madhu.chittim@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; andrewx.bowers@intel.com
> Cc: netdev@vger.kernel.org
> Subject: Re: [PATCH net] ice: fix memory leak of aRFS after resuming from
> suspend
>
>
> On Thu, 2021-03-18 at 16:15 +0800, Yongxin Liu wrote:
> > In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
> > irq_free_descs() will be eventually called to free irq and its
> > descriptor.
> >
> > In ice_resume(), ice_init_interrupt_scheme() is called to allocate new
> > irqs.
> > However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap
> > maybe cannot be freed, if the irqs that released in ice_suspend() were
> > reassigned to other devices, which makes irq descriptor's
> > affinity_notify lost.
> >
> > So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which
> > can make sure all irq_glue and cpu_rmap can be correctly released
> > before corresponding irq and descriptor are released.
> >
> > Fix the following memeory leak.
>
> s/memeory/memory
>
> <snip>
>
> > diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c
> > b/drivers/net/ethernet/intel/ice/ice_arfs.c
> > index 6560acd76c94..c748d0a5c7d4 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_arfs.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
> > @@ -654,7 +654,6 @@ void ice_rebuild_arfs(struct ice_pf *pf)
> > if (!pf_vsi)
> > return;
> >
> > - ice_remove_arfs(pf);
>
> This should not be removed. Removing this would break the reset flows
> outside of the suspend/remove case.
>
> > if (ice_set_cpu_rx_rmap(pf_vsi)) {
> > dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n");
> > return;
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> > b/drivers/net/ethernet/intel/ice/ice_main.c
> > index 2c23c8f468a5..dba901bf2b9b 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -4568,6 +4568,9 @@ static int __maybe_unused ice_suspend(struct
> > device *dev)
> > continue;
> > ice_vsi_free_q_vectors(pf->vsi[v]);
> > }
> > + if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
> > + ice_remove_arfs(pf);
> > + }
>
> Braces aren't needed around a single if statement like this.
>
> Also, I don't think this is the right solution. I think a better approach
> would be to call ice_free_rx_cpu_map() here. With this, it seems like no
> other changes are necessary. It also isn't necessary to check the
> ICE_FLAG_FD_ENA bit with this change.
Thanks for your valuable review. I will send V2.
--Yongxin
>
> > ice_clear_interrupt_scheme(pf);
> >
> > pci_save_state(pdev);
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-19 2:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-18 8:15 [PATCH net] ice: fix memory leak of aRFS after resuming from suspend Yongxin Liu
2021-03-18 22:20 ` Creeley, Brett
2021-03-19 2:33 ` Liu, Yongxin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).