From: Remi Pommarel <repk@triplefau.lt>
To: Andrew Strohman <andrew@andrewstrohman.com>
Cc: Sven Eckelmann <sven@narfation.org>, b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [PATCH] batman-adv: fix panic during interface removal
Date: Fri, 10 Jan 2025 14:10:21 +0100 [thread overview]
Message-ID: <Z4EcPQOMU1BUtO07@pilgrim> (raw)
In-Reply-To: <CAA8ajJnVQWO3fhLAjQtEfjEVReY7x-==TEkhrKQKZMrVJms44w@mail.gmail.com>
Hi Andrew,
On Fri, Jan 10, 2025 at 01:02:19AM -0800, Andrew Strohman wrote:
> > I would prefer when you would call cancel_work_sync when metric stuff should
> > be stopped. I was expecting to see this somewhere around
> > batadv_v_elp_iface_disable after the cancel_work_sync but it seems like it is
> > missing there (or in a similar place)
> >
>
> I tried this:
>
> diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
> index 1d704574..b35ded79 100644
> --- a/net/batman-adv/bat_v_elp.c
> +++ b/net/batman-adv/bat_v_elp.c
> @@ -387,8 +387,20 @@ int batadv_v_elp_iface_enable(struct
> batadv_hard_iface *hard_iface)
> */
> void batadv_v_elp_iface_disable(struct batadv_hard_iface *hard_iface)
> {
> + struct batadv_hardif_neigh_node *hardif_neigh;
> +
> cancel_delayed_work_sync(&hard_iface->bat_v.elp_wq);
>
> + rcu_read_lock();
> + hlist_for_each_entry_rcu(hardif_neigh,
> + &hard_iface->neigh_list, list) {
> + if (!kref_get_unless_zero(&hardif_neigh->refcount))
> + continue;
> + cancel_work_sync(&hardif_neigh->bat_v.metric_work);
> + batadv_hardif_neigh_put(hardif_neigh);
> + }
> + rcu_read_unlock();
> +
> dev_kfree_skb(hard_iface->bat_v.elp_skb);
> hard_iface->bat_v.elp_skb = NULL;
> }
>
> But it seems to cause a hang on reboot every once in a while. When the hang
> happens, I'm not able to trigger sysrq over serial.
Quickly looking at that I think that metric_work may need to sleep so
calling cancel_work_sync() on this work does not seem safe while in rcu
protected context.
I would try to frame the cancel_work_sync() call with
rcu_read_unlock()/rcu_read_lock() as below:
rcu_read_unlock();
cancel_work_sync(...);
rcu_read_lock();
batadv_hardif_neigh_put(...);
But be careful as batadv_hardif_neigh_put() could modify the list you
are currently traversing. At first glance it seems safe to realease
rcu read constraint to call cancel_work_sync() as long as you take it
back before calling batadv_hardif_neigh_put(), but this could need more
though on that.
--
Remi
next prev parent reply other threads:[~2025-01-10 13:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-09 2:27 [PATCH] batman-adv: fix panic during interface removal Andy Strohman
2025-01-09 7:46 ` Sven Eckelmann
2025-01-09 10:10 ` Andrew Strohman
2025-01-09 10:23 ` Sven Eckelmann
2025-01-10 9:02 ` Andrew Strohman
2025-01-10 13:10 ` Remi Pommarel [this message]
2025-01-13 7:35 ` Andrew Strohman
2025-01-19 22:28 ` Andrew Strohman
2025-01-19 23:03 ` Sven Eckelmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4EcPQOMU1BUtO07@pilgrim \
--to=repk@triplefau.lt \
--cc=andrew@andrewstrohman.com \
--cc=b.a.t.m.a.n@lists.open-mesh.org \
--cc=sven@narfation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.