B.A.T.M.A.N Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Remi Pommarel <repk@triplefau.lt>
To: Andrew Strohman <andrew@andrewstrohman.com>
Cc: Sven Eckelmann <sven@narfation.org>, b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [PATCH] batman-adv: fix panic during interface removal
Date: Fri, 10 Jan 2025 14:10:21 +0100	[thread overview]
Message-ID: <Z4EcPQOMU1BUtO07@pilgrim> (raw)
In-Reply-To: <CAA8ajJnVQWO3fhLAjQtEfjEVReY7x-==TEkhrKQKZMrVJms44w@mail.gmail.com>

Hi Andrew,

On Fri, Jan 10, 2025 at 01:02:19AM -0800, Andrew Strohman wrote:
> > I would prefer when you would call cancel_work_sync when metric stuff should
> > be stopped. I was expecting to see this somewhere around
> > batadv_v_elp_iface_disable after the cancel_work_sync but it seems like it is
> > missing there (or in a similar place)
> >
> 
> I tried this:
> 
> diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
> index 1d704574..b35ded79 100644
> --- a/net/batman-adv/bat_v_elp.c
> +++ b/net/batman-adv/bat_v_elp.c
> @@ -387,8 +387,20 @@ int batadv_v_elp_iface_enable(struct
> batadv_hard_iface *hard_iface)
>   */
>  void batadv_v_elp_iface_disable(struct batadv_hard_iface *hard_iface)
>  {
> +       struct batadv_hardif_neigh_node *hardif_neigh;
> +
>         cancel_delayed_work_sync(&hard_iface->bat_v.elp_wq);
> 
> +       rcu_read_lock();
> +       hlist_for_each_entry_rcu(hardif_neigh,
> +                                &hard_iface->neigh_list, list) {
> +               if (!kref_get_unless_zero(&hardif_neigh->refcount))
> +                       continue;
> +               cancel_work_sync(&hardif_neigh->bat_v.metric_work);
> +               batadv_hardif_neigh_put(hardif_neigh);
> +       }
> +       rcu_read_unlock();
> +
>         dev_kfree_skb(hard_iface->bat_v.elp_skb);
>         hard_iface->bat_v.elp_skb = NULL;
>  }
> 
> But it seems to cause a hang on reboot every once in a while. When the hang
> happens, I'm not able to trigger sysrq over serial.

Quickly looking at that I think that metric_work may need to sleep so
calling cancel_work_sync() on this work does not seem safe while in rcu
protected context.

I would try to frame the cancel_work_sync() call with
rcu_read_unlock()/rcu_read_lock() as below:

  rcu_read_unlock();
  cancel_work_sync(...);
  rcu_read_lock();
  batadv_hardif_neigh_put(...);

But be careful as batadv_hardif_neigh_put() could modify the list you
are currently traversing. At first glance it seems safe to realease
rcu read constraint to call cancel_work_sync() as long as you take it
back before calling batadv_hardif_neigh_put(), but this could need more
though on that.

-- 
Remi

  reply	other threads:[~2025-01-10 13:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-09  2:27 [PATCH] batman-adv: fix panic during interface removal Andy Strohman
2025-01-09  7:46 ` Sven Eckelmann
2025-01-09 10:10   ` Andrew Strohman
2025-01-09 10:23     ` Sven Eckelmann
2025-01-10  9:02       ` Andrew Strohman
2025-01-10 13:10         ` Remi Pommarel [this message]
2025-01-13  7:35           ` Andrew Strohman
2025-01-19 22:28             ` Andrew Strohman
2025-01-19 23:03             ` Sven Eckelmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4EcPQOMU1BUtO07@pilgrim \
    --to=repk@triplefau.lt \
    --cc=andrew@andrewstrohman.com \
    --cc=b.a.t.m.a.n@lists.open-mesh.org \
    --cc=sven@narfation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox