On Thursday, 9 January 2025 03:27:56 CET Andy Strohman wrote: > Reference counting is used to ensure that > batadv_hardif_neigh_node and batadv_hard_iface > are not freed before/during > batadv_v_elp_throughput_metric_update work is > finished. > > But there isn't a guarantee that the hard if will > remain associated with a soft interface up until > the work is finished. > > This fixes a crash triggered by reboot that looks > like this: > > Call trace: > batadv_v_mesh_free+0xd0/0x4dc [batman_adv] > batadv_v_elp_throughput_metric_update+0x1c/0xa4 > process_one_work+0x178/0x398 > worker_thread+0x2e8/0x4d0 > kthread+0xd8/0xdc > ret_from_fork+0x10/0x20 > > (the batadv_v_mesh_free call is misleading, > and does not actually happen) I am not 100% sure how you build batman-adv but when you've used the external kernel module then you can use [1,2]: make EXTRA_CFLAGS="-fno-inline -Og -fno-optimize-sibling-calls -fno-reorder-blocks -fno-ipa-cp-clone -fno-partial-inlining" KERNELPATH=... to get actually useful backtraces. Unfortunately, compile time checks sometimes need inlining and compilations fails or some kernel configurations with '-fno-inline'. If this happens to you then you can at least try to use the rest of the extra flags. [1] https://www.open-mesh.org/projects/devtools/wiki/Kernel_hacking_Debian_image#Building-the-batman-adv-module [2] https://www.open-mesh.org/projects/devtools/wiki/Kernel_debugging_with_kgdb#Connecting-gdb > I was able to make the issue happen more reliably > by changing hardif_neigh->bat_v.metric_work work > to be delayed work. This allowed me to track down > and confirm the fix. > > Signed-off-by: Andy Strohman Please add before your Signed-off-by line following extra line: Fixes: 5c3245172c01 ("batman-adv: ELP - compute the metric based on the estimated throughput") > --- > net/batman-adv/bat_v_elp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c > index 1d704574..7daaad9c 100644 > --- a/net/batman-adv/bat_v_elp.c > +++ b/net/batman-adv/bat_v_elp.c > @@ -140,7 +140,7 @@ static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh) > } > > default_throughput: > - if (!(hard_iface->bat_v.flags & BATADV_WARNING_DEFAULT)) { > + if (!(hard_iface->bat_v.flags & BATADV_WARNING_DEFAULT) && hard_iface->soft_iface) { > batadv_info(hard_iface->soft_iface, > "WiFi driver or ethtool info does not provide information about link speeds on interface %s, therefore defaulting to hardcoded throughput values of %u.%1u Mbps. Consider overriding the throughput manually or checking your driver.\n", > hard_iface->net_dev->name, > I would prefer something more explanatory instead of adding more conditions at the end of actually interesting checks. Something more like: diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c index 1d704574..185b063f 100644 --- a/net/batman-adv/bat_v_elp.c +++ b/net/batman-adv/bat_v_elp.c @@ -66,12 +66,19 @@ static void batadv_v_elp_start_timer(struct batadv_hard_iface *hard_iface) static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh) { struct batadv_hard_iface *hard_iface = neigh->if_incoming; + struct net_device *soft_iface = hard_iface->soft_iface; struct ethtool_link_ksettings link_settings; struct net_device *real_netdev; struct station_info sinfo; u32 throughput; int ret; + /* don't query throughput when no longer associated with any + * batman-adv interface + */ + if (!soft_iface) + return BATADV_THROUGHPUT_DEFAULT_VALUE; + /* if the user specified a customised value for this interface, then * return it directly */ @@ -141,7 +148,7 @@ static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh) default_throughput: if (!(hard_iface->bat_v.flags & BATADV_WARNING_DEFAULT)) { - batadv_info(hard_iface->soft_iface, + batadv_info(soft_iface, "WiFi driver or ethtool info does not provide information about link speeds on interface %s, therefore defaulting to hardcoded throughput values of %u.%1u Mbps. Consider overriding the throughput manually or checking your driver.\n", hard_iface->net_dev->name, BATADV_THROUGHPUT_DEFAULT_VALUE / 10,