* unregister_netdevice: waiting for DEV to become free (8) @ 2025-11-19 12:18 Tetsuo Handa 2025-11-19 12:18 ` syzbot 0 siblings, 1 reply; 18+ messages in thread From: Tetsuo Handa @ 2025-11-19 12:18 UTC (permalink / raw) To: syzbot+881d65229ca4f9ae8c84, LKML Let me check whether a reproducer that can still reproduce this problem is used for testing. #syz test: git://git.code.sf.net/p/tomoyo/tomoyo.git ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 12:18 unregister_netdevice: waiting for DEV to become free (8) Tetsuo Handa @ 2025-11-19 12:18 ` syzbot 2025-11-19 12:20 ` Tetsuo Handa 0 siblings, 1 reply; 18+ messages in thread From: syzbot @ 2025-11-19 12:18 UTC (permalink / raw) To: penguin-kernel; +Cc: linux-kernel, penguin-kernel, syzkaller-bugs > Let me check whether a reproducer that can still reproduce this problem is used for testing. > > #syz test: git://git.code.sf.net/p/tomoyo/tomoyo.git I've failed to parse your command. Did you perhaps forget to provide the branch name, or added an extra ':'? Please use one of the two supported formats: 1. #syz test 2. #syz test: repo branch-or-commit-hash Note the lack of ':' in option 1. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 12:18 ` syzbot @ 2025-11-19 12:20 ` Tetsuo Handa 2025-11-19 13:09 ` [syzbot] [net?] " syzbot 2025-11-19 13:13 ` Tetsuo Handa 0 siblings, 2 replies; 18+ messages in thread From: Tetsuo Handa @ 2025-11-19 12:20 UTC (permalink / raw) To: syzbot; +Cc: linux-kernel #syz test: git://git.code.sf.net/p/tomoyo/tomoyo.git master ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 12:20 ` Tetsuo Handa @ 2025-11-19 13:09 ` syzbot 2025-11-19 13:13 ` Tetsuo Handa 1 sibling, 0 replies; 18+ messages in thread From: syzbot @ 2025-11-19 13:09 UTC (permalink / raw) To: linux-kernel, penguin-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested on: commit: 5ac798f7 net/can/j1939: add j1939_priv debugging git tree: git://git.code.sf.net/p/tomoyo/tomoyo.git master console output: https://syzkaller.appspot.com/x/log.txt?x=10a5a8b4580000 kernel config: https://syzkaller.appspot.com/x/.config?x=7fd2b6b29c6ab719 dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8 Note: no patches were applied. Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 12:20 ` Tetsuo Handa 2025-11-19 13:09 ` [syzbot] [net?] " syzbot @ 2025-11-19 13:13 ` Tetsuo Handa 2025-11-19 13:57 ` [syzbot] [net?] " syzbot 2025-11-19 14:00 ` Tetsuo Handa 1 sibling, 2 replies; 18+ messages in thread From: Tetsuo Handa @ 2025-11-19 13:13 UTC (permalink / raw) To: syzbot; +Cc: linux-kernel Trying once more. #syz test: git://git.code.sf.net/p/tomoyo/tomoyo.git master ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 13:13 ` Tetsuo Handa @ 2025-11-19 13:57 ` syzbot 2025-11-19 14:00 ` Tetsuo Handa 1 sibling, 0 replies; 18+ messages in thread From: syzbot @ 2025-11-19 13:57 UTC (permalink / raw) To: linux-kernel, penguin-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested on: commit: 5ac798f7 net/can/j1939: add j1939_priv debugging git tree: git://git.code.sf.net/p/tomoyo/tomoyo.git master console output: https://syzkaller.appspot.com/x/log.txt?x=17b5a8b4580000 kernel config: https://syzkaller.appspot.com/x/.config?x=7fd2b6b29c6ab719 dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8 Note: no patches were applied. Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 13:13 ` Tetsuo Handa 2025-11-19 13:57 ` [syzbot] [net?] " syzbot @ 2025-11-19 14:00 ` Tetsuo Handa 2025-11-19 14:47 ` [syzbot] [net?] " syzbot 2026-03-02 10:56 ` Tetsuo Handa 1 sibling, 2 replies; 18+ messages in thread From: Tetsuo Handa @ 2025-11-19 14:00 UTC (permalink / raw) To: syzbot; +Cc: linux-kernel Too timing-dependent to trigger using a reproducer? Or, a reproducer for an already-fixed bug is used? #syz test diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index d1a687444b27..798d60b3e2ad 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2084,6 +2084,8 @@ enum netdev_reg_state { * * FIXME: cleanup struct net_device such that network protocol info * moves out. + * + * @netdev_trace_buffer_list: Linked list for debugging refcount leak. */ struct net_device { @@ -2238,6 +2240,9 @@ struct net_device { #if IS_ENABLED(CONFIG_TLS_DEVICE) const struct tlsdev_ops *tlsdev_ops; #endif +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + struct list_head netdev_trace_buffer_list; +#endif unsigned int operstate; unsigned char link_mode; @@ -3166,6 +3171,7 @@ enum netdev_cmd { NETDEV_OFFLOAD_XSTATS_REPORT_USED, NETDEV_OFFLOAD_XSTATS_REPORT_DELTA, NETDEV_XDP_FEAT_CHANGE, + NETDEV_DEBUG_UNREGISTER, }; const char *netdev_cmd_to_name(enum netdev_cmd cmd); @@ -4345,9 +4351,15 @@ static inline bool dev_nit_active(const struct net_device *dev) void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev); +void save_netdev_trace_buffer(struct net_device *dev, int delta); +int trim_netdev_trace(unsigned long *entries, int nr_entries); + static inline void __dev_put(struct net_device *dev) { if (dev) { +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + save_netdev_trace_buffer(dev, -1); +#endif #ifdef CONFIG_PCPU_DEV_REFCNT this_cpu_dec(*dev->pcpu_refcnt); #else @@ -4359,6 +4371,9 @@ static inline void __dev_put(struct net_device *dev) static inline void __dev_hold(struct net_device *dev) { if (dev) { +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + save_netdev_trace_buffer(dev, 1); +#endif #ifdef CONFIG_PCPU_DEV_REFCNT this_cpu_inc(*dev->pcpu_refcnt); #else diff --git a/kernel/softirq.c b/kernel/softirq.c index 77198911b8dd..5f435c1e48d8 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -576,6 +576,10 @@ static inline bool lockdep_softirq_start(void) { return false; } static inline void lockdep_softirq_end(bool in_hardirq) { } #endif +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static noinline void handle_softirqs(bool ksirqd); +#endif + static void handle_softirqs(bool ksirqd) { unsigned long end = jiffies + MAX_SOFTIRQ_TIME; diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 45320e27a16c..e9c654a9d0bb 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3145,6 +3145,10 @@ static bool manage_workers(struct worker *worker) return true; } +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static noinline void process_one_work(struct worker *worker, struct work_struct *work); +#endif + /** * process_one_work - process single work * @worker: self diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c index a93af55df5fd..66e6624abfa3 100644 --- a/net/can/j1939/main.c +++ b/net/can/j1939/main.c @@ -124,6 +124,16 @@ static void j1939_can_recv(struct sk_buff *iskb, void *data) static DEFINE_MUTEX(j1939_netdev_lock); +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static void dump_priv_trace_buffer(const struct net_device *ndev); +static void erase_priv_trace_buffer(struct j1939_priv *priv); +static noinline void save_priv_trace_buffer(struct j1939_priv *priv, int delta); +#else +static inline void dump_priv_trace_buffer(const struct net_device *ndev) { }; +static inline void erase_priv_trace_buffer(struct j1939_priv *priv) { }; +static inline void save_priv_trace_buffer(struct j1939_priv *priv, int delta) { }; +#endif + static struct j1939_priv *j1939_priv_create(struct net_device *ndev) { struct j1939_priv *priv; @@ -137,6 +147,7 @@ static struct j1939_priv *j1939_priv_create(struct net_device *ndev) priv->ndev = ndev; kref_init(&priv->kref); kref_init(&priv->rx_kref); + save_priv_trace_buffer(priv, 1); dev_hold(ndev); netdev_dbg(priv->ndev, "%s : 0x%p\n", __func__, priv); @@ -164,17 +175,20 @@ static void __j1939_priv_release(struct kref *kref) WARN_ON_ONCE(!list_empty(&priv->j1939_socks)); dev_put(ndev); + erase_priv_trace_buffer(priv); kfree(priv); } void j1939_priv_put(struct j1939_priv *priv) { + save_priv_trace_buffer(priv, -1); kref_put(&priv->kref, __j1939_priv_release); } void j1939_priv_get(struct j1939_priv *priv) { kref_get(&priv->kref); + save_priv_trace_buffer(priv, 1); } static int j1939_can_rx_register(struct j1939_priv *priv) @@ -282,6 +296,7 @@ struct j1939_priv *j1939_netdev_start(struct net_device *ndev) kref_get(&priv_new->rx_kref); mutex_unlock(&j1939_netdev_lock); dev_put(ndev); + erase_priv_trace_buffer(priv); kfree(priv); return priv_new; } @@ -299,6 +314,7 @@ struct j1939_priv *j1939_netdev_start(struct net_device *ndev) mutex_unlock(&j1939_netdev_lock); dev_put(ndev); + erase_priv_trace_buffer(priv); kfree(priv); return ERR_PTR(ret); @@ -364,6 +380,9 @@ static int j1939_netdev_notify(struct notifier_block *nb, struct can_ml_priv *can_ml = can_get_ml_priv(ndev); struct j1939_priv *priv; + if (msg == NETDEV_DEBUG_UNREGISTER) + dump_priv_trace_buffer(ndev); + if (!can_ml) goto notify_done; @@ -428,3 +447,79 @@ static __exit void j1939_module_exit(void) module_init(j1939_module_init); module_exit(j1939_module_exit); + +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + +#define PRIV_TRACE_BUFFER_SIZE 1024 +static struct priv_trace_buffer { + struct j1939_priv *priv; // no-ref + struct net_device *ndev; // no-ref + atomic_t count; + int nr_entries; + unsigned long entries[20]; +} priv_trace_buffer[PRIV_TRACE_BUFFER_SIZE]; +static bool priv_trace_buffer_exhausted; + +static void dump_priv_trace_buffer(const struct net_device *ndev) +{ + struct priv_trace_buffer *ptr; + int count, balance = 0; + int i; + + for (i = 0; i < PRIV_TRACE_BUFFER_SIZE; i++) { + ptr = &priv_trace_buffer[i]; + if (!ptr->priv || ptr->ndev != ndev) + continue; + count = atomic_read(&ptr->count); + balance += count; + pr_info("Call trace for %s@%p %+d at\n", ndev->name, ptr->priv, count); + stack_trace_print(ptr->entries, ptr->nr_entries, 4); + } + if (!priv_trace_buffer_exhausted) + pr_info("balance for %s@j1939_priv is %d\n", ndev->name, balance); + else + pr_info("balance for %s@j1939_priv is unknown\n", ndev->name); +} + +static void erase_priv_trace_buffer(struct j1939_priv *priv) +{ + int i; + + for (i = 0; i < PRIV_TRACE_BUFFER_SIZE; i++) + if (priv_trace_buffer[i].priv == priv) + priv_trace_buffer[i].priv = NULL; +} + +static noinline void save_priv_trace_buffer(struct j1939_priv *priv, int delta) +{ + struct priv_trace_buffer *ptr; + unsigned long entries[ARRAY_SIZE(ptr->entries)]; + unsigned long nr_entries; + int i; + + if (in_nmi()) + return; + nr_entries = stack_trace_save(entries, ARRAY_SIZE(ptr->entries), 1); + nr_entries = trim_netdev_trace(entries, nr_entries); + for (i = 0; i < PRIV_TRACE_BUFFER_SIZE; i++) { + ptr = &priv_trace_buffer[i]; + if (ptr->priv == priv && ptr->nr_entries == nr_entries && + !memcmp(ptr->entries, entries, nr_entries * sizeof(unsigned long))) { + atomic_add(delta, &ptr->count); + return; + } + } + for (i = 0; i < PRIV_TRACE_BUFFER_SIZE; i++) { + ptr = &priv_trace_buffer[i]; + if (!ptr->priv && !cmpxchg(&ptr->priv, NULL, priv)) { + ptr->ndev = priv->ndev; + atomic_set(&ptr->count, delta); + ptr->nr_entries = nr_entries; + memmove(ptr->entries, entries, nr_entries * sizeof(unsigned long)); + return; + } + } + priv_trace_buffer_exhausted = true; +} + +#endif diff --git a/net/core/dev.c b/net/core/dev.c index 2acfa44927da..c3a62c16fa15 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1854,6 +1854,7 @@ const char *netdev_cmd_to_name(enum netdev_cmd cmd) N(PRE_CHANGEADDR) N(OFFLOAD_XSTATS_ENABLE) N(OFFLOAD_XSTATS_DISABLE) N(OFFLOAD_XSTATS_REPORT_USED) N(OFFLOAD_XSTATS_REPORT_DELTA) N(XDP_FEAT_CHANGE) + N(DEBUG_UNREGISTER) } #undef N return "UNKNOWN_NETDEV_EVENT"; @@ -11429,6 +11430,14 @@ int netdev_refcnt_read(const struct net_device *dev) } EXPORT_SYMBOL(netdev_refcnt_read); +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static void dump_netdev_trace_buffer(const struct net_device *dev); +static void erase_netdev_trace_buffer(const struct net_device *dev); +#else +static inline void dump_netdev_trace_buffer(const struct net_device *dev) { } +static inline void erase_netdev_trace_buffer(const struct net_device *dev) { } +#endif + int netdev_unregister_timeout_secs __read_mostly = 10; #define WAIT_REFS_MIN_MSECS 1 @@ -11502,11 +11511,16 @@ static struct net_device *netdev_wait_allrefs_any(struct list_head *list) if (time_after(jiffies, warning_time + READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { + rtnl_lock(); list_for_each_entry(dev, list, todo_list) { pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", dev->name, netdev_refcnt_read(dev)); ref_tracker_dir_print(&dev->refcnt_tracker, 10); + call_netdevice_notifiers(NETDEV_DEBUG_UNREGISTER, dev); + dump_netdev_trace_buffer(dev); } + __rtnl_unlock(); + rcu_barrier(); warning_time = jiffies; } @@ -11904,6 +11918,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, dev->priv_len = sizeof_priv; +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + INIT_LIST_HEAD(&dev->netdev_trace_buffer_list); +#endif ref_tracker_dir_init(&dev->refcnt_tracker, 128, "netdev"); #ifdef CONFIG_PCPU_DEV_REFCNT dev->pcpu_refcnt = alloc_percpu(int); @@ -12076,6 +12093,8 @@ void free_netdev(struct net_device *dev) mutex_destroy(&dev->lock); + erase_netdev_trace_buffer(dev); + /* Compatibility with error handling in drivers */ if (dev->reg_state == NETREG_UNINITIALIZED || dev->reg_state == NETREG_DUMMY) { @@ -13090,3 +13109,180 @@ static int __init net_dev_init(void) } subsys_initcall(net_dev_init); + +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + +#define NETDEV_TRACE_BUFFER_SIZE 32768 +static struct netdev_trace_buffer { + struct list_head list; + int prev_count; + atomic_t count; + int nr_entries; + unsigned long entries[20]; +} netdev_trace_buffer[NETDEV_TRACE_BUFFER_SIZE]; +static LIST_HEAD(netdev_trace_buffer_list); +static DEFINE_SPINLOCK(netdev_trace_buffer_lock); +static bool netdev_trace_buffer_exhausted; + +static int netdev_trace_buffer_init(void) +{ + int i; + + for (i = 0; i < NETDEV_TRACE_BUFFER_SIZE; i++) + list_add_tail(&netdev_trace_buffer[i].list, &netdev_trace_buffer_list); + return 0; +} +pure_initcall(netdev_trace_buffer_init); + +static void dump_netdev_trace_buffer(const struct net_device *dev) +{ + struct netdev_trace_buffer *ptr; + int count, balance = 0, pos = 0; + + list_for_each_entry_rcu(ptr, &dev->netdev_trace_buffer_list, list, + /* list elements can't go away. */ 1) { + pos++; + count = atomic_read(&ptr->count); + balance += count; + if (ptr->prev_count == count) + continue; + ptr->prev_count = count; + pr_info("Call trace for %s[%d] %+d at\n", dev->name, pos, count); + stack_trace_print(ptr->entries, ptr->nr_entries, 4); + cond_resched(); + } + if (!netdev_trace_buffer_exhausted) + pr_info("balance as of %s[%d] is %d\n", dev->name, pos, balance); +} + +static void erase_netdev_trace_buffer(const struct net_device *dev) +{ + struct netdev_trace_buffer *ptr; + unsigned long flags; + + spin_lock_irqsave(&netdev_trace_buffer_lock, flags); + while (!list_empty(&dev->netdev_trace_buffer_list)) { + ptr = list_first_entry(&dev->netdev_trace_buffer_list, typeof(*ptr), list); + list_del(&ptr->list); + list_add_tail(&ptr->list, &netdev_trace_buffer_list); + } + spin_unlock_irqrestore(&netdev_trace_buffer_lock, flags); +} + +#ifdef CONFIG_KALLSYMS +static noinline unsigned long __find_trim(unsigned long *entries, int nr_entries, const char *name) +{ + int i; + char buffer[KSYM_SYMBOL_LEN]; + const int len = strlen(name); + + for (i = 0; i < nr_entries; i++) { + snprintf(buffer, sizeof(buffer), "%pS", (void *)entries[i]); + if (!strncmp(buffer, name, len) && buffer[len] == '+') + return entries[i]; + } + return 0; +} + +static unsigned long caller_handle_softirqs; +static unsigned long caller_process_one_work; +static unsigned long caller_ksys_unshare; +static unsigned long caller___sys_bind; +static unsigned long caller___sock_sendmsg; + +static int __init net_check_symbols(void) +{ + if (!kallsyms_lookup_name("handle_softirqs")) + caller_handle_softirqs = -1; + if (!kallsyms_lookup_name("process_one_work")) + caller_process_one_work = -1; + if (!kallsyms_lookup_name("ksys_unshare")) + caller_ksys_unshare = -1; + if (!kallsyms_lookup_name("__sys_bind")) + caller___sys_bind = -1; + if (!kallsyms_lookup_name("sock_sendmsg_nosec") && + !kallsyms_lookup_name("__sock_sendmsg")) + caller___sock_sendmsg = -1; + return 0; +} +late_initcall(net_check_symbols); +#endif + +int trim_netdev_trace(unsigned long *entries, int nr_entries) +{ +#ifdef CONFIG_KALLSYMS + int i; + + if (in_softirq()) { + if (unlikely(!caller_handle_softirqs)) + caller_handle_softirqs = __find_trim(entries, nr_entries, + "handle_softirqs"); + for (i = 0; i < nr_entries; i++) + if (entries[i] == caller_handle_softirqs) + return i + 1; + } else if (current->flags & PF_WQ_WORKER) { + if (unlikely(!caller_process_one_work)) + caller_process_one_work = __find_trim(entries, nr_entries, + "process_one_work"); + for (i = 0; i < nr_entries; i++) + if (entries[i] == caller_process_one_work) + return i + 1; + } else { + if (unlikely(!caller_ksys_unshare)) + caller_ksys_unshare = __find_trim(entries, nr_entries, "ksys_unshare"); + if (unlikely(!caller___sys_bind)) + caller___sys_bind = __find_trim(entries, nr_entries, "__sys_bind"); + if (unlikely(!caller___sock_sendmsg)) { + caller___sock_sendmsg = __find_trim(entries, nr_entries, + "sock_sendmsg_nosec"); + if (!caller___sock_sendmsg) + caller___sock_sendmsg = __find_trim(entries, nr_entries, + "__sock_sendmsg"); + } + for (i = 0; i < nr_entries; i++) + if (entries[i] == caller_ksys_unshare || + entries[i] == caller___sys_bind || + entries[i] == caller___sock_sendmsg) + return i + 1; + } +#endif + return nr_entries; +} +EXPORT_SYMBOL(trim_netdev_trace); + +void save_netdev_trace_buffer(struct net_device *dev, int delta) +{ + struct netdev_trace_buffer *ptr; + unsigned long entries[ARRAY_SIZE(ptr->entries)]; + unsigned long nr_entries; + unsigned long flags; + + if (in_nmi()) + return; + nr_entries = stack_trace_save(entries, ARRAY_SIZE(ptr->entries), 1); + nr_entries = trim_netdev_trace(entries, nr_entries); + list_for_each_entry_rcu(ptr, &dev->netdev_trace_buffer_list, list, + /* list elements can't go away. */ 1) { + if (ptr->nr_entries == nr_entries && + !memcmp(ptr->entries, entries, nr_entries * sizeof(unsigned long))) { + atomic_add(delta, &ptr->count); + return; + } + } + spin_lock_irqsave(&netdev_trace_buffer_lock, flags); + if (!list_empty(&netdev_trace_buffer_list)) { + ptr = list_first_entry(&netdev_trace_buffer_list, typeof(*ptr), list); + list_del(&ptr->list); + ptr->prev_count = 0; + atomic_set(&ptr->count, delta); + ptr->nr_entries = nr_entries; + memmove(ptr->entries, entries, nr_entries * sizeof(unsigned long)); + list_add_tail_rcu(&ptr->list, &dev->netdev_trace_buffer_list); + } else { + netdev_trace_buffer_exhausted = true; + } + spin_unlock_irqrestore(&netdev_trace_buffer_lock, flags); +} +EXPORT_SYMBOL(save_netdev_trace_buffer); + +#endif diff --git a/net/core/lock_debug.c b/net/core/lock_debug.c index 9e9fb25314b9..78d611bb6d1c 100644 --- a/net/core/lock_debug.c +++ b/net/core/lock_debug.c @@ -29,6 +29,7 @@ int netdev_debug_event(struct notifier_block *nb, unsigned long event, case NETDEV_DOWN: case NETDEV_REBOOT: case NETDEV_UNREGISTER: + case NETDEV_DEBUG_UNREGISTER: case NETDEV_CHANGEMTU: case NETDEV_CHANGEADDR: case NETDEV_PRE_CHANGEADDR: diff --git a/net/socket.c b/net/socket.c index e8892b218708..fce536d2d8b9 100644 --- a/net/socket.c +++ b/net/socket.c @@ -734,6 +734,10 @@ static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg) return ret; } +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static noinline int __sock_sendmsg(struct socket *sock, struct msghdr *msg); +#endif + static int __sock_sendmsg(struct socket *sock, struct msghdr *msg) { int err = security_socket_sendmsg(sock, msg, ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 14:00 ` Tetsuo Handa @ 2025-11-19 14:47 ` syzbot 2026-03-02 10:56 ` Tetsuo Handa 1 sibling, 0 replies; 18+ messages in thread From: syzbot @ 2025-11-19 14:47 UTC (permalink / raw) To: linux-kernel, penguin-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested on: commit: 8b690556 Merge tag 'for-linus' of git://git.kernel.org.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=106da8b4580000 kernel config: https://syzkaller.appspot.com/x/.config?x=7fd2b6b29c6ab719 dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8 patch: https://syzkaller.appspot.com/x/patch.diff?x=13b77884580000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: unregister_netdevice: waiting for DEV to become free (8) 2025-11-19 14:00 ` Tetsuo Handa 2025-11-19 14:47 ` [syzbot] [net?] " syzbot @ 2026-03-02 10:56 ` Tetsuo Handa 2026-03-02 11:21 ` [syzbot] [net?] " syzbot 1 sibling, 1 reply; 18+ messages in thread From: Tetsuo Handa @ 2026-03-02 10:56 UTC (permalink / raw) To: syzbot; +Cc: linux-kernel #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index d4e6e00bb90a..97862830f7d0 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2102,6 +2102,8 @@ enum netdev_reg_state { * * FIXME: cleanup struct net_device such that network protocol info * moves out. + * + * @netdev_trace_buffer_list: Linked list for debugging refcount leak. */ struct net_device { @@ -2257,6 +2259,9 @@ struct net_device { #if IS_ENABLED(CONFIG_TLS_DEVICE) const struct tlsdev_ops *tlsdev_ops; #endif +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + struct list_head netdev_trace_buffer_list; +#endif unsigned int operstate; unsigned char link_mode; @@ -3185,6 +3190,7 @@ enum netdev_cmd { NETDEV_OFFLOAD_XSTATS_REPORT_USED, NETDEV_OFFLOAD_XSTATS_REPORT_DELTA, NETDEV_XDP_FEAT_CHANGE, + NETDEV_DEBUG_UNREGISTER, }; const char *netdev_cmd_to_name(enum netdev_cmd cmd); @@ -4373,9 +4379,15 @@ static inline bool dev_nit_active(const struct net_device *dev) void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev); +void save_netdev_trace_buffer(struct net_device *dev, int delta); +int trim_netdev_trace(unsigned long *entries, int nr_entries); + static inline void __dev_put(struct net_device *dev) { if (dev) { +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + save_netdev_trace_buffer(dev, -1); +#endif #ifdef CONFIG_PCPU_DEV_REFCNT this_cpu_dec(*dev->pcpu_refcnt); #else @@ -4387,6 +4399,9 @@ static inline void __dev_put(struct net_device *dev) static inline void __dev_hold(struct net_device *dev) { if (dev) { +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + save_netdev_trace_buffer(dev, 1); +#endif #ifdef CONFIG_PCPU_DEV_REFCNT this_cpu_inc(*dev->pcpu_refcnt); #else diff --git a/kernel/softirq.c b/kernel/softirq.c index 77198911b8dd..5f435c1e48d8 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -576,6 +576,10 @@ static inline bool lockdep_softirq_start(void) { return false; } static inline void lockdep_softirq_end(bool in_hardirq) { } #endif +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static noinline void handle_softirqs(bool ksirqd); +#endif + static void handle_softirqs(bool ksirqd) { unsigned long end = jiffies + MAX_SOFTIRQ_TIME; diff --git a/kernel/workqueue.c b/kernel/workqueue.c index aeaec79bc09c..66cb4d8c00bb 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3157,6 +3157,10 @@ static bool manage_workers(struct worker *worker) return true; } +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static noinline void process_one_work(struct worker *worker, struct work_struct *work); +#endif + /** * process_one_work - process single work * @worker: self diff --git a/net/core/dev.c b/net/core/dev.c index c1a9f7fdcffa..fbc205c82256 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1874,6 +1874,7 @@ const char *netdev_cmd_to_name(enum netdev_cmd cmd) N(PRE_CHANGEADDR) N(OFFLOAD_XSTATS_ENABLE) N(OFFLOAD_XSTATS_DISABLE) N(OFFLOAD_XSTATS_REPORT_USED) N(OFFLOAD_XSTATS_REPORT_DELTA) N(XDP_FEAT_CHANGE) + N(DEBUG_UNREGISTER) } #undef N return "UNKNOWN_NETDEV_EVENT"; @@ -11557,6 +11558,14 @@ int netdev_refcnt_read(const struct net_device *dev) } EXPORT_SYMBOL(netdev_refcnt_read); +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +static void dump_netdev_trace_buffer(const struct net_device *dev); +static void erase_netdev_trace_buffer(const struct net_device *dev); +#else +static inline void dump_netdev_trace_buffer(const struct net_device *dev) { } +static inline void erase_netdev_trace_buffer(const struct net_device *dev) { } +#endif + int netdev_unregister_timeout_secs __read_mostly = 10; #define WAIT_REFS_MIN_MSECS 1 @@ -11630,11 +11639,16 @@ static struct net_device *netdev_wait_allrefs_any(struct list_head *list) if (time_after(jiffies, warning_time + READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { + rtnl_lock(); list_for_each_entry(dev, list, todo_list) { pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", dev->name, netdev_refcnt_read(dev)); ref_tracker_dir_print(&dev->refcnt_tracker, 10); + call_netdevice_notifiers(NETDEV_DEBUG_UNREGISTER, dev); + dump_netdev_trace_buffer(dev); } + __rtnl_unlock(); + rcu_barrier(); warning_time = jiffies; } @@ -12032,6 +12046,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, dev->priv_len = sizeof_priv; +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + INIT_LIST_HEAD(&dev->netdev_trace_buffer_list); +#endif ref_tracker_dir_init(&dev->refcnt_tracker, 128, "netdev"); #ifdef CONFIG_PCPU_DEV_REFCNT dev->pcpu_refcnt = alloc_percpu(int); @@ -12204,6 +12221,8 @@ void free_netdev(struct net_device *dev) mutex_destroy(&dev->lock); + erase_netdev_trace_buffer(dev); + /* Compatibility with error handling in drivers */ if (dev->reg_state == NETREG_UNINITIALIZED || dev->reg_state == NETREG_DUMMY) { @@ -13306,3 +13325,171 @@ static int __init net_dev_init(void) } subsys_initcall(net_dev_init); + +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER + +#define NETDEV_TRACE_BUFFER_SIZE 32768 +static struct netdev_trace_buffer { + struct list_head list; + int prev_count; + atomic_t count; + int nr_entries; + unsigned long entries[20]; +} netdev_trace_buffer[NETDEV_TRACE_BUFFER_SIZE]; +static LIST_HEAD(netdev_trace_buffer_list); +static DEFINE_SPINLOCK(netdev_trace_buffer_lock); +static bool netdev_trace_buffer_exhausted; + +static int netdev_trace_buffer_init(void) +{ + int i; + + for (i = 0; i < NETDEV_TRACE_BUFFER_SIZE; i++) + list_add_tail(&netdev_trace_buffer[i].list, &netdev_trace_buffer_list); + return 0; +} +pure_initcall(netdev_trace_buffer_init); + +static void dump_netdev_trace_buffer(const struct net_device *dev) +{ + struct netdev_trace_buffer *ptr; + int count, balance = 0, pos = 0; + + list_for_each_entry_rcu(ptr, &dev->netdev_trace_buffer_list, list, + /* list elements can't go away. */ 1) { + pos++; + count = atomic_read(&ptr->count); + balance += count; + if (ptr->prev_count == count) + continue; + ptr->prev_count = count; + pr_info("Call trace for %s[%d] %+d at\n", dev->name, pos, count); + stack_trace_print(ptr->entries, ptr->nr_entries, 4); + cond_resched(); + } + if (!netdev_trace_buffer_exhausted) + pr_info("balance as of %s[%d] is %d\n", dev->name, pos, balance); +} + +static void erase_netdev_trace_buffer(const struct net_device *dev) +{ + struct netdev_trace_buffer *ptr; + unsigned long flags; + + spin_lock_irqsave(&netdev_trace_buffer_lock, flags); + while (!list_empty(&dev->netdev_trace_buffer_list)) { + ptr = list_first_entry(&dev->netdev_trace_buffer_list, typeof(*ptr), list); + list_del(&ptr->list); + list_add_tail(&ptr->list, &netdev_trace_buffer_list); + } + spin_unlock_irqrestore(&netdev_trace_buffer_lock, flags); +} + +int trim_netdev_trace(unsigned long *entries, int nr_entries) +{ +#ifdef CONFIG_KALLSYMS + char buffer[32] = { }; + char *cp; + int i; + + if (in_softirq()) { + static unsigned long __data_racy caller; + + if (!caller) { + for (i = 0; i < nr_entries; i++) { + snprintf(buffer, sizeof(buffer) - 1, "%ps", (void *)entries[i]); + cp = strchr(buffer, ' '); + if (cp) + *cp = '\0'; + if (!strcmp(buffer, "handle_softirqs")) { + caller = entries[i]; + break; + } + } + } + for (i = 0; i < nr_entries; i++) + if (entries[i] == caller) + return i + 1; + } else if (current->flags & PF_WQ_WORKER) { + static unsigned long __data_racy caller; + + if (!caller) { + for (i = 0; i < nr_entries; i++) { + snprintf(buffer, sizeof(buffer) - 1, "%ps", (void *)entries[i]); + cp = strchr(buffer, ' '); + if (cp) + *cp = '\0'; + if (!strcmp(buffer, "process_one_work")) { + caller = entries[i]; + break; + } + } + } + for (i = 0; i < nr_entries; i++) + if (entries[i] == caller) + return i + 1; + } else { + for (i = 0; i < nr_entries; i++) { + snprintf(buffer, sizeof(buffer) - 1, "%ps", (void *)entries[i]); + cp = strchr(buffer, ' '); + if (cp) + *cp = '\0'; + if (buffer[0] == 'k') { + if (!strcmp(buffer, "ksys_unshare")) + return i + 1; + } else if (buffer[0] == 's') { + if (!strcmp(buffer, "sock_sendmsg_nosec") || + !strcmp(buffer, "sock_recvmsg_nosec")) + return i + 1; + } else if (buffer[0] == '_') { + if (!strcmp(buffer, "__sys_bind") || + !strcmp(buffer, "__sock_release") || + !strcmp(buffer, "__sys_bpf")) + return i + 1; + } else { + if (!strcmp(buffer, "do_sock_setsockopt")) + return i + 1; + } + } + } +#endif + return nr_entries; +} +EXPORT_SYMBOL(trim_netdev_trace); + +void save_netdev_trace_buffer(struct net_device *dev, int delta) +{ + struct netdev_trace_buffer *ptr; + unsigned long entries[ARRAY_SIZE(ptr->entries)]; + unsigned long nr_entries; + unsigned long flags; + + if (in_nmi()) + return; + nr_entries = stack_trace_save(entries, ARRAY_SIZE(ptr->entries), 1); + nr_entries = trim_netdev_trace(entries, nr_entries); + list_for_each_entry_rcu(ptr, &dev->netdev_trace_buffer_list, list, + /* list elements can't go away. */ 1) { + if (ptr->nr_entries == nr_entries && + !memcmp(ptr->entries, entries, nr_entries * sizeof(unsigned long))) { + atomic_add(delta, &ptr->count); + return; + } + } + spin_lock_irqsave(&netdev_trace_buffer_lock, flags); + if (!list_empty(&netdev_trace_buffer_list)) { + ptr = list_first_entry(&netdev_trace_buffer_list, typeof(*ptr), list); + list_del(&ptr->list); + ptr->prev_count = 0; + atomic_set(&ptr->count, delta); + ptr->nr_entries = nr_entries; + memmove(ptr->entries, entries, nr_entries * sizeof(unsigned long)); + list_add_tail_rcu(&ptr->list, &dev->netdev_trace_buffer_list); + } else { + netdev_trace_buffer_exhausted = true; + } + spin_unlock_irqrestore(&netdev_trace_buffer_lock, flags); +} +EXPORT_SYMBOL(save_netdev_trace_buffer); + +#endif diff --git a/net/core/lock_debug.c b/net/core/lock_debug.c index 9e9fb25314b9..78d611bb6d1c 100644 --- a/net/core/lock_debug.c +++ b/net/core/lock_debug.c @@ -29,6 +29,7 @@ int netdev_debug_event(struct notifier_block *nb, unsigned long event, case NETDEV_DOWN: case NETDEV_REBOOT: case NETDEV_UNREGISTER: + case NETDEV_DEBUG_UNREGISTER: case NETDEV_CHANGEMTU: case NETDEV_CHANGEADDR: case NETDEV_PRE_CHANGEADDR: diff --git a/net/socket.c b/net/socket.c index 05952188127f..53c4b1fd3ef7 100644 --- a/net/socket.c +++ b/net/socket.c @@ -650,7 +650,11 @@ struct socket *sock_alloc(void) } EXPORT_SYMBOL(sock_alloc); -static void __sock_release(struct socket *sock, struct inode *inode) +static +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +noinline +#endif +void __sock_release(struct socket *sock, struct inode *inode) { const struct proto_ops *ops = READ_ONCE(sock->ops); @@ -722,7 +726,13 @@ static noinline void call_trace_sock_send_length(struct sock *sk, int ret, trace_sock_send_length(sk, ret, 0); } -static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg) +static +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +noinline +#else +inline +#endif +int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg) { int ret = INDIRECT_CALL_INET(READ_ONCE(sock->ops)->sendmsg, inet6_sendmsg, inet_sendmsg, sock, msg, @@ -1072,8 +1082,13 @@ static noinline void call_trace_sock_recv_length(struct sock *sk, int ret, int f trace_sock_recv_length(sk, ret, flags); } -static inline int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg, - int flags) +static +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +noinline +#else +inline +#endif +int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg, int flags) { int ret = INDIRECT_CALL_INET(READ_ONCE(sock->ops)->recvmsg, inet6_recvmsg, @@ -2532,9 +2547,12 @@ static int copy_msghdr_from_user(struct msghdr *kmsg, return err < 0 ? err : 0; } -static int ____sys_sendmsg(struct socket *sock, struct msghdr *msg_sys, - unsigned int flags, struct used_address *used_address, - unsigned int allowed_msghdr_flags) +static +#ifdef CONFIG_NET_DEV_REFCNT_TRACKER +noinline +#endif +int ____sys_sendmsg(struct socket *sock, struct msghdr *msg_sys, unsigned int flags, + struct used_address *used_address, unsigned int allowed_msghdr_flags) { unsigned char ctl[sizeof(struct cmsghdr) + 20] __aligned(sizeof(__kernel_size_t)); ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2026-03-02 10:56 ` Tetsuo Handa @ 2026-03-02 11:21 ` syzbot 0 siblings, 0 replies; 18+ messages in thread From: syzbot @ 2026-03-02 11:21 UTC (permalink / raw) To: linux-kernel, penguin-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested on: commit: 11439c46 Linux 7.0-rc2 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=154a7472580000 kernel config: https://syzkaller.appspot.com/x/.config?x=894aa0415989af66 dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 patch: https://syzkaller.appspot.com/x/patch.diff?x=12afd3e6580000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] can: j1939: implement NETDEV_UNREGISTER notification handler
@ 2025-08-25 11:01 Tetsuo Handa
2025-08-25 13:35 ` [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) syzbot
0 siblings, 1 reply; 18+ messages in thread
From: Tetsuo Handa @ 2025-08-25 11:01 UTC (permalink / raw)
To: syzbot+881d65229ca4f9ae8c84, LKML
syzbot is reporting
unregister_netdevice: waiting for vcan0 to become free. Usage count = 2
problem, for j1939 protocol did not have NETDEV_UNREGISTER notification
handler for undoing changes made by j1939_sk_bind().
Commit 25fe97cb7620 ("can: j1939: move j1939_priv_put() into sk_destruct
callback") expects that a call to j1939_priv_put() can be unconditionally
delayed until j1939_sk_sock_destruct() is called. But we need to call
j1939_priv_put() against an extra ref held by j1939_sk_bind() call
(as a part of undoing changes made by j1939_sk_bind()) as soon as
NETDEV_UNREGISTER notification fires (i.e. before j1939_sk_sock_destruct()
is called via j1939_sk_release()). Otherwise, the extra ref on "struct
j1939_priv" held by j1939_sk_bind() call prevents "struct net_device" from
dropping the usage count to 1; making it impossible for
unregister_netdevice() to continue.
Reported-by: syzbot <syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Fixes: 25fe97cb7620 ("can: j1939: move j1939_priv_put() into sk_destruct callback")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
#syz test
net/can/j1939/j1939-priv.h | 1 +
net/can/j1939/main.c | 3 +++
net/can/j1939/socket.c | 49 ++++++++++++++++++++++++++++++++++++++
3 files changed, 53 insertions(+)
diff --git a/net/can/j1939/j1939-priv.h b/net/can/j1939/j1939-priv.h
index 31a93cae5111..81f58924b4ac 100644
--- a/net/can/j1939/j1939-priv.h
+++ b/net/can/j1939/j1939-priv.h
@@ -212,6 +212,7 @@ void j1939_priv_get(struct j1939_priv *priv);
/* notify/alert all j1939 sockets bound to ifindex */
void j1939_sk_netdev_event_netdown(struct j1939_priv *priv);
+void j1939_sk_netdev_event_unregister(struct j1939_priv *priv);
int j1939_cancel_active_session(struct j1939_priv *priv, struct sock *sk);
void j1939_tp_init(struct j1939_priv *priv);
diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c
index 7e8a20f2fc42..3706a872ecaf 100644
--- a/net/can/j1939/main.c
+++ b/net/can/j1939/main.c
@@ -377,6 +377,9 @@ static int j1939_netdev_notify(struct notifier_block *nb,
j1939_sk_netdev_event_netdown(priv);
j1939_ecu_unmap_all(priv);
break;
+ case NETDEV_UNREGISTER:
+ j1939_sk_netdev_event_unregister(priv);
+ break;
}
j1939_priv_put(priv);
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 493f49bfaf5d..72c649cec9e1 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1303,6 +1303,55 @@ void j1939_sk_netdev_event_netdown(struct j1939_priv *priv)
read_unlock_bh(&priv->j1939_socks_lock);
}
+void j1939_sk_netdev_event_unregister(struct j1939_priv *priv)
+{
+ struct sock *sk;
+ struct j1939_sock *jsk;
+ bool wait_rcu = false;
+
+ rescan: /* The caller is holding a ref on this "priv" via j1939_priv_get_by_ndev(). */
+ read_lock_bh(&priv->j1939_socks_lock);
+ list_for_each_entry(jsk, &priv->j1939_socks, list) {
+ /* Skip if j1939_jsk_add() is not called on this socket. */
+ if (!(jsk->state & J1939_SOCK_BOUND))
+ continue;
+ sk = &jsk->sk;
+ sock_hold(sk);
+ read_unlock_bh(&priv->j1939_socks_lock);
+ /* Check if j1939_jsk_del() is not yet called on this socket after holding
+ * socket's lock, for both j1939_sk_bind() and j1939_sk_release() call
+ * j1939_jsk_del() with socket's lock held.
+ */
+ lock_sock(sk);
+ if (jsk->state & J1939_SOCK_BOUND) {
+ /* Neither j1939_sk_bind() nor j1939_sk_release() called j1939_jsk_del().
+ * Make this socket no longer bound, by pretending as if j1939_sk_bind()
+ * dropped old references but did not get new references.
+ */
+ j1939_jsk_del(priv, jsk);
+ j1939_local_ecu_put(priv, jsk->addr.src_name, jsk->addr.sa);
+ j1939_netdev_stop(priv);
+ /* Call j1939_priv_put() now and prevent j1939_sk_sock_destruct() from
+ * calling the corresponding j1939_priv_put().
+ *
+ * j1939_sk_sock_destruct() is supposed to call j1939_priv_put() after
+ * an RCU grace period. But since the caller is holding a ref on this
+ * "priv", we can defer synchronize_rcu() until immediately before
+ * the caller calls j1939_priv_put().
+ */
+ j1939_priv_put(priv);
+ jsk->priv = NULL;
+ wait_rcu = true;
+ }
+ release_sock(sk);
+ sock_put(sk);
+ goto rescan;
+ }
+ read_unlock_bh(&priv->j1939_socks_lock);
+ if (wait_rcu)
+ synchronize_rcu();
+}
+
static int j1939_sk_no_ioctlcmd(struct socket *sock, unsigned int cmd,
unsigned long arg)
{
--
2.51.0
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2025-08-25 11:01 [PATCH] can: j1939: implement NETDEV_UNREGISTER notification handler Tetsuo Handa @ 2025-08-25 13:35 ` syzbot 0 siblings, 0 replies; 18+ messages in thread From: syzbot @ 2025-08-25 13:35 UTC (permalink / raw) To: linux-kernel, penguin-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Tested on: commit: 1b237f19 Linux 6.17-rc3 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=144dec42580000 kernel config: https://syzkaller.appspot.com/x/.config?x=d4703ac89d9e185a dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=10cefa34580000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8)
@ 2023-06-10 1:34 syzbot
2023-06-21 7:07 ` Ziqi Zhao
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: syzbot @ 2023-06-10 1:34 UTC (permalink / raw)
To: arnd, bridge, davem, edumazet, kuba, linux-kernel, netdev,
nikolay, pabeni, roopa, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 67faabbde36b selftests/bpf: Add missing prototypes for sev..
git tree: bpf-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1381363b280000
kernel config: https://syzkaller.appspot.com/x/.config?x=5335204dcdecfda
dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=132faf93280000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10532add280000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/751a0490d875/disk-67faabbd.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/2c5106cd9f1f/vmlinux-67faabbd.xz
kernel image: https://storage.googleapis.com/syzbot-assets/62c1154294e4/bzImage-67faabbd.xz
The issue was bisected to:
commit ad2f99aedf8fa77f3ae647153284fa63c43d3055
Author: Arnd Bergmann <arnd@arndb.de>
Date: Tue Jul 27 13:45:16 2021 +0000
net: bridge: move bridge ioctls out of .ndo_do_ioctl
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=146de6f1280000
final oops: https://syzkaller.appspot.com/x/report.txt?x=166de6f1280000
console output: https://syzkaller.appspot.com/x/log.txt?x=126de6f1280000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Fixes: ad2f99aedf8f ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
unregister_netdevice: waiting for bridge0 to become free. Usage count = 2
leaked reference.
__netdev_tracker_alloc include/linux/netdevice.h:4070 [inline]
netdev_hold include/linux/netdevice.h:4099 [inline]
dev_ifsioc+0xbc0/0xeb0 net/core/dev_ioctl.c:408
dev_ioctl+0x250/0x1090 net/core/dev_ioctl.c:605
sock_do_ioctl+0x15a/0x230 net/socket.c:1215
sock_ioctl+0x1f8/0x680 net/socket.c:1318
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2023-06-10 1:34 syzbot @ 2023-06-21 7:07 ` Ziqi Zhao 2023-06-21 8:46 ` Dongliang Mu 2025-08-25 12:35 ` Hillf Danton 2025-08-26 1:50 ` Hillf Danton 2 siblings, 1 reply; 18+ messages in thread From: Ziqi Zhao @ 2023-06-21 7:07 UTC (permalink / raw) To: syzbot+881d65229ca4f9ae8c84 Cc: arnd, bridge, davem, edumazet, kuba, linux-kernel, netdev, nikolay, pabeni, roopa, syzkaller-bugs, skhan, ivan.orlov0322 Hi all, I'm taking a look at this bug as part of the exercice for the Linux Kernel Bug Fixing Summer 2023 program. Thanks to the help from my mentor, Ivan Orlov and Shuah Khan, I've already obtained a reproduction of the issue using the provided C reproducer, and I should be able to submit a patch by the end of this week to fix the highlighted error. If you have any information or suggestions, please feel free to reply to this thread. Any help would be greatly appreciated! Best regards, Ziqi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2023-06-21 7:07 ` Ziqi Zhao @ 2023-06-21 8:46 ` Dongliang Mu 0 siblings, 0 replies; 18+ messages in thread From: Dongliang Mu @ 2023-06-21 8:46 UTC (permalink / raw) To: Ziqi Zhao Cc: syzbot+881d65229ca4f9ae8c84, arnd, bridge, davem, edumazet, kuba, linux-kernel, netdev, nikolay, pabeni, roopa, syzkaller-bugs, skhan, ivan.orlov0322 On Wed, Jun 21, 2023 at 3:38 PM 'Ziqi Zhao' via syzkaller-bugs <syzkaller-bugs@googlegroups.com> wrote: > > Hi all, > > I'm taking a look at this bug as part of the exercice for the Linux > Kernel Bug Fixing Summer 2023 program. Thanks to the help from my This is an interesting program. There are many kernel crashes on the syzbot dashboard, which needs help. > mentor, Ivan Orlov and Shuah Khan, I've already obtained a reproduction > of the issue using the provided C reproducer, and I should be able to > submit a patch by the end of this week to fix the highlighted error. If > you have any information or suggestions, please feel free to reply to > this thread. Any help would be greatly appreciated! Please carefully read the guidance of submitting patches to linux kernel [1]. Be careful about your coding style before sending. Note that, Syzbot has the feature: patch testing. You can upload and test your own patch to confirm that your patch is working properly. [1] https://docs.kernel.org/process/submitting-patches.html > > Best regards, > Ziqi > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20230621070710.380373-1-astrajoan%40yahoo.com. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2023-06-10 1:34 syzbot 2023-06-21 7:07 ` Ziqi Zhao @ 2025-08-25 12:35 ` Hillf Danton 2025-08-25 13:51 ` syzbot 2025-08-26 1:50 ` Hillf Danton 2 siblings, 1 reply; 18+ messages in thread From: Hillf Danton @ 2025-08-25 12:35 UTC (permalink / raw) To: syzbot; +Cc: linux-kernel, syzkaller-bugs > Date: Fri, 09 Jun 2023 18:34:58 -0700 [thread overview] > Hello, > > syzbot found the following issue on: > > HEAD commit: 67faabbde36b selftests/bpf: Add missing prototypes for sev.. > git tree: bpf-next > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1381363b280000 > kernel config: https://syzkaller.appspot.com/x/.config?x=5335204dcdecfda > dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=132faf93280000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10532add280000 #syz test upstream master --- a/net/can/j1939/j1939-priv.h +++ b/net/can/j1939/j1939-priv.h @@ -59,6 +59,7 @@ struct j1939_priv { /* segments need a lock to protect the above list */ rwlock_t lock; + int unregistered; struct net_device *ndev; /* list of 256 ecu ptrs, that cache the claimed addresses. --- a/net/can/j1939/main.c +++ b/net/can/j1939/main.c @@ -157,13 +157,15 @@ static void __j1939_priv_release(struct struct j1939_priv *priv = container_of(kref, struct j1939_priv, kref); struct net_device *ndev = priv->ndev; - netdev_dbg(priv->ndev, "%s: 0x%p\n", __func__, priv); + if (!priv->unregistered) + netdev_dbg(priv->ndev, "%s: 0x%p\n", __func__, priv); WARN_ON_ONCE(!list_empty(&priv->active_session_list)); WARN_ON_ONCE(!list_empty(&priv->ecus)); WARN_ON_ONCE(!list_empty(&priv->j1939_socks)); - dev_put(ndev); + if (!priv->unregistered) + dev_put(ndev); kfree(priv); } @@ -377,6 +379,10 @@ static int j1939_netdev_notify(struct no j1939_sk_netdev_event_netdown(priv); j1939_ecu_unmap_all(priv); break; + case NETDEV_UNREGISTER: + priv->unregistered++; + dev_put(priv->ndev); + break; } j1939_priv_put(priv); -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2025-08-25 12:35 ` Hillf Danton @ 2025-08-25 13:51 ` syzbot 0 siblings, 0 replies; 18+ messages in thread From: syzbot @ 2025-08-25 13:51 UTC (permalink / raw) To: hdanton, linux-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: KASAN: use-after-free Read in j1939_netdev_stop ================================================================== BUG: KASAN: use-after-free in read_pnet include/net/net_namespace.h:409 [inline] BUG: KASAN: use-after-free in dev_net include/linux/netdevice.h:2718 [inline] BUG: KASAN: use-after-free in j1939_can_rx_unregister net/can/j1939/main.c:202 [inline] BUG: KASAN: use-after-free in __j1939_rx_release net/can/j1939/main.c:218 [inline] BUG: KASAN: use-after-free in kref_put_mutex include/linux/kref.h:86 [inline] BUG: KASAN: use-after-free in j1939_netdev_stop+0x2ab/0x2d0 net/can/j1939/main.c:311 Read of size 8 at addr ffff888023f80108 by task syz.0.17/6524 CPU: 0 UID: 0 PID: 6524 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xcd/0x630 mm/kasan/report.c:482 kasan_report+0xe0/0x110 mm/kasan/report.c:595 read_pnet include/net/net_namespace.h:409 [inline] dev_net include/linux/netdevice.h:2718 [inline] j1939_can_rx_unregister net/can/j1939/main.c:202 [inline] __j1939_rx_release net/can/j1939/main.c:218 [inline] kref_put_mutex include/linux/kref.h:86 [inline] j1939_netdev_stop+0x2ab/0x2d0 net/can/j1939/main.c:311 j1939_sk_release+0x5c3/0x8e0 net/can/j1939/socket.c:651 __sock_release+0xb3/0x270 net/socket.c:649 sock_close+0x1c/0x30 net/socket.c:1439 __fput+0x402/0xb70 fs/file_table.c:468 task_work_run+0x14d/0x240 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x86f/0x2bf0 kernel/exit.c:961 do_group_exit+0xd3/0x2a0 kernel/exit.c:1102 get_signal+0x2673/0x26d0 kernel/signal.c:3034 arch_do_signal_or_restart+0x8f/0x7d0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop+0x84/0x110 kernel/entry/common.c:40 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x3f6/0x4c0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f408338ebe9 Code: Unable to access opcode bytes at 0x7f408338ebbf. RSP: 002b:00007f4084235038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: 0000000000000024 RBX: 00007f40835b5fa0 RCX: 00007f408338ebe9 RDX: 0000000000000000 RSI: 0000200000000200 RDI: 0000000000000003 RBP: 00007f4083411e19 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f40835b6038 R14: 00007f40835b5fa0 R15: 00007ffd085a2ed8 </TASK> The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff888023f80000 pfn:0x23f80 flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000000000 ffffea000174f608 ffffea0000c2c208 0000000000000000 raw: ffff888023f80000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as freed page last allocated via order 3, migratetype Unmovable, gfp_mask 0x446dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_COMP), pid 6440, tgid 6440 (syz-executor), ts 122667010586, free_ts 125938931446 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1c0/0x230 mm/page_alloc.c:1851 prep_new_page mm/page_alloc.c:1859 [inline] get_page_from_freelist+0x132b/0x38e0 mm/page_alloc.c:3858 __alloc_frozen_pages_noprof+0x261/0x23f0 mm/page_alloc.c:5148 alloc_pages_mpol+0x1fb/0x550 mm/mempolicy.c:2416 ___kmalloc_large_node+0xed/0x160 mm/slub.c:4306 __kmalloc_large_node_noprof+0x1c/0x70 mm/slub.c:4337 __do_kmalloc_node mm/slub.c:4353 [inline] __kvmalloc_node_noprof.cold+0xb/0x65 mm/slub.c:5052 alloc_netdev_mqs+0xd2/0x1530 net/core/dev.c:11812 rtnl_create_link+0xc08/0xf90 net/core/rtnetlink.c:3633 vxcan_newlink+0x2f8/0x640 drivers/net/can/vxcan.c:208 rtnl_newlink_create net/core/rtnetlink.c:3825 [inline] __rtnl_newlink net/core/rtnetlink.c:3942 [inline] rtnl_newlink+0xc45/0x2000 net/core/rtnetlink.c:4057 rtnetlink_rcv_msg+0x95e/0xe90 net/core/rtnetlink.c:6946 netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x5a7/0x870 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg net/socket.c:729 [inline] __sys_sendto+0x4a3/0x520 net/socket.c:2228 page last free pid 6524 tgid 6523 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1395 [inline] __free_frozen_pages+0x7d5/0x10f0 mm/page_alloc.c:2895 device_release+0xa1/0x240 drivers/base/core.c:2565 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x1e7/0x5a0 lib/kobject.c:737 netdev_run_todo+0x7e9/0x1320 net/core/dev.c:11513 rtnl_unlock net/core/rtnetlink.c:157 [inline] rtnl_net_unlock include/linux/rtnetlink.h:135 [inline] rtnl_dellink+0x3da/0xa80 net/core/rtnetlink.c:3563 rtnetlink_rcv_msg+0x95e/0xe90 net/core/rtnetlink.c:6946 netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x5a7/0x870 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg net/socket.c:729 [inline] ____sys_sendmsg+0xa95/0xc70 net/socket.c:2614 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2668 __sys_sendmsg+0x16d/0x220 net/socket.c:2700 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffff888023f80000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888023f80080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff888023f80100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888023f80180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888023f80200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Tested on: commit: 1b237f19 Linux 6.17-rc3 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=10b66862580000 kernel config: https://syzkaller.appspot.com/x/.config?x=d4703ac89d9e185a dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=1417b862580000 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2023-06-10 1:34 syzbot 2023-06-21 7:07 ` Ziqi Zhao 2025-08-25 12:35 ` Hillf Danton @ 2025-08-26 1:50 ` Hillf Danton 2025-08-26 2:48 ` syzbot 2 siblings, 1 reply; 18+ messages in thread From: Hillf Danton @ 2025-08-26 1:50 UTC (permalink / raw) To: syzbot; +Cc: linux-kernel, syzkaller-bugs > Date: Fri, 09 Jun 2023 18:34:58 -0700 [thread overview] > Hello, > > syzbot found the following issue on: > > HEAD commit: 67faabbde36b selftests/bpf: Add missing prototypes for sev.. > git tree: bpf-next > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1381363b280000 > kernel config: https://syzkaller.appspot.com/x/.config?x=5335204dcdecfda > dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=132faf93280000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10532add280000 #syz test upstream master --- a/net/can/j1939/j1939-priv.h +++ b/net/can/j1939/j1939-priv.h @@ -59,6 +59,7 @@ struct j1939_priv { /* segments need a lock to protect the above list */ rwlock_t lock; + int unregistered; struct net_device *ndev; /* list of 256 ecu ptrs, that cache the claimed addresses. --- a/net/can/j1939/main.c +++ b/net/can/j1939/main.c @@ -157,13 +157,15 @@ static void __j1939_priv_release(struct struct j1939_priv *priv = container_of(kref, struct j1939_priv, kref); struct net_device *ndev = priv->ndev; - netdev_dbg(priv->ndev, "%s: 0x%p\n", __func__, priv); + if (!priv->unregistered) + netdev_dbg(priv->ndev, "%s: 0x%p\n", __func__, priv); WARN_ON_ONCE(!list_empty(&priv->active_session_list)); WARN_ON_ONCE(!list_empty(&priv->ecus)); WARN_ON_ONCE(!list_empty(&priv->j1939_socks)); - dev_put(ndev); + if (!priv->unregistered) + dev_put(ndev); kfree(priv); } @@ -197,7 +199,8 @@ static void j1939_can_rx_unregister(stru { struct net_device *ndev = priv->ndev; - can_rx_unregister(dev_net(ndev), ndev, J1939_CAN_ID, J1939_CAN_MASK, + if (!priv->unregistered) + can_rx_unregister(dev_net(ndev), ndev, J1939_CAN_ID, J1939_CAN_MASK, j1939_can_recv, priv); /* The last reference of priv is dropped by the RCU deferred @@ -377,6 +380,10 @@ static int j1939_netdev_notify(struct no j1939_sk_netdev_event_netdown(priv); j1939_ecu_unmap_all(priv); break; + case NETDEV_UNREGISTER: + priv->unregistered++; + dev_put(priv->ndev); + break; } j1939_priv_put(priv); -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) 2025-08-26 1:50 ` Hillf Danton @ 2025-08-26 2:48 ` syzbot 0 siblings, 0 replies; 18+ messages in thread From: syzbot @ 2025-08-26 2:48 UTC (permalink / raw) To: hdanton, linux-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: KASAN: use-after-free Read in j1939_netdev_stop ================================================================== BUG: KASAN: use-after-free in netdev_get_ml_priv include/linux/netdevice.h:2692 [inline] BUG: KASAN: use-after-free in can_get_ml_priv include/linux/can/can-ml.h:71 [inline] BUG: KASAN: use-after-free in j1939_priv_set net/can/j1939/main.c:150 [inline] BUG: KASAN: use-after-free in __j1939_rx_release net/can/j1939/main.c:221 [inline] BUG: KASAN: use-after-free in kref_put_mutex include/linux/kref.h:86 [inline] BUG: KASAN: use-after-free in j1939_netdev_stop+0x2df/0x320 net/can/j1939/main.c:312 Read of size 4 at addr ffff888062a506b0 by task syz.0.17/6467 CPU: 0 UID: 0 PID: 6467 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xcd/0x630 mm/kasan/report.c:482 kasan_report+0xe0/0x110 mm/kasan/report.c:595 netdev_get_ml_priv include/linux/netdevice.h:2692 [inline] can_get_ml_priv include/linux/can/can-ml.h:71 [inline] j1939_priv_set net/can/j1939/main.c:150 [inline] __j1939_rx_release net/can/j1939/main.c:221 [inline] kref_put_mutex include/linux/kref.h:86 [inline] j1939_netdev_stop+0x2df/0x320 net/can/j1939/main.c:312 j1939_sk_release+0x5c3/0x8e0 net/can/j1939/socket.c:651 __sock_release+0xb0/0x270 net/socket.c:649 sock_close+0x1c/0x30 net/socket.c:1439 __fput+0x402/0xb70 fs/file_table.c:468 task_work_run+0x14d/0x240 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x86f/0x2bf0 kernel/exit.c:961 do_group_exit+0xd3/0x2a0 kernel/exit.c:1102 get_signal+0x2673/0x26d0 kernel/signal.c:3034 arch_do_signal_or_restart+0x8f/0x7d0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop+0x84/0x110 kernel/entry/common.c:40 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x3f6/0x4c0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f15b7b8ebe9 Code: Unable to access opcode bytes at 0x7f15b7b8ebbf. RSP: 002b:00007f15b899d038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: 0000000000000024 RBX: 00007f15b7db5fa0 RCX: 00007f15b7b8ebe9 RDX: 0000000000000000 RSI: 0000200000000200 RDI: 0000000000000003 RBP: 00007f15b7c11e19 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f15b7db6038 R14: 00007f15b7db5fa0 R15: 00007ffec48e9178 </TASK> The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x62a50 flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000000000 ffffea0001db7008 ffffea00018a9208 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as freed page last allocated via order 3, migratetype Unmovable, gfp_mask 0x446dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_COMP), pid 6357, tgid 6357 (syz-executor), ts 118494889601, free_ts 123112554125 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1c0/0x230 mm/page_alloc.c:1851 prep_new_page mm/page_alloc.c:1859 [inline] get_page_from_freelist+0x132b/0x38e0 mm/page_alloc.c:3858 __alloc_frozen_pages_noprof+0x261/0x23f0 mm/page_alloc.c:5148 alloc_pages_mpol+0x1fb/0x550 mm/mempolicy.c:2416 ___kmalloc_large_node+0xed/0x160 mm/slub.c:4306 __kmalloc_large_node_noprof+0x1c/0x70 mm/slub.c:4337 __do_kmalloc_node mm/slub.c:4353 [inline] __kvmalloc_node_noprof.cold+0xb/0x65 mm/slub.c:5052 alloc_netdev_mqs+0xd2/0x1530 net/core/dev.c:11812 rtnl_create_link+0xc08/0xf90 net/core/rtnetlink.c:3633 vxcan_newlink+0x2f8/0x640 drivers/net/can/vxcan.c:208 rtnl_newlink_create net/core/rtnetlink.c:3825 [inline] __rtnl_newlink net/core/rtnetlink.c:3942 [inline] rtnl_newlink+0xc45/0x2000 net/core/rtnetlink.c:4057 rtnetlink_rcv_msg+0x95b/0xe90 net/core/rtnetlink.c:6946 netlink_rcv_skb+0x155/0x420 net/netlink/af_netlink.c:2552 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg net/socket.c:729 [inline] __sys_sendto+0x4a0/0x520 net/socket.c:2228 page last free pid 6467 tgid 6466 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1395 [inline] __free_frozen_pages+0x7d5/0x10f0 mm/page_alloc.c:2895 device_release+0xa1/0x240 drivers/base/core.c:2565 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x1e7/0x5a0 lib/kobject.c:737 netdev_run_todo+0x7e9/0x1320 net/core/dev.c:11513 rtnl_unlock net/core/rtnetlink.c:157 [inline] rtnl_net_unlock include/linux/rtnetlink.h:135 [inline] rtnl_dellink+0x3da/0xa80 net/core/rtnetlink.c:3563 rtnetlink_rcv_msg+0x95b/0xe90 net/core/rtnetlink.c:6946 netlink_rcv_skb+0x155/0x420 net/netlink/af_netlink.c:2552 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg net/socket.c:729 [inline] ____sys_sendmsg+0xa98/0xc70 net/socket.c:2614 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2668 __sys_sendmsg+0x16d/0x220 net/socket.c:2700 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffff888062a50580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888062a50600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff888062a50680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888062a50700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888062a50780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Tested on: commit: fab1beda Merge tag 'devicetree-fixes-for-6.17-1' of gi.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16049c42580000 kernel config: https://syzkaller.appspot.com/x/.config?x=d4703ac89d9e185a dashboard link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=142de862580000 ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2026-03-02 11:21 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-19 12:18 unregister_netdevice: waiting for DEV to become free (8) Tetsuo Handa 2025-11-19 12:18 ` syzbot 2025-11-19 12:20 ` Tetsuo Handa 2025-11-19 13:09 ` [syzbot] [net?] " syzbot 2025-11-19 13:13 ` Tetsuo Handa 2025-11-19 13:57 ` [syzbot] [net?] " syzbot 2025-11-19 14:00 ` Tetsuo Handa 2025-11-19 14:47 ` [syzbot] [net?] " syzbot 2026-03-02 10:56 ` Tetsuo Handa 2026-03-02 11:21 ` [syzbot] [net?] " syzbot -- strict thread matches above, loose matches on Subject: below -- 2025-08-25 11:01 [PATCH] can: j1939: implement NETDEV_UNREGISTER notification handler Tetsuo Handa 2025-08-25 13:35 ` [syzbot] [net?] unregister_netdevice: waiting for DEV to become free (8) syzbot 2023-06-10 1:34 syzbot 2023-06-21 7:07 ` Ziqi Zhao 2023-06-21 8:46 ` Dongliang Mu 2025-08-25 12:35 ` Hillf Danton 2025-08-25 13:51 ` syzbot 2025-08-26 1:50 ` Hillf Danton 2025-08-26 2:48 ` syzbot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.