All of lore.kernel.org
 help / color / mirror / Atom feed
From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org,
	Sean Tranchetti <stranche@codeaurora.org>
Subject: Re: [PATCH net] dev: Delay the free of the percpu refcount
Date: Wed, 04 Sep 2019 12:50:32 -0600	[thread overview]
Message-ID: <a539983647c166a28db839dbff32a6bc@codeaurora.org> (raw)
In-Reply-To: <adbe6efaabd34538fa424e028bdc6699@codeaurora.org>

On 2019-08-30 15:03, Subash Abhinov Kasiviswanathan wrote:
>> This looks bogus.
>> 
>> Whatever layer tries to access dev refcnt after free_netdev() has been
>> called is buggy.
>> 
>> I would rather trap early and fix the root cause.
>> 
>> Untested patch :
>> 
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index b5d28dadf964..8080f1305417 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3723,6 +3723,7 @@ void netdev_run_todo(void);
>>   */
>>  static inline void dev_put(struct net_device *dev)
>>  {
>> +       BUG_ON(!dev->pcpu_refcnt);
>>         this_cpu_dec(*dev->pcpu_refcnt);
>>  }
>> 
>> @@ -3734,6 +3735,7 @@ static inline void dev_put(struct net_device 
>> *dev)
>>   */
>>  static inline void dev_hold(struct net_device *dev)
>>  {
>> +       BUG_ON(!dev->pcpu_refcnt);
>>         this_cpu_inc(*dev->pcpu_refcnt);
>>  }
> 
> Hello Eric
> 
> I am seeing a similar crash with your patch as well.
> The NULL dev->pcpu_refcnt was caught by the BUG you added.
> 
>    786.510217:   <6> kernel BUG at include/linux/netdevice.h:3633!
>    786.510263:   <2> pc : in_dev_finish_destroy+0xcc/0xd0
>    786.510267:   <2> lr : in_dev_finish_destroy+0x2c/0xd0
>    786.511220:   <2> Call trace:
>    786.511225:   <2>  in_dev_finish_destroy+0xcc/0xd0
>    786.511230:   <2>  in_dev_rcu_put+0x24/0x30
>    786.511237:   <2>  rcu_nocb_kthread+0x43c/0x468
>    786.511243:   <2>  kthread+0x118/0x128
>    786.511249:   <2>  ret_from_fork+0x10/0x1c
> 
> This seems to be happening when there is an allocation failure
> in the IPv6 notifier callback only.
> 
> I had added some additional debug to narrow down the refcount
> validity along the callers of the dev_put/dev_hold.
> refcnt valid below shows that the pointer dev->pcpu_refcnt is valid
> while refcnt null shows the case where dev->pcpu_refcnt is NULL.
> The last dev_put happens after free_netdev leading to the
> dev->pcpu_refcnt to be accessed when NULL.
> 
> 309.908501:   <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
> setup_net+0xa0/0x210 -> ops_init+0x88/0x110
> 309.908674:   <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
> register_netdevice+0x29c/0x5b0 -> netdev_register_kobject+0xd8/0x150
> 309.908696:   <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
> register_netdevice+0x29c/0x5b0 -> netdev_register_kobject+0x100/0x150
> 309.908717:   <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
> vti6_init_net+0x188/0x1c0 -> register_netdev+0x28/0x40
> 309.908763:   <6> neighbour: dev_hold() ffffffe13c9df000 ip6_vti0
> refcnt valid inetdev_event+0x43c/0x528 -> inetdev_init+0x80/0x1e0
> 309.908835:   <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
> raw_notifier_call_chain+0x3c/0x68 -> inetdev_event+0x43c/0x528
> 309.908882:   <6> neighbour: dev_hold() ffffffe13c9df000 ip6_vti0
> refcnt valid addrconf_notify+0x42c/0xe58 -> ipv6_add_dev+0xe4/0x588
> 309.908890:   <6> IPv6: dev_hold() ffffffe13c9df000 ip6_vti0 refcnt
> valid raw_notifier_call_chain+0x3c/0x68 -> addrconf_notify+0x42c/0xe58
> 309.908906:   <6> stress-ng-clone: page allocation failure: order:0,
> mode:0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
> 309.908910:   <6> stress-ng-clone cpuset=foreground mems_allowed=0
> 309.908925:   <2> Call trace:
> 309.908931:   <2>  dump_backtrace+0x0/0x158
> 309.908934:   <2>  show_stack+0x14/0x20
> 309.908939:   <2>  dump_stack+0xc4/0xfc
> 309.908944:   <2>  warn_alloc+0xf8/0x168
> 309.908947:   <2>  __alloc_pages_nodemask+0xff4/0x1018
> 309.908955:   <2>  new_slab+0x128/0x5b8
> 309.908958:   <2>  ___slab_alloc+0x4cc/0x5f8
> 309.908960:   <2>  kmem_cache_alloc_trace+0x2a4/0x2c0
> 309.908963:   <2>  ipv6_add_dev+0x220/0x588
> 309.908966:   <2>  addrconf_notify+0x42c/0xe58
> 309.908969:   <2>  raw_notifier_call_chain+0x3c/0x68
> 309.908972:   <2>  register_netdevice+0x3c4/0x5b0
> 309.908974:   <2>  register_netdev+0x28/0x40
> 309.908978:   <2>  vti6_init_net+0x188/0x1c0
> 309.908981:   <2>  ops_init+0x88/0x110
> 309.908983:   <2>  setup_net+0xa0/0x210
> 309.908986:   <2>  copy_net_ns+0xa8/0x130
> 309.908990:   <2>  create_new_namespaces+0x138/0x170
> 309.908993:   <2>  unshare_nsproxy_namespaces+0x68/0x90
> 309.908999:   <2>  ksys_unshare+0x17c/0x248
> 309.909001:   <2>  __arm64_sys_unshare+0x10/0x20
> 309.909004:   <2>  el0_svc_common+0xa0/0x158
> 309.909007:   <2>  el0_svc_handler+0x6c/0x88
> 309.909010:   <2>  el0_svc+0x8/0xc
> 309.909021:   <6> neighbour: dev_put() ffffffe13c9df000 ip6_vti0
> refcnt valid addrconf_notify+0x42c/0xe58 -> ipv6_add_dev+0x400/0x588
> 309.909030:   <6> IPv6: dev_put() ffffffe13c9df000 ip6_vti0 refcnt
> valid raw_notifier_call_chain+0x3c/0x68 -> addrconf_notify+0x42c/0xe58
> 309.918097:   <6> neighbour: dev_put() ffffffe13c9df000 ip6_vti0
> refcnt valid raw_notifier_call_chain+0x3c/0x68 ->
> inetdev_event+0x290/0x528
> 309.918249:   <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
> register_netdevice+0x3f8/0x5b0 -> rollback_registered_many+0x488/0x658
> 309.918318:   <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
> net_rx_queue_update_kobjects+0x1ec/0x238 -> kobject_put+0x7c/0xc0
> 309.918405:   <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
> netdev_queue_update_kobjects+0x1dc/0x228 -> kobject_put+0x7c/0xc0
> 309.918759:   <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
> register_netdev+0x28/0x40 -> register_netdevice+0x3f8/0x5b0
> 309.918778:   <6> free_netdev() ffffffe13c9df000 ip6_vti0 refcnt valid
> ops_init+0x88/0x110 -> vti6_init_net+0x1ac/0x1c0
> 309.980671:   <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt null
> rcu_nocb_kthread+0x43c/0x468 -> in_dev_rcu_put+0x24/0x30
> 309.980838:   <6> kernel BUG at include/linux/netdevice.h:3636!

Hello Eric

Could you please review this? The main concern here is that
dev_put() is getting called after free_netdev() causing a NULL
dereference.

309.918778:   <6> free_netdev() ffffffe13c9df000 ip6_vti0 refcnt valid
ops_init+0x88/0x110 -> vti6_init_net+0x1ac/0x1c0
309.980671:   <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt null
rcu_nocb_kthread+0x43c/0x468 -> in_dev_rcu_put+0x24/0x30

I am able to reproduce this consistently and open to testing
out patches.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

      reply	other threads:[~2019-09-04 18:50 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30  5:23 [PATCH net] dev: Delay the free of the percpu refcount Subash Abhinov Kasiviswanathan
2019-08-30  9:16 ` Eric Dumazet
2019-08-30 21:03   ` Subash Abhinov Kasiviswanathan
2019-09-04 18:50     ` Subash Abhinov Kasiviswanathan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a539983647c166a28db839dbff32a6bc@codeaurora.org \
    --to=subashab@codeaurora.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=stranche@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.