netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Pirko <jiri@resnulli.us>,
	netdev@vger.kernel.org, davem@davemloft.net, pabeni@redhat.com,
	edumazet@google.com, petrm@nvidia.com
Subject: Re: [RFC PATCH net-next 1/2] devlink: Hold a reference on parent device
Date: Wed, 21 Jun 2023 09:31:43 +0300	[thread overview]
Message-ID: <ZJKZT3LHBN3zEUd1@shredder> (raw)
In-Reply-To: <20230620104351.6debe7f1@kernel.org>

On Tue, Jun 20, 2023 at 10:43:51AM -0700, Jakub Kicinski wrote:
> Do we need to hold the reference on the device until release?
> I think you can release it in devlink_free().
> The only valid fields for an unregistered devlink instance are:
>  - lock
>  - refcount
>  - index
> 
> And obviously unregistered devices can't be reloaded.

Thanks for taking a look.

Moving the release to devlink_free() [1] was the first thing I tried and
it indeed solves the problem I mentioned earlier, but creates a new one.
After devlink_free() returns the devlink instance can still be accessed
by user space in devlink_get_from_attrs_lock(). If I reload in a loop
while concurrently removing and adding the device [2], we can hit a UAF
when trying to acquire the device lock [3].

[1]
diff --git a/net/devlink/core.c b/net/devlink/core.c
index a4b6d548e50c..b60a8463d6e0 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -92,7 +92,6 @@ static void devlink_release(struct work_struct *work)
 
        mutex_destroy(&devlink->lock);
        lockdep_unregister_key(&devlink->lock_key);
-       put_device(devlink->dev);
        kfree(devlink);
 }
 
@@ -264,6 +263,8 @@ void devlink_free(struct devlink *devlink)
 
        xa_erase(&devlinks, devlink->index);
 
+       put_device(devlink->dev);
+
        devlink_put(devlink);
 }
 EXPORT_SYMBOL_GPL(devlink_free);

[2]
while true; do devlink dev reload netdevsim/netdevsim10 &> /dev/null; done &
while true; do echo "10 0" > /sys/bus/netdevsim/new_device; echo 10 > /sys/bus/netdevsim/del_device; done

[3]
[   96.081096] ==================================================================
[   96.082158] BUG: KASAN: slab-use-after-free in __mutex_lock+0x18a7/0x1ac0
[   96.083107] Read of size 8 at addr ffff888109caa8d8 by task devlink/429

[   96.084266] CPU: 0 PID: 429 Comm: devlink Not tainted 6.4.0-rc6-custom-gb01b0912311c-dirty #303
[   96.085443] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
[   96.086632] Call Trace:
[   96.086990]  <TASK>
[   96.087314]  dump_stack_lvl+0x91/0xf0
[   96.087852]  print_report+0xcf/0x660
[   96.088384]  ? __virt_addr_valid+0x86/0x360
[   96.088980]  ? __mutex_lock+0x18a7/0x1ac0
[   96.089558]  ? __mutex_lock+0x18a7/0x1ac0
[   96.090132]  kasan_report+0xd6/0x110
[   96.090667]  ? __mutex_lock+0x18a7/0x1ac0
[   96.091245]  __mutex_lock+0x18a7/0x1ac0
[   96.091898]  ? devlink_get_from_attrs_lock+0x2bc/0x460
[   96.092645]  ? mutex_lock_io_nested+0x18b0/0x18b0
[   96.093323]  ? reacquire_held_locks+0x4e0/0x4e0
[   96.093972]  ? devlink_try_get+0x158/0x1e0
[   96.094586]  ? devlink_get_from_attrs_lock+0x460/0x460
[   96.095325]  ? devlink_get_from_attrs_lock+0x2bc/0x460
[   96.096053]  devlink_get_from_attrs_lock+0x2bc/0x460
[   96.096767]  ? devlink_nl_post_doit+0xf0/0xf0
[   96.097409]  ? __nla_parse+0x40/0x50
[   96.097943]  ? devlink_get_from_attrs_lock+0x460/0x460
[   96.098671]  devlink_nl_pre_doit+0xb3/0x480
[   96.099255]  genl_family_rcv_msg_doit.isra.0+0x1b8/0x2e0
[   96.099966]  ? genl_start+0x670/0x670
[   96.100487]  ? ns_capable+0xda/0x110
[   96.100991]  genl_rcv_msg+0x558/0x7f0
[   96.101516]  ? genl_family_rcv_msg_doit.isra.0+0x2e0/0x2e0
[   96.102260]  ? devlink_get_from_attrs_lock+0x460/0x460
[   96.102925]  ? devlink_reload+0x540/0x540
[   96.103429]  ? devlink_pernet_pre_exit+0x340/0x340
[   96.104017]  netlink_rcv_skb+0x170/0x440
[   96.104508]  ? genl_family_rcv_msg_doit.isra.0+0x2e0/0x2e0
[   96.105166]  ? netlink_ack+0x1380/0x1380
[   96.105662]  ? lock_contended+0xc70/0xc70
[   96.106172]  ? rwsem_down_read_slowpath+0xda0/0xda0
[   96.106767]  ? netlink_deliver_tap+0x1b6/0xd90
[   96.107317]  ? is_vmalloc_addr+0x8b/0xb0
[   96.107804]  genl_rcv+0x2d/0x40
[   96.108213]  netlink_unicast+0x53f/0x810
[   96.108698]  ? netlink_attachskb+0x870/0x870
[   96.109230]  ? lock_release+0x3ac/0xbb0
[   96.109714]  netlink_sendmsg+0x95c/0xe80
[   96.110206]  ? netlink_unicast+0x810/0x810
[   96.110711]  ? __might_fault+0x15b/0x190
[   96.111194]  ? _copy_from_user+0x9f/0xd0
[   96.111691]  __sys_sendto+0x2aa/0x420
[   96.112148]  ? __ia32_sys_getpeername+0xb0/0xb0
[   96.112706]  ? reacquire_held_locks+0x4e0/0x4e0
[   96.113275]  ? rcu_is_watching+0x12/0xb0
[   96.113769]  ? blkcg_exit_disk+0x50/0x50
[   96.114260]  __x64_sys_sendto+0xe5/0x1c0
[   96.114742]  ? lockdep_hardirqs_on+0x7d/0x100
[   96.115285]  ? syscall_enter_from_user_mode+0x20/0x50
[   96.115899]  do_syscall_64+0x38/0x80
[   96.116352]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   96.116964] RIP: 0033:0x7fa073f72fa7
[   96.117417] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d 5d d6 0c 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 c3 55 48 83 ec 30 44 89 4c 24 2c 4c 89 44
[   96.119548] RSP: 002b:00007ffca2723938 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[   96.120449] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa073f72fa7
[   96.121287] RDX: 0000000000000034 RSI: 000055883950a460 RDI: 0000000000000003
[   96.122120] RBP: 0000558837836e60 R08: 00007fa074050200 R09: 000000000000000c
[   96.122951] R10: 0000000000000000 R11: 0000000000000202 R12: 000055883950a2a0
[   96.123789] R13: 000055883950a460 R14: 000055883784eeab R15: 000055883950a2a0
[   96.124638]  </TASK>

[   96.125120] Allocated by task 209:
[   96.125543]  kasan_save_stack+0x33/0x50
[   96.125567]  kasan_set_track+0x25/0x30
[   96.125587]  __kasan_kmalloc+0x87/0x90
[   96.125606]  new_device_store+0x22d/0x6a0
[   96.125625]  bus_attr_store+0x7b/0xa0
[   96.125641]  sysfs_kf_write+0x11c/0x170
[   96.125659]  kernfs_fop_write_iter+0x3f7/0x600
[   96.125676]  vfs_write+0x680/0xe90
[   96.125691]  ksys_write+0x143/0x270
[   96.125707]  do_syscall_64+0x38/0x80
[   96.125722]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

[   96.125947] Freed by task 209:
[   96.126332]  kasan_save_stack+0x33/0x50
[   96.126355]  kasan_set_track+0x25/0x30
[   96.126374]  kasan_save_free_info+0x2e/0x40
[   96.126389]  __kasan_slab_free+0x10a/0x180
[   96.126409]  __kmem_cache_free+0x8a/0x1a0
[   96.126429]  device_release+0xa6/0x240
[   96.126449]  kobject_put+0x1f7/0x5b0
[   96.126465]  device_unregister+0x34/0xc0
[   96.126478]  del_device_store+0x308/0x430
[   96.126496]  bus_attr_store+0x7b/0xa0
[   96.126511]  sysfs_kf_write+0x11c/0x170
[   96.126528]  kernfs_fop_write_iter+0x3f7/0x600
[   96.126545]  vfs_write+0x680/0xe90
[   96.126560]  ksys_write+0x143/0x270
[   96.126575]  do_syscall_64+0x38/0x80
[   96.126590]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

[   96.126815] The buggy address belongs to the object at ffff888109caa800
                which belongs to the cache kmalloc-1k of size 1024
[   96.128252] The buggy address is located 216 bytes inside of
                freed 1024-byte region [ffff888109caa800, ffff888109caac00)

[   96.129875] The buggy address belongs to the physical page:
[   96.130540] page:00000000c7912b23 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109ca8
[   96.130562] head:00000000c7912b23 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[   96.130575] flags: 0x200000000010200(slab|head|node=0|zone=2)
[   96.130591] page_type: 0xffffffff()
[   96.130607] raw: 0200000000010200 ffff888100041dc0 dead000000000100 dead000000000122
[   96.130622] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
[   96.130632] page dumped because: kasan: bad access detected

[   96.130844] Memory state around the buggy address:
[   96.131424]  ffff888109caa780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   96.132266]  ffff888109caa800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   96.133101] >ffff888109caa880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   96.133944]                                                     ^
[   96.134666]  ffff888109caa900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   96.135510]  ffff888109caa980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   96.136351] ==================================================================

  reply	other threads:[~2023-06-21  6:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-19 12:50 [RFC PATCH net-next 0/2] devlink: Acquire device lock during reload Ido Schimmel
2023-06-19 12:50 ` [RFC PATCH net-next 1/2] devlink: Hold a reference on parent device Ido Schimmel
2023-06-20  6:23   ` Jiri Pirko
2023-06-20  7:05     ` Ido Schimmel
2023-06-20 17:43       ` Jakub Kicinski
2023-06-21  6:31         ` Ido Schimmel [this message]
2023-06-21 19:03           ` Jakub Kicinski
2023-06-22  6:03             ` Ido Schimmel
2023-06-21 11:48   ` Jiri Pirko
2023-06-21 15:35     ` Ido Schimmel
2023-06-22  6:29       ` Jiri Pirko
2023-06-25 11:55         ` Ido Schimmel
2023-06-27 10:13           ` Jiri Pirko
2023-06-19 12:50 ` [RFC PATCH net-next 2/2] devlink: Acquire device lock during reload Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJKZT3LHBN3zEUd1@shredder \
    --to=idosch@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).