From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: wangyufen <wangyufen@huawei.com>, Jason Gunthorpe <jgg@ziepe.ca>,
Dmitry Vyukov <dvyukov@google.com>
Cc: syzbot <syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com>,
Leon Romanovsky <leon@kernel.org>,
chenzhongjin@huawei.com,
RDMA mailing list <linux-rdma@vger.kernel.org>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
syzkaller-bugs@googlegroups.com,
Zhu Yanjun <zyjzyj2000@gmail.com>,
Bob Pearson <rpearsonhpe@gmail.com>
Subject: Re: [syzbot] unregister_netdevice: waiting for DEV to become free (7)
Date: Wed, 23 Nov 2022 17:45:53 +0800 [thread overview]
Message-ID: <2f54056f-0acf-e088-c6cc-9ffce77bbe24@linux.dev> (raw)
In-Reply-To: <ecc8b532-4e80-b7bd-3621-78cd55fd48fa@huawei.com>
On 11/22/22 11:28 AM, wangyufen wrote:
>
> 在 2022/11/22 10:13, Jason Gunthorpe 写道:
>> On Fri, Nov 18, 2022 at 02:28:53PM +0100, Dmitry Vyukov wrote:
>>> On Fri, 18 Nov 2022 at 12:39, syzbot
>>> <syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: 9c8774e629a1 net: eql: Use kzalloc instead of
>>>> kmalloc/memset
>>>> git tree: net-next
>>>> console output:
>>>> https://syzkaller.appspot.com/x/log.txt?x=17bf6cc8f00000
>>>> kernel config:
>>>> https://syzkaller.appspot.com/x/.config?x=9eb259db6b1893cf
>>>> dashboard link:
>>>> https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU
>>>> Binutils for Debian) 2.35.2
>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1136d592f00000
>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1193ae64f00000
>>>>
>>>> Bisection is inconclusive: the issue happens on the oldest tested
>>>> release.
>>>>
>>>> bisection log:
>>>> https://syzkaller.appspot.com/x/bisect.txt?x=167c33a2f00000
>>>> final oops:
>>>> https://syzkaller.appspot.com/x/report.txt?x=157c33a2f00000
>>>> console output:
>>>> https://syzkaller.appspot.com/x/log.txt?x=117c33a2f00000
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to
>>>> the commit:
>>>> Reported-by: syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com
>>>>
>>>> iwpm_register_pid: Unable to send a nlmsg (client = 2)
>>>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>>>> unregister_netdevice: waiting for vlan0 to become free. Usage count
>>>> = 2
>>>
>>> +RDMA maintainers
>>>
>>> There are 4 reproducers and all contain:
>>>
>>> r0 = socket$nl_rdma(0x10, 0x3, 0x14)
>>> sendmsg$RDMA_NLDEV_CMD_NEWLINK(...)
>>>
>>> Also the preceding print looks related (a bug in the error handling
>>> path there?):
>>>
>>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>>
>> I'm pretty sure it is an rxe bug
>>
>> ib_device_set_netdev() will hold the netdev until the caller destroys
>> the ib_device
>>
>> rxe calls it during rxe_register_device() because the user asked for a
>> stacked ib_device on top of the netdev
>>
>> Presumably rxe needs to have a notifier to also self destroy the rxe
>> device if the underlying net device is to be destroyed?
>>
>> Can someone from rxe check into this?
>
> The following patch may fix the issue:
>
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -4049,6 +4049,9 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
> return 0;
> err:
> id_priv->backlog = 0;
> + if (id_priv->cma_dev)
> + cma_release_dev(id_priv);
> +
> /*
> * All the failure paths that lead here will not allow the
> req_handler's
> * to have run.
>
But it is the caller's responsibility to destroy it since commit
dd37d2f59eb8.
> The causes are as follows:
>
> rdma_listen()
> rdma_bind_addr()
> cma_acquire_dev_by_src_ip()
> cma_attach_to_dev()
> _cma_attach_to_dev()
> cma_dev_get()
Thanks for the analysis.
And for the two callers of cma_listen_on_dev, looks they have
different behaviors with regard to handling failure.
1. cma_listen_on_all which calls both
list_del_init(&to_destroy->device_item)
and
rdma_destroy_id(&to_destroy->id)
2. cma_add_one invokes cma_process_remove to delete to_destroy,
cma_process_remove call both list_del_init(&id_priv->listen_item)
and list_del_init(&id_priv->device_item), but it doesn't call
rdma_destroy_id(&dev_id_priv->id) which is also different with
_cma_cancel_listens.
I am wondering if this is needed.
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc2222b85c88..48e283d1389b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -5231,6 +5231,7 @@ static void cma_process_remove(struct cma_device
*cma_dev)
cma_id_get(id_priv);
mutex_unlock(&lock);
+ rdma_destroy_id(&dev_id_priv->id);
cma_send_device_removal_put(id_priv);
mutex_lock(&lock);
Thanks,
Guoqing
next prev parent reply other threads:[~2022-11-23 9:49 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-18 11:39 [syzbot] unregister_netdevice: waiting for DEV to become free (7) syzbot
2022-11-18 13:28 ` Dmitry Vyukov
2022-11-22 2:13 ` Jason Gunthorpe
2022-11-22 3:28 ` wangyufen
2022-11-23 9:45 ` Guoqing Jiang [this message]
2022-11-24 0:22 ` Jason Gunthorpe
2022-11-24 1:42 ` wangyufen
2023-03-25 16:02 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2f54056f-0acf-e088-c6cc-9ffce77bbe24@linux.dev \
--to=guoqing.jiang@linux.dev \
--cc=chenzhongjin@huawei.com \
--cc=dvyukov@google.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rpearsonhpe@gmail.com \
--cc=syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=wangyufen@huawei.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.