netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: wangyufen <wangyufen@huawei.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Dmitry Vyukov <dvyukov@google.com>
Cc: syzbot <syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com>,
	Leon Romanovsky <leon@kernel.org>,
	chenzhongjin@huawei.com,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	syzkaller-bugs@googlegroups.com,
	Zhu Yanjun <zyjzyj2000@gmail.com>,
	Bob Pearson <rpearsonhpe@gmail.com>
Subject: Re: [syzbot] unregister_netdevice: waiting for DEV to become free (7)
Date: Wed, 23 Nov 2022 17:45:53 +0800	[thread overview]
Message-ID: <2f54056f-0acf-e088-c6cc-9ffce77bbe24@linux.dev> (raw)
In-Reply-To: <ecc8b532-4e80-b7bd-3621-78cd55fd48fa@huawei.com>



On 11/22/22 11:28 AM, wangyufen wrote:
>
> 在 2022/11/22 10:13, Jason Gunthorpe 写道:
>> On Fri, Nov 18, 2022 at 02:28:53PM +0100, Dmitry Vyukov wrote:
>>> On Fri, 18 Nov 2022 at 12:39, syzbot
>>> <syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    9c8774e629a1 net: eql: Use kzalloc instead of 
>>>> kmalloc/memset
>>>> git tree:       net-next
>>>> console output: 
>>>> https://syzkaller.appspot.com/x/log.txt?x=17bf6cc8f00000
>>>> kernel config: 
>>>> https://syzkaller.appspot.com/x/.config?x=9eb259db6b1893cf
>>>> dashboard link: 
>>>> https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
>>>> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU 
>>>> Binutils for Debian) 2.35.2
>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1136d592f00000
>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1193ae64f00000
>>>>
>>>> Bisection is inconclusive: the issue happens on the oldest tested 
>>>> release.
>>>>
>>>> bisection log: 
>>>> https://syzkaller.appspot.com/x/bisect.txt?x=167c33a2f00000
>>>> final oops: 
>>>> https://syzkaller.appspot.com/x/report.txt?x=157c33a2f00000
>>>> console output: 
>>>> https://syzkaller.appspot.com/x/log.txt?x=117c33a2f00000
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to 
>>>> the commit:
>>>> Reported-by: syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com
>>>>
>>>> iwpm_register_pid: Unable to send a nlmsg (client = 2)
>>>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>>>> unregister_netdevice: waiting for vlan0 to become free. Usage count 
>>>> = 2
>>>
>>> +RDMA maintainers
>>>
>>> There are 4 reproducers and all contain:
>>>
>>> r0 = socket$nl_rdma(0x10, 0x3, 0x14)
>>> sendmsg$RDMA_NLDEV_CMD_NEWLINK(...)
>>>
>>> Also the preceding print looks related (a bug in the error handling
>>> path there?):
>>>
>>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>>
>> I'm pretty sure it is an rxe bug
>>
>> ib_device_set_netdev() will hold the netdev until the caller destroys
>> the ib_device
>>
>> rxe calls it during rxe_register_device() because the user asked for a
>> stacked ib_device on top of the netdev
>>
>> Presumably rxe needs to have a notifier to also self destroy the rxe
>> device if the underlying net device is to be destroyed?
>>
>> Can someone from rxe check into this?
>
> The following patch may fix the issue:
>
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -4049,6 +4049,9 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
>         return 0;
>  err:
>         id_priv->backlog = 0;
> +       if (id_priv->cma_dev)
> +               cma_release_dev(id_priv);
> +
>         /*
>          * All the failure paths that lead here will not allow the 
> req_handler's
>          * to have run.
>

But it is the caller's responsibility to destroy it since commit 
dd37d2f59eb8.

> The causes are as follows:
>
> rdma_listen()
>   rdma_bind_addr()
>     cma_acquire_dev_by_src_ip()
>       cma_attach_to_dev()
>         _cma_attach_to_dev()
>           cma_dev_get()

Thanks for the analysis.

And for the two callers of cma_listen_on_dev, looks they have
different behaviors with regard to handling failure.

1. cma_listen_on_all which calls both
             list_del_init(&to_destroy->device_item)
     and
             rdma_destroy_id(&to_destroy->id)

2. cma_add_one invokes cma_process_remove to delete to_destroy,
cma_process_remove call both list_del_init(&id_priv->listen_item)
and list_del_init(&id_priv->device_item), but it doesn't call
rdma_destroy_id(&dev_id_priv->id) which is also different with
_cma_cancel_listens.

I am wondering if this is needed.

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc2222b85c88..48e283d1389b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -5231,6 +5231,7 @@ static void cma_process_remove(struct cma_device 
*cma_dev)
                 cma_id_get(id_priv);
                 mutex_unlock(&lock);

+               rdma_destroy_id(&dev_id_priv->id);
                 cma_send_device_removal_put(id_priv);

                 mutex_lock(&lock);

Thanks,
Guoqing

  reply	other threads:[~2022-11-23  9:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-18 11:39 [syzbot] unregister_netdevice: waiting for DEV to become free (7) syzbot
2022-11-18 13:28 ` Dmitry Vyukov
2022-11-22  2:13   ` Jason Gunthorpe
2022-11-22  3:28     ` wangyufen
2022-11-23  9:45       ` Guoqing Jiang [this message]
2022-11-24  0:22         ` Jason Gunthorpe
2022-11-24  1:42           ` wangyufen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f54056f-0acf-e088-c6cc-9ffce77bbe24@linux.dev \
    --to=guoqing.jiang@linux.dev \
    --cc=chenzhongjin@huawei.com \
    --cc=dvyukov@google.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=wangyufen@huawei.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).