All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Ding Hui <dinghui@sangfor.com.cn>
Cc: leon@kernel.org, wenglianfa@huawei.com, gustavoars@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	Shifeng Li <lishifeng1992@126.com>,
	Shifeng Li <lishifeng@sangfor.com.cn>
Subject: Re: [PATCH] RDMA/device: Fix a race between mad_client and cm_client init
Date: Mon, 15 Jan 2024 09:47:07 -0400	[thread overview]
Message-ID: <20240115134707.GZ50608@ziepe.ca> (raw)
In-Reply-To: <e029db0a-c515-e61c-d34e-f7f054d51e88@sangfor.com.cn>

On Sat, Jan 06, 2024 at 10:12:17AM +0800, Ding Hui wrote:
> On 2024/1/4 20:37, Jason Gunthorpe wrote:
> > On Thu, Jan 04, 2024 at 02:48:14PM +0800, Shifeng Li wrote:
> > 
> > > The root cause is that mad_client and cm_client may init concurrently
> > > when devices_rwsem write semaphore is downgraded in enable_device_and_get() like:
> > 
> > That can't be true, the module loader infrastructue ensures those two
> > things are sequential.
> > 
> 
> Please consider the sequence again and notice that:
> 
> 1. We agree that dependencies ensure mad_client be registered before cm_client.
> 2. But the mad_client.add() is not invoked in ib_register_client(), since
>    there is no DEVICE_REGISTERED device at that time.
>    Instead, it will be delayed until the device driver init (e.g. mlx5_core)
>    in enable_device_and_get().
> 3. The ib_cm and mlx5_core can be loaded concurrently, after setting DEVICE_REGISTERED
>    and downgrade_write(&devices_rwsem) in enable_device_and_get(), there is a chance
>    that cm_client.add() can be invoked before mad_client.add().
> 
> 
>         T1(ib_core init)      |      T2(device driver init)        |        T3(ib_cm init)
> ---------------------------------------------------------------------------------------------------
> ib_register_client mad_client
>   assign_client_id
>     add clients CLIENT_REGISTERED
>     (with clients_rwsem write)
>   down_read(&devices_rwsem);
>   xa_for_each_marked (&devices, DEVICE_REGISTERED)
>     nop # no devices
>   up_read(&devices_rwsem);
> 
>                               ib_register_device
>                                 enable_device_and_get
>                                   down_write(&devices_rwsem);
>                                   set DEVICE_REGISTERED
>                                   downgrade_write(&devices_rwsem);
>                                                                     ib_register_client cm_client
>                                                                       assign_client_id
>                                                                         add clients CLIENT_REGISTERED
>                                                                         (with clients_rwsem write)
>                                                                       down_read(&devices_rwsem);
>                                                                       xa_for_each_marked (&devices, DEVICE_REGISTERED)
>                                                                         add_client_context
>                                                                           down_write(&device->client_data_rwsem);
>                                                                           get CLIENT_DATA_REGISTERED
>                                                                           downgrade_write(&device->client_data_rwsem);
>                                                                           cm_client.add
>                                                                             cm_add_one
>                                                                               ib_register_mad_agent
>                                                                                 ib_get_mad_port
>                                                                                   __ib_get_mad_port return NULL!
>                                                                           set CLIENT_DATA_REGISTERED
>                                                                           up_read(&device->client_data_rwsem);
>                                                                       up_read(&devices_rwsem);
>                                 down_read(&clients_rwsem);
>                                 xa_for_each_marked (&clients, CLIENT_REGISTERED)
>                                   add_client_context [mad]
>                                     mad_client.add
>                                   add_client_context [cm]
>                                     nop # already CLIENT_DATA_REGISTERED
>                                 up_read(&clients_rwsem);
>                                 up_read(&devices_rwsem);

Take the draft I sent previously and use down_write(&devices_rwsem) in
ib_register_client()

Jason

  reply	other threads:[~2024-01-15 13:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-02  3:43 [PATCH] RDMA/device: Fix a race between mad_client and cm_client init Shifeng Li
2024-01-02  8:58 ` Leon Romanovsky
2024-01-02 11:33   ` Shifeng Li
2024-01-02 11:58     ` Leon Romanovsky
2024-01-02 12:30       ` Shifeng Li
2024-01-03 18:48 ` Jason Gunthorpe
     [not found]   ` <80cac9fd-7fed-403e-8889-78e2fc7a49b0@sangfor.com.cn>
2024-01-04 12:37     ` Jason Gunthorpe
2024-01-05  8:15       ` Shifeng Li
2024-01-05 14:19         ` Jason Gunthorpe
2024-01-15  3:27           ` Shifeng Li
2024-01-06  2:12       ` Ding Hui
2024-01-15 13:47         ` Jason Gunthorpe [this message]
     [not found]           ` <354e2bf7-a8b4-629d-3d2d-35951a52e8bd@sangfor.com.cn>
2024-01-26  2:25             ` Shifeng Li
2024-02-01  0:04               ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240115134707.GZ50608@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=dinghui@sangfor.com.cn \
    --cc=gustavoars@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lishifeng1992@126.com \
    --cc=lishifeng@sangfor.com.cn \
    --cc=wenglianfa@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.