From: "dust.li" <dust.li@linux.alibaba.com>
To: Karsten Graul <kgraul@linux.ibm.com>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>
Cc: linux-s390@vger.kernel.org, netdev@vger.kernel.org,
Wen Gu <guwen@linux.alibaba.com>,
Tony Lu <tonylu@linux.alibaba.com>
Subject: Re: [PATCH net 2/2] net/smc: fix kernel panic caused by race of smc_sock
Date: Thu, 30 Dec 2021 11:46:57 +0800 [thread overview]
Message-ID: <20211230034657.GB55356@linux.alibaba.com> (raw)
In-Reply-To: <444f15e6-db60-3e86-fe20-32f24928844c@linux.ibm.com>
On Wed, Dec 29, 2021 at 01:33:20PM +0100, Karsten Graul wrote:
>On 28/12/2021 10:03, Dust Li wrote:
>> A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
>> but smc_release() has already freed it.
>
>I am not sure about what happened here.
>Your patch removes the whole dismisser concept that was introduced to
>solve exactly the problem you describe. And you implemented a different approach.
>
>In theory, when smc_cdc_tx_handler() is called but the connection is already
>freed than the connection should have gone through smc_cdc_tx_dismiss_slots(),
>called by smc_conn_kill() or smc_conn_free(). If that happened there would be no
>access to an already freed address in smc_cdc_tx_handler().
>
>Can you explain why the code reached smc_cdc_tx_handler() with cdcpend->conn
>pointing to a connection that is already freed? I think if there is a bug it should
>be fixed instead of replacing the code by a new construct.
>
>Thoughts?
Yes, at first we do try to fix this on the original path, but finally failed.
that's why we turned into this way.
This bug can be reproduced in our environment pretty fast running the
following test:
server:
smc_run nginx
client:
while true; do smc_run wrk -c 1000 -t 4 -d 20 http://smc-server; done
The reason is smc_cdc_tx_handler() checks whether cdcpend->conn == NULL
or not, and will access to the connection if it's not NULL.
But for short TCP flows(transfered to SMC flow), it is likely to close()
the connection very soon. Since smc_cdc_tx_handler() is running in the
soft IRQ context, and close(2) is running in the process context.
There is a chance of race as describe below:
smc_cdc_tx_handler() |smc_release()
if (!conn) |
|
|smc_cdc_tx_dismiss_slots()
| smc_cdc_tx_dismisser()
|
|sock_put(&smc->sk) <- last sock_put,
| smc_sock freed
bh_lock_sock(&smc->sk) (panic) |
If the check at the left passed, and then the application closed the
smc_socket and dismissed the cdcpend with smc_connection.
smc_cdc_tx_handler() still think the smc_sock is alive but actually
it's already released, and further bh_lock_sock(&smc->sk) will cause
the kernel panic.
As long as the check in smc_cdc_tx_handler() is not synchronized with
the smc_release(), this is inevitable. And unfortunately, I haven't
found a good way to synchronized them with the dismisser.
So my final solution is to remove the dissmiser and introduce the
refcount for tx cdc messages to protect the visit from
smc_cdc_tx_handler() from smc_release().
With the refcount, we can make sure the smc_conn_free() will wait for
all pending tx cdc messages done by the underlying RDMA device and
smc_cdc_tx_handler() is always safe to visit the smc_sock.
With this patchset, I run the wrk/nginx test cases for 2 days without
any panic(No link down/up is performed) any more.
next prev parent reply other threads:[~2021-12-30 3:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-28 9:03 [PATCH net 0/2] net/smc: fix kernel panic caused by race of smc_sock Dust Li
2021-12-28 9:03 ` [PATCH net 1/2] net/smc: don't send CDC/LLC message if link not ready Dust Li
2021-12-29 12:36 ` Karsten Graul
2021-12-30 3:02 ` dust.li
2021-12-30 18:55 ` Karsten Graul
2021-12-31 3:15 ` dust.li
2022-01-03 10:40 ` Karsten Graul
2021-12-31 6:08 ` [PATCH net] net/smc: add comments for smc_link_{usable|sendable} Dust Li
2022-01-02 16:20 ` patchwork-bot+netdevbpf
2021-12-28 9:03 ` [PATCH net 2/2] net/smc: fix kernel panic caused by race of smc_sock Dust Li
2021-12-29 12:33 ` Karsten Graul
2021-12-30 3:46 ` dust.li [this message]
2021-12-28 12:50 ` [PATCH net 0/2] " patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211230034657.GB55356@linux.alibaba.com \
--to=dust.li@linux.alibaba.com \
--cc=davem@davemloft.net \
--cc=guwen@linux.alibaba.com \
--cc=kgraul@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tonylu@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox