From: "D. Wythe" <alibuda@linux.alibaba.com>
To: Wenjia Zhang <wenjia@linux.ibm.com>,
kgraul@linux.ibm.com, jaka@linux.ibm.com, wintera@linux.ibm.com
Cc: kuba@kernel.org, davem@davemloft.net, netdev@vger.kernel.org,
linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: Re: [PATCH net 5/5] net/smc: put sk reference if close work was canceled
Date: Fri, 20 Oct 2023 10:41:13 +0800 [thread overview]
Message-ID: <94f89147-cedc-b8b2-415f-942ec14cd670@linux.alibaba.com> (raw)
In-Reply-To: <990a6b09-135a-41fb-a375-c37ffec6fe99@linux.ibm.com>
On 10/20/23 1:40 AM, Wenjia Zhang wrote:
>
>
> On 19.10.23 09:33, D. Wythe wrote:
>>
>>
>> On 10/19/23 4:26 AM, Wenjia Zhang wrote:
>>>
>>>
>>> On 17.10.23 04:06, D. Wythe wrote:
>>>>
>>>>
>>>> On 10/13/23 3:04 AM, Wenjia Zhang wrote:
>>>>>
>>>>>
>>>>> On 11.10.23 09:33, D. Wythe wrote:
>>>>>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>>>>>
>>>>>> Note that we always hold a reference to sock when attempting
>>>>>> to submit close_work.
>>>>> yes
>>>>> Therefore, if we have successfully
>>>>>> canceled close_work from pending, we MUST release that reference
>>>>>> to avoid potential leaks.
>>>>>>
>>>>> Isn't the corresponding reference already released inside the
>>>>> smc_close_passive_work()?
>>>>>
>>>>
>>>> Hi Wenjia,
>>>>
>>>> If we successfully cancel the close work from the pending state,
>>>> it means that smc_close_passive_work() has never been executed.
>>>>
>>>> You can find more details here.
>>>>
>>>> /**
>>>> * cancel_work_sync - cancel a work and wait for it to finish
>>>> * @work:the work to cancel
>>>> *
>>>> * Cancel @work and wait for its execution to finish. This function
>>>> * can be used even if the work re-queues itself or migrates to
>>>> * another workqueue. On return from this function, @work is
>>>> * guaranteed to be not pending or executing on any CPU.
>>>> *
>>>> * cancel_work_sync(&delayed_work->work) must not be used for
>>>> * delayed_work's. Use cancel_delayed_work_sync() instead.
>>>> *
>>>> * The caller must ensure that the workqueue on which @work was last
>>>> * queued can't be destroyed before this function returns.
>>>> *
>>>> * Return:
>>>> * %true if @work was pending, %false otherwise.
>>>> */
>>>> boolcancel_work_sync(structwork_struct *work)
>>>> {
>>>> return__cancel_work_timer(work, false);
>>>> }
>>>>
>>>> Best wishes,
>>>> D. Wythe
>>> As I understand, queue_work() would wake up the work if the work is
>>> not already on the queue. And the sock_hold() is just prio to the
>>> queue_work(). That means, cancel_work_sync() would cancel the work
>>> either before its execution or after. If your fix refers to the
>>> former case, at this moment, I don't think the reference can be
>>> hold, thus it is unnecessary to put it.
>>>>
>>
>> I am quite confuse about why you think when we cancel the work before
>> its execution,
>> the reference can not be hold ?
>>
>>
>> Perhaps the following diagram can describe the problem in better way :
>>
>> smc_close_cancel_work
>> smc_cdc_msg_recv_action
>>
>>
>> sock_hold
>> queue_work
>> if (cancel_work_sync()) // successfully cancel before execution
>> sock_put() // need to put it since we already
>> hold a ref before queue_work()
>>
>>
> ha, I already thought you might ask such question:P
>
> I think here two Problems need to be clarified:
>
> 1) Do you think the bh_lock_sock/bh_unlock_sock in the
> smc_cdc_msg_recv does not protect the smc_cdc_msg_recv_action() from
> cancel_work_sync()?
> Maybe that would go back to the discussion in the other patch on the
> behaviors of the locks.
>
Yes. bh_lock_sock/bh_unlock_sock can not block code execution protected
by lock_sock/unlock(). That is to say, they are not exclusive.
We can use a very simple example to infer that since bh_lock_sock is
type of spin-lock, if bh_lock_sock/bh_unlock_sock can block
lock_sock/unlock(),
then lock_sock/unlock() can also block bh_lock_sock/bh_unlock_sock.
If this is true, when the process context already lock_sock(), the
interrupt context must wait for the process to call
release_sock(). Obviously, this is very unreasonable.
> 2) If the queue_work returns true, as I said in the last main, the
> work should be (being) executed. How could the cancel_work_sync()
> cancel the work before execution successgully?
No, that's not true. In fact, if queue_work returns true, it simply
means that we have added the task to the queue and may schedule a worker
to execute it,
but it does not guarantee that the task will be executed or is being
executed when it returns true,
the task might still in the list and waiting some worker to execute it.
We can make a simple inference,
1. A known fact is that if no special flag (WORK_UNBOUND) is given,
tasks submitted will eventually be executed on the CPU where they were
submitted.
2. If the queue_work returns true, the work should be or is being executed
If all of the above are true, when we invoke queue_work in an interrupt
context, does it mean that the submitted task will be executed in the
interrupt context?
Best wishes,
D. Wythe
>
>>>>>> Fixes: 42bfba9eaa33 ("net/smc: immediate termination for SMCD
>>>>>> link groups")
>>>>>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>>>>>> ---
>>>>>> net/smc/smc_close.c | 3 ++-
>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
>>>>>> index 449ef45..10219f5 100644
>>>>>> --- a/net/smc/smc_close.c
>>>>>> +++ b/net/smc/smc_close.c
>>>>>> @@ -116,7 +116,8 @@ static void smc_close_cancel_work(struct
>>>>>> smc_sock *smc)
>>>>>> struct sock *sk = &smc->sk;
>>>>>> release_sock(sk);
>>>>>> - cancel_work_sync(&smc->conn.close_work);
>>>>>> + if (cancel_work_sync(&smc->conn.close_work))
>>>>>> + sock_put(sk);
>>>>>> cancel_delayed_work_sync(&smc->conn.tx_work);
>>>>>> lock_sock(sk);
>>>>>> }
>>>>
>>
>>
next prev parent reply other threads:[~2023-10-20 2:41 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 7:33 [PATCH net 0/5] net/smc: bugfixs for smc-r D. Wythe
2023-10-11 7:33 ` [PATCH net 1/5] net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT D. Wythe
2023-10-11 14:00 ` Dust Li
2023-10-11 20:31 ` Wenjia Zhang
2023-10-12 2:47 ` D. Wythe
[not found] ` <f8089b26-bb11-f82d-8070-222b1f8c1db1@linux.alibaba.com>
2023-10-12 11:51 ` Wenjia Zhang
2023-10-13 5:32 ` Dust Li
2023-10-13 11:52 ` Wenjia Zhang
2023-10-13 12:27 ` Dust Li
2023-10-17 2:00 ` D. Wythe
2023-10-17 8:39 ` Dust Li
2023-10-17 17:03 ` Wenjia Zhang
[not found] ` <4065e94f-f7ea-7943-e2cc-0c7d3f9c788b@linux.alibaba.com>
2023-10-19 11:54 ` Wenjia Zhang
2023-10-23 20:53 ` Wenjia Zhang
2023-10-11 7:33 ` [PATCH net 2/5] net/smc: fix incorrect barrier usage D. Wythe
2023-10-11 8:44 ` Heiko Carstens
2023-10-11 8:57 ` D. Wythe
2023-10-11 7:33 ` [PATCH net 3/5] net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc D. Wythe
2023-10-11 20:37 ` Wenjia Zhang
2023-10-12 2:49 ` D. Wythe
2023-10-12 15:15 ` Wenjia Zhang
2023-10-11 7:33 ` [PATCH net 4/5] net/smc: protect connection state transitions in listen work D. Wythe
2023-10-12 17:14 ` Wenjia Zhang
2023-10-31 3:04 ` D. Wythe
2023-10-11 7:33 ` [PATCH net 5/5] net/smc: put sk reference if close work was canceled D. Wythe
2023-10-11 14:54 ` Dust Li
2023-10-12 19:04 ` Wenjia Zhang
[not found] ` <ee641ca5-104b-d1ec-5b2a-e20237c5378a@linux.alibaba.com>
2023-10-18 20:26 ` Wenjia Zhang
2023-10-19 7:33 ` D. Wythe
2023-10-19 17:40 ` Wenjia Zhang
2023-10-20 2:41 ` D. Wythe [this message]
2023-10-23 8:19 ` Wenjia Zhang
2023-10-23 8:52 ` D. Wythe
2023-10-23 10:28 ` Wenjia Zhang
2023-10-23 11:56 ` Dust Li
[not found] ` <59c0c75f-e9df-2ef1-ead2-7c5c97f3e750@linux.alibaba.com>
2023-10-23 20:52 ` Wenjia Zhang
2023-10-12 13:43 ` [PATCH net 0/5] net/smc: bugfixs for smc-r Alexandra Winter
2023-10-17 1:56 ` D. Wythe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=94f89147-cedc-b8b2-415f-942ec14cd670@linux.alibaba.com \
--to=alibuda@linux.alibaba.com \
--cc=davem@davemloft.net \
--cc=jaka@linux.ibm.com \
--cc=kgraul@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=wenjia@linux.ibm.com \
--cc=wintera@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox