From: Bob Pearson <rpearsonhpe@gmail.com>
To: Zhu Yanjun <zyjzyj2000@gmail.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>,
jgg@ziepe.ca, leon@kernel.org, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [for-next PATCH v2 1/2] RDMA/rxe: Remove unnecessary mr testing
Date: Mon, 24 Oct 2022 09:09:32 -0500 [thread overview]
Message-ID: <454431b3-0e65-6181-0849-4b0bc7d7ad3d@gmail.com> (raw)
In-Reply-To: <CAD=hENeTnLGN5KKi_Cwni6FDb77prijHVjCjfdvhrjXdp7bdjQ@mail.gmail.com>
On 10/23/22 21:25, Zhu Yanjun wrote:
> On Mon, Oct 24, 2022 at 2:05 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>>
>> On 10/21/22 20:09, Li Zhijian wrote:
>>>
>>>
>>> On 21/10/2022 22:39, Zhu Yanjun wrote:
>>>> On Fri, Oct 21, 2022 at 3:53 PM Li Zhijian <lizhijian@fujitsu.com> wrote:
>>>>> Before the testing, we already passed it to rxe_mr_copy() where mr could
>>>>> be dereferenced. so this checking is not exactly correct.
>>>>>
>>>>> I tried to figure out the details how/when mr could be NULL, but failed
>>>>> at last. Add a WARN_ON(!mr) to that path to tell us more when it
>>>>> happends.
>>>> If I get you correctly, you confronted a problem,
>>> Not exactly, I removed the mr checking since i think this checking is not correct.
>>> the newly added WARN_ON(!mr) is the only once place where the mr can be NULL but not handled correctly.
>>> At least with/without this patch, once WARN_ON(!mr) is triggered, kernel will go something wrong.
>>>
>>> so i want to place this WARN_ON(!mr) to point to the problem.
>>>
>>> Thanks
>>> Zhijian
>>>
>>>> but you can not figure it out.
>>>> So you send it upstream as a patch?
>>>>
>>>> I am not sure if it is a good idea.
>>>>
>>>> Zhu Yanjun
>>>>
>>>>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>>>>> ---
>>>>> drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
>>>>> index ed5a09e86417..218c14fb07c6 100644
>>>>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
>>>>> @@ -778,6 +778,7 @@ static enum resp_states read_reply(struct rxe_qp *qp,
>>>>> if (res->state == rdatm_res_state_new) {
>>>>> if (!res->replay) {
>>>>> mr = qp->resp.mr;
>>>>> + WARN_ON(!mr);
>>>>> qp->resp.mr = NULL;
>>>>> } else {
>>>>> mr = rxe_recheck_mr(qp, res->read.rkey);
>>>>> @@ -811,8 +812,7 @@ static enum resp_states read_reply(struct rxe_qp *qp,
>>>>>
>>>>> rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
>>>>> payload, RXE_FROM_MR_OBJ);
>>>>> - if (mr)
>>>>> - rxe_put(mr);
>>>>> + rxe_put(mr);
>>>>>
>>>>> if (bth_pad(&ack_pkt)) {
>>>>> u8 *pad = payload_addr(&ack_pkt) + payload
>>>>> --
>>>>> 2.31.1
>>>>>
>>>
>>
>> Li is correct that the only way mr could be NULL is if qp->resp.mr == NULL. So the
>
> What I am concerned about is if "WARN_ON(!mr);" should be added or not.
> IMO, if the root cause remains unclear, this should be a problem.
> Currently this problem is not fixed. It is useless to send a debug
> statement to the maillist.
Li was fixing a bug that no one ever saw. mr is not NULL in this case.
Bob
>
> Zhu Yanjun
>
>> 'if (mr)' is not needed if that is the case. The read_reply subroutine is reached
>> from a new rdma read operation after going through check_rkey or from a previous
>> rdma read operations from get_req if qp->resp.res != NULL or from a duplicate request
>> where the previous responder resource is found. In all these cases the mr is set.
>> Initially in check_rkey where if it can't find the mr it causes an RKEY_VIOLATION.
>> Thereafter the rkey is stored in the responder resources and looked up for each
>> packet to get an mr or cause an RKEY_VIOLATION. So the mr can't be NULL. I think
>> you can leave out the WARN and just drop the if (mr).
>>
>> Bob
>>
next prev parent reply other threads:[~2022-10-24 20:02 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-21 7:52 [for-next PATCH v2 1/2] RDMA/rxe: Remove unnecessary mr testing Li Zhijian
2022-10-21 7:52 ` Li Zhijian
2022-10-21 14:39 ` Zhu Yanjun
2022-10-22 1:09 ` Li Zhijian
2022-10-23 18:05 ` Bob Pearson
2022-10-24 2:25 ` Zhu Yanjun
2022-10-24 3:15 ` Li Zhijian
2022-10-24 14:09 ` Bob Pearson [this message]
2022-10-24 3:26 ` Li Zhijian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=454431b3-0e65-6181-0849-4b0bc7d7ad3d@gmail.com \
--to=rpearsonhpe@gmail.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lizhijian@fujitsu.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox